<< Chapter < Page | Chapter >> Page > |
At least the top layer of the memory hierarchy, the CPU registers, operate as fast as the rest of the processor. The goal is to keep operands in the registers as much as possible. This is especially important for intermediate values used in a long computation such as:
X = G * 2.41 + A / W - W * M
While computing the value of A divided by W, we must store the result of multiplying G by 2.41. It would be a shame to have to store this intermediate result in memory and then reload it a few instructions later. On any modern processor with moderate optimization, the intermediate result is stored in a register. Also, the value W is used in two computations, and so it can be loaded once and used twice to eliminate a “wasted” load.
Compilers have been very good at detecting these types of optimizations and efficiently making use of the available registers since the 1970s. Adding more registers to the processor has some performance benefit. It’s not practical to add enough registers to the processor to store the entire problem data. So we must still use the slower memory technology.
Notification Switch
Would you like to follow the 'High performance computing' conversation and receive update notifications?