<< Chapter < Page | Chapter >> Page > |
If you have been in high performance computing since its beginning in the 1950s, you have programmed in several languages during that time. During the 1950s and early 1960s, you programmed in assembly language. The constraint on memory and slow clock rates made every instruction precious. With small memories, overall program size was typically small, so assembly language was sufficient. Toward the end of the 1960s, programmers began writing more of their code in a high-level language such as FORTRAN. Writing in a high-level language made your work much more portable, reliable, and maintainable. Given the increasing speed and capacity of computers, the cost of using a high-level language was something most programmers were willing to accept. In the 1970s if a program spent a particularly large amount of time in a particular routine, or the routine was part of the operating system or it was a commonly used library, most likely it was written in assembly language.
During the late 1970s and early 1980s, optimizing compilers continued to improve to the point that all but the most critical portions of general-purpose programs were written in high-level languages. On the average, the compilers generate better code than most assembly language programmers. This was often because a compiler could make better use of hardware resources such as registers. In a processor with 16 registers, a programmer might adopt a convention regarding the use of registers to help keep track of what value is in what register. A compiler can use each register as much as it likes because it can precisely track when a register is available for another use.
However, during that time, high performance computer architecture was also evolving. Cray Research was developing vector processors at the very top end of the computing spectrum. Compilers were not quite ready to determine when these new vector instructions could be used. Programmers were forced to write assembly language or create highly hand-tuned FORTRAN that called the appropriate vector routines in their code. In a sense, vector processors turned back the clock when it came to trusting the compiler for a while. Programmers never lapsed completely into assembly language, but some of their FORTRAN started looking rather un-FORTRAN like. As the vector computers matured, their compilers became increasingly able to detect when vectorization could be performed. At some point, the compilers again became better than programmers on these architectures. These new compilers reduced the need for extensive directives or language extensions. The Livermore Loops was a benchmark that specifically tested the capability of a compiler to effectively optimize a set of loops. In addition to being a performance benchmark, it was also a compiler benchmark.
The RISC revolution led to an increasing dependence on the compiler. Programming early RISC processors such as the Intel i860 was painful compared to CISC processors. Subtle differences in the way a program was coded in machine language could have a significant impact on the overall performance of the program. For example, a programmer might have to count the instruction cycles between a load instruction and the use of the results of the load in a computational instruction. As superscalar processors were developed, certain pairs of instructions could be issued simultaneously, and others had to be issued serially. Because there were a large number of different RISC processors produced, programmers did not have time to learn the nuances of wringing the last bit of performance out of each processor. It was much easier to lock the processor designer and the compiler writer together (hopefully they work for the same company) and have them hash out the best way to generate the machine code. Then everyone would use the compiler and get code that made reasonably good use of the hardware.
The compiler became an important tool in the processor design cycle. Processor designers had much greater flexibility in the types of changes they could make. For example, it would be a good design in the next revision of a processor to execute existing codes 10% slower than a new revision, but by recompiling the code, it would perform 65% faster. Of course it was important to actually provide that compiler when the new processor was shipped and have the compiler give that level of performance across a wide range of codes rather than just one particular benchmark suite.
Notification Switch
Would you like to follow the 'High performance computing' conversation and receive update notifications?