<< Chapter < Page | Chapter >> Page > |
A = -B + C * D / E
Taken all at once, this statement has four operators and four operands:
/
,
*
,
+
, and
-
(negate), and
B
,
C
,
D
, and
E
. This is clearly too much to fit into one quadruple. We need a form with exactly one operator and, at most, two operands per statement. The recast version that follows manages to do this, employing temporary variables to hold the intermediate results:
T1 = D / E
T2 = C * T1T3 = -B
A = T3 + T2
A workable intermediate language would, of course, need some other features, like pointers. We’re going to suggest that we create our own intermediate language to investigate how optimizations work. To begin, we need to establish a few rules:
X := Y op Z
, meaning
X
gets the result of
op
applied to
Y
and
Z
.t
n .If we were building a compiler, we’d need to be a little more specific. For our purposes, this will do. Consider the following bit of C code:
while (j<n) {
k = k + j * 2;m = j * 2;
j++;}
This loop translates into the intermediate language representation shown here:
A:: t1 := j
t2 := nt3 := t1<t2
jmp (B) t3jmp (C) TRUEB:: t4 := k
t5 := jt6 := t5 * 2
t7 := t4 + t6k := t7
t8 := jt9 := t8 * 2
m := t9t10 := j
t11 := t10 + 1j := t11
jmp (A) TRUEC::
Each C source line is represented by several IL statements. On many RISC processors, our IL code is so close to machine language that we could turn it directly into object code.
See
[link] for some examples of machine code translated directly from intermediate language. Often the lowest optimization level does a literal translation from the intermediate language to machine code. When this is done, the code generally is very large and performs very poorly. Looking at it, you can see places to save a few instructions. For instance,
j
gets loaded into temporaries in four places; surely we can reduce that. We have to do some analysis and make some optimizations.
After generating our intermediate language, we want to cut it into basic blocks . These are code sequences that start with an instruction that either follows a branch or is itself a target for a branch. Put another way, each basic block has one entrance (at the top) and one exit (at the bottom). [link] represents our IL code as a group of three basic blocks. Basic blocks make code easier to analyze. By restricting flow of control within a basic block from top to bottom and eliminating all the branches, we can be sure that if the first statement gets executed, the second one does too, and so on. Of course, the branches haven’t disappeared, but we have forced them outside the blocks in the form of the connecting arrows — the flow graph .
We are now free to extract information from the blocks themselves. For instance, we can say with certainty which variables a given block uses and which variables it defines (sets the value of ). We might not be able to do that if the block contained a branch. We can also gather the same kind of information about the calculations it performs. After we have analyzed the blocks so that we know what goes in and what comes out, we can modify them to improve performance and just worry about the interaction between blocks.
Notification Switch
Would you like to follow the 'High performance computing' conversation and receive update notifications?