<< Chapter < Page | Chapter >> Page > |
IF (N .EQ. 0) THEN
DO I=1,KA(I) = A(I) + B(I) * C
ENDDOELSE
DO I=1,KA(I) = 0
ENDDOENDIF
The effect on the runtime is dramatic. Not only have we eliminated
K-1
copies of the test, we have also assured that the computations in the middle of the loop are not control-dependent on the if-statement, and are therefore much easier for the compiler to pipeline.
We remember helping someone optimize a program with loops containing similar conditionals. They were checking to see whether debug output should be printed each iteration inside an otherwise highly optimizable loop. We can’t fault the person for not realizing how much this slowed the program down. Performance wasn’t important at the time. The programmer was just trying to get the code to produce good answers. But later on, when performance mattered, by cleaning up invariant conditionals, we were able to speed up the program by a factor of 100.
For
loop index dependent conditionals, the test is true for certain ranges of the loop index variables. It isn’t always true or always false, like the conditional we just looked at, but it does change with a predictable pattern, and one that we can use to our advantage. The following loop has two index variables,
I
and
J
.
DO I=1,N
DO J=1,NIF (J .LT. I)
A(J,I) = A(J,I) + B(J,I) * CELSE
A(J,I) = 0.0ENDIF
ENDDOENDDO
Notice how the if-statement partitions the iterations into distinct sets: those for which it is true and those for which it is false. You can take advantage of the predictability of the test to restructure the loop into several loops — each custom-made for a different partition:
DO I=1,N
DO J=1,I-1A(J,I) = A(J,I) + B(J,I) * C
ENDDODO J=I,N
A(J,I) = 0.0ENDDO
ENDDO
The new version will almost always be faster. A possible exception is when
N
is a small value, like 3, in which case we have created more clutter. But then, the loop probably has such a small impact on the total runtime that it won’t matter which way it’s coded.
It would be nice if you could optimize every loop by partitioning it. But more often than not, the conditional doesn’t directly depend on the value of the index variables. Although an index variable may be involved in addressing an array, it doesn’t create a recognizable pattern in advance — at least not one you can see when you are writing the program. Here’s such a loop:
DO I=1,N
DO J=1,NIF (B(J,I) .GT. 1.0) A(J,I) = A(J,I) + B(J,I) * C
ENDDOENDDO
There is not much you can do about this type of conditional. But because every iteration is independent, the loop can be unrolled or can be performed in parallel.
When the conditional is based on a value that changes with each iteration of the loop, the compiler has no choice but to execute the code exactly as written. For instance, the following loop has an if-statement with built-in scalar recursion:
Notification Switch
Would you like to follow the 'High performance computing' conversation and receive update notifications?