<< Chapter < Page Chapter >> Page >

The most basic form of loop optimization is loop unrolling. It is so basic that most of today’s compilers do it automatically if it looks like there’s a benefit. There has been a great deal of clutter introduced into old dusty-deck FORTRAN programs in the name of loop unrolling that now serves only to confuse and mislead today’s compilers.

We’re not suggesting that you unroll any loops by hand. The purpose of this section is twofold. First, once you are familiar with loop unrolling, you might recognize code that was unrolled by a programmer (not you) some time ago and simplify the code. Second, you need to understand the concepts of loop unrolling so that when you look at generated machine code, you recognize unrolled loops.

The primary benefit in loop unrolling is to perform more computations per iteration. At the end of each iteration, the index value must be incremented, tested, and the control is branched back to the top of the loop if the loop has more iterations to process. By unrolling the loop, there are less “loop-ends” per loop execution. Unrolling also reduces the overall number of branches significantly and gives the processor more instructions between branches (i.e., it increases the size of the basic blocks).

For illustration, consider the following loop. It has a single statement wrapped in a do-loop:


DO I=1,N A(I) = A(I) + B(I) * CENDDO

You can unroll the loop, as we have below, giving you the same operations in fewer iterations with less loop overhead. You can imagine how this would help on any computer. Because the computations in one iteration do not depend on the computations in other iterations, calculations from different iterations can be executed together. On a superscalar processor, portions of these four statements may actually execute in parallel:


DO I=1,N,4 A(I) = A(I) + B(I) * CA(I+1) = A(I+1) + B(I+1) * C A(I+2) = A(I+2) + B(I+2) * CA(I+3) = A(I+3) + B(I+3) * C ENDDO

However, this loop is not exactly the same as the previous loop. The loop is unrolled four times, but what if N is not divisible by 4? If not, there will be one, two, or three spare iterations that don’t get executed. To handle these extra iterations, we add another little loop to soak them up. The extra loop is called a preconditioning loop :


II = IMOD (N,4) DO I=1,IIA(I) = A(I) + B(I) * C ENDDODO I=1+II,N,4A(I) = A(I) + B(I) * C A(I+1) = A(I+1) + B(I+1) * CA(I+2) = A(I+2) + B(I+2) * C A(I+3) = A(I+3) + B(I+3) * CENDDO

The number of iterations needed in the preconditioning loop is the total iteration count modulo for this unrolling amount. If, at runtime, N turns out to be divisible by 4, there are no spare iterations, and the preconditioning loop isn’t executed.

Speculative execution in the post-RISC architecture can reduce or eliminate the need for unrolling a loop that will operate on values that must be retrieved from main memory. Because the load operations take such a long time relative to the computations, the loop is naturally unrolled. While the processor is waiting for the first load to finish, it may speculatively execute three to four iterations of the loop ahead of the first load, effectively unrolling the loop in the Instruction Reorder Buffer.

Questions & Answers

what is decentralised
mithlesh Reply
Ayele, K., 2003. Introductory Economics, 3rd ed., Addis Ababa.
Widad Reply
can you send the book attached ?
Ariel
?
Ariel
What is economics
Widad Reply
the study of how humans make choices under conditions of scarcity
AI-Robot
U(x,y) = (x×y)1/2 find mu of x for y
Desalegn Reply
U(x,y) = (x×y)1/2 find mu of x for y
Desalegn
what is ecnomics
Jan Reply
this is the study of how the society manages it's scarce resources
Belonwu
what is macroeconomic
John Reply
macroeconomic is the branch of economics which studies actions, scale, activities and behaviour of the aggregate economy as a whole.
husaini
etc
husaini
difference between firm and industry
husaini Reply
what's the difference between a firm and an industry
Abdul
firm is the unit which transform inputs to output where as industry contain combination of firms with similar production 😅😅
Abdulraufu
Suppose the demand function that a firm faces shifted from Qd  120 3P to Qd  90  3P and the supply function has shifted from QS  20  2P to QS 10  2P . a) Find the effect of this change on price and quantity. b) Which of the changes in demand and supply is higher?
Toofiq Reply
explain standard reason why economic is a science
innocent Reply
factors influencing supply
Petrus Reply
what is economic.
Milan Reply
scares means__________________ends resources. unlimited
Jan
economics is a science that studies human behaviour as a relationship b/w ends and scares means which have alternative uses
Jan
calculate the profit maximizing for demand and supply
Zarshad Reply
Why qualify 28 supplies
Milan
what are explicit costs
Nomsa Reply
out-of-pocket costs for a firm, for example, payments for wages and salaries, rent, or materials
AI-Robot
concepts of supply in microeconomics
David Reply
economic overview notes
Amahle Reply
Got questions? Join the online conversation and get instant answers!
Jobilize.com Reply

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, High performance computing. OpenStax CNX. Aug 25, 2010 Download for free at http://cnx.org/content/col11136/1.5
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'High performance computing' conversation and receive update notifications?

Ask