<< Chapter < Page Chapter >> Page >

A pipeline

Figure displays an arrow labeled, instructions in, pointing to the right, at 5 boxes numbered sequentially, labeled stage, followed by an arrow to the right pointing to the right, labeled results out.

[link] shows a conceptual diagram of a pipeline. An operation entering at the left proceeds on its own for five clock ticks before emerging at the right. Given that the pipeline stages are independent of one another, up to five operations can be in flight at a time as long as each instruction is delayed long enough for the previous instruction to clear the pipeline stage. Consider how powerful this mechanism is: where before it would have taken five clock ticks to get a single result, a pipeline produces as much as one result every clock tick.

Pipelining is useful when a procedure can be divided into stages. Instruction processing fits into that category. The job of retrieving an instruction from memory, figuring out what it does, and doing it are separate steps we usually lump together when we talk about executing an instruction. The number of steps varies, depending on whose processor you are using, but for illustration, let’s say there are five:

  1. Instruction fetch : The processor fetches an instruction from memory.
  2. Instruction decode : The instruction is recognized or decoded.
  3. Operand Fetch : The processor fetches the operands the instruction needs. These operands may be in registers or in memory.
  4. Execute : The instruction gets executed.
  5. Writeback : The processor writes the results back to wherever they are supposed to go —possibly registers, possibly memory.

Ideally, instruction

  1. Will be entering the operand fetch stage as instruction
  2. enters instruction decode stage and instruction
  3. starts instruction fetch, and so on.

Our pipeline is five stages deep, so it should be possible to get five instructions in flight all at once. If we could keep it up, we would see one instruction complete per clock cycle.

Simple as this illustration seems, instruction pipelining is complicated in real life. Each step must be able to occur on different instructions simultaneously, and delays in any stage have to be coordinated with all those that follow. In [link] we see three instructions being executed simultaneously by the processor, with each instruction in a different stage of execution.

Three instructions in flight through one pipeline

This figure shows three rows of instructions laid on three layers of a cartesian graph. The horizontal axis of the graph is labeled, Time. There is a vertical line placed three horizontal units to the right of the vertical axis. Starting one unit to the right of the vertical axis and two units above the horizontal axis is the first row of instructions. Below this, one space above the horizontal axis and two spaces to the right of the vertical axis is the second set of instructions. The third row of instructions happens along the horizontal axis and three spaces to the right of the vertical axis. Each set of instructions is a set of 5 connected boxes, numbered sequentially. Box 1 is labeled Ins. Fetch, Box 2 is labeled Ins. Decode, Box 3 is labeled Op. Fetch, Box 4 is labeled Exec, and Box 5 is labeled Writeback.

For instance, if a complicated memory access occurs in stage three, the instruction needs to be delayed before going on to stage four because it takes some time to calculate the operand’s address and retrieve it from memory. All the while, the rest of the pipeline is stalled. A simpler instruction, sitting in one of the earlier stages, can’t continue until the traffic ahead clears up.

Now imagine how a jump to a new program address, perhaps caused by an if statement, could disrupt the pipeline flow. The processor doesn’t know an instruction is a branch until the decode stage. It usually doesn’t know whether a branch will be taken or not until the execute stage. As shown in [link] , during the four cycles after the branch instruction was fetched, the processor blindly fetches instructions sequentially and starts these instructions through the pipeline.

Detecting a branch

This figure shows six rows of labeled boxes, reading from left to right, fetch, decode, operand, exec, and write. Each row begins one box to the right of the row above it, but retains the same order of labels. There is an arrow pointing from the first row's Exec label to the fifth row's Fetch label, and below is the caption Sure. Below the Fetch label in the first row are three unboxed rows of the word Guess, which are aligned evenly with the second, third, and fourth rows. In the upper-right portion of the graph is the caption, Branch.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, High performance computing. OpenStax CNX. Aug 25, 2010 Download for free at http://cnx.org/content/col11136/1.5
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'High performance computing' conversation and receive update notifications?

Ask