<< Chapter < Page | Chapter >> Page > |
In general, the time delay d is equivalent to a clock pulse and >>d. Suppose that n instruction are processed with no branched.
= [k + (n-1)]
– Each subtask requires 1 time unit to complete
– The task itself then requires k time units tocomplete. For n iterations of the task, the execution times will be:
– With no pipelining: nk time units
– With pipelining: k + (n-1) time units
Speedup of a k-stage pipeline is thus
S = nk / [k+(n-1)] ==>k (for large n)
Several factors serve to limit the pipeline performance. If the six stage are not of equal duration, there will be some waiting involved at various pipeline stage. Another difficulty is the condition branch instruction or the unpredictable event is an interrupt. Other problem arise that the memory conflicts could occur. So the system must contain logic to account for the type of conflict.
- Data dependencies also factor into the effective length of pipelines
- Logic to handle memory and register use and to control the overall pipeline increases significantly with increasing pipeline depth
– If the speedup is based on the number of stages, why not build lots of stages?
– Each stage uses latches at its input (output) to buffer the next set of inputs
+ If the stage granularity is reduced too much, the latches and their control become a significant hardware overhead
+ Also suffer a time overhead in the propagation time through the latches
- Limits the rate at which data can be clocked through the pipeline
– Pipelining must insure that computed results are the same as if computation was performed in strict sequential order
– With multiple stages, two instructions “in execution” in the pipeline may have data dependencies. So we must design the pipeline to prevent this.
– Data dependency examples:
A = B + C
D = E + A
C = G x H
A = D / H
Data dependencies limit when an instruction can be input to the pipeline.
One of the major problems in designing an instruction pipeline is assuring a steady flow of instructions to initial stages of the pipeline. However, 15-20% of instructions in an assembly-level stream are (conditional) branches. Of these, 60-70% take the branch to a target address. Until the instruction is actually executed, it is impossible to determin whether the branch will be taken or not.
- Impact of the branch is that pipeline never really operates at its full capacity.
– The average time to complete a pipelined instruction becomes
Tave =(1-pb)1 + pb[pt(1+b) + (1-pt)1]
– A number of techniques can be used to minimize the impact of the branch instruction (the branch penalty).
- A several approaches have been taken for dealing with conditional branches:
+ Multiple streams
+ Prefetch branch target
+ Loop buffer
Notification Switch
Would you like to follow the 'Computer architecture' conversation and receive update notifications?