<< Chapter < Page | Chapter >> Page > |
Maximum speedup of O(log n)
Maximum speedup of O(n / ln n)
The machines are the true parallel processors (also called concurrent processors)
These paralle machines fall into Flynn’s taxonomy classes of SIMD and MIMD systems
– SIMD: Single Instruction stream and Multiple Data streams
– MIMD: Multiple Instruction streams and Multiple Data streams
– Extensions of scalar instructions
Adds, stores, multiplies, etc. become vector operations executed in all processors concurrently
– Must add the ability to transfer vector and scalar data between processors to the instruction set -- attributes of a “parallel language”
Vector addition
C(I) = A(I) + B(I)
Complexity O(n) in SISD systems for I=1 to n do
C(I) = A(I) + B(I)
Complexity O(1) in SIMD systems
Matrix multiply
A, B, and C are n-by-n matrices
Compute C= AxB
Complexity O(n3) in SISD systems
n2 dot products, each of which is O(n)
Complexity O(n2) in SIMD systems
Perform n dot products in parallel across M the array
Image smoothing
– Smooth an n-by-n pixel image to reduce “noise”
– Each pixel is replaced by the average of itself
and its 8 nearest neighbors
– Complexity O(n2) in SISD systems
– Complexity O(n) in SIMD systems
Pixel and 8 neighbors
instructions
– Rather than forcing all processors to perform the same task at the same time, processors can be assigned different tasks that, when taken as a whole, complete the assigned application
– Each processor executes its own copy of the SIMD algorithm
– Distributed simulations is a good example of a
very hard MIMD application
– Process synchronization
– Process scheduling
awaiting data from another processor
– By the programmer through the use of parallel language constructs
Specify apriori what processes will be instantiated and where they will be
executed
– During program execution by spawning processes off for execution on available processors.
Fork-join construct in some languages
SIMD
– Illiac IV
One of the first massively parallel systems 64 processors
– Goodyear Staran: 256 bit-serial processors
– Current system from Cray Computer Corp.uses supercomputer (Cray 3) front end coupled to an SIMD array of 1000s of processors
MIMD
– Intel hypercube series:
Supported up to several hundred CISC processors
Next-gen Paragon
– Cray Research T3D
Cray Y-MP coupled to a massive array of Dec Alphas
Target: sustained teraflop performance
– Hardware is relatively easy to build
– Massively parallel systems just take massive amounts of money to build
– How should/can the large numbers of processors be interconnected
– The real trick is building software that will exploit the capabilities of the system
– Outside of a limited number of high-profile applications, parallel processing is still a“young” discipline
– Parallel software is still fairly sparse
– Risky for companies to adopt parallel strategies, just wait for the next new SISD system.
Notification Switch
Would you like to follow the 'Computer architecture' conversation and receive update notifications?