Entries that bridge the gap between two areas are welcome. For example, a module about a particular accelerated computing system might describe its accompanying language, and thus fall into both the Parallel Programming Models and Languages and Accelerated Computing categories. However, such modules can only be entered in one category each. To be eligible for awards in multiple categories, the module would need to be divided into appropriate parts, each entered in one category.
Parallel architectures
Today, nearly all computers have some parallel aspects. However, there are a variety of ways that processors can be organized into effective parallel systems. The classic classification of
parallel architectures is
Flynn’s taxonomy based on the number of distinct instruction and data streams supplied to the parallel processors.
- Single Instruction, Single Data (SISD) – These are sequential computers. This is the only class that is not a parallel computer. We include it only for completeness.
- Multiple Instruction, Single Data (MISD) – Several independent processors work on the same stream. Few computers of this type exist. Arguably the clearest examples are fault tolerant systems that replicate a computation, comparing answers to detect errors; the flight controller on the US Space Shuttle is based on such a design. Pipelined computations are also sometimes considered MISD. However, the data passed between stages of the pipeline has been changed, so the “single” data aspect is murky at best.
-
Single Instruction, Multiple Data (SIMD) – A central controller sends the same stream of instructions to a set of identical processors, each of which operates on its own data. Additional control instructions move data or exclude unneeded processors. At a low level of modern architectures, this is often used to update arrays or large data structures. For example, GPUs typically get most of their speed from SIMD operation. Perhaps the most famous large-scale SIMD computer was the Thinking Machines CM-2 in the 1980s, which boasted up to 65,536 bit-serial processors.
-
Multiple Instruction, Multiple Data (MIMD) – All processors execute their own instruction stream, each operating on its own data. Additional instructions are needed to synchronize and communicate between processors. Most computers sold as “parallel computers” today fall into this class. Examples include the supercomputers, servers, grid computers, and multicore chips described above.
Hierarchies are also possible. For example, a MIMD supercomputer might include SIMD chips as accelerators on each of its boards.
Beyond general class, many architectural decisions are critical in designing a parallel computer architecture. Two of the most important include:
- Memory hierarchy and organization. To reduce data access time, most modern computers use a hierarchy of caches to keep frequently-used data accessible. This becomes particularly important in parallel computers, where many processors mean even more accesses. Moreover, parallel computers must arrange for data to be shared between processors.
Shared memory architectures do this by allowing multiple processors to access the same memory.
Nonshared memory architectures allot each processor its own memory, and require explicit communication to move data to another processor. A hybrid approach – non-uniform shared memory – places memory with each processor for fast access, but allows slower access to other processors’ memory.
- Interconnection topology. To communicate and synchronize, processors in a parallel computer need a connection with each other. However, as a practical matter not all of these can be direct connections. Thus is born the need for interconnection networks. For example, interconnects in use today include simple buses, crossbar switches, token rings, fat trees, and 2- and 3-D torus topologies. At the same time, the underlying technology or system environment may affect the networks that are feasible. For example, grid computing systems typically have to accept the wide-area network that they have available.