<< Chapter < Page | Chapter >> Page > |
One ordering distinction is between recursion and iteration. As expressed above, the Cooley-Tukey algorithm could be thought of asdefining a tree of smaller and smaller DFTs, as depicted in [link] ; for example, a textbook radix-2 algorithm would divide size into two transforms of size , which are divided into four transforms of size , and so on until a base case is reached (in principle, size 1). This might naturally suggest a recursive implementation inwhich the tree is traversed “depth-first” as in [link] (right) and the algorithm of [link] —one size transform is solved completely before processing the other one, and so on. However, most traditionalFFT implementations are non-recursive (with rare exceptions [link] ) and traverse the tree “breadth-first” [link] as in [link] (left)—in the radix-2 example, they would perform (trivial) size-1 transforms, then combinations into size-2 transforms, then combinations into size-4 transforms, and so on, thus making passes over the whole array. In contrast, as we discuss in "Discussion" , FFTW employs an explicitly recursive strategy that encompasses both depth-first and breadth-first styles, favoring the former since it has sometheoretical and practical advantages as discussed in "FFTs and the Memory Hierarchy" .
:
IF n=1 THEN
ELSE
FOR
TO
DO
END FOR
END IF
A depth-first recursive radix-2 DIT Cooley-Tukey FFT to
compute a DFT of a power-of-two size
. The input is an array
of length
with stride
(i.e., the inputs are
for
) and the output is an array
of length
(with
stride 1), containing the DFT of
[Equation 1].
denotes the array beginning with
. This algorithm operates
out-of-place, produces in-order output, and does not require aseparate bit-reversal stage.
A second ordering distinction lies in how the digit-reversal is performed. The classic approach is a single, separate digit-reversalpass following or preceding the arithmetic computations; this approach is so common and so deeply embedded into FFT lore that manypractitioners find it difficult to imagine an FFT without an explicit bit-reversal stage. Although this pass requires only time [link] , it can still be non-negligible, especially if the data is out-of-cache; moreover, it neglects the possibility that datareordering during the transform may improve memory locality. Perhaps the oldest alternative is the Stockham auto-sort FFT [link] , [link] , which transforms back and forth between two arrays with each butterfly, transposing one digit eachtime, and was popular to improve contiguity of access for vector computers [link] . Alternatively, an explicitly recursive style, as in FFTW, performs the digit-reversal implicitlyat the “leaves” of its computation when operating out-of-place (see section "Discussion" ). A simple example of this style, which computes in-order output using an out-of-place radix-2FFT without explicit bit-reversal, is shown in the algorithm of [link] [corresponding to [link] (right)]. To operate in-placewith scratch storage, one can interleave small matrix transpositions with thebutterflies [link] , [link] , [link] , [link] , and a related strategy in FFTW [link] is briefly described by "Discussion" .
Notification Switch
Would you like to follow the 'Fast fourier transforms' conversation and receive update notifications?