<< Chapter < Page | Chapter >> Page > |
If the DFT is calculated directly using [link] , the algorithm is called a prime factor algorithm [link] , [link] and was discussed in Winograd’s Short DFT Algorithms and Multidimensional Index Mapping: In-Place Calculation of the DFT and Scrambling . When the short DFT's are calculated by the very efficient algorithms of Winograddiscussed in Factoring the Signal Processing Operators , the PFA becomes a very powerful method that is as fast or faster than the best Cooley-Tukey FFT's [link] , [link] .
A flow graph is not as helpful with the PFA as it was with the Cooley-Tukey FFT, however, the following representation in [link] which combines Figures Multidimensional Index Mapping: Figure 1 and Winograd’s Short DFT Algorithms: Figure 2 gives a good picture of the algorithm with the example of Multidimensional Index Mapping: Equation 25
If is factored into three factors, the DFT of [link] would have three nested summations and would be a three-dimensional DFT.This principle extends to any number of factors; however, recall that the Type-1 map requires that all the factors be relativelyprime. A very simple three-loop indexing scheme has been developed [link] which gives a compact, efficient PFA program for any number of factors. The basic program structure is illustrated in [link] with the short DFT's being omitted for clarity. Complete programs are given in [link] and in the appendices.
C---------------PFA INDEXING LOOPS--------------
DO 10 K = 1, MN1 = NI(K)
N2 = N/N1I(1) = 1
DO 20 J = 1, N2DO 30 L=2, N1
I(L) = I(L-1) + N2IF (I(L .GT.N) I(L) = I(L) - N
30 CONTINUEGOTO (20,102,103,104,105), N1
I(1) = I(1) + N120 CONTINUE
10 CONTINUERETURN
C----------------MODULE FOR N=2-----------------102 R1 = X(I(1))
X(I(1)) = R1 + X(I(2))X(I(2)) = R1 - X(I(2))
R1 = Y(I(1))Y(I(1)) = R1 + Y(I(2))
Y(I(2)) = R1 - Y(I(2))GOTO 20
C----------------OTHER MODULES------------------103 Length-3 DFT
104 Length-4 DFT105 Length-5 DFT
etc.
Part of a FORTRAN PFA Program
As in the Cooley-Tukey program, the DO 10 loop steps through the M stages (factors of N) and the DO 20 loop calculates the N/N1 length-N1DFT's. The input index map of [link] is implemented in the DO 30 loop and the statement just before label 20. In the PFA, each stageor factor requires a separately programmed module or butterfly. This lengthens the PFA program but an efficient Cooley-Tukey program willalso require three or more butterflies.
Because the PFA is calculated in-place using the input index map, the output is scrambled. There are five approaches to dealingwith this scrambled output. First, there are some applications where the output does not have to be unscrambled as in the case ofhigh-speed convolution. Second, an unscrambler can be added after the PFA to give the output in correct order just as thebit-reversed-counter is used for the Cooley-Tukey FFT. A simple unscrambler is given in [link] , [link] but it is not in place. The third method does the unscrambling in the modules while they arebeing calculated. This is probably the fastest method but the program must be written for a specific length [link] , [link] . A fourth method is similar and achieves the unscrambling by choosingthe multiplier constants in the modules properly [link] . The fifth method uses a separate indexingmethod for the input and output of each module [link] , [link] .
Notification Switch
Would you like to follow the 'Fast fourier transforms' conversation and receive update notifications?