Split-radix FFT algorithm

The split-radix FFT is a fast Fourier transform (FFT) algorithm for computing the discrete Fourier transform (DFT), and was first described in an initially little-appreciated paper by R. Yavne (1968) and subsequently rediscovered simultaneously by various authors in 1984. (The name "split radix" was coined by two of these reinventors, P. Duhamel and H. Hollmann.) In particular, split radix is a variant of the Cooley-Tukey FFT algorithm that uses a blend of radices 2 and 4: it recursively expresses a DFT of length N in terms of one smaller DFT of length N/2 and two smaller DFTs of length N/4.

The split-radix FFT, along with its variations, long had the distinction of achieving the lowest published arithmetic operation count (total exact number of required real additions and multiplications) to compute a DFT of power-of-two sizes N. The arithmetic count of the original split-radix algorithm was improved upon in 2004 (with the initial gains made in unpublished work by J. Van Buskirk via hand optimization for N=64 ), but it turns out that one can still achieve the new lowest count by a modification of split radix (Johnson and Frigo, 2007). Although the number of arithmetic operations is not the sole factor (or even necessarily the dominant factor) in determining the time required to compute a DFT on a computer, the question of the minimum possible count is of longstanding theoretical interest. (No tight lower bound on the operation count has currently been proven.)

The split-radix algorithm can only be applied when N is a multiple of 4, but since it breaks a DFT into smaller DFTs it can be combined with any other FFT algorithm as desired.

Split-radix decomposition

Recall that the DFT is defined by the formula:

 X_k =  \sum_{n=0}^{N-1} x_n \omega_N^{nk}

where k is an integer ranging from 0 to N-1 and \omega_N denotes the primitive root of unity:

\omega_N = e^{-\frac{2\pi i}{N}},

and thus :\omega_N^N = 1.

The split-radix algorithm works by expressing this summation in terms of three smaller summations. (Here, we give the "decimation in time" version of the split-radix FFT; the dual decimation in frequency version is essentially just the reverse of these steps.)

First, a summation over the even indices x_{2n_2}. Second, a summation over the odd indices broken into two pieces: x_{4n_4+1} and x_{4n_4+3}, according to whether the index is 1 or 3 modulo 4. Here, n_m denotes an index that runs from 0 to N/m-1. The resulting summations look like:

 X_k =  \sum_{n_2=0}^{N/2-1} x_{2n_2} \omega_{N/2}^{n_2 k} 
+ \omega_N^k \sum_{n_4=0}^{N/4-1} x_{4n_4+1} \omega_{N/4}^{n_4 k}
+ \omega_N^{3k} \sum_{n_4=0}^{N/4-1} x_{4n_4+3} \omega_{N/4}^{n_4 k}

where we have used the fact that \omega_N^{m n k} = \omega_{N/m}^{n k}. These three sums correspond to portions of radix-2 (size N/2) and radix-4 (size N/4) Cooley-Tukey steps, respectively. (The underlying idea is that the even-index subtransform of radix-2 has no multiplicative factor in front of it, so it should be left as-is, while the odd-index subtransform of radix-2 benefits by combining a second recursive subdivision.)

These smaller summations are now exactly DFTs of length N/2 and N/4, which can be performed recursively and then recombined.

More specifically, let U_k denote the result of the DFT of length N/2 (for k = 0,\ldots,N/2-1), and let Z_k and Z'_k denote the results of the DFTs of length N/4 (for k = 0,\ldots,N/4-1). Then the output X_k is simply:

X_k = U_k + \omega_N^k Z_k + \omega_N^{3k} Z'_k.

This, however, performs unnecessary calculations, since k \geq N/4 turn out to share many calculations with k < N/4. In particular, if we add N/4 to k, the size-N/4 DFTs are not changed (because they are periodic in k), while the size-N/2 DFT is unchanged if we add N/2 to k. So, the only things that change are the \omega_N^k and \omega_N^{3k} terms, known as twiddle factors. Here, we use the identities:

\omega_N^{k+N/4} = -i \omega_N^k
\omega_N^{3(k+N/4)} = i \omega_N^{3k}

to finally arrive at:

X_k = U_k + \left( \omega_N^k Z_k + \omega_N^{3k} Z'_k \right),
X_{k+N/2} = U_k - \left( \omega_N^k Z_k + \omega_N^{3k} Z'_k \right),
X_{k+N/4} = U_{k+N/4} - i \left( \omega_N^k Z_k - \omega_N^{3k} Z'_k \right),
X_{k+3N/4} = U_{k+N/4} + i \left( \omega_N^k Z_k - \omega_N^{3k} Z'_k \right),

which gives all of the outputs X_k if we let k range from 0 to N/4-1 in the above four expressions.

Notice that these expressions are arranged so that we need to combine the various DFT outputs by pairs of additions and subtractions, which are known as butterflies. In order to obtain the minimal operation count for this algorithm, one needs to take into account special cases for k = 0 (where the twiddle factors are unity) and for k = N/8 (where the twiddle factors are (1 \pm i)/\sqrt{2} and can be multiplied more quickly); see, e.g. Sorensen et al. (1986). Multiplications by \pm 1 and \pm i are ordinarily counted as free (all negations can be absorbed by converting additions into subtractions or vice versa).

This decomposition is performed recursively when N is a power of two. The base cases of the recursion are N=1, where the DFT is just a copy X_0 = x_0, and N=2, where the DFT is an addition X_0 = x_0 + x_1 and a subtraction X_1 = x_0 - x_1.

These considerations result in a count: 4 N \log_2 N - 6N + 8 real additions and multiplications, for N>1 a power of two. This count assumes that, for odd powers of 2, the leftover factor of 2 (after all the split-radix steps, which divide N by 4) is handled directly by the DFT definition (4 real additions and multiplications), or equivalently by a radix-2 Cooley–Tukey FFT step.

References

This article is issued from Wikipedia - version of the Sunday, September 13, 2015. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.