Introducción
Hay mucha información gratuita en Internet sobre DSP. Se apuntará en la sección de links
Algoritmos implementados en el DSP C67x
Para hacer la convolución FFT.
FFT
- DSPF_sp_bitrev_cplx
- DSPF_sp_cfftr4_dif:
(14*n/4 + 23)*log4(n) + 20
e.g., if n = 256, cycles = 3696.
- DSPF_sp_cfftr2_dit:
(14*n/4 + 23)*log4(n) + 20
e.g., if n = 256, cycles = 3696.
- DSPF_sp_fftSPxSP:
cycles = 3 * ceil(log4(N)-1) * N + 21 * ceil(log4(N)-1) + 2*N + 44
e.g., N = 1024, cycles = 14464
e.g., N = 512, cycles = 7296
e.g., N = 256, cycles = 2923
e.g., N = 128, cycles = 1515
e.g., N = 64, cycles = 598
- DSPF_sp_ifftSPxSP:
cycles = 3 * ceil(log4(N)-1) * N + 21*ceil(log4(N)-1) + 2*N + 44
e.g., N = 1024, cycles = 14464
e.g., N = 512, cycles = 7296
e.g., N = 256, cycles = 2923
e.g., N = 128, cycles = 1515
e.g., N = 64, cycles = 598
- DSPF_sp_icfft2_dif:
2*n*log2(n) + 37
e.g., IF n = 64, cycles = 805
e.g., IF n = 128, cycles = 1829
El coste para una señal de 256 taps es de 3600 ciclos. 48000 taps por segundo son 188 fragmentos de 256.
El paso al dominio de la frecuencia cuesta: 3600*188 = 677000 ciclos por segundo.
Multiplicaciones
Hay que hacer la multiplicación compleja entre el filtro en el dominio de la frecuencia y la señal, una vez pasada al ominio de la frecuencia.
- DSPF_sp_vecmul: se puede hacer el producto en el dominio complejo mediante esta técnica.
2*floor((n-1)/2) + 18
e.g., for n = 200, cycles = 216
El producto será aproximadamente el doble de la longitud: 48000 * 2 = 100000 ciclos
Coste de la recombinación de los trozos
NPI
Total
700000 ciclos por filtro siendo optimista.
Partitioned convolution
The central idea found in many of these improvements, is that the impulse response, that is the filter, is partitioned into several smaller parts. When each part is filtered with the input, the results delayed suitably and finally added together, one gets the same result as when processing the whole filter at once. As far as I know, the earliest user of this simple but powerful concept is T.G. Stockham [16], who published his results only one year after the famous Cooley and Tukey FFT paper [5]. The concept can be used to solve several problems. Stockham used it for saving memory, but in later work made in the eighties and early nineties, at the time when realtime DSP became feasible for the first time, it was stated that it can also be used to reduce quantisation erros, reduce I/O-delay, and adapt to optimal FFT lengths of a specific implementation. All these improvements are described by J.S. Soo and K.K. Pang [14], [15]. Other realtime partitioned convolution pioneers are B.D. Kulp [17], P.C.W. Sommen [12], [13] and J.M.P. Borrallo and M. G. Otero [4]. Their work is a good place to start reading for the one interested in getting a more detailed description of partitioned convolution. The convolution algorithm in BruteFIR is conceptually exactly the same as the one found in these papers.
Preguntas a contestar
- ¿Cuál es la diferencia entre los distintos algoritmos presentados?






