SlideShare a Scribd company logo
International Journal of VLSI design & Communication Systems (VLSICS) Vol.7, No.4, August 2016
DOI : 10.5121/vlsic.2016.7403 29
IMPLEMENTATION OF SDC - SDF
ARCHITECTURE FOR RADIX-4 FFT
G. Deeshma Venkatakanakadurga1
and Dr.G.R.L.V.N. Srinivasaraju2
1
PG Scholar, Dept of ECE, SVECW, Bhimavaram, AP,India
2
Professor, Head of the Dept, Dept of ECE, SVECW, Bhimavaram, AP,India
ABSTRACT
Very large scale integration and Digital signal processing are the very crucial technologies from the last
few decades. DSP applications require high performance, low area and low power VLSI circuits. This
paper is discussing about FFT which is one of the vital component in the digital signal processing. In this
Paper, we propose a single path delay commutator–feedback (SDC-SDF) Architecture for Radix-4 FFT
and presented its simulation and synthesis results. The Radix-4 FFT architecture consists of log4 N-1 SDC
Stages and 1 SDF stage. Previously, the radix-2 SDC-SDF (Single path delay commutator-feedback) FFT
architecture was includes log2 N-1 SDC Stages and 1 SDF stage. The proposed Radix-4 SDC-SDF
architecture reduces the number of multiplications and additions as well as number of stages which
achieves reduced area and low power. The resultant architecture is simulated using Modelsim, design
verification and synthesis results are done using Xilinx ISE. The proposed architecture is compared with
Radix-2 SDC-SDF FFT and it can achieve less area as well as low power consumption.
KEYWORDS
Radix-2 FFT, Radix-4 FFT, Single path delay commutator – feedback (SDC-SDF), Bit reverser.
1. INTRODUCTION
The Fast Fourier Transform (FFT) is one of the vital components in the field of digital signal
processing. It is very helpful to calculate the discrete Fourier transform (DFT) accurately. DFT is
one of the important operations in the field of digital signal processing. The DFT, with a
transform length N equal to a power of 2, is usually implemented with the fast Fourier transform.
Hardware designers are always tried to develop good architectures for the computation of the
FFT to get high performance and real-time requirements of modern applications. Pipelined
hardware architectures provide high throughputs and low latencies suitable for real time, as well
as a low area and power consumption.
Fast Fourier Transform (FFT) is the vital component in orthogonal frequency division
multiplexing (OFDM) systems [1]. OFDM has been adopted in a wide range of applications from
wired- communication modems, such as digital subscriber lines , to wireless communication
modems, such as IEEE802.11 Wi-Fi, IEEE802.16 Wi-Max or 3GPP long term evolution(LTE),
to process baseband data.
International Journal of VLSI design & Communication Systems (VLSICS) Vol.7, No.4, August 2016
30
Previously, some of them worked in this area and they also implemented some FFT architectures.
They are Multi-path delay commutaor, single-path delay feedback and single-path delay
commutator. MDC architecture [3]-[6] is used typically to process multiple- input data streams
because of its high throughput rate. But it is not suited for single input data stream.MDC
architectures require more hardware utilization compared to combined SDC-SDF architecture.
The SDC-SDF (Single path delay commutator-feedback) architecture reduces the memory size
and it can utilize multipliers fully. However the utilization of adders is still very low. SDC
architecture is seldom used to process the single-input data stream, because it uses more memory
resources than SDF and has a more complicated control.
Radix-2 FFT architecture mainly performs two operations. They are addition and subtraction.
After completion of subtraction operation it indeed involves complex multiplication.
An FFT algorithm for radix’s other than radix-2 one of the most important is radix-4. The radix-4
FFT was only used when N is the power of 4. We can achieve less computational complexity by
using higher radix. The operation of radix-4 FFT is similar to the radix-2 FFT.
In radix-4 FFT, the sequence is divided into 4 sub sequences and each of which is again divided
into 4 sub sequences and so on. In radix-4 FFT, the butterfly is based on the four point DFT. So
radix-4 algorithm requires somewhat fewer multiplications than the radix-2 algorithm.
In this paper, we propose an efficient combined SDC-SDF (Single path delay commutator-
feedback) radix-4 FFT architecture, which contains log4 N-1 SDC Stages and 1 SDF stage, and 1
bit reverser. This architecture can produce the output sequence as the same order of input [19].
2. THE COMBINED SDC-SDF RADIX-2 FFT
The existing single path delay commutator-feedback (SDC-SDF) radix-2 FFT architecture
contains 1 pre-stage, log2N-1 SDC stages, 1 post-stage, 1 SDF stage, and 1 bit reverser as shown
in figure 1(a) [1]. The pre stage modifies the complex input data to a new sequence that is real
part and the corresponding imaginary part.
The SDC stages contain an SDC PE; it can achieve 100% arithmetic resource utilization through
both complex adders and complex multipliers. The SDC PE, shown in figure 1(b), contains a real
add/sub unit, a data commutator, and an optimum complex multiplier unit. In the stage t, the data
commutator modifies its input data to generate a new data sequence and the index difference to
get the new sequence is N/2t
, where t indicates the index of the SDC stage. The output of data
commutaor is input to the real add/sub unit. The real add/sub unit consists of one adder and one
subtracter.These two operations are performed for each input data.
Figure 1(b) is SDC PE for Radix-2 FFT consists of optimum complex multiplier unit. It contains
2 multiplexers, 2 multipliers, 1 real adder, and 1.5 word memory. The signal s operates the
operation of the real adder that is both addition and subtraction operations.
International Journal of VLSI design & Communication Systems (VLSICS) Vol.7, No.4, August 2016
31
Figure 1(a). The combined SDC-SDF architecture for Radix-2 FFT
Figure 1(b). The SDC PE for Radix-2 FFT
The post stage changes back the new sequence to the complex format. The last stage is the single
path delay feedback stage, which is similar to the radix-2 butterfly, requires a complex adder and
a complex subtracter. By using the modified addressing method, the bit reverser requires only
N/2 data buffer and we get the data in normal order.
3. PROPOSED SYSTEM
The main advantage of proposed SDC-SDF (Single path delay commutator-feedback) Radix-4
FFT architecture is we are applying inputs through single path and we are getting outputs through
single path. The proposed single path delay commutator processing engine can require less
number of complex multipliers and adders compared to the existing SDC-SDF (Single path delay
commutator-feedback) Radix-2 FFT architecture.
The proposed SDC-SDF (Single path delay commutator-feedback) Radix-4 FFT architecture
requires 1 Pre-stage, log4N – 1 SDC stages, 1 Post-stage, 1 SDF stage and 1 bit reverser as shown
in figure 2(a).
This architecture is based on Radix-4 Butterfly operation. That is 4 Operations are performed at
the same time. The pre-stage changes the complex input data into real part followed by the
imaginary part. For example initially the data in the form of 0_r,0_i,1_r,1_i etc., we get the
output of pre stage as 0_r,1_r,2_r,3_r in the 1st
cycle and 0_i,1_i,2_i,3_i in the 2nd
cycle. Like
that the pre-stage modifies the Complex input data into real part and the following imaginary
part.
Next, the output of pre-stage is input to the SDC stages. Single path data commutator stages are
depends on N value. The proposed architecture consists of log4 N – 1 or ½ log2N – 1 SDC stages.
Single path delay commutator processing engine consists of data commutator, Radix-4 butterfly
and complex multipliers 1 and 2 as shown in figure 2(b). Data commutaor shuffles real input data
to new data sequence, whose index difference is 3N/4, N/2, N/4. After generating the new data
sequence, before going to the butterfly4, they were multiplied by complex multipliers1. Here k
value varies from 0 to 3.
International Journal of VLSI design & Communication Systems (VLSICS) Vol.7, No.4, August 2016
32
The operation of data commutator was performed in 4 cycles. In the first cycle k=0, second cycle
k=1, third cycle k=2 and finally fourth cycle k=3. Depending on the k value the output of data
commutators were multiplied by complex multipliers1.
Next block is radix-4 butterfly. In this it get the data from complex multipliers1. The main
advantage of Radix-4 butterfly is they perform 4 operations at the same time. Internally radix-4
butterfly consists of adders/subtractors. It gets the 4 inputs and performs the addition/subtraction
between these 4 sequences and finally generates the 4 outputs. The output of butterfly4 is
multiplied by complex multipliers 2. This multiplication also depends on k value. Finally we get
the 4 outputs as real output, complex output1, complex output2 and complex output3.
Figure 2(a). Proposed Architecture for combined SDC-SDF Radix-4 FFT
Figure 2(b). The single path delay commutator processing engine for Radix-4 FFT
The process can be continued by applying to the other couples (inputs) to the SDC1 and so on. If
we perform the above process towards log4N – 1 single path delay commutaor stages to
Completion. Finally, we can complete the maximum part of the radix-4 FFT operation.
The output of SDC stages is input to the post stage. This stage was exactly opposite to the pre-
stage. The post-stage shuffles the new sequence to complex input data. Next stage is SDF stage.
It gets the input from post-stage. Single path data feedback consists of radix4 butterfly and thrice
N/4 delay elements. The advantage of single path delay feedback stage to changes the data
sequence, and then the delay memory is reduced to N/4 for the bit reverser. This combined SDC-
SDF (Single path delay commutator-feedback) architecture produces the output in normal order
as same as the order of input.
4. RESULTS AND COMPARISON
The design of combined SDC-SDF (Single path delay commutator-feedback) architecture for
Radix-4 FFT has been made by using Verilog Hardware Description Language (Verilog HDL).
International Journal of VLSI design & Communication Systems (VLSICS) Vol.7, No.4, August 2016
33
The simulation results has been evaluated by using Modelsim 6.3c and synthesis Performances
are estimated by using Xilinx 14.1
Figure 3(a). Simulation Waveform of Radix-4 SDC-SDF FFT
In Figure 3(a).complex input consists of real part and imaginary part. Here, in_real is the real part
and in_imag is the imaginary part. Here, we are applying 16 inputs (complex) of 32 bit range
through single path, clk, control as well as twiddle of 3 bit. Signal s of 4 bit represents number of
inputs.
Figure 3(b). Simulation Waveform continue1 of Radix-4 SDC-SDF FFT
In Figure 3(b).complex output consists of real part and imaginary part. Here, out_real is the real
part and out_imag is the imaginary part. After receiving 16 inputs (complex data), we are getting
outputs (out_real and out_imag) through single path of 32 bit range.
Fig. 4(a) RTL Schematic of Radix-4 SDC-SDF FFT
International Journal of VLSI design & Communication Systems (VLSICS) Vol.7, No.4, August 2016
34
In Fig.4 RTL Schematic shows input and output signals.clk, in_real of 32 bit, in_imag of 32 bit,
control of 5 bit, twiddle of 3 bit and s are inputs. Out_real of 32 bit and out_imag of 32 bit are
outputs.
Fig. 4(b): RTL Schematic detailed view of Radix-4 SDC-SDF FFT
In Fig. 4(b) RTL Schematic detailed view of SDC-SDF (Single path delay commutator -
feedback) Radix-4 FFT shows 5 blocks. They are 1 Pre-stage, 1 SDC Stage, 1 Post-stage, 1 SDF
stage and 1 Bit-Reverser. The design was verified through this RTL Schematic view.
Table1: Design Summary of Single path delay comutator-feedback Radix-4 FFT
S.No. PARAMETERS VALUE
1. No. of slice registers (in %) 2
2. Number of slice LUTs (in %) 61
3. Number of DSP 48E 1s (in %) 9
4. Min. Clock period 39.473 ns
5. Frequency 25.334 MHz
6. On-chip logic 0.007
7. Dynamic Power 0.047 W
8. Quiescent Power 0.073 W
9. Total Power 0.120 W
International Journal of VLSI design & Communication Systems (VLSICS) Vol.7, No.4, August 2016
35
Single path delay commutaor–feedback Radix-4 FFT is compared with SDC-SDF (Single path
delay commutaor–feedback) Radix-2 FFT in various parameters like dynamic power, quiescent
power, number of slice registers, number of slice LUTs, and number of DSP 48E 1s. The
implementation results give the same outputs, but in power consumption and area is less
compared with SDC-SDF Radix-2 FFT.
Table2: Comparison between Single path delay comutator-feedback Radix-4 FFT and Radix-2 FFT
S.No. Parameters SDC-SDF Radix-2 FFT SDC-SDF Radix-4 FFT
1. No. Of slice registers (in %) 3 2
2. Number of slice LUTs (in %) 72 61
3. Number of DSP 48E 1s (in %) 25 9
4. Dynamic Power 0.048 W 0.047 W
5. Quiescent Power 0.073 W 0.073 W
6. Total Power 0.122 W 0.120 W
Fig.5 Comparison of Number of slice LUTs of SDC-SDF of Radix-4 and Radix-2 FFT
From Fig.5 we understood the Number of slice LUTs of single path delay commutator-feedback
(SDC-SDF) Radix-4 FFT is less compared with SDC-SDF Radix-2 FFT. It says that output of
SDC-SDF Radix-4 FFT is obtained as fast as compared to SDC-SDF Radix-2 FFT.
International Journal of VLSI design & Communication Systems (VLSICS) Vol.7, No.4, August 2016
36
Fig.6 Comparison of Number of DSP 48E 1s of SDC-SDF of Radix-4 and Radix-2 FFT
From Fig.6 we understood the Number of DSP 48E 1s of single path delay commutator-feedback
(SDC-SDF) Radix-4 FFT is less compared with SDC-SDF Radix-2 FFT. It says that output of
SDC-SDF (single path delay commutator-feedback) Radix-4 FFT is obtained as fast as compared
to SDC-SDF Radix-2 FFT.
Fig.7 Comparison of Dynamic Power of SDC-SDF of Radix-4 and Radix-2 FFT
From Fig.7 we understood the Dynamic Power of single path delay commutator-feedback Radix-
4 FFT is low compared with SDC-SDF Radix-2 FFT. It says that the SDC-SDF Radix-4 FFT is
architecture performance is increased.
Fig.8 Comparison of Total Power of SDC-SDF of Radix-4 and Radix-2 FFT
International Journal of VLSI design & Communication Systems (VLSICS) Vol.7, No.4, August 2016
37
From Fig.8 we understood the Dynamic Power of single path delay commutator-feedback (SDC-
SDF) Radix-4 FFT is low compared with SDC-SDF Radix-2 FFT. It says that the SDC-SDF
Radix-4 FFT is architecture performance is increased.
5. CONCLUSION
The proposed SDC-SDF (Single path delay commutator-feedback) radix-4 FFT architecture
produces the output data in the same order as input. The proposed architecture reduces number of
complex multiplications as well as number of stages compared with the radix-2 FFT architecture.
The Single path delay commutator-feedback Radix-4 FFT architecture is simulated using
Modelsim and design verification, area and power reports were done using Xilinx ISE 14.1.
Finally, the proposed architecture can achieves reduced area and low power consumption.
REFERENCES
[1] Zeke Wang,Xue Liu,Bingsheng He and Feng Yu “ A Combined SDC-SDF Architecture for Normal
I/O Pipelined Radix-2 FFT ”, IEEE Trans. Very Large Scale Inegr. (VLSI) Syst., vol. 23, no. 5, pp.
973-977, May 2015.
[2] L.J.Cimini, “Analysis and simulation of a digital mobile channel using orthogonal frequency division
multiplexing”, IEEE Trans. Commun., vol. 33, no. 7, pp, 665-675, Jul. 1985.
[3] C.Cheng and K.K.Parthi, “High throughput VLSI architecture for FFT computation”, IEEE Trans.
Circuits Syst. II, Exp. Briefs, vol.54, no. 10, pp. 339-344, Oct. 2007.
[4] S.N.Tang, J.W.Tsai, and T.Y.Chang, “A 2.4-GS/s FFT processor for OFDM-based WPAN
applications”, IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 57, no. 6, pp. 451-455, Jun. 2010.
[5] Y.Jung, H.Yoon, and J.Kim, “New efiicient FFT algorithm and pipeline implementation results for
OFDM/DMT applications”, IEEE Trans. Consum. Electron. , vol. 49, no. 1, pp. 14-20, Feb. 2003.
[6] M.Shin and H.Lee, “A high-speed, four-parallel radix-24 FFT Processor for UWB applications”, in
Proc. IEEE ISCAS, May 2008, pp. 960-963.
[7] E.H.Wold and A.M.Despain, “Pipeline and Parallel-Pipeline FFT processors for VLSI
implementation”, IEEE Trans. Comput., vol. C-33, no. 5, pp. 414-426, Mayn1984.
[8] Y.N.Chang, “AN efficient VLSI architecture for normal I/O order pipeline FFT design”, IEEE Trans.
Circuits Syst. II, Exp. Briefs, vol. 55, no. 12, pp. 1234-1238, Dec. 2008.
[9] X.Liu, F.Yu, and Z.K.Wang, “A pipelined architecture for normal I/O order FFT”, J.Zhejiang Univ.
Sci. C, vol. 12, no. 1, pp. 76-82, Jan. 2011.
[10] T.Sansaloni, A.Perez-Pascual, V.Torres, and J.Valla, “Efficient pipeline FFT processors for WLAN
MIMO-OFDM systems”, Electron. Lett., vol. 41, no. 19, pp. 1043-1044, Sep. 2005.
[11] J.Y.Oh and M.S.Lim, “Area and power efficient pipeline FFT algorithm”, in Proc. IEEE Workshop
Signal Process. Syst. Design and Implementation, Nov. 2005, pp. 520-525.
International Journal of VLSI design & Communication Systems (VLSICS) Vol.7, No.4, August 2016
38
[12] T.Cho, S.Tsai, and H.Lee, “A high-speed low-complexity modified radix-25 FFT processor for high
rate WPAN applications”, IEEE Trans. Very Large Scale Inegr. (VLSI) Syst., vol. 21, no. 1, pp. 187-
191, Jan. 2013.
[13] A.Cortes, I.Velez, and J.F. Sevillano, “Radix rk FFts: Matricial representation and SDC/SDF pipeline
implementation”, IEEE Trans. Signal Process., vol. 57, pp. 2824-2839, Jul. 2009.
[14] M.Garrido, J.Grajal, M. Sanchez, and O.Gustafsson, “pipelined radix-2k feedforward FFT
architectures”, IEEE Trans. Very Large Scale Inegr. (VLSI) Syst., vol. 21, no. 1, pp. 23-32, Jan.
2013.
[15] L.Yang, K.Zhang, H.Liu, J.Huang, S.Huang, “An efficient locally pipelined FFT processor”, IEEE
Trans. Circuits Syst. II, Exp. Briefs, vol. 53, no. 7, pp. 585-589, Jul. 2006.
[16] T.Lenart and V.Owall, “Architectures for dynamic data scaling in 2/4/8k pipeline FFT cores”, IEEE
Trans. Very Large Scale Inegr. (VLSI) Syst. vol. 14, no. 11, pp. 1286-1290, Nov. 2006.
[17] M.Ayinala, M. Brown, and K.Parthi, “Pipelined parallel FFT architectures via folding
transformation”, IEEE Trans. Very Large Scale Inegr. (VLSI) Syst., vol. 20, no. 6, pp. 1068-1081,
Jun. 2012.
[18] G.Bi and E.v.Jones, “A Pipelined FFT processor for word-sequential data” IEEE Trans. Acoust.
Speech Signal Process, vol. 37, no. 12, pp. 1982-1985, Dec. 1989.
[19] B.Gold and C.M.Rader, Digital Processing of Signal. New York, NY, USA: McGraw-Hill, 1969,
ch.6.

More Related Content

PDF
Review on low power high speed 32 point cyclotomic parallel FFT Processor
PDF
FPGA Implementation of Mixed Radix CORDIC FFT
PDF
Design and Power Measurement of 2 And 8 Point FFT Using Radix-2 Algorithm for...
PDF
Modified Distributive Arithmetic Based DWT-IDWT Processor Design and FPGA Imp...
PDF
A Novel VLSI Architecture for FFT Utilizing Proposed 4:2 & 7:2 Compressor
PDF
Implementation of High Throughput Radix-16 FFT Processor
PDF
B1030610
PDF
Gn3311521155
Review on low power high speed 32 point cyclotomic parallel FFT Processor
FPGA Implementation of Mixed Radix CORDIC FFT
Design and Power Measurement of 2 And 8 Point FFT Using Radix-2 Algorithm for...
Modified Distributive Arithmetic Based DWT-IDWT Processor Design and FPGA Imp...
A Novel VLSI Architecture for FFT Utilizing Proposed 4:2 & 7:2 Compressor
Implementation of High Throughput Radix-16 FFT Processor
B1030610
Gn3311521155

What's hot (16)

PDF
Aw25293296
PPTX
Instruction Level Parallelism Compiler optimization Techniques Anna Universit...
PDF
Instruction Level Parallelism (ILP) Limitations
PPT
PDF
Area and Speed Efficient Reversible Fused Radix-2 FFT Unit using 4:3 Compressor
PDF
Computer organiztion4
PPTX
Addressing Modes
PDF
Cs8591 qb
PPT
Addressing modes of 8051
PDF
Different addressing mode and risc, cisc microprocessor
DOCX
PDF
Design of Processing Element (PE3) for Implementing Pipeline FFT Processor
PDF
Chapter 3 instruction level parallelism and its exploitation
PPT
Synthesis
PPT
Addressing modes
Aw25293296
Instruction Level Parallelism Compiler optimization Techniques Anna Universit...
Instruction Level Parallelism (ILP) Limitations
Area and Speed Efficient Reversible Fused Radix-2 FFT Unit using 4:3 Compressor
Computer organiztion4
Addressing Modes
Cs8591 qb
Addressing modes of 8051
Different addressing mode and risc, cisc microprocessor
Design of Processing Element (PE3) for Implementing Pipeline FFT Processor
Chapter 3 instruction level parallelism and its exploitation
Synthesis
Addressing modes
Ad

Viewers also liked (19)

PDF
A novel low power high dynamic threshold swing limited repeater insertion for...
PDF
Design approach for fault
PDF
MODIFIED MICROPIPLINE ARCHITECTURE FOR SYNTHESIZABLE ASYNCHRONOUS FIR FILTER ...
PDF
LOW POWER, LOW NOISE AMPLIFIERS DESIGN AND ANALYSIS FOR RF RECEIVER FRONT END...
PDF
Design of ultra low power 8 channel analog multiplexer using dynamic threshol...
PDF
Transistor level implementation of digital reversible circuits
PDF
Architecture of a novel configurable
PDF
Power Optimized Datapath Units of Hybrid Embedded Core Architecture Using Clo...
PDF
An integrated approach for designing and testing specific processors
PDF
DESIGN AND IMPLEMENTATION OF 10 BIT, 2MS/s SPLIT SAR ADC USING 0.18um CMOS TE...
PDF
A novel architecture of rns based
PDF
A new efficient fpga design of residue to-binary converter
PDF
DUALISTIC THRESHOLD BASED MIN-MAX METHOD FOR VOICE SIGNAL ENHANCEMENT
PDF
SCHOTTKY TUNNELING SOURCE IMPACT IONIZATION MOSFET (STS-IMOS) WITH ENHANCED D...
PDF
DESIGN AND IMPLEMENTATION OF AN IMPROVED CARRY INCREMENT ADDER
PDF
A novel handover algorithm for lte
PDF
Static power optimization using dual sub threshold supply voltages in digital...
PDF
A low power cmos analog circuit design for acquiring multichannel eeg signals
PDF
EVALUATION OF ATM FUNCTIONING USING VHDL AND FPGA
A novel low power high dynamic threshold swing limited repeater insertion for...
Design approach for fault
MODIFIED MICROPIPLINE ARCHITECTURE FOR SYNTHESIZABLE ASYNCHRONOUS FIR FILTER ...
LOW POWER, LOW NOISE AMPLIFIERS DESIGN AND ANALYSIS FOR RF RECEIVER FRONT END...
Design of ultra low power 8 channel analog multiplexer using dynamic threshol...
Transistor level implementation of digital reversible circuits
Architecture of a novel configurable
Power Optimized Datapath Units of Hybrid Embedded Core Architecture Using Clo...
An integrated approach for designing and testing specific processors
DESIGN AND IMPLEMENTATION OF 10 BIT, 2MS/s SPLIT SAR ADC USING 0.18um CMOS TE...
A novel architecture of rns based
A new efficient fpga design of residue to-binary converter
DUALISTIC THRESHOLD BASED MIN-MAX METHOD FOR VOICE SIGNAL ENHANCEMENT
SCHOTTKY TUNNELING SOURCE IMPACT IONIZATION MOSFET (STS-IMOS) WITH ENHANCED D...
DESIGN AND IMPLEMENTATION OF AN IMPROVED CARRY INCREMENT ADDER
A novel handover algorithm for lte
Static power optimization using dual sub threshold supply voltages in digital...
A low power cmos analog circuit design for acquiring multichannel eeg signals
EVALUATION OF ATM FUNCTIONING USING VHDL AND FPGA
Ad

Similar to IMPLEMENTATION OF SDC - SDF ARCHITECTURE FOR RADIX-4 FFT (20)

PDF
Design Radix-4 64-Point Pipeline FFT/IFFT Processor for Wireless Application
PDF
G010233540
PDF
PDF
High Speed Area Efficient 8-point FFT using Vedic Multiplier
PDF
Design of Scalable FFT architecture for Advanced Wireless Communication Stand...
PDF
HIGH PERFORMANCE SPLIT RADIX FFT
PDF
Iaetsd fpga implementation of cordic algorithm for pipelined fft realization and
PDF
Implementation Of Grigoryan FFT For Its Performance Case Study Over Cooley-Tu...
PDF
Iaetsd pipelined parallel fft architecture through folding transformation
PDF
IEEE_Peer_Reviewed_Paper_1
PDF
PERFORMANCE EVALUATIONS OF GRIORYAN FFT AND COOLEY-TUKEY FFT ONTO XILINX VIRT...
PDF
Performance evaluations of grioryan fft and cooley tukey fft onto xilinx virt...
PDF
IRJET- VLSI Architecture for Reversible Radix-2 FFT Algorithm using Programma...
DOCX
s.Magesh kumar DECE,BTECH,ME (ASAN MEMORIAL COLLEGE OF ENGINEERING AND TECHNO...
DOCX
A combined sdc
PDF
J0166875
PDF
Iaetsd finger print recognition by cordic algorithm and pipelined fft
PDF
IRJET- Low Complexity Pipelined FFT Design for High Throughput and Low Densit...
PPTX
PDF
IRJET - Design and Implementation of FFT using Compressor with XOR Gate Topology
Design Radix-4 64-Point Pipeline FFT/IFFT Processor for Wireless Application
G010233540
High Speed Area Efficient 8-point FFT using Vedic Multiplier
Design of Scalable FFT architecture for Advanced Wireless Communication Stand...
HIGH PERFORMANCE SPLIT RADIX FFT
Iaetsd fpga implementation of cordic algorithm for pipelined fft realization and
Implementation Of Grigoryan FFT For Its Performance Case Study Over Cooley-Tu...
Iaetsd pipelined parallel fft architecture through folding transformation
IEEE_Peer_Reviewed_Paper_1
PERFORMANCE EVALUATIONS OF GRIORYAN FFT AND COOLEY-TUKEY FFT ONTO XILINX VIRT...
Performance evaluations of grioryan fft and cooley tukey fft onto xilinx virt...
IRJET- VLSI Architecture for Reversible Radix-2 FFT Algorithm using Programma...
s.Magesh kumar DECE,BTECH,ME (ASAN MEMORIAL COLLEGE OF ENGINEERING AND TECHNO...
A combined sdc
J0166875
Iaetsd finger print recognition by cordic algorithm and pipelined fft
IRJET- Low Complexity Pipelined FFT Design for High Throughput and Low Densit...
IRJET - Design and Implementation of FFT using Compressor with XOR Gate Topology

Recently uploaded (20)

PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
Big Data Technologies - Introduction.pptx
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
sap open course for s4hana steps from ECC to s4
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Empathic Computing: Creating Shared Understanding
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PPT
Teaching material agriculture food technology
PPTX
Cloud computing and distributed systems.
NewMind AI Weekly Chronicles - August'25 Week I
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Per capita expenditure prediction using model stacking based on satellite ima...
Spectral efficient network and resource selection model in 5G networks
Reach Out and Touch Someone: Haptics and Empathic Computing
Big Data Technologies - Introduction.pptx
The Rise and Fall of 3GPP – Time for a Sabbatical?
Unlocking AI with Model Context Protocol (MCP)
Network Security Unit 5.pdf for BCA BBA.
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
MYSQL Presentation for SQL database connectivity
sap open course for s4hana steps from ECC to s4
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
20250228 LYD VKU AI Blended-Learning.pptx
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Empathic Computing: Creating Shared Understanding
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Teaching material agriculture food technology
Cloud computing and distributed systems.

IMPLEMENTATION OF SDC - SDF ARCHITECTURE FOR RADIX-4 FFT

  • 1. International Journal of VLSI design & Communication Systems (VLSICS) Vol.7, No.4, August 2016 DOI : 10.5121/vlsic.2016.7403 29 IMPLEMENTATION OF SDC - SDF ARCHITECTURE FOR RADIX-4 FFT G. Deeshma Venkatakanakadurga1 and Dr.G.R.L.V.N. Srinivasaraju2 1 PG Scholar, Dept of ECE, SVECW, Bhimavaram, AP,India 2 Professor, Head of the Dept, Dept of ECE, SVECW, Bhimavaram, AP,India ABSTRACT Very large scale integration and Digital signal processing are the very crucial technologies from the last few decades. DSP applications require high performance, low area and low power VLSI circuits. This paper is discussing about FFT which is one of the vital component in the digital signal processing. In this Paper, we propose a single path delay commutator–feedback (SDC-SDF) Architecture for Radix-4 FFT and presented its simulation and synthesis results. The Radix-4 FFT architecture consists of log4 N-1 SDC Stages and 1 SDF stage. Previously, the radix-2 SDC-SDF (Single path delay commutator-feedback) FFT architecture was includes log2 N-1 SDC Stages and 1 SDF stage. The proposed Radix-4 SDC-SDF architecture reduces the number of multiplications and additions as well as number of stages which achieves reduced area and low power. The resultant architecture is simulated using Modelsim, design verification and synthesis results are done using Xilinx ISE. The proposed architecture is compared with Radix-2 SDC-SDF FFT and it can achieve less area as well as low power consumption. KEYWORDS Radix-2 FFT, Radix-4 FFT, Single path delay commutator – feedback (SDC-SDF), Bit reverser. 1. INTRODUCTION The Fast Fourier Transform (FFT) is one of the vital components in the field of digital signal processing. It is very helpful to calculate the discrete Fourier transform (DFT) accurately. DFT is one of the important operations in the field of digital signal processing. The DFT, with a transform length N equal to a power of 2, is usually implemented with the fast Fourier transform. Hardware designers are always tried to develop good architectures for the computation of the FFT to get high performance and real-time requirements of modern applications. Pipelined hardware architectures provide high throughputs and low latencies suitable for real time, as well as a low area and power consumption. Fast Fourier Transform (FFT) is the vital component in orthogonal frequency division multiplexing (OFDM) systems [1]. OFDM has been adopted in a wide range of applications from wired- communication modems, such as digital subscriber lines , to wireless communication modems, such as IEEE802.11 Wi-Fi, IEEE802.16 Wi-Max or 3GPP long term evolution(LTE), to process baseband data.
  • 2. International Journal of VLSI design & Communication Systems (VLSICS) Vol.7, No.4, August 2016 30 Previously, some of them worked in this area and they also implemented some FFT architectures. They are Multi-path delay commutaor, single-path delay feedback and single-path delay commutator. MDC architecture [3]-[6] is used typically to process multiple- input data streams because of its high throughput rate. But it is not suited for single input data stream.MDC architectures require more hardware utilization compared to combined SDC-SDF architecture. The SDC-SDF (Single path delay commutator-feedback) architecture reduces the memory size and it can utilize multipliers fully. However the utilization of adders is still very low. SDC architecture is seldom used to process the single-input data stream, because it uses more memory resources than SDF and has a more complicated control. Radix-2 FFT architecture mainly performs two operations. They are addition and subtraction. After completion of subtraction operation it indeed involves complex multiplication. An FFT algorithm for radix’s other than radix-2 one of the most important is radix-4. The radix-4 FFT was only used when N is the power of 4. We can achieve less computational complexity by using higher radix. The operation of radix-4 FFT is similar to the radix-2 FFT. In radix-4 FFT, the sequence is divided into 4 sub sequences and each of which is again divided into 4 sub sequences and so on. In radix-4 FFT, the butterfly is based on the four point DFT. So radix-4 algorithm requires somewhat fewer multiplications than the radix-2 algorithm. In this paper, we propose an efficient combined SDC-SDF (Single path delay commutator- feedback) radix-4 FFT architecture, which contains log4 N-1 SDC Stages and 1 SDF stage, and 1 bit reverser. This architecture can produce the output sequence as the same order of input [19]. 2. THE COMBINED SDC-SDF RADIX-2 FFT The existing single path delay commutator-feedback (SDC-SDF) radix-2 FFT architecture contains 1 pre-stage, log2N-1 SDC stages, 1 post-stage, 1 SDF stage, and 1 bit reverser as shown in figure 1(a) [1]. The pre stage modifies the complex input data to a new sequence that is real part and the corresponding imaginary part. The SDC stages contain an SDC PE; it can achieve 100% arithmetic resource utilization through both complex adders and complex multipliers. The SDC PE, shown in figure 1(b), contains a real add/sub unit, a data commutator, and an optimum complex multiplier unit. In the stage t, the data commutator modifies its input data to generate a new data sequence and the index difference to get the new sequence is N/2t , where t indicates the index of the SDC stage. The output of data commutaor is input to the real add/sub unit. The real add/sub unit consists of one adder and one subtracter.These two operations are performed for each input data. Figure 1(b) is SDC PE for Radix-2 FFT consists of optimum complex multiplier unit. It contains 2 multiplexers, 2 multipliers, 1 real adder, and 1.5 word memory. The signal s operates the operation of the real adder that is both addition and subtraction operations.
  • 3. International Journal of VLSI design & Communication Systems (VLSICS) Vol.7, No.4, August 2016 31 Figure 1(a). The combined SDC-SDF architecture for Radix-2 FFT Figure 1(b). The SDC PE for Radix-2 FFT The post stage changes back the new sequence to the complex format. The last stage is the single path delay feedback stage, which is similar to the radix-2 butterfly, requires a complex adder and a complex subtracter. By using the modified addressing method, the bit reverser requires only N/2 data buffer and we get the data in normal order. 3. PROPOSED SYSTEM The main advantage of proposed SDC-SDF (Single path delay commutator-feedback) Radix-4 FFT architecture is we are applying inputs through single path and we are getting outputs through single path. The proposed single path delay commutator processing engine can require less number of complex multipliers and adders compared to the existing SDC-SDF (Single path delay commutator-feedback) Radix-2 FFT architecture. The proposed SDC-SDF (Single path delay commutator-feedback) Radix-4 FFT architecture requires 1 Pre-stage, log4N – 1 SDC stages, 1 Post-stage, 1 SDF stage and 1 bit reverser as shown in figure 2(a). This architecture is based on Radix-4 Butterfly operation. That is 4 Operations are performed at the same time. The pre-stage changes the complex input data into real part followed by the imaginary part. For example initially the data in the form of 0_r,0_i,1_r,1_i etc., we get the output of pre stage as 0_r,1_r,2_r,3_r in the 1st cycle and 0_i,1_i,2_i,3_i in the 2nd cycle. Like that the pre-stage modifies the Complex input data into real part and the following imaginary part. Next, the output of pre-stage is input to the SDC stages. Single path data commutator stages are depends on N value. The proposed architecture consists of log4 N – 1 or ½ log2N – 1 SDC stages. Single path delay commutator processing engine consists of data commutator, Radix-4 butterfly and complex multipliers 1 and 2 as shown in figure 2(b). Data commutaor shuffles real input data to new data sequence, whose index difference is 3N/4, N/2, N/4. After generating the new data sequence, before going to the butterfly4, they were multiplied by complex multipliers1. Here k value varies from 0 to 3.
  • 4. International Journal of VLSI design & Communication Systems (VLSICS) Vol.7, No.4, August 2016 32 The operation of data commutator was performed in 4 cycles. In the first cycle k=0, second cycle k=1, third cycle k=2 and finally fourth cycle k=3. Depending on the k value the output of data commutators were multiplied by complex multipliers1. Next block is radix-4 butterfly. In this it get the data from complex multipliers1. The main advantage of Radix-4 butterfly is they perform 4 operations at the same time. Internally radix-4 butterfly consists of adders/subtractors. It gets the 4 inputs and performs the addition/subtraction between these 4 sequences and finally generates the 4 outputs. The output of butterfly4 is multiplied by complex multipliers 2. This multiplication also depends on k value. Finally we get the 4 outputs as real output, complex output1, complex output2 and complex output3. Figure 2(a). Proposed Architecture for combined SDC-SDF Radix-4 FFT Figure 2(b). The single path delay commutator processing engine for Radix-4 FFT The process can be continued by applying to the other couples (inputs) to the SDC1 and so on. If we perform the above process towards log4N – 1 single path delay commutaor stages to Completion. Finally, we can complete the maximum part of the radix-4 FFT operation. The output of SDC stages is input to the post stage. This stage was exactly opposite to the pre- stage. The post-stage shuffles the new sequence to complex input data. Next stage is SDF stage. It gets the input from post-stage. Single path data feedback consists of radix4 butterfly and thrice N/4 delay elements. The advantage of single path delay feedback stage to changes the data sequence, and then the delay memory is reduced to N/4 for the bit reverser. This combined SDC- SDF (Single path delay commutator-feedback) architecture produces the output in normal order as same as the order of input. 4. RESULTS AND COMPARISON The design of combined SDC-SDF (Single path delay commutator-feedback) architecture for Radix-4 FFT has been made by using Verilog Hardware Description Language (Verilog HDL).
  • 5. International Journal of VLSI design & Communication Systems (VLSICS) Vol.7, No.4, August 2016 33 The simulation results has been evaluated by using Modelsim 6.3c and synthesis Performances are estimated by using Xilinx 14.1 Figure 3(a). Simulation Waveform of Radix-4 SDC-SDF FFT In Figure 3(a).complex input consists of real part and imaginary part. Here, in_real is the real part and in_imag is the imaginary part. Here, we are applying 16 inputs (complex) of 32 bit range through single path, clk, control as well as twiddle of 3 bit. Signal s of 4 bit represents number of inputs. Figure 3(b). Simulation Waveform continue1 of Radix-4 SDC-SDF FFT In Figure 3(b).complex output consists of real part and imaginary part. Here, out_real is the real part and out_imag is the imaginary part. After receiving 16 inputs (complex data), we are getting outputs (out_real and out_imag) through single path of 32 bit range. Fig. 4(a) RTL Schematic of Radix-4 SDC-SDF FFT
  • 6. International Journal of VLSI design & Communication Systems (VLSICS) Vol.7, No.4, August 2016 34 In Fig.4 RTL Schematic shows input and output signals.clk, in_real of 32 bit, in_imag of 32 bit, control of 5 bit, twiddle of 3 bit and s are inputs. Out_real of 32 bit and out_imag of 32 bit are outputs. Fig. 4(b): RTL Schematic detailed view of Radix-4 SDC-SDF FFT In Fig. 4(b) RTL Schematic detailed view of SDC-SDF (Single path delay commutator - feedback) Radix-4 FFT shows 5 blocks. They are 1 Pre-stage, 1 SDC Stage, 1 Post-stage, 1 SDF stage and 1 Bit-Reverser. The design was verified through this RTL Schematic view. Table1: Design Summary of Single path delay comutator-feedback Radix-4 FFT S.No. PARAMETERS VALUE 1. No. of slice registers (in %) 2 2. Number of slice LUTs (in %) 61 3. Number of DSP 48E 1s (in %) 9 4. Min. Clock period 39.473 ns 5. Frequency 25.334 MHz 6. On-chip logic 0.007 7. Dynamic Power 0.047 W 8. Quiescent Power 0.073 W 9. Total Power 0.120 W
  • 7. International Journal of VLSI design & Communication Systems (VLSICS) Vol.7, No.4, August 2016 35 Single path delay commutaor–feedback Radix-4 FFT is compared with SDC-SDF (Single path delay commutaor–feedback) Radix-2 FFT in various parameters like dynamic power, quiescent power, number of slice registers, number of slice LUTs, and number of DSP 48E 1s. The implementation results give the same outputs, but in power consumption and area is less compared with SDC-SDF Radix-2 FFT. Table2: Comparison between Single path delay comutator-feedback Radix-4 FFT and Radix-2 FFT S.No. Parameters SDC-SDF Radix-2 FFT SDC-SDF Radix-4 FFT 1. No. Of slice registers (in %) 3 2 2. Number of slice LUTs (in %) 72 61 3. Number of DSP 48E 1s (in %) 25 9 4. Dynamic Power 0.048 W 0.047 W 5. Quiescent Power 0.073 W 0.073 W 6. Total Power 0.122 W 0.120 W Fig.5 Comparison of Number of slice LUTs of SDC-SDF of Radix-4 and Radix-2 FFT From Fig.5 we understood the Number of slice LUTs of single path delay commutator-feedback (SDC-SDF) Radix-4 FFT is less compared with SDC-SDF Radix-2 FFT. It says that output of SDC-SDF Radix-4 FFT is obtained as fast as compared to SDC-SDF Radix-2 FFT.
  • 8. International Journal of VLSI design & Communication Systems (VLSICS) Vol.7, No.4, August 2016 36 Fig.6 Comparison of Number of DSP 48E 1s of SDC-SDF of Radix-4 and Radix-2 FFT From Fig.6 we understood the Number of DSP 48E 1s of single path delay commutator-feedback (SDC-SDF) Radix-4 FFT is less compared with SDC-SDF Radix-2 FFT. It says that output of SDC-SDF (single path delay commutator-feedback) Radix-4 FFT is obtained as fast as compared to SDC-SDF Radix-2 FFT. Fig.7 Comparison of Dynamic Power of SDC-SDF of Radix-4 and Radix-2 FFT From Fig.7 we understood the Dynamic Power of single path delay commutator-feedback Radix- 4 FFT is low compared with SDC-SDF Radix-2 FFT. It says that the SDC-SDF Radix-4 FFT is architecture performance is increased. Fig.8 Comparison of Total Power of SDC-SDF of Radix-4 and Radix-2 FFT
  • 9. International Journal of VLSI design & Communication Systems (VLSICS) Vol.7, No.4, August 2016 37 From Fig.8 we understood the Dynamic Power of single path delay commutator-feedback (SDC- SDF) Radix-4 FFT is low compared with SDC-SDF Radix-2 FFT. It says that the SDC-SDF Radix-4 FFT is architecture performance is increased. 5. CONCLUSION The proposed SDC-SDF (Single path delay commutator-feedback) radix-4 FFT architecture produces the output data in the same order as input. The proposed architecture reduces number of complex multiplications as well as number of stages compared with the radix-2 FFT architecture. The Single path delay commutator-feedback Radix-4 FFT architecture is simulated using Modelsim and design verification, area and power reports were done using Xilinx ISE 14.1. Finally, the proposed architecture can achieves reduced area and low power consumption. REFERENCES [1] Zeke Wang,Xue Liu,Bingsheng He and Feng Yu “ A Combined SDC-SDF Architecture for Normal I/O Pipelined Radix-2 FFT ”, IEEE Trans. Very Large Scale Inegr. (VLSI) Syst., vol. 23, no. 5, pp. 973-977, May 2015. [2] L.J.Cimini, “Analysis and simulation of a digital mobile channel using orthogonal frequency division multiplexing”, IEEE Trans. Commun., vol. 33, no. 7, pp, 665-675, Jul. 1985. [3] C.Cheng and K.K.Parthi, “High throughput VLSI architecture for FFT computation”, IEEE Trans. Circuits Syst. II, Exp. Briefs, vol.54, no. 10, pp. 339-344, Oct. 2007. [4] S.N.Tang, J.W.Tsai, and T.Y.Chang, “A 2.4-GS/s FFT processor for OFDM-based WPAN applications”, IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 57, no. 6, pp. 451-455, Jun. 2010. [5] Y.Jung, H.Yoon, and J.Kim, “New efiicient FFT algorithm and pipeline implementation results for OFDM/DMT applications”, IEEE Trans. Consum. Electron. , vol. 49, no. 1, pp. 14-20, Feb. 2003. [6] M.Shin and H.Lee, “A high-speed, four-parallel radix-24 FFT Processor for UWB applications”, in Proc. IEEE ISCAS, May 2008, pp. 960-963. [7] E.H.Wold and A.M.Despain, “Pipeline and Parallel-Pipeline FFT processors for VLSI implementation”, IEEE Trans. Comput., vol. C-33, no. 5, pp. 414-426, Mayn1984. [8] Y.N.Chang, “AN efficient VLSI architecture for normal I/O order pipeline FFT design”, IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 55, no. 12, pp. 1234-1238, Dec. 2008. [9] X.Liu, F.Yu, and Z.K.Wang, “A pipelined architecture for normal I/O order FFT”, J.Zhejiang Univ. Sci. C, vol. 12, no. 1, pp. 76-82, Jan. 2011. [10] T.Sansaloni, A.Perez-Pascual, V.Torres, and J.Valla, “Efficient pipeline FFT processors for WLAN MIMO-OFDM systems”, Electron. Lett., vol. 41, no. 19, pp. 1043-1044, Sep. 2005. [11] J.Y.Oh and M.S.Lim, “Area and power efficient pipeline FFT algorithm”, in Proc. IEEE Workshop Signal Process. Syst. Design and Implementation, Nov. 2005, pp. 520-525.
  • 10. International Journal of VLSI design & Communication Systems (VLSICS) Vol.7, No.4, August 2016 38 [12] T.Cho, S.Tsai, and H.Lee, “A high-speed low-complexity modified radix-25 FFT processor for high rate WPAN applications”, IEEE Trans. Very Large Scale Inegr. (VLSI) Syst., vol. 21, no. 1, pp. 187- 191, Jan. 2013. [13] A.Cortes, I.Velez, and J.F. Sevillano, “Radix rk FFts: Matricial representation and SDC/SDF pipeline implementation”, IEEE Trans. Signal Process., vol. 57, pp. 2824-2839, Jul. 2009. [14] M.Garrido, J.Grajal, M. Sanchez, and O.Gustafsson, “pipelined radix-2k feedforward FFT architectures”, IEEE Trans. Very Large Scale Inegr. (VLSI) Syst., vol. 21, no. 1, pp. 23-32, Jan. 2013. [15] L.Yang, K.Zhang, H.Liu, J.Huang, S.Huang, “An efficient locally pipelined FFT processor”, IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 53, no. 7, pp. 585-589, Jul. 2006. [16] T.Lenart and V.Owall, “Architectures for dynamic data scaling in 2/4/8k pipeline FFT cores”, IEEE Trans. Very Large Scale Inegr. (VLSI) Syst. vol. 14, no. 11, pp. 1286-1290, Nov. 2006. [17] M.Ayinala, M. Brown, and K.Parthi, “Pipelined parallel FFT architectures via folding transformation”, IEEE Trans. Very Large Scale Inegr. (VLSI) Syst., vol. 20, no. 6, pp. 1068-1081, Jun. 2012. [18] G.Bi and E.v.Jones, “A Pipelined FFT processor for word-sequential data” IEEE Trans. Acoust. Speech Signal Process, vol. 37, no. 12, pp. 1982-1985, Dec. 1989. [19] B.Gold and C.M.Rader, Digital Processing of Signal. New York, NY, USA: McGraw-Hill, 1969, ch.6.