SlideShare a Scribd company logo
Pipelined Architectures forPipelined Architectures for
High-Speed and Area-EfficientHigh-Speed and Area-Efficient
Viterbi DecodersViterbi Decoders
Chen, Chao-Nan
Chu, Hsi-Cheng
Convolutional code
Viterbi decoder
In-place path metric updating
Inserting pipeline levels into ACS
Convolutional CodesConvolutional Codes
 Convolutional encoders map information streams into a long code sequence.
 k = 1 bit input blocks produce n = 2 code symbols each.
 The code rate k/n expresses the information per coded bit and the constraint
length v defines the encoder memory order.
 This encoder has 2(v–1)
= 4 states.
Fig.1 A simple rate ½, v = 3 convolutional encoder
inputinput
1st code symbol1st code symbol
2nd code symbol2nd code symbol
outputoutput
Viterbi Algorithm (VA)Viterbi Algorithm (VA)
 The most commonly employed decoding technique that can be implemented
using either software or digital hardware.
 VA uses the trellis diagram (Fig.2) and can theoretically perform maximum
likelihood decoding.
 It finds the most likely path by means of suitable distance metric between the
received sequence and all the trellis paths.
Fig.2 Trellis diagram representation of the encoder of Fig.1
0000
0101
1010
1111
0000
1111
0000
1111
0000
1111
0000
1111
0000
1111
1010
0101
1010
0101
1010
0101
1010
0101
1111
0000
1111
0000
1111
0000
0101
1010
0101
1010
0101
1010
input bit 0input bit 0
input bit 1input bit 1
Viterbi DecoderViterbi Decoder
BMU: BM are computed from introduced input data
ACSU: PMs of all states are updated according to equation (1)
SMU: The stored decisions are employed in the SMU to build a unique
decoded output
PM[i](t+1) = min ( PM[k](t) + BM([k][i])(t) ) (1)
PM[k](t) : Path metric corresponding to state k at instant t
BM([k][i])(t): Branch metric of the transtion from state k at t to state i at t+1
Branch Metric
Unit
(BMU)
ACS
Unit
Survior-Path
Memory Unit
(SMU)
Input Output
Fig.3 Basic computation units in Viterbi decoder
all possible
State State
0
1
2
3
4
5
6
7
0
2
4
6
1
3
5
7
State
0
4
1
5
2
6
3
7
State
0
1
2
3
4
5
6
7
State
0
1
2
3
4
5
6
7
0
1
2
3
4
5
6
7
State
(a) (b)
Fig. 4. Example for v=3: (a) butterflies in the traditional approach;
(b) states and butterfies during one full cycle of in-place computation
State
i
State
i+2v-1
State
2i
State
2i+1
Overwrites previous
metric of state i
Overwrites previous
metric of state i+2v-1
Fig. 3. Partial trellis diagram or butterfly for
in-place computation of updated path metrics.
In-place Path MetricIn-place Path Metric
UpdatingUpdating
Efficiently save half
memory size
State
i
State
i+32
State
2i
State
2i+1
Figure 5. The diagram of BF unit
Table 1. State arrangement and path metric
updating for constraint length 7 (64 states)
Figure 6. A novel architecture for
the Viterbi decoder
Cycle 0 1 2 3 4 5 6 7
Iterarion 0
Address(DpRAM0-3) 0 1 2 3 4 5 6 7
Address(DpRAM4-7) 0 1 2 3 4 5 6 7
Iteration 1
Address(DpRAM0-3) 0 2 4 6 1 3 5 7
Address(DpRAM4-7) 1 3 5 7 0 2 4 6
Table 2. Address scrambling of path metric
memory for constraint length 7 (64 states)
State
i
State
i+32
State
2i
State
2i+1
Figure 5. The diagram of BF unit
Table 1. State arrangement and path metric
updating for constraint length 7 (64 states)
Figure 6. A novel architecture for
the Viterbi decoder
Cycle 0 1 2 3 4 5 6 7
Iterarion 0
Address(DpRAM0-3) 0 1 2 3 4 5 6 7
Address(DpRAM4-7) 0 1 2 3 4 5 6 7
Iteration 1
Address(DpRAM0-3) 0 2 4 6 1 3 5 7
Address(DpRAM4-7) 1 3 5 7 0 2 4 6
Table 2. Address scrambling of path metric
memory for constraint length 7 (64 states)
State
i
State
i+32
State
2i
State
2i+1
Figure 5. The diagram of BF unit
Table 1. State arrangement and path metric
updating for constraint length 7 (64 states)
Figure 6. A novel architecture for
the Viterbi decoder
Cycle 0 1 2 3 4 5 6 7
Iterarion 0
Address(DpRAM0-3) 0 1 2 3 4 5 6 7
Address(DpRAM4-7) 0 1 2 3 4 5 6 7
Iteration 1
Address(DpRAM0-3) 0 2 4 6 1 3 5 7
Address(DpRAM4-7) 1 3 5 7 0 2 4 6
Table 2. Address scrambling of path metric
memory for constraint length 7 (64 states)
State
i
State
i+32
State
2i
State
2i+1
Figure 5. The diagram of BF unit
Table 1. State arrangement and path metric
updating for constraint length 7 (64 states)
Figure 6. A novel architecture for
the Viterbi decoder
Cycle 0 1 2 3 4 5 6 7
Iterarion 0
Address(DpRAM0-3) 0 1 2 3 4 5 6 7
Address(DpRAM4-7) 0 1 2 3 4 5 6 7
Iteration 1
Address(DpRAM0-3) 0 2 4 6 1 3 5 7
Address(DpRAM4-7) 1 3 5 7 0 2 4 6
Table 2. Address scrambling of path metric
memory for constraint length 7 (64 states)
Insert Pipeline Levels into ACSInsert Pipeline Levels into ACS
 Generally, the maximum number of ACS pipeline levels is only dependent
on the ratio N/P (N: number of states ; P: number of ACS unit)
N/P 1 2 4 8 16 32 64
ACS pipline levels 1 1 2 5 10 20 40
Table 3. The maximum pipelines levels for
(N/P) from 1 to 64
+
+
SelectorComparator PM[i](t+1)
BM[k][i](t)
PM[j](t)
BM[j][i](t)
PM[k](t)
Figure 7. A simple example of inserting pipeline levels into ACS unit
ConclusionConclusion
Assuming pipeline levels are equally distributed into ACS,
the decoding speed is LP/N ≈ 5/8 of a state-parallel ACS
instead of P/N.
The maximum possible area-saving can be obtained by
selecting a large enough ratio N/P
A favorable solution for applications, where area-saving
and hence power, is the most crucial while moderate
decoding speed degradation is allowed.

More Related Content

PDF
Multiplier and Accumulator Using Csla
PDF
LOW POWER-AREA GDI & PTL TECHNIQUES BASED FULL ADDER DESIGNS
PDF
Q010228189
PDF
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
PDF
Understanding GPS & NMEA Messages and Algo to extract Information from NMEA.
DOC
Design and implementation of GPS Tracker
PPT
Homework solutionsch8
Multiplier and Accumulator Using Csla
LOW POWER-AREA GDI & PTL TECHNIQUES BASED FULL ADDER DESIGNS
Q010228189
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
Understanding GPS & NMEA Messages and Algo to extract Information from NMEA.
Design and implementation of GPS Tracker
Homework solutionsch8

What's hot (17)

PDF
FPGA Implementation of SubByte & Inverse SubByte for AES Algorithm
PDF
Axes Tech
ODP
Nmea Introduction
PDF
Review of high-speed phase accumulator for direct digital frequency synthesizer
PDF
Iaetsd vlsi design of high throughput finite field multiplier using redundant...
PDF
International Journal of Engineering Research and Development (IJERD)
PDF
J0166875
PDF
7th Semester Electronic and Communication Engineering (June/July-2015) Questi...
PDF
A comparative study of different multiplier designs
PDF
An Efficient High Speed Design of 16-Bit Sparse-Tree RSFQ Adder
PDF
An Improved Optimization Techniques for Parallel Prefix Adder using FPGA
DOCX
Graph based transistor network generation method for supergate design
PDF
Low cost high-performance vlsi architecture for montgomery modular multiplica...
PDF
DUAL FIELD DUAL CORE SECURE CRYPTOPROCESSOR ON FPGA PLATFORM
DOCX
High performance pipelined architecture of elliptic curve scalar multiplicati...
PDF
8th Semester Electronic and Communication Engineering (2012June) Question Papers
PDF
Iaetsd finger print recognition by cordic algorithm and pipelined fft
FPGA Implementation of SubByte & Inverse SubByte for AES Algorithm
Axes Tech
Nmea Introduction
Review of high-speed phase accumulator for direct digital frequency synthesizer
Iaetsd vlsi design of high throughput finite field multiplier using redundant...
International Journal of Engineering Research and Development (IJERD)
J0166875
7th Semester Electronic and Communication Engineering (June/July-2015) Questi...
A comparative study of different multiplier designs
An Efficient High Speed Design of 16-Bit Sparse-Tree RSFQ Adder
An Improved Optimization Techniques for Parallel Prefix Adder using FPGA
Graph based transistor network generation method for supergate design
Low cost high-performance vlsi architecture for montgomery modular multiplica...
DUAL FIELD DUAL CORE SECURE CRYPTOPROCESSOR ON FPGA PLATFORM
High performance pipelined architecture of elliptic curve scalar multiplicati...
8th Semester Electronic and Communication Engineering (2012June) Question Papers
Iaetsd finger print recognition by cordic algorithm and pipelined fft
Ad

Similar to Chenchu (20)

PDF
A Configurable and Low Power Hard-Decision Viterbi Decoder in VLSI Architecture
PDF
FPGA IMPLEMENTATION OF SOFT OUTPUT VITERBI ALGORITHM USING MEMORYLESS HYBRID ...
PDF
A Low Power VITERBI Decoder Design With Minimum Transition Hybrid Register Ex...
PDF
A LOW POWER VITERBI DECODER DESIGN WITH MINIMUM TRANSITION HYBRID REGISTER EX...
PDF
A LOW POWER VITERBI DECODER DESIGN WITH MINIMUM TRANSITION HYBRID REGISTER ...
PDF
FPGA Implementation of Soft Output Viterbi Algorithm Using Memoryless Hybrid ...
PDF
www.ijerd.com
PDF
Viterbi Decoder Plain Sailing Design for TCM Decoders
PDF
International Journal of Engineering Research and Development (IJERD)
PDF
FPGA Implementation of Viterbi Decoder using Hybrid Trace Back and Register E...
PDF
40120140505011
PDF
Design of High Speed and Low Power Veterbi Decoder for Trellis Coded Modulati...
PDF
E42032732
PDF
An Efficient Low Power Convolutional Coding with Viterbi Decoding using FSM
PDF
Performance analysis of viterbi decoder for wireless applications
PDF
Iaetsd vlsi implementation of efficient convolutional
PDF
K0216571
PDF
G364246
PDF
Implementation of Viterbi Decoder on FPGA to Improve Design
PDF
Hard Decision Viterbi Decoder: Implementation on FPGA and Comparison of Resou...
A Configurable and Low Power Hard-Decision Viterbi Decoder in VLSI Architecture
FPGA IMPLEMENTATION OF SOFT OUTPUT VITERBI ALGORITHM USING MEMORYLESS HYBRID ...
A Low Power VITERBI Decoder Design With Minimum Transition Hybrid Register Ex...
A LOW POWER VITERBI DECODER DESIGN WITH MINIMUM TRANSITION HYBRID REGISTER EX...
A LOW POWER VITERBI DECODER DESIGN WITH MINIMUM TRANSITION HYBRID REGISTER ...
FPGA Implementation of Soft Output Viterbi Algorithm Using Memoryless Hybrid ...
www.ijerd.com
Viterbi Decoder Plain Sailing Design for TCM Decoders
International Journal of Engineering Research and Development (IJERD)
FPGA Implementation of Viterbi Decoder using Hybrid Trace Back and Register E...
40120140505011
Design of High Speed and Low Power Veterbi Decoder for Trellis Coded Modulati...
E42032732
An Efficient Low Power Convolutional Coding with Viterbi Decoding using FSM
Performance analysis of viterbi decoder for wireless applications
Iaetsd vlsi implementation of efficient convolutional
K0216571
G364246
Implementation of Viterbi Decoder on FPGA to Improve Design
Hard Decision Viterbi Decoder: Implementation on FPGA and Comparison of Resou...
Ad

Recently uploaded (20)

PPTX
CURRICULAM DESIGN engineering FOR CSE 2025.pptx
PDF
Human-AI Collaboration: Balancing Agentic AI and Autonomy in Hybrid Systems
PPTX
AUTOMOTIVE ENGINE MANAGEMENT (MECHATRONICS).pptx
PDF
distributed database system" (DDBS) is often used to refer to both the distri...
PPTX
ASME PCC-02 TRAINING -DESKTOP-NLE5HNP.pptx
PPTX
Information Storage and Retrieval Techniques Unit III
PPTX
Feature types and data preprocessing steps
PPTX
Current and future trends in Computer Vision.pptx
PDF
null (2) bgfbg bfgb bfgb fbfg bfbgf b.pdf
PDF
SMART SIGNAL TIMING FOR URBAN INTERSECTIONS USING REAL-TIME VEHICLE DETECTI...
PDF
Design Guidelines and solutions for Plastics parts
PPTX
tack Data Structure with Array and Linked List Implementation, Push and Pop O...
PDF
Influence of Green Infrastructure on Residents’ Endorsement of the New Ecolog...
PDF
Accra-Kumasi Expressway - Prefeasibility Report Volume 1 of 7.11.2018.pdf
PDF
ChapteR012372321DFGDSFGDFGDFSGDFGDFGDFGSDFGDFGFD
PPTX
Management Information system : MIS-e-Business Systems.pptx
PDF
A SYSTEMATIC REVIEW OF APPLICATIONS IN FRAUD DETECTION
PPTX
Fundamentals of Mechanical Engineering.pptx
PPTX
Graph Data Structures with Types, Traversals, Connectivity, and Real-Life App...
PPTX
6ME3A-Unit-II-Sensors and Actuators_Handouts.pptx
CURRICULAM DESIGN engineering FOR CSE 2025.pptx
Human-AI Collaboration: Balancing Agentic AI and Autonomy in Hybrid Systems
AUTOMOTIVE ENGINE MANAGEMENT (MECHATRONICS).pptx
distributed database system" (DDBS) is often used to refer to both the distri...
ASME PCC-02 TRAINING -DESKTOP-NLE5HNP.pptx
Information Storage and Retrieval Techniques Unit III
Feature types and data preprocessing steps
Current and future trends in Computer Vision.pptx
null (2) bgfbg bfgb bfgb fbfg bfbgf b.pdf
SMART SIGNAL TIMING FOR URBAN INTERSECTIONS USING REAL-TIME VEHICLE DETECTI...
Design Guidelines and solutions for Plastics parts
tack Data Structure with Array and Linked List Implementation, Push and Pop O...
Influence of Green Infrastructure on Residents’ Endorsement of the New Ecolog...
Accra-Kumasi Expressway - Prefeasibility Report Volume 1 of 7.11.2018.pdf
ChapteR012372321DFGDSFGDFGDFSGDFGDFGDFGSDFGDFGFD
Management Information system : MIS-e-Business Systems.pptx
A SYSTEMATIC REVIEW OF APPLICATIONS IN FRAUD DETECTION
Fundamentals of Mechanical Engineering.pptx
Graph Data Structures with Types, Traversals, Connectivity, and Real-Life App...
6ME3A-Unit-II-Sensors and Actuators_Handouts.pptx

Chenchu

  • 1. Pipelined Architectures forPipelined Architectures for High-Speed and Area-EfficientHigh-Speed and Area-Efficient Viterbi DecodersViterbi Decoders Chen, Chao-Nan Chu, Hsi-Cheng
  • 2. Convolutional code Viterbi decoder In-place path metric updating Inserting pipeline levels into ACS
  • 3. Convolutional CodesConvolutional Codes  Convolutional encoders map information streams into a long code sequence.  k = 1 bit input blocks produce n = 2 code symbols each.  The code rate k/n expresses the information per coded bit and the constraint length v defines the encoder memory order.  This encoder has 2(v–1) = 4 states. Fig.1 A simple rate ½, v = 3 convolutional encoder inputinput 1st code symbol1st code symbol 2nd code symbol2nd code symbol outputoutput
  • 4. Viterbi Algorithm (VA)Viterbi Algorithm (VA)  The most commonly employed decoding technique that can be implemented using either software or digital hardware.  VA uses the trellis diagram (Fig.2) and can theoretically perform maximum likelihood decoding.  It finds the most likely path by means of suitable distance metric between the received sequence and all the trellis paths. Fig.2 Trellis diagram representation of the encoder of Fig.1 0000 0101 1010 1111 0000 1111 0000 1111 0000 1111 0000 1111 0000 1111 1010 0101 1010 0101 1010 0101 1010 0101 1111 0000 1111 0000 1111 0000 0101 1010 0101 1010 0101 1010 input bit 0input bit 0 input bit 1input bit 1
  • 5. Viterbi DecoderViterbi Decoder BMU: BM are computed from introduced input data ACSU: PMs of all states are updated according to equation (1) SMU: The stored decisions are employed in the SMU to build a unique decoded output PM[i](t+1) = min ( PM[k](t) + BM([k][i])(t) ) (1) PM[k](t) : Path metric corresponding to state k at instant t BM([k][i])(t): Branch metric of the transtion from state k at t to state i at t+1 Branch Metric Unit (BMU) ACS Unit Survior-Path Memory Unit (SMU) Input Output Fig.3 Basic computation units in Viterbi decoder all possible
  • 6. State State 0 1 2 3 4 5 6 7 0 2 4 6 1 3 5 7 State 0 4 1 5 2 6 3 7 State 0 1 2 3 4 5 6 7 State 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 State (a) (b) Fig. 4. Example for v=3: (a) butterflies in the traditional approach; (b) states and butterfies during one full cycle of in-place computation State i State i+2v-1 State 2i State 2i+1 Overwrites previous metric of state i Overwrites previous metric of state i+2v-1 Fig. 3. Partial trellis diagram or butterfly for in-place computation of updated path metrics. In-place Path MetricIn-place Path Metric UpdatingUpdating Efficiently save half memory size
  • 7. State i State i+32 State 2i State 2i+1 Figure 5. The diagram of BF unit Table 1. State arrangement and path metric updating for constraint length 7 (64 states) Figure 6. A novel architecture for the Viterbi decoder Cycle 0 1 2 3 4 5 6 7 Iterarion 0 Address(DpRAM0-3) 0 1 2 3 4 5 6 7 Address(DpRAM4-7) 0 1 2 3 4 5 6 7 Iteration 1 Address(DpRAM0-3) 0 2 4 6 1 3 5 7 Address(DpRAM4-7) 1 3 5 7 0 2 4 6 Table 2. Address scrambling of path metric memory for constraint length 7 (64 states)
  • 8. State i State i+32 State 2i State 2i+1 Figure 5. The diagram of BF unit Table 1. State arrangement and path metric updating for constraint length 7 (64 states) Figure 6. A novel architecture for the Viterbi decoder Cycle 0 1 2 3 4 5 6 7 Iterarion 0 Address(DpRAM0-3) 0 1 2 3 4 5 6 7 Address(DpRAM4-7) 0 1 2 3 4 5 6 7 Iteration 1 Address(DpRAM0-3) 0 2 4 6 1 3 5 7 Address(DpRAM4-7) 1 3 5 7 0 2 4 6 Table 2. Address scrambling of path metric memory for constraint length 7 (64 states)
  • 9. State i State i+32 State 2i State 2i+1 Figure 5. The diagram of BF unit Table 1. State arrangement and path metric updating for constraint length 7 (64 states) Figure 6. A novel architecture for the Viterbi decoder Cycle 0 1 2 3 4 5 6 7 Iterarion 0 Address(DpRAM0-3) 0 1 2 3 4 5 6 7 Address(DpRAM4-7) 0 1 2 3 4 5 6 7 Iteration 1 Address(DpRAM0-3) 0 2 4 6 1 3 5 7 Address(DpRAM4-7) 1 3 5 7 0 2 4 6 Table 2. Address scrambling of path metric memory for constraint length 7 (64 states)
  • 10. State i State i+32 State 2i State 2i+1 Figure 5. The diagram of BF unit Table 1. State arrangement and path metric updating for constraint length 7 (64 states) Figure 6. A novel architecture for the Viterbi decoder Cycle 0 1 2 3 4 5 6 7 Iterarion 0 Address(DpRAM0-3) 0 1 2 3 4 5 6 7 Address(DpRAM4-7) 0 1 2 3 4 5 6 7 Iteration 1 Address(DpRAM0-3) 0 2 4 6 1 3 5 7 Address(DpRAM4-7) 1 3 5 7 0 2 4 6 Table 2. Address scrambling of path metric memory for constraint length 7 (64 states)
  • 11. Insert Pipeline Levels into ACSInsert Pipeline Levels into ACS  Generally, the maximum number of ACS pipeline levels is only dependent on the ratio N/P (N: number of states ; P: number of ACS unit) N/P 1 2 4 8 16 32 64 ACS pipline levels 1 1 2 5 10 20 40 Table 3. The maximum pipelines levels for (N/P) from 1 to 64 + + SelectorComparator PM[i](t+1) BM[k][i](t) PM[j](t) BM[j][i](t) PM[k](t) Figure 7. A simple example of inserting pipeline levels into ACS unit
  • 12. ConclusionConclusion Assuming pipeline levels are equally distributed into ACS, the decoding speed is LP/N ≈ 5/8 of a state-parallel ACS instead of P/N. The maximum possible area-saving can be obtained by selecting a large enough ratio N/P A favorable solution for applications, where area-saving and hence power, is the most crucial while moderate decoding speed degradation is allowed.