SlideShare a Scribd company logo
International Conference on Information Engineering, Management and Security 2016 (ICIEMS 2016) 70
Cite this article as: R P Meenaakshi Sundhari, M Karthickumar, S Pavithra, E Madura. “Implementation of MAC
using Modified Booth Algorithm”. International Conference on Information Engineering, Management and Security
2016: 70-76. Print.
International Conference on Information Engineering, Management and Security 2016 [ICIEMS]
ISBN 978-81-929866-4-7 VOL 01
Website iciems.in eMail iciems@asdf.res.in
Received 02 – February – 2016 Accepted 15 - February – 2016
Article ID ICIEMS014 eAID ICIEMS.2016.014
Implementation of MAC using Modified Booth
Algorithm
R P Meenaakshi Sundhari1
, M Karthickumar2
, S Pavithra3
and E Madura4
1, 2, 4
Assistant Professor and 3
PG Scholar
1, 3, 4
Angel College of Engineering and Technology, Tirupur, India.
2
Erode Sengunthar Engineering College, Erode, India.
Abstract- The proposed system is an efficient processing of 16-bit Multiplier Accumulator using Radix-8 and Radix-16 modified Booth Algorithm
and other adders (SPST adder, Carry select adder, Parallel Prefix adder) using VHDL (Very High Speed Integrated Circuit Hardware Description
Language). This proposed system provides low power, high speed and fewer delays. In both booth multipliers, comparison between the power
consumption (mw) and estimated delay (ns) are calculated. The application of digital signal processing like fast fourier transform, finite impulse
response and convolution needs high speed and low power MAC (Multiplier and Accumulator) units to construct an added. By reducing the glitches
(from 1 to 0 transition) and spikes (from 0 to 1 transition), the speed of operation is improved and dynamic power is reduced. The adder designed with
SPST avoids the unwanted glitches and spikes, reduce the switching power dissipation and the dynamic power. The speed can be improved by reducing
the number of partial products to half, by grouping of bits in the multiplier term. The proposed Radix-8 and Radix-16 Modified Booth Algorithm
MAC with SPST reduces the delay and obtain low power consumption as compared to array MAC.
Keywords: Radix-8 modified booth algorithm Radix- 16 modified booth algorithm, Digital signal processing, VHDL (Very High Speed Integrated
Circuit Hardware Description Language), Spurious Power Suppression Technique (SPST).
I INTRODUCTION
Multiplication is a fundamental operation in digital signal processing application which consumes more power and area. Consequently,
there is a need for designing low power Booth Algorithm. Booth algorithm is a standard technique which provides significant
improvement in terms of chip area and power compared to other multiplication techniques. The implementation of the multiplier
depends on the type of adder which is used in the MAC unit. By combining the multiplication with the accumulation the development
of a hybrid type of adders like Parallel prefix adder and Carry save adder, the performance has improved. Several commercial
processors have selected the Radix-8 multiplier architecture to increase the speed of operation, thereby reducing the number of partial
products in the multiplication terms. The Radix-8 encoding reduces the digit number length in a signed digit representation as
compared to Radix-2 multiplication. Its performance is bottleneck by the generation of the term 3X (Multiplicand), also referred to as
hard multiple. The proposed MAC unit accumulates intermediate result in the terms of sum and carries bits instead of the output of
the final adder, which optimize the pipeline system to improve the overall performance. The modified Booth’s algorithm based on the
Radix-8, generally called Booth-2, is the most popular approach for implementing the fast multipliers using parallel encoding. In
general, multi-operand addition is the part of many complex arithmetic algorithms, such as multiplication and certain DSP algorithms.
One of the most popular multi-operand adders is the carry-save adder which is capable of adding more than two operands at a time.
This paper is prepared exclusively for International Conference on Information Engineering, Management and Security 2016 [ICIEMS 2016] which is published by
ASDF International, Registered in London, United Kingdom under the directions of the Editor-in-Chief Dr. K. Saravanan and Editors Dr. Daniel James, Dr.
Kokula Krishna Hari Kunasekaran and Dr. Saikishore Elangovan. Permission to make digital or hard copies of part or all of this work for personal or classroom use
is granted without fee provided that copies are not made or distributed for profit or commercial advantage, and that copies bear this notice and the full citation on
the first page. Copyrights for third-party components of this work must be honoured. For all other uses, contact the owner/author(s). Copyright Holder can be
reached at copy@asdf.international for distribution.
2016 © Reserved by Association of Scientists, Developers and Faculties [www.ASDF.international]
International Conference on Information Engineering, Management and Security 2016 (ICIEMS 2016) 71
Cite this article as: R P Meenaakshi Sundhari, M Karthickumar, S Pavithra, E Madura. “Implementation of MAC
using Modified Booth Algorithm”. International Conference on Information Engineering, Management and Security
2016: 70-76. Print.
Figure 1. Hardware Architecture of General MAC Array Multiplier
The objective of this paper is to introduce the flexibility of adding three-input operands to a regular adder, thereby reducing the need
of a special adder to the same process. General architecture of MAC is shown in Figure 1. The proposed approach is implemented
using VHDL design with ModelSim 6.5c software. This executes the multiplication operation by multiplying the multiplier and the
multiplicand. Multiplier is considered as X and multiplicand is Y which is added to the previous multiplication result Z as an
accumulation step.
II Types of Adders
A. SPST Adder
In SPST Adder, the 16-bit adder /subtractor are divided into MSP (Most Significant Part) and LSP (Least Significant Part) between the
8th and 9th bits.
Figure 2. Proposed Low Power SPST Equipped Multiplier
The MSP of the original adder is modified to include the detection logical circuits, data controlling circuits, sign extension circuits,
latch and clock circuits and logic for calculating carry-in and carry-out signals. Figure 2 shows the Proposed Low Power SPST
Equipped Multiplier which consist of Latch, Detection logic and Sign extension logic.
B. Carry Select Adder
The 16-bit carry-select adder with a uniform block size of 4 can be created with three of these blocks and a 4- bit ripple carry adder is
used. Since carry-in is known at the beginning of computation, a carry select block is not needed for first four bits. The delay of this
adder will be four full adder delays, plus three MUX delays. The 16-bit carry-select adder with variable size can be similarly created
shown in Figure 3. Here an adder with block sizes of 2-2-3-4-5 is used. This break-up is ideal when the full-adder delay is equal to the
MUX delay, which is unlikely. The total delay is of two full adder delays and four multiplexer delays.
International Conference on Information Engineering, Management and Security 2016 (ICIEMS 2016) 72
Cite this article as: R P Meenaakshi Sundhari, M Karthickumar, S Pavithra, E Madura. “Implementation of MAC
using Modified Booth Algorithm”. International Conference on Information Engineering, Management and Security
2016: 70-76. Print.
Figure 3. Variable Sized CSLA
C. Parallel Binary Adder
The goal of this paper is to present the architectures that provide the flexibility within a regular adder to augment/decrement the sum
of two numbers by a constant which is considered in the addition process. This flexibility adds to the functionality of a regular adder,
which achieves a comparable performance to conventional designs, therefore eliminating the need of having a dedicated adder unit to
perform the same tasks. In this adder if the third operand is a constant, a design to accomplish three-input addition is required. These
designs are called Enhanced Flagged Binary Adders (EFBA), shown in Figure 4. It also examines the performance of the adder when
the operand size is expanded from 16 bits to 32 and 64 bits. Detailed analysis has been provided to compare the performance of the
new designs with carry-save adders in terms of delay, power dissipated and area consumes.
Figure 4. EFBA Block diagram
III Implementation
The Booth multiplication is a technique that allows faster multiplication by grouping the multiplier bits. The grouping of multiplier bits
and Radix-8 Booth encoding reduce the number of partial products to half. The shifting and adding is for every column of the
multiplier term and multiplying by 1 or 0 is commonly used. Here every second column is taken and multiplied by ±1, ±2, or 0. The
advantage of this method is halving of the number of partial products. In Booth encoding the multiplier bits is formed in blocks of
three, such that each block overlaps the previous block by only one bit. Grouping is started from the LSB side, and the first block only
uses two bits of the multiplier term. Figure 5 below shows the grouping of bits from the multiplier term.
To obtain the correct partial product, each block is decoded from the grouped terms. Table 1 shows the encoding of the multiplier
value Y, which uses the Modified Booth Algorithm and generates the following five signed digits, -2, -1, 0, +1, +2. Each encoded
digit in the multiplier performs a certain operation on the multiplicand X.
Figure 5. Grouping of bits from the multiplier term in the multiplication operation
International Conference on Information Engineering, Management and Security 2016 (ICIEMS 2016) 73
Cite this article as: R P Meenaakshi Sundhari, M Karthickumar, S Pavithra, E Madura. “Implementation of MAC
using Modified Booth Algorithm”. International Conference on Information Engineering, Management and Security
2016: 70-76. Print.
Table I Encoded five signed digits
IV Modified Booth Algorithm for Radix-16
The numbers of subsequent calculation stages are decreased by enhancing the parallelism operation. So one of the solutions of realizing
the high speed multipliers is to enhance parallelism operation. The Radix-4 Booth multiplier is the modified version of the
conventional version of the Booth algorithm (Radix-2). The generation of Radix 2 and Radix 8 multiplication generally requires some
kind of carry propagate adder, which increases the latency mainly due to the long wires that are required for propagating carries from
the less significant to more significant bits.
High-speed modulo multipliers using the Booth encoding for partial product generation have been proposed in the Booth encoding
technique which reduces the number of partial products to be generated and accumulated. In Radix-4 Booth encoding all modulo-
reduced partial products can be generated by shifting and negation. The greater savings in area and dynamic power dissipation are
feasible for large word-length multipliers by increasing the radix beyond four.
In Radix-8 Booth encoding method as shown in the Figure 6, the number of partial products is reduced by two- thirds. However, this
reduction in the number of partial products leads to increased complexity in their generation. Compared with many other arithmetic
operations multiplication is the time consuming and power hungry. Thus enhancing the performance of the circuit and reduction the
power dissipation are the most important design challenges for all applications in which multiplier unit dominate the system
performance and power dissipation.
The effective way to increase the speed of the multiplier is to reduce the number of the partial products. The number of partial
products can be reduced with the higher radix Booth encoder, but the numbers of hard multiples are costly to generate and increases
simultaneously. To increase the speed and performance, many parallel MAC architectures are proposed.
Figure 6. Block Diagram of Radix-8 MBA
There are two different common approaches that make use of parallelism to enhance the multiplication performance. The difference
between the two is the latest one carries out accumulation by feeding back the final CSA (Carry Save Adder) output rather than the
final adder results that are obtained. The entire process of parallel MAC is based on radix-8 booth encodings. Further the
implementation result and the characteristics of parallel MAC based on both of the booth encodings is exposed.
International Conference on Information Engineering, Management and Security 2016 (ICIEMS 2016) 74
Cite this article as: R P Meenaakshi Sundhari, M Karthickumar, S Pavithra, E Madura. “Implementation of MAC
using Modified Booth Algorithm”. International Conference on Information Engineering, Management and Security
2016: 70-76. Print.
V Results
The simulation results for 16-bit Radix-2 and Radix-8 modified Booth algorithm with three different adders and MAC are shown
below. Table II and III shows the synthesis report for array MAC, Radix-2 and Radix-8modified Booth algorithm with adders used in
MAC. The code is dumped onto the target device Spartan 3E (Xc3s500eft2564), inputs (Set frequency of asynchronous nets
as10MHz), signals (Set frequency for asynchronous nets as10MHz) and outputs (Set capacitive load of outputs as 28000 pf).
Table II shows the comparisons of power consumption and delay estimated of the Radix-2 Modified Booth Algorithm with three
different adders in MAC. Table III shows the Radix-8 using that same adders used in the Radix-2 MAC. The design summary and
simulation result also shown below.
Table II Comparison of radix-2 MBA
Table III Comparison of Radix-8 MBA
Device parameters
Array
Multiplier &
accumulator
SPST adderParallel prefix adder
Parallel Binary
adder
Number of
4 input
LUTs
636 out
of
29504
1093 out
of
29504
1083 out
of
29504
1222 out
of
9312
Number of gate count
for design
4209 5987 7167 7155
Estimated delay(ns) 217.8 39.69 24.936 66.10
Power consumption
(mw)
154 144 138.80 19.93
Device parameters
Array
Multiplier &
accumulator
SPST adderParallel prefix adder
Parallel Binary
adder
Number of
4 input
LUTs
636 out
of
29504
1093 out
of
29504
1083 out
of
29504
549 out
of
9312
Number of gate count
for design
4209 5987 7167 3768
Estimated delay(ns) 217.8 39.69 24.93 53.084
Power consumption (mw) 154 144 138.80 16.533
International Conference on Information Engineering, Management and Security 2016 (ICIEMS 2016) 75
Cite this article as: R P Meenaakshi Sundhari, M Karthickumar, S Pavithra, E Madura. “Implementation of MAC
using Modified Booth Algorithm”. International Conference on Information Engineering, Management and Security
2016: 70-76. Print.
Figure 7. Graphical comparison of different parameter of the adders
Figure 8. Simulation results for a 16-bit multiplier using radix-2 modified Booth algorithm with Parallel Prefix adder
Figure 9. Simulation results for a 16-bit multiplier using radix-8 modified Booth algorithm with Parallel Prefix adder
Figure 10. Design Summary of Radix-2 MBA for Parallel Prefix Adder
Figure 11. Design Summary of Radix-8 MBA for Parallel Prefix Adder
International Conference on Information Engineering, Management and Security 2016 (ICIEMS 2016) 76
Cite this article as: R P Meenaakshi Sundhari, M Karthickumar, S Pavithra, E Madura. “Implementation of MAC
using Modified Booth Algorithm”. International Conference on Information Engineering, Management and Security
2016: 70-76. Print.
VI Conclusion
The different adders are compared for various measures and works well either in power dissipation or in delay. So the performance of
each adder is different from the other. The adder is to avoid the unwanted glitches and spikes therefore switching power dissipation is
minimized. The Radix -2 modified booth algorithm reduces the number of partial products to half by grouping of bits from the
multiplier term in the multiplication operation, which improves the speed.
VII Future Scope
The modified booth algorithm which is different from the existing booth algorithm are commonly used. The Radix-2 and Radix-8
Booth Algorithm is used for all multiplication process that reduces the number of critical path, and reduces the power consumption. In
this paper, 16- bit Radix-8 Modified Booth Algorithm using spurious power suppression technique and Radix-16 MBA is also
implemented from the designed Radix-8 MBA. The benefits of miniaturization are high packing densities, good circuit speed and low
power consumption. A fixed-width multiplier is required to maintain a fixed format and minimum accuracy loss to output data.
References
1. Young-Ho Seo and Dong-Wook Kim, (February 2010) ‘A new VLSI architecture of parallel multiplier-accumulator based on
radix-2 modified Booth algorithm’, in IEEE Trans. On Very Large Scale Integration (VLSI) Systems, vol. 18, no. 2, pp.201-
208.
2. Z. Huang and M. D.Ercegovac, (March 2005), ‘High-performance low-power left-to right array multiplier design’, IEEE
trans.Comput., vol.54, no.3, pp.272–283.
3. G.Lakshmi Narayanan a n d B. Venkataramani, (July 2005), ‘Optimization techniques for FPGA-based wave pipelined DSP
blocks’, IEEE Trans. Very Large Scale Integration (VLSI) Systems, vol.13, no. 7, pp.783-792.
4. H.K.Chen, K.C.Chao, J.I.Guo, J.S.Wang and Y.S. Chu, (2005), ‘An efficient spurious power suppression technique (SPST)
and its applications on MPEG-4AVC/H.264 transform coding design’,Proc. IEEE Int. Symps. Low Power Electron. Des.,
pp.155–160.
5. H.Lee, (2004) ‘A power-aware scalable pipelined Booth multiplier’,Proc. IEEE Int. SOC Conf., , pp.123–126.
6. J.Choi, J.Jeon and K.Choi, (2000), ’Power minimization of functional units by partially guarded computation’, Proc. IEEE
Int. Symp. Low Power Electron. Des., pp.131–136.
7. J. Fadavi-Ardekani,( June1993), ‘M*N Booth encoded multiplier generator using optimized Wallace trees’, IEEE Trans.
Very Large Scale Integration (VLSI) Systems, vol. 1, no. 2, pp.120–125.
8. K.H. Chen, Y.M. Chen, and Y.S. Chu, ‘A versatile multimedia functional unit design using the spurious power suppression
technique’, in Proc. IEEE Asian Solid -State circuits Conf., 2006.

More Related Content

PDF
Design of a Novel Multiplier and Accumulator using Modified Booth Algorithm w...
PDF
International Journal of Computational Engineering Research(IJCER)
PDF
IRJET- Design of 16 Bit Low Power Vedic Architecture using CSA & UTS
PDF
Paper id 25201467
PDF
A Pipelined Fused Processing Unit for DSP Applications
PDF
Bn26425431
PDF
Implementation and Performance Analysis of a Vedic Multiplier Using Tanner ED...
PDF
F011123134
Design of a Novel Multiplier and Accumulator using Modified Booth Algorithm w...
International Journal of Computational Engineering Research(IJCER)
IRJET- Design of 16 Bit Low Power Vedic Architecture using CSA & UTS
Paper id 25201467
A Pipelined Fused Processing Unit for DSP Applications
Bn26425431
Implementation and Performance Analysis of a Vedic Multiplier Using Tanner ED...
F011123134

What's hot (17)

PDF
Fpga based efficient multiplier for image processing applications using recur...
PDF
An Enhanced Performance Pipelined Bus Invert Coding For Power Optimization Of...
PDF
A Novel Low Complexity Histogram Algorithm for High Performance Image Process...
PDF
Design of High Speed and Low Power Veterbi Decoder for Trellis Coded Modulati...
PDF
A Fast Floating Point Double Precision Implementation on Fpga
PDF
IRJET- Review Paper on Study of Various Interleavers and their Significance
PDF
High Speed Low Power Veterbi Decoder Design for TCM Decoders
PDF
Analysis of various mcm algorithms for reconfigurable rrc fir filter
PDF
IRJET- Review Paper on Radix-2 DIT Fast Fourier Transform using Reversible Gate
PDF
Transformation and dynamic visualization of images from computer through an F...
PDF
Analysis, verification and fpga implementation of low power multiplier
PDF
Image transmission in wireless sensor networks
PDF
Area Efficient VHDL implementation of AHB arbiter IP
PDF
A Review on Analysis on Codes using Different Algorithms
PDF
Reconfigurable and versatile bil rc architecture
PDF
FPGA based Efficient Interpolator design using DALUT Algorithm
PDF
V3I8-0460
Fpga based efficient multiplier for image processing applications using recur...
An Enhanced Performance Pipelined Bus Invert Coding For Power Optimization Of...
A Novel Low Complexity Histogram Algorithm for High Performance Image Process...
Design of High Speed and Low Power Veterbi Decoder for Trellis Coded Modulati...
A Fast Floating Point Double Precision Implementation on Fpga
IRJET- Review Paper on Study of Various Interleavers and their Significance
High Speed Low Power Veterbi Decoder Design for TCM Decoders
Analysis of various mcm algorithms for reconfigurable rrc fir filter
IRJET- Review Paper on Radix-2 DIT Fast Fourier Transform using Reversible Gate
Transformation and dynamic visualization of images from computer through an F...
Analysis, verification and fpga implementation of low power multiplier
Image transmission in wireless sensor networks
Area Efficient VHDL implementation of AHB arbiter IP
A Review on Analysis on Codes using Different Algorithms
Reconfigurable and versatile bil rc architecture
FPGA based Efficient Interpolator design using DALUT Algorithm
V3I8-0460
Ad

Similar to Implementation of MAC using Modified Booth Algorithm (20)

PDF
EFFICIENT IMPLEMENTATION OF 16-BIT MULTIPLIER-ACCUMULATOR USING RADIX-2 MODIF...
PDF
Ijarcet vol-2-issue-3-1036-1040
PDF
J045075661
PDF
International Journal of Computational Engineering Research(IJCER)
PDF
IRJET- Realization of Decimal Multiplication using Radix-16 Modified Booth En...
PDF
Q045079298
PDF
International Journal of Engineering and Science Invention (IJESI)
PDF
Iaetsd mac using compressor based multiplier and carry save adder
PDF
Ijetr011743
PDF
Comparative Design of Regular Structured Modified Booth Multiplier
PDF
IRJET- An Efficient Multiply Accumulate Unit Design using Vedic Mathematics A...
PDF
Implementation of area optimized low power multiplication and accumulation
PDF
High Performance MAC Unit for FFT Implementation
PDF
IRJET - Design of a Low Power Serial- Parallel Multiplier with Low Transition...
PDF
PDF
Project on digital vlsi design
PDF
A Review of Different Methods for Booth Multiplier
PDF
Lo3420902093
PDF
Fast Multiplier for FIR Filters
PDF
Implementation of Radix-4 Booth Multiplier by VHDL
EFFICIENT IMPLEMENTATION OF 16-BIT MULTIPLIER-ACCUMULATOR USING RADIX-2 MODIF...
Ijarcet vol-2-issue-3-1036-1040
J045075661
International Journal of Computational Engineering Research(IJCER)
IRJET- Realization of Decimal Multiplication using Radix-16 Modified Booth En...
Q045079298
International Journal of Engineering and Science Invention (IJESI)
Iaetsd mac using compressor based multiplier and carry save adder
Ijetr011743
Comparative Design of Regular Structured Modified Booth Multiplier
IRJET- An Efficient Multiply Accumulate Unit Design using Vedic Mathematics A...
Implementation of area optimized low power multiplication and accumulation
High Performance MAC Unit for FFT Implementation
IRJET - Design of a Low Power Serial- Parallel Multiplier with Low Transition...
Project on digital vlsi design
A Review of Different Methods for Booth Multiplier
Lo3420902093
Fast Multiplier for FIR Filters
Implementation of Radix-4 Booth Multiplier by VHDL
Ad

More from Association of Scientists, Developers and Faculties (20)

PDF
Core conferences bta 19 paper 12
PDF
Core conferences bta 19 paper 10
PDF
PDF
PDF
PDF
PDF
PDF
PDF
PDF
International Conference on Cloud of Things and Wearable Technologies 2018
PDF
A Typical Sleep Scheduling Algorithm in Cluster Head Selection for Energy Eff...
PDF
Application of Agricultural Waste in Preparation of Sustainable Construction ...
PDF
Survey and Research Challenges in Big Data
PDF
Asynchronous Power Management Using Grid Deployment Method for Wireless Senso...
Core conferences bta 19 paper 12
Core conferences bta 19 paper 10
International Conference on Cloud of Things and Wearable Technologies 2018
A Typical Sleep Scheduling Algorithm in Cluster Head Selection for Energy Eff...
Application of Agricultural Waste in Preparation of Sustainable Construction ...
Survey and Research Challenges in Big Data
Asynchronous Power Management Using Grid Deployment Method for Wireless Senso...

Recently uploaded (20)

PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PPTX
bas. eng. economics group 4 presentation 1.pptx
PPT
Project quality management in manufacturing
DOCX
573137875-Attendance-Management-System-original
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PDF
R24 SURVEYING LAB MANUAL for civil enggi
PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
UNIT 4 Total Quality Management .pptx
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
Construction Project Organization Group 2.pptx
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPTX
Welding lecture in detail for understanding
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PDF
Digital Logic Computer Design lecture notes
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Foundation to blockchain - A guide to Blockchain Tech
bas. eng. economics group 4 presentation 1.pptx
Project quality management in manufacturing
573137875-Attendance-Management-System-original
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
R24 SURVEYING LAB MANUAL for civil enggi
Internet of Things (IOT) - A guide to understanding
UNIT 4 Total Quality Management .pptx
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Construction Project Organization Group 2.pptx
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Welding lecture in detail for understanding
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Digital Logic Computer Design lecture notes
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
Automation-in-Manufacturing-Chapter-Introduction.pdf

Implementation of MAC using Modified Booth Algorithm

  • 1. International Conference on Information Engineering, Management and Security 2016 (ICIEMS 2016) 70 Cite this article as: R P Meenaakshi Sundhari, M Karthickumar, S Pavithra, E Madura. “Implementation of MAC using Modified Booth Algorithm”. International Conference on Information Engineering, Management and Security 2016: 70-76. Print. International Conference on Information Engineering, Management and Security 2016 [ICIEMS] ISBN 978-81-929866-4-7 VOL 01 Website iciems.in eMail iciems@asdf.res.in Received 02 – February – 2016 Accepted 15 - February – 2016 Article ID ICIEMS014 eAID ICIEMS.2016.014 Implementation of MAC using Modified Booth Algorithm R P Meenaakshi Sundhari1 , M Karthickumar2 , S Pavithra3 and E Madura4 1, 2, 4 Assistant Professor and 3 PG Scholar 1, 3, 4 Angel College of Engineering and Technology, Tirupur, India. 2 Erode Sengunthar Engineering College, Erode, India. Abstract- The proposed system is an efficient processing of 16-bit Multiplier Accumulator using Radix-8 and Radix-16 modified Booth Algorithm and other adders (SPST adder, Carry select adder, Parallel Prefix adder) using VHDL (Very High Speed Integrated Circuit Hardware Description Language). This proposed system provides low power, high speed and fewer delays. In both booth multipliers, comparison between the power consumption (mw) and estimated delay (ns) are calculated. The application of digital signal processing like fast fourier transform, finite impulse response and convolution needs high speed and low power MAC (Multiplier and Accumulator) units to construct an added. By reducing the glitches (from 1 to 0 transition) and spikes (from 0 to 1 transition), the speed of operation is improved and dynamic power is reduced. The adder designed with SPST avoids the unwanted glitches and spikes, reduce the switching power dissipation and the dynamic power. The speed can be improved by reducing the number of partial products to half, by grouping of bits in the multiplier term. The proposed Radix-8 and Radix-16 Modified Booth Algorithm MAC with SPST reduces the delay and obtain low power consumption as compared to array MAC. Keywords: Radix-8 modified booth algorithm Radix- 16 modified booth algorithm, Digital signal processing, VHDL (Very High Speed Integrated Circuit Hardware Description Language), Spurious Power Suppression Technique (SPST). I INTRODUCTION Multiplication is a fundamental operation in digital signal processing application which consumes more power and area. Consequently, there is a need for designing low power Booth Algorithm. Booth algorithm is a standard technique which provides significant improvement in terms of chip area and power compared to other multiplication techniques. The implementation of the multiplier depends on the type of adder which is used in the MAC unit. By combining the multiplication with the accumulation the development of a hybrid type of adders like Parallel prefix adder and Carry save adder, the performance has improved. Several commercial processors have selected the Radix-8 multiplier architecture to increase the speed of operation, thereby reducing the number of partial products in the multiplication terms. The Radix-8 encoding reduces the digit number length in a signed digit representation as compared to Radix-2 multiplication. Its performance is bottleneck by the generation of the term 3X (Multiplicand), also referred to as hard multiple. The proposed MAC unit accumulates intermediate result in the terms of sum and carries bits instead of the output of the final adder, which optimize the pipeline system to improve the overall performance. The modified Booth’s algorithm based on the Radix-8, generally called Booth-2, is the most popular approach for implementing the fast multipliers using parallel encoding. In general, multi-operand addition is the part of many complex arithmetic algorithms, such as multiplication and certain DSP algorithms. One of the most popular multi-operand adders is the carry-save adder which is capable of adding more than two operands at a time. This paper is prepared exclusively for International Conference on Information Engineering, Management and Security 2016 [ICIEMS 2016] which is published by ASDF International, Registered in London, United Kingdom under the directions of the Editor-in-Chief Dr. K. Saravanan and Editors Dr. Daniel James, Dr. Kokula Krishna Hari Kunasekaran and Dr. Saikishore Elangovan. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage, and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honoured. For all other uses, contact the owner/author(s). Copyright Holder can be reached at copy@asdf.international for distribution. 2016 © Reserved by Association of Scientists, Developers and Faculties [www.ASDF.international]
  • 2. International Conference on Information Engineering, Management and Security 2016 (ICIEMS 2016) 71 Cite this article as: R P Meenaakshi Sundhari, M Karthickumar, S Pavithra, E Madura. “Implementation of MAC using Modified Booth Algorithm”. International Conference on Information Engineering, Management and Security 2016: 70-76. Print. Figure 1. Hardware Architecture of General MAC Array Multiplier The objective of this paper is to introduce the flexibility of adding three-input operands to a regular adder, thereby reducing the need of a special adder to the same process. General architecture of MAC is shown in Figure 1. The proposed approach is implemented using VHDL design with ModelSim 6.5c software. This executes the multiplication operation by multiplying the multiplier and the multiplicand. Multiplier is considered as X and multiplicand is Y which is added to the previous multiplication result Z as an accumulation step. II Types of Adders A. SPST Adder In SPST Adder, the 16-bit adder /subtractor are divided into MSP (Most Significant Part) and LSP (Least Significant Part) between the 8th and 9th bits. Figure 2. Proposed Low Power SPST Equipped Multiplier The MSP of the original adder is modified to include the detection logical circuits, data controlling circuits, sign extension circuits, latch and clock circuits and logic for calculating carry-in and carry-out signals. Figure 2 shows the Proposed Low Power SPST Equipped Multiplier which consist of Latch, Detection logic and Sign extension logic. B. Carry Select Adder The 16-bit carry-select adder with a uniform block size of 4 can be created with three of these blocks and a 4- bit ripple carry adder is used. Since carry-in is known at the beginning of computation, a carry select block is not needed for first four bits. The delay of this adder will be four full adder delays, plus three MUX delays. The 16-bit carry-select adder with variable size can be similarly created shown in Figure 3. Here an adder with block sizes of 2-2-3-4-5 is used. This break-up is ideal when the full-adder delay is equal to the MUX delay, which is unlikely. The total delay is of two full adder delays and four multiplexer delays.
  • 3. International Conference on Information Engineering, Management and Security 2016 (ICIEMS 2016) 72 Cite this article as: R P Meenaakshi Sundhari, M Karthickumar, S Pavithra, E Madura. “Implementation of MAC using Modified Booth Algorithm”. International Conference on Information Engineering, Management and Security 2016: 70-76. Print. Figure 3. Variable Sized CSLA C. Parallel Binary Adder The goal of this paper is to present the architectures that provide the flexibility within a regular adder to augment/decrement the sum of two numbers by a constant which is considered in the addition process. This flexibility adds to the functionality of a regular adder, which achieves a comparable performance to conventional designs, therefore eliminating the need of having a dedicated adder unit to perform the same tasks. In this adder if the third operand is a constant, a design to accomplish three-input addition is required. These designs are called Enhanced Flagged Binary Adders (EFBA), shown in Figure 4. It also examines the performance of the adder when the operand size is expanded from 16 bits to 32 and 64 bits. Detailed analysis has been provided to compare the performance of the new designs with carry-save adders in terms of delay, power dissipated and area consumes. Figure 4. EFBA Block diagram III Implementation The Booth multiplication is a technique that allows faster multiplication by grouping the multiplier bits. The grouping of multiplier bits and Radix-8 Booth encoding reduce the number of partial products to half. The shifting and adding is for every column of the multiplier term and multiplying by 1 or 0 is commonly used. Here every second column is taken and multiplied by ±1, ±2, or 0. The advantage of this method is halving of the number of partial products. In Booth encoding the multiplier bits is formed in blocks of three, such that each block overlaps the previous block by only one bit. Grouping is started from the LSB side, and the first block only uses two bits of the multiplier term. Figure 5 below shows the grouping of bits from the multiplier term. To obtain the correct partial product, each block is decoded from the grouped terms. Table 1 shows the encoding of the multiplier value Y, which uses the Modified Booth Algorithm and generates the following five signed digits, -2, -1, 0, +1, +2. Each encoded digit in the multiplier performs a certain operation on the multiplicand X. Figure 5. Grouping of bits from the multiplier term in the multiplication operation
  • 4. International Conference on Information Engineering, Management and Security 2016 (ICIEMS 2016) 73 Cite this article as: R P Meenaakshi Sundhari, M Karthickumar, S Pavithra, E Madura. “Implementation of MAC using Modified Booth Algorithm”. International Conference on Information Engineering, Management and Security 2016: 70-76. Print. Table I Encoded five signed digits IV Modified Booth Algorithm for Radix-16 The numbers of subsequent calculation stages are decreased by enhancing the parallelism operation. So one of the solutions of realizing the high speed multipliers is to enhance parallelism operation. The Radix-4 Booth multiplier is the modified version of the conventional version of the Booth algorithm (Radix-2). The generation of Radix 2 and Radix 8 multiplication generally requires some kind of carry propagate adder, which increases the latency mainly due to the long wires that are required for propagating carries from the less significant to more significant bits. High-speed modulo multipliers using the Booth encoding for partial product generation have been proposed in the Booth encoding technique which reduces the number of partial products to be generated and accumulated. In Radix-4 Booth encoding all modulo- reduced partial products can be generated by shifting and negation. The greater savings in area and dynamic power dissipation are feasible for large word-length multipliers by increasing the radix beyond four. In Radix-8 Booth encoding method as shown in the Figure 6, the number of partial products is reduced by two- thirds. However, this reduction in the number of partial products leads to increased complexity in their generation. Compared with many other arithmetic operations multiplication is the time consuming and power hungry. Thus enhancing the performance of the circuit and reduction the power dissipation are the most important design challenges for all applications in which multiplier unit dominate the system performance and power dissipation. The effective way to increase the speed of the multiplier is to reduce the number of the partial products. The number of partial products can be reduced with the higher radix Booth encoder, but the numbers of hard multiples are costly to generate and increases simultaneously. To increase the speed and performance, many parallel MAC architectures are proposed. Figure 6. Block Diagram of Radix-8 MBA There are two different common approaches that make use of parallelism to enhance the multiplication performance. The difference between the two is the latest one carries out accumulation by feeding back the final CSA (Carry Save Adder) output rather than the final adder results that are obtained. The entire process of parallel MAC is based on radix-8 booth encodings. Further the implementation result and the characteristics of parallel MAC based on both of the booth encodings is exposed.
  • 5. International Conference on Information Engineering, Management and Security 2016 (ICIEMS 2016) 74 Cite this article as: R P Meenaakshi Sundhari, M Karthickumar, S Pavithra, E Madura. “Implementation of MAC using Modified Booth Algorithm”. International Conference on Information Engineering, Management and Security 2016: 70-76. Print. V Results The simulation results for 16-bit Radix-2 and Radix-8 modified Booth algorithm with three different adders and MAC are shown below. Table II and III shows the synthesis report for array MAC, Radix-2 and Radix-8modified Booth algorithm with adders used in MAC. The code is dumped onto the target device Spartan 3E (Xc3s500eft2564), inputs (Set frequency of asynchronous nets as10MHz), signals (Set frequency for asynchronous nets as10MHz) and outputs (Set capacitive load of outputs as 28000 pf). Table II shows the comparisons of power consumption and delay estimated of the Radix-2 Modified Booth Algorithm with three different adders in MAC. Table III shows the Radix-8 using that same adders used in the Radix-2 MAC. The design summary and simulation result also shown below. Table II Comparison of radix-2 MBA Table III Comparison of Radix-8 MBA Device parameters Array Multiplier & accumulator SPST adderParallel prefix adder Parallel Binary adder Number of 4 input LUTs 636 out of 29504 1093 out of 29504 1083 out of 29504 1222 out of 9312 Number of gate count for design 4209 5987 7167 7155 Estimated delay(ns) 217.8 39.69 24.936 66.10 Power consumption (mw) 154 144 138.80 19.93 Device parameters Array Multiplier & accumulator SPST adderParallel prefix adder Parallel Binary adder Number of 4 input LUTs 636 out of 29504 1093 out of 29504 1083 out of 29504 549 out of 9312 Number of gate count for design 4209 5987 7167 3768 Estimated delay(ns) 217.8 39.69 24.93 53.084 Power consumption (mw) 154 144 138.80 16.533
  • 6. International Conference on Information Engineering, Management and Security 2016 (ICIEMS 2016) 75 Cite this article as: R P Meenaakshi Sundhari, M Karthickumar, S Pavithra, E Madura. “Implementation of MAC using Modified Booth Algorithm”. International Conference on Information Engineering, Management and Security 2016: 70-76. Print. Figure 7. Graphical comparison of different parameter of the adders Figure 8. Simulation results for a 16-bit multiplier using radix-2 modified Booth algorithm with Parallel Prefix adder Figure 9. Simulation results for a 16-bit multiplier using radix-8 modified Booth algorithm with Parallel Prefix adder Figure 10. Design Summary of Radix-2 MBA for Parallel Prefix Adder Figure 11. Design Summary of Radix-8 MBA for Parallel Prefix Adder
  • 7. International Conference on Information Engineering, Management and Security 2016 (ICIEMS 2016) 76 Cite this article as: R P Meenaakshi Sundhari, M Karthickumar, S Pavithra, E Madura. “Implementation of MAC using Modified Booth Algorithm”. International Conference on Information Engineering, Management and Security 2016: 70-76. Print. VI Conclusion The different adders are compared for various measures and works well either in power dissipation or in delay. So the performance of each adder is different from the other. The adder is to avoid the unwanted glitches and spikes therefore switching power dissipation is minimized. The Radix -2 modified booth algorithm reduces the number of partial products to half by grouping of bits from the multiplier term in the multiplication operation, which improves the speed. VII Future Scope The modified booth algorithm which is different from the existing booth algorithm are commonly used. The Radix-2 and Radix-8 Booth Algorithm is used for all multiplication process that reduces the number of critical path, and reduces the power consumption. In this paper, 16- bit Radix-8 Modified Booth Algorithm using spurious power suppression technique and Radix-16 MBA is also implemented from the designed Radix-8 MBA. The benefits of miniaturization are high packing densities, good circuit speed and low power consumption. A fixed-width multiplier is required to maintain a fixed format and minimum accuracy loss to output data. References 1. Young-Ho Seo and Dong-Wook Kim, (February 2010) ‘A new VLSI architecture of parallel multiplier-accumulator based on radix-2 modified Booth algorithm’, in IEEE Trans. On Very Large Scale Integration (VLSI) Systems, vol. 18, no. 2, pp.201- 208. 2. Z. Huang and M. D.Ercegovac, (March 2005), ‘High-performance low-power left-to right array multiplier design’, IEEE trans.Comput., vol.54, no.3, pp.272–283. 3. G.Lakshmi Narayanan a n d B. Venkataramani, (July 2005), ‘Optimization techniques for FPGA-based wave pipelined DSP blocks’, IEEE Trans. Very Large Scale Integration (VLSI) Systems, vol.13, no. 7, pp.783-792. 4. H.K.Chen, K.C.Chao, J.I.Guo, J.S.Wang and Y.S. Chu, (2005), ‘An efficient spurious power suppression technique (SPST) and its applications on MPEG-4AVC/H.264 transform coding design’,Proc. IEEE Int. Symps. Low Power Electron. Des., pp.155–160. 5. H.Lee, (2004) ‘A power-aware scalable pipelined Booth multiplier’,Proc. IEEE Int. SOC Conf., , pp.123–126. 6. J.Choi, J.Jeon and K.Choi, (2000), ’Power minimization of functional units by partially guarded computation’, Proc. IEEE Int. Symp. Low Power Electron. Des., pp.131–136. 7. J. Fadavi-Ardekani,( June1993), ‘M*N Booth encoded multiplier generator using optimized Wallace trees’, IEEE Trans. Very Large Scale Integration (VLSI) Systems, vol. 1, no. 2, pp.120–125. 8. K.H. Chen, Y.M. Chen, and Y.S. Chu, ‘A versatile multimedia functional unit design using the spurious power suppression technique’, in Proc. IEEE Asian Solid -State circuits Conf., 2006.