FPGA-Based Multi-Level Approximate Multipliers for High-Performance Error-Resilient Applications

FPGA-Based Multi-Level Approximate Multipliers
for High-Performance Error-Resilient Applications
Objective:
The main objective of the paper is to approximate multipliers which are efficiently deployed on Field
Programmable Gate Arrays (FPGAs) by using newly proposed approximate logic compressors at different
levels of accuracy.
Introduction:
These days in many of the error-resilient applications such as multimedia, data mining, image processing,
machine learning, etc.,precise computations are not always necessary. The computation results with some
degradation of accuracy can be acceptable and meaningful enough for such applications. By taking
advantage of this property, we take into consideration the trade-offs between the accuracy and electrical
performances of a circuit. That is, we can sacrifice some loss of accuracy for beneficial gains of power
dissipation, occupied area,and delay. In this paper the approximate 8×8 , 16×16 , and 32×32 multipliers
with different accuracies will be efficiently implemented on FPGAs by using proposed approximate
compressors.
Exact and Approximate 4–2 Compressors:
A conventional exact 4–2 compressor consists of two full adders as shown in the figure below. It has a
total of 5 inputs and 3 outputs. The output sum has the same weight of 1 as the inputs while the
outputs cout and carry have a weight of 2. The reduction tree of an 8×8 multiplier is illustrated in the
second figure in which the dashed shapes represent half and full adders and the solid shapes are the exact
4–2 compressors.
4–2 compressors: (a) a conventional exact 4–2 compressor, (b) an approximate 4–2 compressor.

The partial product reduction tree of an 8×8 multipliers by using conventional 4–2 compressor (solid),
half-, and full-adders (dashed).
In an approximate 4–2 compressor there are two outputs which can approximately count all 1’s at its four
inputs itself. The block diagram of an approximate 4–2 compressor is illustrated above. These two outputs
of the approximate 4–2 compressor can have the same or different weights. Compared to the conventional
exact 4–2 compressor, this approximate compressor does not have the carry input in it. This shortens the
delay and reduces the occupied area and power consumption drastically.
Approximate 8×8 Multipliers:
The proposed 8×8 approximate multiplier implementations consist of severalsteps including partial
product generation, PPM reduction at the first stage by using approximate 4–2 compressors in class 2, the
PPM reduction at the following stages n (n>1 ) by using approximate 4–2 compressors class 1 and the
generation of the final result by using a ripple carry adder (RCA). It is to be noted that the partial product
generation and the PPM reduction at the first stage are combined together to save the hardware resource,
thereby assisting in power consumption. This is a considerable difference between the proposed FPGA-
based implementations and ASIC-based ones.
A dot diagram of the proposed 8×8 approximate multiplier has been illustrated. The different kinds of
approximate compressors to compress the PPM at different reduction stages are also applied. In terms of
hardware resources and accuracy,each kind of approximate compressors are used properly for the PPM
reduction at different stages. Here for example, the approximate 4–2 compressors in class 2 is efficiently
applied to compress the PPM at the first stage since it costs small hardware resources and less power
dissipation. The approximate 4–2 compressor class 1 is appropriate with reducing the PPM height at
stages n (n >1) because they show considerable higher accuracies and consume less dynamic power.

The dot diagram of the proposed approximate 8×8 multiplier. Approximate 4–2 compressors CP3
(CP4) are used to reduce the PPM at the first reduction stage while the compressors CP1 (CP2)
are utilized for the reduction stage 2.
Moreover, the proposed approximate multiplier has been clustered into two parts with different
accuracies. The most significant partial products are accumulated by using accurate compressors and
adders while the least significant PPs are accumulated by utilizing presented approximate compressors. In
this work, different configurations of approximate multipliers with various accuracies were implemented.
For example, in the above figure the approximate multiplier is implemented with 5 leftmost columns of
the partial products compressed by using accurate compressors and adders. An RCA is used to add two
last rows of the final stage to generate the final result since the dedicated RCA is known as one of the
fastest adders on FPGAs till date.
Approximate 8×8 Multiplier Configurations
The proposed multi-level approximate architecture in this paper improves the accuracy,shortens the
delay, and reduces the power dissipation and the occupied area of the approximate multiplier. Finally, all
the proposed approximate 8×8 multipliers with different accuracy configurations are summarized in in the
above table. The M8_CP13_k group includes approximate 8×8 multipliers using the approximate 4–2
compressor CP3 at the first reduction stage and the compressor CP1 in the second reduction stage. k is the
number of leftmost partial product columns compressed by accurate compressors,full-adders and half-
adders.

In this work, the approximate 4–2 compressor CP2 is not used to construct approximate 8×8 multipliers.
Since only a small amount of approximate 4–2 compressors are used for the reduction stage 2, the
reduction in dynamic power dissipation is small while the loss of accuracy could be considerable. The
effectiveness of the approximate 4–2 compressor CP2 on the dynamic power reduction is evaluated on
larger operand size multipliers (i.e., 16×16 , 32×32 multipliers).
Conclusion:
In this paper the proposed approximate 8×8 , 16×16 , and 32×32 multipliers with different accuracies
were efficiently implemented on FPGAs by using proposed approximate compressors. The proposed
approximate multipliers were evaluated on both circuit and application levels to demonstrate their
effectiveness and applicability. This was the first work to implement the approximate multipliers and
measure their dynamic power consumption on an FPGA board. Finally, the proposed multipliers in this
paper were proven suitable for high-performance and low-power error-resilient applications.

FPGA-Based Multi-Level Approximate Multipliers for High-Performance Error-Resilient Applications

More Related Content

What's hot (20)

Similar to FPGA-Based Multi-Level Approximate Multipliers for High-Performance Error-Resilient Applications (20)

Recently uploaded (20)

FPGA-Based Multi-Level Approximate Multipliers for High-Performance Error-Resilient Applications