SlideShare a Scribd company logo
Akanksha Yadav Int. Journal of Engineering Research and Applications www.ijera.com 
ISSN : 2248-9622, Vol. 4, Issue 11(Version - 6), November 2014, pp.37-41 
www.ijera.com 37 | P a g e 
High Speed Memory Efficient Multiplier-less 1-D 9/7 Wavelet Filters Based NEDA Technique Akanksha Yadav1, Anushree2 
Hindustan College Of Science & Technology Farah Mathura (India) ABSTRACT— Conventional distributed arithmetic (DA) is popular in field programmable gate array (FPGA) design, and it features on-chip ROM to achieve high speed and regularity. In this paper, we describe high speed area efficient 1-D discrete wavelet transform (DWT) using 9/7 filter based new efficient distributed arithmetic (NEDA) Technique. Being area efficient architecture free of ROM, multiplication, and subtraction, NEDA can also expose the redundancy existing in the adder array consisting of entries of 0 and 1. This architecture supports any size of image pixel value and any level of decomposition. The parallel structure has 100% hardware utilization efficiency. 
Keywords: - 1-D Discrete Wavelet Transform (DWT), NEDA, Low Pass Filter, High Pass Filter, Xilinx Simulation. 
I. INTRODUCTION 
The well-known image coding standards, namely, MPEG-4 and JPEG2000 have adopted 1-D DWT as the transform coder due to its remarkable advantages over the other transforms. For lossy and lossless compression, Daubechies 9/7 orthogonal filter is used as the default wavelet filter in JPEG 2000. Efficient implementation of 1-D DWT using 9/7 filters in resource-constrained hand-held devices with capability for real-time processing of the computation-intensive multimedia applications is, therefore, a necessary challenge. Multiplier-less hardware implementation approach provides a kind of solution to this problem due to its scope for lower hardware-complexity and higher throughput of computation. Several parallel and pipeline systems that meet the computational requirements of the discrete wavelet transform have been proposed. Some of them need multiprocessor to implement it and the system is complex, time consuming, and costly [1]. The Field programmable gate array (FPGA) provides us a new way to digital signal processing [2]. Several designs have been proposed for the multiplier, multiplier-less implementation of 1-D DWT based on the principle of multiplier based design (MBD) distributed arithmetic (DA) canonic signed digit (CSD), [1]–[3]. The structure of distributes the bits of the fixed coefficients instead of the bits of input samples. Consequently, the adder- complexity of the structure of depends on the DA- matrix of the fixed coefficients [2]. Canonic signed digit (CSD) are popular for representing a number with fewest number of non- zero digit. The CSD representation of a number 
contains the minimum possible number of nonzero bits, thus the name canonic. The CSD representation of a number is unique and CSD numbers cover the range (-4/3, 4/3), out of which the value in the range {-1, 1} are of greatest interest. Martina et al [5] have approximated the 9/7 filter coefficients and performance of a hardware implementation of the 9/7 filter bank depends on the accuracy of coefficients representation. By that approach, they have significantly reduced the adder- complexity of the 9/7 DWT. Gourav et al [7] have suggested an LUT-less DA-based design for the implementation of 1-D DWT. They have eliminated the ROM cells required by the DA-based structures at the cost of additional adders and multiplexors. Some of them need Rom to implement it and the system is complex, time consuming, and costly [4] The adder-complexity of this structure is significantly higher than the other multiplier-less structures. In this paper, we have proposed an efficient scheme to derive NEDA-based bit-parallel structures, for low- hardware and high-speed computation DWT using 9/7 filters [4]. The remainder of the paper is organized as follows: New efficient distributed arithmetic based computation of 1-D DWT using 9/7 filter is presented in Section II. The proposed structures are presented in Section III. Hardware and time complexity of the proposed structures are discussed and compared with the existing structures in Section IV. Conclusion is presented in Section V. 
II. NEW EFFICIENT DISTRIBUTED ARITHMETRIC (NEDA) Let us consider the following sum of products [4]: 
RESEARCH ARTICLE OPEN ACCESS
Akanksha Yadav Int. Journal of Engineering Research and Applications www.ijera.com 
ISSN : 2248-9622, Vol. 4, Issue 11(Version - 6), November 2014, pp.37-41 
www.ijera.com 38 | P a g e 
k 
L 
k 
k Y X R  1 
(1) 
Where k X are fixed coefficients and they k Y are the 
input data words. Equation (1) can be expressed in 
the form of a matrix product as: 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
L 
L 
Y 
Y 
Y 
R X X X 
. 
... 2 
1 
1 2 
(2) 
Both k X and k Y are in two’s complement format. 
The two’s complement representation of k X may be 
expressed as 
 
 
 
   
1 
2 2 
M 
i N 
i i 
k 
M M 
k k X X X (3) 
Where  i 
k X 0 or 1, and i  N, N+1… M 
and 
M 
k X is the sign bit and 
N 
k X is the least significant 
bit (LSB). 
Equation (3) can be expressed in matrix form as: 
  
  
 
 
 
 
 
  
 
 
 
 
 
 
 
 
 
M 
k 
N 
k 
N 
k 
N N M 
k 
X 
X 
X 
X 
. 
2 2 ... 2 
1 
1 
(4) 
Similarly k Y can be represented in two’s 
complemented format as: 
 
 
 
   
1 
2 2 
X 
i W 
i i 
k 
X X 
k k Y Y Y (5) 
Where  i 
k Y 0 or 1, and i  W, W+1, …,X 
and 
M 
k Y is the sign bit and 
N 
k Y is the least significant 
bit (LSB). 
Now on combining equations (1) and (3), we get- 
 
 
 
   
1 
( .2 ) ( .2 ) 
M 
i N 
M M i i R R R (6) 
Where 
 
 
L 
k 
k 
i 
k 
i R X Y 
1 
, i  N, N+1… M 
III. PROPOSED ARCHITECTURE 
In this paper, we have proposed a high speed 
area efficient multiplier-less 1-D 9/7 wavelet filters 
based NEDA technique. 9/7 wavelet filters 
coefficient i.e. 9 low-pass and 7 high-pass wavelet 
filters coefficient are given in table1. We multiply the 
filter coefficients by 128 for simplification. The 
mathematical calculation for 1-D high pass filter 
output is explained by an example. 
Table 1: Show high-pass and low-pass wavelet filters 
coefficient. 
Wavelet filters 
coefficients 
Multiplied 
by 128 
8 bit binary 
representation 
with 2’s 
complement 
of negative no. 
0 h 0.60294901823 77 01001101 
1 h 0.26686441184 34 00100010 
2 h -0.07822326652 -10 11110110 
3 h -0.01686411844 -2 11111110 
4 h 0.026748757410 3 00000011 
0 g 0.55754352622 71 01000111 
1 g -0.29563588155 -38 01011010 
2 g -0.02877176311 -4 11111100 
3 g 0.045635881557 6 00000110 
Where 0 h , 1 h , 2 h , 3 h , 4 h are the Low pass filter 
coefficients and 0 g , 1 g , 2 g , 3 g are the High pass 
filter coefficients. 
If we take the high pass coefficients 0 g , 1 g , 2 g and 
3 g multiply by 1 r , 2 r , 3 r and 4 r then we get the High 
pass output H Y of the 9/7 filter as [6]: 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
4 
3 
2 
1 
0 1 2 3 
r 
r 
r 
r 
Y g g g g H (7) 
Where 
( ) ( 6) 1 r  Y n Y n  
( 1) ( 5) 2 r  Y n  Y n  
( 2) ( 4) 3 r  Y n   Y n  
( 3) 4 r  Y n  
Let 1 r =1, 2 r =2, 3 r =3, 4 r =4 then
Akanksha Yadav Int. Journal of Engineering Research and Applications www.ijera.com 
ISSN : 2248-9622, Vol. 4, Issue 11(Version - 6), November 2014, pp.37-41 
www.ijera.com 39 | P a g e 
  7 
4 
3 
2 
1 
71 38 4 6  
 
 
 
 
 
 
 
 
 
 
 
 
   H Y 
(8) 
Now if we implement this with NEDA then 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
4 
3 
2 
1 
01000111 11011010 11111100 00000110 
r 
r 
r 
r 
YH 
(9) 
Now we can make the DA matrix by the filter 
coefficients as 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0 
0 
0 
1 
1 
1 
1 
1 
0 
0 
1 
0 
0 1 1 0 
0 1 1 0 
1 0 1 1 
1 1 0 1 
1 0 0 0 
k B 
(10) 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
  
 
 
  
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2 3 
1 2 3 
3 
2 3 
2 3 
1 3 4 
1 2 4 
1 
4 
3 
2 
1 
0 
0 
0 
1 
1 
1 
1 
1 
0 
0 
1 
0 
0 1 1 0 
0 1 1 0 
1 0 1 1 
1 1 0 1 
1 0 0 0 
r r 
r r r 
r 
r r 
r r 
r r r 
r r r 
r 
r 
r 
r 
r 
YH 
(11) 
In Figure 2, apply NEDA techniques step-1 all the 
input converts’ binary number, Step-2 all the binary 
input applied to sign extension, after than all the sign 
extension input applied to a adder array so, 
00001 1 P  , 00111 2 P  
01000 3 P  , 00101 4 P  
00101 5 P  , 00011 6 P  
YL 
Y(n) 
YH 
NEDA Technique 
NEDA Technique 
Y(n-1) Y(n-2) Y(n-3) Y(n-4) Y(n-5) Y(n-6) Y(n-7) Y(n-8) 
A 
A 
A 
A 
A 
A 
A 
Figure 1: Proposed Multiplier-less 9/7 Wavelet filter using NEDA Technique
Akanksha Yadav Int. Journal of Engineering Research and Applications www.ijera.com 
ISSN : 2248-9622, Vol. 4, Issue 11(Version - 6), November 2014, pp.37-41 
www.ijera.com 40 | P a g e 
, 
00110 7 P  , 00101 8 P  
The entire adder array input applied to MUX so, the 
entire adder array input m(1) right shift 1-bit so 
MUX (1) = 0’0111 =Yp (0) 
MUX (1) add MUX (2) = YP (1) 
= 0’00001 
= 0 0111 
+ 0 01111 
Output of the YP (1) again right shift 1-bit and adds 
MUX (3) so 
= 0’001111 
= 0 1000 
+ 0 101111 
YP (1) + MUX (3) = YP (2) 
Output of the YP (2) again right shift 1-bit and adds 
MUX (4) so 
= 0’0101111 
= 0 0101 
+ 0 1010111 
YP (2) + MUX (4) = YP (3) 
Output of the YP (3) again right shift 1-bit and adds 
MUX (5) 
so 
= 0’01010111 
= 0 0101 
+ 0 10100111 
YP (3) + MUX (5) = YP (4) 
Output of the YP (4) again right shift 1-bit and adds 
MUX (6) 
so 
= 0’010100111 
= 0 0011 
+ 0 100000111 
YP (4) + MUX (6) = YP (5) 
Output of the YP (5) again right shift 1-bit and adds 
MUX (7) 
so 
= 0’010000111 
= 0 0110 
+ 0 1010000111 
YP (5) + MUX (7) = YP (6) 
Output of the YP (6) again right shift 1-bit and adds 
MUX (8) 
so 
= 0’01010000111 
RIGHT 
SHIFT 
1 BIT 
MUX 
S 
I 
G 
N 
E 
X 
T 
E 
T 
I 
O 
N 
P8 
“1” 
P1 
P2 
P3 
P4 
P5 
P6 
P7 
r1 
r2 
r3 
r4 
Figure 2: Mathematical calculation of the NEDA Technique of the Low-pass Wavelet Filter Output
Akanksha Yadav Int. Journal of Engineering Research and Applications www.ijera.com 
ISSN : 2248-9622, Vol. 4, Issue 11(Version - 6), November 2014, pp.37-41 
www.ijera.com 41 | P a g e 
= 1 1011 + 1000000000111 Total output YP (7) = 000000000111 = 7 Carry is rejected. 
IV. SIMULATION RESULT 
The proposed architecture has very low hardware complexity compared to DA based structures, because DA requires ROM.In the proposed architecture, calculate the high-pass and low-pass wavelet filter output using NEDA scheme. NEDA does not require ROM. Proposed structure consist only 33 adders, zero mux and 29 registers. In the proposed architecture is better than other architecture in shown the Table 2. Table 2: Comparison of proposed with existing architectures Arch.: Architecture, MUL: Multiplier MUX: Multiplex, REG: Register, CP: Cyclic Period 
Arch. 
MUL 
Adder 
MUX 
Rom 
REG 
CP 
Alam et al. [2] 
0 
43 
9 
4 
8 
12 TA 
Martina et al [5] 
0 
36 
5 
4 
8 
9 TA 
Martina et al. [6] 
0 
36 
4 
4 
8 
6 TA 
Gaurav et al. [7] 
0 
30 
1 
4 
8 
6 TA 
Proposed 
0 
30 
1 
0 
8 
6 TA 
V. CONCLUSION We propose a novel distributed arithmetic paradigm named NEDA for VLSI implementation of digital signal processing (DSP) algorithms involving inner product of vectors and vector-matrix multiplication. Mathematical proof is given for the validity of the NEDA scheme. We demonstrate that NEDA is a very efficient architecture with adders as the main component and free of ROM (free memory), multiplication, and subtraction. For the adder array, a systematic approach is introduced to remove the potential redundancy so that minimum additions are necessary. NEDA is an accuracy preserving scheme and capable of maintaining a satisfactory performance even at low DA precision. REFERENCES 
[1] S.G. Mallat, ―A Theory for Multiresolution Signal Decomposition: The Wavelet Representation‖, IEEE Trans. on Pattern 
Analysis on Machine Intelligence, 110. July1989, pp. 674-693. 
[2] M. Alam, C. A. Rahman, and G. Jullian, ‖Efficient distributed arithmetic based DWT architectures for multimedia applications,‖ in Proc. IEEE Workshop on SoC for real-time applications, pp. 333 336, 2003. 
[3] X. Cao, Q. Xie, C. Peng, Q. Wang and D. Yu, ‖An efficient VLSI implementation of distributed architecture for DWT,‖ in Proc. IEEE Workshop on Multimedia and Signal Process., pp. 364-367, 2006. 
[4] Archana Chidanandan and Magdy Bayoumi, “AREA-EFFICIENT NEDA ARCHITECTURE FOR THE 1-D DCT/IDCT,‖ ICASSP 2006. 
[5] M. Martina, and G. Masera, ‖Low- complexity, efficient 9/7 wavelet filters VLSI implementation,‖ IEEE Trans. on Circuits and Syst. II, Express Brief vol. 53, no. 11, pp. 1289-1293, Nov. 2006. 
[6] M. Martina, and G. Masera, ‖Multiplierless, folded 9/7-5/3 wavelet VLSI architecture,‖ IEEE Trans. on Circuits and syst. II, Express Brief vol. 54, no. 9, pp. 770-774, Sep. 2007. 
[7] Gaurav Tewari, Santu Sardar, K. A. Babu, ‖ High-Speed & Memory Efficient 2-D DWT on Xilinx Spartan3A DSP using scalable Polyphase Structure with DA for JPEG2000 Standard,‖ 978-1-4244-8679-3/11/$26.00 ©2011 IEEE. 
[8] B. K. Mohanty and P. K. Meher, ―Memory Efficient Modular VLSI Architecture for Highthroughput and Low-Latency Implementation of Multilevel Lifting 2-D DWT‖, IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 59, NO. 5, MAY 2011. 
[9] B. K. Mohanty and P. K. Meher, ―Memory- Efficient High-Speed Convolution-based Generic Structure for Multilevel 2-D DWT‖, IEEE TRANSACTIONS ON CIRCUITS SYSTEMS FOR VIDEO TECHNOLOGY. 
[10] B. K. Mohanty and P. K. Meher, ―Efficient Multiplierless Designs for 1-D DWT using 9/7 Filters Based on Distributed Arithmetic‖, ISIC 2009.

More Related Content

PDF
EXTENDED K-MAP FOR MINIMIZING MULTIPLE OUTPUT LOGIC CIRCUITS
PDF
Implementation performance analysis of cordic
PDF
Analysis of CANADAIR CL-215 retractable landing gear.
PDF
Aes encryption engine for many core processor arrays for enhanced security
PPTX
Convolutional Neural Network (CNN) presentation from theory to code in Theano
PPT
Chapter 2 Image Processing: Pixel Relation
PDF
4213ijaia05
PDF
R044120124
EXTENDED K-MAP FOR MINIMIZING MULTIPLE OUTPUT LOGIC CIRCUITS
Implementation performance analysis of cordic
Analysis of CANADAIR CL-215 retractable landing gear.
Aes encryption engine for many core processor arrays for enhanced security
Convolutional Neural Network (CNN) presentation from theory to code in Theano
Chapter 2 Image Processing: Pixel Relation
4213ijaia05
R044120124

What's hot (20)

PDF
Robust Watermarking through Dual Band IWT and Chinese Remainder Theorem
PPT
Digital signal processor part 3
PDF
Glocalized Weisfeiler-Lehman Graph Kernels: Global-Local Feature Maps of Graphs
PDF
A Cryptographic Hardware Revolution in Communication Systems using Verilog HDL
PDF
Real Time System Identification of Speech Signal Using Tms320c6713
PDF
Neural Networks: Least Mean Square (LSM) Algorithm
PDF
Fortran induction project. DGTSV DGESV
PDF
nips report
PDF
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks
PDF
Recent Advances in Kernel-Based Graph Classification
PDF
505 260-266
PDF
B046050711
PDF
DAOR - Bridging the Gap between Community and Node Representations: Graph Emb...
PDF
Performance evaluation of ds cdma
PDF
Toward wave net speech synthesis
PDF
Lecture 6: Convolutional Neural Networks
PDF
Methods of Manifold Learning for Dimension Reduction of Large Data Sets
PDF
Radial Basis Function Interpolation
PPTX
Multimedia lossy compression algorithms
PPT
CS 354 Understanding Color
Robust Watermarking through Dual Band IWT and Chinese Remainder Theorem
Digital signal processor part 3
Glocalized Weisfeiler-Lehman Graph Kernels: Global-Local Feature Maps of Graphs
A Cryptographic Hardware Revolution in Communication Systems using Verilog HDL
Real Time System Identification of Speech Signal Using Tms320c6713
Neural Networks: Least Mean Square (LSM) Algorithm
Fortran induction project. DGTSV DGESV
nips report
Weisfeiler and Leman Go Neural: Higher-order Graph Neural Networks
Recent Advances in Kernel-Based Graph Classification
505 260-266
B046050711
DAOR - Bridging the Gap between Community and Node Representations: Graph Emb...
Performance evaluation of ds cdma
Toward wave net speech synthesis
Lecture 6: Convolutional Neural Networks
Methods of Manifold Learning for Dimension Reduction of Large Data Sets
Radial Basis Function Interpolation
Multimedia lossy compression algorithms
CS 354 Understanding Color
Ad

Viewers also liked (20)

PDF
O44087882
PDF
Ay044316318
PDF
Establishing Optimal Dehydration Process Parameters for Papaya By EmployingA ...
PDF
Studies on Strength Evaluation of Fiber Reinforced Plastic Composites
PDF
P045068488
PDF
Color Based Authentication Scheme for Publically Disclosable Entities
PDF
Modeling the transport of charge carriers in the active devices diode submicr...
PPT
Concern: Tax Effective Giving Campaign
PDF
Br044426429
PDF
X4502151157
PDF
Q43019295
PDF
Q045058791
PDF
Z044168189
PDF
Cooling Of Power Converters by Natural Convection
PDF
A044080105
PDF
S430199101
PDF
L44095762
PDF
Au4301244247
PDF
Be4301314318
PDF
UNICEF: Believe in zero campaign 2009
O44087882
Ay044316318
Establishing Optimal Dehydration Process Parameters for Papaya By EmployingA ...
Studies on Strength Evaluation of Fiber Reinforced Plastic Composites
P045068488
Color Based Authentication Scheme for Publically Disclosable Entities
Modeling the transport of charge carriers in the active devices diode submicr...
Concern: Tax Effective Giving Campaign
Br044426429
X4502151157
Q43019295
Q045058791
Z044168189
Cooling Of Power Converters by Natural Convection
A044080105
S430199101
L44095762
Au4301244247
Be4301314318
UNICEF: Believe in zero campaign 2009
Ad

Similar to High Speed Memory Efficient Multiplier-less 1-D 9/7 Wavelet Filters Based NEDA Technique (20)

PDF
High Speed and Time Efficient 1-D DWT on Xilinx Virtex4 DWT Using 9/7 Filter ...
PDF
A novel modified distributed
PDF
Gf3511031106
PDF
D044042432
PDF
A novel architecture of rns based
PDF
Concurrent Ternary Galois-based Computation using Nano-apex Multiplexing Nibs...
PDF
VLSI IMPLEMENTATION OF AREA EFFICIENT 2-PARALLEL FIR DIGITAL FILTER
PDF
VLSI IMPLEMENTATION OF AREA EFFICIENT 2-PARALLEL FIR DIGITAL FILTER
PDF
International Journal of Computational Engineering Research(IJCER)
PDF
Gt3612201224
PDF
Low Power Adaptive FIR Filter Based on Distributed Arithmetic
PDF
Bandpass Filter in S-Band by D.C.Vaghela,LJIET,Ahmedabad,Gujarat.
PDF
IRJET - Distributed Arithmetic Method for Complex Multiplication
PDF
IRJET - Design and Implementation of FFT using Compressor with XOR Gate Topology
PDF
Paper id 37201520
PDF
Design of an Adaptive Hearing Aid Algorithm using Booth-Wallace Tree Multiplier
PDF
Channel Equalization of WCDMA Downlink System Using Finite Length MMSE-DFE
PDF
Channel Equalization of WCDMA Downlink System Using Finite Length MMSE-DFE
PDF
FPGA Implementation of High Speed FIR Filters and less power consumption stru...
PDF
DSP unit1,2,3 VSQs-vrc.pdf important question
High Speed and Time Efficient 1-D DWT on Xilinx Virtex4 DWT Using 9/7 Filter ...
A novel modified distributed
Gf3511031106
D044042432
A novel architecture of rns based
Concurrent Ternary Galois-based Computation using Nano-apex Multiplexing Nibs...
VLSI IMPLEMENTATION OF AREA EFFICIENT 2-PARALLEL FIR DIGITAL FILTER
VLSI IMPLEMENTATION OF AREA EFFICIENT 2-PARALLEL FIR DIGITAL FILTER
International Journal of Computational Engineering Research(IJCER)
Gt3612201224
Low Power Adaptive FIR Filter Based on Distributed Arithmetic
Bandpass Filter in S-Band by D.C.Vaghela,LJIET,Ahmedabad,Gujarat.
IRJET - Distributed Arithmetic Method for Complex Multiplication
IRJET - Design and Implementation of FFT using Compressor with XOR Gate Topology
Paper id 37201520
Design of an Adaptive Hearing Aid Algorithm using Booth-Wallace Tree Multiplier
Channel Equalization of WCDMA Downlink System Using Finite Length MMSE-DFE
Channel Equalization of WCDMA Downlink System Using Finite Length MMSE-DFE
FPGA Implementation of High Speed FIR Filters and less power consumption stru...
DSP unit1,2,3 VSQs-vrc.pdf important question

Recently uploaded (20)

PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PPTX
Welding lecture in detail for understanding
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
Lecture Notes Electrical Wiring System Components
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PPT
Project quality management in manufacturing
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
web development for engineering and engineering
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PPTX
Lesson 3_Tessellation.pptx finite Mathematics
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
UNIT 4 Total Quality Management .pptx
PPTX
Strings in CPP - Strings in C++ are sequences of characters used to store and...
PPTX
bas. eng. economics group 4 presentation 1.pptx
PDF
Digital Logic Computer Design lecture notes
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
Welding lecture in detail for understanding
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Lecture Notes Electrical Wiring System Components
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Project quality management in manufacturing
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
web development for engineering and engineering
Operating System & Kernel Study Guide-1 - converted.pdf
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Lesson 3_Tessellation.pptx finite Mathematics
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
UNIT 4 Total Quality Management .pptx
Strings in CPP - Strings in C++ are sequences of characters used to store and...
bas. eng. economics group 4 presentation 1.pptx
Digital Logic Computer Design lecture notes
UNIT-1 - COAL BASED THERMAL POWER PLANTS
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx

High Speed Memory Efficient Multiplier-less 1-D 9/7 Wavelet Filters Based NEDA Technique

  • 1. Akanksha Yadav Int. Journal of Engineering Research and Applications www.ijera.com ISSN : 2248-9622, Vol. 4, Issue 11(Version - 6), November 2014, pp.37-41 www.ijera.com 37 | P a g e High Speed Memory Efficient Multiplier-less 1-D 9/7 Wavelet Filters Based NEDA Technique Akanksha Yadav1, Anushree2 Hindustan College Of Science & Technology Farah Mathura (India) ABSTRACT— Conventional distributed arithmetic (DA) is popular in field programmable gate array (FPGA) design, and it features on-chip ROM to achieve high speed and regularity. In this paper, we describe high speed area efficient 1-D discrete wavelet transform (DWT) using 9/7 filter based new efficient distributed arithmetic (NEDA) Technique. Being area efficient architecture free of ROM, multiplication, and subtraction, NEDA can also expose the redundancy existing in the adder array consisting of entries of 0 and 1. This architecture supports any size of image pixel value and any level of decomposition. The parallel structure has 100% hardware utilization efficiency. Keywords: - 1-D Discrete Wavelet Transform (DWT), NEDA, Low Pass Filter, High Pass Filter, Xilinx Simulation. I. INTRODUCTION The well-known image coding standards, namely, MPEG-4 and JPEG2000 have adopted 1-D DWT as the transform coder due to its remarkable advantages over the other transforms. For lossy and lossless compression, Daubechies 9/7 orthogonal filter is used as the default wavelet filter in JPEG 2000. Efficient implementation of 1-D DWT using 9/7 filters in resource-constrained hand-held devices with capability for real-time processing of the computation-intensive multimedia applications is, therefore, a necessary challenge. Multiplier-less hardware implementation approach provides a kind of solution to this problem due to its scope for lower hardware-complexity and higher throughput of computation. Several parallel and pipeline systems that meet the computational requirements of the discrete wavelet transform have been proposed. Some of them need multiprocessor to implement it and the system is complex, time consuming, and costly [1]. The Field programmable gate array (FPGA) provides us a new way to digital signal processing [2]. Several designs have been proposed for the multiplier, multiplier-less implementation of 1-D DWT based on the principle of multiplier based design (MBD) distributed arithmetic (DA) canonic signed digit (CSD), [1]–[3]. The structure of distributes the bits of the fixed coefficients instead of the bits of input samples. Consequently, the adder- complexity of the structure of depends on the DA- matrix of the fixed coefficients [2]. Canonic signed digit (CSD) are popular for representing a number with fewest number of non- zero digit. The CSD representation of a number contains the minimum possible number of nonzero bits, thus the name canonic. The CSD representation of a number is unique and CSD numbers cover the range (-4/3, 4/3), out of which the value in the range {-1, 1} are of greatest interest. Martina et al [5] have approximated the 9/7 filter coefficients and performance of a hardware implementation of the 9/7 filter bank depends on the accuracy of coefficients representation. By that approach, they have significantly reduced the adder- complexity of the 9/7 DWT. Gourav et al [7] have suggested an LUT-less DA-based design for the implementation of 1-D DWT. They have eliminated the ROM cells required by the DA-based structures at the cost of additional adders and multiplexors. Some of them need Rom to implement it and the system is complex, time consuming, and costly [4] The adder-complexity of this structure is significantly higher than the other multiplier-less structures. In this paper, we have proposed an efficient scheme to derive NEDA-based bit-parallel structures, for low- hardware and high-speed computation DWT using 9/7 filters [4]. The remainder of the paper is organized as follows: New efficient distributed arithmetic based computation of 1-D DWT using 9/7 filter is presented in Section II. The proposed structures are presented in Section III. Hardware and time complexity of the proposed structures are discussed and compared with the existing structures in Section IV. Conclusion is presented in Section V. II. NEW EFFICIENT DISTRIBUTED ARITHMETRIC (NEDA) Let us consider the following sum of products [4]: RESEARCH ARTICLE OPEN ACCESS
  • 2. Akanksha Yadav Int. Journal of Engineering Research and Applications www.ijera.com ISSN : 2248-9622, Vol. 4, Issue 11(Version - 6), November 2014, pp.37-41 www.ijera.com 38 | P a g e k L k k Y X R  1 (1) Where k X are fixed coefficients and they k Y are the input data words. Equation (1) can be expressed in the form of a matrix product as:                L L Y Y Y R X X X . ... 2 1 1 2 (2) Both k X and k Y are in two’s complement format. The two’s complement representation of k X may be expressed as       1 2 2 M i N i i k M M k k X X X (3) Where  i k X 0 or 1, and i  N, N+1… M and M k X is the sign bit and N k X is the least significant bit (LSB). Equation (3) can be expressed in matrix form as:                     M k N k N k N N M k X X X X . 2 2 ... 2 1 1 (4) Similarly k Y can be represented in two’s complemented format as:       1 2 2 X i W i i k X X k k Y Y Y (5) Where  i k Y 0 or 1, and i  W, W+1, …,X and M k Y is the sign bit and N k Y is the least significant bit (LSB). Now on combining equations (1) and (3), we get-       1 ( .2 ) ( .2 ) M i N M M i i R R R (6) Where   L k k i k i R X Y 1 , i  N, N+1… M III. PROPOSED ARCHITECTURE In this paper, we have proposed a high speed area efficient multiplier-less 1-D 9/7 wavelet filters based NEDA technique. 9/7 wavelet filters coefficient i.e. 9 low-pass and 7 high-pass wavelet filters coefficient are given in table1. We multiply the filter coefficients by 128 for simplification. The mathematical calculation for 1-D high pass filter output is explained by an example. Table 1: Show high-pass and low-pass wavelet filters coefficient. Wavelet filters coefficients Multiplied by 128 8 bit binary representation with 2’s complement of negative no. 0 h 0.60294901823 77 01001101 1 h 0.26686441184 34 00100010 2 h -0.07822326652 -10 11110110 3 h -0.01686411844 -2 11111110 4 h 0.026748757410 3 00000011 0 g 0.55754352622 71 01000111 1 g -0.29563588155 -38 01011010 2 g -0.02877176311 -4 11111100 3 g 0.045635881557 6 00000110 Where 0 h , 1 h , 2 h , 3 h , 4 h are the Low pass filter coefficients and 0 g , 1 g , 2 g , 3 g are the High pass filter coefficients. If we take the high pass coefficients 0 g , 1 g , 2 g and 3 g multiply by 1 r , 2 r , 3 r and 4 r then we get the High pass output H Y of the 9/7 filter as [6]:                4 3 2 1 0 1 2 3 r r r r Y g g g g H (7) Where ( ) ( 6) 1 r  Y n Y n  ( 1) ( 5) 2 r  Y n  Y n  ( 2) ( 4) 3 r  Y n   Y n  ( 3) 4 r  Y n  Let 1 r =1, 2 r =2, 3 r =3, 4 r =4 then
  • 3. Akanksha Yadav Int. Journal of Engineering Research and Applications www.ijera.com ISSN : 2248-9622, Vol. 4, Issue 11(Version - 6), November 2014, pp.37-41 www.ijera.com 39 | P a g e   7 4 3 2 1 71 38 4 6                 H Y (8) Now if we implement this with NEDA then                4 3 2 1 01000111 11011010 11111100 00000110 r r r r YH (9) Now we can make the DA matrix by the filter coefficients as                            0 0 0 1 1 1 1 1 0 0 1 0 0 1 1 0 0 1 1 0 1 0 1 1 1 1 0 1 1 0 0 0 k B (10)                                                                        2 3 1 2 3 3 2 3 2 3 1 3 4 1 2 4 1 4 3 2 1 0 0 0 1 1 1 1 1 0 0 1 0 0 1 1 0 0 1 1 0 1 0 1 1 1 1 0 1 1 0 0 0 r r r r r r r r r r r r r r r r r r r r r YH (11) In Figure 2, apply NEDA techniques step-1 all the input converts’ binary number, Step-2 all the binary input applied to sign extension, after than all the sign extension input applied to a adder array so, 00001 1 P  , 00111 2 P  01000 3 P  , 00101 4 P  00101 5 P  , 00011 6 P  YL Y(n) YH NEDA Technique NEDA Technique Y(n-1) Y(n-2) Y(n-3) Y(n-4) Y(n-5) Y(n-6) Y(n-7) Y(n-8) A A A A A A A Figure 1: Proposed Multiplier-less 9/7 Wavelet filter using NEDA Technique
  • 4. Akanksha Yadav Int. Journal of Engineering Research and Applications www.ijera.com ISSN : 2248-9622, Vol. 4, Issue 11(Version - 6), November 2014, pp.37-41 www.ijera.com 40 | P a g e , 00110 7 P  , 00101 8 P  The entire adder array input applied to MUX so, the entire adder array input m(1) right shift 1-bit so MUX (1) = 0’0111 =Yp (0) MUX (1) add MUX (2) = YP (1) = 0’00001 = 0 0111 + 0 01111 Output of the YP (1) again right shift 1-bit and adds MUX (3) so = 0’001111 = 0 1000 + 0 101111 YP (1) + MUX (3) = YP (2) Output of the YP (2) again right shift 1-bit and adds MUX (4) so = 0’0101111 = 0 0101 + 0 1010111 YP (2) + MUX (4) = YP (3) Output of the YP (3) again right shift 1-bit and adds MUX (5) so = 0’01010111 = 0 0101 + 0 10100111 YP (3) + MUX (5) = YP (4) Output of the YP (4) again right shift 1-bit and adds MUX (6) so = 0’010100111 = 0 0011 + 0 100000111 YP (4) + MUX (6) = YP (5) Output of the YP (5) again right shift 1-bit and adds MUX (7) so = 0’010000111 = 0 0110 + 0 1010000111 YP (5) + MUX (7) = YP (6) Output of the YP (6) again right shift 1-bit and adds MUX (8) so = 0’01010000111 RIGHT SHIFT 1 BIT MUX S I G N E X T E T I O N P8 “1” P1 P2 P3 P4 P5 P6 P7 r1 r2 r3 r4 Figure 2: Mathematical calculation of the NEDA Technique of the Low-pass Wavelet Filter Output
  • 5. Akanksha Yadav Int. Journal of Engineering Research and Applications www.ijera.com ISSN : 2248-9622, Vol. 4, Issue 11(Version - 6), November 2014, pp.37-41 www.ijera.com 41 | P a g e = 1 1011 + 1000000000111 Total output YP (7) = 000000000111 = 7 Carry is rejected. IV. SIMULATION RESULT The proposed architecture has very low hardware complexity compared to DA based structures, because DA requires ROM.In the proposed architecture, calculate the high-pass and low-pass wavelet filter output using NEDA scheme. NEDA does not require ROM. Proposed structure consist only 33 adders, zero mux and 29 registers. In the proposed architecture is better than other architecture in shown the Table 2. Table 2: Comparison of proposed with existing architectures Arch.: Architecture, MUL: Multiplier MUX: Multiplex, REG: Register, CP: Cyclic Period Arch. MUL Adder MUX Rom REG CP Alam et al. [2] 0 43 9 4 8 12 TA Martina et al [5] 0 36 5 4 8 9 TA Martina et al. [6] 0 36 4 4 8 6 TA Gaurav et al. [7] 0 30 1 4 8 6 TA Proposed 0 30 1 0 8 6 TA V. CONCLUSION We propose a novel distributed arithmetic paradigm named NEDA for VLSI implementation of digital signal processing (DSP) algorithms involving inner product of vectors and vector-matrix multiplication. Mathematical proof is given for the validity of the NEDA scheme. We demonstrate that NEDA is a very efficient architecture with adders as the main component and free of ROM (free memory), multiplication, and subtraction. For the adder array, a systematic approach is introduced to remove the potential redundancy so that minimum additions are necessary. NEDA is an accuracy preserving scheme and capable of maintaining a satisfactory performance even at low DA precision. REFERENCES [1] S.G. Mallat, ―A Theory for Multiresolution Signal Decomposition: The Wavelet Representation‖, IEEE Trans. on Pattern Analysis on Machine Intelligence, 110. July1989, pp. 674-693. [2] M. Alam, C. A. Rahman, and G. Jullian, ‖Efficient distributed arithmetic based DWT architectures for multimedia applications,‖ in Proc. IEEE Workshop on SoC for real-time applications, pp. 333 336, 2003. [3] X. Cao, Q. Xie, C. Peng, Q. Wang and D. Yu, ‖An efficient VLSI implementation of distributed architecture for DWT,‖ in Proc. IEEE Workshop on Multimedia and Signal Process., pp. 364-367, 2006. [4] Archana Chidanandan and Magdy Bayoumi, “AREA-EFFICIENT NEDA ARCHITECTURE FOR THE 1-D DCT/IDCT,‖ ICASSP 2006. [5] M. Martina, and G. Masera, ‖Low- complexity, efficient 9/7 wavelet filters VLSI implementation,‖ IEEE Trans. on Circuits and Syst. II, Express Brief vol. 53, no. 11, pp. 1289-1293, Nov. 2006. [6] M. Martina, and G. Masera, ‖Multiplierless, folded 9/7-5/3 wavelet VLSI architecture,‖ IEEE Trans. on Circuits and syst. II, Express Brief vol. 54, no. 9, pp. 770-774, Sep. 2007. [7] Gaurav Tewari, Santu Sardar, K. A. Babu, ‖ High-Speed & Memory Efficient 2-D DWT on Xilinx Spartan3A DSP using scalable Polyphase Structure with DA for JPEG2000 Standard,‖ 978-1-4244-8679-3/11/$26.00 ©2011 IEEE. [8] B. K. Mohanty and P. K. Meher, ―Memory Efficient Modular VLSI Architecture for Highthroughput and Low-Latency Implementation of Multilevel Lifting 2-D DWT‖, IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 59, NO. 5, MAY 2011. [9] B. K. Mohanty and P. K. Meher, ―Memory- Efficient High-Speed Convolution-based Generic Structure for Multilevel 2-D DWT‖, IEEE TRANSACTIONS ON CIRCUITS SYSTEMS FOR VIDEO TECHNOLOGY. [10] B. K. Mohanty and P. K. Meher, ―Efficient Multiplierless Designs for 1-D DWT using 9/7 Filters Based on Distributed Arithmetic‖, ISIC 2009.