SlideShare a Scribd company logo
International Journal of Engineering Science Invention
ISSN (Online): 2319 – 6734, ISSN (Print): 2319 – 6726
www.ijesi.org || Volume 5 Issue 5 || May 2016 || PP.1-7
www.ijesi.org 1 | Page
New binary memristor crossbar architecture
based neural networks for speech recognition
Van-Tien Nguyen1
, Minh-Huan Vo2
1,2
Department of Electrical Electronic Engineering, HCMC University of Technology and Education, Viet Nam.
ABSTRACT : In this paper, we propose a new binary memristor crossbar architecture based neural
networks for speech recognition. The circuit can recognize five vowels. The proposed crossbar is tested by 1,000
speech samples and recognized 94% of the tested samples. We use Monte Carlo simulation to estimate
recognitition rate. The percentage variation in memristance is increased from 0% to 15%, the recognition rate
is degraded from 94% to 82%.
KEYWORDS – Memristors, Crossbar, Speech recognition, Binary memristors.
I. INTRODUCTION
According to Moore's Law, transistor count per chip will double within two years. The IC technology
development in recent years has shown the validity of Moore's law. However, according to estimates in the near
future, the technology will reach the limit of Moore's Law, which means that the chip size will reach critical
values to ensure accuracy and stability. So, there are many new methods being studied to replace Moore's law.
And memristor of Leon. Chua was mentioned in 1971 [1], which was completed by Stanley- Williams in 2008
[2]. The future of technology has opened up new development. This technology is even better than CMOS
technology has been thriving.
Anyone with a basic knowledge in electrical engineering knows that there are four fundamental circuit
variables: Current i, Voltage v, Charge q, and Flux f. Then it is clear that with these four parameters, there can
be six possible combinations for relating them to each other. So far we have complete understanding and control
over five of these combinations in which three of them are passive twoterminal fundamental circuit elements,
namely the resistor R, the capacitor C and the inductor L. Unlike the active components which can generate
energy, these three components are passive elements which are only capable of storing or dissipating energy but
not generating it. The relationship between 'voltage and current', 'voltage and charge', and 'current and flux' are
defined by a resistor, capacitor and an inductor, respectively. No device was there to relate the charge and the
magnetic flux until Leon Chua introduced his new circuit element called “memristor”. In 2008, a research group
at HP Labs lead by Stanley Williams succeeded to fabricate the device in nanometer scale. Since then, the
research being conducted on memristors gained momentum and the number of publications have boosted quite
rapidly. Memristor have two types, are analog memristors and binary memristors. The analog memristor can
change the value memristance depend on voltage or electric current applied to it. However, installing
memristance value is difficult, not exactly. On the contrary, memristance of binary memristor is easy to install,
and more exact. Binary memristors have two state either a high resistance state (HRS) or a low resistance state
(LRS), so they can be stored only„1‟ or „0‟ in binary memristors.
Recent research focuses on using crossbar architecture to simulate synaptic systems. Thus, an
application uses memristor for speech recognition [3]. Our research focuses on recognizing five vowels: „a‟, „e‟,
„i‟, „o‟ and „u‟ from the human voice. To do this, First, a voice signal will be extracted features by MFCC [4].
There are 48 feature values. Then, they are trained by neutral network to recognize 5 vowels. After training,
weightings are quantized in 4 bits, their values were stored in binary memristor crossbar circuit. The memristor
can achieve either a high-resistance state (HRS) or a low-resistance state (LRS). It means that memristor can
store „1‟ and „0‟ with two states. This memristor plays a role as a 2-terminal switch to change the resistance
between high resistance state (HRS, logic “0”) and low resistance state (LRS, logic “1”). By doing so, we can
recognize each vowel by multiplication of input signal and weight stored in binary memrisor. The summation
of the multiplication results decides the biggest output among 5 outputs that will represent input signal. We
suggest a new binary memristor crossbar circuit based neural network model for recognizing five vowels. In
addition, statistical simulations are performed, and the simulation results are discussed and finally summarized
in this paper.
New binary memristor crossbar architecture based neural networks for speech recognition
www.ijesi.org 2 | Page
II. METHODS
The recognition model consists of two main processes: weight installation and recognition process.
First, in weight installation, voice input is processed and trained in neural network model. These weights will be
quantized in 4 bits. The obtained bits will be stored into the binary memristor. Second, in recognition process,
voice input will be processed, and then applied to weighted memristor circuit to determine output. In the first
process, the voice signal is extracted features by MFCC method, including preprocess, framing, windowing,
DFT, mel frequency log. After that, they are trained in neural network by Matlab. Neural network model has
one neural in hidden layer, transfer function is logsig [5]. The training process have 5 times for five vowels.
Output is „1‟ for vowel which trained, else output is „0‟. The process of recognition will be performed in each
sound. In the recognition process, the input will be quantized in 4 bits with 16 levels from 0 to 15. The input
before training was normalized to training input value in the range -15 to 15. After the training process for each
vowel, we will have 48 weights correspondingly. We have [ ] is „a‟ voice input, [ ] is „a‟ weight. [ ] will
be quantized to 4 bits. However, the [ ] contain both positive values and negative values [ ] [ ]
[ ] . Therefore, before the quantization, we should process negative values [ ] [ ] [ ] . An array
[ ] was created to process negative values. Thus the value output depends on [ ] [ ] Or
[ ] [ ] [ ] [ ] [ ] . [ ] is the new input values after multiplying with array
[ ] . The training process will achieve the significant positive values of [ ] , [ ] , [ ] , [ ] and [ ] .
The weights will be adjusted proportionally to the corresponding coefficient so that its value in the range (0; 15),
then quantized to 4 bits. The bit value “1”, “0” will be stored in two memristor with memristance values of Ron
and Roff.
Fig 1: Proposed crossbar architecture for recognition based the comparison signal.
Previous researches proposed crossbar architecture for speed recognition based the comparison signal
[6]. This is not true if the input signal is unstable or voice samples come from various people [6]. If test sample
is best matched with trained samples, output is the biggest. Figure 1a shows that data input is „1100‟, we have 4
columns, that are the weights, first column is „1111‟, second column is „0111‟, third column is „1001‟ and fourth
is „1100‟. In Figure 1b, the best matched column with the input vector of „1100‟ is the fourth column. The
number of matched cells is as large as 4 for the fourth column. By adding cells in M+
and M-
arrays, we can find
the best matched cells. Hence, we determine the best matched column with the input vector among four columns.
Fig 2: Previous crossbar architecture causes error.
New binary memristor crossbar architecture based neural networks for speech recognition
www.ijesi.org 3 | Page
However, if the test voice is not completely matched with trained samples then it is not matched with
trained samples, certantly. In figure 2, data input is „0111‟ as 7, having 4 columns, first is „1000‟ as 8, second
„0001‟ as 1, third is „0100‟ as 4 and last column is „1111‟ as 15. Like this, data input is 7 nearly same with first
column, is 8. But we apply the memristor crossbar architecture, results is 0 as a bad result. The output of first
column is 0, it is the smallest among 4 columns. So, this architecture is reason that causes low recognition rate.
In addtion, we cann‟t recognize a lot of samples of many human with this architecture because each person has
private speech. Therefore, to raise recognition rate with various human speech, we propose a new memristor
crossbar architecture, it is based neural network model. The output of neural network is caculated
∑ . Each input is multiplied by each weight, then sum of multiplication results is output. Figure 3
shows proposed memristor crossbar architecture.
Fig 3: New crossbar architecture based neural network.
To multiply input and the weight, memristors are arranged as figure 3. In figure 3a and 3b, data input is
„0111‟. The weights of rows are „0101‟ as 5 and „1101‟ as 11. This works similarly to multiplication of two 4-
bit numbers. From that, we have 7 column and 7 factors of 1, 2, 4, 8, 16, 32 and 64. The result in figure 3a is
7 5 = 63 and in figure 3b is 7 11=155. The results show that if b < c then a b < a c as multiplying two integer
numbers. In figure 3c, data input is „1101‟ as 11 and the weight is „0111‟ as 7. The result is 11 7 =155 like as
the result of figure 3b. The results show that a b = b a. This is interchange between two integer numbers. The
results show that the new memristor crossbar architecture simulates neural network model very accuracy. So,
for recognizing five vowels, we make 4-bit 48 channel inputs corresponding to 48 features of voice input. Each
channel is included 4-bit binary values. In Figure 4, is the voltage of the „x1‟ column for recognizing„a‟.
is the voltage of the „x2‟ column for recognizing„a‟. Similarly, , , , and are the
voltages of the „x4‟ , „x8‟, „x16‟, „x32‟ and „x64‟ columns in the „a‟ crossbar array. Here, „x1‟ is the weight of
this column and voltage of this colume is as much as 1.
In Figure 4, „x2‟, „x4‟, „x8‟, „x16‟, „x32‟ and „x64‟ mean that the weight factors are 2, 4, 8, 16, 32 and
64 respectively, for the corresponding to columns in the „a‟ crossbar array. Here, can be calculated with the
weighted summation of Similarly, is the
weighted summation for recognizing „e‟. The value of is caculated by the weighted summation of
. The voltages of , , and are inputs in the winner-
take-all circuit. They are compared each other to determine which vowel is the biggest in vowels, that is the
voice input. Figure 4 shows that: , , , and are the outputs of the
winner-take-all circuit. We can measure the voltage level to recognize the voice.
Figure 5a shows the schematic of the binary memristor crossbar circuit. The circuit has 48 input
channels, 4-bit binary values in each channel and each 4-bit binary weight is stored into each row. The 4-bit
weight is set into 4 memristors. Each row has 4 memristors, another rows are shifted left to create 7 columns. In
testing process, there are 48 input channels after extracting the MFCC correspond 48 features to voice. These
input channels have value in the range (-15, 15). Because the input voltage value has both negative and positive
New binary memristor crossbar architecture based neural networks for speech recognition
www.ijesi.org 4 | Page
Fig 4: The block diagram of the proposed binary memristor crossbar circuit with 4-bit 48 input channels.
values, we add bias voltage to increase input voltage to positive voltage values. Figure 5b, this is multiplier
circuit. The output is calculated by ( ) corresponding to the multipliers of 1, 2, 4, 8, 16, 32 and 64.
Here, . Figure
5c is the adder circuit, . Here, the five capacitors of ,
, , , and are represented to the five vowels „a‟, „e‟, „i‟, „o‟, and „u‟. We can determine that a certain
vowel corresponding to the fastest-charged capacitor among the five capacitors of , , , and is the
biggest with the input of a human voice. The capacitor can be charged by the weighted summation of , If
the weighted summation of is large, can be charged to VCC very fast. If the weighted summation of
is small, it takes longer time to charge to VCC. Then are compared with a reference
voltage through the comparison and If is bigger than then become high.
and are smaller than , the ouputs of and become low. are the OR
gates. A delay time τ between and creates small CLK pulse. and are flip flops with
input and .The simulated waveforms of and are shown in figure 6.
Here, seems to be charged to VCC faster than the other capacitor nodes of and . So, the
vowel „a‟ is the best among the other vowels.
The timing diagram of important signals is shown in figure 6. When the CLK signal is high, all the
capacitor nodes of and are charged to VCC. At this time, and are
higher than Thus, and can be high. If is charged to VCC faster than ,
gets the higher voltage level among and . If becomes higher than , becomes
high. So can also be the fastest rising signal among . Since generates the locking
pulse that is the clock signal of D flip-flop circuits of FF1, FF2, FF3, FF4, and FF5, FF1 register leads to high
output signal. So, we can determine which vowel is similar to the voice input. The signal of will make
high and the other output signals become low, as shown in figure 6.
New binary memristor crossbar architecture based neural networks for speech recognition
www.ijesi.org 5 | Page
Fig 5: The schematics of the binary memristor crossbar circuit for speech recognition. (a) The schematic of the binary
memristor crossbar circuit, (b) Voltage multiplier circuit, (c) Adder circuit, (d) The schematic of the winner-take-all circuit
binary memristor circuit.
New binary memristor crossbar architecture based neural networks for speech recognition
www.ijesi.org 6 | Page
Fig 6: Voltage waveforms of the binary memristor crossbar and winner-take-all circuits.
III. RESULTS AND DISSCUSION
In this work, we used the MFCC method to process the voice signal, the MFCC includes preprocessing,
framing, windown-ing, discrete fourier transforming, mel filter. Then, we received 48 MFCC values, that is
feature of the voice signal and is data input for training process. We have 5 training times for 5 vowels. After the
training process, the weights of each vowel will be quantized to 4-bit binary and stored into the memristor
crossbar array by HRS or LRS. When test voice is quantized and converted to 4-bit voltage evels „-1‟, „0‟, „1‟.
In regconition circuit, voltage input is applied to binary memristors. Output voltage of winner take all circuit
will decide the vowel which is tested.
Fig7: The simulation results for the recognition rate of the proposed new binary memristor crossbar circuit.
The recognition rate shows in figure 7 with 1000 input voices for recognizing five different vowels.
Each vowel is tested by 200 different voices. The average recognition rate of five different vowels is estimated
to be around 94%. In the result, the recognition rate of „e‟ is the highest at 96%. While recognition rate of vowel
„i‟, „o‟ are 95%, 94%. The vowel „u‟ and „a‟ has the lowest recognition rate at 90%.
Fig 8: The statistical variation of memristance in HRS and LRS with the standard deviation (=σ) of 10%.
New binary memristor crossbar architecture based neural networks for speech recognition
www.ijesi.org 7 | Page
Fig 9: The recognition rate of the binary memristor crossbar with variation in memristance.
Figure 8 shows the statistical variation of memristance. The memristance of 1 HRS and 10 LRS
have standard deviation (=σ) of 10%. The statistical variation was measured by Monte Carlo simulation that is
performed by matlab software. Monte Carlo simulation estimates tolerant of recognized rate when memristance
variation varies in range 0% to 15%. In figure 9, the recognition rate of the binary memristor crossbar is
decreased very little only from 94% to 82% when the percentage statistical variation in memristance increase
from 0% to 15%.
IV. CONCLUSION
In this paper, the new binary memristor crossbar based neural network model could recognize five
vowels „a‟, „e‟, „i‟, „o‟ and „u‟ with 48 channels. Because each voice input has each feature so the proposed
crossbar array can determine output signal by weights stored in memristance. We tested 1000 speech samples
and verified to be able to recognize 94% of the total tested samples. It shows that using neural network apply in
binary memristor crossbar gets better result than comparison among samples. In neural network, we only use a
neural in hidden layer, so we have 5 times for training. In the further research, we will use more neural in hidden
layers to raise recognition rate and focus on low power consumption, leakage current in binary memristor
crossbar circuit.
REFERENCES
[1]. L. O. Chua, “Memristor – the missing circuit element,” IEEE Trans. Circuit Theory, vol. CT-18, no. 5, pp. 507-519, Sep. 1971.
[2]. D. B. Strukov, G. S. Snider, D. R.Stewart, and R. S. Williams, “The missing memristor found,” Nature, vol. 453, pp. 80-83, May 2008.
[3]. Son Ngoc Truong, Seok-Jin Ham and Kyeong-Sik Min, Neuromorphic crossbar circuit with nanoscale filamentary-switching binary
memristors for speech recognition, Nanoscale Research Letters 2014 9:629.
[4]. Muda L, Begam M, Elamvazuthi I, Voice recognition algorithms using Mel frequency epstral coefficient (MFCC) and dynamic time
warping (DTW) techniques, J Comput 2010,2(3):138–143.
[5]. Raqibul Hasan and Tarek M. Taha, Enabling Back Propagation Training of Memristor Crossbar Neuromorphic Processors, 2014
International Joint Conference on Neural Networks (IJCNN).
[6]. Son Ngoc Truong, SangHak Shin, Sang-Don Byeon, JaeSang Song, Hyun-Sun Mo and Kyeong-Sik Min, Comparative Study on
Statistical-Variation Tolerance Between Complementary Crossbar and Twin Crossbar of Binary Nanoscale Memristors for Pattern
Recognition, Nanoscale Research Letters (2015) 10:405.

More Related Content

PPTX
Lecture Notes: EEEC6440315 Communication Systems - Information Theory
PDF
Design, Analysis and Implementation of Modified Luby Transform Code
PDF
Neuro genetic key based recursive modulo 2 substitution using mutated charact...
DOCX
PDF
An Optimal Software Framework for Parallel Computation of CRC
PDF
Welcome to International Journal of Engineering Research and Development (IJERD)
PDF
Performance Analysis of Steepest Descent Decoding Algorithm for LDPC Codes
PPT
Lecture 09
Lecture Notes: EEEC6440315 Communication Systems - Information Theory
Design, Analysis and Implementation of Modified Luby Transform Code
Neuro genetic key based recursive modulo 2 substitution using mutated charact...
An Optimal Software Framework for Parallel Computation of CRC
Welcome to International Journal of Engineering Research and Development (IJERD)
Performance Analysis of Steepest Descent Decoding Algorithm for LDPC Codes
Lecture 09

What's hot (20)

PDF
Manchester & Differential Manchester encoding scheme
PDF
Data communication 123
DOCX
Dccn solution
PPTX
Error control coding techniques
PPT
PPTX
Source coding
PPT
Digital Communication: Channel Coding
PPT
Full error detection and correction
PDF
Error Control Coding -Introduction
PPT
Lecture 08
PDF
40120140501016
PPT
Ch10 2 v1
PDF
Design and Performance Analysis of Convolutional Encoder and Viterbi Decoder ...
PPTX
Data communications 4 1
PDF
Applying Deep Learning Machine Translation to Language Services
PDF
Reliability Improvement in Logic Circuit Stochastic Computation
PDF
Data Communication & Computer Networks : Unipolar & Polar coding
PPT
Error detection and correction
PDF
A genetic algorithm to solve the
Manchester & Differential Manchester encoding scheme
Data communication 123
Dccn solution
Error control coding techniques
Source coding
Digital Communication: Channel Coding
Full error detection and correction
Error Control Coding -Introduction
Lecture 08
40120140501016
Ch10 2 v1
Design and Performance Analysis of Convolutional Encoder and Viterbi Decoder ...
Data communications 4 1
Applying Deep Learning Machine Translation to Language Services
Reliability Improvement in Logic Circuit Stochastic Computation
Data Communication & Computer Networks : Unipolar & Polar coding
Error detection and correction
A genetic algorithm to solve the
Ad

Viewers also liked (20)

PDF
Certain D - Operator for Srivastava H B - Hypergeometric Functions of Three V...
PDF
Synaptic memristor bridge circuit with pulse width based programable weights ...
PDF
Approaches to Administrative Leadership
PDF
An Exposition of Qualities of Leadership
PDF
Efficient Algorithm for Constructing KU-algebras from Block Codes
PDF
Multi-Mode Conceptual Clustering Algorithm Based Social Group Identification ...
PDF
New classes of Adomian polynomials for the Adomian decomposition method
PDF
Bagasse based high pressure co-generation in Pakistan
PDF
Design and Fabrication of a Recreational Human-Powered Vehicle
PDF
Power Aware Geocast Based Geocast Region Tracking Using Mobile Node in Wirele...
PDF
Effect of Wind Direction through Double Storied Building Model Configurations...
PDF
Middle Range Theories as Coherent Intellectual Frameworks
PDF
E-Voting and Credible Elections in Nigeria
PDF
Importance and Functions of Bills of Quantities in the Construction Industry:...
PDF
Fuzzy random variables and Kolomogrov’s important results
PDF
Trust based Mechanism for Secure Cloud Computing Environment: A Survey
PDF
Numerical Study of Some Iterative Methods for Solving Nonlinear Equations
PDF
The Case for Developing and Introducing the M-Procurement System in Nigeria, ...
PDF
Face Recognition System Using Local Ternary Pattern and Signed Number Multipl...
PDF
Comparison of Supercritical Fluid Extraction with Steam Distillation for the ...
Certain D - Operator for Srivastava H B - Hypergeometric Functions of Three V...
Synaptic memristor bridge circuit with pulse width based programable weights ...
Approaches to Administrative Leadership
An Exposition of Qualities of Leadership
Efficient Algorithm for Constructing KU-algebras from Block Codes
Multi-Mode Conceptual Clustering Algorithm Based Social Group Identification ...
New classes of Adomian polynomials for the Adomian decomposition method
Bagasse based high pressure co-generation in Pakistan
Design and Fabrication of a Recreational Human-Powered Vehicle
Power Aware Geocast Based Geocast Region Tracking Using Mobile Node in Wirele...
Effect of Wind Direction through Double Storied Building Model Configurations...
Middle Range Theories as Coherent Intellectual Frameworks
E-Voting and Credible Elections in Nigeria
Importance and Functions of Bills of Quantities in the Construction Industry:...
Fuzzy random variables and Kolomogrov’s important results
Trust based Mechanism for Secure Cloud Computing Environment: A Survey
Numerical Study of Some Iterative Methods for Solving Nonlinear Equations
The Case for Developing and Introducing the M-Procurement System in Nigeria, ...
Face Recognition System Using Local Ternary Pattern and Signed Number Multipl...
Comparison of Supercritical Fluid Extraction with Steam Distillation for the ...
Ad

Similar to New binary memristor crossbar architecture based neural networks for speech recognition (20)

PPT
Flow cytometry
DOCX
Ieee 2013 matlab abstracts part b
PDF
Quantum Noise and Error Correction
PDF
Cs8591 Computer Networks
PDF
Simulation of Quantum Cryptography and use of DNA based algorithm for Secure ...
PPT
Lecture1
PPTX
Data Communications- Unit-4.pptx
PDF
Cellonics-Seminar-Report[1]
PPTX
Information Theory Final.pptx
PDF
Acquisition of Long Pseudo Code in Dsss Signal
PDF
Quantum Communications Q&A with Gemini LLM
PDF
Nt1330 Unit 4.2 Paper
DOC
Framework for Channel Attenuation Model Final Paper
PDF
Technical details
PDF
Performance Analysis of Various Symbol Detection Techniques in Wireless MIMO ...
PDF
Performance Analysis of Various Symbol Detection Techniques in Wireless MIMO ...
PDF
Performance Analysis of Various Symbol Detection Techniques in Wireless MIMO ...
PPT
RADIO FREQUENCY COMMUNICATION SYSTEMS, ANTENNA THEORY AND MICROWAVE DEVICES
PDF
Adaptive Channel Equalization using Multilayer Perceptron Neural Networks wit...
PDF
Fpga implementation of soft decision low power convolutional decoder using vi...
Flow cytometry
Ieee 2013 matlab abstracts part b
Quantum Noise and Error Correction
Cs8591 Computer Networks
Simulation of Quantum Cryptography and use of DNA based algorithm for Secure ...
Lecture1
Data Communications- Unit-4.pptx
Cellonics-Seminar-Report[1]
Information Theory Final.pptx
Acquisition of Long Pseudo Code in Dsss Signal
Quantum Communications Q&A with Gemini LLM
Nt1330 Unit 4.2 Paper
Framework for Channel Attenuation Model Final Paper
Technical details
Performance Analysis of Various Symbol Detection Techniques in Wireless MIMO ...
Performance Analysis of Various Symbol Detection Techniques in Wireless MIMO ...
Performance Analysis of Various Symbol Detection Techniques in Wireless MIMO ...
RADIO FREQUENCY COMMUNICATION SYSTEMS, ANTENNA THEORY AND MICROWAVE DEVICES
Adaptive Channel Equalization using Multilayer Perceptron Neural Networks wit...
Fpga implementation of soft decision low power convolutional decoder using vi...

Recently uploaded (20)

PPTX
web development for engineering and engineering
PDF
R24 SURVEYING LAB MANUAL for civil enggi
PPTX
Current and future trends in Computer Vision.pptx
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPTX
Sustainable Sites - Green Building Construction
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PPT
introduction to datamining and warehousing
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PPTX
Lecture Notes Electrical Wiring System Components
PDF
Well-logging-methods_new................
PPTX
OOP with Java - Java Introduction (Basics)
PPTX
additive manufacturing of ss316l using mig welding
PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPTX
Geodesy 1.pptx...............................................
web development for engineering and engineering
R24 SURVEYING LAB MANUAL for civil enggi
Current and future trends in Computer Vision.pptx
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Sustainable Sites - Green Building Construction
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
introduction to datamining and warehousing
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Operating System & Kernel Study Guide-1 - converted.pdf
Lecture Notes Electrical Wiring System Components
Well-logging-methods_new................
OOP with Java - Java Introduction (Basics)
additive manufacturing of ss316l using mig welding
Internet of Things (IOT) - A guide to understanding
CYBER-CRIMES AND SECURITY A guide to understanding
UNIT-1 - COAL BASED THERMAL POWER PLANTS
Geodesy 1.pptx...............................................

New binary memristor crossbar architecture based neural networks for speech recognition

  • 1. International Journal of Engineering Science Invention ISSN (Online): 2319 – 6734, ISSN (Print): 2319 – 6726 www.ijesi.org || Volume 5 Issue 5 || May 2016 || PP.1-7 www.ijesi.org 1 | Page New binary memristor crossbar architecture based neural networks for speech recognition Van-Tien Nguyen1 , Minh-Huan Vo2 1,2 Department of Electrical Electronic Engineering, HCMC University of Technology and Education, Viet Nam. ABSTRACT : In this paper, we propose a new binary memristor crossbar architecture based neural networks for speech recognition. The circuit can recognize five vowels. The proposed crossbar is tested by 1,000 speech samples and recognized 94% of the tested samples. We use Monte Carlo simulation to estimate recognitition rate. The percentage variation in memristance is increased from 0% to 15%, the recognition rate is degraded from 94% to 82%. KEYWORDS – Memristors, Crossbar, Speech recognition, Binary memristors. I. INTRODUCTION According to Moore's Law, transistor count per chip will double within two years. The IC technology development in recent years has shown the validity of Moore's law. However, according to estimates in the near future, the technology will reach the limit of Moore's Law, which means that the chip size will reach critical values to ensure accuracy and stability. So, there are many new methods being studied to replace Moore's law. And memristor of Leon. Chua was mentioned in 1971 [1], which was completed by Stanley- Williams in 2008 [2]. The future of technology has opened up new development. This technology is even better than CMOS technology has been thriving. Anyone with a basic knowledge in electrical engineering knows that there are four fundamental circuit variables: Current i, Voltage v, Charge q, and Flux f. Then it is clear that with these four parameters, there can be six possible combinations for relating them to each other. So far we have complete understanding and control over five of these combinations in which three of them are passive twoterminal fundamental circuit elements, namely the resistor R, the capacitor C and the inductor L. Unlike the active components which can generate energy, these three components are passive elements which are only capable of storing or dissipating energy but not generating it. The relationship between 'voltage and current', 'voltage and charge', and 'current and flux' are defined by a resistor, capacitor and an inductor, respectively. No device was there to relate the charge and the magnetic flux until Leon Chua introduced his new circuit element called “memristor”. In 2008, a research group at HP Labs lead by Stanley Williams succeeded to fabricate the device in nanometer scale. Since then, the research being conducted on memristors gained momentum and the number of publications have boosted quite rapidly. Memristor have two types, are analog memristors and binary memristors. The analog memristor can change the value memristance depend on voltage or electric current applied to it. However, installing memristance value is difficult, not exactly. On the contrary, memristance of binary memristor is easy to install, and more exact. Binary memristors have two state either a high resistance state (HRS) or a low resistance state (LRS), so they can be stored only„1‟ or „0‟ in binary memristors. Recent research focuses on using crossbar architecture to simulate synaptic systems. Thus, an application uses memristor for speech recognition [3]. Our research focuses on recognizing five vowels: „a‟, „e‟, „i‟, „o‟ and „u‟ from the human voice. To do this, First, a voice signal will be extracted features by MFCC [4]. There are 48 feature values. Then, they are trained by neutral network to recognize 5 vowels. After training, weightings are quantized in 4 bits, their values were stored in binary memristor crossbar circuit. The memristor can achieve either a high-resistance state (HRS) or a low-resistance state (LRS). It means that memristor can store „1‟ and „0‟ with two states. This memristor plays a role as a 2-terminal switch to change the resistance between high resistance state (HRS, logic “0”) and low resistance state (LRS, logic “1”). By doing so, we can recognize each vowel by multiplication of input signal and weight stored in binary memrisor. The summation of the multiplication results decides the biggest output among 5 outputs that will represent input signal. We suggest a new binary memristor crossbar circuit based neural network model for recognizing five vowels. In addition, statistical simulations are performed, and the simulation results are discussed and finally summarized in this paper.
  • 2. New binary memristor crossbar architecture based neural networks for speech recognition www.ijesi.org 2 | Page II. METHODS The recognition model consists of two main processes: weight installation and recognition process. First, in weight installation, voice input is processed and trained in neural network model. These weights will be quantized in 4 bits. The obtained bits will be stored into the binary memristor. Second, in recognition process, voice input will be processed, and then applied to weighted memristor circuit to determine output. In the first process, the voice signal is extracted features by MFCC method, including preprocess, framing, windowing, DFT, mel frequency log. After that, they are trained in neural network by Matlab. Neural network model has one neural in hidden layer, transfer function is logsig [5]. The training process have 5 times for five vowels. Output is „1‟ for vowel which trained, else output is „0‟. The process of recognition will be performed in each sound. In the recognition process, the input will be quantized in 4 bits with 16 levels from 0 to 15. The input before training was normalized to training input value in the range -15 to 15. After the training process for each vowel, we will have 48 weights correspondingly. We have [ ] is „a‟ voice input, [ ] is „a‟ weight. [ ] will be quantized to 4 bits. However, the [ ] contain both positive values and negative values [ ] [ ] [ ] . Therefore, before the quantization, we should process negative values [ ] [ ] [ ] . An array [ ] was created to process negative values. Thus the value output depends on [ ] [ ] Or [ ] [ ] [ ] [ ] [ ] . [ ] is the new input values after multiplying with array [ ] . The training process will achieve the significant positive values of [ ] , [ ] , [ ] , [ ] and [ ] . The weights will be adjusted proportionally to the corresponding coefficient so that its value in the range (0; 15), then quantized to 4 bits. The bit value “1”, “0” will be stored in two memristor with memristance values of Ron and Roff. Fig 1: Proposed crossbar architecture for recognition based the comparison signal. Previous researches proposed crossbar architecture for speed recognition based the comparison signal [6]. This is not true if the input signal is unstable or voice samples come from various people [6]. If test sample is best matched with trained samples, output is the biggest. Figure 1a shows that data input is „1100‟, we have 4 columns, that are the weights, first column is „1111‟, second column is „0111‟, third column is „1001‟ and fourth is „1100‟. In Figure 1b, the best matched column with the input vector of „1100‟ is the fourth column. The number of matched cells is as large as 4 for the fourth column. By adding cells in M+ and M- arrays, we can find the best matched cells. Hence, we determine the best matched column with the input vector among four columns. Fig 2: Previous crossbar architecture causes error.
  • 3. New binary memristor crossbar architecture based neural networks for speech recognition www.ijesi.org 3 | Page However, if the test voice is not completely matched with trained samples then it is not matched with trained samples, certantly. In figure 2, data input is „0111‟ as 7, having 4 columns, first is „1000‟ as 8, second „0001‟ as 1, third is „0100‟ as 4 and last column is „1111‟ as 15. Like this, data input is 7 nearly same with first column, is 8. But we apply the memristor crossbar architecture, results is 0 as a bad result. The output of first column is 0, it is the smallest among 4 columns. So, this architecture is reason that causes low recognition rate. In addtion, we cann‟t recognize a lot of samples of many human with this architecture because each person has private speech. Therefore, to raise recognition rate with various human speech, we propose a new memristor crossbar architecture, it is based neural network model. The output of neural network is caculated ∑ . Each input is multiplied by each weight, then sum of multiplication results is output. Figure 3 shows proposed memristor crossbar architecture. Fig 3: New crossbar architecture based neural network. To multiply input and the weight, memristors are arranged as figure 3. In figure 3a and 3b, data input is „0111‟. The weights of rows are „0101‟ as 5 and „1101‟ as 11. This works similarly to multiplication of two 4- bit numbers. From that, we have 7 column and 7 factors of 1, 2, 4, 8, 16, 32 and 64. The result in figure 3a is 7 5 = 63 and in figure 3b is 7 11=155. The results show that if b < c then a b < a c as multiplying two integer numbers. In figure 3c, data input is „1101‟ as 11 and the weight is „0111‟ as 7. The result is 11 7 =155 like as the result of figure 3b. The results show that a b = b a. This is interchange between two integer numbers. The results show that the new memristor crossbar architecture simulates neural network model very accuracy. So, for recognizing five vowels, we make 4-bit 48 channel inputs corresponding to 48 features of voice input. Each channel is included 4-bit binary values. In Figure 4, is the voltage of the „x1‟ column for recognizing„a‟. is the voltage of the „x2‟ column for recognizing„a‟. Similarly, , , , and are the voltages of the „x4‟ , „x8‟, „x16‟, „x32‟ and „x64‟ columns in the „a‟ crossbar array. Here, „x1‟ is the weight of this column and voltage of this colume is as much as 1. In Figure 4, „x2‟, „x4‟, „x8‟, „x16‟, „x32‟ and „x64‟ mean that the weight factors are 2, 4, 8, 16, 32 and 64 respectively, for the corresponding to columns in the „a‟ crossbar array. Here, can be calculated with the weighted summation of Similarly, is the weighted summation for recognizing „e‟. The value of is caculated by the weighted summation of . The voltages of , , and are inputs in the winner- take-all circuit. They are compared each other to determine which vowel is the biggest in vowels, that is the voice input. Figure 4 shows that: , , , and are the outputs of the winner-take-all circuit. We can measure the voltage level to recognize the voice. Figure 5a shows the schematic of the binary memristor crossbar circuit. The circuit has 48 input channels, 4-bit binary values in each channel and each 4-bit binary weight is stored into each row. The 4-bit weight is set into 4 memristors. Each row has 4 memristors, another rows are shifted left to create 7 columns. In testing process, there are 48 input channels after extracting the MFCC correspond 48 features to voice. These input channels have value in the range (-15, 15). Because the input voltage value has both negative and positive
  • 4. New binary memristor crossbar architecture based neural networks for speech recognition www.ijesi.org 4 | Page Fig 4: The block diagram of the proposed binary memristor crossbar circuit with 4-bit 48 input channels. values, we add bias voltage to increase input voltage to positive voltage values. Figure 5b, this is multiplier circuit. The output is calculated by ( ) corresponding to the multipliers of 1, 2, 4, 8, 16, 32 and 64. Here, . Figure 5c is the adder circuit, . Here, the five capacitors of , , , , and are represented to the five vowels „a‟, „e‟, „i‟, „o‟, and „u‟. We can determine that a certain vowel corresponding to the fastest-charged capacitor among the five capacitors of , , , and is the biggest with the input of a human voice. The capacitor can be charged by the weighted summation of , If the weighted summation of is large, can be charged to VCC very fast. If the weighted summation of is small, it takes longer time to charge to VCC. Then are compared with a reference voltage through the comparison and If is bigger than then become high. and are smaller than , the ouputs of and become low. are the OR gates. A delay time τ between and creates small CLK pulse. and are flip flops with input and .The simulated waveforms of and are shown in figure 6. Here, seems to be charged to VCC faster than the other capacitor nodes of and . So, the vowel „a‟ is the best among the other vowels. The timing diagram of important signals is shown in figure 6. When the CLK signal is high, all the capacitor nodes of and are charged to VCC. At this time, and are higher than Thus, and can be high. If is charged to VCC faster than , gets the higher voltage level among and . If becomes higher than , becomes high. So can also be the fastest rising signal among . Since generates the locking pulse that is the clock signal of D flip-flop circuits of FF1, FF2, FF3, FF4, and FF5, FF1 register leads to high output signal. So, we can determine which vowel is similar to the voice input. The signal of will make high and the other output signals become low, as shown in figure 6.
  • 5. New binary memristor crossbar architecture based neural networks for speech recognition www.ijesi.org 5 | Page Fig 5: The schematics of the binary memristor crossbar circuit for speech recognition. (a) The schematic of the binary memristor crossbar circuit, (b) Voltage multiplier circuit, (c) Adder circuit, (d) The schematic of the winner-take-all circuit binary memristor circuit.
  • 6. New binary memristor crossbar architecture based neural networks for speech recognition www.ijesi.org 6 | Page Fig 6: Voltage waveforms of the binary memristor crossbar and winner-take-all circuits. III. RESULTS AND DISSCUSION In this work, we used the MFCC method to process the voice signal, the MFCC includes preprocessing, framing, windown-ing, discrete fourier transforming, mel filter. Then, we received 48 MFCC values, that is feature of the voice signal and is data input for training process. We have 5 training times for 5 vowels. After the training process, the weights of each vowel will be quantized to 4-bit binary and stored into the memristor crossbar array by HRS or LRS. When test voice is quantized and converted to 4-bit voltage evels „-1‟, „0‟, „1‟. In regconition circuit, voltage input is applied to binary memristors. Output voltage of winner take all circuit will decide the vowel which is tested. Fig7: The simulation results for the recognition rate of the proposed new binary memristor crossbar circuit. The recognition rate shows in figure 7 with 1000 input voices for recognizing five different vowels. Each vowel is tested by 200 different voices. The average recognition rate of five different vowels is estimated to be around 94%. In the result, the recognition rate of „e‟ is the highest at 96%. While recognition rate of vowel „i‟, „o‟ are 95%, 94%. The vowel „u‟ and „a‟ has the lowest recognition rate at 90%. Fig 8: The statistical variation of memristance in HRS and LRS with the standard deviation (=σ) of 10%.
  • 7. New binary memristor crossbar architecture based neural networks for speech recognition www.ijesi.org 7 | Page Fig 9: The recognition rate of the binary memristor crossbar with variation in memristance. Figure 8 shows the statistical variation of memristance. The memristance of 1 HRS and 10 LRS have standard deviation (=σ) of 10%. The statistical variation was measured by Monte Carlo simulation that is performed by matlab software. Monte Carlo simulation estimates tolerant of recognized rate when memristance variation varies in range 0% to 15%. In figure 9, the recognition rate of the binary memristor crossbar is decreased very little only from 94% to 82% when the percentage statistical variation in memristance increase from 0% to 15%. IV. CONCLUSION In this paper, the new binary memristor crossbar based neural network model could recognize five vowels „a‟, „e‟, „i‟, „o‟ and „u‟ with 48 channels. Because each voice input has each feature so the proposed crossbar array can determine output signal by weights stored in memristance. We tested 1000 speech samples and verified to be able to recognize 94% of the total tested samples. It shows that using neural network apply in binary memristor crossbar gets better result than comparison among samples. In neural network, we only use a neural in hidden layer, so we have 5 times for training. In the further research, we will use more neural in hidden layers to raise recognition rate and focus on low power consumption, leakage current in binary memristor crossbar circuit. REFERENCES [1]. L. O. Chua, “Memristor – the missing circuit element,” IEEE Trans. Circuit Theory, vol. CT-18, no. 5, pp. 507-519, Sep. 1971. [2]. D. B. Strukov, G. S. Snider, D. R.Stewart, and R. S. Williams, “The missing memristor found,” Nature, vol. 453, pp. 80-83, May 2008. [3]. Son Ngoc Truong, Seok-Jin Ham and Kyeong-Sik Min, Neuromorphic crossbar circuit with nanoscale filamentary-switching binary memristors for speech recognition, Nanoscale Research Letters 2014 9:629. [4]. Muda L, Begam M, Elamvazuthi I, Voice recognition algorithms using Mel frequency epstral coefficient (MFCC) and dynamic time warping (DTW) techniques, J Comput 2010,2(3):138–143. [5]. Raqibul Hasan and Tarek M. Taha, Enabling Back Propagation Training of Memristor Crossbar Neuromorphic Processors, 2014 International Joint Conference on Neural Networks (IJCNN). [6]. Son Ngoc Truong, SangHak Shin, Sang-Don Byeon, JaeSang Song, Hyun-Sun Mo and Kyeong-Sik Min, Comparative Study on Statistical-Variation Tolerance Between Complementary Crossbar and Twin Crossbar of Binary Nanoscale Memristors for Pattern Recognition, Nanoscale Research Letters (2015) 10:405.