SlideShare a Scribd company logo
VOICE IDENTIFICATION AND
RECOGNITION SYSTEM
A SIMPLE YET COMPLEX APPROACH TO MODERN SOPHISTICATION
VOICE IDENTIFICATION AND RECOGNITION SYSTEM 1
GROUP MEMBERS
• SOHAIB TALLAT SP13-BCE-040
• FARHAN SHAHID SP13-BCE-013
• ABDUL SAMAD SP13-BCE-002
• MATTI ULLAH ABBASI SP13-BCE-025
VOICE IDENTIFICATION AND RECOGNITION SYSTEM 2
INTRODUCTION AND INSPIRATION
• As we know that simplicity has taken its tool, it is now the age of sophisticated technologies therefore
nowadays efficient security systems have to be utilised in our life.
• The “VOICE IDENTIFICATION AND RECOGNITION SYSTEM” has been developed to cater our needs for
controlling access to services such as: banking, databases systems etc. which are used to secure
confidential information.
• We were inspired to make this project for making lock mechanism systems speech automated,
especially for the ease of physically disabled people.
VOICE IDENTIFICATION AND RECOGNITION SYSTEM 3
ABSTRACT
• Approaches for making Voice recognition sytems:
a. Linear Prediction Coding (LPC)
b. Mel-Frequecy Cepstrum Coefficients (MFCC) and others.
• Principle Used: Mel-Frequecy Cepstrum Coefficients (MFCC)
• Working
VOICE IDENTIFICATION AND RECOGNITION SYSTEM 4
THE VOICE IDENTIFICATION ALGORITHM
• Priciples of Speaker Recognition:
a. Identification
b. Verification
Input
speech
Feature
extraction
Reference
model
(Speaker #1)
Similarity
Reference
model
(Speaker #N)
Similarity
Maximum
selection
Identification
result
(Speaker ID)
Reference
model
(Speaker #M)
Similarity
Input
speech
Feature
extraction
Verification
result
(Accept/Reject)
Decision
ThresholdSpeaker ID
(#M)
VOICE IDENTIFICATION AND RECOGNITION SYSTEM 5
Figure 1: Speaker Identification
Figure 2: Speaker Recognition
FEATURE EXTRACTION
• Feature extraction is the process that extracts a small amount of data from the voice signal that can
later be used to represent each speaker.
• A wide range of possibilities exist for parametrically representing the speech signal for the speaker
recognition task, such as Mel Frequency Cepstrum Coefficients (MFCC).
0 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016 0.018
-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
Time (second)
VOICE IDENTIFICATION AND RECOGNITION SYSTEM 6
Figure 3: Example Of Speech Signal
MEL-FREQUENCY CEPSTRUM COEFFICIENTS (MFCC)
PROCESSOR
mel
cepstrum
mel
spectrum
framecontinuous
speech
Frame
Blocking
Windowing FFT spectrum
Mel-frequency
Wrapping
Cepstrum
VOICE IDENTIFICATION AND RECOGNITION SYSTEM 7
MFCC PROCESSOR ELABORATED
• Frame Blocking
• Windowing
• Fast Fourier Transform
• Mel- Frequency Wrapping
• Cepstrum
0 1000 2000 3000 4000 5000 6000 7000
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Mel-spaced filterbank
Frequency (Hz)
VOICE IDENTIFICATION AND RECOGNITION SYSTEM 8
Figure 4: Example of mel-spaced
frequency bank
FEATURE MATCHING
• Feature matching involves the actual procedure to identify the unknown speaker by comparing
extracted features from his/her voice input with the ones from a set of known speakers
• The goal of pattern recognition is to classify objects of interest into one of a number of categories or
classes.
• The objects of interest are called patterns and in our case are sequences of acoustic vectors that are
extracted from an input speech.
• Classes are referred to individual speakers.
VOICE IDENTIFICATION AND RECOGNITION SYSTEM 9
PATTERN RECOGNITION TECHNIQUE
• Feature matching technique used in “VOICE IDENTIFICATION AND RECOGNITION SYSTEM” is Vector
Quantization (VQ).
• VQ is a process of mapping vectors from a large vector space to a finite number of regions in that space.
Each region is called a cluster and can be represented by its center called a codeword. The collection of
all codewords is called a codebook.
VOICE IDENTIFICATION AND RECOGNITION SYSTEM 10
RECOGNITION PROCESS
VOICE IDENTIFICATION AND RECOGNITION SYSTEM 11
Speaker 1
Speaker 1
centroid
sample
Speaker 2
centroid
sample
Speaker 2
VQ distortion
Figure 5: Conceptual Diagram Illustrating Vector
Quantization codebook Formation
LINDE-BUZO-GREY ALGORITHM
The Linde–Buzo–Gray algorithm (introduced by Yoseph Linde,
Andrés Buzo and Robert M. Gray in 1980) is a vector quantization
algorithm to derive a good codebook.
VOICE IDENTIFICATION AND RECOGNITION SYSTEM 12
Find
centroid
Split each
centroid
Cluster
vectors
Find
centroids
Compute D
(distortion)


D
D'D
Stop
D’ = D
m = 2*m
No
Yes
Yes
No
m < M
THE GRAPHICAL USER INTERFACE
• There are many ways to make your own custom Graphical User Interface (GUI); you can do it manually
or you can use another efficient approach that is the “Guide” approach.
VOICE IDENTIFICATION AND RECOGNITION SYSTEM 13
Figure 6: Guide Quick Start Window
Figure 7: Our Custom GUI
EMBEDDING CODE TO THE GUI
• Note that in the figure we have six essential buttons, which perform their unique task.
a. “Add New Sound To The Database”
b. “Speaker Recognition From Mike”
c. “DATABASE INFORMATION”
d. “PLOT DATABASE”
e. “Delete Database”
f. “EXIT”
VOICE IDENTIFICATION AND RECOGNITION SYSTEM 14
Figure 7: Our Custom GUI
ADDING BACK GROUND TO THE GUI
CODE:
% create an axes that spans the whole gui
ah = axes('unit', 'normalized', 'position', [0 0 1 1]);
% import the background image and show it on the axes
bg = imread('project image 3.jpg'); imagesc(bg);
% prevent plotting over the background and turn the axis off
set(ah,'handlevisibility','off','visible','off')
% making sure the background is behind all the other uicontrols
uistack(ah, 'bottom');
VOICE IDENTIFICATION AND RECOGNITION SYSTEM 15
Figure 8: Our Custom Background
VOICE IDENTIFICATION AND RECOGNITION SYSTEM 16
Figure 9: Our Final Program
APPLICATION DEPLOYMENT
VOICE IDENTIFICATION AND RECOGNITION SYSTEM 17
Figure 10: Standalone application deployment window Figure 11: Our Custom Splash screen
REFERENCES
• L.R. Rabiner and B.H. Juang, Fundamentals of Speech Recognition, Prentice-Hall, Englewood Cliffs, N.J., 1993.
• S.B. Davis and P. Mermelstein, “Comparison of parametric representations for monosyllabic word recognition
in continuously spoken sentences”, IEEE Transactions on Acoustics, Speech, Signal Processing, Vol. ASSP-28,
No. 4, August 1980
• Y. Linde, A. Buzo & R. Gray, “An algorithm for vector quantizer design”, IEEE Transactions on Communications,
Vol. 28, pp.84-95, 1980
• S. Furui, “Speaker independent isolated word recognition using dynamic features of speech spectrum”, IEEE
Transactions on Acoustic, Speech, Signal Processing, Vol. ASSP-34, No. 1, pp. 52-59, February 1986
• F.K. Song, A.E. Rosenberg and B.H. Juang, “A vector quantisation approach to speaker recognition”, AT&T
Technical Journal, Vol. 66-2, pp. 14-26, March 1987
• comp.speech Frequently Asked Questions WWW site,
http://svr-www.eng.cam.ac.uk/comp.speech/
VOICE IDENTIFICATION AND RECOGNITION SYSTEM 18
Voice Identification And Recognition System, Matlab

More Related Content

PDF
Realising Society 5.0 and its Relation to Industry 4.0
PDF
Design Thinking
PDF
Eplan P8 Serial Number Validation 473
PPT
Workshop geweld tegen meisjes
PPT
Speaker identification system with voice controlled functionality
PPTX
EDI 2009- Admissibility of Electronic/Digital Evidence
PPTX
The PEACE Model of Investigative Interviewing
PDF
Types of questions
Realising Society 5.0 and its Relation to Industry 4.0
Design Thinking
Eplan P8 Serial Number Validation 473
Workshop geweld tegen meisjes
Speaker identification system with voice controlled functionality
EDI 2009- Admissibility of Electronic/Digital Evidence
The PEACE Model of Investigative Interviewing
Types of questions

Similar to Voice Identification And Recognition System, Matlab (20)

DOCX
Voice biometric recognition
DOC
Speaker recognition on matlab
PPTX
Speaker recognition using MFCC
PDF
Speaker Recognition System using MFCC and Vector Quantization Approach
PPTX
Voice recognition system
PDF
Isolated word recognition using lpc &amp; vector quantization
PDF
Isolated word recognition using lpc & vector quantization
PDF
IRJET- Device Activation based on Voice Recognition using Mel Frequency Cepst...
PDF
Bachelors project summary
PDF
AN EFFICIENT SPEECH RECOGNITION SYSTEM
PDF
Speaker and Speech Recognition for Secured Smart Home Applications
PDF
[Tobias herbig, franz_gerl]_self-learning_speaker_(book_zz.org)
PPTX
Joint MFCC-and-Vector Quantization based Text-Independent Speaker Recognition...
PDF
Classification of Language Speech Recognition System
PPTX
Text independent speaker recognition system
PPT
Automatic speech recognition
PDF
Course report-islam-taharimul (1)
PDF
ASR_final
PPTX
Speech based password authentication system on FPGA
Voice biometric recognition
Speaker recognition on matlab
Speaker recognition using MFCC
Speaker Recognition System using MFCC and Vector Quantization Approach
Voice recognition system
Isolated word recognition using lpc &amp; vector quantization
Isolated word recognition using lpc & vector quantization
IRJET- Device Activation based on Voice Recognition using Mel Frequency Cepst...
Bachelors project summary
AN EFFICIENT SPEECH RECOGNITION SYSTEM
Speaker and Speech Recognition for Secured Smart Home Applications
[Tobias herbig, franz_gerl]_self-learning_speaker_(book_zz.org)
Joint MFCC-and-Vector Quantization based Text-Independent Speaker Recognition...
Classification of Language Speech Recognition System
Text independent speaker recognition system
Automatic speech recognition
Course report-islam-taharimul (1)
ASR_final
Speech based password authentication system on FPGA
Ad

Recently uploaded (20)

PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PPTX
Sustainable Sites - Green Building Construction
PPTX
bas. eng. economics group 4 presentation 1.pptx
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PPT
Project quality management in manufacturing
PDF
PPT on Performance Review to get promotions
PPTX
CH1 Production IntroductoryConcepts.pptx
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPT
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
PPTX
Lecture Notes Electrical Wiring System Components
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPTX
Foundation to blockchain - A guide to Blockchain Tech
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Sustainable Sites - Green Building Construction
bas. eng. economics group 4 presentation 1.pptx
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
Project quality management in manufacturing
PPT on Performance Review to get promotions
CH1 Production IntroductoryConcepts.pptx
CYBER-CRIMES AND SECURITY A guide to understanding
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
Lecture Notes Electrical Wiring System Components
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Foundation to blockchain - A guide to Blockchain Tech
Ad

Voice Identification And Recognition System, Matlab

  • 1. VOICE IDENTIFICATION AND RECOGNITION SYSTEM A SIMPLE YET COMPLEX APPROACH TO MODERN SOPHISTICATION VOICE IDENTIFICATION AND RECOGNITION SYSTEM 1
  • 2. GROUP MEMBERS • SOHAIB TALLAT SP13-BCE-040 • FARHAN SHAHID SP13-BCE-013 • ABDUL SAMAD SP13-BCE-002 • MATTI ULLAH ABBASI SP13-BCE-025 VOICE IDENTIFICATION AND RECOGNITION SYSTEM 2
  • 3. INTRODUCTION AND INSPIRATION • As we know that simplicity has taken its tool, it is now the age of sophisticated technologies therefore nowadays efficient security systems have to be utilised in our life. • The “VOICE IDENTIFICATION AND RECOGNITION SYSTEM” has been developed to cater our needs for controlling access to services such as: banking, databases systems etc. which are used to secure confidential information. • We were inspired to make this project for making lock mechanism systems speech automated, especially for the ease of physically disabled people. VOICE IDENTIFICATION AND RECOGNITION SYSTEM 3
  • 4. ABSTRACT • Approaches for making Voice recognition sytems: a. Linear Prediction Coding (LPC) b. Mel-Frequecy Cepstrum Coefficients (MFCC) and others. • Principle Used: Mel-Frequecy Cepstrum Coefficients (MFCC) • Working VOICE IDENTIFICATION AND RECOGNITION SYSTEM 4
  • 5. THE VOICE IDENTIFICATION ALGORITHM • Priciples of Speaker Recognition: a. Identification b. Verification Input speech Feature extraction Reference model (Speaker #1) Similarity Reference model (Speaker #N) Similarity Maximum selection Identification result (Speaker ID) Reference model (Speaker #M) Similarity Input speech Feature extraction Verification result (Accept/Reject) Decision ThresholdSpeaker ID (#M) VOICE IDENTIFICATION AND RECOGNITION SYSTEM 5 Figure 1: Speaker Identification Figure 2: Speaker Recognition
  • 6. FEATURE EXTRACTION • Feature extraction is the process that extracts a small amount of data from the voice signal that can later be used to represent each speaker. • A wide range of possibilities exist for parametrically representing the speech signal for the speaker recognition task, such as Mel Frequency Cepstrum Coefficients (MFCC). 0 0.002 0.004 0.006 0.008 0.01 0.012 0.014 0.016 0.018 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 Time (second) VOICE IDENTIFICATION AND RECOGNITION SYSTEM 6 Figure 3: Example Of Speech Signal
  • 7. MEL-FREQUENCY CEPSTRUM COEFFICIENTS (MFCC) PROCESSOR mel cepstrum mel spectrum framecontinuous speech Frame Blocking Windowing FFT spectrum Mel-frequency Wrapping Cepstrum VOICE IDENTIFICATION AND RECOGNITION SYSTEM 7
  • 8. MFCC PROCESSOR ELABORATED • Frame Blocking • Windowing • Fast Fourier Transform • Mel- Frequency Wrapping • Cepstrum 0 1000 2000 3000 4000 5000 6000 7000 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Mel-spaced filterbank Frequency (Hz) VOICE IDENTIFICATION AND RECOGNITION SYSTEM 8 Figure 4: Example of mel-spaced frequency bank
  • 9. FEATURE MATCHING • Feature matching involves the actual procedure to identify the unknown speaker by comparing extracted features from his/her voice input with the ones from a set of known speakers • The goal of pattern recognition is to classify objects of interest into one of a number of categories or classes. • The objects of interest are called patterns and in our case are sequences of acoustic vectors that are extracted from an input speech. • Classes are referred to individual speakers. VOICE IDENTIFICATION AND RECOGNITION SYSTEM 9
  • 10. PATTERN RECOGNITION TECHNIQUE • Feature matching technique used in “VOICE IDENTIFICATION AND RECOGNITION SYSTEM” is Vector Quantization (VQ). • VQ is a process of mapping vectors from a large vector space to a finite number of regions in that space. Each region is called a cluster and can be represented by its center called a codeword. The collection of all codewords is called a codebook. VOICE IDENTIFICATION AND RECOGNITION SYSTEM 10
  • 11. RECOGNITION PROCESS VOICE IDENTIFICATION AND RECOGNITION SYSTEM 11 Speaker 1 Speaker 1 centroid sample Speaker 2 centroid sample Speaker 2 VQ distortion Figure 5: Conceptual Diagram Illustrating Vector Quantization codebook Formation
  • 12. LINDE-BUZO-GREY ALGORITHM The Linde–Buzo–Gray algorithm (introduced by Yoseph Linde, Andrés Buzo and Robert M. Gray in 1980) is a vector quantization algorithm to derive a good codebook. VOICE IDENTIFICATION AND RECOGNITION SYSTEM 12 Find centroid Split each centroid Cluster vectors Find centroids Compute D (distortion)   D D'D Stop D’ = D m = 2*m No Yes Yes No m < M
  • 13. THE GRAPHICAL USER INTERFACE • There are many ways to make your own custom Graphical User Interface (GUI); you can do it manually or you can use another efficient approach that is the “Guide” approach. VOICE IDENTIFICATION AND RECOGNITION SYSTEM 13 Figure 6: Guide Quick Start Window Figure 7: Our Custom GUI
  • 14. EMBEDDING CODE TO THE GUI • Note that in the figure we have six essential buttons, which perform their unique task. a. “Add New Sound To The Database” b. “Speaker Recognition From Mike” c. “DATABASE INFORMATION” d. “PLOT DATABASE” e. “Delete Database” f. “EXIT” VOICE IDENTIFICATION AND RECOGNITION SYSTEM 14 Figure 7: Our Custom GUI
  • 15. ADDING BACK GROUND TO THE GUI CODE: % create an axes that spans the whole gui ah = axes('unit', 'normalized', 'position', [0 0 1 1]); % import the background image and show it on the axes bg = imread('project image 3.jpg'); imagesc(bg); % prevent plotting over the background and turn the axis off set(ah,'handlevisibility','off','visible','off') % making sure the background is behind all the other uicontrols uistack(ah, 'bottom'); VOICE IDENTIFICATION AND RECOGNITION SYSTEM 15 Figure 8: Our Custom Background
  • 16. VOICE IDENTIFICATION AND RECOGNITION SYSTEM 16 Figure 9: Our Final Program
  • 17. APPLICATION DEPLOYMENT VOICE IDENTIFICATION AND RECOGNITION SYSTEM 17 Figure 10: Standalone application deployment window Figure 11: Our Custom Splash screen
  • 18. REFERENCES • L.R. Rabiner and B.H. Juang, Fundamentals of Speech Recognition, Prentice-Hall, Englewood Cliffs, N.J., 1993. • S.B. Davis and P. Mermelstein, “Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences”, IEEE Transactions on Acoustics, Speech, Signal Processing, Vol. ASSP-28, No. 4, August 1980 • Y. Linde, A. Buzo & R. Gray, “An algorithm for vector quantizer design”, IEEE Transactions on Communications, Vol. 28, pp.84-95, 1980 • S. Furui, “Speaker independent isolated word recognition using dynamic features of speech spectrum”, IEEE Transactions on Acoustic, Speech, Signal Processing, Vol. ASSP-34, No. 1, pp. 52-59, February 1986 • F.K. Song, A.E. Rosenberg and B.H. Juang, “A vector quantisation approach to speaker recognition”, AT&T Technical Journal, Vol. 66-2, pp. 14-26, March 1987 • comp.speech Frequently Asked Questions WWW site, http://svr-www.eng.cam.ac.uk/comp.speech/ VOICE IDENTIFICATION AND RECOGNITION SYSTEM 18