SlideShare a Scribd company logo
2
Most read
5
Most read
14
Most read
Speech Processing
Feature Extraction
By.Mohamed Essam
Sound Feature
Sound features can be used to detect speakers, detect the
gender, the age, diseases and much more through the voice.
Energy
The energy of a signal is the total magnitude of the signal,
i.e. how loud the signal is(Can use it as a classifier if
someone anger or not..
. It is defined as:
Tempo
(BPM).
An estimate of the tempo in Beats Per Minute:
the rate of speed of a musical piece or passage .
Like a heartbeat, it can also be thought of as the 'pulse' of
the music. Tempo is measured in BPM, or beats pe
-can indicate that someone are arrested.
Bank of filters
Digital filter-banks are an integral part of many speech and
audio processing algorithms used in today’s communication
systems. They are commonly employed for adaptive subband
filtering.
Another frequent task is speech enhancement by noise
reduction
Pitches
The Sound an object makes depending on how fast it is
vibrating.
When an object vibrates Quickly , high –pitched sounds are
heard.
Pitches
Low-pitched sounds come from things that vibrate more
slowly.
Humans can hear sounds of different pitches, But there are
sounds that they cannot hear, human ears cannot detect very
low-pitched noise known as infrasound, or very high-pitched
noises, called ultrasound.
2.Feature Extraction
2.Feature Extraction
Zero Crossing Rate
The zero-crossing rate (ZCR) is the rate at which a signal
changes from positive to zero to negative or from negative to
zero to positive. Its value has been widely used in both
speech recognition and music information retrieval, being a
key feature to classify percussive sounds. is an indicator
function.
2.Feature Extraction
Spectrogram
Ever heard of a spectrogram? It’s a 2D plot between time and
frequency where each point in the plot represents the
amplitude of a particular frequency at a particular time in
terms of intensity of color. In simple terms, the spectrogram
is a spectrum (broad range of colors) of frequencies as it
varies with time.
2.Feature Extraction
Spectral Centroid
The spectral centroid is a measure used in
digital signal processing to characterise a
spectrum. It indicates where the center of mass
of the spectrum is located. Perceptually, it has a
robust connection with the impression of
brightness of a sound.
Spectral Centroid
The spectral centroid is a measure that
indicates where the “center of mass” of the
spectrum is. Perceptually, it has a robust
connection with the impression of “brightness”
of a sound, and therefore is used to
characterise musical timbre.

More Related Content

PPTX
Feature extraction
PPTX
SoundSense
PDF
IRJET- Music Genre Classification using GMM
PDF
Classification of Vehicles Based on Audio Signals using Quadratic Discriminan...
PDF
A Novel Method for Silence Removal in Sounds Produced by Percussive Instruments
PDF
IRJET - Essential Features Extraction from Aaroh and Avroh of Indian Clas...
PDF
Classification of vehicles based on audio signals
PDF
Speaker Identification
Feature extraction
SoundSense
IRJET- Music Genre Classification using GMM
Classification of Vehicles Based on Audio Signals using Quadratic Discriminan...
A Novel Method for Silence Removal in Sounds Produced by Percussive Instruments
IRJET - Essential Features Extraction from Aaroh and Avroh of Indian Clas...
Classification of vehicles based on audio signals
Speaker Identification

Similar to 2.Feature Extraction (20)

PDF
Pitch detection from singing voice, advantages, limitations and applications ...
PDF
Ijarcet vol-2-issue-4-1347-1351
PDF
T26123129
PDF
Anvita Wisp 2007 Presentation
PDF
HUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATION
PPTX
SpeechProcessing_using_Librosa__1___1_.pptx
PDF
Introduction to Music Information Retrieval
PDF
CONTENT BASED AUDIO CLASSIFIER & FEATURE EXTRACTION USING ANN TECNIQUES
PDF
Introduction to Music Information Retrieval
PDF
IRJET- Segmentation in Digital Signal Processing
PPTX
Interdisciplinary Perspectives on Emotion, Music and Technology
PDF
Artificial Intelligent Algorithm for the Analysis, Quality Speech & Different...
PDF
IRJET- Emotion recognition using Speech Signal: A Review
PDF
A computationally efficient learning model to classify audio signal attributes
PDF
IRJET- Machine Learning and Noise Reduction Techniques for Music Genre Classi...
PDF
1801 1805
PDF
1801 1805
PDF
03-05-ACA-Input-Features.pdf
PPTX
Graphical visualization of musical emotions
PDF
Investigating Multi-Feature Selection and Ensembling for Audio Classification
Pitch detection from singing voice, advantages, limitations and applications ...
Ijarcet vol-2-issue-4-1347-1351
T26123129
Anvita Wisp 2007 Presentation
HUFFMAN CODING ALGORITHM BASED ADAPTIVE NOISE CANCELLATION
SpeechProcessing_using_Librosa__1___1_.pptx
Introduction to Music Information Retrieval
CONTENT BASED AUDIO CLASSIFIER & FEATURE EXTRACTION USING ANN TECNIQUES
Introduction to Music Information Retrieval
IRJET- Segmentation in Digital Signal Processing
Interdisciplinary Perspectives on Emotion, Music and Technology
Artificial Intelligent Algorithm for the Analysis, Quality Speech & Different...
IRJET- Emotion recognition using Speech Signal: A Review
A computationally efficient learning model to classify audio signal attributes
IRJET- Machine Learning and Noise Reduction Techniques for Music Genre Classi...
1801 1805
1801 1805
03-05-ACA-Input-Features.pdf
Graphical visualization of musical emotions
Investigating Multi-Feature Selection and Ensembling for Audio Classification
Ad

More from Mohamed Essam (20)

PPTX
Data Science Crash course
PPTX
Data Science
PPTX
Introduction to Robotics.pptx
PPTX
Introduction_to_Gui_with_tkinter.pptx
PPTX
Getting_Started_with_DL_in_Keras.pptx
PPTX
Linear_algebra.pptx
PPTX
Let_s_Dive_to_Deep_Learning.pptx
PPTX
OOP-Advanced_Programming.pptx
PPTX
1.Basic_Syntax
PPTX
KNN.pptx
PPTX
Regularization_BY_MOHAMED_ESSAM.pptx
PPTX
1.What_if_Adham_Nour_tried_to_make_a_Machine_Learning_Model_at_Home.pptx
PPTX
Clean_Code
PPTX
Linear_Regression
PPTX
2.Data_Strucures_and_modules.pptx
PPTX
Naieve_Bayee.pptx
PPTX
Activation_function.pptx
PPTX
Deep_Learning_Frameworks
PPTX
Neural_Network
PPTX
Software Engineering
Data Science Crash course
Data Science
Introduction to Robotics.pptx
Introduction_to_Gui_with_tkinter.pptx
Getting_Started_with_DL_in_Keras.pptx
Linear_algebra.pptx
Let_s_Dive_to_Deep_Learning.pptx
OOP-Advanced_Programming.pptx
1.Basic_Syntax
KNN.pptx
Regularization_BY_MOHAMED_ESSAM.pptx
1.What_if_Adham_Nour_tried_to_make_a_Machine_Learning_Model_at_Home.pptx
Clean_Code
Linear_Regression
2.Data_Strucures_and_modules.pptx
Naieve_Bayee.pptx
Activation_function.pptx
Deep_Learning_Frameworks
Neural_Network
Software Engineering
Ad

Recently uploaded (20)

PDF
Approach and Philosophy of On baking technology
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Empathic Computing: Creating Shared Understanding
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
Big Data Technologies - Introduction.pptx
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Electronic commerce courselecture one. Pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Machine learning based COVID-19 study performance prediction
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
Approach and Philosophy of On baking technology
“AI and Expert System Decision Support & Business Intelligence Systems”
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Empathic Computing: Creating Shared Understanding
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Programs and apps: productivity, graphics, security and other tools
Big Data Technologies - Introduction.pptx
20250228 LYD VKU AI Blended-Learning.pptx
Electronic commerce courselecture one. Pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Machine learning based COVID-19 study performance prediction
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
Reach Out and Touch Someone: Haptics and Empathic Computing
Mobile App Security Testing_ A Comprehensive Guide.pdf
Spectral efficient network and resource selection model in 5G networks
Dropbox Q2 2025 Financial Results & Investor Presentation
Per capita expenditure prediction using model stacking based on satellite ima...

2.Feature Extraction

  • 2. Sound Feature Sound features can be used to detect speakers, detect the gender, the age, diseases and much more through the voice.
  • 3. Energy The energy of a signal is the total magnitude of the signal, i.e. how loud the signal is(Can use it as a classifier if someone anger or not.. . It is defined as:
  • 4. Tempo (BPM). An estimate of the tempo in Beats Per Minute: the rate of speed of a musical piece or passage . Like a heartbeat, it can also be thought of as the 'pulse' of the music. Tempo is measured in BPM, or beats pe -can indicate that someone are arrested.
  • 5. Bank of filters Digital filter-banks are an integral part of many speech and audio processing algorithms used in today’s communication systems. They are commonly employed for adaptive subband filtering. Another frequent task is speech enhancement by noise reduction
  • 6. Pitches The Sound an object makes depending on how fast it is vibrating. When an object vibrates Quickly , high –pitched sounds are heard.
  • 7. Pitches Low-pitched sounds come from things that vibrate more slowly. Humans can hear sounds of different pitches, But there are sounds that they cannot hear, human ears cannot detect very low-pitched noise known as infrasound, or very high-pitched noises, called ultrasound.
  • 10. Zero Crossing Rate The zero-crossing rate (ZCR) is the rate at which a signal changes from positive to zero to negative or from negative to zero to positive. Its value has been widely used in both speech recognition and music information retrieval, being a key feature to classify percussive sounds. is an indicator function.
  • 12. Spectrogram Ever heard of a spectrogram? It’s a 2D plot between time and frequency where each point in the plot represents the amplitude of a particular frequency at a particular time in terms of intensity of color. In simple terms, the spectrogram is a spectrum (broad range of colors) of frequencies as it varies with time.
  • 14. Spectral Centroid The spectral centroid is a measure used in digital signal processing to characterise a spectrum. It indicates where the center of mass of the spectrum is located. Perceptually, it has a robust connection with the impression of brightness of a sound.
  • 15. Spectral Centroid The spectral centroid is a measure that indicates where the “center of mass” of the spectrum is. Perceptually, it has a robust connection with the impression of “brightness” of a sound, and therefore is used to characterise musical timbre.