SlideShare a Scribd company logo
Speech Compression                                    Uncompressed audio data rates
• Recommended Reading: J. Harrington and S.                     • Voice: 8000samples/sec, 8bits/sample,
  Cassidy, “Techniques in Speech Acoustics”,                      mono
  Kluwer, 1999
                                                                                 = 64000bits/sec (64kbps)
• Contents
                                                                • CD: 44100samples/sec, 16bits/sample,
  – Uncompressed audio data rates
                                                                  stereo
  – ADPCM
  – SB-ADPCM
                                                                         =1411200bits/sec (~1.5Mbps)
  – LPC




  ADPCM (Adaptive Differential PCM)                                                    ADPCM
• Uses the statistical properties of human speech (=> not           Measured                                   Transmitted
  compatible with fax/modem signals)                                value                                      value
• Makes a prediction about the size of the next sample, based                                  Adaptive
  on previous info                                                                             quantiser
                                                                             -
• Transmitter then sends only the difference between real
  value and predicted value
                                                                                               Predictor
• Receiver uses the same prediction algorithm, together with
  the differences to reconstruct the speech data
• Enables the data rate to be reduced to 32kbps
• Used on international telephone links
• Specified in G.721, G.722, G.723, G.726, G.727




SB-APDCM (Sub-band ADPCM)                                                           SB-ADPCM
• Given 64kbps: ADPCM could produce                                                   Upper sub-band
                                                                                      ADPCM encoder
  better than toll voice quality (eg radio)
                                                                       Input 4-7KHz                  16kbps
                                                                                                              MUX
• Sub-bands are 0-4kHz (given 48kbps), 4-                              filters
  7kHz (given 16kbps)                                                                 Lower sub-band
                                                                                      ADPCM encoder
• Low band contains more audio energy, high                                        50Hz-4KHz         48kbps
  band contains intelligibility info.
• Standardised in G.722                                          Analogue                                      Digital signal
                                                                 signal in                                     out




                                                                                                                                1
Linear Predictive Coding (LPC)                                                        LPC
• Introduced in the 1960s                                      • coefficients (‘a’s) correspond to those of a vocal
• nth signal sample is represented as a linear                   tract filter and the error signal (‘e’) corresponds to
  combination of the previous p samples, plus a                  a source signal
  residual representing the prediction error:                  • Source signal will approximate either a voiced
                                                                 signal (which looks like a series of impulses) or a
x(n) = a1x(n-1) + a2x(n-2) + … + apx(n-p) + e(n)
                                                                 white noise source
                                                               • So, LPC involves “exciting” a source signal with a
• If the error (‘e’) is small enough, we can just
  transmit the coefficients (‘a’s)                               vocal tract filter




            Impulses and Filters                                        LPC – Autocorrelation
                                                               • Minimise the error signal by choosing optimal
                                                                 coefficients (‘a’s)
                                                               • Use the autocorrelation criteria (aka root mean
                                                                 squared criterion):




                                                                                                      for 1<=j<=p,
                                                               where R is the autocorrelation of x(n) defined as
                                                                R(i) = E[x(n)x(n-i)]




  LPC – Solving the autocorrelation
                                                                                       LPC
              formula
• In matrix form the equation can be written as                • Used in:
                      R*a=r
                                                                  – GSM (Groupe Speciale Mobile) (Residual
  where the autocorrelation matrix R is a symmetric Toeplitz        Pulse Excited-LPC) (13kbps)
  matrix with elements ri,j = R(i - j), vector r is the
  autocorrelation vector rj = R(j), and vector a is the           – LD-CELP (Low-Delay Code Excited Linear
  parameter vector of ai                                            Prediction) (G.728) (16kbps)
• An algorithm by N. Levinson (proposed in 1947) and              – CS-ACELP (Conjugate Structure-Algebraic
  modified by J. Durbin (in 1959) recursively calculates the        CELP) (G.729) (8kbps)
  solution to the Toeplitz matrix.
• GSM coder uses an integer version of the Schur recursion        – MP-MLQ (Multi Pulse – Maximum Likelihood
  (1917)                                                            Quantisation) (G.723.1) (6.3kbps)…




                                                                                                                          2

More Related Content

PDF
Iy2415661571
PPT
Speech technology basics
DOCX
Linear predictive coding documentation
PPTX
Improvement of Phase Noise Compensation for Coherent Optical OFDM via Data Ai...
PPTX
LPC for Speech Recognition
PPTX
Linear Predictive Coding
PPT
04 physical
PPTX
Speech Compression using LPC
Iy2415661571
Speech technology basics
Linear predictive coding documentation
Improvement of Phase Noise Compensation for Coherent Optical OFDM via Data Ai...
LPC for Speech Recognition
Linear Predictive Coding
04 physical
Speech Compression using LPC

What's hot (19)

PPTX
Line coding
PPT
Audio and video compression
PDF
Audio Morphing for Percussive Sound Generation
PDF
Speech Compression using LPC
PPT
Digital Transmission Fundamentals
PDF
Lect2 up400 (100329)
PPTX
Presentation ct
PPTX
3D Spatial Response
PDF
A BICMOS chipset for a DVB-H front-end receiver
PPTX
Digital Audio
PDF
iDiff 2008 conference #04 IP-Racine FSSG
PDF
Data bit rate_by_abhishek_wadhwa
PDF
CivcomIntelDuoBinary-TODC-OFC2006
PDF
Introduction To Video Compression
DOCX
Baud rate is the number of change in signal
PPTX
Application of fourier series
DOC
Chap 5
PPTX
3. digital transmission fundamentals
PDF
L'explication à propos du règlement sonore
Line coding
Audio and video compression
Audio Morphing for Percussive Sound Generation
Speech Compression using LPC
Digital Transmission Fundamentals
Lect2 up400 (100329)
Presentation ct
3D Spatial Response
A BICMOS chipset for a DVB-H front-end receiver
Digital Audio
iDiff 2008 conference #04 IP-Racine FSSG
Data bit rate_by_abhishek_wadhwa
CivcomIntelDuoBinary-TODC-OFC2006
Introduction To Video Compression
Baud rate is the number of change in signal
Application of fourier series
Chap 5
3. digital transmission fundamentals
L'explication à propos du règlement sonore
Ad

Similar to Z24 4 Speech Compression (20)

PPT
add9.5.ppt
PPT
est your knowledge of digital communication systems with our interactive quiz...
PDF
30 CHL PCM PDH SDH BY SKG
PPT
Speech coding techniques
PPTX
DC_PPT.pptx
PPTX
Waveform_codingUNIT-II_DC_-PPT.pptx
PPTX
Harmonic speech coding
PPTX
Waveform_codingUNIT-II_DC_-PPT.pptx
PDF
Optics101 for non-Optical (IP) folks by Tashi Phuntsho
DOC
Lpc vocoder implemented by using matlab
PPT
Ncc2004 ofdm tutorial part ii-apal
PPTX
Acoustic echo cancellation
PDF
Arithmetic Coding
PDF
Software-defined white-space cognitive systems: implementation of the spectru...
PDF
Analog mixed vlsi notes
PDF
A1mpeg12 2004
PPT
Carrier to Noise Versus Signal to Noise.ppt
PPT
03_04-AnalogDigital-HYanikomeroglu-12Jan2011_14Jan2011_Old1.ppt
PPT
Logsv2
add9.5.ppt
est your knowledge of digital communication systems with our interactive quiz...
30 CHL PCM PDH SDH BY SKG
Speech coding techniques
DC_PPT.pptx
Waveform_codingUNIT-II_DC_-PPT.pptx
Harmonic speech coding
Waveform_codingUNIT-II_DC_-PPT.pptx
Optics101 for non-Optical (IP) folks by Tashi Phuntsho
Lpc vocoder implemented by using matlab
Ncc2004 ofdm tutorial part ii-apal
Acoustic echo cancellation
Arithmetic Coding
Software-defined white-space cognitive systems: implementation of the spectru...
Analog mixed vlsi notes
A1mpeg12 2004
Carrier to Noise Versus Signal to Noise.ppt
03_04-AnalogDigital-HYanikomeroglu-12Jan2011_14Jan2011_Old1.ppt
Logsv2
Ad

More from anithabalaprabhu (20)

PPTX
Shannon Fano
PDF
Ch 04 Arithmetic Coding ( P P T)
PPT
Compression
PPT
Datacompression1
PPT
Speech Compression
PPT
PDF
Dictionary Based Compression
PDF
Module 4 Arithmetic Coding
PDF
Ch 04 Arithmetic Coding (Ppt)
PPT
Compression Ii
PDF
06 Arithmetic 1
PPT
Compression Ii
PPT
PPT
PPT
Losseless
PPT
Lec5 Compression
PPT
Huffman Student
Shannon Fano
Ch 04 Arithmetic Coding ( P P T)
Compression
Datacompression1
Speech Compression
Dictionary Based Compression
Module 4 Arithmetic Coding
Ch 04 Arithmetic Coding (Ppt)
Compression Ii
06 Arithmetic 1
Compression Ii
Losseless
Lec5 Compression
Huffman Student

Recently uploaded (20)

PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Machine learning based COVID-19 study performance prediction
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Encapsulation theory and applications.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
KodekX | Application Modernization Development
PDF
cuic standard and advanced reporting.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Machine learning based COVID-19 study performance prediction
Diabetes mellitus diagnosis method based random forest with bat algorithm
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
NewMind AI Weekly Chronicles - August'25 Week I
Encapsulation theory and applications.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
KodekX | Application Modernization Development
cuic standard and advanced reporting.pdf
20250228 LYD VKU AI Blended-Learning.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Dropbox Q2 2025 Financial Results & Investor Presentation
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Understanding_Digital_Forensics_Presentation.pptx
sap open course for s4hana steps from ECC to s4
Spectral efficient network and resource selection model in 5G networks
Mobile App Security Testing_ A Comprehensive Guide.pdf

Z24 4 Speech Compression

  • 1. Speech Compression Uncompressed audio data rates • Recommended Reading: J. Harrington and S. • Voice: 8000samples/sec, 8bits/sample, Cassidy, “Techniques in Speech Acoustics”, mono Kluwer, 1999 = 64000bits/sec (64kbps) • Contents • CD: 44100samples/sec, 16bits/sample, – Uncompressed audio data rates stereo – ADPCM – SB-ADPCM =1411200bits/sec (~1.5Mbps) – LPC ADPCM (Adaptive Differential PCM) ADPCM • Uses the statistical properties of human speech (=> not Measured Transmitted compatible with fax/modem signals) value value • Makes a prediction about the size of the next sample, based Adaptive on previous info quantiser - • Transmitter then sends only the difference between real value and predicted value Predictor • Receiver uses the same prediction algorithm, together with the differences to reconstruct the speech data • Enables the data rate to be reduced to 32kbps • Used on international telephone links • Specified in G.721, G.722, G.723, G.726, G.727 SB-APDCM (Sub-band ADPCM) SB-ADPCM • Given 64kbps: ADPCM could produce Upper sub-band ADPCM encoder better than toll voice quality (eg radio) Input 4-7KHz 16kbps MUX • Sub-bands are 0-4kHz (given 48kbps), 4- filters 7kHz (given 16kbps) Lower sub-band ADPCM encoder • Low band contains more audio energy, high 50Hz-4KHz 48kbps band contains intelligibility info. • Standardised in G.722 Analogue Digital signal signal in out 1
  • 2. Linear Predictive Coding (LPC) LPC • Introduced in the 1960s • coefficients (‘a’s) correspond to those of a vocal • nth signal sample is represented as a linear tract filter and the error signal (‘e’) corresponds to combination of the previous p samples, plus a a source signal residual representing the prediction error: • Source signal will approximate either a voiced signal (which looks like a series of impulses) or a x(n) = a1x(n-1) + a2x(n-2) + … + apx(n-p) + e(n) white noise source • So, LPC involves “exciting” a source signal with a • If the error (‘e’) is small enough, we can just transmit the coefficients (‘a’s) vocal tract filter Impulses and Filters LPC – Autocorrelation • Minimise the error signal by choosing optimal coefficients (‘a’s) • Use the autocorrelation criteria (aka root mean squared criterion): for 1<=j<=p, where R is the autocorrelation of x(n) defined as R(i) = E[x(n)x(n-i)] LPC – Solving the autocorrelation LPC formula • In matrix form the equation can be written as • Used in: R*a=r – GSM (Groupe Speciale Mobile) (Residual where the autocorrelation matrix R is a symmetric Toeplitz Pulse Excited-LPC) (13kbps) matrix with elements ri,j = R(i - j), vector r is the autocorrelation vector rj = R(j), and vector a is the – LD-CELP (Low-Delay Code Excited Linear parameter vector of ai Prediction) (G.728) (16kbps) • An algorithm by N. Levinson (proposed in 1947) and – CS-ACELP (Conjugate Structure-Algebraic modified by J. Durbin (in 1959) recursively calculates the CELP) (G.729) (8kbps) solution to the Toeplitz matrix. • GSM coder uses an integer version of the Schur recursion – MP-MLQ (Multi Pulse – Maximum Likelihood (1917) Quantisation) (G.723.1) (6.3kbps)… 2