SlideShare a Scribd company logo
RyAudio,
A Real-time Audio Spectrogram
with Application to Sound-Driven Games
in Python 3, Pyaudio, Pygame, and Pylab
Renyuan Lyu
呂仁園 1
May 18, 14:30, @R1
Preface
• Python helps me implement the real-time
spectrogram at the beginning of this year
(2014).
– After doing speech signal processing research for
a long time, I feel so excited to share that
excitement with friends.
– So I submit my program with a youtube demo to
this conference
2
• The followings are the scores and comments
given by the reviewers
• Reviewer #1: Score: 2
– No comments
• Reviewer #2: Score: 3
– real-time speech recognizer !!!
• Reviewer #3: Score: 3
– I'll admit being a bit selfish here. I have been planning
to work on audio analysing for a while. This looks like
a good start. :)
3
• Reviewer #4: Score: 0
– After reviewing his code carefully, I have to say
that his spectrogram analysis is not good for
speech processing. He took every 512 samples to
perform FFT to get spectrum under 16kHz
sampling rate. As far as I know, speech processing
will use so-called short-term frequency analysis,
which is different than this one. Well, it might be
an interesting topic for Python users as long as he
provides accurate and correct information about
DSP.
4
• Reviewer #5: Score: 2
– Sound recognition is different than speech
recognition, right?
• Reviewer #6: Score: 3
– I am too excited to give any comment. I would
even love to pay for his ticket just to listen to this
talk.
5
• By the way, Python 3 allows me to use my
native (most fluent) language to name the
variables, functions, and classes.
– That is even a more wonderful experience.
– I can have much more precise, more elegant
vocabulary to construct the program.
6
Overview
• Some Background on this talk
• Signal Processing, Speech
• Spectrum, Spectrogram
• Processing in Real Time
• An Awesome Example: Friture
• RyAudio
• A lighter example for realtime spectrogram
• Demo
• Some Comments on Programming in Native
Languages
• Using Chinese in Python 3
7
Signal Processing
• Signal Processing deals with operations on or
analysis of analog or digital signals,
representing time varying or spatially varying
physical quantities, like sound, image or video.
8http://upload.wikimedia.org/wikipedia/commons/4/46/Signal_processing_system.png
Speech
• Speech is a 1-dimentional signal
– a subclass of audio signal
• a representation of sound, typically as an electrical voltage
• with frequencies in the audio frequency range
– roughly 20 to 20,000 Hz (the limits of human hearing)
– the vocalized form of human language
• carrying linguistic information
– the frequency range within 8,000 Hz is enough
9
(Optical) Spectrum
• The word spectrum was first used
scientifically within the field of
optics
– to describe the rainbow of colors
in visible light
• when separated using a prism.
10
Audio/Speech Spectrum
• Spectrum can be also obtained from
audio/speech signal,
• where it represents the frequency distribution of the signal.
• Fast Fourier Transform (FFT)
• the core algorithm to get such a spectrum.
11
FFT
Spectrogram
• Speech as a time-varying signal
• short-time FFT is applied in the spectral analysis to form
a time-frequency spectrogram
– Typically the short-time frame is about 20 ms long.
• Free analysis tools for speech processing
• Audacity, Praat, ..etc
• Perfect for off-line, non-real-time processing
12
Processing in Real-time
• Real-time processing
– Acquiring, processing, responding simultaneously
• An example: Friture
– A Python application to visualize and analyze live
audio data in real-time.
– importing PyQt, PyQwt, PyAudio, Numpy, Scipy,
Cython, OpenGL, etc,..
– http://friture.org/
13
An Awesome Example: Friture
• I found this app in 2011.
• It was implemented in Python.
– this is one of the reasons why I was attracted into Python’s
world
14
• Comments on Friture:
– Cool, Splendid, Wonderful, Awesome!!
– But,
• Importing too many modules
– PyQt, PyQwt, PyAudio, Numpy, Scipy, Cython, OpenGL, etc,..
• Only in Python-2, Not yet in Python-3
– I have ONLY Python-3 environment installed
• Too complicated for me as a newbie to follow
– The Zen of Python
» Simple is better than complex.
» Complex is better than complicated.
15
A smaller dependent set
• A smaller dependent module set to implement a
real-time spectrogram
– PyAudio
• For acquiring speech
– http://people.csail.mit.edu/hubert/pyaudio/
– Pylab
• For DSP (FFT, etc)
– http://wiki.scipy.org/PyLab
– PyGame
• For displaying and GUI
– http://www.pygame.org/news.html
16
RyAudio
~ a class to deal with audio processing
17
Source Code can be found here
https://gist.github.com/renyuanL/
f9cb017a3a5b6c621b43
18
19
20
21
22
23
24
25
ryApp.py
~ an app of realtime spectrogram
26
27
28
29
30
31
32
• Source Code can be found here
– https://gist.github.com/renyuanL/f9cb017a3a5b6c621b43
33
Demo
• Demo in Youtube
• http://youtu.be/sFtKlLF88DU
34
Some Comments
on Programming in Native Languages
設計程式對非英語為母語的人來說,(特別是小孩)
允許其運用 其「母語」 來 「寫」,
比較有可能「登門入室」甚至「文思泉湧」,
因此能迅速產生「內容」。
等到「內容」大致底定,
為了與全球人士分享智慧的結晶,
需要將這種「母語」程式轉成「英語」程式,
以利全球範圍的流通。
以寫文章來做類比,
金庸 要用 中文 才寫得出 「神鵰俠侶」,
Mark Twain 要用 English 才寫得出 'Tom Sawyer'
一旦作品優秀、揚名了,自然有轉成其他語言與更多人
分享的需求
35
Python Code Translation
ryApp.py  ryApp_en.py
36
ryApp.py  ryApp_en.py
• https://gist.github.com/renyuanL/f9cb017a3a5b6c621b43
37
Examples of Chinese Programs
• http://apython.blogspot.tw/
• A set of Chinese Programs in Python 3
– https://gist.github.com/renyuanL/044b6bc6142dc
71086bc
– https://gist.github.com/renyuanL/a36f2a121c4d2
7753d8c
38
陰陽 Yinyang
• C:Python33Libturtledemoyinyang.py
39
Coding in your own native language
40
• If readability counts, then it will achieve
maximum when coding in your own native
language.
41
Thank you
for Listening
42

More Related Content

PDF
Digital signal processing through speech, hearing, and Python
PPTX
Sound analysis and processing with MATLAB
PDF
Py conjp2019 renyuanlyu_3
PDF
PySynth : A toy pure python software synthesizer.
PPTX
Matlab: Speech Signal Analysis
PDF
Spatial Fourier transform-based localized sound zone generation with loudspea...
PDF
Introduction of ToySynth
PDF
Missing Component Restoration for Masked Speech Signals based on Time-Domain ...
Digital signal processing through speech, hearing, and Python
Sound analysis and processing with MATLAB
Py conjp2019 renyuanlyu_3
PySynth : A toy pure python software synthesizer.
Matlab: Speech Signal Analysis
Spatial Fourier transform-based localized sound zone generation with loudspea...
Introduction of ToySynth
Missing Component Restoration for Masked Speech Signals based on Time-Domain ...

What's hot (20)

PDF
Introductory Lecture to Audio Signal Processing
PDF
Python Workshop
PDF
Defying Nyquist in Analog to Digital Conversion
PDF
Utp pds_l4_procesamiento de señales del habla con mat_lab
PPT
Miniproject audioenhancement-100223094301-phpapp02
PDF
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Te...
PDF
Numba: Array-oriented Python Compiler for NumPy
PPTX
Speech Compression using LPC
PDF
H0814247
PDF
DSP_FOEHU - Lec 13 - Digital Signal Processing Applications I
PPT
Audio compression 1
PDF
Travis Oliphant "Python for Speed, Scale, and Science"
KEY
PPTX
Homomorphic speech processing
PDF
Sequence Learning with CTC technique
PDF
The Joy of SciPy
PPT
Audio and video compression
PDF
Introduction to Python Pandas for Data Analytics
PPTX
Introduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLab
Introductory Lecture to Audio Signal Processing
Python Workshop
Defying Nyquist in Analog to Digital Conversion
Utp pds_l4_procesamiento de señales del habla con mat_lab
Miniproject audioenhancement-100223094301-phpapp02
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Te...
Numba: Array-oriented Python Compiler for NumPy
Speech Compression using LPC
H0814247
DSP_FOEHU - Lec 13 - Digital Signal Processing Applications I
Audio compression 1
Travis Oliphant "Python for Speed, Scale, and Science"
Homomorphic speech processing
Sequence Learning with CTC technique
The Joy of SciPy
Audio and video compression
Introduction to Python Pandas for Data Analytics
Introduction To TensorFlow | Deep Learning Using TensorFlow | CloudxLab
Ad

Viewers also liked (11)

ODP
(2013-11-29) [RuPy] AudioLazy Python DSP (Digital Signal Processing)
PDF
ICML Talk on deep learning for music recommendation
PPTX
ISMIR 2016_Melody Extraction
PPTX
Talwar_Rakshak_2016URD
PDF
Audio chord recognition using deep neural networks
PDF
Deep Learning Meetup #5
PDF
Deep Learning for Speech Recognition - Vikrant Singh Tomar
PPTX
Big Data Day LA 2015 - Deep Learning Human Vocalized Animal Sounds by Sabri S...
PDF
Deep learning for music classification, 2016-05-24
PDF
Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016
PDF
GTC 2016 ディープラーニング最新情報
(2013-11-29) [RuPy] AudioLazy Python DSP (Digital Signal Processing)
ICML Talk on deep learning for music recommendation
ISMIR 2016_Melody Extraction
Talwar_Rakshak_2016URD
Audio chord recognition using deep neural networks
Deep Learning Meetup #5
Deep Learning for Speech Recognition - Vikrant Singh Tomar
Big Data Day LA 2015 - Deep Learning Human Vocalized Animal Sounds by Sabri S...
Deep learning for music classification, 2016-05-24
Automatic Tagging using Deep Convolutional Neural Networks - ISMIR 2016
GTC 2016 ディープラーニング最新情報
Ad

Similar to Pycon apac 2014 (20)

PPTX
Py conjp2019 renyuanlyu_3
PDF
Py conjp2019 renyuanlyu_3
PDF
Desktop assistant
PDF
Exploratory Analytics in Python provided by EY.pdf
PPT
py4inf-01-intro.ppt
PPTX
How to start Python? - lesson 1
PPTX
Python
PDF
FEC2017-Introduction-to-programming
PPTX
An Introduction To Python - Python, Print()
PPTX
Introduction to Python – Learn Python Programming.pptx
PPTX
Pi, Python, and Paintball??? Innovating with Affordable Tech!
PPTX
python classes in thane
PDF
Python Programming: The Best Language for Every Coder
PPTX
Presentation.pptx
PPTX
Presentation.pptx
PDF
WebRTC, RED and Janus @ ClueCon21
PDF
Why should I learn python
PPTX
PPTX
Python basics
PPTX
Everyday Python Idioms
Py conjp2019 renyuanlyu_3
Py conjp2019 renyuanlyu_3
Desktop assistant
Exploratory Analytics in Python provided by EY.pdf
py4inf-01-intro.ppt
How to start Python? - lesson 1
Python
FEC2017-Introduction-to-programming
An Introduction To Python - Python, Print()
Introduction to Python – Learn Python Programming.pptx
Pi, Python, and Paintball??? Innovating with Affordable Tech!
python classes in thane
Python Programming: The Best Language for Every Coder
Presentation.pptx
Presentation.pptx
WebRTC, RED and Janus @ ClueCon21
Why should I learn python
Python basics
Everyday Python Idioms

More from Renyuan Lyu (8)

PDF
Lightning talk01 docx
PDF
Lightning talk01
PPTX
Pycon JP 2016 ---- Pitch Detection
PPTX
pycon jp 2016 ---- CguTranslate
PDF
pyconjp2015_talk_Translation of Python Program__
PDF
Ry pyconjp2015 turtle
PDF
Ry pyconjp2015 karaoke
PDF
教青少年寫程式
Lightning talk01 docx
Lightning talk01
Pycon JP 2016 ---- Pitch Detection
pycon jp 2016 ---- CguTranslate
pyconjp2015_talk_Translation of Python Program__
Ry pyconjp2015 turtle
Ry pyconjp2015 karaoke
教青少年寫程式

Recently uploaded (20)

PPTX
CH1 Production IntroductoryConcepts.pptx
PDF
PPT on Performance Review to get promotions
PPTX
Internet of Things (IOT) - A guide to understanding
PPT
Mechanical Engineering MATERIALS Selection
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PDF
composite construction of structures.pdf
PPTX
web development for engineering and engineering
PPTX
Lecture Notes Electrical Wiring System Components
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPT
Introduction, IoT Design Methodology, Case Study on IoT System for Weather Mo...
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
DOCX
573137875-Attendance-Management-System-original
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
CH1 Production IntroductoryConcepts.pptx
PPT on Performance Review to get promotions
Internet of Things (IOT) - A guide to understanding
Mechanical Engineering MATERIALS Selection
Operating System & Kernel Study Guide-1 - converted.pdf
composite construction of structures.pdf
web development for engineering and engineering
Lecture Notes Electrical Wiring System Components
Embodied AI: Ushering in the Next Era of Intelligent Systems
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Introduction, IoT Design Methodology, Case Study on IoT System for Weather Mo...
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Foundation to blockchain - A guide to Blockchain Tech
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
573137875-Attendance-Management-System-original
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf

Pycon apac 2014

  • 1. RyAudio, A Real-time Audio Spectrogram with Application to Sound-Driven Games in Python 3, Pyaudio, Pygame, and Pylab Renyuan Lyu 呂仁園 1 May 18, 14:30, @R1
  • 2. Preface • Python helps me implement the real-time spectrogram at the beginning of this year (2014). – After doing speech signal processing research for a long time, I feel so excited to share that excitement with friends. – So I submit my program with a youtube demo to this conference 2
  • 3. • The followings are the scores and comments given by the reviewers • Reviewer #1: Score: 2 – No comments • Reviewer #2: Score: 3 – real-time speech recognizer !!! • Reviewer #3: Score: 3 – I'll admit being a bit selfish here. I have been planning to work on audio analysing for a while. This looks like a good start. :) 3
  • 4. • Reviewer #4: Score: 0 – After reviewing his code carefully, I have to say that his spectrogram analysis is not good for speech processing. He took every 512 samples to perform FFT to get spectrum under 16kHz sampling rate. As far as I know, speech processing will use so-called short-term frequency analysis, which is different than this one. Well, it might be an interesting topic for Python users as long as he provides accurate and correct information about DSP. 4
  • 5. • Reviewer #5: Score: 2 – Sound recognition is different than speech recognition, right? • Reviewer #6: Score: 3 – I am too excited to give any comment. I would even love to pay for his ticket just to listen to this talk. 5
  • 6. • By the way, Python 3 allows me to use my native (most fluent) language to name the variables, functions, and classes. – That is even a more wonderful experience. – I can have much more precise, more elegant vocabulary to construct the program. 6
  • 7. Overview • Some Background on this talk • Signal Processing, Speech • Spectrum, Spectrogram • Processing in Real Time • An Awesome Example: Friture • RyAudio • A lighter example for realtime spectrogram • Demo • Some Comments on Programming in Native Languages • Using Chinese in Python 3 7
  • 8. Signal Processing • Signal Processing deals with operations on or analysis of analog or digital signals, representing time varying or spatially varying physical quantities, like sound, image or video. 8http://upload.wikimedia.org/wikipedia/commons/4/46/Signal_processing_system.png
  • 9. Speech • Speech is a 1-dimentional signal – a subclass of audio signal • a representation of sound, typically as an electrical voltage • with frequencies in the audio frequency range – roughly 20 to 20,000 Hz (the limits of human hearing) – the vocalized form of human language • carrying linguistic information – the frequency range within 8,000 Hz is enough 9
  • 10. (Optical) Spectrum • The word spectrum was first used scientifically within the field of optics – to describe the rainbow of colors in visible light • when separated using a prism. 10
  • 11. Audio/Speech Spectrum • Spectrum can be also obtained from audio/speech signal, • where it represents the frequency distribution of the signal. • Fast Fourier Transform (FFT) • the core algorithm to get such a spectrum. 11 FFT
  • 12. Spectrogram • Speech as a time-varying signal • short-time FFT is applied in the spectral analysis to form a time-frequency spectrogram – Typically the short-time frame is about 20 ms long. • Free analysis tools for speech processing • Audacity, Praat, ..etc • Perfect for off-line, non-real-time processing 12
  • 13. Processing in Real-time • Real-time processing – Acquiring, processing, responding simultaneously • An example: Friture – A Python application to visualize and analyze live audio data in real-time. – importing PyQt, PyQwt, PyAudio, Numpy, Scipy, Cython, OpenGL, etc,.. – http://friture.org/ 13
  • 14. An Awesome Example: Friture • I found this app in 2011. • It was implemented in Python. – this is one of the reasons why I was attracted into Python’s world 14
  • 15. • Comments on Friture: – Cool, Splendid, Wonderful, Awesome!! – But, • Importing too many modules – PyQt, PyQwt, PyAudio, Numpy, Scipy, Cython, OpenGL, etc,.. • Only in Python-2, Not yet in Python-3 – I have ONLY Python-3 environment installed • Too complicated for me as a newbie to follow – The Zen of Python » Simple is better than complex. » Complex is better than complicated. 15
  • 16. A smaller dependent set • A smaller dependent module set to implement a real-time spectrogram – PyAudio • For acquiring speech – http://people.csail.mit.edu/hubert/pyaudio/ – Pylab • For DSP (FFT, etc) – http://wiki.scipy.org/PyLab – PyGame • For displaying and GUI – http://www.pygame.org/news.html 16
  • 17. RyAudio ~ a class to deal with audio processing 17 Source Code can be found here https://gist.github.com/renyuanL/ f9cb017a3a5b6c621b43
  • 18. 18
  • 19. 19
  • 20. 20
  • 21. 21
  • 22. 22
  • 23. 23
  • 24. 24
  • 25. 25
  • 26. ryApp.py ~ an app of realtime spectrogram 26
  • 27. 27
  • 28. 28
  • 29. 29
  • 30. 30
  • 31. 31
  • 32. 32
  • 33. • Source Code can be found here – https://gist.github.com/renyuanL/f9cb017a3a5b6c621b43 33
  • 34. Demo • Demo in Youtube • http://youtu.be/sFtKlLF88DU 34
  • 35. Some Comments on Programming in Native Languages 設計程式對非英語為母語的人來說,(特別是小孩) 允許其運用 其「母語」 來 「寫」, 比較有可能「登門入室」甚至「文思泉湧」, 因此能迅速產生「內容」。 等到「內容」大致底定, 為了與全球人士分享智慧的結晶, 需要將這種「母語」程式轉成「英語」程式, 以利全球範圍的流通。 以寫文章來做類比, 金庸 要用 中文 才寫得出 「神鵰俠侶」, Mark Twain 要用 English 才寫得出 'Tom Sawyer' 一旦作品優秀、揚名了,自然有轉成其他語言與更多人 分享的需求 35
  • 36. Python Code Translation ryApp.py  ryApp_en.py 36
  • 37. ryApp.py  ryApp_en.py • https://gist.github.com/renyuanL/f9cb017a3a5b6c621b43 37
  • 38. Examples of Chinese Programs • http://apython.blogspot.tw/ • A set of Chinese Programs in Python 3 – https://gist.github.com/renyuanL/044b6bc6142dc 71086bc – https://gist.github.com/renyuanL/a36f2a121c4d2 7753d8c 38
  • 40. Coding in your own native language 40
  • 41. • If readability counts, then it will achieve maximum when coding in your own native language. 41