SlideShare a Scribd company logo
A (Mis-) Guided Tour of the “Web Audio API” 
Edward B. Rockower, Ph.D. 
Presented 10/15/14 
Monterey Bay Information Technologists (MBIT) Meetup 
ed@rockower.net 1
Abstract 
• 
Audio for websites has a very checkered past. 
• 
The HTML5 <audio> tag is a big step forward 
• 
“Web Audio API”, more of a giant Leap 
– 
modeled on a modular graph of “audio nodes” 
– 
provides filters, gains, convolvers, spectral analysis, and spatially-located sound sources 
– 
Very important for sounds in 
• 
games, online music synthesis, speech recognition, analyses 
• 
Javascript Arrays and XHR2 (AJAX) 
• 
“getUserMedia” to capture real-time camera and microphone input 
• 
arriving “as we speak” (Check Availability: www.CanIUse.com) 2
Organizing Principles (Evolutionary  Revolutionary) 3 
Events 
Asynchronous/callbacks 
Web Workers 
Theremin 
(artificial) 
Transistor Printed circuit FFT A/D PCM, DSP 
Moog 
Audio Synthesizer 
Internet 
Browser Wars 
Human/computer interaction 
Online & Games 
Natural 
(birds, voices) 
Computer 
Generated 
demos 
Enabling 
Technologies 
Music Audio Engineering 
Web Audio API
What’s New in AJAX, HTML5, & Javascript 
• 
New XHR2 (arraybuffer, typed arrays, CORS) 
• 
Asynchronous (callbacks, non-blocking, events) 
• 
Audio Threads (Audio Worker) 
• 
getUserMedia (HTML5, WebRTC) 
• 
requestAnimationFrame (60 fps, HTML5) 
• 
<audio> (HTML5 mediaElement) 
• 
Web Audio API (optimized ‘native’ C code, modular audio graph, Fast Fourier Transform) 
• 
Vendor prefixed syntax (webkitAudioContext) 
• 
Firefox v. 32 Dev Tools displays/edits Web Audio graph 4
Both Sound-Amplitude & Time are Quantized 5
PCM Digitization Analog to Digital (A/D) 
• 
4 bits 2^4 = 16 different values 
– 
Quantization of values 
– 
Encode as binary numbers 
• 
Ts = Sampling interval 
• 
1/ Ts = Sampling Frequency 
• 
44.1 kHz used in Compact discs 
– 
Nyquist Freq. = 44.1kHz/2 = upper limit of hearing 6
Buffers and views: “typed array” architecture 
JavaScript typed arrays split implementation into buffers and views. 
• 
“buffer” (ArrayBuffer object) represents a chunk of data; 
• 
no format to speak of 
• 
no mechanism for accessing its contents 
• 
need to use a “view”, provides context, i.e. data type, starting offset, number of elements 
• 
Not your standard Arrays, but Fast !! 
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Typed_arrays 7 
16 bytes = 8 * 16 bits = 128 bits
Buffers, Arrays, XHR, … 8 
• 
XMLHttpRequest (XHR)  request.responseType = 'arraybuffer'; 
• 
audioContext.decodeAudioData(request.response, function (anotherBuffer) { … } 
• 
// Create the array for the data values 
• 
frequencyArray = new Uint8Array(analyserNode.frequencyBinCount); 
• 
analyserNode.getByteFrequencyData(frequencyArray);  Fast Fourier Transform (FFT) i.e. Spectrum. FFT populates frequencyArray 
• 
requestAnimationFrame plots data at each “frame” redraw (60 fps) 
 
more efficient than setTimeout( ) or setInterval( ) ( here 8 bits is quantization in the value of each measurement/sample ‘frame’, Whereas the inverse of Sampling rate, e.g. 1/22,050 = ~4.5 ms is the quantization in time.)
Leon Theremin 9 
http://mdn.github.io/violent-theremin/ 
http://youtu.be/w5qf9O6c20o
10
Moog Synthesizer  Audio Graphs 11
Bourne Identity: Sound Engineers (explaining how car sounds are modified to be more exciting) 12
Audio Graph Setup: Typical Workflow 13 
1.Create audio context 
2.Inside the context, create sources, e.g. <audio>, oscillator, stream 
3.Create effects nodes, e.g. reverb, biquad filter, panner, compressor 
4.Choose final destination of audio, for example your system speakers 
5.Connect the sources up to the effects, and the effects to the detination. 
developer.mozilla.org/en-US/docs/Web/API/Web_Audio_API
Polyfills – Vendor-prefixed (webkit, moz, ms) (e.g. using a self-executing-function) 
(function() { 
// Polyfill for AudioContext 
window.AudioContext = window.AudioContext || window.webkitAudioContext || window.mozAudioContext; 
// Polyfill for requestAnimationFrame (replaces setTimeout) 
var requestAnimationFrame = window.requestAnimationFrame || window.mozRequestAnimationFrame || window.webkitRequestAnimationFrame || window.msRequestAnimationFrame; window.requestAnimationFrame = requestAnimationFrame; 
})(); 14
Audio Sources 
• 
<audio> 
• 
new Audio(‘sounds/mySound.mp3’); 
• 
XHR (AJAX) 
• 
oscillatorNode(s) 
• 
“getUserMedia()” (live, usb microphone) 
• 
Procedurally Generated (Script Processor) 15
Draw the AudioBuffer (no audiograph) 16 
var audioContext = new AudioContext(); 
function initAudio() { 
var audioRequest = new XMLHttpRequest(); 
audioRequest.open("GET", "sounds/myAudio.ogg", true); 
audioRequest.responseType = "arraybuffer"; 
audioRequest.onload = function() { 
audioContext.decodeAudioData( audioRequest.response, function(buffer) { 
var canvas = document.getElementById("view1"); 
drawBuffer( canvas.width, canvas.height, canvas.getContext('2d'), buffer ); 
} ); 
} 
audioRequest.send(); 
} 
function drawBuffer( width, height, context, buffer ) { 
var data = buffer.getChannelData( 0 ); 
var step = Math.ceil( data.length / width ); 
var amp = height / 2; 
for(var i=0; i < width; i++){ 
var min = 1.0; 
var max = -1.0; 
for (var j=0; j<step; j++) { 
var datum = data[(i*step)+j]; 
if (datum < min) 
min = datum; 
if (datum > max) 
max = datum; 
} 
context.fillRect(i,(1+min)*amp,1,Math.max(1,(max-min)*amp)); 
} 
} 
Draws a Web Audio AudioBuffer to a canvas 
https://github.com/cwilso/Audio-Buffer-Draw/commits/master
Plot Audio Spectrum 
var audioEl = document.querySelector('audio'); // <audio> 
var audioCtx = new AudioContext(); 
var canvasEl = document.querySelector('canvas'); // <canvas> 
var canvasCtx = canvasEl.getContext('2d'); 
var mySource = audioCtx.createMediaElementSource(audioEl); // create source 
var myAnalyser = audioCtx.createAnalyser(); // create analyser 
mySource.connect(analyser); // connect audio nodes 
myAnalyser.connect(audioCtx.destination); // connect to speakers 
function processIt() { 
var freqData = new Uint8Array(myAnalyser.frequencyBinCount); 
myAnalyser.getByteFrequencyData(freqData); // place spectrum in freqData 
requestAnimationFrame(function() { 
canvasCtx.clearRect(0, 0, canvasEl.width, canvasEl.height); 
canvasCtx.fillStyle = "#ff0000"; 
for (var i = 0; i < freqData.length; i++) { 
canvasCtx.fillRect(i, canvasEl.height, 1, canvasEl.height - freqData[i]); // plot frequency spectrum 
} // end for 
}); // end requestAnimationFrame 
} // end fcn processIt 
setInterval(processIt, 1000/60); 
17
Plot Audio Spectrogram* 18 
var audioEl = document.querySelector('audio'); // <audio> 
var audioCtx = new AudioContext(); 
var canvasEl = document.querySelector('canvas'); // <canvas> 
var canvasCtx = canvasEl.getContext('2d'); 
var mySource = audioCtx.createMediaElementSource(audioEl); 
var myAnalyser = audioCtx.createAnalyser(); 
myAnalyser.smoothingTimeConstant = 0; 
var myScriptProcessor = audioCtx.createScriptProcessor(myAnalyser.frequencyBinCount, 1, 1); 
mySource.connect(myAnalyser); 
myAnalyser.connect(audioCtx.destination); // speakers/headphone 
myScriptProcessor.connect(audioCtx.destination); 
var x = 0; 
myScriptProcessor.onaudioprocess = function () { 
if(!audioEl.paused) { 
x += 1; 
var freqData = new Uint8Array(myAnalyser.frequencyBinCount); 
myAnalyser.getByteFrequencyData(freqData); 
requestAnimationFrame(function() { 
if (x > canvasEl.width) { 
canvasCtx.clearRect(0, 0, canvasEl.width, canvasEl.height); 
x = 0; 
} 
for (var i = 0; i < freqData.length; i++) { 
canvasCtx.fillStyle = "hsl(" + freqData[i] + ",100%, 50%)"; 
canvasCtx.fillRect(x, canvasEl.height - i, 1, 1); 
} // end for 
}); // end requestAnimationFrame 
} // end if 
} // end onaudioprocess 
*plot of the spectrum as a function of time 
Time  
Frequency 
Types of Audio Nodes 19 
• 
Source 
• 
<audio> Element 
• 
Buffer Source (use with XHR) 
• 
Oscillator 
• 
Analyser Node 
• 
Panner 
• 
Doppler Shift (cf voice changer)? 
• 
http://chromium.googlecode.com/svn/trunk/samples/audio/doppler.html 
• 
Script Processor/AudioWorker (e.g. add your own higher resolution FFT) 
• 
Compressor (e.g. avoid ‘clipping’) 
• 
Convolution (e.g. add impulse response of large cathedral) 
• 
Delay 
• 
…
Developer Tools Console “Hints”: Explore Latest Syntax, Methods & Params 20 
e.g. Firefox
A Fluid Specification 
• 
http://webaudio.github.io/web-audio-api for latest 
• 
Updated frequently: W3C Editor's Draft 14 October 2014 
– 
August 29th + … 
– 
September 29th + … 
– 
October 5th, 8th, 14th 
• 
Boris Smus web book with syntax changes 
– 
http://chimera.labs.oreilly.com/books/1234000001552 
• 
Script Processor Node is deprecated, use createAudioWorker 
• 
“AudioProcessingEvent” (deprecated) is dispatched to ScriptProcessorNode. When the ScriptProcessorNode is replaced by AudioWorker, we’ll use AudioProcessEvent. 21
Boris Smus Book, Deprecations (http://chimera.labs.oreilly.com/books/1234000001552/apa.html) 
• 
AudioBufferSourceNode.noteOn() has been changed to start(). 
• 
AudioBufferSourceNode.noteGrainOn() has been changed to start(). 
• 
AudioBufferSourceNode.noteOff() has been changed to stop(). 
• 
AudioContext.createGainNode() has been changed to createGain(). 
• 
AudioContext.createDelayNode() has been changed to createDelay(). 
• 
AudioContext.createJavaScriptNode() has been changed to createScriptProcessor(). (changing to Audio Workers ) 
• 
OscillatorNode.noteOn() has been changed to start(). 
• 
OscillatorNode.noteOff() has been changed to stop(). 22
Firefox Web Audio Editor 
https://developer.mozilla.org/en-US/docs/Tools/Web_Audio_Editor 
Activate 
Web Audio Editor
Firefox Web Audio Editor (cont.) 24 
• 
Click F12 or Ctrl-Shift-K  Show Developer Tools 
• 
Select “Web Audio” tab  Oscillator Node  AudioParams 
• 
Edit AudioParams 
• 
Update Audio Graph (and Sound!) in real time
Demos 
• 
http://borismus.github.io/spectrogram Realtime, “getUserMedia” 
• 
http://webaudioapi.com Boris Smus 
• 
https://webaudiodemos.appspot.com Chris Wilson 
• 
https://webaudiodemos.appspot.com/Vocoder 
• 
https://webaudiodemos.appspot.com/slides/mediademo 
• 
http://chromium.googlecode.com/svn/trunk/samples/audio/doppler.html 
• 
http://chromium.googlecode.com/svn/trunk/samples/audio/ (shows you files, can view sources) 
•http://labs.dinahmoe.com/ToneCraft 
•Localhost Demos C:UsersrockowerDropboxAudioMBIT- WebAudioTalkdemosstartPythonServer.bat 25 
@echo off 
rem start Python3 Web Server in demos folder 
call python -m http.server 80
http://webaudioplayground.appspot.com/ 26 
• 
Web Audio Playground: interactive creation of Audio Graph 
• 
getUserMedia requests permission to access microphone
webaudioplayground.appspot.com 27
source.connect(B); B.connect(C); … 
…; C.connect(audioContext.destination);
Impulse Response, Convolution, Spatialization, … 
• 
*http://www.openairlib.net 
• 
http://www.openairlib.net/auralizationdb/content/r1-nuclear-reactor-hall 
– 
Upload a sound to hear in that space .wav < 5Megs 
– 
Or download “impulse response” to convolve with your sound 29 
Boris Smus says (in his O’Reilly book): 
– 
Room Effects: ‘The convolver node “smushes” the input sound and its impulse response* by computing a convolution, a mathematically intensive function. The result is something that sounds as if it was produced in the room where the impulse response was recorded.’ 
– 
Spatialized Sounds: the Web Audio API comes with built-in positional audio features 
– 
Position and orientation of sources and listeners 
– 
Parameters associated with the source audio cones 
– 
Relative velocities of sources and listeners (Doppler Shift)
References/links 
• 
http://webaudio.github.io/web-audio-api latest specification 
• 
http://webaudioapi.com/ Boris Smus site 
• 
http://chimera.labs.oreilly.com/books/1234000001552 “Web Audio API” book online 
• 
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Typed_arrays 
• 
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/ArrayBuffer 
• 
http://www.html5rocks.com/en/tutorials/webaudio/intro/ (Smus) 
• 
https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest/Using_XMLHttpRequest 
• 
http://webaudiodemos.appspot.com/ Chris Wilson 
• 
http://webaudioplayground.appspot.com create ‘audio graph’, include analyser, gain, filter, delay 
• 
http://www.html5rocks.com/en/tutorials/file/xhr2/ Bidelman tutorial 
• 
Book “Javascript Creativity” Shane Hudson, Apress, chapter 3, etc. 
• 
https://developer.mozilla.org/en-US/docs/Web/API/Web_Audio_API/Using_Web_Audio_API 
Caveat: many Audio websites have outdated, i.e. non-working, syntax for AudioContext &/or Audio Nodes; some are “vendor-prefixed” e.g. webkitCreateAudioContext 
(as well as for requestAnimationFrame) 30
Backup Slides 31
To make it as an audio engineer, you MUST know: 
• 
Digital audio 
• 
The ins and outs of signal flow and patch bays 
• 
How analog consoles work 
• 
In-depth study of analog consoles 
• 
Audio processing 
• 
Available audio plugins and how they work 
• 
Signal processing and compressors 
• 
How to perform a professional mix-down 
• 
How various studios are designed and how their monitors work 
• 
Electronic music and beat matching 
• 
Sync and automation 
• 
Recording and mixing ins and outs 
• 
Surround mixing 32 
http://www.recordingconnection.com/courses/audio-engineering
What is a “biquad” filter? 
• 
a digital biquad filter is a second-order recursive linear filter, 
• 
containing two poles and two zeros. 
• 
"Biquad" is an abbreviation of "biquadratic", i.e. in the Z domain, 
its transfer function is the ratio of two quadratic functions 
33
Uint8Array(k) has k samples where each ‘sample’ is a quantized measurement or computed value with 8 bits per value 34 
• 
Analog signal is sampled every TS secs. 
• 
Ts is referred to as the sampling interval. 
• 
fs = 1/Ts is called the sampling rate or sampling frequency.
Abstract of Presentation 
Audio for websites has a very checkered past. Finally, however, we can forget about using media tags like “embed” & “object”, and browser plugins like flash, along with the annoying “bgsound” of IE. The HTML5 <audio> tag is a big step forward…. But the “Web Audio API”, modeled on a graph of “audio nodes” providing filters, gains, spectral analysis, and spatially-located sound sources, is more of a giant leap forward for sounds in games and online music synthesis. That, along with “getUserMedia” to capture real-time camera and microphone input are arriving “as we speak”. Plan on lots of eye- (and ear-) candy to whet your appetite, with a modest taste of geeky codes and advances in Javascript Arrays and XHR2. 35
General audio graph definition 
• 
General containers and definitions that shape audio graphs in Web Audio API usage. 
• 
AudioContext: represents an audio-processing graph built from audio modules linked together, each represented by an AudioNode. An audio context controls the creation of the nodes it contains and the execution of the audio processing, or decoding. You need to create an AudioContext before you do anything else, as everything happens inside a context. 
• 
AudioNode: interface represents an audio-processing module like an audio source (e.g. an HTML <audio> or <video> element), audio destination, intermediate processing module (e.g. a filter like BiquadFilterNode, or volume control like GainNode). 
• 
AudioParam: interface represents an audio-related parameter, like one of an AudioNode. It can be set to a specific value or a change in value, and can be scheduled to happen at a specific time and following a specific pattern. 
• 
ended (event): fired when playback has stopped because the end of the media was reached. 36
Interfaces defining audio sources 
• 
OscillatorNode: represents a sine wave. It is an AudioNode audio- processing module that causes a given frequency of sine wave to be created. 
• 
AudioBuffer: represents a short audio asset residing in memory, created from an audio file using the AudioContext.decodeAudioData() method, or created with raw data using AudioContext.createBuffer(). Once decoded into this form, the audio can then be put into an AudioBufferSourceNode. 
• 
AudioBufferSourceNode: represents an audio source consisting of in- memory audio data, stored in an AudioBuffer. It is an AudioNode that acts as an audio source. 
• 
MediaElementAudioSourceNode: represents an audio source consisting of an HTML5 <audio> or <video> element. It is an AudioNode that acts as an audio source. 
• 
MediaStreamAudioSourceNode: represents an audio source consisting of a WebRTC MediaStream (such as a webcam or microphone.) It is an AudioNode that acts as an audio source. 37
Define effects you want to apply to audio sources. 
• 
BiquadFilterNode: represents a simple low-order filter, represents different kinds of filters, tone control devices or graphic equalizers. 
• 
ConvolverNode: performs a Linear Convolution on a given AudioBuffer, often used to achieve a reverb effect. 
• 
DelayNode: causes a delay between the arrival of an input data and its propagation to the output. 
• 
DynamicsCompressorNode: a compression effect, lowers volume of the loudest parts of the signal to help prevent clipping and distortion from multiple sounds played and multiplexed together 
• 
GainNode: represents a change in volume, causes a given gain to be applied to the input signal 
• 
WaveShaperNode: represents a non-linear distorter, uses a curve to apply a waveshaping distortion, often used to add a warm feeling 
• 
PeriodicWave: define a periodic waveform that can be used to shape the output of an OscillatorNode. 38
Audio Analysis, Spatialization & Destinations 
• 
AnalyserNode: represents a node able to provide real-time frequency and time-domain analysis, for data analysis and visualization. 
• 
audio spatialization panning effects to your audio sources. 
– 
AudioListener: represents the position and orientation of the unique person listening to the audio scene 
– 
PannerNode: represents the behavior of a signal in space, describing its position with right-hand Cartesian coordinates, its movement using a velocity vector and its directionality using a directionality cone. 
• 
AudioDestinationNode: represents the end destination of an audio source in a given context — usually the speakers of your device. 
• 
MediaStreamAudioDestinationNode: represents an audio destination consisting of a WebRTC MediaStream with a single AudioMediaStreamTrack 
– 
can be used in a similar way to a MediaStream obtained from Navigator.getUserMedia., acts as an audio destination. 39
40 
Firefox Web Audio Editor: AudioParams to adjust

More Related Content

ODP
音声生成の基礎と音声学
PDF
ECCV読み会 "Materials for Masses: SVBRDF Acquisition with a Single Mobile Phone ...
PDF
ゲームにおけるニューラルネットワーク「NERO における学習と進化 」(後半)
PPTX
[DL輪読会]SurfelGAN: Synthesizing Realistic Sensor Data for Autonomous Driving
PDF
コンピューテーショナルフォトグラフィ
PDF
(文献紹介)デブラー手法の紹介
PDF
CVIM最先端ガイド6 幾何学的推定のための最適化手法 3.5 - 3.8
PDF
統計的学習手法による人検出
音声生成の基礎と音声学
ECCV読み会 "Materials for Masses: SVBRDF Acquisition with a Single Mobile Phone ...
ゲームにおけるニューラルネットワーク「NERO における学習と進化 」(後半)
[DL輪読会]SurfelGAN: Synthesizing Realistic Sensor Data for Autonomous Driving
コンピューテーショナルフォトグラフィ
(文献紹介)デブラー手法の紹介
CVIM最先端ガイド6 幾何学的推定のための最適化手法 3.5 - 3.8
統計的学習手法による人検出

Similar to A (Mis-) Guided Tour of the Web Audio API (20)

PDF
Web &amp; sound
PPTX
Let's Make Some Noise with Web Audio API
PDF
Can you hear me now?
PDF
Guitar Effects with the HTML5 Audio API
PDF
Web Audio API: brief introduction
PDF
Music is the Soul - The Web is the Platform FOWA London 2014
PDF
Web rtc+webaudio
PPT
"Probably, Maybe, No: The State of HTML5 Audio" - Scott Schiller
PPTX
Effective HTML5 game audio
PDF
Experiential Audio
PDF
Web MIDI API - the paster, the present, and the future -
PPTX
Create fun & immersive audio experiences with web audio
PDF
Web audio app preso
PDF
The Web Audio Experience
PDF
HTML5 multimedia - browser-native video, audio and canvas - meet.js Summit / ...
PDF
Javascript For Sound Artists Learn To Code With The Web Audio Api William Tur...
PDF
Web audio, Trackers and Making Music
PDF
audio, video and canvas in HTML5 - standards>next Manchester 29.09.2010
PDF
Web Audio API in 15 min
PPTX
DoctypeHTML5 (Hyderabad) Presentation on Multimedia
Web &amp; sound
Let's Make Some Noise with Web Audio API
Can you hear me now?
Guitar Effects with the HTML5 Audio API
Web Audio API: brief introduction
Music is the Soul - The Web is the Platform FOWA London 2014
Web rtc+webaudio
"Probably, Maybe, No: The State of HTML5 Audio" - Scott Schiller
Effective HTML5 game audio
Experiential Audio
Web MIDI API - the paster, the present, and the future -
Create fun & immersive audio experiences with web audio
Web audio app preso
The Web Audio Experience
HTML5 multimedia - browser-native video, audio and canvas - meet.js Summit / ...
Javascript For Sound Artists Learn To Code With The Web Audio Api William Tur...
Web audio, Trackers and Making Music
audio, video and canvas in HTML5 - standards>next Manchester 29.09.2010
Web Audio API in 15 min
DoctypeHTML5 (Hyderabad) Presentation on Multimedia
Ad

Recently uploaded (20)

PPT
FIRE PREVENTION AND CONTROL PLAN- LUS.FM.MQ.OM.UTM.PLN.00014.ppt
PPTX
artificial intelligence overview of it and more
PPTX
Funds Management Learning Material for Beg
PPTX
Slides PPTX World Game (s) Eco Economic Epochs.pptx
PDF
The New Creative Director: How AI Tools for Social Media Content Creation Are...
PPTX
522797556-Unit-2-Temperature-measurement-1-1.pptx
PPTX
artificialintelligenceai1-copy-210604123353.pptx
PDF
Best Practices for Testing and Debugging Shopify Third-Party API Integrations...
PPT
Ethics in Information System - Management Information System
PDF
Paper PDF World Game (s) Great Redesign.pdf
PPTX
Introduction to Information and Communication Technology
PDF
SASE Traffic Flow - ZTNA Connector-1.pdf
PPT
isotopes_sddsadsaadasdasdasdasdsa1213.ppt
PPTX
newyork.pptxirantrafgshenepalchinachinane
PDF
Exploring VPS Hosting Trends for SMBs in 2025
PDF
💰 𝐔𝐊𝐓𝐈 𝐊𝐄𝐌𝐄𝐍𝐀𝐍𝐆𝐀𝐍 𝐊𝐈𝐏𝐄𝐑𝟒𝐃 𝐇𝐀𝐑𝐈 𝐈𝐍𝐈 𝟐𝟎𝟐𝟓 💰
PPTX
Power Point - Lesson 3_2.pptx grad school presentation
PDF
FINAL CALL-6th International Conference on Networks & IOT (NeTIOT 2025)
PPTX
Module 1 - Cyber Law and Ethics 101.pptx
PDF
Introduction to the IoT system, how the IoT system works
FIRE PREVENTION AND CONTROL PLAN- LUS.FM.MQ.OM.UTM.PLN.00014.ppt
artificial intelligence overview of it and more
Funds Management Learning Material for Beg
Slides PPTX World Game (s) Eco Economic Epochs.pptx
The New Creative Director: How AI Tools for Social Media Content Creation Are...
522797556-Unit-2-Temperature-measurement-1-1.pptx
artificialintelligenceai1-copy-210604123353.pptx
Best Practices for Testing and Debugging Shopify Third-Party API Integrations...
Ethics in Information System - Management Information System
Paper PDF World Game (s) Great Redesign.pdf
Introduction to Information and Communication Technology
SASE Traffic Flow - ZTNA Connector-1.pdf
isotopes_sddsadsaadasdasdasdasdsa1213.ppt
newyork.pptxirantrafgshenepalchinachinane
Exploring VPS Hosting Trends for SMBs in 2025
💰 𝐔𝐊𝐓𝐈 𝐊𝐄𝐌𝐄𝐍𝐀𝐍𝐆𝐀𝐍 𝐊𝐈𝐏𝐄𝐑𝟒𝐃 𝐇𝐀𝐑𝐈 𝐈𝐍𝐈 𝟐𝟎𝟐𝟓 💰
Power Point - Lesson 3_2.pptx grad school presentation
FINAL CALL-6th International Conference on Networks & IOT (NeTIOT 2025)
Module 1 - Cyber Law and Ethics 101.pptx
Introduction to the IoT system, how the IoT system works
Ad

A (Mis-) Guided Tour of the Web Audio API

  • 1. A (Mis-) Guided Tour of the “Web Audio API” Edward B. Rockower, Ph.D. Presented 10/15/14 Monterey Bay Information Technologists (MBIT) Meetup ed@rockower.net 1
  • 2. Abstract • Audio for websites has a very checkered past. • The HTML5 <audio> tag is a big step forward • “Web Audio API”, more of a giant Leap – modeled on a modular graph of “audio nodes” – provides filters, gains, convolvers, spectral analysis, and spatially-located sound sources – Very important for sounds in • games, online music synthesis, speech recognition, analyses • Javascript Arrays and XHR2 (AJAX) • “getUserMedia” to capture real-time camera and microphone input • arriving “as we speak” (Check Availability: www.CanIUse.com) 2
  • 3. Organizing Principles (Evolutionary  Revolutionary) 3 Events Asynchronous/callbacks Web Workers Theremin (artificial) Transistor Printed circuit FFT A/D PCM, DSP Moog Audio Synthesizer Internet Browser Wars Human/computer interaction Online & Games Natural (birds, voices) Computer Generated demos Enabling Technologies Music Audio Engineering Web Audio API
  • 4. What’s New in AJAX, HTML5, & Javascript • New XHR2 (arraybuffer, typed arrays, CORS) • Asynchronous (callbacks, non-blocking, events) • Audio Threads (Audio Worker) • getUserMedia (HTML5, WebRTC) • requestAnimationFrame (60 fps, HTML5) • <audio> (HTML5 mediaElement) • Web Audio API (optimized ‘native’ C code, modular audio graph, Fast Fourier Transform) • Vendor prefixed syntax (webkitAudioContext) • Firefox v. 32 Dev Tools displays/edits Web Audio graph 4
  • 5. Both Sound-Amplitude & Time are Quantized 5
  • 6. PCM Digitization Analog to Digital (A/D) • 4 bits 2^4 = 16 different values – Quantization of values – Encode as binary numbers • Ts = Sampling interval • 1/ Ts = Sampling Frequency • 44.1 kHz used in Compact discs – Nyquist Freq. = 44.1kHz/2 = upper limit of hearing 6
  • 7. Buffers and views: “typed array” architecture JavaScript typed arrays split implementation into buffers and views. • “buffer” (ArrayBuffer object) represents a chunk of data; • no format to speak of • no mechanism for accessing its contents • need to use a “view”, provides context, i.e. data type, starting offset, number of elements • Not your standard Arrays, but Fast !! https://developer.mozilla.org/en-US/docs/Web/JavaScript/Typed_arrays 7 16 bytes = 8 * 16 bits = 128 bits
  • 8. Buffers, Arrays, XHR, … 8 • XMLHttpRequest (XHR)  request.responseType = 'arraybuffer'; • audioContext.decodeAudioData(request.response, function (anotherBuffer) { … } • // Create the array for the data values • frequencyArray = new Uint8Array(analyserNode.frequencyBinCount); • analyserNode.getByteFrequencyData(frequencyArray);  Fast Fourier Transform (FFT) i.e. Spectrum. FFT populates frequencyArray • requestAnimationFrame plots data at each “frame” redraw (60 fps)  more efficient than setTimeout( ) or setInterval( ) ( here 8 bits is quantization in the value of each measurement/sample ‘frame’, Whereas the inverse of Sampling rate, e.g. 1/22,050 = ~4.5 ms is the quantization in time.)
  • 9. Leon Theremin 9 http://mdn.github.io/violent-theremin/ http://youtu.be/w5qf9O6c20o
  • 10. 10
  • 11. Moog Synthesizer  Audio Graphs 11
  • 12. Bourne Identity: Sound Engineers (explaining how car sounds are modified to be more exciting) 12
  • 13. Audio Graph Setup: Typical Workflow 13 1.Create audio context 2.Inside the context, create sources, e.g. <audio>, oscillator, stream 3.Create effects nodes, e.g. reverb, biquad filter, panner, compressor 4.Choose final destination of audio, for example your system speakers 5.Connect the sources up to the effects, and the effects to the detination. developer.mozilla.org/en-US/docs/Web/API/Web_Audio_API
  • 14. Polyfills – Vendor-prefixed (webkit, moz, ms) (e.g. using a self-executing-function) (function() { // Polyfill for AudioContext window.AudioContext = window.AudioContext || window.webkitAudioContext || window.mozAudioContext; // Polyfill for requestAnimationFrame (replaces setTimeout) var requestAnimationFrame = window.requestAnimationFrame || window.mozRequestAnimationFrame || window.webkitRequestAnimationFrame || window.msRequestAnimationFrame; window.requestAnimationFrame = requestAnimationFrame; })(); 14
  • 15. Audio Sources • <audio> • new Audio(‘sounds/mySound.mp3’); • XHR (AJAX) • oscillatorNode(s) • “getUserMedia()” (live, usb microphone) • Procedurally Generated (Script Processor) 15
  • 16. Draw the AudioBuffer (no audiograph) 16 var audioContext = new AudioContext(); function initAudio() { var audioRequest = new XMLHttpRequest(); audioRequest.open("GET", "sounds/myAudio.ogg", true); audioRequest.responseType = "arraybuffer"; audioRequest.onload = function() { audioContext.decodeAudioData( audioRequest.response, function(buffer) { var canvas = document.getElementById("view1"); drawBuffer( canvas.width, canvas.height, canvas.getContext('2d'), buffer ); } ); } audioRequest.send(); } function drawBuffer( width, height, context, buffer ) { var data = buffer.getChannelData( 0 ); var step = Math.ceil( data.length / width ); var amp = height / 2; for(var i=0; i < width; i++){ var min = 1.0; var max = -1.0; for (var j=0; j<step; j++) { var datum = data[(i*step)+j]; if (datum < min) min = datum; if (datum > max) max = datum; } context.fillRect(i,(1+min)*amp,1,Math.max(1,(max-min)*amp)); } } Draws a Web Audio AudioBuffer to a canvas https://github.com/cwilso/Audio-Buffer-Draw/commits/master
  • 17. Plot Audio Spectrum var audioEl = document.querySelector('audio'); // <audio> var audioCtx = new AudioContext(); var canvasEl = document.querySelector('canvas'); // <canvas> var canvasCtx = canvasEl.getContext('2d'); var mySource = audioCtx.createMediaElementSource(audioEl); // create source var myAnalyser = audioCtx.createAnalyser(); // create analyser mySource.connect(analyser); // connect audio nodes myAnalyser.connect(audioCtx.destination); // connect to speakers function processIt() { var freqData = new Uint8Array(myAnalyser.frequencyBinCount); myAnalyser.getByteFrequencyData(freqData); // place spectrum in freqData requestAnimationFrame(function() { canvasCtx.clearRect(0, 0, canvasEl.width, canvasEl.height); canvasCtx.fillStyle = "#ff0000"; for (var i = 0; i < freqData.length; i++) { canvasCtx.fillRect(i, canvasEl.height, 1, canvasEl.height - freqData[i]); // plot frequency spectrum } // end for }); // end requestAnimationFrame } // end fcn processIt setInterval(processIt, 1000/60); 17
  • 18. Plot Audio Spectrogram* 18 var audioEl = document.querySelector('audio'); // <audio> var audioCtx = new AudioContext(); var canvasEl = document.querySelector('canvas'); // <canvas> var canvasCtx = canvasEl.getContext('2d'); var mySource = audioCtx.createMediaElementSource(audioEl); var myAnalyser = audioCtx.createAnalyser(); myAnalyser.smoothingTimeConstant = 0; var myScriptProcessor = audioCtx.createScriptProcessor(myAnalyser.frequencyBinCount, 1, 1); mySource.connect(myAnalyser); myAnalyser.connect(audioCtx.destination); // speakers/headphone myScriptProcessor.connect(audioCtx.destination); var x = 0; myScriptProcessor.onaudioprocess = function () { if(!audioEl.paused) { x += 1; var freqData = new Uint8Array(myAnalyser.frequencyBinCount); myAnalyser.getByteFrequencyData(freqData); requestAnimationFrame(function() { if (x > canvasEl.width) { canvasCtx.clearRect(0, 0, canvasEl.width, canvasEl.height); x = 0; } for (var i = 0; i < freqData.length; i++) { canvasCtx.fillStyle = "hsl(" + freqData[i] + ",100%, 50%)"; canvasCtx.fillRect(x, canvasEl.height - i, 1, 1); } // end for }); // end requestAnimationFrame } // end if } // end onaudioprocess *plot of the spectrum as a function of time Time  Frequency 
  • 19. Types of Audio Nodes 19 • Source • <audio> Element • Buffer Source (use with XHR) • Oscillator • Analyser Node • Panner • Doppler Shift (cf voice changer)? • http://chromium.googlecode.com/svn/trunk/samples/audio/doppler.html • Script Processor/AudioWorker (e.g. add your own higher resolution FFT) • Compressor (e.g. avoid ‘clipping’) • Convolution (e.g. add impulse response of large cathedral) • Delay • …
  • 20. Developer Tools Console “Hints”: Explore Latest Syntax, Methods & Params 20 e.g. Firefox
  • 21. A Fluid Specification • http://webaudio.github.io/web-audio-api for latest • Updated frequently: W3C Editor's Draft 14 October 2014 – August 29th + … – September 29th + … – October 5th, 8th, 14th • Boris Smus web book with syntax changes – http://chimera.labs.oreilly.com/books/1234000001552 • Script Processor Node is deprecated, use createAudioWorker • “AudioProcessingEvent” (deprecated) is dispatched to ScriptProcessorNode. When the ScriptProcessorNode is replaced by AudioWorker, we’ll use AudioProcessEvent. 21
  • 22. Boris Smus Book, Deprecations (http://chimera.labs.oreilly.com/books/1234000001552/apa.html) • AudioBufferSourceNode.noteOn() has been changed to start(). • AudioBufferSourceNode.noteGrainOn() has been changed to start(). • AudioBufferSourceNode.noteOff() has been changed to stop(). • AudioContext.createGainNode() has been changed to createGain(). • AudioContext.createDelayNode() has been changed to createDelay(). • AudioContext.createJavaScriptNode() has been changed to createScriptProcessor(). (changing to Audio Workers ) • OscillatorNode.noteOn() has been changed to start(). • OscillatorNode.noteOff() has been changed to stop(). 22
  • 23. Firefox Web Audio Editor https://developer.mozilla.org/en-US/docs/Tools/Web_Audio_Editor Activate Web Audio Editor
  • 24. Firefox Web Audio Editor (cont.) 24 • Click F12 or Ctrl-Shift-K  Show Developer Tools • Select “Web Audio” tab  Oscillator Node  AudioParams • Edit AudioParams • Update Audio Graph (and Sound!) in real time
  • 25. Demos • http://borismus.github.io/spectrogram Realtime, “getUserMedia” • http://webaudioapi.com Boris Smus • https://webaudiodemos.appspot.com Chris Wilson • https://webaudiodemos.appspot.com/Vocoder • https://webaudiodemos.appspot.com/slides/mediademo • http://chromium.googlecode.com/svn/trunk/samples/audio/doppler.html • http://chromium.googlecode.com/svn/trunk/samples/audio/ (shows you files, can view sources) •http://labs.dinahmoe.com/ToneCraft •Localhost Demos C:UsersrockowerDropboxAudioMBIT- WebAudioTalkdemosstartPythonServer.bat 25 @echo off rem start Python3 Web Server in demos folder call python -m http.server 80
  • 26. http://webaudioplayground.appspot.com/ 26 • Web Audio Playground: interactive creation of Audio Graph • getUserMedia requests permission to access microphone
  • 28. source.connect(B); B.connect(C); … …; C.connect(audioContext.destination);
  • 29. Impulse Response, Convolution, Spatialization, … • *http://www.openairlib.net • http://www.openairlib.net/auralizationdb/content/r1-nuclear-reactor-hall – Upload a sound to hear in that space .wav < 5Megs – Or download “impulse response” to convolve with your sound 29 Boris Smus says (in his O’Reilly book): – Room Effects: ‘The convolver node “smushes” the input sound and its impulse response* by computing a convolution, a mathematically intensive function. The result is something that sounds as if it was produced in the room where the impulse response was recorded.’ – Spatialized Sounds: the Web Audio API comes with built-in positional audio features – Position and orientation of sources and listeners – Parameters associated with the source audio cones – Relative velocities of sources and listeners (Doppler Shift)
  • 30. References/links • http://webaudio.github.io/web-audio-api latest specification • http://webaudioapi.com/ Boris Smus site • http://chimera.labs.oreilly.com/books/1234000001552 “Web Audio API” book online • https://developer.mozilla.org/en-US/docs/Web/JavaScript/Typed_arrays • https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/ArrayBuffer • http://www.html5rocks.com/en/tutorials/webaudio/intro/ (Smus) • https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest/Using_XMLHttpRequest • http://webaudiodemos.appspot.com/ Chris Wilson • http://webaudioplayground.appspot.com create ‘audio graph’, include analyser, gain, filter, delay • http://www.html5rocks.com/en/tutorials/file/xhr2/ Bidelman tutorial • Book “Javascript Creativity” Shane Hudson, Apress, chapter 3, etc. • https://developer.mozilla.org/en-US/docs/Web/API/Web_Audio_API/Using_Web_Audio_API Caveat: many Audio websites have outdated, i.e. non-working, syntax for AudioContext &/or Audio Nodes; some are “vendor-prefixed” e.g. webkitCreateAudioContext (as well as for requestAnimationFrame) 30
  • 32. To make it as an audio engineer, you MUST know: • Digital audio • The ins and outs of signal flow and patch bays • How analog consoles work • In-depth study of analog consoles • Audio processing • Available audio plugins and how they work • Signal processing and compressors • How to perform a professional mix-down • How various studios are designed and how their monitors work • Electronic music and beat matching • Sync and automation • Recording and mixing ins and outs • Surround mixing 32 http://www.recordingconnection.com/courses/audio-engineering
  • 33. What is a “biquad” filter? • a digital biquad filter is a second-order recursive linear filter, • containing two poles and two zeros. • "Biquad" is an abbreviation of "biquadratic", i.e. in the Z domain, its transfer function is the ratio of two quadratic functions 33
  • 34. Uint8Array(k) has k samples where each ‘sample’ is a quantized measurement or computed value with 8 bits per value 34 • Analog signal is sampled every TS secs. • Ts is referred to as the sampling interval. • fs = 1/Ts is called the sampling rate or sampling frequency.
  • 35. Abstract of Presentation Audio for websites has a very checkered past. Finally, however, we can forget about using media tags like “embed” & “object”, and browser plugins like flash, along with the annoying “bgsound” of IE. The HTML5 <audio> tag is a big step forward…. But the “Web Audio API”, modeled on a graph of “audio nodes” providing filters, gains, spectral analysis, and spatially-located sound sources, is more of a giant leap forward for sounds in games and online music synthesis. That, along with “getUserMedia” to capture real-time camera and microphone input are arriving “as we speak”. Plan on lots of eye- (and ear-) candy to whet your appetite, with a modest taste of geeky codes and advances in Javascript Arrays and XHR2. 35
  • 36. General audio graph definition • General containers and definitions that shape audio graphs in Web Audio API usage. • AudioContext: represents an audio-processing graph built from audio modules linked together, each represented by an AudioNode. An audio context controls the creation of the nodes it contains and the execution of the audio processing, or decoding. You need to create an AudioContext before you do anything else, as everything happens inside a context. • AudioNode: interface represents an audio-processing module like an audio source (e.g. an HTML <audio> or <video> element), audio destination, intermediate processing module (e.g. a filter like BiquadFilterNode, or volume control like GainNode). • AudioParam: interface represents an audio-related parameter, like one of an AudioNode. It can be set to a specific value or a change in value, and can be scheduled to happen at a specific time and following a specific pattern. • ended (event): fired when playback has stopped because the end of the media was reached. 36
  • 37. Interfaces defining audio sources • OscillatorNode: represents a sine wave. It is an AudioNode audio- processing module that causes a given frequency of sine wave to be created. • AudioBuffer: represents a short audio asset residing in memory, created from an audio file using the AudioContext.decodeAudioData() method, or created with raw data using AudioContext.createBuffer(). Once decoded into this form, the audio can then be put into an AudioBufferSourceNode. • AudioBufferSourceNode: represents an audio source consisting of in- memory audio data, stored in an AudioBuffer. It is an AudioNode that acts as an audio source. • MediaElementAudioSourceNode: represents an audio source consisting of an HTML5 <audio> or <video> element. It is an AudioNode that acts as an audio source. • MediaStreamAudioSourceNode: represents an audio source consisting of a WebRTC MediaStream (such as a webcam or microphone.) It is an AudioNode that acts as an audio source. 37
  • 38. Define effects you want to apply to audio sources. • BiquadFilterNode: represents a simple low-order filter, represents different kinds of filters, tone control devices or graphic equalizers. • ConvolverNode: performs a Linear Convolution on a given AudioBuffer, often used to achieve a reverb effect. • DelayNode: causes a delay between the arrival of an input data and its propagation to the output. • DynamicsCompressorNode: a compression effect, lowers volume of the loudest parts of the signal to help prevent clipping and distortion from multiple sounds played and multiplexed together • GainNode: represents a change in volume, causes a given gain to be applied to the input signal • WaveShaperNode: represents a non-linear distorter, uses a curve to apply a waveshaping distortion, often used to add a warm feeling • PeriodicWave: define a periodic waveform that can be used to shape the output of an OscillatorNode. 38
  • 39. Audio Analysis, Spatialization & Destinations • AnalyserNode: represents a node able to provide real-time frequency and time-domain analysis, for data analysis and visualization. • audio spatialization panning effects to your audio sources. – AudioListener: represents the position and orientation of the unique person listening to the audio scene – PannerNode: represents the behavior of a signal in space, describing its position with right-hand Cartesian coordinates, its movement using a velocity vector and its directionality using a directionality cone. • AudioDestinationNode: represents the end destination of an audio source in a given context — usually the speakers of your device. • MediaStreamAudioDestinationNode: represents an audio destination consisting of a WebRTC MediaStream with a single AudioMediaStreamTrack – can be used in a similar way to a MediaStream obtained from Navigator.getUserMedia., acts as an audio destination. 39
  • 40. 40 Firefox Web Audio Editor: AudioParams to adjust