PhoneGap Day US 2013 - Simon MacDonald: Speech Recognition

JavaScript
Speech
Recognition
Applicationsand maybe some other HTML5 goodness

Who is this guy?
core contributor
nutty about
IBM'er
@macdonst
macdonst on Github
simonmacdonald.com
PhoneGap
speech recognition

Why do I care about speech rec?
Here's a conversation between
two Cape Bretoners
P1: jeet?
P2: naw, jew?
P1: naw, t'rly t'eet bye.

And here's the translation
P1: jeet?
P1: Did you eat?
P2: naw, jew?
P2: No, did you?
P1: naw, t'rly t'eet bye.
P1: No, it's too early to eat buddy.

Speech
recognition is
the process of
translating

The process of speech
rec includes
1. Record and digitize the audio data
2. Split data into phonemes
3. Apply the phonemes to the recognition model
4. Analyze the results against the grammar
5. Return a confidence weighted result

PhoneGap Day US 2013 - Simon MacDonald: Speech Recognition

So how do we
add speech
rec to our
app?

You may look at the W3C
Speech API Specification

but only Chrome on the
desktop has
implemented that spec

The spec looks like this:
interface SpeechRecognition : EventTarget {
// recognition parameters
attribute SpeechGrammarList grammars;
attribute DOMString lang;
attribute boolean continuous;
attribute boolean interimResults;
attribute unsigned long maxAlternatives;
attribute DOMString serviceURI;
// methods to drive the speech interaction
void start();
void stop();
void abort();
};

With additional event
methods to control
behaviour:
attribute EventHandler onaudiostart;
attribute EventHandler onsoundstart;
attribute EventHandler onspeechstart;
attribute EventHandler onspeechend;
attribute EventHandler onsoundend;
attribute EventHandler onaudioend;
attribute EventHandler onresult;
attribute EventHandler onnomatch;
attribute EventHandler onerror;
attribute EventHandler onstart;
attribute EventHandler onend;

Let's recognize some
speech
Click to Speak
hello world
var recognition = new SpeechRecognition();
recognition.onresult = function(event) {
if (event.results.length > 0) {
var test1 = document.getElementById("test1");
test1.innerHTML = event.results[0][0].transcript;
}
};
recognition.start();

...if taking
dictation gets
you going

But I want to
do something
more exciting
with the

Let's do something a little
less trivial
Click to Speak
recognition.onresult = function(event) {
var result = event.results[0][0].transcript;
var music = document.getElementById("music");
switch(result) {
case "jazz":
music.src="jazz.mp3";
music.play();
break;
case "rock":
music.src="rock.mp3";
music.play();
break;
case "stop":
default:
music.pause();
}
};

Let's ask the web a
question
Click to Speak
Q: what day is it today
A: Friday July 19th, 2013

Works pretty
good...
...but ugly!

Let's style our
button with
some CSS

+
=
<a class="speechinput">
<img src="images/mic.png">
</a>
#speechinput input {
cursor:pointer;
margin:auto;
margin:15px;
color:transparent;
background-color:transparent;
border:5px;
width:15px;
-webkit-transform: scale(3.0, 3.0);
}

And we'll add some color
using
by Nicholas Gallagher
Speech
Bubbles
Pure-CSS-Speech-Bubbles

what is steve jobs middle name
Steven Paul Jobs

But wait, why
am I using my
eyes like a
sucker

We'll output the answer
using SpeechSynthesis

The SpeechSynthesis
spec looks like this:
interface SpeechSynthesis {
readonly attribute boolean pending;
readonly attribute boolean speaking;
readonly attribute boolean paused;
void speak(SpeechSynthesisUtterance utterance);
void cancel();
void pause();
void resume();
SpeechSynthesisVoiceList getVoices();
};

The
SpeechSynthesisUtterance
spec looks like this:
interface SpeechSynthesisUtterance : EventTarget {
attribute DOMString text;
attribute DOMString lang;
attribute DOMString voiceURI;
attribute float volume;
attribute float rate;
attribute float pitch;
};

With additional event
methods to control
behaviour:
attribute EventHandler onstart;
attribute EventHandler onend;
attribute EventHandler onerror;
attribute EventHandler onpause;
attribute EventHandler onresume;
attribute EventHandler onmark;
attribute EventHandler onboundary;

who won the stanley cup this year
Chicago Blackhawks

Plugin repo's
SpeechRecognitionPlugin -
https://github.com/macdonst/SpeechRecognitionPlugin
SpeechSynthesisPlugin -
https://github.com/macdonst/SpeechSynthesisPlugin

PhoneGap Day US 2013 - Simon MacDonald: Speech Recognition

More Related Content

Similar to PhoneGap Day US 2013 - Simon MacDonald: Speech Recognition (20)

Recently uploaded (20)

PhoneGap Day US 2013 - Simon MacDonald: Speech Recognition