SlideShare a Scribd company logo
Introduction to Voice
Design
Presented by
Jeff LeBlanc, Director of User Experience
Boston UX
1
About the Presenter
Jeff LeBlanc
• Director of UX at Boston UX and ICS
• Software developer for 20+ years
• Adjunct faculty at WPI teaching HCI
Contact me at jeffl@bostonux.com
2
About Boston UX
• Providing inspired UX design for embedded devices
• Design arm of Integrated Computer Solutions
• Over 30 years of software excellence
• Specialists in intuitive interface design for touch- and voice-powered smart
devices
3
Fascinated by the Future
• Genius
• Billionaire
• Playboy
• Philanthropist
• Futurist
4
Turning Science Fiction Into Fact
• Fiction has inspired innovation and design for years
• Jules Verne inspired Simon Lake
• H.G. Welles inspired Robert Goddard
5
Turning Science Fiction Into Fact
6
Tony Stark’s Mansion
Good morning. It’s 7:00 a.m. The weather in Malibu is 72 degrees with
scattered clouds. The surf conditions are fair with waist-to-shoulder high
lines. High tide will be at 10:52 a.m.
7
Voice Interaction
• Voice is an excellent input modality
• Allows for hands-free interaction and multitasking
• Fast, often beats typing or manual controls
• Nuances of voice output can add to experience
• Devices can be designed:
• Screen first, add voice later
• Voice first, add screen later
8
Voice Interaction – Why not?
• May not be suitable for public spaces, privacy concerns
• Medical and HIPAA situations
• Disturbing others in your living space
• Can add to cognitive load
• Remember instead of recognize
9
Design Considerations
• Automated Speech Recognition
• What did I say?
• Natural Language Understanding
• Do what I mean!
• Voice User Interface (VUI)
10
Automated Speech Recognition
• DRAGON was started in 1975
• Dragon Naturally Speaking - 1997, recognized around 100 words
• Major platforms reporting 95% recognition rate
• Same as human to human accuracy
• Automated Speech Recognition
• Accuracy achieved by cloud processing
• Local processing, less accurate
11
Automated Speech Recognition
• Intel Realsense initiative, started in 2012
• Free SDK, limited hardware (only supported Creative camera)
• Accuracy level has varied between versions of the SDK and target hardware
• In practice, simple commands recognized with 40% to 60% confidence
12
Automated Speech Recognition
• Intel Realsense initiative, started in 2012
• Free SDK, limited hardware (only supported Creative camera)
• Accuracy level has varied between versions of the SDK and target hardware
• In practice, simple commands recognized with 40% to 60% confidence
13
Natural Language Understanding
• The harder problem, requires lots of computing and “intelligence”
• Neural nets, machine learning, etc.
• Challenge is getting meaning from a set of words
• What I say: “Do you have a pen?”
• What I mean: “I need a pen and if you have one please give it to me now.”
14
Natural Language Understanding
“Skip the spinning rims! We’re on the clock!”
15
Natural Language Understanding
• 1950 - Turing test
• Test of a machine’s ability to communicate as a human
• 1990 - Loebner prize
• Artificial intelligence competition
• 2017- Alexa prize
• Build “a socialbot that can converse coherently and engagingly with
humans on popular topics for 20 minutes.”
16
Natural User Interfaces
• Designing a UI that has a “natural” feel
• Need to be able to hold a conversation and understand context
• Conversation is more than a single “turn”
• “Jarvis, turn on kitchen lights.”
• “Yes sir.”
Vs
• “Jarvis, turn on the lights.”
• “Which lights, sir?”
• “The kitchen lights.”
• Rules of conversation are learned by being human
17
VUI Design
“Designing Voice User Interfaces:
Principles of Conversational
Experiences”
Cathy Pearl, 1st edition
18
VUI Design
• Confidence thresholds and confirmation
• “Jarvis, order me some paper towels”
• > 80% “Yes sir, ordering you more paper towels”
• Implicit confirmation - repeating part of the question in the answer
• 45% - 79% “Sir, I think you said to order more paper towels, is that correct?”
• Uncertain, ask for confirmation
• < 45% “I’m sorry sir, I didn’t understand what you said”
• Non-speech confirmation
• “Jarvis, turn on the lights”
19
VUI Design
• Command and Control - typically “one off” voice commands
• Triggers
• Push to talk
• The “open mic” problem
• Wake words
• Alexa et al parse for wake words locally, then send sound to cloud
20
VUI Design
• How do we handle error conditions?
• No speech detected
• Speech detected but not recognized
• Recognized but system does the wrong thing
• Escalation error handling
• prompt for more info
• don’t blame the user!
21
VUI Design
• Natural conversations use variety of words for same meaning
• Design for intents and support synonyms of meaning
• Jarvis: “Shall I render using proposed specifications?”
• Tony Stark: “Thrill me.”
• Saying “yes” - yup, yep, yeah, uh huh, please do, go for it, etc.
22
VUI Design
• Intent - an action that the user wants to take
• Intent can sometimes be inferred by object
• “Show my calendar” versus “Add a meeting to my calendar”
• Designs VUIs to handle intents first
23
VUI Design
• Limitation of voice is serial output
• With visual, we can parallel process
• Support the ability to interrupt output (barge in)
• Jarvis: “Sir, at 19% power, the odds of reaching that altitude ...”
• Tony Stark: “I KNOW THE MATH! DO IT!!”
• Break up long lists into smaller chunks
• “Would you like to hear more?”
24
VUI Design
• Consider multiple modalities
• Combinations of screen and voice are very powerful
• Amazon Echo Show
• Example: a voice query that displays results on a touch screen
25
VUI Design
• Handling error condition of speech detected but not recognized
• Solution: provide a list of probable matches, scored and sorted by
context or other knowledge
• N-Best lists solution
• Example: “find” versus “fine”
• Make your error messages as useful as possible
• Talk as a person, not as software
26
VUI Design
• Disambiguation - speech is understood but we don’t have enough information
to take action
• “I’d like a large, please”
• Often requires a conversation
• Not enough information - “What is the weather in Springfield?”
• Too much information - “I have a cough and a fever”
• Properly handling negation - “I’m not feeling well” - “Great to hear!”
27
VUI Design
• Support for conversations is challenging
• Requires memory of “past” parts of conversation
• Support for pronouns
• Jarvis: “The render is complete.”
• Tony Stark: “Hey, I like it. Fabricate it.”
• Building a “dialog” in the original sense of the word
28
VUI Design
• Speech output vastly influences the overall experience
• Stephen Hawking and DECtalk - Perfect Paul
• JARVIS - Paul Bettany doing voice overs
• Pre-recorded voice
• Limited responses, but more emotional
• Computer generated voice
• More flexible, less nuanced
29
VUI Design
• You are designing a persona, not a device
• Siri, Jarvis, Friday, Alexa, Cortana: they all have personalities
• You are not designing a person, but a persona
• You can say “I” or “me”, but avoid “we” so as to not group with humans
• Design your persona as you would in any design activity, with traits and quirks
• Jarvis – British butler, said “Sir”
• Friday – Irish assistant gal, said “Boss”
30
VUI Design
Consider human rules of conversation to make the interactions as natural as possible
• Use transitional markers
• “Next, I’ll need to know…”
• Be helpful and respectful within the persona
• Provide acknowledgements and feedback
• “Okay, got it.”
• Avoid jargon and tech speak
• Don’t: “Sorry, there was a server error”
• Do: “Hm, I’m having trouble doing that right now”
31
VUI Design
• When working with pure voice, you are always working with a blank slate
• No affordances, nothing to recognize
• Provide prompts
• “You can do things like…”
• Consider your intents and synonyms carefully
• Test, test, test!
32
Thanks for Attending
Contact me @ jeffl@bostonux.com
33

More Related Content

KEY
Podcasting primer presentation
PPT
ALA Anne
PPTX
Tech in libraries
PPTX
Programming Ideas in Makerspaces
PPTX
AAM 2014: Tech Tutorial: Principles of Effective Video
PPSX
Summer project 2018
KEY
Hampton's 6 Rules of Mobile Design
PPTX
Designing Interactive Learning Spaces
Podcasting primer presentation
ALA Anne
Tech in libraries
Programming Ideas in Makerspaces
AAM 2014: Tech Tutorial: Principles of Effective Video
Summer project 2018
Hampton's 6 Rules of Mobile Design
Designing Interactive Learning Spaces

What's hot (8)

PPTX
Designing for Grannie
PPTX
ALA Jeffrey
PPTX
Building a Digital Media Lab ALA2013 Presentation
PDF
Chasing Elephants - Alberto Brandolini - Codemotion Rome 2017
PPTX
G325 - the future: Kelly and eight generatives
PDF
Making an Impact: UX Team of One
PDF
Ouhk comm6005 lecture 7 tools for presentations
PPTX
Presentation Skills Workshop - KUMC Fellowship 2014
Designing for Grannie
ALA Jeffrey
Building a Digital Media Lab ALA2013 Presentation
Chasing Elephants - Alberto Brandolini - Codemotion Rome 2017
G325 - the future: Kelly and eight generatives
Making an Impact: UX Team of One
Ouhk comm6005 lecture 7 tools for presentations
Presentation Skills Workshop - KUMC Fellowship 2014
Ad

Similar to Introduction to Voice Design (20)

PDF
UX STRAT Europe 2019: Zhaochang He, VMware
PPTX
Conversational User Interfaces, Past and Future
PDF
Content Design for the Conversational UI - Design + Content Conference 2019
PDF
Contribute and Collaborate 101
PDF
Let's talk about voice
PDF
Emergent Patterns in DevOps
PDF
Spoken Dialogue Systems and Social Talk - Emer Gilmartin
PDF
A faster horse
PPTX
Cutting Edge Without Bleeding
PDF
Using Word Cloud Plus to Code Open Ended Text
PDF
Ride the Wave of Conversational UX
PPTX
Swfln key note
PDF
Voice usability testing with WOZ methodology - UX SCOT 2019
PDF
Cracking the Chat bot Code
PDF
2016-How-to-give-a-great-research-talk.pdf
PDF
Ux scot voice usability testing with woz - ar and sf - june 2019
PDF
Content Design for the Conversational UI - STC Summit 2018
PDF
Community its easier than you think
KEY
Sound Design
PPTX
ECPA #PubU12 October 2012
UX STRAT Europe 2019: Zhaochang He, VMware
Conversational User Interfaces, Past and Future
Content Design for the Conversational UI - Design + Content Conference 2019
Contribute and Collaborate 101
Let's talk about voice
Emergent Patterns in DevOps
Spoken Dialogue Systems and Social Talk - Emer Gilmartin
A faster horse
Cutting Edge Without Bleeding
Using Word Cloud Plus to Code Open Ended Text
Ride the Wave of Conversational UX
Swfln key note
Voice usability testing with WOZ methodology - UX SCOT 2019
Cracking the Chat bot Code
2016-How-to-give-a-great-research-talk.pdf
Ux scot voice usability testing with woz - ar and sf - june 2019
Content Design for the Conversational UI - STC Summit 2018
Community its easier than you think
Sound Design
ECPA #PubU12 October 2012
Ad

More from ICS (20)

PDF
Understanding the EU Cyber Resilience Act
 
PDF
Porting Qt 5 QML Modules to Qt 6 Webinar
 
PDF
Medical Device Cybersecurity Threat & Risk Scoring
 
PDF
Exploring Wayland: A Modern Display Server for the Future
 
PDF
Threat Modeling & Risk Assessment Webinar: A Step-by-Step Example
 
PDF
8 Mandatory Security Control Categories for Successful Submissions
 
PDF
Future-Proofing Embedded Device Capabilities with the Qt 6 Plugin Mechanism.pdf
 
PDF
Choosing an Embedded GUI: Comparative Analysis of UI Frameworks
 
PDF
Medical Device Cyber Testing to Meet FDA Requirements
 
PDF
Threat Modeling and Risk Assessment Webinar.pdf
 
PDF
Secure-by-Design Using Hardware and Software Protection for FDA Compliance
 
PDF
Webinar On-Demand: Using Flutter for Embedded
 
PDF
A Deep Dive into Secure Product Development Frameworks.pdf
 
PDF
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
PDF
Practical Advice for FDA’s 510(k) Requirements.pdf
 
PDF
Accelerating Development of a Safety-Critical Cobot Welding System with Qt/QM...
 
PDF
Overcoming CMake Configuration Issues Webinar
 
PDF
Enhancing Quality and Test in Medical Device Design - Part 2.pdf
 
PDF
Designing and Managing IoT Devices for Rapid Deployment - Webinar.pdf
 
PDF
Quality and Test in Medical Device Design - Part 1.pdf
 
Understanding the EU Cyber Resilience Act
 
Porting Qt 5 QML Modules to Qt 6 Webinar
 
Medical Device Cybersecurity Threat & Risk Scoring
 
Exploring Wayland: A Modern Display Server for the Future
 
Threat Modeling & Risk Assessment Webinar: A Step-by-Step Example
 
8 Mandatory Security Control Categories for Successful Submissions
 
Future-Proofing Embedded Device Capabilities with the Qt 6 Plugin Mechanism.pdf
 
Choosing an Embedded GUI: Comparative Analysis of UI Frameworks
 
Medical Device Cyber Testing to Meet FDA Requirements
 
Threat Modeling and Risk Assessment Webinar.pdf
 
Secure-by-Design Using Hardware and Software Protection for FDA Compliance
 
Webinar On-Demand: Using Flutter for Embedded
 
A Deep Dive into Secure Product Development Frameworks.pdf
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Practical Advice for FDA’s 510(k) Requirements.pdf
 
Accelerating Development of a Safety-Critical Cobot Welding System with Qt/QM...
 
Overcoming CMake Configuration Issues Webinar
 
Enhancing Quality and Test in Medical Device Design - Part 2.pdf
 
Designing and Managing IoT Devices for Rapid Deployment - Webinar.pdf
 
Quality and Test in Medical Device Design - Part 1.pdf
 

Recently uploaded (20)

PDF
Facade & Landscape Lighting Techniques and Trends.pptx.pdf
PPT
Machine printing techniques and plangi dyeing
PPT
Package Design Design Kit 20100009 PWM IC by Bee Technologies
PDF
Africa 2025 - Prospects and Challenges first edition.pdf
PPT
EGWHermeneuticsffgggggggggggggggggggggggggggggggg.ppt
PPTX
An introduction to AI in research and reference management
PPTX
AD Bungalow Case studies Sem 2.pptxvwewev
PPTX
ANATOMY OF ANTERIOR CHAMBER ANGLE AND GONIOSCOPY.pptx
PDF
Urban Design Final Project-Site Analysis
PDF
Benefits_of_Cast_Aluminium_Doors_Presentation.pdf
PDF
Trusted Executive Protection Services in Ontario — Discreet & Professional.pdf
PPTX
DOC-20250430-WA0014._20250714_235747_0000.pptx
PDF
Key Trends in Website Development 2025 | B3AITS - Bow & 3 Arrows IT Solutions
DOCX
actividad 20% informatica microsoft project
PDF
UNIT 1 Introduction fnfbbfhfhfbdhdbdto Java.pptx.pdf
PDF
Interior Structure and Construction A1 NGYANQI
PPTX
mahatma gandhi bus terminal in india Case Study.pptx
PDF
Design Thinking - Module 1 - Introduction To Design Thinking - Dr. Rohan Dasg...
PPTX
BSCS lesson 3.pptxnbbjbb mnbkjbkbbkbbkjb
PPTX
HPE Aruba-master-icon-library_052722.pptx
Facade & Landscape Lighting Techniques and Trends.pptx.pdf
Machine printing techniques and plangi dyeing
Package Design Design Kit 20100009 PWM IC by Bee Technologies
Africa 2025 - Prospects and Challenges first edition.pdf
EGWHermeneuticsffgggggggggggggggggggggggggggggggg.ppt
An introduction to AI in research and reference management
AD Bungalow Case studies Sem 2.pptxvwewev
ANATOMY OF ANTERIOR CHAMBER ANGLE AND GONIOSCOPY.pptx
Urban Design Final Project-Site Analysis
Benefits_of_Cast_Aluminium_Doors_Presentation.pdf
Trusted Executive Protection Services in Ontario — Discreet & Professional.pdf
DOC-20250430-WA0014._20250714_235747_0000.pptx
Key Trends in Website Development 2025 | B3AITS - Bow & 3 Arrows IT Solutions
actividad 20% informatica microsoft project
UNIT 1 Introduction fnfbbfhfhfbdhdbdto Java.pptx.pdf
Interior Structure and Construction A1 NGYANQI
mahatma gandhi bus terminal in india Case Study.pptx
Design Thinking - Module 1 - Introduction To Design Thinking - Dr. Rohan Dasg...
BSCS lesson 3.pptxnbbjbb mnbkjbkbbkbbkjb
HPE Aruba-master-icon-library_052722.pptx

Introduction to Voice Design

  • 1. Introduction to Voice Design Presented by Jeff LeBlanc, Director of User Experience Boston UX 1
  • 2. About the Presenter Jeff LeBlanc • Director of UX at Boston UX and ICS • Software developer for 20+ years • Adjunct faculty at WPI teaching HCI Contact me at jeffl@bostonux.com 2
  • 3. About Boston UX • Providing inspired UX design for embedded devices • Design arm of Integrated Computer Solutions • Over 30 years of software excellence • Specialists in intuitive interface design for touch- and voice-powered smart devices 3
  • 4. Fascinated by the Future • Genius • Billionaire • Playboy • Philanthropist • Futurist 4
  • 5. Turning Science Fiction Into Fact • Fiction has inspired innovation and design for years • Jules Verne inspired Simon Lake • H.G. Welles inspired Robert Goddard 5
  • 7. Tony Stark’s Mansion Good morning. It’s 7:00 a.m. The weather in Malibu is 72 degrees with scattered clouds. The surf conditions are fair with waist-to-shoulder high lines. High tide will be at 10:52 a.m. 7
  • 8. Voice Interaction • Voice is an excellent input modality • Allows for hands-free interaction and multitasking • Fast, often beats typing or manual controls • Nuances of voice output can add to experience • Devices can be designed: • Screen first, add voice later • Voice first, add screen later 8
  • 9. Voice Interaction – Why not? • May not be suitable for public spaces, privacy concerns • Medical and HIPAA situations • Disturbing others in your living space • Can add to cognitive load • Remember instead of recognize 9
  • 10. Design Considerations • Automated Speech Recognition • What did I say? • Natural Language Understanding • Do what I mean! • Voice User Interface (VUI) 10
  • 11. Automated Speech Recognition • DRAGON was started in 1975 • Dragon Naturally Speaking - 1997, recognized around 100 words • Major platforms reporting 95% recognition rate • Same as human to human accuracy • Automated Speech Recognition • Accuracy achieved by cloud processing • Local processing, less accurate 11
  • 12. Automated Speech Recognition • Intel Realsense initiative, started in 2012 • Free SDK, limited hardware (only supported Creative camera) • Accuracy level has varied between versions of the SDK and target hardware • In practice, simple commands recognized with 40% to 60% confidence 12
  • 13. Automated Speech Recognition • Intel Realsense initiative, started in 2012 • Free SDK, limited hardware (only supported Creative camera) • Accuracy level has varied between versions of the SDK and target hardware • In practice, simple commands recognized with 40% to 60% confidence 13
  • 14. Natural Language Understanding • The harder problem, requires lots of computing and “intelligence” • Neural nets, machine learning, etc. • Challenge is getting meaning from a set of words • What I say: “Do you have a pen?” • What I mean: “I need a pen and if you have one please give it to me now.” 14
  • 15. Natural Language Understanding “Skip the spinning rims! We’re on the clock!” 15
  • 16. Natural Language Understanding • 1950 - Turing test • Test of a machine’s ability to communicate as a human • 1990 - Loebner prize • Artificial intelligence competition • 2017- Alexa prize • Build “a socialbot that can converse coherently and engagingly with humans on popular topics for 20 minutes.” 16
  • 17. Natural User Interfaces • Designing a UI that has a “natural” feel • Need to be able to hold a conversation and understand context • Conversation is more than a single “turn” • “Jarvis, turn on kitchen lights.” • “Yes sir.” Vs • “Jarvis, turn on the lights.” • “Which lights, sir?” • “The kitchen lights.” • Rules of conversation are learned by being human 17
  • 18. VUI Design “Designing Voice User Interfaces: Principles of Conversational Experiences” Cathy Pearl, 1st edition 18
  • 19. VUI Design • Confidence thresholds and confirmation • “Jarvis, order me some paper towels” • > 80% “Yes sir, ordering you more paper towels” • Implicit confirmation - repeating part of the question in the answer • 45% - 79% “Sir, I think you said to order more paper towels, is that correct?” • Uncertain, ask for confirmation • < 45% “I’m sorry sir, I didn’t understand what you said” • Non-speech confirmation • “Jarvis, turn on the lights” 19
  • 20. VUI Design • Command and Control - typically “one off” voice commands • Triggers • Push to talk • The “open mic” problem • Wake words • Alexa et al parse for wake words locally, then send sound to cloud 20
  • 21. VUI Design • How do we handle error conditions? • No speech detected • Speech detected but not recognized • Recognized but system does the wrong thing • Escalation error handling • prompt for more info • don’t blame the user! 21
  • 22. VUI Design • Natural conversations use variety of words for same meaning • Design for intents and support synonyms of meaning • Jarvis: “Shall I render using proposed specifications?” • Tony Stark: “Thrill me.” • Saying “yes” - yup, yep, yeah, uh huh, please do, go for it, etc. 22
  • 23. VUI Design • Intent - an action that the user wants to take • Intent can sometimes be inferred by object • “Show my calendar” versus “Add a meeting to my calendar” • Designs VUIs to handle intents first 23
  • 24. VUI Design • Limitation of voice is serial output • With visual, we can parallel process • Support the ability to interrupt output (barge in) • Jarvis: “Sir, at 19% power, the odds of reaching that altitude ...” • Tony Stark: “I KNOW THE MATH! DO IT!!” • Break up long lists into smaller chunks • “Would you like to hear more?” 24
  • 25. VUI Design • Consider multiple modalities • Combinations of screen and voice are very powerful • Amazon Echo Show • Example: a voice query that displays results on a touch screen 25
  • 26. VUI Design • Handling error condition of speech detected but not recognized • Solution: provide a list of probable matches, scored and sorted by context or other knowledge • N-Best lists solution • Example: “find” versus “fine” • Make your error messages as useful as possible • Talk as a person, not as software 26
  • 27. VUI Design • Disambiguation - speech is understood but we don’t have enough information to take action • “I’d like a large, please” • Often requires a conversation • Not enough information - “What is the weather in Springfield?” • Too much information - “I have a cough and a fever” • Properly handling negation - “I’m not feeling well” - “Great to hear!” 27
  • 28. VUI Design • Support for conversations is challenging • Requires memory of “past” parts of conversation • Support for pronouns • Jarvis: “The render is complete.” • Tony Stark: “Hey, I like it. Fabricate it.” • Building a “dialog” in the original sense of the word 28
  • 29. VUI Design • Speech output vastly influences the overall experience • Stephen Hawking and DECtalk - Perfect Paul • JARVIS - Paul Bettany doing voice overs • Pre-recorded voice • Limited responses, but more emotional • Computer generated voice • More flexible, less nuanced 29
  • 30. VUI Design • You are designing a persona, not a device • Siri, Jarvis, Friday, Alexa, Cortana: they all have personalities • You are not designing a person, but a persona • You can say “I” or “me”, but avoid “we” so as to not group with humans • Design your persona as you would in any design activity, with traits and quirks • Jarvis – British butler, said “Sir” • Friday – Irish assistant gal, said “Boss” 30
  • 31. VUI Design Consider human rules of conversation to make the interactions as natural as possible • Use transitional markers • “Next, I’ll need to know…” • Be helpful and respectful within the persona • Provide acknowledgements and feedback • “Okay, got it.” • Avoid jargon and tech speak • Don’t: “Sorry, there was a server error” • Do: “Hm, I’m having trouble doing that right now” 31
  • 32. VUI Design • When working with pure voice, you are always working with a blank slate • No affordances, nothing to recognize • Provide prompts • “You can do things like…” • Consider your intents and synonyms carefully • Test, test, test! 32
  • 33. Thanks for Attending Contact me @ jeffl@bostonux.com 33