SlideShare a Scribd company logo
by Isidore Gotto
Getting Started with Voice UI
Last updated: Feb. 2018
“Hello, I’m _____.
“
TOPICS
Voice Overview Designing for Voice
Resources & Reference Links
2
• What is Voice?
• Why you should consider adding Voice?
• Voice Only: Pro’s & Con’s
• Voice: Things to Consider
• Introducing Voice to SDLC
• Crawl, Walk, Run Approach
• Intro
• 5 Steps to Designing for Voice before Coding
• 7 Principles for Designing Voice
• Real Life User Conditions
• Error Handling
• Identifying the Problem
• Complexity by Data Inputs
• Voice AI Persona, Personality, Tone …
• Designer Tool-Kit Downloads
• UX Research Result
• Platform Comparison
• Industry Best Practices URLs
4
5
6
7
8
9-10
12
13
14
15
16
17
18
19
21
22
23
24
• Prototyping & Development Tools 25
Voice Overview
3
“
“Hello, I’m _____.
What is Voice?
Voice experience has been around since the 1950’s. Today’s enhancements in
technology & demand for innovation has brought us to the next evolution of human
computer interactions.
In today’s market you may come across all types of terms for Voice; i.e. voice
assistant, voice-enabled speakers, Voice UI (VUI), Conversational UI (CUI), Artificial
Intelligence (AI), etc. All you need to understand is that its software that listens out
for grammatical details and attempts to recognize sentence structure to understand
the context and meaning of instructions.
Voice User Interface (VUI) is the next generation of human computer
interaction. VUIs allow people to use the power of their voice to interact with
computers/systems, instead of using their hands with a mouse, keyboard, or touch
screen.
This method of interacting with your product & services has unlimited potential.
* Apple’s Siri, Google’s Assistant, Amazon’s Alexa, and Microsoft’s Cortana are all prime examples of
consumer level AI that can respond to a request, control some level of physical devices, help give
options based on internet searches, and more.
** IBM’s Watson, a business to business solution, takes AI to another level by adding the ability to
make predictions, assumptions, and even some reasoning to computational outcomes.
4
“
Technology has Arrived
as of June 2017 Amazon Alexa
has grown to 15,000 skills &
98% speech recognition
accuracy.
Tech giants like Google, IBM,
Apple, Cisco and even Slack are
all investing into voice
technology.
“Hello, I’m _____.
Why should you consider introducing
Voice Assistance to your products &
services?
• Simplification / Ease of Use - “everyone knows how to talk…”
• Speed & Convenience in Hands Free / Screen Free Situations
• Multi-Tasking - working on one file requesting info from another
* Taking it beyond Voice Only an introducing multi-modal Voice experiences with a new Voice GUI,
we now bring contextual navigation, orientation, personalization & additional benefits to users.
** Voice assistants can help with human empathy as humans have a difficult time understanding tone
via the written word alone. Voice, which includes tone, volume, intonation, personality and rate of
speech conveys a great deal of information.
5
“
Introducing Voice Assistance to
your product & services will one
day help improve your overall
client experience.
Key benefits to users & clients:
Problem
 Today it takes an avg. new user multiple attempts, endless
amount of training to get familiar with your product & services.
 Fact: majority of call volume across all business types revolves
around – “how do I…”
“Voice is being seen as the
future of software & computer
interaction.”
“Hello, I’m _____.
6
Voice Only: Pro’s vs. Con’s
Pro’s
 Get a specific question addressed more
easily and faster; Ask/Command & Done!
 Great for specific info/data lookup and data
analysis tasks, that are either buried or not
accessible via current navigation
 Focused conversation & limiting number of
choices lends to speed & confidence with
decision making
 Handy when user situation requires a
hands-free setup
 More Natural interactions – “Humanize the
experience”
Con’s
✗ May not be obvious to user that they can
initiate conversation or what/how to ask
✗ User may need to adjust their work
environment
✗ High Risk of exceeding cognitive load to
process voice response
✗ Not suitable for complex tasks that require
visual guidance, user input or involve many
choices
✗ Privacy & security concerns with speaking
out loud
7
Voice: Things to Consider
Benefit of Introducing Voice + GUI Experience “Multi-Modal Interactions“– combining two or more
modes of interaction.
• Multi-Modal allows you to compensate for cognitive memory weaknesses & task complexity, through current
visual interface or by introducing a new Voice GUI overlay.
Examples:
Leverage a voice/chat based experience.
Visual Confirmations (Hound app does a great job with only voice input, responses are voice + visual. )
UX Challenges to Overcome
• User Input - Speech 2 Text Recognition (based on technology selection constraints)
• Type of Data Input – will vary based on complexity of Use Case & Task
• Privacy - Speaking Out Loud Sensitive Information (system needs to be able to identify sensitive information and not respond
with audio)
The challenge is making the experience more natural, tackling the wide variety of ambiguity that may occur.
8
Introducing Voice to your SDLC
A conversational or natural language user interface is a method of interacting with computers through text or voice
commands.
With good speech recognition, accurate instruction detection & quick responses, voice interaction is starting to feel natural.
“
“Hello, I’m _____.
9
Designing for Voice
10
“
“Hello, I’m _____.
11
Designing for Voice
Voice User Interface (VUI) systems understand voice
commands, and respond either by speaking back, or by showing
a visual response.
The difference between Voice-Only interactions & ‘multi-modal’
means more information can be conveyed to the user than on
voice only devices. Multi-modal interfaces could help drive huge
advances in the workplace.
While Voice-Only interactions benefit the user in hands free
situations and providing quick answers to short commands.
Adding voice to any system will give it the sense of life, personality, &
character. Moving forward with voice, we must think about how verbal
conversations sound, feel, and flow.
““Using a VUI should feel as natural as speaking, and
listening, to any other human.” “Hello, I’m _____.
5 Steps to Designing a Voice Experience before #Coding
1. Discover
What problem can voice solve?
How will voice provide value to your
users? i.e. consider all environments
2. Define
Voice Persona – Tone, Voice,
Personality…
Evaluate Capabilities – Will voice be a
good fit for this use case or task?
i.e. start with introducing 1 to 5 capabilities.
- Download Voice Evaluation Worksheet
3. Detail Conversation Flow
Begin with the “Happy Path” a
conversational flow in which the voice app
can respond to the users request without
any expectations or error. Then move on to
detailing the conversation flow for
exceptions and errors. - Download Design
Kit
4. Describe Alternative
Words & Phrases for NLP
People don’t always use the same
words to say the same thing and voice
apps need to be taught that. Phrase-
mapping is an exercise to teach voice
apps to accommodate variation in the
way users phrase their requests.
5. Refine
Test, learn, measure &
refine with user research.
12
“
Steps to VUI
Discover
Define
DetailsDescribe
Refine
“Hello, I’m _____.
GoogleAmazon Cortana
13
Principles for Designing for Voice
*Voice Design Guides
by
“
Voice UI & Conversational UI
Design Kit - Download
Voice Task Evaluation
Worksheet - Download
“Hello, I’m _____.
interrupted
self correction
cut off to soon
background noise
confused
too many choices
didn’t understand
talked too long
speaks in other termscoughs
hesitation
connection cuts off
REAL LIFE
USER
CONDITIONS
}
language
accents
soft spoken
“It’s hard enough to
speak with another
human.”
culture
jumps from one thought to another
14
Things to Consider when Designing for Voice
privacy
I Don’t Understand You
When a so-called “error”
occurs in a conversation, it
should be treated simply as a
new turn in the dialog, only
with different conditions.
15
Error Handling
“
Example:
• I did not understand your request. Did
you say A or B?
• I currently am not able to process your
request, would you prefer A or B?
• I am not able to process your request.
Would you like me to connect you with a
Service Representative?
A
B
?
“Hello, I’m _____.
GET STARTED WITH ASKING: What user problem are you looking to solve?
Identifying if Voice UI experience is the right solution
• First, identify your intended user persona & personality.
• Then, layout their typical journey when using your application.
• Next, identify areas where Voice will benefit the user.
• Then, identify what other personas will benefit with the same or similar Voice experience.
• Design, porotype and test – more on this later
1. Difficulty finding or navigating applications. i.e. how do I… Where is… Shortcuts...
2. What’s my status? i.e. Did my package ship?
3. What is __________ phone number?
4. I have a specific question on ___________.
5. Look up _________ information or data.
6. Show me _________ report.
7. Calculate total or difference between _________ & _________.
Examples where Voice can make a BIG difference assisting users today.
Note: where possible try to use data/analytics first to identify areas of applications that are
most frequently used, have the largest amount of call volume. Then use the voice task
evaluation worksheet to evaluate.
16
“
Our GOAL is to build a
complete & seamless
Voice Experience
across all your
products.
Voice UI & Conversational
UI Design Kit - Download
💡
“Hello, I’m _____.
Complexity by Data Input Types on Users via Voice/Conversational
UI
TYPES OF DATA INPUT
VOICE ONLY
(standalone)
VOICE + GUI
(Multi-modal Exp.)
CONVERSATIONAL UI
CHAT / TEXT
PRO-ACTIVE
CONVERSATIONAL UI w/ AI
(Multi-modal Exp.)
On/Off
(checkbox, switch)
Easy Easy Easy Easy
Select one or multiple
from options offered
(radio options, dropdown menus,
checkboxes, cards, multi-select)
Difficult
(cognitive load with visual
aid)
Easy,
(Multi-Mode two or more
modes of interaction. GUI used
for data entry, selection,
validation, confirmation)
Difficult
Presentation of choices needs to be
limited; especially multiple choice
Difficult
Presentation of choices needs to be
limited; especially multiple choice)
Structured fields
(dates, currency, etc.)
Difficult
(inconsistent voice
recognition performance)
Easy
(Multi-Mode two or more
modes of interaction. GUI used
for data entry, selection,
validation, confirmation)
Easy, but could be tedious when
multiple fields are involved.
Recommend large input forms to be
designed in traditional UI Format.
Easy, but could be tedious when
multiple fields are involved.
Recommend large input forms to be
designed in traditional UI Format.
Text fields with variable
data
(email address, people names,
addresses)
Difficult
(voice recognition of
variable data)
Easy
(Multi-Mode two or more
modes of interaction. GUI used
for data entry, selection,
validation, confirmation)
Easy, but could be tedious when
multiple fields are involved.
Recommend large input forms to be
designed in traditional UI Format.
Easy, but could be tedious when
multiple fields are involved.
Recommend large input forms to be
designed in traditional UI Format.
17
Characteristics of Voice for A.I.
1. Tone of Voice
2. Gender of Voice
3. Personality
4. Character
5. Word & Phrase Choices
6. Functional Design
7. Style & Technique
Creating the Voice of A.I. for your Product
Base your characteristics on:
 Your user population
 Their needs
 The imagery & qualities associated with
your brand
18
“
“Hello, I’m _____.
Reference Links, Research Results, Frameworks & more.
Resources
19
“
“Hello, I’m _____.
Reference & Resource Links
20
We have created several downloadable tool-kits for you to get started with adopting Voice/Conversation UI
experiences on your products.
• Customer Journey & Scripting for Voice – will assist you with facilitating stakeholder discussions in evaluating
where in your customer journey Voice UI would make an impact from Product Discovery, Initial Setup of new Client, First
Benefit/Use, Re-Use. As well samples on designing conversational UI with scripts and prototype references. – download
• Voice Use Case / Task Evaluation Worksheet – helps you quickly evaluate your product use cases for Voice prior to
designing. – download
• Voice Personality Development – expanding on traditional personas, looking deeper into user personality traits,
character and into your AI Personality.
Reference & Resource Links
Industry UX design best practices and heuristics for voice & conversational UI.
Amazon:
https://developer.amazon.com/designing-for-voice/design-process/
Apple Siri:
https://developer.apple.com/sirikit/
Google:
https://developers.google.com/actions/design/checklist
https://developers.google.com/actions/design/principles
Microsoft:
https://docs.microsoft.com/en-us/cortana/skills/design-principles
Samsung Bixby:
http://bixby.samsung.com/
21
Platform Comparison
AVAILABLE ON PRO’s CON’s
Amazon Skills Standalone, Mobile
(Nov.2017 announced
Alexa for business)
95-98% accuracy; languages US, Europe,
German, Japanese
…
- To Be Delivered (TBD)
Apple Siri Kit iPhone, iPad, mac,
macbook, iWatch,
HomePod
88% accuracy; multi-language supported
…
…
Google
Assistant
Phone, tablet, laptop,
standalone devices &
web
95-98% accuracy; multi-language supported
…
…
Microsoft
Cortana
Laptop, desktop,
standalone devices
95-98% accuracy; multi-language supported
…
…
Samsung
Bixby
Phone, tablet, TV - To Be Delivered (TBD)
Company
Virtual
Assistant
Company Ecosystem
of products &
services online or
native app.
- To Be Delivered (TBD)
Other
platforms…
As of Oct. 2017 22
Google Assistant
https://developers.google.com/assistant/sdk/overview |
https://developers.google.com/assistant/sdk/
Google Speech - https://cloud.google.com/speech/
Apple Siri Kit - https://developer.apple.com/sirikit/
Microsoft Cortana - https://developer.microsoft.com/en-us/cortana
Microsoft Bing Speech API - https://azure.microsoft.com/en-
us/services/cognitive-services/speech/
UMP Speech Recognition - https://docs.microsoft.com/en-
us/windows/uwp/input-and-devices/speech-recognition
Microsoft Cortana Skills Kit - https://developer.microsoft.com/en-us/cortana
Aug 2017 reached 5.1% error rate -
https://techcrunch.com/2017/08/20/microsofts-speech-recognition-system-
hits-a-new-accuracy-milestone/
Finnish IT company Blucup wanted to find a way for its salespeople to input
customer data and generate leads while in the
field. https://customers.microsoft.com/en-us/story/blucup-discrete-
manufacturing-cognitive-services
Samsung Bixby - http://developer.samsung.com/home.do
https://news.samsung.com/global/bixby-a-new-way-to-interact-with-your-
phone
Amazon Alexa -https://developer.amazon.com/alexa
Voice Design Guide - https://developer.amazon.com/designing-for-
voice/
Amazon - https://developer.amazon.com/designing-for-voice/
Google - https://developers.google.com/actions/design/
Facebook - https://developers.facebook.com/docs/messenger-
platform/introduction/general-best-practices
Slack - https://api.slack.com/best-practices
Apple - https://developer.apple.com/ios/human-interface-
guidelines/overview/themes/
Paid Vendors
KeenResearch - http://keenresearch.com/
DialogFlow - Conversational UX Platform for Web, Mobile and IoT -
https://dialogflow.com/
SpeechMatics - https://www.speechmatics.com/
Open Source Vendors
SoundHound “Hound” - https://soundhound.com/hound
CMU Sphinx - https://soundhound.com/hound
OpenEars - https://www.politepix.com/openears/
iSpeech - https://www.ispeech.org/
23
Prototyping & Development Tools
None Developer
• Wizard of Oz – set of microphones and speakers
• Sayspring.com (voice only, can be connected to Amazon and Google)
• InvisionApp, Axure, Keynote etc. (used to create GUI part of the experience)
Development Skills Required
• Wit.ai
• Dialogflow.com
• SoundHound.com ‘Houndify’
• Amazon Alexa Skills
• Google Cloud Platform
• Apple Speech Recongnition
• IBM Watson – Speech to Text and Text to Speech
Voice Analytics
• VoiceLabs.com
24

More Related Content

PPTX
Marketing plan for app "Share the Food - Help Needy"
PDF
UX 101: Personas
PDF
How CleverTap helped Dream11 Drive Exceptional User Growth
PPTX
The Ethics of AI
PPT
Social Networking Project (2)
PDF
Ui vs UX design
PPTX
UI / UX Design Presentation
PPSX
Introduction to mobile application
Marketing plan for app "Share the Food - Help Needy"
UX 101: Personas
How CleverTap helped Dream11 Drive Exceptional User Growth
The Ethics of AI
Social Networking Project (2)
Ui vs UX design
UI / UX Design Presentation
Introduction to mobile application

What's hot (20)

PPTX
Disadvantages of Social Media
PPTX
Brief history of social media
PDF
Personal website design
DOCX
Artificial Intelligence Report
PDF
Future of work employability and digital skills march 2021
PDF
Technology - Yesterday, Today and Tomorrow
DOCX
Artificial intelligence report
PPTX
PDF
Topics for Presentation
PDF
UX para dispositivos móviles
PDF
Artificial intelligence a bane or boon-pdf
PPTX
Technology’s impact on society
PDF
The Rise of Douyin/TikTok
PPTX
Artificial Intelligence: Classification, Applications, Opportunities, and Cha...
PDF
Fitness app proposal
PPTX
Impact Of Artificial Intelligence (AI) On Society_ Presentation .pptx
PDF
UI/UX Workshop - Hackvision
PPTX
smart note taker
PPTX
Effects of social networking
PPT
Trends in New Media
Disadvantages of Social Media
Brief history of social media
Personal website design
Artificial Intelligence Report
Future of work employability and digital skills march 2021
Technology - Yesterday, Today and Tomorrow
Artificial intelligence report
Topics for Presentation
UX para dispositivos móviles
Artificial intelligence a bane or boon-pdf
Technology’s impact on society
The Rise of Douyin/TikTok
Artificial Intelligence: Classification, Applications, Opportunities, and Cha...
Fitness app proposal
Impact Of Artificial Intelligence (AI) On Society_ Presentation .pptx
UI/UX Workshop - Hackvision
smart note taker
Effects of social networking
Trends in New Media
Ad

Similar to Getting Started with Voice UI (20)

PPTX
Designing Voice-Driven Game Experiences | Dave Isbitski
PDF
Voice Tech TO #1
PPTX
Designing for Voice UI: Planning and Writing for Voice Interaction
PPTX
Great Voice Experiences Start with Listening: Best Practices in Research and ...
PDF
Introduction to Voice Design
 
PDF
What Voice User Interface (VUI) Means in Web Design
PDF
Getting ready for voice
PDF
VUI Design
PDF
Designing applications for voice interface platforms
PDF
Understanding Voice User Interface Design
PDF
Stratis Valachis, Designing for Voice Interfaces
PDF
Designing for Voice
PDF
HAMBURG Voice MEETUP #4 LEARN voice user interface design!
PDF
UX STRAT Europe 2019: Zhaochang He, VMware
PDF
Designing Voice Applications - Create For Voice
PPTX
What do you understand by voice user interface design (VUI).pptx
PPTX
Conversational User Interfaces, Past and Future
PPTX
Design Principal for Action on Google
PDF
Conversational user interfaces (by Jochem Grietens)
PDF
Conversational experience by Systango
Designing Voice-Driven Game Experiences | Dave Isbitski
Voice Tech TO #1
Designing for Voice UI: Planning and Writing for Voice Interaction
Great Voice Experiences Start with Listening: Best Practices in Research and ...
Introduction to Voice Design
 
What Voice User Interface (VUI) Means in Web Design
Getting ready for voice
VUI Design
Designing applications for voice interface platforms
Understanding Voice User Interface Design
Stratis Valachis, Designing for Voice Interfaces
Designing for Voice
HAMBURG Voice MEETUP #4 LEARN voice user interface design!
UX STRAT Europe 2019: Zhaochang He, VMware
Designing Voice Applications - Create For Voice
What do you understand by voice user interface design (VUI).pptx
Conversational User Interfaces, Past and Future
Design Principal for Action on Google
Conversational user interfaces (by Jochem Grietens)
Conversational experience by Systango
Ad

More from Isidore Gotto (14)

PDF
User Behavior Analytics
PDF
7 Principles for Designing for Voice
PDF
Things to consider when designing for voice
PDF
Conversational UI User/Technology Path
PDF
Conversational UI / Voice UI Use Case Evaluation
PDF
The Rise of Voice Invoca Report: Nov 2017
PDF
User Testing Webinar: Mobile Banking Industry Insights 02.21.2018
PDF
Forrester 2018 predictions
PDF
Answer lab best practices in research and design for voice user interfaces
PDF
Game Changers 2018
PDF
What is the Social Graph?
PDF
User Experience Tools for the UX Professional
PDF
Social Media and Technology Events
DOCX
Microsoft html5 web camp june 15 in nyc notes
User Behavior Analytics
7 Principles for Designing for Voice
Things to consider when designing for voice
Conversational UI User/Technology Path
Conversational UI / Voice UI Use Case Evaluation
The Rise of Voice Invoca Report: Nov 2017
User Testing Webinar: Mobile Banking Industry Insights 02.21.2018
Forrester 2018 predictions
Answer lab best practices in research and design for voice user interfaces
Game Changers 2018
What is the Social Graph?
User Experience Tools for the UX Professional
Social Media and Technology Events
Microsoft html5 web camp june 15 in nyc notes

Recently uploaded (20)

PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPT
Teaching material agriculture food technology
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
A Presentation on Artificial Intelligence
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
Understanding_Digital_Forensics_Presentation.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
CIFDAQ's Market Insight: SEC Turns Pro Crypto
“AI and Expert System Decision Support & Business Intelligence Systems”
The AUB Centre for AI in Media Proposal.docx
Encapsulation_ Review paper, used for researhc scholars
Dropbox Q2 2025 Financial Results & Investor Presentation
MYSQL Presentation for SQL database connectivity
Unlocking AI with Model Context Protocol (MCP)
20250228 LYD VKU AI Blended-Learning.pptx
Teaching material agriculture food technology
NewMind AI Monthly Chronicles - July 2025
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
A Presentation on Artificial Intelligence
NewMind AI Weekly Chronicles - August'25 Week I
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Building Integrated photovoltaic BIPV_UPV.pdf

Getting Started with Voice UI

  • 1. by Isidore Gotto Getting Started with Voice UI Last updated: Feb. 2018 “Hello, I’m _____. “
  • 2. TOPICS Voice Overview Designing for Voice Resources & Reference Links 2 • What is Voice? • Why you should consider adding Voice? • Voice Only: Pro’s & Con’s • Voice: Things to Consider • Introducing Voice to SDLC • Crawl, Walk, Run Approach • Intro • 5 Steps to Designing for Voice before Coding • 7 Principles for Designing Voice • Real Life User Conditions • Error Handling • Identifying the Problem • Complexity by Data Inputs • Voice AI Persona, Personality, Tone … • Designer Tool-Kit Downloads • UX Research Result • Platform Comparison • Industry Best Practices URLs 4 5 6 7 8 9-10 12 13 14 15 16 17 18 19 21 22 23 24 • Prototyping & Development Tools 25
  • 4. What is Voice? Voice experience has been around since the 1950’s. Today’s enhancements in technology & demand for innovation has brought us to the next evolution of human computer interactions. In today’s market you may come across all types of terms for Voice; i.e. voice assistant, voice-enabled speakers, Voice UI (VUI), Conversational UI (CUI), Artificial Intelligence (AI), etc. All you need to understand is that its software that listens out for grammatical details and attempts to recognize sentence structure to understand the context and meaning of instructions. Voice User Interface (VUI) is the next generation of human computer interaction. VUIs allow people to use the power of their voice to interact with computers/systems, instead of using their hands with a mouse, keyboard, or touch screen. This method of interacting with your product & services has unlimited potential. * Apple’s Siri, Google’s Assistant, Amazon’s Alexa, and Microsoft’s Cortana are all prime examples of consumer level AI that can respond to a request, control some level of physical devices, help give options based on internet searches, and more. ** IBM’s Watson, a business to business solution, takes AI to another level by adding the ability to make predictions, assumptions, and even some reasoning to computational outcomes. 4 “ Technology has Arrived as of June 2017 Amazon Alexa has grown to 15,000 skills & 98% speech recognition accuracy. Tech giants like Google, IBM, Apple, Cisco and even Slack are all investing into voice technology. “Hello, I’m _____.
  • 5. Why should you consider introducing Voice Assistance to your products & services? • Simplification / Ease of Use - “everyone knows how to talk…” • Speed & Convenience in Hands Free / Screen Free Situations • Multi-Tasking - working on one file requesting info from another * Taking it beyond Voice Only an introducing multi-modal Voice experiences with a new Voice GUI, we now bring contextual navigation, orientation, personalization & additional benefits to users. ** Voice assistants can help with human empathy as humans have a difficult time understanding tone via the written word alone. Voice, which includes tone, volume, intonation, personality and rate of speech conveys a great deal of information. 5 “ Introducing Voice Assistance to your product & services will one day help improve your overall client experience. Key benefits to users & clients: Problem  Today it takes an avg. new user multiple attempts, endless amount of training to get familiar with your product & services.  Fact: majority of call volume across all business types revolves around – “how do I…” “Voice is being seen as the future of software & computer interaction.” “Hello, I’m _____.
  • 6. 6 Voice Only: Pro’s vs. Con’s Pro’s  Get a specific question addressed more easily and faster; Ask/Command & Done!  Great for specific info/data lookup and data analysis tasks, that are either buried or not accessible via current navigation  Focused conversation & limiting number of choices lends to speed & confidence with decision making  Handy when user situation requires a hands-free setup  More Natural interactions – “Humanize the experience” Con’s ✗ May not be obvious to user that they can initiate conversation or what/how to ask ✗ User may need to adjust their work environment ✗ High Risk of exceeding cognitive load to process voice response ✗ Not suitable for complex tasks that require visual guidance, user input or involve many choices ✗ Privacy & security concerns with speaking out loud
  • 7. 7 Voice: Things to Consider Benefit of Introducing Voice + GUI Experience “Multi-Modal Interactions“– combining two or more modes of interaction. • Multi-Modal allows you to compensate for cognitive memory weaknesses & task complexity, through current visual interface or by introducing a new Voice GUI overlay. Examples: Leverage a voice/chat based experience. Visual Confirmations (Hound app does a great job with only voice input, responses are voice + visual. ) UX Challenges to Overcome • User Input - Speech 2 Text Recognition (based on technology selection constraints) • Type of Data Input – will vary based on complexity of Use Case & Task • Privacy - Speaking Out Loud Sensitive Information (system needs to be able to identify sensitive information and not respond with audio) The challenge is making the experience more natural, tackling the wide variety of ambiguity that may occur.
  • 8. 8 Introducing Voice to your SDLC A conversational or natural language user interface is a method of interacting with computers through text or voice commands. With good speech recognition, accurate instruction detection & quick responses, voice interaction is starting to feel natural. “ “Hello, I’m _____.
  • 9. 9
  • 11. 11 Designing for Voice Voice User Interface (VUI) systems understand voice commands, and respond either by speaking back, or by showing a visual response. The difference between Voice-Only interactions & ‘multi-modal’ means more information can be conveyed to the user than on voice only devices. Multi-modal interfaces could help drive huge advances in the workplace. While Voice-Only interactions benefit the user in hands free situations and providing quick answers to short commands. Adding voice to any system will give it the sense of life, personality, & character. Moving forward with voice, we must think about how verbal conversations sound, feel, and flow. ““Using a VUI should feel as natural as speaking, and listening, to any other human.” “Hello, I’m _____.
  • 12. 5 Steps to Designing a Voice Experience before #Coding 1. Discover What problem can voice solve? How will voice provide value to your users? i.e. consider all environments 2. Define Voice Persona – Tone, Voice, Personality… Evaluate Capabilities – Will voice be a good fit for this use case or task? i.e. start with introducing 1 to 5 capabilities. - Download Voice Evaluation Worksheet 3. Detail Conversation Flow Begin with the “Happy Path” a conversational flow in which the voice app can respond to the users request without any expectations or error. Then move on to detailing the conversation flow for exceptions and errors. - Download Design Kit 4. Describe Alternative Words & Phrases for NLP People don’t always use the same words to say the same thing and voice apps need to be taught that. Phrase- mapping is an exercise to teach voice apps to accommodate variation in the way users phrase their requests. 5. Refine Test, learn, measure & refine with user research. 12 “ Steps to VUI Discover Define DetailsDescribe Refine “Hello, I’m _____.
  • 13. GoogleAmazon Cortana 13 Principles for Designing for Voice *Voice Design Guides by “ Voice UI & Conversational UI Design Kit - Download Voice Task Evaluation Worksheet - Download “Hello, I’m _____.
  • 14. interrupted self correction cut off to soon background noise confused too many choices didn’t understand talked too long speaks in other termscoughs hesitation connection cuts off REAL LIFE USER CONDITIONS } language accents soft spoken “It’s hard enough to speak with another human.” culture jumps from one thought to another 14 Things to Consider when Designing for Voice privacy
  • 15. I Don’t Understand You When a so-called “error” occurs in a conversation, it should be treated simply as a new turn in the dialog, only with different conditions. 15 Error Handling “ Example: • I did not understand your request. Did you say A or B? • I currently am not able to process your request, would you prefer A or B? • I am not able to process your request. Would you like me to connect you with a Service Representative? A B ? “Hello, I’m _____.
  • 16. GET STARTED WITH ASKING: What user problem are you looking to solve? Identifying if Voice UI experience is the right solution • First, identify your intended user persona & personality. • Then, layout their typical journey when using your application. • Next, identify areas where Voice will benefit the user. • Then, identify what other personas will benefit with the same or similar Voice experience. • Design, porotype and test – more on this later 1. Difficulty finding or navigating applications. i.e. how do I… Where is… Shortcuts... 2. What’s my status? i.e. Did my package ship? 3. What is __________ phone number? 4. I have a specific question on ___________. 5. Look up _________ information or data. 6. Show me _________ report. 7. Calculate total or difference between _________ & _________. Examples where Voice can make a BIG difference assisting users today. Note: where possible try to use data/analytics first to identify areas of applications that are most frequently used, have the largest amount of call volume. Then use the voice task evaluation worksheet to evaluate. 16 “ Our GOAL is to build a complete & seamless Voice Experience across all your products. Voice UI & Conversational UI Design Kit - Download 💡 “Hello, I’m _____.
  • 17. Complexity by Data Input Types on Users via Voice/Conversational UI TYPES OF DATA INPUT VOICE ONLY (standalone) VOICE + GUI (Multi-modal Exp.) CONVERSATIONAL UI CHAT / TEXT PRO-ACTIVE CONVERSATIONAL UI w/ AI (Multi-modal Exp.) On/Off (checkbox, switch) Easy Easy Easy Easy Select one or multiple from options offered (radio options, dropdown menus, checkboxes, cards, multi-select) Difficult (cognitive load with visual aid) Easy, (Multi-Mode two or more modes of interaction. GUI used for data entry, selection, validation, confirmation) Difficult Presentation of choices needs to be limited; especially multiple choice Difficult Presentation of choices needs to be limited; especially multiple choice) Structured fields (dates, currency, etc.) Difficult (inconsistent voice recognition performance) Easy (Multi-Mode two or more modes of interaction. GUI used for data entry, selection, validation, confirmation) Easy, but could be tedious when multiple fields are involved. Recommend large input forms to be designed in traditional UI Format. Easy, but could be tedious when multiple fields are involved. Recommend large input forms to be designed in traditional UI Format. Text fields with variable data (email address, people names, addresses) Difficult (voice recognition of variable data) Easy (Multi-Mode two or more modes of interaction. GUI used for data entry, selection, validation, confirmation) Easy, but could be tedious when multiple fields are involved. Recommend large input forms to be designed in traditional UI Format. Easy, but could be tedious when multiple fields are involved. Recommend large input forms to be designed in traditional UI Format. 17
  • 18. Characteristics of Voice for A.I. 1. Tone of Voice 2. Gender of Voice 3. Personality 4. Character 5. Word & Phrase Choices 6. Functional Design 7. Style & Technique Creating the Voice of A.I. for your Product Base your characteristics on:  Your user population  Their needs  The imagery & qualities associated with your brand 18 “ “Hello, I’m _____.
  • 19. Reference Links, Research Results, Frameworks & more. Resources 19 “ “Hello, I’m _____.
  • 20. Reference & Resource Links 20 We have created several downloadable tool-kits for you to get started with adopting Voice/Conversation UI experiences on your products. • Customer Journey & Scripting for Voice – will assist you with facilitating stakeholder discussions in evaluating where in your customer journey Voice UI would make an impact from Product Discovery, Initial Setup of new Client, First Benefit/Use, Re-Use. As well samples on designing conversational UI with scripts and prototype references. – download • Voice Use Case / Task Evaluation Worksheet – helps you quickly evaluate your product use cases for Voice prior to designing. – download • Voice Personality Development – expanding on traditional personas, looking deeper into user personality traits, character and into your AI Personality.
  • 21. Reference & Resource Links Industry UX design best practices and heuristics for voice & conversational UI. Amazon: https://developer.amazon.com/designing-for-voice/design-process/ Apple Siri: https://developer.apple.com/sirikit/ Google: https://developers.google.com/actions/design/checklist https://developers.google.com/actions/design/principles Microsoft: https://docs.microsoft.com/en-us/cortana/skills/design-principles Samsung Bixby: http://bixby.samsung.com/ 21
  • 22. Platform Comparison AVAILABLE ON PRO’s CON’s Amazon Skills Standalone, Mobile (Nov.2017 announced Alexa for business) 95-98% accuracy; languages US, Europe, German, Japanese … - To Be Delivered (TBD) Apple Siri Kit iPhone, iPad, mac, macbook, iWatch, HomePod 88% accuracy; multi-language supported … … Google Assistant Phone, tablet, laptop, standalone devices & web 95-98% accuracy; multi-language supported … … Microsoft Cortana Laptop, desktop, standalone devices 95-98% accuracy; multi-language supported … … Samsung Bixby Phone, tablet, TV - To Be Delivered (TBD) Company Virtual Assistant Company Ecosystem of products & services online or native app. - To Be Delivered (TBD) Other platforms… As of Oct. 2017 22
  • 23. Google Assistant https://developers.google.com/assistant/sdk/overview | https://developers.google.com/assistant/sdk/ Google Speech - https://cloud.google.com/speech/ Apple Siri Kit - https://developer.apple.com/sirikit/ Microsoft Cortana - https://developer.microsoft.com/en-us/cortana Microsoft Bing Speech API - https://azure.microsoft.com/en- us/services/cognitive-services/speech/ UMP Speech Recognition - https://docs.microsoft.com/en- us/windows/uwp/input-and-devices/speech-recognition Microsoft Cortana Skills Kit - https://developer.microsoft.com/en-us/cortana Aug 2017 reached 5.1% error rate - https://techcrunch.com/2017/08/20/microsofts-speech-recognition-system- hits-a-new-accuracy-milestone/ Finnish IT company Blucup wanted to find a way for its salespeople to input customer data and generate leads while in the field. https://customers.microsoft.com/en-us/story/blucup-discrete- manufacturing-cognitive-services Samsung Bixby - http://developer.samsung.com/home.do https://news.samsung.com/global/bixby-a-new-way-to-interact-with-your- phone Amazon Alexa -https://developer.amazon.com/alexa Voice Design Guide - https://developer.amazon.com/designing-for- voice/ Amazon - https://developer.amazon.com/designing-for-voice/ Google - https://developers.google.com/actions/design/ Facebook - https://developers.facebook.com/docs/messenger- platform/introduction/general-best-practices Slack - https://api.slack.com/best-practices Apple - https://developer.apple.com/ios/human-interface- guidelines/overview/themes/ Paid Vendors KeenResearch - http://keenresearch.com/ DialogFlow - Conversational UX Platform for Web, Mobile and IoT - https://dialogflow.com/ SpeechMatics - https://www.speechmatics.com/ Open Source Vendors SoundHound “Hound” - https://soundhound.com/hound CMU Sphinx - https://soundhound.com/hound OpenEars - https://www.politepix.com/openears/ iSpeech - https://www.ispeech.org/ 23
  • 24. Prototyping & Development Tools None Developer • Wizard of Oz – set of microphones and speakers • Sayspring.com (voice only, can be connected to Amazon and Google) • InvisionApp, Axure, Keynote etc. (used to create GUI part of the experience) Development Skills Required • Wit.ai • Dialogflow.com • SoundHound.com ‘Houndify’ • Amazon Alexa Skills • Google Cloud Platform • Apple Speech Recongnition • IBM Watson – Speech to Text and Text to Speech Voice Analytics • VoiceLabs.com 24

Editor's Notes

  • #3: Created to help product teams, designers & researchers ramp up on the benefits of Voice, how to design for Voice and how to test voice.
  • #9: To build a Voice Assistant we need to introduce: 1- Speech Recognition 2- Natural Language/Learning Processing 3- Machine Learning 4- Artificial Intelligence 5- Automation
  • #13: Discover How will voice provide value to my users? Take into consideration why and where people use voice apps. People use voice interfaces because of the benefits of hands-free interaction, the speed of interaction and the ease of use. Define Persona – Tone, Voice, Personality Capabilities – What would benefit your users on a voice-driven device in a shared space? Start with introducing 1 to 5 capabilities Detail Conversation Flow Begin with the “Happy Path” a conversational flow in which the voice app can respond to the users request without any expectations or error. Then move on to detailing the conversation flow for exceptions and errors. Use the Voice UI & Conversational UI Design Spreadsheet Then with your team play out the conversation out loud and debate. Describe Alternate Words & Phrases People don’t always use the same words to say the same thing and voice apps need to be taught that. Phrase-mapping is an exercise to teach voice apps to accommodate variation in the way users phrase their requests. For each path you detailed in step 3 now think of different ways users could word those requests. Refine Test the voice interface with users.
  • #15: In Conversations, there are no “Errors” ? We need to handle these conditions - Error handling - Decision support - Ability to start over or go back
  • #16: We need to handle these conditions - Error handling - Decision support - Ability to start over or go back
  • #18: Voice Only - We can speak 3 to 4 times faster than we type. Chat / Text - More than 2.1 billion users today use a social messaging app. – Portio research
  • #23: Smart Assistant companies claim their speech recognition software is now at 5.5% word error rate. Humans average around 5.1%.
  • #24: Additional deeper R&D required into: Limited to how open the platform is & capabilities How does platform presence intersect with your app & brand experiences? Discovery & voice command recall Setting the right user expectations related to domain coverage
  • #25: https://wit.ai/ https://dialogflow.com/ https://soundhound.com/ https://cloud.google.com/products/machine-learning/ https://www.ibm.com/watson/services/speech-to-text/ https://www.ibm.com/watson/services/text-to-speech/ https://developer.amazon.com/alexa https://azure.microsoft.com/en-gb/services/cognitive-services/ https://developer.apple.com/documentation/speech http://voicelabs.co/