SlideShare a Scribd company logo
Creating a Voice User Interface with Speech Server 2007 Jason Townsend
Jason Townsend President, Bartlesville .NET User Group Sr. Analyst, ConocoPhillips 11+ Years Development Experience Father of 4 wonderful children Married to an amazing and forgiving wife! Avid Sailor
 
Speech Server 2007 Speech Server is an IVR (interactive voice response) platform that allows you to develop telephony applications using standards such as Speech Application Language Tags (SALT) and VoiceXML. New Features Native Voice Over IP (VoIP) Voice Response Workflow Conversational Grammar Builder
Common Application Scenarios Customer Service Pay bills by phone (ex: ChoicePay) Order products (ex: Tickets.com) Customer Support (ex: Dell) Banking (ex: Bank of America) Information Worker Markets Pipeline workers Insurance Appraisers Realtors For workers that may not be in front of a desktop
New Features Support for .NET 2.0 Framework Support for VoiceXML  Voice Response Workflow Applications Based on Windows Workflow Foundation Native Support for VoIP Integrated into Office Communications Server.
Speech Server Architecture
Speech Recognition Supported Languages English – Austalia English – United Kingdom English – North America German – Germany Spanish – Americas More to come…
VoiceXML W3C’s standard XML Format for specifying interactive voice dialogues between a human and a computer Interpreted by a voice browser
SALT SALT Forum was founded on October 15, 2001  Microsoft Cisco Comverse Intel Philips ScanSoft W3C work initiated in July 2002 SALT Forum seems to have gone dead.  The last press release was in 2003. Main concept was multimodal applications Speechify the web, ivr, handhelds, etc…
SALT Usage Microsoft Speech Server 2004 Only SALT Microsoft Speech Server 2007 SALT and VXML Plugin for Internet Explorer
Key Workflow Concepts Workflows are a set of activities The work flow itself is an Activity Activities are the building blocks of the application A single unit of Reuse A single unit of Execution An Activity has associated properties, conditions, and events Developers can build their own Custom Activity Libraries Image your own Telerik RAD Controls, Infragistics Controls, etc… Just for VUI’s A Workflow runs within a Host Process WAS IIS .EXE Windows Managed Services
Dialogue Flow is a Workflow Speech Server only supports sequential workflow development
Speech Application Development Define the dialogue flow Statements, questions, answers, etc… Other activities Specify possible answers (grammars) Record questions (prompts) Integrate into the back-end (Web Services) Deploy, test, and tune application
Developing Your Prototype Managed Code Assembly
Tuning Applications Out of the box speech applications Are not robust to real world user input Need real data to optimize  Trial phases required for gathering data Wizard of Oz phase Pilot phases Visual Studio Integrated Analytics and Tuning Studio tool can be used to analyze the data and find problems
Reporting in Speech Server Measuring application performance and server performance Call-Volume  Self Service completion rates Sharing reporting date throughout the business Speech server can leverage the full SQL Server stack Reporting Services Analysis Services Integration Services
Data Management – Trace Logging Logs Call details Application instrumentation Audio and grammers Server latencies More.. Saved in Speech Server Log files Can import via Log import tool into your SQL Server Database/Farm Analyze via Speech Server 2007 Analytics and Tuning Stuiod Present reports via SQL Server Reporting Services
Logged Information - Prompt Prompt Content Barge-in detection Rate/Volume Persona
Logged Information - Response Input Mode Speech DTMF Grammar Content (coverage) Rule weights Pronunciations Confirmation Threshold SR configuration Speech Detection Rejection Threshold Silence Timeout Endsilence Decoder … Acoustic Models …
 
Voice User Interface (VUI) Allows for human interaction with computers through a voice/speech platform VUI is the interface to any speech application Drive to make them conversational Instead of Browser Incompatibility you have dialect incompatibility. Not all business processes are suited to VUIs. Some are too complex Sometimes automation is impossible or impractical
Grammars Best practice: constrain the grammar as much as possible. Good prompt design guides the caller to use in-grammar responses. Out-of-grammar (OOG) responses are handled with more explicit prompting to elicit in-grammar response.
VUI Design Best Practices Use DTMF for long numbers Don’t use open ended prompts Don’t repeat prompts Focus on grammar accuracy If natural dialogs fail, fall back to directed dialog Always confirm what was recognized Generate prompts based on recognition confidence scores. Bail out if too many errors occur Keep text-to-speech output to a minimum Be aware of human memory “ Platinum Rule” Let the Caller Drive
Use DTMF for Long Numbers Limit spoken digits to 4 or less This rule is often broken for: Credit Card Numbers Social Security Numbers Bank Account Numbers Telephone Numbers DON’T Break This Rule!!! Remember customer privacy!
Don’t Use Open Ended Prompts BAD: “Hello, thank you for calling Tulsa Techfest.  May I help you? BETTER: “Hello, thank you for calling Tulsa Techfest, would you like to hear about today’s speakers?
Don’t Repeat Prompts Callers will tend to repeat the same response you did not understand the first time, when prompts are repeated Provide Escalated Help
Focus on Grammar Accuracy Spend time TUNING and REFINING your grammars Accuracy is IMPERATIVE To reduce recognition failures: Create prompts that make it clear what the user can and should say Test grammars with many different utterances from several people Record incoming calls once the system is in production and use this information to continually tune the grammars. Watch for dialects!
If Natural Dialogs Fail, Fall back to Directed Dialog Natural Dialogs are great, but they have a higher rate of failure. Don’t want to frustrate the user
Always Confirm What Was Recognized Mismatches are common Austin/Boston Sharp/Shark Brittney Spears/Kevin Federline Even for grammars with low ambiguity it’s important to confirm your recognition Implicit confirmation Ok Jason, Are you coming to Techfest? QA Control makes it easy to provide confirmation
Generate Prompts Based on Recognition Confidence Scores Speech recognition errors are common How to handle? Changing prompts Falling back to directed dialogs Transferring to operator Humans change their interaction based on perceived confidence, whether implicitly or explicitly N-Best lists are of great value here
Confidence Scores & N-Best Lists The recognition engine returns a confidence score along with a result The recognition engine can return several “guesses” of what it understood. You tell the engine to return up to N guesses.
Skip Lists Skip List is a  type  of N-Best processing Keep track of results that caller has confirmed ‘no’ to, and don’t ask again.
Bail Out If Too Many Errors Don’t make your customer become a “0” (zero) jammer Transfer to a live person if they error out more than twice Remember, some people have speech impediments, or patterns that may not correlate well into recognition confidence. Find the threshold!  (This takes testing)
Keep TTS Output to a Minimum Does not sound professional Hire a voice talent.. The payoff will justify the upfront cost Can use as a fall back for data or prompts that need to be dynamic
Be Aware of Human Memory Make lists short No more than 5 items Present large lists in chunks Make the prompts short
Platinum Rule Treat users as they want to be treated, not how you want to be treated Step into their shoes Use vocabulary they understand
Let The Caller Drive Provide instant gratification (let’s the caller get in a zone, and they enjoy the experience due to small successes) Only ask for what you need, not everything at once.
VUI Design is a Science Design before development Wizard of Oz Testing Find balance between business requirements and the caller experience Run usability trials on test subjects to validate your design Use a pilot to trial the application.  If caller behavior is not as expected, make adjustments.
Demos
 
Additional Information http://www.microsoft.com/speech http://www.microsoft.com/uc http://www.gotspeech.net http://www.nuance.com https://www.intervoice.com/ http://www.tellme.com/ http://www.vuidesign.org/
Further Resources My Blog http://www.okcodemonkey.com Linkedin http://www.linkedin.com/in/okcodemonkey Bartlesville .NET User Group http://www.bdnug.com Twitter http://twitter.com/okcodemonkey Email [email_address]
Key Terms
Voice Dialogue
 
Voice Browser “ Web Browser” that presents and IVR VUI to the user Provides interface to the PSTN or a PBX Works with Voice Dialogues (were web browsers work with HTML/XHMTL) Presents information aurally via: Text-To-Speech Prerecorded prompts Obtains information through: Speech Recognition DTMF detection
Speech Recognition Converts spoken words to machine readable input
DTMF (Dual-tone Multi-Frequency) Used for telephone signaling over the line in the voice-frequency band to the call switching center. Standardardized ny the ITU-T Recommendation Q.23
Text-To-Speech (Speech Synthesis) Artificial production of human speech Computer used is called the speech synthesizer Can be implemented in software or hardware Converts normal language text into speech
PSTN (Public Switched Telephone Network) Network of the world’s public circuit switched telephone networks Similar to the way the Internet is the network of the world’s public IP-based packet-switched networks. Originally a network of fixed-line analog telephone systems Now almost completely digital and includes mobile phones Governed by technical standards created by the ITU-T, and uses E.163/E.164 addresses (telephone numbers)
ITU-T (International Telecommunication Union Standardization Sector) Coordinates standards for telecommunications on behalf of the International Telecommunications Union Based in Geneva, Switzerland Original work dates back to 1865, with the birth of the International Telegraph Union Became a United Nations specialized agency in 1947
ITU (International Telecommunication Union) Established to standardize and regulate international radio and telecommunications. Founded as the International Telegraph Union on May 17, 1865 in Paris Main tasks include standardization, allocation of the radio spectrum, and organizing interconnection agreements between countries
PBX (Private Branch Exchange) Is a telephone exchange that serves as a particular business or office, as opposed to one that a common carrier or telephone company operates for many businesses

More Related Content

PDF
Enterprise Voice Technology Solutions: A Primer
PPTX
Text to Speech for Mobile Voice
PPTX
Not greek and latin v0.6
PPTX
Introduction to myanmar Text-To-Speech
PPTX
voice browser
PPTX
Voice Browser
ODP
Delivering Successful Online Presentations
PPTX
VOICE BROWSER
Enterprise Voice Technology Solutions: A Primer
Text to Speech for Mobile Voice
Not greek and latin v0.6
Introduction to myanmar Text-To-Speech
voice browser
Voice Browser
Delivering Successful Online Presentations
VOICE BROWSER

What's hot (6)

PPT
Ibm Tech Support
PPTX
project indesh
PPTX
Speech to text conversion
PDF
A Text To Speech Detection Methodology for Bangla in Android
PPTX
Text to speech converter in C#.NET
PPTX
Text to speech with Google Cloud
Ibm Tech Support
project indesh
Speech to text conversion
A Text To Speech Detection Methodology for Bangla in Android
Text to speech converter in C#.NET
Text to speech with Google Cloud
Ad

Viewers also liked (20)

PPTX
Evangelizing and Designing Voice User Interface: Adopting VUI in a GUI world
PDF
Jasper: the AI-powered recruiter bot
PPT
Cultural Awareness, Localization and the Impact on Content Creation of User I...
PDF
ThingsCon Amsterdam 2016 - Alper Cugun
PPTX
Conversational UI, chatbot, AI - simply explained
PDF
101 Conversational User Interfaces
PPTX
Conversational apps UX best practices
PDF
Converations on conversational Ux
PDF
designing conversations: Conversational interfaces, Bot Interactions, Chatb...
PPTX
Conversational interfaces - beyond the hype
PDF
Use Cases for Voice User Interface
PDF
Chatbots - A new era in digital banking
PPTX
AI and Python: Developing a Conversational Interface using Python
PDF
Chatbots, Conversational Interfaces, and the Rise of Messaging platforms
PDF
Introduction to Facebook Messenger, Conversational UI & NLP
PDF
Designing for conversation
PDF
chatbot and messenger as a platform
PDF
Introduction to Chatbots
PDF
The Chatbots Are Coming: A Guide to Chatbots, AI and Conversational Interfaces
 
Evangelizing and Designing Voice User Interface: Adopting VUI in a GUI world
Jasper: the AI-powered recruiter bot
Cultural Awareness, Localization and the Impact on Content Creation of User I...
ThingsCon Amsterdam 2016 - Alper Cugun
Conversational UI, chatbot, AI - simply explained
101 Conversational User Interfaces
Conversational apps UX best practices
Converations on conversational Ux
designing conversations: Conversational interfaces, Bot Interactions, Chatb...
Conversational interfaces - beyond the hype
Use Cases for Voice User Interface
Chatbots - A new era in digital banking
AI and Python: Developing a Conversational Interface using Python
Chatbots, Conversational Interfaces, and the Rise of Messaging platforms
Introduction to Facebook Messenger, Conversational UI & NLP
Designing for conversation
chatbot and messenger as a platform
Introduction to Chatbots
The Chatbots Are Coming: A Guide to Chatbots, AI and Conversational Interfaces
 
Ad

Similar to Tulsa Techfest 2008 - Creating A Voice User Interface With Speech Server (20)

PDF
Intro to watson bluemix services
PDF
How to Incorporate VoiceAI into Your Customer Service Product A Step-by-Step ...
PDF
Voice Tech TO #1
PPT
PPTX
Stream SQL eventflow visual programming for real programmers presentation
PPT
Csun2010 read speaker_formreader_presentation
PPT
Odd E验收测试驱动开发实战
PDF
Teleperformance - Smart personalized service door het gebruik van Data Science
PPT
Better Software Keynote The Complete Developer 07
PPT
Better Software Keynote The Complete Developer 07
PDF
Watson DevCon 2016 - From Jeopardy! to the Future
PPT
Controlled Authoring Workshop: Learn How Standardizing Content Will Improve Q...
PPT
Noise Adaptive Training for Robust Automatic Speech Recognition
PPTX
Raj Wpf Controls
PPT
The Essentials of Webinars by George Buckbee, Expertune
PDF
How to Implement Conversational IVR
PPTX
Shop By Voice Product Overview
PPTX
Keynote: Challenges, Pains and Points of Software Development Today
PPTX
IBM Cloud Artificial Intelligence : A Comprehensive Overview
PPTX
Improving Software Development Across the Lifecycle with Microsoft Visual Stu...
Intro to watson bluemix services
How to Incorporate VoiceAI into Your Customer Service Product A Step-by-Step ...
Voice Tech TO #1
Stream SQL eventflow visual programming for real programmers presentation
Csun2010 read speaker_formreader_presentation
Odd E验收测试驱动开发实战
Teleperformance - Smart personalized service door het gebruik van Data Science
Better Software Keynote The Complete Developer 07
Better Software Keynote The Complete Developer 07
Watson DevCon 2016 - From Jeopardy! to the Future
Controlled Authoring Workshop: Learn How Standardizing Content Will Improve Q...
Noise Adaptive Training for Robust Automatic Speech Recognition
Raj Wpf Controls
The Essentials of Webinars by George Buckbee, Expertune
How to Implement Conversational IVR
Shop By Voice Product Overview
Keynote: Challenges, Pains and Points of Software Development Today
IBM Cloud Artificial Intelligence : A Comprehensive Overview
Improving Software Development Across the Lifecycle with Microsoft Visual Stu...

Recently uploaded (20)

PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
MYSQL Presentation for SQL database connectivity
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPT
Teaching material agriculture food technology
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
Understanding_Digital_Forensics_Presentation.pptx
MYSQL Presentation for SQL database connectivity
“AI and Expert System Decision Support & Business Intelligence Systems”
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Mobile App Security Testing_ A Comprehensive Guide.pdf
NewMind AI Weekly Chronicles - August'25 Week I
Teaching material agriculture food technology
Digital-Transformation-Roadmap-for-Companies.pptx
MIND Revenue Release Quarter 2 2025 Press Release
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
The AUB Centre for AI in Media Proposal.docx
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Review of recent advances in non-invasive hemoglobin estimation
The Rise and Fall of 3GPP – Time for a Sabbatical?
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Per capita expenditure prediction using model stacking based on satellite ima...
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Building Integrated photovoltaic BIPV_UPV.pdf

Tulsa Techfest 2008 - Creating A Voice User Interface With Speech Server

  • 1. Creating a Voice User Interface with Speech Server 2007 Jason Townsend
  • 2. Jason Townsend President, Bartlesville .NET User Group Sr. Analyst, ConocoPhillips 11+ Years Development Experience Father of 4 wonderful children Married to an amazing and forgiving wife! Avid Sailor
  • 3.  
  • 4. Speech Server 2007 Speech Server is an IVR (interactive voice response) platform that allows you to develop telephony applications using standards such as Speech Application Language Tags (SALT) and VoiceXML. New Features Native Voice Over IP (VoIP) Voice Response Workflow Conversational Grammar Builder
  • 5. Common Application Scenarios Customer Service Pay bills by phone (ex: ChoicePay) Order products (ex: Tickets.com) Customer Support (ex: Dell) Banking (ex: Bank of America) Information Worker Markets Pipeline workers Insurance Appraisers Realtors For workers that may not be in front of a desktop
  • 6. New Features Support for .NET 2.0 Framework Support for VoiceXML Voice Response Workflow Applications Based on Windows Workflow Foundation Native Support for VoIP Integrated into Office Communications Server.
  • 8. Speech Recognition Supported Languages English – Austalia English – United Kingdom English – North America German – Germany Spanish – Americas More to come…
  • 9. VoiceXML W3C’s standard XML Format for specifying interactive voice dialogues between a human and a computer Interpreted by a voice browser
  • 10. SALT SALT Forum was founded on October 15, 2001 Microsoft Cisco Comverse Intel Philips ScanSoft W3C work initiated in July 2002 SALT Forum seems to have gone dead. The last press release was in 2003. Main concept was multimodal applications Speechify the web, ivr, handhelds, etc…
  • 11. SALT Usage Microsoft Speech Server 2004 Only SALT Microsoft Speech Server 2007 SALT and VXML Plugin for Internet Explorer
  • 12. Key Workflow Concepts Workflows are a set of activities The work flow itself is an Activity Activities are the building blocks of the application A single unit of Reuse A single unit of Execution An Activity has associated properties, conditions, and events Developers can build their own Custom Activity Libraries Image your own Telerik RAD Controls, Infragistics Controls, etc… Just for VUI’s A Workflow runs within a Host Process WAS IIS .EXE Windows Managed Services
  • 13. Dialogue Flow is a Workflow Speech Server only supports sequential workflow development
  • 14. Speech Application Development Define the dialogue flow Statements, questions, answers, etc… Other activities Specify possible answers (grammars) Record questions (prompts) Integrate into the back-end (Web Services) Deploy, test, and tune application
  • 15. Developing Your Prototype Managed Code Assembly
  • 16. Tuning Applications Out of the box speech applications Are not robust to real world user input Need real data to optimize Trial phases required for gathering data Wizard of Oz phase Pilot phases Visual Studio Integrated Analytics and Tuning Studio tool can be used to analyze the data and find problems
  • 17. Reporting in Speech Server Measuring application performance and server performance Call-Volume Self Service completion rates Sharing reporting date throughout the business Speech server can leverage the full SQL Server stack Reporting Services Analysis Services Integration Services
  • 18. Data Management – Trace Logging Logs Call details Application instrumentation Audio and grammers Server latencies More.. Saved in Speech Server Log files Can import via Log import tool into your SQL Server Database/Farm Analyze via Speech Server 2007 Analytics and Tuning Stuiod Present reports via SQL Server Reporting Services
  • 19. Logged Information - Prompt Prompt Content Barge-in detection Rate/Volume Persona
  • 20. Logged Information - Response Input Mode Speech DTMF Grammar Content (coverage) Rule weights Pronunciations Confirmation Threshold SR configuration Speech Detection Rejection Threshold Silence Timeout Endsilence Decoder … Acoustic Models …
  • 21.  
  • 22. Voice User Interface (VUI) Allows for human interaction with computers through a voice/speech platform VUI is the interface to any speech application Drive to make them conversational Instead of Browser Incompatibility you have dialect incompatibility. Not all business processes are suited to VUIs. Some are too complex Sometimes automation is impossible or impractical
  • 23. Grammars Best practice: constrain the grammar as much as possible. Good prompt design guides the caller to use in-grammar responses. Out-of-grammar (OOG) responses are handled with more explicit prompting to elicit in-grammar response.
  • 24. VUI Design Best Practices Use DTMF for long numbers Don’t use open ended prompts Don’t repeat prompts Focus on grammar accuracy If natural dialogs fail, fall back to directed dialog Always confirm what was recognized Generate prompts based on recognition confidence scores. Bail out if too many errors occur Keep text-to-speech output to a minimum Be aware of human memory “ Platinum Rule” Let the Caller Drive
  • 25. Use DTMF for Long Numbers Limit spoken digits to 4 or less This rule is often broken for: Credit Card Numbers Social Security Numbers Bank Account Numbers Telephone Numbers DON’T Break This Rule!!! Remember customer privacy!
  • 26. Don’t Use Open Ended Prompts BAD: “Hello, thank you for calling Tulsa Techfest. May I help you? BETTER: “Hello, thank you for calling Tulsa Techfest, would you like to hear about today’s speakers?
  • 27. Don’t Repeat Prompts Callers will tend to repeat the same response you did not understand the first time, when prompts are repeated Provide Escalated Help
  • 28. Focus on Grammar Accuracy Spend time TUNING and REFINING your grammars Accuracy is IMPERATIVE To reduce recognition failures: Create prompts that make it clear what the user can and should say Test grammars with many different utterances from several people Record incoming calls once the system is in production and use this information to continually tune the grammars. Watch for dialects!
  • 29. If Natural Dialogs Fail, Fall back to Directed Dialog Natural Dialogs are great, but they have a higher rate of failure. Don’t want to frustrate the user
  • 30. Always Confirm What Was Recognized Mismatches are common Austin/Boston Sharp/Shark Brittney Spears/Kevin Federline Even for grammars with low ambiguity it’s important to confirm your recognition Implicit confirmation Ok Jason, Are you coming to Techfest? QA Control makes it easy to provide confirmation
  • 31. Generate Prompts Based on Recognition Confidence Scores Speech recognition errors are common How to handle? Changing prompts Falling back to directed dialogs Transferring to operator Humans change their interaction based on perceived confidence, whether implicitly or explicitly N-Best lists are of great value here
  • 32. Confidence Scores & N-Best Lists The recognition engine returns a confidence score along with a result The recognition engine can return several “guesses” of what it understood. You tell the engine to return up to N guesses.
  • 33. Skip Lists Skip List is a type of N-Best processing Keep track of results that caller has confirmed ‘no’ to, and don’t ask again.
  • 34. Bail Out If Too Many Errors Don’t make your customer become a “0” (zero) jammer Transfer to a live person if they error out more than twice Remember, some people have speech impediments, or patterns that may not correlate well into recognition confidence. Find the threshold! (This takes testing)
  • 35. Keep TTS Output to a Minimum Does not sound professional Hire a voice talent.. The payoff will justify the upfront cost Can use as a fall back for data or prompts that need to be dynamic
  • 36. Be Aware of Human Memory Make lists short No more than 5 items Present large lists in chunks Make the prompts short
  • 37. Platinum Rule Treat users as they want to be treated, not how you want to be treated Step into their shoes Use vocabulary they understand
  • 38. Let The Caller Drive Provide instant gratification (let’s the caller get in a zone, and they enjoy the experience due to small successes) Only ask for what you need, not everything at once.
  • 39. VUI Design is a Science Design before development Wizard of Oz Testing Find balance between business requirements and the caller experience Run usability trials on test subjects to validate your design Use a pilot to trial the application. If caller behavior is not as expected, make adjustments.
  • 40. Demos
  • 41.  
  • 42. Additional Information http://www.microsoft.com/speech http://www.microsoft.com/uc http://www.gotspeech.net http://www.nuance.com https://www.intervoice.com/ http://www.tellme.com/ http://www.vuidesign.org/
  • 43. Further Resources My Blog http://www.okcodemonkey.com Linkedin http://www.linkedin.com/in/okcodemonkey Bartlesville .NET User Group http://www.bdnug.com Twitter http://twitter.com/okcodemonkey Email [email_address]
  • 46.  
  • 47. Voice Browser “ Web Browser” that presents and IVR VUI to the user Provides interface to the PSTN or a PBX Works with Voice Dialogues (were web browsers work with HTML/XHMTL) Presents information aurally via: Text-To-Speech Prerecorded prompts Obtains information through: Speech Recognition DTMF detection
  • 48. Speech Recognition Converts spoken words to machine readable input
  • 49. DTMF (Dual-tone Multi-Frequency) Used for telephone signaling over the line in the voice-frequency band to the call switching center. Standardardized ny the ITU-T Recommendation Q.23
  • 50. Text-To-Speech (Speech Synthesis) Artificial production of human speech Computer used is called the speech synthesizer Can be implemented in software or hardware Converts normal language text into speech
  • 51. PSTN (Public Switched Telephone Network) Network of the world’s public circuit switched telephone networks Similar to the way the Internet is the network of the world’s public IP-based packet-switched networks. Originally a network of fixed-line analog telephone systems Now almost completely digital and includes mobile phones Governed by technical standards created by the ITU-T, and uses E.163/E.164 addresses (telephone numbers)
  • 52. ITU-T (International Telecommunication Union Standardization Sector) Coordinates standards for telecommunications on behalf of the International Telecommunications Union Based in Geneva, Switzerland Original work dates back to 1865, with the birth of the International Telegraph Union Became a United Nations specialized agency in 1947
  • 53. ITU (International Telecommunication Union) Established to standardize and regulate international radio and telecommunications. Founded as the International Telegraph Union on May 17, 1865 in Paris Main tasks include standardization, allocation of the radio spectrum, and organizing interconnection agreements between countries
  • 54. PBX (Private Branch Exchange) Is a telephone exchange that serves as a particular business or office, as opposed to one that a common carrier or telephone company operates for many businesses