SlideShare a Scribd company logo
Speech Devices SDK
Speech Devices SDK
VisionMicrosoft Speech Services
Speech To Text
Convert speech to text
and back again, and
understand its intent
Custom Speech to
Text
Fine-tune speech
recognition for anyone,
anywhere
TTS/Custom Voice
Speech recognition and
analytics and
transcription
Speaker Recognition
Give your app the ability
to know who's talking
Translator
Speech translation
The Speech Devices SDK is pre-
packaged software fine tuned to
specific hardware (dev kits), that
makes it easy to integrate with the
full range of cloud-based Microsoft
Speech services, creating rich user
experiences for customers.
The Speech Devices SDK allows you
to choose your device’s custom
“wake word” – the cue that initiates
a user interaction. Working with
Microsoft Speech services and other
APIs, the SDK enables tailored Voice
AI experiences.
Speech Services
Speech Services
http://ddk.roobo.com
Speech
Devices SDK
Circular Dev kit
Linear Dev kit
Far-field
Custom Wake Word
Software and Services:
• Speech Devices SDK (from Microsoft)
• Premium audio processing solution
• Wake Word recognition
• Communicate with the Microsoft Speech services
• Wake Word customization (from Microsoft)
• (Microsoft Speech services and other Azure services for additional cost)
• Device logic and tools (from the dev kit manufacture)
Dev Kit (from the 3rd party provider):
• CPU: QUALCOMM AP08009 4coreA7 1.1 GHz
• Memory: LPDDR3+eMMC, 1GB + 8GB
• Mic Array: I2S Mic x 6+1 or 4
• Network: 802.11 b/g/n
• Charging: DC jack 2.5mm 12V 1.5A
Documentation and Support:
• Hardware tech docs (from the dev kit provider)
• Device hardware specs (from the dev kit provider)
• Sample app and sample code (from Microsoft)
• Documentation (from Microsoft http://aka.ms/sdsdk-info)
• Online Support
Speech Services SDK
Azure Speech Services
Microphone Array Audio Stack
Keyword Spotter
Client API
Multi-Channel Raw
Audio Input
Your Application
Speech Audio
Speech Audio
Speech AudioText Transcription,
and intent
Speech Audio
Speech Devices SDK
Evaluate building
an ambient
device
Visit the Azure
Speech Service site
Learn more
On
https://aka.ms/sds
dk-info on
http://ddk.roobo.c
om
Decide to
Try
Order the
Dev Kit
Through a third
party’s website
Wait for the
Dev Kit to
arrive
Can opt to try out
the Speech
Services on the
PC, while waiting
for the Hardware
Received
the Dev Kit
Use the sample
code and default
KWS to test
everything E2E
Customize
the KWS
•Through the
Custom
Speech portal
•Deploy the
model for the
custom
keyword
Run
everything
E2E
•Build the
sample app or
their own
application and
get everything
working E2E
Complete
Evaluation
Move to
commercialization
Phase
Satisfied with the
evaluation, and want
to move to
production
Contact Dev Kit Provider
•Customization
•Production
•Pricing
•Certification/Testing
•Shipping, etc
Contact Microsoft
• Pricing/Package discussion for Speech
service
• Customization of service, if applicable
• Pricing discussion for other Azure service, if
applicable
Move to
Production
Get a Speech
Subscription Key
Sign up for the
SDK
Download SDK
https://aka.ms/sdsdk-info
https://aka.ms/sdsdk-signup
https://aka.ms/csspeech/javaref
Speech Devices SDK
Speech Devices SDK
final SpeechRecognizer reco = factory.createSpeechRecognizer();
final Task<SpeechRecognitionResult> task = reco.recognizeAsync();
setOnTaskCompletedListener(task, result -> {
final String s = result.getRecognizedText();
});
final SpeechRecognizer reco = factory.createSpeechRecognizer();
reco.IntermediateResultReceived.addEventListener((o,
speechRecognitionResultEventArgs) -> {
final String s =
speechRecognitionResultEventArgs.getResult().getRecognizedText();
Log.i(logTag, "Intermediate result received: " + s);
setRecognizedText(s);
});
final Task<SpeechRecognitionResult> task = reco.recognizeAsync();
setOnTaskCompletedListener(task, result -> {
final String s = result.getRecognizedText();
}
final SpeechRecognizer reco = factory.createSpeechRecognizer();
reco.IntermediateResultReceived.addEventListener((o, speechRecognitionResultEventArgs) -> {
final String s = speechRecognitionResultEventArgs.getResult().getRecognizedText();
});
reco.FinalResultReceived.addEventListener((o, speechRecognitionResultEventArgs) -> {
final String s = speechRecognitionResultEventArgs.getResult().getRecognizedText();
});
final Task<?> task = reco.startContinuousRecognitionAsync();
reco = factory.createSpeechRecognizer();
reco.SessionEvent.addEventListener((o, sessionEventArgs) -> {
if (sessionEventArgs.getEventType() == SessionEventType.SessionStartedEvent) {
//do some customized stuff
}
});
reco.IntermediateResultReceived.addEventListener((o, intermediateResultEventArgs) -> {
final String s = intermediateResultEventArgs.getResult().getRecognizedText();
});
reco.FinalResultReceived.addEventListener((o, finalResultEventArgs) -> {
String s = finalResultEventArgs.getResult().getRecognizedText();
});
final Task<?> task = reco.startKeywordRecognitionAsync(KeywordRecognitionModel.fromFile(KeywordModel));
setOnTaskCompletedListener(task, result -> {
content.set(0, "say `" + Keyword + "`...");
setRecognizedText(TextUtils.join(delimiter, content));
continuousListeningStarted = true;
});
final HashMap<String, String> intentIdMap = new HashMap<>();
intentIdMap.put("1", "play music");
intentIdMap.put("2", "stop");
final IntentRecognizer reco = factory.createIntentRecognizer();
LanguageUnderstandingModel intentModel = LanguageUnderstandingModel.fromSubscription(LuisRegion, LuisSubscriptionKey, LuisAppId);
for (Map.Entry<String, String> entry : intentIdMap.entrySet()) {
reco.addIntent(entry.getKey(), intentModel, entry.getValue());
}
reco.IntermediateResultReceived.addEventListener((o, intentRecognitionResultEventArgs) -> {
final String s = intentRecognitionResultEventArgs.getResult().getRecognizedText();
});
final Task<IntentRecognitionResult> task = reco.recognizeAsync();
setOnTaskCompletedListener(task, result -> {
String s = result.getRecognizedText();
String intentId = result.getIntentId();
String intent = "";
if (intentIdMap.containsKey(intentId)) {
intent = intentIdMap.get(intentId);
}
}

More Related Content

PDF
Containers and Virtualisation for Continuous Testing
PDF
CLI, SDK, Doc... What if we generate them?
PDF
Build and automate your machine learning application with docker and jenkins
PPTX
Running Python Flask web apps on Azure App Service
PDF
Let Writers Write: Automating the Boring Stuff for Our Docs Team
PPTX
PHP And Silverlight - DevDays session
PPTX
Running the-next-generation-of-cloud-native-applications-using-open-applicati...
PPTX
DevOps Interview Questions Part - 2 | Devops Interview Questions And Answers ...
Containers and Virtualisation for Continuous Testing
CLI, SDK, Doc... What if we generate them?
Build and automate your machine learning application with docker and jenkins
Running Python Flask web apps on Azure App Service
Let Writers Write: Automating the Boring Stuff for Our Docs Team
PHP And Silverlight - DevDays session
Running the-next-generation-of-cloud-native-applications-using-open-applicati...
DevOps Interview Questions Part - 2 | Devops Interview Questions And Answers ...

What's hot (20)

PPTX
DevOps Interview Questions Part - 1 | Devops Interview Questions And Answers ...
PDF
Puppet Integration Adapter - Datasheet
PPTX
Getting Started with Azure DevOps
PDF
Spring Tools 4: Bootiful Spring Tooling for the Masses
PPTX
Infrastructure automation with .NET
PPTX
Getting Started with Azure Artifacts
PDF
Guide To Continuous Deployment Containerization With Docker Complete Deck
PDF
[JAZUG Tohoku Azure DevOps] Azure DevOps
PPTX
Update from android kk to android l
PDF
An Introduction to Enterprise Design Patterns
PDF
33 Software Development Tools that Drive Dialexa’s Success
PPTX
Azure dev ops
PPTX
Exposing services with Azure API Management
PDF
Five Steps to Add AppUp .NET SDK to Microsoft Visual Studio
PPTX
What is Puppet? | How Puppet Works? | Puppet Tutorial For Beginners | DevOps ...
PDF
Build embedded and IoT solutions with Microsoft Windows IoT Core (BRK30077)
PDF
Move Desktop Apps to the Cloud - RollApp & Embarcadero webinar
PDF
David Bureš - Xamarin, IoT a Azure
PPTX
Microsoft Tech Series 2019 - Azure DevOps
PPTX
Using Azure DevOps to continuously build, test, and deploy containerized appl...
DevOps Interview Questions Part - 1 | Devops Interview Questions And Answers ...
Puppet Integration Adapter - Datasheet
Getting Started with Azure DevOps
Spring Tools 4: Bootiful Spring Tooling for the Masses
Infrastructure automation with .NET
Getting Started with Azure Artifacts
Guide To Continuous Deployment Containerization With Docker Complete Deck
[JAZUG Tohoku Azure DevOps] Azure DevOps
Update from android kk to android l
An Introduction to Enterprise Design Patterns
33 Software Development Tools that Drive Dialexa’s Success
Azure dev ops
Exposing services with Azure API Management
Five Steps to Add AppUp .NET SDK to Microsoft Visual Studio
What is Puppet? | How Puppet Works? | Puppet Tutorial For Beginners | DevOps ...
Build embedded and IoT solutions with Microsoft Windows IoT Core (BRK30077)
Move Desktop Apps to the Cloud - RollApp & Embarcadero webinar
David Bureš - Xamarin, IoT a Azure
Microsoft Tech Series 2019 - Azure DevOps
Using Azure DevOps to continuously build, test, and deploy containerized appl...
Ad

Similar to Speech Devices SDK (20)

PPTX
Microsoft Innovation Summit
PDF
Accelerate Your Automation Testing Effort using TestProject & Docker | Docker...
PDF
Whats New in Microsoft Teams Hybrid Meetings November 2021
PPTX
flutterbootcamp
PPTX
flutter_bootcamp_MUGDSC_Presentation.pptx
PPTX
Continuous Delivery with a PaaS Application
PDF
Improve Developer Experience with Developer Portal
PPTX
Drive business outcomes using Azure Devops
PDF
Eclipse Che - A Revolutionary IDE for Distributed & Mainframe Development
DOCX
resume
PDF
Cloud Deployment Toolkit
PDF
Let's banish "it works on my machine"
PPTX
Evolution of VS code Java ecosystem
PPTX
Azure DevOps Best Practices Webinar
PDF
DevOps lagos meetup
PPTX
Tour of Azure DevOps
PPTX
Continues Integration and Continuous Delivery with Azure DevOps - Deploy Anyt...
PPTX
Dockercon 2019 Developing Apps with Containers, Functions and Cloud Services
PDF
DCSF 19 Developing Apps with Containers, Functions and Cloud Services
PDF
Cisco Spark Hybrid Services & Cloud Collaboration
Microsoft Innovation Summit
Accelerate Your Automation Testing Effort using TestProject & Docker | Docker...
Whats New in Microsoft Teams Hybrid Meetings November 2021
flutterbootcamp
flutter_bootcamp_MUGDSC_Presentation.pptx
Continuous Delivery with a PaaS Application
Improve Developer Experience with Developer Portal
Drive business outcomes using Azure Devops
Eclipse Che - A Revolutionary IDE for Distributed & Mainframe Development
resume
Cloud Deployment Toolkit
Let's banish "it works on my machine"
Evolution of VS code Java ecosystem
Azure DevOps Best Practices Webinar
DevOps lagos meetup
Tour of Azure DevOps
Continues Integration and Continuous Delivery with Azure DevOps - Deploy Anyt...
Dockercon 2019 Developing Apps with Containers, Functions and Cloud Services
DCSF 19 Developing Apps with Containers, Functions and Cloud Services
Cisco Spark Hybrid Services & Cloud Collaboration
Ad

More from Microsoft Tech Community (20)

PPTX
100 ways to use Yammer
PPTX
10 Yammer Group Suggestions
PPTX
Removing Security Roadblocks to IoT Deployment Success
PPTX
Building mobile apps with Visual Studio and Xamarin
PPTX
Best practices with Microsoft Graph: Making your applications more performant...
PPTX
Interactive emails in Outlook with Adaptive Cards
PPTX
Unlocking security insights with Microsoft Graph API
PPTX
Break through the serverless barriers with Durable Functions
PPTX
Multiplayer Server Scaling with Azure Container Instances
PPTX
Explore Azure Cosmos DB
PPTX
Media Streaming Apps with Azure and Xamarin
PPTX
DevOps for Data Science
PPTX
Real-World Solutions with PowerApps: Tips & tricks to manage your app complexity
PPTX
Azure Functions and Microsoft Graph
PPTX
Ingestion in data pipelines with Managed Kafka Clusters in Azure HDInsight
PPTX
Getting Started with Visual Studio Tools for AI
PPTX
Using AML Python SDK
PPTX
Mobile Workforce Location Tracking with Bing Maps
PPTX
Cognitive Services Labs in action Anomaly detection
PPTX
LinkedIn Learning presents: Securing web applications in ASP.NET Core 2.1
100 ways to use Yammer
10 Yammer Group Suggestions
Removing Security Roadblocks to IoT Deployment Success
Building mobile apps with Visual Studio and Xamarin
Best practices with Microsoft Graph: Making your applications more performant...
Interactive emails in Outlook with Adaptive Cards
Unlocking security insights with Microsoft Graph API
Break through the serverless barriers with Durable Functions
Multiplayer Server Scaling with Azure Container Instances
Explore Azure Cosmos DB
Media Streaming Apps with Azure and Xamarin
DevOps for Data Science
Real-World Solutions with PowerApps: Tips & tricks to manage your app complexity
Azure Functions and Microsoft Graph
Ingestion in data pipelines with Managed Kafka Clusters in Azure HDInsight
Getting Started with Visual Studio Tools for AI
Using AML Python SDK
Mobile Workforce Location Tracking with Bing Maps
Cognitive Services Labs in action Anomaly detection
LinkedIn Learning presents: Securing web applications in ASP.NET Core 2.1

Recently uploaded (20)

PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Modernizing your data center with Dell and AMD
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Empathic Computing: Creating Shared Understanding
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Approach and Philosophy of On baking technology
PDF
Spectral efficient network and resource selection model in 5G networks
Understanding_Digital_Forensics_Presentation.pptx
Unlocking AI with Model Context Protocol (MCP)
Review of recent advances in non-invasive hemoglobin estimation
Modernizing your data center with Dell and AMD
The Rise and Fall of 3GPP – Time for a Sabbatical?
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Advanced methodologies resolving dimensionality complications for autism neur...
Building Integrated photovoltaic BIPV_UPV.pdf
NewMind AI Weekly Chronicles - August'25 Week I
Empathic Computing: Creating Shared Understanding
“AI and Expert System Decision Support & Business Intelligence Systems”
Diabetes mellitus diagnosis method based random forest with bat algorithm
NewMind AI Monthly Chronicles - July 2025
CIFDAQ's Market Insight: SEC Turns Pro Crypto
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Encapsulation_ Review paper, used for researhc scholars
20250228 LYD VKU AI Blended-Learning.pptx
Approach and Philosophy of On baking technology
Spectral efficient network and resource selection model in 5G networks

Speech Devices SDK

  • 3. VisionMicrosoft Speech Services Speech To Text Convert speech to text and back again, and understand its intent Custom Speech to Text Fine-tune speech recognition for anyone, anywhere TTS/Custom Voice Speech recognition and analytics and transcription Speaker Recognition Give your app the ability to know who's talking Translator Speech translation
  • 4. The Speech Devices SDK is pre- packaged software fine tuned to specific hardware (dev kits), that makes it easy to integrate with the full range of cloud-based Microsoft Speech services, creating rich user experiences for customers. The Speech Devices SDK allows you to choose your device’s custom “wake word” – the cue that initiates a user interaction. Working with Microsoft Speech services and other APIs, the SDK enables tailored Voice AI experiences.
  • 6. http://ddk.roobo.com Speech Devices SDK Circular Dev kit Linear Dev kit Far-field Custom Wake Word
  • 7. Software and Services: • Speech Devices SDK (from Microsoft) • Premium audio processing solution • Wake Word recognition • Communicate with the Microsoft Speech services • Wake Word customization (from Microsoft) • (Microsoft Speech services and other Azure services for additional cost) • Device logic and tools (from the dev kit manufacture) Dev Kit (from the 3rd party provider): • CPU: QUALCOMM AP08009 4coreA7 1.1 GHz • Memory: LPDDR3+eMMC, 1GB + 8GB • Mic Array: I2S Mic x 6+1 or 4 • Network: 802.11 b/g/n • Charging: DC jack 2.5mm 12V 1.5A Documentation and Support: • Hardware tech docs (from the dev kit provider) • Device hardware specs (from the dev kit provider) • Sample app and sample code (from Microsoft) • Documentation (from Microsoft http://aka.ms/sdsdk-info) • Online Support
  • 8. Speech Services SDK Azure Speech Services Microphone Array Audio Stack Keyword Spotter Client API Multi-Channel Raw Audio Input Your Application Speech Audio Speech Audio Speech AudioText Transcription, and intent Speech Audio
  • 10. Evaluate building an ambient device Visit the Azure Speech Service site Learn more On https://aka.ms/sds dk-info on http://ddk.roobo.c om Decide to Try Order the Dev Kit Through a third party’s website Wait for the Dev Kit to arrive Can opt to try out the Speech Services on the PC, while waiting for the Hardware Received the Dev Kit Use the sample code and default KWS to test everything E2E Customize the KWS •Through the Custom Speech portal •Deploy the model for the custom keyword Run everything E2E •Build the sample app or their own application and get everything working E2E Complete Evaluation Move to commercialization Phase Satisfied with the evaluation, and want to move to production Contact Dev Kit Provider •Customization •Production •Pricing •Certification/Testing •Shipping, etc Contact Microsoft • Pricing/Package discussion for Speech service • Customization of service, if applicable • Pricing discussion for other Azure service, if applicable Move to Production Get a Speech Subscription Key Sign up for the SDK Download SDK
  • 14. final SpeechRecognizer reco = factory.createSpeechRecognizer(); final Task<SpeechRecognitionResult> task = reco.recognizeAsync(); setOnTaskCompletedListener(task, result -> { final String s = result.getRecognizedText(); }); final SpeechRecognizer reco = factory.createSpeechRecognizer(); reco.IntermediateResultReceived.addEventListener((o, speechRecognitionResultEventArgs) -> { final String s = speechRecognitionResultEventArgs.getResult().getRecognizedText(); Log.i(logTag, "Intermediate result received: " + s); setRecognizedText(s); }); final Task<SpeechRecognitionResult> task = reco.recognizeAsync(); setOnTaskCompletedListener(task, result -> { final String s = result.getRecognizedText(); } final SpeechRecognizer reco = factory.createSpeechRecognizer(); reco.IntermediateResultReceived.addEventListener((o, speechRecognitionResultEventArgs) -> { final String s = speechRecognitionResultEventArgs.getResult().getRecognizedText(); }); reco.FinalResultReceived.addEventListener((o, speechRecognitionResultEventArgs) -> { final String s = speechRecognitionResultEventArgs.getResult().getRecognizedText(); }); final Task<?> task = reco.startContinuousRecognitionAsync();
  • 15. reco = factory.createSpeechRecognizer(); reco.SessionEvent.addEventListener((o, sessionEventArgs) -> { if (sessionEventArgs.getEventType() == SessionEventType.SessionStartedEvent) { //do some customized stuff } }); reco.IntermediateResultReceived.addEventListener((o, intermediateResultEventArgs) -> { final String s = intermediateResultEventArgs.getResult().getRecognizedText(); }); reco.FinalResultReceived.addEventListener((o, finalResultEventArgs) -> { String s = finalResultEventArgs.getResult().getRecognizedText(); }); final Task<?> task = reco.startKeywordRecognitionAsync(KeywordRecognitionModel.fromFile(KeywordModel)); setOnTaskCompletedListener(task, result -> { content.set(0, "say `" + Keyword + "`..."); setRecognizedText(TextUtils.join(delimiter, content)); continuousListeningStarted = true; });
  • 16. final HashMap<String, String> intentIdMap = new HashMap<>(); intentIdMap.put("1", "play music"); intentIdMap.put("2", "stop"); final IntentRecognizer reco = factory.createIntentRecognizer(); LanguageUnderstandingModel intentModel = LanguageUnderstandingModel.fromSubscription(LuisRegion, LuisSubscriptionKey, LuisAppId); for (Map.Entry<String, String> entry : intentIdMap.entrySet()) { reco.addIntent(entry.getKey(), intentModel, entry.getValue()); } reco.IntermediateResultReceived.addEventListener((o, intentRecognitionResultEventArgs) -> { final String s = intentRecognitionResultEventArgs.getResult().getRecognizedText(); }); final Task<IntentRecognitionResult> task = reco.recognizeAsync(); setOnTaskCompletedListener(task, result -> { String s = result.getRecognizedText(); String intentId = result.getIntentId(); String intent = ""; if (intentIdMap.containsKey(intentId)) { intent = intentIdMap.get(intentId); } }

Editor's Notes