PortDial: Language Resources for
Portable Spoken Dialogue Systems

      Aris Karanikas, CCO, VoiceWeb
           European Data Forum
              June 6-7, 2012
          Copenhagen, Denmark
Spoken Dialogue Systems
DATA

•   Speech recordings (used for training acoustic models)
•   Text data (used for training language models)
•   Ontologies (used to define application domain)
•   Grammars (used for recognition)
    etc.
Challenge

A major roadblock in spoken dialogue system
(SDS) design is the lack of linguistic resources
that would enable the rapid porting of speech
services to new domains and languages
PortDial Objectives
• Devise machine-aided methods for creating,
  cleaning-up and publishing multilingual domain
  ontologies and grammars for SDS prototyping
• Create a platform that supports cost-effective
  language resource building for the domain and
  language porting scenarios
• Create and support a sustainable pool of users that
  contribute to a linguistic resources data exchange
Main Innovations
Technological Innovation:
    Combining knowledge-based and data-driven
    approaches for ontology and grammar induction
    from web-harvested data
Market Innovation:
  – Speech services prototyping/porting platform
    reduces time-to-market and barrier-to-entry
  – Spoken dialogue resources/data as a service
Partners
• Expertise: language engineering, spoken
  dialogue systems, semantic web, speech
  services, semantic networks, linked-data
PortDial Scenarios

• Porting to a new application domain
   – Focus on adaptation
• Porting to a new language
   – Focus on translation
• Resource-rich scenario
   – Focus on reusability/adaptation of existing resources
   – Use targeted web mined data to enrich resources
• Resource-poor scenario
   – Focus on data-driven bottom-up creation of
     resources using the web
Main Concept
Main Outputs
• A commercial platform for rapid prototyping of
  speech service resources for new domains and
  languages
• A collection of multilingual speech service resources
  (ontologies, grammars) for entertainment, banking
  and customer service domains
• Languages covered: English, German, Italian,
  Spanish, Greek, Turkish, Hebrew
Target Groups
• SMEs worldwide in the mobile application
  development industry lacking the expertise/
  resources to develop speech services in-house
• Non-commercial actors including the research
  community that can maintain and enrich the
  free version of the data pool
Impact

The SDS linguistic resources will lower the
barrier to entry for European SMEs to speech
services, allowing for inexpensive proof-of-
concept demonstrator development, opening
up new markets and application domains.

More Related Content

PPTX
EDF2013: Invited Talk Bríd Dooley: Cross-archival content discovery in the di...
PDF
EDF2013: Selected Talk Kristin Lyng: The Norwegian Meteorological Institute
PDF
EDF2012 Jaspar Hedegaar Bojsen - Big Data
PPT
EDF2013: Selected Talk, Sander van der Waal and Christian Villum: A one-stop ...
PDF
EDF2012 Simon Riggs - Open Data, Open Database: PostgreSQL
PDF
EDF2014: Daniel Vila-Suero, Researcher, Ontology Engineering Group, Universid...
PPTX
EDF2014: Ralf-Peter Schaefer, Head of Traffic Product Unit, TomTom, Germany: ...
PPTX
Aplikace pro rozpoznávání řeči - Jan Šedivý
EDF2013: Invited Talk Bríd Dooley: Cross-archival content discovery in the di...
EDF2013: Selected Talk Kristin Lyng: The Norwegian Meteorological Institute
EDF2012 Jaspar Hedegaar Bojsen - Big Data
EDF2013: Selected Talk, Sander van der Waal and Christian Villum: A one-stop ...
EDF2012 Simon Riggs - Open Data, Open Database: PostgreSQL
EDF2014: Daniel Vila-Suero, Researcher, Ontology Engineering Group, Universid...
EDF2014: Ralf-Peter Schaefer, Head of Traffic Product Unit, TomTom, Germany: ...
Aplikace pro rozpoznávání řeči - Jan Šedivý

Similar to EDF2012 Aris Karanikas - PortDial (20)

PDF
Paper on Speech Recognition
PDF
ELKL 5 Language documentation for linguistics and technology
PDF
PDF
An HLT profile of the official South African languages
PDF
Speech Platform
PDF
Efficient Intralingual Text To Speech Web Podcasting And Recording
PPTX
visH (fin).pptx
PPTX
Artificial Intelligence - An Introduction
PPTX
Artificial Intelligence- An Introduction
PDF
Mobile speech and advanced natural language solutions
PDF
Aj31253258
PDF
The Value and Benefits of Data-to-Text Technologies
PPTX
Artificial Intelligence Day 3 Slides for your Reference Happy Learning
DOC
12EEE032- text 2 voice
PDF
Designing applications for voice interface platforms
PDF
Speech recognition - how does it work?
PPTX
Speech Recognition
PDF
D1803041822
PPTX
Methodology for the Development of Vocal User Interfaces
PDF
AVoiceControlledE-CommerceWebApplication.pdf
Paper on Speech Recognition
ELKL 5 Language documentation for linguistics and technology
An HLT profile of the official South African languages
Speech Platform
Efficient Intralingual Text To Speech Web Podcasting And Recording
visH (fin).pptx
Artificial Intelligence - An Introduction
Artificial Intelligence- An Introduction
Mobile speech and advanced natural language solutions
Aj31253258
The Value and Benefits of Data-to-Text Technologies
Artificial Intelligence Day 3 Slides for your Reference Happy Learning
12EEE032- text 2 voice
Designing applications for voice interface platforms
Speech recognition - how does it work?
Speech Recognition
D1803041822
Methodology for the Development of Vocal User Interfaces
AVoiceControlledE-CommerceWebApplication.pdf
Ad

More from European Data Forum (20)

PPTX
Barbato leit ict 15-16-17
PPT
EDF2014: BIG - NESSI Networking Session: Edward Curry, National University of...
PPTX
EDF2014: BIG - NESSI Networking Session: Nuria de Lama, Representative to the...
PPT
EDF2014: BIG - NESSI Networking Session: Intro Presentation
PPTX
EDF2014: Kush Wadhwa, Senior Partner, Trilateral Research & Consulting: Addre...
PPTX
EDF2014: Adrian Cristal, Barcelona Supercomputing Center, RETHINK big Project...
PDF
EDF2014: Dimitris Vassiliadis, Head of Unit, EXUS Innovation Attractor: From ...
PPTX
EDF2014: Rüdiger Eichin, Research Manager at SAP AG, Germany: Deriving Value ...
PPTX
EDF2014: Paul Groth, Department of Computer Science & The Network Institute, ...
PPTX
EDF2014: Christian Lindemann, Wolters Kluwer Germany & Christian Dirschl, Wol...
PPT
EDF2014: Marta Nagy-Rothengass, Head of Unit Data Value Chain, Directorate Ge...
PDF
EDF2014: Stefan Wrobel, Institute Director, Fraunhofer IAIS / Member of the b...
PDF
EDF2014: Michele Vescovi, Researcher, Semantic & Knowledge Innovation Lab, It...
PPTX
EDF2014: Allan Hanbury, Senior Researcher, Vienna University of Technology, A...
PPTX
EDF2014: Nikolaos Loutas, Manager at PwC Belgium, Business Models for Linked ...
PPTX
EDF2014: Vedran Sabol, Head of the Knowledge Visualisation Area, Know-Center,...
PDF
EDF2014: Piek Vossen, Professor Computational Lexicology, VU University Amste...
PPT
EDF2014: Taru Rastas, Senior Advisor, Ministry of Communications of Finland: ...
PPT
EDF2014: José Ignacio Sánchez Valdenebro, Deputy Director of Digital Public S...
PPTX
EDF2014: Harry Theocharis, General Secretary of Public Revenue in the Ministr...
Barbato leit ict 15-16-17
EDF2014: BIG - NESSI Networking Session: Edward Curry, National University of...
EDF2014: BIG - NESSI Networking Session: Nuria de Lama, Representative to the...
EDF2014: BIG - NESSI Networking Session: Intro Presentation
EDF2014: Kush Wadhwa, Senior Partner, Trilateral Research & Consulting: Addre...
EDF2014: Adrian Cristal, Barcelona Supercomputing Center, RETHINK big Project...
EDF2014: Dimitris Vassiliadis, Head of Unit, EXUS Innovation Attractor: From ...
EDF2014: Rüdiger Eichin, Research Manager at SAP AG, Germany: Deriving Value ...
EDF2014: Paul Groth, Department of Computer Science & The Network Institute, ...
EDF2014: Christian Lindemann, Wolters Kluwer Germany & Christian Dirschl, Wol...
EDF2014: Marta Nagy-Rothengass, Head of Unit Data Value Chain, Directorate Ge...
EDF2014: Stefan Wrobel, Institute Director, Fraunhofer IAIS / Member of the b...
EDF2014: Michele Vescovi, Researcher, Semantic & Knowledge Innovation Lab, It...
EDF2014: Allan Hanbury, Senior Researcher, Vienna University of Technology, A...
EDF2014: Nikolaos Loutas, Manager at PwC Belgium, Business Models for Linked ...
EDF2014: Vedran Sabol, Head of the Knowledge Visualisation Area, Know-Center,...
EDF2014: Piek Vossen, Professor Computational Lexicology, VU University Amste...
EDF2014: Taru Rastas, Senior Advisor, Ministry of Communications of Finland: ...
EDF2014: José Ignacio Sánchez Valdenebro, Deputy Director of Digital Public S...
EDF2014: Harry Theocharis, General Secretary of Public Revenue in the Ministr...
Ad

Recently uploaded (20)

PPTX
Modernising the Digital Integration Hub
PDF
1 - Historical Antecedents, Social Consideration.pdf
DOCX
search engine optimization ppt fir known well about this
PPT
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
PDF
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PDF
Convolutional neural network based encoder-decoder for efficient real-time ob...
PDF
Comparative analysis of machine learning models for fake news detection in so...
PDF
The influence of sentiment analysis in enhancing early warning system model f...
PDF
Five Habits of High-Impact Board Members
PPTX
Custom Battery Pack Design Considerations for Performance and Safety
PDF
NewMind AI Weekly Chronicles – August ’25 Week III
PDF
Consumable AI The What, Why & How for Small Teams.pdf
PPTX
The various Industrial Revolutions .pptx
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
Flame analysis and combustion estimation using large language and vision assi...
PPTX
Configure Apache Mutual Authentication
PPTX
Build Your First AI Agent with UiPath.pptx
PDF
Credit Without Borders: AI and Financial Inclusion in Bangladesh
PDF
STKI Israel Market Study 2025 version august
Modernising the Digital Integration Hub
1 - Historical Antecedents, Social Consideration.pdf
search engine optimization ppt fir known well about this
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
sustainability-14-14877-v2.pddhzftheheeeee
Convolutional neural network based encoder-decoder for efficient real-time ob...
Comparative analysis of machine learning models for fake news detection in so...
The influence of sentiment analysis in enhancing early warning system model f...
Five Habits of High-Impact Board Members
Custom Battery Pack Design Considerations for Performance and Safety
NewMind AI Weekly Chronicles – August ’25 Week III
Consumable AI The What, Why & How for Small Teams.pdf
The various Industrial Revolutions .pptx
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
Flame analysis and combustion estimation using large language and vision assi...
Configure Apache Mutual Authentication
Build Your First AI Agent with UiPath.pptx
Credit Without Borders: AI and Financial Inclusion in Bangladesh
STKI Israel Market Study 2025 version august

EDF2012 Aris Karanikas - PortDial

  • 1. PortDial: Language Resources for Portable Spoken Dialogue Systems Aris Karanikas, CCO, VoiceWeb European Data Forum June 6-7, 2012 Copenhagen, Denmark
  • 2. Spoken Dialogue Systems DATA • Speech recordings (used for training acoustic models) • Text data (used for training language models) • Ontologies (used to define application domain) • Grammars (used for recognition) etc.
  • 3. Challenge A major roadblock in spoken dialogue system (SDS) design is the lack of linguistic resources that would enable the rapid porting of speech services to new domains and languages
  • 4. PortDial Objectives • Devise machine-aided methods for creating, cleaning-up and publishing multilingual domain ontologies and grammars for SDS prototyping • Create a platform that supports cost-effective language resource building for the domain and language porting scenarios • Create and support a sustainable pool of users that contribute to a linguistic resources data exchange
  • 5. Main Innovations Technological Innovation: Combining knowledge-based and data-driven approaches for ontology and grammar induction from web-harvested data Market Innovation: – Speech services prototyping/porting platform reduces time-to-market and barrier-to-entry – Spoken dialogue resources/data as a service
  • 6. Partners • Expertise: language engineering, spoken dialogue systems, semantic web, speech services, semantic networks, linked-data
  • 7. PortDial Scenarios • Porting to a new application domain – Focus on adaptation • Porting to a new language – Focus on translation • Resource-rich scenario – Focus on reusability/adaptation of existing resources – Use targeted web mined data to enrich resources • Resource-poor scenario – Focus on data-driven bottom-up creation of resources using the web
  • 9. Main Outputs • A commercial platform for rapid prototyping of speech service resources for new domains and languages • A collection of multilingual speech service resources (ontologies, grammars) for entertainment, banking and customer service domains • Languages covered: English, German, Italian, Spanish, Greek, Turkish, Hebrew
  • 10. Target Groups • SMEs worldwide in the mobile application development industry lacking the expertise/ resources to develop speech services in-house • Non-commercial actors including the research community that can maintain and enrich the free version of the data pool
  • 11. Impact The SDS linguistic resources will lower the barrier to entry for European SMEs to speech services, allowing for inexpensive proof-of- concept demonstrator development, opening up new markets and application domains.