SlideShare a Scribd company logo
Neural-based
Context Representation Learning for
Dialog Act Classification
Daniel Ortega, Ngoc Thang Vu
(Institute for Natural Language Processing (IMS) University of
Stuttgart)
-
Proceedings of the SIGDIAL 2017 Conference,
-
Slides by Park JeeHyun
25 OCT 2018
Introduction
• The study of spoken dialogs between two or more speakers
can be approached by analyzing the dialog acts (DAs).
• Dialog Acts
• the intention of the speaker at every utterance during a
conversation.
• e.g.
Introduction. Automatic_DA_classification
• Automatic DA classification
• This classification task has been approached using traditional
statistical methods such as
• hidden Markov models (HMMs) (Stolcke et al., 2000),
• conditional random fields (CRF) (Zimmermann, 2009) and
• support vector machines (SVMs) (Henderson et al., 2012).
• Recently, deep learning (DL) techniques have brought state-of-
the-art models in DA classification, such as
• convolutional neural networks (CNNs) (Kalchbrenner and Blunsom, 2013;
Lee and Dernoncourt, 2016),
• recurrent neural networks (RNNs) (Lee and Dernoncourt, 2016; Ji et al., 2016)
and
• long short-term memory (LSTM) models (Shen and Lee, 2016).
Introduction. Automatic_DA_with_context_info
• In many cases, the utterances are too short so that is hard
to classify them,
• for example the utterance ‘Right’ can be
either an Agreement or a Backchannel,
in this case the context plays a key role at disambiguating.
• Therefore, using context information from the previous utterances
in a dialog flow is a crucial step for improving DA classification.
Introduction. Attention_Mechanisms
• Attention mechanisms (AMs) introduced by Bahdanau et al.
(2014) have contributed to significant improvements in
many natural language processing tasks.
Model
• CNN-based Dialog Utterance Representation
• Internal Attention Mechanism
• Neural-based Context Modeling
Model. CNN-based_Dialog_Utterance_Representation
• CNNs is used for the representation of each utterance.
• CNNs perform a discrete convolution on an input matrix with
a set of different filters.
• For the DA classification task, the input matrix rep- resents
a dialog utterance and its context, this is n previous
utterances:
• each column of the matrix stores the word embedding of the
corresponding word.
• 2D filters f (with width |f|) spanning all embedding
dimensions d.
Model. CNN-based_Dialog_Utterance_Representation
• 2D filters f (with width |f|) spanning all embedding
dimensions d.
Model. Internal_Attention_Mechanism
• Attention mechanisms can be applied in different
sequences of input vectors, e.g. representations of
consecutive dialog utterances.
for output attention
for input attention
attention strength
input vector @ time step t-i
Model. Neural-based_Context_Modeling
Experimental Setup
• Data
• MRDA:
ICSI Meeting Recorder Dialog Act Corpus (Janin et al., 2003; Shriberg et al.,
2004; Dhillon et al., 2004), a dialog corpus of multiparty meetings. The 5-tag-
set used in this work was introduced by Ang et al. (2005).
• SwDA:
Switchboard Dialog Act Corpus (Godfrey et al., 1992; Jurafsky et al., 1997), a
dialog corpus of 2-speaker conversations.
In both datasets the classes are highly unbalanced,
the majority class is 59.1% on MRDA
and 33.7 % on SwDA.
Experimental Setup
• Hyperparameters and Training
Experimental Results
• Baseline Models
• a one-layer CNN for sentence classification based on Kim (2014)
• Baseline I: The input is a single utterance a time without any contextual
information
• Baseline II: The input is the concatenation of the current utterance and
previous utterances.
Q) Which model did they used as baseline models?
Static or Non-static?
Or Multi-channel?
Experimental Results. Results
Q) How about using attention on both ends?
(RNN-Input-Output-Attention???)
Experimental Results. Impact_of_Context_Length
Comparison with Other Works
Lee and Dernon-court (CNN-FF & LSTM-FF) is
the newest research in DA classification,
which published train/validation splits and
claimed to be the state-of-the-art on that setup.
Therefore, an accurate comparison of our results
can be only done with this work.
Conclusions
• We explored different neural-based context representation learning
methods for dialog act classification which combine RNN
architectures with attention mechanisms at different context levels.
• Our results on two benchmark datasets reveal that using
RNN architecture is important to learn the context representation.
• Moreover, attention mechanisms contribute to the overall
improvements,
however, the place where AM should be applied depends on
the nature of the dataset.
neural based_context_representation_learning_for_dialog_act_classification

More Related Content

PDF
PDF
An expert system for automatic reading of a text written in standard arabic
PDF
GENETIC APPROACH FOR ARABIC PART OF SPEECH TAGGING
PDF
Bachelors project summary
PPTX
Dsp ict3105- lec02
PPTX
First-Character Filtering Method in Syllable Segmentation using Data Dictiona...
PDF
ASA 09 Poster-portlandOR-051209
PPTX
Speaker recognition in android
An expert system for automatic reading of a text written in standard arabic
GENETIC APPROACH FOR ARABIC PART OF SPEECH TAGGING
Bachelors project summary
Dsp ict3105- lec02
First-Character Filtering Method in Syllable Segmentation using Data Dictiona...
ASA 09 Poster-portlandOR-051209
Speaker recognition in android

What's hot (19)

PDF
Named Entity Recognition using Hidden Markov Model (HMM)
PDF
Study_of_Sequence_labeling_Systems
PDF
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
PDF
What can GAN and GMMN do for augmented speech communication?
PDF
D3 dhanalakshmi
PDF
A survey on phrase structure learning methods for text classification
PPTX
Does Pronunciation Instruction Promote Intelligibility and Comprehensibility?
PDF
Semi-supervised Prosody Modeling Using Deep Gaussian Process Latent Variable...
PDF
Integration of speech recognition with computer assisted translation
ODP
Decision tables
PDF
An Automatic Question Paper Generation : Using Bloom's Taxonomy
PPT
20080603 Assessment Final
PDF
Turkish language modeling using BERT
PDF
IRJET- Spoken Language Identification System using MFCC Features and Gaus...
PDF
Speaker identification system using close set
PDF
Transformer Introduction (Seminar Material)
PDF
ON THE UTILITY OF A SYLLABLE-LIKE SEGMENTATION FOR LEARNING A TRANSLITERATION...
PDF
MT SUMMIT13.Language-independent Model for Machine Translation Evaluation wit...
PDF
LiCord: Language Independent Content Word Finder
Named Entity Recognition using Hidden Markov Model (HMM)
Study_of_Sequence_labeling_Systems
End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF
What can GAN and GMMN do for augmented speech communication?
D3 dhanalakshmi
A survey on phrase structure learning methods for text classification
Does Pronunciation Instruction Promote Intelligibility and Comprehensibility?
Semi-supervised Prosody Modeling Using Deep Gaussian Process Latent Variable...
Integration of speech recognition with computer assisted translation
Decision tables
An Automatic Question Paper Generation : Using Bloom's Taxonomy
20080603 Assessment Final
Turkish language modeling using BERT
IRJET- Spoken Language Identification System using MFCC Features and Gaus...
Speaker identification system using close set
Transformer Introduction (Seminar Material)
ON THE UTILITY OF A SYLLABLE-LIKE SEGMENTATION FOR LEARNING A TRANSLITERATION...
MT SUMMIT13.Language-independent Model for Machine Translation Evaluation wit...
LiCord: Language Independent Content Word Finder
Ad

Similar to neural based_context_representation_learning_for_dialog_act_classification (20)

PDF
2019 dynamically composing_domain-data_selection_with_clean-data_selection_by...
PDF
Performance estimation based recurrent-convolutional encoder decoder for spee...
PPTX
Advanced_NLP_with_Transformers_PPT_final 50.pptx
PPTX
Applying static code analysis for domain-specific languages
PPTX
LiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptx
PDF
Nlp research presentation
PDF
[IJET-V2I1P13] Authors:Shilpa More, Gagandeep .S. Dhir , Deepak Daiwadney and...
PDF
Word Segmentation and Lexical Normalization for Unsegmented Languages
PPTX
Gnerative AI presidency Module1_L4_LLMs_new.pptx
PPTX
Transfer Learning in NLP: A Survey
PDF
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
PDF
BIDIRECTIONAL LONG SHORT-TERM MEMORY (BILSTM)WITH CONDITIONAL RANDOM FIELDS (...
PDF
BIDIRECTIONAL LONG SHORT-TERM MEMORY (BILSTM)WITH CONDITIONAL RANDOM FIELDS (...
PDF
A NEURAL MACHINE LANGUAGE TRANSLATION SYSTEM FROM GERMAN TO ENGLISH
PPTX
Sequence to sequence model speech recognition
PDF
T EXT M INING AND C LASSIFICATION OF P RODUCT R EVIEWS U SING S TRUCTURED S U...
PDF
LEPOR: an augmented machine translation evaluation metric - Thesis PPT
PDF
Lepor: augmented automatic MT evaluation metric
PPTX
Natural Language Processing - Language Model.pptx
PDF
MT SUMMIT PPT: Language-independent Model for Machine Translation Evaluation ...
2019 dynamically composing_domain-data_selection_with_clean-data_selection_by...
Performance estimation based recurrent-convolutional encoder decoder for spee...
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Applying static code analysis for domain-specific languages
LiDeng-BerlinOct2015-ASR-GenDisc-4by3.pptx
Nlp research presentation
[IJET-V2I1P13] Authors:Shilpa More, Gagandeep .S. Dhir , Deepak Daiwadney and...
Word Segmentation and Lexical Normalization for Unsegmented Languages
Gnerative AI presidency Module1_L4_LLMs_new.pptx
Transfer Learning in NLP: A Survey
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
BIDIRECTIONAL LONG SHORT-TERM MEMORY (BILSTM)WITH CONDITIONAL RANDOM FIELDS (...
BIDIRECTIONAL LONG SHORT-TERM MEMORY (BILSTM)WITH CONDITIONAL RANDOM FIELDS (...
A NEURAL MACHINE LANGUAGE TRANSLATION SYSTEM FROM GERMAN TO ENGLISH
Sequence to sequence model speech recognition
T EXT M INING AND C LASSIFICATION OF P RODUCT R EVIEWS U SING S TRUCTURED S U...
LEPOR: an augmented machine translation evaluation metric - Thesis PPT
Lepor: augmented automatic MT evaluation metric
Natural Language Processing - Language Model.pptx
MT SUMMIT PPT: Language-independent Model for Machine Translation Evaluation ...
Ad

More from JEE HYUN PARK (9)

PPTX
keti companion classifier
PPTX
[Paper review] BERT
PPTX
Kcc201728apr2017 170828235330
PPTX
a deep reinforced model for abstractive summarization
PPTX
Understanding GloVe
PPTX
Historical Finance Data
PPTX
Understanding lstm and its diagrams
PPTX
KCC2017 28APR2017
PPTX
Short-Term Load Forecasting of Australian National Electricity Market by Hier...
keti companion classifier
[Paper review] BERT
Kcc201728apr2017 170828235330
a deep reinforced model for abstractive summarization
Understanding GloVe
Historical Finance Data
Understanding lstm and its diagrams
KCC2017 28APR2017
Short-Term Load Forecasting of Australian National Electricity Market by Hier...

Recently uploaded (20)

PPTX
Foundation to blockchain - A guide to Blockchain Tech
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PDF
Well-logging-methods_new................
PPTX
Sustainable Sites - Green Building Construction
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPT
Mechanical Engineering MATERIALS Selection
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PDF
PPT on Performance Review to get promotions
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPTX
Lecture Notes Electrical Wiring System Components
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
Geodesy 1.pptx...............................................
PPTX
Construction Project Organization Group 2.pptx
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
Foundation to blockchain - A guide to Blockchain Tech
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
Well-logging-methods_new................
Sustainable Sites - Green Building Construction
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Mechanical Engineering MATERIALS Selection
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PPT on Performance Review to get promotions
Embodied AI: Ushering in the Next Era of Intelligent Systems
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Lecture Notes Electrical Wiring System Components
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
Geodesy 1.pptx...............................................
Construction Project Organization Group 2.pptx
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx

neural based_context_representation_learning_for_dialog_act_classification

  • 1. Neural-based Context Representation Learning for Dialog Act Classification Daniel Ortega, Ngoc Thang Vu (Institute for Natural Language Processing (IMS) University of Stuttgart) - Proceedings of the SIGDIAL 2017 Conference, - Slides by Park JeeHyun 25 OCT 2018
  • 2. Introduction • The study of spoken dialogs between two or more speakers can be approached by analyzing the dialog acts (DAs). • Dialog Acts • the intention of the speaker at every utterance during a conversation. • e.g.
  • 3. Introduction. Automatic_DA_classification • Automatic DA classification • This classification task has been approached using traditional statistical methods such as • hidden Markov models (HMMs) (Stolcke et al., 2000), • conditional random fields (CRF) (Zimmermann, 2009) and • support vector machines (SVMs) (Henderson et al., 2012). • Recently, deep learning (DL) techniques have brought state-of- the-art models in DA classification, such as • convolutional neural networks (CNNs) (Kalchbrenner and Blunsom, 2013; Lee and Dernoncourt, 2016), • recurrent neural networks (RNNs) (Lee and Dernoncourt, 2016; Ji et al., 2016) and • long short-term memory (LSTM) models (Shen and Lee, 2016).
  • 4. Introduction. Automatic_DA_with_context_info • In many cases, the utterances are too short so that is hard to classify them, • for example the utterance ‘Right’ can be either an Agreement or a Backchannel, in this case the context plays a key role at disambiguating. • Therefore, using context information from the previous utterances in a dialog flow is a crucial step for improving DA classification.
  • 5. Introduction. Attention_Mechanisms • Attention mechanisms (AMs) introduced by Bahdanau et al. (2014) have contributed to significant improvements in many natural language processing tasks.
  • 6. Model • CNN-based Dialog Utterance Representation • Internal Attention Mechanism • Neural-based Context Modeling
  • 7. Model. CNN-based_Dialog_Utterance_Representation • CNNs is used for the representation of each utterance. • CNNs perform a discrete convolution on an input matrix with a set of different filters. • For the DA classification task, the input matrix rep- resents a dialog utterance and its context, this is n previous utterances: • each column of the matrix stores the word embedding of the corresponding word. • 2D filters f (with width |f|) spanning all embedding dimensions d.
  • 8. Model. CNN-based_Dialog_Utterance_Representation • 2D filters f (with width |f|) spanning all embedding dimensions d.
  • 9. Model. Internal_Attention_Mechanism • Attention mechanisms can be applied in different sequences of input vectors, e.g. representations of consecutive dialog utterances. for output attention for input attention attention strength input vector @ time step t-i
  • 11. Experimental Setup • Data • MRDA: ICSI Meeting Recorder Dialog Act Corpus (Janin et al., 2003; Shriberg et al., 2004; Dhillon et al., 2004), a dialog corpus of multiparty meetings. The 5-tag- set used in this work was introduced by Ang et al. (2005). • SwDA: Switchboard Dialog Act Corpus (Godfrey et al., 1992; Jurafsky et al., 1997), a dialog corpus of 2-speaker conversations. In both datasets the classes are highly unbalanced, the majority class is 59.1% on MRDA and 33.7 % on SwDA.
  • 13. Experimental Results • Baseline Models • a one-layer CNN for sentence classification based on Kim (2014) • Baseline I: The input is a single utterance a time without any contextual information • Baseline II: The input is the concatenation of the current utterance and previous utterances. Q) Which model did they used as baseline models? Static or Non-static? Or Multi-channel?
  • 14. Experimental Results. Results Q) How about using attention on both ends? (RNN-Input-Output-Attention???)
  • 16. Comparison with Other Works Lee and Dernon-court (CNN-FF & LSTM-FF) is the newest research in DA classification, which published train/validation splits and claimed to be the state-of-the-art on that setup. Therefore, an accurate comparison of our results can be only done with this work.
  • 17. Conclusions • We explored different neural-based context representation learning methods for dialog act classification which combine RNN architectures with attention mechanisms at different context levels. • Our results on two benchmark datasets reveal that using RNN architecture is important to learn the context representation. • Moreover, attention mechanisms contribute to the overall improvements, however, the place where AM should be applied depends on the nature of the dataset.