Dstc6 an introduction

DSTC6 – Dialogue System
Technology Challenges
An Introduction
서강대학교 자연어처리 연구실
허광호
2017.07.12

DSTC6 Tracks
• Track 1 – End-to-End Goal Oriented Dialog Learning
• Y-Lan Boureau et al. - Facebook AI Research
• Track 2 – End-to-End Conversation Modeling
• Chiori HORI et al. (Mitsubishi Electric Research Laboratories)
• Track 3 – Dialog Breakdown Detection
• Ryuichiro Higashinaka et al. (NTT)

Track 3 – Dialog Breakdown Detection
NB: Not a breakdown
PB: Possible breakdown
B: Breakdown

• 필요한 이유
• Voice agent 서비스가 상업용으로 출시되고 있지만
• Still cannot converse as naturally as two humans.
• 가장 큰 문제점은 Voice agent가 가끔 Dialogue breakdown을 유발하는 부적절한 발화를
생성함.
• 용도
• Breakdown detection 기술은 Chat-oriented 대화와 같이 대화유지가 중요한 경우 유용함.
• 대화 시스템의 error recovery에도 사용할 수 있음.

• Dataset
• 100 chat-oriented dialogues (21 utterances per dialogue) – 24 annotators.
• 1000 chat-oriented dialogues – 2~3 annotators.
• 300 chat-oriented dialogues – 30 annotators.
• Unfortunately, the data above are in Japanese;
• 추가로 영어로 된 100 dialogues 를 수집하여 배포한다고 함.
• 평가방법
• Classification-Related metrics – Accuracy, Precision, Recall, F-measure
• Distribution-related metrics – JS Divergence and Mean squared error

• LREC 2016 Breakdown Detection (In Japanese) 결과
• Baseline: CRF-based method
• Team1: LSTM-RNN-based method
• Features: Word2Vec + co-occurrence freq. vector + Sent2Vec vector
• Team2: LSTM-RNN-based method (Word2Vec)
• Team3: Rule-based method (Keyword는 시스템 발화에서 추출)
• Team4: SVM-based method (Word frequency vector)
• Team5: DNN-based method
• Features: dialogue act of the system and previous user utterance.
• Team6: LSTM-RNN-based method
• Features: Word vector encoded by the use of NCM (Neural Conversation Model), LSTM,
bag-of-word embedding, and an extended NCM.

Classification-Related Metrics
Baseline: CRF
Team1: LSTM-RNN-based
Team3: Rule-based
Team4: SVM-based
Team5: DNN-based
Team6: LSTM-RNN based
출처: The Dialogue Breakdown Detection Challenge - Task Description, Datasets, and Evaluation Metrics

Distribution-Related Metrics
Baseline: CRF
Team3: Rule-based
Team4: SVM-based
Team5: DNN-based
Team6: LSTM-RNN based
출처: The Dialogue Breakdown Detection Challenge - Task Description, Datasets, and Evaluation Metrics

Track 1 – End-to-End Goal Oriented Dialog Learning
• Goal-oriented Dialog Learning
• Goal-oriented 대화는 language modeling 이상의 기술을 필요로 함.
• Asking questions to clearly define a user request.
• Querying Knowledge Bases (KBs).
• Interpreting results from queries to display options to users or Completing a
transaction.
• 대화 도메인
• Restaurant reservation system
• Facebook AI Research open resource 를 코퍼스로 사용 (Bordes et al. 2017)

• Task 구성
• Goal-oriented 대화시스템이 갖춰야 할 기능들을 sub-task로 나누어서 각각 평가.
• Task 1: Issuing API calls
• Task 2: Updating API calls
• Task 3: Displaying options
• Task 4: Providing extra information
• Task 5: Conducting full dialogs

• Dataset 구성
• Task 당 10,000 examples
• 정답 발화 + 발화 candidates
• Evaluation
• Language generation 방식이 아니라
• 발화 candidates Ranking 방식으로 진행됨
• Next-Utterance Classification 이라고 함
• (Lowe et al. 2016)

Results published in ICLR 2017
출처: learning end-to-end goal-oriented dialog – A. Bordes 2017

Track 2 – End-to-End Conversation Modeling
• The system has to generate sentences responsive to a user input in a
given dialogue history where it can use external knowledge from web.

• Dataset
• Training Data (OpenSubtitles/Twitter ≈ 1M dialogs, 2.2M utterances)
• Test Data: 500 – 1000 dialogs

• Baseline System
• LSTM-based seq2seq generation system and a pre-trained model will be
provided.
• Evaluation
• Objective measure: Perplexity, BLEU, etc.
• Subjective measure: Human rating using crowd source.

DSTC6 Tracks Conclusion
• Track 1 – End-to-End Goal Oriented Dialog Learning
• Next-Utterance Classification Task
• Track 2 – End-to-End Conversation Modeling
• Language Generation Task
• Track 3 – Dialog Breakdown Detection
• Label Classification Task

Dstc6 an introduction

More Related Content

What's hot (20)

Similar to Dstc6 an introduction (20)

Recently uploaded (20)

Dstc6 an introduction