Challenging Reading Comprehension on Daily Conversation: Passage Completion on Multiparty Dialog

Challenging Reading Comprehension on Daily Conversation:
Passage Completion on Multiparty Dialog
Kaixin Ma, Tomasz Jurczyk, Jinho D. Choi
Department of Mathematics and Computer Science, Emory University
• Test machine’s ability to comprehend daily conversation
• Given a dialog and a passage describing the dialog, the machine is asked to
predict the missing word in the passage.
Objective
• Transcripts from TV show Friends are used, which have 10 seasons, 236
episodes and 3,107 scenes in total.
• Each scene is treated as a separate dialog.
• Collect plot summaries from fan sites.
• A second set of passages are created without using dominate characters.
• In total, 1,681 dialogs are used, 4,646 passages are generated, in which
2,994 come from plot summaries, 615 are different descriptions and 1,037
are ones without using major characters.
Sample dialog from our dataset
Passages associated with this dialog
Queries generated from passages
Experimental splits of our dataset
Corpus Creation
Overall Structure of the model
Utterance level attention from similarity matrix
• A
Dialog-level attention to optimize global view
Results of all of our experiments
• To test models’ performance on longer dialogs, 3 more datasets are created
by adding utterances from consecutive scenes with the same set of queries.
• On all 4 datasets, our models converge in fewer epochs and as dialog gets
longer our models outperform Bi-LSTM by a larger margin.
Attention matrix in the dialog level attention
• Keywords in the query such as misses, take, and good time get relatively
high attention. Utterances 14, 15, 17 which give the answer also got more
attention. Showing the effectiveness of our attention mechanism.
• During error analysis, it is found that Bi-LSTM is better at capturing exact
word or phrase matches, whereas our models are better at answering
questions that require inference from multiple utterances.
Results
• We introduced a new corpus consisting multiparty dialogs and crowdsource
annotations for the task of passage completion
• We presented a neural architecture combining CNN, RNN and two levels of
attention. Models trained on this architecture outperform ones trained on
pure LSTM, especially on longer dialogs.
• To the best of our knowledge, this is the first time the passage completion
task is thoroughly examines with a challenging dataset on multiparty dialog
using deep learning models.
• For the future work, it is interesting to expand annotation for other entity
types and add an entity linker to automatically link mentions with respect to
their entities.
Conclusion
Approaches
Convolution
U1
Convolution
Uk
LSTM↓d LSTM↓d
LSTM↑d LSTM↑d
· · ·
· · ·
Q
LSTM↓q LSTM↓q
LSTM↑q LSTM↑q
· · ·
· · ·
D
· · ·
~u1 ~uk
~h #d
~h #q
~h "q
~h "d
Softmax
Q
Ui
Si
U0
i
⌦
Vi
A
DQ ⌦
⌦ ⌦
Q0
D0
P
~pc
~pr~aq ~ad
1D Convolution
Sum

Challenging Reading Comprehension on Daily Conversation: Passage Completion on Multiparty Dialog

More Related Content

What's hot (19)

Similar to Challenging Reading Comprehension on Daily Conversation: Passage Completion on Multiparty Dialog (20)

More from Jinho Choi (20)

Recently uploaded (20)

Challenging Reading Comprehension on Daily Conversation: Passage Completion on Multiparty Dialog