Reading Comprehension Quiz Generation using Generative Pre-trained Transformers

Reading Comprehension Quiz Generation
using Generative Pre-trained Transformers
Ramon Dijkstra, Zülküf Genç, Subhradeep Kayal and Jaap Kamps
The 23th International Conference on Artificial Intelligence in Education (AIED’2022)
Fourth Workshop on Intelligent Textbooks (iTextbooks)
27 July 2022

Agenda
• Background
• Goal
• Demo
• Approach
• Experimental Setup
• Experimental Results
• Analysis
• Revisiting the demo
• Main Takeaways
• Q&A

Background
Quiz Generation
• Question Generation
• Question Answering
• Distractor Generation

Goal
Educational Text → Multiple-choice quiz
Why?
• Enhance intelligent textbooks with assessments
• Students could test themselves during the learning phase
• Teachers could use the tool to generate assessments

Approach
Large pre-trained transformers have shown superior perfomances on several text
generation tasks.
Generative Pre-trained Transformer 3 (GPT-3) can be finetuned to downstream
tasks using the API of OpenAI:
• Train on prompt-completion pairs
• Give a never-seen before prompt during inference

Approach
Prompt: Educational Text
Completion: Quiz
End-to-End Quiz Generation Template
Question: . . .
True answer: . . .
False answer: . . .
False answer: . . .
False answer: . . .
We will call this finetuned model EduQuiz.

Experimental Setup – quiz generation techniques
Two quiz generation techniques:
• Step-Wise Quiz Generation (SWQG)
• End-to-End Quiz Generation (EEQG)

Experimental Setup – models
Two models:
• GPT-3
• Macaw-11b
Macaw-11b is a general-purpose model trained on various text generation tasks.

Experimental Setup – dataset
EQG-RACE dataset:
• 18.501 train
• 1.035 validation
• 950 test
Processed RACE dataset where only examination questions are kept.

Experimental Setup – automatic evaluation
Metrics:
• BLEU-4: measures the 4-gram similarity between a prediction and ground truth
instances
• ROUGE-L: measures the longest common sub-sequence between the prediction
and ground truth instances
• METEOR: is similar in comparison to BLEU-4 but also takes synonyms, stemming,
and paraphrasing into account

Experimental Results
Macaw-11b did have the configurations to perform EEQG but our experiments
showed that only 14% of the generated quizzes contained all the elements of the
quiz. Therefore, we will not use Macaw-11b for EEQG.

Experimental Results
Original quiz:
Question: What is Edell’s strongest character?
True answer: Liking to try new things.
False answer: Liking to remember things.
False answer: Liking to help others.
False answer: Liking to think of problems differently.
Macaw-11b SWQG generated quiz (BLEU-4 = 0.0):
Question: The writer’s mother told her that if she wanted
to be successful in life, she should be prepared for any
kind of change. What did Edell do to prepare for the
Asian MTV Awards?
True answer: she read the biographies of the guests
False answer: she practiced her English with her friends
False answer: she made a list of the names of the guests
False answer: she made a plan to talk with the guests
GPT-3 SWQG generated quiz (BLEU-4 = 0.0):
Question: What advice does Edell give us?
True answer: Get ready for any opportunity in life.
False answer: Try to live a colorful life.
False answer: Take any chance that comes up.
False answer: Explore your potential talents.
EduQuiz-generated quiz (BLEU-4 = 0.0):
Question: What advice does Edell give to young people?
True answer: Try to get yourself well-prepared in life.
False answer: Have a rich collection of CDs.
False answer: Never miss an opportunity to learn ballet.
False answer: Be a hostess of the Asian MTV Awards.

Main Takeaways
• Already useful for formative feedback and to increase engagement during the
learning phase
• Currently only limited to English language and reading comprehension texts
• Too early to replace educational professionals
• Current performances require a human-in-the-loop to check the quality

Reading Comprehension Quiz Generation using Generative Pre-trained Transformers

More Related Content

What's hot (20)

Similar to Reading Comprehension Quiz Generation using Generative Pre-trained Transformers (20)

More from Sergey Sosnovsky (20)

Recently uploaded (20)

Reading Comprehension Quiz Generation using Generative Pre-trained Transformers