SlideShare a Scribd company logo
How to turn Wikipedia into a Quiz Game
7th April 2017
Roberto Turrin Andrea Cappelli
PyCon O:o - Florence, Italy
About Us
Roberto Turrin Andrea Cappelli
Head of Technology, PhD Data ScienGst, PhD
@robytur @Skep86
Amazon Echo
Agenda
Wikipedia and wikidata as knowledge sources
NLP with Google Natural Language
Using enCCes and dependencies to generate quesCons
IoT integraCon with Alexa
Wikipedia and Wikidata as knowledge sources
Wikipedia and Wikidata
Wikidata stores structured informaCon
about several Wikipedia enCCes.
Wikidata is a document-oriented
database.
Wikidata can be queried
InformaCon is represented by
statements, i.e., key-value pairs.
PetScan
SPARQL
AutoList
Querying Wikidata with SPARQL
resource
resource
resource/literal
subject
predicate
object
<urn:x-states:New%20York>
<h.p://purl.org/dc/terms/alterna6ve>
"NY"
“New York has the postal abbrevia6on NY”
Downloading knowledge
Querying
Wikidata to
retrieve
Wikipedia
links of
interest
SPARQL
Downloading
Wikipedia
pages as XML
REST h+p call
Parsing
Wikipedia
dump
WikiExtractor
Reference
knowledge
NLP with Google Natural Language
Natural Language Processing
Syntax analysis
SemanCcs analysis
EnCty recogniCon
SenCment analysis
Main NLP tasks:
Google Natural Language - overview
Homer Simpson stole Ned’s air conditioner.
https://cloud.google.com/natural-language/
Using enCCes and dependencies 

to generate quesCons
From statement to quesGon
Homer
Simpson
stole
Ned
‘s
air
conditioner .
nsubj
dobj
steal
• Rephrase the sentence as a
question
Homer Simpson stole Ned’s air conditioner.
https://en.wikipedia.org/wiki/Homer_Simpson https://en.wikipedia.org/wiki/Ned_Flanders
• Identify a relevant phrase and remove it
__________ stole Ned’s air conditioner.
Q: What did Homer Simpson steal?

A: Ned’s air conditioner.
What about wrong answers?
appears in
The Simpsons (Q886)
Good “wrong answers”
(distractors) for quizzes
…
appears in
appears in appears in appears in
Q646166
Q727156 Q324430
Homer Simpson stole Ned’s air conditioner.
Q7810
Demo
clda.co/wiki-trivia
IoT integraCon with Alexa
What is Amazon Alexa (Echo Dot)?
Based on Amazon Alexa Voice service
Enabling HCI via voice
Quick build with AWS Lambda or
poinCng to web API
Intent-based with slot-filling
Retains memory within each
session
“Has Skills”
Deploying Alexa Skills
Create new skill (name and invocaCon)
Define intents, slots and
uberances
Link to Lambda or web API
Test
Conclusions
Overview
Download
knowledge
from
Wikipedia and
Wikidata
Extract
enCCes and
dependencies
with NLP
Generate
quesCons
from NLP
outcome and
Wikidata
enCCes
Configure
Alexa to serve
quesCons
Future work
Other paberns to generate quesCons
Custom enCty extracCon
IteraCve correcCon of bad quesCons
More complex distractors
Thank you
Q & A
7th April 2017

More Related Content

PDF
agINFRA work on germplasm and soil Linked Data by Luca Matteus, Giovanni L’Ab...
PPTX
Evolutionary & Swarm Computing for the Semantic Web
PPTX
Science in the open, what does it take?
PPTX
How open is open? An evaluation rubric for public knowledgebases
PDF
nanopub-java: A Java Library for Nanopublications
PPTX
OpenRefine reconciliation services
PDF
Where is the World is my Open Government Data?
PPTX
Reusable data for biomedicine: A data licensing odyssey
agINFRA work on germplasm and soil Linked Data by Luca Matteus, Giovanni L’Ab...
Evolutionary & Swarm Computing for the Semantic Web
Science in the open, what does it take?
How open is open? An evaluation rubric for public knowledgebases
nanopub-java: A Java Library for Nanopublications
OpenRefine reconciliation services
Where is the World is my Open Government Data?
Reusable data for biomedicine: A data licensing odyssey

What's hot (17)

PDF
Beyond 2022 project presentation 2021
PPT
Semantic web and Drupal: an introduction
PDF
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
PPTX
The agINFRA Linked Data layer by Valeria Pesce, Giovanni l'Abate, Luca Mattei...
PPTX
Semantic web 101: Benefits for geologists
PPTX
WASAPI Web Archive Data Transfer APIs
PDF
Linked Data track at ApacheCon Europe 2014
PDF
5 Ruby Gems in 10 minutes - Faraday, Hashie, Twitter, Diametric, and Adamantium
PPTX
Towards a Unified PageRank for DBpedia and Wikidata
PPT
Talis Platform: A Linked Data Engine
PDF
DHWI Linked Open Data - Show and Tell
PPT
Introduction To Linked Data
PPTX
RDA data, linked data, and benefits for users / Gordon Dunsire
PPTX
Introduction to Linked Data
PPT
Webtracks at JISC Managing Research Data Meeting
PPT
Linked Open Government Data and the Semantic Web
PDF
Web at 25 - Ontos Linked Open Data
Beyond 2022 project presentation 2021
Semantic web and Drupal: an introduction
Benchmarking RDF Metadata Representations: Reification, Singleton Property an...
The agINFRA Linked Data layer by Valeria Pesce, Giovanni l'Abate, Luca Mattei...
Semantic web 101: Benefits for geologists
WASAPI Web Archive Data Transfer APIs
Linked Data track at ApacheCon Europe 2014
5 Ruby Gems in 10 minutes - Faraday, Hashie, Twitter, Diametric, and Adamantium
Towards a Unified PageRank for DBpedia and Wikidata
Talis Platform: A Linked Data Engine
DHWI Linked Open Data - Show and Tell
Introduction To Linked Data
RDA data, linked data, and benefits for users / Gordon Dunsire
Introduction to Linked Data
Webtracks at JISC Managing Research Data Meeting
Linked Open Government Data and the Semantic Web
Web at 25 - Ontos Linked Open Data
Ad

Recently uploaded (20)

PPTX
Primary and secondary sources, and history
PPTX
INTERNATIONAL LABOUR ORAGNISATION PPT ON SOCIAL SCIENCE
PPTX
Self management and self evaluation presentation
PPTX
PHIL.-ASTRONOMY-AND-NAVIGATION of ..pptx
PPTX
Anesthesia and it's stage with mnemonic and images
PDF
Instagram's Product Secrets Unveiled with this PPT
PDF
Swiggy’s Playbook: UX, Logistics & Monetization
PPTX
2025-08-10 Joseph 02 (shared slides).pptx
PPTX
nose tajweed for the arabic alphabets for the responsive
PPTX
Sustainable Forest Management ..SFM.pptx
DOCX
"Project Management: Ultimate Guide to Tools, Techniques, and Strategies (2025)"
PPTX
Introduction-to-Food-Packaging-and-packaging -materials.pptx
PPT
First Aid Training Presentation Slides.ppt
PPTX
Presentation for DGJV QMS (PQP)_12.03.2025.pptx
DOC
学位双硕士UTAS毕业证,墨尔本理工学院毕业证留学硕士毕业证
PPTX
Effective_Handling_Information_Presentation.pptx
PPTX
Emphasizing It's Not The End 08 06 2025.pptx
PPTX
The Effect of Human Resource Management Practice on Organizational Performanc...
PPTX
Impressionism_PostImpressionism_Presentation.pptx
PPTX
Human Mind & its character Characteristics
Primary and secondary sources, and history
INTERNATIONAL LABOUR ORAGNISATION PPT ON SOCIAL SCIENCE
Self management and self evaluation presentation
PHIL.-ASTRONOMY-AND-NAVIGATION of ..pptx
Anesthesia and it's stage with mnemonic and images
Instagram's Product Secrets Unveiled with this PPT
Swiggy’s Playbook: UX, Logistics & Monetization
2025-08-10 Joseph 02 (shared slides).pptx
nose tajweed for the arabic alphabets for the responsive
Sustainable Forest Management ..SFM.pptx
"Project Management: Ultimate Guide to Tools, Techniques, and Strategies (2025)"
Introduction-to-Food-Packaging-and-packaging -materials.pptx
First Aid Training Presentation Slides.ppt
Presentation for DGJV QMS (PQP)_12.03.2025.pptx
学位双硕士UTAS毕业证,墨尔本理工学院毕业证留学硕士毕业证
Effective_Handling_Information_Presentation.pptx
Emphasizing It's Not The End 08 06 2025.pptx
The Effect of Human Resource Management Practice on Organizational Performanc...
Impressionism_PostImpressionism_Presentation.pptx
Human Mind & its character Characteristics
Ad

How to turn Wikipedia into a Quiz Game

  • 1. How to turn Wikipedia into a Quiz Game 7th April 2017 Roberto Turrin Andrea Cappelli PyCon O:o - Florence, Italy
  • 2. About Us Roberto Turrin Andrea Cappelli Head of Technology, PhD Data ScienGst, PhD @robytur @Skep86 Amazon Echo
  • 3. Agenda Wikipedia and wikidata as knowledge sources NLP with Google Natural Language Using enCCes and dependencies to generate quesCons IoT integraCon with Alexa
  • 4. Wikipedia and Wikidata as knowledge sources
  • 5. Wikipedia and Wikidata Wikidata stores structured informaCon about several Wikipedia enCCes. Wikidata is a document-oriented database. Wikidata can be queried InformaCon is represented by statements, i.e., key-value pairs. PetScan SPARQL AutoList
  • 6. Querying Wikidata with SPARQL resource resource resource/literal subject predicate object <urn:x-states:New%20York> <h.p://purl.org/dc/terms/alterna6ve> "NY" “New York has the postal abbrevia6on NY”
  • 7. Downloading knowledge Querying Wikidata to retrieve Wikipedia links of interest SPARQL Downloading Wikipedia pages as XML REST h+p call Parsing Wikipedia dump WikiExtractor Reference knowledge
  • 8. NLP with Google Natural Language
  • 9. Natural Language Processing Syntax analysis SemanCcs analysis EnCty recogniCon SenCment analysis Main NLP tasks:
  • 10. Google Natural Language - overview Homer Simpson stole Ned’s air conditioner. https://cloud.google.com/natural-language/
  • 11. Using enCCes and dependencies 
 to generate quesCons
  • 12. From statement to quesGon Homer Simpson stole Ned ‘s air conditioner . nsubj dobj steal • Rephrase the sentence as a question Homer Simpson stole Ned’s air conditioner. https://en.wikipedia.org/wiki/Homer_Simpson https://en.wikipedia.org/wiki/Ned_Flanders • Identify a relevant phrase and remove it __________ stole Ned’s air conditioner. Q: What did Homer Simpson steal?
 A: Ned’s air conditioner.
  • 13. What about wrong answers? appears in The Simpsons (Q886) Good “wrong answers” (distractors) for quizzes … appears in appears in appears in appears in Q646166 Q727156 Q324430 Homer Simpson stole Ned’s air conditioner. Q7810
  • 16. What is Amazon Alexa (Echo Dot)? Based on Amazon Alexa Voice service Enabling HCI via voice Quick build with AWS Lambda or poinCng to web API Intent-based with slot-filling Retains memory within each session “Has Skills”
  • 17. Deploying Alexa Skills Create new skill (name and invocaCon) Define intents, slots and uberances Link to Lambda or web API Test
  • 19. Overview Download knowledge from Wikipedia and Wikidata Extract enCCes and dependencies with NLP Generate quesCons from NLP outcome and Wikidata enCCes Configure Alexa to serve quesCons
  • 20. Future work Other paberns to generate quesCons Custom enCty extracCon IteraCve correcCon of bad quesCons More complex distractors
  • 21. Thank you Q & A 7th April 2017