SlideShare a Scribd company logo
CrowdTruth.org 
Machine-­‐Human 
Computa7on 
for 
Harnessing 
Disagreement 
in 
Seman7c 
Interpreta7on 
Oana 
Inel, 
Khalid 
Khamkham, 
Ta0ana 
Cristea, 
Anca 
Dumitrache, 
Arne 
Rutjes 
, 
Jelle 
v.d 
Ploeg 
, 
Lukasz 
Romaszko, 
Lora 
Aroyo, 
Robert-­‐Jan 
Sips
Importance 
of 
Human 
Annota7on 
• Seman7c 
interpreta7on 
of 
data 
is 
needed 
in 
all 
sciences 
• Humans 
analyze 
examples 
and 
annotate 
them 
for 
the 
“correct” 
interpreta7on 
• Machines 
learn 
& 
are 
evaluated 
from 
those 
examples 
Lora Aroyo @laroyo
㻴㼁㻹㻭㻺 㻰㻵㻿㻭㻳㻾㻱㻱㻹㻱㻺㼀 㻵㻿 
㻱㻿㻿㻱㻺㼀㻵㻭㻸 㻵㻺 㻴㻱㻸㻼㻵㻺㻳 㻹㻭㻯㻴㻵㻺㻱㻿 
㼃㻵㼀㻴 㻿㻱㻹㻭㻺㼀㻵㻯 㻵㻺㼀㻱㻾㻼㻾㻱㼀㻭㼀㻵㻻㻺! 
Lora Aroyo @laroyo
disagreement 
can 
reflect 
the 
degree 
of 
clarity 
in 
a 
sentence 
Does each sentence express the TREAT relation? 
ANTIBIOTICS are the first line treatment for indications of TYPHUS. 
à 95% 
Patients with TYPHUS who were given ANTIBIOTICS exhibited side-effects. 
à 80% 
With ANTIBIOTICS in short supply, DDT was used during WWII to control 
the insect vectors of TYPHUS. 
à 50% 
Lora Aroyo @laroyo
disagreement 
can 
indicate 
ambiguity 
of 
the 
rela7on 
What is the RELATION between the highlighted terms? 
GADOLINIUM agents are useful for patients with renal impairment, but in 
patients with severe renal failure requiring dialysis it presents a risk of 
nephrogenic systemic FIBROSIS. 
CAUSE? or SIDE EFFECT? 
70% 45% 
Lora Aroyo @laroyo
disagreement 
can 
indicate 
low 
quality 
workers 
Does each sentence express the TREAT relation? 
• S1: ANTIBIOTICS are the first line treatment for indications of TYPHUS. 
• S2: QUININE is not a reliable cure for MALARIA. 
Worker 
S1 
S2 
Worker 1 
yes 
no 
Worker 2 
yes 
no 
Worker 3 
yes 
Worker 4 
no 
Worker 5 
no 
yes 
Lora Aroyo @laroyo
CrowdTruth 
SoJware 
Components: 
Machines 
& 
Crowds 
Workflow 
• Machine Pre-processing: 
op0mizing 
crowdsourcing 
• Micro-task Template Library: 
reuse 
& 
op0miza0on 
• CrowdTruth Analytics: 
disagreement-­‐based 
metrics 
• Novel 
approach 
to 
ground 
truth 
data 
collec0on 
& 
evalua0on 
• PROV 
for 
tracking 
versions 
of 
data 
and 
processing 
steps 
• Reusability 
in 
variety 
of 
annota7on 
tasks 
& 
domains 
with 
text, 
image, 
video 
(thinking 
about 
sound) 
Lora Aroyo @laroyo
• Open 
CrowdTruth 
SoJware 
source: 
hQps://github.com/CrowdTruth 
• Web 
service: 
hQp://stable.crowdtruth.org 
Lora Aroyo @laroyo
• Open 
CrowdTruth 
SoJware: 
Crowdsourcing 
Job 
Analy7cs 
source: 
hQps://github.com/CrowdTruth 
• Web 
service: 
hQp://stable.crowdtruth.org 
Lora Aroyo @laroyo
• Open 
CrowdTruth 
SoJware: 
Worker 
Analy7cs 
source: 
hQps://github.com/CrowdTruth 
• Web 
service: 
hQp://stable.crowdtruth.org 
Lora Aroyo @laroyo
crowdtruth.org 
github.com/CrowdTruth 
Lora Aroyo @laroyo

More Related Content

PDF
CCCT University of Amsterdam Seminars 2013: Crowdsourcing Session
PDF
WebSci2013 Harnessing Disagreement in Crowdsourcing
PDF
How Can Software Engineering Support AI
PDF
Querylog-based Assessment of Retrievability Bias in Delpher
PPTX
Rigourous evaluation of nlp models in real world deployment
PDF
Open-Source Software's Responsibility to Science
PPTX
Safety Bot Guaranteed -- Shmoocon 2017
PDF
The Art of Social Media Analysis with Twitter & Python
CCCT University of Amsterdam Seminars 2013: Crowdsourcing Session
WebSci2013 Harnessing Disagreement in Crowdsourcing
How Can Software Engineering Support AI
Querylog-based Assessment of Retrievability Bias in Delpher
Rigourous evaluation of nlp models in real world deployment
Open-Source Software's Responsibility to Science
Safety Bot Guaranteed -- Shmoocon 2017
The Art of Social Media Analysis with Twitter & Python

Viewers also liked (18)

PDF
EUScreen XL 2014 Conference: DIVE In Digital Hermeneutics
PDF
VU Amsterdam: Social Web Course: Lecture1: Introduction to Social Web
PDF
Expand. Learn. Interact: Enabling Digital Humanities
PPTX
Hci history
PDF
Open, Connected & Smart Heritage: Towards New Cultural Commons
PDF
Lecture2: What People Do on the Social Web (VU Amsterdam Social Web Course)
PDF
Lecture 4: Human-Computer Interaction Course (2015) @VU University Amsterdam
PDF
Lecture 1: Human-Computer Interaction Course (2015) @VU University Amsterdam
PDF
"Video Killed the Radio Star": From MTV to Snapchat
PDF
Lecture 3: Human-Computer Interaction Course (2015) @VU University Amsterdam
PDF
Lecture 2: Human-Computer Interaction Course (2015) @VU University Amsterdam
PDF
CrowdTruth for User-Centric Relevance
PDF
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age
PDF
Lecture 5: Human-Computer Interaction Course (2015) @VU University Amsterdam
PDF
HCI Basics
PDF
Lecture 3: Human-Computer Interaction: HCI Design (2014)
PPT
Introduction to HCI
PPT
Lecture 1: Human-Computer Interaction Introduction (2014)
EUScreen XL 2014 Conference: DIVE In Digital Hermeneutics
VU Amsterdam: Social Web Course: Lecture1: Introduction to Social Web
Expand. Learn. Interact: Enabling Digital Humanities
Hci history
Open, Connected & Smart Heritage: Towards New Cultural Commons
Lecture2: What People Do on the Social Web (VU Amsterdam Social Web Course)
Lecture 4: Human-Computer Interaction Course (2015) @VU University Amsterdam
Lecture 1: Human-Computer Interaction Course (2015) @VU University Amsterdam
"Video Killed the Radio Star": From MTV to Snapchat
Lecture 3: Human-Computer Interaction Course (2015) @VU University Amsterdam
Lecture 2: Human-Computer Interaction Course (2015) @VU University Amsterdam
CrowdTruth for User-Centric Relevance
Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age
Lecture 5: Human-Computer Interaction Course (2015) @VU University Amsterdam
HCI Basics
Lecture 3: Human-Computer Interaction: HCI Design (2014)
Introduction to HCI
Lecture 1: Human-Computer Interaction Introduction (2014)
Ad

More from Lora Aroyo (20)

PDF
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
PDF
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning
PDF
Harnessing Human Semantics at Scale (updated)
PDF
Data excellence: Better data for better AI
PDF
CHIP Demonstrator presentation @ CATCH Symposium
PDF
Semantic Web Challenge: CHIP Demonstrator
PDF
The Rijksmuseum Collection as Linked Data
PDF
Keynote at International Conference of Art Libraries 2018 @Rijksmuseum
PDF
FAIRview: Responsible Video Summarization @NYCML'18
PDF
Understanding bias in video news & news filtering algorithms
PDF
StorySourcing: Telling Stories with Humans & Machines
PDF
Data Science with Humans in the Loop
PPTX
Digital Humanities Benelux 2017: Keynote Lora Aroyo
PDF
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
PDF
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017
PDF
My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
PDF
Data Science with Human in the Loop @Faculty of Science #Leiden University
PDF
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
PPTX
UMAP 2016 Opening Ceremony
PDF
Crowdsourcing & Nichesourcing: Enriching Cultural Heritage with Experts & Cr...
NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning
Harnessing Human Semantics at Scale (updated)
Data excellence: Better data for better AI
CHIP Demonstrator presentation @ CATCH Symposium
Semantic Web Challenge: CHIP Demonstrator
The Rijksmuseum Collection as Linked Data
Keynote at International Conference of Art Libraries 2018 @Rijksmuseum
FAIRview: Responsible Video Summarization @NYCML'18
Understanding bias in video news & news filtering algorithms
StorySourcing: Telling Stories with Humans & Machines
Data Science with Humans in the Loop
Digital Humanities Benelux 2017: Keynote Lora Aroyo
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017
My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone
Data Science with Human in the Loop @Faculty of Science #Leiden University
SXSW2017 @NewDutchMedia Talk: Exploration is the New Search
UMAP 2016 Opening Ceremony
Crowdsourcing & Nichesourcing: Enriching Cultural Heritage with Experts & Cr...
Ad

Recently uploaded (20)

PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Unlocking AI with Model Context Protocol (MCP)
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
cuic standard and advanced reporting.pdf
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Electronic commerce courselecture one. Pdf
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Approach and Philosophy of On baking technology
PDF
Encapsulation theory and applications.pdf
PDF
Machine learning based COVID-19 study performance prediction
PPTX
sap open course for s4hana steps from ECC to s4
PPTX
A Presentation on Artificial Intelligence
20250228 LYD VKU AI Blended-Learning.pptx
Unlocking AI with Model Context Protocol (MCP)
“AI and Expert System Decision Support & Business Intelligence Systems”
Assigned Numbers - 2025 - Bluetooth® Document
Per capita expenditure prediction using model stacking based on satellite ima...
cuic standard and advanced reporting.pdf
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Chapter 3 Spatial Domain Image Processing.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Digital-Transformation-Roadmap-for-Companies.pptx
Electronic commerce courselecture one. Pdf
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
MYSQL Presentation for SQL database connectivity
Approach and Philosophy of On baking technology
Encapsulation theory and applications.pdf
Machine learning based COVID-19 study performance prediction
sap open course for s4hana steps from ECC to s4
A Presentation on Artificial Intelligence

CrowdTruth: Machine-Human Computation for Harnessing Disagreement in Semantic Interpretation

  • 1. CrowdTruth.org Machine-­‐Human Computa7on for Harnessing Disagreement in Seman7c Interpreta7on Oana Inel, Khalid Khamkham, Ta0ana Cristea, Anca Dumitrache, Arne Rutjes , Jelle v.d Ploeg , Lukasz Romaszko, Lora Aroyo, Robert-­‐Jan Sips
  • 2. Importance of Human Annota7on • Seman7c interpreta7on of data is needed in all sciences • Humans analyze examples and annotate them for the “correct” interpreta7on • Machines learn & are evaluated from those examples Lora Aroyo @laroyo
  • 3. 㻴㼁㻹㻭㻺 㻰㻵㻿㻭㻳㻾㻱㻱㻹㻱㻺㼀 㻵㻿 㻱㻿㻿㻱㻺㼀㻵㻭㻸 㻵㻺 㻴㻱㻸㻼㻵㻺㻳 㻹㻭㻯㻴㻵㻺㻱㻿 㼃㻵㼀㻴 㻿㻱㻹㻭㻺㼀㻵㻯 㻵㻺㼀㻱㻾㻼㻾㻱㼀㻭㼀㻵㻻㻺! Lora Aroyo @laroyo
  • 4. disagreement can reflect the degree of clarity in a sentence Does each sentence express the TREAT relation? ANTIBIOTICS are the first line treatment for indications of TYPHUS. à 95% Patients with TYPHUS who were given ANTIBIOTICS exhibited side-effects. à 80% With ANTIBIOTICS in short supply, DDT was used during WWII to control the insect vectors of TYPHUS. à 50% Lora Aroyo @laroyo
  • 5. disagreement can indicate ambiguity of the rela7on What is the RELATION between the highlighted terms? GADOLINIUM agents are useful for patients with renal impairment, but in patients with severe renal failure requiring dialysis it presents a risk of nephrogenic systemic FIBROSIS. CAUSE? or SIDE EFFECT? 70% 45% Lora Aroyo @laroyo
  • 6. disagreement can indicate low quality workers Does each sentence express the TREAT relation? • S1: ANTIBIOTICS are the first line treatment for indications of TYPHUS. • S2: QUININE is not a reliable cure for MALARIA. Worker S1 S2 Worker 1 yes no Worker 2 yes no Worker 3 yes Worker 4 no Worker 5 no yes Lora Aroyo @laroyo
  • 7. CrowdTruth SoJware Components: Machines & Crowds Workflow • Machine Pre-processing: op0mizing crowdsourcing • Micro-task Template Library: reuse & op0miza0on • CrowdTruth Analytics: disagreement-­‐based metrics • Novel approach to ground truth data collec0on & evalua0on • PROV for tracking versions of data and processing steps • Reusability in variety of annota7on tasks & domains with text, image, video (thinking about sound) Lora Aroyo @laroyo
  • 8. • Open CrowdTruth SoJware source: hQps://github.com/CrowdTruth • Web service: hQp://stable.crowdtruth.org Lora Aroyo @laroyo
  • 9. • Open CrowdTruth SoJware: Crowdsourcing Job Analy7cs source: hQps://github.com/CrowdTruth • Web service: hQp://stable.crowdtruth.org Lora Aroyo @laroyo
  • 10. • Open CrowdTruth SoJware: Worker Analy7cs source: hQps://github.com/CrowdTruth • Web service: hQp://stable.crowdtruth.org Lora Aroyo @laroyo