SlideShare a Scribd company logo
THE HUMAN FACE OF AI:
HOW COLLECTIVE AND
AUGMENTED INTELLIGENCE
CAN HELP SOLVE SOCIETAL
PROBLEMS
Elena Simperl
ACM-W UK, June 2020
@esimperl
AUGMENTED
INTELLIGENCE
Human-centred design paradigm for systems
that utilise artificial intelligence (AI)
People and AI work together to enhance
cognitive performance, support decision
making and create new experiences
AI DEPENDS
ON PEOPLE
Applications require more or better data e.g.
from mobile or IoT devices
Machine learning algorithms learn from
human labellers
Knowledge-based AI approaches acquire
domain knowledge from people
AI BENEFITS
FROM
COLLECTIVE
INTELLIGENCE
Collective intelligence (CI) emerges when
groups or communities come together,
implicitly or explicitly, to achieve a common
goal
CI techniques help AI applications design and
manage interactions with people
In human computation a machine performs a
function by outsourcing some steps to people
How do we design
systems that bring
together human,
collective &
computational
intelligence
IN THIS TALK
Design patterns for socio-
technical systems
Socio-technical challenges
when defining and applying
the patterns
Directions for future research
EXAMPLE:
SUPPORTING DISASTER RELIEF
Human computation has provided huge advances to
disaster relief efforts
 40,000 independent reports mapped through Ushahidi after
Haiti earthquake
Crisis teams sift through large volumes of
crowdsourced reports from social media and other
sources
Volunteer efforts predominantly limited to initial
phase of recovery
Human interest and effort often fails before later
stages of process
CHALLENGE:
MAKING CROWDSOURCING SUSTAINABLE
Learning
Engagement
TASK ALLOCATION
EXPERIMENT
Increase learning and engagement by ordering
tasks by difficulty or similar content
Public dataset of tweet URLs about hurricanes
Harvey, Irma and Maria, curated manually to 2000
tweets, 1000 text-only, 1000 with images
People were asked to classify tweets to help
recovery teams process social media reports
Recruitment via Amazon’s Mechanical Turk
Labels train machine learning classifiers
TASK DESIGN
Presented participants with disaster relief tweets
(text or text + image)
Participants asked to:
 Classify text based on content
 Rate task according to difficulty
Three conditions:
 Random baseline
 Difficult tweets
 Easy tweets
Monitored accuracy of responses
FINDINGS
Accuracy influenced by difficulty
 Text: Weak association when comparing easy and difficult
clusters
 Images: Strong association when comparing difficult and random
clusters
No significant association between difficulty and
volume of completed tasks
Only 30% of workers completed more than one task
FEEDBACK EXPERIMENT
Two forms of feedback
 Expert feedback (using gold standard)
 Crowd feedback (randomly selected)
Workflow:
 Participant gives answer
 Prompted with pre-existing answer and offered chance to
edit
 Asked to explain decision
Monitored decisions and justifications
FINDINGS
Participants generally poor at taking feedback into
account:
 57% of workers felt expert feedback matched their responses
(only 7% did)
 36% of workers felt crowd feedback matched (only 4% did)
Participants presented with crowd feedback more
likely to change their answer in response (41% vs
26%)
Also more likely to deem presented feedback from
crowd as incorrect than from experts (22% vs 16%)
FUTURE WORK
Difficulty impacts accuracy, but not
engagement
Participants struggled with more
complex tasks
Significant support required for
maximum accuracy
Generic feedback not sufficient –
more personalised support required,
resource-intensive
EXAMPLE:
URBAN AUDITING
ON DEMAND
Urban datasets are often out-of-
date
 Survey methodologies: expensive,
error-prone, no validation
 VGI (e.g. OpenStreetMap): no
control over data updates,
coverage etc.
Online tool using paid microtask
crowdsourcing
 Uses digital street view imagery
 Task performed remotely
 Participants recruited from online
marketplaces
VIRTUAL CITY EXPLORER
QROWD-POI.HEROKUAPP.COM/
Urban planner defines an area
and the instructions for the
participants
Participants explore an area
virtually and identify points of
interest
Urban planner monitors task
execution, quality and rewards
CHALLENGE:
CROWDSOURCING DESIGN
TASK DESIGN DATA QUALITY INCENTIVES FAIRNESS
EXPERIMENT: CYCLING TRENTO & NANTES
150 participants per city, random starting positions
5 PoIs (bike racks) per participant for $0.15
Total cost per city: $45 (7 days)
Mixed methods approach, including metrics and
manual inspection
 RQ1: Feasibility and precision as task progresses
 RQ2: Completeness (overlap with benchmark datasets)
 RQ3: Coverage (percentage of visited nodes on explorable path)
 RQ4: Crowd experience (interface errors triggered, number of
escapes)
Trento Nantes
Area 0.347km2 0.336km2
Nodes 906 1177
Explorable
distance
9127m 12104m
StreetView
coverage
93% 92%
RQ1: TASK FEASIBILITY AND PRECISION AS TASK PROGRESSES
UX supports discovery
of PoIs
Photoshoot paradigm
and triangulation
method help identify
low-quality answers
Precision drops as all
PoIs are submitted
RQ2: DATA COMPLETENESS
Approach complements existing
data sources and is able to find
new PoIs
Highly customisable (area of
interest, budget, questions, timing)
52
54
RQ3: COVERAGE OF THE
DESIGNATED AREA
Approach achieves high coverage of the
area of interest
Some parts of the map are visited more
often than others (resources)
Black dots are points on StreetView that
are difficult to explore
RQ4: CROWD EXPERIENCE
Most participants were able to complete
their tasks without any incidents. Some did
not manage to triangulate or stepped
outside of the designated area
Positive feedback, fair payment, despite
taboo mechanism. Small percentage submit
some data and dropped out.
Most participants who dropped out did not
seem to try to complete the task
FINDINGS
VCE adds value to urban auditing methods
 Accuracy comparable to OpenStreetMap, easier to
manage than VGI
 Additional resources upon demand (at a cost)
Free exploration achieves good coverage
Taboo mechanism helps reduce costs and avoid
duplicated work
FUTURE WORK
Allocating starting positions: randomly, centre, to confirm item, to cover new
area etc.
Coordinating among participants: map showing progress of other participants
Understanding the impact of urban topology on feasibility, accuracy,
coverage
Direct comparisons with other approaches
Hybrid workflows with crowds on the ground and online
EXAMPLE:
UNDERSTANDING MOBILITY
PATTERNS
City planners lack detailed mobility
information about their residents
Human-AI workflow
Bespoke app for data collection
Combination of symbolic and numerical ML
classifiers to match trip segments to modes of
transport
Active learning approach to ask travellers to
validate trips the machine is unsure about
CHALLENGE:
USER EXPERIENCE
Iterative UX development via citizen lab to improve
journey data and ML predictions
Lab and field studies with 250+ participants
CHALLENGE:
ASSESSING THE QUALITY OF THE DATA
Naïve model assumes people will notice and correct errors in journeys detected by the
algorithm
Is this true? If not, can we detect errors and estimate residual error rate?
Are people employing specific ‘strategies’ to check and correct journeys?
EXPERIMENT DESIGN
No independent ground truth!
Inject artificial errors and measure if
they are corrected
Assume artificial errors are not
accidental corrections
Use ratio of discovered natural
errors to discovered artificial errors
to estimate initial and residual
natural error rate
Assume natural errors are comparable
to artificial ones and people are not
adding new errors (‘mis-corrections’)
EXPERIMENT DESIGN (2)
10 participants, ~5 journeys per participant, from Google Timelines (KML)
Pre-process to add artificial errors in four classes:
Under- or over-segmentation
Bad mode
Bad point (100/400m GPS point move)
Score. Manual process, tool supported
PRELIMINARY FINDINGS:
MORE RESEARCH NEEDED INTO DATA COLLECTION
METHODOLOGIES FOR ML
Errors can be corrected
Errors can mislead
Errors can persist
A range of complex cases
How do we design systems that
bring together human, collective &
computational intelligence
Mix of CI approaches
Iterative UX design
Methods to assess data quality and
improve human-AI interactions
Aligned motivation and incentives
THANKS TO LUIS-DANIEL IBÁÑEZ, EDDY
MADDALENA, RICHARD GOMER, NEAL REEVES,
THE QROWD PROJECT, NESTA AND THE
EUROPEAN COMMISSION
@esimperl
Maddalena, E., Ibáñez, L.D. and Simperl, E., 2020. On the mapping of
Points of Interest through StreetView imagery and paid crowdsourcing. To
appear in ACM TIST.
qrowd-poi.herokuapp.com
Nesta, June 2020. Combining Crowds and Machines: Experiments in
collective intelligence design 1.0. nesta.org.uk/report/combining-crowds-
and-machines/
Nesta, June 2020. Collective intelligence grants 1.0.
nesta.org.uk/feature/collective-intelligence-grants/

More Related Content

PDF
One does not simply crowdsource the Semantic Web: 10 years with people, URIs,...
PDF
Building better knowledge graphs through social computing
PDF
High-value datasets: from publication to impact
PDF
The data we want
PDF
The web of data: how are we doing so far?
PDF
The story of Data Stories
PDF
Crowdsourcing and citizen engagement for people-centric smart cities
PDF
Data stories
One does not simply crowdsource the Semantic Web: 10 years with people, URIs,...
Building better knowledge graphs through social computing
High-value datasets: from publication to impact
The data we want
The web of data: how are we doing so far?
The story of Data Stories
Crowdsourcing and citizen engagement for people-centric smart cities
Data stories

What's hot (20)

PDF
Are our knowledge graphs trustworthy?
PPTX
A Semantics-based Approach to Machine Perception
PPT
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
PPTX
Smart Data - How you and I will exploit Big Data for personalized digital hea...
PPTX
Reality Mining
PDF
EU FP7 CityPulse Project
PPTX
What's up at Kno.e.sis?
PPT
Reality Mining (Nathan Eagle)
PDF
Big Data Analytics : A Social Network Approach
PDF
The NEEDS vs. the WANTS in IoT
PPTX
Engines of Order. Social Media and the Rise of Algorithmic Knowing.
PDF
SP1: Exploratory Network Analysis with Gephi
PDF
Challenges in Analytics for BIG Data
PPTX
Tweets are Not Created Equal. Intersecting Devices in the 1% Sample
PPTX
Platforms and Analytical Gestures
PPTX
Data Discovery and Visualization
PPTX
Big data divided (24 march2014)
PPTX
E psi open data - rejseplanen
PDF
Big Data Social Network Analysis
PPTX
From Algorithms to Diagrams: How to Study Platforms?
Are our knowledge graphs trustworthy?
A Semantics-based Approach to Machine Perception
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
Smart Data - How you and I will exploit Big Data for personalized digital hea...
Reality Mining
EU FP7 CityPulse Project
What's up at Kno.e.sis?
Reality Mining (Nathan Eagle)
Big Data Analytics : A Social Network Approach
The NEEDS vs. the WANTS in IoT
Engines of Order. Social Media and the Rise of Algorithmic Knowing.
SP1: Exploratory Network Analysis with Gephi
Challenges in Analytics for BIG Data
Tweets are Not Created Equal. Intersecting Devices in the 1% Sample
Platforms and Analytical Gestures
Data Discovery and Visualization
Big data divided (24 march2014)
E psi open data - rejseplanen
Big Data Social Network Analysis
From Algorithms to Diagrams: How to Study Platforms?
Ad

Similar to The human face of AI: how collective and augmented intelligence can help solve societal problems (20)

PPTX
Work completion seminar defence
PDF
Qrowd and the city: designing people-centric smart cities
PPT
Data Quality and Neogeography
PPTX
Explainable AI for non-expert users
PPTX
A Server-Assigned Crowdsourcing Framework
PDF
PhD Defence: Leveraging sensing-based interaction for supporting reflection a...
PPT
SOTMEU 2011 - OSM Potlatch2 Usability Evaluation
PDF
UI/UX/UCD
PDF
Towards Collaboration Translucence: Giving Meaning to Multimodal Group Data
PPTX
Human-centered AI: how can we support end-users to interact with AI?
PDF
Human-centered AI: how can we support lay users to understand AI?
PPTX
Brightfind world usability day 2016 full deck final
PPT
MACHINE LEARNING FOR SATELLITE-GUIDED WATER QUALITY MONITORING
PPT
Intranet Usability Testing
PPTX
Real Life Machine Learning Case on Mobile Advertisement
PPT
Dobson presentation nys geo summit for slideshare
PDF
Corso Interazione Uomo Macchina e Sviluppo Applicazioni Mobile - GoBus
PPTX
Crowdsourcing Approaches for Smart City Open Data Management
PDF
Applying Commercial Computer Vision Tools to Cope with Uncertainties in a Cit...
PDF
Crowdsourcing Linked Data Quality Assessment
Work completion seminar defence
Qrowd and the city: designing people-centric smart cities
Data Quality and Neogeography
Explainable AI for non-expert users
A Server-Assigned Crowdsourcing Framework
PhD Defence: Leveraging sensing-based interaction for supporting reflection a...
SOTMEU 2011 - OSM Potlatch2 Usability Evaluation
UI/UX/UCD
Towards Collaboration Translucence: Giving Meaning to Multimodal Group Data
Human-centered AI: how can we support end-users to interact with AI?
Human-centered AI: how can we support lay users to understand AI?
Brightfind world usability day 2016 full deck final
MACHINE LEARNING FOR SATELLITE-GUIDED WATER QUALITY MONITORING
Intranet Usability Testing
Real Life Machine Learning Case on Mobile Advertisement
Dobson presentation nys geo summit for slideshare
Corso Interazione Uomo Macchina e Sviluppo Applicazioni Mobile - GoBus
Crowdsourcing Approaches for Smart City Open Data Management
Applying Commercial Computer Vision Tools to Cope with Uncertainties in a Cit...
Crowdsourcing Linked Data Quality Assessment
Ad

More from Elena Simperl (20)

PDF
When stars align: studies in data quality, knowledge graphs, and machine lear...
PDF
Knowledge engineering: from people to machines and back
PDF
This talk was not generated with ChatGPT: how AI is changing science
PDF
Knowledge graph use cases in natural language generation
PDF
Knowledge engineering: from people to machines and back
PDF
The web of data: how are we doing so far
PDF
What Wikidata teaches us about knowledge engineering
PDF
Open government data portals: from publishing to use and impact
PDF
Ten myths about knowledge graphs.pdf
PDF
What Wikidata teaches us about knowledge engineering
PDF
Data commons and their role in fighting misinformation.pdf
PDF
Pie chart or pizza: identifying chart types and their virality on Twitter
PDF
Qrowd and the city
PDF
Inclusive cities: a crowdsourcing approach
PDF
Loops of humans and bots in Wikidata
PDF
Making transport smarter, leveraging the human factor
PDF
Data storytelling
PDF
Quality and collaboration in Wikidata
PDF
Beyond monetary incentives: experiments with paid microtasks
PDF
The Data Pitch call
When stars align: studies in data quality, knowledge graphs, and machine lear...
Knowledge engineering: from people to machines and back
This talk was not generated with ChatGPT: how AI is changing science
Knowledge graph use cases in natural language generation
Knowledge engineering: from people to machines and back
The web of data: how are we doing so far
What Wikidata teaches us about knowledge engineering
Open government data portals: from publishing to use and impact
Ten myths about knowledge graphs.pdf
What Wikidata teaches us about knowledge engineering
Data commons and their role in fighting misinformation.pdf
Pie chart or pizza: identifying chart types and their virality on Twitter
Qrowd and the city
Inclusive cities: a crowdsourcing approach
Loops of humans and bots in Wikidata
Making transport smarter, leveraging the human factor
Data storytelling
Quality and collaboration in Wikidata
Beyond monetary incentives: experiments with paid microtasks
The Data Pitch call

Recently uploaded (20)

PDF
cuic standard and advanced reporting.pdf
PPTX
Big Data Technologies - Introduction.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Empathic Computing: Creating Shared Understanding
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
A Presentation on Artificial Intelligence
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
Cloud computing and distributed systems.
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Review of recent advances in non-invasive hemoglobin estimation
cuic standard and advanced reporting.pdf
Big Data Technologies - Introduction.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
MYSQL Presentation for SQL database connectivity
Per capita expenditure prediction using model stacking based on satellite ima...
Empathic Computing: Creating Shared Understanding
The Rise and Fall of 3GPP – Time for a Sabbatical?
A Presentation on Artificial Intelligence
Building Integrated photovoltaic BIPV_UPV.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
“AI and Expert System Decision Support & Business Intelligence Systems”
Cloud computing and distributed systems.
Spectral efficient network and resource selection model in 5G networks
Diabetes mellitus diagnosis method based random forest with bat algorithm
Review of recent advances in non-invasive hemoglobin estimation

The human face of AI: how collective and augmented intelligence can help solve societal problems

  • 1. THE HUMAN FACE OF AI: HOW COLLECTIVE AND AUGMENTED INTELLIGENCE CAN HELP SOLVE SOCIETAL PROBLEMS Elena Simperl ACM-W UK, June 2020 @esimperl
  • 2. AUGMENTED INTELLIGENCE Human-centred design paradigm for systems that utilise artificial intelligence (AI) People and AI work together to enhance cognitive performance, support decision making and create new experiences
  • 3. AI DEPENDS ON PEOPLE Applications require more or better data e.g. from mobile or IoT devices Machine learning algorithms learn from human labellers Knowledge-based AI approaches acquire domain knowledge from people
  • 4. AI BENEFITS FROM COLLECTIVE INTELLIGENCE Collective intelligence (CI) emerges when groups or communities come together, implicitly or explicitly, to achieve a common goal CI techniques help AI applications design and manage interactions with people In human computation a machine performs a function by outsourcing some steps to people
  • 5. How do we design systems that bring together human, collective & computational intelligence
  • 6. IN THIS TALK Design patterns for socio- technical systems Socio-technical challenges when defining and applying the patterns Directions for future research
  • 7. EXAMPLE: SUPPORTING DISASTER RELIEF Human computation has provided huge advances to disaster relief efforts  40,000 independent reports mapped through Ushahidi after Haiti earthquake Crisis teams sift through large volumes of crowdsourced reports from social media and other sources Volunteer efforts predominantly limited to initial phase of recovery Human interest and effort often fails before later stages of process
  • 9. TASK ALLOCATION EXPERIMENT Increase learning and engagement by ordering tasks by difficulty or similar content Public dataset of tweet URLs about hurricanes Harvey, Irma and Maria, curated manually to 2000 tweets, 1000 text-only, 1000 with images People were asked to classify tweets to help recovery teams process social media reports Recruitment via Amazon’s Mechanical Turk Labels train machine learning classifiers
  • 10. TASK DESIGN Presented participants with disaster relief tweets (text or text + image) Participants asked to:  Classify text based on content  Rate task according to difficulty Three conditions:  Random baseline  Difficult tweets  Easy tweets Monitored accuracy of responses
  • 11. FINDINGS Accuracy influenced by difficulty  Text: Weak association when comparing easy and difficult clusters  Images: Strong association when comparing difficult and random clusters No significant association between difficulty and volume of completed tasks Only 30% of workers completed more than one task
  • 12. FEEDBACK EXPERIMENT Two forms of feedback  Expert feedback (using gold standard)  Crowd feedback (randomly selected) Workflow:  Participant gives answer  Prompted with pre-existing answer and offered chance to edit  Asked to explain decision Monitored decisions and justifications
  • 13. FINDINGS Participants generally poor at taking feedback into account:  57% of workers felt expert feedback matched their responses (only 7% did)  36% of workers felt crowd feedback matched (only 4% did) Participants presented with crowd feedback more likely to change their answer in response (41% vs 26%) Also more likely to deem presented feedback from crowd as incorrect than from experts (22% vs 16%)
  • 14. FUTURE WORK Difficulty impacts accuracy, but not engagement Participants struggled with more complex tasks Significant support required for maximum accuracy Generic feedback not sufficient – more personalised support required, resource-intensive
  • 15. EXAMPLE: URBAN AUDITING ON DEMAND Urban datasets are often out-of- date  Survey methodologies: expensive, error-prone, no validation  VGI (e.g. OpenStreetMap): no control over data updates, coverage etc. Online tool using paid microtask crowdsourcing  Uses digital street view imagery  Task performed remotely  Participants recruited from online marketplaces
  • 16. VIRTUAL CITY EXPLORER QROWD-POI.HEROKUAPP.COM/ Urban planner defines an area and the instructions for the participants Participants explore an area virtually and identify points of interest Urban planner monitors task execution, quality and rewards
  • 17. CHALLENGE: CROWDSOURCING DESIGN TASK DESIGN DATA QUALITY INCENTIVES FAIRNESS
  • 18. EXPERIMENT: CYCLING TRENTO & NANTES 150 participants per city, random starting positions 5 PoIs (bike racks) per participant for $0.15 Total cost per city: $45 (7 days) Mixed methods approach, including metrics and manual inspection  RQ1: Feasibility and precision as task progresses  RQ2: Completeness (overlap with benchmark datasets)  RQ3: Coverage (percentage of visited nodes on explorable path)  RQ4: Crowd experience (interface errors triggered, number of escapes) Trento Nantes Area 0.347km2 0.336km2 Nodes 906 1177 Explorable distance 9127m 12104m StreetView coverage 93% 92%
  • 19. RQ1: TASK FEASIBILITY AND PRECISION AS TASK PROGRESSES UX supports discovery of PoIs Photoshoot paradigm and triangulation method help identify low-quality answers Precision drops as all PoIs are submitted
  • 20. RQ2: DATA COMPLETENESS Approach complements existing data sources and is able to find new PoIs Highly customisable (area of interest, budget, questions, timing) 52 54
  • 21. RQ3: COVERAGE OF THE DESIGNATED AREA Approach achieves high coverage of the area of interest Some parts of the map are visited more often than others (resources) Black dots are points on StreetView that are difficult to explore
  • 22. RQ4: CROWD EXPERIENCE Most participants were able to complete their tasks without any incidents. Some did not manage to triangulate or stepped outside of the designated area Positive feedback, fair payment, despite taboo mechanism. Small percentage submit some data and dropped out. Most participants who dropped out did not seem to try to complete the task
  • 23. FINDINGS VCE adds value to urban auditing methods  Accuracy comparable to OpenStreetMap, easier to manage than VGI  Additional resources upon demand (at a cost) Free exploration achieves good coverage Taboo mechanism helps reduce costs and avoid duplicated work
  • 24. FUTURE WORK Allocating starting positions: randomly, centre, to confirm item, to cover new area etc. Coordinating among participants: map showing progress of other participants Understanding the impact of urban topology on feasibility, accuracy, coverage Direct comparisons with other approaches Hybrid workflows with crowds on the ground and online
  • 25. EXAMPLE: UNDERSTANDING MOBILITY PATTERNS City planners lack detailed mobility information about their residents Human-AI workflow Bespoke app for data collection Combination of symbolic and numerical ML classifiers to match trip segments to modes of transport Active learning approach to ask travellers to validate trips the machine is unsure about
  • 26. CHALLENGE: USER EXPERIENCE Iterative UX development via citizen lab to improve journey data and ML predictions Lab and field studies with 250+ participants
  • 27. CHALLENGE: ASSESSING THE QUALITY OF THE DATA Naïve model assumes people will notice and correct errors in journeys detected by the algorithm Is this true? If not, can we detect errors and estimate residual error rate? Are people employing specific ‘strategies’ to check and correct journeys?
  • 28. EXPERIMENT DESIGN No independent ground truth! Inject artificial errors and measure if they are corrected Assume artificial errors are not accidental corrections Use ratio of discovered natural errors to discovered artificial errors to estimate initial and residual natural error rate Assume natural errors are comparable to artificial ones and people are not adding new errors (‘mis-corrections’)
  • 29. EXPERIMENT DESIGN (2) 10 participants, ~5 journeys per participant, from Google Timelines (KML) Pre-process to add artificial errors in four classes: Under- or over-segmentation Bad mode Bad point (100/400m GPS point move) Score. Manual process, tool supported
  • 30. PRELIMINARY FINDINGS: MORE RESEARCH NEEDED INTO DATA COLLECTION METHODOLOGIES FOR ML Errors can be corrected Errors can mislead Errors can persist A range of complex cases
  • 31. How do we design systems that bring together human, collective & computational intelligence
  • 32. Mix of CI approaches Iterative UX design Methods to assess data quality and improve human-AI interactions Aligned motivation and incentives
  • 33. THANKS TO LUIS-DANIEL IBÁÑEZ, EDDY MADDALENA, RICHARD GOMER, NEAL REEVES, THE QROWD PROJECT, NESTA AND THE EUROPEAN COMMISSION @esimperl Maddalena, E., Ibáñez, L.D. and Simperl, E., 2020. On the mapping of Points of Interest through StreetView imagery and paid crowdsourcing. To appear in ACM TIST. qrowd-poi.herokuapp.com Nesta, June 2020. Combining Crowds and Machines: Experiments in collective intelligence design 1.0. nesta.org.uk/report/combining-crowds- and-machines/ Nesta, June 2020. Collective intelligence grants 1.0. nesta.org.uk/feature/collective-intelligence-grants/