Provenance and Uncertainty in Human
Terrain Visual Analytics
Kai Xu
Middlesex University, UK
Background: DIVA Project
• DIVA: Data Intensive Visual Analytics
• EPSRC (UK Research Council) and DSTL (Defence
Science and Technology Lab)
• Uncertainty in Human Terrain Analysis
– Help ground troops understand local social structure
– Working with large and heterogeneous data sets

• Approach
– Visual Analytic
– Provenance
Provenance
• “The place of origin or earliest
known history of something”
(Oxford Dictionary)
• “The sources of information,
such as entities and processes,
involved in producing an
artefact” (W3C).
Different Types of Provenance
• Data provenance:
– Data source and collection
– Data changes & quality issues

• Computation provenance:
– Workflow
– Parameters & results

• Visual exploration
provenance:
– User interactions
– Insights

• Reasoning/sensemaking
provenance:
– Reasoning artefact: evidence,
hypothesis, etc.

Transformation
and Analysis
Data
Collection

Knowledge
and insights
Visualisation
and Interaction

Conclusions
/ Decisions

Analytic Provenance
Why Provenance?
• Provide the ‘context’ of • Data/analysis quality
– Data and analysis
– Reasoning and decision

• Reproducibility
– Trace the source
– Automatic update

• Help others understand
the process
– Collaboration
– Reporting

– Missing data, errors,
and uncertainty
– Computational analysis
artefacts
– Human reasoning bias

• Trust
– Understanding of data,
analysis, and reasoning
helps build the trust
DIVA Project - Details
•
•
•
•
•

Process for this project (participatory design)
Schema for data and provenance (ProveML)
Prototype system for HTVA
Constructing narratives
Demo/Video
Workshops
Requirements: Data Characteristics
•
•
•
•

Semi-structured
Clear language
Different perspectives
Synthesized or derived data
Requirements: Uncertainty Types
•
•
•
•

Source uncertainty
Collection bias
Spoofing or astroturfing
Automated extraction of information
The Data-Intensive Visual Analytics (DIVA) project
The Data-Intensive Visual Analytics (DIVA) project
The Data-Intensive Visual Analytics (DIVA) project
•
•
•
•
•

Process for this project
Schema for data and provenance
Prototype system for HTVA
Constructing narratives
Demo/video
ProveML and Facets
• ProveML: Provenance XML
• Facets: document, author, place, time, and
theme
• Review as ‘document’
Place
Author

Write
Theme
Document

Time
Insight as ‘Document’
Mariachi
Tequila
Shack

Place

Author

Write

Time

Pancho
Villa's
Quesadilla

Paco's Bar
and Grill

Mexican

Mexican food is
becoming more
popular

Restaurant

Theme
Document

A. N.
Analyst

Insight

Reviews & insights ↔ A ProveML graph
Mariachi
Tequila
Shack
Pancho
Villa's
Quesadilla

Mexican

Mexican food is
becoming more
popular

Paco's Bar
and Grill

Restaurant

Insight

A. N.
Analyst

Mexican food is
becoming more
popular

A. N.
Analyst

Collection: all
places tagged with
both Mexican and
Restaurant

Insight
•
•
•
•
•

Process for this project
Schema for data and provenance
Prototype system for HTVA
Constructing narratives
Demo/video
The Data-Intensive Visual Analytics (DIVA) project
The Data-Intensive Visual Analytics (DIVA) project
i
d
w
s
w
n
d
e
a
t
w

Fig. 4: Summary graphics showing the distribution of values for each

t
l
n
f
s
The Data-Intensive Visual Analytics (DIVA) project
The Data-Intensive Visual Analytics (DIVA) project
•
•
•
•
•

Process for this project
Schema for data and provenance
Prototype system for HTVA
Constructing narratives
Demo/Video
Visual Exploration in ProveML
Collection

State

<visual encoding>

A. N. Analyst
Link to the Rest of ProveML Graph
Bookmark

Collection

State

A comment about why
this is important

A. N. Analyst
Visual Summary of a State
A Series of States
Spatial Uncertainty
Constructing Narrative
The Data-Intensive Visual Analytics (DIVA) project
•
•
•
•
•

Process for this project
Schema for data and provenance
Prototype system for HTVA
Constructing narratives
Demo/Video
Social Media: VAST Challenge 2011
Conclusions and Future Work
• Framework for provenance and uncertainty in
Human Terrain Analysis
• Some confidence that our work is relevant and
directly related to Dstl requirements
• Try ProveML with other data sets
• Semantically-rich provenance in the future:
infer analyst intent from actions
The Team
City
University
(London)
Jason Dykes

Jo Wood

Aidan Slingsby

Derek Stephens
Loughborough
University, UK

Middlesex
University
(London)
William Wong

Rick Walker

Phong
Nguyen

Yongjun
Zheng
Visit Us @ Middlesex University
• North West London: Google Map
• Interaction Design Centre
• Lots of Visual Analytics Research
– UK Visual Analytics Consortium: Oxford, Imperial,
UCL, and Bangor
– Visual Analytics Summer School and MSc program
– MoD, EPSRC, and EU projects

• Always look for collaboration

More Related Content

PPTX
Visual Analytics - Makingn Sense of Big Data
PDF
Making sense of (big) data - visual analytics and provenance
PPTX
Machine Learning in the Data Science Context
PDF
Data publication: Discover, Explore, Visualise
PDF
Why Electronic Data Capture?
PDF
Interpretation, Context, and Metadata: Examples from Open Context
PPTX
COUNTER Point: Making the Most of Imperfect Data
PDF
NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014
Visual Analytics - Makingn Sense of Big Data
Making sense of (big) data - visual analytics and provenance
Machine Learning in the Data Science Context
Data publication: Discover, Explore, Visualise
Why Electronic Data Capture?
Interpretation, Context, and Metadata: Examples from Open Context
COUNTER Point: Making the Most of Imperfect Data
NPG Scientific Data - Metabolomics Society meeting, Tsuruola, Japan, 2014

Similar to The Data-Intensive Visual Analytics (DIVA) project (20)

PDF
Bridging Big Data and Data Science Using Scalable Workflows
PPTX
Introduction to Information Architecture and Design - SVA Workshop 09/28/13
ODP
FOSDEM 2014: Social Network Benchmark (SNB) Graph Generator
PDF
10-31-13 “Researcher Perspectives of Data Curation” Presentation Slides
PPTX
Accelerating Delivery of Data Products - The EBSCO Way
PPTX
Introduction to Information Architecture & Design - SVA Workshop 12/07/13
PPTX
Introduction to Information Architecture & Design - SVA Workshop 02/15/14
PDF
SC13 BoF: RDA and HPC
PPTX
Provenance for Reproducible Data Science
PDF
Provenance Analysis and RDF Query Processing: W3C PROV for Data Quality and T...
PPTX
The Rhetoric of Research Objects
PDF
Wiser2009 Luis Martinez
PPTX
Metadata for Research Objects
PPTX
Introduction to Information Architecture and Design - SVA Workshop 06/22/13
PDF
NC3Rs Publication Bias workshop - Sansone - Better Data = Better Science
PDF
Global lodlam_communities and open cultural data
PPTX
Sharing data
PDF
SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...
PPTX
Introduction to Information Architecture & Design - 10/03/15
PDF
Showcasing Student Scholarship
Bridging Big Data and Data Science Using Scalable Workflows
Introduction to Information Architecture and Design - SVA Workshop 09/28/13
FOSDEM 2014: Social Network Benchmark (SNB) Graph Generator
10-31-13 “Researcher Perspectives of Data Curation” Presentation Slides
Accelerating Delivery of Data Products - The EBSCO Way
Introduction to Information Architecture & Design - SVA Workshop 12/07/13
Introduction to Information Architecture & Design - SVA Workshop 02/15/14
SC13 BoF: RDA and HPC
Provenance for Reproducible Data Science
Provenance Analysis and RDF Query Processing: W3C PROV for Data Quality and T...
The Rhetoric of Research Objects
Wiser2009 Luis Martinez
Metadata for Research Objects
Introduction to Information Architecture and Design - SVA Workshop 06/22/13
NC3Rs Publication Bias workshop - Sansone - Better Data = Better Science
Global lodlam_communities and open cultural data
Sharing data
SciDataCon 2014 Data Papers and their applications workshop - NPG Scientific ...
Introduction to Information Architecture & Design - 10/03/15
Showcasing Student Scholarship
Ad

Recently uploaded (20)

PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
PPT
Module 1.ppt Iot fundamentals and Architecture
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PDF
Hybrid model detection and classification of lung cancer
DOCX
search engine optimization ppt fir known well about this
PDF
STKI Israel Market Study 2025 version august
PPTX
Chapter 5: Probability Theory and Statistics
PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PDF
Getting Started with Data Integration: FME Form 101
PDF
Developing a website for English-speaking practice to English as a foreign la...
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
PPTX
observCloud-Native Containerability and monitoring.pptx
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
DP Operators-handbook-extract for the Mautical Institute
PDF
Hindi spoken digit analysis for native and non-native speakers
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PDF
Five Habits of High-Impact Board Members
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
Module 1.ppt Iot fundamentals and Architecture
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
Hybrid model detection and classification of lung cancer
search engine optimization ppt fir known well about this
STKI Israel Market Study 2025 version august
Chapter 5: Probability Theory and Statistics
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
Getting Started with Data Integration: FME Form 101
Developing a website for English-speaking practice to English as a foreign la...
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
observCloud-Native Containerability and monitoring.pptx
1 - Historical Antecedents, Social Consideration.pdf
DP Operators-handbook-extract for the Mautical Institute
Hindi spoken digit analysis for native and non-native speakers
A contest of sentiment analysis: k-nearest neighbor versus neural network
Five Habits of High-Impact Board Members
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
Ad

The Data-Intensive Visual Analytics (DIVA) project

  • 1. Provenance and Uncertainty in Human Terrain Visual Analytics Kai Xu Middlesex University, UK
  • 2. Background: DIVA Project • DIVA: Data Intensive Visual Analytics • EPSRC (UK Research Council) and DSTL (Defence Science and Technology Lab) • Uncertainty in Human Terrain Analysis – Help ground troops understand local social structure – Working with large and heterogeneous data sets • Approach – Visual Analytic – Provenance
  • 3. Provenance • “The place of origin or earliest known history of something” (Oxford Dictionary) • “The sources of information, such as entities and processes, involved in producing an artefact” (W3C).
  • 4. Different Types of Provenance • Data provenance: – Data source and collection – Data changes & quality issues • Computation provenance: – Workflow – Parameters & results • Visual exploration provenance: – User interactions – Insights • Reasoning/sensemaking provenance: – Reasoning artefact: evidence, hypothesis, etc. Transformation and Analysis Data Collection Knowledge and insights Visualisation and Interaction Conclusions / Decisions Analytic Provenance
  • 5. Why Provenance? • Provide the ‘context’ of • Data/analysis quality – Data and analysis – Reasoning and decision • Reproducibility – Trace the source – Automatic update • Help others understand the process – Collaboration – Reporting – Missing data, errors, and uncertainty – Computational analysis artefacts – Human reasoning bias • Trust – Understanding of data, analysis, and reasoning helps build the trust
  • 6. DIVA Project - Details • • • • • Process for this project (participatory design) Schema for data and provenance (ProveML) Prototype system for HTVA Constructing narratives Demo/Video
  • 8. Requirements: Data Characteristics • • • • Semi-structured Clear language Different perspectives Synthesized or derived data
  • 9. Requirements: Uncertainty Types • • • • Source uncertainty Collection bias Spoofing or astroturfing Automated extraction of information
  • 13. • • • • • Process for this project Schema for data and provenance Prototype system for HTVA Constructing narratives Demo/video
  • 14. ProveML and Facets • ProveML: Provenance XML • Facets: document, author, place, time, and theme • Review as ‘document’ Place Author Write Theme Document Time
  • 15. Insight as ‘Document’ Mariachi Tequila Shack Place Author Write Time Pancho Villa's Quesadilla Paco's Bar and Grill Mexican Mexican food is becoming more popular Restaurant Theme Document A. N. Analyst Insight Reviews & insights ↔ A ProveML graph
  • 16. Mariachi Tequila Shack Pancho Villa's Quesadilla Mexican Mexican food is becoming more popular Paco's Bar and Grill Restaurant Insight A. N. Analyst Mexican food is becoming more popular A. N. Analyst Collection: all places tagged with both Mexican and Restaurant Insight
  • 17. • • • • • Process for this project Schema for data and provenance Prototype system for HTVA Constructing narratives Demo/video
  • 20. i d w s w n d e a t w Fig. 4: Summary graphics showing the distribution of values for each t l n f s
  • 23. • • • • • Process for this project Schema for data and provenance Prototype system for HTVA Constructing narratives Demo/Video
  • 24. Visual Exploration in ProveML Collection State <visual encoding> A. N. Analyst
  • 25. Link to the Rest of ProveML Graph Bookmark Collection State A comment about why this is important A. N. Analyst
  • 26. Visual Summary of a State
  • 27. A Series of States
  • 31. • • • • • Process for this project Schema for data and provenance Prototype system for HTVA Constructing narratives Demo/Video
  • 32. Social Media: VAST Challenge 2011
  • 33. Conclusions and Future Work • Framework for provenance and uncertainty in Human Terrain Analysis • Some confidence that our work is relevant and directly related to Dstl requirements • Try ProveML with other data sets • Semantically-rich provenance in the future: infer analyst intent from actions
  • 34. The Team City University (London) Jason Dykes Jo Wood Aidan Slingsby Derek Stephens Loughborough University, UK Middlesex University (London) William Wong Rick Walker Phong Nguyen Yongjun Zheng
  • 35. Visit Us @ Middlesex University • North West London: Google Map • Interaction Design Centre • Lots of Visual Analytics Research – UK Visual Analytics Consortium: Oxford, Imperial, UCL, and Bangor – Visual Analytics Summer School and MSc program – MoD, EPSRC, and EU projects • Always look for collaboration