SlideShare a Scribd company logo
Victor de Boer
User-centered Data Science for Digital Humanities
DIVE, Dutch Ships and Sailors and ArchimediaL
User-centered Data Science for Digital Humanities
Digital Humanities
Part of the effort of humanities researcher is moved
from the physical archives to digital ones
New possibilities for humanities research
Img:www.doaks.org, www.dkrz.de
Integrating collections as Linked Data
Tools built on top of the data
Continuous
enrichment
Embed in humanities methodology
Continuous collection enrichment
Multimedia analysis (image, text, video)
Human computation
Linked Data
Human-based computation
Nichesourcing CrowdsourcingProfessional
annotation
Niche groups of amateur experts with shared characteristics
Dutch Ships and Sailors
(semi-) automatically establish links between
datasets and to external sources
dss:Record
gzmvoc:Telling
gzmvoc:telling-1046-De_Berkel
__bnode_1
gzmvoc:aziatischeBemanning
dss:Ship
gzmvoc:Schip
gzmvoc: schip-1046-De_Berkel
dss:has_ship
gzmvoc:schip
"1046"
“Schip”
“De Berkel”
rdfs:label
dss:scheepsnaam
gzmvoc:scheepsnaam
dss:ShipType
gzmvoc:Scheepstype
gzmvoc: type-Ship
dss:has_shiptype
gzmvoc:has_shiptype
gzmvoc:scheepstype
“21”
“Moorse
mattroosen”
dss:azRegistratieKop
gzmvoc:azAantalMatrozen
gzmvoc:telling
gzmvoc:heeft DAS heenreis
dss:Record
das:Voyage
das:voyage-1918_61
Locations
Ranks
Ship types
Voyages
…
Mentionedin
Novel data analysis and visualisation
DIVE+ INTO THE EVENT-ENRICHED
LINKED OPEN CULTURAL HERITAGE
User-centered Data Science for Digital Humanities
Access to Integrated Online Multimedia collections
using Linked Open Data
Interactive Exploration & Discovery in Context
linking objects to events and entities
building automatic storylines (narratives)
DIVE+
OPENIMAGES.EU
3,220 news broadcasts
Netherlands Institute for Sound & Vision
GTAA thesaurus
DELPHER.NL
197,199 Scans of Radio
bulletins
1937 – 1984
AMSTERDAM MUSEUM
73,447 cultural heritage objects
AM Thesaurus
TROPENMUSEUM
78,270 cultural heritage objects
SVNC thesaurus
Collections and Vocabularies
Hybrid enrichment pipeline
ENTITY EXTRACTION
EVENTS CROWDSOURCING AND LINKING
TO CONCEPTS THROUGH
CROWDTRUTH.ORG
SEGMENTATION & KEYFRAMES
LINKING EVENTS AND
CONCEPTS TO
KEYFRAMES
DIVE:MediaObject
Nieuws uit Indonesië:
opheffing van het KNIL
dive:depictedBy
sem:hasTimestamp
sem:Event
ANP:1950-08-11:50
dive:isRelatedTo
dive:relatedPlace
sem:hasPlace
dive:isRelatedTo
dive:relatedActor
sem:hasActor
dive:isRelatedTo
dive:relatedPlace
sem:hasPlace
sem:Time
25 Juli 1950
dive:depictedBy
sem:hasTimestamp
DIVE:MediaObject
Mannen bij het huis van Paul Spies
aan de Parapattan 42, Djakarta
dive:depictedBy
dive:depictedBy
dive:depictedBy
DIVE:MediaObject
ANP:1950-08-11:50
DIVE:MediaObject
Schaal
sem:Time
11 Augustus 1950
sem:Event
ontbindingsceremonie
sem:Place
Djakarta
sem:Place
Indonesië
sem:Actor
Mohammad Hatta
Integration of Heterogeneous Collections
Innovative exploratory UI
diveplus.beeldengeluid.nl
ArchiMediaL
Developing Post-colonial Interpretations of Built Form
through Heterogeneous Linked Digital Media
Computer Vision + crowdsourcing
How to identify (elements of) buildings
across different representations
Flexible data model allows for multi-interpretation
Continuous enrichment and linking of heterogeneous
collections brings new possibilities for access, analysis
Using automatic methods
Always with human(s) in the loop
Victor de Boer
v.de.boer@vu.nl

More Related Content

PPTX
CC Presentation Peter Troxler
PPTX
Monday 4 May: From linear to non-linear broadcast contents: considering an “...
PPTX
From WG2 Datathon to AWAC2. Exploring IIPC special COVID collection thanks to...
PPTX
What’s in a URL? Analysing COVID-19 web archive collections
PPTX
Presentatie for "Studiemiddag Linked Data Archieven"
PDF
Linked Data for Digital Humanities - Big Data Summerschool
PPTX
Linked Data for Audiovisual Archives (Guest lecture at NISV)
PPTX
DSpace for Cultural Heritage: adding support for images visualization,audio/v...
CC Presentation Peter Troxler
Monday 4 May: From linear to non-linear broadcast contents: considering an “...
From WG2 Datathon to AWAC2. Exploring IIPC special COVID collection thanks to...
What’s in a URL? Analysing COVID-19 web archive collections
Presentatie for "Studiemiddag Linked Data Archieven"
Linked Data for Digital Humanities - Big Data Summerschool
Linked Data for Audiovisual Archives (Guest lecture at NISV)
DSpace for Cultural Heritage: adding support for images visualization,audio/v...

Similar to User-centered Data Science for Digital Humanities (20)

PPTX
DSpace for Cultural Heritage: adding support for images visualization,audio/v...
PDF
The future importance of bibliographic data
PPTX
Linked Data: principles and examples
PDF
20160818 Semantics and Linkage of Archived Catalogs
PPTX
Madrid Linked Data for Digital Humanities
PDF
Interactive Visualization of a News Clips Network: A Journalistic Research an...
PPT
December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...
PPTX
Containers for sensor web services, applications and research @ Sensor Web Co...
PDF
Integration of Accessible Documents into Digital Libraries of Tomorrow
PPT
At the Interface of Religion and Cosmopolitanism: Bernard Picart's "Cérémonie...
PPT
Project 'The Digital City Revives'. A Case Study of Web Archaeology
PPT
Olaf Janssen on the principles of large-scale digital libraries and their app...
PDF
3D-printing with GRASS GIS – a work in progress in report FOSS4G 2014
ODP
Wikipedia as source of collaboratively created Knowledge Organization Systems
PPTX
Archivematica integration handshaking towards comprehensive digital preserva...
PPTX
Sw4 sh slides
DSpace for Cultural Heritage: adding support for images visualization,audio/v...
The future importance of bibliographic data
Linked Data: principles and examples
20160818 Semantics and Linkage of Archived Catalogs
Madrid Linked Data for Digital Humanities
Interactive Visualization of a News Clips Network: A Journalistic Research an...
December 2, 2015: NISO/NFAIS Virtual Conference: Semantic Web: What's New and...
Containers for sensor web services, applications and research @ Sensor Web Co...
Integration of Accessible Documents into Digital Libraries of Tomorrow
At the Interface of Religion and Cosmopolitanism: Bernard Picart's "Cérémonie...
Project 'The Digital City Revives'. A Case Study of Web Archaeology
Olaf Janssen on the principles of large-scale digital libraries and their app...
3D-printing with GRASS GIS – a work in progress in report FOSS4G 2014
Wikipedia as source of collaboratively created Knowledge Organization Systems
Archivematica integration handshaking towards comprehensive digital preserva...
Sw4 sh slides
Ad

More from Victor de Boer (20)

PPTX
One day workshop Linked Data and Semantic Web
PPTX
Linked Data for Digital Humanities research at Media Archives
PDF
The Benefits of Linking Metadata for Internal and External users of an Audiov...
PPTX
UX Challenges of Information Organisation: Assessment of Language Impairment ...
PPTX
Interactive Dance Choreography Assistance presentation for ACE entertainment ...
PDF
Fahad Ali's slides for Machine to-machine communication in rural conditions ...
PDF
Linking African Traditional Medicine Knowledge - by Gossa Lo
PPTX
Enriching Media Collections for Event-based Exploration
PPTX
New Life for Old Media (NEM presentation)
PPTX
Semantic Technology for Development: Semantic Web without the Web?
PPTX
DIVE+ and Events at EVENTS2017
PPTX
About Cultuurlink
PPTX
Intro to Linked, Dutch Ships and Sailors and SPARQL handson
PDF
Kasadaka and ICT4D at VU
PPTX
VU ICT4D symposium 2017 Francis Dittoh Mr. Meteo
PPTX
VU ICT4D symposium 2017 Chris van Aart
PDF
VU ICT4D symposium 2017 Gayo Diallo Towards a Digital African Traditional Hea...
PPT
VU ICT4D symposium 2017 Wendelien Tuyp: Boosting african agriculture
PPTX
Rudy Marsman's thesis presentation slides: Speech synthesis based on a limite...
PPTX
Exploring Audiovisual Archives through Aligned Thesauri
One day workshop Linked Data and Semantic Web
Linked Data for Digital Humanities research at Media Archives
The Benefits of Linking Metadata for Internal and External users of an Audiov...
UX Challenges of Information Organisation: Assessment of Language Impairment ...
Interactive Dance Choreography Assistance presentation for ACE entertainment ...
Fahad Ali's slides for Machine to-machine communication in rural conditions ...
Linking African Traditional Medicine Knowledge - by Gossa Lo
Enriching Media Collections for Event-based Exploration
New Life for Old Media (NEM presentation)
Semantic Technology for Development: Semantic Web without the Web?
DIVE+ and Events at EVENTS2017
About Cultuurlink
Intro to Linked, Dutch Ships and Sailors and SPARQL handson
Kasadaka and ICT4D at VU
VU ICT4D symposium 2017 Francis Dittoh Mr. Meteo
VU ICT4D symposium 2017 Chris van Aart
VU ICT4D symposium 2017 Gayo Diallo Towards a Digital African Traditional Hea...
VU ICT4D symposium 2017 Wendelien Tuyp: Boosting african agriculture
Rudy Marsman's thesis presentation slides: Speech synthesis based on a limite...
Exploring Audiovisual Archives through Aligned Thesauri
Ad

Recently uploaded (20)

PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
A systematic review of self-coping strategies used by university students to ...
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PDF
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
Cell Structure & Organelles in detailed.
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
Orientation - ARALprogram of Deped to the Parents.pptx
PPTX
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PDF
VCE English Exam - Section C Student Revision Booklet
PPTX
Lesson notes of climatology university.
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
Final Presentation General Medicine 03-08-2024.pptx
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
A systematic review of self-coping strategies used by university students to ...
202450812 BayCHI UCSC-SV 20250812 v17.pptx
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
Microbial diseases, their pathogenesis and prophylaxis
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Final Presentation General Medicine 03-08-2024.pptx
Cell Structure & Organelles in detailed.
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Orientation - ARALprogram of Deped to the Parents.pptx
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Pharmacology of Heart Failure /Pharmacotherapy of CHF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
VCE English Exam - Section C Student Revision Booklet
Lesson notes of climatology university.
2.FourierTransform-ShortQuestionswithAnswers.pdf
Supply Chain Operations Speaking Notes -ICLT Program
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Final Presentation General Medicine 03-08-2024.pptx

User-centered Data Science for Digital Humanities

  • 1. Victor de Boer User-centered Data Science for Digital Humanities DIVE, Dutch Ships and Sailors and ArchimediaL
  • 3. Digital Humanities Part of the effort of humanities researcher is moved from the physical archives to digital ones New possibilities for humanities research Img:www.doaks.org, www.dkrz.de
  • 4. Integrating collections as Linked Data Tools built on top of the data Continuous enrichment Embed in humanities methodology Continuous collection enrichment Multimedia analysis (image, text, video) Human computation
  • 6. Human-based computation Nichesourcing CrowdsourcingProfessional annotation Niche groups of amateur experts with shared characteristics
  • 7. Dutch Ships and Sailors
  • 8. (semi-) automatically establish links between datasets and to external sources dss:Record gzmvoc:Telling gzmvoc:telling-1046-De_Berkel __bnode_1 gzmvoc:aziatischeBemanning dss:Ship gzmvoc:Schip gzmvoc: schip-1046-De_Berkel dss:has_ship gzmvoc:schip "1046" “Schip” “De Berkel” rdfs:label dss:scheepsnaam gzmvoc:scheepsnaam dss:ShipType gzmvoc:Scheepstype gzmvoc: type-Ship dss:has_shiptype gzmvoc:has_shiptype gzmvoc:scheepstype “21” “Moorse mattroosen” dss:azRegistratieKop gzmvoc:azAantalMatrozen gzmvoc:telling gzmvoc:heeft DAS heenreis dss:Record das:Voyage das:voyage-1918_61 Locations Ranks Ship types Voyages … Mentionedin
  • 9. Novel data analysis and visualisation
  • 10. DIVE+ INTO THE EVENT-ENRICHED LINKED OPEN CULTURAL HERITAGE
  • 12. Access to Integrated Online Multimedia collections using Linked Open Data Interactive Exploration & Discovery in Context linking objects to events and entities building automatic storylines (narratives) DIVE+
  • 13. OPENIMAGES.EU 3,220 news broadcasts Netherlands Institute for Sound & Vision GTAA thesaurus DELPHER.NL 197,199 Scans of Radio bulletins 1937 – 1984 AMSTERDAM MUSEUM 73,447 cultural heritage objects AM Thesaurus TROPENMUSEUM 78,270 cultural heritage objects SVNC thesaurus Collections and Vocabularies
  • 14. Hybrid enrichment pipeline ENTITY EXTRACTION EVENTS CROWDSOURCING AND LINKING TO CONCEPTS THROUGH CROWDTRUTH.ORG SEGMENTATION & KEYFRAMES LINKING EVENTS AND CONCEPTS TO KEYFRAMES
  • 15. DIVE:MediaObject Nieuws uit Indonesië: opheffing van het KNIL dive:depictedBy sem:hasTimestamp sem:Event ANP:1950-08-11:50 dive:isRelatedTo dive:relatedPlace sem:hasPlace dive:isRelatedTo dive:relatedActor sem:hasActor dive:isRelatedTo dive:relatedPlace sem:hasPlace sem:Time 25 Juli 1950 dive:depictedBy sem:hasTimestamp DIVE:MediaObject Mannen bij het huis van Paul Spies aan de Parapattan 42, Djakarta dive:depictedBy dive:depictedBy dive:depictedBy DIVE:MediaObject ANP:1950-08-11:50 DIVE:MediaObject Schaal sem:Time 11 Augustus 1950 sem:Event ontbindingsceremonie sem:Place Djakarta sem:Place Indonesië sem:Actor Mohammad Hatta Integration of Heterogeneous Collections
  • 17. ArchiMediaL Developing Post-colonial Interpretations of Built Form through Heterogeneous Linked Digital Media
  • 18. Computer Vision + crowdsourcing How to identify (elements of) buildings across different representations
  • 19. Flexible data model allows for multi-interpretation
  • 20. Continuous enrichment and linking of heterogeneous collections brings new possibilities for access, analysis Using automatic methods Always with human(s) in the loop Victor de Boer v.de.boer@vu.nl