SlideShare a Scribd company logo
The Golden Agents project
Disambiguating Person Name Entities
Chiara Latronico, International Archival Conference, Čačak, 10-10-2019
Agenda
● About the Golden Agents project
● Amsterdam City Archive registries
● Transforming and basic interpreting. Processing and enriching the datasets.
● Disambiguating between Persons: Lenticular Lenses II
● Network analysis
About the Golden Agents
project
Partners
NWO-Large Infrastructure Project –
funding ca. $ 4 million budget € 6
million : 2017 – 2021 (5 years)
Golden Agents Infrastructure: Aim
Understanding the dynamics in the creative industries of the Dutch Golden Age
● By analyzing interactions between various branches of the creative industries
● By analyzing interactions between producers and consumers of the creative
industries
Problem 1 data about production of creative industries
dispersed over separated databases
Problem 2 lack of digital data about the consumption of the
creative industries of the Dutch Golden Age
Golden Agents: Infrastructure
● Combines Semantic Web and Multi Agents technology to analyze and to
interact with existing and new datasets (in linked data) about the
Dutch Golden Age
● Develops ontologies incorporating these dynamic interactions as storylines
and events in linked data
● Uses a combination of hand-written text recognition and crowdsourcing to
disclose 2 million scans of notary acts
Amsterdam City Archive
registries
Amsterdam City Archives Registies
● Notarial Acts (Notarieel archief)
● Baptism Registries (Doopregisters)
● Burial Registries (Begraafregisters)
● Prenuptial Agreements (Ondertrouwregisters)
● Transport Acts (Kwijtscheldingen)
● Confession Books (Confessieboeken)
● Burghers Books (Poorterboeken)
● Fines on marriage and burying (Boetes op trouwen en begraven)
https://www.amsterdam.nl/stadsarchief/organisatie/open-data/
Notarial Acts
● 3,5 km
● 731 notaries (1578-1915)
● 30.000 inventory n.
● 6.419.017 scans
● 783 volunteers
● Millions of deeds
Amsterdam City
Archives
Data Pipeline
Transforming and basic
interpreting. Processing and
enriching the datasets.
https://data.goldenagents.org/
Basic Conversion XML to RDF
● Golden Agent Ontology
○ PREFIX saaOnt:
<http://goldenagents.org/uva/SAA/ontology/>
■ saaOnt:identifier
■ saaOnt:mentionsBride
■ saaOnt:mentionsGroom
■ saaOnt:mentionsEarlierHusband
■ saaOnt:mentionsEarlierBride
○ Person names are stored as pnv:PersonName
■ No interpretation: PersonName = Name
Enrichment and Interpretation
● Event based data model
● PersonNames attached to Persons
● Relations between Persons are indicated
○ schema:spouse, bio:Marriage
● Implicit person mentions are added as Person
○ Child of
○ Widow of
● Georeferences are expressed with possible
geometries
● Further information structuring if needed
○ Splitting StringLiterals into individual resources
and types
● Duplicate records are removed.
Enrichment: Geo-Referencing
link to historical and contemporary map
Chiara Latronico, The Golden Agents project. Disambiguating  Person Name Entities
Chiara Latronico, The Golden Agents project. Disambiguating  Person Name Entities
Disambiguating between
Persons: Lenticular Lenses II
Disambiguating Person Names
One of the several experiments
● Prenuptial Agreements
● Baptism Registries
● ECARTICO
ECARTICO is a comprehensive collection of structured biographical data concerning painters, engravers, printers, book
sellers, gold- and silversmiths and others involved in the ‘cultural industries’ of the Low Countries in the sixteenth and
seventeenth centuries. (http://www.vondel.humanities.uva.nl/ecartico/)
Chiara Latronico, The Golden Agents project. Disambiguating  Person Name Entities
Chiara Latronico, The Golden Agents project. Disambiguating  Person Name Entities
Chiara Latronico, The Golden Agents project. Disambiguating  Person Name Entities
Chiara Latronico, The Golden Agents project. Disambiguating  Person Name Entities
Chiara Latronico, The Golden Agents project. Disambiguating  Person Name Entities
Chiara Latronico, The Golden Agents project. Disambiguating  Person Name Entities
Chiara Latronico, The Golden Agents project. Disambiguating  Person Name Entities
Chiara Latronico, The Golden Agents project. Disambiguating  Person Name Entities
Chiara Latronico, The Golden Agents project. Disambiguating  Person Name Entities
Chiara Latronico, The Golden Agents project. Disambiguating  Person Name Entities
Chiara Latronico, The Golden Agents project. Disambiguating  Person Name Entities
Chiara Latronico, The Golden Agents project. Disambiguating  Person Name Entities
Chiara Latronico, The Golden Agents project. Disambiguating  Person Name Entities
Contextual Entity Disambiguation in Domains with Weak Identity Criteria:
Disambiguating Golden Age Amsterdamers (Accepted)
Thank you!
Questions?
c.latronico@uva.nl

More Related Content

PDF
Maphub und Pelagios: Anwendung von Linked Data in den Digitalen Geisteswissen...
PDF
[RMLL2017] LDAPCon 2017
PDF
Overview of D3.js (1)
PDF
College Van Trends Tot Innovatie
PPTX
Digitised Manuscripts and the British Library's new IIIF viewer
PPTX
SMW between OpenData, OpenGLAM, Linked Data and the Semantic Web, Bernhard Kr...
PDF
Session 5.2 linking national core registries
Maphub und Pelagios: Anwendung von Linked Data in den Digitalen Geisteswissen...
[RMLL2017] LDAPCon 2017
Overview of D3.js (1)
College Van Trends Tot Innovatie
Digitised Manuscripts and the British Library's new IIIF viewer
SMW between OpenData, OpenGLAM, Linked Data and the Semantic Web, Bernhard Kr...
Session 5.2 linking national core registries

What's hot (20)

PDF
Wikidata at Wikipeda Day 15 (2016) NYC
PDF
2018 03-03 culture hack-bucharest-marco streefkerk
PPTX
Opening Semantics 2016
PPTX
Paul Jansen (SADC) - Beyond Logistics
PDF
HISB@UNESCO 24.10.2014 1-part
PDF
All WP Meeting Athens - Preliminary Results of the Contextualisation - Klaus ...
PPTX
Semantic MediaWik as Co-Creation Tool - Digital Humanities Austria #DHA2015
PDF
OTDS presentation on Standards at Travel Traction Berlin 2013
PDF
Otds, edf & global types
PDF
TUD-Chat – a moderated chat add-on for Plone, by Sebastian Schietzold
PPTX
20160309 open geodata_odz-perspective
PPT
Jisc Tile Painpointlandscape June 2008
PPT
Nes Global Presentation
PPTX
Gaenovium - Open data in the Netherlands
PPT
Hampshire hub presentation for making transparency work 06 june 2014
PPTX
Visual Analytics of Smart City Data for Sustainable Quality of Life of Citizens
PPTX
Best of Brussels hackathons
PDF
Keynote csws2013
ODP
Osm city planning
Wikidata at Wikipeda Day 15 (2016) NYC
2018 03-03 culture hack-bucharest-marco streefkerk
Opening Semantics 2016
Paul Jansen (SADC) - Beyond Logistics
HISB@UNESCO 24.10.2014 1-part
All WP Meeting Athens - Preliminary Results of the Contextualisation - Klaus ...
Semantic MediaWik as Co-Creation Tool - Digital Humanities Austria #DHA2015
OTDS presentation on Standards at Travel Traction Berlin 2013
Otds, edf & global types
TUD-Chat – a moderated chat add-on for Plone, by Sebastian Schietzold
20160309 open geodata_odz-perspective
Jisc Tile Painpointlandscape June 2008
Nes Global Presentation
Gaenovium - Open data in the Netherlands
Hampshire hub presentation for making transparency work 06 june 2014
Visual Analytics of Smart City Data for Sustainable Quality of Life of Citizens
Best of Brussels hackathons
Keynote csws2013
Osm city planning
Ad

Similar to Chiara Latronico, The Golden Agents project. Disambiguating Person Name Entities (20)

PPTX
101 This is Digital Scholarship 2016
PPTX
The Ground Truth: Arabic Scientific Manuscripts Workshop
PPTX
Open Data Aha! Renfrewshire Open Data 'Enlightenment' event
PPTX
Doing Digital Research @ British Library
PDF
Unlocking the Past - Digital Collections at the British Library
PDF
Heeren pan-seadda-leiden-17mrt2020
PDF
DRIVE | cultural heritage in virtual worlds (part 1)
PPTX
Current metadata landscape in the library world (Getaneh Alemu)
PPTX
Using historical open data for family history - and the value of GB1900 data
PDF
Interpreting Open Data
PPTX
PPTX
AHRC CDP Digital Humanities 101
PDF
#FAIRGLAM : Towards FAIR digital cultural heritage collections in the Rijksmu...
PPTX
Cross Border Heritage Initiatives
PDF
Creating and Curating Assets for the Future of the Urban Past
PPT
Presentation to the National Science Library of the Chinese Academy of Sciences
PDF
Methodological Guidelines for Publishing Linked Data
PDF
Copenhagen business school drives sustainability at roskilde festival using c...
PDF
Blockchain in Digital Vienna - Technology of an innovative administration
PDF
Semantic Web for Cultural Heritage valorisation
101 This is Digital Scholarship 2016
The Ground Truth: Arabic Scientific Manuscripts Workshop
Open Data Aha! Renfrewshire Open Data 'Enlightenment' event
Doing Digital Research @ British Library
Unlocking the Past - Digital Collections at the British Library
Heeren pan-seadda-leiden-17mrt2020
DRIVE | cultural heritage in virtual worlds (part 1)
Current metadata landscape in the library world (Getaneh Alemu)
Using historical open data for family history - and the value of GB1900 data
Interpreting Open Data
AHRC CDP Digital Humanities 101
#FAIRGLAM : Towards FAIR digital cultural heritage collections in the Rijksmu...
Cross Border Heritage Initiatives
Creating and Curating Assets for the Future of the Urban Past
Presentation to the National Science Library of the Chinese Academy of Sciences
Methodological Guidelines for Publishing Linked Data
Copenhagen business school drives sustainability at roskilde festival using c...
Blockchain in Digital Vienna - Technology of an innovative administration
Semantic Web for Cultural Heritage valorisation
Ad

More from Arhivistika (20)

PPTX
Марија Тодоровић, Законске обавезе стваралаца архивске грађе и документарног ...
PPTX
Vlatka Lemić, ICARUS aktivnosti i projekti - suradnja i umrežavanje
PPSX
Наташа Малобабић Вукић, Југослав Вељсковски, Архивске зграде - место и улога ...
PPSX
Др Јасмина Живковић, Јавни извршитељи у систему заштите архивске грађе
PPTX
Архивски гласник, Информативни билтен Архивистичког друштва Србије, бр. 1-15
PPSX
Бојана Јовац и Јелена Ковачевић, Нова архивска стварност – утицај пандемије К...
PPSX
10. Стојанка Бојовић, Правосудни органи у Нишу непосредно и након Другог свет...
PPSX
Siniša Domazet, Arhiv Bosne i Hercegovine – osnutak i postojanje
PPSX
Снежана Петров, Етика у конзервацији архивске грађе
PPSX
Sađida Balta, Iskustva na sređivanju fondova iz oblasti javne uprave i javne ...
PPSX
Дарко Маринковић, Дигитализација архивске грађе Војног архива – Информациони...
PPSX
Erika Žilić Vincetić, Razglednice u arhivskim fondovima i zbirkama – doprinos...
PPSX
Татјана Драгићевић, Искуства на уносу података у јединствени информациони си...
PPSX
Никола Аџић, Архивски информациони систем – АРХИС
PPSX
Мр Мирјана Богосављевић, Очување архивских записа о пандемији корона вируса
PPSX
Др Светлана Стефановић, Међународна година жена – ОУН 1975. у фондовима Архи...
PPSX
Татјана Сегединчев, И Архивска грађа као извор за историју руске емиграције у...
PPSX
Невена Карабашевић, Породични и лични архивски фондови у Историјском архиву Ниш
PPSX
Dr Ivo Orešković, Ostavinski spisi Kotarskog suda u Cavtatu – ogledalo svakod...
PPSX
Драгана Станисављевић и Бранка Јаначковић, Фотографија као сведок историјских...
Марија Тодоровић, Законске обавезе стваралаца архивске грађе и документарног ...
Vlatka Lemić, ICARUS aktivnosti i projekti - suradnja i umrežavanje
Наташа Малобабић Вукић, Југослав Вељсковски, Архивске зграде - место и улога ...
Др Јасмина Живковић, Јавни извршитељи у систему заштите архивске грађе
Архивски гласник, Информативни билтен Архивистичког друштва Србије, бр. 1-15
Бојана Јовац и Јелена Ковачевић, Нова архивска стварност – утицај пандемије К...
10. Стојанка Бојовић, Правосудни органи у Нишу непосредно и након Другог свет...
Siniša Domazet, Arhiv Bosne i Hercegovine – osnutak i postojanje
Снежана Петров, Етика у конзервацији архивске грађе
Sađida Balta, Iskustva na sređivanju fondova iz oblasti javne uprave i javne ...
Дарко Маринковић, Дигитализација архивске грађе Војног архива – Информациони...
Erika Žilić Vincetić, Razglednice u arhivskim fondovima i zbirkama – doprinos...
Татјана Драгићевић, Искуства на уносу података у јединствени информациони си...
Никола Аџић, Архивски информациони систем – АРХИС
Мр Мирјана Богосављевић, Очување архивских записа о пандемији корона вируса
Др Светлана Стефановић, Међународна година жена – ОУН 1975. у фондовима Архи...
Татјана Сегединчев, И Архивска грађа као извор за историју руске емиграције у...
Невена Карабашевић, Породични и лични архивски фондови у Историјском архиву Ниш
Dr Ivo Orešković, Ostavinski spisi Kotarskog suda u Cavtatu – ogledalo svakod...
Драгана Станисављевић и Бранка Јаначковић, Фотографија као сведок историјских...

Recently uploaded (20)

PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPT
Mechanical Engineering MATERIALS Selection
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PPTX
Geodesy 1.pptx...............................................
PDF
Digital Logic Computer Design lecture notes
PPTX
Artificial Intelligence
PDF
Well-logging-methods_new................
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
bas. eng. economics group 4 presentation 1.pptx
PPTX
CH1 Production IntroductoryConcepts.pptx
PDF
composite construction of structures.pdf
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PPTX
Current and future trends in Computer Vision.pptx
PPT
introduction to datamining and warehousing
Automation-in-Manufacturing-Chapter-Introduction.pdf
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
Embodied AI: Ushering in the Next Era of Intelligent Systems
Mechanical Engineering MATERIALS Selection
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Geodesy 1.pptx...............................................
Digital Logic Computer Design lecture notes
Artificial Intelligence
Well-logging-methods_new................
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
bas. eng. economics group 4 presentation 1.pptx
CH1 Production IntroductoryConcepts.pptx
composite construction of structures.pdf
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
Current and future trends in Computer Vision.pptx
introduction to datamining and warehousing

Chiara Latronico, The Golden Agents project. Disambiguating Person Name Entities

  • 1. The Golden Agents project Disambiguating Person Name Entities Chiara Latronico, International Archival Conference, Čačak, 10-10-2019
  • 2. Agenda ● About the Golden Agents project ● Amsterdam City Archive registries ● Transforming and basic interpreting. Processing and enriching the datasets. ● Disambiguating between Persons: Lenticular Lenses II ● Network analysis
  • 3. About the Golden Agents project
  • 4. Partners NWO-Large Infrastructure Project – funding ca. $ 4 million budget € 6 million : 2017 – 2021 (5 years)
  • 5. Golden Agents Infrastructure: Aim Understanding the dynamics in the creative industries of the Dutch Golden Age ● By analyzing interactions between various branches of the creative industries ● By analyzing interactions between producers and consumers of the creative industries
  • 6. Problem 1 data about production of creative industries dispersed over separated databases
  • 7. Problem 2 lack of digital data about the consumption of the creative industries of the Dutch Golden Age
  • 8. Golden Agents: Infrastructure ● Combines Semantic Web and Multi Agents technology to analyze and to interact with existing and new datasets (in linked data) about the Dutch Golden Age ● Develops ontologies incorporating these dynamic interactions as storylines and events in linked data ● Uses a combination of hand-written text recognition and crowdsourcing to disclose 2 million scans of notary acts
  • 10. Amsterdam City Archives Registies ● Notarial Acts (Notarieel archief) ● Baptism Registries (Doopregisters) ● Burial Registries (Begraafregisters) ● Prenuptial Agreements (Ondertrouwregisters) ● Transport Acts (Kwijtscheldingen) ● Confession Books (Confessieboeken) ● Burghers Books (Poorterboeken) ● Fines on marriage and burying (Boetes op trouwen en begraven) https://www.amsterdam.nl/stadsarchief/organisatie/open-data/
  • 11. Notarial Acts ● 3,5 km ● 731 notaries (1578-1915) ● 30.000 inventory n. ● 6.419.017 scans ● 783 volunteers ● Millions of deeds
  • 13. Transforming and basic interpreting. Processing and enriching the datasets.
  • 15. Basic Conversion XML to RDF ● Golden Agent Ontology ○ PREFIX saaOnt: <http://goldenagents.org/uva/SAA/ontology/> ■ saaOnt:identifier ■ saaOnt:mentionsBride ■ saaOnt:mentionsGroom ■ saaOnt:mentionsEarlierHusband ■ saaOnt:mentionsEarlierBride ○ Person names are stored as pnv:PersonName ■ No interpretation: PersonName = Name
  • 16. Enrichment and Interpretation ● Event based data model ● PersonNames attached to Persons ● Relations between Persons are indicated ○ schema:spouse, bio:Marriage ● Implicit person mentions are added as Person ○ Child of ○ Widow of ● Georeferences are expressed with possible geometries ● Further information structuring if needed ○ Splitting StringLiterals into individual resources and types ● Duplicate records are removed.
  • 17. Enrichment: Geo-Referencing link to historical and contemporary map
  • 21. Disambiguating Person Names One of the several experiments ● Prenuptial Agreements ● Baptism Registries ● ECARTICO ECARTICO is a comprehensive collection of structured biographical data concerning painters, engravers, printers, book sellers, gold- and silversmiths and others involved in the ‘cultural industries’ of the Low Countries in the sixteenth and seventeenth centuries. (http://www.vondel.humanities.uva.nl/ecartico/)
  • 35. Contextual Entity Disambiguation in Domains with Weak Identity Criteria: Disambiguating Golden Age Amsterdamers (Accepted)