Towards Linked Vital Registration Data for
Reconstituting Families and Creating
Longitudinal Health HistoriesLongitudinal Health Histories
Oya Beyan, Ciara Breathnach, Sandra Collins,
Christophe Debruyne, Stefan Decker, Dolores Grant,
Rebecca Grant, and Brian Gurrin
21st of July 2014 – KR4HC Workshop – Vienna, Austria21st of July 2014 – KR4HC Workshop – Vienna, Austria
Irish Record Linkage, 1864-1913
• Developing a platform applying semantic
technologies to historical birth-, death andtechnologies to historical birth-, death and
marriage certificates.
• Answering questions such as: “How accurate are
historic maternal mortality rates (MMR) and
infant mortality rates (IMR) for Dublin?”
• Team consists of researchers (historians), digital
archivists, and knowledge engineers.
21/07/2014 2
Data: General Office Records
• Vital registration data
– Birth-certificates– Birth-certificates
– Death-certificates
– Marriage records
• Digitised TIFF images of
hardcopy indexes and
registers.
• 2 TB of data• 2 TB of data
• Database describing the
digitised records allowing
searches on some fields.
21/07/2014 3
©General Records Office of Ireland 2014
Challenges
• Certified causes of death that can be attributed to maternal
death
– Within 42 days after labour – before (1864) it was 12– Within 42 days after labour – before (1864) it was 12
– Septicemia (blood poisoning), Fever, …
– “Corresponding” birth certificate?
• Death certificates with no corresponding birth certificate
• “Gaps” in sibship interval, even though no birth- or death
certificates can be found.
• The terminology used pre-1900. E.g., “debile” to denote• The terminology used pre-1900. E.g., “debile” to denote
weak or a failure to thrive.
• Capturing the socio-economical status of the families via,
for instance, the professions, ranks of fathers.
21/07/2014 4
Conceptual Architecture
Digital Archivist
SPARQL endpoint /
Linked Data Server
Updates
GRO records
as RDF
LinksLinker UpdaterRepository
Triple-
store
Linked Data Server
Analytics
Researcher
21/07/2014 5
DATA ANALYTICSPRESERVATION
Links to external datasets: e.g., Logainm – a database of Irish historical and
contemporary place names to provide additional context.
Development of 2 ontologies
Triplestore 2 Data Analysis
CONCERNSSEPARATIONOFCONCERNS
Obviously, due to
the sensitive
nature of the
data, data
protection is key.
21/07/2014 6
GRO Triplestore
Transformation from one model to another
• SPIN – SPARQL Inference
• SWRL / RuleML
• SPARQL Construct
• …
SEPARATION
protection is key.
Development of 2 ontologies
• 2 ontologies were developed – separation of concerns
• First ontology for describing the contents of records
– OWL 2 shallow, “flat ontology”
• Second ontology for data analysis
– OWL 2 + rules
– Rules to capture background and domain knowledge– Rules to capture background and domain knowledge
– Developed by having the historians formulate competency
questions (Grüninger and Fox)
– Captured graphically using Object Role Modelling
21/07/2014 7
Graphical Representation in ORM
21/07/2014 8
### Prefixes ommitted …
irl:Record a owl:Class ;
rdfs:label "Record" ; .
irl:Certificate a owl:Class ;
rdfs:label "Certificate" ;
rdfs:subClassOf irl:Record; .rdfs:subClassOf irl:Record; .
irl:BirthRecord a owl:Class ;
rdfs:label "Birth Record" ;
rdfs:subClassOf irl:Certificate ; .
irl:DeathRecord a owl:Class ;
rdfs:label "Death Record" ;
rdfs:subClassOf irl:Certificate ; .
irl:MarriageRecord a owl:Class ;
rdfs:label "Marriage Record" ;rdfs:label "Marriage Record" ;
rdfs:subClassOf irl:Record ; .
irl:Return a owl:Class ;
rdfs:label "Return" ; .
…
21/07/2014 9
Conclusions
• Presented the problem and highlighted the
challengeschallenges
• Developed two ontologies
– Encoding contents of digitized GRO records for
long-term digital preservation DRI
– Data analytics to answer the researchers’
question – in this case a historianquestion – in this case a historian
• Data exploration and annotation of the
records started on a subset of the dataset
21/07/2014 10

More Related Content

PDF
Stoppanicbuying001_03161645ver
PPT
Z moją pasją na ty
DOC
03 tus pecadossonperdonados
PPTX
Black eyed peas
ODP
Manuel Francisco Esparza 6.2.2 hiperenlaces
PPTX
Sympathomimmetics
PDF
Using Semantic Technologies to Create Virtual Families from Historical Vital ...
PDF
Using Semantic Technologies to Create Virtual Families from Historical Vital ...
Stoppanicbuying001_03161645ver
Z moją pasją na ty
03 tus pecadossonperdonados
Black eyed peas
Manuel Francisco Esparza 6.2.2 hiperenlaces
Sympathomimmetics
Using Semantic Technologies to Create Virtual Families from Historical Vital ...
Using Semantic Technologies to Create Virtual Families from Historical Vital ...

Similar to Towards linked vital registration data for reconstituting families and creating longitudinal health histories (20)

PDF
Using Semantic Technologies to Create Virtual Families from Historical Vital ...
PPT
Mid-Sweden University/SNIA Conference 13 October 2008
PDF
Creating and Consuming Metadata from Transcribed Historical Vital Records for...
PDF
R - datascience
PPT
The eCrystals Federation
PPTX
Sharing data
PPT
Rdm slides march 2014
PDF
Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...
PDF
CLARIAH-clio-dap
PDF
RDAP14: Learning to Curate Panel
PDF
Ji cv6n1
PDF
Sarah Jones RDM from a disciplinary perspective
PDF
The Future of Semantics on the Web
PPTX
Rebecca Grant - Approaching Archival Authenticity: when 'Records' become 'Data.
PDF
Scientists’ Hard Drives, Databases, and Blogs: Preservation Intent and Source...
PPTX
Rebecca Grant DPASSH presentation 2015
PPT
Saving private data, sharing Open Data? Role of libraries and institutional r...
PDF
20141112 courtot big_datasemwebontologies
PPTX
Open science in RIKEN-KI doctorial course on March 20, 2019
PPTX
A Deep Survey of the Digital Resource Landscape
Using Semantic Technologies to Create Virtual Families from Historical Vital ...
Mid-Sweden University/SNIA Conference 13 October 2008
Creating and Consuming Metadata from Transcribed Historical Vital Records for...
R - datascience
The eCrystals Federation
Sharing data
Rdm slides march 2014
Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...
CLARIAH-clio-dap
RDAP14: Learning to Curate Panel
Ji cv6n1
Sarah Jones RDM from a disciplinary perspective
The Future of Semantics on the Web
Rebecca Grant - Approaching Archival Authenticity: when 'Records' become 'Data.
Scientists’ Hard Drives, Databases, and Blogs: Preservation Intent and Source...
Rebecca Grant DPASSH presentation 2015
Saving private data, sharing Open Data? Role of libraries and institutional r...
20141112 courtot big_datasemwebontologies
Open science in RIKEN-KI doctorial course on March 20, 2019
A Deep Survey of the Digital Resource Landscape
Ad

Recently uploaded (20)

DOCX
search engine optimization ppt fir known well about this
PPTX
Modernising the Digital Integration Hub
PDF
Zenith AI: Advanced Artificial Intelligence
PPTX
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
PDF
Enhancing plagiarism detection using data pre-processing and machine learning...
PDF
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
PDF
Consumable AI The What, Why & How for Small Teams.pdf
PPTX
Final SEM Unit 1 for mit wpu at pune .pptx
PPTX
Benefits of Physical activity for teenagers.pptx
PDF
UiPath Agentic Automation session 1: RPA to Agents
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
The influence of sentiment analysis in enhancing early warning system model f...
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
PDF
Developing a website for English-speaking practice to English as a foreign la...
PPTX
Chapter 5: Probability Theory and Statistics
PPTX
TEXTILE technology diploma scope and career opportunities
PDF
sbt 2.0: go big (Scala Days 2025 edition)
PDF
Comparative analysis of machine learning models for fake news detection in so...
PDF
STKI Israel Market Study 2025 version august
search engine optimization ppt fir known well about this
Modernising the Digital Integration Hub
Zenith AI: Advanced Artificial Intelligence
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
Enhancing plagiarism detection using data pre-processing and machine learning...
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
Consumable AI The What, Why & How for Small Teams.pdf
Final SEM Unit 1 for mit wpu at pune .pptx
Benefits of Physical activity for teenagers.pptx
UiPath Agentic Automation session 1: RPA to Agents
1 - Historical Antecedents, Social Consideration.pdf
The influence of sentiment analysis in enhancing early warning system model f...
sustainability-14-14877-v2.pddhzftheheeeee
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
Developing a website for English-speaking practice to English as a foreign la...
Chapter 5: Probability Theory and Statistics
TEXTILE technology diploma scope and career opportunities
sbt 2.0: go big (Scala Days 2025 edition)
Comparative analysis of machine learning models for fake news detection in so...
STKI Israel Market Study 2025 version august
Ad

Towards linked vital registration data for reconstituting families and creating longitudinal health histories

  • 1. Towards Linked Vital Registration Data for Reconstituting Families and Creating Longitudinal Health HistoriesLongitudinal Health Histories Oya Beyan, Ciara Breathnach, Sandra Collins, Christophe Debruyne, Stefan Decker, Dolores Grant, Rebecca Grant, and Brian Gurrin 21st of July 2014 – KR4HC Workshop – Vienna, Austria21st of July 2014 – KR4HC Workshop – Vienna, Austria
  • 2. Irish Record Linkage, 1864-1913 • Developing a platform applying semantic technologies to historical birth-, death andtechnologies to historical birth-, death and marriage certificates. • Answering questions such as: “How accurate are historic maternal mortality rates (MMR) and infant mortality rates (IMR) for Dublin?” • Team consists of researchers (historians), digital archivists, and knowledge engineers. 21/07/2014 2
  • 3. Data: General Office Records • Vital registration data – Birth-certificates– Birth-certificates – Death-certificates – Marriage records • Digitised TIFF images of hardcopy indexes and registers. • 2 TB of data• 2 TB of data • Database describing the digitised records allowing searches on some fields. 21/07/2014 3 ©General Records Office of Ireland 2014
  • 4. Challenges • Certified causes of death that can be attributed to maternal death – Within 42 days after labour – before (1864) it was 12– Within 42 days after labour – before (1864) it was 12 – Septicemia (blood poisoning), Fever, … – “Corresponding” birth certificate? • Death certificates with no corresponding birth certificate • “Gaps” in sibship interval, even though no birth- or death certificates can be found. • The terminology used pre-1900. E.g., “debile” to denote• The terminology used pre-1900. E.g., “debile” to denote weak or a failure to thrive. • Capturing the socio-economical status of the families via, for instance, the professions, ranks of fathers. 21/07/2014 4
  • 5. Conceptual Architecture Digital Archivist SPARQL endpoint / Linked Data Server Updates GRO records as RDF LinksLinker UpdaterRepository Triple- store Linked Data Server Analytics Researcher 21/07/2014 5 DATA ANALYTICSPRESERVATION Links to external datasets: e.g., Logainm – a database of Irish historical and contemporary place names to provide additional context.
  • 6. Development of 2 ontologies Triplestore 2 Data Analysis CONCERNSSEPARATIONOFCONCERNS Obviously, due to the sensitive nature of the data, data protection is key. 21/07/2014 6 GRO Triplestore Transformation from one model to another • SPIN – SPARQL Inference • SWRL / RuleML • SPARQL Construct • … SEPARATION protection is key.
  • 7. Development of 2 ontologies • 2 ontologies were developed – separation of concerns • First ontology for describing the contents of records – OWL 2 shallow, “flat ontology” • Second ontology for data analysis – OWL 2 + rules – Rules to capture background and domain knowledge– Rules to capture background and domain knowledge – Developed by having the historians formulate competency questions (Grüninger and Fox) – Captured graphically using Object Role Modelling 21/07/2014 7
  • 8. Graphical Representation in ORM 21/07/2014 8
  • 9. ### Prefixes ommitted … irl:Record a owl:Class ; rdfs:label "Record" ; . irl:Certificate a owl:Class ; rdfs:label "Certificate" ; rdfs:subClassOf irl:Record; .rdfs:subClassOf irl:Record; . irl:BirthRecord a owl:Class ; rdfs:label "Birth Record" ; rdfs:subClassOf irl:Certificate ; . irl:DeathRecord a owl:Class ; rdfs:label "Death Record" ; rdfs:subClassOf irl:Certificate ; . irl:MarriageRecord a owl:Class ; rdfs:label "Marriage Record" ;rdfs:label "Marriage Record" ; rdfs:subClassOf irl:Record ; . irl:Return a owl:Class ; rdfs:label "Return" ; . … 21/07/2014 9
  • 10. Conclusions • Presented the problem and highlighted the challengeschallenges • Developed two ontologies – Encoding contents of digitized GRO records for long-term digital preservation DRI – Data analytics to answer the researchers’ question – in this case a historianquestion – in this case a historian • Data exploration and annotation of the records started on a subset of the dataset 21/07/2014 10