SlideShare a Scribd company logo
A Semantic Knowledge Graph at
National Library Board Singapore
BIBFRAME Workshop in Europe 2024
17th September 2024 - Helsinki
Richard Wallis
Evangelist and Founder
Data Liberate
richard.wallis@dataliberate.com
Independent Consultant, Evangelist & Founder
W3C Community Groups:
• Bibframe2Schema (Chair) – Standardised conversion path(s)
• Schema Bib Extend (Chair) - Bibliographic data
• Schema Architypes (Chair) - Archives
• Financial Industry Business Ontology – Financial schema.org
• Tourism Structured Web Data (Co-Chair)
• Schema Course Extension
• Schema IoT Community
• Educational & Occupational Credentials in Schema.org
richard.wallis@dataliberate.com — @dataliberate
40+ Years – Computing
30+ Years – Cultural Heritage technology
20+ Years – Semantic Web & Linked Data
Worked With:
• Google – Schema.org vocabulary, site, extensions. documentation and community
• OCLC – Global library cooperative
• FIBO – Financial Industry Business Ontology Group
• Various Clients – Implementing/understanding Linked Data, Schema.org:
National Library Board Singapore
British Library — Stanford University — Europeana
2
3
Agenda for today
4
Agenda for today
• National Library and their resources
5
Agenda for today
• National Library and their resources
• Knowledge Graph ambition
6
Agenda for today
• National Library and their resources
• Knowledge Graph ambition
• Linked Data Management System – the LDMS delivered
7
Agenda for today
• National Library and their resources
• Knowledge Graph ambition
• Linked Data Management System – the LDMS delivered
• Continued development
8
Agenda for today
• National Library and their resources
• Knowledge Graph ambition
• Linked Data Management System – the LDMS delivered
• Continued development
– Data sharing with the Entity Data Service
9
Agenda for today
• National Library and their resources
• Knowledge Graph ambition
• Linked Data Management System – the LDMS delivered
• Continued development
– Data sharing with the Entity Data Service
– User experience enrichment – the sidebar API
10
Agenda for today
• National Library and their resources
• Knowledge Graph ambition
• Linked Data Management System – the LDMS delivered
• Continued development
– Data sharing with the Entity Data Service
– User experience enrichment – the sidebar API
– Data quality enhancement utilizing external authorities
11
National Library Board Singapore
Public Libraries
Network of 28 Public Libraries,
including 2 partner libraries*
Reading Programmes and Initiatives
Programmes and Exhibitions
targeted at Singapore communities
*Partner libraries are libraries which are partner owned
and
funded but managed by NLB/NLB’s subsidiary Libraries
and Archives Solutions Pte Ltd. Library@Chinatown and
the Lifelong Learning Institute Library are Partner libraries.
National Archives
Transferred from NHB to NLB in Nov
2012
Custodian of Singapore’s
Collective Memory: Responsible
for Collection, Preservation and
Management of
Singapore’s Public and Private
Archival Records
Promotes Public Interest in our
Nation’s
History and Heritage
National Library
Preserving Singapore’s Print
and Literary Heritage, and
Intellectual memory
Reference Collections
Legal Deposit (including
electronic)
12
Over
560,000
Singapore &
SEA items
Over 147,000
Chinese, Malay &
Tamil Languages
items
Reference Collection
Over 62,000
Social Sciences
& Humanities
items
Over 39,000
Science &
Technology
items
Over
53,000
Arts items
Over 19,000
Rare Materials
items
Archival Materials
Over 290,000
Government files &
Parliament papers
Over 190,000
Audiovisual & sound
recordings
Over 70,000
Maps & building
plans
Over
1.14m
Photographs
Over 35,000
Oral history
interviews
Over 55,000
Speeches & press
releases
Over
7,000
Posters
National Library Board Singapore
Over 5m
print collection
Over 2.4m
music tracks
78
databases
Over 7,400
e-newspapers and
e-magazines titles
Over
8,000
e-learning
courses
Over 1.7m
e-books and
audio books
Lending Collection
13
National Library Board Online Services
15
The Ambition
• To enable the discovery & display of entitles from different sources
in a combined interface
16
The Ambition
• To enable the discovery & display of entitles from different sources
in a combined interface
• To bring together resources physical and digital
17
The Ambition
• To enable the discovery & display of entitles from different sources
in a combined interface
• To bring together resources physical and digital
• To bring together diverse systems across the National Library,
National Archives, and Public Libraries in a Linked Data Environment
18
The Ambition
• To enable the discovery & display of entitles from different sources
in a combined interface
• To bring together resources physical and digital
• To bring together diverse systems across the National Library,
National Archives, and Public Libraries in a Linked Data Environment
• To provide a staff interface to view and manage all entities, their
descriptions and relationships
19
The Ambition – Technical Challenges
• To produce a Knowledge Graph that is [daily] up to date
20
The Ambition – Technical Challenges
• To produce a Knowledge Graph that is [daily] up to date
• Not to replace current cataloging processes & practices
21
The Ambition – Technical Challenges
• To produce a Knowledge Graph that is [daily] up to date
• Not to replace current cataloging processes & practices
– Marc cataloguing in the ILS
22
The Ambition – Technical Challenges
• To produce a Knowledge Graph that is [daily] up to date
• Not to replace current cataloging processes & practices
– Marc cataloguing in the ILS
– TTE maintenance in authority control
23
The Ambition – Technical Challenges
• To produce a Knowledge Graph that is [daily] up to date
• Not to replace current cataloging processes & practices
– Marc cataloguing in the ILS
– TTE maintenance in authority control
– Dublin Core content management for CMS sites and Archives
24
The Ambition – Technical Challenges
• To produce a Knowledge Graph that is [daily] up to date
• Not to replace current cataloging processes & practices
– Marc cataloguing in the ILS
– TTE maintenance in authority control
– Dublin Core content management for CMS sites and Archives
• Data sharable with the world
– Linked Open Data
– Schema.org
25
The Ambition – Technical Challenges
• To produce a Knowledge Graph that is [daily] up to date
• Not to replace current cataloging processes & practices
– Marc cataloguing in the ILS
– TTE maintenance in authority control
– Dublin Core content management for CMS sites and Archives
• Data sharable with the world
– Linked Open Data
– Schema.org
• An aggregated source of truth
26
Contract Awarded
metaphactory platform
Low-code knowledge graph platform
Semantic knowledge modeling
Semantic search & discovery
AWS Partner
Public sector partner
Singapore based
Linked Data, Structured data, Semantic
Web, bibliographic meta data, Schema.org
and management systems consultant
27
Basic Data Model
• Linked Data
– BIBFRAME to capture detail of bibliographic records
– Schema.org to deliver structured data for search engines
28
Basic Data Model
• Linked Data
– BIBFRAME to capture detail of bibliographic records
– Schema.org to deliver structured data for search engines
– Schema.org representation of CMS, NAS, TTE data
29
Basic Data Model
• Linked Data
– BIBFRAME to capture detail of bibliographic records
– Schema.org to deliver structured data for search engines
– Schema.org representation of CMS, NAS, TTE data
– Schema.org enrichment of BIBFRAME
30
Basic Data Model
• Linked Data
– BIBFRAME to capture detail of bibliographic records
– Schema.org to deliver structured data for search engines
– Schema.org representation of CMS, NAS, TTE data
– Schema.org enrichment of BIBFRAME
• Schema.org as the ‘lingua franca’ vocabulary of the Knowledge graph
31
Basic Data Model
• Linked Data
– BIBFRAME to capture detail of bibliographic records
– Schema.org to deliver structured data for search engines
– Schema.org representation of CMS, NAS, TTE data
– Schema.org enrichment of BIBFRAME
• Schema.org as the ‘lingua franca’ vocabulary of the Knowledge graph
– All entities described using Schema.org as a minimum.
32
Data Data Data!
Data Source Source Records Entity Count Update Frequency
ILS 1.4m 7.9m Daily
CMS 82k 228k Weekly
NAS 1.6m 6.7m Monthly
TTE 3k 317k Monthly
3.1m 15.15m
33
Data Ingest Pipelines
• Triggered by data upload from source system
34
Data Ingest Pipelines
• Triggered by data upload from source system
• ILS – daily
35
Data Ingest Pipelines
• Triggered by data upload from source system
• ILS – daily
– MARC-XML parsed through Open Source scripts:
• Marc2bibframe2 – Library of Congress
• Bibframe2schema – Bibframe2Schema.org
36
Data Ingest Pipelines
• Triggered by data upload from source system
• ILS – daily
– MARC-XML parsed through Open Source scripts:
• Marc2bibframe2 – Library of Congress
• Bibframe2schema – Bibframe2Schema.org
• TTE Authorities – Monthly
37
Data Ingest Pipelines
• Triggered by data upload from source system
• ILS – daily
– MARC-XML parsed through Open Source scripts:
• Marc2bibframe2 – Library of Congress
• Bibframe2schema – Bibframe2Schema.org
• TTE Authorities – Monthly
– Bespoke CSV conversion
38
Data Ingest Pipelines
• Triggered by data upload from source system
• ILS – daily
– MARC-XML parsed through Open Source scripts:
• Marc2bibframe2 – Library of Congress
• Bibframe2schema – Bibframe2Schema.org
• TTE Authorities – Monthly
– Bespoke CSV conversion
• CMS & NAS – Weekly / Monthly
39
Data Ingest Pipelines
• Triggered by data upload from source system
• ILS – daily
– MARC-XML parsed through Open Source scripts:
• Marc2bibframe2 – Library of Congress
• Bibframe2schema – Bibframe2Schema.org
• TTE Authorities – Monthly
– Bespoke CSV conversion
• CMS & NAS – Weekly / Monthly
– Dublin Core to Schema.org
40
Technical Architecture (simplified)
Hosted on Amazon Web Services
Batch Scripts
import control
Etc.
SOURCE DATA
IMPORT
41
Technical Architecture (simplified)
Hosted on Amazon Web Services
Pipeline
processing
Batch Scripts
import control
Etc.
SOURCE DATA
IMPORT
42
Technical Architecture (simplified)
Hosted on Amazon Web Services
GraphDB
Cluster
GraphDB
Cluster
GraphDB
Cluster
GraphDB
Cluster
Pipeline
processing
Batch Scripts
import control
Etc.
SOURCE DATA
IMPORT
43
Technical Architecture (simplified)
Hosted on Amazon Web Services
EDS
GraphDB
Cluster
GraphDB
Cluster
GraphDB
Cluster
GraphDB
Cluster
Pipeline
processing
Batch Scripts
import control
Etc.
SOURCE DATA
IMPORT
44
Technical Architecture (simplified)
Hosted on Amazon Web Services
EDS
GraphDB
Cluster
GraphDB
Cluster
GraphDB
Cluster
GraphDB
Cluster
Pipeline
processing
Batch Scripts
import control
Etc.
SOURCE DATA
IMPORT
DMI
45
A need for entity reconciliation …..
• Lots (and lots and lots) of source entities – 10 million entities
46
A need for entity reconciliation …..
• Lots (and lots and lots) of source entities – 10 million entities
• Lots of duplication
– Lee, Kuan Yew – 1st Prime Minister of Singapore
• 160 individual entities in ILS source data
47
A need for entity reconciliation …..
• Lots (and lots and lots) of source entities – 10 million entities
• Lots of duplication
– Lee, Kuan Yew – 1st Prime Minister of Singapore
• 160 individual entities in ILS source data
– Singapore Art Museum
• Entities from source data
• 21 CMS, 1 NAS, 66 ILS, 1 TTE
48
A need for entity reconciliation …..
• Lots (and lots and lots) of source entities – 10 million entities
• Lots of duplication
– Lee, Kuan Yew – 1st Prime Minister of Singapore
• 160 individual entities in ILS source data
– Singapore Art Museum
• Entities from source data
• 21 CMS, 1 NAS, 66 ILS, 1 TTE
• Users only want 1 of each!
49
Adaptive Data Model Concepts
• Source entitles
– Individual representation of source data
50
Adaptive Data Model Concepts
• Source entitles
– Individual representation of source data
• Aggregation entities
– Tracking relationships between source entities for the same thing
– No copying of attributes
51
Adaptive Data Model Concepts
• Source entitles
– Individual representation of source data
• Aggregation entities
– Tracking relationships between source entities for the same thing
– No copying of attributes
• Primary Entities
– Searchable by users
– Displayable to users
– Consolidation of aggregated source data & managed attributes
52
53
54
55
56
57
58
59
60
66
The entity iceberg
67
The entity iceberg
Primary
68
The entity iceberg
Primary
Discovery
69
The entity iceberg
Primary
Aggregation
Discovery
70
The entity iceberg
Primary
Aggregation
Source
Ingestion
Pipelines
Discovery
71
The entity iceberg
Primary
Aggregation
Source
Ingestion
Pipelines
Discovery
Management
72
The NLB Knowledge Graph
• 666M Triples
73
The NLB Knowledge Graph
• 666M Triples
• 10M Source Entities
74
The NLB Knowledge Graph
• 666M Triples
• 10M Source Entities
• 5.8M Primary Entities
75
The NLB Knowledge Graph
• 666M Triples
• 10M Source Entities
• 5.8M Primary Entities
– Aggregation of source derived entities
76
The NLB Knowledge Graph
• 666M Triples
• 10M Source Entities
• 5.8M Primary Entities
– Aggregation of source derived entities
– Searchable
77
The NLB Knowledge Graph
• 666M Triples
• 10M Source Entities
• 5.8M Primary Entities
– Aggregation of source derived entities
– Searchable
– Shared with world
78
The NLB Knowledge Graph
• 666M Triples
• 10M Source Entities
• 5.8M Primary Entities
– Aggregation of source derived entities
– Searchable
– Shared with world
79
NLB Linked Data Management System (LDMS)
• Powered by the Knowledge Graph
80
NLB Linked Data Management System (LDMS)
• Powered by the Knowledge Graph
• Updated daily
81
NLB Linked Data Management System (LDMS)
• Powered by the Knowledge Graph
• Updated daily
• A new separate environment built on established systems
82
NLB Linked Data Management System (LDMS)
• Powered by the Knowledge Graph
• Updated daily
• A new separate environment built on established systems
• No changes in cataloguing practices
83
NLB Linked Data Management System (LDMS)
• Powered by the Knowledge Graph
• Updated daily
• A new separate environment built on established systems
• No changes in cataloguing practices
• No cataloguer retraining
84
NLB Linked Data Management System (LDMS)
• Powered by the Knowledge Graph
• Updated daily
• A new separate environment built on established systems
• No changes in cataloguing practices
• No cataloguer retraining
• Not just the bibliographic (MARC) data
85
NLB Linked Data Management System (LDMS)
• Powered by the Knowledge Graph
• Updated daily
• A new separate environment built on established systems
• No changes in cataloguing practices
• No cataloguer retraining
• Not just the bibliographic (MARC) data
• No replacement systems – to implement Linked Data
86
NLB Linked Data Management System (LDMS)
• Powered by the Knowledge Graph
• Updated daily
• A new separate environment built on established systems
• No changes in cataloguing practices
• No cataloguer retraining
• Not just the bibliographic (MARC) data
• No replacement systems – to implement Linked Data
– MARC based ILS swap out occurred mid project – without LDMS impact
87
NLB Linked Data Management System (LDMS)
• Powered by the Knowledge Graph
• Updated daily
• A new separate environment built on established systems
• No changes in cataloguing practices
• No cataloguer retraining
• Not just the bibliographic (MARC) data
• No replacement systems – to implement Linked Data
– MARC based ILS swap out occurred mid project – without LDMS impact
• Delivering Linked Data benefits back into the organization
88
Building on the Knowledge Graph
Entity Data Service
• Open Linked Data interface
89
Building on the Knowledge Graph
Entity Data Service
• Open Linked Data interface
• Dereferencing entity URIs
90
Building on the Knowledge Graph
Entity Data Service
• Open Linked Data interface
• Dereferencing entity URIs
• Content negotiation for RDF/XML / JSON-LD / Turtle / N-Triples
91
Building on the Knowledge Graph
Entity Data Service
• Open Linked Data interface
• Dereferencing entity URIs
• Content negotiation for RDF/XML / JSON-LD / Turtle / N-Triples
• Download formats RDF/XML / JSON-LD / Turtle / N-Triples
92
Building on the Knowledge Graph
Entity Data Service
• Open Linked Data interface
• Dereferencing entity URIs
• Content negotiation for RDF/XML / JSON-LD / Turtle / N-Triples
• Download formats RDF/XML / JSON-LD / Turtle / N-Triples
• Embedded Schema.org
93
Building on the Knowledge Graph
Entity Data Service
• Open Linked Data interface
• Dereferencing entity URIs
• Content negotiation for RDF/XML / JSON-LD / Turtle / N-Triples
• Download formats RDF/XML / JSON-LD / Turtle / N-Triples
• Embedded Schema.org
• Enhanced navigation
Building a Semantic Knowledge Graph - web.pdf
Building a Semantic Knowledge Graph - web.pdf
Building a Semantic Knowledge Graph - web.pdf
97
Building on the Knowledge Graph
Enriching the User Journey
98
Building on the Knowledge Graph
Enriching the User Journey
• Systems are often silos
99
Building on the Knowledge Graph
Enriching the User Journey
• Systems are often silos
• User search and navigation constrained to the data in the silo
100
Building on the Knowledge Graph
Enriching the User Journey
• Systems are often silos
• User search and navigation constrained to the data in the silo
• Knowledge Graph populated from several individual systems
101
Building on the Knowledge Graph
Enriching the User Journey
• Systems are often silos
• User search and navigation constrained to the data in the silo
• Knowledge Graph populated from several individual systems
• Entities aggregated and related across system sources
102
Building on the Knowledge Graph
Enriching the User Journey
• Systems are often silos
• User search and navigation constrained to the data in the silo
• Knowledge Graph populated from several individual systems
• Entities aggregated and related across system sources
• The fuel to explore between systems
103
Building on the Knowledge Graph
Enriching the User Journey
• Systems are often silos
• User search and navigation constrained to the data in the silo
• Knowledge Graph populated from several individual systems
• Entities aggregated and related across system sources
• The fuel to explore between systems
• Via a navigational interface sidebar – bridging the silos
104
Building on the Knowledge Graph
Enriching the User Journey
• Systems are often silos
• User search and navigation constrained to the data in the silo
• Knowledge Graph populated from several individual systems
• Entities aggregated and related across system sources
• The fuel to explore between systems
• Via a navigational interface sidebar – bridging the silos
• Plugged into user interface
105
Building on the Knowledge Graph
Enriching the User Journey
• Systems are often silos
• User search and navigation constrained to the data in the silo
• Knowledge Graph populated from several individual systems
• Entities aggregated and related across system sources
• The fuel to explore between systems
• Via a navigational interface sidebar – bridging the silos
• Plugged into user interface
• Powered by a JavaScript Sidebar API
Building a Semantic Knowledge Graph - web.pdf
Building a Semantic Knowledge Graph - web.pdf
Use of the JavaScript Sidebar API
Use of the JavaScript Sidebar API
→ API call to KG – article ID passed as parameter
Use of the JavaScript Sidebar API
→ API call to KG – article ID passed as parameter
← Description of associated Primary entity returned
Description includes list of ‘about’ related entity IDs
used to build display and navigation links
Use of the JavaScript Sidebar API
→ API call to KG – article ID passed as parameter
← Description of associated Primary entity returned
Clicking sidebar links trigger new API calls to rebuild
the sidebar display as entity relationships are
followed
Description includes list of ‘about’ related entity IDs
used to build display and navigation links
Use of the JavaScript Sidebar API
→ API call to KG – article ID passed as parameter
← Description of associated Primary entity returned
Clicking sidebar links trigger new API calls to rebuild
the sidebar display as entity relationships are
followed
Knowledge Graph navigation via a sidebar
Description includes list of ‘about’ related entity IDs
used to build display and navigation links
Use of the JavaScript Sidebar API
→ API call to KG – article ID passed as parameter
← Description of associated Primary entity returned
Clicking sidebar links trigger new API calls to rebuild
the sidebar display as entity relationships are
followed
Knowledge Graph navigation via a sidebar
Description includes list of ‘about’ related entity IDs
used to build display and navigation links
Use of the JavaScript Sidebar API
→ API call to KG – article ID passed as parameter
← Description of associated Primary entity returned
115
KG Quality Enhancement from Authorities
LCNAF URI Ingestion
• For Person / Organization entities with LCNAF URIs
116
KG Quality Enhancement from Authorities
LCNAF URI Ingestion
• For Person / Organization entities with LCNAF URIs
• Created via the marc2bibframe2 scripts - from $0 subfield
117
KG Quality Enhancement from Authorities
LCNAF URI Ingestion
• For Person / Organization entities with LCNAF URIs
• Created via the marc2bibframe2 scripts - from $0 subfield
• Create rdfs:label values from the marc record
eg. 700$a + 700$d
118
KG Quality Enhancement from Authorities
LCNAF URI Ingestion
• For Person / Organization entities with LCNAF URIs
• Created via the marc2bibframe2 scripts - from $0 subfield
• Create rdfs:label values from the marc record
eg. 700$a + 700$d
• These values are not controlled – entity can have several different labels
119
KG Quality Enhancement from Authorities
LCNAF URI Ingestion
• For Person / Organization entities with LCNAF URIs
• Created via the marc2bibframe2 scripts - from $0 subfield
• Create rdfs:label values from the marc record
eg. 700$a + 700$d
• These values are not controlled – entity can have several different labels
• Use LCNAF authority data to introduce naming consistency
120
KG Quality Enhancement from Authorities
LCNAF URI Ingestion
• For Person / Organization entities with LCNAF URIs
• Created via the marc2bibframe2 scripts - from $0 subfield
• Create rdfs:label values from the marc record
eg. 700$a + 700$d
• These values are not controlled – entity can have several different labels
• Use LCNAF authority data to introduce naming consistency
• Lookup against LCNAF to identify & ingest authoritative version
121
KG Quality Enhancement from Authorities
LCNAF URI Ingestion
• For Person / Organization entities with LCNAF URIs
• Created via the marc2bibframe2 scripts - from $0 subfield
• Create rdfs:label values from the marc record
eg. 700$a + 700$d
• These values are not controlled – entity can have several different labels
• Use LCNAF authority data to introduce naming consistency
• Lookup against LCNAF to identify & ingest authoritative version
• LCNAF values take precedence in primary entity consolidation
Quality Enrichment from Authorities
LCNAF URI Ingestion
MARC XML:
Quality Enrichment from Authorities
LCNAF URI Ingestion
MARC XML:
Quality Enrichment from Authorities
LCNAF URI Ingestion
MARC XML:
Bibframe RDF:
Quality Enrichment from Authorities
LCNAF URI Ingestion
MARC XML:
Bibframe RDF:
Quality Enrichment from Authorities
LCNAF URI Ingestion
MARC XML:
Bibframe RDF:
MARC XML:
Quality Enrichment from Authorities
LCNAF URI Ingestion
MARC XML:
Bibframe RDF:
MARC XML:
Bibframe RDF:
Quality Enrichment from Authorities
LCNAF URI Ingestion
MARC XML:
Bibframe RDF:
MARC XML:
Bibframe RDF:
Entity result in Knowledge Graph
Which is correct?
Quality Enrichment from Authorities
LCNAF URI Ingestion
MARC XML:
Bibframe RDF:
MARC XML:
Bibframe RDF:
Entity result in Knowledge Graph
Which is correct?
Quality Enrichment from Authorities
LCNAF URI Ingestion
MARC XML:
Bibframe RDF:
MARC XML:
Bibframe RDF:
Entity result in Knowledge Graph
Which is correct?
Ingest from LCNAF and give precedence in consolidation
131
Quality Enhancement from Authorities
LCNAF Person & Organization Name Matching
• For all Person and Organization primary entities
132
Quality Enhancement from Authorities
LCNAF Person & Organization Name Matching
• For all Person and Organization primary entities
• Perform a string-matching LCNAF lookup for schema:name values
133
Quality Enhancement from Authorities
LCNAF Person & Organization Name Matching
• For all Person and Organization primary entities
• Perform a string-matching LCNAF lookup for schema:name values
• Automatic background process
134
Quality Enhancement from Authorities
LCNAF Person & Organization Name Matching
• For all Person and Organization primary entities
• Perform a string-matching LCNAF lookup for schema:name values
• Automatic background process
• If exact match
– Ingest LCNAF entity – takes precedence in consolidation
135
Quality Enhancement from Authorities
LCNAF Person & Organization Name Matching
• For all Person and Organization primary entities
• Perform a string-matching LCNAF lookup for schema:name values
• Automatic background process
• If exact match
– Ingest LCNAF entity – takes precedence in consolidation
• If close match
– Add to list of match candidates
136
Quality Enhancement from Authorities
LCNAF Person & Organization Name Matching
• For all Person and Organization primary entities
• Perform a string-matching LCNAF lookup for schema:name values
• Automatic background process
• If exact match
– Ingest LCNAF entity – takes precedence in consolidation
• If close match
– Add to list of match candidates
– [Human] curator either accepts as a match or not
Quality Enrichment from Authorities
LCNAF Person & Organization Name Matching
Quality Enrichment from Authorities
LCNAF Person & Organization Name Matching
139
• 2 years in development
NLB Linked Data Management System (LDMS)
140
• 2 years in development
• Live and operational for 1.5 years
NLB Linked Data Management System (LDMS)
141
• 2 years in development
• Live and operational for 1.5 years
• Built on a 666M triple Knowledge Graph
NLB Linked Data Management System (LDMS)
142
• 2 years in development
• Live and operational for 1.5 years
• Built on a 666M triple Knowledge Graph
• Automatically updated daily
NLB Linked Data Management System (LDMS)
143
• 2 years in development
• Live and operational for 1.5 years
• Built on a 666M triple Knowledge Graph
• Automatically updated daily
• Using Bibframe & Schema.org
NLB Linked Data Management System (LDMS)
144
• 2 years in development
• Live and operational for 1.5 years
• Built on a 666M triple Knowledge Graph
• Automatically updated daily
• Using Bibframe & Schema.org
• Built on – not replacing – established systems & practices
NLB Linked Data Management System (LDMS)
145
• 2 years in development
• Live and operational for 1.5 years
• Built on a 666M triple Knowledge Graph
• Automatically updated daily
• Using Bibframe & Schema.org
• Built on – not replacing – established systems & practices
• A Linked Data Service for NLB
NLB Linked Data Management System (LDMS)
146
• 2 years in development
• Live and operational for 1.5 years
• Built on a 666M triple Knowledge Graph
• Automatically updated daily
• Using Bibframe & Schema.org
• Built on – not replacing – established systems & practices
• A Linked Data Service for NLB
– Utilizing external authorities to enrich and standardize descriptions
NLB Linked Data Management System (LDMS)
147
• 2 years in development
• Live and operational for 1.5 years
• Built on a 666M triple Knowledge Graph
• Automatically updated daily
• Using Bibframe & Schema.org
• Built on – not replacing – established systems & practices
• A Linked Data Service for NLB
– Utilizing external authorities to enrich and standardize descriptions
– Part of Open Linked Data Cloud – via Entity Data Service
NLB Linked Data Management System (LDMS)
148
• 2 years in development
• Live and operational for 1.5 years
• Built on a 666M triple Knowledge Graph
• Automatically updated daily
• Using Bibframe & Schema.org
• Built on – not replacing – established systems & practices
• A Linked Data Service for NLB
– Utilizing external authorities to enrich and standardize descriptions
– Part of Open Linked Data Cloud – via Entity Data Service
– Enriching user journeys on non-linked data systems – via sidebar API
NLB Linked Data Management System (LDMS)
A Semantic Knowledge Graph at
National Library Board Singapore
BIBFRAME Workshop in Europe 2024
17th September 2024 - Helsinki
Richard Wallis
Evangelist and Founder
Data Liberate
richard.wallis@dataliberate.com

More Related Content

PPTX
Facts Behind Traditional Practices
PPTX
The indus valley
PDF
Building a Semantic Knowledge Graph split.pdf
PDF
From Ambition to Go Live SWIB.pdf
PDF
From Ambition to Go Live
PPTX
131205 KU Leuven and the LIBISnet consortium on the way to the next generatio...
PDF
ESWC 2017 Tutorial Knowledge Graphs
PPT
Marc and beyond: 3 Linked Data Choices
Facts Behind Traditional Practices
The indus valley
Building a Semantic Knowledge Graph split.pdf
From Ambition to Go Live SWIB.pdf
From Ambition to Go Live
131205 KU Leuven and the LIBISnet consortium on the way to the next generatio...
ESWC 2017 Tutorial Knowledge Graphs
Marc and beyond: 3 Linked Data Choices

Similar to Building a Semantic Knowledge Graph - web.pdf (20)

PPTX
Linked Open Data and The Digital Archaeological Workflow at the Swedish Natio...
PPTX
PDF
Open source glam tools for building sustainable cultural heritage and digital...
PPTX
Preservation of Research Data: Dataverse / Archivematica Integration by Allan...
PPTX
Development of a MODS-RDF Cataloguing Tool for Information Professionals CONU...
PPTX
‘Development of a MODS-RDF Cataloguing Tool for the Digital Resources and Ima...
PDF
Archiving the French Web: the BnF web archiving workflow. Sara Aubry
PDF
Achille Felicetti "Introduction to the Ariadne winter school and to the ARIAD...
PDF
Why do you consider to adopt Koha Open Source Integrated Library System for y...
PDF
Linked Open Data: Identifying Opportunities
PPTX
Scaling up Linked Data
PPTX
Evaluating and Selecting Library Services PlatformNEW
PPSX
INNOVATION AND ‎RESEARCH (Digital Library ‎Information Access)‎
PPTX
Shared Print in the Orbis Cascade Alliance and Colorado Alliance (Levine-Clark)
PPTX
150310 Implementing Alma for LIBISnet
PPTX
Successful E-Resource Acquisitions: Looking Beyond Selecting, Ordering, Payin...
PPTX
Scaling up Linked Data
PPTX
The Canadian Linked Data Initiative: Charting a Path to a Linked Data Future
PDF
Industry@RuleML2015 DataGraft
Linked Open Data and The Digital Archaeological Workflow at the Swedish Natio...
Open source glam tools for building sustainable cultural heritage and digital...
Preservation of Research Data: Dataverse / Archivematica Integration by Allan...
Development of a MODS-RDF Cataloguing Tool for Information Professionals CONU...
‘Development of a MODS-RDF Cataloguing Tool for the Digital Resources and Ima...
Archiving the French Web: the BnF web archiving workflow. Sara Aubry
Achille Felicetti "Introduction to the Ariadne winter school and to the ARIAD...
Why do you consider to adopt Koha Open Source Integrated Library System for y...
Linked Open Data: Identifying Opportunities
Scaling up Linked Data
Evaluating and Selecting Library Services PlatformNEW
INNOVATION AND ‎RESEARCH (Digital Library ‎Information Access)‎
Shared Print in the Orbis Cascade Alliance and Colorado Alliance (Levine-Clark)
150310 Implementing Alma for LIBISnet
Successful E-Resource Acquisitions: Looking Beyond Selecting, Ordering, Payin...
Scaling up Linked Data
The Canadian Linked Data Initiative: Charting a Path to a Linked Data Future
Industry@RuleML2015 DataGraft
Ad

More from Richard Wallis (20)

PDF
Structured Data: It's All About the Graph!
PDF
Schema.org Structured data the What, Why, & How
PDF
Three Linked Data choices for Libraries
PDF
Structured data: Where did that come from & why are Google asking for it
PDF
Schema.org where did that come from?
PDF
Contextual Computing - Knowledge Graphs & Web of Entities
PDF
Schema.org: Where did that come from!
PDF
Contextual Computing: Laying a Global Data Foundation
PDF
FIBO & Schema.org
PDF
Telling the World and Our Users What We Have
PDF
The Web of Data is Our Opportunity
PDF
Schema.org - An Extending Influence
PDF
Schema.org - Extending Benefits
PDF
Identifying The Benefit of Linked Data
PDF
Web Driven Revolution For Library Data
PDF
The Web of Data is Our Oyster
PDF
LD4L OCLC Data Strategy
PDF
Linked Data in Libraries
PDF
Entification: The Route to 'Useful' Library Data
PDF
Links and Entities
Structured Data: It's All About the Graph!
Schema.org Structured data the What, Why, & How
Three Linked Data choices for Libraries
Structured data: Where did that come from & why are Google asking for it
Schema.org where did that come from?
Contextual Computing - Knowledge Graphs & Web of Entities
Schema.org: Where did that come from!
Contextual Computing: Laying a Global Data Foundation
FIBO & Schema.org
Telling the World and Our Users What We Have
The Web of Data is Our Opportunity
Schema.org - An Extending Influence
Schema.org - Extending Benefits
Identifying The Benefit of Linked Data
Web Driven Revolution For Library Data
The Web of Data is Our Oyster
LD4L OCLC Data Strategy
Linked Data in Libraries
Entification: The Route to 'Useful' Library Data
Links and Entities
Ad

Recently uploaded (20)

PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
NewMind AI Monthly Chronicles - July 2025
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Modernizing your data center with Dell and AMD
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Empathic Computing: Creating Shared Understanding
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Electronic commerce courselecture one. Pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPT
Teaching material agriculture food technology
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Dropbox Q2 2025 Financial Results & Investor Presentation
The Rise and Fall of 3GPP – Time for a Sabbatical?
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Unlocking AI with Model Context Protocol (MCP)
NewMind AI Monthly Chronicles - July 2025
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Encapsulation_ Review paper, used for researhc scholars
Diabetes mellitus diagnosis method based random forest with bat algorithm
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Modernizing your data center with Dell and AMD
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Empathic Computing: Creating Shared Understanding
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Electronic commerce courselecture one. Pdf
20250228 LYD VKU AI Blended-Learning.pptx
Teaching material agriculture food technology
Building Integrated photovoltaic BIPV_UPV.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025

Building a Semantic Knowledge Graph - web.pdf

  • 1. A Semantic Knowledge Graph at National Library Board Singapore BIBFRAME Workshop in Europe 2024 17th September 2024 - Helsinki Richard Wallis Evangelist and Founder Data Liberate richard.wallis@dataliberate.com
  • 2. Independent Consultant, Evangelist & Founder W3C Community Groups: • Bibframe2Schema (Chair) – Standardised conversion path(s) • Schema Bib Extend (Chair) - Bibliographic data • Schema Architypes (Chair) - Archives • Financial Industry Business Ontology – Financial schema.org • Tourism Structured Web Data (Co-Chair) • Schema Course Extension • Schema IoT Community • Educational & Occupational Credentials in Schema.org richard.wallis@dataliberate.com — @dataliberate 40+ Years – Computing 30+ Years – Cultural Heritage technology 20+ Years – Semantic Web & Linked Data Worked With: • Google – Schema.org vocabulary, site, extensions. documentation and community • OCLC – Global library cooperative • FIBO – Financial Industry Business Ontology Group • Various Clients – Implementing/understanding Linked Data, Schema.org: National Library Board Singapore British Library — Stanford University — Europeana 2
  • 4. 4 Agenda for today • National Library and their resources
  • 5. 5 Agenda for today • National Library and their resources • Knowledge Graph ambition
  • 6. 6 Agenda for today • National Library and their resources • Knowledge Graph ambition • Linked Data Management System – the LDMS delivered
  • 7. 7 Agenda for today • National Library and their resources • Knowledge Graph ambition • Linked Data Management System – the LDMS delivered • Continued development
  • 8. 8 Agenda for today • National Library and their resources • Knowledge Graph ambition • Linked Data Management System – the LDMS delivered • Continued development – Data sharing with the Entity Data Service
  • 9. 9 Agenda for today • National Library and their resources • Knowledge Graph ambition • Linked Data Management System – the LDMS delivered • Continued development – Data sharing with the Entity Data Service – User experience enrichment – the sidebar API
  • 10. 10 Agenda for today • National Library and their resources • Knowledge Graph ambition • Linked Data Management System – the LDMS delivered • Continued development – Data sharing with the Entity Data Service – User experience enrichment – the sidebar API – Data quality enhancement utilizing external authorities
  • 11. 11 National Library Board Singapore Public Libraries Network of 28 Public Libraries, including 2 partner libraries* Reading Programmes and Initiatives Programmes and Exhibitions targeted at Singapore communities *Partner libraries are libraries which are partner owned and funded but managed by NLB/NLB’s subsidiary Libraries and Archives Solutions Pte Ltd. Library@Chinatown and the Lifelong Learning Institute Library are Partner libraries. National Archives Transferred from NHB to NLB in Nov 2012 Custodian of Singapore’s Collective Memory: Responsible for Collection, Preservation and Management of Singapore’s Public and Private Archival Records Promotes Public Interest in our Nation’s History and Heritage National Library Preserving Singapore’s Print and Literary Heritage, and Intellectual memory Reference Collections Legal Deposit (including electronic)
  • 12. 12 Over 560,000 Singapore & SEA items Over 147,000 Chinese, Malay & Tamil Languages items Reference Collection Over 62,000 Social Sciences & Humanities items Over 39,000 Science & Technology items Over 53,000 Arts items Over 19,000 Rare Materials items Archival Materials Over 290,000 Government files & Parliament papers Over 190,000 Audiovisual & sound recordings Over 70,000 Maps & building plans Over 1.14m Photographs Over 35,000 Oral history interviews Over 55,000 Speeches & press releases Over 7,000 Posters National Library Board Singapore Over 5m print collection Over 2.4m music tracks 78 databases Over 7,400 e-newspapers and e-magazines titles Over 8,000 e-learning courses Over 1.7m e-books and audio books Lending Collection
  • 13. 13 National Library Board Online Services
  • 14. 15 The Ambition • To enable the discovery & display of entitles from different sources in a combined interface
  • 15. 16 The Ambition • To enable the discovery & display of entitles from different sources in a combined interface • To bring together resources physical and digital
  • 16. 17 The Ambition • To enable the discovery & display of entitles from different sources in a combined interface • To bring together resources physical and digital • To bring together diverse systems across the National Library, National Archives, and Public Libraries in a Linked Data Environment
  • 17. 18 The Ambition • To enable the discovery & display of entitles from different sources in a combined interface • To bring together resources physical and digital • To bring together diverse systems across the National Library, National Archives, and Public Libraries in a Linked Data Environment • To provide a staff interface to view and manage all entities, their descriptions and relationships
  • 18. 19 The Ambition – Technical Challenges • To produce a Knowledge Graph that is [daily] up to date
  • 19. 20 The Ambition – Technical Challenges • To produce a Knowledge Graph that is [daily] up to date • Not to replace current cataloging processes & practices
  • 20. 21 The Ambition – Technical Challenges • To produce a Knowledge Graph that is [daily] up to date • Not to replace current cataloging processes & practices – Marc cataloguing in the ILS
  • 21. 22 The Ambition – Technical Challenges • To produce a Knowledge Graph that is [daily] up to date • Not to replace current cataloging processes & practices – Marc cataloguing in the ILS – TTE maintenance in authority control
  • 22. 23 The Ambition – Technical Challenges • To produce a Knowledge Graph that is [daily] up to date • Not to replace current cataloging processes & practices – Marc cataloguing in the ILS – TTE maintenance in authority control – Dublin Core content management for CMS sites and Archives
  • 23. 24 The Ambition – Technical Challenges • To produce a Knowledge Graph that is [daily] up to date • Not to replace current cataloging processes & practices – Marc cataloguing in the ILS – TTE maintenance in authority control – Dublin Core content management for CMS sites and Archives • Data sharable with the world – Linked Open Data – Schema.org
  • 24. 25 The Ambition – Technical Challenges • To produce a Knowledge Graph that is [daily] up to date • Not to replace current cataloging processes & practices – Marc cataloguing in the ILS – TTE maintenance in authority control – Dublin Core content management for CMS sites and Archives • Data sharable with the world – Linked Open Data – Schema.org • An aggregated source of truth
  • 25. 26 Contract Awarded metaphactory platform Low-code knowledge graph platform Semantic knowledge modeling Semantic search & discovery AWS Partner Public sector partner Singapore based Linked Data, Structured data, Semantic Web, bibliographic meta data, Schema.org and management systems consultant
  • 26. 27 Basic Data Model • Linked Data – BIBFRAME to capture detail of bibliographic records – Schema.org to deliver structured data for search engines
  • 27. 28 Basic Data Model • Linked Data – BIBFRAME to capture detail of bibliographic records – Schema.org to deliver structured data for search engines – Schema.org representation of CMS, NAS, TTE data
  • 28. 29 Basic Data Model • Linked Data – BIBFRAME to capture detail of bibliographic records – Schema.org to deliver structured data for search engines – Schema.org representation of CMS, NAS, TTE data – Schema.org enrichment of BIBFRAME
  • 29. 30 Basic Data Model • Linked Data – BIBFRAME to capture detail of bibliographic records – Schema.org to deliver structured data for search engines – Schema.org representation of CMS, NAS, TTE data – Schema.org enrichment of BIBFRAME • Schema.org as the ‘lingua franca’ vocabulary of the Knowledge graph
  • 30. 31 Basic Data Model • Linked Data – BIBFRAME to capture detail of bibliographic records – Schema.org to deliver structured data for search engines – Schema.org representation of CMS, NAS, TTE data – Schema.org enrichment of BIBFRAME • Schema.org as the ‘lingua franca’ vocabulary of the Knowledge graph – All entities described using Schema.org as a minimum.
  • 31. 32 Data Data Data! Data Source Source Records Entity Count Update Frequency ILS 1.4m 7.9m Daily CMS 82k 228k Weekly NAS 1.6m 6.7m Monthly TTE 3k 317k Monthly 3.1m 15.15m
  • 32. 33 Data Ingest Pipelines • Triggered by data upload from source system
  • 33. 34 Data Ingest Pipelines • Triggered by data upload from source system • ILS – daily
  • 34. 35 Data Ingest Pipelines • Triggered by data upload from source system • ILS – daily – MARC-XML parsed through Open Source scripts: • Marc2bibframe2 – Library of Congress • Bibframe2schema – Bibframe2Schema.org
  • 35. 36 Data Ingest Pipelines • Triggered by data upload from source system • ILS – daily – MARC-XML parsed through Open Source scripts: • Marc2bibframe2 – Library of Congress • Bibframe2schema – Bibframe2Schema.org • TTE Authorities – Monthly
  • 36. 37 Data Ingest Pipelines • Triggered by data upload from source system • ILS – daily – MARC-XML parsed through Open Source scripts: • Marc2bibframe2 – Library of Congress • Bibframe2schema – Bibframe2Schema.org • TTE Authorities – Monthly – Bespoke CSV conversion
  • 37. 38 Data Ingest Pipelines • Triggered by data upload from source system • ILS – daily – MARC-XML parsed through Open Source scripts: • Marc2bibframe2 – Library of Congress • Bibframe2schema – Bibframe2Schema.org • TTE Authorities – Monthly – Bespoke CSV conversion • CMS & NAS – Weekly / Monthly
  • 38. 39 Data Ingest Pipelines • Triggered by data upload from source system • ILS – daily – MARC-XML parsed through Open Source scripts: • Marc2bibframe2 – Library of Congress • Bibframe2schema – Bibframe2Schema.org • TTE Authorities – Monthly – Bespoke CSV conversion • CMS & NAS – Weekly / Monthly – Dublin Core to Schema.org
  • 39. 40 Technical Architecture (simplified) Hosted on Amazon Web Services Batch Scripts import control Etc. SOURCE DATA IMPORT
  • 40. 41 Technical Architecture (simplified) Hosted on Amazon Web Services Pipeline processing Batch Scripts import control Etc. SOURCE DATA IMPORT
  • 41. 42 Technical Architecture (simplified) Hosted on Amazon Web Services GraphDB Cluster GraphDB Cluster GraphDB Cluster GraphDB Cluster Pipeline processing Batch Scripts import control Etc. SOURCE DATA IMPORT
  • 42. 43 Technical Architecture (simplified) Hosted on Amazon Web Services EDS GraphDB Cluster GraphDB Cluster GraphDB Cluster GraphDB Cluster Pipeline processing Batch Scripts import control Etc. SOURCE DATA IMPORT
  • 43. 44 Technical Architecture (simplified) Hosted on Amazon Web Services EDS GraphDB Cluster GraphDB Cluster GraphDB Cluster GraphDB Cluster Pipeline processing Batch Scripts import control Etc. SOURCE DATA IMPORT DMI
  • 44. 45 A need for entity reconciliation ….. • Lots (and lots and lots) of source entities – 10 million entities
  • 45. 46 A need for entity reconciliation ….. • Lots (and lots and lots) of source entities – 10 million entities • Lots of duplication – Lee, Kuan Yew – 1st Prime Minister of Singapore • 160 individual entities in ILS source data
  • 46. 47 A need for entity reconciliation ….. • Lots (and lots and lots) of source entities – 10 million entities • Lots of duplication – Lee, Kuan Yew – 1st Prime Minister of Singapore • 160 individual entities in ILS source data – Singapore Art Museum • Entities from source data • 21 CMS, 1 NAS, 66 ILS, 1 TTE
  • 47. 48 A need for entity reconciliation ….. • Lots (and lots and lots) of source entities – 10 million entities • Lots of duplication – Lee, Kuan Yew – 1st Prime Minister of Singapore • 160 individual entities in ILS source data – Singapore Art Museum • Entities from source data • 21 CMS, 1 NAS, 66 ILS, 1 TTE • Users only want 1 of each!
  • 48. 49 Adaptive Data Model Concepts • Source entitles – Individual representation of source data
  • 49. 50 Adaptive Data Model Concepts • Source entitles – Individual representation of source data • Aggregation entities – Tracking relationships between source entities for the same thing – No copying of attributes
  • 50. 51 Adaptive Data Model Concepts • Source entitles – Individual representation of source data • Aggregation entities – Tracking relationships between source entities for the same thing – No copying of attributes • Primary Entities – Searchable by users – Displayable to users – Consolidation of aggregated source data & managed attributes
  • 51. 52
  • 52. 53
  • 53. 54
  • 54. 55
  • 55. 56
  • 56. 57
  • 57. 58
  • 58. 59
  • 59. 60
  • 66. 72 The NLB Knowledge Graph • 666M Triples
  • 67. 73 The NLB Knowledge Graph • 666M Triples • 10M Source Entities
  • 68. 74 The NLB Knowledge Graph • 666M Triples • 10M Source Entities • 5.8M Primary Entities
  • 69. 75 The NLB Knowledge Graph • 666M Triples • 10M Source Entities • 5.8M Primary Entities – Aggregation of source derived entities
  • 70. 76 The NLB Knowledge Graph • 666M Triples • 10M Source Entities • 5.8M Primary Entities – Aggregation of source derived entities – Searchable
  • 71. 77 The NLB Knowledge Graph • 666M Triples • 10M Source Entities • 5.8M Primary Entities – Aggregation of source derived entities – Searchable – Shared with world
  • 72. 78 The NLB Knowledge Graph • 666M Triples • 10M Source Entities • 5.8M Primary Entities – Aggregation of source derived entities – Searchable – Shared with world
  • 73. 79 NLB Linked Data Management System (LDMS) • Powered by the Knowledge Graph
  • 74. 80 NLB Linked Data Management System (LDMS) • Powered by the Knowledge Graph • Updated daily
  • 75. 81 NLB Linked Data Management System (LDMS) • Powered by the Knowledge Graph • Updated daily • A new separate environment built on established systems
  • 76. 82 NLB Linked Data Management System (LDMS) • Powered by the Knowledge Graph • Updated daily • A new separate environment built on established systems • No changes in cataloguing practices
  • 77. 83 NLB Linked Data Management System (LDMS) • Powered by the Knowledge Graph • Updated daily • A new separate environment built on established systems • No changes in cataloguing practices • No cataloguer retraining
  • 78. 84 NLB Linked Data Management System (LDMS) • Powered by the Knowledge Graph • Updated daily • A new separate environment built on established systems • No changes in cataloguing practices • No cataloguer retraining • Not just the bibliographic (MARC) data
  • 79. 85 NLB Linked Data Management System (LDMS) • Powered by the Knowledge Graph • Updated daily • A new separate environment built on established systems • No changes in cataloguing practices • No cataloguer retraining • Not just the bibliographic (MARC) data • No replacement systems – to implement Linked Data
  • 80. 86 NLB Linked Data Management System (LDMS) • Powered by the Knowledge Graph • Updated daily • A new separate environment built on established systems • No changes in cataloguing practices • No cataloguer retraining • Not just the bibliographic (MARC) data • No replacement systems – to implement Linked Data – MARC based ILS swap out occurred mid project – without LDMS impact
  • 81. 87 NLB Linked Data Management System (LDMS) • Powered by the Knowledge Graph • Updated daily • A new separate environment built on established systems • No changes in cataloguing practices • No cataloguer retraining • Not just the bibliographic (MARC) data • No replacement systems – to implement Linked Data – MARC based ILS swap out occurred mid project – without LDMS impact • Delivering Linked Data benefits back into the organization
  • 82. 88 Building on the Knowledge Graph Entity Data Service • Open Linked Data interface
  • 83. 89 Building on the Knowledge Graph Entity Data Service • Open Linked Data interface • Dereferencing entity URIs
  • 84. 90 Building on the Knowledge Graph Entity Data Service • Open Linked Data interface • Dereferencing entity URIs • Content negotiation for RDF/XML / JSON-LD / Turtle / N-Triples
  • 85. 91 Building on the Knowledge Graph Entity Data Service • Open Linked Data interface • Dereferencing entity URIs • Content negotiation for RDF/XML / JSON-LD / Turtle / N-Triples • Download formats RDF/XML / JSON-LD / Turtle / N-Triples
  • 86. 92 Building on the Knowledge Graph Entity Data Service • Open Linked Data interface • Dereferencing entity URIs • Content negotiation for RDF/XML / JSON-LD / Turtle / N-Triples • Download formats RDF/XML / JSON-LD / Turtle / N-Triples • Embedded Schema.org
  • 87. 93 Building on the Knowledge Graph Entity Data Service • Open Linked Data interface • Dereferencing entity URIs • Content negotiation for RDF/XML / JSON-LD / Turtle / N-Triples • Download formats RDF/XML / JSON-LD / Turtle / N-Triples • Embedded Schema.org • Enhanced navigation
  • 91. 97 Building on the Knowledge Graph Enriching the User Journey
  • 92. 98 Building on the Knowledge Graph Enriching the User Journey • Systems are often silos
  • 93. 99 Building on the Knowledge Graph Enriching the User Journey • Systems are often silos • User search and navigation constrained to the data in the silo
  • 94. 100 Building on the Knowledge Graph Enriching the User Journey • Systems are often silos • User search and navigation constrained to the data in the silo • Knowledge Graph populated from several individual systems
  • 95. 101 Building on the Knowledge Graph Enriching the User Journey • Systems are often silos • User search and navigation constrained to the data in the silo • Knowledge Graph populated from several individual systems • Entities aggregated and related across system sources
  • 96. 102 Building on the Knowledge Graph Enriching the User Journey • Systems are often silos • User search and navigation constrained to the data in the silo • Knowledge Graph populated from several individual systems • Entities aggregated and related across system sources • The fuel to explore between systems
  • 97. 103 Building on the Knowledge Graph Enriching the User Journey • Systems are often silos • User search and navigation constrained to the data in the silo • Knowledge Graph populated from several individual systems • Entities aggregated and related across system sources • The fuel to explore between systems • Via a navigational interface sidebar – bridging the silos
  • 98. 104 Building on the Knowledge Graph Enriching the User Journey • Systems are often silos • User search and navigation constrained to the data in the silo • Knowledge Graph populated from several individual systems • Entities aggregated and related across system sources • The fuel to explore between systems • Via a navigational interface sidebar – bridging the silos • Plugged into user interface
  • 99. 105 Building on the Knowledge Graph Enriching the User Journey • Systems are often silos • User search and navigation constrained to the data in the silo • Knowledge Graph populated from several individual systems • Entities aggregated and related across system sources • The fuel to explore between systems • Via a navigational interface sidebar – bridging the silos • Plugged into user interface • Powered by a JavaScript Sidebar API
  • 102. Use of the JavaScript Sidebar API
  • 103. Use of the JavaScript Sidebar API → API call to KG – article ID passed as parameter
  • 104. Use of the JavaScript Sidebar API → API call to KG – article ID passed as parameter ← Description of associated Primary entity returned
  • 105. Description includes list of ‘about’ related entity IDs used to build display and navigation links Use of the JavaScript Sidebar API → API call to KG – article ID passed as parameter ← Description of associated Primary entity returned
  • 106. Clicking sidebar links trigger new API calls to rebuild the sidebar display as entity relationships are followed Description includes list of ‘about’ related entity IDs used to build display and navigation links Use of the JavaScript Sidebar API → API call to KG – article ID passed as parameter ← Description of associated Primary entity returned
  • 107. Clicking sidebar links trigger new API calls to rebuild the sidebar display as entity relationships are followed Knowledge Graph navigation via a sidebar Description includes list of ‘about’ related entity IDs used to build display and navigation links Use of the JavaScript Sidebar API → API call to KG – article ID passed as parameter ← Description of associated Primary entity returned
  • 108. Clicking sidebar links trigger new API calls to rebuild the sidebar display as entity relationships are followed Knowledge Graph navigation via a sidebar Description includes list of ‘about’ related entity IDs used to build display and navigation links Use of the JavaScript Sidebar API → API call to KG – article ID passed as parameter ← Description of associated Primary entity returned
  • 109. 115 KG Quality Enhancement from Authorities LCNAF URI Ingestion • For Person / Organization entities with LCNAF URIs
  • 110. 116 KG Quality Enhancement from Authorities LCNAF URI Ingestion • For Person / Organization entities with LCNAF URIs • Created via the marc2bibframe2 scripts - from $0 subfield
  • 111. 117 KG Quality Enhancement from Authorities LCNAF URI Ingestion • For Person / Organization entities with LCNAF URIs • Created via the marc2bibframe2 scripts - from $0 subfield • Create rdfs:label values from the marc record eg. 700$a + 700$d
  • 112. 118 KG Quality Enhancement from Authorities LCNAF URI Ingestion • For Person / Organization entities with LCNAF URIs • Created via the marc2bibframe2 scripts - from $0 subfield • Create rdfs:label values from the marc record eg. 700$a + 700$d • These values are not controlled – entity can have several different labels
  • 113. 119 KG Quality Enhancement from Authorities LCNAF URI Ingestion • For Person / Organization entities with LCNAF URIs • Created via the marc2bibframe2 scripts - from $0 subfield • Create rdfs:label values from the marc record eg. 700$a + 700$d • These values are not controlled – entity can have several different labels • Use LCNAF authority data to introduce naming consistency
  • 114. 120 KG Quality Enhancement from Authorities LCNAF URI Ingestion • For Person / Organization entities with LCNAF URIs • Created via the marc2bibframe2 scripts - from $0 subfield • Create rdfs:label values from the marc record eg. 700$a + 700$d • These values are not controlled – entity can have several different labels • Use LCNAF authority data to introduce naming consistency • Lookup against LCNAF to identify & ingest authoritative version
  • 115. 121 KG Quality Enhancement from Authorities LCNAF URI Ingestion • For Person / Organization entities with LCNAF URIs • Created via the marc2bibframe2 scripts - from $0 subfield • Create rdfs:label values from the marc record eg. 700$a + 700$d • These values are not controlled – entity can have several different labels • Use LCNAF authority data to introduce naming consistency • Lookup against LCNAF to identify & ingest authoritative version • LCNAF values take precedence in primary entity consolidation
  • 116. Quality Enrichment from Authorities LCNAF URI Ingestion MARC XML:
  • 117. Quality Enrichment from Authorities LCNAF URI Ingestion MARC XML:
  • 118. Quality Enrichment from Authorities LCNAF URI Ingestion MARC XML: Bibframe RDF:
  • 119. Quality Enrichment from Authorities LCNAF URI Ingestion MARC XML: Bibframe RDF:
  • 120. Quality Enrichment from Authorities LCNAF URI Ingestion MARC XML: Bibframe RDF: MARC XML:
  • 121. Quality Enrichment from Authorities LCNAF URI Ingestion MARC XML: Bibframe RDF: MARC XML: Bibframe RDF:
  • 122. Quality Enrichment from Authorities LCNAF URI Ingestion MARC XML: Bibframe RDF: MARC XML: Bibframe RDF: Entity result in Knowledge Graph Which is correct?
  • 123. Quality Enrichment from Authorities LCNAF URI Ingestion MARC XML: Bibframe RDF: MARC XML: Bibframe RDF: Entity result in Knowledge Graph Which is correct?
  • 124. Quality Enrichment from Authorities LCNAF URI Ingestion MARC XML: Bibframe RDF: MARC XML: Bibframe RDF: Entity result in Knowledge Graph Which is correct? Ingest from LCNAF and give precedence in consolidation
  • 125. 131 Quality Enhancement from Authorities LCNAF Person & Organization Name Matching • For all Person and Organization primary entities
  • 126. 132 Quality Enhancement from Authorities LCNAF Person & Organization Name Matching • For all Person and Organization primary entities • Perform a string-matching LCNAF lookup for schema:name values
  • 127. 133 Quality Enhancement from Authorities LCNAF Person & Organization Name Matching • For all Person and Organization primary entities • Perform a string-matching LCNAF lookup for schema:name values • Automatic background process
  • 128. 134 Quality Enhancement from Authorities LCNAF Person & Organization Name Matching • For all Person and Organization primary entities • Perform a string-matching LCNAF lookup for schema:name values • Automatic background process • If exact match – Ingest LCNAF entity – takes precedence in consolidation
  • 129. 135 Quality Enhancement from Authorities LCNAF Person & Organization Name Matching • For all Person and Organization primary entities • Perform a string-matching LCNAF lookup for schema:name values • Automatic background process • If exact match – Ingest LCNAF entity – takes precedence in consolidation • If close match – Add to list of match candidates
  • 130. 136 Quality Enhancement from Authorities LCNAF Person & Organization Name Matching • For all Person and Organization primary entities • Perform a string-matching LCNAF lookup for schema:name values • Automatic background process • If exact match – Ingest LCNAF entity – takes precedence in consolidation • If close match – Add to list of match candidates – [Human] curator either accepts as a match or not
  • 131. Quality Enrichment from Authorities LCNAF Person & Organization Name Matching
  • 132. Quality Enrichment from Authorities LCNAF Person & Organization Name Matching
  • 133. 139 • 2 years in development NLB Linked Data Management System (LDMS)
  • 134. 140 • 2 years in development • Live and operational for 1.5 years NLB Linked Data Management System (LDMS)
  • 135. 141 • 2 years in development • Live and operational for 1.5 years • Built on a 666M triple Knowledge Graph NLB Linked Data Management System (LDMS)
  • 136. 142 • 2 years in development • Live and operational for 1.5 years • Built on a 666M triple Knowledge Graph • Automatically updated daily NLB Linked Data Management System (LDMS)
  • 137. 143 • 2 years in development • Live and operational for 1.5 years • Built on a 666M triple Knowledge Graph • Automatically updated daily • Using Bibframe & Schema.org NLB Linked Data Management System (LDMS)
  • 138. 144 • 2 years in development • Live and operational for 1.5 years • Built on a 666M triple Knowledge Graph • Automatically updated daily • Using Bibframe & Schema.org • Built on – not replacing – established systems & practices NLB Linked Data Management System (LDMS)
  • 139. 145 • 2 years in development • Live and operational for 1.5 years • Built on a 666M triple Knowledge Graph • Automatically updated daily • Using Bibframe & Schema.org • Built on – not replacing – established systems & practices • A Linked Data Service for NLB NLB Linked Data Management System (LDMS)
  • 140. 146 • 2 years in development • Live and operational for 1.5 years • Built on a 666M triple Knowledge Graph • Automatically updated daily • Using Bibframe & Schema.org • Built on – not replacing – established systems & practices • A Linked Data Service for NLB – Utilizing external authorities to enrich and standardize descriptions NLB Linked Data Management System (LDMS)
  • 141. 147 • 2 years in development • Live and operational for 1.5 years • Built on a 666M triple Knowledge Graph • Automatically updated daily • Using Bibframe & Schema.org • Built on – not replacing – established systems & practices • A Linked Data Service for NLB – Utilizing external authorities to enrich and standardize descriptions – Part of Open Linked Data Cloud – via Entity Data Service NLB Linked Data Management System (LDMS)
  • 142. 148 • 2 years in development • Live and operational for 1.5 years • Built on a 666M triple Knowledge Graph • Automatically updated daily • Using Bibframe & Schema.org • Built on – not replacing – established systems & practices • A Linked Data Service for NLB – Utilizing external authorities to enrich and standardize descriptions – Part of Open Linked Data Cloud – via Entity Data Service – Enriching user journeys on non-linked data systems – via sidebar API NLB Linked Data Management System (LDMS)
  • 143. A Semantic Knowledge Graph at National Library Board Singapore BIBFRAME Workshop in Europe 2024 17th September 2024 - Helsinki Richard Wallis Evangelist and Founder Data Liberate richard.wallis@dataliberate.com