SlideShare a Scribd company logo
The Research and Education Space
a pathway to bring our cultural heritage
(including the BBC archive) to life
Dr Chiara Del Vescovo
Data Architect at BBC
Vision
Web-like
Web-based
Vision
Web-like
Web-based
Interlinking
heterogenous
resources
Vision
Web-like
Web-based
Interlinking
heterogenous
resources
Capturing
semantic
interrelations
Vision
Web-like
Web-based
Interlinking
heterogenous
resources
Capturing
semantic
interrelations
Reliable,
provably
cleared for
education
Vision
Web-like
Web-based
Interlinking
heterogenous
resources
Capturing
semantic
interrelations
Reliable,
provably
cleared for
education
Linked Open Data
A pathway
users
BL
BM
BFI
Tate
V&A
…
BBC
A pathway
users
BL
BM
BFI
Tate
V&A
…
BBC
?
usersdevelopers
A pathway
BL
BM
BFI
Tate
V&A
…
BBC
usersdevelopers
A pathway
BL
BM
BFI
Tate
V&A
…
BBC
aggregating
platform
RES (BBC, Jisc, BUFVC)
Core Platform: β€œAcropolis”
Project RES: Technical Approach
1
The crawler fetches data via HTTP from published
sources. Once retrieved, it is indexed by the full-text
store and passed to the aggregation engine for evaluation.
2
The results of the aggregation engine's evaluation process
are stored in the aggregate store, which contains minimal
browse information and information about the similarity of
entities.
3
The public face of the core platform is an extremely basic
browsing interface (which presents the data in tabular form
to aid application developers), and read-write RESTful APIs.
4
Applications may use the APIs to locate information about
aggregated entities, and also to store annotations and activity
data.
5
Each component employs standard protocols and formats.
For example, we can make use of any capable quad-store
as our aggregate store.
Linked
data
crawler
Anansi Aggregation
engine
Spindle
Full-text
store
Aggregate
store
Minimal browse
interface &
APIs
Quilt
Activity
store
usersdevelopers
Acropolis
(index!)
BL
BM
BFI
Tate
V&A
…
BBC
RES (BBC, Jisc, BUFVC)
Core Platform: β€œAcropolis”
Project RES: Technical Approach
1
The crawler fetches data via HTTP from published
sources. Once retrieved, it is indexed by the full-text
store and passed to the aggregation engine for evaluation.
2
The results of the aggregation engine's evaluation process
are stored in the aggregate store, which contains minimal
browse information and information about the similarity of
entities.
3
The public face of the core platform is an extremely basic
browsing interface (which presents the data in tabular form
to aid application developers), and read-write RESTful APIs.
4
Applications may use the APIs to locate information about
aggregated entities, and also to store annotations and activity
data.
5
Each component employs standard protocols and formats.
For example, we can make use of any capable quad-store
as our aggregate store.
Linked
data
crawler
Anansi Aggregation
engine
Spindle
Full-text
store
Aggregate
store
Minimal browse
interface &
APIs
Quilt
Activity
store
informed by
usersdevelopers
Acropolis
(index!)
planned pilots
BL
BM
BFI
Tate
V&A
…
BBC
AcropolisCore Platform: β€œAcropolis”
1
The crawler fetches data
sources. Once retrieved
store and passed to the
2
The results of the aggre
are stored in the aggreg
browse information and
entities.
3
The public face of the c
browsing interface (whi
to aid application develo
4
Applications may use th
aggregated entities, and
data.
5
Each component emplo
For example, we can ma
as our aggregate store.
Linked
data
crawler
Anansi Aggregation
engine
Spindle
Full-text
store
Aggregate
store
Minimal browse
interface &
APIs
Quilt
Activity
storebeta.acropolis.org.uk
Acropolis
Acropolis
Acropolis
Acropolis
Core Platform: β€œAcropolis”
Project RES: Technical Approach
1
The crawler fetches data via HTTP from published
sources. Once retrieved, it is indexed by the full-text
store and passed to the aggregation engine for evaluation.
2
The results of the aggregation engine's evaluation process
are stored in the aggregate store, which contains minimal
browse information and information about the similarity of
entities.
3
The public face of the core platform is an extremely basic
browsing interface (which presents the data in tabular form
to aid application developers), and read-write RESTful APIs.
4
Applications may use the APIs to locate information about
aggregated entities, and also to store annotations and activity
data.
5
Each component employs standard protocols and formats.
For example, we can make use of any capable quad-store
as our aggregate store.
Linked
data
crawler
Anansi Aggregation
engine
Spindle
Full-text
store
Aggregate
store
Minimal browse
interface &
APIs
Quilt
Activity
store
informed by
usersdevelopersAcropolis
What I do
(with my colleague Alex)
planned pilots
BL
BM
BFI
Tate
V&A
…
BBC
What I do
(with my colleague Alex)
BL
BM
BFI
Tate
V&A
…
BBC
What I do
(with my colleague Alex)
1.devise a publishing scheme to
determine URIs
2.translate original metadata into RDF
3.links discovery and reconciliation with
β€œhubs” (e.g., LoC, Geonames,
DBPedia)
4.make the existing schema explicit as
a local ontology
5.matching the ontology onto well-
established ontologies (e.g., DCMI,
FOAF, SKOS, CIDOC-CRM)
6.advice on how to express machine-
readable licenses, for both resources
and metadata
7.technical support to publish LOD
BL
BM
BFI
Tate
V&A
…
BBC
DBPedialite
DBPedialite
DBPedialite
British Museum
British Museum
British Museum
DBPedia
DBPedia
β€’ Europeana
β€’ β€œgeneral” Data Model (EDM)
β€’ collection holders responsible to fit their
resources and metadata in EDM
Europeana
β€’ Europeana
β€’ β€œgeneral” Data Model (EDM)
β€’ collection holders responsible to fit their
resources and metadata in EDM
Europeana
British Library
Extreme cases
Challenges
Stakeholders go quiet!
1. Which metadata?
β€’ Currently, resources metadata mostly oriented
towards β€œphysical proximity”

i.e., indexes reflect similarity of author’s surname, broad
subject, format, media, etc.
β€’ Heterogeneous platforms and data models

incompatibility, transformations needed
β€’ Even when RDF is used, there’s a proliferation of
terms, vocabularies, formats adopted

little (if any) validation
2. Linking
β€’ Systems that do not use RDF do not allow
collection holders to express their knowledge as
they wish

underspecified knowledge
β€’ Even when RDF is used, information often provided
as literals rather than links to URIs

ad hoc solutions unavailable in a machine-readable format
3. Usability
β€’ Reliability
β€’ Lack of tools

developers have little contact with collection holders
β€’ Licensing issues

resources licensing (not always explicit)

metadata licensing

users need to be aware of what that mean

(note that in educations things are slightly easier - blanket
licensing etc.)
Interested?
β€’ get in touch!
β€’ chiara.delvescovo@bbc.co.uk
β€’ alex.tucker@bbc.co.uk
β€’ new advertised position as

Junior Data Architect

careershub.bbc.co.uk

More Related Content

PPTX
Building Linked Data Applications
PPTX
Interaction with Linked Data
PPTX
Linked Data Implementationsβ€”Who, What and Why?
Β 
PPTX
Providing Linked Data
PPTX
Scaling up Linked Data
PPTX
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
PPTX
Big Linked Data - Creating Training Curricula
Building Linked Data Applications
Interaction with Linked Data
Linked Data Implementationsβ€”Who, What and Why?
Β 
Providing Linked Data
Scaling up Linked Data
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
Big Linked Data - Creating Training Curricula

What's hot (20)

PPTX
Usage of Linked Data: Introduction and Application Scenarios
PPTX
Microtask Crowdsourcing Applications for Linked Data
PPTX
Querying Linked Data
PPT
euclid_linkedup WWW tutorial (Besnik Fetahu)
PPT
Linked library data
PDF
Discovering Related Data Sources in Data Portals
PPTX
RDTF Metadata Guidelines: an update
PPTX
The Information Workbench - Linked Data and Semantic Wikis in the Enterprise
PDF
Smart Data Applications powered by the Wikidata Knowledge Graph
PDF
Linked data as a library data platform
PPTX
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
PDF
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
PPTX
Making Use of the Linked Open Data Services for OpenAIRE (DI4R 2016 tutorial ...
PPT
Ifla swsig meeting - Puerto Rico - 20110817
PDF
Getting Started with Knowledge Graphs
PDF
Linked Data Snowball, or Why We Need Reconciliation
PDF
Finding Data Sets
PPTX
Linked data life cycles
PPTX
SWSIG wlic2016
Usage of Linked Data: Introduction and Application Scenarios
Microtask Crowdsourcing Applications for Linked Data
Querying Linked Data
euclid_linkedup WWW tutorial (Besnik Fetahu)
Linked library data
Discovering Related Data Sources in Data Portals
RDTF Metadata Guidelines: an update
The Information Workbench - Linked Data and Semantic Wikis in the Enterprise
Smart Data Applications powered by the Wikidata Knowledge Graph
Linked data as a library data platform
First Steps in Semantic Data Modelling and Search & Analytics in the Cloud
Build Narratives, Connect Artifacts: Linked Open Data for Cultural Heritage
Making Use of the Linked Open Data Services for OpenAIRE (DI4R 2016 tutorial ...
Ifla swsig meeting - Puerto Rico - 20110817
Getting Started with Knowledge Graphs
Linked Data Snowball, or Why We Need Reconciliation
Finding Data Sets
Linked data life cycles
SWSIG wlic2016
Ad

Similar to Documents, services, and data on the web (20)

PPT
Of Cataloging & Context
PDF
Europeana datainaction nov2012
PDF
The Europeana Data Model - TPDL2018
PPT
Open for Business Open Archives, OpenURL, RSS and the Dublin Core
PDF
Eun lre brussels_winer20100616
PDF
Freedom for bibliographic references: OpenCitations arise
PPT
Linked Data - the Future for Open Repositories?
PPT
Mapping the European(a) metadata landscape
PPT
The JISC Information Environment and collection description
PDF
Alexandria winer20100623
PPTX
High and Lows of Library Linked Data
PPT
Lodlam presentation v1.0 final al20151104
PPT
Descriptive Standards and Applications in Memory Institutions
PPT
Open for Business - Open Archives, OpenURL, RSS and the Dublin Core
PPTX
Linked Data and Locah, UKSG2011
PPT
The JISC Information Environment and VLEs
PDF
Designing a multilingual knowledge graph - DCMI2018
PPTX
IIIF for CNI Spring 2014 Membership Meeting
PPT
Item Banks and the JISC Information Environment
PPTX
The Hellenic Aggregator - Overview, procedures & the cooperation with Europeana
Of Cataloging & Context
Europeana datainaction nov2012
The Europeana Data Model - TPDL2018
Open for Business Open Archives, OpenURL, RSS and the Dublin Core
Eun lre brussels_winer20100616
Freedom for bibliographic references: OpenCitations arise
Linked Data - the Future for Open Repositories?
Mapping the European(a) metadata landscape
The JISC Information Environment and collection description
Alexandria winer20100623
High and Lows of Library Linked Data
Lodlam presentation v1.0 final al20151104
Descriptive Standards and Applications in Memory Institutions
Open for Business - Open Archives, OpenURL, RSS and the Dublin Core
Linked Data and Locah, UKSG2011
The JISC Information Environment and VLEs
Designing a multilingual knowledge graph - DCMI2018
IIIF for CNI Spring 2014 Membership Meeting
Item Banks and the JISC Information Environment
The Hellenic Aggregator - Overview, procedures & the cooperation with Europeana
Ad

Recently uploaded (20)

PDF
The New Creative Director: How AI Tools for Social Media Content Creation Are...
PPT
tcp ip networks nd ip layering assotred slides
PDF
APNIC Update, presented at PHNOG 2025 by Shane Hermoso
Β 
PDF
Decoding a Decade: 10 Years of Applied CTI Discipline
PDF
πŸ’° π”πŠπ“πˆ πŠπ„πŒπ„ππ€ππ†π€π πŠπˆππ„π‘πŸ’πƒ π‡π€π‘πˆ 𝐈𝐍𝐈 πŸπŸŽπŸπŸ“ πŸ’°
Β 
PPTX
Funds Management Learning Material for Beg
PPTX
international classification of diseases ICD-10 review PPT.pptx
PPTX
Introduction to Information and Communication Technology
PPTX
Digital Literacy And Online Safety on internet
PPTX
522797556-Unit-2-Temperature-measurement-1-1.pptx
PPTX
QR Codes Qr codecodecodecodecocodedecodecode
DOCX
Unit-3 cyber security network security of internet system
PDF
Tenda Login Guide: Access Your Router in 5 Easy Steps
PPTX
Module 1 - Cyber Law and Ethics 101.pptx
PDF
Best Practices for Testing and Debugging Shopify Third-Party API Integrations...
PPTX
artificial intelligence overview of it and more
PPTX
CHE NAA, , b,mn,mblblblbljb jb jlb ,j , ,C PPT.pptx
PPTX
INTERNET------BASICS-------UPDATED PPT PRESENTATION
PDF
SASE Traffic Flow - ZTNA Connector-1.pdf
PPTX
introduction about ICD -10 & ICD-11 ppt.pptx
The New Creative Director: How AI Tools for Social Media Content Creation Are...
tcp ip networks nd ip layering assotred slides
APNIC Update, presented at PHNOG 2025 by Shane Hermoso
Β 
Decoding a Decade: 10 Years of Applied CTI Discipline
πŸ’° π”πŠπ“πˆ πŠπ„πŒπ„ππ€ππ†π€π πŠπˆππ„π‘πŸ’πƒ π‡π€π‘πˆ 𝐈𝐍𝐈 πŸπŸŽπŸπŸ“ πŸ’°
Β 
Funds Management Learning Material for Beg
international classification of diseases ICD-10 review PPT.pptx
Introduction to Information and Communication Technology
Digital Literacy And Online Safety on internet
522797556-Unit-2-Temperature-measurement-1-1.pptx
QR Codes Qr codecodecodecodecocodedecodecode
Unit-3 cyber security network security of internet system
Tenda Login Guide: Access Your Router in 5 Easy Steps
Module 1 - Cyber Law and Ethics 101.pptx
Best Practices for Testing and Debugging Shopify Third-Party API Integrations...
artificial intelligence overview of it and more
CHE NAA, , b,mn,mblblblbljb jb jlb ,j , ,C PPT.pptx
INTERNET------BASICS-------UPDATED PPT PRESENTATION
SASE Traffic Flow - ZTNA Connector-1.pdf
introduction about ICD -10 & ICD-11 ppt.pptx

Documents, services, and data on the web

  • 1. The Research and Education Space a pathway to bring our cultural heritage (including the BBC archive) to life Dr Chiara Del Vescovo Data Architect at BBC
  • 11. RES (BBC, Jisc, BUFVC) Core Platform: β€œAcropolis” Project RES: Technical Approach 1 The crawler fetches data via HTTP from published sources. Once retrieved, it is indexed by the full-text store and passed to the aggregation engine for evaluation. 2 The results of the aggregation engine's evaluation process are stored in the aggregate store, which contains minimal browse information and information about the similarity of entities. 3 The public face of the core platform is an extremely basic browsing interface (which presents the data in tabular form to aid application developers), and read-write RESTful APIs. 4 Applications may use the APIs to locate information about aggregated entities, and also to store annotations and activity data. 5 Each component employs standard protocols and formats. For example, we can make use of any capable quad-store as our aggregate store. Linked data crawler Anansi Aggregation engine Spindle Full-text store Aggregate store Minimal browse interface & APIs Quilt Activity store usersdevelopers Acropolis (index!) BL BM BFI Tate V&A … BBC
  • 12. RES (BBC, Jisc, BUFVC) Core Platform: β€œAcropolis” Project RES: Technical Approach 1 The crawler fetches data via HTTP from published sources. Once retrieved, it is indexed by the full-text store and passed to the aggregation engine for evaluation. 2 The results of the aggregation engine's evaluation process are stored in the aggregate store, which contains minimal browse information and information about the similarity of entities. 3 The public face of the core platform is an extremely basic browsing interface (which presents the data in tabular form to aid application developers), and read-write RESTful APIs. 4 Applications may use the APIs to locate information about aggregated entities, and also to store annotations and activity data. 5 Each component employs standard protocols and formats. For example, we can make use of any capable quad-store as our aggregate store. Linked data crawler Anansi Aggregation engine Spindle Full-text store Aggregate store Minimal browse interface & APIs Quilt Activity store informed by usersdevelopers Acropolis (index!) planned pilots BL BM BFI Tate V&A … BBC
  • 13. AcropolisCore Platform: β€œAcropolis” 1 The crawler fetches data sources. Once retrieved store and passed to the 2 The results of the aggre are stored in the aggreg browse information and entities. 3 The public face of the c browsing interface (whi to aid application develo 4 Applications may use th aggregated entities, and data. 5 Each component emplo For example, we can ma as our aggregate store. Linked data crawler Anansi Aggregation engine Spindle Full-text store Aggregate store Minimal browse interface & APIs Quilt Activity storebeta.acropolis.org.uk
  • 18. Core Platform: β€œAcropolis” Project RES: Technical Approach 1 The crawler fetches data via HTTP from published sources. Once retrieved, it is indexed by the full-text store and passed to the aggregation engine for evaluation. 2 The results of the aggregation engine's evaluation process are stored in the aggregate store, which contains minimal browse information and information about the similarity of entities. 3 The public face of the core platform is an extremely basic browsing interface (which presents the data in tabular form to aid application developers), and read-write RESTful APIs. 4 Applications may use the APIs to locate information about aggregated entities, and also to store annotations and activity data. 5 Each component employs standard protocols and formats. For example, we can make use of any capable quad-store as our aggregate store. Linked data crawler Anansi Aggregation engine Spindle Full-text store Aggregate store Minimal browse interface & APIs Quilt Activity store informed by usersdevelopersAcropolis What I do (with my colleague Alex) planned pilots BL BM BFI Tate V&A … BBC
  • 19. What I do (with my colleague Alex) BL BM BFI Tate V&A … BBC
  • 20. What I do (with my colleague Alex) 1.devise a publishing scheme to determine URIs 2.translate original metadata into RDF 3.links discovery and reconciliation with β€œhubs” (e.g., LoC, Geonames, DBPedia) 4.make the existing schema explicit as a local ontology 5.matching the ontology onto well- established ontologies (e.g., DCMI, FOAF, SKOS, CIDOC-CRM) 6.advice on how to express machine- readable licenses, for both resources and metadata 7.technical support to publish LOD BL BM BFI Tate V&A … BBC
  • 29. β€’ Europeana β€’ β€œgeneral” Data Model (EDM) β€’ collection holders responsible to fit their resources and metadata in EDM Europeana
  • 30. β€’ Europeana β€’ β€œgeneral” Data Model (EDM) β€’ collection holders responsible to fit their resources and metadata in EDM Europeana
  • 34. 1. Which metadata? β€’ Currently, resources metadata mostly oriented towards β€œphysical proximity”
 i.e., indexes reflect similarity of author’s surname, broad subject, format, media, etc. β€’ Heterogeneous platforms and data models
 incompatibility, transformations needed β€’ Even when RDF is used, there’s a proliferation of terms, vocabularies, formats adopted
 little (if any) validation
  • 35. 2. Linking β€’ Systems that do not use RDF do not allow collection holders to express their knowledge as they wish
 underspecified knowledge β€’ Even when RDF is used, information often provided as literals rather than links to URIs
 ad hoc solutions unavailable in a machine-readable format
  • 36. 3. Usability β€’ Reliability β€’ Lack of tools
 developers have little contact with collection holders β€’ Licensing issues
 resources licensing (not always explicit)
 metadata licensing
 users need to be aware of what that mean
 (note that in educations things are slightly easier - blanket licensing etc.)
  • 37. Interested? β€’ get in touch! β€’ chiara.delvescovo@bbc.co.uk β€’ alex.tucker@bbc.co.uk β€’ new advertised position as
 Junior Data Architect
 careershub.bbc.co.uk