SlideShare a Scribd company logo
Linking digitized collections across metadata silos
Jeff Mixter and Titia van der Werf
OCLC Research
July 2, 2014
LIBER 2014
Connecting the Dots
Introduction
• Projects such as Europeana and the Digital Public Library of America have
highlighted the importance of sharing metadata across silos
• While both of these projects have been successful in harvesting collections data,
they have had problems with rationalizing the data and forming a coherent
understanding of the aggregation
• In order to properly share data across silos and to better share data on the Web, for
both human as well as machine consumption, there needs to be a concerted effort to
apply best practices and standards that are universally understood and consumed
Current Situation
• Organizations create digital collections and generate metadata in repository silos. This
metadata is generally:
•Not connecting the digitized items to their analogue sources
•Not connecting names to authority records (persons, organizations, places, etc.) nor subject
descriptions to controlled vocabularies
•Not connecting to related online items accessible elsewhere
• Aggregators harvest this metadata that, in the process, generally gets “dumbed
down”:
•The University of Illinois OAI-PMH Data provider registry notes that 2964 repositories use
dc. The next highest is MARC21 at 545 repositories
•Even if dc.extensions are used, they are often lost in the OAI-PMH harvesting process
•Aggregators usually ignore idiosyncratic use of metadata schemas and enforce the use of
designated metadata fields
•Digital collection items are not very visible to search engines
• A recent JISC project determined “Only about 50% of items appeared on the first page of
Google results using the item name or title”
a case study: “a good example”
Search string: exposition organisée pour le centenaire des
"Fleurs du Mal"
& search on full-text string from document: "Eugène Crépet"
Search in:
1. BnF Catalogue (Library Catalogue)
2. Gallica (Repository)
3. WorldCat (Aggregator via DCG harvester)
4. TEL (Aggregator)
5. Europeana (Aggregator)
6. Google (Search Engine)
Connecting the Dots: Linking Digitized Collections Across Metadata Silos
Connecting the Dots: Linking Digitized Collections Across Metadata Silos
Connecting the Dots: Linking Digitized Collections Across Metadata Silos
Connecting the Dots: Linking Digitized Collections Across Metadata Silos
Connecting the Dots: Linking Digitized Collections Across Metadata Silos
Connecting the Dots: Linking Digitized Collections Across Metadata Silos
Connecting the Dots: Linking Digitized Collections Across Metadata Silos
Observations
1. A lot of duplication of effort and waste of resources in
developing aggregator services within the same domain
2. A lot of missed opportunities to connect to related data
inside&outside the own silo (both repository and aggregation levels)
3. Visibility/discoverability via SEO is a sign of digital maturity
4. Aggregators generally do not use the FT-indexes available from
repositories to enrich their search functionality
Problem Statements
1. How to share metadata and reduce costs?
2. How to make digital collections more interoperable
across data silos?
3. How to make digital collections more visible to search
engines?
Data sharing
• ‘Data sharing’ is a rather simple term and does not do justice to what it means in
today’s knowledge society
• What we want to do is:
1. Publish data on the Web in a format that can be consumed and indexed by
aggregators/web applications
2. Share data with other organizations with the goal of ‘connecting the dots’
3. This entails connecting points in your data to points in other organization’s
data. This could be People, Places, Events, Organizations, Topics etc.
4. Connecting data across silos will help improve the ability for patrons to
browse and navigate related data/items without having to do multiple
searches in multiple portals
Data sharing
BNF
France
DNB
Germany
BL
UK
BNE
Spain
KB
Netherlands
Europeana
TEL
APEnet
A Knowledge Graph
• In essence what we want to build is a massive knowledge graph of data from digital
collections
BNF
France
DNB
Germany
KB
Netherlands
BNE
Spain
BL
UK
A Knowledge Graph
• Better yet, we actually want to connect individual dots within and across data silos.
This is the essence of Linked Data
•This requires changes in how repository data is published
Vincent
van Gogh
Vincent
van Gogh
Vincent
van Gogh
Vincent
van Gogh
Vincent
van Gogh
Linked Data
https://www.freebase.com/m/07_m2
http://viaf.org/ 9854560
http://viaf.org/ 9854560
http://dbpedia.org/Vincent_van_Gogh
http://dbpedia.org/Vincent_van_Gogh
Linked Data
•Linked Data is a way of publishing data on the Web in a format that can be easily
consumed and understood by both humans and machines. It relies on linking data
points together to form a complex graph of information
• Linked Data relies on identifiers called URIs
• Things NOT Strings!
• Linked Data can also be used to help connect data across silos and across
domains of practice
Schema.org
• Schema.org is a Linked Data vocabulary that is understood and indexed by search
engines
• It is widely used:
• It is used on 15% of web pages harvested by Google
• over 5 million web sites
• over 25 billion referenced entities
• Google Web Master tools can tell users how much structured data Google is seeing
and indexing
• WorldCat.org has unique 4.63 million structured data entities over 1.48 million
pages
• So why Schema.org?
• Discoverability on the web
• Interoperability with data outside of the library domain
OCLC Projects
• In 2012, OCLC added Schema.org tags to WorldCat.org records, improving the way in
which library information is represented to search engines.
http://www.worldcat.org/oclc/808127130
OCLC Projects
•In 2012, OCLC published VIAF data as Linked Open Data
•In 2013, OCLC developed a VIAF bot for Wikipedia
http://inkdroid.org/journal/2012/05/15/diving-into-viaf/comment-page-1/
OCLC Projects
• In April of this year (2014), OCLC released a beta version of its Works data as Linked Data,
marked up in Schema.org (197M work descriptions)
This Points to the ‘manifestation’ in
WorldCat.org
http://experiment.worldcat.org/entity/work/data/51196.html
Other OCLC Projects
• There is an exploratory project underway to take the Digital Collections Gateway
metadata create more granular Linked Data descriptions
• A USC collection was used as a test case
• Using original metadata rich descriptions of people, places, events and items
were created
•As OCLC continues to use the Schema.org vocabulary to items found in libraries
archives and museums, we have begun to create extension terms to supplement
shortcomings in Schema.org
• There is also a W3C Community Group Schema Bib Extend that proposes
additional terms to Schema.org for review and consideration
Becoming a player in the web of data
Questions?
Thank You!
©2014 OCLC. This work is licensed under a Creative Commons Attribution 3.0 Unported License. Suggested attribution: “This
work uses content from [presentation title] © OCLC, used under a Creative Commons Attribution license:
http://creativecommons.org/licenses/by/3.0/”
Jeff Mixter – mixterj@oclc.org
Titia van der Werf – titia.vanderwerf@oclc.org

More Related Content

PPTX
Exploring a world of networked information built from free-text metadata
PDF
[[edit]] this GLAM
PPTX
Challenges and opportunities for academic libraries
PDF
Open Context and Publishing to the Web of Data: Eric Kansa's LAWDI Presentation
PPT
Virtual Research Networks : Towards Research 2.0
PPTX
OCLC Research Update at ALA Chicago. June 26, 2017.
PPTX
The Evolving Scholarly Record Framing the Landscape
PPTX
Collections unbound: collection directions and the RLUK collective collection
Exploring a world of networked information built from free-text metadata
[[edit]] this GLAM
Challenges and opportunities for academic libraries
Open Context and Publishing to the Web of Data: Eric Kansa's LAWDI Presentation
Virtual Research Networks : Towards Research 2.0
OCLC Research Update at ALA Chicago. June 26, 2017.
The Evolving Scholarly Record Framing the Landscape
Collections unbound: collection directions and the RLUK collective collection

What's hot (20)

PPT
OCLC and the Social Web: Building tools, providing platforms, engaging the co...
PDF
Digital Visitors and Residents: Project Feedback
PPTX
Rightscaling, engagement, learning: reconfiguring the library for a network e...
PPT
Social metadata for libraries, archives and museums: Research findings from t...
PDF
IASSIT Kansa Presentation
PPTX
Multilingual presentation ifla 2013 08-19
PPTX
Thinking about technology .... differently
PPTX
Using Europeana for learning & teaching: EMMA MOOC “Digital library in princ...
PPTX
Looking at Libraries, collections & technology
PPTX
Redefining the Academic Library
PPTX
Describing Theses and Dissertations Using Schema.org
PPTX
The research library: scalable efficiency and scalable learning
PPTX
Libraries, collections, technology: presented at Pennylvania State University...
PDF
Embedded Librarians: Diverse Initiatives, Common Challenges.
PPTX
Collection Directions - Research collections in the network environment
PPTX
Supporting Open Access Publishing via Open Journal Systems – One Library’s ex...
PPTX
Linked Data Implementations—Who, What and Why?
PPTX
The Future of Research Communications and e-Scholarship: Are we there yet?
PPTX
The SHARES Partnership, Plus Tracking Trends in ILL Cost and Transaction Data
PPTX
What Libraries Still Need from Discovery Layers
OCLC and the Social Web: Building tools, providing platforms, engaging the co...
Digital Visitors and Residents: Project Feedback
Rightscaling, engagement, learning: reconfiguring the library for a network e...
Social metadata for libraries, archives and museums: Research findings from t...
IASSIT Kansa Presentation
Multilingual presentation ifla 2013 08-19
Thinking about technology .... differently
Using Europeana for learning & teaching: EMMA MOOC “Digital library in princ...
Looking at Libraries, collections & technology
Redefining the Academic Library
Describing Theses and Dissertations Using Schema.org
The research library: scalable efficiency and scalable learning
Libraries, collections, technology: presented at Pennylvania State University...
Embedded Librarians: Diverse Initiatives, Common Challenges.
Collection Directions - Research collections in the network environment
Supporting Open Access Publishing via Open Journal Systems – One Library’s ex...
Linked Data Implementations—Who, What and Why?
The Future of Research Communications and e-Scholarship: Are we there yet?
The SHARES Partnership, Plus Tracking Trends in ILL Cost and Transaction Data
What Libraries Still Need from Discovery Layers
Ad

Similar to Connecting the Dots: Linking Digitized Collections Across Metadata Silos (20)

PDF
CLARIAH Toogdag 2018: A distributed network of digital heritage information
PDF
Linked Data (1st Linked Data Meetup Malmö)
PPTX
Intro-EOSC.pptx
PDF
Session 1.4 a distributed network of heritage information
PDF
A distributed network of digital heritage information - Semantics Amsterdam
PDF
EuropeanaTech 2018: A distributed network of digital heritage information
PPTX
What is eScience, and where does it go from here?
PPTX
Exposing EO Linked (meta-)Data from OpenSearch Catalogue
PDF
CAEPIA 2011
PPTX
The Semantic Web Exists. What Next?
PPTX
IIIF at europeana, IIIF conference, Vatican, 2017
PDF
A distributed network of digital heritage information by Enno Meijers - Europ...
PPTX
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
PPTX
The CSO Open Data Experience
PPTX
Boundless Opportunity
PDF
A distributed network of digital heritage information - Unesco/NDL India
PDF
NORFest 2023 Lightning Talks Session Three
PPSX
INNOVATION AND ‎RESEARCH (Digital Library ‎Information Access)‎
PDF
Introduction to the FP7 CODE project @ BDBC
ZIP
Linked Open Data in Libraries Archives & Museums
CLARIAH Toogdag 2018: A distributed network of digital heritage information
Linked Data (1st Linked Data Meetup Malmö)
Intro-EOSC.pptx
Session 1.4 a distributed network of heritage information
A distributed network of digital heritage information - Semantics Amsterdam
EuropeanaTech 2018: A distributed network of digital heritage information
What is eScience, and where does it go from here?
Exposing EO Linked (meta-)Data from OpenSearch Catalogue
CAEPIA 2011
The Semantic Web Exists. What Next?
IIIF at europeana, IIIF conference, Vatican, 2017
A distributed network of digital heritage information by Enno Meijers - Europ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
The CSO Open Data Experience
Boundless Opportunity
A distributed network of digital heritage information - Unesco/NDL India
NORFest 2023 Lightning Talks Session Three
INNOVATION AND ‎RESEARCH (Digital Library ‎Information Access)‎
Introduction to the FP7 CODE project @ BDBC
Linked Open Data in Libraries Archives & Museums
Ad

More from OCLC (20)

PPTX
Communicating library impact beyond library walls: Findings from an action-or...
PPTX
"You can just tell whether a website looks reliable or not." People's modes o...
PPTX
Factors influencing research data management programs.
PPTX
Teaching research methods in LIS programs: Approaches, formats, and innovativ...
PPTX
OCLC ALISE Library & Information Science Research Grant Program
PPTX
Investing in library users and potential users: The Many Faces of Digital Vi...
PPTX
Academic library impact: Improving practice and essential areas to research
PPTX
Studying information behavior: The Many Faces of Digital Visitors and Residents
PPTX
Online engagement and information literacy: The Many Face of Digital Visitors...
PPTX
People's mode of online engagement: The Many Faces of Digital Visitors and R...
PPTX
Applying research methods: Investigating the Many Faces of Digital Visitors &...
PDF
OCLC RLP @ RLUK
PPTX
Using Qualitative Methods for Library Evaluation: An Interactive Workshop
PPTX
Visitors and Residents: The Hows and Whys of Engagement with Technology
PPTX
Action-Oriented Research Agenda on Library Contributions to Student Learning ...
PPTX
Visitors and Residents: Interactive Mapping Exercise Workshop
PPTX
The Library in the Life of the User
PPTX
Where are We Going and What Do We Do Next? Demonstrating the Value of Academi...
PPTX
Changing Tack: A Future-Focused ACRL Research Agenda
PPTX
Qualitative Research Methods in LIS
Communicating library impact beyond library walls: Findings from an action-or...
"You can just tell whether a website looks reliable or not." People's modes o...
Factors influencing research data management programs.
Teaching research methods in LIS programs: Approaches, formats, and innovativ...
OCLC ALISE Library & Information Science Research Grant Program
Investing in library users and potential users: The Many Faces of Digital Vi...
Academic library impact: Improving practice and essential areas to research
Studying information behavior: The Many Faces of Digital Visitors and Residents
Online engagement and information literacy: The Many Face of Digital Visitors...
People's mode of online engagement: The Many Faces of Digital Visitors and R...
Applying research methods: Investigating the Many Faces of Digital Visitors &...
OCLC RLP @ RLUK
Using Qualitative Methods for Library Evaluation: An Interactive Workshop
Visitors and Residents: The Hows and Whys of Engagement with Technology
Action-Oriented Research Agenda on Library Contributions to Student Learning ...
Visitors and Residents: Interactive Mapping Exercise Workshop
The Library in the Life of the User
Where are We Going and What Do We Do Next? Demonstrating the Value of Academi...
Changing Tack: A Future-Focused ACRL Research Agenda
Qualitative Research Methods in LIS

Recently uploaded (20)

PPTX
master seminar digital applications in india
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
Cell Types and Its function , kingdom of life
PDF
Complications of Minimal Access Surgery at WLH
PDF
VCE English Exam - Section C Student Revision Booklet
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPTX
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
PDF
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
Pre independence Education in Inndia.pdf
PPTX
Institutional Correction lecture only . . .
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
RMMM.pdf make it easy to upload and study
master seminar digital applications in india
PPH.pptx obstetrics and gynecology in nursing
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Cell Types and Its function , kingdom of life
Complications of Minimal Access Surgery at WLH
VCE English Exam - Section C Student Revision Booklet
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
STATICS OF THE RIGID BODIES Hibbelers.pdf
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Pre independence Education in Inndia.pdf
Institutional Correction lecture only . . .
Final Presentation General Medicine 03-08-2024.pptx
Supply Chain Operations Speaking Notes -ICLT Program
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
O5-L3 Freight Transport Ops (International) V1.pdf
RMMM.pdf make it easy to upload and study

Connecting the Dots: Linking Digitized Collections Across Metadata Silos

  • 1. Linking digitized collections across metadata silos Jeff Mixter and Titia van der Werf OCLC Research July 2, 2014 LIBER 2014 Connecting the Dots
  • 2. Introduction • Projects such as Europeana and the Digital Public Library of America have highlighted the importance of sharing metadata across silos • While both of these projects have been successful in harvesting collections data, they have had problems with rationalizing the data and forming a coherent understanding of the aggregation • In order to properly share data across silos and to better share data on the Web, for both human as well as machine consumption, there needs to be a concerted effort to apply best practices and standards that are universally understood and consumed
  • 3. Current Situation • Organizations create digital collections and generate metadata in repository silos. This metadata is generally: •Not connecting the digitized items to their analogue sources •Not connecting names to authority records (persons, organizations, places, etc.) nor subject descriptions to controlled vocabularies •Not connecting to related online items accessible elsewhere • Aggregators harvest this metadata that, in the process, generally gets “dumbed down”: •The University of Illinois OAI-PMH Data provider registry notes that 2964 repositories use dc. The next highest is MARC21 at 545 repositories •Even if dc.extensions are used, they are often lost in the OAI-PMH harvesting process •Aggregators usually ignore idiosyncratic use of metadata schemas and enforce the use of designated metadata fields •Digital collection items are not very visible to search engines • A recent JISC project determined “Only about 50% of items appeared on the first page of Google results using the item name or title”
  • 4. a case study: “a good example” Search string: exposition organisée pour le centenaire des "Fleurs du Mal" & search on full-text string from document: "Eugène Crépet" Search in: 1. BnF Catalogue (Library Catalogue) 2. Gallica (Repository) 3. WorldCat (Aggregator via DCG harvester) 4. TEL (Aggregator) 5. Europeana (Aggregator) 6. Google (Search Engine)
  • 12. Observations 1. A lot of duplication of effort and waste of resources in developing aggregator services within the same domain 2. A lot of missed opportunities to connect to related data inside&outside the own silo (both repository and aggregation levels) 3. Visibility/discoverability via SEO is a sign of digital maturity 4. Aggregators generally do not use the FT-indexes available from repositories to enrich their search functionality
  • 13. Problem Statements 1. How to share metadata and reduce costs? 2. How to make digital collections more interoperable across data silos? 3. How to make digital collections more visible to search engines?
  • 14. Data sharing • ‘Data sharing’ is a rather simple term and does not do justice to what it means in today’s knowledge society • What we want to do is: 1. Publish data on the Web in a format that can be consumed and indexed by aggregators/web applications 2. Share data with other organizations with the goal of ‘connecting the dots’ 3. This entails connecting points in your data to points in other organization’s data. This could be People, Places, Events, Organizations, Topics etc. 4. Connecting data across silos will help improve the ability for patrons to browse and navigate related data/items without having to do multiple searches in multiple portals
  • 16. A Knowledge Graph • In essence what we want to build is a massive knowledge graph of data from digital collections BNF France DNB Germany KB Netherlands BNE Spain BL UK
  • 17. A Knowledge Graph • Better yet, we actually want to connect individual dots within and across data silos. This is the essence of Linked Data •This requires changes in how repository data is published Vincent van Gogh Vincent van Gogh Vincent van Gogh Vincent van Gogh Vincent van Gogh
  • 18. Linked Data https://www.freebase.com/m/07_m2 http://viaf.org/ 9854560 http://viaf.org/ 9854560 http://dbpedia.org/Vincent_van_Gogh http://dbpedia.org/Vincent_van_Gogh
  • 19. Linked Data •Linked Data is a way of publishing data on the Web in a format that can be easily consumed and understood by both humans and machines. It relies on linking data points together to form a complex graph of information • Linked Data relies on identifiers called URIs • Things NOT Strings! • Linked Data can also be used to help connect data across silos and across domains of practice
  • 20. Schema.org • Schema.org is a Linked Data vocabulary that is understood and indexed by search engines • It is widely used: • It is used on 15% of web pages harvested by Google • over 5 million web sites • over 25 billion referenced entities • Google Web Master tools can tell users how much structured data Google is seeing and indexing • WorldCat.org has unique 4.63 million structured data entities over 1.48 million pages • So why Schema.org? • Discoverability on the web • Interoperability with data outside of the library domain
  • 21. OCLC Projects • In 2012, OCLC added Schema.org tags to WorldCat.org records, improving the way in which library information is represented to search engines.
  • 23. OCLC Projects •In 2012, OCLC published VIAF data as Linked Open Data •In 2013, OCLC developed a VIAF bot for Wikipedia http://inkdroid.org/journal/2012/05/15/diving-into-viaf/comment-page-1/
  • 24. OCLC Projects • In April of this year (2014), OCLC released a beta version of its Works data as Linked Data, marked up in Schema.org (197M work descriptions)
  • 25. This Points to the ‘manifestation’ in WorldCat.org http://experiment.worldcat.org/entity/work/data/51196.html
  • 26. Other OCLC Projects • There is an exploratory project underway to take the Digital Collections Gateway metadata create more granular Linked Data descriptions • A USC collection was used as a test case • Using original metadata rich descriptions of people, places, events and items were created •As OCLC continues to use the Schema.org vocabulary to items found in libraries archives and museums, we have begun to create extension terms to supplement shortcomings in Schema.org • There is also a W3C Community Group Schema Bib Extend that proposes additional terms to Schema.org for review and consideration
  • 27. Becoming a player in the web of data
  • 29. Thank You! ©2014 OCLC. This work is licensed under a Creative Commons Attribution 3.0 Unported License. Suggested attribution: “This work uses content from [presentation title] © OCLC, used under a Creative Commons Attribution license: http://creativecommons.org/licenses/by/3.0/” Jeff Mixter – mixterj@oclc.org Titia van der Werf – titia.vanderwerf@oclc.org

Editor's Notes

  • #3: Set the framework for why there needs to be a change in how we share and publish repository data
  • #4: Explain how we currently do things
  • #6: The author names are linked to the French authority record. The full-text index of this resource is seemingly not indexed in this catalogue.
  • #7: Gallica does not link the author names to any authority record/related resources. The full-text index of this resource is indexed in Gallica. Searching on “Eugène Crépet” gives 3 results extracted from the FT of this specific document, amongst other results.
  • #8: Searching on the string “Eugène Crépet” yields 12 results, one of which (nr.8) from the FT of the document under scrutiny. There are three extracts from the FT containing this string.
  • #9: The metadata coming in WorldCat via the Digital Collection Gateway is dumbed down and not enriched with links to authority records within or outside of WorldCat. The full-text index of this resource is seemingly not indexed in WorldCat.
  • #10: The authors are not linked to other resources within or outside TEL. The full-text index of this resource is seemingly not indexed in TEL.
  • #11: The author names are linked to other works WITHIN the Europeana aggregation – not to an authority record outside of Europeana (e.g. BnF authority record or VIAF). The full-text index of this resource is seemingly not indexed in Europeana.
  • #12: One would expect that Gallica would be top ranked on this result page (with the direct link to the FT-resource), but the DCG-WorldCat record is on top of the list – which shows that WorldCat is better in SEO. Gallica is still ranked second – which is not bad. TEL and Europeana are not visible.
  • #16: This is reflective of how data sharing works. Europeana harvests repository data in bulk uploads and then publishes it. They do some behind the scenes clean-up but because it is already simple dublin core the efforts are rather futile and very difficult
  • #17: What we actually want is to link the individual repositories together. Using a rich granular standard Web vocabulary organization will be able to publish their data without loss. The task of linking it to other repositories will still be difficult but the data experts will at least be able to work with very detailed source metadata
  • #18: The linking that goes on between individual data sets will actually be micro-linking. This is linking individual dots (metadata points) to other dots in other data sets.
  • #19: URIs are what we are actually linking together
  • #20: Brief overview of Linked Data and discussion of efforts already undertaken in Europe
  • #21: Brief overview of Schema.org – I left the discussion of Schema.org purposely brief as to not give the over-impression that we are trying to push it on people. I am not sure if that would come off as offensive to anyone.
  • #27: Overview of OCLC Linked Data releases and then a more detailed discussion about the exploratory project that I am working on. Emphasis of just getting data published using Schema.org. The conclusion of this slide will be that the goal of developing better Digital Collection Gateway Linked Data records is to enable us build better descriptions of these items. The job would be much easier if we could simply link to the Linked Data descriptions published by the original creator of the data (i.e. the repository). The repository has the benefit of being able to generate Linked Data from highly granular source metadata where as the Digital Collections Gateway has to rely on simple Dublin Core data.