SlideShare a Scribd company logo
June 22 – 27, 2017
BIBFRAME and OCLC Works:
Defining models and discovering evidence
Carol Jean Godby and Diane Vizine-Goetz
Senior Research Scientists
OCLC Membership and Research
Library of Congress BIBFRAME Update Session. 26 June 2017
• OCLC researchers are acting on a request from the
Library of Congress to align OCLC Works with BIBFRAME
works.
• We are working in the larger context of two Program for
Cooperative Cataloging Task Groups: PCC-Work, and
PCC-URI.
A starting place
OCLC’s linked-data models reflect
what can be discovered in
aggregations of data converted from
MARC.
This outcome is congruent with
BIBFRAME and other efforts in the
library community to specify a future in
which library data is more machine-
understandable than it is today.
One highlight….
MARC Uniform
Title Record
cluster
VIAF Work
records
VIAF
WorldCat Works
Research prototypes
VIAF
ExpressionVIAF “Extended
Relationship” (xR)
record
Inputs Model outputs Visible outputs
“same”
Work
cluster
MARC
bibliographic
records
RELATIONSHIPS TO BIBFRAME
BIBFRAME and OCLC Works: Defining Models and Discovering Evidence
BIBFRAME and OCLC Works: Defining Models and Discovering Evidence
BIBFRAME and OCLC Works: Defining Models and Discovering Evidence
BIBFRAME and OCLC Works: Defining Models and Discovering Evidence
BIBFRAME and OCLC Works: Defining Models and Discovering Evidence
Title
Main Title: “Civil disobedience”
Contribution
Agent: <Henry David Thoreau>
Role: <aut>
Subject: <civil disobedience>
Genre Form: <sound recording>
Credits: “Read by Archibald McLeish”
BF Work
Audio Issue Number: “TC 1263”
Provision Activity Statement:
Agent: Caedmon
Place: <New York, N.Y>
Date: “1968”
Dimensions: “12 in.”
BF Instance
exampleOfWork
OCLC Work
23 descriptions of sound
recordings, with different:
• Publishers
• Issue dates
• Carriers
• Narrators
OCLC Works and LC records
WorldCat
• 394,835,538 records
• 78% (178,375,018) are
singleton works
• Non-singleton work
clusters average 4.18
records
• 230,113,951 total works
LC subset
• 10,321,873 records
• 26% (2,706,494) are
singleton works
• Non-singleton work
clusters containing LC
records average 7.74
records
• 9,575,077 total works
Don Quixote in WorldCat Works
– 260 LC records with Uniform title: Don Quixote*
– 247 records for books
• 104 LC records in primary Don Quixote work cluster
(~7900 WorldCat records)
• 79 singleton works
• 64 records in 21 additional clusters; includes
translations not [yet] recognized by the clustering
algorithms and other works
THE “WORK” IN CONTEXT
FRBR BIBFRAME OCLC
Work. “A distinct
intellectual or artistic
creation”
Work. “The highest level of
abstraction…reflects the
conceptual essence of the
cataloged
resource: authors,
languages, and what it is
about (subjects).”
Author, title, format;
content properties such as
subject or genre
Expression. “A Work
realized as a distinct
intellectual or artistic
form”
A language tag or evidence
of a translation relationship
Manifestation. “Physical
embodiment of an
Expression”
Instance. “…one or more
individual embodiments of
a Work”
A product code, such as an
ISBN
Item. “Exemplar of a
Manifestation”
Item. “…an actual copy of
an Instance.”
Evidence of uniqueness
Creative Work with a “language”
descriptor or a translator
Creative Work with
a “Product” identifier
Creative Work with a
“Uniqueness” identifier
“Work”
schema:CreativeWork
exampleOfWork
“Expression”
schema:CreativeWork
“Manifestation”
”schema:CreativeWork
schema:ProductModel
“Item”
schema:CreativeWork
schema:IndividualProduct
FRBR in the OCLC model of Works
workExample
OCLC Manifestation
es
lang
enlang
workExample / bf:translation workExample / workTranslation
workExample
BIBFRAME Work
BIBFRAME
Instance
Alignment with BIBFRAME
BIBFRAME Item
OCLC Item
workExample
exampleOfWork
OCLC
Work
OCLC Expression
bf:itemOf
bf:instanceOf
OCLC Expression
SYNERGIES
• hasPart, partOf
• accompaniedBy,
accompanies
• hasDerivative,
derivativeOf
• precedes,
precededBy
BIBFRAME and OCLC Works: Defining Models and Discovering Evidence
Why OCLC’s “work on Works” is in
synch with BIBFRAME
• We’re describing the same things.
• Our goal is to discover evidence for
standards, not propose new ones.
• We agree that Works are fundamentally
important in the domain of library resource
description.
Jean Godby
OCLC Membership and Research
godby@oclc.org
For more information
• Jean Godby. 2016. A Division of Labor. In Linked Data for Cultural
Heritage (Ed Jones and Michele Seikel, eds). ALCTS monograph.
• PCC SCS/LDAC Task Group on the Work Entity.
• PCC Task Group on URIs in MARC.
• Karen Smith-Yoshimura. 2017. Representing Translations as Linked
Data. OCLC Linked Data Roundtable. ALA Annual.
• Roy Tennant and Jean Godby. 2017, OCLC’s Work on Works. ALA
Midwinter.
• WorldCat Cookbook Finder.

More Related Content

PPTX
Data Designed for Discovery
PPTX
FAST Update
PPTX
Collection Directions - Research collections in the network environment
PPTX
Linked Data Implementations—Who, What and Why?
PPTX
Multilingual presentation ifla 2013 08-19
PPTX
Best Practices for Descriptive Metadata
Data Designed for Discovery
FAST Update
Collection Directions - Research collections in the network environment
Linked Data Implementations—Who, What and Why?
Multilingual presentation ifla 2013 08-19
Best Practices for Descriptive Metadata

What's hot (20)

PPTX
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
PDF
Shieh "Enabling Descriptive Data to be Linked at the Smithsonian Libraries"
PDF
Godby "'What are the 'entities that matter?' And how much should we say about...
PDF
Sparling and Cohen "BIBFRAME Implementation at the University of Alberta Libr...
PDF
PPTX
The library in the life of the user
PPT
Publishing the British National Bibliography as Linked Open Data / Corine Del...
PPTX
Exploring a world of networked information built from free-text metadata
PPTX
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
PDF
The Future of Finding: Resource Discovery @ The University of Oxford
PPTX
Multilingualism ifla 2014 08
PPTX
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
PDF
The Web of Data is Our Opportunity
PPTX
NISO/DCMI May 22 Webinar: Semantic Mashups Across Large, Heterogeneous Insti...
PPTX
NISO/DCMI Webinar: Cooperative Authority Control: The Virtual International A...
PPTX
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
April 8 NISO Webinar: Experimenting with BIBFRAME: Reports from Early Adopters
Shieh "Enabling Descriptive Data to be Linked at the Smithsonian Libraries"
Godby "'What are the 'entities that matter?' And how much should we say about...
Sparling and Cohen "BIBFRAME Implementation at the University of Alberta Libr...
The library in the life of the user
Publishing the British National Bibliography as Linked Open Data / Corine Del...
Exploring a world of networked information built from free-text metadata
NISO/DCMI September 25 Webinar: Implementing Linked Data in Developing Countr...
The Future of Finding: Resource Discovery @ The University of Oxford
Multilingualism ifla 2014 08
SAFETY NETS: RESCUE AND REVIVAL FOR ENDANGERED BORN-DIGITAL RECORDS- Program ...
The Web of Data is Our Opportunity
NISO/DCMI May 22 Webinar: Semantic Mashups Across Large, Heterogeneous Insti...
NISO/DCMI Webinar: Cooperative Authority Control: The Virtual International A...
NISO/DCMI Webinar: Schema.org and Linked Data: Complementary Approaches to Pu...
Ad

Similar to BIBFRAME and OCLC Works: Defining Models and Discovering Evidence (20)

PPTX
The Rhetoric of Research Objects
PPTX
EDS for JIBS
ODP
Wikipedia as source of collaboratively created Knowledge Organization Systems
PPTX
The Progress of BIBFRAME, by Angela Kroeger
PPTX
Share: discovery: a focus on papers
PPTX
The Buzz About BIBFRAME, by Angela Kroeger
PDF
NISO/DCMI Webinar: International Bibliographic Standards, Linked Data, and th...
PPTX
Cataloging roundtable discussion questions
PDF
Tracking Changes through EARMARK: a Theoretical Perspective and an Implementa...
PDF
Knowledge Patterns for the Web: extraction, transformation, and reuse
PPTX
Libraries and Linked Data: Looking to the Future (3)
PPT
Repositories and the wider context
PDF
OpenCitations
PPT
Semantic Web and Linked Data for cultural heritage materials - Approaches in ...
PPTX
SEMANTIC WEB SOURCES – comparison of open-source Knowledge Graphs
PPT
SWAP : A Dublin Core Application Profile for desribing scholarly works
PDF
FRBR and RDA
PDF
Books and Webs: Pulling the Down Rows
PPT
Catalog of the Future
PPT
Convergence and Interoperability (IFLA 2011)
The Rhetoric of Research Objects
EDS for JIBS
Wikipedia as source of collaboratively created Knowledge Organization Systems
The Progress of BIBFRAME, by Angela Kroeger
Share: discovery: a focus on papers
The Buzz About BIBFRAME, by Angela Kroeger
NISO/DCMI Webinar: International Bibliographic Standards, Linked Data, and th...
Cataloging roundtable discussion questions
Tracking Changes through EARMARK: a Theoretical Perspective and an Implementa...
Knowledge Patterns for the Web: extraction, transformation, and reuse
Libraries and Linked Data: Looking to the Future (3)
Repositories and the wider context
OpenCitations
Semantic Web and Linked Data for cultural heritage materials - Approaches in ...
SEMANTIC WEB SOURCES – comparison of open-source Knowledge Graphs
SWAP : A Dublin Core Application Profile for desribing scholarly works
FRBR and RDA
Books and Webs: Pulling the Down Rows
Catalog of the Future
Convergence and Interoperability (IFLA 2011)
Ad

More from OCLC (20)

PPTX
Communicating library impact beyond library walls: Findings from an action-or...
PPTX
"You can just tell whether a website looks reliable or not." People's modes o...
PPTX
Factors influencing research data management programs.
PPTX
Teaching research methods in LIS programs: Approaches, formats, and innovativ...
PPTX
OCLC ALISE Library & Information Science Research Grant Program
PPTX
Investing in library users and potential users: The Many Faces of Digital Vi...
PPTX
Academic library impact: Improving practice and essential areas to research
PPTX
Studying information behavior: The Many Faces of Digital Visitors and Residents
PPTX
Online engagement and information literacy: The Many Face of Digital Visitors...
PPTX
People's mode of online engagement: The Many Faces of Digital Visitors and R...
PPTX
Applying research methods: Investigating the Many Faces of Digital Visitors &...
PDF
OCLC RLP @ RLUK
PPTX
Using Qualitative Methods for Library Evaluation: An Interactive Workshop
PPTX
Visitors and Residents: The Hows and Whys of Engagement with Technology
PPTX
Action-Oriented Research Agenda on Library Contributions to Student Learning ...
PPTX
Visitors and Residents: Interactive Mapping Exercise Workshop
PPTX
The Library in the Life of the User
PPTX
Where are We Going and What Do We Do Next? Demonstrating the Value of Academi...
PPTX
Changing Tack: A Future-Focused ACRL Research Agenda
PPTX
Qualitative Research Methods in LIS
Communicating library impact beyond library walls: Findings from an action-or...
"You can just tell whether a website looks reliable or not." People's modes o...
Factors influencing research data management programs.
Teaching research methods in LIS programs: Approaches, formats, and innovativ...
OCLC ALISE Library & Information Science Research Grant Program
Investing in library users and potential users: The Many Faces of Digital Vi...
Academic library impact: Improving practice and essential areas to research
Studying information behavior: The Many Faces of Digital Visitors and Residents
Online engagement and information literacy: The Many Face of Digital Visitors...
People's mode of online engagement: The Many Faces of Digital Visitors and R...
Applying research methods: Investigating the Many Faces of Digital Visitors &...
OCLC RLP @ RLUK
Using Qualitative Methods for Library Evaluation: An Interactive Workshop
Visitors and Residents: The Hows and Whys of Engagement with Technology
Action-Oriented Research Agenda on Library Contributions to Student Learning ...
Visitors and Residents: Interactive Mapping Exercise Workshop
The Library in the Life of the User
Where are We Going and What Do We Do Next? Demonstrating the Value of Academi...
Changing Tack: A Future-Focused ACRL Research Agenda
Qualitative Research Methods in LIS

Recently uploaded (20)

PPTX
Cell Structure & Organelles in detailed.
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
Institutional Correction lecture only . . .
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
Classroom Observation Tools for Teachers
PDF
VCE English Exam - Section C Student Revision Booklet
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PPTX
Cell Types and Its function , kingdom of life
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
GDM (1) (1).pptx small presentation for students
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PPTX
PPH.pptx obstetrics and gynecology in nursing
Cell Structure & Organelles in detailed.
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Institutional Correction lecture only . . .
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
Classroom Observation Tools for Teachers
VCE English Exam - Section C Student Revision Booklet
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Cell Types and Its function , kingdom of life
Pharmacology of Heart Failure /Pharmacotherapy of CHF
human mycosis Human fungal infections are called human mycosis..pptx
Abdominal Access Techniques with Prof. Dr. R K Mishra
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
GDM (1) (1).pptx small presentation for students
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
102 student loan defaulters named and shamed – Is someone you know on the list?
Module 4: Burden of Disease Tutorial Slides S2 2025
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Renaissance Architecture: A Journey from Faith to Humanism
PPH.pptx obstetrics and gynecology in nursing

BIBFRAME and OCLC Works: Defining Models and Discovering Evidence

  • 1. June 22 – 27, 2017 BIBFRAME and OCLC Works: Defining models and discovering evidence Carol Jean Godby and Diane Vizine-Goetz Senior Research Scientists OCLC Membership and Research Library of Congress BIBFRAME Update Session. 26 June 2017
  • 2. • OCLC researchers are acting on a request from the Library of Congress to align OCLC Works with BIBFRAME works. • We are working in the larger context of two Program for Cooperative Cataloging Task Groups: PCC-Work, and PCC-URI. A starting place
  • 3. OCLC’s linked-data models reflect what can be discovered in aggregations of data converted from MARC. This outcome is congruent with BIBFRAME and other efforts in the library community to specify a future in which library data is more machine- understandable than it is today. One highlight….
  • 4. MARC Uniform Title Record cluster VIAF Work records VIAF WorldCat Works Research prototypes VIAF ExpressionVIAF “Extended Relationship” (xR) record Inputs Model outputs Visible outputs “same” Work cluster MARC bibliographic records
  • 11. Title Main Title: “Civil disobedience” Contribution Agent: <Henry David Thoreau> Role: <aut> Subject: <civil disobedience> Genre Form: <sound recording> Credits: “Read by Archibald McLeish” BF Work Audio Issue Number: “TC 1263” Provision Activity Statement: Agent: Caedmon Place: <New York, N.Y> Date: “1968” Dimensions: “12 in.” BF Instance exampleOfWork OCLC Work
  • 12. 23 descriptions of sound recordings, with different: • Publishers • Issue dates • Carriers • Narrators
  • 13. OCLC Works and LC records WorldCat • 394,835,538 records • 78% (178,375,018) are singleton works • Non-singleton work clusters average 4.18 records • 230,113,951 total works LC subset • 10,321,873 records • 26% (2,706,494) are singleton works • Non-singleton work clusters containing LC records average 7.74 records • 9,575,077 total works
  • 14. Don Quixote in WorldCat Works – 260 LC records with Uniform title: Don Quixote* – 247 records for books • 104 LC records in primary Don Quixote work cluster (~7900 WorldCat records) • 79 singleton works • 64 records in 21 additional clusters; includes translations not [yet] recognized by the clustering algorithms and other works
  • 15. THE “WORK” IN CONTEXT
  • 16. FRBR BIBFRAME OCLC Work. “A distinct intellectual or artistic creation” Work. “The highest level of abstraction…reflects the conceptual essence of the cataloged resource: authors, languages, and what it is about (subjects).” Author, title, format; content properties such as subject or genre Expression. “A Work realized as a distinct intellectual or artistic form” A language tag or evidence of a translation relationship Manifestation. “Physical embodiment of an Expression” Instance. “…one or more individual embodiments of a Work” A product code, such as an ISBN Item. “Exemplar of a Manifestation” Item. “…an actual copy of an Instance.” Evidence of uniqueness
  • 17. Creative Work with a “language” descriptor or a translator Creative Work with a “Product” identifier Creative Work with a “Uniqueness” identifier “Work” schema:CreativeWork exampleOfWork “Expression” schema:CreativeWork “Manifestation” ”schema:CreativeWork schema:ProductModel “Item” schema:CreativeWork schema:IndividualProduct FRBR in the OCLC model of Works workExample
  • 18. OCLC Manifestation es lang enlang workExample / bf:translation workExample / workTranslation workExample BIBFRAME Work BIBFRAME Instance Alignment with BIBFRAME BIBFRAME Item OCLC Item workExample exampleOfWork OCLC Work OCLC Expression bf:itemOf bf:instanceOf OCLC Expression
  • 20. • hasPart, partOf • accompaniedBy, accompanies • hasDerivative, derivativeOf • precedes, precededBy
  • 22. Why OCLC’s “work on Works” is in synch with BIBFRAME • We’re describing the same things. • Our goal is to discover evidence for standards, not propose new ones. • We agree that Works are fundamentally important in the domain of library resource description.
  • 23. Jean Godby OCLC Membership and Research godby@oclc.org
  • 24. For more information • Jean Godby. 2016. A Division of Labor. In Linked Data for Cultural Heritage (Ed Jones and Michele Seikel, eds). ALCTS monograph. • PCC SCS/LDAC Task Group on the Work Entity. • PCC Task Group on URIs in MARC. • Karen Smith-Yoshimura. 2017. Representing Translations as Linked Data. OCLC Linked Data Roundtable. ALA Annual. • Roy Tennant and Jean Godby. 2017, OCLC’s Work on Works. ALA Midwinter. • WorldCat Cookbook Finder.

Editor's Notes

  • #3: The PCC-Work TF is doing a comparative analysis of the Work models being developed in the library community and writing up the results in a white paper. BIBFRAME and OCLC Works are among them. Others: FRBR, RDA, and IFLA’s Library Reference Data Model. The PCC-URI TF is defining best practices for
  • #4: In other words, OCLC is pursuing a data-driven approach to modeling. That is the theme I want to develop today, with respect to our “work on Works.” This is a companion to the presentation that my OCLC colleague Roy Tennant gave six months ago, at the Midwinter LC BIBFRAME Update session.
  • #5: Like BIBFRAME, OCLC’s Works are discovered in two resources: MARC bibliographic and Uniform Title authority records. In general, Authority records produce VIAF Works and bibliographic records produce OCLC works. Traditionally, they have been processed in separate projects, but they’re obviously related. There is a VIAF Work for William Faulkner’s book “Go Down Moses.” And there is a WorldCat Work. Since they describe the same thing, they’re “the same” in some respect. Our previous presentation focused on VIAF Works, and here we will drill down into OCLC Works. In short: a Work is a cluster of bibliographic descriptions that have the same author, title, genre, and resource type, according to algorithms working directly on the data. The raw output is simply a cluster of MARC records (shown in the middle of the slide). Downstream processes (shown in the right-hand segment of the slide) create 1) the hierarchical display of search results on WorldCat; 2) research prototypes such as Classify and Cookbook Finder; 3) and the RDF dataset known as WorldCat Works. Broadly speaking, the process that starts with bibliographic records (WorldCat) is more complex than the VIAF process. But there is some overlap. For example, VIAF records sometimes show ‘xR’ as a source authority. That’s an ‘Extended Relationship’ record. An xR record is a synthetic uniform title record created from a WorldCat Work cluster. It is an experimental attempt to increase the number of uniform title records because we all know there aren’t enough of them. In our data, only a small number of WorldCat Work clusters contain bib records with references to uniform title authority records.
  • #6: This section describes some of the tangible outputs of the process for creating Works from bibliographic records and compares them to BIBFRAME Works.
  • #7: With this background, we can get started. I’ll show a bit more about the applications derived from the model outputs. WorldCat Cookbook Finder is a research prototype. It builds a user experience for browsing and searching bibliographic records organized as Works and other FRBR Group I categories. One of the inputs is the set of Work clusters. The result is a human-readable description of the properties that describe the content of Works such as Irma Rombauer’s Joy of Cooking. Shown here: author, description, date range for published editions.
  • #8: Authors, subjects, class assignment. Again: Joy of Cooking is not built on linked data, but is an application derived from the same source, the cluster of MARC records that we call Works. Note the Work identifier, which is associated with the Work cluster.
  • #9: Links to editions, or Manifestations. OCLC’s holdings are another input.
  • #10: Here is the corresponding WorldCat Linked data description derived from the same source. The Work ID is the same the one shown in Cookbook Finder: 399397. This is the raw data, designed for machine understanding. It can be delivered via Web protocols shown in the turquoise tabs across the top: HTML, JSON-LD, RDF XML, etc. Cookbook Finder filters the raw data for a clean user experience. For example it displays the most popular subject (Cooking, American). The complete list excerpted here. Because WorldCat Works is agnostic about how it might be used and is expressed as Linked Data, this is the typical point of comparison with BIBFRAME Works.
  • #11: The next set of slides shows links to WorldCat Works in WorldCat catalog records and compares them to corresponding BIBFRAME descriptions. This is an example from the a dataset in the BIBFRAME 2.0 converter distribution. It describes an audiobook of Thoreau’s Civil Disobedience narrated by Archibald MacLeish published by Caedmon in 1968.
  • #12: Applying the BIBFRAME converter to this record produces two outputs: The Work, containing subject-level information very similar to the output of OCLC’s processes. The data in blue is shorthand for data represented as URIs, not text strings. We have URIs for Henry David Thoreau playing the role of author, and for subject headings and genres. LC is also investigating the possibility of extracting information about Works from MARC linking fields and $t subfields. This is also consistent with what OCLC is doing. Instance, describing this edition of Civil Disobedience, with publication information and a product identifier. In progress: BF Work descriptions are being built out in two ways: 1) aggregated in clusters. 2) associated with instances and items. This is similar to OCLC’s approach, though we place more emphasis on automated processes. At the bottom of the Worldcat.org page is OCLC’s Linked Data representation of this MARC record. It contains many details seen in BIBFRAME output. It contains a URI to a WorldCat Works description, which resolves to a cluster of records. In other words, the 1968 audiobook edition of Civil Disobedience is only one member of the Work cluster. In the long run: BF and OCLC Works are aiming for the same goal. Achieved with similar strategies. A reduction step – extract work-level information; 2) an aggregation step – create a composite description.
  • #13: The raw data looks like this, and is just another WorldCat Works page. But when the ‘example of work’ links are dereferenced, the other members of the cluster are retrieved. Three are shown here. (Summary points) Note that an aggregate is included in the cluster. This is obviously an error, but it occurs because it is difficult for a machine process to detect aggregates with certainty. I’ll return to this point later.
  • #14: Extracted records from WorldCat with DLC in subfields $a and/or $c of MARC 040 Cataloging Source For example, 040##$aDLC$cDLC indicates cataloging produced and input by the Library of Congress 10,321,873 records
  • #15: Examples of singleton works Primera y segvnda parte del ingenioso hidalgo Don Qvixote de la Mancha Don Quixote de la Mancha : an old-spelling control edition based on the first editions of parts 1 and 2 / prepared by R.M. Flores Examples: in additional clusters El ingenioso hidalgo Don Quijote de la Mancha / Miguel de Cervantes ; ilustraciones de José Ramón Sánchez ; edición, introducción y notas de Angel Basanta ; apéndices de Eduardo Pérez-Rasilla [and others]. Don Quixote of the Mancha / retold by Judge Parry ; and illustrated by Walter Crane
  • #16: Now that we’ve gone through a couple of examples, I’d like to step back and put WorldCat and BIBFRAME Works in the context of FRBR and related models.
  • #17: FRBR, BIBFRAME (and other models such as RDA and IFLA’s Library reference model) are specifications. Roughly speaking: BIBFRAME and the OCLC model of Works represent a simplification of the FRBR model because only three levels are clearly distinguished: Work, Manifestation, and Item. The distinction between Work and Expression is only a soft one. But the color change in the slide is meant to suggest that OCLC’s model of Works is different from the two other model specifications shown here. OCLC’s model does not propose new definitions but is simply a list of the criteria we use to discover categories in MARC data defined by FRBR. The results are expressed in Schema.org for two reasons: We can’t be sure that they’re exactly the same as FRBR because empirical reality only approximates theory. But if what we do discover is expressed in a general-purpose vocabulary, other users outside the library community are in a better position to consume it.
  • #18: A graphical representation of the OCLC’s rules for identifying FRBR categories in slightly more technical detail. Bottom line: FRBR Group I categories can be expressed, in a lightweight fashion. They are all labeled (or “typed”) as creative works and are labeled as “examples” of the category above them in the hierarchy. There are three Schema.org type assignments corresponding to the three categories that can be distinguished. As we move down the hierarchy, the description becomes more and more specific. At the top are Work and Expression. If this description also has a Product identifier such as an ISBN, it is labeled as a Manifestation. If the description also has a barcode or other uniqueness identifier, it is labeled as an Item. The relationships between the levels are generic. The same relationships can be used at all levels. ExampleOfWork points up (relating an Expression to a Work). WorkExample points down (relating an Expression to a Manifestation). The properties are defined in schema.org: “workExample: Example/instance/realization/derivation of the concept of this creative work.” Note that the definition is a cover term for the relationships spelled out in the FRBR definitions.
  • #19: This diagram is like the one that was described in our previous presentation at ALA Midwinter, but it has more detail. It’s just another way of emphasizing the points of similarity between the OCLC model of Work and BIBFRAME. The OCLC model and BIBFRAME both recognize Work, Manifestation, and Item categories. Where there is evidence, as in the Translation relationship, a category interpreted as an Expression can also be identified. The relationships are generic. If more information is available, a more specific “workTranslation” relationship can be expressed (and its converse). BIBFRAME is similar. OCLC has one general relation; BF has two (itemOf/instanceOf). The BF relations can also be upgraded to something more specific (translation/translationOf … etc).
  • #21: I have hinted several times that the “Translation” relationship is important in OCLC’s model of Works. Karen Smith-Yoshimura’s presentation in the Linked Data roundtable covers this point in more detail. Translations are inherently important, of course, but in our data-driven approach to FRBR, translations represent the best evidence for the Expression concept, and for a Work-to-Expression relationship. But other relationships have obviously been defined, particularly in the RDA Relationship ontology. And BIBFRAME 2.0 has incorporated many of these relationships, into an ontology that specifies the type of relationship. For example, HasPart and accompaniedBy are examples of the “relatedTo” relationship.
  • #22: And more relations…hasReproduction, dataSource… Translations are a type of “Derivative” relationship between Works and Instances. The relationship and the things related are very close to what we would say. In the future, we will attempt to discover some of these other relationships. For example, we mentioned earlier that some types of aggregates create errors in our Work clusters. Thus, the creation of original descriptions that includes these relationships such as BIBFRAME would represent a major step forward toward our community’s goal of upgrading MARC and creating a model of our domain that is more machine-understandable. In other words, bibliographic descriptions that reference BIBFRAME relationships would make it easier to do what OCLC is trying to do: aggregate the world’s descriptions of library resources to improve the connection between users to the information they seek, and provide a backbone to the library operations that support this goal.