SlideShare a Scribd company logo
XML & MODS
INTRODUCTION TO XML AND THE METADATA OBJECT DESCRIPTION
SCHEMA IN THE CTDA
XML
XML stands for eXtensible markup language. XML was designed to describe data whereas HTML
was designed to display data.
XML uses “tags”. In metadata land, these are also referred to as “labels”, “elements”, or “fields”.
These tags are not predefined but are meant to be self-descriptive.
You can invent your own tags in XML.
By itself, XML DOES NOT DO ANYTHING. XML needs a script written by someone or a piece of
software to receive, send, transform, or display it.
XML is a software and hardware independent tool for carrying information. It is not a
replacement for HTML but can be a complement to HTML.
Extensible
You can create and define your own tags.
<note>
<myAwesomeNote>
<thisIsMyTag>
The power of being extensible is the ability to customize your xml.
Markup
It’s all about the <tags>. The angle brackets are the most recognizable feature of XML. These
tags or elements are very similar to the ones in HTML.
Elements are surrounding by angle brackets. Each element has an opening and closing
designation like HTML.
Language
XML is a language or rather a “meta” – language. XML allows you to create and definite other
languages.
Have you ever heard of RSS feeds, XSLT, or XSD?
Languages such as XSLT and XSD are sometimes referred to as members of the XML family.
XSLT is eXtensible stylesheet transformation
XSD is eXtensible schema definition
XML Documents
When you create an XML file or document, you essentially are creating a text file with the
extension .xml. Because it is a text file, it can be read by any type of software or hardware. This is
why xml simplifies data sharing and transport. It also helps when you change platforms because
text can be read a large number of programs and systems.
XML documents all have the same structure, called a tree. There is a branch, limbs, and leaves.
The XML declaration declares that this is an XML document. The branch of the tree is called the
root. The limbs and leaves are called children. Another name for the root is “parent”.
XML Document
<?xml version=“1.0” encoding=“UTF-8”?>
<note>
<to>Homer</to>
<from>Marcy</from>
<heading>Reminder</heading>
<body>Don’t forget about the BBQ this weekend</body>
</note>
__
<root>
<child>
<subchild>….</subchild>
</child>
</root>
XML Declaration
The Root or ultimate parent element
Children elements to the parent element, note, which is
also the root
note
to from heading body
XML Expanded
<note> is the root. It is also the parent to 4 children.
<to>, <from>, <heading>, <body> are children to its parent, <note>, and are siblings.
A parent element does not necessarily have to be the root element in the XML file.
All elements must have a closing tag.
All elements are case sensitive.
All elements must be properly nested.
All XML documents (or files) must have a root element.
All attributes values must be quoted.
All entity references (such as &, <, “, etc.) must use the 5 pre-defined entity references.
More XML
XML has comments that appear in the following syntax:
<!-- Add your comments here -->
White-space is preserved in XML. Hello Homer. Hello Homer.
A new line in XML is just a line feed whereas in Windows it is a carriage return and line feed. Use
Notepad++ or Oxygen to edit your XML.
An XML document is well-formed is it conforms to the rules above.
What is an element?
An element is everything from the start tag to the closing tag.
An element can contain:
• Other elements
• Text
• Attributes
• Mix of the above
<bookstore>
<book category=“children”>
<title>Harry Potter</title>
<author>J.K. Rowling</author>
<year>2005</year>
</book>
<book category=“young adult”>
<title>Hunger Games, book 1</title>
<author>Suzanne Collins</author>
<year>2008</year>
</book>
</bookstore>
Elements
XML Naming rules:
•Elements are case sensitive.
•Element names must start with a letter or an underscore.
•Element names can’t start with the letters xml (XML, xMl, xmL, etc.)
•Element names can contain letters, digits, hyphens, underscores, and periods
•Element names cannot contain spaces
Attributes
Attributes provide additional information about elements. Values must be placed in quotes.
<person gender=“female”>
<book category=“young adult”>
Notice that attribute values can have spaces. Attributes can’t have multiple values, tree structures
and are there not very extensible.
<person>
<gender>female</gender>
</person>
When would you use an attribute and not an element? It depends on what you want and if you are
writing an XML document based on definitions already decided for you such as a metadata standard.
Name Conflicts
Because you can create your own elements, there are times when elements have the same
name but refer to very different things.
Here’s an HTML table:
Here’s a table that is a piece of furniture:
If we combine these XML documents, there will be a conflict. How do you know that <table> is
different from <table>?
Namespaces – The Name Authority of
XML
Name conflicts such as this are resolved by adding a prefix. The prefix is a namespace and must
be defined by using the xmlns attribute in the start tag of the root or element.
xmlns:prefix=“URI”
The URI can be fictional in some cases. In many cases, it is not and refers to what is called a
schema or document definition type. A schema, XSD, is like a dictionary and grammar for an XML
document. It outlines the syntax and semantics that an XML document needs to follow in order
to conform to that schema.
For example, an XML that is a MODS file and that references the MODS schema must conform to
the syntax and semantics required by MODS as specified by the MODS schema. If you want to
learn German, you need a German dictionary and grammar book to help you write in German.
Metadata Object Description Schema
MODS is an XML based bibliographic description schema developed and maintained by the
Library of Congress. It is a compromise between the simplicity of Dublin Core and the complexity
of MARC. It was developed in 2002. Currently, MODS is now in version 3.6.
The main web site for MODS: http://www.loc.gov/standards/mods/.
This site provides information about the standard, guidelines, tools, schemas (for each version of
MODS), conversions, etc.
The CTDA does not implement the full standard of MODS.
CTDA Implementation of MODS
CTDA’s implementation guidelines and metadata application profile can be found online on our
web site (http://ctdigitalarchive.org/resources-for-participants).
These guidelines and profile are based on the full standard and in part on the technical
infrastructure’s capabilities for managing metadata. Such capabilities include indexing,
mapping/transforming, re-using, sharing, displaying, or extracting metadata.
CTDA implements MODS version 3.5 and references that version in MODS XML records using the
XML namespace declaration, xmlns, and the prefix, mods.
Minimum MODS XML
XML Declaration
◦ <?xml version=“1.0” encoding=“UTF-8”>
Root
◦ <mods:mods xmlns:mods=“http://www.loc.gov/mods/v3” xmlns:xlink=“http://www.w3.org/1999/xlink”
xmlns:xsi=“http://www.w3.org/2001/XMLSchema-instance” version=“3.5”
xsi:schemaLocation=“http://www.loc.gov/mods/v3 http://www.loc.gov/standards/mods/v3/mods-3-
5.xsd”>
Title
◦ <mods:titleInfo><mods:title>
Resource type
◦ <mods:typeOfResource>
Digital Resource
◦ <mods:physicalDescription><mods:digitalOrigin>
Minimum MODS XML Continued
Held By
◦ <mods:note type=“ownership”>
Rights
◦ <mods:accessCondition type=“use and reproduction”>
Persistent Identifier
◦ <mods:identifier type=“hdl”>
Language of MODS record
◦ <mods:recordInfo><mods:languageOfCataloging><mods:languageTerm type=“code” authority=“iso639-
2b”>
Remember that each opening tag needs a closing tag and there is a specific MODS tree to follow
according to the MODS specification or the schema version 3.5.
Example of Minimal MODS XML Document
<?xml version=“1.0” encoding=“UTF-8”>
<mods:mods xmlns:mods=“http://www.loc.gov/mods/v3” xmlns:xlink=“http://www.w3.org/1999/xlink” xmlns:xsi=“http://www.w3.org/2001/XMLSchema-instance” version=“3.5”
xsi:schemaLocation=“http://www.loc.gov/mods/v3 http://www.loc.gov/standards/mods/v3/mods-3-5.xsd”>
<mods:titleInfo>
<mods:title>This is an example title an image</mods:title>
</mods:titleInfo>
<mods:typeOfResource>still image</mods:typeOfResource>
<mods:physicalDescription>
<mods:digitalOrigin>reformatted digital</mods:digitalOrigin>
</mods:physicalDescription>
<mods:note type=“ownership”>Bridgeport History Center, Bridgeport Public Library</mods:note>
<mods:accessCondition type=“use and reproduction”>Rights statement</mods:accessCondition>
<mods:identifier type=“hdl”>http://hdl.handle.net/11134/110002:495858</mods:identifier>
<mods:recordInfo>
<mods:languageOfCataloging>
<mods:languageTerm type=“code” authority=“iso639-2b”>eng</mods:languageTerm>
</mods:languageOfCataloging>
</mods:recordInfo>
</mods:mods>
MODS XML
Explained
XML declaration
+
Root (mods:mods)
mods:titleInfo
mods:title
mods:typeOfResource
(controlled vocabulary)
mods:physicalDescription
mods:digitalOrigin
(controlled vocabulary)
mods:note mods:accessCondition mods:identifier mods:recordInfo
mods:languageOfCataloging
mods:languageTerm
XML declaration
Open root
Open 1st child (titleInfo)
Open 1st grandchild (or child of parent titleInfo) (title)
Add content
Close 1st grandchild (title)
Close 1st child (titleInfo)
Open 2nd child (typeOfResource)
Add content
Close 2nd child (typeOfResource)
Open 3rd child (physicalDescription)
Open child of parent physicalDescription (digitalOrigin)
Add content using one of the required terms from schema
Close child of parent (digitalOrigin)
Close 3rd child (physicalDescription)
Open 4th child (note)
Add attribute type with suggested value based on LC
recommendations
Add content
Close 4th child
ETC.
type type
type
type
authority
Attributes go in the opening tag only.
Particulars of MODS
typeOfResource has a required value list: text; cartographic; notated music; sound recording-musical; sound
recording-nonmusical; sound recording; still image; moving image; three dimensional object; software;
multimedia; mixed material.
digitalOrigin has a required value list: born digital, reformatted digital, digitized microfilm, digitized other analog
languageTerm requires the attribute type with the value of code and the attribute authority set to iso639-2b
The attribute qualifier for dateIssued has a required value list: approximate, inferred, questionable.
There is an ORDER to how elements appear. For example, the element scale must appear before coodinates.
We don’t use the MODS element relatedItem.
Particulars of CTDA MODS - Name
When you want to include a name such as an author or contributor, the role must be specified
and the entire name goes into one namePart element. The element name requires the attribute
type that has the required values of personal, corporate, family, conference. The child of role,
roleTerm, requires the attributes authority and type with the required values of marcrelator and
text respectively.
<mods:name type=“personal”>
<mods:namePart>Smith, John, 1850-1899</mods:namePart>
<mods:role>
<mods:roleTerm authority=“marcrelator” type=“text”>Author</mods:roleTerm>
</mods:role>
</mods:name>
Particulars of CTDA MODS - Date
Dates are not required. If you add a date, CTDA implements the element dateIssued element and requires the w3cdtf encoding and attribute
keyDate. For date ranges, it is necessary to implement the attribute point with either the value start of end.
Single Date:
<mods:originInfo>
<mods:dateIssued encoding=“w3cdtf” keyDate=“yes”>2010</mods:dateIssued>
</mods:originInfo>
Date Range:
<mods:originInfo>
<mods:dateIssued encoding=“w3cdtf” keyDate=“yes” point=“start”>1907</mods:dateIssued>
<mods:dateIssued encoding=“w3cdtf” point=“end”>1917</mods:dateIssued>
</mods:originInfo>
Single Date with Qualifier:
<mods:originInfo>
<mods:dateIssued encoding=“w3cdtf” keyDate=“yes” qualifier=“inferred”>1908</mods:dateIssued>
</mods:originInfo>
Particulars of CTDA MODS - Coordinates
In CTDA you can record both a center point and a bounding box. The center point is recording in the element
<mods:coordintates>. MODS 3.5 does not have a convenient way to record a bounding box. We use the
<mods:extension> element to record bounding box information in the content standard CSGDM.
<mods:cartographics>
<mods:scale>0.4583333333333333</mods:scale>
<mods:coordinates>42.023187, -71.852071</mods:coordinates>
</mods:cartographics>
<mods:extension xmlns:fgdc="http://www.fgdc.gov/schemas/metadata/fgdc-std-001-1998.xsd">
<fgdc:metadata>
<fgdc:idinfo>
<fgdc:spdom>
<fgdc:bounding>
<fgdc:westbc>-71.852071</fgdc:westbc>
<fgdc:eastbc>-71.841559</fgdc:eastbc>
<fgdc:northbc>42.030805</fgdc:northbc>
<fgdc:southbc>42.023187</fgdc:southbc>
</fgdc:bounding>
</fgdc:spdom>
</fgdc:idinfo>
</fgdc:metadata>
</mods:extension>
Particulars of CTDA MODS – Aggregating
Content
There is one repository where all content is stored for long-term preservation purposes. Content
can be presented on different “channels” or sites. One way of doing this is using what are called
Aggregation Tags. These tags are 3 uppercase letters. Each tag designates a particular channel.
The index is configured to recognize these tags and then push content to where it needs to go.
CTDA has 2 tags: CHO, GEO. These tags are values that go in the element
<mods:targetAudience>. This element, targetAudience, CANNOT be used for any other type of
content or tags that are made up on the fly.
<mods:targetAudience>CHO</mods:targetAudience>
<mods:targetAudience>GEO</mods:targetAudience>
Question: What is the parent element of this element?
Question: What’s the different between <mods:targetAudience> and <targetAudience>?
How To Recognize Parent/Child
Relationships?
If you go to the main web site on MODS 3.5
outline
(http://www.loc.gov/standards/mods/mods-
outline-3-5.html), you will see a list of the TOP
LEVEL Elements. Top level elements are all
children of the root. Each top level element is
then described in terms of its children,
required or recommended attributes, and
other requirements.
Requirements of CTDA MODS
Well-Formed XML
The MODS xml document conforms to the requirements
of the XML standard.
Do you remember the requirements?
There are online tools to check this:
http://www.w3schools.com/xml/xml_validator.asp
http://xmlgrid.net/validator.html
Oxygen xml software editing tool
Valid Document
The MODS xml document conforms to the requirements
of MODS version 3.5.
http://www.loc.gov/standards/mods/v3/mods-3-5.xsd
What does this mean?
There are online tools to check this:
http://www.xmlvalidation.com/
http://www.utilities-
online.info/xsdvalidation/#.VVS9x_lVhBc (requires to input
both your xml and the MODS 3.5 xsd)
Oxygen xml software editing tool
An example
Let’s write a MODS xml document from scratch….
Questions
Links:
http://ctdigitalarchive.org/resources-for-participants/
http://www.loc.gov/standards/mods/
http://www.w3schools.com/xml/default.asp
http://www.w3schools.com/xml/xml_schema.asp
http://www.w3schools.com/xml/xml_validator.asp
http://www.utilities-online.info/xsdvalidation/#.VVTCc_lVhBc
http://www.oxygenxml.com/
https://notepad-plus-plus.org/

More Related Content

PPTX
PPT
uptu web technology unit 2 Xml2
PPT
uptu web technology unit 2 Xml2
PPT
XML Databases
PPT
XML and Databases
PDF
PPT
uptu web technology unit 2 Xml2
PDF
Introduction to XML and Databases
uptu web technology unit 2 Xml2
uptu web technology unit 2 Xml2
XML Databases
XML and Databases
uptu web technology unit 2 Xml2
Introduction to XML and Databases

What's hot (20)

PPT
uptu web technology unit 2 Xml2
PPTX
XML - Data Modeling
PPT
10. XML in DBMS
PPT
Introduction to XML
PPTX
Xml and xml processor
PPT
01 xml document structure
PPTX
XML, DTD & XSD Overview
PPT
XML.ppt
PDF
Xml schema
PDF
HTML and XML Difference FAQs
PPTX
Xml presentation
PPTX
XML-Extensible Markup Language
PPTX
Xml schema
PPS
Xml basics for beginning
PPTX
Basics of XML
PDF
Introduction to XML
PPTX
Introduction to XML
uptu web technology unit 2 Xml2
XML - Data Modeling
10. XML in DBMS
Introduction to XML
Xml and xml processor
01 xml document structure
XML, DTD & XSD Overview
XML.ppt
Xml schema
HTML and XML Difference FAQs
Xml presentation
XML-Extensible Markup Language
Xml schema
Xml basics for beginning
Basics of XML
Introduction to XML
Introduction to XML
Ad

Viewers also liked (19)

PDF
CTDA Brown Bag, Feb. 2017
PPTX
Seeing Connecticut Now and Then: Repository Services that Support Your Best M...
PDF
CTDA Brown Bag, Dec. 2016
PPTX
CTDA Overview September 2016
PDF
How to Add Or Replace a Datastream
PPTX
CTDA Annual Meeting 2016
PDF
CTDA Metadata Application Profile
PPTX
A Cloud of Your Own: Preservation & Access Services from the Connecticut Digi...
PDF
Collaborative Data Archiving and Access: Developing a Shared Repository Infra...
PDF
CTDA Brown Bag, Oct. 2016
PPTX
We Don't Make Your Preservation Program, We Make Your Preservation Program Be...
PDF
CTDA MODS and Islandora XML Forms
PDF
How to Add A Compound Object
PDF
CTDA End of Year Reports
PDF
CTDA MODS Implementation Guidelines
PDF
How to Use the Manuscript Content Model
PPTX
CTDA: Brief Introduction
PDF
Open refine to update and clean up your messy data
CTDA Brown Bag, Feb. 2017
Seeing Connecticut Now and Then: Repository Services that Support Your Best M...
CTDA Brown Bag, Dec. 2016
CTDA Overview September 2016
How to Add Or Replace a Datastream
CTDA Annual Meeting 2016
CTDA Metadata Application Profile
A Cloud of Your Own: Preservation & Access Services from the Connecticut Digi...
Collaborative Data Archiving and Access: Developing a Shared Repository Infra...
CTDA Brown Bag, Oct. 2016
We Don't Make Your Preservation Program, We Make Your Preservation Program Be...
CTDA MODS and Islandora XML Forms
How to Add A Compound Object
CTDA End of Year Reports
CTDA MODS Implementation Guidelines
How to Use the Manuscript Content Model
CTDA: Brief Introduction
Open refine to update and clean up your messy data
Ad

Similar to CTDA Workshop on XML and MODS (20)

DOCX
Introduction to xml schema
PPTX
XML notes.pptx
DOCX
PDF
Jaxp Xmltutorial 11 200108
PDF
Introduction to xml
PPT
1 xml fundamentals
PPT
working with internet technologies using XML
PPTX
XML Introduction
PPTX
Internet_Technology_UNIT V- Introduction to XML.pptx
PDF
Module 5 XML Notes.pdf
PPT
Xml iet 2015
PPTX
Xml programming language myassignmenthelp.net
PDF
Xml tutorial
 
PPTX
DOCX
Xml material
DOCX
Xml material
DOCX
Xml material
Introduction to xml schema
XML notes.pptx
Jaxp Xmltutorial 11 200108
Introduction to xml
1 xml fundamentals
working with internet technologies using XML
XML Introduction
Internet_Technology_UNIT V- Introduction to XML.pptx
Module 5 XML Notes.pdf
Xml iet 2015
Xml programming language myassignmenthelp.net
Xml tutorial
 
Xml material
Xml material
Xml material

Recently uploaded (20)

PPTX
PPH.pptx obstetrics and gynecology in nursing
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PPTX
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
master seminar digital applications in india
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
TR - Agricultural Crops Production NC III.pdf
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
Pharma ospi slides which help in ospi learning
PDF
Sports Quiz easy sports quiz sports quiz
PDF
RMMM.pdf make it easy to upload and study
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPH.pptx obstetrics and gynecology in nursing
Final Presentation General Medicine 03-08-2024.pptx
102 student loan defaulters named and shamed – Is someone you know on the list?
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
Microbial diseases, their pathogenesis and prophylaxis
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
master seminar digital applications in india
Renaissance Architecture: A Journey from Faith to Humanism
TR - Agricultural Crops Production NC III.pdf
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
human mycosis Human fungal infections are called human mycosis..pptx
Pharma ospi slides which help in ospi learning
Sports Quiz easy sports quiz sports quiz
RMMM.pdf make it easy to upload and study
O5-L3 Freight Transport Ops (International) V1.pdf
3rd Neelam Sanjeevareddy Memorial Lecture.pdf

CTDA Workshop on XML and MODS

  • 1. XML & MODS INTRODUCTION TO XML AND THE METADATA OBJECT DESCRIPTION SCHEMA IN THE CTDA
  • 2. XML XML stands for eXtensible markup language. XML was designed to describe data whereas HTML was designed to display data. XML uses “tags”. In metadata land, these are also referred to as “labels”, “elements”, or “fields”. These tags are not predefined but are meant to be self-descriptive. You can invent your own tags in XML. By itself, XML DOES NOT DO ANYTHING. XML needs a script written by someone or a piece of software to receive, send, transform, or display it. XML is a software and hardware independent tool for carrying information. It is not a replacement for HTML but can be a complement to HTML.
  • 3. Extensible You can create and define your own tags. <note> <myAwesomeNote> <thisIsMyTag> The power of being extensible is the ability to customize your xml.
  • 4. Markup It’s all about the <tags>. The angle brackets are the most recognizable feature of XML. These tags or elements are very similar to the ones in HTML. Elements are surrounding by angle brackets. Each element has an opening and closing designation like HTML.
  • 5. Language XML is a language or rather a “meta” – language. XML allows you to create and definite other languages. Have you ever heard of RSS feeds, XSLT, or XSD? Languages such as XSLT and XSD are sometimes referred to as members of the XML family. XSLT is eXtensible stylesheet transformation XSD is eXtensible schema definition
  • 6. XML Documents When you create an XML file or document, you essentially are creating a text file with the extension .xml. Because it is a text file, it can be read by any type of software or hardware. This is why xml simplifies data sharing and transport. It also helps when you change platforms because text can be read a large number of programs and systems. XML documents all have the same structure, called a tree. There is a branch, limbs, and leaves. The XML declaration declares that this is an XML document. The branch of the tree is called the root. The limbs and leaves are called children. Another name for the root is “parent”.
  • 7. XML Document <?xml version=“1.0” encoding=“UTF-8”?> <note> <to>Homer</to> <from>Marcy</from> <heading>Reminder</heading> <body>Don’t forget about the BBQ this weekend</body> </note> __ <root> <child> <subchild>….</subchild> </child> </root> XML Declaration The Root or ultimate parent element Children elements to the parent element, note, which is also the root note to from heading body
  • 8. XML Expanded <note> is the root. It is also the parent to 4 children. <to>, <from>, <heading>, <body> are children to its parent, <note>, and are siblings. A parent element does not necessarily have to be the root element in the XML file. All elements must have a closing tag. All elements are case sensitive. All elements must be properly nested. All XML documents (or files) must have a root element. All attributes values must be quoted. All entity references (such as &, <, “, etc.) must use the 5 pre-defined entity references.
  • 9. More XML XML has comments that appear in the following syntax: <!-- Add your comments here --> White-space is preserved in XML. Hello Homer. Hello Homer. A new line in XML is just a line feed whereas in Windows it is a carriage return and line feed. Use Notepad++ or Oxygen to edit your XML. An XML document is well-formed is it conforms to the rules above.
  • 10. What is an element? An element is everything from the start tag to the closing tag. An element can contain: • Other elements • Text • Attributes • Mix of the above <bookstore> <book category=“children”> <title>Harry Potter</title> <author>J.K. Rowling</author> <year>2005</year> </book> <book category=“young adult”> <title>Hunger Games, book 1</title> <author>Suzanne Collins</author> <year>2008</year> </book> </bookstore>
  • 11. Elements XML Naming rules: •Elements are case sensitive. •Element names must start with a letter or an underscore. •Element names can’t start with the letters xml (XML, xMl, xmL, etc.) •Element names can contain letters, digits, hyphens, underscores, and periods •Element names cannot contain spaces
  • 12. Attributes Attributes provide additional information about elements. Values must be placed in quotes. <person gender=“female”> <book category=“young adult”> Notice that attribute values can have spaces. Attributes can’t have multiple values, tree structures and are there not very extensible. <person> <gender>female</gender> </person> When would you use an attribute and not an element? It depends on what you want and if you are writing an XML document based on definitions already decided for you such as a metadata standard.
  • 13. Name Conflicts Because you can create your own elements, there are times when elements have the same name but refer to very different things. Here’s an HTML table: Here’s a table that is a piece of furniture: If we combine these XML documents, there will be a conflict. How do you know that <table> is different from <table>?
  • 14. Namespaces – The Name Authority of XML Name conflicts such as this are resolved by adding a prefix. The prefix is a namespace and must be defined by using the xmlns attribute in the start tag of the root or element. xmlns:prefix=“URI” The URI can be fictional in some cases. In many cases, it is not and refers to what is called a schema or document definition type. A schema, XSD, is like a dictionary and grammar for an XML document. It outlines the syntax and semantics that an XML document needs to follow in order to conform to that schema. For example, an XML that is a MODS file and that references the MODS schema must conform to the syntax and semantics required by MODS as specified by the MODS schema. If you want to learn German, you need a German dictionary and grammar book to help you write in German.
  • 15. Metadata Object Description Schema MODS is an XML based bibliographic description schema developed and maintained by the Library of Congress. It is a compromise between the simplicity of Dublin Core and the complexity of MARC. It was developed in 2002. Currently, MODS is now in version 3.6. The main web site for MODS: http://www.loc.gov/standards/mods/. This site provides information about the standard, guidelines, tools, schemas (for each version of MODS), conversions, etc. The CTDA does not implement the full standard of MODS.
  • 16. CTDA Implementation of MODS CTDA’s implementation guidelines and metadata application profile can be found online on our web site (http://ctdigitalarchive.org/resources-for-participants). These guidelines and profile are based on the full standard and in part on the technical infrastructure’s capabilities for managing metadata. Such capabilities include indexing, mapping/transforming, re-using, sharing, displaying, or extracting metadata. CTDA implements MODS version 3.5 and references that version in MODS XML records using the XML namespace declaration, xmlns, and the prefix, mods.
  • 17. Minimum MODS XML XML Declaration ◦ <?xml version=“1.0” encoding=“UTF-8”> Root ◦ <mods:mods xmlns:mods=“http://www.loc.gov/mods/v3” xmlns:xlink=“http://www.w3.org/1999/xlink” xmlns:xsi=“http://www.w3.org/2001/XMLSchema-instance” version=“3.5” xsi:schemaLocation=“http://www.loc.gov/mods/v3 http://www.loc.gov/standards/mods/v3/mods-3- 5.xsd”> Title ◦ <mods:titleInfo><mods:title> Resource type ◦ <mods:typeOfResource> Digital Resource ◦ <mods:physicalDescription><mods:digitalOrigin>
  • 18. Minimum MODS XML Continued Held By ◦ <mods:note type=“ownership”> Rights ◦ <mods:accessCondition type=“use and reproduction”> Persistent Identifier ◦ <mods:identifier type=“hdl”> Language of MODS record ◦ <mods:recordInfo><mods:languageOfCataloging><mods:languageTerm type=“code” authority=“iso639- 2b”> Remember that each opening tag needs a closing tag and there is a specific MODS tree to follow according to the MODS specification or the schema version 3.5.
  • 19. Example of Minimal MODS XML Document <?xml version=“1.0” encoding=“UTF-8”> <mods:mods xmlns:mods=“http://www.loc.gov/mods/v3” xmlns:xlink=“http://www.w3.org/1999/xlink” xmlns:xsi=“http://www.w3.org/2001/XMLSchema-instance” version=“3.5” xsi:schemaLocation=“http://www.loc.gov/mods/v3 http://www.loc.gov/standards/mods/v3/mods-3-5.xsd”> <mods:titleInfo> <mods:title>This is an example title an image</mods:title> </mods:titleInfo> <mods:typeOfResource>still image</mods:typeOfResource> <mods:physicalDescription> <mods:digitalOrigin>reformatted digital</mods:digitalOrigin> </mods:physicalDescription> <mods:note type=“ownership”>Bridgeport History Center, Bridgeport Public Library</mods:note> <mods:accessCondition type=“use and reproduction”>Rights statement</mods:accessCondition> <mods:identifier type=“hdl”>http://hdl.handle.net/11134/110002:495858</mods:identifier> <mods:recordInfo> <mods:languageOfCataloging> <mods:languageTerm type=“code” authority=“iso639-2b”>eng</mods:languageTerm> </mods:languageOfCataloging> </mods:recordInfo> </mods:mods>
  • 20. MODS XML Explained XML declaration + Root (mods:mods) mods:titleInfo mods:title mods:typeOfResource (controlled vocabulary) mods:physicalDescription mods:digitalOrigin (controlled vocabulary) mods:note mods:accessCondition mods:identifier mods:recordInfo mods:languageOfCataloging mods:languageTerm XML declaration Open root Open 1st child (titleInfo) Open 1st grandchild (or child of parent titleInfo) (title) Add content Close 1st grandchild (title) Close 1st child (titleInfo) Open 2nd child (typeOfResource) Add content Close 2nd child (typeOfResource) Open 3rd child (physicalDescription) Open child of parent physicalDescription (digitalOrigin) Add content using one of the required terms from schema Close child of parent (digitalOrigin) Close 3rd child (physicalDescription) Open 4th child (note) Add attribute type with suggested value based on LC recommendations Add content Close 4th child ETC. type type type type authority Attributes go in the opening tag only.
  • 21. Particulars of MODS typeOfResource has a required value list: text; cartographic; notated music; sound recording-musical; sound recording-nonmusical; sound recording; still image; moving image; three dimensional object; software; multimedia; mixed material. digitalOrigin has a required value list: born digital, reformatted digital, digitized microfilm, digitized other analog languageTerm requires the attribute type with the value of code and the attribute authority set to iso639-2b The attribute qualifier for dateIssued has a required value list: approximate, inferred, questionable. There is an ORDER to how elements appear. For example, the element scale must appear before coodinates. We don’t use the MODS element relatedItem.
  • 22. Particulars of CTDA MODS - Name When you want to include a name such as an author or contributor, the role must be specified and the entire name goes into one namePart element. The element name requires the attribute type that has the required values of personal, corporate, family, conference. The child of role, roleTerm, requires the attributes authority and type with the required values of marcrelator and text respectively. <mods:name type=“personal”> <mods:namePart>Smith, John, 1850-1899</mods:namePart> <mods:role> <mods:roleTerm authority=“marcrelator” type=“text”>Author</mods:roleTerm> </mods:role> </mods:name>
  • 23. Particulars of CTDA MODS - Date Dates are not required. If you add a date, CTDA implements the element dateIssued element and requires the w3cdtf encoding and attribute keyDate. For date ranges, it is necessary to implement the attribute point with either the value start of end. Single Date: <mods:originInfo> <mods:dateIssued encoding=“w3cdtf” keyDate=“yes”>2010</mods:dateIssued> </mods:originInfo> Date Range: <mods:originInfo> <mods:dateIssued encoding=“w3cdtf” keyDate=“yes” point=“start”>1907</mods:dateIssued> <mods:dateIssued encoding=“w3cdtf” point=“end”>1917</mods:dateIssued> </mods:originInfo> Single Date with Qualifier: <mods:originInfo> <mods:dateIssued encoding=“w3cdtf” keyDate=“yes” qualifier=“inferred”>1908</mods:dateIssued> </mods:originInfo>
  • 24. Particulars of CTDA MODS - Coordinates In CTDA you can record both a center point and a bounding box. The center point is recording in the element <mods:coordintates>. MODS 3.5 does not have a convenient way to record a bounding box. We use the <mods:extension> element to record bounding box information in the content standard CSGDM. <mods:cartographics> <mods:scale>0.4583333333333333</mods:scale> <mods:coordinates>42.023187, -71.852071</mods:coordinates> </mods:cartographics> <mods:extension xmlns:fgdc="http://www.fgdc.gov/schemas/metadata/fgdc-std-001-1998.xsd"> <fgdc:metadata> <fgdc:idinfo> <fgdc:spdom> <fgdc:bounding> <fgdc:westbc>-71.852071</fgdc:westbc> <fgdc:eastbc>-71.841559</fgdc:eastbc> <fgdc:northbc>42.030805</fgdc:northbc> <fgdc:southbc>42.023187</fgdc:southbc> </fgdc:bounding> </fgdc:spdom> </fgdc:idinfo> </fgdc:metadata> </mods:extension>
  • 25. Particulars of CTDA MODS – Aggregating Content There is one repository where all content is stored for long-term preservation purposes. Content can be presented on different “channels” or sites. One way of doing this is using what are called Aggregation Tags. These tags are 3 uppercase letters. Each tag designates a particular channel. The index is configured to recognize these tags and then push content to where it needs to go. CTDA has 2 tags: CHO, GEO. These tags are values that go in the element <mods:targetAudience>. This element, targetAudience, CANNOT be used for any other type of content or tags that are made up on the fly. <mods:targetAudience>CHO</mods:targetAudience> <mods:targetAudience>GEO</mods:targetAudience> Question: What is the parent element of this element? Question: What’s the different between <mods:targetAudience> and <targetAudience>?
  • 26. How To Recognize Parent/Child Relationships? If you go to the main web site on MODS 3.5 outline (http://www.loc.gov/standards/mods/mods- outline-3-5.html), you will see a list of the TOP LEVEL Elements. Top level elements are all children of the root. Each top level element is then described in terms of its children, required or recommended attributes, and other requirements.
  • 27. Requirements of CTDA MODS Well-Formed XML The MODS xml document conforms to the requirements of the XML standard. Do you remember the requirements? There are online tools to check this: http://www.w3schools.com/xml/xml_validator.asp http://xmlgrid.net/validator.html Oxygen xml software editing tool Valid Document The MODS xml document conforms to the requirements of MODS version 3.5. http://www.loc.gov/standards/mods/v3/mods-3-5.xsd What does this mean? There are online tools to check this: http://www.xmlvalidation.com/ http://www.utilities- online.info/xsdvalidation/#.VVS9x_lVhBc (requires to input both your xml and the MODS 3.5 xsd) Oxygen xml software editing tool
  • 28. An example Let’s write a MODS xml document from scratch….