SlideShare a Scribd company logo
Archival Information Packages for
NASA HDF-EOS Data
R. Duerr, Kent Yang, Azhar Sikander
Outline
• What is an Archival Information Package?
 HDF-AIP

• Standards? What Standards?
 METS
 DIF/FGDC/ISO 19115-2
 PREMIS

• Results
• Next Steps

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
OAIS Reference Model1
Archive Information Package

1

Reference Model for an Open Archival Information System (OAIS), CCSDS 650.0-B-1, Blue Book, January 2002.

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
Archival Information Package Contents
• Content Information
 The data object to be preserved
 Information that describes the data object
o Typically interpreted as the syntax and semantics of the file
structure

• Preservation Description Information
 Provenance –

Origin or source of the data, any changes that have taken place since,
and who has had custody of it

 Fixity – the authentication mechanisms (with keys) needed to ensure that the data
object has not been altered in an undocumented manner

 Reference – identification mechanisms and values
 Context – relation of the object to its environment

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
HDF-Archive Information Packages
• The HDF group was
funded to investigate
and propose a design
for a complete archival
information package
for HDF data files
• The result was a METS
metadata file to
accompany the HDF
data file
http://www.hdfgroup.org/projects/hdf5_aip/hdf5_aip_wp.html
Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
Metadata Standards - METS
• Metadata Encoding and Transmission Standard
• An initiative of the Digital Library Federation
• Provides the means to convey the metadata
necessary for
 management of digital objects within a repository
 exchange of objects between repositories (or between
repositories and their users)

• Designed to facilitate
 shared development of information management
tools/services
 interoperable exchange of digital materials

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
METS - A very brief overview
Describes the METS
document itself
Describes the editor
e.g., creator orobject
using some external standard
Describes object creation, storage,
e.g., MARC, FGDC, Dublin Core
intellectual property rights, source
info, provenance, etc.
Provides an inventory of all of the
e.g., PREMIS
files that are part of the object
described
A physical or logical map of the
organization of the materials
described
Allows specification of hyperlinks
between parts of the map (mostly
useful when preserving websites)
Used to associate executable code
with parts of the content

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
Metadata Standards - Descriptive Metadata

Derived from

• Discovery, Assess and Access Metadata
 GCMD DIF
 FGDC CSDGM
 ISO 19115

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
Metadata Standards - ISO 19115:2003
• The international equivalent of the FGDC standard
• Most fields can be mapped or generated from
FGDC metadata
• The exception is the Dataset Topic Keywords
• Allows for national profiles

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
Metadata Standards - ISO 19115:2003

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
Is there a metadata standard for AIP
information?
Archive Information Package

1

Reference Model for an Open Archival Information System (OAIS), CCSDS 650.0-B-1, Blue Book, January 2002.

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
Preservation Metadata Implementation Strategies
(PREMIS)
• Provide a core preservation metadata set with broad
applicability across the digital preservation
community
• Developed by an OCLC and RLG sponsored
international working group
 Representatives from libraries, museums, archives,
government, and the private sector.

• Maintained by the Library of Congress
• Based on the OAIS reference model

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
PREMIS - Entity-Relationship Diagram
Intellectual
Entities

Objects

“an action that involves at
least organization, or
Rights
“a“a coherent set of content
person,one object or agent
known to the of information
software program associated
“a discrete unitpreservation
that is reasonably
repository”
with described as a unit” in
preservation events
in digital form”
thee.g.,example,archived,
For created, a data file
life of a web site,
For example, an object” data
migrated or more
e.g., Dr. Spockofof data it
“assertions donated sets
set or collection one
rights or permissions
pertaining to an object
or an agent”
e.g., copywrite notice, legal
Events
statute, deposit agreement

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII

Agents
Is there a metadata standard for AIP
information?
PREMIS

ISO 19115

1

Reference Model for an Open Archival Information System (OAIS), CCSDS 650.0-B-1, Blue Book, January 2002.

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
NOAA Data Stewardship Prototype
• NSIDC and THG demonstrated the feasibility of
migrating NASA data to a standard HDF-AIP
format
• Motivation:
Technologies change regularly,
organizations come and go, but data must
survive
But preserving data takes more than just
preserving the bits, all the components of an
AIP are critical
Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
Project Goals
• Prototype development of Archive Information
Packages for HDF data:
 For entire data sets
 For individual “granules”

• Test usability of digital library standards with
geospatial data

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
Program Plan (Modified)
ISO-19115
CDM/NetCDF4

ECS to
METS
(Data Set)

HDF5-AIP
NetCDF4 /
HDF5 Data

METS

NetCDF4/HDF5-data

ECS to
METS

NSIDC/ECS
Metadata

(Granule)

H4to
H5

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII

NSIDC/ ECS
HDF4-data
HDF5 Granule Level Archive Information
Packages
Data file

HDF5

Metadata file

METS

Primary Schema

Extension Schema

|<mets>
|---<dmdSec>----------------<ISO 19115>
|---<amdSec>--------------|--<techMD>
|
|--<rightsMD>
|
|--<sourceMD>
|----<fileGrp>
|----<structMap>

PREMIS

HDF5 AIP Components

http://www.hdfgroup.uiuc.edu/papers/papers/AIP/HDF5_AIP_White_Paper.pdf

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
File Level AIP Activity Status
• Developed a map from NSIDC/ECS metadata to
METS/PREMIS/ISO 19115 components
• Prototype software completed
• Issues
 What goes in PREMIS vs ISO 19115?
 Auxillary file handling - own AIP or not?
o

E.g., browse files, processing history, PGE’s

 Granules vs files

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
Issues and Questions
• Inconsistent use of terminology between standards
– for example, what is a data set?
• Many of the standards care about distribution
formats
 Are these even relevant concepts any more?
 Do you really want to have to update the metadata record
just because a new distribution format was added?
 What about new access services?

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
Next Steps
• NSIDC is updating our non-ECS data systems
handling of metadata including support for
PREMIS, etc. metadata on all holdings
• Work underway to upgrade granule level metadata
for NSIDC flagship sea ice products
(PREMIS/METS/ISO AIP packages)
• Work to improve archivability of data stored in
HDF formats on-going – NASA implementing a
standard XML description of contents across its
archives
Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII
Acknowledgement
This work was supported under NOAA Scientific
Stewardship Program grant number
NA07OAR4310286. Any opinions, findings,
and conclusions or recommendations
expressed in this material are those of the
author(s) and do not necessarily reflect the
views of NOAA.

Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF
and HDF-EOS Workshop XIII

More Related Content

PDF
Welcome to HDF Workshop V
PDF
HDF5 for NPOESS Data Products
PPTX
HydraDAM2: Repository Challenges and Solutions for Large Media Files
PPT
Content Framework for Operational Environmental Remote Sensing Data Sets: NPO...
PPTX
Parallel HDF5 Developments
PDF
Hdf5
Welcome to HDF Workshop V
HDF5 for NPOESS Data Products
HydraDAM2: Repository Challenges and Solutions for Large Media Files
Content Framework for Operational Environmental Remote Sensing Data Sets: NPO...
Parallel HDF5 Developments
Hdf5

What's hot (20)

PPT
Hdf5 intro
PPT
PPTX
PPT
Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps
PPT
Status of HDF-EOS, Related Software and Tools
PPT
Caching and Buffering in HDF5
PPT
Aura HDF-EOS File Format Guidelines: Overview and Status
PPT
Digital Object Identifiers for EOSDIS data
PDF
SCAPE Information Day at BL - Characterising content in web archives with Nanite
PPTX
HDF Project Status and Plans
PDF
PPTX
iRODS: Interoperability in Data Management
PPT
Access HDF5 Datasets via OPeNDAP's Data Access Protocol (DAP)
PDF
Introduction to HDF5 Data Model, Programming Model and Library APIs
PPT
Migrating from HDF5 1.6 to 1.8
PDF
A Survey on Different File Handling Mechanisms in HDFS
Hdf5 intro
Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps
Status of HDF-EOS, Related Software and Tools
Caching and Buffering in HDF5
Aura HDF-EOS File Format Guidelines: Overview and Status
Digital Object Identifiers for EOSDIS data
SCAPE Information Day at BL - Characterising content in web archives with Nanite
HDF Project Status and Plans
iRODS: Interoperability in Data Management
Access HDF5 Datasets via OPeNDAP's Data Access Protocol (DAP)
Introduction to HDF5 Data Model, Programming Model and Library APIs
Migrating from HDF5 1.6 to 1.8
A Survey on Different File Handling Mechanisms in HDFS
Ad

Similar to Archive Information Packages for NASA HDF-EOS Data (20)

PDF
NASA HDF and HDF-EOS Status - Use in EOSDIS
PPTX
Improving long-term preservation of EOS data by independently mapping HDF4 da...
PPT
Reference Model for an Open Archival Information Systems (OAIS): Overview and...
PPT
The HDF-EOS Aura Data Guidelines - "What's New"
PPT
Integrating HDF5 with SRB
PDF
HDF-EOS Development: Current Status and Tools
PPT
Using HDF5 Archive Information Package to preserve HDF-EOS2 data
PPT
HDF-EOS APIs, tools, etc.
PDF
Using Dublin Core for DISCOVER: a New Zealand visual art and music resource f...
PDF
HDF-EOS Development - Current Status and Schedule
PPT
HDF-EOS Workshop II Introduction
PPT
Survey of Data Format Tools
PDF
Geoscience Data Analysis and Visualization Tools from NCAR
PDF
PPTX
HDF4 Mapping Project Update
PDF
HDF5 High Level and Lite Libraries
PPT
GES DISC Eexperiences with HDF Formats for MEaSUREs Projects
NASA HDF and HDF-EOS Status - Use in EOSDIS
Improving long-term preservation of EOS data by independently mapping HDF4 da...
Reference Model for an Open Archival Information Systems (OAIS): Overview and...
The HDF-EOS Aura Data Guidelines - "What's New"
Integrating HDF5 with SRB
HDF-EOS Development: Current Status and Tools
Using HDF5 Archive Information Package to preserve HDF-EOS2 data
HDF-EOS APIs, tools, etc.
Using Dublin Core for DISCOVER: a New Zealand visual art and music resource f...
HDF-EOS Development - Current Status and Schedule
HDF-EOS Workshop II Introduction
Survey of Data Format Tools
Geoscience Data Analysis and Visualization Tools from NCAR
HDF4 Mapping Project Update
HDF5 High Level and Lite Libraries
GES DISC Eexperiences with HDF Formats for MEaSUREs Projects
Ad

More from The HDF-EOS Tools and Information Center (20)

PDF
HDF5 2.0: Cloud Optimized from the Start
PDF
Using a Hierarchical Data Format v5 file as Zarr v3 Shard
PDF
Cloud-Optimized HDF5 Files - Current Status
PDF
Cloud Optimized HDF5 for the ICESat-2 mission
PPTX
Access HDF Data in the Cloud via OPeNDAP Web Service
PPTX
Upcoming New HDF5 Features: Multi-threading, sparse data storage, and encrypt...
PPTX
The State of HDF5 / Dana Robinson / The HDF Group
PDF
Cloud-Optimized HDF5 Files
PDF
Accessing HDF5 data in the cloud with HSDS
PPTX
Highly Scalable Data Service (HSDS) Performance Features
PDF
Creating Cloud-Optimized HDF5 Files
PPTX
HDF5 OPeNDAP Handler Updates, and Performance Discussion
PPTX
Hyrax: Serving Data from S3
PPSX
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
PDF
HDF - Current status and Future Directions
PPSX
HDFEOS.org User Analsys, Updates, and Future
PPTX
HDF - Current status and Future Directions
PDF
H5Coro: The Cloud-Optimized Read-Only Library
PPTX
MATLAB Modernization on HDF5 1.10
HDF5 2.0: Cloud Optimized from the Start
Using a Hierarchical Data Format v5 file as Zarr v3 Shard
Cloud-Optimized HDF5 Files - Current Status
Cloud Optimized HDF5 for the ICESat-2 mission
Access HDF Data in the Cloud via OPeNDAP Web Service
Upcoming New HDF5 Features: Multi-threading, sparse data storage, and encrypt...
The State of HDF5 / Dana Robinson / The HDF Group
Cloud-Optimized HDF5 Files
Accessing HDF5 data in the cloud with HSDS
Highly Scalable Data Service (HSDS) Performance Features
Creating Cloud-Optimized HDF5 Files
HDF5 OPeNDAP Handler Updates, and Performance Discussion
Hyrax: Serving Data from S3
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
HDF - Current status and Future Directions
HDFEOS.org User Analsys, Updates, and Future
HDF - Current status and Future Directions
H5Coro: The Cloud-Optimized Read-Only Library
MATLAB Modernization on HDF5 1.10

Recently uploaded (20)

PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Hindi spoken digit analysis for native and non-native speakers
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PDF
Zenith AI: Advanced Artificial Intelligence
PDF
project resource management chapter-09.pdf
PDF
Mushroom cultivation and it's methods.pdf
PPTX
Chapter 5: Probability Theory and Statistics
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PPTX
A Presentation on Touch Screen Technology
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Hybrid model detection and classification of lung cancer
Unlocking AI with Model Context Protocol (MCP)
Hindi spoken digit analysis for native and non-native speakers
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
SOPHOS-XG Firewall Administrator PPT.pptx
1 - Historical Antecedents, Social Consideration.pdf
Encapsulation_ Review paper, used for researhc scholars
Heart disease approach using modified random forest and particle swarm optimi...
Zenith AI: Advanced Artificial Intelligence
project resource management chapter-09.pdf
Mushroom cultivation and it's methods.pdf
Chapter 5: Probability Theory and Statistics
MIND Revenue Release Quarter 2 2025 Press Release
NewMind AI Weekly Chronicles - August'25-Week II
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
A Presentation on Touch Screen Technology
Assigned Numbers - 2025 - Bluetooth® Document
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Hybrid model detection and classification of lung cancer

Archive Information Packages for NASA HDF-EOS Data

  • 1. Archival Information Packages for NASA HDF-EOS Data R. Duerr, Kent Yang, Azhar Sikander
  • 2. Outline • What is an Archival Information Package?  HDF-AIP • Standards? What Standards?  METS  DIF/FGDC/ISO 19115-2  PREMIS • Results • Next Steps Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII
  • 3. OAIS Reference Model1 Archive Information Package 1 Reference Model for an Open Archival Information System (OAIS), CCSDS 650.0-B-1, Blue Book, January 2002. Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII
  • 4. Archival Information Package Contents • Content Information  The data object to be preserved  Information that describes the data object o Typically interpreted as the syntax and semantics of the file structure • Preservation Description Information  Provenance – Origin or source of the data, any changes that have taken place since, and who has had custody of it  Fixity – the authentication mechanisms (with keys) needed to ensure that the data object has not been altered in an undocumented manner  Reference – identification mechanisms and values  Context – relation of the object to its environment Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII
  • 5. HDF-Archive Information Packages • The HDF group was funded to investigate and propose a design for a complete archival information package for HDF data files • The result was a METS metadata file to accompany the HDF data file http://www.hdfgroup.org/projects/hdf5_aip/hdf5_aip_wp.html Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII
  • 6. Metadata Standards - METS • Metadata Encoding and Transmission Standard • An initiative of the Digital Library Federation • Provides the means to convey the metadata necessary for  management of digital objects within a repository  exchange of objects between repositories (or between repositories and their users) • Designed to facilitate  shared development of information management tools/services  interoperable exchange of digital materials Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII
  • 7. METS - A very brief overview Describes the METS document itself Describes the editor e.g., creator orobject using some external standard Describes object creation, storage, e.g., MARC, FGDC, Dublin Core intellectual property rights, source info, provenance, etc. Provides an inventory of all of the e.g., PREMIS files that are part of the object described A physical or logical map of the organization of the materials described Allows specification of hyperlinks between parts of the map (mostly useful when preserving websites) Used to associate executable code with parts of the content Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII
  • 8. Metadata Standards - Descriptive Metadata Derived from • Discovery, Assess and Access Metadata  GCMD DIF  FGDC CSDGM  ISO 19115 Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII
  • 9. Metadata Standards - ISO 19115:2003 • The international equivalent of the FGDC standard • Most fields can be mapped or generated from FGDC metadata • The exception is the Dataset Topic Keywords • Allows for national profiles Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII
  • 10. Metadata Standards - ISO 19115:2003 Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII
  • 11. Is there a metadata standard for AIP information? Archive Information Package 1 Reference Model for an Open Archival Information System (OAIS), CCSDS 650.0-B-1, Blue Book, January 2002. Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII
  • 12. Preservation Metadata Implementation Strategies (PREMIS) • Provide a core preservation metadata set with broad applicability across the digital preservation community • Developed by an OCLC and RLG sponsored international working group  Representatives from libraries, museums, archives, government, and the private sector. • Maintained by the Library of Congress • Based on the OAIS reference model Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII
  • 13. PREMIS - Entity-Relationship Diagram Intellectual Entities Objects “an action that involves at least organization, or Rights “a“a coherent set of content person,one object or agent known to the of information software program associated “a discrete unitpreservation that is reasonably repository” with described as a unit” in preservation events in digital form” thee.g.,example,archived, For created, a data file life of a web site, For example, an object” data migrated or more e.g., Dr. Spockofof data it “assertions donated sets set or collection one rights or permissions pertaining to an object or an agent” e.g., copywrite notice, legal Events statute, deposit agreement Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII Agents
  • 14. Is there a metadata standard for AIP information? PREMIS ISO 19115 1 Reference Model for an Open Archival Information System (OAIS), CCSDS 650.0-B-1, Blue Book, January 2002. Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII
  • 15. NOAA Data Stewardship Prototype • NSIDC and THG demonstrated the feasibility of migrating NASA data to a standard HDF-AIP format • Motivation: Technologies change regularly, organizations come and go, but data must survive But preserving data takes more than just preserving the bits, all the components of an AIP are critical Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII
  • 16. Project Goals • Prototype development of Archive Information Packages for HDF data:  For entire data sets  For individual “granules” • Test usability of digital library standards with geospatial data Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII
  • 17. Program Plan (Modified) ISO-19115 CDM/NetCDF4 ECS to METS (Data Set) HDF5-AIP NetCDF4 / HDF5 Data METS NetCDF4/HDF5-data ECS to METS NSIDC/ECS Metadata (Granule) H4to H5 Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII NSIDC/ ECS HDF4-data
  • 18. HDF5 Granule Level Archive Information Packages Data file HDF5 Metadata file METS Primary Schema Extension Schema |<mets> |---<dmdSec>----------------<ISO 19115> |---<amdSec>--------------|--<techMD> | |--<rightsMD> | |--<sourceMD> |----<fileGrp> |----<structMap> PREMIS HDF5 AIP Components http://www.hdfgroup.uiuc.edu/papers/papers/AIP/HDF5_AIP_White_Paper.pdf Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII
  • 19. File Level AIP Activity Status • Developed a map from NSIDC/ECS metadata to METS/PREMIS/ISO 19115 components • Prototype software completed • Issues  What goes in PREMIS vs ISO 19115?  Auxillary file handling - own AIP or not? o E.g., browse files, processing history, PGE’s  Granules vs files Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII
  • 20. Issues and Questions • Inconsistent use of terminology between standards – for example, what is a data set? • Many of the standards care about distribution formats  Are these even relevant concepts any more?  Do you really want to have to update the metadata record just because a new distribution format was added?  What about new access services? Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII
  • 21. Next Steps • NSIDC is updating our non-ECS data systems handling of metadata including support for PREMIS, etc. metadata on all holdings • Work underway to upgrade granule level metadata for NSIDC flagship sea ice products (PREMIS/METS/ISO AIP packages) • Work to improve archivability of data stored in HDF formats on-going – NASA implementing a standard XML description of contents across its archives Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII
  • 22. Acknowledgement This work was supported under NOAA Scientific Stewardship Program grant number NA07OAR4310286. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of NOAA. Archival Information Packages for NASA HDF-EOS Data, presented 11/4/09 by R. Duerr HDF and HDF-EOS Workshop XIII

Editor's Notes

  • #3: Lots of background material that I won’t really discuss – indicated
  • #5: Syntax - XFDU - DFDL - ESML Semantics?
  • #8: A couple of interesting and useful things about METS: is that it is deliberately designed to handle objects at a wide variety of scales (single files, complex web sites) Rather than attempting to define descriptive and administrative metadata needs for all kinds of objects, they designed the standard to incorporate a variety of other standards (e.g., FGDC for geospatial metadata)
  • #9: When you talk to a geoscientist or data scientist who deals with geospatial data – these are the standards they know and care about GCMD – because it is the oldest, is internationally accepted; NASA/NOAA/NSF require it for data set descriptions; because the Global Change Master Directory is the data equivalent of WorldCat FGDC – Content Standard for Digital Geospatial Metadata; derived from DIF; mandated for all federally funded data by Executive Order ISO 19115 – Most recent standard – replacing FGDC – adopted by NOAA and likely NASA
  • #13: But more than just descriptive metadata is needed It is equally important to know what has happened to the data since it’s creation, to know it’s provenance
  • #14: The PREMIS entity&lt;-&gt;relationship diagram Representation - “the set of files needed for a complete and reasonable rendition of an Intellectual Entity” File Bitstream - “contiguous or non-contiguous data within a file that has meaningful common properties for preservation purposes” So how does this apply to science data?
  • #17: Keeping track of events in the digital library world for a few years Noticed that they’ve come up with standards to deal with a wide variety of information types NOAA and USGS were to be the ultimate home of much of NASA’s EOS data THG with funding ultimately from National Archives and Records Administration had written a white paper defining an HDF-AIP using a digital library standard A standard called METS
  • #19: Primary Schema Extension Schema National Digital Geospatial Archive - LOC NDIIP (National Digital Information Infrastructure and Preservation Program ) Recommendation by Nancy Hoebelheinrich of Stanford
  • #20: Different data sets are different - some data sets have 1 file per granule; others have many; some data sets have a browse for each granule; in others the mapping is 1 to many; many to 1, or many to many
  • #21: In ISO 19115 parlance, a dataset is an “identifiable collection of data,” where a dataset may reside in a larger dataset, can be as small as a single feature, and could even be a single map or chart (see ISO 19115:2003(E) page 3). This is in contrast to a data series which is a “collection of datasets sharing the same product specification” where the phrase “product specification” is totally undefined. In NASA, NOAA, and NSF parlance a data set is the collection of all of the files for a particular project, from a particular instrument, etc. preferentially that are all of the same type. A data set is comprised of data files or data granules.In HDF parlance, a Science Data Set is the unit within a file that contains a particular data array.