SlideShare a Scribd company logo
The HDF Group

HDF4 Mapping Project Update
www.hdfgroup.org/projects/h4map

Ruth Aydt
(aydt@hdfgroup.org)

The HDF Group
The 15thHDF and HDF-EOS Workshop
April 17-19, 2012
Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

1

www.hdfgroup.org
Project Motivation
HDF4 file

DVD
HDFView

Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

2

HDF4
Library

www.hdfgroup.org
Project Purpose

Ensure long-term access
to EOS data
stored in HDF4 files.

Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

3

www.hdfgroup.org
Project Scope
April 2012

Time

HDF4 Library
HDF4 Files with EOS Data produced
HDF4 Files with EOS Data valuable to community
Concern
Idea

HDF4
Mapping
Project
Scope

Proof of Concept Prototype
Develop

Support

Product

Verification Requirements Study

? Verification Implementation
HDF4 File Content Maps
Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

4

www.hdfgroup.org
Concern – Workshop VIII (2004)
“HDF and HDF EOS: Implications for Long-Term
Archiving and Data Access”
- Ruth Duerr, NSIDC

Slide Notes:
“Without human
readability you are
locked into having
to maintain the read
software forever!”

Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

5

www.hdfgroup.org
Idea – Workshop X (2006)
“Leveraging HDF Utilities” - Chris Lynnes, GES-DISC

Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

6

www.hdfgroup.org
HDF4 File Contents – User View
Objects & Relationships

Object Data

User Metadata

Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

7

www.hdfgroup.org
HDF4 File Contents – Format View
variable
name = variable_name
rank
type
storagetype
1

Vgroup
name = variable_name
class = Var0.0

1

1

Object Data

1
1

1

0...1

SD

1

SDD
1

0...1

data

0…*

byte order,
chunked storage,
compression, …

1

1

0...1

NT

1
1

1
1

1

1

NDG
0…*
Vdata
name = attribute_name
class = Attr0.0

attribute
name = attribute_name

Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

8

www.hdfgroup.org
Proof of Concept (8/07- 7/08)
• Categorize HDF4 data held by NASA
• Build a prototype
HDF4 File

bytestreams

Map Writer
linked with
HDF4 library
request
Reader

HDF4 File Content
Map (XML)

Objects & Relationships;
User Metadata;
Object Data retrieval &
reconstruction information
2 independent readers
in C and Perl

Object Data
Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

9

www.hdfgroup.org
Develop Product (11/09 - 7/11)
Tasks:
A. Investigate integration of mapping schema
with existing standards
B. Determine HDF-EOS 2 requirements
C. Redesign and expand the XML schema
D. Implement production quality map writer
E. Develop demo map reader
F. Deploy tools at select NASA data centers

For preservation, we must get it right while the HDF4
library, tools, documentation, and expertise are around.
Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

10

www.hdfgroup.org
Develop Product (Tasks C & D)
C: HDF4 File Content Maps
Have enough information to stand alone
• Described by schema

D: Production Quality Map Writer
• Read HDF4 file and create Map
• Command-line options fine-tune behavior

HDF4 Library
• New functions added to facilitate map creation
Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

11

www.hdfgroup.org
Surprise!
• Expected hardest part to be support for retrieval
and reconstruction of object data.
• In fact, making sure all user-created HDF4
objects were found and represented correctly
was a bigger challenge.
• Existing tools didn’t always
report same user-level
information.
• “Correctness” can be subject
to interpretation – not always
able to know intent of file
creator.
Image from publications.usa.gov

Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

12

www.hdfgroup.org
Project Actions in Response
User View • Map from top down
andbottom up
• Watch for extra parts

• “Over include” in map if any
doubt (e.g., 2 palettes for 1 raster)

Format View
• Improve HDF4 library, tools,
and documentation to
address ambiguities

Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

13

www.hdfgroup.org
HDF4 File Content Map

Select object data values
Information needed
Represents HDF4
included to help reader
to access and
Objects and
program verify binary
interpret object data
dataRelationships
handled properly
in HDF4 file

Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

14

www.hdfgroup.org
E: Develop Demo Reader
Developed by student at NSIDC
Only given Content Maps
• Written in Python
• Reader extracts object data from HDF4 file
• Output in ASCII (csv) or binary (numpy)
• Compares extracted data to values for verification
in Content Map

Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

15

www.hdfgroup.org
Releases & Support
Date

Version

Comments

July 2011

1.0.0 schema
1.0.0 writer

First official release
http://www.hdfgroup.org/projects/h4map

Sept 2011

1.0.1 writer

Minorbug fixes

Nov 2011

1.0.1 schema
1.0.2 writer

Robustly handle empty SDS

March 2012

May 2012
(planned)
?

Apr. 17-19, 2012

ECS Release 8.1

1.0.3 writer

Minor bug fixes
Support 2 palettes with same reference number

HDF/HDF-EOS Workshop XV

17

www.hdfgroup.org
HDF4 File Content Maps
Content Map generation at GES-DISC
• Datasets mapped
• TOVS Pathfinder
For example: ftp://disc1.gsfc.nasa.gov/data/s4pa/tovs/TOVSADNG/1986/330/

• MERRA Model Output

• In progress
• TRMM
• AIRS

Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

18

www.hdfgroup.org
ECS Release 8.1 – March 2012
“Raytheon EED deployed the HDF4 File Content Maps
capability as part of ECS Release 8.1. This capability wraps
the Content Map Writer in the ECS Map Generation Server.
ECS DAACs can choose whether or not to enable map
generation in operations.
With workload spec testing, seeing 2-3 maps/second under
load and 10-15 on unloaded system”
-- Evelyn Nakamura, Raytheon

“We installed our new big ECS software release which
included the code for creating maps. The installers set it up
to create maps (not in operations mode) for MOD10A1 and
it produced 20 or 30 thousand. We haven't had a chance
to look at them yet.”
-- Doug Fowler, NSIDC
Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

19

www.hdfgroup.org
Verification* Study (1/12 - 4/12)
“Work with DAAC personnel to identify
requirements that would produce appropriate
and efficient methods of verifying, concurrent
with operation activities, correctness of the
HDF4 maps that are produced with the ECS 8.1
capability.”

* The terms Verification and Validation are used interchangeably.
Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

20

www.hdfgroup.org
Verification Study Activities
Webinars with ASDC, LPDAAC, NSIDC, Raytheon
• Provide background on Mapping Project
• Gather input on requirements and concerns
• Collect sample datasets and generate Content Maps
Exposed 3 bugs: 1 in HDF4 library & 2 in Map Writer; Fixed.

• Discuss possible approaches
• Seek guidance from NASA on expectations regarding
Map creation timeline and verification responsibilities

Prototype possible approaches
• Demonstrate functionality and assess feasibility

Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

21

www.hdfgroup.org
Verification Study Findings (1)
• Automate verification as much as possible.
• Focus verification at the ESDT version level.

• No definitive specification for user-level
objects expected in a given HDF4 file.
• Scientists look at visualizations, not
directly at data.

Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

22

www.hdfgroup.org
Verification Study Findings (2)
• Every DAAC is different
• Flexibility in deciding when to generate Maps
• May need involvement of science teams to
confirm correctness

• Content Maps should be produced near end
of mission, or sooner if users want them.
• AMSR-E identified
• NSIDC involved with Mapping project from the
start and comfortable with verification using
demo reader

Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

23

www.hdfgroup.org
Verification Study Findings (3)
• Interest in web-based tools is growing.
• XSLT stylesheets

• DAAC representatives are very concerned
about long-term access to data.
• This is beyond the scope of the study
• But, something to keep in mind when considering
different approaches

Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

24

www.hdfgroup.org
Verification Dilemma
Translator to

DVD

Reader

Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

25

www.hdfgroup.org
Possible Approach

DVD

DVD
Creator

DVD

Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

26

www.hdfgroup.org
Applied to Content Maps
HDF4 File Content
Map (XML)

HDF4 File
request
bytestreams
HDF4
Reader
Retranslator

Objects & Relationships;
Relationships;
User Metadata;
Metadata;
Object Data retrieval &
Object Data retrieval &
reconstruction information
reconstruction information

Object Data
HDF4 File

Replace this… with this…
Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

27

www.hdfgroup.org
Verification Recommendations (1)
• Check h4mapwriter errors
• Run xmllint
• Check for well-formed XML
• Validate Map conforms to schema

These checks are possible now
Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

28

www.hdfgroup.org
Verification Recommendations (2)
• Develop content map checker to check
•
•
•
•

Filesize and checksum
Object data values
Values for verification
Attribute values in Map

What people expect to be enough
Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

29

www.hdfgroup.org
Verification Recommendations (3)
• Develop retranslatorto create new HDF4 file
• Allows use of familiar tools (GrADS, IDL,
HDFview, hdiff, …)
• If new file is not equivalent to original (from
user perspective), investigate ASAP.

Needed since no definitive source of correctness
for original HDF4 files.
Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

30

www.hdfgroup.org
Verification Recommendations (4)
• Build content map checker and retranslatoron
common modular infrastructure.

Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

31

www.hdfgroup.org
Not just for Preservation!
“I find the HDF Map writer and reader very useful
when I am in the discovery phase of new projects
using HDF4 datasets.
• They enable me to analyze the full structure of CERES hdf4
datasets and ensure HDF Attributes from the archived HDF4
files are preserved in subsetted files.
• I am building a capability to subset MOPITT HDF4 data and
am using them to help validate SDS data arrays over 4
dimensions.
• A team of consultants is working with ASDC on an
experimental semantic database implemented on a 'grand
challenge' scale. They are interested in using CERES
datasets, but are unfamiliar with HDF. They are using the
HDF4 map application to analyze the structure of proposed
CERES datasets and to help extract metadata and data from
target files.”

--- Walt Baskin, ASDC
Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

32

www.hdfgroup.org
Presentation “Take Away”
HDF4 Content Maps are the best thing since
sliced bread!
More seriously …
•
•

Content Maps can be created now and you may
find them useful
Ask questions and report problems
We want to know about issues ASAP

•

Feedback regarding proposed Verification
approach very welcome
Project report / recommendations due next week

Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

33

www.hdfgroup.org
Project Contributors
• The HDF Group
• Ruth Aydt, Peter Cao, Jo Eads, Mike Folk, Joe Lee, Elena
Pourmal, Binh-Minh Ribler, Kent Yang, and others

• NASA / DAACs
• Jeanne Behnke, Dan Marinelli, H. K. "Rama" Ramapriyan
• ASDC: Walt Baskin, Greg Cates, Gerald Lemay, Lindsay
Parker, Steve Protack
• GES-DISC: Guang-Dih Lei, Chris Lynnes
• LP DAAC: Matt Martens, BhaskarRamachandran, Jody
Rundell, Jim Vermeer
• NSIDC: Jonathan Crider, Ruth Duerr, Doug Fowler, Luis Lopez

• Raytheon
• Evelyn Nakamura, Lou Swentek, Abe Taaheri

Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

34

www.hdfgroup.org
Acknowledgements
This work was supported by Subcontract number
114820 under RaytheonContract number
NNG10HP02C, funded by the National Aeronautics
andSpace Administration (NASA) and by
cooperative agreement numberNNX08AO77A from
the NASA. Any opinions, findings, conclusions, or
recommendations expressed in this material are
those of the authorsand do not necessarily reflect
the views of Raytheon or the NationalAeronautics
and Space Administration.

Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

35

www.hdfgroup.org
The HDF Group

Questions/comments?

Apr. 17-19, 2012

HDF/HDF-EOS Workshop XV

36

www.hdfgroup.org

More Related Content

PPTX
Tools to improve the usability of NASA HDF Data
PPTX
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
PPTX
HDF & HDF-EOS Data & Support at NSIDC
PPSX
NASA HDF/HDF-EOS Data for Dummies (and Developers)
PPTX
Introduction to HDF5 Data and Programming Models
PPTX
HDF and netCDF Data Support in ArcGIS
PPT
Digital Object Identifiers for EOSDIS data
Tools to improve the usability of NASA HDF Data
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
HDF & HDF-EOS Data & Support at NSIDC
NASA HDF/HDF-EOS Data for Dummies (and Developers)
Introduction to HDF5 Data and Programming Models
HDF and netCDF Data Support in ArcGIS
Digital Object Identifiers for EOSDIS data

What's hot (20)

PPT
HDF-EOS 2/5 to netCDF Converter
PDF
Using IDL with Suomi NPP VIIRS Data
PPTX
PPSX
NASA HDF/HDF-EOS Data Access Challenges
PPTX
HDF Project Status and Plans
PPTX
HDF OPeNDAP Project Update and Demo
PPT
Using HDF5 and Python: The H5py module
PPT
Migrating from HDF5 1.6 to 1.8
PPSX
HDFEOS.org User Analsys, Updates, and Future
PPTX
MATLAB Modernization on HDF5 1.10
PPTX
Bridging ICESat and ICESat-2 Standard Data Products
PPSX
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
PPTX
HDF Update for DAAC Managers (2017-02-27)
PPTX
Hierarchical Data Formats (HDF) Update
PPTX
Product Designer Hub - Taking HPD to the Web
PDF
H5Coro: The Cloud-Optimized Read-Only Library
HDF-EOS 2/5 to netCDF Converter
Using IDL with Suomi NPP VIIRS Data
NASA HDF/HDF-EOS Data Access Challenges
HDF Project Status and Plans
HDF OPeNDAP Project Update and Demo
Using HDF5 and Python: The H5py module
Migrating from HDF5 1.6 to 1.8
HDFEOS.org User Analsys, Updates, and Future
MATLAB Modernization on HDF5 1.10
Bridging ICESat and ICESat-2 Standard Data Products
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
HDF Update for DAAC Managers (2017-02-27)
Hierarchical Data Formats (HDF) Update
Product Designer Hub - Taking HPD to the Web
H5Coro: The Cloud-Optimized Read-Only Library
Ad

Viewers also liked (14)

PPTX
Connecting HDF with ISO Metadata Standards
PPTX
HDF Group Support for NPP/NPOESS/JPSS
PPTX
Earth Science Data and Information System (ESDIS) Project Update
PPT
GES DISC Eexperiences with HDF Formats for MEaSUREs Projects
PPT
Status of HDF-EOS, Related Software and Tools
PPTX
HDF Tools Updates and Discussions
PPT
HDF-EOS to GeoTIFF Conversion Tool & HDF-EOS Plug-in for HDFView
PPT
PPTX
Web-based On-demand Global NDVI Data Services
PDF
Data Storage for Remote Monitoring of CAT Machines Using HDF
PPTX
MATLAB, netCDF, and OPeNDAP
PPTX
Access HDF-EOS data with OGC Web Coverage Service - Earth Observation Applica...
PPTX
iRODS: Interoperability in Data Management
Connecting HDF with ISO Metadata Standards
HDF Group Support for NPP/NPOESS/JPSS
Earth Science Data and Information System (ESDIS) Project Update
GES DISC Eexperiences with HDF Formats for MEaSUREs Projects
Status of HDF-EOS, Related Software and Tools
HDF Tools Updates and Discussions
HDF-EOS to GeoTIFF Conversion Tool & HDF-EOS Plug-in for HDFView
Web-based On-demand Global NDVI Data Services
Data Storage for Remote Monitoring of CAT Machines Using HDF
MATLAB, netCDF, and OPeNDAP
Access HDF-EOS data with OGC Web Coverage Service - Earth Observation Applica...
iRODS: Interoperability in Data Management
Ad

Similar to HDF4 Mapping Project Update (20)

PPT
Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps
PPTX
Improving long-term preservation of EOS data by independently mapping HDF4 da...
PPT
HDF OPeNDAP project update and demo
PPT
HDF Status and Development
PPT
Transitions from HDF4 to HDF5: Issues
PPSX
Adding new servicees for HDF in THREDDS Data Server (TDS)
PPTX
Support for NPP/NPOESS by The HDF Group
PPTX
Easy Remote Access Via OPeNDAP
PPTX
HDF5 OPeNDAP project update and demo
PDF
h5web: a web-based viewer of HDF5 files
PPTX
HDF5 and Ecosystem: What Is New?
Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps
Improving long-term preservation of EOS data by independently mapping HDF4 da...
HDF OPeNDAP project update and demo
HDF Status and Development
Transitions from HDF4 to HDF5: Issues
Adding new servicees for HDF in THREDDS Data Server (TDS)
Support for NPP/NPOESS by The HDF Group
Easy Remote Access Via OPeNDAP
HDF5 OPeNDAP project update and demo
h5web: a web-based viewer of HDF5 files
HDF5 and Ecosystem: What Is New?

More from The HDF-EOS Tools and Information Center (20)

PDF
HDF5 2.0: Cloud Optimized from the Start
PDF
Using a Hierarchical Data Format v5 file as Zarr v3 Shard
PDF
Cloud-Optimized HDF5 Files - Current Status
PDF
Cloud Optimized HDF5 for the ICESat-2 mission
PPTX
Access HDF Data in the Cloud via OPeNDAP Web Service
PPTX
Upcoming New HDF5 Features: Multi-threading, sparse data storage, and encrypt...
PPTX
The State of HDF5 / Dana Robinson / The HDF Group
PDF
Cloud-Optimized HDF5 Files
PDF
Accessing HDF5 data in the cloud with HSDS
PPTX
Highly Scalable Data Service (HSDS) Performance Features
PDF
Creating Cloud-Optimized HDF5 Files
PPTX
HDF5 OPeNDAP Handler Updates, and Performance Discussion
PPTX
Hyrax: Serving Data from S3
PPSX
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
PDF
HDF - Current status and Future Directions
PPTX
HDF - Current status and Future Directions
PPTX
HDF for the Cloud - Serverless HDF
PPTX
HDF for the Cloud - New HDF Server Features
HDF5 2.0: Cloud Optimized from the Start
Using a Hierarchical Data Format v5 file as Zarr v3 Shard
Cloud-Optimized HDF5 Files - Current Status
Cloud Optimized HDF5 for the ICESat-2 mission
Access HDF Data in the Cloud via OPeNDAP Web Service
Upcoming New HDF5 Features: Multi-threading, sparse data storage, and encrypt...
The State of HDF5 / Dana Robinson / The HDF Group
Cloud-Optimized HDF5 Files
Accessing HDF5 data in the cloud with HSDS
Highly Scalable Data Service (HSDS) Performance Features
Creating Cloud-Optimized HDF5 Files
HDF5 OPeNDAP Handler Updates, and Performance Discussion
Hyrax: Serving Data from S3
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
HDF - Current status and Future Directions
HDF - Current status and Future Directions
HDF for the Cloud - Serverless HDF
HDF for the Cloud - New HDF Server Features

Recently uploaded (20)

PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
cuic standard and advanced reporting.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Cloud computing and distributed systems.
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Modernizing your data center with Dell and AMD
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Big Data Technologies - Introduction.pptx
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Electronic commerce courselecture one. Pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
cuic standard and advanced reporting.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Cloud computing and distributed systems.
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Unlocking AI with Model Context Protocol (MCP)
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Modernizing your data center with Dell and AMD
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
The Rise and Fall of 3GPP – Time for a Sabbatical?
Big Data Technologies - Introduction.pptx
Building Integrated photovoltaic BIPV_UPV.pdf

HDF4 Mapping Project Update

  • 1. The HDF Group HDF4 Mapping Project Update www.hdfgroup.org/projects/h4map Ruth Aydt (aydt@hdfgroup.org) The HDF Group The 15thHDF and HDF-EOS Workshop April 17-19, 2012 Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 1 www.hdfgroup.org
  • 2. Project Motivation HDF4 file DVD HDFView Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 2 HDF4 Library www.hdfgroup.org
  • 3. Project Purpose Ensure long-term access to EOS data stored in HDF4 files. Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 3 www.hdfgroup.org
  • 4. Project Scope April 2012 Time HDF4 Library HDF4 Files with EOS Data produced HDF4 Files with EOS Data valuable to community Concern Idea HDF4 Mapping Project Scope Proof of Concept Prototype Develop Support Product Verification Requirements Study ? Verification Implementation HDF4 File Content Maps Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 4 www.hdfgroup.org
  • 5. Concern – Workshop VIII (2004) “HDF and HDF EOS: Implications for Long-Term Archiving and Data Access” - Ruth Duerr, NSIDC Slide Notes: “Without human readability you are locked into having to maintain the read software forever!” Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 5 www.hdfgroup.org
  • 6. Idea – Workshop X (2006) “Leveraging HDF Utilities” - Chris Lynnes, GES-DISC Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 6 www.hdfgroup.org
  • 7. HDF4 File Contents – User View Objects & Relationships Object Data User Metadata Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 7 www.hdfgroup.org
  • 8. HDF4 File Contents – Format View variable name = variable_name rank type storagetype 1 Vgroup name = variable_name class = Var0.0 1 1 Object Data 1 1 1 0...1 SD 1 SDD 1 0...1 data 0…* byte order, chunked storage, compression, … 1 1 0...1 NT 1 1 1 1 1 1 NDG 0…* Vdata name = attribute_name class = Attr0.0 attribute name = attribute_name Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 8 www.hdfgroup.org
  • 9. Proof of Concept (8/07- 7/08) • Categorize HDF4 data held by NASA • Build a prototype HDF4 File bytestreams Map Writer linked with HDF4 library request Reader HDF4 File Content Map (XML) Objects & Relationships; User Metadata; Object Data retrieval & reconstruction information 2 independent readers in C and Perl Object Data Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 9 www.hdfgroup.org
  • 10. Develop Product (11/09 - 7/11) Tasks: A. Investigate integration of mapping schema with existing standards B. Determine HDF-EOS 2 requirements C. Redesign and expand the XML schema D. Implement production quality map writer E. Develop demo map reader F. Deploy tools at select NASA data centers For preservation, we must get it right while the HDF4 library, tools, documentation, and expertise are around. Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 10 www.hdfgroup.org
  • 11. Develop Product (Tasks C & D) C: HDF4 File Content Maps Have enough information to stand alone • Described by schema D: Production Quality Map Writer • Read HDF4 file and create Map • Command-line options fine-tune behavior HDF4 Library • New functions added to facilitate map creation Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 11 www.hdfgroup.org
  • 12. Surprise! • Expected hardest part to be support for retrieval and reconstruction of object data. • In fact, making sure all user-created HDF4 objects were found and represented correctly was a bigger challenge. • Existing tools didn’t always report same user-level information. • “Correctness” can be subject to interpretation – not always able to know intent of file creator. Image from publications.usa.gov Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 12 www.hdfgroup.org
  • 13. Project Actions in Response User View • Map from top down andbottom up • Watch for extra parts • “Over include” in map if any doubt (e.g., 2 palettes for 1 raster) Format View • Improve HDF4 library, tools, and documentation to address ambiguities Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 13 www.hdfgroup.org
  • 14. HDF4 File Content Map Select object data values Information needed Represents HDF4 included to help reader to access and Objects and program verify binary interpret object data dataRelationships handled properly in HDF4 file Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 14 www.hdfgroup.org
  • 15. E: Develop Demo Reader Developed by student at NSIDC Only given Content Maps • Written in Python • Reader extracts object data from HDF4 file • Output in ASCII (csv) or binary (numpy) • Compares extracted data to values for verification in Content Map Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 15 www.hdfgroup.org
  • 16. Releases & Support Date Version Comments July 2011 1.0.0 schema 1.0.0 writer First official release http://www.hdfgroup.org/projects/h4map Sept 2011 1.0.1 writer Minorbug fixes Nov 2011 1.0.1 schema 1.0.2 writer Robustly handle empty SDS March 2012 May 2012 (planned) ? Apr. 17-19, 2012 ECS Release 8.1 1.0.3 writer Minor bug fixes Support 2 palettes with same reference number HDF/HDF-EOS Workshop XV 17 www.hdfgroup.org
  • 17. HDF4 File Content Maps Content Map generation at GES-DISC • Datasets mapped • TOVS Pathfinder For example: ftp://disc1.gsfc.nasa.gov/data/s4pa/tovs/TOVSADNG/1986/330/ • MERRA Model Output • In progress • TRMM • AIRS Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 18 www.hdfgroup.org
  • 18. ECS Release 8.1 – March 2012 “Raytheon EED deployed the HDF4 File Content Maps capability as part of ECS Release 8.1. This capability wraps the Content Map Writer in the ECS Map Generation Server. ECS DAACs can choose whether or not to enable map generation in operations. With workload spec testing, seeing 2-3 maps/second under load and 10-15 on unloaded system” -- Evelyn Nakamura, Raytheon “We installed our new big ECS software release which included the code for creating maps. The installers set it up to create maps (not in operations mode) for MOD10A1 and it produced 20 or 30 thousand. We haven't had a chance to look at them yet.” -- Doug Fowler, NSIDC Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 19 www.hdfgroup.org
  • 19. Verification* Study (1/12 - 4/12) “Work with DAAC personnel to identify requirements that would produce appropriate and efficient methods of verifying, concurrent with operation activities, correctness of the HDF4 maps that are produced with the ECS 8.1 capability.” * The terms Verification and Validation are used interchangeably. Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 20 www.hdfgroup.org
  • 20. Verification Study Activities Webinars with ASDC, LPDAAC, NSIDC, Raytheon • Provide background on Mapping Project • Gather input on requirements and concerns • Collect sample datasets and generate Content Maps Exposed 3 bugs: 1 in HDF4 library & 2 in Map Writer; Fixed. • Discuss possible approaches • Seek guidance from NASA on expectations regarding Map creation timeline and verification responsibilities Prototype possible approaches • Demonstrate functionality and assess feasibility Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 21 www.hdfgroup.org
  • 21. Verification Study Findings (1) • Automate verification as much as possible. • Focus verification at the ESDT version level. • No definitive specification for user-level objects expected in a given HDF4 file. • Scientists look at visualizations, not directly at data. Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 22 www.hdfgroup.org
  • 22. Verification Study Findings (2) • Every DAAC is different • Flexibility in deciding when to generate Maps • May need involvement of science teams to confirm correctness • Content Maps should be produced near end of mission, or sooner if users want them. • AMSR-E identified • NSIDC involved with Mapping project from the start and comfortable with verification using demo reader Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 23 www.hdfgroup.org
  • 23. Verification Study Findings (3) • Interest in web-based tools is growing. • XSLT stylesheets • DAAC representatives are very concerned about long-term access to data. • This is beyond the scope of the study • But, something to keep in mind when considering different approaches Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 24 www.hdfgroup.org
  • 24. Verification Dilemma Translator to DVD Reader Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 25 www.hdfgroup.org
  • 25. Possible Approach DVD DVD Creator DVD Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 26 www.hdfgroup.org
  • 26. Applied to Content Maps HDF4 File Content Map (XML) HDF4 File request bytestreams HDF4 Reader Retranslator Objects & Relationships; Relationships; User Metadata; Metadata; Object Data retrieval & Object Data retrieval & reconstruction information reconstruction information Object Data HDF4 File Replace this… with this… Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 27 www.hdfgroup.org
  • 27. Verification Recommendations (1) • Check h4mapwriter errors • Run xmllint • Check for well-formed XML • Validate Map conforms to schema These checks are possible now Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 28 www.hdfgroup.org
  • 28. Verification Recommendations (2) • Develop content map checker to check • • • • Filesize and checksum Object data values Values for verification Attribute values in Map What people expect to be enough Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 29 www.hdfgroup.org
  • 29. Verification Recommendations (3) • Develop retranslatorto create new HDF4 file • Allows use of familiar tools (GrADS, IDL, HDFview, hdiff, …) • If new file is not equivalent to original (from user perspective), investigate ASAP. Needed since no definitive source of correctness for original HDF4 files. Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 30 www.hdfgroup.org
  • 30. Verification Recommendations (4) • Build content map checker and retranslatoron common modular infrastructure. Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 31 www.hdfgroup.org
  • 31. Not just for Preservation! “I find the HDF Map writer and reader very useful when I am in the discovery phase of new projects using HDF4 datasets. • They enable me to analyze the full structure of CERES hdf4 datasets and ensure HDF Attributes from the archived HDF4 files are preserved in subsetted files. • I am building a capability to subset MOPITT HDF4 data and am using them to help validate SDS data arrays over 4 dimensions. • A team of consultants is working with ASDC on an experimental semantic database implemented on a 'grand challenge' scale. They are interested in using CERES datasets, but are unfamiliar with HDF. They are using the HDF4 map application to analyze the structure of proposed CERES datasets and to help extract metadata and data from target files.” --- Walt Baskin, ASDC Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 32 www.hdfgroup.org
  • 32. Presentation “Take Away” HDF4 Content Maps are the best thing since sliced bread! More seriously … • • Content Maps can be created now and you may find them useful Ask questions and report problems We want to know about issues ASAP • Feedback regarding proposed Verification approach very welcome Project report / recommendations due next week Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 33 www.hdfgroup.org
  • 33. Project Contributors • The HDF Group • Ruth Aydt, Peter Cao, Jo Eads, Mike Folk, Joe Lee, Elena Pourmal, Binh-Minh Ribler, Kent Yang, and others • NASA / DAACs • Jeanne Behnke, Dan Marinelli, H. K. "Rama" Ramapriyan • ASDC: Walt Baskin, Greg Cates, Gerald Lemay, Lindsay Parker, Steve Protack • GES-DISC: Guang-Dih Lei, Chris Lynnes • LP DAAC: Matt Martens, BhaskarRamachandran, Jody Rundell, Jim Vermeer • NSIDC: Jonathan Crider, Ruth Duerr, Doug Fowler, Luis Lopez • Raytheon • Evelyn Nakamura, Lou Swentek, Abe Taaheri Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 34 www.hdfgroup.org
  • 34. Acknowledgements This work was supported by Subcontract number 114820 under RaytheonContract number NNG10HP02C, funded by the National Aeronautics andSpace Administration (NASA) and by cooperative agreement numberNNX08AO77A from the NASA. Any opinions, findings, conclusions, or recommendations expressed in this material are those of the authorsand do not necessarily reflect the views of Raytheon or the NationalAeronautics and Space Administration. Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 35 www.hdfgroup.org
  • 35. The HDF Group Questions/comments? Apr. 17-19, 2012 HDF/HDF-EOS Workshop XV 36 www.hdfgroup.org

Editor's Notes

  • #19: TOVS Pathfinder: http://mirador.gsfc.nasa.gov/cgi-bin/mirador/presentNavigation.pl?tree=project&project=TOVSMERRA Model Output:mirador.gsfc.nasa.gov/cgi-bin/mirador/presentNavigation.pl?tree=project&project=MERRATo find the map files, you go down all the way to the granule level, then copy the FTP link and take off the file part, e.g.,:ftp://disc1.gsfc.nasa.gov/data/s4pa/tovs/TOVSADNG/1986/330/Thanks to Chris Lynnes for the info & links.
  • #20: Maps can be generated. Because of concerns that they can’t be verified in an automatic, scalable way, don’t have to be turned on. Verification Study.
  • #21: With the ability to generate content maps, DAACs wanted to know how they should verify that dataset files are adequately described… In many cases they were not responsible for creating the files or for understanding the content in them… they typically just look at checksums, filesizes, before distributing. In part because of our surprise in the product phase, we felt it would be best to discuss some of the uncertainties related to verification – why just comparing the values in the object data isn’t enough and how the uncertainty regarding creator intent (in some cases) could be addressed.
  • #22: Here’s a high-level rundown of the activities that have gone on during the project. DAAC personnel have been very responsive to questions and made room in their schedules to meet on fairly short notice.
  • #24: A summary of the findings. Details are in meeting minutes.
  • #25: Will a “Map Reader” replace the HDF4 library as the way to access data at some point in the future?How will a “Map Reader” or other utilities be supported?