SlideShare a Scribd company logo
What will be new in HDF5?

October 15, 2008

HDF and HDF-EOS Workshop XII

1
HDF5 Road Map

Performance
Ease of use
Robustness
Innovation

October 15, 2008

HDF and HDF-EOS Workshop XII

2
Outline
• Performance improvements
• Fortran 2003 features
• HDF5 file recover (metadata journaling)

October 15, 2008

HDF and HDF-EOS Workshop XII

3
Performance Improvements

October 15, 2008

HDF and HDF-EOS Workshop XII

4
Performance Improvements in HDF5
• Examples of completed work:
• New implementation of metadata cache to
improve I/O performance and memory usage when
accessing many objects (HDF5 1.8.0)
• Faster, more scalable storage and access for
large groups (HDF5 1.8.0)

October 15, 2008

HDF and HDF-EOS Workshop XII

5
Performance Improvements in HDF5
• Work in progress
• New implementation of free-space management
• Affects “dynamic” applications that
add/delete/modify existing objects
• When creating HDF5 objects, space is allocated
from available space tracked by the free-space
manager
• When deleting objects, unused space is added to
the free-space pool via free-space manager
• Current implementation uses O(N2) operations for
each N allocations or freeing space
• New implementation O(log2N)
October 15, 2008

HDF and HDF-EOS Workshop XII

6
Free-space Management
• Test: creating/deleting attributes
•
•
•
•

Create first set of attributes
Delete odd-numbered attributes from the first set
Create second set of attributes
Delete all attributes from the second set

Number of
attributes

Old
New
Improvement
implementation implementation ratio

500

786.5 sec

68.2 sec

11.5x

1000

11000 sec

289 sec

38x

October 15, 2008

HDF and HDF-EOS Workshop XII

7
Performance Improvements in HDF5
• Work in progress
• Fast data append (along slowest changing
dimension)

• Future areas of interest
• Efficient chunking cache implementation
(NetCDF4)
• Efficient handling of the variable-length data
including compression (will affect sizes of NPOESS
files)

October 15, 2008

HDF and HDF-EOS Workshop XII

8
Fortran 2003 features

October 15, 2008

HDF and HDF-EOS Workshop XII

9
Status of the HDF5 Fortran Library
• HDF5 Fortran library is a part of standard HDF5
distribution
• First release goes back to 1999
• Implemented as Fortran90 wrappers on top of the
HDF5 C library
• Supported on Linux, Windows, Mac Intel, Solaris,
VMS, clusters, etc.
• Compilers
• Open source gfortran, g95
• Vendors (SUN, Intel, PGI, Absoft)
• 32 and 64-bit versions
October 15, 2008

HDF and HDF-EOS Workshop XII

10
HDF5 Fortran Library
• Mimics HDF5 C APIs
• Fortran 90 features used
•
•
•
•
•

Modules
Function overloading
Function interfaces
Dynamic memory allocation
Optional parameters
• Many Fortran APIs are simpler than their C
counterparts

October 15, 2008

HDF and HDF-EOS Workshop XII

11
Current Limitations
• Supports only “native” Fortran types such as
•
•
•
•

INTEGER
REAL
CHARACTER
DOUBLE PRECISION (obsolete Fortran feature)

• No support for INTEGER*1, INTEGER*2,
INTEGER*8, INTEGER*16, REAL*8, REAL*16
• Cannot write/read buffers of those types

• Fortran types have to match C types
• No support for –r8 and –r16 flags (Fortran real =/=
C float)
October 15, 2008

HDF and HDF-EOS Workshop XII

12
Current Limitations
• No support for
INTEGER(KIND=n) and REAL(KIND=m)
• Integers n and m are called KIND parameters
• Returned by
selected _int_kind (r)
-10r < n < 10r
selected_real_kind(p,r)
p – precision
r – decimal exponent range

October 15, 2008

HDF and HDF-EOS Workshop XII

13
Current Limitations
• Limited support for derived types (compare with C
structures and HDF5 compound datatypes)
• Supports derived types with “native” fields only
• Doesn’t support complex HDF5 datatypes
(e.g., with array member, or nested compound)

• Writes/reads HDF5 compound datasets by fields
only
• Cannot be used in parallel applications
• No support for enum types

• No support for callback functions

October 15, 2008

HDF and HDF-EOS Workshop XII

14
Current Limitations - Summary
• Any application written according to Fortran
95/2003 standard will struggle using HDF5
Fortran Library
• Many HDF5 features are not available to Fortran
applications

October 15, 2008

HDF and HDF-EOS Workshop XII

15
Fortran 2003 Features
• Fortran 2003 provides a standardized mechanism
for interoperating with C
• Module ISO_C_BINDING for interoperability of
intrinsic types
• C_PTR and C_FUNCPTR for interoperability with
C pointers
• C_LOC(x) and C_FUNLOC(x) inquiry functions for
getting addresses of variables and procedures
• BIND attribute for interoperability of derived types
and C structures and enumerated types

October 15, 2008

HDF and HDF-EOS Workshop XII

16
Fortran 2003 Features and HDF5
• New 2003 features allowed us to support
• Any Fortran INTEGER and REAL type data in
HDF5 files
• Fortran derived types and HDF5 compound
datatypes
• Fortran enumerated types and HDF5 enumerated
types
• HDF5 APIs with callbacks

October 15, 2008

HDF and HDF-EOS Workshop XII

17
Fortran Compilers with 2003 Features
Compiler

Current versions and
status of HDF5

Future versions

Intel

Versions 10.1 and 11
All F2003 functionality
works in HDF5

g95

August 2008 and later; All
Fortran 2003 functionality
works in HDF5

gfortran

Version 4.4
Limited support for C
interoperability; HDF5
doesn’t work

No plans from compiler
developers to improve the
support

SUN compilers

Express-July 2008 build;
HDF5 doesn’t work

No timeline for fixes; may
be in a year

PGI

Version 7.2.1; HDF5
doesn’t work

Fixes will be available in
Version 8.

October 15, 2008

HDF and HDF-EOS Workshop XII

Version 11 will have a fix
that will allow us to
remove common blocks

18
Example

October 15, 2008

HDF and HDF-EOS Workshop XII

19
Example

October 15, 2008

HDF and HDF-EOS Workshop XII

20
Example
HDF5 "SDScompound.h5" {
GROUP "/" {
DATASET "ArrayOfStructures" {
DATATYPE H5T_COMPOUND {
H5T_ARRAY { [13] H5T_STRING {
STRSIZE 1;
STRPAD H5T_STR_SPACEPAD;
CSET H5T_CSET_ASCII;
CTYPE H5T_C_S1;
} } "chr_name";
H5T_STD_I8LE "a_name";
H5T_IEEE_F64LE "c_name";
H5T_IEEE_F32LE "b_name";
}
…..
October 15, 2008

HDF and HDF-EOS Workshop XII

21
Information

October 15, 2008

HDF and HDF-EOS Workshop XII

22
HDF5 file recovery or
Surviving a System Failure
through Metadata
Journaling
October 15, 2008

HDF and HDF-EOS Workshop XII

23
Surviving a System Failure in HDF5
• Problem:
• Data in an opened HDF5 files susceptible to
corruption in the event of an application or system
crash.
• Corruption possible if an opened HDF5 file has
been updated when the crash occurs.

• Initial Objective:
• Guarantee an HDF5 file with consistent metadata
can be reconstructed in the event of a crash.
• No guarantee on state of raw data – contains
whatever made it to disk prior to crash.
October 15, 2008

HDF and HDF-EOS Workshop XII

24
Crash Survivability in HDF5
• Approach: Metadata Journaling
• When an HDF5 file is opened with Metadata
Journaling enabled, a companion Journal file is
created.
• When an HDF5 API function that modifies
metadata is completed, a transaction is recorded in
the Journal file.
• If the application crashes, a recovery program can
replay the journal by applying in order all metadata
writes until the end of the last completed
transaction written to the journal file.

October 15, 2008

HDF and HDF-EOS Workshop XII

25
HDF5 Metadata Journaling Recovery
Application
crashed
h5recover tool

liFe Corrupted 5DFH

Restored
HDF5
File

Companion Journal File

Oct. 16 2008

HDF and HDF-EOS Workshop XII

26
Implementation Status
• Serial HDF5 with synchronous write mode
• Alpha1 released August 2008
• User interface (API definition and h5recover tool)
and file format may change

October 15, 2008

HDF and HDF-EOS Workshop XII

27
Metadata Journaling Plans
• Serial HDF5 with synchronous write mode
• Finalize User interface definitions and file format

• Serial HDF5 with asynchronous write mode
• To improve Journal file write speed

• More features (need funding)
• Make raw data operations atomic
• Allow "super‐transactions" to be created by
applications
• Enable journaling for Parallel HDF5

October 15, 2008

HDF and HDF-EOS Workshop XII

28
Questions?

October 15, 2008

HDF and HDF-EOS Workshop XII

29
Acknowledgement
• This report is based upon work supported in part
by a Cooperative Agreement with the National
Aeronautics and Space Administration (NASA)
under NASA Awards NNX06AC83A and
NNX08AO77A. Any opinions, findings, and
conclusions or recommendations expressed in
this material are those of the author(s) and do not
necessarily reflect the views of the National
Aeronautics and Space Administration.

October 16, 2008

HDF and HDF-EOS Workshop XII

30

More Related Content

PPT
HDF5-OPeNDAP Project Update and Demo
PPT
HDFView and HDF Java Products
PDF
Archiving Oracle Primavera project plans with software development tools
PDF
Sprint 19
PPTX
KPN ETL Factory (KETL) - Automated Code generation using Metadata to build Da...
PPT
HDF OPeNDAP project update and demo
PDF
Airframe: Lightweight Building Blocks for Scala - Scale By The Bay 2018
HDF5-OPeNDAP Project Update and Demo
HDFView and HDF Java Products
Archiving Oracle Primavera project plans with software development tools
Sprint 19
KPN ETL Factory (KETL) - Automated Code generation using Metadata to build Da...
HDF OPeNDAP project update and demo
Airframe: Lightweight Building Blocks for Scala - Scale By The Bay 2018

What's hot (6)

PDF
LCU14 303- Toolchain Collaboration
PDF
Sprint 124
PDF
DO-178C: the OOT supplement
PDF
Sprint 100
PDF
HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...
LCU14 303- Toolchain Collaboration
Sprint 124
DO-178C: the OOT supplement
Sprint 100
HKG18-411 - Introduction to OpenAMP which is an open source solution for hete...
Ad

Viewers also liked (20)

PPT
Migrating from HDF5 1.6 to 1.8
PPT
Status of HDF-EOS, Related Software, and Tools
PPTX
HDF5 OPeNDAP project update and demo
PPT
Shifting the Burden from the User to the Data Provider
PDF
HDF and HDF-EOS Experiences and Applications
PPT
Proposal for adding Named Dimensions to HDF5 Arrays
PPT
Profile of HDF-EOS5 Files
PDF
Workshop Discussion: HDF & HDF-EOS Future Direction
PPTX
Support for NPP/NPOESS by The HDF Group
PPT
Profile of NPOESS HDF5 Files
PPT
The CFD General Notation System transition to HDF5
PPT
The MATLAB Low-Level HDF5 Interface
PPT
Reading HDF family of formats via NetCDF-Java / CDM
PPT
Using HDF5 Archive Information Package to preserve HDF-EOS2 data
PPT
ORNL DAAC MODIS Land Product Subsets
PPT
Migrating from HDF5 1.6 to 1.8
Status of HDF-EOS, Related Software, and Tools
HDF5 OPeNDAP project update and demo
Shifting the Burden from the User to the Data Provider
HDF and HDF-EOS Experiences and Applications
Proposal for adding Named Dimensions to HDF5 Arrays
Profile of HDF-EOS5 Files
Workshop Discussion: HDF & HDF-EOS Future Direction
Support for NPP/NPOESS by The HDF Group
Profile of NPOESS HDF5 Files
The CFD General Notation System transition to HDF5
The MATLAB Low-Level HDF5 Interface
Reading HDF family of formats via NetCDF-Java / CDM
Using HDF5 Archive Information Package to preserve HDF-EOS2 data
ORNL DAAC MODIS Land Product Subsets
Ad

Similar to What will be new in HDF5? (20)

PDF
Introduction to HDF5 Data Model, Programming Model and Library APIs
PDF
Introduction to HDF5 Data Model, Programming Model and Library APIs
PPT
Introduction to HDF5 Data Model, Programming Model and Library APIs
PPT
HDF5 Advanced Topics - Datatypes and Partial I/O
PPTX
Introduction to HDF5 Data and Programming Models
PPT
HDF Status and Development
PDF
Transition from HDF4 to HDF5
PPT
HDF5 Backward and Forward Compatibility Issues
PPT
Moving applications to HDF5 1.8
PPTX
PDF
Parallel HDF5 Introductory Tutorial
PPT
Hdf5 intro
Introduction to HDF5 Data Model, Programming Model and Library APIs
Introduction to HDF5 Data Model, Programming Model and Library APIs
Introduction to HDF5 Data Model, Programming Model and Library APIs
HDF5 Advanced Topics - Datatypes and Partial I/O
Introduction to HDF5 Data and Programming Models
HDF Status and Development
Transition from HDF4 to HDF5
HDF5 Backward and Forward Compatibility Issues
Moving applications to HDF5 1.8
Parallel HDF5 Introductory Tutorial
Hdf5 intro

More from The HDF-EOS Tools and Information Center (20)

PDF
HDF5 2.0: Cloud Optimized from the Start
PDF
Using a Hierarchical Data Format v5 file as Zarr v3 Shard
PDF
Cloud-Optimized HDF5 Files - Current Status
PDF
Cloud Optimized HDF5 for the ICESat-2 mission
PPTX
Access HDF Data in the Cloud via OPeNDAP Web Service
PPTX
Upcoming New HDF5 Features: Multi-threading, sparse data storage, and encrypt...
PPTX
The State of HDF5 / Dana Robinson / The HDF Group
PDF
Cloud-Optimized HDF5 Files
PDF
Accessing HDF5 data in the cloud with HSDS
PPTX
Highly Scalable Data Service (HSDS) Performance Features
PDF
Creating Cloud-Optimized HDF5 Files
PPTX
HDF5 OPeNDAP Handler Updates, and Performance Discussion
PPTX
Hyrax: Serving Data from S3
PPSX
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
PDF
HDF - Current status and Future Directions
PPSX
HDFEOS.org User Analsys, Updates, and Future
PPTX
HDF - Current status and Future Directions
PDF
H5Coro: The Cloud-Optimized Read-Only Library
PPTX
MATLAB Modernization on HDF5 1.10
HDF5 2.0: Cloud Optimized from the Start
Using a Hierarchical Data Format v5 file as Zarr v3 Shard
Cloud-Optimized HDF5 Files - Current Status
Cloud Optimized HDF5 for the ICESat-2 mission
Access HDF Data in the Cloud via OPeNDAP Web Service
Upcoming New HDF5 Features: Multi-threading, sparse data storage, and encrypt...
The State of HDF5 / Dana Robinson / The HDF Group
Cloud-Optimized HDF5 Files
Accessing HDF5 data in the cloud with HSDS
Highly Scalable Data Service (HSDS) Performance Features
Creating Cloud-Optimized HDF5 Files
HDF5 OPeNDAP Handler Updates, and Performance Discussion
Hyrax: Serving Data from S3
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
HDF - Current status and Future Directions
HDFEOS.org User Analsys, Updates, and Future
HDF - Current status and Future Directions
H5Coro: The Cloud-Optimized Read-Only Library
MATLAB Modernization on HDF5 1.10

Recently uploaded (20)

PDF
cuic standard and advanced reporting.pdf
PDF
KodekX | Application Modernization Development
PDF
Approach and Philosophy of On baking technology
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Spectroscopy.pptx food analysis technology
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Empathic Computing: Creating Shared Understanding
PDF
Machine learning based COVID-19 study performance prediction
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
cuic standard and advanced reporting.pdf
KodekX | Application Modernization Development
Approach and Philosophy of On baking technology
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
The AUB Centre for AI in Media Proposal.docx
Spectroscopy.pptx food analysis technology
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
MIND Revenue Release Quarter 2 2025 Press Release
Encapsulation_ Review paper, used for researhc scholars
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Electronic commerce courselecture one. Pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
Spectral efficient network and resource selection model in 5G networks
NewMind AI Weekly Chronicles - August'25 Week I
Review of recent advances in non-invasive hemoglobin estimation
Empathic Computing: Creating Shared Understanding
Machine learning based COVID-19 study performance prediction
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx

What will be new in HDF5?

  • 1. What will be new in HDF5? October 15, 2008 HDF and HDF-EOS Workshop XII 1
  • 2. HDF5 Road Map Performance Ease of use Robustness Innovation October 15, 2008 HDF and HDF-EOS Workshop XII 2
  • 3. Outline • Performance improvements • Fortran 2003 features • HDF5 file recover (metadata journaling) October 15, 2008 HDF and HDF-EOS Workshop XII 3
  • 4. Performance Improvements October 15, 2008 HDF and HDF-EOS Workshop XII 4
  • 5. Performance Improvements in HDF5 • Examples of completed work: • New implementation of metadata cache to improve I/O performance and memory usage when accessing many objects (HDF5 1.8.0) • Faster, more scalable storage and access for large groups (HDF5 1.8.0) October 15, 2008 HDF and HDF-EOS Workshop XII 5
  • 6. Performance Improvements in HDF5 • Work in progress • New implementation of free-space management • Affects “dynamic” applications that add/delete/modify existing objects • When creating HDF5 objects, space is allocated from available space tracked by the free-space manager • When deleting objects, unused space is added to the free-space pool via free-space manager • Current implementation uses O(N2) operations for each N allocations or freeing space • New implementation O(log2N) October 15, 2008 HDF and HDF-EOS Workshop XII 6
  • 7. Free-space Management • Test: creating/deleting attributes • • • • Create first set of attributes Delete odd-numbered attributes from the first set Create second set of attributes Delete all attributes from the second set Number of attributes Old New Improvement implementation implementation ratio 500 786.5 sec 68.2 sec 11.5x 1000 11000 sec 289 sec 38x October 15, 2008 HDF and HDF-EOS Workshop XII 7
  • 8. Performance Improvements in HDF5 • Work in progress • Fast data append (along slowest changing dimension) • Future areas of interest • Efficient chunking cache implementation (NetCDF4) • Efficient handling of the variable-length data including compression (will affect sizes of NPOESS files) October 15, 2008 HDF and HDF-EOS Workshop XII 8
  • 9. Fortran 2003 features October 15, 2008 HDF and HDF-EOS Workshop XII 9
  • 10. Status of the HDF5 Fortran Library • HDF5 Fortran library is a part of standard HDF5 distribution • First release goes back to 1999 • Implemented as Fortran90 wrappers on top of the HDF5 C library • Supported on Linux, Windows, Mac Intel, Solaris, VMS, clusters, etc. • Compilers • Open source gfortran, g95 • Vendors (SUN, Intel, PGI, Absoft) • 32 and 64-bit versions October 15, 2008 HDF and HDF-EOS Workshop XII 10
  • 11. HDF5 Fortran Library • Mimics HDF5 C APIs • Fortran 90 features used • • • • • Modules Function overloading Function interfaces Dynamic memory allocation Optional parameters • Many Fortran APIs are simpler than their C counterparts October 15, 2008 HDF and HDF-EOS Workshop XII 11
  • 12. Current Limitations • Supports only “native” Fortran types such as • • • • INTEGER REAL CHARACTER DOUBLE PRECISION (obsolete Fortran feature) • No support for INTEGER*1, INTEGER*2, INTEGER*8, INTEGER*16, REAL*8, REAL*16 • Cannot write/read buffers of those types • Fortran types have to match C types • No support for –r8 and –r16 flags (Fortran real =/= C float) October 15, 2008 HDF and HDF-EOS Workshop XII 12
  • 13. Current Limitations • No support for INTEGER(KIND=n) and REAL(KIND=m) • Integers n and m are called KIND parameters • Returned by selected _int_kind (r) -10r < n < 10r selected_real_kind(p,r) p – precision r – decimal exponent range October 15, 2008 HDF and HDF-EOS Workshop XII 13
  • 14. Current Limitations • Limited support for derived types (compare with C structures and HDF5 compound datatypes) • Supports derived types with “native” fields only • Doesn’t support complex HDF5 datatypes (e.g., with array member, or nested compound) • Writes/reads HDF5 compound datasets by fields only • Cannot be used in parallel applications • No support for enum types • No support for callback functions October 15, 2008 HDF and HDF-EOS Workshop XII 14
  • 15. Current Limitations - Summary • Any application written according to Fortran 95/2003 standard will struggle using HDF5 Fortran Library • Many HDF5 features are not available to Fortran applications October 15, 2008 HDF and HDF-EOS Workshop XII 15
  • 16. Fortran 2003 Features • Fortran 2003 provides a standardized mechanism for interoperating with C • Module ISO_C_BINDING for interoperability of intrinsic types • C_PTR and C_FUNCPTR for interoperability with C pointers • C_LOC(x) and C_FUNLOC(x) inquiry functions for getting addresses of variables and procedures • BIND attribute for interoperability of derived types and C structures and enumerated types October 15, 2008 HDF and HDF-EOS Workshop XII 16
  • 17. Fortran 2003 Features and HDF5 • New 2003 features allowed us to support • Any Fortran INTEGER and REAL type data in HDF5 files • Fortran derived types and HDF5 compound datatypes • Fortran enumerated types and HDF5 enumerated types • HDF5 APIs with callbacks October 15, 2008 HDF and HDF-EOS Workshop XII 17
  • 18. Fortran Compilers with 2003 Features Compiler Current versions and status of HDF5 Future versions Intel Versions 10.1 and 11 All F2003 functionality works in HDF5 g95 August 2008 and later; All Fortran 2003 functionality works in HDF5 gfortran Version 4.4 Limited support for C interoperability; HDF5 doesn’t work No plans from compiler developers to improve the support SUN compilers Express-July 2008 build; HDF5 doesn’t work No timeline for fixes; may be in a year PGI Version 7.2.1; HDF5 doesn’t work Fixes will be available in Version 8. October 15, 2008 HDF and HDF-EOS Workshop XII Version 11 will have a fix that will allow us to remove common blocks 18
  • 19. Example October 15, 2008 HDF and HDF-EOS Workshop XII 19
  • 20. Example October 15, 2008 HDF and HDF-EOS Workshop XII 20
  • 21. Example HDF5 "SDScompound.h5" { GROUP "/" { DATASET "ArrayOfStructures" { DATATYPE H5T_COMPOUND { H5T_ARRAY { [13] H5T_STRING { STRSIZE 1; STRPAD H5T_STR_SPACEPAD; CSET H5T_CSET_ASCII; CTYPE H5T_C_S1; } } "chr_name"; H5T_STD_I8LE "a_name"; H5T_IEEE_F64LE "c_name"; H5T_IEEE_F32LE "b_name"; } ….. October 15, 2008 HDF and HDF-EOS Workshop XII 21
  • 22. Information October 15, 2008 HDF and HDF-EOS Workshop XII 22
  • 23. HDF5 file recovery or Surviving a System Failure through Metadata Journaling October 15, 2008 HDF and HDF-EOS Workshop XII 23
  • 24. Surviving a System Failure in HDF5 • Problem: • Data in an opened HDF5 files susceptible to corruption in the event of an application or system crash. • Corruption possible if an opened HDF5 file has been updated when the crash occurs. • Initial Objective: • Guarantee an HDF5 file with consistent metadata can be reconstructed in the event of a crash. • No guarantee on state of raw data – contains whatever made it to disk prior to crash. October 15, 2008 HDF and HDF-EOS Workshop XII 24
  • 25. Crash Survivability in HDF5 • Approach: Metadata Journaling • When an HDF5 file is opened with Metadata Journaling enabled, a companion Journal file is created. • When an HDF5 API function that modifies metadata is completed, a transaction is recorded in the Journal file. • If the application crashes, a recovery program can replay the journal by applying in order all metadata writes until the end of the last completed transaction written to the journal file. October 15, 2008 HDF and HDF-EOS Workshop XII 25
  • 26. HDF5 Metadata Journaling Recovery Application crashed h5recover tool liFe Corrupted 5DFH Restored HDF5 File Companion Journal File Oct. 16 2008 HDF and HDF-EOS Workshop XII 26
  • 27. Implementation Status • Serial HDF5 with synchronous write mode • Alpha1 released August 2008 • User interface (API definition and h5recover tool) and file format may change October 15, 2008 HDF and HDF-EOS Workshop XII 27
  • 28. Metadata Journaling Plans • Serial HDF5 with synchronous write mode • Finalize User interface definitions and file format • Serial HDF5 with asynchronous write mode • To improve Journal file write speed • More features (need funding) • Make raw data operations atomic • Allow "super‐transactions" to be created by applications • Enable journaling for Parallel HDF5 October 15, 2008 HDF and HDF-EOS Workshop XII 28
  • 29. Questions? October 15, 2008 HDF and HDF-EOS Workshop XII 29
  • 30. Acknowledgement • This report is based upon work supported in part by a Cooperative Agreement with the National Aeronautics and Space Administration (NASA) under NASA Awards NNX06AC83A and NNX08AO77A. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Aeronautics and Space Administration. October 16, 2008 HDF and HDF-EOS Workshop XII 30

Editor's Notes

  • #25: Maybe Objective bullets do belong on later slide… not sure.