SlideShare a Scribd company logo
Conf-DDDD-IN
HDF5 Roadmap 2019-2020
Summer ESIP 2019
This work was supported by NASA/GSFC under Raytheon Co. contract number NNG15HZ39C.
This document does not contain technology or Technical Data controlled under either the U.S. International Traffic
in Arms Regulations or the U.S. Export Administration Regulations.
Elena Pourmal
The HDF Group EED2 Team Lead
epourmal@hdfgroup.org
Conf-DDDD-IN
2
• New features in Hierarchical Data Format 5
(HDF5) Release 1.12.0
• Support for HDF5 versions 1.8 and 1.10
• Summary of HDF5 Roadmap
Outline
Conf-DDDD-IN
3
HDF5 References
V1 | V2 | temp
---- |----- |-----
12 | 23 | 3.1
15 | 24 | 4.2
17 | 21 | 3.6
HDF5 dataset may store references to other objects
or references to selected elements of a dataset.
References to objects or dataset regions stored in
other files are coming in HDF5 1.12.0.
A.h5
B.h5
Time step
36,000
Dataset in B.h5 stores references
to datasets stored in A.h5
Conf-DDDD-IN
4
HDF5 Reference to Attribute
V1 | V2 | temp
---- |----- |-----
12 | 23 | 3.1
15 | 24 | 4.2
17 | 21 | 3.6
HDF5 1.12.0 will allow to reference an
attribute of a dataset or a group. Dataset in
A.h5 stores references to the attributes in
A.h5 and B.h5.
A.h5
B.h5
Time step
36,000
Time step
100,000
Conf-DDDD-IN
5
• HDF5 1.12.0 enables HDF5 to store data on new
storage using Virtual Object Layer (VOL) connectors
and Virtual File Drivers (VFDs).
• VOL connectors developed by The HDF Group and its
collaborators are available from public repository
https://bitbucket.hdfgroup.org/projects/HDF5VOL
HDF5 New Architecture
Conf-DDDD-IN
6
HDF5 New Architecture (cont’d)
S3
DAOS Object Store
DAOS REST
S3
AIO
1
1 – HDF5 Application Programming Interface
Conf-DDDD-IN
7
• HDF5 1.12.0 will have a VFD to access HDF5
file via Amazon Simple Storage Service
(Amazon S3)
• Requires minimum changes to the application
code
• h5dump and h5ls tools have a flag to specify the
driver to access HDF5 file on S3
h5ls --vfd=ros3
https://s3.us-east-2.amazonaws.com/file.h5
S3 VFD
Conf-DDDD-IN
8
• Uses “range get” commands to get
“bytes” from HDF5 file stored on S3
• New API to set up S3 VFD
herr_t H5Pset_fapl_ros3(hid_t fapl_id,
H5FD_ros3_fapl_t *fa)
• Credentials are passed via parameter to
the function
• Demo
S3 VFD (cont’d)
Conf-DDDD-IN
9
• HDF5 1.12.0 will have a VFD to access HDF5
file on Hadoop Distributed File System (HDFS).
• New API to access HDF5 file on HDFS
herr_t H5Pset_fapl_hdfs(hid_t fapl_id);
• HDF5 command line tools with enabled HDFS
VFD allows to extract metadata and raw data
from HDF5 and netCDF4 files on HDFS, and
use Hadoop streaming to collect data from
multiple HDF5 files.
• Demo
HDFS VFD
Conf-DDDD-IN
10
• UTF-8 will be default string encoding
instead of ASCII starting with HDF5
1.12.0
– Names of groups, datasets, attributes
– Names of compound datatypes fields,
enums, names of the files that are stored in
HDF5 according to the File Format spec
(VDS, VFDs family and split) have to be
ASCII.
New default for string encoding
Conf-DDDD-IN
11
• The HDF Group will continue support for
HDF5 1.8.* and 1.10.* releases
– Security patches
– Critical bug fixes
– Performance improvements
– Thorough testing of HDF5 1.8.* for forward
compatibility with HDF5 1.10.* and 1.12.0
• Files created by applications that are built with the HDF5
version later than 1.8.* and that do not use any new
features of that version are readable by 1.8.*
Support for HDF5 1.8 and 1.10
Conf-DDDD-IN
12
Today
Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1
2019 2020
HDF5 Roadmap
2021 2022
Q1
Fall 2019 1.8.22
HDF5 1.8
• Security patches
• Forward Compatibility
with 1.12
HDF5 1.10
HDF5 2.0 instead of 1.12.0?
Storage beyond the current
File Systems and HDF5 file
format
Nov 1.10.6 May 1.10.7 Nov 1.10.8
• Address degradation of performance between
major HDF5 releases 1.6 – 1.10
• VDS HDF5 performance
• Public performance benchmark test suit
• Dynamically loaded VFDs
• S3, HDFS, Spark VFD connectors
July 2019 1.12.0
• VOL architecture
• References
• Parallel compression (enhanced)
Nov 2019 1.12.1
• VOL plugins (DAOS)
• VFD plugins
• Dynamically loaded VFDs
• S3, HDFS, SPARK VFD
connectors
May 1.10.9
• Address degradation of performance between
major HDF5 releases 1.6 – 1.10
• Security updates
• Bug fixes
2020-2021 1.12.X
• Query and Indexing
• Full SWMR
• Mirror VFD
• Provenance (Onion) VFD
Summer 2020 1.8.23
• Security patches
• Forward Compatibility
with 1.12
HDF5 1.12
Conf-DDDD-IN
13
This work was supported by NASA/GSFC under
Raytheon Co. contract number NNG15HZ39C.
in partnership with

More Related Content

PPTX
Leveraging the Cloud for HDF Software Testing
PPTX
Google Colaboratory for HDF-EOS
PPTX
Parallel Computing with HDF Server
PPSX
HDFEOS.org User Analsys, Updates, and Future
PPSX
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
PPTX
HDF5 and Ecosystem: What Is New?
PPTX
MATLAB Modernization on HDF5 1.10
Leveraging the Cloud for HDF Software Testing
Google Colaboratory for HDF-EOS
Parallel Computing with HDF Server
HDFEOS.org User Analsys, Updates, and Future
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
HDF5 and Ecosystem: What Is New?
MATLAB Modernization on HDF5 1.10

What's hot (20)

PDF
H5Coro: The Cloud-Optimized Read-Only Library
PPTX
Parallel HDF5 Developments
PPTX
Hierarchical Data Formats (HDF) Update
PPTX
HDF Kita Lab: JupyterLab + HDF Service
PPTX
Product Designer Hub - Taking HPD to the Web
PPTX
Easy Access of NASA HDF data via OPeNDAP
PPTX
Efficiently serving HDF5 via OPeNDAP
PPTX
HDF Product Designer: Using Templates to Achieve Interoperability
PPTX
HDF Update for DAAC Managers (2017-02-27)
PPTX
Open-source Scientific Computing and Data Analytics using HDF
PPT
HDF-EOS 2/5 to netCDF Converter
PPTX
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
PPT
Status of HDF-EOS, Related Software and Tools
PPTX
HDF Group Support for NPP/NPOESS/JPSS
PPTX
HDF & HDF-EOS Data & Support at NSIDC
PDF
Using IDL with Suomi NPP VIIRS Data
PPTX
Easy Remote Access Via OPeNDAP
H5Coro: The Cloud-Optimized Read-Only Library
Parallel HDF5 Developments
Hierarchical Data Formats (HDF) Update
HDF Kita Lab: JupyterLab + HDF Service
Product Designer Hub - Taking HPD to the Web
Easy Access of NASA HDF data via OPeNDAP
Efficiently serving HDF5 via OPeNDAP
HDF Product Designer: Using Templates to Achieve Interoperability
HDF Update for DAAC Managers (2017-02-27)
Open-source Scientific Computing and Data Analytics using HDF
HDF-EOS 2/5 to netCDF Converter
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
Status of HDF-EOS, Related Software and Tools
HDF Group Support for NPP/NPOESS/JPSS
HDF & HDF-EOS Data & Support at NSIDC
Using IDL with Suomi NPP VIIRS Data
Easy Remote Access Via OPeNDAP
Ad

Similar to HDF5 Roadmap 2019-2020 (20)

PDF
HDF - Current status and Future Directions
PPTX
The State of HDF5 / Dana Robinson / The HDF Group
PPTX
HDF - Current status and Future Directions
PPT
Hdf5 intro
PPTX
HDF for the Cloud - New HDF Server Features
PDF
HDF5 2.0: Cloud Optimized from the Start
PPT
HDF Status and Development
PPTX
HDF Tools Updates and Discussions
Ad

More from The HDF-EOS Tools and Information Center (15)

PDF
Using a Hierarchical Data Format v5 file as Zarr v3 Shard
PDF
Cloud-Optimized HDF5 Files - Current Status
PDF
Cloud Optimized HDF5 for the ICESat-2 mission
PPTX
Access HDF Data in the Cloud via OPeNDAP Web Service
PPTX
Upcoming New HDF5 Features: Multi-threading, sparse data storage, and encrypt...
PDF
Cloud-Optimized HDF5 Files
PDF
Accessing HDF5 data in the cloud with HSDS
PPTX
Highly Scalable Data Service (HSDS) Performance Features
PDF
Creating Cloud-Optimized HDF5 Files
PPTX
HDF5 OPeNDAP Handler Updates, and Performance Discussion
PPTX
Hyrax: Serving Data from S3
PPSX
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
PPTX
HDF for the Cloud - Serverless HDF
PPTX
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
PPTX
HDF-EOS Data Product Developer's Guide
Using a Hierarchical Data Format v5 file as Zarr v3 Shard
Cloud-Optimized HDF5 Files - Current Status
Cloud Optimized HDF5 for the ICESat-2 mission
Access HDF Data in the Cloud via OPeNDAP Web Service
Upcoming New HDF5 Features: Multi-threading, sparse data storage, and encrypt...
Cloud-Optimized HDF5 Files
Accessing HDF5 data in the cloud with HSDS
Highly Scalable Data Service (HSDS) Performance Features
Creating Cloud-Optimized HDF5 Files
HDF5 OPeNDAP Handler Updates, and Performance Discussion
Hyrax: Serving Data from S3
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
HDF for the Cloud - Serverless HDF
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
HDF-EOS Data Product Developer's Guide

Recently uploaded (20)

PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Electronic commerce courselecture one. Pdf
PDF
cuic standard and advanced reporting.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
Big Data Technologies - Introduction.pptx
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
Spectroscopy.pptx food analysis technology
PDF
Machine learning based COVID-19 study performance prediction
PDF
Empathic Computing: Creating Shared Understanding
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
KodekX | Application Modernization Development
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Electronic commerce courselecture one. Pdf
cuic standard and advanced reporting.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
Big Data Technologies - Introduction.pptx
Understanding_Digital_Forensics_Presentation.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
The Rise and Fall of 3GPP – Time for a Sabbatical?
Spectral efficient network and resource selection model in 5G networks
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
MIND Revenue Release Quarter 2 2025 Press Release
NewMind AI Weekly Chronicles - August'25 Week I
Spectroscopy.pptx food analysis technology
Machine learning based COVID-19 study performance prediction
Empathic Computing: Creating Shared Understanding
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
KodekX | Application Modernization Development
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Diabetes mellitus diagnosis method based random forest with bat algorithm

HDF5 Roadmap 2019-2020

  • 1. Conf-DDDD-IN HDF5 Roadmap 2019-2020 Summer ESIP 2019 This work was supported by NASA/GSFC under Raytheon Co. contract number NNG15HZ39C. This document does not contain technology or Technical Data controlled under either the U.S. International Traffic in Arms Regulations or the U.S. Export Administration Regulations. Elena Pourmal The HDF Group EED2 Team Lead epourmal@hdfgroup.org
  • 2. Conf-DDDD-IN 2 • New features in Hierarchical Data Format 5 (HDF5) Release 1.12.0 • Support for HDF5 versions 1.8 and 1.10 • Summary of HDF5 Roadmap Outline
  • 3. Conf-DDDD-IN 3 HDF5 References V1 | V2 | temp ---- |----- |----- 12 | 23 | 3.1 15 | 24 | 4.2 17 | 21 | 3.6 HDF5 dataset may store references to other objects or references to selected elements of a dataset. References to objects or dataset regions stored in other files are coming in HDF5 1.12.0. A.h5 B.h5 Time step 36,000 Dataset in B.h5 stores references to datasets stored in A.h5
  • 4. Conf-DDDD-IN 4 HDF5 Reference to Attribute V1 | V2 | temp ---- |----- |----- 12 | 23 | 3.1 15 | 24 | 4.2 17 | 21 | 3.6 HDF5 1.12.0 will allow to reference an attribute of a dataset or a group. Dataset in A.h5 stores references to the attributes in A.h5 and B.h5. A.h5 B.h5 Time step 36,000 Time step 100,000
  • 5. Conf-DDDD-IN 5 • HDF5 1.12.0 enables HDF5 to store data on new storage using Virtual Object Layer (VOL) connectors and Virtual File Drivers (VFDs). • VOL connectors developed by The HDF Group and its collaborators are available from public repository https://bitbucket.hdfgroup.org/projects/HDF5VOL HDF5 New Architecture
  • 6. Conf-DDDD-IN 6 HDF5 New Architecture (cont’d) S3 DAOS Object Store DAOS REST S3 AIO 1 1 – HDF5 Application Programming Interface
  • 7. Conf-DDDD-IN 7 • HDF5 1.12.0 will have a VFD to access HDF5 file via Amazon Simple Storage Service (Amazon S3) • Requires minimum changes to the application code • h5dump and h5ls tools have a flag to specify the driver to access HDF5 file on S3 h5ls --vfd=ros3 https://s3.us-east-2.amazonaws.com/file.h5 S3 VFD
  • 8. Conf-DDDD-IN 8 • Uses “range get” commands to get “bytes” from HDF5 file stored on S3 • New API to set up S3 VFD herr_t H5Pset_fapl_ros3(hid_t fapl_id, H5FD_ros3_fapl_t *fa) • Credentials are passed via parameter to the function • Demo S3 VFD (cont’d)
  • 9. Conf-DDDD-IN 9 • HDF5 1.12.0 will have a VFD to access HDF5 file on Hadoop Distributed File System (HDFS). • New API to access HDF5 file on HDFS herr_t H5Pset_fapl_hdfs(hid_t fapl_id); • HDF5 command line tools with enabled HDFS VFD allows to extract metadata and raw data from HDF5 and netCDF4 files on HDFS, and use Hadoop streaming to collect data from multiple HDF5 files. • Demo HDFS VFD
  • 10. Conf-DDDD-IN 10 • UTF-8 will be default string encoding instead of ASCII starting with HDF5 1.12.0 – Names of groups, datasets, attributes – Names of compound datatypes fields, enums, names of the files that are stored in HDF5 according to the File Format spec (VDS, VFDs family and split) have to be ASCII. New default for string encoding
  • 11. Conf-DDDD-IN 11 • The HDF Group will continue support for HDF5 1.8.* and 1.10.* releases – Security patches – Critical bug fixes – Performance improvements – Thorough testing of HDF5 1.8.* for forward compatibility with HDF5 1.10.* and 1.12.0 • Files created by applications that are built with the HDF5 version later than 1.8.* and that do not use any new features of that version are readable by 1.8.* Support for HDF5 1.8 and 1.10
  • 12. Conf-DDDD-IN 12 Today Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2 Q3 Q4 Q1 2019 2020 HDF5 Roadmap 2021 2022 Q1 Fall 2019 1.8.22 HDF5 1.8 • Security patches • Forward Compatibility with 1.12 HDF5 1.10 HDF5 2.0 instead of 1.12.0? Storage beyond the current File Systems and HDF5 file format Nov 1.10.6 May 1.10.7 Nov 1.10.8 • Address degradation of performance between major HDF5 releases 1.6 – 1.10 • VDS HDF5 performance • Public performance benchmark test suit • Dynamically loaded VFDs • S3, HDFS, Spark VFD connectors July 2019 1.12.0 • VOL architecture • References • Parallel compression (enhanced) Nov 2019 1.12.1 • VOL plugins (DAOS) • VFD plugins • Dynamically loaded VFDs • S3, HDFS, SPARK VFD connectors May 1.10.9 • Address degradation of performance between major HDF5 releases 1.6 – 1.10 • Security updates • Bug fixes 2020-2021 1.12.X • Query and Indexing • Full SWMR • Mirror VFD • Provenance (Onion) VFD Summer 2020 1.8.23 • Security patches • Forward Compatibility with 1.12 HDF5 1.12
  • 13. Conf-DDDD-IN 13 This work was supported by NASA/GSFC under Raytheon Co. contract number NNG15HZ39C. in partnership with