SlideShare a Scribd company logo
NetCDF and HDF5
Ed Hartnett, Unidata/UCAR, 2010
Unidata
• Mission: To provide the data services, tools,
and cyberinfrastructure leadership that
advance Earth system science, enhance
educational opportunities, and broaden
participation.
Unidata Software
• NetCDF – data format and libraries.
• NetCDF-Java/common data model – reads
many data formants (HDF5, HDF4, GRIB,
BUFR, many more).
• THREDDS – Data server for cataloging and
serving data.
• IDV – Integrated Data Viewer
• IDD/LDM – Peer to peer data distribution.
• UDUNITS – Unit conversions.
What is NetCDF?






NetCDF is a set of software libraries and
machine-independent data formats that
support the creation, access, and sharing of
array-oriented scientific data.
First released in 1989.
NetCDF-4.0 (June, 2008) introduces many
new features, while maintaining full code
and data compatibility.
The NetCDF-4 Project
• Does not indicate any lack of
commitment or compatibility for
classic formats.
• Uses HDF5 as data storage layer.
• Also provides read-only access to
some HDF4, HDF5 archives.
• Parallel I/O for high performance
computing.
NetCDF Disk Formats
NetCDF and HDF5
Commitment to Backward Compatibility
Because preserving access to archived data
for future generations is sacrosanct:
• NetCDF-4 provides both read and write access to
all earlier forms of netCDF data.
• Existing C, Fortran, and Java netCDF programs
will continue to work after recompiling and
relinking.
• Future versions of netCDF will continue to support
both data access compatibility and API
compatibility.
Who Uses NetCDF?
• NetCDF is widely
used in University
Earth Science
community.
• Used for IPCC data
sets.
• Used by NASA and
other large data
producers.
The OPeNDAP Client








OPenDAP (http://www.opendap.org/) is a
widely supported protocol for access to
remote data
Defined and maintained by the OPenDAP
organization
Designed to serve as intermediate format
for accessing a wide variety of data sources.
Client is now built into netCDF C library.
Using OPeNDAP Client








In order to access DAP data sources, you
need a special format URL:
[limit=5]http://test.opendap.org/dods/dts/te
st.32.X?windW[0:10:2]&CS02.light>
Location of data source and its part, where
X is one of "dds", "das", or "dods"
Constraints on what part of the data source
is to be sent.
NetCDF Data Models






The netCDF data model, consisting of
variables, dimensions, and attributes (the
classic model), has been expanded in
version 4.0.
The enhanced 4.0 model adds expandable
dimensions, strings, 64-bit integers,
unsigned integers, groups and user-defined
types.
The 4.0 release also adds some features
that need not use the enhanced model, like
compression, chunking, endianness control,
checksums, parallel I/O.
NetCDF Classic Model
• Contains dimensions, variables, and attributes.
NetCDF Classic Model
Enhanced
Model
NetCDF Enhanced Model
A netCDF-4 file can organize variable, dimensions, and
attributes in groups, which can be nested.

Attribute
Attribute

Attribute
Attribute

Attribute
Attribute

Attribute
Attribute
Variable

Attribute
Attribute

Attribute
Attribute

Variable

Variable

Attribute
Attribute
Variable

Attribute
Attribute
Variable
Reasons to Use Classic Model
• Provides compatibility with existing netCDF
programs.
• Still possible to use chunking, parallel I/O,
compression, endianness control.
• Simple and powerful data model.
Accessing HDF5 Data with NetCDF
NetCDF (starting with version 4.1) provides
read-only access to existing HDF5 files if they
do not violate some rules:

Must not use circular group structure.

HDF5 reference type (and some other more
obscure types) are not understood.

Write access still only possible with netCDF4/HDF5 files.

Reading HDF5 with NetCDF
Before netCDF-4.1, HDF5 files had to use
creation ordering and dimension scales in order
to be understood by netCDF-4.

Starting with netCDF-4.1, read-only access is
possible to HDF5 files with alphabetical
ordering and no dimension scales. (Created by
HDF5 1.6 perhaps.)

HDF5 may have dimension scales for all
dimensions, or for no dimensions (not for just
some of them).

Accessing HDF4 Data with NetCDF
Starting with version 4.1, netCDF will be able
to read HDF4 files created with the “Scientific
Dataset” (SD) API.

This is read-only: NetCDF can't write HDF4!

The intention is to make netCDF software
work automatically with important HDF4
scientific data collections.

Confusing: HDF4 Includes NetCDF
v2 API
•A netCDF V2 API is provided with HDF4 which
writes SD data files.
• This must be turned off at HDF4 install-time if
netCDF and HDF4 are to be linked in the same
application.
• There is no easy way to use both HDF4 with
netCDF API and netCDF with HDF4 read capability
in the same program.
Building NetCDF for HDF5/HDF4
Access
This is only available for those who also build
netCDF with HDF5.

HDF4, HDF5, zlib, and other compression
libraries must exist before netCDF is built.

Build like this:
./configure –with-hdf5=/home/ed –enable-hdf4

Building User Programs with
HDF5/HDF4 Access
Include locations of netCDF, HDF5, and HDF4
include directories:

-I/loc/of/netcdf/include -I/loc/of/hdf5/include
-I/loc/of/hdf4/include

The HDF4 and HDF5 libraries (and associated
libraries) are needed and must be linked into
all netCDF applications. The locations of the lib
directories must also be provided:

-L/loc/of/netcdf/lib -L/loc/of/hdf5/lib
-L/loc/of/hdf4/lib -lmfhdf -ldf -ljpeg -lhdf5_hl
-lhdf5 -lz

Using HDF4
You don't need to identify the file as HDF4
when opening it with netCDF, but you do have
to open it read-only.

The HDF4 SD API provides a named, shared
dimension, which fits easily into the netCDF
model.

The HDF4 SD API uses other HDF4 APIs, (like
vgroups) to store metadata. This can be
confusing when using the HDF4 data dumping
tool hdp.

HDF4 MODIS File ncdumped
../ncdump/ncdump -h
MOD29.A2000055.0005.005.2006267200024.hdf
netcdf MOD29.A2000055.0005.005.2006267200024 {
dimensions:
Coarse_swath_lines_5km:MOD_Swath_Sea_Ice = 406 ;
Coarse_swath_pixels_5km:MOD_Swath_Sea_Ice = 271 ;
Along_swath_lines_1km:MOD_Swath_Sea_Ice = 2030 ;
Cross_swath_pixels_1km:MOD_Swath_Sea_Ice = 1354 ;
variables:
float Latitude(Coarse_swath_lines_5km:MOD_Swath_Sea_Ice,
Coarse_swath_pixels_5km:MOD_Swath_Sea_Ice) ;
Latitude:long_name = "Coarse 5 km resolution latitude" ;
Latitude:units = "degrees" ;
...
Accessing HDF4-EOS Data with
NetCDF
• Data can be read, but netCDF does not (yet)
understand how to break down the
StructMetadata attribute into useful information.
// global attributes:
:HDFEOSVersion = "HDFEOS_V2.9" ;
:StructMetadata.0 =
"GROUP=SwathStructurentGROUP=SWATH_1nttSwathNam
e="MOD_Swath_Sea_Ice"nttGROUP=DimensionntttOBJE
CT=Dimension_1nttttDimensionName="Coarse_swath_lines
_5km"nttttSize=406ntttEND_OBJECT=Dimension_1nttt
OBJECT=Dimension_2nttttDimensionName="Coarse_swath
_pixels_5km"nttttSize=271nttt...
Contribute Code to Write HDF4?
Some programmers use the netCDF v2 API to
write HDF4 files.

It would not be too hard to write the glue
code to allow the v2 API -> HDF4 output from
the netCDF library.

The next step would be to allow netCDF v3/v4
API code to write HDF4 files.

Writing HDF4 seems like a low priority to our
users. I would be happy to help any user who
would like to undertake this task.

Parallel I/O with NetCDF
• Parallel I/O allows many processes to
read/write netCDF data at the same time.
• Used properly, parallel I/O allows users to
overcome I/O bottlenecks in high
performance computing environments.
• A parallel I/O file system is required for
much improvement in I/O throughput.
• NetCDF-4 can use parallel I/O with netCDF4/HDF5 files, or netCDF classic files (with
pnetcdf library).
Parallel I/O C Example
nc_create_par(FILE, NC_NETCDF4|NC_MPIIO, comm, info,
&ncid);
nc_def_dim(ncid, "d1", DIMSIZE, dimids);
nc_def_dim(ncid, "d2", DIMSIZE, &dimids[1]);
nc_def_var(ncid, "v1", NC_INT, NDIMS, dimids, &v1id);
/* Set up slab for this process. */
start[0] = mpi_rank * DIMSIZE/mpi_size;
start[1] = 0; count[0] = DIMSIZE/mpi_size;
count[1] = DIMSIZE;
nc_var_par_access(ncid, v1id, NC_INDEPENDENT);
nc_put_vara_int(ncid, v1id, start, count,
&data[mpi_rank*QTR_DATA]);
NetCDF APIs
The netCDF core library is written in C and
Java.

Fortran 77 is “faked” when netCDF is built –
actually C functions are called by Fortran 77
API.

A C++ API also calls the C API, a new C++
API us under development to support netCDF4 more fully.

C API
nc_create(FILE_NAME, NC_CLOBBER, &ncid);
nc_def_dim(ncid, "x", NX, &x_dimid);
nc_def_dim(ncid, "y", NY, &y_dimid);
dimids[0] = x_dimid;
dimids[1] = y_dimid;
nc_def_var(ncid, "data", NC_INT, NDIMS,
dimids, &varid);
nc_enddef(ncid);
nc_put_var_int(ncid, varid, &data_out[0][0]);
nc_close(ncid);
Fortran API
call check( nf90_create(FILE_NAME, NF90_CLOBBER,
ncid) )
call check( nf90_def_dim(ncid, "x", NX, x_dimid) )
call check( nf90_def_dim(ncid, "y", NY, y_dimid) )
dimids = (/ y_dimid, x_dimid /)
call check( nf90_def_var(ncid, "data", NF90_INT, dimids,
varid) )
call check( nf90_enddef(ncid) )
call check( nf90_put_var(ncid, varid, data_out) )
call check( nf90_close(ncid) )
New C++ API (cxx4)
Existing C++ API works with netCDF-4 classic
model files.

The existing API was written before many
features of C++ became standard, and thus
needed updating.

A new C++ API has been partially developed .

You can build the new API (which is not
complete!) with --enable-cxx4.

Java API
dataFile = NetcdfFileWriteable.createNew(filename, false);
// Create netCDF dimensions,
Dimension xDim = dataFile.addDimension("x", NX );
Dimension yDim = dataFile.addDimension("y", NY );
ArrayList dims = new ArrayList();
// define dimensions
dims.add( xDim);
dims.add( yDim);
...
Tools
• ncdump – ASCII or NcML dump of data file.
• ncgen – Take ASCII or NcML and create data
file.
• nccopy – Copy a file, changing format,
compression, chunking, etc.
Conventions
The NetCDF User's Guide recommends some
conventions (ex. "units" and "Conventions"
attributes).

Conventions are published agreements about how
data of a particular type should be represented to
foster interoperability.

Most conventions use attributes.

Use of an existing convention is highly
recommended. Use the CF Conventions, if applicable.

A netCDF file should use the global "Conventions"
attribute to identify which conventions it uses.

Climate and Forecast Conventions
The CF Conventions are becoming a widely
used standard for atmospheric, ocean, and
climate data.

The NetCDF Climate and Forecast (CF)
Metadata Conventions, Version 1.3, describes
consensus representations for climate and
forecast data using the netCDF-3 data model.

LibCF
−

The NetCDF CF Library supports the
creation of scientific data files conforming
to the CF conventions, using the netCDF
API.

−

Now distributed with netCDF.

−

Now home of GRIDSPEC: A standard for
the description of grids used in Earth
System models, developed by V. Balaji,
GFDL, proposed as a Climate and
Forecast (CF) convention.
UDUNITS




The Unidata units library, udunits, supports
conversion of unit specifications between
formatted and binary forms, arithmetic
manipulation of unit specifications, and
conversion of values between compatible
scales of measurement.
Now being distributed with netCDF.
NetCDF 4.1.2 Release
• Performance improvements: much faster file
opens (factor of 200 speedup).
• Better memory handling, much better
testing for leaks and memory errors in
netCDF and HDF5.
• nccopy now can compress and re-chunk
data.
• Refactoring of dispatch layer (invisible to
user).
NetCDF Future Plans
• By “plans” we really mean “aspirations.”
• We us agile programming, with aggressive
refactoring, and heavy reliance on automatic
testing.
• Our highest priority is fixing bugs so that we
do not have a bug-list to maintain and
prioritize.
Plans: Fortran Refactor
• We plan a complete Fortran re-factor within the
next year.
• Fortan 90 and Fortran 77 backward compatibility
will be preserved. No user code will need to be
rewritten.
• Fortan 90 compilers will be required (even for
F77 API code). Fortran 77 compilers will not
work with netCDF releases after the refactor.
• Fortran 90 API will be rewritten with Fortran
2003 C interoperability features. Fortran 77 API
will be rewritten in terms of Fortran 90 API.
Plans: Windows Port
• Recent refactoring of netCDF architecture
requires (yet another) Windows port. This is
planned for the end of 2010.
• Windows ports are not too hard, but require a
detailed knowledge of Microsoft's latest changes
and developments of the Windows platform.
• I invite collaboration with any Windows
programmer who would like to help with the
Windows port.
Plans: Virtual Files
• There are some uses (including
LibCF/GRIDSPEC) for disk-less netCDF files –
that is, files which exist only in memory.
• I am experimenting with this now – interested
users should contact me at:
ed@unidata.ucar.edu
Plans: More Formats
• The NetCDF Java library can read many formats
that are a mystery to the C-based library.
• Recent refactoring of the netCDF architecture
makes it easier to support additional formats.
• We would like to support GRIB and BUFR next.
We seek collaboration with interested users.
NetCDF Team – Russ Rew
• Vision.
• nccopy
• classic library
NetCDF Team – Ed Hartnett
•
•
•
•
•

NetCDF-4
Release engineering
Parallel I/O
LibCF
Fortran libraries
NetCDF Team – Dennis
Heimbigner
• Opendap client.
• New ncdump/ncgen
• Some netCDF-Java
NetCDF Team – John Caron
• NetCDF-Java
• Common Data Model
Support
Send bug reports to:
support-netcdf@unidata.ucar.edu

Your support email will enter a support
tracking system which will ensure that it does
not get lost.

But it may take us a while to solve your
problem...

Snapshot Releases and Daily
Testing
• Automatic daily test runs at Unidata ensure
that our changes don't break netCDF.
• Test results available on-line at NetCDF web
site.
• Daily snapshot release provided so users
can get latest code, and iterate fixes with
netCDF developers.
NetCDF Workshop
• Annual netCDF workshop is a good place to
learn the latest developments in netCDF,
and talk to netCDF developers.
• October 28-29, 2010, and swanky Mesa Lab
at NCAR – great views, mountain trails,
without the usual riffraff.
• Preceded by data format summit.
Questions?
• Any questions?

More Related Content

PPT
Medical Image Processing
PPTX
materi jaringan komputer kelas XI Sekolah Menengah Atas
PDF
preparationofchildhoodandparenthood-200429064251.pdf
PDF
Deep VO and SLAM
PDF
Modul Kamera 2
PPTX
AMD EPYC Family World Record Performance Summary Mar 2022
 
PPTX
Canny Edge & Image Representation.pptx
PPTX
Edge detection of video using matlab code
Medical Image Processing
materi jaringan komputer kelas XI Sekolah Menengah Atas
preparationofchildhoodandparenthood-200429064251.pdf
Deep VO and SLAM
Modul Kamera 2
AMD EPYC Family World Record Performance Summary Mar 2022
 
Canny Edge & Image Representation.pptx
Edge detection of video using matlab code

What's hot (20)

PDF
Calendar class in java
PDF
A Comprehensive Guide to Content Management Systems.pdf
PDF
Basics of image processing using MATLAB
PPTX
Fundamentals steps in Digital Image processing
PPTX
Bresenham-Circle-drawing-algorithm, Midpoint Circle Drawing Algorithm
DOCX
final_project
PDF
MUSIC RECOMMENDATION THROUGH FACE RECOGNITION AND EMOTION DETECTION
PPTX
Active contour segmentation
PPTX
camera calibration
PDF
Human Face Detection and Tracking for Age Rank, Weight and Gender Estimation ...
PDF
Cuadernillo Tercer Grado - Actividades matemáticas
PPTX
Deep learning on face recognition (use case, development and risk)
PDF
Image segmentation
PPTX
Bezier Curve
DOC
Final Year Project - Enhancing Virtual Learning through Emotional Agents (Doc...
PDF
Lecture 11
PPTX
Flow oriented modeling
PDF
fusion of Camera and lidar for autonomous driving II
PPTX
Applications of Digital image processing in Medical Field
PPTX
Fundamental Steps of Digital Image Processing & Image Components
Calendar class in java
A Comprehensive Guide to Content Management Systems.pdf
Basics of image processing using MATLAB
Fundamentals steps in Digital Image processing
Bresenham-Circle-drawing-algorithm, Midpoint Circle Drawing Algorithm
final_project
MUSIC RECOMMENDATION THROUGH FACE RECOGNITION AND EMOTION DETECTION
Active contour segmentation
camera calibration
Human Face Detection and Tracking for Age Rank, Weight and Gender Estimation ...
Cuadernillo Tercer Grado - Actividades matemáticas
Deep learning on face recognition (use case, development and risk)
Image segmentation
Bezier Curve
Final Year Project - Enhancing Virtual Learning through Emotional Agents (Doc...
Lecture 11
Flow oriented modeling
fusion of Camera and lidar for autonomous driving II
Applications of Digital image processing in Medical Field
Fundamental Steps of Digital Image Processing & Image Components
Ad

Viewers also liked (14)

PDF
Working with HDF and netCDF Data in ArcGIS: Tools and Case Studies
PPTX
HDF and netCDF Data Support in ArcGIS
PPT
Visualizing and Analyzing HDF-EOS5 and HDF5 data with NCL
PPT
Summary of HDF-EOS5 Files, Data Model and File Format
PPTX
Introduction to HDF5 Data and Programming Models
DOCX
Meteo penerbangan
PPT
PPT
Annex 2 -- examples of observation data
PDF
Jma hr gsm_data_gr_ads_20130529
PDF
SATAID_manual_201611
PDF
Guide for visualizing JMA's GSM outputs using GrADS
DOCX
Tutorial pengolahan data salinitas menggunakan aplikasi SeaDas, Excel dan ODV...
PPTX
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
PDF
Esri Course Catalog 2014
Working with HDF and netCDF Data in ArcGIS: Tools and Case Studies
HDF and netCDF Data Support in ArcGIS
Visualizing and Analyzing HDF-EOS5 and HDF5 data with NCL
Summary of HDF-EOS5 Files, Data Model and File Format
Introduction to HDF5 Data and Programming Models
Meteo penerbangan
Annex 2 -- examples of observation data
Jma hr gsm_data_gr_ads_20130529
SATAID_manual_201611
Guide for visualizing JMA's GSM outputs using GrADS
Tutorial pengolahan data salinitas menggunakan aplikasi SeaDas, Excel dan ODV...
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
Esri Course Catalog 2014
Ad

Similar to NetCDF and HDF5 (20)

PPTX
MATLAB, netCDF, and OPeNDAP
PPT
PDF
Module net cdf4
PDF
LCI2009-Tutorial
PDF
LCI2009-Tutorial
PDF
Plans for Enhanced NetCDF-4 Interface to HDF5 Data
PPTX
Efficiently serving HDF5 via OPeNDAP
PPTX
Moving form HDF4 to HDF5/netCDF-4
PPTX
MATLAB and Scientific Data: New Features and Capabilities
PPTX
Adding CF Attributes to an HDF5 File
PDF
Introduction to HDF5 Data Model, Programming Model and Library APIs
PPTX
MATLAB, netCDF, and OPeNDAP
Module net cdf4
LCI2009-Tutorial
LCI2009-Tutorial
Plans for Enhanced NetCDF-4 Interface to HDF5 Data
Efficiently serving HDF5 via OPeNDAP
Moving form HDF4 to HDF5/netCDF-4
MATLAB and Scientific Data: New Features and Capabilities
Adding CF Attributes to an HDF5 File
Introduction to HDF5 Data Model, Programming Model and Library APIs

More from The HDF-EOS Tools and Information Center (20)

PDF
HDF5 2.0: Cloud Optimized from the Start
PDF
Using a Hierarchical Data Format v5 file as Zarr v3 Shard
PDF
Cloud-Optimized HDF5 Files - Current Status
PDF
Cloud Optimized HDF5 for the ICESat-2 mission
PPTX
Access HDF Data in the Cloud via OPeNDAP Web Service
PPTX
Upcoming New HDF5 Features: Multi-threading, sparse data storage, and encrypt...
PPTX
The State of HDF5 / Dana Robinson / The HDF Group
PDF
Cloud-Optimized HDF5 Files
PDF
Accessing HDF5 data in the cloud with HSDS
PPTX
Highly Scalable Data Service (HSDS) Performance Features
PDF
Creating Cloud-Optimized HDF5 Files
PPTX
HDF5 OPeNDAP Handler Updates, and Performance Discussion
PPTX
Hyrax: Serving Data from S3
PPSX
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
PDF
HDF - Current status and Future Directions
PPSX
HDFEOS.org User Analsys, Updates, and Future
PPTX
HDF - Current status and Future Directions
PDF
H5Coro: The Cloud-Optimized Read-Only Library
PPTX
MATLAB Modernization on HDF5 1.10
HDF5 2.0: Cloud Optimized from the Start
Using a Hierarchical Data Format v5 file as Zarr v3 Shard
Cloud-Optimized HDF5 Files - Current Status
Cloud Optimized HDF5 for the ICESat-2 mission
Access HDF Data in the Cloud via OPeNDAP Web Service
Upcoming New HDF5 Features: Multi-threading, sparse data storage, and encrypt...
The State of HDF5 / Dana Robinson / The HDF Group
Cloud-Optimized HDF5 Files
Accessing HDF5 data in the cloud with HSDS
Highly Scalable Data Service (HSDS) Performance Features
Creating Cloud-Optimized HDF5 Files
HDF5 OPeNDAP Handler Updates, and Performance Discussion
Hyrax: Serving Data from S3
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
HDF - Current status and Future Directions
HDFEOS.org User Analsys, Updates, and Future
HDF - Current status and Future Directions
H5Coro: The Cloud-Optimized Read-Only Library
MATLAB Modernization on HDF5 1.10

Recently uploaded (20)

PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Empathic Computing: Creating Shared Understanding
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Machine learning based COVID-19 study performance prediction
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Encapsulation theory and applications.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Unlocking AI with Model Context Protocol (MCP)
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Empathic Computing: Creating Shared Understanding
Encapsulation_ Review paper, used for researhc scholars
Machine learning based COVID-19 study performance prediction
Chapter 3 Spatial Domain Image Processing.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Advanced methodologies resolving dimensionality complications for autism neur...
“AI and Expert System Decision Support & Business Intelligence Systems”
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Spectral efficient network and resource selection model in 5G networks
Encapsulation theory and applications.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Digital-Transformation-Roadmap-for-Companies.pptx
Network Security Unit 5.pdf for BCA BBA.
Reach Out and Touch Someone: Haptics and Empathic Computing
Unlocking AI with Model Context Protocol (MCP)

NetCDF and HDF5

  • 1. NetCDF and HDF5 Ed Hartnett, Unidata/UCAR, 2010
  • 2. Unidata • Mission: To provide the data services, tools, and cyberinfrastructure leadership that advance Earth system science, enhance educational opportunities, and broaden participation.
  • 3. Unidata Software • NetCDF – data format and libraries. • NetCDF-Java/common data model – reads many data formants (HDF5, HDF4, GRIB, BUFR, many more). • THREDDS – Data server for cataloging and serving data. • IDV – Integrated Data Viewer • IDD/LDM – Peer to peer data distribution. • UDUNITS – Unit conversions.
  • 4. What is NetCDF?    NetCDF is a set of software libraries and machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. First released in 1989. NetCDF-4.0 (June, 2008) introduces many new features, while maintaining full code and data compatibility.
  • 5. The NetCDF-4 Project • Does not indicate any lack of commitment or compatibility for classic formats. • Uses HDF5 as data storage layer. • Also provides read-only access to some HDF4, HDF5 archives. • Parallel I/O for high performance computing.
  • 8. Commitment to Backward Compatibility Because preserving access to archived data for future generations is sacrosanct: • NetCDF-4 provides both read and write access to all earlier forms of netCDF data. • Existing C, Fortran, and Java netCDF programs will continue to work after recompiling and relinking. • Future versions of netCDF will continue to support both data access compatibility and API compatibility.
  • 9. Who Uses NetCDF? • NetCDF is widely used in University Earth Science community. • Used for IPCC data sets. • Used by NASA and other large data producers.
  • 10. The OPeNDAP Client     OPenDAP (http://www.opendap.org/) is a widely supported protocol for access to remote data Defined and maintained by the OPenDAP organization Designed to serve as intermediate format for accessing a wide variety of data sources. Client is now built into netCDF C library.
  • 11. Using OPeNDAP Client     In order to access DAP data sources, you need a special format URL: [limit=5]http://test.opendap.org/dods/dts/te st.32.X?windW[0:10:2]&CS02.light> Location of data source and its part, where X is one of "dds", "das", or "dods" Constraints on what part of the data source is to be sent.
  • 12. NetCDF Data Models    The netCDF data model, consisting of variables, dimensions, and attributes (the classic model), has been expanded in version 4.0. The enhanced 4.0 model adds expandable dimensions, strings, 64-bit integers, unsigned integers, groups and user-defined types. The 4.0 release also adds some features that need not use the enhanced model, like compression, chunking, endianness control, checksums, parallel I/O.
  • 13. NetCDF Classic Model • Contains dimensions, variables, and attributes.
  • 16. NetCDF Enhanced Model A netCDF-4 file can organize variable, dimensions, and attributes in groups, which can be nested. Attribute Attribute Attribute Attribute Attribute Attribute Attribute Attribute Variable Attribute Attribute Attribute Attribute Variable Variable Attribute Attribute Variable Attribute Attribute Variable
  • 17. Reasons to Use Classic Model • Provides compatibility with existing netCDF programs. • Still possible to use chunking, parallel I/O, compression, endianness control. • Simple and powerful data model.
  • 18. Accessing HDF5 Data with NetCDF NetCDF (starting with version 4.1) provides read-only access to existing HDF5 files if they do not violate some rules:  Must not use circular group structure.  HDF5 reference type (and some other more obscure types) are not understood.  Write access still only possible with netCDF4/HDF5 files. 
  • 19. Reading HDF5 with NetCDF Before netCDF-4.1, HDF5 files had to use creation ordering and dimension scales in order to be understood by netCDF-4.  Starting with netCDF-4.1, read-only access is possible to HDF5 files with alphabetical ordering and no dimension scales. (Created by HDF5 1.6 perhaps.)  HDF5 may have dimension scales for all dimensions, or for no dimensions (not for just some of them). 
  • 20. Accessing HDF4 Data with NetCDF Starting with version 4.1, netCDF will be able to read HDF4 files created with the “Scientific Dataset” (SD) API.  This is read-only: NetCDF can't write HDF4!  The intention is to make netCDF software work automatically with important HDF4 scientific data collections. 
  • 21. Confusing: HDF4 Includes NetCDF v2 API •A netCDF V2 API is provided with HDF4 which writes SD data files. • This must be turned off at HDF4 install-time if netCDF and HDF4 are to be linked in the same application. • There is no easy way to use both HDF4 with netCDF API and netCDF with HDF4 read capability in the same program.
  • 22. Building NetCDF for HDF5/HDF4 Access This is only available for those who also build netCDF with HDF5.  HDF4, HDF5, zlib, and other compression libraries must exist before netCDF is built.  Build like this: ./configure –with-hdf5=/home/ed –enable-hdf4 
  • 23. Building User Programs with HDF5/HDF4 Access Include locations of netCDF, HDF5, and HDF4 include directories:  -I/loc/of/netcdf/include -I/loc/of/hdf5/include -I/loc/of/hdf4/include  The HDF4 and HDF5 libraries (and associated libraries) are needed and must be linked into all netCDF applications. The locations of the lib directories must also be provided:  -L/loc/of/netcdf/lib -L/loc/of/hdf5/lib -L/loc/of/hdf4/lib -lmfhdf -ldf -ljpeg -lhdf5_hl -lhdf5 -lz 
  • 24. Using HDF4 You don't need to identify the file as HDF4 when opening it with netCDF, but you do have to open it read-only.  The HDF4 SD API provides a named, shared dimension, which fits easily into the netCDF model.  The HDF4 SD API uses other HDF4 APIs, (like vgroups) to store metadata. This can be confusing when using the HDF4 data dumping tool hdp. 
  • 25. HDF4 MODIS File ncdumped ../ncdump/ncdump -h MOD29.A2000055.0005.005.2006267200024.hdf netcdf MOD29.A2000055.0005.005.2006267200024 { dimensions: Coarse_swath_lines_5km:MOD_Swath_Sea_Ice = 406 ; Coarse_swath_pixels_5km:MOD_Swath_Sea_Ice = 271 ; Along_swath_lines_1km:MOD_Swath_Sea_Ice = 2030 ; Cross_swath_pixels_1km:MOD_Swath_Sea_Ice = 1354 ; variables: float Latitude(Coarse_swath_lines_5km:MOD_Swath_Sea_Ice, Coarse_swath_pixels_5km:MOD_Swath_Sea_Ice) ; Latitude:long_name = "Coarse 5 km resolution latitude" ; Latitude:units = "degrees" ; ...
  • 26. Accessing HDF4-EOS Data with NetCDF • Data can be read, but netCDF does not (yet) understand how to break down the StructMetadata attribute into useful information. // global attributes: :HDFEOSVersion = "HDFEOS_V2.9" ; :StructMetadata.0 = "GROUP=SwathStructurentGROUP=SWATH_1nttSwathNam e="MOD_Swath_Sea_Ice"nttGROUP=DimensionntttOBJE CT=Dimension_1nttttDimensionName="Coarse_swath_lines _5km"nttttSize=406ntttEND_OBJECT=Dimension_1nttt OBJECT=Dimension_2nttttDimensionName="Coarse_swath _pixels_5km"nttttSize=271nttt...
  • 27. Contribute Code to Write HDF4? Some programmers use the netCDF v2 API to write HDF4 files.  It would not be too hard to write the glue code to allow the v2 API -> HDF4 output from the netCDF library.  The next step would be to allow netCDF v3/v4 API code to write HDF4 files.  Writing HDF4 seems like a low priority to our users. I would be happy to help any user who would like to undertake this task. 
  • 28. Parallel I/O with NetCDF • Parallel I/O allows many processes to read/write netCDF data at the same time. • Used properly, parallel I/O allows users to overcome I/O bottlenecks in high performance computing environments. • A parallel I/O file system is required for much improvement in I/O throughput. • NetCDF-4 can use parallel I/O with netCDF4/HDF5 files, or netCDF classic files (with pnetcdf library).
  • 29. Parallel I/O C Example nc_create_par(FILE, NC_NETCDF4|NC_MPIIO, comm, info, &ncid); nc_def_dim(ncid, "d1", DIMSIZE, dimids); nc_def_dim(ncid, "d2", DIMSIZE, &dimids[1]); nc_def_var(ncid, "v1", NC_INT, NDIMS, dimids, &v1id); /* Set up slab for this process. */ start[0] = mpi_rank * DIMSIZE/mpi_size; start[1] = 0; count[0] = DIMSIZE/mpi_size; count[1] = DIMSIZE; nc_var_par_access(ncid, v1id, NC_INDEPENDENT); nc_put_vara_int(ncid, v1id, start, count, &data[mpi_rank*QTR_DATA]);
  • 30. NetCDF APIs The netCDF core library is written in C and Java.  Fortran 77 is “faked” when netCDF is built – actually C functions are called by Fortran 77 API.  A C++ API also calls the C API, a new C++ API us under development to support netCDF4 more fully. 
  • 31. C API nc_create(FILE_NAME, NC_CLOBBER, &ncid); nc_def_dim(ncid, "x", NX, &x_dimid); nc_def_dim(ncid, "y", NY, &y_dimid); dimids[0] = x_dimid; dimids[1] = y_dimid; nc_def_var(ncid, "data", NC_INT, NDIMS, dimids, &varid); nc_enddef(ncid); nc_put_var_int(ncid, varid, &data_out[0][0]); nc_close(ncid);
  • 32. Fortran API call check( nf90_create(FILE_NAME, NF90_CLOBBER, ncid) ) call check( nf90_def_dim(ncid, "x", NX, x_dimid) ) call check( nf90_def_dim(ncid, "y", NY, y_dimid) ) dimids = (/ y_dimid, x_dimid /) call check( nf90_def_var(ncid, "data", NF90_INT, dimids, varid) ) call check( nf90_enddef(ncid) ) call check( nf90_put_var(ncid, varid, data_out) ) call check( nf90_close(ncid) )
  • 33. New C++ API (cxx4) Existing C++ API works with netCDF-4 classic model files.  The existing API was written before many features of C++ became standard, and thus needed updating.  A new C++ API has been partially developed .  You can build the new API (which is not complete!) with --enable-cxx4. 
  • 34. Java API dataFile = NetcdfFileWriteable.createNew(filename, false); // Create netCDF dimensions, Dimension xDim = dataFile.addDimension("x", NX ); Dimension yDim = dataFile.addDimension("y", NY ); ArrayList dims = new ArrayList(); // define dimensions dims.add( xDim); dims.add( yDim); ...
  • 35. Tools • ncdump – ASCII or NcML dump of data file. • ncgen – Take ASCII or NcML and create data file. • nccopy – Copy a file, changing format, compression, chunking, etc.
  • 36. Conventions The NetCDF User's Guide recommends some conventions (ex. "units" and "Conventions" attributes).  Conventions are published agreements about how data of a particular type should be represented to foster interoperability.  Most conventions use attributes.  Use of an existing convention is highly recommended. Use the CF Conventions, if applicable.  A netCDF file should use the global "Conventions" attribute to identify which conventions it uses. 
  • 37. Climate and Forecast Conventions The CF Conventions are becoming a widely used standard for atmospheric, ocean, and climate data.  The NetCDF Climate and Forecast (CF) Metadata Conventions, Version 1.3, describes consensus representations for climate and forecast data using the netCDF-3 data model. 
  • 38. LibCF − The NetCDF CF Library supports the creation of scientific data files conforming to the CF conventions, using the netCDF API. − Now distributed with netCDF. − Now home of GRIDSPEC: A standard for the description of grids used in Earth System models, developed by V. Balaji, GFDL, proposed as a Climate and Forecast (CF) convention.
  • 39. UDUNITS   The Unidata units library, udunits, supports conversion of unit specifications between formatted and binary forms, arithmetic manipulation of unit specifications, and conversion of values between compatible scales of measurement. Now being distributed with netCDF.
  • 40. NetCDF 4.1.2 Release • Performance improvements: much faster file opens (factor of 200 speedup). • Better memory handling, much better testing for leaks and memory errors in netCDF and HDF5. • nccopy now can compress and re-chunk data. • Refactoring of dispatch layer (invisible to user).
  • 41. NetCDF Future Plans • By “plans” we really mean “aspirations.” • We us agile programming, with aggressive refactoring, and heavy reliance on automatic testing. • Our highest priority is fixing bugs so that we do not have a bug-list to maintain and prioritize.
  • 42. Plans: Fortran Refactor • We plan a complete Fortran re-factor within the next year. • Fortan 90 and Fortran 77 backward compatibility will be preserved. No user code will need to be rewritten. • Fortan 90 compilers will be required (even for F77 API code). Fortran 77 compilers will not work with netCDF releases after the refactor. • Fortran 90 API will be rewritten with Fortran 2003 C interoperability features. Fortran 77 API will be rewritten in terms of Fortran 90 API.
  • 43. Plans: Windows Port • Recent refactoring of netCDF architecture requires (yet another) Windows port. This is planned for the end of 2010. • Windows ports are not too hard, but require a detailed knowledge of Microsoft's latest changes and developments of the Windows platform. • I invite collaboration with any Windows programmer who would like to help with the Windows port.
  • 44. Plans: Virtual Files • There are some uses (including LibCF/GRIDSPEC) for disk-less netCDF files – that is, files which exist only in memory. • I am experimenting with this now – interested users should contact me at: ed@unidata.ucar.edu
  • 45. Plans: More Formats • The NetCDF Java library can read many formats that are a mystery to the C-based library. • Recent refactoring of the netCDF architecture makes it easier to support additional formats. • We would like to support GRIB and BUFR next. We seek collaboration with interested users.
  • 46. NetCDF Team – Russ Rew • Vision. • nccopy • classic library
  • 47. NetCDF Team – Ed Hartnett • • • • • NetCDF-4 Release engineering Parallel I/O LibCF Fortran libraries
  • 48. NetCDF Team – Dennis Heimbigner • Opendap client. • New ncdump/ncgen • Some netCDF-Java
  • 49. NetCDF Team – John Caron • NetCDF-Java • Common Data Model
  • 50. Support Send bug reports to: support-netcdf@unidata.ucar.edu  Your support email will enter a support tracking system which will ensure that it does not get lost.  But it may take us a while to solve your problem... 
  • 51. Snapshot Releases and Daily Testing • Automatic daily test runs at Unidata ensure that our changes don't break netCDF. • Test results available on-line at NetCDF web site. • Daily snapshot release provided so users can get latest code, and iterate fixes with netCDF developers.
  • 52. NetCDF Workshop • Annual netCDF workshop is a good place to learn the latest developments in netCDF, and talk to netCDF developers. • October 28-29, 2010, and swanky Mesa Lab at NCAR – great views, mountain trails, without the usual riffraff. • Preceded by data format summit.