SlideShare a Scribd company logo
Introduction to NetCDF4
MuQun Yang
The HDF Group

11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
Notes
• Require basic knowledge of HDF5 and netCDF3
• Cover general NetCDF4 concepts
- Several new features and their performances

• Cover some NetCDF4 APIs but won’t review all new
APIs
• Is not a netCDF3 tutorial

11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
Contents
History review
• Overview of NetCDF4 features, builds and etc
• Performance issues
• Suggestions for users
•

11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
History Review
• Funded by NASA ESTO AIST Program
• Joint project between Unidata and HDF Group
• Used HDF5 as the storage layer of NetCDF

11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
NetCDF-4/HDF5 Goals
• Combine desirable characteristics of netCDF and
HDF5, while taking advantage of their separate
strengths:
- Widespread use and simplicity of netCDF
- Generality and performance of HDF5

• Preserve format and API compatibility for netCDF
users
• Demonstrate benefits of combination in advanced
Earth science modeling efforts

(From : Russ Rew etc’s talk at VII HDF and HDF-EOS workshop)
11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
NetCDF-4 Architecture
netCDF-3
netCDF-3
applications
applications

netCDF
netCDF
files
files

netCDF-4
HDF5 files

HDF5
files

netCDF-4
netCDF-4
applications
applications

netCDF-3
Interface

netCDF-4
Library

HDF5 Library

(From : Russ Rew etc’s talk at VII HDF and HDF-EOS workshop)
11/6/2007

HDF5
HDF5
applications
applications

HDF and HDF-EOS Workshop XI, Landover, MD
11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
Contents
History review
• Overview of NetCDF4 features, builds and etc
• Performance issues
• Suggestions for users
•

11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
Current Status
• http://www.unidata.ucar.edu/software/netcdf/netcdf-4/
• 4.0 beta
•

1 based on HDF5 1.8 beta 1 on April, 2007
4.0 beta 2 release is coming soon

11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
Compilers, platforms and language supports

• Platforms
- Linux, IBM AIX, Sun OS, HP-UX, OSF1, IRIX,
Cygwin

• Programming Languages
- C/C++ and fortran

• Compilers
- Vendor compilers on the supported platforms
•

Watch for Snapshot

http://www.unidata.ucar.edu/software/netcdf/builds/snapshot/netcdf-4
11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
Configuration
• Only NetCDF3 will be built if you just type ./configure
• Before building NetCDF4, one must
- install HDF5 1.8 beta 1 or later (note: parallel HDF5 needs separate build)
- install zlib library if using data compression

• To build sequential version
- ./configure --enable-netcdf-4 --with-hdf5=/HDF5path --with-zlib=/zlibpath

• To build parallel version
- ./configure --enable-netcdf-4 –enable-parallel –disable-shared --with-hdf5=/parallel
HDF5path --with-zlib=/zlibpath

Parallel NetCDF4 needs more work. It has been tested on IBM AIX.

11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
API Changes
• Existing APIs:
Essentially no differences but with new flags
NetCDF3:

nc_create(FILE_NAME, NC_NOCLOBBER, &ncid);

NetCDF4:

nc_create(FILE_NAME, NC_NETCDF4,&ncid);

• Adding new APIs for new features
such as:
nc_def_var_deflate(ncid, varid, shuffle, deflate, deflate level)

Hereafter blue color in APIS implies this is an output parameter
11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
Overview of NetCDF4 new features
• Data Type
- Compound data type
- Variable length type

•
•
•
•

Group
Multiple Unlimited Dimension
Compression
Parallel IO

11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
A compound datatype example
types:
compound wind_vector_t {
float eastward ;
float northward ;
}

dimensions:
lat = 18 ;
lon = 36 ;
pres = 15 ;
time = 4 ;

variables:
wind_vector_t gwind(time, pres, lat, lon) ;
wind:long_name = "geostrophic wind vector" ;
wind:standard_name = "geostrophic_wind_vector" ;

data:
gwind = {1, -2.5}, {-1, 2}, {20, 10}, {1.5, 1.5}, ...;
11/8/2007

HDF and HDF-EOS Workshop XI, Landover, MD

14
Variable length type
Simple example: ragged array
types:
float(*) row_of_floats;
dimensions:
m = 50;
variables:
row_of_floats
ragged_array(m);

11/8/2007

HDF and HDF-EOS Workshop XI, Landover, MD

15
An Example – variable length and compound datatype
struct sea_sounding
{
int sounding_no;
nc_vlen_t temp_vl;
} data[DIM_LEN];

/*1. Create a netcdf-4 file. */
nc_create(FILE_NAME, NC_NETCDF4, &ncid);
/* 2. Create the vlen type, with a float base type. */
nc_def_vlen(ncid, "temp_vlen", NC_FLOAT, &temp_typeid);
/* 3. Create the compound type to hold a sea sounding. */
nc_def_compound(ncid, sizeof(struct sea_sounding), "sea_sounding", &sounding_typeid);
nc_insert_compound(ncid, sounding_typeid, "sounding_no",
NC_COMPOUND_OFFSET(struct sea_sounding, sounding_no), NC_INT);
nc_insert_compound(ncid, sounding_typeid, "temp_vl",
NC_COMPOUND_OFFSET(struct sea_sounding, temp_vl), temp_typeid);
/* 4. Define a dimension, and a 1D var of sea sounding compound type. */
nc_def_dim(ncid, DIM_NAME, DIM_LEN, &dimid);
nc_def_var(ncid, "fun_soundings", sounding_typeid, 1, &dimid, &varid);
/* 5. Write our array of phone data to the file, all at once. */
nc_put_var(ncid, varid, data);
/*6. Close the file*/
nc_close(ncid);
11/8/2007

HDF and HDF-EOS Workshop XI, Landover, MD

16
Group
• Use of Groups is optional, with backward compatibility
maintained by putting everything in the top-level
unnamed Group.
• Unlike HDF5, netCDF-4 requires that Groups form a
strict hierarchy.
• Potential uses for Groups include
o Factoring out common information
o Containers for data within regions, ensembles

o Organizing a large number of variables
o Providing name spaces for multiple uses of same names
for dimensions, variables, attributes
o Modeling large hierarchies
11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
Group APIs
• APIs for creating group( define APIs)
nc_def_grp(parent_group_id, group name, &group_id)
Examples:
nc_def_grp(ncid, HENRY_VII, &henry_vii_id)
nc_def_grp(henry_vii_id, MARGARET, &margaret_id)

• APIs for inquiring information from a group
( inquiry APIs)
number of groups: nc_inq_grps(group_id,

&num_grps, NULL);
children group id list: nc_inq_grps(group_id, NULL, group_id_list);
children group name:

nc_inq_grpname(group_id_list[0], children_group_name);
11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
Multiple Unlimited Dimension APIs
• APIs for defining multiple unlimited dimensions
Old API with the same flag:

nc_def_dim(ncid, dimension name, NC_UNLIMITED, int *idp)
Examples:

nc_def_dim(ncid, dimname_1, NC_UNLIMITED, &dimid[0])
nc_def_dim(ncid, dimname_2,NC_UNLIMITED, &dimid[1])

•

APIs for inquiring multiple dimensions
Old API with the same flag: nc_inq_unlimdim(ncid,,int *idp)
New API: nc_inq_unlimdims(ncid,

•

int nunlimdims_in, int unlimdimid[ ])

How to use the new API
1) First obtain the number of unlimited dimensions:

nc_inq_unlimdims(ncid, &nunlimdims ,NULL)
2) Then obtain the unlimited dimensional list:

nc_inq_unlimdims(ncid, &nunlimdims, unlimdimid)
11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
Compression
• Deflate now
• Scaleoffset, N-bit and maybe szip in the future
• Only need to add one routine

nc_def_var_deflate( int

netcdf id,

int
variable id,

int shuffle,
int deflate,
int deflate_level);
11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
Compression example code
----- Data writing -------1. Define variable
nc_def_var(ncid, VAR_BYTE_NAME, NC_BYTE, 2, dimids, &byte_varid);

2. Set deflate compression
nc_def_var_deflate(ncid, byte_varid, 0, 1, DEFLATE_LEVEL_3);

3. Write the data
nc_put_var_schar(ncid, byte_varid, (signed char *)byte_out);

----- Data reading -------nc_get_var_schar(ncid, byte_varid, (signed char *)byte_in);

11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
Parallel IO
• Support either collective or independent
• Support MPI-IO or MPI-POSIX IO via parallel HDF5
• Special functions are used to create/open a netCDF file
in parallel.

11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
New APIs to do parallel IO
• nc_create_par
nc_create_par
(const char *path, int mode,MPI_Comm comm, MPI_Info info,
int *ncidp)
“mode” must be NC_NETCDF4|NC_MPIIO or NC_NETCDF4|NC_MPIPOSIX

• nc_var_par_access
nc_var_par_access
(int ncid, int var_id, int data_access )
Data_access can be either NC_COLLECTIVE or NC_INDEPENDENT

• nc_open_par
nc_open_par
(const char *path,int mode ,MPI_Comm comm, MPI_Info
info,&ncid)
“mode” must be either NC_MPIIO or NC_MPIPOSIX
11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
Parallel IO Programming Model
Data writing :
/* 1. Initialize MPI. */
MPI_Init(&argc,&argv)

/* 2. Create a parallel netcdf-4 file. */
nc_create_par(FILE, NC_NETCDF4|NC_MPIIO, comm, info, &ncid)
nc_var_par_access(ncid, v1id, NC_COLLECTIVE)

/* 3. Write data. */
nc_put_vara_int(ncid, v1id, start, count,data )

/*4. Close the file */
nc_close(ncid);
/* 5. Shut down MPI. */
MPI_Finalize();

Data reading:
Use nc_open_par instead of nc_create_par
11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
Other features
• Datatype
-

•
•
•
•

More atomic datatype: unsigned integer(1,2,4 and 8 bytes)
Strings: replace character arrays
Enums,Opaque types
User-defined datatype

Fletcher32 checksum filter
UTF-8 support
Reader-Makes-Right conversion
Using HDF5 dimensional scale

11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
Content
History review
• Overview of NetCDF4 features, builds and etc
• Performance issues
• Suggestions for users
•

11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
NetCDF4 Data Compression: Size

<2 %

11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
NetCDF4 Data Compression: Data Write time

11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
NetCDF4 Data Compression: Data Read Time

11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
WRF Output in HDF5 -File Size
2500
No Compression
With szip

2000
1500

)
B
M
(
z
S
e
l
i
F

1000
500
0
Run 1

Run 2

Run 3

Four Model Runs
11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD

Run 4
WRF Output in HDF5- Data writing time
6
No Compression

5

With szip compression

4
3
2

s
u
n
m
e
i
r
w
t
a
D

1
0
Run 1

Run 2

Run 3

Four Model Runs
11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD

Run 4
EUMETNET OPERA Report in 2006
They evaluated the following data format:
 FM 92 GRIB, NORDRAD, Universal Format,
 netCDF, HDF4,HDF5,
 XML and Scalable Vector Graphics (SVG), and GeoTIFF

Their Recommendation:
• Based on the results of the detailed evaluation, HDF5 is recommended for consideration as an
official European standard format for weather radar data and products.

Why?
• Compared to other formats,

HDF5’s compression algorithm (ZLIB) is more efficient…
• A file format with efficient compression and platform independence is essential

PyTables
One of the beauties of PyTables is that it supports compression
on tables and arrays

11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
Evaluation of Parallel NetCDF4 Performance
•
•
•
•

Regional Oceanographic Modeling System
History file writer in parallel NetCDF4(PnetCDF4)
History file writer in parallel NetCDF from Argonne(PnetCDF)
Data:
• 60 1D-4D double-precision float and integer arrays

33
PnetCDF4 and PnetCDF performance comparison

Bandwidth (MB/S)

PNetCDF collective

NetCDF4 collective

160
140
120
100
80
60
40
20
0
0

16

32

48

64

80

96

112

Number of processors

• Fixed problem size = 995 MB
• Performance of PnetCDF4 is close to PnetCDF
34

128

144
ROMS Output with Parallel NetCDF4
Bandwidth (MB/S)

Output size 995 MB

Output size 15.5 GB

300
250
200
150
100
50
0
0

16

32

48 64 80 96 112 128 144
Number of Processors

• The IO performance gets improved as the file size increases.
• It can provide decent I/O performance for big problem size.
35
Chunking
• Using chunking wisely
• Review chunking tips for HDF5

11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
Content
History review
• Overview of NetCDF4 features, builds and etc
• Performance issues
• Suggestions for users
•

11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
NetCDF Classic Model

11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
Using the NetCDF Classic Model
• NetCDF-4 files can be created with the CLASSIC_MODEL flag. This
enforces the rules of the classic netCDF data model on this file.
nc_create(FILE_NAME, NC_NETCDF4|NC_CLASSIC_MODEL, &ncid)

• Once a classic model file, always a classic model file. This sticks with the file
and there is no way to change in within the netCDF API.
• Classic model files don't use any elements of the expansion of the data model
in netCDF-4. They don't have groups, user-defined types, multiple unlimited
dimensions, or the new atomic types.
• Since they conform to the classic model, they can be read and understood by
any existing netCDF software (as soon as that software upgrades to netCDF-4
and HDF5 1.8.0).
• NetCDF-4 features which don't affect the data model are still available:
compression, parallel I/O.

11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
HDF5 Features not in current
NetCDF4.0
•
•
•
•
•
•

No Scaleoffset, N-bit, szip filters (Plan for 4.1 release)
No supports for user-defined filters
Can only read HDF5 files having dimensional scales
Can only write data in chunking storage
No Fortran 90 APIs
No corresponding APIs for optimizations
- cache, MPI-IO

11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
NetCDF 4.1 Plan
• http://www.unidata.ucar.edu/software/netcdf/netcdf4/req_4_1.html

11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
NetCDF4, HDF5
which one should I use?
Evaluate the followings:
•
•
•
•
•

Familiarity
Features
Performance
Compatibility
Release/feature lags

11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
Based on stability of NetCDF4

Priority
High Performance + many
advanced HDF5 features

Recommendation
HDF5 definitely

Care about performance, Possibly need
to use many new advanced features

NetCDF4:Avoid transition cost from
NetCDF to HDF5

NetCDF4: maybe

1. Just need one or two HDF5 features
for intensive NetCDF applications
NetCDF4/CLASSIC_MODEL
(compression ,parallel IO)
2. Existing NetCDF software or
applications that don’t care about
performance
11/6/2007

HDF5: maybe

NetCDF4 definitely

HDF and HDF-EOS Workshop XI, Landover, MD
More NetCDF4 information
• Release and snapshot:
http://www.unidata.ucar.edu/software/netcdf/netcdf-4/
• Tutorial in 2007 NetCDF workshop:

http://www.unidata.ucar.edu/software/netcdf/workshops/2
• Paper in 2006 AMS annual meeting:
http://www.unidata.ucar.edu/software/netcdf/papers/2
006-ams.pdf
11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD
Acknowledgements
• Thanks Russ Rew and Ed Hartnett from Unidata for generously
allowing me to use their slides and sharing their compression
performance results in this workshop
• Some contents that describe New features of are copied from 2007
Unidata NetCDF workshop
• The Radar NetCDF data compression performance results are
provided by Ed Hartnett at Unidata

11/6/2007

HDF and HDF-EOS Workshop XI, Landover, MD

More Related Content

PPTX
Writing a 5-Paragraph Essay
PDF
Working with HDF and netCDF Data in ArcGIS: Tools and Case Studies
PDF
Module net cdf4
PPTX
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
PDF
LCI2009-Tutorial
PDF
LCI2009-Tutorial
PPT
Real IO and Parallel NetCDF4 Performance
Writing a 5-Paragraph Essay
Working with HDF and netCDF Data in ArcGIS: Tools and Case Studies
Module net cdf4
Interoperability with netCDF-4 - Experience with NPP and HDF-EOS5 products
LCI2009-Tutorial
LCI2009-Tutorial
Real IO and Parallel NetCDF4 Performance

Similar to Introduction to NetCDF-4 (20)

PPTX
MATLAB, netCDF, and OPeNDAP
PPTX
Adding CF Attributes to an HDF5 File
PPT
HDF OPeNDAP project update and demo
PPSX
NASA HDF/HDF-EOS Data for Dummies (and Developers)
PPT
Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps
PPTX
Tools to improve the usability of NASA HDF Data
PPTX
PPTX
Moving form HDF4 to HDF5/netCDF-4
PDF
Plans for Enhanced NetCDF-4 Interface to HDF5 Data
MATLAB, netCDF, and OPeNDAP
Adding CF Attributes to an HDF5 File
HDF OPeNDAP project update and demo
NASA HDF/HDF-EOS Data for Dummies (and Developers)
Ensuring Long Term Access to Remotely Sensed HDF4 Data with Layout Maps
Tools to improve the usability of NASA HDF Data
Moving form HDF4 to HDF5/netCDF-4
Plans for Enhanced NetCDF-4 Interface to HDF5 Data
Ad

More from The HDF-EOS Tools and Information Center (20)

PDF
HDF5 2.0: Cloud Optimized from the Start
PDF
Using a Hierarchical Data Format v5 file as Zarr v3 Shard
PDF
Cloud-Optimized HDF5 Files - Current Status
PDF
Cloud Optimized HDF5 for the ICESat-2 mission
PPTX
Access HDF Data in the Cloud via OPeNDAP Web Service
PPTX
Upcoming New HDF5 Features: Multi-threading, sparse data storage, and encrypt...
PPTX
The State of HDF5 / Dana Robinson / The HDF Group
PDF
Cloud-Optimized HDF5 Files
PDF
Accessing HDF5 data in the cloud with HSDS
PPTX
Highly Scalable Data Service (HSDS) Performance Features
PDF
Creating Cloud-Optimized HDF5 Files
PPTX
HDF5 OPeNDAP Handler Updates, and Performance Discussion
PPTX
Hyrax: Serving Data from S3
PPSX
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
PDF
HDF - Current status and Future Directions
PPSX
HDFEOS.org User Analsys, Updates, and Future
PPTX
HDF - Current status and Future Directions
PDF
H5Coro: The Cloud-Optimized Read-Only Library
PPTX
MATLAB Modernization on HDF5 1.10
HDF5 2.0: Cloud Optimized from the Start
Using a Hierarchical Data Format v5 file as Zarr v3 Shard
Cloud-Optimized HDF5 Files - Current Status
Cloud Optimized HDF5 for the ICESat-2 mission
Access HDF Data in the Cloud via OPeNDAP Web Service
Upcoming New HDF5 Features: Multi-threading, sparse data storage, and encrypt...
The State of HDF5 / Dana Robinson / The HDF Group
Cloud-Optimized HDF5 Files
Accessing HDF5 data in the cloud with HSDS
Highly Scalable Data Service (HSDS) Performance Features
Creating Cloud-Optimized HDF5 Files
HDF5 OPeNDAP Handler Updates, and Performance Discussion
Hyrax: Serving Data from S3
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
HDF - Current status and Future Directions
HDFEOS.org User Analsys, Updates, and Future
HDF - Current status and Future Directions
H5Coro: The Cloud-Optimized Read-Only Library
MATLAB Modernization on HDF5 1.10
Ad

Recently uploaded (20)

PDF
cuic standard and advanced reporting.pdf
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Machine learning based COVID-19 study performance prediction
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Electronic commerce courselecture one. Pdf
PDF
Approach and Philosophy of On baking technology
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Encapsulation_ Review paper, used for researhc scholars
cuic standard and advanced reporting.pdf
Programs and apps: productivity, graphics, security and other tools
Reach Out and Touch Someone: Haptics and Empathic Computing
Machine learning based COVID-19 study performance prediction
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Unlocking AI with Model Context Protocol (MCP)
The Rise and Fall of 3GPP – Time for a Sabbatical?
Per capita expenditure prediction using model stacking based on satellite ima...
Building Integrated photovoltaic BIPV_UPV.pdf
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Electronic commerce courselecture one. Pdf
Approach and Philosophy of On baking technology
MIND Revenue Release Quarter 2 2025 Press Release
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
The AUB Centre for AI in Media Proposal.docx
Encapsulation_ Review paper, used for researhc scholars

Introduction to NetCDF-4

  • 1. Introduction to NetCDF4 MuQun Yang The HDF Group 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 2. Notes • Require basic knowledge of HDF5 and netCDF3 • Cover general NetCDF4 concepts - Several new features and their performances • Cover some NetCDF4 APIs but won’t review all new APIs • Is not a netCDF3 tutorial 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 3. Contents History review • Overview of NetCDF4 features, builds and etc • Performance issues • Suggestions for users • 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 4. History Review • Funded by NASA ESTO AIST Program • Joint project between Unidata and HDF Group • Used HDF5 as the storage layer of NetCDF 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 5. NetCDF-4/HDF5 Goals • Combine desirable characteristics of netCDF and HDF5, while taking advantage of their separate strengths: - Widespread use and simplicity of netCDF - Generality and performance of HDF5 • Preserve format and API compatibility for netCDF users • Demonstrate benefits of combination in advanced Earth science modeling efforts (From : Russ Rew etc’s talk at VII HDF and HDF-EOS workshop) 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 6. NetCDF-4 Architecture netCDF-3 netCDF-3 applications applications netCDF netCDF files files netCDF-4 HDF5 files HDF5 files netCDF-4 netCDF-4 applications applications netCDF-3 Interface netCDF-4 Library HDF5 Library (From : Russ Rew etc’s talk at VII HDF and HDF-EOS workshop) 11/6/2007 HDF5 HDF5 applications applications HDF and HDF-EOS Workshop XI, Landover, MD
  • 7. 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 8. Contents History review • Overview of NetCDF4 features, builds and etc • Performance issues • Suggestions for users • 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 9. Current Status • http://www.unidata.ucar.edu/software/netcdf/netcdf-4/ • 4.0 beta • 1 based on HDF5 1.8 beta 1 on April, 2007 4.0 beta 2 release is coming soon 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 10. Compilers, platforms and language supports • Platforms - Linux, IBM AIX, Sun OS, HP-UX, OSF1, IRIX, Cygwin • Programming Languages - C/C++ and fortran • Compilers - Vendor compilers on the supported platforms • Watch for Snapshot http://www.unidata.ucar.edu/software/netcdf/builds/snapshot/netcdf-4 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 11. Configuration • Only NetCDF3 will be built if you just type ./configure • Before building NetCDF4, one must - install HDF5 1.8 beta 1 or later (note: parallel HDF5 needs separate build) - install zlib library if using data compression • To build sequential version - ./configure --enable-netcdf-4 --with-hdf5=/HDF5path --with-zlib=/zlibpath • To build parallel version - ./configure --enable-netcdf-4 –enable-parallel –disable-shared --with-hdf5=/parallel HDF5path --with-zlib=/zlibpath Parallel NetCDF4 needs more work. It has been tested on IBM AIX. 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 12. API Changes • Existing APIs: Essentially no differences but with new flags NetCDF3: nc_create(FILE_NAME, NC_NOCLOBBER, &ncid); NetCDF4: nc_create(FILE_NAME, NC_NETCDF4,&ncid); • Adding new APIs for new features such as: nc_def_var_deflate(ncid, varid, shuffle, deflate, deflate level) Hereafter blue color in APIS implies this is an output parameter 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 13. Overview of NetCDF4 new features • Data Type - Compound data type - Variable length type • • • • Group Multiple Unlimited Dimension Compression Parallel IO 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 14. A compound datatype example types: compound wind_vector_t { float eastward ; float northward ; } dimensions: lat = 18 ; lon = 36 ; pres = 15 ; time = 4 ; variables: wind_vector_t gwind(time, pres, lat, lon) ; wind:long_name = "geostrophic wind vector" ; wind:standard_name = "geostrophic_wind_vector" ; data: gwind = {1, -2.5}, {-1, 2}, {20, 10}, {1.5, 1.5}, ...; 11/8/2007 HDF and HDF-EOS Workshop XI, Landover, MD 14
  • 15. Variable length type Simple example: ragged array types: float(*) row_of_floats; dimensions: m = 50; variables: row_of_floats ragged_array(m); 11/8/2007 HDF and HDF-EOS Workshop XI, Landover, MD 15
  • 16. An Example – variable length and compound datatype struct sea_sounding { int sounding_no; nc_vlen_t temp_vl; } data[DIM_LEN]; /*1. Create a netcdf-4 file. */ nc_create(FILE_NAME, NC_NETCDF4, &ncid); /* 2. Create the vlen type, with a float base type. */ nc_def_vlen(ncid, "temp_vlen", NC_FLOAT, &temp_typeid); /* 3. Create the compound type to hold a sea sounding. */ nc_def_compound(ncid, sizeof(struct sea_sounding), "sea_sounding", &sounding_typeid); nc_insert_compound(ncid, sounding_typeid, "sounding_no", NC_COMPOUND_OFFSET(struct sea_sounding, sounding_no), NC_INT); nc_insert_compound(ncid, sounding_typeid, "temp_vl", NC_COMPOUND_OFFSET(struct sea_sounding, temp_vl), temp_typeid); /* 4. Define a dimension, and a 1D var of sea sounding compound type. */ nc_def_dim(ncid, DIM_NAME, DIM_LEN, &dimid); nc_def_var(ncid, "fun_soundings", sounding_typeid, 1, &dimid, &varid); /* 5. Write our array of phone data to the file, all at once. */ nc_put_var(ncid, varid, data); /*6. Close the file*/ nc_close(ncid); 11/8/2007 HDF and HDF-EOS Workshop XI, Landover, MD 16
  • 17. Group • Use of Groups is optional, with backward compatibility maintained by putting everything in the top-level unnamed Group. • Unlike HDF5, netCDF-4 requires that Groups form a strict hierarchy. • Potential uses for Groups include o Factoring out common information o Containers for data within regions, ensembles o Organizing a large number of variables o Providing name spaces for multiple uses of same names for dimensions, variables, attributes o Modeling large hierarchies 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 18. Group APIs • APIs for creating group( define APIs) nc_def_grp(parent_group_id, group name, &group_id) Examples: nc_def_grp(ncid, HENRY_VII, &henry_vii_id) nc_def_grp(henry_vii_id, MARGARET, &margaret_id) • APIs for inquiring information from a group ( inquiry APIs) number of groups: nc_inq_grps(group_id, &num_grps, NULL); children group id list: nc_inq_grps(group_id, NULL, group_id_list); children group name: nc_inq_grpname(group_id_list[0], children_group_name); 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 19. Multiple Unlimited Dimension APIs • APIs for defining multiple unlimited dimensions Old API with the same flag: nc_def_dim(ncid, dimension name, NC_UNLIMITED, int *idp) Examples: nc_def_dim(ncid, dimname_1, NC_UNLIMITED, &dimid[0]) nc_def_dim(ncid, dimname_2,NC_UNLIMITED, &dimid[1]) • APIs for inquiring multiple dimensions Old API with the same flag: nc_inq_unlimdim(ncid,,int *idp) New API: nc_inq_unlimdims(ncid, • int nunlimdims_in, int unlimdimid[ ]) How to use the new API 1) First obtain the number of unlimited dimensions: nc_inq_unlimdims(ncid, &nunlimdims ,NULL) 2) Then obtain the unlimited dimensional list: nc_inq_unlimdims(ncid, &nunlimdims, unlimdimid) 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 20. Compression • Deflate now • Scaleoffset, N-bit and maybe szip in the future • Only need to add one routine nc_def_var_deflate( int netcdf id, int variable id, int shuffle, int deflate, int deflate_level); 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 21. Compression example code ----- Data writing -------1. Define variable nc_def_var(ncid, VAR_BYTE_NAME, NC_BYTE, 2, dimids, &byte_varid); 2. Set deflate compression nc_def_var_deflate(ncid, byte_varid, 0, 1, DEFLATE_LEVEL_3); 3. Write the data nc_put_var_schar(ncid, byte_varid, (signed char *)byte_out); ----- Data reading -------nc_get_var_schar(ncid, byte_varid, (signed char *)byte_in); 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 22. Parallel IO • Support either collective or independent • Support MPI-IO or MPI-POSIX IO via parallel HDF5 • Special functions are used to create/open a netCDF file in parallel. 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 23. New APIs to do parallel IO • nc_create_par nc_create_par (const char *path, int mode,MPI_Comm comm, MPI_Info info, int *ncidp) “mode” must be NC_NETCDF4|NC_MPIIO or NC_NETCDF4|NC_MPIPOSIX • nc_var_par_access nc_var_par_access (int ncid, int var_id, int data_access ) Data_access can be either NC_COLLECTIVE or NC_INDEPENDENT • nc_open_par nc_open_par (const char *path,int mode ,MPI_Comm comm, MPI_Info info,&ncid) “mode” must be either NC_MPIIO or NC_MPIPOSIX 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 24. Parallel IO Programming Model Data writing : /* 1. Initialize MPI. */ MPI_Init(&argc,&argv) /* 2. Create a parallel netcdf-4 file. */ nc_create_par(FILE, NC_NETCDF4|NC_MPIIO, comm, info, &ncid) nc_var_par_access(ncid, v1id, NC_COLLECTIVE) /* 3. Write data. */ nc_put_vara_int(ncid, v1id, start, count,data ) /*4. Close the file */ nc_close(ncid); /* 5. Shut down MPI. */ MPI_Finalize(); Data reading: Use nc_open_par instead of nc_create_par 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 25. Other features • Datatype - • • • • More atomic datatype: unsigned integer(1,2,4 and 8 bytes) Strings: replace character arrays Enums,Opaque types User-defined datatype Fletcher32 checksum filter UTF-8 support Reader-Makes-Right conversion Using HDF5 dimensional scale 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 26. Content History review • Overview of NetCDF4 features, builds and etc • Performance issues • Suggestions for users • 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 27. NetCDF4 Data Compression: Size <2 % 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 28. NetCDF4 Data Compression: Data Write time 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 29. NetCDF4 Data Compression: Data Read Time 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 30. WRF Output in HDF5 -File Size 2500 No Compression With szip 2000 1500 ) B M ( z S e l i F 1000 500 0 Run 1 Run 2 Run 3 Four Model Runs 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD Run 4
  • 31. WRF Output in HDF5- Data writing time 6 No Compression 5 With szip compression 4 3 2 s u n m e i r w t a D 1 0 Run 1 Run 2 Run 3 Four Model Runs 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD Run 4
  • 32. EUMETNET OPERA Report in 2006 They evaluated the following data format:  FM 92 GRIB, NORDRAD, Universal Format,  netCDF, HDF4,HDF5,  XML and Scalable Vector Graphics (SVG), and GeoTIFF Their Recommendation: • Based on the results of the detailed evaluation, HDF5 is recommended for consideration as an official European standard format for weather radar data and products. Why? • Compared to other formats, HDF5’s compression algorithm (ZLIB) is more efficient… • A file format with efficient compression and platform independence is essential PyTables One of the beauties of PyTables is that it supports compression on tables and arrays 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 33. Evaluation of Parallel NetCDF4 Performance • • • • Regional Oceanographic Modeling System History file writer in parallel NetCDF4(PnetCDF4) History file writer in parallel NetCDF from Argonne(PnetCDF) Data: • 60 1D-4D double-precision float and integer arrays 33
  • 34. PnetCDF4 and PnetCDF performance comparison Bandwidth (MB/S) PNetCDF collective NetCDF4 collective 160 140 120 100 80 60 40 20 0 0 16 32 48 64 80 96 112 Number of processors • Fixed problem size = 995 MB • Performance of PnetCDF4 is close to PnetCDF 34 128 144
  • 35. ROMS Output with Parallel NetCDF4 Bandwidth (MB/S) Output size 995 MB Output size 15.5 GB 300 250 200 150 100 50 0 0 16 32 48 64 80 96 112 128 144 Number of Processors • The IO performance gets improved as the file size increases. • It can provide decent I/O performance for big problem size. 35
  • 36. Chunking • Using chunking wisely • Review chunking tips for HDF5 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 37. Content History review • Overview of NetCDF4 features, builds and etc • Performance issues • Suggestions for users • 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 38. NetCDF Classic Model 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 39. Using the NetCDF Classic Model • NetCDF-4 files can be created with the CLASSIC_MODEL flag. This enforces the rules of the classic netCDF data model on this file. nc_create(FILE_NAME, NC_NETCDF4|NC_CLASSIC_MODEL, &ncid) • Once a classic model file, always a classic model file. This sticks with the file and there is no way to change in within the netCDF API. • Classic model files don't use any elements of the expansion of the data model in netCDF-4. They don't have groups, user-defined types, multiple unlimited dimensions, or the new atomic types. • Since they conform to the classic model, they can be read and understood by any existing netCDF software (as soon as that software upgrades to netCDF-4 and HDF5 1.8.0). • NetCDF-4 features which don't affect the data model are still available: compression, parallel I/O. 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 40. HDF5 Features not in current NetCDF4.0 • • • • • • No Scaleoffset, N-bit, szip filters (Plan for 4.1 release) No supports for user-defined filters Can only read HDF5 files having dimensional scales Can only write data in chunking storage No Fortran 90 APIs No corresponding APIs for optimizations - cache, MPI-IO 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 41. NetCDF 4.1 Plan • http://www.unidata.ucar.edu/software/netcdf/netcdf4/req_4_1.html 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 42. NetCDF4, HDF5 which one should I use? Evaluate the followings: • • • • • Familiarity Features Performance Compatibility Release/feature lags 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 43. Based on stability of NetCDF4 Priority High Performance + many advanced HDF5 features Recommendation HDF5 definitely Care about performance, Possibly need to use many new advanced features NetCDF4:Avoid transition cost from NetCDF to HDF5 NetCDF4: maybe 1. Just need one or two HDF5 features for intensive NetCDF applications NetCDF4/CLASSIC_MODEL (compression ,parallel IO) 2. Existing NetCDF software or applications that don’t care about performance 11/6/2007 HDF5: maybe NetCDF4 definitely HDF and HDF-EOS Workshop XI, Landover, MD
  • 44. More NetCDF4 information • Release and snapshot: http://www.unidata.ucar.edu/software/netcdf/netcdf-4/ • Tutorial in 2007 NetCDF workshop: http://www.unidata.ucar.edu/software/netcdf/workshops/2 • Paper in 2006 AMS annual meeting: http://www.unidata.ucar.edu/software/netcdf/papers/2 006-ams.pdf 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD
  • 45. Acknowledgements • Thanks Russ Rew and Ed Hartnett from Unidata for generously allowing me to use their slides and sharing their compression performance results in this workshop • Some contents that describe New features of are copied from 2007 Unidata NetCDF workshop • The Radar NetCDF data compression performance results are provided by Ed Hartnett at Unidata 11/6/2007 HDF and HDF-EOS Workshop XI, Landover, MD

Editor's Notes

  • #13: Here something you can do: Compound datatype Multi-dimensional variable ROMS *Compression Variable length Performance: Parallel IO, chunking What kind of HDF5 files can be read by NetCDF interface? Tools: ncdump
  • #32: Pytables, Radar Swedde
  • #33: EUMETNET: The network of European Meteorological Services(23 countries) OPERA: Operational Programme for the Exchange of weather RAdar information