SlideShare a Scribd company logo
Performance Tuning in HDF5

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD

1
Outline
• HDF5 file overhead
• Performance and options for tuning
• Chunking
• Caching

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
2
HDF5 file overhead
• Why my HDF5 file is so big?
• Each HDF5 object has an overhead (object header, Btrees, global heaps)
• Examples
•
•
•
•

Empty HDF5 file (has root group) 976 bytes
File with one empty group 1952 bytes
Dataset without data written 1576 bytes
Dataset with 4MB data written 4002048 bytes (1000x1000
int)
• Dataset with chunk storage 25 chunks 4004664 bytes
• Dataset with chunk storage 10^4 chunks 4497104 bytes
• Dataset with chunk storage 25 chunks with compression
34817 bytes (actual data was 8-bit integer)
02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
3
HDF5 file overhead
• Overhead may come from datatypes
• Example: Compound datatype
typedef struct s1_t {
int a;
char d;
float b;
double c;
} s1_t;

• If compiler aligns data on 8 byte boundaries, we have 11 bytes
overhead for each element
• Use H5Tpack to “compress” data

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
4
HDF5 file overhead
• Example: Variable length data
typedef struct vl_t {
int len;
void *p;
} vl_t;

•
•
•
•

~20 bytes overhead for each element
Raw data is stored in global heaps
Cannot be compressed
Opening/closing VL dataset will increase overhead in a file due to
fragmented global heap

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
5
HDF5 file overhead
• Example: 9 bit integer
• Takes at least 16bits, 7 bits are overhead
• Use N-bit filter to compress (1.8.0 release)

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
6
Summary
• File
• Try to keep file open as long as possible; don’t open/close
if not necessary

• Groups
• Use compact storage (available in 1.8.0) for groups with a
few members

• Datasets
• Use compact group storage for small ( <64K) datasets
• If many datasets are of the same datatype, use shared
datatype
• Use appropriate chunking and compression strategy
02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
7
HDF5 chunking
• Chunking refers to a storage layout where a dataset is partitioned
into fixed-size multi-dimensional chunks
• Used when dataset is
• Extendible
• Compressed
• Checksum or other filters are applied
• HDF5 chunk’s properties
• Chunk has the same number of dimensions as a dataset
• Chunks cover all dataset, but the dataset need not be an integral
number of chunks
• If no data is ever written to a chunk, chunk is not allocated in a
file
• Chunk is an atomic object for I/O operation (e.g. written or
read in one I/O operation)
02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
8
HDF5 chunking

1

2

3

Dataset
4

5

6

Data written
7

8

9

10

11

12

02/18/14

• Dataset is covered by 12 chunks
• Chunks 1,6 and 11 never allocated
in a file
• Compression, encryption, checksum, etc
is performed on entire chunk

HDF and HDF-EOS Workshop X, Landover, MD
9
HDF5 filter pipeline
Example: H5Dwrite touches only a few bytes in a chunk
Modify bytes

File

02/18/14

• Entire chunk is read from the file
• Data passes through filter pipeline
• The few bytes are modified
• Data passes through filter pipeline
• Entire chunk is written back to file
• May be written to another location
leaving a hole in the file
• Can increase file size

Chunk # 12

HDF and HDF-EOS Workshop X, Landover, MD
10
HDF5 filter pipeline and chunk cache
Example: H5Dwrite touches only a few bytes in a chunk
Modify bytes

02/18/14

• Calling H5Dwrite many times
would result in poor performance
• Chunk cache layer
• Holds 521 chunks or 1MB
(whichever is less)
• H5Pset_cache call to modify cache

HDF and HDF-EOS Workshop X, Landover, MD
11
HDF5 chunk cache
• The preemption policy for the cache favors certain chunks and
tries not to preempt them.
• Chunks that have been accessed frequently in the near past are
favored.
• A chunk which has just entered the cache is favored.
• A chunk which has been completely read or completely written but
not partially read or written is penalized according to some
application specified weighting between zero and one.
• A chunk which is larger than the maximum cache size is not eligible for caching.

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
12
HDF5 chunk overhead in a file
• B-tree maps chunk N-dimensional addresses to file addresses
• B-tree grows with the number of chunks
• File storage overhead is higher
• More disk I/O is required to traverse the tree from root to leaves
• Large # of B-tree nodes will result in higher contention for metadata
cache

• To reduce overhead
• Use bigger chunk sizes, don’t’ use chunk size 1

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
13
HDF5 metadata cache
• Outline
•
•
•
•

Overview of HDF5 metadata cache
Implementation prior 1.8.0 release
Adaptive metadata cache in 1.8.0
Documentation is available in UG and RM for 1.8.0
release
• http://hdf.ncsa.uiuc.edu/HDF5/doc_dev_snapshot/H5_dev/UG/UG
_frame17SpecialTopics.html

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
14
Overview of HDF5 metadata cache
●
●

Metadata – extra information about your data
Structural metadata
●

Stores information about your data
●
●
●
●

●

Example: When you create a group, you really create:
Group header
B-Tree (to index entries), and
Local heap (to store entry names)

User defined metadata (Created via the H5A calls)
●
●
●

Usually small – less than 1 KB
Accessed frequently
Small disk accesses still expensive

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
15
Overview of HDF5 metadata cache
●

●

●

●

Cache
●
An area of storage devoted to the high speed retrieval of
frequently used data
HDF5 Metadata Cache
●
A module that tries to keep frequently used metadata in core so
as to avoid file I/O
●
Exists to enhance performance
●
Limited size – can't hold all the metadata all the time
Cache Hit
●
A metadata access request that is satisfied from cache
●
Saves a file access
Cache Miss
●
A metadata access request that can't be satisfied from cache
●
Costs a file access (several milliseconds in the worst case)

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
16
Overview of HDF5 metadata cache
●

Dirty Metadata


●

Eviction


●

Procedure for selecting metadata to evict

Principle of Locality




•

The removal of a piece of metadata from the cache

Eviction Policy


●

Metadata that has been altered in cache but not written to file

File access tends not to be random
Metadata just accessed is likely to be accessed again soon
This is the reason why caching is practical

Working set




Subset of the metadata that is in frequent use at a given point in
time
Size highly variable depending on file structure and access
pattern

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
17
Overview of HDF5 metadata cache
Scenario

Working set size

# of Metadata cache
accesses

Create datasets A,B,C,D
10^6 chunks under root
group

< 1MB

<50K

Initialize the chunks using a
round robin (1 from A, 1
from B, 1 from C, 1 from D,
repeat until done

< 1MB

~30M

10^6 random accesses across
A,B,C and D

~120MB

~4M

~40MB

~4M

10^6 random accesses to A
only

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
18
Overview of HDF5 metadata cache
• Challenges peculiar to metadata caching in HDF5
• Varying metadata entry sizes
• Most entries are less than a few hundred bytes
• Entries may be of almost any size
• Encountered variations from few bytes to megabytes

• Varying working set sizes
• < 1MB for most applications most of the time
• ~ 8MB (astrophysics simulation code)

• Metadata cache competes with application in core
• Cache must be big enough to to hold working set
• Should never be significantly bigger lest is starve the user
program for core
02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
19
Metadata Cache in HDF5 1.6.3 and before
Hash Table
Metadata
Metadata

Metadata

02/18/14

Fast

No provision for collisions

Eviction on collision

For small hash table performance is
bad since frequently accessed entries hash to
the same location

Good performance requires big size of hash
table

Inefficient use of core

Unsustainable as HDF5 file size and complexity
increases


HDF and HDF-EOS Workshop X, Landover, MD
20
Metadata Cache in HDF5 1.6.4 and 1.6.5
•
•
•
•
•

Entries are stored in a hash table as before
Collisions handled by chaining
Maintain a LRU list to select candidates for eviction
Maintain a running sum of the sizes of the entries
Entries are evicted when a predefined limit on this sum is
reached
• Size of the metadata cache is bounded
• hard coded 8MB
• Doesn’t work when working set size is bigger
• Larger variations on a working set sizes are anticipated
• Manual control over the cache size is needed!!!!

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
21
Overview of HDF5 metadata cache
Hash Table
Metadata 9

Metadata 1

Metadata 2

Metadata 8
Metadata 4

Metadata 5

Metadata 7

Metadata 3

9
2
4
1
8
7
6
5
3

LRU list

Metadata 6

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
22
Metadata Cache in HDF5 1.8.0
• New metadata cache APIs
• control cache size
• monitor actual cache size and current hit rate
• Adaptive cache resizing
• Enabled by default (min and max sizes are 1MB)
• Automatically detects the current working size
• Sets max cache size to the working set size

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
23
Metadata Cache in HDF5 1.8.0
• First problem (easy)
• Detect when the cache is too small and select a size
increment (some overshoot is OK)
•
•
•
•

Check hit rate every n accesses
Increase by defined increment if hit rate is below a threshold
Repeat until hit rate is above the threshold
User-defined via APIs

• Works well in most cases
• Doesn’t work well when hit rate varies slowly with
increasing cache size
• Happens when working size is very large
02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
24
Metadata Cache in HDF5 1.8.0
• Second problem (hard)
• Detect when the cache is too big, and select a size
decrement (must not overshoot)
• Track how long since each entry in the cache has been
accessed
• Check hit rate every n cache accesses
• If hit rate is above some threshold, evict entries with
number of accesses less than m
• If cache size is significantly below the max cache size,
reduce the cache size

• Worked well for Boeing time-segment library application
02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
25
Metadata Cache in HDF5 1.8.0
• Hints:
• If working set size varies quickly, correct size is never
reached
• Adaptive cache resize algorithms take time to react
• Control cache size directly from application
• Shouldn’t see application memory growth due to the
metadata cache
• See HDF5 documentation; collect statistics on metadata
cache access
• If performance is still poor contact help@hdfgroup.org

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
26
Questions? Comments?

Thank you!

Questions ?

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
27
Acknowledgement

This report is based upon work supported in part by a Cooperative Agreement with NASA under
NASA NNG05GC60A. Any opinions, findings, and conclusions or recommendations expressed in
this material are those of the author(s) and do not necessarily reflect the views of the National Aeronautics
and Space Administration.

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
28

More Related Content

PDF
desarrollo tareas integradas futbol base
PDF
Las abp en el futbol. aspectos generales
PPTX
Fútbol base (finlandia)
PPT
Periodización táctica 2014(ii)
PDF
Mesociclo combinativo
PDF
96 juegos para el entrenamiento integrado de las acciones combinativas en el ...
PDF
Tareas y juegos simplificados (práctica)
PPTX
desarrollo tareas integradas futbol base
Las abp en el futbol. aspectos generales
Fútbol base (finlandia)
Periodización táctica 2014(ii)
Mesociclo combinativo
96 juegos para el entrenamiento integrado de las acciones combinativas en el ...
Tareas y juegos simplificados (práctica)

What's hot (16)

PDF
93930414 80-entrenamientos-defensivos
PDF
282204131 volumen-4-espacios-reducidos
PDF
E-book - Futebol: Bases Científicas da Preparação de Força (ISBN: 978-85-9203...
PDF
Equilibrio táctico de la línea defensiva
PDF
Modelo de juego de los grandes principios a otros más específicos
PPTX
Trabajo tactica
DOCX
Ruedas de Pase Tercer Hombre
PPTX
El floorball
PDF
58918742 microciclo-futbol-1ª-division
PPTX
12 ejercicios de fútbol LIC LUIS GIRON RUMICHE
PDF
7ª sesión partidos modificados I
PDF
Tactical Periodization
PDF
Schede da lucarelli allenamento calcio dilettanti
PDF
Training Fútbol 207
PDF
Preparación física integrada FIFA
PDF
Training Fútbol 210
93930414 80-entrenamientos-defensivos
282204131 volumen-4-espacios-reducidos
E-book - Futebol: Bases Científicas da Preparação de Força (ISBN: 978-85-9203...
Equilibrio táctico de la línea defensiva
Modelo de juego de los grandes principios a otros más específicos
Trabajo tactica
Ruedas de Pase Tercer Hombre
El floorball
58918742 microciclo-futbol-1ª-division
12 ejercicios de fútbol LIC LUIS GIRON RUMICHE
7ª sesión partidos modificados I
Tactical Periodization
Schede da lucarelli allenamento calcio dilettanti
Training Fútbol 207
Preparación física integrada FIFA
Training Fútbol 210
Ad

Similar to Performance Tuning in HDF5 (20)

PPT
PDF
Cloud-Optimized HDF5 Files
PDF
Accessing HDF5 data in the cloud with HSDS
PPTX
Parallel Computing with HDF Server
PPT
Caching and Buffering in HDF5
PPTX
Highly Scalable Data Service (HSDS) Performance Features
PPTX
PPT
HDF5 Advanced Topics - Chunking
PPT
Using HDF5 tools for performance tuning and troubleshooting
PPTX
Data Analytics presentation.pptx
PPTX
Unit-3.pptx
PDF
Aziksa hadoop architecture santosh jha
PDF
Hdfs internals
PPT
HDF5 Advanced Topics - Datatypes and Partial I/O
PPTX
HDF for the Cloud - New HDF Server Features
PDF
Hadoop 3.0 - Revolution or evolution?
Cloud-Optimized HDF5 Files
Accessing HDF5 data in the cloud with HSDS
Parallel Computing with HDF Server
Caching and Buffering in HDF5
Highly Scalable Data Service (HSDS) Performance Features
HDF5 Advanced Topics - Chunking
Using HDF5 tools for performance tuning and troubleshooting
Data Analytics presentation.pptx
Unit-3.pptx
Aziksa hadoop architecture santosh jha
Hdfs internals
HDF5 Advanced Topics - Datatypes and Partial I/O
HDF for the Cloud - New HDF Server Features
Hadoop 3.0 - Revolution or evolution?
Ad

More from The HDF-EOS Tools and Information Center (20)

PDF
HDF5 2.0: Cloud Optimized from the Start
PDF
Using a Hierarchical Data Format v5 file as Zarr v3 Shard
PDF
Cloud-Optimized HDF5 Files - Current Status
PDF
Cloud Optimized HDF5 for the ICESat-2 mission
PPTX
Access HDF Data in the Cloud via OPeNDAP Web Service
PPTX
Upcoming New HDF5 Features: Multi-threading, sparse data storage, and encrypt...
PPTX
The State of HDF5 / Dana Robinson / The HDF Group
PDF
Creating Cloud-Optimized HDF5 Files
PPTX
HDF5 OPeNDAP Handler Updates, and Performance Discussion
PPTX
Hyrax: Serving Data from S3
PPSX
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
PDF
HDF - Current status and Future Directions
PPSX
HDFEOS.org User Analsys, Updates, and Future
PPTX
HDF - Current status and Future Directions
PDF
H5Coro: The Cloud-Optimized Read-Only Library
PPTX
MATLAB Modernization on HDF5 1.10
PPTX
HDF for the Cloud - Serverless HDF
PPSX
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
HDF5 2.0: Cloud Optimized from the Start
Using a Hierarchical Data Format v5 file as Zarr v3 Shard
Cloud-Optimized HDF5 Files - Current Status
Cloud Optimized HDF5 for the ICESat-2 mission
Access HDF Data in the Cloud via OPeNDAP Web Service
Upcoming New HDF5 Features: Multi-threading, sparse data storage, and encrypt...
The State of HDF5 / Dana Robinson / The HDF Group
Creating Cloud-Optimized HDF5 Files
HDF5 OPeNDAP Handler Updates, and Performance Discussion
Hyrax: Serving Data from S3
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
HDF - Current status and Future Directions
HDFEOS.org User Analsys, Updates, and Future
HDF - Current status and Future Directions
H5Coro: The Cloud-Optimized Read-Only Library
MATLAB Modernization on HDF5 1.10
HDF for the Cloud - Serverless HDF
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3

Recently uploaded (20)

PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
cuic standard and advanced reporting.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Empathic Computing: Creating Shared Understanding
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Spectroscopy.pptx food analysis technology
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
cuic standard and advanced reporting.pdf
NewMind AI Weekly Chronicles - August'25 Week I
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Advanced methodologies resolving dimensionality complications for autism neur...
Programs and apps: productivity, graphics, security and other tools
MIND Revenue Release Quarter 2 2025 Press Release
Network Security Unit 5.pdf for BCA BBA.
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Empathic Computing: Creating Shared Understanding
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Unlocking AI with Model Context Protocol (MCP)
Reach Out and Touch Someone: Haptics and Empathic Computing
Diabetes mellitus diagnosis method based random forest with bat algorithm
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Spectroscopy.pptx food analysis technology
Building Integrated photovoltaic BIPV_UPV.pdf
Review of recent advances in non-invasive hemoglobin estimation
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton

Performance Tuning in HDF5

  • 1. Performance Tuning in HDF5 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 1
  • 2. Outline • HDF5 file overhead • Performance and options for tuning • Chunking • Caching 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 2
  • 3. HDF5 file overhead • Why my HDF5 file is so big? • Each HDF5 object has an overhead (object header, Btrees, global heaps) • Examples • • • • Empty HDF5 file (has root group) 976 bytes File with one empty group 1952 bytes Dataset without data written 1576 bytes Dataset with 4MB data written 4002048 bytes (1000x1000 int) • Dataset with chunk storage 25 chunks 4004664 bytes • Dataset with chunk storage 10^4 chunks 4497104 bytes • Dataset with chunk storage 25 chunks with compression 34817 bytes (actual data was 8-bit integer) 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 3
  • 4. HDF5 file overhead • Overhead may come from datatypes • Example: Compound datatype typedef struct s1_t { int a; char d; float b; double c; } s1_t; • If compiler aligns data on 8 byte boundaries, we have 11 bytes overhead for each element • Use H5Tpack to “compress” data 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 4
  • 5. HDF5 file overhead • Example: Variable length data typedef struct vl_t { int len; void *p; } vl_t; • • • • ~20 bytes overhead for each element Raw data is stored in global heaps Cannot be compressed Opening/closing VL dataset will increase overhead in a file due to fragmented global heap 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 5
  • 6. HDF5 file overhead • Example: 9 bit integer • Takes at least 16bits, 7 bits are overhead • Use N-bit filter to compress (1.8.0 release) 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 6
  • 7. Summary • File • Try to keep file open as long as possible; don’t open/close if not necessary • Groups • Use compact storage (available in 1.8.0) for groups with a few members • Datasets • Use compact group storage for small ( <64K) datasets • If many datasets are of the same datatype, use shared datatype • Use appropriate chunking and compression strategy 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 7
  • 8. HDF5 chunking • Chunking refers to a storage layout where a dataset is partitioned into fixed-size multi-dimensional chunks • Used when dataset is • Extendible • Compressed • Checksum or other filters are applied • HDF5 chunk’s properties • Chunk has the same number of dimensions as a dataset • Chunks cover all dataset, but the dataset need not be an integral number of chunks • If no data is ever written to a chunk, chunk is not allocated in a file • Chunk is an atomic object for I/O operation (e.g. written or read in one I/O operation) 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 8
  • 9. HDF5 chunking 1 2 3 Dataset 4 5 6 Data written 7 8 9 10 11 12 02/18/14 • Dataset is covered by 12 chunks • Chunks 1,6 and 11 never allocated in a file • Compression, encryption, checksum, etc is performed on entire chunk HDF and HDF-EOS Workshop X, Landover, MD 9
  • 10. HDF5 filter pipeline Example: H5Dwrite touches only a few bytes in a chunk Modify bytes File 02/18/14 • Entire chunk is read from the file • Data passes through filter pipeline • The few bytes are modified • Data passes through filter pipeline • Entire chunk is written back to file • May be written to another location leaving a hole in the file • Can increase file size Chunk # 12 HDF and HDF-EOS Workshop X, Landover, MD 10
  • 11. HDF5 filter pipeline and chunk cache Example: H5Dwrite touches only a few bytes in a chunk Modify bytes 02/18/14 • Calling H5Dwrite many times would result in poor performance • Chunk cache layer • Holds 521 chunks or 1MB (whichever is less) • H5Pset_cache call to modify cache HDF and HDF-EOS Workshop X, Landover, MD 11
  • 12. HDF5 chunk cache • The preemption policy for the cache favors certain chunks and tries not to preempt them. • Chunks that have been accessed frequently in the near past are favored. • A chunk which has just entered the cache is favored. • A chunk which has been completely read or completely written but not partially read or written is penalized according to some application specified weighting between zero and one. • A chunk which is larger than the maximum cache size is not eligible for caching. 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 12
  • 13. HDF5 chunk overhead in a file • B-tree maps chunk N-dimensional addresses to file addresses • B-tree grows with the number of chunks • File storage overhead is higher • More disk I/O is required to traverse the tree from root to leaves • Large # of B-tree nodes will result in higher contention for metadata cache • To reduce overhead • Use bigger chunk sizes, don’t’ use chunk size 1 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 13
  • 14. HDF5 metadata cache • Outline • • • • Overview of HDF5 metadata cache Implementation prior 1.8.0 release Adaptive metadata cache in 1.8.0 Documentation is available in UG and RM for 1.8.0 release • http://hdf.ncsa.uiuc.edu/HDF5/doc_dev_snapshot/H5_dev/UG/UG _frame17SpecialTopics.html 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 14
  • 15. Overview of HDF5 metadata cache ● ● Metadata – extra information about your data Structural metadata ● Stores information about your data ● ● ● ● ● Example: When you create a group, you really create: Group header B-Tree (to index entries), and Local heap (to store entry names) User defined metadata (Created via the H5A calls) ● ● ● Usually small – less than 1 KB Accessed frequently Small disk accesses still expensive 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 15
  • 16. Overview of HDF5 metadata cache ● ● ● ● Cache ● An area of storage devoted to the high speed retrieval of frequently used data HDF5 Metadata Cache ● A module that tries to keep frequently used metadata in core so as to avoid file I/O ● Exists to enhance performance ● Limited size – can't hold all the metadata all the time Cache Hit ● A metadata access request that is satisfied from cache ● Saves a file access Cache Miss ● A metadata access request that can't be satisfied from cache ● Costs a file access (several milliseconds in the worst case) 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 16
  • 17. Overview of HDF5 metadata cache ● Dirty Metadata  ● Eviction  ● Procedure for selecting metadata to evict Principle of Locality    • The removal of a piece of metadata from the cache Eviction Policy  ● Metadata that has been altered in cache but not written to file File access tends not to be random Metadata just accessed is likely to be accessed again soon This is the reason why caching is practical Working set   Subset of the metadata that is in frequent use at a given point in time Size highly variable depending on file structure and access pattern 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 17
  • 18. Overview of HDF5 metadata cache Scenario Working set size # of Metadata cache accesses Create datasets A,B,C,D 10^6 chunks under root group < 1MB <50K Initialize the chunks using a round robin (1 from A, 1 from B, 1 from C, 1 from D, repeat until done < 1MB ~30M 10^6 random accesses across A,B,C and D ~120MB ~4M ~40MB ~4M 10^6 random accesses to A only 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 18
  • 19. Overview of HDF5 metadata cache • Challenges peculiar to metadata caching in HDF5 • Varying metadata entry sizes • Most entries are less than a few hundred bytes • Entries may be of almost any size • Encountered variations from few bytes to megabytes • Varying working set sizes • < 1MB for most applications most of the time • ~ 8MB (astrophysics simulation code) • Metadata cache competes with application in core • Cache must be big enough to to hold working set • Should never be significantly bigger lest is starve the user program for core 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 19
  • 20. Metadata Cache in HDF5 1.6.3 and before Hash Table Metadata Metadata Metadata 02/18/14 Fast  No provision for collisions  Eviction on collision  For small hash table performance is bad since frequently accessed entries hash to the same location  Good performance requires big size of hash table  Inefficient use of core  Unsustainable as HDF5 file size and complexity increases  HDF and HDF-EOS Workshop X, Landover, MD 20
  • 21. Metadata Cache in HDF5 1.6.4 and 1.6.5 • • • • • Entries are stored in a hash table as before Collisions handled by chaining Maintain a LRU list to select candidates for eviction Maintain a running sum of the sizes of the entries Entries are evicted when a predefined limit on this sum is reached • Size of the metadata cache is bounded • hard coded 8MB • Doesn’t work when working set size is bigger • Larger variations on a working set sizes are anticipated • Manual control over the cache size is needed!!!! 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 21
  • 22. Overview of HDF5 metadata cache Hash Table Metadata 9 Metadata 1 Metadata 2 Metadata 8 Metadata 4 Metadata 5 Metadata 7 Metadata 3 9 2 4 1 8 7 6 5 3 LRU list Metadata 6 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 22
  • 23. Metadata Cache in HDF5 1.8.0 • New metadata cache APIs • control cache size • monitor actual cache size and current hit rate • Adaptive cache resizing • Enabled by default (min and max sizes are 1MB) • Automatically detects the current working size • Sets max cache size to the working set size 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 23
  • 24. Metadata Cache in HDF5 1.8.0 • First problem (easy) • Detect when the cache is too small and select a size increment (some overshoot is OK) • • • • Check hit rate every n accesses Increase by defined increment if hit rate is below a threshold Repeat until hit rate is above the threshold User-defined via APIs • Works well in most cases • Doesn’t work well when hit rate varies slowly with increasing cache size • Happens when working size is very large 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 24
  • 25. Metadata Cache in HDF5 1.8.0 • Second problem (hard) • Detect when the cache is too big, and select a size decrement (must not overshoot) • Track how long since each entry in the cache has been accessed • Check hit rate every n cache accesses • If hit rate is above some threshold, evict entries with number of accesses less than m • If cache size is significantly below the max cache size, reduce the cache size • Worked well for Boeing time-segment library application 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 25
  • 26. Metadata Cache in HDF5 1.8.0 • Hints: • If working set size varies quickly, correct size is never reached • Adaptive cache resize algorithms take time to react • Control cache size directly from application • Shouldn’t see application memory growth due to the metadata cache • See HDF5 documentation; collect statistics on metadata cache access • If performance is still poor contact help@hdfgroup.org 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 26
  • 27. Questions? Comments? Thank you! Questions ? 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 27
  • 28. Acknowledgement This report is based upon work supported in part by a Cooperative Agreement with NASA under NASA NNG05GC60A. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Aeronautics and Space Administration. 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 28