SlideShare a Scribd company logo
ESIP-0722 JG
Hyrax: Serving Data from S3
Summer ESIP 2022
This work was supported by NASA/GSFC under Raytheon Technologies contract number 80GSFC21CA001.
This document does not contain technology or Technical Data controlled under either the U.S. International Traffic
in Arms Regulations or the U.S. Export Administration Regulations.
James Gallagher
Software Engineer/NASA EED-3 contractor
jgallagher@opendap.org
ESIP-0722 JG
2
• Serve existing files that are stored on S3*
• There’s no need to alter the files – just
copy them to S3
• Works for HDF5** and netCDF4***
What you can do
* Simple Storage Service
** Hierarchical Data Format, version 5
*** Network Common Data Format, version 4
ESIP-0722 JG
3
• Where the server will run
• Where you will store the ancillary
metadata files the server needs
– These files are not the big data files
– They provide a road map to the interior
structure of those data files
• The URLs* of the data files you want to
serve
What you need to know
* Universal Resource Locators
ESIP-0722 JG
4
• Use the command line tool (get_dmrpp)
to build a DMR++* file
• Use get_dmrpp’s default setting and see
if the way it represents your data is
acceptable (try it and see)
• Customize the configuration if needed
• Write a script to process your collection of
files
How to make the ancillary files
*Dataset Metadata Response Plus Plus
ESIP-0722 JG
5
• Get the Hyrax server Docker container
• Start the container – it contains both the
Hyrax data server and the get_dmrpp
command
• Run get_dmrpp inside the container – a
recipe follows
To run get_dmrpp
ESIP-0722 JG
6
Go to the directory where you want to store the ancillary information
Start the docker container that has the Hyrax server and the get_dmrpp tool
docker run --hostname hyrax 
--publish 8080:8080 
--volume $(pwd):/usr/share/hyrax 
--env AWS_ACCESS_KEY_ID 
--env AWS_SECRET_ACCESS_KEY 
--name hyrax 
opendap/hyrax:snapshot
If your computer uses the new Apple Silicon processor (M1), add
--platform linux/amd64
The get_dmrpp Recipe
ESIP-0722 JG
7
Run any command in the Docker container
docker exec –interactive --tty hyrax /bin/bash
Run get_dmrpp in the Docker container
docker exec –interactive --tty hyrax 
/bin/bash -c “cd /usr/share/hyrax; get_dmrpp …”
Where
get_dmrpp 
-b . 
-u https://url.for.your/data/file.nc 
-o file.nc.dmrpp 
s3://bucket/objectname
get_dmrpp, continued
ESIP-0722 JG
8
Run any command in the Docker container
docker exec –interactive --tty hyrax /bin/bash
Run get_dmrpp in the Docker container
get_dmrpp 
-b . 
-u https://cloudydap.s3.amazonaws.com/samples/1A.GPM.GMI.COUNT2014v3.20160105-
S230545-E003816.010538.V03B.h5 
-o 1A.GPM.GMI.COUNT2014v3.20160105-S230545-E003816.010538.V03B.h5.dmrpp 
s3://cloudydap/samples/1A.GPM.GMI.COUNT2014v3.20160105-S230545-E003816.010538.V03B.h5
This will build a DMR++ document for the named granule in
cloudydap S3 bucket. The DMR++ will use the URL
https://cloudydap.s3... to read data values.
get_dmrpp, example
ESIP-0722 JG
9
• Look at the default output and decide how
it should change
• Look at the HDF5 Handler documentation
for Hyrax – the same options can be used
with get_dmrpp
• Put those optional ‘keys’ in a file and pass
name of that file to get_dmrpp using the
command’s -s option
How to customize the DMR++
ESIP-0722 JG
10
• How to build & deploy DMR++ files for Hyrax
• Customization options for HDF5
• If you’d like more information or a
demonstration, see me after the session
More information
Some useful Docker commands
docker rm -f $(docker ps -aq) # remove all containers
docker rmi -f $(docker images -q) # remove all images
ESIP-0722 JG
11
This work was supported by NASA/GSFC under
Raytheon Technologies contract number
80GSFC21CA001.

More Related Content

PPTX
Big data processing using hadoop poster presentation
PDF
linux installation.pdf
PDF
Ae backup
PDF
DCEU 18: Developing with Docker Containers
PDF
Using Docker For Development
PDF
How to Set Up Esri Geoportal Server 1.2.2 on Windows
PDF
Rac on NFS
PDF
Hands on Docker - Launch your own LEMP or LAMP stack - SunshinePHP
Big data processing using hadoop poster presentation
linux installation.pdf
Ae backup
DCEU 18: Developing with Docker Containers
Using Docker For Development
How to Set Up Esri Geoportal Server 1.2.2 on Windows
Rac on NFS
Hands on Docker - Launch your own LEMP or LAMP stack - SunshinePHP

Similar to Hyrax: Serving Data from S3 (20)

PDF
Spark Jupyterlab Final GSE Presentation 2024
PDF
GR740 User day
PDF
Spark Working Environment in Windows OS
PDF
02 Hadoop deployment and configuration
PPTX
HDF5 OPeNDAP Handler Updates, and Performance Discussion
PDF
Dok Talks #124 - Intro to Druid on Kubernetes
PPTX
HDFS tiered storage: mounting object stores in HDFS
PDF
R Data Access from hdfs,spark,hive
PDF
State of Containers and the Convergence of HPC and BigData
PPTX
PPTX
DataStax | DSE: Bring Your Own Spark (with Enterprise Security) (Artem Aliev)...
PDF
PPTX
Exp-3.pptx
PPSX
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
PDF
PDF
Design and Research of Hadoop Distributed Cluster Based on Raspberry
PPTX
Dache: A Data Aware Caching for Big-Data Applications Using the MapReduce Fra...
PDF
R the unsung hero of Big Data
PDF
Hadoop Architecture and HDFS
Spark Jupyterlab Final GSE Presentation 2024
GR740 User day
Spark Working Environment in Windows OS
02 Hadoop deployment and configuration
HDF5 OPeNDAP Handler Updates, and Performance Discussion
Dok Talks #124 - Intro to Druid on Kubernetes
HDFS tiered storage: mounting object stores in HDFS
R Data Access from hdfs,spark,hive
State of Containers and the Convergence of HPC and BigData
DataStax | DSE: Bring Your Own Spark (with Enterprise Security) (Artem Aliev)...
Exp-3.pptx
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
Design and Research of Hadoop Distributed Cluster Based on Raspberry
Dache: A Data Aware Caching for Big-Data Applications Using the MapReduce Fra...
R the unsung hero of Big Data
Hadoop Architecture and HDFS
Ad

More from The HDF-EOS Tools and Information Center (20)

PDF
HDF5 2.0: Cloud Optimized from the Start
PDF
Using a Hierarchical Data Format v5 file as Zarr v3 Shard
PDF
Cloud-Optimized HDF5 Files - Current Status
PDF
Cloud Optimized HDF5 for the ICESat-2 mission
PPTX
Access HDF Data in the Cloud via OPeNDAP Web Service
PPTX
Upcoming New HDF5 Features: Multi-threading, sparse data storage, and encrypt...
PPTX
The State of HDF5 / Dana Robinson / The HDF Group
PDF
Cloud-Optimized HDF5 Files
PDF
Accessing HDF5 data in the cloud with HSDS
PPTX
Highly Scalable Data Service (HSDS) Performance Features
PDF
Creating Cloud-Optimized HDF5 Files
PDF
HDF - Current status and Future Directions
PPSX
HDFEOS.org User Analsys, Updates, and Future
PPTX
HDF - Current status and Future Directions
PDF
H5Coro: The Cloud-Optimized Read-Only Library
PPTX
MATLAB Modernization on HDF5 1.10
PPTX
HDF for the Cloud - Serverless HDF
PPTX
HDF for the Cloud - New HDF Server Features
HDF5 2.0: Cloud Optimized from the Start
Using a Hierarchical Data Format v5 file as Zarr v3 Shard
Cloud-Optimized HDF5 Files - Current Status
Cloud Optimized HDF5 for the ICESat-2 mission
Access HDF Data in the Cloud via OPeNDAP Web Service
Upcoming New HDF5 Features: Multi-threading, sparse data storage, and encrypt...
The State of HDF5 / Dana Robinson / The HDF Group
Cloud-Optimized HDF5 Files
Accessing HDF5 data in the cloud with HSDS
Highly Scalable Data Service (HSDS) Performance Features
Creating Cloud-Optimized HDF5 Files
HDF - Current status and Future Directions
HDFEOS.org User Analsys, Updates, and Future
HDF - Current status and Future Directions
H5Coro: The Cloud-Optimized Read-Only Library
MATLAB Modernization on HDF5 1.10
HDF for the Cloud - Serverless HDF
HDF for the Cloud - New HDF Server Features
Ad

Recently uploaded (20)

PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Approach and Philosophy of On baking technology
PDF
Encapsulation theory and applications.pdf
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Machine learning based COVID-19 study performance prediction
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
cuic standard and advanced reporting.pdf
PPT
Teaching material agriculture food technology
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Modernizing your data center with Dell and AMD
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
A Presentation on Artificial Intelligence
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
20250228 LYD VKU AI Blended-Learning.pptx
Chapter 3 Spatial Domain Image Processing.pdf
Approach and Philosophy of On baking technology
Encapsulation theory and applications.pdf
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Network Security Unit 5.pdf for BCA BBA.
Building Integrated photovoltaic BIPV_UPV.pdf
Machine learning based COVID-19 study performance prediction
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Mobile App Security Testing_ A Comprehensive Guide.pdf
cuic standard and advanced reporting.pdf
Teaching material agriculture food technology
NewMind AI Monthly Chronicles - July 2025
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
Modernizing your data center with Dell and AMD
Reach Out and Touch Someone: Haptics and Empathic Computing
MYSQL Presentation for SQL database connectivity
A Presentation on Artificial Intelligence

Hyrax: Serving Data from S3

  • 1. ESIP-0722 JG Hyrax: Serving Data from S3 Summer ESIP 2022 This work was supported by NASA/GSFC under Raytheon Technologies contract number 80GSFC21CA001. This document does not contain technology or Technical Data controlled under either the U.S. International Traffic in Arms Regulations or the U.S. Export Administration Regulations. James Gallagher Software Engineer/NASA EED-3 contractor jgallagher@opendap.org
  • 2. ESIP-0722 JG 2 • Serve existing files that are stored on S3* • There’s no need to alter the files – just copy them to S3 • Works for HDF5** and netCDF4*** What you can do * Simple Storage Service ** Hierarchical Data Format, version 5 *** Network Common Data Format, version 4
  • 3. ESIP-0722 JG 3 • Where the server will run • Where you will store the ancillary metadata files the server needs – These files are not the big data files – They provide a road map to the interior structure of those data files • The URLs* of the data files you want to serve What you need to know * Universal Resource Locators
  • 4. ESIP-0722 JG 4 • Use the command line tool (get_dmrpp) to build a DMR++* file • Use get_dmrpp’s default setting and see if the way it represents your data is acceptable (try it and see) • Customize the configuration if needed • Write a script to process your collection of files How to make the ancillary files *Dataset Metadata Response Plus Plus
  • 5. ESIP-0722 JG 5 • Get the Hyrax server Docker container • Start the container – it contains both the Hyrax data server and the get_dmrpp command • Run get_dmrpp inside the container – a recipe follows To run get_dmrpp
  • 6. ESIP-0722 JG 6 Go to the directory where you want to store the ancillary information Start the docker container that has the Hyrax server and the get_dmrpp tool docker run --hostname hyrax --publish 8080:8080 --volume $(pwd):/usr/share/hyrax --env AWS_ACCESS_KEY_ID --env AWS_SECRET_ACCESS_KEY --name hyrax opendap/hyrax:snapshot If your computer uses the new Apple Silicon processor (M1), add --platform linux/amd64 The get_dmrpp Recipe
  • 7. ESIP-0722 JG 7 Run any command in the Docker container docker exec –interactive --tty hyrax /bin/bash Run get_dmrpp in the Docker container docker exec –interactive --tty hyrax /bin/bash -c “cd /usr/share/hyrax; get_dmrpp …” Where get_dmrpp -b . -u https://url.for.your/data/file.nc -o file.nc.dmrpp s3://bucket/objectname get_dmrpp, continued
  • 8. ESIP-0722 JG 8 Run any command in the Docker container docker exec –interactive --tty hyrax /bin/bash Run get_dmrpp in the Docker container get_dmrpp -b . -u https://cloudydap.s3.amazonaws.com/samples/1A.GPM.GMI.COUNT2014v3.20160105- S230545-E003816.010538.V03B.h5 -o 1A.GPM.GMI.COUNT2014v3.20160105-S230545-E003816.010538.V03B.h5.dmrpp s3://cloudydap/samples/1A.GPM.GMI.COUNT2014v3.20160105-S230545-E003816.010538.V03B.h5 This will build a DMR++ document for the named granule in cloudydap S3 bucket. The DMR++ will use the URL https://cloudydap.s3... to read data values. get_dmrpp, example
  • 9. ESIP-0722 JG 9 • Look at the default output and decide how it should change • Look at the HDF5 Handler documentation for Hyrax – the same options can be used with get_dmrpp • Put those optional ‘keys’ in a file and pass name of that file to get_dmrpp using the command’s -s option How to customize the DMR++
  • 10. ESIP-0722 JG 10 • How to build & deploy DMR++ files for Hyrax • Customization options for HDF5 • If you’d like more information or a demonstration, see me after the session More information Some useful Docker commands docker rm -f $(docker ps -aq) # remove all containers docker rmi -f $(docker images -q) # remove all images
  • 11. ESIP-0722 JG 11 This work was supported by NASA/GSFC under Raytheon Technologies contract number 80GSFC21CA001.