SlideShare a Scribd company logo
Advancing Collaborative Connections for Earth System Science
ACCESS
STARE-PODS: A VERSATILE DATA STORE
LEVERAGING THE HDF VIRTUAL OBJECT
LAYER FOR COMPATIBILITY
Michael L Rilee1,2 Kwo-Sen Kuo1,3, James Gallagher4, James Frew5, Niklas Griessbaum5,
Edward Hartnett6, Robert Wolfe1, Gerd Heber7, Siri Jodha Khalsa8
1NASA Goddard Space Flight Center, Greenbelt, Maryland, USA
2Rilee Systems Technologies LLC, Derwood, Maryland, USA
3Bayesics LLC, Bowie, Maryland, USA
4OPeNDAP, Inc., Narragansett, Rhode Island, USA
5University of California, Santa Barbara, California, USA
6Ed Hartnett Consulting, Boulder, CO, USA
7The HDF Group, Champaign, IL, USA
8Coloradio Associates for Science and Technology LLC, Boulder, CO, USA
2020 ESIP Summer Meeting
2020 July 22
STARE
Proposal No. 17-ACCESS17-0039
Federal Award ID No. 80NSSC18M0118
SpatioTemporal Adaptive Resolution Encoding (STARE)
Advancing Collaborative Connections for Earth System Science
ACCESS
STARE-PODS for scalable Analysis Ready Data (ARD)
• Diverse low-level Earth Science data (ESD) requires special treatment to
co-align and combine for integrative analysis
• The SpatioTemporal Adaptive Resolution Encoding (STARE) provides a
unifying indexing scheme to combine geo-located ESD
• STARE partitioned ESD enables Parallel Optimized Data Store (PODS)
• HDF’s Virtual Object Layer (VOL) and Virtual Data Set (VDS) technologies
can provide familiar front-ends to data in STARE-PODS
• STARE-PODS unifies accessing diverse data with minimum duplication
STARE-PODS is a proposal to NASA/ACCESS-19 currently under review.
Advancing Collaborative Connections for Earth System Science
ACCESS
STARE Basics
Advancing Collaborative Connections for Earth System Science
ACCESS
Existing native array & memory indexing impedes integration and processing.
STARE Basics
Advancing Collaborative Connections for Earth System Science
ACCESS
Two swath sections, A and B, overlap with the region of interest (ROI) outlined in black, with data on
separate computational nodes (numbered).
Parallel & distributed indexing based on native array partitioning
leads to extra data movement, breaking SCALABILITY.
Higher-res nadir Lower-res wing
Region of interest
Advancing Collaborative Connections for Earth System Science
ACCESS
STARE Encoding a locations in a recursive spatial quad-tree
STARE Temporal indexing is similar but based on calendrical periods.
A tilted root polyhedron
0th level
First refinement level
1st level
STARE Spatial ‘Trixels’
Encoded as 64-bit integers
Advancing Collaborative Connections for Earth System Science
ACCESS
Worker
Node
2
Worker
Node
1
Worker
Node
3
Worker
Node
4
Chunk 1
ffc0-ffcc
Chunk 2
ffd0-ffdc
Chunk 3
ffe0-ffec
Chunk 4
fff0-fffc
Parallel Store, SciDB…
N3333
Bit 1 1 11 11 11 11 -> 0xffc (right justified)
N3333 ffc0000000000000 @level 3 (left justified)
N33330-N33333
N33330 ffc0000000000000 @level 4
l
N33333 fff0000000000000 @level 4
00
01
10
11
N333300-N333333
N333300 ffc0000000000000 @level 5
l
N333333 fffc0000000000000 @level 5
@level 5
0000
0011
1100
1111
3
4
5
“Chunks”
Levels
STARE Spatial Hierarchical Triangular Mesh (HTM) Indexing: spherical triangles to integers via quadtree recursion
- aids comparison of different data sets, integer operations are much faster than geometric calculations
- bit pattern keeps co-located data together when “chunked”
STARE Temporal Hierarchical Calendrical Partitioning (HCP): similar but with branching based on calendar partitions
00
01
10
11
1 2 3
level
Worker
Node
2
Worker
Node
1
Worker
Node
3
Worker
Node
4
Advancing Collaborative Connections for Earth System Science
ACCESS
STARE vs Floating-Point Encoding
Longitude Latitude
Human readable +123.4° 60°
Single-precision floating-point 0x42f6cccd 0x42700000
STARE id* 0x36ee9398f7210f34
The smallest triangle in the figure
is at quadfurcation level 6.
*STARE id also includes resolution information. In this case, it points
to quadfurcation level 20, i.e. ≲ 10m
Advancing Collaborative Connections for Earth System Science
ACCESS
NADIRWING
STARE indexing
adapts to the
resolution of the
data, which often
varies.
MODIS
GOES pixel
Lon-lat
search area for
combining data
Supporting conventional lon-lat vs. STARE-based integration
One “scan” with
ten sensors.
MODIS pixel
(nadir resolution)
Advancing Collaborative Connections for Earth System Science
ACCESS
2+1 Dimensions indexed with two integers
STARE SpatioTemporal Search/Index Volumes
Hurricane IRMA
Key West
“Sensor trajectory”
Cuba
STARE Volumes
(not to scale)
Advancing Collaborative Connections for Earth System Science
ACCESS
Parallelization
for Volume & Variety Scaling
Advancing Collaborative Connections for Earth System Science
ACCESS
STARE supporting a 16-way partitioning co-locating diverse data
Advancing Collaborative Connections for Earth System Science
ACCESS
GOES (red/brown) and MODIS (blue) granules integrated using STARE (visualized in equirectangular projection)
Using STARE to combine GOES and MODIS data
Can use key-value store to integrate
Advancing Collaborative Connections for Earth System Science
ACCESS
GOES (red/brown) and MODIS (blue) granules integrated using STARE (visualized in equirectangular projection)
Using STARE to combine GOES and MODIS data
Can use key-value store to integrate
Advancing Collaborative Connections for Earth System Science
ACCESS
HDF
Virtual Object Layer and
Virtual Data Sets
Advancing Collaborative Connections for Earth System Science
ACCESS
Individual instrument field of views
Scalable Homogenized Analysis Ready Data Store (STARE-SHARDS)
Actual data partitioned into
chunks for parallelism with
unified search and co-alignment.
HDF Virtual Data Set for
tailoring views into the data
Volume & variety scalability
Usability
HDF Virtual Data Set API
Advancing Collaborative Connections for Earth System Science
ACCESS
Individual instrument field of views
Scalable Homogenized Analysis Ready Data Store (STARE-SHARDS)
Actual data partitioned into
chunks for parallelism with
unified search and co-alignment.
HDF Virtual Data Set for
tailoring views into the data
Usability
HDF Virtual Data Set API
STARE-SHARDS
Storage Layer
Volume & variety scalability
Advancing Collaborative Connections for Earth System Science
ACCESS
Use a STARE ‘cover’ to
partition a granule
STARE partitioned swath data
looks like familiar HDF files
Using familiar HDF methods to access STARE-SHARDS
Data Source 1
Data Source 2
Data Source 3
HDF
Virtual
Granule
End users and legacy applications interact with STARE-SHARDS transparently.
Different sources and varieties of data with
different coverage, resolutions…
Data Source A
Data Source B
Advancing Collaborative Connections for Earth System Science
ACCESS
Use a STARE ‘cover’ to
partition a granule
STARE partitioned swath data
looks like familiar HDF files
Using familiar HDF methods to access STARE-SHARDS
Data Source 1
Data Source 2
Data Source 3
HDF
Virtual
Granule
End users and legacy applications interact with STARE-SHARDS transparently.
Different sources and varieties of data with
different coverage, resolutions…
Data Source A
Data Source B
Advancing Collaborative Connections for Earth System Science
ACCESS
The Proposed Architecture
STARE SHARDS to PODS to Integrative Analysis
Computing & Storage
Index & Organization
Query, Marshalling, “Transport”
Use & Tooling
Advancing Collaborative Connections for Earth System Science
ACCESS
The Architecture
STARE SHARDS to PODS to Integrative Analysis
STARE Location Service (SLS)
A ‘DNS’ for geolocated data
Advancing Collaborative Connections for Earth System Science
ACCESS
Conclusion: STARE-PODS for scalable integrative analysis
• STARE lays the foundation for scaling both variety and volume
• Supports lower-level (L1 & L2) data accessibility, combination, and scalability
• Features C++ and Python APIs, including a Pandas-like interface
• STARE Sidecar files limit costs of translation into STARE indices
• OPeNDAP integration is in progress
• Libraries, examples, tests, and cookbooks at https://github.com/SpatioTemporal
• STARE-PODS and STARE-SHARDS
• Organize diverse data for co-alignment and parallel/distributed storage and processing
• HDF Virtual Object Layer and Data Set support transparent legacy access
Acknowledgments
• STARE-PODS is a proposal to NASA/ACCESS-19 currently under review.
• This work is supported by NASA/ACCESS-17. Federal Award ID No. 80NSSC18M0118.
• NASA/LaRC for interest and support.
Advancing Collaborative Connections for Earth System Science
ACCESS
Supplemental
Advancing Collaborative Connections for Earth System Science
ACCESS
Advancing Collaborative Connections for Earth System Science
ACCESS
NASA/ACCESS-17-39 STARE
80NSSC18M0118
M. Rilee
mike@rilee.net
Rilee Systems Technologies LLC
2019 October 21
Advancing Collaborative Connections for Earth System Science
ACCESS
NASA/ACCESS-17-39 STARE
80NSSC18M0118
M. Rilee
mike@rilee.net
Rilee Systems Technologies LLC
2019 October 21
Advancing Collaborative Connections for Earth System Science
ACCESS
Zooming in to the MODIS swath “bow-tie”
WING NADIR
Two “scans”
overlapping
STARE Indexing adapts to the data
Advancing Collaborative Connections for Earth System Science
ACCESS
0x1048000000000005
0x1049e66dab30632b
STARE Spatial IDs
Level 5, green trixels
A 0x1048000000000005
B 0x104a000000000005
C 0x104c000000000005
D 0x104e000000000005
A
B
C
D
NASA/ACCESS-17-39 STARE
80NSSC18M0118
M. Rilee
mike@rilee.net
Rilee Systems Technologies LLC
2019 October 24
ROI+GOES ROI+MODIS ROI+GOES+MODIS
Advancing Collaborative Connections for Earth System Science
ACCESS
NASA/ACCESS-17-39 STARE
80NSSC18M0118
M. Rilee
mike@rilee.net
Rilee Systems Technologies LLC
2019 October 21
Advancing Collaborative Connections for Earth System Science
ACCESS
NASA/ACCESS-17-39 STARE
80NSSC18M0118
M. Rilee
mike@rilee.net
Rilee Systems Technologies LLC
2019 October 24
ROI+GOES ROI+MODIS
ROI
+GOES
+MODIS
A: 0x1049e6000000000a
B: 0x1049e6600000000b
C: 0x1049e66dab30632b
Advancing Collaborative Connections for Earth System Science
ACCESS
Integration at the
finest level via IFOV
and PSF modeling
i
j
k
𝑠𝑖 ≈ 𝑆𝑗 𝑊𝑗𝑖 ⊕ 𝑆 𝑘 𝑊𝑘𝑖
𝑠 = 𝑾 𝑺
Observation
Vectors
(source)
PSF
weights
“combined”
Signal
(target)
Finer trixels not shown for clarity.
“brown psf” “blue psf”
Instrument Field of View and Point Spread Function Modeling
Advancing Collaborative Connections for Earth System Science
ACCESS

More Related Content

PDF
H5Coro: The Cloud-Optimized Read-Only Library
PPT
Caching and Buffering in HDF5
PPSX
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
PPTX
Parallel Computing with HDF Server
PPTX
HDF5 and Ecosystem: What Is New?
PPTX
MATLAB Modernization on HDF5 1.10
PPSX
HDFEOS.org User Analsys, Updates, and Future
H5Coro: The Cloud-Optimized Read-Only Library
Caching and Buffering in HDF5
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Parallel Computing with HDF Server
HDF5 and Ecosystem: What Is New?
MATLAB Modernization on HDF5 1.10
HDFEOS.org User Analsys, Updates, and Future

What's hot (20)

PPTX
HDF Update for DAAC Managers (2017-02-27)
PPTX
Utilizing HDF4 File Content Maps for the Cloud Computing
PPTX
MATLAB and Scientific Data: New Features and Capabilities
PPT
PPTX
Incorporating ISO Metadata Using HDF Product Designer
PPTX
R Hadoop integration
PPT
Projection Indexes for HDF5 Datasets
PPT
Performance Tuning in HDF5
PPT
HDF-EOS 2/5 to netCDF Converter
PPTX
Data Analytics using MATLAB and HDF5
PPTX
Matlab, Big Data, and HDF Server
PPTX
Product Designer Hub - Taking HPD to the Web
PPTX
Geospatial Data Abstraction Library (GDAL) Enhancement for ESDIS (GEE)
PPTX
PPTX
Efficiently serving HDF5 via OPeNDAP
HDF Update for DAAC Managers (2017-02-27)
Utilizing HDF4 File Content Maps for the Cloud Computing
MATLAB and Scientific Data: New Features and Capabilities
Incorporating ISO Metadata Using HDF Product Designer
R Hadoop integration
Projection Indexes for HDF5 Datasets
Performance Tuning in HDF5
HDF-EOS 2/5 to netCDF Converter
Data Analytics using MATLAB and HDF5
Matlab, Big Data, and HDF Server
Product Designer Hub - Taking HPD to the Web
Geospatial Data Abstraction Library (GDAL) Enhancement for ESDIS (GEE)
Efficiently serving HDF5 via OPeNDAP
Ad

Similar to STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer for Compatibility (20)

PPTX
LiveLinkedData - TransWebData - Nantes 2013
PPT
Foss4G 2009 Scenz Grid
PPTX
Graph Databases in the Microsoft Ecosystem
PPTX
Democratizing Big Semantic Data management
PDF
Mapping Lo Dto Proton Revised [Compatibility Mode]
PPTX
Eco-informatics: Data services for bringing together and publishing the full ...
PPTX
SQL on Hadoop for the Oracle Professional
PPTX
FAIR Workflows and Research Objects get a Workout
PDF
Metadata as Linked Data for Research Data Repositories
PPTX
Empowering Transformational Science
PPTX
RDF Stream Processing and the role of Semantics
PDF
Hot-Spot analysis Using Apache Spark framework
PDF
Find access and Analyze with the PDS search API CL23_3146.pdf
PPT
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
PDF
Very Large Data Files, Object Stores, and Deep Learning—Lessons Learned While...
PDF
Hala skafkeynote@conferencedata2021
KEY
NISO Forum, Denver, Sept. 24, 2012: Data Equivalence
PDF
IEEE_BigData2014-Lee.pdf
LiveLinkedData - TransWebData - Nantes 2013
Foss4G 2009 Scenz Grid
Graph Databases in the Microsoft Ecosystem
Democratizing Big Semantic Data management
Mapping Lo Dto Proton Revised [Compatibility Mode]
Eco-informatics: Data services for bringing together and publishing the full ...
SQL on Hadoop for the Oracle Professional
FAIR Workflows and Research Objects get a Workout
Metadata as Linked Data for Research Data Repositories
Empowering Transformational Science
RDF Stream Processing and the role of Semantics
Hot-Spot analysis Using Apache Spark framework
Find access and Analyze with the PDS search API CL23_3146.pdf
SciDB : Open Source Data Management System for Data-Intensive Scientific Anal...
Very Large Data Files, Object Stores, and Deep Learning—Lessons Learned While...
Hala skafkeynote@conferencedata2021
NISO Forum, Denver, Sept. 24, 2012: Data Equivalence
IEEE_BigData2014-Lee.pdf
Ad

More from The HDF-EOS Tools and Information Center (20)

PDF
HDF5 2.0: Cloud Optimized from the Start
PDF
Using a Hierarchical Data Format v5 file as Zarr v3 Shard
PDF
Cloud-Optimized HDF5 Files - Current Status
PDF
Cloud Optimized HDF5 for the ICESat-2 mission
PPTX
Access HDF Data in the Cloud via OPeNDAP Web Service
PPTX
Upcoming New HDF5 Features: Multi-threading, sparse data storage, and encrypt...
PPTX
The State of HDF5 / Dana Robinson / The HDF Group
PDF
Cloud-Optimized HDF5 Files
PDF
Accessing HDF5 data in the cloud with HSDS
PPTX
Highly Scalable Data Service (HSDS) Performance Features
PDF
Creating Cloud-Optimized HDF5 Files
PPTX
HDF5 OPeNDAP Handler Updates, and Performance Discussion
PPTX
Hyrax: Serving Data from S3
PPSX
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
PDF
HDF - Current status and Future Directions
PPTX
HDF - Current status and Future Directions
PPTX
HDF for the Cloud - Serverless HDF
PPTX
HDF for the Cloud - New HDF Server Features
PPTX
Leveraging the Cloud for HDF Software Testing
HDF5 2.0: Cloud Optimized from the Start
Using a Hierarchical Data Format v5 file as Zarr v3 Shard
Cloud-Optimized HDF5 Files - Current Status
Cloud Optimized HDF5 for the ICESat-2 mission
Access HDF Data in the Cloud via OPeNDAP Web Service
Upcoming New HDF5 Features: Multi-threading, sparse data storage, and encrypt...
The State of HDF5 / Dana Robinson / The HDF Group
Cloud-Optimized HDF5 Files
Accessing HDF5 data in the cloud with HSDS
Highly Scalable Data Service (HSDS) Performance Features
Creating Cloud-Optimized HDF5 Files
HDF5 OPeNDAP Handler Updates, and Performance Discussion
Hyrax: Serving Data from S3
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
HDF - Current status and Future Directions
HDF - Current status and Future Directions
HDF for the Cloud - Serverless HDF
HDF for the Cloud - New HDF Server Features
Leveraging the Cloud for HDF Software Testing

Recently uploaded (20)

PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PPTX
Transform Your Business with a Software ERP System
PPTX
Introduction to Artificial Intelligence
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
medical staffing services at VALiNTRY
PDF
Digital Strategies for Manufacturing Companies
PDF
top salesforce developer skills in 2025.pdf
PDF
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PDF
AI in Product Development-omnex systems
PPTX
CHAPTER 2 - PM Management and IT Context
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PDF
PTS Company Brochure 2025 (1).pdf.......
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Transform Your Business with a Software ERP System
Introduction to Artificial Intelligence
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Adobe Illustrator 28.6 Crack My Vision of Vector Design
medical staffing services at VALiNTRY
Digital Strategies for Manufacturing Companies
top salesforce developer skills in 2025.pdf
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
VVF-Customer-Presentation2025-Ver1.9.pptx
AI in Product Development-omnex systems
CHAPTER 2 - PM Management and IT Context
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PTS Company Brochure 2025 (1).pdf.......
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Softaken Excel to vCard Converter Software.pdf
Upgrade and Innovation Strategies for SAP ERP Customers

STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer for Compatibility

  • 1. Advancing Collaborative Connections for Earth System Science ACCESS STARE-PODS: A VERSATILE DATA STORE LEVERAGING THE HDF VIRTUAL OBJECT LAYER FOR COMPATIBILITY Michael L Rilee1,2 Kwo-Sen Kuo1,3, James Gallagher4, James Frew5, Niklas Griessbaum5, Edward Hartnett6, Robert Wolfe1, Gerd Heber7, Siri Jodha Khalsa8 1NASA Goddard Space Flight Center, Greenbelt, Maryland, USA 2Rilee Systems Technologies LLC, Derwood, Maryland, USA 3Bayesics LLC, Bowie, Maryland, USA 4OPeNDAP, Inc., Narragansett, Rhode Island, USA 5University of California, Santa Barbara, California, USA 6Ed Hartnett Consulting, Boulder, CO, USA 7The HDF Group, Champaign, IL, USA 8Coloradio Associates for Science and Technology LLC, Boulder, CO, USA 2020 ESIP Summer Meeting 2020 July 22 STARE Proposal No. 17-ACCESS17-0039 Federal Award ID No. 80NSSC18M0118 SpatioTemporal Adaptive Resolution Encoding (STARE)
  • 2. Advancing Collaborative Connections for Earth System Science ACCESS STARE-PODS for scalable Analysis Ready Data (ARD) • Diverse low-level Earth Science data (ESD) requires special treatment to co-align and combine for integrative analysis • The SpatioTemporal Adaptive Resolution Encoding (STARE) provides a unifying indexing scheme to combine geo-located ESD • STARE partitioned ESD enables Parallel Optimized Data Store (PODS) • HDF’s Virtual Object Layer (VOL) and Virtual Data Set (VDS) technologies can provide familiar front-ends to data in STARE-PODS • STARE-PODS unifies accessing diverse data with minimum duplication STARE-PODS is a proposal to NASA/ACCESS-19 currently under review.
  • 3. Advancing Collaborative Connections for Earth System Science ACCESS STARE Basics
  • 4. Advancing Collaborative Connections for Earth System Science ACCESS Existing native array & memory indexing impedes integration and processing. STARE Basics
  • 5. Advancing Collaborative Connections for Earth System Science ACCESS Two swath sections, A and B, overlap with the region of interest (ROI) outlined in black, with data on separate computational nodes (numbered). Parallel & distributed indexing based on native array partitioning leads to extra data movement, breaking SCALABILITY. Higher-res nadir Lower-res wing Region of interest
  • 6. Advancing Collaborative Connections for Earth System Science ACCESS STARE Encoding a locations in a recursive spatial quad-tree STARE Temporal indexing is similar but based on calendrical periods. A tilted root polyhedron 0th level First refinement level 1st level STARE Spatial ‘Trixels’ Encoded as 64-bit integers
  • 7. Advancing Collaborative Connections for Earth System Science ACCESS Worker Node 2 Worker Node 1 Worker Node 3 Worker Node 4 Chunk 1 ffc0-ffcc Chunk 2 ffd0-ffdc Chunk 3 ffe0-ffec Chunk 4 fff0-fffc Parallel Store, SciDB… N3333 Bit 1 1 11 11 11 11 -> 0xffc (right justified) N3333 ffc0000000000000 @level 3 (left justified) N33330-N33333 N33330 ffc0000000000000 @level 4 l N33333 fff0000000000000 @level 4 00 01 10 11 N333300-N333333 N333300 ffc0000000000000 @level 5 l N333333 fffc0000000000000 @level 5 @level 5 0000 0011 1100 1111 3 4 5 “Chunks” Levels STARE Spatial Hierarchical Triangular Mesh (HTM) Indexing: spherical triangles to integers via quadtree recursion - aids comparison of different data sets, integer operations are much faster than geometric calculations - bit pattern keeps co-located data together when “chunked” STARE Temporal Hierarchical Calendrical Partitioning (HCP): similar but with branching based on calendar partitions 00 01 10 11 1 2 3 level Worker Node 2 Worker Node 1 Worker Node 3 Worker Node 4
  • 8. Advancing Collaborative Connections for Earth System Science ACCESS STARE vs Floating-Point Encoding Longitude Latitude Human readable +123.4° 60° Single-precision floating-point 0x42f6cccd 0x42700000 STARE id* 0x36ee9398f7210f34 The smallest triangle in the figure is at quadfurcation level 6. *STARE id also includes resolution information. In this case, it points to quadfurcation level 20, i.e. ≲ 10m
  • 9. Advancing Collaborative Connections for Earth System Science ACCESS NADIRWING STARE indexing adapts to the resolution of the data, which often varies. MODIS GOES pixel Lon-lat search area for combining data Supporting conventional lon-lat vs. STARE-based integration One “scan” with ten sensors. MODIS pixel (nadir resolution)
  • 10. Advancing Collaborative Connections for Earth System Science ACCESS 2+1 Dimensions indexed with two integers STARE SpatioTemporal Search/Index Volumes Hurricane IRMA Key West “Sensor trajectory” Cuba STARE Volumes (not to scale)
  • 11. Advancing Collaborative Connections for Earth System Science ACCESS Parallelization for Volume & Variety Scaling
  • 12. Advancing Collaborative Connections for Earth System Science ACCESS STARE supporting a 16-way partitioning co-locating diverse data
  • 13. Advancing Collaborative Connections for Earth System Science ACCESS GOES (red/brown) and MODIS (blue) granules integrated using STARE (visualized in equirectangular projection) Using STARE to combine GOES and MODIS data Can use key-value store to integrate
  • 14. Advancing Collaborative Connections for Earth System Science ACCESS GOES (red/brown) and MODIS (blue) granules integrated using STARE (visualized in equirectangular projection) Using STARE to combine GOES and MODIS data Can use key-value store to integrate
  • 15. Advancing Collaborative Connections for Earth System Science ACCESS HDF Virtual Object Layer and Virtual Data Sets
  • 16. Advancing Collaborative Connections for Earth System Science ACCESS Individual instrument field of views Scalable Homogenized Analysis Ready Data Store (STARE-SHARDS) Actual data partitioned into chunks for parallelism with unified search and co-alignment. HDF Virtual Data Set for tailoring views into the data Volume & variety scalability Usability HDF Virtual Data Set API
  • 17. Advancing Collaborative Connections for Earth System Science ACCESS Individual instrument field of views Scalable Homogenized Analysis Ready Data Store (STARE-SHARDS) Actual data partitioned into chunks for parallelism with unified search and co-alignment. HDF Virtual Data Set for tailoring views into the data Usability HDF Virtual Data Set API STARE-SHARDS Storage Layer Volume & variety scalability
  • 18. Advancing Collaborative Connections for Earth System Science ACCESS Use a STARE ‘cover’ to partition a granule STARE partitioned swath data looks like familiar HDF files Using familiar HDF methods to access STARE-SHARDS Data Source 1 Data Source 2 Data Source 3 HDF Virtual Granule End users and legacy applications interact with STARE-SHARDS transparently. Different sources and varieties of data with different coverage, resolutions… Data Source A Data Source B
  • 19. Advancing Collaborative Connections for Earth System Science ACCESS Use a STARE ‘cover’ to partition a granule STARE partitioned swath data looks like familiar HDF files Using familiar HDF methods to access STARE-SHARDS Data Source 1 Data Source 2 Data Source 3 HDF Virtual Granule End users and legacy applications interact with STARE-SHARDS transparently. Different sources and varieties of data with different coverage, resolutions… Data Source A Data Source B
  • 20. Advancing Collaborative Connections for Earth System Science ACCESS The Proposed Architecture STARE SHARDS to PODS to Integrative Analysis Computing & Storage Index & Organization Query, Marshalling, “Transport” Use & Tooling
  • 21. Advancing Collaborative Connections for Earth System Science ACCESS The Architecture STARE SHARDS to PODS to Integrative Analysis STARE Location Service (SLS) A ‘DNS’ for geolocated data
  • 22. Advancing Collaborative Connections for Earth System Science ACCESS Conclusion: STARE-PODS for scalable integrative analysis • STARE lays the foundation for scaling both variety and volume • Supports lower-level (L1 & L2) data accessibility, combination, and scalability • Features C++ and Python APIs, including a Pandas-like interface • STARE Sidecar files limit costs of translation into STARE indices • OPeNDAP integration is in progress • Libraries, examples, tests, and cookbooks at https://github.com/SpatioTemporal • STARE-PODS and STARE-SHARDS • Organize diverse data for co-alignment and parallel/distributed storage and processing • HDF Virtual Object Layer and Data Set support transparent legacy access Acknowledgments • STARE-PODS is a proposal to NASA/ACCESS-19 currently under review. • This work is supported by NASA/ACCESS-17. Federal Award ID No. 80NSSC18M0118. • NASA/LaRC for interest and support.
  • 23. Advancing Collaborative Connections for Earth System Science ACCESS Supplemental
  • 24. Advancing Collaborative Connections for Earth System Science ACCESS
  • 25. Advancing Collaborative Connections for Earth System Science ACCESS NASA/ACCESS-17-39 STARE 80NSSC18M0118 M. Rilee mike@rilee.net Rilee Systems Technologies LLC 2019 October 21
  • 26. Advancing Collaborative Connections for Earth System Science ACCESS NASA/ACCESS-17-39 STARE 80NSSC18M0118 M. Rilee mike@rilee.net Rilee Systems Technologies LLC 2019 October 21
  • 27. Advancing Collaborative Connections for Earth System Science ACCESS Zooming in to the MODIS swath “bow-tie” WING NADIR Two “scans” overlapping STARE Indexing adapts to the data
  • 28. Advancing Collaborative Connections for Earth System Science ACCESS 0x1048000000000005 0x1049e66dab30632b STARE Spatial IDs Level 5, green trixels A 0x1048000000000005 B 0x104a000000000005 C 0x104c000000000005 D 0x104e000000000005 A B C D NASA/ACCESS-17-39 STARE 80NSSC18M0118 M. Rilee mike@rilee.net Rilee Systems Technologies LLC 2019 October 24 ROI+GOES ROI+MODIS ROI+GOES+MODIS
  • 29. Advancing Collaborative Connections for Earth System Science ACCESS NASA/ACCESS-17-39 STARE 80NSSC18M0118 M. Rilee mike@rilee.net Rilee Systems Technologies LLC 2019 October 21
  • 30. Advancing Collaborative Connections for Earth System Science ACCESS NASA/ACCESS-17-39 STARE 80NSSC18M0118 M. Rilee mike@rilee.net Rilee Systems Technologies LLC 2019 October 24 ROI+GOES ROI+MODIS ROI +GOES +MODIS A: 0x1049e6000000000a B: 0x1049e6600000000b C: 0x1049e66dab30632b
  • 31. Advancing Collaborative Connections for Earth System Science ACCESS Integration at the finest level via IFOV and PSF modeling i j k 𝑠𝑖 ≈ 𝑆𝑗 𝑊𝑗𝑖 ⊕ 𝑆 𝑘 𝑊𝑘𝑖 𝑠 = 𝑾 𝑺 Observation Vectors (source) PSF weights “combined” Signal (target) Finer trixels not shown for clarity. “brown psf” “blue psf” Instrument Field of View and Point Spread Function Modeling
  • 32. Advancing Collaborative Connections for Earth System Science ACCESS

Editor's Notes

  • #5: PROJECT OVERVIEW What we see in analysis is the native array indexing, the l,m, I,j, k and the actual geometry is hidden. IFOV too. The above still not reality.
  • #6: L1+L2: If your scientific analysis requires combining level 1 and 2 data, one has to arrange all of this themselves.
  • #10: Bottom part of MODIS granule, zoom in to scan line, explain boxes on WING, go to nadir. Show difference between conventional and STARE ways Why? Square-based vs. triangle-based integration – infusion people they understand conventional lon-lat grid integration/comparison (L3) We hope the comparison will help show people/convince people STARE-way is better
  • #13: chunkLevel=3 (nodes). TrixelGrid=0,1,2,3,4 (yellows getting darker). Swath, resolution level=5 (green). *** HTM not new, NEIGHBORHOOD part is new
  • #14: Want to do this at scale, spatial and temporal NOAA help desk, don’t us this
  • #15: Want to do this at scale, spatial and temporal NOAA help desk, don’t us this
  • #27: 16 nodes. GOES northern hemisphere, MODIS, Hawaii – 2deg circle
  • #28: 20 detectors, adjacent scans, overlapping at the wings No overlap at center
  • #29: cover: 0x1049e66dab30632b 0 0x1010000000000005 0 0x1048000000000005 0 0x104a000000000005 0 0x104c000000000005 0 0x104e000000000005 0 0x1068000000000005 0 0x2c08000000000005 0 0x2c30000000000005 0 0x2c70000000000005