SlideShare a Scribd company logo
iRODS:
Interoperability in
Data Management
Leesa Brieger, RENCI-UNC
Mike Wan, DICE-UCSD
integrated Rule-Oriented Data System
(iRODS)
•

Developed by the Data Intensive Cyber Environments (DICE) group,
UNC and UCSD

•

Follow-on to SRB, the Storage Resource Broker from SDSC
– decade-long development experience, community-driven

•

Modular, extensible, customizable

•

Open source (BSD license)

•

Supported by the Renaissance Computing Institute (RENCI), UNC
– a research unit of UNC Chapel Hill
– state-supported
– governed by the Triangle universities (UNC, NCSU, Duke)
HDF, HDF-EOS Workshop XV, April 1719, 2012

2
iRODS
I.

Data grid middleware

II.

Data management infrastructure

III.

Framework for implementing policy-driven data
management

The extensibility and modularity of iRODS make it a customizable and
resource-agnostic infrastructure.

HDF, HDF-EOS Workshop XV, April 17-19,
2012

3
iRODS as Data Grid
iRODS View of Distributed Data

User Client
User sees a single collection
My Data:
disk, filesystem,
WOS storage unit...

My Data:
tape, database, filesystem
...

Partner’s Data
remote
disk, tape, filesystem...

•iRODS installs over heterogeneous data resources
• Users share & manage distributed data as a single collection

• iCAT metadata catalogue: DB that manages the logical-tophysical mappings (data objects, users, resources)
HDF, HDF-EOS Workshop XV, April 1719, 2012

4
Data Life Cycle
Usage evolves across stages of the
data life cycle; management
policy evolves along with it.

Creation
Active
Use

Publication
& Sharing

Local Policy

Reference
Collection

Service/Use
Distribution

Discovery and
Re-purposing

Archival
Collection/
Deletion
Retention/
Preservation

iRODS modularity and extensibility allows support for changing
s ds
management requirements over the data life cycle.

HDF, HDF-EOS Workshop XV, April 1719, 2012

5
iRODS Design Goals
• Data grid abstraction for data, users, resources
• Abstract out the data management
– Separate data administration from storage administration
• drivers allow iRODS to talk local storage protocol
• rule engine runs services and data operations

– Policy-based data management
• Data management: specialized modules of microservices (C
code) and rules for running data-side services
• Policy-based: event-triggered rule execution

– Policy follows data around the grid
• collection management independent of remote storage
HDF, HDF-EOS Workshop XV, April 17-19,
locations
2012

6
Interoperability
• Federation
– Data grids with independent administration can federate and crosscommunicate

• Clients
– User-supplied or specialty client interfaces
– Many specialized views of the collections

• iRODS core extensions for resource agnosticism/fitting in with
existing infrastructure
–
–
–
–

network transport (RBUDP)
authentication mechanisms (Kerberos, Shibboleth, GSI, etc)
external databases (DataBase Resources - DBRs)
storage drivers (HPSS, WOS, EC2, etc)
HDF, HDF-EOS Workshop XV, April 17-19,
2012

7
Interoperability Through Microservices
iRODS provides a structure for implementing custom services
– Rules and microservice modules
– Can be user-defined
– Data-side services: format
conversion, extraction, visualization, accounting &reporting, …
– Archival: replication, curation procedures, long-term archival
procedures
– Access: access control policy

– Discoverability: metadata organization and management
– Symbolic links: integrate data from other collections into iRODS
repository
• microservice drivers

– Universal mass storage driver – plug in new protocols
HDF, HDF-EOS Workshop XV, April 1719, 2012

8
Interoperability Through Integration with
Existing Infrastructure
• Data management integrated with storage management: OSG,
DDN

• Data management integrated with standard interfaces and
services:
–
–
–
–

Fedora (librarians)
DataVerse (social scientists)
HDF5 (cosmologists)
NetCDF (NASA climate scientists, NSF earth scientists - hydrologists)

HDF, HDF-EOS Workshop XV, April 1719, 2012

9
Integration with HDF5
Mike Wan and Peter Cao, 2008

Interactive access to HDF5 files on a remote iRODS server –
browsing of metadata and data sharing with services
•

Clients access to data (subsets) and metadata in HDF5 files stored
remotely; transfers only of requested data and metadata, not of full
files

•

iRODS microservices and APIs created to support HDF5 functionality on
HDF5 objects

•

islice – extracts a slice from a FLASH (cosmology) file stored on a
remote iRODS server

•

Remote viewing of HDF5 iRODS data

•

HDFView

HDF, HDF-EOS Workshop XV, April 1719, 2012

– iRODS HDF5 Java objects were added to the HDF-Java products

10
Integration with NetCDF
Mike Wan, 2012
• Add NETCDF functionalities to iRODS:
– wrap NETCDF APIs into iRODS APIs and micro-services

• New iRODS APIs to wrap basic NETCDF APIs (libnetcdf) and a higherlevel libcf subsetting function
– Basic: nc_create, nc_open, nc_close
– Inquiry functions: nc_inq_varid, nc_inq_dimid, nc_inq_dim, nc_inq_var
– Subsetting functions:
nc_get_vars_text, nc_get_vars_string, nc_get_vars_int, nc_get_vars_float,
nc_get_vars_double, …
– Higher-level subsetting function of libcf for CF data: nccf_get_vara

• New NETCDF-based iRODS micro-services
– Allow NETCDF workflows to be performed data-side on the iRODS servers
HDF, HDF-EOS Workshop XV, April 17-19,
– One for each of the new APIs, for server-side operations
11
2012
– 5 micro-services for accessing data elements in the new data structures
iRODS for Interoperability – NASA (NCCS)
Separating metadata from the data object
(from NetCDF files into the iCAT)

Using an iRODS FUSE client
to expose data to the ESG
Data Node

In support of discovery, long term curation,
and reuse/repurposing of the data
HDF, HDF-EOS Workshop XV, April 1719, 2012

12
E-iRODS from RENCI – the RedHat Model
• Initial release based on iRODS 3.0
– Tracks community code, with a delay
– Download beta release binaries at http://e-irods.com

• Hardened binary release of iRODS
– Passes continuous integration with back-ported bug fixes from
community trunk
– Packaging and signing: initially RPM and DEB

• Certification
• Documentation
• Subscription Support Contracts – leesa@renci.org for information
HDF, HDF-EOS Workshop XV, April 17-19,
2012

13

More Related Content

PDF
PDF
iRODS/Dataverse Project by Jonathan Crabtree
PPTX
Web-based On-demand Global NDVI Data Services
PPTX
Access HDF-EOS data with OGC Web Coverage Service - Earth Observation Applica...
PPT
Hota hadoop
PPT
Digital Object Identifiers for EOSDIS data
PDF
5.15.17 Powering Linked Data and Hosted Solutions with Fedora Webinar Slides
iRODS/Dataverse Project by Jonathan Crabtree
Web-based On-demand Global NDVI Data Services
Access HDF-EOS data with OGC Web Coverage Service - Earth Observation Applica...
Hota hadoop
Digital Object Identifiers for EOSDIS data
5.15.17 Powering Linked Data and Hosted Solutions with Fedora Webinar Slides

What's hot (20)

PPTX
DataverseEU as multilingual repository
 
PDF
Digital Preservation in Production (DPN and DuraCloud Vault)
PPTX
Extending DSpace 7: DSpace-CRIS and DSpace-GLAM for empowered repositories an...
PDF
ARIADNE: progress in the first nine month
PPTX
Running Dataverse repository in the European Open Science Cloud (EOSC)
 
PDF
Hadoop training in Bangalore
PPTX
How Worthy is DSpace for Digital Libraries
PDF
ARCLib project presentation from Pasig 2016
PPTX
Open-source Scientific Computing and Data Analytics using HDF
PDF
TYPO3 and CMIS
PPTX
DataverseNL as structured data hub
 
PPTX
Building COVID-19 Museum as Open Science Project
 
PPTX
Dataverse SSHOC enrichment of DDI support at EDDI'19 2
 
PPT
DataCite How To: Use the MDS
PDF
Moving ahead: The ARIADNE integration process
PPT
Geoservices Activities at EDINA
PDF
DSpace-CRIS: an open source solution - Cineca euroCRIS membership meeting Por...
PPTX
Building an electronic repository and archives on Dataverse in the European O...
 
PDF
Putting Historical Data in Context: how to use DSpace-GLAM
DataverseEU as multilingual repository
 
Digital Preservation in Production (DPN and DuraCloud Vault)
Extending DSpace 7: DSpace-CRIS and DSpace-GLAM for empowered repositories an...
ARIADNE: progress in the first nine month
Running Dataverse repository in the European Open Science Cloud (EOSC)
 
Hadoop training in Bangalore
How Worthy is DSpace for Digital Libraries
ARCLib project presentation from Pasig 2016
Open-source Scientific Computing and Data Analytics using HDF
TYPO3 and CMIS
DataverseNL as structured data hub
 
Building COVID-19 Museum as Open Science Project
 
Dataverse SSHOC enrichment of DDI support at EDDI'19 2
 
DataCite How To: Use the MDS
Moving ahead: The ARIADNE integration process
Geoservices Activities at EDINA
DSpace-CRIS: an open source solution - Cineca euroCRIS membership meeting Por...
Building an electronic repository and archives on Dataverse in the European O...
 
Putting Historical Data in Context: how to use DSpace-GLAM
Ad

Viewers also liked (20)

PDF
Data Management for Grown Ups
PPTX
ODSC and iRODS
PPT
NAGARA: SRB and iRODS
PPTX
Green Shoots: Research Data Management Pilot at Imperial College London
PDF
Research Data Management en bibliotheken
PDF
iRODS User Group Meeting 2016 - MUMC+
PPT
UDT
PPTX
Connecting HDF with ISO Metadata Standards
PPTX
HDF Tools Updates and Discussions
PDF
Using IDL with Suomi NPP VIIRS Data
PPT
Status of HDF-EOS, Related Software and Tools
PPTX
HDF Group Support for NPP/NPOESS/JPSS
PPTX
Earth Science Data and Information System (ESDIS) Project Update
PPTX
HDF4 Mapping Project Update
PPT
HDF-EOS to GeoTIFF Conversion Tool & HDF-EOS Plug-in for HDFView
PPTX
HDF OPeNDAP Project Update and Demo
PPTX
Bridging ICESat and ICESat-2 Standard Data Products
Data Management for Grown Ups
ODSC and iRODS
NAGARA: SRB and iRODS
Green Shoots: Research Data Management Pilot at Imperial College London
Research Data Management en bibliotheken
iRODS User Group Meeting 2016 - MUMC+
UDT
Connecting HDF with ISO Metadata Standards
HDF Tools Updates and Discussions
Using IDL with Suomi NPP VIIRS Data
Status of HDF-EOS, Related Software and Tools
HDF Group Support for NPP/NPOESS/JPSS
Earth Science Data and Information System (ESDIS) Project Update
HDF4 Mapping Project Update
HDF-EOS to GeoTIFF Conversion Tool & HDF-EOS Plug-in for HDFView
HDF OPeNDAP Project Update and Demo
Bridging ICESat and ICESat-2 Standard Data Products
Ad

Similar to iRODS: Interoperability in Data Management (20)

PDF
iRODS 4.0 and Beyond (DDN UK User Group Meeting, September 2014)
PPTX
EOSC-hub service portfolio
PPTX
Scaling up Linked Data
PDF
COMSODE networking session at ICT Lisbon 2015
PPTX
DSpace-CRIS Workshop OR2015: Slides
PPTX
Persistent identifiers in DataverseEU project
 
PPTX
Scaling up Linked Data
PPTX
CPaaS.io Y1 Review Meeting - Holistic Data Management
PPTX
Information Systems
PDF
Big data and cloud computing 9 sep-2017
PDF
E FFICIENT D ATA R ETRIEVAL F ROM C LOUD S TORAGE U SING D ATA M ININ...
PPTX
Zloch, Bosch, Wegener: A technical perspective...
PPTX
2013.05 - IASSIST 2013 - 2
PPTX
Supporting Research through "Desktop as a Service" models of e-infrastructure...
PDF
A distributed network of digital heritage information by Enno Meijers - Europ...
PPTX
Data Tactics dhs introduction to cloud technologies wtc
PDF
Presentation 16 may keynote karin bredenberg
PPTX
Securing Hadoop in an Enterprise Context (v2)
PPTX
Introduction to the new DAD-IS architecture
 
iRODS 4.0 and Beyond (DDN UK User Group Meeting, September 2014)
EOSC-hub service portfolio
Scaling up Linked Data
COMSODE networking session at ICT Lisbon 2015
DSpace-CRIS Workshop OR2015: Slides
Persistent identifiers in DataverseEU project
 
Scaling up Linked Data
CPaaS.io Y1 Review Meeting - Holistic Data Management
Information Systems
Big data and cloud computing 9 sep-2017
E FFICIENT D ATA R ETRIEVAL F ROM C LOUD S TORAGE U SING D ATA M ININ...
Zloch, Bosch, Wegener: A technical perspective...
2013.05 - IASSIST 2013 - 2
Supporting Research through "Desktop as a Service" models of e-infrastructure...
A distributed network of digital heritage information by Enno Meijers - Europ...
Data Tactics dhs introduction to cloud technologies wtc
Presentation 16 may keynote karin bredenberg
Securing Hadoop in an Enterprise Context (v2)
Introduction to the new DAD-IS architecture
 

More from The HDF-EOS Tools and Information Center (20)

PDF
HDF5 2.0: Cloud Optimized from the Start
PDF
Using a Hierarchical Data Format v5 file as Zarr v3 Shard
PDF
Cloud-Optimized HDF5 Files - Current Status
PDF
Cloud Optimized HDF5 for the ICESat-2 mission
PPTX
Access HDF Data in the Cloud via OPeNDAP Web Service
PPTX
Upcoming New HDF5 Features: Multi-threading, sparse data storage, and encrypt...
PPTX
The State of HDF5 / Dana Robinson / The HDF Group
PDF
Cloud-Optimized HDF5 Files
PDF
Accessing HDF5 data in the cloud with HSDS
PPTX
Highly Scalable Data Service (HSDS) Performance Features
PDF
Creating Cloud-Optimized HDF5 Files
PPTX
HDF5 OPeNDAP Handler Updates, and Performance Discussion
PPTX
Hyrax: Serving Data from S3
PPSX
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
PDF
HDF - Current status and Future Directions
PPSX
HDFEOS.org User Analsys, Updates, and Future
PPTX
HDF - Current status and Future Directions
PDF
H5Coro: The Cloud-Optimized Read-Only Library
PPTX
MATLAB Modernization on HDF5 1.10
HDF5 2.0: Cloud Optimized from the Start
Using a Hierarchical Data Format v5 file as Zarr v3 Shard
Cloud-Optimized HDF5 Files - Current Status
Cloud Optimized HDF5 for the ICESat-2 mission
Access HDF Data in the Cloud via OPeNDAP Web Service
Upcoming New HDF5 Features: Multi-threading, sparse data storage, and encrypt...
The State of HDF5 / Dana Robinson / The HDF Group
Cloud-Optimized HDF5 Files
Accessing HDF5 data in the cloud with HSDS
Highly Scalable Data Service (HSDS) Performance Features
Creating Cloud-Optimized HDF5 Files
HDF5 OPeNDAP Handler Updates, and Performance Discussion
Hyrax: Serving Data from S3
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
HDF - Current status and Future Directions
HDFEOS.org User Analsys, Updates, and Future
HDF - Current status and Future Directions
H5Coro: The Cloud-Optimized Read-Only Library
MATLAB Modernization on HDF5 1.10

Recently uploaded (20)

PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Spectroscopy.pptx food analysis technology
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Machine learning based COVID-19 study performance prediction
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Unlocking AI with Model Context Protocol (MCP)
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Network Security Unit 5.pdf for BCA BBA.
Spectroscopy.pptx food analysis technology
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
20250228 LYD VKU AI Blended-Learning.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing
Advanced methodologies resolving dimensionality complications for autism neur...
Machine learning based COVID-19 study performance prediction
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Programs and apps: productivity, graphics, security and other tools
Mobile App Security Testing_ A Comprehensive Guide.pdf
Chapter 3 Spatial Domain Image Processing.pdf
Electronic commerce courselecture one. Pdf
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Diabetes mellitus diagnosis method based random forest with bat algorithm
Digital-Transformation-Roadmap-for-Companies.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
Unlocking AI with Model Context Protocol (MCP)

iRODS: Interoperability in Data Management

  • 1. iRODS: Interoperability in Data Management Leesa Brieger, RENCI-UNC Mike Wan, DICE-UCSD
  • 2. integrated Rule-Oriented Data System (iRODS) • Developed by the Data Intensive Cyber Environments (DICE) group, UNC and UCSD • Follow-on to SRB, the Storage Resource Broker from SDSC – decade-long development experience, community-driven • Modular, extensible, customizable • Open source (BSD license) • Supported by the Renaissance Computing Institute (RENCI), UNC – a research unit of UNC Chapel Hill – state-supported – governed by the Triangle universities (UNC, NCSU, Duke) HDF, HDF-EOS Workshop XV, April 1719, 2012 2
  • 3. iRODS I. Data grid middleware II. Data management infrastructure III. Framework for implementing policy-driven data management The extensibility and modularity of iRODS make it a customizable and resource-agnostic infrastructure. HDF, HDF-EOS Workshop XV, April 17-19, 2012 3
  • 4. iRODS as Data Grid iRODS View of Distributed Data User Client User sees a single collection My Data: disk, filesystem, WOS storage unit... My Data: tape, database, filesystem ... Partner’s Data remote disk, tape, filesystem... •iRODS installs over heterogeneous data resources • Users share & manage distributed data as a single collection • iCAT metadata catalogue: DB that manages the logical-tophysical mappings (data objects, users, resources) HDF, HDF-EOS Workshop XV, April 1719, 2012 4
  • 5. Data Life Cycle Usage evolves across stages of the data life cycle; management policy evolves along with it. Creation Active Use Publication & Sharing Local Policy Reference Collection Service/Use Distribution Discovery and Re-purposing Archival Collection/ Deletion Retention/ Preservation iRODS modularity and extensibility allows support for changing s ds management requirements over the data life cycle. HDF, HDF-EOS Workshop XV, April 1719, 2012 5
  • 6. iRODS Design Goals • Data grid abstraction for data, users, resources • Abstract out the data management – Separate data administration from storage administration • drivers allow iRODS to talk local storage protocol • rule engine runs services and data operations – Policy-based data management • Data management: specialized modules of microservices (C code) and rules for running data-side services • Policy-based: event-triggered rule execution – Policy follows data around the grid • collection management independent of remote storage HDF, HDF-EOS Workshop XV, April 17-19, locations 2012 6
  • 7. Interoperability • Federation – Data grids with independent administration can federate and crosscommunicate • Clients – User-supplied or specialty client interfaces – Many specialized views of the collections • iRODS core extensions for resource agnosticism/fitting in with existing infrastructure – – – – network transport (RBUDP) authentication mechanisms (Kerberos, Shibboleth, GSI, etc) external databases (DataBase Resources - DBRs) storage drivers (HPSS, WOS, EC2, etc) HDF, HDF-EOS Workshop XV, April 17-19, 2012 7
  • 8. Interoperability Through Microservices iRODS provides a structure for implementing custom services – Rules and microservice modules – Can be user-defined – Data-side services: format conversion, extraction, visualization, accounting &reporting, … – Archival: replication, curation procedures, long-term archival procedures – Access: access control policy – Discoverability: metadata organization and management – Symbolic links: integrate data from other collections into iRODS repository • microservice drivers – Universal mass storage driver – plug in new protocols HDF, HDF-EOS Workshop XV, April 1719, 2012 8
  • 9. Interoperability Through Integration with Existing Infrastructure • Data management integrated with storage management: OSG, DDN • Data management integrated with standard interfaces and services: – – – – Fedora (librarians) DataVerse (social scientists) HDF5 (cosmologists) NetCDF (NASA climate scientists, NSF earth scientists - hydrologists) HDF, HDF-EOS Workshop XV, April 1719, 2012 9
  • 10. Integration with HDF5 Mike Wan and Peter Cao, 2008 Interactive access to HDF5 files on a remote iRODS server – browsing of metadata and data sharing with services • Clients access to data (subsets) and metadata in HDF5 files stored remotely; transfers only of requested data and metadata, not of full files • iRODS microservices and APIs created to support HDF5 functionality on HDF5 objects • islice – extracts a slice from a FLASH (cosmology) file stored on a remote iRODS server • Remote viewing of HDF5 iRODS data • HDFView HDF, HDF-EOS Workshop XV, April 1719, 2012 – iRODS HDF5 Java objects were added to the HDF-Java products 10
  • 11. Integration with NetCDF Mike Wan, 2012 • Add NETCDF functionalities to iRODS: – wrap NETCDF APIs into iRODS APIs and micro-services • New iRODS APIs to wrap basic NETCDF APIs (libnetcdf) and a higherlevel libcf subsetting function – Basic: nc_create, nc_open, nc_close – Inquiry functions: nc_inq_varid, nc_inq_dimid, nc_inq_dim, nc_inq_var – Subsetting functions: nc_get_vars_text, nc_get_vars_string, nc_get_vars_int, nc_get_vars_float, nc_get_vars_double, … – Higher-level subsetting function of libcf for CF data: nccf_get_vara • New NETCDF-based iRODS micro-services – Allow NETCDF workflows to be performed data-side on the iRODS servers HDF, HDF-EOS Workshop XV, April 17-19, – One for each of the new APIs, for server-side operations 11 2012 – 5 micro-services for accessing data elements in the new data structures
  • 12. iRODS for Interoperability – NASA (NCCS) Separating metadata from the data object (from NetCDF files into the iCAT) Using an iRODS FUSE client to expose data to the ESG Data Node In support of discovery, long term curation, and reuse/repurposing of the data HDF, HDF-EOS Workshop XV, April 1719, 2012 12
  • 13. E-iRODS from RENCI – the RedHat Model • Initial release based on iRODS 3.0 – Tracks community code, with a delay – Download beta release binaries at http://e-irods.com • Hardened binary release of iRODS – Passes continuous integration with back-ported bug fixes from community trunk – Packaging and signing: initially RPM and DEB • Certification • Documentation • Subscription Support Contracts – leesa@renci.org for information HDF, HDF-EOS Workshop XV, April 17-19, 2012 13

Editor's Notes

  • #7: This is the first mention of microservices… defined in the next slide.