SlideShare a Scribd company logo
Globus: Enabling the Open Storage Network
GlobusWorld 2019
Brian Mohr
Made possible by
grants from the NSF
and the Schmidt
Foundation
NSF grants
1747552
1747493
1747507
1747490
1747483
1836357
The Open Storage Network: Mission Statement
The mission of OSN is
to provide a low-cost, high-quality,
sustainable, distributed storage cloud
for the NSF research community.
Research Cyberinfrastructure Today
Shared Resource
(XSEDE, PRAC)
Standardized
NSF-Funded
200+ universities with
40/100Gb Connectivity
Standardized
NSF-Funded
Largely Balkanized
No Standards
Requirement
No CI Funding
Computation Networking Storage
The Open Storage Network:
Cyberinfrastructure Goals
q Leverage existing NSF-funded high-speed network connectivity
q Establish a standard national petascale storage infrastructure
q Promote sharing of publicly-funded research datasets
q Facilitate interdisciplinary research (searchable metadata)
The third pillar…
OSN Federation Design Objectives
q Scalable: uniform hardware architecture across all OSN sites
q Efficient: centralized remote provisioning and monitoring
q Manageable: distributed lights-out data center
q Resilient: offline site ≠ loss of access to data (cross-site replication)
q Sustainable: minimal site-local staff overhead
Keep it simple…
OSN Pod Design Objective: “Scalable Unit”
q Capacity: 1 petabyte usable object storage
q Performance: 40Gb sequential throughput
q Ease of Procurement: an OSN-optimized vendor SKU
q Economical: $140,000 hardware budget
q Ease of Adoption: plug-n-play appliance model
Keep it simple…
OSN Scalable Unit – Technical Spec
8 Server Nodes
Five 4U Data Nodes | Three 1U Monitor/DTN Nodes
1.44 PB Raw Storage
8T HDDs | 7200 RPM 12Gb SAS | 36 Disks per Data Node
High-Speed Network: 100GbE ToR Switch
40 or 100Gb I2 Uplink | 50GbE Cluster Interconnect
Remote Management: 1GbE, KVM ToR Switches
OOB IPv4 | IPMI | Console | Switched Outlets
OSN Pod Physical Site Requirements
q Floor space for one 30-inch wide rack
q A/B power (current configuration: 6kW max)
q Dual fiber uplink to 40/100Gb network infrastructure
q Out-of-band network access for remote “pod” provisioning
q Allocate IP address blocks (high-speed: /27; OOB: 3x IPv4)
q That’s it!
Minimize barriers to adoption…
OSN Software Stack: Globus
q Authentication infrastructure
q GridFTP file transfer software
q S3 interface to Ceph object storage
Leverage existing Globus features…
OSN Software Stack: Globus Extensions
q Dataset Ownership
q Dataset Access Authorization
q Dataset Locality
q Dataset Replication
q Dataset Aging
q Dataset Tags (Searchable Catalog!)
Develop OSN-specific metadata/policy engine…
OSN Pod/Appliance Operations
Monitoring
Globus Engine
Provisioning
OSN Prototype Deployment Sites
Northeastern Storage Exchange
San Diego Supercomputing Center
University of Illinois
Renaissance Computing Institute
Northwestern University
Johns Hopkins University
Funded by NSF
Funded by Schmidt Foundation
OSN Scaled-out Deployment (Projected)
Assumption: one
or more OSN Pods per
40/100Gb NSF Campus
Cyberinfrastructure Site.
Questions?
openstoragenetwork.org

More Related Content

PDF
Enabling Secure Data Discoverability (SC21 Tutorial)
PDF
Recent Upgrades to ARM Data Transfer and Delivery Using Globus
PPT
20090701 Climate Data Staging
PPT
SomeSlides
PPT
Grid Computing July 2009
PDF
A Data Ecosystem to Support Machine Learning in Materials Science
PDF
Health Sciences Research Informatics, Powered by Globus
PDF
What's New in Globus - Internet2 TechEXtra
Enabling Secure Data Discoverability (SC21 Tutorial)
Recent Upgrades to ARM Data Transfer and Delivery Using Globus
20090701 Climate Data Staging
SomeSlides
Grid Computing July 2009
A Data Ecosystem to Support Machine Learning in Materials Science
Health Sciences Research Informatics, Powered by Globus
What's New in Globus - Internet2 TechEXtra

What's hot (20)

PDF
Automating Research Data Flows with Globus (CHPC 2019 - South Africa)
PDF
Connecting Your System to Globus (APS Workshop)
PPTX
Gateways 2020 Tutorial - Instrument Data Distribution with Globus
PPTX
Gateways 2020 Tutorial - Automated Data Ingest and Search with Globus
PDF
Data Orchestration at Scale (GlobusWorld Tour West)
PDF
Globus Portal Framework (APS Workshop)
PDF
GlobusWorld 2021 Tutorial: Building with the Globus Platform
PPTX
Gateways 2020 Tutorial - Large Scale Data Transfer with Globus
PDF
Automating Research Data Management at Scale with Globus
PPTX
"What's New With Globus" Webinar: Spring 2018
PDF
Instrument Data Orchestration with Globus Search and Flows
PPTX
Gateways 2020 Tutorial - Introduction to Globus
PDF
20160922 Materials Data Facility TMS Webinar
PPTX
NIH Data Commons Architecture Ideas
PDF
Mining a Large Web Corpus
PDF
Introduction to the Globus Platform (APS Workshop)
PDF
Architecting An Enterprise Storage Platform Using Object Stores
PDF
Foundations for the Future of Science
PPTX
GlobusWorld 2020 Keynote
PPTX
Globus: Beyond File Transfer
Automating Research Data Flows with Globus (CHPC 2019 - South Africa)
Connecting Your System to Globus (APS Workshop)
Gateways 2020 Tutorial - Instrument Data Distribution with Globus
Gateways 2020 Tutorial - Automated Data Ingest and Search with Globus
Data Orchestration at Scale (GlobusWorld Tour West)
Globus Portal Framework (APS Workshop)
GlobusWorld 2021 Tutorial: Building with the Globus Platform
Gateways 2020 Tutorial - Large Scale Data Transfer with Globus
Automating Research Data Management at Scale with Globus
"What's New With Globus" Webinar: Spring 2018
Instrument Data Orchestration with Globus Search and Flows
Gateways 2020 Tutorial - Introduction to Globus
20160922 Materials Data Facility TMS Webinar
NIH Data Commons Architecture Ideas
Mining a Large Web Corpus
Introduction to the Globus Platform (APS Workshop)
Architecting An Enterprise Storage Platform Using Object Stores
Foundations for the Future of Science
GlobusWorld 2020 Keynote
Globus: Beyond File Transfer
Ad

Similar to Globus: Enabling the Open Storage Network (20)

PPTX
Thoughts on Cybersecurity
PDF
The Open Science Grid and how it relates to PRAGMA
PPTX
Toward a National Research Platform
PDF
OGF Introductory Overview - FAS* 2014
PDF
Kubernetes - Hosted OSG Services
PPTX
Open Science Data Cloud (June 21, 2010)
PDF
Open compute and future of data centers
PDF
PLNOG 22 - Marcin Kuczera - Open Compute Project by John Laban
PPTX
Open Cloud Consortium: An Update (04-23-10, v9)
PDF
Using the Open Science Data Cloud for Data Science Research
PDF
Data Centre Utopias: Open source collaboration and sharing
PDF
What Are Science Clouds?
PPTX
OCP Copenhagen presentation sept 2017
PDF
Cloud Standards in the Real World: Cloud Standards Testing for Developers
PPTX
OCP presentation Revolution in Data Centre Innovation
PDF
MSST-2013 Openstack in the Land of Guilder
PDF
Walk Through a Software Defined Everything PoC
PDF
London Ceph Day: Ceph at CERN
PDF
THE OPEN SCIENCE GRID Ruth Pordes
Thoughts on Cybersecurity
The Open Science Grid and how it relates to PRAGMA
Toward a National Research Platform
OGF Introductory Overview - FAS* 2014
Kubernetes - Hosted OSG Services
Open Science Data Cloud (June 21, 2010)
Open compute and future of data centers
PLNOG 22 - Marcin Kuczera - Open Compute Project by John Laban
Open Cloud Consortium: An Update (04-23-10, v9)
Using the Open Science Data Cloud for Data Science Research
Data Centre Utopias: Open source collaboration and sharing
What Are Science Clouds?
OCP Copenhagen presentation sept 2017
Cloud Standards in the Real World: Cloud Standards Testing for Developers
OCP presentation Revolution in Data Centre Innovation
MSST-2013 Openstack in the Land of Guilder
Walk Through a Software Defined Everything PoC
London Ceph Day: Ceph at CERN
THE OPEN SCIENCE GRID Ruth Pordes
Ad

More from Globus (20)

PDF
Globus Compute wth IRI Workflows - GlobusWorld 2024
PDF
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
PDF
Globus Compute Introduction - GlobusWorld 2024
PDF
Globus Connect Server Deep Dive - GlobusWorld 2024
PDF
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
PDF
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
PDF
First Steps with Globus Compute Multi-User Endpoints
PDF
Enhancing Research Orchestration Capabilities at ORNL.pdf
PDF
Understanding Globus Data Transfers with NetSage
PDF
How to Position Your Globus Data Portal for Success Ten Good Practices
PDF
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
PDF
Developing Distributed High-performance Computing Capabilities of an Open Sci...
PDF
The Department of Energy's Integrated Research Infrastructure (IRI)
PDF
GlobusWorld 2024 Opening Keynote session
PDF
Enhancing Performance with Globus and the Science DMZ
PDF
Extending Globus into a Site-wide Automated Data Infrastructure.pdf
PDF
Globus at the United States Geological Survey
PDF
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
PDF
Globus Compute with Integrated Research Infrastructure (IRI) workflows
PDF
Reactive Documents and Computational Pipelines - Bridging the Gap
Globus Compute wth IRI Workflows - GlobusWorld 2024
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Globus Compute Introduction - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
First Steps with Globus Compute Multi-User Endpoints
Enhancing Research Orchestration Capabilities at ORNL.pdf
Understanding Globus Data Transfers with NetSage
How to Position Your Globus Data Portal for Success Ten Good Practices
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
The Department of Energy's Integrated Research Infrastructure (IRI)
GlobusWorld 2024 Opening Keynote session
Enhancing Performance with Globus and the Science DMZ
Extending Globus into a Site-wide Automated Data Infrastructure.pdf
Globus at the United States Geological Survey
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Globus Compute with Integrated Research Infrastructure (IRI) workflows
Reactive Documents and Computational Pipelines - Bridging the Gap

Recently uploaded (20)

PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Encapsulation theory and applications.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Big Data Technologies - Introduction.pptx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Machine learning based COVID-19 study performance prediction
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Modernizing your data center with Dell and AMD
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
20250228 LYD VKU AI Blended-Learning.pptx
Encapsulation theory and applications.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Review of recent advances in non-invasive hemoglobin estimation
Big Data Technologies - Introduction.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing
Spectral efficient network and resource selection model in 5G networks
Machine learning based COVID-19 study performance prediction
Encapsulation_ Review paper, used for researhc scholars
Modernizing your data center with Dell and AMD
The Rise and Fall of 3GPP – Time for a Sabbatical?
Per capita expenditure prediction using model stacking based on satellite ima...
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Chapter 3 Spatial Domain Image Processing.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation

Globus: Enabling the Open Storage Network

  • 1. Globus: Enabling the Open Storage Network GlobusWorld 2019 Brian Mohr Made possible by grants from the NSF and the Schmidt Foundation NSF grants 1747552 1747493 1747507 1747490 1747483 1836357
  • 2. The Open Storage Network: Mission Statement The mission of OSN is to provide a low-cost, high-quality, sustainable, distributed storage cloud for the NSF research community.
  • 3. Research Cyberinfrastructure Today Shared Resource (XSEDE, PRAC) Standardized NSF-Funded 200+ universities with 40/100Gb Connectivity Standardized NSF-Funded Largely Balkanized No Standards Requirement No CI Funding Computation Networking Storage
  • 4. The Open Storage Network: Cyberinfrastructure Goals q Leverage existing NSF-funded high-speed network connectivity q Establish a standard national petascale storage infrastructure q Promote sharing of publicly-funded research datasets q Facilitate interdisciplinary research (searchable metadata) The third pillar…
  • 5. OSN Federation Design Objectives q Scalable: uniform hardware architecture across all OSN sites q Efficient: centralized remote provisioning and monitoring q Manageable: distributed lights-out data center q Resilient: offline site ≠ loss of access to data (cross-site replication) q Sustainable: minimal site-local staff overhead Keep it simple…
  • 6. OSN Pod Design Objective: “Scalable Unit” q Capacity: 1 petabyte usable object storage q Performance: 40Gb sequential throughput q Ease of Procurement: an OSN-optimized vendor SKU q Economical: $140,000 hardware budget q Ease of Adoption: plug-n-play appliance model Keep it simple…
  • 7. OSN Scalable Unit – Technical Spec 8 Server Nodes Five 4U Data Nodes | Three 1U Monitor/DTN Nodes 1.44 PB Raw Storage 8T HDDs | 7200 RPM 12Gb SAS | 36 Disks per Data Node High-Speed Network: 100GbE ToR Switch 40 or 100Gb I2 Uplink | 50GbE Cluster Interconnect Remote Management: 1GbE, KVM ToR Switches OOB IPv4 | IPMI | Console | Switched Outlets
  • 8. OSN Pod Physical Site Requirements q Floor space for one 30-inch wide rack q A/B power (current configuration: 6kW max) q Dual fiber uplink to 40/100Gb network infrastructure q Out-of-band network access for remote “pod” provisioning q Allocate IP address blocks (high-speed: /27; OOB: 3x IPv4) q That’s it! Minimize barriers to adoption…
  • 9. OSN Software Stack: Globus q Authentication infrastructure q GridFTP file transfer software q S3 interface to Ceph object storage Leverage existing Globus features…
  • 10. OSN Software Stack: Globus Extensions q Dataset Ownership q Dataset Access Authorization q Dataset Locality q Dataset Replication q Dataset Aging q Dataset Tags (Searchable Catalog!) Develop OSN-specific metadata/policy engine…
  • 12. OSN Prototype Deployment Sites Northeastern Storage Exchange San Diego Supercomputing Center University of Illinois Renaissance Computing Institute Northwestern University Johns Hopkins University Funded by NSF Funded by Schmidt Foundation
  • 13. OSN Scaled-out Deployment (Projected) Assumption: one or more OSN Pods per 40/100Gb NSF Campus Cyberinfrastructure Site.