SlideShare a Scribd company logo
Ian Foster
The University of Chicago
Argonne National Laboratory
Talk at 1st National Research Platform Workshop
Aug 7-8, 2017
Bozeman, Montana
Software infrastructure for a
National Research Platform
globus.org
Congratulations, you have a Science DMZ!
10GE10GE
10GE
10GE
Border Router
WAN
Science DMZ
Switch/Router
Firewall
Enterprise
perfSONAR
perfSONAR
10GE
10GE
10GE
10GE
DTN
DTN
API DTNs
(data access governed by portal)
DTN
DTN
perfSONAR
Filesystem
(data store)
10GE
Portal Server
Browsing path
Query path
Portal server applications:
· web server
· search
· database
· authentication
Data Path
Data Transfer Path
Portal Query/Browse Path
2Credit: Eli Dart
globus.org
What you really want is a science accelerator
Software
Infrastructure
Software transmutes silicon into discoveries
High-speed data ingest
Secure data sharing
Data publication
Smart instruments
Ultra-scale collaboration
globus.org
A strong software infrastructure is…
Accessible — trivially usable by all
Ubiquitous — it goes where you need it
Performant — fast end to end
Secure — all resources are protected
Reliable — you can count on it
Programmable — you can build on it
Manageable — it supports sys admins, too
Sustainable — it will be there tomorrow
4
globus.org
Accessible means trivially usable by all
5
Accessible
Ubiquitous
Performant
Secure
Reliable
Programmable
Manageable
Sustainable
Researcher initiates
transfer request; or
requested automatically
by script, science
gateway
1
Instrument
Compute Facility
Globus transfers files
reliably, securely
2
Globus controls
access to shared
files on existing
storage; no need
to move files to
cloud storage!
4
Curator reviews and
approves; dataset
published on campus
or other system
7
Researcher
selects files to
share, selects user
or group, and sets
access permissions
3
Collaborator logs in to
Globus and accesses
shared files; no local
account required;
download via Globus
5
Researcher assembles
dataset; describes it
with Dublin core &
domain-specific
metadaa
6
6
Peers, collaborators
search and discover
datasets; transfer and
share using Globus
8
Publication
Repository
Personal Computer
Transfer
Share
Publish
Discover
• Access via web
browser, command
line, or REST API
• Use any storage
• Use existing identity
globus.org
Ubiquitous means it goes where you need it
6
Accessible
Ubiquitous
Performant
Secure
Reliable
Programmable
Manageable
Sustainable
10,000+ active endpoints
Native packages
Installs in seconds
Linux, Windows, MacOS
GPFS, Lustre, OrangeFS, …
AWS S3, Ceph RadosGW
Spectra Logic BlackPearl
Google Drive, HPSS
Amazon
Glacier
globus.org
Performant means fast end to end
7
Accessible
Ubiquitous
Performant
Secure
Reliable
Programmable
Manageable
Sustainable
 Specialized protocols
 Auto-configuration
 Parallel DTNs
 File system optimizations
 Tape system optimizations
1PB in 1.002 days, ArgonneNCSA
R. Kettimuthu et al.
globus.org
Secure means all resources are protected
8
Accessible
Ubiquitous
Performant
Secure
Reliable
Programmable
Manageable
Sustainable
Globus service is itself highly secure
 Best-practice cloud security
 Third-party security reviews
Globus platform ensures your services are secure
 Accept credentials from 300+ identity providers
 Control proxy credential lifetimes
 Industry-standard OAuth-2 and OIDC protocols
 Data encryption
 Build secure services with controlled delegation
globus.org
Reliable means you can count on it
9
Accessible
Ubiquitous
Performant
Secure
Reliable
Programmable
Manageable
Sustainable
Each transfer is monitored,
retried upon failure
Protocols support restart
Fail over on multiple DTNs
Service is cloud hosted,
with replication, dynamic
failover, monitoring
99.5% uptime over past
three years
globus.org
Programmable means you can build on it
Accessible
Ubiquitous
Performant
Secure
Reliable
Programmable
Manageable
Sustainable
Globus Auth API
…
GlobusTransferAPI
GlobusConnect
Data Publication &
Discovery
File Sharing
File Transfer & Replication
Use institutional ID
systems in external
web applications
Integrate file transfer
and sharing capabilities
into scientific web apps,
portals, gateways, etc.
GET /endpoint/go%23ep1
PUT /endpoint/vas#my_endpt
200 OK
X-Transfer-API-Version: 0.10
Content-Type: application/json
…
Web
Command line
REST API
globus.org
Programmable means you can build on it
Accessible
Ubiquitous
Performant
Secure
Reliable
Programmable
Manageable
Sustainable
Globus Auth API
…
GlobusTransferAPI
GlobusConnect
Data Publication &
Discovery
File Sharing
File Transfer & Replication
Use institutional ID
systems in external
web applications
Integrate file transfer
and sharing capabilities
into scientific web apps,
portals, gateways, etc.
Python SDK
Jupyter Notebooks
Programmable means automation
Recurring transfers
with sync option
Copy /ingest
Daily @ 3:30am
Data distribution
.../my_share
--/cohort045
--/cohort096
--/cohort127
Shared
Endpoint
Staging area
cleanup
Shared
Endpoint
1. Check if successful transfer
2. Delete data from staging area
Accessible
Ubiquitous
Performant
Secure
Reliable
Programmable
Manageable
Sustainable
globus.org
globus.org
Programmable means automation
13
Accessible
Ubiquitous
Performant
Secure
Reliable
Programmable
Manageable
Sustainable ARM Climate Research Facility
globus.org
Manageable means it helps sys admins, too
14
Accessible
Ubiquitous
Performant
Secure
Reliable
Programmable
Manageable
Sustainable
Low admin costs
Priority support
Usage reporting
Management
console
Alternative identity
provider
Training materials
Constant innovation
globus.org
Sustainable means it will be there tomorrow
15
Accessible
Ubiquitous
Performant
Secure
Reliable
Programmable
Manageable
Sustainable
Operated by professionals at the University of Chicago
Supported by subscriptions from >65 institutions
globus.org
Raising the bar on research software quality
5
major services
13
national labs
use Globus
290PB
transferred
10,000
active endpoints
50 Bn
files processed
70,000
registered users
99.5%
uptime
65+
institutional
subscribers
1 PB
largest single
transfer to date
3 months
longest
continuously
managed transfer
300+
federated
campus identities
12,000
active users/year
Accessible
Ubiquitous
Performant
Secure
Reliable
Programmable
Manageable
Sustainable
globus.org
More
Users
Time
Data
Storage
Better
Collaboration
Ideas
Innovation
Easier
Authentication
Transfer
Sharing
Publication
Administration
Software infrastructure for a national research platform
Get more data to more people faster
Software transmutes hardware into discoveries
Thank you to our sponsors!
U . S . D E P A R T M E N T O F
ENERGY 18
Our
subscribers
globus.org

More Related Content

PPT
Ozone: Framework for Securing Peer to Peer Network
PDF
PCI Compliane With Hadoop
PDF
Defense in Depth: Implementing a Layered Privileged Password Security Strategy
PDF
(130511) #fitalk network forensics and its role and scope
PDF
Investigating, Mitigating and Preventing Cyber Attacks with Security Analytics
PPTX
Email Security Presentation
PPTX
Scaling Network Incident Response
PPTX
Malicious Client Detection using Machine learning
Ozone: Framework for Securing Peer to Peer Network
PCI Compliane With Hadoop
Defense in Depth: Implementing a Layered Privileged Password Security Strategy
(130511) #fitalk network forensics and its role and scope
Investigating, Mitigating and Preventing Cyber Attacks with Security Analytics
Email Security Presentation
Scaling Network Incident Response
Malicious Client Detection using Machine learning

What's hot (20)

PDF
Burning Down the Haystack to Find the Needle: Security Analytics in Action
PPTX
PKI token as a secure mechanism of Keystone authentication system for OpenStack
PPTX
The Other Advanced Attacks: DNS/NTP Amplification and Careto
PPTX
Ntxissacsc5 yellow 7 protecting the cloud with cep
PDF
No Easy Breach DerbyCon 2016
PPTX
OpenStack Security Project
PDF
IoT Lock Down - Battling the Bot Net Builders
PPT
Email security
PPT
Shmoocon 2013 - OpenStack Security Brief
PPTX
Network Intelligence for a secured Network (2014-03-12)
PDF
Heartbleed by-danish amber
PPTX
DNS Security, is it enough?
PPTX
Fingerprinting healthcare institutions
PPT
Web Security
PDF
MITRE ATT&CKcon 2018: Playing Devil’s Advocate to Security Initiatives with A...
PDF
Ccna sec 01
PPT
Chapter 08
PDF
NetExplorer security leaflet
Burning Down the Haystack to Find the Needle: Security Analytics in Action
PKI token as a secure mechanism of Keystone authentication system for OpenStack
The Other Advanced Attacks: DNS/NTP Amplification and Careto
Ntxissacsc5 yellow 7 protecting the cloud with cep
No Easy Breach DerbyCon 2016
OpenStack Security Project
IoT Lock Down - Battling the Bot Net Builders
Email security
Shmoocon 2013 - OpenStack Security Brief
Network Intelligence for a secured Network (2014-03-12)
Heartbleed by-danish amber
DNS Security, is it enough?
Fingerprinting healthcare institutions
Web Security
MITRE ATT&CKcon 2018: Playing Devil’s Advocate to Security Initiatives with A...
Ccna sec 01
Chapter 08
NetExplorer security leaflet
Ad

Similar to Software Infrastructure for a National Research Platform (20)

PPTX
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
PPTX
A Global Research Data Platform: How Globus Services Enable Scientific Discovery
PPTX
Globus: Research Data Management as Service and Platform - pearc17
PDF
Simplified Research Data Management with the Globus Platform
PPTX
Globus for Data Management: 2014 Joint Facility User Forum
PDF
Introduction to Globus for New Users (GlobusWorld Tour - UCSD)
PDF
Introduction to Globus - XSEDE14 Tutorial
PDF
Introduction to the Globus SaaS (GlobusWorld Tour - STFC)
PDF
Introduction to Globus for New Users (GlobusWorld Tour - Columbia University)
PDF
GlobusWorld 2024 Opening Keynote session
PDF
Facilitating Collaboration with Globus (GlobusWorld Tour - STFC)
PDF
Leveraging the Globus Platform in Web Applications (CHPC 2019 - South Africa)
PPTX
Gateways 2020 Tutorial - Introduction to Globus
PPTX
Globus: Beyond File Transfer
PDF
Globus: A Data Management Platform for Collaborative Research (CHPC 2019 - So...
PDF
GlobusWorld 2019 Opening Keynote
PDF
Introduction to Globus: Research Data Management Software at the ALCF
PDF
Introduction to Globus
PDF
Building Research Applications with Globus PaaS
PPTX
Globus status and publication plans
Science Services and Science Platforms: Using the Cloud to Accelerate and Dem...
A Global Research Data Platform: How Globus Services Enable Scientific Discovery
Globus: Research Data Management as Service and Platform - pearc17
Simplified Research Data Management with the Globus Platform
Globus for Data Management: 2014 Joint Facility User Forum
Introduction to Globus for New Users (GlobusWorld Tour - UCSD)
Introduction to Globus - XSEDE14 Tutorial
Introduction to the Globus SaaS (GlobusWorld Tour - STFC)
Introduction to Globus for New Users (GlobusWorld Tour - Columbia University)
GlobusWorld 2024 Opening Keynote session
Facilitating Collaboration with Globus (GlobusWorld Tour - STFC)
Leveraging the Globus Platform in Web Applications (CHPC 2019 - South Africa)
Gateways 2020 Tutorial - Introduction to Globus
Globus: Beyond File Transfer
Globus: A Data Management Platform for Collaborative Research (CHPC 2019 - So...
GlobusWorld 2019 Opening Keynote
Introduction to Globus: Research Data Management Software at the ALCF
Introduction to Globus
Building Research Applications with Globus PaaS
Globus status and publication plans
Ad

More from Ian Foster (20)

PPTX
Global Services for Global Science March 2023.pptx
PPTX
The Earth System Grid Federation: Origins, Current State, Evolution
PPTX
Better Information Faster: Programming the Continuum
PPTX
ESnet6 and Smart Instruments
PPTX
Linking Scientific Instruments and Computation
PPTX
Foster CRA March 2022.pptx
PPTX
Big Data, Big Computing, AI, and Environmental Science
PPTX
AI at Scale for Materials and Chemistry
PPTX
Coding the Continuum
PPTX
Data Tribology: Overcoming Data Friction with Cloud Automation
PPTX
Research Automation for Data-Driven Discovery
PPTX
Scaling collaborative data science with Globus and Jupyter
PPTX
Learning Systems for Science
PPTX
Data Automation at Light Sources
PPTX
Team Argon Summary
PPTX
Thoughts on interoperability
PPTX
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
PPTX
NIH Data Commons Architecture Ideas
PPTX
Going Smart and Deep on Materials at ALCF
PPTX
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Global Services for Global Science March 2023.pptx
The Earth System Grid Federation: Origins, Current State, Evolution
Better Information Faster: Programming the Continuum
ESnet6 and Smart Instruments
Linking Scientific Instruments and Computation
Foster CRA March 2022.pptx
Big Data, Big Computing, AI, and Environmental Science
AI at Scale for Materials and Chemistry
Coding the Continuum
Data Tribology: Overcoming Data Friction with Cloud Automation
Research Automation for Data-Driven Discovery
Scaling collaborative data science with Globus and Jupyter
Learning Systems for Science
Data Automation at Light Sources
Team Argon Summary
Thoughts on interoperability
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
NIH Data Commons Architecture Ideas
Going Smart and Deep on Materials at ALCF
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...

Recently uploaded (20)

PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
Machine Learning_overview_presentation.pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPTX
Spectroscopy.pptx food analysis technology
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PPTX
A Presentation on Artificial Intelligence
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Empathic Computing: Creating Shared Understanding
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
August Patch Tuesday
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Mushroom cultivation and it's methods.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPT
Teaching material agriculture food technology
PPTX
Tartificialntelligence_presentation.pptx
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Advanced methodologies resolving dimensionality complications for autism neur...
Machine Learning_overview_presentation.pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
MIND Revenue Release Quarter 2 2025 Press Release
Assigned Numbers - 2025 - Bluetooth® Document
Spectroscopy.pptx food analysis technology
Accuracy of neural networks in brain wave diagnosis of schizophrenia
A Presentation on Artificial Intelligence
SOPHOS-XG Firewall Administrator PPT.pptx
Univ-Connecticut-ChatGPT-Presentaion.pdf
Network Security Unit 5.pdf for BCA BBA.
Empathic Computing: Creating Shared Understanding
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
August Patch Tuesday
Diabetes mellitus diagnosis method based random forest with bat algorithm
Mushroom cultivation and it's methods.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Teaching material agriculture food technology
Tartificialntelligence_presentation.pptx

Software Infrastructure for a National Research Platform

  • 1. Ian Foster The University of Chicago Argonne National Laboratory Talk at 1st National Research Platform Workshop Aug 7-8, 2017 Bozeman, Montana Software infrastructure for a National Research Platform
  • 2. globus.org Congratulations, you have a Science DMZ! 10GE10GE 10GE 10GE Border Router WAN Science DMZ Switch/Router Firewall Enterprise perfSONAR perfSONAR 10GE 10GE 10GE 10GE DTN DTN API DTNs (data access governed by portal) DTN DTN perfSONAR Filesystem (data store) 10GE Portal Server Browsing path Query path Portal server applications: · web server · search · database · authentication Data Path Data Transfer Path Portal Query/Browse Path 2Credit: Eli Dart
  • 3. globus.org What you really want is a science accelerator Software Infrastructure Software transmutes silicon into discoveries High-speed data ingest Secure data sharing Data publication Smart instruments Ultra-scale collaboration
  • 4. globus.org A strong software infrastructure is… Accessible — trivially usable by all Ubiquitous — it goes where you need it Performant — fast end to end Secure — all resources are protected Reliable — you can count on it Programmable — you can build on it Manageable — it supports sys admins, too Sustainable — it will be there tomorrow 4
  • 5. globus.org Accessible means trivially usable by all 5 Accessible Ubiquitous Performant Secure Reliable Programmable Manageable Sustainable Researcher initiates transfer request; or requested automatically by script, science gateway 1 Instrument Compute Facility Globus transfers files reliably, securely 2 Globus controls access to shared files on existing storage; no need to move files to cloud storage! 4 Curator reviews and approves; dataset published on campus or other system 7 Researcher selects files to share, selects user or group, and sets access permissions 3 Collaborator logs in to Globus and accesses shared files; no local account required; download via Globus 5 Researcher assembles dataset; describes it with Dublin core & domain-specific metadaa 6 6 Peers, collaborators search and discover datasets; transfer and share using Globus 8 Publication Repository Personal Computer Transfer Share Publish Discover • Access via web browser, command line, or REST API • Use any storage • Use existing identity
  • 6. globus.org Ubiquitous means it goes where you need it 6 Accessible Ubiquitous Performant Secure Reliable Programmable Manageable Sustainable 10,000+ active endpoints Native packages Installs in seconds Linux, Windows, MacOS GPFS, Lustre, OrangeFS, … AWS S3, Ceph RadosGW Spectra Logic BlackPearl Google Drive, HPSS Amazon Glacier
  • 7. globus.org Performant means fast end to end 7 Accessible Ubiquitous Performant Secure Reliable Programmable Manageable Sustainable  Specialized protocols  Auto-configuration  Parallel DTNs  File system optimizations  Tape system optimizations 1PB in 1.002 days, ArgonneNCSA R. Kettimuthu et al.
  • 8. globus.org Secure means all resources are protected 8 Accessible Ubiquitous Performant Secure Reliable Programmable Manageable Sustainable Globus service is itself highly secure  Best-practice cloud security  Third-party security reviews Globus platform ensures your services are secure  Accept credentials from 300+ identity providers  Control proxy credential lifetimes  Industry-standard OAuth-2 and OIDC protocols  Data encryption  Build secure services with controlled delegation
  • 9. globus.org Reliable means you can count on it 9 Accessible Ubiquitous Performant Secure Reliable Programmable Manageable Sustainable Each transfer is monitored, retried upon failure Protocols support restart Fail over on multiple DTNs Service is cloud hosted, with replication, dynamic failover, monitoring 99.5% uptime over past three years
  • 10. globus.org Programmable means you can build on it Accessible Ubiquitous Performant Secure Reliable Programmable Manageable Sustainable Globus Auth API … GlobusTransferAPI GlobusConnect Data Publication & Discovery File Sharing File Transfer & Replication Use institutional ID systems in external web applications Integrate file transfer and sharing capabilities into scientific web apps, portals, gateways, etc. GET /endpoint/go%23ep1 PUT /endpoint/vas#my_endpt 200 OK X-Transfer-API-Version: 0.10 Content-Type: application/json … Web Command line REST API
  • 11. globus.org Programmable means you can build on it Accessible Ubiquitous Performant Secure Reliable Programmable Manageable Sustainable Globus Auth API … GlobusTransferAPI GlobusConnect Data Publication & Discovery File Sharing File Transfer & Replication Use institutional ID systems in external web applications Integrate file transfer and sharing capabilities into scientific web apps, portals, gateways, etc. Python SDK Jupyter Notebooks
  • 12. Programmable means automation Recurring transfers with sync option Copy /ingest Daily @ 3:30am Data distribution .../my_share --/cohort045 --/cohort096 --/cohort127 Shared Endpoint Staging area cleanup Shared Endpoint 1. Check if successful transfer 2. Delete data from staging area Accessible Ubiquitous Performant Secure Reliable Programmable Manageable Sustainable globus.org
  • 14. globus.org Manageable means it helps sys admins, too 14 Accessible Ubiquitous Performant Secure Reliable Programmable Manageable Sustainable Low admin costs Priority support Usage reporting Management console Alternative identity provider Training materials Constant innovation
  • 15. globus.org Sustainable means it will be there tomorrow 15 Accessible Ubiquitous Performant Secure Reliable Programmable Manageable Sustainable Operated by professionals at the University of Chicago Supported by subscriptions from >65 institutions
  • 16. globus.org Raising the bar on research software quality 5 major services 13 national labs use Globus 290PB transferred 10,000 active endpoints 50 Bn files processed 70,000 registered users 99.5% uptime 65+ institutional subscribers 1 PB largest single transfer to date 3 months longest continuously managed transfer 300+ federated campus identities 12,000 active users/year Accessible Ubiquitous Performant Secure Reliable Programmable Manageable Sustainable
  • 17. globus.org More Users Time Data Storage Better Collaboration Ideas Innovation Easier Authentication Transfer Sharing Publication Administration Software infrastructure for a national research platform Get more data to more people faster Software transmutes hardware into discoveries
  • 18. Thank you to our sponsors! U . S . D E P A R T M E N T O F ENERGY 18 Our subscribers globus.org

Editor's Notes

  • #6: A U P P S R M S PURPOSE SOFTWARE
  • #7: A U P P S R M S PURPOSE SOFTWARE
  • #9: A U P P S R M S PURPOSE SOFTWARE
  • #10: A U P P S R M S PURPOSE SOFTWARE
  • #11: A U P P S R M S PURPOSE SOFTWARE
  • #15: A U P P S R M S PURPOSE SOFTWARE
  • #16: Picture of team