SlideShare a Scribd company logo
Advanced Globus
System Administration
Vas Vasiliadis
vas@uchicago.edu
8 July 2022
Agenda
• Comparing GCS v4 and v5 comparison
• Migrating from GCS v4 to v5
• Multi-DTN deployments
• Supporting non-POSIX storage systems
• Optimizing (or not!) file transfer performance
• Modifying the data channel interface
2
Adding DTNs to your
endpoint
14
Recall: GCSv5 deployment key
15
Adding a node requires just two commands
$ globus-connect-server node setup $CLIENT_ID --deployment-key THE_KEY
$ systemctl restart apache2
Copy the deployment key
from the first node (DTN) to
every other node
Node setup pulls configuration from Globus service
Check your DTN cluster status:
globus-connect-server node list
Multi-node DTN behavior
• Active nodes can receive transfer tasks
• Tasks on inactive node will pause until active again
• GCS manager assistant
– Synchronizes configuration among nodes in the endpoint
– Stores encrypted configuration values in Globus service
17
Migrating an endpoint to a new host (DTN)
• An endpoints is a logical construct è replace host
system without disrupting the endpoint
– Avoid replicating configuration data (esp. for guest collections!)
– Maintain continuity for custom apps, automation scripts, etc., that
use the endpoint UUID
• Using GCS’s multi-node configuration, add new node(s)
to endpoint and then remove original node(s)
• Again, deployment key is required
– Export node configuration with node setup --export-node
– Import on new DTN using node setup --import-node
Supporting non-POSIX systems
• Update your GCS packages
• Add the appropriate storage gateway
– Non-POSIX systems require add-on connector subscription(s)
• Gateway configuration options vary by connector
– e.g., specify bucket name(s) for AWS S3
• Collection authentication options vary by connector
– e.g., provide user access key and secret key for AWS S3
– Credentials must grant appropriate permissions
– Mapped collection may not actually “map” to local user account
Supporting access to
AWS S3
(and S3-compatible systems)
20
On performance…
21
Globus transfer is fast …but it depends on…
• Data Transfer Node (CPU, RAM, bus, NIC, …)
• Network (devices, path quality, latency, …)
• Storage (hardware, attach mode, …)
• Dataset make-up (file#, size, tree depth, …)
– Remember: LoSF == Great sadness
• Things people do (one transfer per file …1M files)
• …?
22
You should have Great Expectations
23
ESnet EPOC target for all DOE labs
Requires at least a 10G connection
Esnet
makes
magic
happen
Legacy Architecture (don’t do this)
10GE
Border Router
WAN
Firewall
Enterprise
perfSONAR
perfSONAR
Filesystem
(data store)
10GE
Portal
Server
Browsing path
Query path
Data path
Portal server applications:
· web server
· search
· database
· authentication
· data service
Best practice: ScienceDMZ – you have one!
10GE
10GE
10GE
10GE
Border Router
WAN
Science DMZ
Switch/Router
Firewall
Enterprise
perfSONAR
perfSONAR
10GE
10GE
10GE
10GE
DTN
DTN
API DTNs
(data access governed
by portal)
DTN
DTN
perfSONAR
Filesystem
(data store)
10GE
Portal
Server
Browsing path
Query path
Portal server applications:
· web server
· search
· database
· authentication
Data Path
Data Transfer Path
Portal Query/Browse Path
Science DMZ configuration
27
Source
security
filters
Destination
security
filters
Destination
Science DMZ
Source
Science DMZ
Source
Border Router
Destination
Border Router
Source Router Destination Router
User
Organization
DATA
CONTROL
Physical Control Path
Logical Control Path
Physical Data Path
Logical Data Path
* Port 443
* Ports 50000-
51000
Data Transfer
Node (DTN)
Data Transfer
Node (DTN)
* Please see TCP ports reference: https://docs.globus.org/resource-provider-guide/#open-tcp-ports_section
Globus balances performance with reliability
72.8Gbps
Performance is a pairs sport
• Network use parameters: concurrency, parallelism
• Maximum, Preferred values for each
• Transfer considers source and destination endpoint settings
min(
max(preferred src, preferred dest),
max src,
max dest
)
• Service limits, e.g. concurrent requests
29
Globus network use parameters
• May only be changed on managed endpoints
• Modify via the web app: Console à Endpoints tab
• Modify via Globus Connect Server CLI
– Run globus-connect-server endpoint modify
• Strong recommendation: Do not change network use
parameters before establishing baseline performance
30
Modifying network
use parameters
31
Configuring a “private” data channel
• Default: data interface is set to the DTN’s public IP
address (see data_interface in
/etc/gridftp.d/globus-connect-server
• Create /etc/gridftp.d/STORAGE_GATEWAY_ID
• Set data_interface PRIVATE_INTERFACE_IP_ADDRESS
• Replicate on every DTN (files in /etc/gridftp.d/ are
not sync'd between nodes by Globus)
32
Troubleshooting
Globus Connect
Server
33
Before asking for help…
• self-diagnostic can identify many issues
– Are services running? GCS manager/assistant, GridFTP server
• Connectivity is a common cause
– Is the DTN control channel reachable?
– Can the DTN establish data channel connection?
docs.globus.org/globus-connect-server/v5.4/troubleshooting-guide
…and we’re always here for you: support@globus.org
34
When you really need a clean slate…
• Proper clean-up—both on your system and in the
Globus service—is important!
• Execute these commands in the specified order:
o globus-connect-server node cleanup (on every DTN)
o globus-connect-server endpoint cleanup (on last DTN)
• Delete the GCS registration at developers.globus.org
• Don’t use the same Client ID for another endpoint!
Resources
• GCSv5 Guides: docs.globus.org/globus-connect-server/
• Migration: docs.globus.org/globus-connect-
server/migrating-to-v5.4/
• Globus support: support@globus.org
36

More Related Content

PDF
Advanced Globus System Administration
PDF
Advanced Globus System Administration
PDF
Globus Endpoint Migration and Advanced Administration Topics
PDF
GlobusWorld 2021 Tutorial: Globus for System Administrators
PDF
Advanced Globus System Administration Topics
PDF
Migrating to Globus Connect Server v5
PDF
Introduction to Globus for System Administrators (GlobusWorld Tour - UMich)
PDF
Globus for System Administrators (GlobusWorld Tour - UCSD)
Advanced Globus System Administration
Advanced Globus System Administration
Globus Endpoint Migration and Advanced Administration Topics
GlobusWorld 2021 Tutorial: Globus for System Administrators
Advanced Globus System Administration Topics
Migrating to Globus Connect Server v5
Introduction to Globus for System Administrators (GlobusWorld Tour - UMich)
Globus for System Administrators (GlobusWorld Tour - UCSD)

Similar to Advanced Globus System Administration (20)

PDF
Making Storage Systems Accessible via Globus (GlobusWorld Tour West)
PDF
Globus for System Administrators (GlobusWorld Tour - Columbia University)
PDF
Tutorial: Introduction to Globus for System Administrators
PDF
Globus Endpoint Administration (GlobusWorld Tour - STFC)
PDF
Globus for System Administrators (CHPC 2019 - South Africa)
PDF
Introduction to Globus for System Administrators
PPTX
Globus for System Administrators
PDF
Introduction to Globus Connect for System Administrators.pdf
PDF
Globus Endpoint Setup and Configuration - XSEDE14 Tutorial
PDF
Globus for System Administrators
PDF
Introduction to Globus for System Administrators
PDF
Introduction to Globus for System Administrators
PDF
Introduction to Globus for System Administrators
PPTX
Updating the Globus Connect Architecture - ARCC Workshop at PEARC17
PPTX
Globus: Research Data Management as Service and Platform - pearc17
PDF
Connecting Your System to Globus (APS Workshop)
PDF
Globus Connect Server Deep Dive - GlobusWorld 2024
PPTX
Globus Connect Server 5.1 Webinar
PPTX
What's New With Globus
PDF
Introduction to Globus: Research Data Management Software at the ALCF
Making Storage Systems Accessible via Globus (GlobusWorld Tour West)
Globus for System Administrators (GlobusWorld Tour - Columbia University)
Tutorial: Introduction to Globus for System Administrators
Globus Endpoint Administration (GlobusWorld Tour - STFC)
Globus for System Administrators (CHPC 2019 - South Africa)
Introduction to Globus for System Administrators
Globus for System Administrators
Introduction to Globus Connect for System Administrators.pdf
Globus Endpoint Setup and Configuration - XSEDE14 Tutorial
Globus for System Administrators
Introduction to Globus for System Administrators
Introduction to Globus for System Administrators
Introduction to Globus for System Administrators
Updating the Globus Connect Architecture - ARCC Workshop at PEARC17
Globus: Research Data Management as Service and Platform - pearc17
Connecting Your System to Globus (APS Workshop)
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server 5.1 Webinar
What's New With Globus
Introduction to Globus: Research Data Management Software at the ALCF
Ad

More from Globus (20)

PDF
Globus Compute wth IRI Workflows - GlobusWorld 2024
PDF
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
PDF
Globus Compute Introduction - GlobusWorld 2024
PDF
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
PDF
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
PDF
First Steps with Globus Compute Multi-User Endpoints
PDF
Enhancing Research Orchestration Capabilities at ORNL.pdf
PDF
Understanding Globus Data Transfers with NetSage
PDF
How to Position Your Globus Data Portal for Success Ten Good Practices
PDF
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
PDF
Developing Distributed High-performance Computing Capabilities of an Open Sci...
PDF
The Department of Energy's Integrated Research Infrastructure (IRI)
PDF
GlobusWorld 2024 Opening Keynote session
PDF
Enhancing Performance with Globus and the Science DMZ
PDF
Extending Globus into a Site-wide Automated Data Infrastructure.pdf
PDF
Globus at the United States Geological Survey
PDF
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
PDF
Globus Compute with Integrated Research Infrastructure (IRI) workflows
PDF
Reactive Documents and Computational Pipelines - Bridging the Gap
PDF
Innovating Inference at Exascale - Remote Triggering of Large Language Models...
Globus Compute wth IRI Workflows - GlobusWorld 2024
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Globus Compute Introduction - GlobusWorld 2024
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
First Steps with Globus Compute Multi-User Endpoints
Enhancing Research Orchestration Capabilities at ORNL.pdf
Understanding Globus Data Transfers with NetSage
How to Position Your Globus Data Portal for Success Ten Good Practices
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
The Department of Energy's Integrated Research Infrastructure (IRI)
GlobusWorld 2024 Opening Keynote session
Enhancing Performance with Globus and the Science DMZ
Extending Globus into a Site-wide Automated Data Infrastructure.pdf
Globus at the United States Geological Survey
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Globus Compute with Integrated Research Infrastructure (IRI) workflows
Reactive Documents and Computational Pipelines - Bridging the Gap
Innovating Inference at Exascale - Remote Triggering of Large Language Models...
Ad

Recently uploaded (20)

PDF
Types of Token_ From Utility to Security.pdf
PDF
DNT Brochure 2025 – ISV Solutions @ D365
PDF
How Tridens DevSecOps Ensures Compliance, Security, and Agility
PPTX
Trending Python Topics for Data Visualization in 2025
PDF
Ableton Live Suite for MacOS Crack Full Download (Latest 2025)
PPTX
AMADEUS TRAVEL AGENT SOFTWARE | AMADEUS TICKETING SYSTEM
PPTX
Weekly report ppt - harsh dattuprasad patel.pptx
PPTX
Custom Software Development Services.pptx.pptx
PDF
How to Make Money in the Metaverse_ Top Strategies for Beginners.pdf
PDF
Topaz Photo AI Crack New Download (Latest 2025)
PPTX
Monitoring Stack: Grafana, Loki & Promtail
PPTX
Patient Appointment Booking in Odoo with online payment
PPTX
Cybersecurity: Protecting the Digital World
PDF
Digital Systems & Binary Numbers (comprehensive )
PPTX
Tech Workshop Escape Room Tech Workshop
PPTX
Computer Software and OS of computer science of grade 11.pptx
PDF
iTop VPN Crack Latest Version Full Key 2025
PDF
AI/ML Infra Meetup | LLM Agents and Implementation Challenges
PDF
Salesforce Agentforce AI Implementation.pdf
PPTX
Why Generative AI is the Future of Content, Code & Creativity?
Types of Token_ From Utility to Security.pdf
DNT Brochure 2025 – ISV Solutions @ D365
How Tridens DevSecOps Ensures Compliance, Security, and Agility
Trending Python Topics for Data Visualization in 2025
Ableton Live Suite for MacOS Crack Full Download (Latest 2025)
AMADEUS TRAVEL AGENT SOFTWARE | AMADEUS TICKETING SYSTEM
Weekly report ppt - harsh dattuprasad patel.pptx
Custom Software Development Services.pptx.pptx
How to Make Money in the Metaverse_ Top Strategies for Beginners.pdf
Topaz Photo AI Crack New Download (Latest 2025)
Monitoring Stack: Grafana, Loki & Promtail
Patient Appointment Booking in Odoo with online payment
Cybersecurity: Protecting the Digital World
Digital Systems & Binary Numbers (comprehensive )
Tech Workshop Escape Room Tech Workshop
Computer Software and OS of computer science of grade 11.pptx
iTop VPN Crack Latest Version Full Key 2025
AI/ML Infra Meetup | LLM Agents and Implementation Challenges
Salesforce Agentforce AI Implementation.pdf
Why Generative AI is the Future of Content, Code & Creativity?

Advanced Globus System Administration

  • 1. Advanced Globus System Administration Vas Vasiliadis vas@uchicago.edu 8 July 2022
  • 2. Agenda • Comparing GCS v4 and v5 comparison • Migrating from GCS v4 to v5 • Multi-DTN deployments • Supporting non-POSIX storage systems • Optimizing (or not!) file transfer performance • Modifying the data channel interface 2
  • 3. Adding DTNs to your endpoint 14
  • 5. Adding a node requires just two commands $ globus-connect-server node setup $CLIENT_ID --deployment-key THE_KEY $ systemctl restart apache2 Copy the deployment key from the first node (DTN) to every other node Node setup pulls configuration from Globus service Check your DTN cluster status: globus-connect-server node list
  • 6. Multi-node DTN behavior • Active nodes can receive transfer tasks • Tasks on inactive node will pause until active again • GCS manager assistant – Synchronizes configuration among nodes in the endpoint – Stores encrypted configuration values in Globus service 17
  • 7. Migrating an endpoint to a new host (DTN) • An endpoints is a logical construct è replace host system without disrupting the endpoint – Avoid replicating configuration data (esp. for guest collections!) – Maintain continuity for custom apps, automation scripts, etc., that use the endpoint UUID • Using GCS’s multi-node configuration, add new node(s) to endpoint and then remove original node(s) • Again, deployment key is required – Export node configuration with node setup --export-node – Import on new DTN using node setup --import-node
  • 8. Supporting non-POSIX systems • Update your GCS packages • Add the appropriate storage gateway – Non-POSIX systems require add-on connector subscription(s) • Gateway configuration options vary by connector – e.g., specify bucket name(s) for AWS S3 • Collection authentication options vary by connector – e.g., provide user access key and secret key for AWS S3 – Credentials must grant appropriate permissions – Mapped collection may not actually “map” to local user account
  • 9. Supporting access to AWS S3 (and S3-compatible systems) 20
  • 11. Globus transfer is fast …but it depends on… • Data Transfer Node (CPU, RAM, bus, NIC, …) • Network (devices, path quality, latency, …) • Storage (hardware, attach mode, …) • Dataset make-up (file#, size, tree depth, …) – Remember: LoSF == Great sadness • Things people do (one transfer per file …1M files) • …? 22
  • 12. You should have Great Expectations 23 ESnet EPOC target for all DOE labs Requires at least a 10G connection
  • 14. Legacy Architecture (don’t do this) 10GE Border Router WAN Firewall Enterprise perfSONAR perfSONAR Filesystem (data store) 10GE Portal Server Browsing path Query path Data path Portal server applications: · web server · search · database · authentication · data service
  • 15. Best practice: ScienceDMZ – you have one! 10GE 10GE 10GE 10GE Border Router WAN Science DMZ Switch/Router Firewall Enterprise perfSONAR perfSONAR 10GE 10GE 10GE 10GE DTN DTN API DTNs (data access governed by portal) DTN DTN perfSONAR Filesystem (data store) 10GE Portal Server Browsing path Query path Portal server applications: · web server · search · database · authentication Data Path Data Transfer Path Portal Query/Browse Path
  • 16. Science DMZ configuration 27 Source security filters Destination security filters Destination Science DMZ Source Science DMZ Source Border Router Destination Border Router Source Router Destination Router User Organization DATA CONTROL Physical Control Path Logical Control Path Physical Data Path Logical Data Path * Port 443 * Ports 50000- 51000 Data Transfer Node (DTN) Data Transfer Node (DTN) * Please see TCP ports reference: https://docs.globus.org/resource-provider-guide/#open-tcp-ports_section
  • 17. Globus balances performance with reliability 72.8Gbps
  • 18. Performance is a pairs sport • Network use parameters: concurrency, parallelism • Maximum, Preferred values for each • Transfer considers source and destination endpoint settings min( max(preferred src, preferred dest), max src, max dest ) • Service limits, e.g. concurrent requests 29
  • 19. Globus network use parameters • May only be changed on managed endpoints • Modify via the web app: Console à Endpoints tab • Modify via Globus Connect Server CLI – Run globus-connect-server endpoint modify • Strong recommendation: Do not change network use parameters before establishing baseline performance 30
  • 21. Configuring a “private” data channel • Default: data interface is set to the DTN’s public IP address (see data_interface in /etc/gridftp.d/globus-connect-server • Create /etc/gridftp.d/STORAGE_GATEWAY_ID • Set data_interface PRIVATE_INTERFACE_IP_ADDRESS • Replicate on every DTN (files in /etc/gridftp.d/ are not sync'd between nodes by Globus) 32
  • 23. Before asking for help… • self-diagnostic can identify many issues – Are services running? GCS manager/assistant, GridFTP server • Connectivity is a common cause – Is the DTN control channel reachable? – Can the DTN establish data channel connection? docs.globus.org/globus-connect-server/v5.4/troubleshooting-guide …and we’re always here for you: support@globus.org 34
  • 24. When you really need a clean slate… • Proper clean-up—both on your system and in the Globus service—is important! • Execute these commands in the specified order: o globus-connect-server node cleanup (on every DTN) o globus-connect-server endpoint cleanup (on last DTN) • Delete the GCS registration at developers.globus.org • Don’t use the same Client ID for another endpoint!
  • 25. Resources • GCSv5 Guides: docs.globus.org/globus-connect-server/ • Migration: docs.globus.org/globus-connect- server/migrating-to-v5.4/ • Globus support: support@globus.org 36