SlideShare a Scribd company logo
Automating Research Data
Workflows
Rachana Ananthakrishnan
rachana@globus.org
Greg Nawrocki
greg@globus.org
Data replication
• For backup: initiated by user or system back up
• Automated transfer of data from science instrument
• Replication to a data share
2
Recurring transfers
with sync option
Copy /ingest
Daily @ 3:30am
Staging data with compute jobs
• Stage data in or out as part of the job
• Transfer task is submitted when the job is run
– Endpoint may not be currently activated
• Alternative approaches
1. User adds directives to job submission script
2. Application manages data staging on user’s behalf
Application driven automation
• Application (e.g. portal, science gateway) submits a
transfer of compute results as the user
• Application monitors transfer, and initiates additional
processing and/or backup of data
Relevant Platform
Capabilities
Globus Auth: Native apps
• Client that cannot keep a secret, e.g…
– Command line, desktop apps
– Mobile apps
– Jupyter notebooks
• Native app is registered with Globus Auth
– Not a confidential client like we’ll learn about later
• Native App Grant is used
– Variation on the Authorization Code Grant
• Globus SDK:
– To get tokens: NativeAppAuthClient
– To use tokens: AccessTokenAuthorizer
6
Browser
Native App grant
7
Native App
(Client)
1. Run
application
2. URL to
authenticate
3. Authenticate and
consent
4. Auth code
5. Register
auth code
6. Exchange
code
7. Access tokens
8. Authenticate with access
tokens to invoke transfer
service as user App/Service
(Resource Server)
Globus Auth
(Authorization Server)
Refresh tokens
• Common use cases
– Portal checking transfer status when user is not logged in
– Running command line app from script
o The CLI gets access and refresh tokens upon ”globus login”
• Refresh tokens issued to client, in particular scope
• Client uses refresh token to get access token
– Confidential client: client_id and client_secret required
– Native app: client_secret not required
• Refresh token good for 6 months after last use
• Consent rescindment revokes resource token
8
Refresh tokens
9
Native App
(Client)
App/Service
(Resource Server)
Globus Auth
(Authorization Server)
1. Run
application
2. URL to
authenticate
Browser
3. Authenticate and consent
4. Auth code
5. Register
auth code
6. Exchange code,
request refresh tokens
7. Access
tokens and refresh tokens
9. Exchange refresh token
for new access tokens
8. Store refresh tokens
10. Access tokens
11. Authenticate with access
tokens to invoke service as user
Native App/Refresh Tokens Sample Code
github.com/globus/native-app-examples
• ./example_copy_paste.py
– User copies and pastes code to the app
• ./example_copy_paste_refresh_token.py
– Stores refresh token locally, uses it to get new access tokens
• See README for installation
10
On your EC2 instance in ~/native-app-examples
Automation via the
Globus CLI
Globus CLI
• It’s a native application distributed by Globus
– https://docs.globus.org/cli/
– https://github.com/globus/globus-cli
• Easy install and updates
• Command “globus login” gets access tokens and refresh
tokens
– Stores the token locally (~/.globus.cfg )
• All interactions with the service use the tokens
– Tokens for Globus Auth and Transfer services
– Just like we’ll do in the Platform examples with the API
• Command globus logout deletes those
UUIDs everywhere
• UUIDs for endpoint, task, user identity, groups…
• Use search/list options
• get-identities for identity username to UUID
$ globus endpoint search 'Globus Tutorial'
$ globus task list
$ globus get-identities vas@globus.org bfc122a3-
af43-43e1-8a41-d36f28a2bc0a
The Globus CLI – Let’s do a few things…
• Find endpoints
– globus endpoint search Midway
– globus endpoint search ESNet
– globus endpoint search --filter-scope=recently-used
• Find endpoint contents
– globus ls af7bda53-6d04-11e5-ba46-22000b92c6ec
– globus ls af7bda53-6d04-11e5-ba46-22000b92c6ec:RMACC2018
• Transfer a file
– From ESnet Read-Only Test DTN at CERN to Midway
– Note the specific paths
– globus transfer d8eb36b6-6d04-11e5-ba46-22000b92c6ec:/~/data1/1M.dat af7bda53-6d04-11e5-
ba46-22000b92c6ec:/~/1M.dat
• Transfer a directory
– From Globus Tutorial Endpoint 2 to Midway (create directory and contents)
– globus transfer --recursive ddb59af0-6d04-11e5-ba46-22000b92c6ec:/~/sync-demo af7bda53-
6d04-11e5-ba46-22000b92c6ec:/~/syncDemo
• https://docs.globus.org/cli/examples/
Batch Transfers
• Transfer tasks have one source/destination, but can have
any number of files
• Provide input source-dest pairs via local file
• e.g. move files listed in files.txt from $ep1 to $ep2
$ ep1=ddb59aef-6d04-11e5-ba46-22000b92c6ec
$ ep2=ddb59af0-6d04-11e5-ba46-22000b92c6ec
$ globus transfer $ep1:/share/godata/ $ep2:/~/ --
batch --label 'CLI Batch' < files.txt
Useful submission commands
• Safe resubmissions
– Applies to all tasks (transfer and delete)
– Get a task UUID, use that in submission
– $ globus task generate-submission-id
– --submission-id option in transfer
• Task wait
– useful for scripting conditional on transfer task status
Parsing CLI output
• Default output is text; for JSON output use --format json
$ globus endpoint search --filter-scope my-endpoints
$ globus endpoint search --filter-scope my-endpoints --
format json
• Extract specific attributes using --jmespath <expression>
$ globus endpoint search --filter-scope my-endpoints --
jmespath 'DATA[].[id, display_name]'
Managing notifications
• Turn off emails sent for tasks
• Useful when an application manages tasks for a user
• Disable notifications with the --notify option
--notify off (all notifications)
--notify succeeded|failed|inactive (select notifications)
Permission management
• Set and manage permissions on shared endpoint
• Requires access manager role
$ share=<shared_endpoint_UUID>
$ globus endpoint permission create --permissions r --
identity greg@nawrockinet.com $share:/nawrockipersonal/
$ globus endpoint permission list $share
$ globus endpoint permission delete $share <perm_UUID>
Automation with CLI
• A script that uses the CLI to transfer data repeatedly via
task manager/cron
– Interactions are as user: both for data access and to Globus
services
• CLI commands used in the job submission script
– CLI is installed on head node
– User runs ”globus login”, the tokens are stored in user’s home
directory
– Tokens accessible when the job runs and submits stage in or stage
out tasks
– Use the –skip-activation-check to submit the task even if endpoint is
not activated at submit time
Automation with portals
• Portal needs to act as the user
• User grants “offline” access to the portal
– Portal gets and stores refresh tokens for each user
– Uses client id/secret + refresh tokens to get new access tokens
– Portal maintains state about transfers being managed (task id)
Automation Examples
• Syncing a directory
– Bash script that calls the Globus CLI and a
Python module that can be run as a script or
imported as a module.
• Staging data in a shared directory
– Bash / Python
• Removing directories after files are
transferred
– Python script
• Simple code examples for various use cases
using Globus
– https://github.com/globus/automation-examples
22
Support resources
• Globus documentation: docs.globus.org
• Sample code: github.com/globus
• Helpdesk and issue escalation: support@globus.org
• Mailing lists
– https://www.globus.org/mailing-lists
– developer-discuss@globus.org
• Globus professional services team
– Assist with portal/gateway/app architecture and design
– Develop custom applications that leverage the Globus platform
– Advise on customized deployment and integration scenarios

More Related Content

PDF
Automating Data Flows with the Globus CLI (GlobusWorld Tour - UMich)
PDF
Automating Research Data Workflows (GlobusWorld Tour - Columbia University)
PDF
Automating Research Data Workflows (GlobusWorld Tour - UCSD)
PDF
Automating Research Data Workflows (GlobusWorld Tour - STFC)
PPTX
Automating Research Data Flows with the Globus Command Line Interface (CLI)
PPTX
Globus for System Administrators
PPTX
Globus Platform Overview
PDF
Simple Data Automation with Globus (GlobusWorld Tour West)
Automating Data Flows with the Globus CLI (GlobusWorld Tour - UMich)
Automating Research Data Workflows (GlobusWorld Tour - Columbia University)
Automating Research Data Workflows (GlobusWorld Tour - UCSD)
Automating Research Data Workflows (GlobusWorld Tour - STFC)
Automating Research Data Flows with the Globus Command Line Interface (CLI)
Globus for System Administrators
Globus Platform Overview
Simple Data Automation with Globus (GlobusWorld Tour West)

What's hot (20)

PDF
Data Publication and Discovery with Globus
PDF
Globus Endpoint Setup and Configuration - XSEDE14 Tutorial
PPTX
Globus: Beyond File Transfer
PDF
Jupyter + Globus: The Foundation for Interactive Data Science
PDF
Leveraging the Globus Platform in Web Applications (CHPC 2019 - South Africa)
PDF
Globus Command Line Interface (APS Workshop)
PDF
Globus for System Administrators (GlobusWorld Tour - UCSD)
PPTX
Stephan Ewen - Running Flink Everywhere
PDF
Making Storage Systems Accessible via Globus (GlobusWorld Tour West)
PDF
Tutorial: Managing Protected Data with Globus Connect Server v5
PDF
Advanced React
PDF
GlobusWorld 2021 Tutorial: The Globus CLI, Platform and SDK
PDF
Introduction to Globus (GlobusWorld Tour West)
PDF
Introduction to the Globus Platform (GlobusWorld Tour - UMich)
PDF
Best Practices for Data Sharing (GlobusWorld Tour - UCSD)
PDF
Introduction to Globus (APS Workshop)
PPTX
Redesigning Apache Flink's Distributed Architecture @ Flink Forward 2017
PDF
Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...
PPTX
Coprocessors - Uses, Abuses, Solutions - presented at HBaseCon East 2016
PDF
XPages Performance Master Class - Survive in the fast lane on the Autobahn (E...
Data Publication and Discovery with Globus
Globus Endpoint Setup and Configuration - XSEDE14 Tutorial
Globus: Beyond File Transfer
Jupyter + Globus: The Foundation for Interactive Data Science
Leveraging the Globus Platform in Web Applications (CHPC 2019 - South Africa)
Globus Command Line Interface (APS Workshop)
Globus for System Administrators (GlobusWorld Tour - UCSD)
Stephan Ewen - Running Flink Everywhere
Making Storage Systems Accessible via Globus (GlobusWorld Tour West)
Tutorial: Managing Protected Data with Globus Connect Server v5
Advanced React
GlobusWorld 2021 Tutorial: The Globus CLI, Platform and SDK
Introduction to Globus (GlobusWorld Tour West)
Introduction to the Globus Platform (GlobusWorld Tour - UMich)
Best Practices for Data Sharing (GlobusWorld Tour - UCSD)
Introduction to Globus (APS Workshop)
Redesigning Apache Flink's Distributed Architecture @ Flink Forward 2017
Flink Forward San Francisco 2018: Stefan Richter - "How to build a modern str...
Coprocessors - Uses, Abuses, Solutions - presented at HBaseCon East 2016
XPages Performance Master Class - Survive in the fast lane on the Autobahn (E...
Ad

Similar to Tutorial: Automating Research Data Workflows (20)

PDF
Automating Research Data Flows with Globus (CHPC 2019 - South Africa)
PDF
Automating Research Data Flows and Introduction to the Globus Platform
PDF
Automating Research Data Flows and an Introduction to the Globus Platform
PDF
Introduction to Globus and Research Automation.pdf
PDF
Using Globus to Streamline Research at Scale
PDF
Introduction to the Command Line Interface (CLI)
PDF
Introduction to Research Automation with Globus
PDF
Globus Automation
PDF
Introduction to Globus: Research Data Management Software at the ALCF
PDF
Leveraging the Globus Platform (GlobusWorld Tour - UCSD)
PDF
Automating Research Data Management with Globus
PPTX
Gateways 2020 Tutorial - Introduction to Globus
PDF
Introduction to Globus for New Users
PDF
GlobusWorld 2024 Opening Keynote session
PDF
Getting Started with Globus for Developers
PDF
Introduction to the Globus PaaS (GlobusWorld Tour - STFC)
PDF
Introduction to Globus for New Users
PDF
Introduction to the Globus Platform (APS Workshop)
PDF
Tutorial: Leveraging Globus in your Research Applications
PDF
Tutorial: Best Practices for Data Sharing
Automating Research Data Flows with Globus (CHPC 2019 - South Africa)
Automating Research Data Flows and Introduction to the Globus Platform
Automating Research Data Flows and an Introduction to the Globus Platform
Introduction to Globus and Research Automation.pdf
Using Globus to Streamline Research at Scale
Introduction to the Command Line Interface (CLI)
Introduction to Research Automation with Globus
Globus Automation
Introduction to Globus: Research Data Management Software at the ALCF
Leveraging the Globus Platform (GlobusWorld Tour - UCSD)
Automating Research Data Management with Globus
Gateways 2020 Tutorial - Introduction to Globus
Introduction to Globus for New Users
GlobusWorld 2024 Opening Keynote session
Getting Started with Globus for Developers
Introduction to the Globus PaaS (GlobusWorld Tour - STFC)
Introduction to Globus for New Users
Introduction to the Globus Platform (APS Workshop)
Tutorial: Leveraging Globus in your Research Applications
Tutorial: Best Practices for Data Sharing
Ad

More from Globus (20)

PDF
Globus Compute wth IRI Workflows - GlobusWorld 2024
PDF
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
PDF
Globus Compute Introduction - GlobusWorld 2024
PDF
Globus Connect Server Deep Dive - GlobusWorld 2024
PDF
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
PDF
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
PDF
First Steps with Globus Compute Multi-User Endpoints
PDF
Enhancing Research Orchestration Capabilities at ORNL.pdf
PDF
Understanding Globus Data Transfers with NetSage
PDF
How to Position Your Globus Data Portal for Success Ten Good Practices
PDF
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
PDF
Developing Distributed High-performance Computing Capabilities of an Open Sci...
PDF
The Department of Energy's Integrated Research Infrastructure (IRI)
PDF
Enhancing Performance with Globus and the Science DMZ
PDF
Extending Globus into a Site-wide Automated Data Infrastructure.pdf
PDF
Globus at the United States Geological Survey
PDF
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
PDF
Globus Compute with Integrated Research Infrastructure (IRI) workflows
PDF
Reactive Documents and Computational Pipelines - Bridging the Gap
PDF
Innovating Inference at Exascale - Remote Triggering of Large Language Models...
Globus Compute wth IRI Workflows - GlobusWorld 2024
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Globus Compute Introduction - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
First Steps with Globus Compute Multi-User Endpoints
Enhancing Research Orchestration Capabilities at ORNL.pdf
Understanding Globus Data Transfers with NetSage
How to Position Your Globus Data Portal for Success Ten Good Practices
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
The Department of Energy's Integrated Research Infrastructure (IRI)
Enhancing Performance with Globus and the Science DMZ
Extending Globus into a Site-wide Automated Data Infrastructure.pdf
Globus at the United States Geological Survey
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Globus Compute with Integrated Research Infrastructure (IRI) workflows
Reactive Documents and Computational Pipelines - Bridging the Gap
Innovating Inference at Exascale - Remote Triggering of Large Language Models...

Recently uploaded (20)

PDF
Encapsulation theory and applications.pdf
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Empathic Computing: Creating Shared Understanding
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Approach and Philosophy of On baking technology
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
Cloud computing and distributed systems.
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Machine learning based COVID-19 study performance prediction
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Encapsulation theory and applications.pdf
MIND Revenue Release Quarter 2 2025 Press Release
Chapter 3 Spatial Domain Image Processing.pdf
Empathic Computing: Creating Shared Understanding
Understanding_Digital_Forensics_Presentation.pptx
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
20250228 LYD VKU AI Blended-Learning.pptx
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Approach and Philosophy of On baking technology
Advanced methodologies resolving dimensionality complications for autism neur...
Per capita expenditure prediction using model stacking based on satellite ima...
Encapsulation_ Review paper, used for researhc scholars
Cloud computing and distributed systems.
Spectral efficient network and resource selection model in 5G networks
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Machine learning based COVID-19 study performance prediction
Agricultural_Statistics_at_a_Glance_2022_0.pdf

Tutorial: Automating Research Data Workflows

  • 1. Automating Research Data Workflows Rachana Ananthakrishnan rachana@globus.org Greg Nawrocki greg@globus.org
  • 2. Data replication • For backup: initiated by user or system back up • Automated transfer of data from science instrument • Replication to a data share 2 Recurring transfers with sync option Copy /ingest Daily @ 3:30am
  • 3. Staging data with compute jobs • Stage data in or out as part of the job • Transfer task is submitted when the job is run – Endpoint may not be currently activated • Alternative approaches 1. User adds directives to job submission script 2. Application manages data staging on user’s behalf
  • 4. Application driven automation • Application (e.g. portal, science gateway) submits a transfer of compute results as the user • Application monitors transfer, and initiates additional processing and/or backup of data
  • 6. Globus Auth: Native apps • Client that cannot keep a secret, e.g… – Command line, desktop apps – Mobile apps – Jupyter notebooks • Native app is registered with Globus Auth – Not a confidential client like we’ll learn about later • Native App Grant is used – Variation on the Authorization Code Grant • Globus SDK: – To get tokens: NativeAppAuthClient – To use tokens: AccessTokenAuthorizer 6
  • 7. Browser Native App grant 7 Native App (Client) 1. Run application 2. URL to authenticate 3. Authenticate and consent 4. Auth code 5. Register auth code 6. Exchange code 7. Access tokens 8. Authenticate with access tokens to invoke transfer service as user App/Service (Resource Server) Globus Auth (Authorization Server)
  • 8. Refresh tokens • Common use cases – Portal checking transfer status when user is not logged in – Running command line app from script o The CLI gets access and refresh tokens upon ”globus login” • Refresh tokens issued to client, in particular scope • Client uses refresh token to get access token – Confidential client: client_id and client_secret required – Native app: client_secret not required • Refresh token good for 6 months after last use • Consent rescindment revokes resource token 8
  • 9. Refresh tokens 9 Native App (Client) App/Service (Resource Server) Globus Auth (Authorization Server) 1. Run application 2. URL to authenticate Browser 3. Authenticate and consent 4. Auth code 5. Register auth code 6. Exchange code, request refresh tokens 7. Access tokens and refresh tokens 9. Exchange refresh token for new access tokens 8. Store refresh tokens 10. Access tokens 11. Authenticate with access tokens to invoke service as user
  • 10. Native App/Refresh Tokens Sample Code github.com/globus/native-app-examples • ./example_copy_paste.py – User copies and pastes code to the app • ./example_copy_paste_refresh_token.py – Stores refresh token locally, uses it to get new access tokens • See README for installation 10 On your EC2 instance in ~/native-app-examples
  • 12. Globus CLI • It’s a native application distributed by Globus – https://docs.globus.org/cli/ – https://github.com/globus/globus-cli • Easy install and updates • Command “globus login” gets access tokens and refresh tokens – Stores the token locally (~/.globus.cfg ) • All interactions with the service use the tokens – Tokens for Globus Auth and Transfer services – Just like we’ll do in the Platform examples with the API • Command globus logout deletes those
  • 13. UUIDs everywhere • UUIDs for endpoint, task, user identity, groups… • Use search/list options • get-identities for identity username to UUID $ globus endpoint search 'Globus Tutorial' $ globus task list $ globus get-identities vas@globus.org bfc122a3- af43-43e1-8a41-d36f28a2bc0a
  • 14. The Globus CLI – Let’s do a few things… • Find endpoints – globus endpoint search Midway – globus endpoint search ESNet – globus endpoint search --filter-scope=recently-used • Find endpoint contents – globus ls af7bda53-6d04-11e5-ba46-22000b92c6ec – globus ls af7bda53-6d04-11e5-ba46-22000b92c6ec:RMACC2018 • Transfer a file – From ESnet Read-Only Test DTN at CERN to Midway – Note the specific paths – globus transfer d8eb36b6-6d04-11e5-ba46-22000b92c6ec:/~/data1/1M.dat af7bda53-6d04-11e5- ba46-22000b92c6ec:/~/1M.dat • Transfer a directory – From Globus Tutorial Endpoint 2 to Midway (create directory and contents) – globus transfer --recursive ddb59af0-6d04-11e5-ba46-22000b92c6ec:/~/sync-demo af7bda53- 6d04-11e5-ba46-22000b92c6ec:/~/syncDemo • https://docs.globus.org/cli/examples/
  • 15. Batch Transfers • Transfer tasks have one source/destination, but can have any number of files • Provide input source-dest pairs via local file • e.g. move files listed in files.txt from $ep1 to $ep2 $ ep1=ddb59aef-6d04-11e5-ba46-22000b92c6ec $ ep2=ddb59af0-6d04-11e5-ba46-22000b92c6ec $ globus transfer $ep1:/share/godata/ $ep2:/~/ -- batch --label 'CLI Batch' < files.txt
  • 16. Useful submission commands • Safe resubmissions – Applies to all tasks (transfer and delete) – Get a task UUID, use that in submission – $ globus task generate-submission-id – --submission-id option in transfer • Task wait – useful for scripting conditional on transfer task status
  • 17. Parsing CLI output • Default output is text; for JSON output use --format json $ globus endpoint search --filter-scope my-endpoints $ globus endpoint search --filter-scope my-endpoints -- format json • Extract specific attributes using --jmespath <expression> $ globus endpoint search --filter-scope my-endpoints -- jmespath 'DATA[].[id, display_name]'
  • 18. Managing notifications • Turn off emails sent for tasks • Useful when an application manages tasks for a user • Disable notifications with the --notify option --notify off (all notifications) --notify succeeded|failed|inactive (select notifications)
  • 19. Permission management • Set and manage permissions on shared endpoint • Requires access manager role $ share=<shared_endpoint_UUID> $ globus endpoint permission create --permissions r -- identity greg@nawrockinet.com $share:/nawrockipersonal/ $ globus endpoint permission list $share $ globus endpoint permission delete $share <perm_UUID>
  • 20. Automation with CLI • A script that uses the CLI to transfer data repeatedly via task manager/cron – Interactions are as user: both for data access and to Globus services • CLI commands used in the job submission script – CLI is installed on head node – User runs ”globus login”, the tokens are stored in user’s home directory – Tokens accessible when the job runs and submits stage in or stage out tasks – Use the –skip-activation-check to submit the task even if endpoint is not activated at submit time
  • 21. Automation with portals • Portal needs to act as the user • User grants “offline” access to the portal – Portal gets and stores refresh tokens for each user – Uses client id/secret + refresh tokens to get new access tokens – Portal maintains state about transfers being managed (task id)
  • 22. Automation Examples • Syncing a directory – Bash script that calls the Globus CLI and a Python module that can be run as a script or imported as a module. • Staging data in a shared directory – Bash / Python • Removing directories after files are transferred – Python script • Simple code examples for various use cases using Globus – https://github.com/globus/automation-examples 22
  • 23. Support resources • Globus documentation: docs.globus.org • Sample code: github.com/globus • Helpdesk and issue escalation: support@globus.org • Mailing lists – https://www.globus.org/mailing-lists – developer-discuss@globus.org • Globus professional services team – Assist with portal/gateway/app architecture and design – Develop custom applications that leverage the Globus platform – Advise on customized deployment and integration scenarios