SlideShare a Scribd company logo
Building and provisioning
genomics platforms on the
world’s clouds
Enis Afgan
Johns Hopkins University
Galaxy Project
April 2016, University of Heidelberg
World’s clouds
AWS
AWS (coming soon)
Google Compute Engine
Chameleon
Jetstream
NeCTAR
Azure
Capacity without end-to-end
solution
How to appropriately utilize
clouds?
VM Platform Service
Standalone VM
Pre-configured server that is readily available.
Pros
Easy to build; easy to deploy
Low cloud infrastructure requirements ⟶ Transferable
Cons
Limited capacity (compute and storage)
See it in action
wiki.galaxyproject.org/Cloud/Jetstream
Scalable platform
Set up a virtual cluster across multiple VMs with app services.
Pros
Dynamically scale compute and storage
Higher-level services: persistent storage, sharing, multi-
application
Cons
Complicated build; considerable infrastructure requirements
See it in action
wiki.galaxyproject.org/CloudMan
Scalable platform (cont)
Data analysis spans more than one application (even if that is
Galaxy).
Meet Genomics Virtual Lab (GVL)
Pros
Versatile platform built on
the scalable CloudMan cluster
Includes common tutorials
Cons
Demanding to build
Calls for more customization
See it in action
genome.edu.au
Ready-to-use service
Use cloud resources from an always-on, public service
Pros
Visit a URL and start computing – no setup required
Cons
User quotas still apply
It’s still a public service: no user customization
See it in action
usegalaxy.org (bwa, bowtie2 – more coming)
There’s a lot of clouds out there!
AWS
AWS (coming soon)
Google Compute Engine
Chameleon
Jetstream
NeCTAR
Azure
How to appropriately utilize many
clouds?
VM Platform Service
Build system
Adjustable build system
Automate the process of building each component
Codify knowledge about the system ⟶ easier to reproduce
We use Ansible as the technology of choice
Compose systems from configurable and reusable roles
Galaxy-Kickstarter
Playbook
artbio.github.io/ansible-artimed/
Galaxy-CloudMan
Playbook
github.com/galaxyproject/
galaxy-cloudman-playbook
Use-Galaxy
Playbook
github.com/galaxyproject/
usegalaxy-playbook
Many clouds AND many solutions
!?!
launch.genome.edu.au ; use.jetstream-cloud.org ; launch.usegalaxy.org
CloudBridge (future)
A Simple Cross-Cloud Python Library
1. Offer a uniform API irrespective of the underlying provider
2. Provide a set of conformance tests for all supported clouds
3. Focus on mature clouds with a required minimal set of features
4. Be as thin as possible
Support for AWS and OpenStack exists; Google Cloud under
development
cloudbridge.readthedocs.org
CloudLaunch (future)
A centralized launcher for any app and any cloud.
User configurable applications and clouds; view and launch
shared instances; multi-cloud dashboard view
github.com/galaxyproject/cloudlaunch
github.com/galaxyproject/cloudlaunch-u
CloudMan (future)
Resource manager with configurable service layer
• Pull away from low-level application service management
• Leverage containers to supply services
• Allow runtime service and configuration changes
• Run on any infrastructure, including high-level services, such as
ECS, or Docker API
Goal: Launch a (template-based) CloudMan platform and add
application services as desired from Dockerhub or similar while
resource provisioning is automatically handled.
Galaxy ObjectStore (future)
Allow uniform any-Galaxy computing (i.e., make Galaxy instances
interchangeable and disposable)
• Galaxy implements an ObjectStore interface as an abstraction to
data
• Leverage it to expand user data storage and allow any Galaxy
to connect to a user’s bucket
• Use ObjectStore for reference data (simplify builds)
• Still will need to deal with the database dependency
The endgame?
launch.usegalaxy.org
ObjectStore
CloudBridge
CloudMan
A P P L I C A T I O N S
Building your own cloud?
Make it easy
For end-users to register and get onboard (very simple auth)
For deployers to interface with the cloud (adopt ‘standards’)
Develop capacity and usage plans
Go for monthly-reset, merit-based Allocation Units (AUs)
Design for flexibility
Users need more storage? Different instance types?
Create champion teams
Bring them onboard early to deploy target apps; give them $$$
Start with good documentation
Technical but not overly detailed (look at AWS)
Be open; add great, interactive support
Design a training program
For application developers and end users; build a community
Acknowledgments
Want more Galaxy?
gcc2016.iu.edu
usegalaxy.org cloud-bursting
usegalaxy.org
CVMFS
NFS
job_conf.xml

More Related Content

PDF
Horizontal scaling with Galaxy
PPTX
CloudStack Meetup - Introduction
PDF
1 cloud, 2 clouds, 3 clouds, tons...
PDF
Helm chart-introduction
PDF
Apache JClouds
PPTX
Scalable On-Demand Hadoop Clusters with Docker and Mesos
PPTX
Secret Techniques to Manage Apache Cloudstack with ActOnCloud
PDF
Kubernetes: Reducing Infrastructure Cost & Complexity
Horizontal scaling with Galaxy
CloudStack Meetup - Introduction
1 cloud, 2 clouds, 3 clouds, tons...
Helm chart-introduction
Apache JClouds
Scalable On-Demand Hadoop Clusters with Docker and Mesos
Secret Techniques to Manage Apache Cloudstack with ActOnCloud
Kubernetes: Reducing Infrastructure Cost & Complexity

What's hot (20)

PDF
Serverless and Design Patterns In GCP
PDF
Using ansible to manage cloud platform by Accelerite
PPTX
Apache CloudStack 4.2: A First Look
PDF
Business Continuity with Microservices-Based Apps and DevOps: Learnings from ...
PPTX
Deploy in scale with docker, coreos, kubernetes and apache stratos
PDF
Cloud infrastructure on Apache Mesos
PDF
GCP CloudRun Overview
PPTX
How to migrate workloads to the google cloud platform
PDF
Getting Started with EC2, S3 and EMR
PPTX
Google cloud platform
PDF
A Tour of Google Cloud Platform
PDF
SaaSification in Action. Attracting Software Vendors with Easy Transformation
PPTX
How to Migrate a Web App to AWS
PPTX
Big Data Day LA 2015 - Lessons learned from scaling Big Data in the Cloud by...
PDF
Kubernetes Application Deployment with Helm - A beginner Guide!
PDF
Cloud computing's truly open silver lining: OpenStack
PPTX
A practical approach to provisioning resources in azure
PPTX
Crash Course in Cloud Computing
PDF
A quick introduction to AKS
PPTX
Container Management - Federico Simoncelli - ManageIQ Design Summit 2016
Serverless and Design Patterns In GCP
Using ansible to manage cloud platform by Accelerite
Apache CloudStack 4.2: A First Look
Business Continuity with Microservices-Based Apps and DevOps: Learnings from ...
Deploy in scale with docker, coreos, kubernetes and apache stratos
Cloud infrastructure on Apache Mesos
GCP CloudRun Overview
How to migrate workloads to the google cloud platform
Getting Started with EC2, S3 and EMR
Google cloud platform
A Tour of Google Cloud Platform
SaaSification in Action. Attracting Software Vendors with Easy Transformation
How to Migrate a Web App to AWS
Big Data Day LA 2015 - Lessons learned from scaling Big Data in the Cloud by...
Kubernetes Application Deployment with Helm - A beginner Guide!
Cloud computing's truly open silver lining: OpenStack
A practical approach to provisioning resources in azure
Crash Course in Cloud Computing
A quick introduction to AKS
Container Management - Federico Simoncelli - ManageIQ Design Summit 2016
Ad

Viewers also liked (9)

DOCX
KhalilCV UPDATED
DOCX
Khalil MB@CV (1) (1) (1)
PPTX
Thesis Presentation
DOCX
Làm thế nào để sống chung với bệnh khớp
DOCX
Addition Detective
PPTX
Características do verdadeiro adorador
DOCX
Community Helpers Themed Unit
DOCX
Khalil MB@CV (1) (1) (1)
DOCX
edu 311 fact and opinion Lesson Plan
KhalilCV UPDATED
Khalil MB@CV (1) (1) (1)
Thesis Presentation
Làm thế nào để sống chung với bệnh khớp
Addition Detective
Características do verdadeiro adorador
Community Helpers Themed Unit
Khalil MB@CV (1) (1) (1)
edu 311 fact and opinion Lesson Plan
Ad

Similar to Building and provisioning genomics platforms on the world’s clouds (20)

PPTX
Cloudjiffy vs Amazon ECS
PPTX
Cloudjiffy vs Amazon Elastic Beanstalk
PPTX
CloudPlatforms-Cloud PLatforms evaluation
PPTX
CloudStack vs Openstack
PDF
OpenNebula Conf 2014 | Cloud Automation for OpenNebula by Kishorekumar Neelam...
PDF
OpenNebulaConf 2014 - Cloud Automation for OpenNebula - Kishorekumar Neelamegam
PPTX
A Journey To The Cloud - An Introduction To Cloud Computing
PDF
Cloud Expo East 2013: Essential Open Source Software for Building the Open Cloud
PDF
Linux Foundation Collaboration Summit: Hitchhiker's Guide to the Cloud
PPT
An Introduction To Infarstructures For Cloud Computing V0.2
PPTX
Cloud1 Computing 01
PDF
Velocity NYC 2016 - Containers @ Netflix
PPTX
Provisioning in the cloud context, cloud computing, EC2, Amazon provisioning ...
PPTX
Managing Your Private Cloud with RightScale
PPTX
What is cloud computing
PDF
Cloud Deployment Toolkit
ODP
Searching The Cloud - The eclipseRT Umbrella
PPTX
Drilett aws vpc_presentation_shared
PPTX
Project COLA: Use Case to create a scalable application in the cloud based on...
PDF
LinuxFest NW 2013: Hitchhiker's Guide to Open Source Cloud Computing
Cloudjiffy vs Amazon ECS
Cloudjiffy vs Amazon Elastic Beanstalk
CloudPlatforms-Cloud PLatforms evaluation
CloudStack vs Openstack
OpenNebula Conf 2014 | Cloud Automation for OpenNebula by Kishorekumar Neelam...
OpenNebulaConf 2014 - Cloud Automation for OpenNebula - Kishorekumar Neelamegam
A Journey To The Cloud - An Introduction To Cloud Computing
Cloud Expo East 2013: Essential Open Source Software for Building the Open Cloud
Linux Foundation Collaboration Summit: Hitchhiker's Guide to the Cloud
An Introduction To Infarstructures For Cloud Computing V0.2
Cloud1 Computing 01
Velocity NYC 2016 - Containers @ Netflix
Provisioning in the cloud context, cloud computing, EC2, Amazon provisioning ...
Managing Your Private Cloud with RightScale
What is cloud computing
Cloud Deployment Toolkit
Searching The Cloud - The eclipseRT Umbrella
Drilett aws vpc_presentation_shared
Project COLA: Use Case to create a scalable application in the cloud based on...
LinuxFest NW 2013: Hitchhiker's Guide to Open Source Cloud Computing

Recently uploaded (20)

PPTX
Cloud computing and distributed systems.
PPTX
Big Data Technologies - Introduction.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PPT
Teaching material agriculture food technology
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
A Presentation on Artificial Intelligence
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Empathic Computing: Creating Shared Understanding
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
sap open course for s4hana steps from ECC to s4
Cloud computing and distributed systems.
Big Data Technologies - Introduction.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Chapter 3 Spatial Domain Image Processing.pdf
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Teaching material agriculture food technology
Assigned Numbers - 2025 - Bluetooth® Document
MYSQL Presentation for SQL database connectivity
A Presentation on Artificial Intelligence
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Diabetes mellitus diagnosis method based random forest with bat algorithm
Advanced methodologies resolving dimensionality complications for autism neur...
MIND Revenue Release Quarter 2 2025 Press Release
The Rise and Fall of 3GPP – Time for a Sabbatical?
“AI and Expert System Decision Support & Business Intelligence Systems”
Empathic Computing: Creating Shared Understanding
Programs and apps: productivity, graphics, security and other tools
20250228 LYD VKU AI Blended-Learning.pptx
Mobile App Security Testing_ A Comprehensive Guide.pdf
sap open course for s4hana steps from ECC to s4

Building and provisioning genomics platforms on the world’s clouds

  • 1. Building and provisioning genomics platforms on the world’s clouds Enis Afgan Johns Hopkins University Galaxy Project April 2016, University of Heidelberg
  • 2. World’s clouds AWS AWS (coming soon) Google Compute Engine Chameleon Jetstream NeCTAR Azure
  • 4. How to appropriately utilize clouds? VM Platform Service
  • 5. Standalone VM Pre-configured server that is readily available. Pros Easy to build; easy to deploy Low cloud infrastructure requirements ⟶ Transferable Cons Limited capacity (compute and storage) See it in action wiki.galaxyproject.org/Cloud/Jetstream
  • 6. Scalable platform Set up a virtual cluster across multiple VMs with app services. Pros Dynamically scale compute and storage Higher-level services: persistent storage, sharing, multi- application Cons Complicated build; considerable infrastructure requirements See it in action wiki.galaxyproject.org/CloudMan
  • 7. Scalable platform (cont) Data analysis spans more than one application (even if that is Galaxy). Meet Genomics Virtual Lab (GVL) Pros Versatile platform built on the scalable CloudMan cluster Includes common tutorials Cons Demanding to build Calls for more customization See it in action genome.edu.au
  • 8. Ready-to-use service Use cloud resources from an always-on, public service Pros Visit a URL and start computing – no setup required Cons User quotas still apply It’s still a public service: no user customization See it in action usegalaxy.org (bwa, bowtie2 – more coming)
  • 9. There’s a lot of clouds out there! AWS AWS (coming soon) Google Compute Engine Chameleon Jetstream NeCTAR Azure
  • 10. How to appropriately utilize many clouds? VM Platform Service Build system
  • 11. Adjustable build system Automate the process of building each component Codify knowledge about the system ⟶ easier to reproduce We use Ansible as the technology of choice Compose systems from configurable and reusable roles Galaxy-Kickstarter Playbook artbio.github.io/ansible-artimed/ Galaxy-CloudMan Playbook github.com/galaxyproject/ galaxy-cloudman-playbook Use-Galaxy Playbook github.com/galaxyproject/ usegalaxy-playbook
  • 12. Many clouds AND many solutions !?! launch.genome.edu.au ; use.jetstream-cloud.org ; launch.usegalaxy.org
  • 13. CloudBridge (future) A Simple Cross-Cloud Python Library 1. Offer a uniform API irrespective of the underlying provider 2. Provide a set of conformance tests for all supported clouds 3. Focus on mature clouds with a required minimal set of features 4. Be as thin as possible Support for AWS and OpenStack exists; Google Cloud under development cloudbridge.readthedocs.org
  • 14. CloudLaunch (future) A centralized launcher for any app and any cloud. User configurable applications and clouds; view and launch shared instances; multi-cloud dashboard view github.com/galaxyproject/cloudlaunch github.com/galaxyproject/cloudlaunch-u
  • 15. CloudMan (future) Resource manager with configurable service layer • Pull away from low-level application service management • Leverage containers to supply services • Allow runtime service and configuration changes • Run on any infrastructure, including high-level services, such as ECS, or Docker API Goal: Launch a (template-based) CloudMan platform and add application services as desired from Dockerhub or similar while resource provisioning is automatically handled.
  • 16. Galaxy ObjectStore (future) Allow uniform any-Galaxy computing (i.e., make Galaxy instances interchangeable and disposable) • Galaxy implements an ObjectStore interface as an abstraction to data • Leverage it to expand user data storage and allow any Galaxy to connect to a user’s bucket • Use ObjectStore for reference data (simplify builds) • Still will need to deal with the database dependency
  • 18. Building your own cloud? Make it easy For end-users to register and get onboard (very simple auth) For deployers to interface with the cloud (adopt ‘standards’) Develop capacity and usage plans Go for monthly-reset, merit-based Allocation Units (AUs) Design for flexibility Users need more storage? Different instance types? Create champion teams Bring them onboard early to deploy target apps; give them $$$ Start with good documentation Technical but not overly detailed (look at AWS) Be open; add great, interactive support Design a training program For application developers and end users; build a community

Editor's Notes

  • #21: Standards: expose API, reliable/adopted middleware,