SlideShare a Scribd company logo
TOWARDS A (UNITED) FEDERATION OF
BIOINFORMATICS RESOURCES
Matthew Vaughn @mattdotvaughn
Director, Life Sciences Computing, TACC | Co-PI Cyverse, Araport, Jetstream
1/14/2017 1
Interoperability and Federation Across Bioinformatic Platforms
and Resources
Jan 14, 2017
WHY FEDERATE?
1/14/2017 2
Because you can’t do it all or be it all. And would you even want to?
WHY FEDERATE?
1/14/2017 3
There’s always some existing or emergent
• Data Set
• Database
• Visualization Technology
• Software Algorithm or Library
• Physical Capacity or Capability
• Source of funding and support
not in scope for you to directly provide or
avail yourself of
Federated infrastructures are TEAM-BUILT
Increase the resiliency of your
informatics ecosystem
Leverage all the other brains who have
different views of your problem
WHY DON’T WE FEDERATE BY DEFAULT?
1/14/2017 4
Federation requires three
things:
• Components confirming to
“standardized" schemas,
protocols for interaction
and usage
• Stably-operated
frameworks to handle
yeoman’s work
of integrating components
1/14/2017 5
WHY DON’T WE FEDERATE BY DEFAULT?
1/14/2017 6
Hey wait.. I said there were three things we needed for federation:
WHY DON’T WE FEDERATE BY DEFAULT?
1/14/2017 7
Hey wait.. I said there were three things we needed for federation:
Planning &
Specific Effort
1/14/2017 8
Lab-Born Software
• Immediately responsive
• Limited R&D
• Resources on hand
• Sustainability? What’s that?
Centrally-Planned Software
• Mindfully built
• Better chance for R&D
• Dedicated resources
• Sustainability? What’s that?
WHY DON’T WE FEDERATE BY DEFAULT?
Some of the most interesting work is done at the edges of our
infrastructure. Their adopting federated access patterns post-hoc
means assuming substantial technical debt.
1/14/2017 9
Science
applications
Domain-specific
services
Established
software and CI
Physical resources
Federated
Storage
National CI Virtualization
Job
Scheduling
Single
Sign-on
EaseofUse
EaseofRe-use
HOW CAN WE MAKE FEDERATION EASIER?
1/14/2017 10
Deeply understand the capabilities of existing integration platforms
• Avoid Not-Invented-Here by adopting the 80% rule
• Contribute enhancements, either via active feedback or by coding them
• Build on our platforms and make sure they get credit for their role
Identify and adopt existing standards. Contribute where they fall short of our needs
• OpenAPI for web service definitions
• ISA Framework for experimental metadata
• Oauth2 for authorization
Provide tooling and documentation for users with diverse technical backgrounds
• GUI, Forms, Web Services. But also language libraries and SDKs.
• Make sure we understand the motivations and constraints of those users
• Write cookbooks, not just shopping lists
1/14/2017 11
Jetstream GUI
Agave API Developer Portal
DesignSafe.CI Workbench
Araport Web Services
MAKING FEDERATION WORK REQUIRES THAT WE
INCREASE EVERYONE’S PRODUCTIVITY
1/14/2017 12
@mattdotvaughn www.slideshare.net/mattdotvaughn vaughn@tacc.utexas.edu
1/14/2017 13

More Related Content

PDF
Glocalized approach to community support
PPT
Bridging the Divide: High Technology in Low-resource Settings -- an update (S...
PDF
MNCC - 2013-10-03 - Open World Forum
PPT
Aspects of the sustainability of software
PPTX
Contabilidad Electrónica 2016
PDF
助成金資料 ダイジェスト編 (fb立ち上げ)
PDF
20110404
PDF
Bernadette Belhaj Boubaker - CV
Glocalized approach to community support
Bridging the Divide: High Technology in Low-resource Settings -- an update (S...
MNCC - 2013-10-03 - Open World Forum
Aspects of the sustainability of software
Contabilidad Electrónica 2016
助成金資料 ダイジェスト編 (fb立ち上げ)
20110404
Bernadette Belhaj Boubaker - CV

Viewers also liked (8)

PDF
Almaviva Santé Magazine n°3
PDF
スライドシェア)モニプラGlobal sales資料 ver.1.1
PDF
Programa electoral 26-J: PNV
PPTX
L’evolució biològica
PDF
Ulises - octubre 2016 | Índice Ánimo Ciudadano
PDF
Actividades historia 2do año DIC 2016
DOCX
PDF
Cv ghannouchi
Almaviva Santé Magazine n°3
スライドシェア)モニプラGlobal sales資料 ver.1.1
Programa electoral 26-J: PNV
L’evolució biològica
Ulises - octubre 2016 | Índice Ánimo Ciudadano
Actividades historia 2do año DIC 2016
Cv ghannouchi
Ad

Similar to Towards a (united) federation of Bioinformatics resources (20)

PPTX
Taming Big Science Data Growth with Converged Infrastructure
PDF
ORION Workshop: XSEDE and Building a National/International Cyberinfrastructure
PDF
Sgci nsf-si2-2-21-17
PDF
Sgci esip-7-20-18
PDF
Project On-Science
PDF
Sgci xsede-gateways-07-08-16
PPTX
Ontologies for Emergency & Disaster Management
PPT
Core Geospatial Ontologies
PPTX
Progress Pacific: Contemporary App Development
PPTX
Geospatial Ontologies and GeoSPARQL Services
PPTX
CNI Fall 2011 Meeting Presentation Margaret Hedstrom & Robert McDonald (Dec. ...
PPTX
Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
PPTX
Troux Presentation Austin Texas
PDF
University of Minho Data Repository - features to publish & share data and w...
PDF
UPES-First Indian University to implement SAP
ODT
Anitha_Resume_BigData
PPTX
e-infrastructural needs to support informatics
PDF
Ecm implementation planning_workshop_hospital_sample
PPTX
ITatMIT Strategy Overview DRAFT v0.pptx
PDF
Resume Deepthi Reddy
Taming Big Science Data Growth with Converged Infrastructure
ORION Workshop: XSEDE and Building a National/International Cyberinfrastructure
Sgci nsf-si2-2-21-17
Sgci esip-7-20-18
Project On-Science
Sgci xsede-gateways-07-08-16
Ontologies for Emergency & Disaster Management
Core Geospatial Ontologies
Progress Pacific: Contemporary App Development
Geospatial Ontologies and GeoSPARQL Services
CNI Fall 2011 Meeting Presentation Margaret Hedstrom & Robert McDonald (Dec. ...
Big Data and BI Tools - BI Reporting for Bay Area Startups User Group
Troux Presentation Austin Texas
University of Minho Data Repository - features to publish & share data and w...
UPES-First Indian University to implement SAP
Anitha_Resume_BigData
e-infrastructural needs to support informatics
Ecm implementation planning_workshop_hospital_sample
ITatMIT Strategy Overview DRAFT v0.pptx
Resume Deepthi Reddy
Ad

More from Matthew Vaughn (16)

PPTX
On-Demand Cloud Computing for Life Sciences Research and Education
PPTX
CYVERSE: TRANSFORMING LIFE SCIENCE RESEARCH VIA CYBERINFRASTRUCTURE
PPTX
Jetstream: Accessible cloud computing for the national science and engineerin...
PPTX
How Cyverse.org enables scalable data discoverability and re-use
PDF
Clouds, Clusters, and Containers: Tools for responsible, collaborative computing
PPTX
Packaging computational biology tools for broad distribution and ease-of-reuse
PPTX
Jetstream: Adding Cloud-based Computing to the National Cyberinfrastructure
PPTX
Scaling People, Not Just Systems, to Take On Big Data Challenges
PPTX
Arabidopsis Information Portal: A Community-Extensible Platform for Open Data
PDF
Developing Apps: Exposing Your Data Through Araport
PPTX
Dinosaur bioinformatics
PPTX
aip-developer-intro_pag2015
PPTX
iplant-highlights-pag2015
PPTX
aip-workshop1-dev-tutorial
PPTX
aip_developer_overview_icar_2014
PPTX
Arabidopsis Information Portal overview from Plant Biology Europe 2014
On-Demand Cloud Computing for Life Sciences Research and Education
CYVERSE: TRANSFORMING LIFE SCIENCE RESEARCH VIA CYBERINFRASTRUCTURE
Jetstream: Accessible cloud computing for the national science and engineerin...
How Cyverse.org enables scalable data discoverability and re-use
Clouds, Clusters, and Containers: Tools for responsible, collaborative computing
Packaging computational biology tools for broad distribution and ease-of-reuse
Jetstream: Adding Cloud-based Computing to the National Cyberinfrastructure
Scaling People, Not Just Systems, to Take On Big Data Challenges
Arabidopsis Information Portal: A Community-Extensible Platform for Open Data
Developing Apps: Exposing Your Data Through Araport
Dinosaur bioinformatics
aip-developer-intro_pag2015
iplant-highlights-pag2015
aip-workshop1-dev-tutorial
aip_developer_overview_icar_2014
Arabidopsis Information Portal overview from Plant Biology Europe 2014

Recently uploaded (20)

PPTX
Overview of calcium in human muscles.pptx
PDF
The scientific heritage No 166 (166) (2025)
PDF
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PPTX
2. Earth - The Living Planet Module 2ELS
PDF
CHAPTER 3 Cell Structures and Their Functions Lecture Outline.pdf
PDF
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
PPT
protein biochemistry.ppt for university classes
PDF
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
PPTX
Taita Taveta Laboratory Technician Workshop Presentation.pptx
PPT
6.1 High Risk New Born. Padetric health ppt
PPTX
BIOMOLECULES PPT........................
PPTX
Fluid dynamics vivavoce presentation of prakash
PDF
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
PPTX
famous lake in india and its disturibution and importance
PDF
Phytochemical Investigation of Miliusa longipes.pdf
PPTX
Microbiology with diagram medical studies .pptx
PPTX
Classification Systems_TAXONOMY_SCIENCE8.pptx
PPTX
Introduction to Fisheries Biotechnology_Lesson 1.pptx
PPTX
Vitamins & Minerals: Complete Guide to Functions, Food Sources, Deficiency Si...
Overview of calcium in human muscles.pptx
The scientific heritage No 166 (166) (2025)
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
ECG_Course_Presentation د.محمد صقران ppt
2. Earth - The Living Planet Module 2ELS
CHAPTER 3 Cell Structures and Their Functions Lecture Outline.pdf
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
protein biochemistry.ppt for university classes
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
Taita Taveta Laboratory Technician Workshop Presentation.pptx
6.1 High Risk New Born. Padetric health ppt
BIOMOLECULES PPT........................
Fluid dynamics vivavoce presentation of prakash
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
famous lake in india and its disturibution and importance
Phytochemical Investigation of Miliusa longipes.pdf
Microbiology with diagram medical studies .pptx
Classification Systems_TAXONOMY_SCIENCE8.pptx
Introduction to Fisheries Biotechnology_Lesson 1.pptx
Vitamins & Minerals: Complete Guide to Functions, Food Sources, Deficiency Si...

Towards a (united) federation of Bioinformatics resources

  • 1. TOWARDS A (UNITED) FEDERATION OF BIOINFORMATICS RESOURCES Matthew Vaughn @mattdotvaughn Director, Life Sciences Computing, TACC | Co-PI Cyverse, Araport, Jetstream 1/14/2017 1 Interoperability and Federation Across Bioinformatic Platforms and Resources Jan 14, 2017
  • 2. WHY FEDERATE? 1/14/2017 2 Because you can’t do it all or be it all. And would you even want to?
  • 3. WHY FEDERATE? 1/14/2017 3 There’s always some existing or emergent • Data Set • Database • Visualization Technology • Software Algorithm or Library • Physical Capacity or Capability • Source of funding and support not in scope for you to directly provide or avail yourself of Federated infrastructures are TEAM-BUILT Increase the resiliency of your informatics ecosystem Leverage all the other brains who have different views of your problem
  • 4. WHY DON’T WE FEDERATE BY DEFAULT? 1/14/2017 4 Federation requires three things: • Components confirming to “standardized" schemas, protocols for interaction and usage • Stably-operated frameworks to handle yeoman’s work of integrating components
  • 6. WHY DON’T WE FEDERATE BY DEFAULT? 1/14/2017 6 Hey wait.. I said there were three things we needed for federation:
  • 7. WHY DON’T WE FEDERATE BY DEFAULT? 1/14/2017 7 Hey wait.. I said there were three things we needed for federation: Planning & Specific Effort
  • 8. 1/14/2017 8 Lab-Born Software • Immediately responsive • Limited R&D • Resources on hand • Sustainability? What’s that? Centrally-Planned Software • Mindfully built • Better chance for R&D • Dedicated resources • Sustainability? What’s that? WHY DON’T WE FEDERATE BY DEFAULT? Some of the most interesting work is done at the edges of our infrastructure. Their adopting federated access patterns post-hoc means assuming substantial technical debt.
  • 9. 1/14/2017 9 Science applications Domain-specific services Established software and CI Physical resources Federated Storage National CI Virtualization Job Scheduling Single Sign-on EaseofUse EaseofRe-use
  • 10. HOW CAN WE MAKE FEDERATION EASIER? 1/14/2017 10 Deeply understand the capabilities of existing integration platforms • Avoid Not-Invented-Here by adopting the 80% rule • Contribute enhancements, either via active feedback or by coding them • Build on our platforms and make sure they get credit for their role Identify and adopt existing standards. Contribute where they fall short of our needs • OpenAPI for web service definitions • ISA Framework for experimental metadata • Oauth2 for authorization Provide tooling and documentation for users with diverse technical backgrounds • GUI, Forms, Web Services. But also language libraries and SDKs. • Make sure we understand the motivations and constraints of those users • Write cookbooks, not just shopping lists
  • 11. 1/14/2017 11 Jetstream GUI Agave API Developer Portal DesignSafe.CI Workbench Araport Web Services
  • 12. MAKING FEDERATION WORK REQUIRES THAT WE INCREASE EVERYONE’S PRODUCTIVITY 1/14/2017 12 @mattdotvaughn www.slideshare.net/mattdotvaughn vaughn@tacc.utexas.edu

Editor's Notes

  • #2: Me. Background molecular genetics and physiology before moving into bioinformatics and infrastructure. Talk about holstic approach taken by Cyverse project over the last few years and how I think it’s been transformational.
  • #3: Walmart is an easy target, but think about other monoliths.
  • #4: We also want to make our ecosystem RESILIENT and USE OUR DIVERSITY This is beautiful. One of the great things about scientists is that we build our own tools with the materials we have at hand.
  • #5: Let’s talk about Hipmunk. It’s actually a good analog for Bioinformatics portals. Hipmunk’s value prop is predictive analytics to optimize customer purchasing decisions for travel. It takes a small slice, which is valuabe to offerors because it helps them match unsold inventory w flexible, price-concsious customers. To accomplish this Hipmunk has to CAPTURE and PRESENT diverse data Current and real-time pricing data from multiple lodging aggregators Maps and transit Review system(s) Identity and access Its own, proprietary data stream It could not exist on its own if these resources were not available because its costs would exceed the value of its improved efficiency. It’s a PLATFORM. So are some (but not all) of its data sources.
  • #6: I don’t want to steal Chris’ thunder so I won’t go into GREAT detail here To accomplish this browser view: Araport Intermine, Jbrowse, and Adama services CyVerse Auth & Data Store GitHub & PyPi TACC’s Agave API JGI Intermine Phytozome TACC Openstack Cloud + Amazon Web Services New services have arisen under Araport model with very little NEW code or resource allocation.
  • #8: Research requirements and build iin Fed to design Dedicate effort to it, even if it’s cheaper in the short term to NOT FEDERATE
  • #9: Built in immediate response to research needs Limited or no research Technology and resources on hand Programming Language/Framework Developer skill and commitment Perspective Possibly no sustainability or maintenance plan Built mindfully, usually to fulfill a funded research mandate Design and implementation research During proposal In early stage of development May be able to dedicate resources, adopt new tech, acquire broader perspective Possibly still no sustainability or maintenance plan ;-) So, we need to make it EASIER to start, easier to comply, easier to maintain federated resources
  • #10: I want to stop and point out an specific example: It possible to extend CyVerse at ANY level of the infrastructure. People can build against Cyverse by DEFAULT now and it’s a net positive. This design pattern is now a standard referenced by non-BIO programs
  • #11: Jetstream, NSF’s new cloud system, builds on CyVerse atmo. Went to production 30d after receipt of hardware. DesignSafe.CI leveraged its own copy of CyVerse API to onboard a 25,000 person user community in 6 months Standards are examples only DE and Atmo and Web Apps / Web Service Catalog INCREASE EVERYONE’S PRODUCTIVITY
  • #12: How do these offerings meet those previous criteria?
  • #13: MAKING FEDERATION WORK REQUIRES THAT WE INCREASE EVERYONE’S PRODUCTIVITY