SlideShare a Scribd company logo
Advancing Life Sciences Research
with High Performance Computing
and Cyberinfrastructure
                                 Ian Stokes-Rees
                          Harvard Medical School
          SHOW - Making Biology Binary, June 2010
Dengue Virus Movie

    animation, not simulation, informed by science
digizyme.com
Science Behind the Movie
                                      Multi-scale
                                      Data intensive
                                      Dynamic
                                      Models
                                      Simulation
                                      Analysis
Ian Stokes-Rees, NEBioGrid, Harvard Medical School     June 23rd, 2010
Water channel through aquaporin tetramere in lipid bilayer
Tajkhorshid, E., Nollert, P., Jensen, M.O., Miercke, L.J., O'Connell, J., Stroud, R.M., and Schulten, K. (2002). Science 296, 525-530
Molecular Dynamics

              Computationally intensive
              Necessarily parallel
              Nanosecond scale today
              Millisecond to second tomorrow
              Rapidly growing interest

Ian Stokes-Rees, NEBioGrid, Harvard Medical School   June 23rd, 2010
2010 06 pre_show_computing_lifesciences_stokesrees
48 cores, single system image
GPU Computing 200-800 stream processing cores per card
NextGen Sequencing
2010 06 pre_show_computing_lifesciences_stokesrees
2010 06 pre_show_computing_lifesciences_stokesrees
Collaborations and
Communities
Boston Life Sciences
        Universities
        Hospitals
        Pharmaceuticals
        Research Institutes
                                                         Tufts
                                                         Universit
                                                         y
                                                         School
                                                         of
                                                         Medicin
                                                         e




Ian Stokes-Rees, NEBioGrid, Harvard Medical School   June 23rd, 2010
Washington U. School of Med.                                                                Cornell U.
                                                                                                R. Cerione              NE-CAT
    T. Ellenberger
                                                                                                B. Crane                R. Oswald
    D. Fremont
                                                                                                S. Ealick               C. Parrish
                                        Rosalind Franklin NIH                                   M. Jin                  H. Sondermann
                                        D. Harrison                    M. Mayer
                                                                                                A. Ke                    UMass Medical
    U. Washington
    T. Gonen
                                                                       U. Maryland                                       W. Royer
                                                                       E. Toth
                                                                                                                          Brandeis U.
    UC Davis                                                                                                              N. Grigorieff
    H. Stahlberg                                                                                                          Tufts U.
                                                                                                                          K. Heldwein
    UCSF                                                                                                                  Columbia U.
    JJ Miranda
                                                                                                                          Q. Fan
    Y. Cheng
                                                                                                                          Rockefeller U.
    Stanford                                                                                                              R. MacKinnon
    A. Brunger                                                                                               Yale U.
    K. Garcia                                                                                                T. Boggon            K. Reinisch
    T. Jardetzky                                                                                             D. Braddock          J. Schlessinger
                                                                                                             Y. Ha                F. Sigworth
    CalTech                                                                                                  E. Lolis             F. Zhou
    P. Bjorkman                                                                                              Harvard and Affiliates
    W. Clemons                                                                                                   N. Beglova        A. Leschziner
    G. Jensen                       Rice University                                                              S. Blacklow       K. Miller
    D. Rees                           E. Nikonowicz                                                              B. Chen           A. Rao
                                      Y. Shamoo                Vanderbilt                                        J. Chou           T. Rapoport
                                      Y.J. Tao                 Center for Structural Biology                     J. Clardy         M. Samso
    WesternU
                                                               W. Chazin          C. Sanders                     M. Eck            P. Sliz
    M. Swairjo
                                                               B. Eichman         B. Spiller                     B. Furie          T. Springer
                                                               M. Egli            M. Stone                       R. Gaudet         G. Verdine
    UCSD                                                       B. Lacy            M. Waterman                    M. Grant          G. Wagner
    T. Nakagawa                                                M. Ohi                                            S.C. Harrison     L. Walensky
    H. Viadiu                                                 Thomas Jefferson                                   J. Hogle          S.Walker
                                                              J. Williams                                        D. Jeruzalmi      T.Walz
   Ian Stokes-Rees, NEBioGrid, Harvard Medical School                                                            D. Kahne          June 23rd, 2010
                                                                                                                                   J. Wang
Not Pictured:
University of Toronto: L. Howell, E. Pai, F. Sicheri; NHRI (Taiwan): G. Liou; Trinity College, Dublin: Amir Khan T. Kirchhausen    S. Wong
If the particle physicists can use it...
Open Science Grid




               opensciencegrid.org
Grid Computing
          Federated and scalable
          Secure
          Standardized
          Compute sharing & cycle scavenging
          Dynamic formation of collaborations
          Data sharing
Ian Stokes-Rees, NEBioGrid, Harvard Medical School   June 23rd, 2010
2010 06 pre_show_computing_lifesciences_stokesrees
Protein Structure Studies
2010 06 pre_show_computing_lifesciences_stokesrees
2010 06 pre_show_computing_lifesciences_stokesrees
Acknowledgements
Piotr Sliz
  PI and SBGrid team leader

Ian Levesque
  Systems Architect

Ben Eisenbraun
  Software Curator

Peter Doherty
  Grid Administrator

Caitlin Colgrove
  Intern Software Engineer

Steve Jahl
  System Administrator        Ian Stokes-Rees, http://sbgrid.org
Summary
          Compute power increasingly affordable
          New computational techniques
          New hardware (multi-core, GPU)
          Grid and cloud computing
          Fast networking, cheap storage
          Scientists developing necessary skills
          Be in touch - ijstokes@hkl.hms.harvard.edu
Ian Stokes-Rees, NEBioGrid, Harvard Medical School     June 23rd, 2010
Extras
How to get a structural biologist using CI
    Ease of use
            No command line
            X.509 (initial request, VOs, proxies, Roles, etc.) are really complicated
            Support infrastructure (mailing lists, tickets, phone, training)

    Killer apps
            They will use it if they see peers using it to advance scientific goals
            They will use it if some novel workflows or workflow patterns are established
            Data management is a big problem for everyone (see bonus, time permitting) -- we
            believe grid infrastructure could provide a solution

    Security
            Data needs to be secure ...
            ... but users still want to control sharing/access

    Roadblocks
            Reliability of underlying infrastructure and difficulty in debugging
            Applications tied to GUIs, rudimentary interfaces
Ian Stokes-Rees, SBGrid, Harvard Medical School                                         October 13th, 2009
Security Challenges
      Identity Management
              Mixture of .htpasswd, PAM, X.509, and application-specific IDs
              Complexity of X.509 (and associated paraphernalia) confuses users
              account creation, use, and management

      Virtual Organization hierarchies and user-driven collaborations
              Inheritance of rights/policies
              How to allow users to easily create and manage groups

      Merging security policies
              Site/resource, VO, and user policies need to be merged

      Encryption and Privacy Preservation
              Generic mechanisms for encryption and key management
              Preserving privacy of actions and data in federated grid environment
Ian Stokes-Rees, NEBioGrid, Harvard Medical School                             June 23rd, 2010
Security Work
      Meta data system
              Provide more generic pointers to ACLs and encryption keys

      Extension of GACL system
              Include non-X.509 ID tokens as policy principals
              Allow GACL policies to apply to web framework objects (pyGACL)

      Simple replicated key system for file encryption
              Use of meta-data framework to point to encryption key (and replicas)
              Use GACL to control key access (regular file)
              Libraries to automatically read/write encrypted files

      Future
              VO hierarchies
              Tools for user driven ACL management
              Tools for policy management (merging site, VO and user policies)
Ian Stokes-Rees, NEBioGrid, Harvard Medical School                               June 23rd, 2010
Ian Stokes-Rees, http://sbgrid.org
Ian Stokes-Rees, http://sbgrid.org
Ian Stokes-Rees, http://sbgrid.org
Ian Stokes-Rees, http://sbgrid.org
Ian Stokes-Rees, http://sbgrid.org

More Related Content

PDF
Adapting federated cyberinfrastructure for shared data collection facilities ...
KEY
Grid Computing Overview
PDF
SiD Letter of Intent_Linear Collider Detector
PDF
Molecular biology of the cell, 5th ed
PDF
2011 10 pre_broad_grid_overview_ianstokesrees
PDF
Guias Brain Trauma
PPT
Antibiotic Evolution
Adapting federated cyberinfrastructure for shared data collection facilities ...
Grid Computing Overview
SiD Letter of Intent_Linear Collider Detector
Molecular biology of the cell, 5th ed
2011 10 pre_broad_grid_overview_ianstokesrees
Guias Brain Trauma
Antibiotic Evolution

Similar to 2010 06 pre_show_computing_lifesciences_stokesrees (20)

DOCX
Bishop reproducibility references nov2016
PDF
References on Reproducibility Crisis in Science by D.V.M. Bishop
DOC
CV_SiwakE-06072016
PPS
Итоговая работа проекта Цикли в нашей жизни
PDF
Evidence for a Dynamo in the Main Group Pallasite Parent Body
PDF
Apcolabo
PDF
Seasonal erosion and restoration of mars’ northern polar dunes
PDF
Campbell_Resume_011515
DOC
tropak cv linkedin
PPTX
Cytokines 2016 Awardees
PDF
Lung cancer, 3rd ed
PPTX
ENFERMEDAD ACIDO PEPTICA EN CIRUGIA GENERAL
DOCX
Allen_CV_FINAL
PDF
Rudy Bueno_CV
PPT
Presentacion Diseño Inteligente
PDF
IJQHCare Editorial anouncement reviewers 2004 (my first year)
PDF
Bishop reproducibility references nov2016
References on Reproducibility Crisis in Science by D.V.M. Bishop
CV_SiwakE-06072016
Итоговая работа проекта Цикли в нашей жизни
Evidence for a Dynamo in the Main Group Pallasite Parent Body
Apcolabo
Seasonal erosion and restoration of mars’ northern polar dunes
Campbell_Resume_011515
tropak cv linkedin
Cytokines 2016 Awardees
Lung cancer, 3rd ed
ENFERMEDAD ACIDO PEPTICA EN CIRUGIA GENERAL
Allen_CV_FINAL
Rudy Bueno_CV
Presentacion Diseño Inteligente
IJQHCare Editorial anouncement reviewers 2004 (my first year)
Ad

More from Boston Consulting Group (13)

PPTX
Cloud-native Enterprise Data Science Teams
PPTX
Cloud-native Enterprise Data Science Teams
PPTX
Beyond the Science Gateway
PPTX
Anaconda Data Science Collaboration
PDF
Python Blaze Overview
PDF
Making Data Analytics Awesome
PDF
SBGrid Science Portal - eScience 2012
PDF
2012 02 pre_hbs_grid_overview_ianstokesrees_pt2
PDF
2012 02 pre_hbs_grid_overview_ianstokesrees_pt1
PDF
2011 11 pre_cs50_accelerating_sciencegrid_ianstokesrees
KEY
Big Data: tools and techniques for working with large data sets
PDF
Wide Search Molecular Replacement and the NEBioGrid portal interface
PDF
To Infiniband and Beyond
Cloud-native Enterprise Data Science Teams
Cloud-native Enterprise Data Science Teams
Beyond the Science Gateway
Anaconda Data Science Collaboration
Python Blaze Overview
Making Data Analytics Awesome
SBGrid Science Portal - eScience 2012
2012 02 pre_hbs_grid_overview_ianstokesrees_pt2
2012 02 pre_hbs_grid_overview_ianstokesrees_pt1
2011 11 pre_cs50_accelerating_sciencegrid_ianstokesrees
Big Data: tools and techniques for working with large data sets
Wide Search Molecular Replacement and the NEBioGrid portal interface
To Infiniband and Beyond
Ad

Recently uploaded (20)

PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
A comparative analysis of optical character recognition models for extracting...
PPTX
Spectroscopy.pptx food analysis technology
PDF
Encapsulation theory and applications.pdf
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PPTX
Machine Learning_overview_presentation.pptx
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPT
Teaching material agriculture food technology
PDF
Approach and Philosophy of On baking technology
PPTX
cloud_computing_Infrastucture_as_cloud_p
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
Tartificialntelligence_presentation.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Group 1 Presentation -Planning and Decision Making .pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Mobile App Security Testing_ A Comprehensive Guide.pdf
MIND Revenue Release Quarter 2 2025 Press Release
A comparative analysis of optical character recognition models for extracting...
Spectroscopy.pptx food analysis technology
Encapsulation theory and applications.pdf
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Machine Learning_overview_presentation.pptx
Spectral efficient network and resource selection model in 5G networks
Diabetes mellitus diagnosis method based random forest with bat algorithm
Digital-Transformation-Roadmap-for-Companies.pptx
Teaching material agriculture food technology
Approach and Philosophy of On baking technology
cloud_computing_Infrastucture_as_cloud_p
Unlocking AI with Model Context Protocol (MCP)
Tartificialntelligence_presentation.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf

2010 06 pre_show_computing_lifesciences_stokesrees

  • 1. Advancing Life Sciences Research with High Performance Computing and Cyberinfrastructure Ian Stokes-Rees Harvard Medical School SHOW - Making Biology Binary, June 2010
  • 2. Dengue Virus Movie animation, not simulation, informed by science
  • 4. Science Behind the Movie Multi-scale Data intensive Dynamic Models Simulation Analysis Ian Stokes-Rees, NEBioGrid, Harvard Medical School June 23rd, 2010
  • 5. Water channel through aquaporin tetramere in lipid bilayer Tajkhorshid, E., Nollert, P., Jensen, M.O., Miercke, L.J., O'Connell, J., Stroud, R.M., and Schulten, K. (2002). Science 296, 525-530
  • 6. Molecular Dynamics Computationally intensive Necessarily parallel Nanosecond scale today Millisecond to second tomorrow Rapidly growing interest Ian Stokes-Rees, NEBioGrid, Harvard Medical School June 23rd, 2010
  • 8. 48 cores, single system image
  • 9. GPU Computing 200-800 stream processing cores per card
  • 14. Boston Life Sciences Universities Hospitals Pharmaceuticals Research Institutes Tufts Universit y School of Medicin e Ian Stokes-Rees, NEBioGrid, Harvard Medical School June 23rd, 2010
  • 15. Washington U. School of Med. Cornell U. R. Cerione NE-CAT T. Ellenberger B. Crane R. Oswald D. Fremont S. Ealick C. Parrish Rosalind Franklin NIH M. Jin H. Sondermann D. Harrison M. Mayer A. Ke UMass Medical U. Washington T. Gonen U. Maryland W. Royer E. Toth Brandeis U. UC Davis N. Grigorieff H. Stahlberg Tufts U. K. Heldwein UCSF Columbia U. JJ Miranda Q. Fan Y. Cheng Rockefeller U. Stanford R. MacKinnon A. Brunger Yale U. K. Garcia T. Boggon K. Reinisch T. Jardetzky D. Braddock J. Schlessinger Y. Ha F. Sigworth CalTech E. Lolis F. Zhou P. Bjorkman Harvard and Affiliates W. Clemons N. Beglova A. Leschziner G. Jensen Rice University S. Blacklow K. Miller D. Rees E. Nikonowicz B. Chen A. Rao Y. Shamoo Vanderbilt J. Chou T. Rapoport Y.J. Tao Center for Structural Biology J. Clardy M. Samso WesternU W. Chazin C. Sanders M. Eck P. Sliz M. Swairjo B. Eichman B. Spiller B. Furie T. Springer M. Egli M. Stone R. Gaudet G. Verdine UCSD B. Lacy M. Waterman M. Grant G. Wagner T. Nakagawa M. Ohi S.C. Harrison L. Walensky H. Viadiu Thomas Jefferson J. Hogle S.Walker J. Williams D. Jeruzalmi T.Walz Ian Stokes-Rees, NEBioGrid, Harvard Medical School D. Kahne June 23rd, 2010 J. Wang Not Pictured: University of Toronto: L. Howell, E. Pai, F. Sicheri; NHRI (Taiwan): G. Liou; Trinity College, Dublin: Amir Khan T. Kirchhausen S. Wong
  • 16. If the particle physicists can use it...
  • 17. Open Science Grid opensciencegrid.org
  • 18. Grid Computing Federated and scalable Secure Standardized Compute sharing & cycle scavenging Dynamic formation of collaborations Data sharing Ian Stokes-Rees, NEBioGrid, Harvard Medical School June 23rd, 2010
  • 23. Acknowledgements Piotr Sliz PI and SBGrid team leader Ian Levesque Systems Architect Ben Eisenbraun Software Curator Peter Doherty Grid Administrator Caitlin Colgrove Intern Software Engineer Steve Jahl System Administrator Ian Stokes-Rees, http://sbgrid.org
  • 24. Summary Compute power increasingly affordable New computational techniques New hardware (multi-core, GPU) Grid and cloud computing Fast networking, cheap storage Scientists developing necessary skills Be in touch - ijstokes@hkl.hms.harvard.edu Ian Stokes-Rees, NEBioGrid, Harvard Medical School June 23rd, 2010
  • 26. How to get a structural biologist using CI Ease of use No command line X.509 (initial request, VOs, proxies, Roles, etc.) are really complicated Support infrastructure (mailing lists, tickets, phone, training) Killer apps They will use it if they see peers using it to advance scientific goals They will use it if some novel workflows or workflow patterns are established Data management is a big problem for everyone (see bonus, time permitting) -- we believe grid infrastructure could provide a solution Security Data needs to be secure ... ... but users still want to control sharing/access Roadblocks Reliability of underlying infrastructure and difficulty in debugging Applications tied to GUIs, rudimentary interfaces Ian Stokes-Rees, SBGrid, Harvard Medical School October 13th, 2009
  • 27. Security Challenges Identity Management Mixture of .htpasswd, PAM, X.509, and application-specific IDs Complexity of X.509 (and associated paraphernalia) confuses users account creation, use, and management Virtual Organization hierarchies and user-driven collaborations Inheritance of rights/policies How to allow users to easily create and manage groups Merging security policies Site/resource, VO, and user policies need to be merged Encryption and Privacy Preservation Generic mechanisms for encryption and key management Preserving privacy of actions and data in federated grid environment Ian Stokes-Rees, NEBioGrid, Harvard Medical School June 23rd, 2010
  • 28. Security Work Meta data system Provide more generic pointers to ACLs and encryption keys Extension of GACL system Include non-X.509 ID tokens as policy principals Allow GACL policies to apply to web framework objects (pyGACL) Simple replicated key system for file encryption Use of meta-data framework to point to encryption key (and replicas) Use GACL to control key access (regular file) Libraries to automatically read/write encrypted files Future VO hierarchies Tools for user driven ACL management Tools for policy management (merging site, VO and user policies) Ian Stokes-Rees, NEBioGrid, Harvard Medical School June 23rd, 2010