SlideShare a Scribd company logo
World-wide in silico drug discovery
                     against neglected and emerging
                     diseases on grid infrastructures
                     Dr Nicolas jacq
                     HealthGrid association

                     Credit : the WISDOM collaboration
                     http://wisdom.healthgrid.org

                     International Symposium on Grids for Science and Business
                     12 June 2007

www.healthgrid.org
The HealthGrid
                      association
•   The vision of HealthGrid is the deployment of e-infrastructures
    able to interoperate geographically distributed repositories of
    health-related data and the integration of high-end processing
    services on top of them.

•   Some key aspects are:
     – The integration of health-related actors in grid projects
     – The integration of grid standards and medical informatics standards for
       interoperability
     – The deployment of pilots for new ways of research and new methods
     – The integration of bioinformatics community and medical informatics

•   The mission of HealthGrid is to foster the communication among
    the different key actors and to catalyse joint research actions at
    international level
                                                            Jacq - 12 June 2007   2
Main achievements

•   Edition of the HealthGrid Whitepaper in 2005 outlining the
    concept, benefits and opportunities offered by applying grids in
    different applications in biomedicine and healthcare
     – http://whitepaper.healthgrid.org

•   Involvement as full partner in several projects
     – SHARE (SSA): http://www.eu-share.org
     – EGEE II (I3): http://www.eu-egee.org
     – ACGT (IP): http://www.eu-acgt.org

•   Organisation of the HealthGrid conference since 2003
     – HealthGrid.US Alliance will host the 6th International HealthGrid
       Conference in Chicago – Spring 2008

•   Development of the health grids knowledge base
     – http://kb.healthgrid.org
                                                              Jacq - 12 June 2007   3
Content


• WISDOM, an initiative for grid-enabled drug discovery
  against neglected and emerging diseases

• Deployment and results of grid-enabled large scale
  virtual screening against malaria and avian influenza

• Deployment method

• Conclusion and perspectives


                                            Jacq - 12 June 2007   4
Goal of the
             WISDOM initiative
• WISDOM stands for World-wide In Silico Docking On Malaria

• Goal: contribute to develop new drugs for neglected and
  emerging diseases with a particular focus on malaria and avian flu

•   Specificity: extensively rely on emerging information technologies
    to provide new tools and environments for drug discovery

•   Initial focus: virtual screening

•   Web site: http://wisdom.healthgrid.org




                                                      Jacq - 12 June 2007   5
WISDOM
                            collaboration
                    LPC Clermont-Ferrand:             SCAI Fraunhofer:
                       Biomedical grid              Knowledge extraction,
                         Web service                  Chemoinformatics

                     CEA, Acamba project:             Univ. Modena:
                       Malaria biology,              Malaria biology,
                       Chemogenomics                Molecular Dynamics

                            HealthGrid:                  ITB CNR:                  Academica Sinica:
                          Biomedical grid,            Bioinformatics,              Grid user interface
                           Dissemination            Molecular modelling             Avian flu biology
                                                                                     In vitro testing
       Univ. Los Andes:
       Bioinformatics,
                                                                            New
       Malaria biology                                                            Chonnam Nat. Univ.:
                                  Univ. Pretoria:   Mahidol Univ. Bangkok:          In vitro testing
                                  Bioinformatics,      In vitro testing
                                  Malaria biology
Partners
Associated labs


           7 partners, 4 associated laboratories providing targets
                           and/or in vitro facilities
                                                                          Jacq - 12 June 2007            6
Benefits from using the
                          grid (1/2)

•   World-wide distribution of malaria resistance
•   1975-2004: Only 21 new drugs for tropical diseases on 1,556 were
    marketed (Chirac P. Toreele. E Lancet. May 2006)

•   Neglected diseases keep suffering lack of R&D

•   Grids allow reduced costs




                                                           Jacq - 12 June 2007   7
Benefits from using the
                           grid (2/2)

•   H5N1 virus has the potential to cause a large-scale pandemic
•   H5N1 may mutate and acquire the ability of drug resistance

•   Time is a critical factor for handling emerging diseases

•   Grids provide accelerating factor




                                                                                             months
                  Deaths from all causes each week expressed as an annual rate per 1000
                   Source : Ross E.G. Upshur BA(HONS), MA, MD, MSc, CCFP, FRCPC
                                                                                  Jacq - 12 June 2007   8
In silico drug
                          discovery

• Problem: development of a drug takes 12 to 15 years
  and costs approximately 800 million dollars

       Target discovery                Lead discovery

 Target           Target       Lead             Lead                        Clinical
 Identification   Validation   Identification   Optimization                Phases
                                                                             (I-III)




                                                               Jacq - 12 June 2007     9
Grid impact on drug
           discovery workflow down
                to drug delivery (1/2)

•   Grids provide the necessary tools and data to identify new
    biological targets
     – Bioinformatics services (database replication, workflow…)
     – Resources for CPU intensive tasks (genomics comparative analysis,
       inverse docking…)

•   Grids provide the resources to speed up lead discovery
     – Large scale in silico docking to identify potentially promising
       compounds
     – Molecular dynamics computations to refine virtual screening and further
       assess selected compounds

•   Grid offers very interesting perspectives to enable collaboration
    between public and private partners
     – Platform for information and knowledge sharing


                                                            Jacq - 12 June 2007   10
Grid impact on drug
         discovery workflow down
              to drug delivery (2/2)

• Grids provide environments for epidemiology
   – Federation of databases to collect data in endemic areas to
     study a disease and to evaluate impact of vaccine, vector control
     measures
   – Resources for data analysis and mathematical modelling


• Grids provide the services needed for clinical trials
   – Federation of databases to collect data in the centres
     participating to the clinical trials


• Grids provide the tools to monitor drug delivery
   – Federation of databases to monitor drug delivery

                                                        Jacq - 12 June 2007   11
Content


• WISDOM, an initiative for grid-enabled drug discovery
  against neglected and emerging diseases

• Deployment and results of grid-enabled large scale
  virtual screening against malaria and avian influenza

• Deployment method

• Conclusion and perspectives


                                            Jacq - 12 June 2007   12
Virtual screening by
                        docking

Compound                Target structure
 database                   model



              DOCKING



               Predicted
            binding models



             Post-analysis
                                           Docking: predict how small
                                           molecules bind to a receptor
                                              of known 3D structure
             Compounds
              for assay


                                                          Jacq - 12 June 2007   13
Grid-enabled high
                        throughput virtual
                     screening by docking
   Millions of potential
                                              High Throughput Screening
   drugs to test against                      1-10$/compound, several hours
   interesting proteins!
                                   Too costly for neglected disease!

Compounds:                                   Molecular docking (FlexX, Autodock)
ZINC: 4.3M                                   ~1 to 15 minutes

Chembridge: 500,000

                                             Data challenge on EGEE
Targets:                                     ~ 2 to 30 days on ~5,000 computers
PDB: 3D structures

                                                          Cheap and fast!
                                  Hits screening         Leads
   Selection of the               using assays           Clinical testing
   best hits                      performed on
                                  living cells           Drug
                                                            Jacq - 12 June 2007   14
Statistics of
                           deployment
•   First Data Challenge: July 1st - August 15th 2005
     – Target: malaria
     – 80 CPU years, 1 TB of data produced, 1,700 CPUs used in parallel
     – 1st large scale docking deployment world-wide on a e-infrastructure

•   Second Data Challenge: April 15th - June 30th 2006
     – Target: avian flu
     – 100 CPU years, 800 GB of data produced, 1,700 CPUs used in parallel
     – Collaboration initiated on March 1st: deployment preparation achieved in 45
       days

•   Third Data Challenge: October 1st - 15th December 2006
     – Target: malaria
     – 400 CPU years, 1.6 TB of data produced, Up to 5,000 CPUs used in parallel
     – Very high docking throughput: > 100,000 compounds per hour




                                                                 Jacq - 12 June 2007   15
A huge international effort
      for the third data challenge


                   1% 2%   2% 3%
                                   3%
                                        3%
                                                        EGEE Germany Switzerland
                                             3%
                                                        EGEE Asia Pacific
38%                                               5%    EGEE Russia
                                                        Auvergrid
                                                   6%   EuChinaGrid
                                                        EELA
                                                        EGEE South Western Europe
                                                        EGEE Central Europe
                                                        EGEE Northern Europe
                                                        EGEE Italy
                                                  7%    EGEE South Eastern Europe
                                                        EGEE France
                                                        EGEE UKI
                                   12%
             15%


 Over 420 CPU years in 10 weeks
 A record throughput of 100,000 docked compounds per hour

 WISDOM calculations used FlexX from BioSolveIT
 (6k free, floating licenses)
                                                          Jacq - 12 June 2007       16
Biological objectives

• Malaria
   – Plasmepsin

   –   DHFR Plasmodium falciparum
   –   DHFR Plasmodium vivax
   –   GST
   –   Tubulin
                                    N1

• Avian influenza
   – Neuraminidase N1


                                    H5

                                         Credit: Y-T12 June 2007
                                               Jacq - Wu (ASGC) 17
Results from avian flu
              data challenge (1/2)

•   5 out of 6 known effective inhibitors can be identified in the first
    15% of the ranking and in the first 5% reranked (2,250 compounds)
     – Enrichment: (5/6)/(15%x5%) = 111 (<1 in most cases)

•   Most known effective inhibitors lose their affinity in binding with a
    mutated target


                       Original type
                                                                     E119A
                                                                         E119A
                                                                     mutated
                                                                     type
     GNA 2.4%                                 GNA 11.5%
                                                   11.5%




         15% cut off       GNA=zanamivir                   Jacq - 12 June 2007   18
Results from avian flu
             data challenge (2/2)

•   Experimental assay confirms 7 actives out of 123 purchased
    “potential hits” (interacting complexes with higher affinities and
    proper docked poses) = 6%
•   Average success rate of in vitro testing = 0.1%
•   To be confirmed on more hits, tests are running in Univ. of
    Chonnam (South Korea)


NA




                                                        Jacq - 12 June 2007   19
Results from first
              malaria data challenge


                                 1,000, 000 chemical compounds


                       Sorting based on scoring in different parameter sets;
                                       Consensus scoring

                                   10,000 compounds selected


                                    Based on key interactions,
                                       binding modes, etc.



                                        1,000 compounds



                                               MD

                 100 compounds will be tested in July by Univ. of
Credit: V. Kasam
                           Chonnam (South Korea)
Fraunhofer Institute                                                           Jacq - 12 June 2007   20
Content


• WISDOM, an initiative for grid-enabled drug discovery
  against neglected and emerging diseases

• Deployment and results of grid-enabled large scale
  virtual screening against malaria and avian influenza

• Deployment method

• Conclusion and perspectives


                                            Jacq - 12 June 2007   21
Requirements for a
           deployment on grid

• Adaptation of the application to the grid


• Access to a large infrastructure providing maintained
  resources


• Use of a production system providing automated and
  fault-tolerant job and file management




                                              Jacq - 12 June 2007   22
Adaptation of the application to the grid

                                                                              DB


• The application codes
  can not be modified and
                                                      Input                Data
  are not designed for grid                            data
                                                                         DB
                                                                          Data
                                                                         DB
                                                                        subset
  computing.                   Parameters


• A common strategy is to                       Docking software
  split the application into
  shorter tasks

• License management for               Output
  commercial software is
  not adapted for large
                                     Embarrassingly parallel application
  infrastructure
                                                        Jacq - 12 June 2007        23
Real Time Monitor (Imperial College London)   Grid Added Value
http://gridportal.hep.ph.ic.ac.uk/rtm/




    • Large number of CPUs available

    • Reliable and secured Data Management Services
        – Sharing of results
        – Replication of the data
        – ACLs


    •   Availability of the resources




                                                   Jacq - 12 June 2007   24
Grid infrastructures and
projects contributing to the
            data challenges

              EMBRACE   BioinfoGrid
                    SHARE
                             EGEE
                 Auvergrid
                          EUMedGrid         EUChinaGrid
                                                          TWGrid



         EELA




  : European grid infrastructure       : European grid project
  : Regional/national grid infrastructure
                                                     Jacq - 12 June 2007   25
WISDOM production environment




Credit: CNRS-IN2P3                         Jacq - 12 June 2007   26
GUI designed by biologists



                    Compound selection




                                                Complex visualization




Target selection




                   Energy table

                                            Docking parameter setter

                                         Credit: H-C12 June(ASGC)27
                                               Jacq - Lee 2007
Content


• WISDOM, an initiative for grid-enabled drug discovery
  against neglected and emerging diseases

• Deployment and results of grid-enabled large scale
  virtual screening against malaria and avian influenza

• Deployment method

• Conclusion and perspectives


                                            Jacq - 12 June 2007   28
Conclusion



• WISDOM proposes a new approach to drug discovery
  thanks to the grid
   – Rapid deployment of large scale virtual screening
   – Collaborative environment for the sharing of data in the
     research community

• First biochemical results demonstrate grid relevance
  to the drug discovery community




                                                      Jacq - 12 June 2007   29
Perspectives

• Summer 2007
  – 2nd data challenge against avian flu
  – In vitro tests of the best molecules from the data challenges


• Winter 2007
  – Discussion with WHO and Novartis
       Targets provided by the Drug Target Portfolio Network from the
       Tropical Disease Research initiative
  – Discussion with Africa@home initiative
       WISDOM deployment on a desktop grid




                                                       Jacq - 12 June 2007   30
Thank you

• To all members of the WISDOM collaboration for their
  contribution to the project (CNRS-IN2P3, ASGC, ITB-CNR,
  SCAI Fraunhofer, Univ of Modena…)

• To all grid nodes which committed resources and allowed
  the success of the initiative

• To all projects which supported the initiative by providing
  either computing resources or manpower to develop the
  WISDOM environment (EGEE, BioinfoGRID, Embrace,
  SHARE…)

• To BioSolveIT by offering up to 6,000 free licenses of FlexX


                                                 Jacq - 12 June 2007   31

More Related Content

PDF
Roadmap on nanomedicine
PDF
Nanomedicine white paper 2018
PDF
Big data, big knowledge big data for personalized healthcare
PDF
summer_eye_on_bdi 2013
PDF
Utilization of virtual microscopy in a cooperative group setting
PDF
JALANov2000
PDF
Deep learning for biomedical discovery and data mining II
PPTX
Telepathology and artificial inteligence in India and beyond
Roadmap on nanomedicine
Nanomedicine white paper 2018
Big data, big knowledge big data for personalized healthcare
summer_eye_on_bdi 2013
Utilization of virtual microscopy in a cooperative group setting
JALANov2000
Deep learning for biomedical discovery and data mining II
Telepathology and artificial inteligence in India and beyond

What's hot (20)

PDF
Extracting clinical value from next gen sequencing
PDF
INSIGHT ABOUT DETECTION, PREDICTION AND WEATHER IMPACT OF CORONAVIRUS (COVID-...
PDF
Healthcare Conference 2013 : Genes, Clouds and Cancer - dr. Andrew Litt
PDF
Nicola Ancona – Dall’Intelligenza Artificiale alla Systems Medicine
PPTX
The Amazing Ways Artificial Intelligence Is Transforming Genomics and Gene Ed...
PDF
Bioinformatics in the Clinical Pipeline: Contribution in Genomic Medicine
PPTX
Deep learning health care
PDF
Data Standards in Radiomics Research
PPT
Dreyfuss.berkeley.2010
PDF
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
PPTX
Illumina-General-Overview-Q1-17
PDF
IRJET- Intelligent Prediction of Lung Cancer Via MRI Images using Morphologic...
PDF
Recognition of Corona virus disease (COVID-19) using deep learning network
PDF
Medicortex ppp june 2021
PPTX
E.Gombocz: Semantics in a Box (SemTech 2013-04-30)
PDF
Medicortex ppp q3 2021
PDF
Medicortex ppp may 2021
PDF
Indo us 2012
Extracting clinical value from next gen sequencing
INSIGHT ABOUT DETECTION, PREDICTION AND WEATHER IMPACT OF CORONAVIRUS (COVID-...
Healthcare Conference 2013 : Genes, Clouds and Cancer - dr. Andrew Litt
Nicola Ancona – Dall’Intelligenza Artificiale alla Systems Medicine
The Amazing Ways Artificial Intelligence Is Transforming Genomics and Gene Ed...
Bioinformatics in the Clinical Pipeline: Contribution in Genomic Medicine
Deep learning health care
Data Standards in Radiomics Research
Dreyfuss.berkeley.2010
GENE-GENE INTERACTION ANALYSIS IN ALZHEIMER
Illumina-General-Overview-Q1-17
IRJET- Intelligent Prediction of Lung Cancer Via MRI Images using Morphologic...
Recognition of Corona virus disease (COVID-19) using deep learning network
Medicortex ppp june 2021
E.Gombocz: Semantics in a Box (SemTech 2013-04-30)
Medicortex ppp q3 2021
Medicortex ppp may 2021
Indo us 2012
Ad

Viewers also liked (20)

PDF
Intellectual property rights and entrepreneurship - UK Intellectual Property ...
PDF
Ddo8 Peter Anker Digital Dividend In Nl
PDF
NSTIC draft charter August 2012 w comments
PPTX
Scentsy Fall/Winter 2014 Catalog
DOCX
PDF
A2 Bforum P1 02 Ka Ho Sl Vincent Naessens E Idea
PDF
Ict Sd09 Overal In Je Leven Kom Je 1700 Tegen
PPT
Asmudes Catalogo 2008 Def
PDF
I Minds2009 Matthias Holzner Smfg Baden WüRttemberg At The European Crossr...
PPT
20090213 Friday Food Croslocis
PDF
Erfgoed2 0 3 Erfgoed Digitaal In Fide
PPT
Fotosintesis2
PDF
Ehip4 caring through sharing privacy and-security-technical-aspects riccardo ...
PDF
Brokerage2006 virtual individual networks
PDF
Determinants of bank's interest margin in the aftermath of the crisis: the ef...
PPS
Ferias Em Africa 2
PDF
Q932+sgo reference fa lec
PDF
Ingrid moerman isbo ng wi nets - overview of the project
PDF
La industrialización como estrategia de crecimiento
Intellectual property rights and entrepreneurship - UK Intellectual Property ...
Ddo8 Peter Anker Digital Dividend In Nl
NSTIC draft charter August 2012 w comments
Scentsy Fall/Winter 2014 Catalog
A2 Bforum P1 02 Ka Ho Sl Vincent Naessens E Idea
Ict Sd09 Overal In Je Leven Kom Je 1700 Tegen
Asmudes Catalogo 2008 Def
I Minds2009 Matthias Holzner Smfg Baden WüRttemberg At The European Crossr...
20090213 Friday Food Croslocis
Erfgoed2 0 3 Erfgoed Digitaal In Fide
Fotosintesis2
Ehip4 caring through sharing privacy and-security-technical-aspects riccardo ...
Brokerage2006 virtual individual networks
Determinants of bank's interest margin in the aftermath of the crisis: the ef...
Ferias Em Africa 2
Q932+sgo reference fa lec
Ingrid moerman isbo ng wi nets - overview of the project
La industrialización como estrategia de crecimiento
Ad

Similar to Grid07 6 Jacq (20)

PPTX
Enabling Patient-Driven Medicine Using Graph Database
PDF
Friend p4c 2012-11-29
PPTX
Data Commons & Data Science Workshop
PDF
SWITCHjournal_2_04
PPT
From Clinical Information Systems toward HealthGrid
PDF
Knowledge Discovery using an Integrated Semantic Web
PPT
Quantitative Medicine Feb 2009
PDF
Real-World Evidence: The Future of Data Generation and Usage
PDF
Friend NRNB 2012-12-13
PPTX
Biomedical Advancements in Medical Dr. Abhinav Golla , Associate Professor ,...
PPT
Knowledge Engineering in Oncology
PDF
Stephen Friend Dana Farber Cancer Institute 2011-10-24
PDF
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
PDF
VPH in Future Healthcare. Where Will We Be in 10 Years from Now?
PDF
Virtual Clinical Trials-06-12-2023.pdf
PDF
Dr. John Gallacher Digital Health Assembly 2015
PDF
GenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.ca
PDF
Stephen Friend Institute of Development, Aging and Cancer 2011-11-29
PPT
Challenges with implementing a new diagnostic platform in Post Entry Quarantine
PPTX
Branch: An interactive, web-based tool for building decision tree classifiers
Enabling Patient-Driven Medicine Using Graph Database
Friend p4c 2012-11-29
Data Commons & Data Science Workshop
SWITCHjournal_2_04
From Clinical Information Systems toward HealthGrid
Knowledge Discovery using an Integrated Semantic Web
Quantitative Medicine Feb 2009
Real-World Evidence: The Future of Data Generation and Usage
Friend NRNB 2012-12-13
Biomedical Advancements in Medical Dr. Abhinav Golla , Associate Professor ,...
Knowledge Engineering in Oncology
Stephen Friend Dana Farber Cancer Institute 2011-10-24
How Data Commons are Changing the Way that Large Datasets Are Analyzed and Sh...
VPH in Future Healthcare. Where Will We Be in 10 Years from Now?
Virtual Clinical Trials-06-12-2023.pdf
Dr. John Gallacher Digital Health Assembly 2015
GenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.ca
Stephen Friend Institute of Development, Aging and Cancer 2011-11-29
Challenges with implementing a new diagnostic platform in Post Entry Quarantine
Branch: An interactive, web-based tool for building decision tree classifiers

More from imec.archive (20)

PDF
iMinds-iLab.o, Open Innovation in ICT
PDF
Accio presentation closing event
PPTX
PRoF+ Patient Room of the Future
PPTX
Results of the Apollon pilot in homecare and independent living
PPTX
Delivery of feedback on Health, Home Security and Home Energy in Aware Homes ...
PDF
NMMU-Emmanuel Haven Living Lab
PDF
The Humanicité workshops
PPTX
A Real-World Experimentation Platform
PDF
ENoLL @ AAL Forum 2012
PDF
ENoLL 6th Wave Results Ceremony (Jesse Marsh)
PDF
The Connected Smart Cities Network and Living Labs - Towards Horizon 2020 - K...
PDF
Apollon-23/05/2012-9u30- Parallell session: Living Labs added value
PPT
Apollon - 22/5/12 - 11:30 - Local SME's - Innovating Across borders
PPT
Apollon - 22/5/12 - 16:00 - Smart Open Cities and the Future Internet
PPT
Apollon - 22/5/12 - 16:00 - Smart Open Cities and the Future Internet
PPT
Apollon - 22/5/12 - 16:00 - Smart Open Cities and the Future Internet
PPTX
Apollon - 22/5/12 - 16:00 - Smart Open Cities and the Future Internet
PPTX
Apollon - 22/5/12 - 11:30 - Local SME's - Innovating Across borders
PPTX
Apollon - 22/5/12 - 09:00 - User-driven Open Innovation Ecosystems
PPT
Apollon - 22/5/12 - 09:00 - User-driven Open Innovation Ecosystems
iMinds-iLab.o, Open Innovation in ICT
Accio presentation closing event
PRoF+ Patient Room of the Future
Results of the Apollon pilot in homecare and independent living
Delivery of feedback on Health, Home Security and Home Energy in Aware Homes ...
NMMU-Emmanuel Haven Living Lab
The Humanicité workshops
A Real-World Experimentation Platform
ENoLL @ AAL Forum 2012
ENoLL 6th Wave Results Ceremony (Jesse Marsh)
The Connected Smart Cities Network and Living Labs - Towards Horizon 2020 - K...
Apollon-23/05/2012-9u30- Parallell session: Living Labs added value
Apollon - 22/5/12 - 11:30 - Local SME's - Innovating Across borders
Apollon - 22/5/12 - 16:00 - Smart Open Cities and the Future Internet
Apollon - 22/5/12 - 16:00 - Smart Open Cities and the Future Internet
Apollon - 22/5/12 - 16:00 - Smart Open Cities and the Future Internet
Apollon - 22/5/12 - 16:00 - Smart Open Cities and the Future Internet
Apollon - 22/5/12 - 11:30 - Local SME's - Innovating Across borders
Apollon - 22/5/12 - 09:00 - User-driven Open Innovation Ecosystems
Apollon - 22/5/12 - 09:00 - User-driven Open Innovation Ecosystems

Grid07 6 Jacq

  • 1. World-wide in silico drug discovery against neglected and emerging diseases on grid infrastructures Dr Nicolas jacq HealthGrid association Credit : the WISDOM collaboration http://wisdom.healthgrid.org International Symposium on Grids for Science and Business 12 June 2007 www.healthgrid.org
  • 2. The HealthGrid association • The vision of HealthGrid is the deployment of e-infrastructures able to interoperate geographically distributed repositories of health-related data and the integration of high-end processing services on top of them. • Some key aspects are: – The integration of health-related actors in grid projects – The integration of grid standards and medical informatics standards for interoperability – The deployment of pilots for new ways of research and new methods – The integration of bioinformatics community and medical informatics • The mission of HealthGrid is to foster the communication among the different key actors and to catalyse joint research actions at international level Jacq - 12 June 2007 2
  • 3. Main achievements • Edition of the HealthGrid Whitepaper in 2005 outlining the concept, benefits and opportunities offered by applying grids in different applications in biomedicine and healthcare – http://whitepaper.healthgrid.org • Involvement as full partner in several projects – SHARE (SSA): http://www.eu-share.org – EGEE II (I3): http://www.eu-egee.org – ACGT (IP): http://www.eu-acgt.org • Organisation of the HealthGrid conference since 2003 – HealthGrid.US Alliance will host the 6th International HealthGrid Conference in Chicago – Spring 2008 • Development of the health grids knowledge base – http://kb.healthgrid.org Jacq - 12 June 2007 3
  • 4. Content • WISDOM, an initiative for grid-enabled drug discovery against neglected and emerging diseases • Deployment and results of grid-enabled large scale virtual screening against malaria and avian influenza • Deployment method • Conclusion and perspectives Jacq - 12 June 2007 4
  • 5. Goal of the WISDOM initiative • WISDOM stands for World-wide In Silico Docking On Malaria • Goal: contribute to develop new drugs for neglected and emerging diseases with a particular focus on malaria and avian flu • Specificity: extensively rely on emerging information technologies to provide new tools and environments for drug discovery • Initial focus: virtual screening • Web site: http://wisdom.healthgrid.org Jacq - 12 June 2007 5
  • 6. WISDOM collaboration LPC Clermont-Ferrand: SCAI Fraunhofer: Biomedical grid Knowledge extraction, Web service Chemoinformatics CEA, Acamba project: Univ. Modena: Malaria biology, Malaria biology, Chemogenomics Molecular Dynamics HealthGrid: ITB CNR: Academica Sinica: Biomedical grid, Bioinformatics, Grid user interface Dissemination Molecular modelling Avian flu biology In vitro testing Univ. Los Andes: Bioinformatics, New Malaria biology Chonnam Nat. Univ.: Univ. Pretoria: Mahidol Univ. Bangkok: In vitro testing Bioinformatics, In vitro testing Malaria biology Partners Associated labs 7 partners, 4 associated laboratories providing targets and/or in vitro facilities Jacq - 12 June 2007 6
  • 7. Benefits from using the grid (1/2) • World-wide distribution of malaria resistance • 1975-2004: Only 21 new drugs for tropical diseases on 1,556 were marketed (Chirac P. Toreele. E Lancet. May 2006) • Neglected diseases keep suffering lack of R&D • Grids allow reduced costs Jacq - 12 June 2007 7
  • 8. Benefits from using the grid (2/2) • H5N1 virus has the potential to cause a large-scale pandemic • H5N1 may mutate and acquire the ability of drug resistance • Time is a critical factor for handling emerging diseases • Grids provide accelerating factor months Deaths from all causes each week expressed as an annual rate per 1000 Source : Ross E.G. Upshur BA(HONS), MA, MD, MSc, CCFP, FRCPC Jacq - 12 June 2007 8
  • 9. In silico drug discovery • Problem: development of a drug takes 12 to 15 years and costs approximately 800 million dollars Target discovery Lead discovery Target Target Lead Lead Clinical Identification Validation Identification Optimization Phases (I-III) Jacq - 12 June 2007 9
  • 10. Grid impact on drug discovery workflow down to drug delivery (1/2) • Grids provide the necessary tools and data to identify new biological targets – Bioinformatics services (database replication, workflow…) – Resources for CPU intensive tasks (genomics comparative analysis, inverse docking…) • Grids provide the resources to speed up lead discovery – Large scale in silico docking to identify potentially promising compounds – Molecular dynamics computations to refine virtual screening and further assess selected compounds • Grid offers very interesting perspectives to enable collaboration between public and private partners – Platform for information and knowledge sharing Jacq - 12 June 2007 10
  • 11. Grid impact on drug discovery workflow down to drug delivery (2/2) • Grids provide environments for epidemiology – Federation of databases to collect data in endemic areas to study a disease and to evaluate impact of vaccine, vector control measures – Resources for data analysis and mathematical modelling • Grids provide the services needed for clinical trials – Federation of databases to collect data in the centres participating to the clinical trials • Grids provide the tools to monitor drug delivery – Federation of databases to monitor drug delivery Jacq - 12 June 2007 11
  • 12. Content • WISDOM, an initiative for grid-enabled drug discovery against neglected and emerging diseases • Deployment and results of grid-enabled large scale virtual screening against malaria and avian influenza • Deployment method • Conclusion and perspectives Jacq - 12 June 2007 12
  • 13. Virtual screening by docking Compound Target structure database model DOCKING Predicted binding models Post-analysis Docking: predict how small molecules bind to a receptor of known 3D structure Compounds for assay Jacq - 12 June 2007 13
  • 14. Grid-enabled high throughput virtual screening by docking Millions of potential High Throughput Screening drugs to test against 1-10$/compound, several hours interesting proteins! Too costly for neglected disease! Compounds: Molecular docking (FlexX, Autodock) ZINC: 4.3M ~1 to 15 minutes Chembridge: 500,000 Data challenge on EGEE Targets: ~ 2 to 30 days on ~5,000 computers PDB: 3D structures Cheap and fast! Hits screening Leads Selection of the using assays Clinical testing best hits performed on living cells Drug Jacq - 12 June 2007 14
  • 15. Statistics of deployment • First Data Challenge: July 1st - August 15th 2005 – Target: malaria – 80 CPU years, 1 TB of data produced, 1,700 CPUs used in parallel – 1st large scale docking deployment world-wide on a e-infrastructure • Second Data Challenge: April 15th - June 30th 2006 – Target: avian flu – 100 CPU years, 800 GB of data produced, 1,700 CPUs used in parallel – Collaboration initiated on March 1st: deployment preparation achieved in 45 days • Third Data Challenge: October 1st - 15th December 2006 – Target: malaria – 400 CPU years, 1.6 TB of data produced, Up to 5,000 CPUs used in parallel – Very high docking throughput: > 100,000 compounds per hour Jacq - 12 June 2007 15
  • 16. A huge international effort for the third data challenge 1% 2% 2% 3% 3% 3% EGEE Germany Switzerland 3% EGEE Asia Pacific 38% 5% EGEE Russia Auvergrid 6% EuChinaGrid EELA EGEE South Western Europe EGEE Central Europe EGEE Northern Europe EGEE Italy 7% EGEE South Eastern Europe EGEE France EGEE UKI 12% 15% Over 420 CPU years in 10 weeks A record throughput of 100,000 docked compounds per hour WISDOM calculations used FlexX from BioSolveIT (6k free, floating licenses) Jacq - 12 June 2007 16
  • 17. Biological objectives • Malaria – Plasmepsin – DHFR Plasmodium falciparum – DHFR Plasmodium vivax – GST – Tubulin N1 • Avian influenza – Neuraminidase N1 H5 Credit: Y-T12 June 2007 Jacq - Wu (ASGC) 17
  • 18. Results from avian flu data challenge (1/2) • 5 out of 6 known effective inhibitors can be identified in the first 15% of the ranking and in the first 5% reranked (2,250 compounds) – Enrichment: (5/6)/(15%x5%) = 111 (<1 in most cases) • Most known effective inhibitors lose their affinity in binding with a mutated target Original type E119A E119A mutated type GNA 2.4% GNA 11.5% 11.5% 15% cut off GNA=zanamivir Jacq - 12 June 2007 18
  • 19. Results from avian flu data challenge (2/2) • Experimental assay confirms 7 actives out of 123 purchased “potential hits” (interacting complexes with higher affinities and proper docked poses) = 6% • Average success rate of in vitro testing = 0.1% • To be confirmed on more hits, tests are running in Univ. of Chonnam (South Korea) NA Jacq - 12 June 2007 19
  • 20. Results from first malaria data challenge 1,000, 000 chemical compounds Sorting based on scoring in different parameter sets; Consensus scoring 10,000 compounds selected Based on key interactions, binding modes, etc. 1,000 compounds MD 100 compounds will be tested in July by Univ. of Credit: V. Kasam Chonnam (South Korea) Fraunhofer Institute Jacq - 12 June 2007 20
  • 21. Content • WISDOM, an initiative for grid-enabled drug discovery against neglected and emerging diseases • Deployment and results of grid-enabled large scale virtual screening against malaria and avian influenza • Deployment method • Conclusion and perspectives Jacq - 12 June 2007 21
  • 22. Requirements for a deployment on grid • Adaptation of the application to the grid • Access to a large infrastructure providing maintained resources • Use of a production system providing automated and fault-tolerant job and file management Jacq - 12 June 2007 22
  • 23. Adaptation of the application to the grid DB • The application codes can not be modified and Input Data are not designed for grid data DB Data DB subset computing. Parameters • A common strategy is to Docking software split the application into shorter tasks • License management for Output commercial software is not adapted for large Embarrassingly parallel application infrastructure Jacq - 12 June 2007 23
  • 24. Real Time Monitor (Imperial College London) Grid Added Value http://gridportal.hep.ph.ic.ac.uk/rtm/ • Large number of CPUs available • Reliable and secured Data Management Services – Sharing of results – Replication of the data – ACLs • Availability of the resources Jacq - 12 June 2007 24
  • 25. Grid infrastructures and projects contributing to the data challenges EMBRACE BioinfoGrid SHARE EGEE Auvergrid EUMedGrid EUChinaGrid TWGrid EELA : European grid infrastructure : European grid project : Regional/national grid infrastructure Jacq - 12 June 2007 25
  • 26. WISDOM production environment Credit: CNRS-IN2P3 Jacq - 12 June 2007 26
  • 27. GUI designed by biologists Compound selection Complex visualization Target selection Energy table Docking parameter setter Credit: H-C12 June(ASGC)27 Jacq - Lee 2007
  • 28. Content • WISDOM, an initiative for grid-enabled drug discovery against neglected and emerging diseases • Deployment and results of grid-enabled large scale virtual screening against malaria and avian influenza • Deployment method • Conclusion and perspectives Jacq - 12 June 2007 28
  • 29. Conclusion • WISDOM proposes a new approach to drug discovery thanks to the grid – Rapid deployment of large scale virtual screening – Collaborative environment for the sharing of data in the research community • First biochemical results demonstrate grid relevance to the drug discovery community Jacq - 12 June 2007 29
  • 30. Perspectives • Summer 2007 – 2nd data challenge against avian flu – In vitro tests of the best molecules from the data challenges • Winter 2007 – Discussion with WHO and Novartis Targets provided by the Drug Target Portfolio Network from the Tropical Disease Research initiative – Discussion with Africa@home initiative WISDOM deployment on a desktop grid Jacq - 12 June 2007 30
  • 31. Thank you • To all members of the WISDOM collaboration for their contribution to the project (CNRS-IN2P3, ASGC, ITB-CNR, SCAI Fraunhofer, Univ of Modena…) • To all grid nodes which committed resources and allowed the success of the initiative • To all projects which supported the initiative by providing either computing resources or manpower to develop the WISDOM environment (EGEE, BioinfoGRID, Embrace, SHARE…) • To BioSolveIT by offering up to 6,000 free licenses of FlexX Jacq - 12 June 2007 31