SlideShare a Scribd company logo
Understanding And
Classifying Metabolite
Space and Metabolite-
Likeness
PLoS One (in press)


   Julio E. Peironcely  @peyron
   Juliopeironcely.com
   PhD student at Leiden University and TNO
Metabolomics

   the quantitative and qualitative
      analysis of all metabolites in
     samples of cells, body fluids,
                       tissues, etc.


                  Julio E. Peironcely
Metabolomics

             Experi-                                                                 Biological
Biological                        Sample      Data       Data pre-         Data
             mental    Sampling                                                        inter-
question                        preparation acquisition processing        analysis
             design                                                                  pretation


                                                                 Metabolites




                                                                               Relevant
                                                                            biomolecules/
                                                                List of
                                      Samples     Raw data                   connectivities
                 Protocol                                       peaks/
                                                                                  &
                                                                biomolecules
                                                                                Models




                                                 Julio E. Peironcely
Metabolomics

             Experi-                                                                 Biological
Biological                        Sample      Data       Data pre-         Data
             mental    Sampling                                                        inter-
question                        preparation acquisition processing        analysis
             design                                                                  pretation


                                                                 Metabolites




                                                                               Relevant
                                                                            biomolecules/
                                                                List of
                                      Samples     Raw data                   connectivities
                 Protocol                                       peaks/
                                                                                  &
                                                                biomolecules
                                                                                Models




                                                 Julio E. Peironcely
How do metabolites
        look like?
HMDB          ZINC
 8K           21M



       Julio E. Peironcely
metabolites   non metabolites

      Water Solubility
            MW
         C Atoms
     Struc. Complexity
            PSA


               Julio E. Peironcely
PCA




      Julio E. Peironcely
PCA
Not so different
Decision Tree




                Julio E. Peironcely
Lots of candidates
         structures
Elemental
Composition




              Julio E. Peironcely
Elemental
Composition




      Structure
     Generation




                  Julio E. Peironcely
Elemental
Composition




      Structure
     Generation




              Molecules

                    Julio E. Peironcely
We are looking for
      metabolites
Elemental
Composition




      Structure       Metabolite
     Generation       Likeness




              Molecules

                    Julio E. Peironcely
Elemental
Composition
                                    Metabolites




      Structure       Metabolite
     Generation       Likeness




              Molecules

                    Julio E. Peironcely
Metabolite-likeness
Representation             + Classification
   HMDB            ZINC
    8K             21M


       Atom Counts

   Physicochemical desc.            Support Vector
                                    Machines (SVM)
     MDL Public Keys
                                 Random Forest (RF)
          FCFP_4
                                   Naïve Bayes (NB)
          ECFP_4




                             Julio E. Peironcely
Metabolite-likeness         HMDB
                             8K
                                                ZINC
                                                21M


                               Standardization


      Atom Counts            Diversity Selection
  Physicochemical desc.
    MDL Public Keys
         FCFP_4
         ECFP_4




                          Julio E. Peironcely
Metabolite-likeness           HMDB
                               8K
                                                  ZINC
                                                  21M


                                 Standardization


      Atom Counts              Diversity Selection
  Physicochemical desc.
    MDL Public Keys
         FCFP_4           Training Set              Test Set
         ECFP_4            532 + 532              6.4K + 6.4K




                            Julio E. Peironcely
Metabolite-likeness               HMDB
                                   8K
                                                      ZINC
                                                      21M


                                      Standardization


      Atom Counts                  Diversity Selection
  Physicochemical desc.
    MDL Public Keys
         FCFP_4             Training Set                Test Set
         ECFP_4              532 + 532                6.4K + 6.4K

                            5-fold CV

                          SVM    RF      BC




                                Julio E. Peironcely
Metabolite-likeness        HMDB
                            8K
                                               ZINC
                                               21M


                               Standardization


                            Diversity Selection
   3 classifiers
         X
                      Training Set               Test Set
  5 descriptions       532 + 532               6.4K + 6.4K

                      5-fold CV                Metabolite
                                                likeness
                   SVM    RF      BC




                         Julio E. Peironcely
Metabolite-likeness                          HMDB
                                              8K
                                                                 ZINC
                                                                 21M


Best = RF – MDLPublicKeys                        Standardization

Sensitivity   Specificity    AUC
                                              Diversity Selection
 99.84%        87.52%       99.20%

                                       Training Set                Test Set
      Bad BC – P_desc                   532 + 532                6.4K + 6.4K

Sensitivity   Specificity    AUC       5-fold CV                 Metabolite
                                                                  likeness
                                     SVM    RF      BC
 42.51%        86.56%       61.57%




                                           Julio E. Peironcely
Metabolite-likeness, external
validation
              HMDB
            External          DrugBank          ChEMBL
          validation set


                                          Random Selection



                           Standardization


                             Metabolite
                              likeness




                                    Julio E. Peironcely
Metabolite-likeness, external
validation




                     Julio E. Peironcely
Understanding and classifying metabolite space and metabolite likeness
Met-likeness + structure generation
(methylhistamine) 260K

                                          71%
     46%




                    Julio E. Peironcely
Met-likeness + structure generation
(malic acid) 8K

                                          100%

57%          77%




                    Julio E. Peironcely
Conclusions


Prediction is good, interpretation not

              Useful in different fields

                Local models needed



                      Julio E. Peironcely
Acknowledgements



   Leiden University      University of Cambridge

   Theo Reijmers          Andreas Bender
   Thomas Hankemeier


   TNO Quality of Life    HMP University of
                          Alberta
   Leon Coulier
                          David Wishart
                          Ying (Edison) Dong




                         Julio E. Peironcely

More Related Content

PDF
Setting Biological Process Specifications
PPTX
Emerald bio nollert_pegs_draft_v4.3
PPTX
20120615_Granatum_COST_v2
PPTX
Modularity requirements in bio-ontologies: a case study of ChEBI
PDF
Application of Capillary Electrophoresis in Follow-up of Fermentation and Cel...
PDF
Navarrete et al 2010 veneno y espermios
PPTX
Michael Buschmann_Nanomedecine
PDF
An Approach to High Cell Density Fed-batch Experiments
Setting Biological Process Specifications
Emerald bio nollert_pegs_draft_v4.3
20120615_Granatum_COST_v2
Modularity requirements in bio-ontologies: a case study of ChEBI
Application of Capillary Electrophoresis in Follow-up of Fermentation and Cel...
Navarrete et al 2010 veneno y espermios
Michael Buschmann_Nanomedecine
An Approach to High Cell Density Fed-batch Experiments

What's hot (15)

PDF
Analyzing ligand and small molecule binding activity of solubilized myszka
PPTX
Poster
PDF
Bio leap InnoCos Europe, Paris
PDF
Lab Informatics 09 Se
PDF
Team presentation min
PPS
Pat O'Mahony, Chief Executive, Irish Medicines Board
PDF
Вычислительный эксперимент в молекулярной биофизике белков и биомембран
PPTX
Chapt 10
PPTX
Anovasia technology presentation nov2012 non-conf
PDF
Bradshaw - Bioenergy - Spring Review 2012
PDF
Selective Protein Staining On Native Gel
PPTX
Website antibodies
PDF
Asilomar2005 Ecoli Poster
PDF
Data Quality Issues That Can Impact Drug Discovery
PDF
Selkoe webinar slides
Analyzing ligand and small molecule binding activity of solubilized myszka
Poster
Bio leap InnoCos Europe, Paris
Lab Informatics 09 Se
Team presentation min
Pat O'Mahony, Chief Executive, Irish Medicines Board
Вычислительный эксперимент в молекулярной биофизике белков и биомембран
Chapt 10
Anovasia technology presentation nov2012 non-conf
Bradshaw - Bioenergy - Spring Review 2012
Selective Protein Staining On Native Gel
Website antibodies
Asilomar2005 Ecoli Poster
Data Quality Issues That Can Impact Drug Discovery
Selkoe webinar slides
Ad

Similar to Understanding and classifying metabolite space and metabolite likeness (20)

PDF
Julio Peironcely @ ICCS 2011
PDF
Structure generation, metabolite space, and metabolite likeness
PDF
Metabolomics: data acquisition, pre-processing and quality control
PDF
Drugs metabolism and disposition
PPTX
Using ontologies to do integrative systems biology
PDF
SelectAZyme
PPTX
Big data in metabolism
PPT
SOT short course on computational toxicology
PDF
Enabling Discoveries at High Throughput - Small molecule and RNAi HTS at the ...
PDF
Chem2bio2rdf portal
PDF
Personalized medicine via molecular interrogation, data mining and systems bi...
PDF
Metabolomics Data Analysis
PPT
Great promise of navigating the internet using in chis
PPTX
BioDiscovery Solutions for Future
PDF
Pharmaday Verona Sxf 13112008 A
PPT
Drug discovery approaches lecture.ppt pharma
PDF
Organisms as chemical factories - Kristala Prather
PDF
Peptidomics Methods and Protocols 1st Edition Mikhail Soloviev (Auth.)
Julio Peironcely @ ICCS 2011
Structure generation, metabolite space, and metabolite likeness
Metabolomics: data acquisition, pre-processing and quality control
Drugs metabolism and disposition
Using ontologies to do integrative systems biology
SelectAZyme
Big data in metabolism
SOT short course on computational toxicology
Enabling Discoveries at High Throughput - Small molecule and RNAi HTS at the ...
Chem2bio2rdf portal
Personalized medicine via molecular interrogation, data mining and systems bi...
Metabolomics Data Analysis
Great promise of navigating the internet using in chis
BioDiscovery Solutions for Future
Pharmaday Verona Sxf 13112008 A
Drug discovery approaches lecture.ppt pharma
Organisms as chemical factories - Kristala Prather
Peptidomics Methods and Protocols 1st Edition Mikhail Soloviev (Auth.)
Ad

Recently uploaded (20)

PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Electronic commerce courselecture one. Pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Machine learning based COVID-19 study performance prediction
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
Big Data Technologies - Introduction.pptx
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PPTX
A Presentation on Artificial Intelligence
PPTX
Spectroscopy.pptx food analysis technology
PDF
Approach and Philosophy of On baking technology
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
Network Security Unit 5.pdf for BCA BBA.
Electronic commerce courselecture one. Pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Machine learning based COVID-19 study performance prediction
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
SOPHOS-XG Firewall Administrator PPT.pptx
Programs and apps: productivity, graphics, security and other tools
Diabetes mellitus diagnosis method based random forest with bat algorithm
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
MYSQL Presentation for SQL database connectivity
Big Data Technologies - Introduction.pptx
gpt5_lecture_notes_comprehensive_20250812015547.pdf
A Presentation on Artificial Intelligence
Spectroscopy.pptx food analysis technology
Approach and Philosophy of On baking technology
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Assigned Numbers - 2025 - Bluetooth® Document
Building Integrated photovoltaic BIPV_UPV.pdf

Understanding and classifying metabolite space and metabolite likeness

  • 1. Understanding And Classifying Metabolite Space and Metabolite- Likeness PLoS One (in press) Julio E. Peironcely  @peyron Juliopeironcely.com PhD student at Leiden University and TNO
  • 2. Metabolomics the quantitative and qualitative analysis of all metabolites in samples of cells, body fluids, tissues, etc. Julio E. Peironcely
  • 3. Metabolomics Experi- Biological Biological Sample Data Data pre- Data mental Sampling inter- question preparation acquisition processing analysis design pretation Metabolites Relevant biomolecules/ List of Samples Raw data connectivities Protocol peaks/ & biomolecules Models Julio E. Peironcely
  • 4. Metabolomics Experi- Biological Biological Sample Data Data pre- Data mental Sampling inter- question preparation acquisition processing analysis design pretation Metabolites Relevant biomolecules/ List of Samples Raw data connectivities Protocol peaks/ & biomolecules Models Julio E. Peironcely
  • 5. How do metabolites look like?
  • 6. HMDB ZINC 8K 21M Julio E. Peironcely
  • 7. metabolites non metabolites Water Solubility MW C Atoms Struc. Complexity PSA Julio E. Peironcely
  • 8. PCA Julio E. Peironcely
  • 9. PCA
  • 11. Decision Tree Julio E. Peironcely
  • 12. Lots of candidates structures
  • 13. Elemental Composition Julio E. Peironcely
  • 14. Elemental Composition Structure Generation Julio E. Peironcely
  • 15. Elemental Composition Structure Generation Molecules Julio E. Peironcely
  • 16. We are looking for metabolites
  • 17. Elemental Composition Structure Metabolite Generation Likeness Molecules Julio E. Peironcely
  • 18. Elemental Composition Metabolites Structure Metabolite Generation Likeness Molecules Julio E. Peironcely
  • 19. Metabolite-likeness Representation + Classification HMDB ZINC 8K 21M Atom Counts Physicochemical desc. Support Vector Machines (SVM) MDL Public Keys Random Forest (RF) FCFP_4 Naïve Bayes (NB) ECFP_4 Julio E. Peironcely
  • 20. Metabolite-likeness HMDB 8K ZINC 21M Standardization Atom Counts Diversity Selection Physicochemical desc. MDL Public Keys FCFP_4 ECFP_4 Julio E. Peironcely
  • 21. Metabolite-likeness HMDB 8K ZINC 21M Standardization Atom Counts Diversity Selection Physicochemical desc. MDL Public Keys FCFP_4 Training Set Test Set ECFP_4 532 + 532 6.4K + 6.4K Julio E. Peironcely
  • 22. Metabolite-likeness HMDB 8K ZINC 21M Standardization Atom Counts Diversity Selection Physicochemical desc. MDL Public Keys FCFP_4 Training Set Test Set ECFP_4 532 + 532 6.4K + 6.4K 5-fold CV SVM RF BC Julio E. Peironcely
  • 23. Metabolite-likeness HMDB 8K ZINC 21M Standardization Diversity Selection 3 classifiers X Training Set Test Set 5 descriptions 532 + 532 6.4K + 6.4K 5-fold CV Metabolite likeness SVM RF BC Julio E. Peironcely
  • 24. Metabolite-likeness HMDB 8K ZINC 21M Best = RF – MDLPublicKeys Standardization Sensitivity Specificity AUC Diversity Selection 99.84% 87.52% 99.20% Training Set Test Set Bad BC – P_desc 532 + 532 6.4K + 6.4K Sensitivity Specificity AUC 5-fold CV Metabolite likeness SVM RF BC 42.51% 86.56% 61.57% Julio E. Peironcely
  • 25. Metabolite-likeness, external validation HMDB External DrugBank ChEMBL validation set Random Selection Standardization Metabolite likeness Julio E. Peironcely
  • 28. Met-likeness + structure generation (methylhistamine) 260K 71% 46% Julio E. Peironcely
  • 29. Met-likeness + structure generation (malic acid) 8K 100% 57% 77% Julio E. Peironcely
  • 30. Conclusions Prediction is good, interpretation not Useful in different fields Local models needed Julio E. Peironcely
  • 31. Acknowledgements Leiden University University of Cambridge Theo Reijmers Andreas Bender Thomas Hankemeier TNO Quality of Life HMP University of Alberta Leon Coulier David Wishart Ying (Edison) Dong Julio E. Peironcely