SlideShare a Scribd company logo
Donovan N. Chin & R. Aldrin Denny
 Traditional Drug Discovery (insert graph)
 In Silico Prediction of ADME (insert graph)
◦ Potency
◦ Absorption
◦ Lead
◦ Drug
◦ Toxicity
◦ Excretion
◦ Metabolism
◦ distribution
 Target IVY(Brute force virtual screening of
very large compound libraries) Lead
Discovery IVY(Utilize predictive models
from Biogen data for more efficient virtual
screening) Lead Optimization candidate
 (insert graph)
◦ Potency
◦ Lead
◦ Drug
◦ Toxicity
◦ Excretion
◦ Metabolism
◦ Distribution
◦ absorption
 Goal: Identify crystallographic binding mode,
Rank order ligands wrt binding with protein
 (insert graph)
 Receptor Docking
 Ligand Shape
 Generate plausible trial binding modes using
docking function then Re-rank modes with
scoring function
 (insert graph)
 341 Active
 47 Non-Active
 (insert graph)
 After filtering by Pharmacophore Feature
 (insert graph)
 (insert functions for)
◦ F_Score*
◦ D_Score
◦ G_Score
◦ PMF_Score
◦ Chem_Score
◦ ICM_Score*
 Cell Adhesion Assay (50% Serum)
◦ (insert graph)
 Biochemical Adhesion Assay
◦ (insert graph)
 Scoring Functions Are Poor More Often Than
Not
 Receptor Site View Library Design FlexX
Score Consensus Score>=3 e.g. Contact
Map, CLogP MW, HBOND Rotatable bonds
Consensus=5? if yes, substructure exists?
if yes, Pharmacophore<4.2Å? if yes, Publish
Hit Report
 (insert graph)
 Goal: Predict hit/miss class based on presence of features
(fingerprints)
 Method
◦ Given a set of N samples
◦ Given that some subset A of them are good (‘active’)
 Then we estimate for a new compound: P(good)~ A/N
◦ Given a set of binary features F
 For a given feature F:
 It appears in N samples
 It appears in A good samples
 Can we estimate: P(good l F)~A/N
 (Problem: Error gets worse as Nsmall)
◦ P’(good l F)= (A+P(good)k)/(n+k)
 P’(good l F)p(good)as N0
 P’(good l F) A/N as N large
◦ (If K=1/P(good) this is the Laplacian correction)
 Descriptors (insert)
 Advantages
◦ Can describe huge number of features (up to 4 billion; MDL 1024; Lead
scope 27,000)
◦ Contains tertiary and stereochemistry information
◦ Fast
 Classification Analysis
◦ Developing Non-Linear Scoring Functions to classify
actives and non-actives
◦ (insert graphs)
◦ Cost Function to Minimize: Gini Impurity N= 1-
ΣP^2(ω)
 Training Set Prediction Success
 (insert table)
 10-fold cross validation
 Randomly split training and test sets
 Significant Improvement in Separating Actives
from Non-Actives
 (insert graph)
 Significant Improvement in Finding Hits Using
New SF
 Optimal tree identified (insert graph)
 No random effects (insert graph)
 (insert cluster)
 Able to identify different molecular property
criteria that lead to hits
 (insert graph)
 (insert graph)
 Size= magnitude of OBA
 OBA values cover range of descriptor space
 (insert graph)
 Choose 1 & 2D Descriptors for ease of
interpretation and lower “noise”
 Build Model (insert graphs) Apply Model
 Features found in high OBA
 Features found in low OBA
 Would be nice if CART did similar view
 Improved scoring functions for separating
hits from non-hits in structure-based drug
design developed with CART and Bayesian
models
 Identified key differences in molecular
physical properties that led to hits
 Built reasonably predictive OBA model
(cannot expect method to extend to other
systems given complexity of OBA, however)
 Biogen IDEC
 Modeling
◦ Rajiah Denny
◦ Claudio Chuaqui
◦ Juswinder Singh
◦ Herman van Vlijmen
◦ Norman Wang
◦ Anuj Patel
◦ Zhan Deng
 Chemistry
◦ Kevin Guckian
◦ Dan Scott
◦ Thomas Durand-Reville
◦ Pat Conlon
◦ Charlie Hammond
◦ Chuck Jewell
 Pharmacology
◦ Tonika Bonhert

More Related Content

PPTX
Improved Predictions in Structure-Based Drug Design Using CART and Bayesian M...
PDF
Reasoning Loops over Arrays using Vampire
PDF
consistency regularization for generative adversarial networks_review
PDF
TDA for feature selection
PPTX
Reviewer prelims
PPTX
Introduction to Text Mining and Semantics
PPTX
Text mining tutorial
PPT
Introduction to text mining
Improved Predictions in Structure-Based Drug Design Using CART and Bayesian M...
Reasoning Loops over Arrays using Vampire
consistency regularization for generative adversarial networks_review
TDA for feature selection
Reviewer prelims
Introduction to Text Mining and Semantics
Text mining tutorial
Introduction to text mining

Similar to Improved Predictions in Structure Based Drug Design Using Cart and Bayesian Models (20)

PPTX
major phase 3 report of rvce of interaction based ligand
PPT
Prediction Of Bioactivity From Chemical Structure
PPTX
NS-CUK Journal club: H.E.Lee, Review on " A biomedical knowledge graph-based ...
PPTX
Pharmacophore mapping in Drug Development
PDF
IRJET - A Framework for Predicting Drug Effectiveness in Human Body
PDF
Accelerating lead optimisation with active learning by exploiting MMPA based ...
PPTX
Using open bioactivity data for developing machine-learning prediction models...
PDF
Translating data to model ICCS2022_pub.pdf
PPTX
pharmacophoremapping05-180503150916-converted.pptx
PDF
Translating data to predictive models
PPT
Nc state lecture v2 Computational Toxicology
PPTX
Fundamentals of computer aided drug design.pptx
PPTX
Drug Target Interaction (DTI) prediction (MSc. thesis)
PDF
Kernel based approaches in drug target interaction prediction
PPTX
SIMILARITY BASED METHODS & PHARMACOPHORE BASED SCREENING M.PHARMA CHEMISTRY 2...
PPTX
ADMET.pptx
PPT
Drug design based on bioinformatic tools
PPTX
Prib2014
PPTX
4. Virtual screening for drug discovery.pptx
PDF
II-PIC 2017: Drug Discovery of Novel Molecules using Chemical Data Mining tool
major phase 3 report of rvce of interaction based ligand
Prediction Of Bioactivity From Chemical Structure
NS-CUK Journal club: H.E.Lee, Review on " A biomedical knowledge graph-based ...
Pharmacophore mapping in Drug Development
IRJET - A Framework for Predicting Drug Effectiveness in Human Body
Accelerating lead optimisation with active learning by exploiting MMPA based ...
Using open bioactivity data for developing machine-learning prediction models...
Translating data to model ICCS2022_pub.pdf
pharmacophoremapping05-180503150916-converted.pptx
Translating data to predictive models
Nc state lecture v2 Computational Toxicology
Fundamentals of computer aided drug design.pptx
Drug Target Interaction (DTI) prediction (MSc. thesis)
Kernel based approaches in drug target interaction prediction
SIMILARITY BASED METHODS & PHARMACOPHORE BASED SCREENING M.PHARMA CHEMISTRY 2...
ADMET.pptx
Drug design based on bioinformatic tools
Prib2014
4. Virtual screening for drug discovery.pptx
II-PIC 2017: Drug Discovery of Novel Molecules using Chemical Data Mining tool
Ad

More from Salford Systems (20)

PDF
Datascience101presentation4
PPTX
Improve Your Regression with CART and RandomForests
PPTX
Churn Modeling-For-Mobile-Telecommunications
PPT
The Do's and Don'ts of Data Mining
PPTX
Introduction to Random Forests by Dr. Adele Cutler
PPTX
9 Data Mining Challenges From Data Scientists Like You
PPTX
Statistically Significant Quotes To Remember
PPTX
Using CART For Beginners with A Teclo Example Dataset
PPT
CART Classification and Regression Trees Experienced User Guide
PPTX
Evolution of regression ols to gps to mars
PPTX
Data Mining for Higher Education
PDF
Comparison of statistical methods commonly used in predictive modeling
PDF
Molecular data mining tool advances in hiv
PPTX
TreeNet Tree Ensembles & CART Decision Trees: A Winning Combination
PDF
SPM v7.0 Feature Matrix
PDF
SPM User's Guide: Introducing MARS
PPT
Hybrid cart logit model 1998
PPTX
Session Logs Tutorial for SPM
PPTX
Some of the new features in SPM 7
PPTX
TreeNet Overview - Updated October 2012
Datascience101presentation4
Improve Your Regression with CART and RandomForests
Churn Modeling-For-Mobile-Telecommunications
The Do's and Don'ts of Data Mining
Introduction to Random Forests by Dr. Adele Cutler
9 Data Mining Challenges From Data Scientists Like You
Statistically Significant Quotes To Remember
Using CART For Beginners with A Teclo Example Dataset
CART Classification and Regression Trees Experienced User Guide
Evolution of regression ols to gps to mars
Data Mining for Higher Education
Comparison of statistical methods commonly used in predictive modeling
Molecular data mining tool advances in hiv
TreeNet Tree Ensembles & CART Decision Trees: A Winning Combination
SPM v7.0 Feature Matrix
SPM User's Guide: Introducing MARS
Hybrid cart logit model 1998
Session Logs Tutorial for SPM
Some of the new features in SPM 7
TreeNet Overview - Updated October 2012
Ad

Recently uploaded (20)

PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Electronic commerce courselecture one. Pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Approach and Philosophy of On baking technology
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Empathic Computing: Creating Shared Understanding
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
cuic standard and advanced reporting.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
NewMind AI Weekly Chronicles - August'25 Week I
Network Security Unit 5.pdf for BCA BBA.
Spectral efficient network and resource selection model in 5G networks
Encapsulation_ Review paper, used for researhc scholars
Chapter 3 Spatial Domain Image Processing.pdf
20250228 LYD VKU AI Blended-Learning.pptx
Electronic commerce courselecture one. Pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
“AI and Expert System Decision Support & Business Intelligence Systems”
Understanding_Digital_Forensics_Presentation.pptx
Review of recent advances in non-invasive hemoglobin estimation
Approach and Philosophy of On baking technology
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Empathic Computing: Creating Shared Understanding
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
cuic standard and advanced reporting.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
NewMind AI Monthly Chronicles - July 2025
Dropbox Q2 2025 Financial Results & Investor Presentation
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx

Improved Predictions in Structure Based Drug Design Using Cart and Bayesian Models

  • 1. Donovan N. Chin & R. Aldrin Denny
  • 2.  Traditional Drug Discovery (insert graph)  In Silico Prediction of ADME (insert graph) ◦ Potency ◦ Absorption ◦ Lead ◦ Drug ◦ Toxicity ◦ Excretion ◦ Metabolism ◦ distribution
  • 3.  Target IVY(Brute force virtual screening of very large compound libraries) Lead Discovery IVY(Utilize predictive models from Biogen data for more efficient virtual screening) Lead Optimization candidate
  • 4.  (insert graph) ◦ Potency ◦ Lead ◦ Drug ◦ Toxicity ◦ Excretion ◦ Metabolism ◦ Distribution ◦ absorption
  • 5.  Goal: Identify crystallographic binding mode, Rank order ligands wrt binding with protein  (insert graph)  Receptor Docking  Ligand Shape  Generate plausible trial binding modes using docking function then Re-rank modes with scoring function
  • 6.  (insert graph)  341 Active  47 Non-Active
  • 7.  (insert graph)  After filtering by Pharmacophore Feature
  • 9.  (insert functions for) ◦ F_Score* ◦ D_Score ◦ G_Score ◦ PMF_Score ◦ Chem_Score ◦ ICM_Score*
  • 10.  Cell Adhesion Assay (50% Serum) ◦ (insert graph)  Biochemical Adhesion Assay ◦ (insert graph)  Scoring Functions Are Poor More Often Than Not
  • 11.  Receptor Site View Library Design FlexX Score Consensus Score>=3 e.g. Contact Map, CLogP MW, HBOND Rotatable bonds Consensus=5? if yes, substructure exists? if yes, Pharmacophore<4.2Å? if yes, Publish Hit Report
  • 13.  Goal: Predict hit/miss class based on presence of features (fingerprints)  Method ◦ Given a set of N samples ◦ Given that some subset A of them are good (‘active’)  Then we estimate for a new compound: P(good)~ A/N ◦ Given a set of binary features F  For a given feature F:  It appears in N samples  It appears in A good samples  Can we estimate: P(good l F)~A/N  (Problem: Error gets worse as Nsmall) ◦ P’(good l F)= (A+P(good)k)/(n+k)  P’(good l F)p(good)as N0  P’(good l F) A/N as N large ◦ (If K=1/P(good) this is the Laplacian correction)  Descriptors (insert)  Advantages ◦ Can describe huge number of features (up to 4 billion; MDL 1024; Lead scope 27,000) ◦ Contains tertiary and stereochemistry information ◦ Fast
  • 14.  Classification Analysis ◦ Developing Non-Linear Scoring Functions to classify actives and non-actives ◦ (insert graphs) ◦ Cost Function to Minimize: Gini Impurity N= 1- ΣP^2(ω)
  • 15.  Training Set Prediction Success  (insert table)  10-fold cross validation  Randomly split training and test sets  Significant Improvement in Separating Actives from Non-Actives
  • 16.  (insert graph)  Significant Improvement in Finding Hits Using New SF
  • 17.  Optimal tree identified (insert graph)  No random effects (insert graph)
  • 18.  (insert cluster)  Able to identify different molecular property criteria that lead to hits
  • 20.  (insert graph)  Size= magnitude of OBA  OBA values cover range of descriptor space
  • 21.  (insert graph)  Choose 1 & 2D Descriptors for ease of interpretation and lower “noise”
  • 22.  Build Model (insert graphs) Apply Model
  • 23.  Features found in high OBA  Features found in low OBA  Would be nice if CART did similar view
  • 24.  Improved scoring functions for separating hits from non-hits in structure-based drug design developed with CART and Bayesian models  Identified key differences in molecular physical properties that led to hits  Built reasonably predictive OBA model (cannot expect method to extend to other systems given complexity of OBA, however)
  • 25.  Biogen IDEC  Modeling ◦ Rajiah Denny ◦ Claudio Chuaqui ◦ Juswinder Singh ◦ Herman van Vlijmen ◦ Norman Wang ◦ Anuj Patel ◦ Zhan Deng  Chemistry ◦ Kevin Guckian ◦ Dan Scott ◦ Thomas Durand-Reville ◦ Pat Conlon ◦ Charlie Hammond ◦ Chuck Jewell  Pharmacology ◦ Tonika Bonhert