SlideShare a Scribd company logo
Protein-Ligand Docking
Outline
• Introduction to protein-ligand docking
• Practical aspects
• Searching for poses
• Scoring functions
• Assessing performance
Outline
• Introduction to protein-ligand docking
• Practical aspects
• Searching for poses
• Scoring functions
• Assessing performance
Computer-aided drug design (CADD)
Known ligand(s)
Known
protein
structure
Unknown
protein
structure
Structure-based drug
design (SBDD)
Protein-ligand docking
Ligand-based drug design
(LBDD)
1 or more ligands
• Similarity searching
Several ligands
• Pharmacophore searching
Many ligands (20+)
• Quantitative Structure-
Activity Relationships
(QSAR)
De novo design
CADD of no use
Need experimental
data of some sort
Protein-ligand docking
• Predicts...
• The pose of the molecule in
the binding site
• The binding affinity or a
score representing the
strength of binding
• A Structure-Based Drug Design (SBDD) method
– “structure” means “using protein structure”
• Computational method that mimics the binding of a ligand to a
protein
• Given...
Pose vs. binding site
• Binding site (or “active site”)
– the part of the protein where the ligand
binds
– generally a cavity on the protein surface
– can be identified by looking at the crystal
structure of the protein bound with a known
inhibitor
• Pose (or “binding mode”)
– The geometry of the ligand in the binding
site
– Geometry = location, orientation and
conformation
• Protein-ligand docking is not about
identifying the binding site
Uses of docking
• The main uses of protein-ligand docking are for
– Virtual screening, to identify potential lead compounds from
a large dataset (see next slide)
– Pose prediction
• Pose prediction
• If we know exactly where
and how a known ligand
binds...
– We can see which parts are
important for binding
– We can suggest changes to
improve affinity
– Avoid changes that will ‘clash’
with the protein
Virtual screening
• Virtual screening is the computational or in silico
analogue of biological screening
• The aim is to score, rank or filter a set of chemical
structures using one or more computational
procedures
– Docking is just one way to do this
• It can be used
– to help decide which compounds to screen (experimentally)
– which libraries to synthesise
– which compounds to purchase from an external company
– to analyse the results of an experiment, such as a HTS run
Components of docking software
• Typically, protein-ligand docking software consist of
two main components which work together:
• 1. Search algorithm
– Generates a large number of poses of a molecule in the
binding site
• 2. Scoring function
– Calculates a score or binding affinity for a particular pose
• To give:
• The pose of the molecule in
the binding site
• The binding affinity or a
score representing the
strength of binding
Final points
• Large number of docking programs available
– AutoDock, DOCK, e-Hits, FlexX, FRED, Glide, GOLD,
LigandFit, QXP, Surflex-Dock…among others
– Different scoring functions, different search algorithms,
different approaches
• Note: protein-ligand docking is not to be confused with the field
of protein-protein docking (“protein docking”)
Outline
• Introduction to protein-ligand docking
• Practical aspects
• Searching for poses
• Scoring functions
• Assessing performance
Preparing the protein structure
• PDB structures often contain water molecules
– In general, all water molecules are removed except where it is
known that they play an important role in coordinating to the ligand
• PDB structures are missing all hydrogen atoms
– Many docking programs require the protein to have explicit
hydrogens. In general these can be added unambiguously, except
in the case of acidic/basic side chains
• An incorrect assignment of protonation
states in the active site will give poor
results
• Glutamate, Aspartate have COO- or
COOH
– OH is hydrogen bond donor, O- is not
• Histidine is a base and its neutral form
has two tautomers
H
N
H N
R
+
N
H N
R
R
NH
N
Preparing the protein structure
• For particular protein side chains, the PDB structure can
be incorrect
• Crystallography gives electron density, not molecular
structure
– In poorly resolved crystal structures of proteins, isoelectronic
groups can give make it difficult to deduce the correct structure
• Affects asparagine, glutamine, histidine
• Important? Affects hydrogen bonding pattern
• May need to flip amide or imidazole
– How to decide? Look at hydrogen bonding pattern in crystal
structures containing ligands
R
O
NH2
R
NH2
O
R
N
N
N
N
R
Ligand Preparation
• A reasonable 3D structure is required as starting point
– During docking, the bond lengths and angles in ligands are held
fixed; only the torsion angles are changed
• The protonation state and tautomeric form of a particular
ligand could influence its hydrogen bonding ability
– Either protonate as expected for physiological pH and use a
single tautomer
– Or generate and dock all possible protonation states and
tautomers, and retain the one with the highest score
OH O
H+
Enol Ketone
Molecular docking and structure based Virtual screening
Outline
• Introduction to protein-ligand docking
• Practical aspects
• Searching for poses
• Scoring functions
• Assessing performance
The search space
• The difficulty with protein–ligand docking is in part
due to the fact that it involves many degrees of
freedom
– The translation and rotation of one molecule relative to
another involves six degrees of freedom
– There are in addition the conformational degrees of freedom
of both the ligand and the protein
– The solvent may also play a significant role in determining
the protein–ligand geometry (often ignored though)
• The search algorithm generates poses, orientations
of particular conformations of the molecule in the
binding site
– Tries to cover the search space, if not exhaustively, then as
extensively as possible
– There is a tradeoff between time and search space
coverage
Ligand conformations
• Conformations are different three-dimensional structures of
molecules that result from rotation about single bonds
• Having too many rotatable bonds results in “combinatorial
explosion”
• Also ring conformations
Taxol
Molecular docking and structure based Virtual screening
Search Algorithms
• We can classify the various search algorithms
according to the degrees of freedom that they
consider
• Rigid docking or flexible docking
– With respect to the ligand structure
• Rigid docking
• The ligand is treated as a rigid structure during the
docking
– Only the translational and rotational degrees of freedom are
considered
• To deal with the problem of ligand conformations, a large
number of conformations of each ligand are generated in
advance and each is docked separately
• Examples: FRED (Fast Rigid Exhaustive Docking) from
OpenEye, and one of the earliest docking programs, DOCK
The DOCK algorithm – Rigid docking
AR Leach, VJ Gillet, An Introduction to Cheminformatics
• Ligand atoms are then matched to
the sphere centres so that the
distances between the atoms
equal the distances between the
corresponding sphere centres,
within some tolerance.
• The ligand conformation is then
oriented into the binding site. After
checking to ensure that there are
no unacceptable steric
interactions, it is then scored.
• New orientations are produced by
generating new sets of matching
ligand atoms and sphere centres.
The procedure continues until all
possible matches have been
considered.
Flexible docking
• Flexible docking is the most common form of docking today
– Conformations of each molecule are generated on-the-fly by the
search algorithm during the docking process
– The algorithm can avoid considering conformations that do not fit
• Exhaustive (systematic) searching computationally too
expensive as the search space is very large
• One common approach is to use stochastic search methods
– These don’t guarantee optimum solution, but good solution within
reasonable length of time
– Stochastic means that they incorporate a degree of randomness
– Such algorithms include genetic algorithms (GOLD), simulated
annealing (AutoDock)
• An alternative is to use incremental construction methods
– These construct conformations of the ligand within the binding site
in a series of stages
– First one or more “base fragments” are identified which are docked
into the binding site
– The orientations of the base fragment then act as anchors for a
systematic conformational analysis of the remainder of the ligand
– Example: FlexX
Outline
• Introduction to protein-ligand docking
• Practical aspects
• Searching for poses
• Scoring functions
• Assessing performance
Components of docking software
• Typically, protein-ligand docking software consist of
two main components which work together:
• 1. Search algorithm
– Generates a large number of poses of a molecule in the
binding site
• 2. Scoring function
– Calculates a score or binding affinity for a particular pose
• To give:
• The pose of the molecule in
the binding site
• The binding affinity or a
score representing the
strength of binding
The perfect scoring function will…
• Accurately calculate the binding affinity
– Will allow actives to be identified in a virtual screen
– Be able to rank actives in terms of affinity
• Score the poses of an active higher than poses of an
inactive
– Will rank actives higher than inactives in a virtual screen
• Score the correct pose of the active higher than an
incorrect pose of the active
– Will allow the correct pose of the active to be identified
• “actives” = molecules with biological activity
Classes of scoring function
• Broadly speaking, scoring functions can be
divided into the following classes:
– Forcefield-based
• Based on terms from molecular mechanics
forcefields
• GoldScore, DOCK, AutoDock
– Empirical
• Parameterised against experimental binding affinities
• ChemScore, PLP, Glide SP/XP
– Knowledge-based potentials
• Based on statistical analysis of observed pairwise
distributions
• PMF, DrugScore, ASP
Empirical scoring functions
Böhm’s empirical scoring function
• The ∆G values on the right of the equation are all constants (see next slide)
• ∆Go is a contribution to the binding energy that does not directly depend on any
specific interactions with the protein
• The hydrogen bonding and ionic terms are both dependent on the geometry
of the interaction, with large deviations from ideal geometries (ideal distance R,
ideal angle α) being penalised.
• The lipophilic term is proportional to the contact surface area (Alipo) between
protein and ligand involving non-polar atoms.
• The conformational entropy term is the penalty associated with freezing
internal rotations of the ligand. It is largely entropic in nature. Here the value is
directly proportional to the number of rotatable bonds in the ligand (NROT).
• In general, scoring functions assume that the free energy of binding can be
written as a linear sum of terms to reflect the various contributions to binding
• Bohm’s scoring function included contributions
from hydrogen bonding, ionic interactions, lipophilic
interactions and the loss of internal conformational
freedom of the ligand.
Outline
• Introduction to protein-ligand docking
• Practical aspects
• Searching for poses
• Scoring functions
• Assessing performance
Pose prediction accuracy
• Given a set of actives with known crystal poses, can
they be docked accurately?
• Accuracy measured by RMSD (root mean squared
deviation) compared to known crystal structures
– RMSD = square root of the average of (the difference
between a particular coordinate in the crystal and that
coordinate in the pose)2
– Within 2.0Å RMSD considered cut-off for accuracy
– More sophisticated measures have been proposed, but are
not widely adopted
• In general, the best docking software predicts the
correct pose about 70% of the time
• Note: it’s always easier to find the correct pose when
docking back into the active’s own crystal structure
– More difficult to cross-dock
Assess performance of a virtual screen
• Need a dataset of Nact known actives, and inactives
• Dock all molecules, and rank each by score
• Ideally, all actives would be at the top of the list
– In practice, we are interested in any improvement over what
is expected by chance
• Define enrichment, E, as the number of actives found
(Nfound) in the top X% of scores (typically 1% or 5%),
compared to how many expected by chance
– E = Nfound / (Nact * X/100)
– E > 1 implies “positive enrichment”, better than random
– E < 1 implies “negative enrichment”, worse than random
• Why use a cut-off instead of looking at the mean rank
of the actives?
– Typically, the researchers might test only have the resources
to experimentally test the top 1% or 5% of compounds
• More sophisticated approaches have been developed
(e.g. BEDROC) but enrichment is still widely used
Final thoughts
• Protein-ligand docking is an essential tool for
computational drug design
– Widely used in pharmaceutical companies
– Many success stories (see Kolb et al. Curr. Opin. Biotech.,
2009, 20, 429)
• But it’s not a golden bullet
– The perfect scoring function has yet to be found
– The performance varies from target to target, and scoring
function to scoring function
• See for example, Plewczynski et al, “Can we trust docking results?
Evaluation of seven commonly used programs on PDBbind database”,
J. Comp. Chem., Online 1 Sep 2010.
• Care needs to be taken when preparing both the
protein and the ligands
• The more information you have (and use!), the better
your chances
– Targeted library, docking constraints, filtering poses, seeding
with known actives, comparing with known crystal poses

More Related Content

PPT
Protein-Ligand Docking
PPT
P. Joshi SBDD and docking.ppt
PPT
Protein-ligand docking
PPTX
Computational Drug Design
PDF
43_EMIJ-06-00212.pdf
PPT
dock.ppt
PPTX
docking
PPT
Dock Sem
Protein-Ligand Docking
P. Joshi SBDD and docking.ppt
Protein-ligand docking
Computational Drug Design
43_EMIJ-06-00212.pdf
dock.ppt
docking
Dock Sem

Similar to Molecular docking and structure based Virtual screening (20)

PPTX
molecular docking screnning. pptx
PPTX
MOLECULAR DOCKING.pptx
PPTX
Molecular docking
PPTX
Basics Of Molecular Docking
PPTX
INTRODUCTION TO MLAECULAR DOCKING between two molecules
PPTX
Computer Aided Molecular Modeling
PPTX
Docking Techniques in Medicinal Chemistry by Rishabh Tiwari.pptx
PDF
Steps for performing molecular docking using autodockvina
PPT
PDF
Comparison of Traditional and Machine Learning Programs in the Evaluation of ...
PPTX
Molecular docking.pptx
PPTX
MOLECULAR DOCKING IN DRUG DESIGN AND DEVELOPMENT BY PRANAVI linkedin.pptx
PDF
Molecular docking and it application ...
PDF
Structural Bioinformatics.pdf
PPTX
Structure based drug design- kiranmayi
PPTX
COMPUTER AIDED DRUG DESIGN.pptx
PPTX
Scoring function
PPT
protein-protein docking lecture bachelor level
PPTX
Docking Score Functions
molecular docking screnning. pptx
MOLECULAR DOCKING.pptx
Molecular docking
Basics Of Molecular Docking
INTRODUCTION TO MLAECULAR DOCKING between two molecules
Computer Aided Molecular Modeling
Docking Techniques in Medicinal Chemistry by Rishabh Tiwari.pptx
Steps for performing molecular docking using autodockvina
Comparison of Traditional and Machine Learning Programs in the Evaluation of ...
Molecular docking.pptx
MOLECULAR DOCKING IN DRUG DESIGN AND DEVELOPMENT BY PRANAVI linkedin.pptx
Molecular docking and it application ...
Structural Bioinformatics.pdf
Structure based drug design- kiranmayi
COMPUTER AIDED DRUG DESIGN.pptx
Scoring function
protein-protein docking lecture bachelor level
Docking Score Functions
Ad

More from MohamedHasan816582 (20)

PPT
Introduction to Genetics and molecular biology.ppt
PPTX
Application of Biotechnology for Improving Medicinal Plants.pptx
PPT
structure Am Health Final and Technology. ppt
PPTX
Bioinformatics & AI- in Medicinal and aromatic plant.pptx
PPTX
Basic Bioinformatics and Biotechnology.pptx
PPT
2- Basics of Molecular Biology and biochemistry.ppt
PPT
3- introduction(SEQU ANAL of PCR products 9 9 12 (2).ppt
PPTX
TNBC Research Presentation and medical virology .pptx
PPTX
EBOV Presentation and medical Virology .pptx
PPTX
Presentation of medical biotechnology.pptx
PPTX
Mohamed El-Sayed Hasan and curriculum vitae.pptx
PPT
Introduction to classical and modern Genetics.ppt
PPTX
Topic 5 of the genomics and proteomics.pptx
PPTX
EmZ medical microbiology and classification.pptx
PPTX
presentation and microbial biotechnology.pptx
PPTX
EmZ medical microbiology and classification.pptx
PPTX
IMAN of medical microbiology and classification.pptx
PPT
aya presentation of discussion seminar .ppt
PPTX
INTRODUCTION-TO-RESEARCH-METHODOLOGY-2020_3.pptx
PPT
INTRODUCTION-TO-RESEARCH-METHODOLOGY-2020_2.ppt
Introduction to Genetics and molecular biology.ppt
Application of Biotechnology for Improving Medicinal Plants.pptx
structure Am Health Final and Technology. ppt
Bioinformatics & AI- in Medicinal and aromatic plant.pptx
Basic Bioinformatics and Biotechnology.pptx
2- Basics of Molecular Biology and biochemistry.ppt
3- introduction(SEQU ANAL of PCR products 9 9 12 (2).ppt
TNBC Research Presentation and medical virology .pptx
EBOV Presentation and medical Virology .pptx
Presentation of medical biotechnology.pptx
Mohamed El-Sayed Hasan and curriculum vitae.pptx
Introduction to classical and modern Genetics.ppt
Topic 5 of the genomics and proteomics.pptx
EmZ medical microbiology and classification.pptx
presentation and microbial biotechnology.pptx
EmZ medical microbiology and classification.pptx
IMAN of medical microbiology and classification.pptx
aya presentation of discussion seminar .ppt
INTRODUCTION-TO-RESEARCH-METHODOLOGY-2020_3.pptx
INTRODUCTION-TO-RESEARCH-METHODOLOGY-2020_2.ppt
Ad

Recently uploaded (20)

PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPT
Teaching material agriculture food technology
PDF
Machine learning based COVID-19 study performance prediction
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Electronic commerce courselecture one. Pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Encapsulation theory and applications.pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Review of recent advances in non-invasive hemoglobin estimation
Teaching material agriculture food technology
Machine learning based COVID-19 study performance prediction
Agricultural_Statistics_at_a_Glance_2022_0.pdf
NewMind AI Weekly Chronicles - August'25 Week I
Electronic commerce courselecture one. Pdf
20250228 LYD VKU AI Blended-Learning.pptx
Per capita expenditure prediction using model stacking based on satellite ima...
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
MIND Revenue Release Quarter 2 2025 Press Release
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
The AUB Centre for AI in Media Proposal.docx
Dropbox Q2 2025 Financial Results & Investor Presentation
Encapsulation theory and applications.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf

Molecular docking and structure based Virtual screening

  • 2. Outline • Introduction to protein-ligand docking • Practical aspects • Searching for poses • Scoring functions • Assessing performance
  • 3. Outline • Introduction to protein-ligand docking • Practical aspects • Searching for poses • Scoring functions • Assessing performance
  • 4. Computer-aided drug design (CADD) Known ligand(s) Known protein structure Unknown protein structure Structure-based drug design (SBDD) Protein-ligand docking Ligand-based drug design (LBDD) 1 or more ligands • Similarity searching Several ligands • Pharmacophore searching Many ligands (20+) • Quantitative Structure- Activity Relationships (QSAR) De novo design CADD of no use Need experimental data of some sort
  • 5. Protein-ligand docking • Predicts... • The pose of the molecule in the binding site • The binding affinity or a score representing the strength of binding • A Structure-Based Drug Design (SBDD) method – “structure” means “using protein structure” • Computational method that mimics the binding of a ligand to a protein • Given...
  • 6. Pose vs. binding site • Binding site (or “active site”) – the part of the protein where the ligand binds – generally a cavity on the protein surface – can be identified by looking at the crystal structure of the protein bound with a known inhibitor • Pose (or “binding mode”) – The geometry of the ligand in the binding site – Geometry = location, orientation and conformation • Protein-ligand docking is not about identifying the binding site
  • 7. Uses of docking • The main uses of protein-ligand docking are for – Virtual screening, to identify potential lead compounds from a large dataset (see next slide) – Pose prediction • Pose prediction • If we know exactly where and how a known ligand binds... – We can see which parts are important for binding – We can suggest changes to improve affinity – Avoid changes that will ‘clash’ with the protein
  • 8. Virtual screening • Virtual screening is the computational or in silico analogue of biological screening • The aim is to score, rank or filter a set of chemical structures using one or more computational procedures – Docking is just one way to do this • It can be used – to help decide which compounds to screen (experimentally) – which libraries to synthesise – which compounds to purchase from an external company – to analyse the results of an experiment, such as a HTS run
  • 9. Components of docking software • Typically, protein-ligand docking software consist of two main components which work together: • 1. Search algorithm – Generates a large number of poses of a molecule in the binding site • 2. Scoring function – Calculates a score or binding affinity for a particular pose • To give: • The pose of the molecule in the binding site • The binding affinity or a score representing the strength of binding
  • 10. Final points • Large number of docking programs available – AutoDock, DOCK, e-Hits, FlexX, FRED, Glide, GOLD, LigandFit, QXP, Surflex-Dock…among others – Different scoring functions, different search algorithms, different approaches • Note: protein-ligand docking is not to be confused with the field of protein-protein docking (“protein docking”)
  • 11. Outline • Introduction to protein-ligand docking • Practical aspects • Searching for poses • Scoring functions • Assessing performance
  • 12. Preparing the protein structure • PDB structures often contain water molecules – In general, all water molecules are removed except where it is known that they play an important role in coordinating to the ligand • PDB structures are missing all hydrogen atoms – Many docking programs require the protein to have explicit hydrogens. In general these can be added unambiguously, except in the case of acidic/basic side chains • An incorrect assignment of protonation states in the active site will give poor results • Glutamate, Aspartate have COO- or COOH – OH is hydrogen bond donor, O- is not • Histidine is a base and its neutral form has two tautomers H N H N R + N H N R R NH N
  • 13. Preparing the protein structure • For particular protein side chains, the PDB structure can be incorrect • Crystallography gives electron density, not molecular structure – In poorly resolved crystal structures of proteins, isoelectronic groups can give make it difficult to deduce the correct structure • Affects asparagine, glutamine, histidine • Important? Affects hydrogen bonding pattern • May need to flip amide or imidazole – How to decide? Look at hydrogen bonding pattern in crystal structures containing ligands R O NH2 R NH2 O R N N N N R
  • 14. Ligand Preparation • A reasonable 3D structure is required as starting point – During docking, the bond lengths and angles in ligands are held fixed; only the torsion angles are changed • The protonation state and tautomeric form of a particular ligand could influence its hydrogen bonding ability – Either protonate as expected for physiological pH and use a single tautomer – Or generate and dock all possible protonation states and tautomers, and retain the one with the highest score OH O H+ Enol Ketone
  • 16. Outline • Introduction to protein-ligand docking • Practical aspects • Searching for poses • Scoring functions • Assessing performance
  • 17. The search space • The difficulty with protein–ligand docking is in part due to the fact that it involves many degrees of freedom – The translation and rotation of one molecule relative to another involves six degrees of freedom – There are in addition the conformational degrees of freedom of both the ligand and the protein – The solvent may also play a significant role in determining the protein–ligand geometry (often ignored though) • The search algorithm generates poses, orientations of particular conformations of the molecule in the binding site – Tries to cover the search space, if not exhaustively, then as extensively as possible – There is a tradeoff between time and search space coverage
  • 18. Ligand conformations • Conformations are different three-dimensional structures of molecules that result from rotation about single bonds • Having too many rotatable bonds results in “combinatorial explosion” • Also ring conformations Taxol
  • 20. Search Algorithms • We can classify the various search algorithms according to the degrees of freedom that they consider • Rigid docking or flexible docking – With respect to the ligand structure • Rigid docking • The ligand is treated as a rigid structure during the docking – Only the translational and rotational degrees of freedom are considered • To deal with the problem of ligand conformations, a large number of conformations of each ligand are generated in advance and each is docked separately • Examples: FRED (Fast Rigid Exhaustive Docking) from OpenEye, and one of the earliest docking programs, DOCK
  • 21. The DOCK algorithm – Rigid docking AR Leach, VJ Gillet, An Introduction to Cheminformatics • Ligand atoms are then matched to the sphere centres so that the distances between the atoms equal the distances between the corresponding sphere centres, within some tolerance. • The ligand conformation is then oriented into the binding site. After checking to ensure that there are no unacceptable steric interactions, it is then scored. • New orientations are produced by generating new sets of matching ligand atoms and sphere centres. The procedure continues until all possible matches have been considered.
  • 22. Flexible docking • Flexible docking is the most common form of docking today – Conformations of each molecule are generated on-the-fly by the search algorithm during the docking process – The algorithm can avoid considering conformations that do not fit • Exhaustive (systematic) searching computationally too expensive as the search space is very large • One common approach is to use stochastic search methods – These don’t guarantee optimum solution, but good solution within reasonable length of time – Stochastic means that they incorporate a degree of randomness – Such algorithms include genetic algorithms (GOLD), simulated annealing (AutoDock) • An alternative is to use incremental construction methods – These construct conformations of the ligand within the binding site in a series of stages – First one or more “base fragments” are identified which are docked into the binding site – The orientations of the base fragment then act as anchors for a systematic conformational analysis of the remainder of the ligand – Example: FlexX
  • 23. Outline • Introduction to protein-ligand docking • Practical aspects • Searching for poses • Scoring functions • Assessing performance
  • 24. Components of docking software • Typically, protein-ligand docking software consist of two main components which work together: • 1. Search algorithm – Generates a large number of poses of a molecule in the binding site • 2. Scoring function – Calculates a score or binding affinity for a particular pose • To give: • The pose of the molecule in the binding site • The binding affinity or a score representing the strength of binding
  • 25. The perfect scoring function will… • Accurately calculate the binding affinity – Will allow actives to be identified in a virtual screen – Be able to rank actives in terms of affinity • Score the poses of an active higher than poses of an inactive – Will rank actives higher than inactives in a virtual screen • Score the correct pose of the active higher than an incorrect pose of the active – Will allow the correct pose of the active to be identified • “actives” = molecules with biological activity
  • 26. Classes of scoring function • Broadly speaking, scoring functions can be divided into the following classes: – Forcefield-based • Based on terms from molecular mechanics forcefields • GoldScore, DOCK, AutoDock – Empirical • Parameterised against experimental binding affinities • ChemScore, PLP, Glide SP/XP – Knowledge-based potentials • Based on statistical analysis of observed pairwise distributions • PMF, DrugScore, ASP
  • 28. Böhm’s empirical scoring function • The ∆G values on the right of the equation are all constants (see next slide) • ∆Go is a contribution to the binding energy that does not directly depend on any specific interactions with the protein • The hydrogen bonding and ionic terms are both dependent on the geometry of the interaction, with large deviations from ideal geometries (ideal distance R, ideal angle α) being penalised. • The lipophilic term is proportional to the contact surface area (Alipo) between protein and ligand involving non-polar atoms. • The conformational entropy term is the penalty associated with freezing internal rotations of the ligand. It is largely entropic in nature. Here the value is directly proportional to the number of rotatable bonds in the ligand (NROT). • In general, scoring functions assume that the free energy of binding can be written as a linear sum of terms to reflect the various contributions to binding • Bohm’s scoring function included contributions from hydrogen bonding, ionic interactions, lipophilic interactions and the loss of internal conformational freedom of the ligand.
  • 29. Outline • Introduction to protein-ligand docking • Practical aspects • Searching for poses • Scoring functions • Assessing performance
  • 30. Pose prediction accuracy • Given a set of actives with known crystal poses, can they be docked accurately? • Accuracy measured by RMSD (root mean squared deviation) compared to known crystal structures – RMSD = square root of the average of (the difference between a particular coordinate in the crystal and that coordinate in the pose)2 – Within 2.0Å RMSD considered cut-off for accuracy – More sophisticated measures have been proposed, but are not widely adopted • In general, the best docking software predicts the correct pose about 70% of the time • Note: it’s always easier to find the correct pose when docking back into the active’s own crystal structure – More difficult to cross-dock
  • 31. Assess performance of a virtual screen • Need a dataset of Nact known actives, and inactives • Dock all molecules, and rank each by score • Ideally, all actives would be at the top of the list – In practice, we are interested in any improvement over what is expected by chance • Define enrichment, E, as the number of actives found (Nfound) in the top X% of scores (typically 1% or 5%), compared to how many expected by chance – E = Nfound / (Nact * X/100) – E > 1 implies “positive enrichment”, better than random – E < 1 implies “negative enrichment”, worse than random • Why use a cut-off instead of looking at the mean rank of the actives? – Typically, the researchers might test only have the resources to experimentally test the top 1% or 5% of compounds • More sophisticated approaches have been developed (e.g. BEDROC) but enrichment is still widely used
  • 32. Final thoughts • Protein-ligand docking is an essential tool for computational drug design – Widely used in pharmaceutical companies – Many success stories (see Kolb et al. Curr. Opin. Biotech., 2009, 20, 429) • But it’s not a golden bullet – The perfect scoring function has yet to be found – The performance varies from target to target, and scoring function to scoring function • See for example, Plewczynski et al, “Can we trust docking results? Evaluation of seven commonly used programs on PDBbind database”, J. Comp. Chem., Online 1 Sep 2010. • Care needs to be taken when preparing both the protein and the ligands • The more information you have (and use!), the better your chances – Targeted library, docking constraints, filtering poses, seeding with known actives, comparing with known crystal poses

Editor's Notes

  • #18: 12^5/12^4 = 12
  • #31: Expected = 5% of 10 = 0.5 actives Enrichment = 2 / 0.5 = 4.0