SlideShare a Scribd company logo
3
Most read
4
Most read
5
Most read
Lecture 9- Molecular descriptors
BTT- 516– Drug Designing and Development
Topic To be covered
1. Introduction
2. Types of molecular descriptors
3. Tools for descriptor calculations
4. Home work
Molecular descriptors can be defined as mathematical representations of molecules’
properties that are generated by algorithms.
The numerical values of molecular descriptors are used to quantitatively describe the
physical and chemical information of the molecules.
An example of molecular descriptors is the LogP which is a quantitative representation of
the lipophilicity of the molecules, it is obtained by measuring the partitioning of the
molecule between an aqueous phase and a lipophilic phase which consists usually of
water/n-octanol.
Introduction
Molecular descriptors can be useful in performing similarity searches in molecular
libraries, as they can find molecules with similar physical or chemical properties based on
their similarity in the descriptors’ values.
The molecular descriptors are used in ADMET prediction models to correlate the
structure–property relationship to help in predicting the ADMET properties of molecules
based on their descriptors values (Khan and sylte, 2007).
The molecular descriptors that are used in ADMET models can be classified on the basis of
level of molecular representation required for calculating the descriptor.
• One-dimensional (1D)
• Two-dimensional (2D)
• Three-dimensional (3D)
The 1D descriptors are the simplest type of molecular descriptors, these represent
information that are calculated from the molecular formula of the molecule, which
includes the count and type of atoms in the molecule and the molecular weight.
One-dimensional (1D)
The 2D descriptors are more complex than the 1D descriptors, usually, they represent
molecular information regarding the size, shape, and electronic distribution in the molecule.
Calculating the 2D descriptors depends mainly on the database size, and the calculation of parts
of a molecule in which the data is missing could largely result in a false result.
The 3D descriptors describe mainly properties that are related to the 3D conformation of the
molecule, such as the intramolecular hydrogen bonding.
Examples of descriptors obtained from calculations involving the 3D structure of the molecules
are the polar and nonpolar surface area (PSA and NPSA, respectively). More advanced
calculation like quantum mechanics calculations can be used to obtain 3D descriptors that
describe the valence electron distribution in the molecules (Bergström, 2005).
3D descriptors
2 D descriptors
• 0D - bond counts, mol weight, atom counts
• 1D - fragment counts, H-Bond acc/don, Crippen, PSA, SMARTS
• 2D - topological descriptors (Balaban, Randic, Wiener, BCUT, kappa, chi)
• 3D - geometrical descriptors (3D WHIM, 3D autocorrelation, 3D-Morse) + surface
properties + COMFA
• 4D - 3D coordinates + conformations (JCHEM conformer, CORINA, gold set,
Crystaleye)
A selection of commercial and free descriptor calculation utilities is collected under the
molecular descriptor software collection or the CompChem list or new programs are posted
to CCL.
• alvaDesc - new visual descriptor suite from Kode solutions covering 4000 descriptors
(developed by Alvascience)
•CDK descriptor GUI (free and open source - using Open Source CDK and Joelib code)
•BlueDesc- Molecular Descriptor Calculator (free and open source - using CDK and Joelib
code, requires JAVA 1.6
•ChemAxon JChem - Descriptor package using Marvin JAVAAPI (free academic license)
•ISIDA/QSPR - free fragment based QSPR descriptor package
•E-Dragon (VCCLab) free (150 molecules), now with GSFRAG, GSFRAG-L, ETState >
3000 descriptors
Tools for descriptor calculations
•MOLD2 - (FDA) a free 2D molecule descriptor package
•Toxicity Estimation Software Tool (T.E.S.T.) - (EPA) contains more than 790 2-dimensional
descriptors
•Open3DQSAR - pharmacophore modelling using molecular interaction fields (MIFs)
•Dragon - 5,270 molecular descriptors for LINUX and WIN (Todeschini/Talete/Kode)
•PaDEL-Descriptor- based on CDK but includes additional 737 2D and 3D descriptors
(NUS/Singapore)
•ADMEWORKS ModelBuilder - 400 descriptors (Jurs) and MOPAC (Stewart) (Fujitsu/Poland)
•QuBiLS-MIDAS - a highly parallel software for three-dimensional molecular descriptor
calculation
Concepts for descriptor calculations and QSAR/QSPR
modeling
• You need a large dataset with the molecular property (logP, bp) to be modeled. The
larger the number of data points the better. There are QSAR models with 20 or less
points, however for broad applications one need to cover a large diversity space.
Hundreds or thousands of such values can be collected from databases or are now
available from HT screening methods.
• You need the molecular structures itself (as SMILES, SDF in 2D or optimized 3D
structure). Handling the molecules together with all descriptors can be a challenging
task, software which can do that is highly preferred.
• You need a descriptor package for descriptor calculation
• You need to apply feature selection (a statistical process) to discard unimportant
(invariant) or sometimes highly correlated descriptors (othogonalization)
• You need to divide your molecule set into three parts. A training (70%), validation (30%) and
an additional external training or validation set which is not used in either method. (Sometime
the validation set is called testing set or vice versa). Cross-validation (n-fold or v-fold)
techniques or other resampling tests (Monte Carlo Sampling, Jackknifing, Bootstrapping) need
to be applied, especially if not enough molecules are available.
• You need to apply regression or classification methods (including meta-learning approaches).
• One need to make sure that for future predictions no other compound classes are included
(which usually results in wrong predictions) by either including error values, fingerprint or
substructure matches or a simple dimension reduction method (PCA, PLS) to avoid molecules
which were not covered during development. As example a logP method only developed on
alkanes will 100% fail on complex drug molecules or molecules with multiple -OH and -NH
or -SH groups. Further more a complete statistical description for either the regression
performance or classification performance needs to be included.
Utility of molecular descriptors
• The purpose of molecular-Descriptor is to calculate properties of molecules
that serve as numerical descriptions or characterizations of molecules in
other calculations such as QSAR model, diversity analysis or combinatorial
library design.
Thank you
Er. Rajan Rolta
Faculty of Applied Sciences and Biotechnology
Shoolini University,
Village Bhajol, Solan (H.P)
+91-7018792621 (Mob No.)
rajanrolta@shooliniuniversity.com

More Related Content

PPTX
Molecular modelling for M.Pharm according to PCI syllabus
PPT
Qsar and drug design ppt
PPTX
computer aided drug designing and molecular modelling
PDF
Basics of QSAR Modeling
PPTX
Molecular docking.pptx
PPT
Molecular maodeling and drug design
PPTX
Hammett parameters
PPT
Cadd and molecular modeling for M.Pharm
Molecular modelling for M.Pharm according to PCI syllabus
Qsar and drug design ppt
computer aided drug designing and molecular modelling
Basics of QSAR Modeling
Molecular docking.pptx
Molecular maodeling and drug design
Hammett parameters
Cadd and molecular modeling for M.Pharm

What's hot (20)

PPTX
energy minimization
PPTX
Quantum Mechanics in Molecular modeling
PPTX
Virtual screening techniques
PPTX
Molecular and Quantum Mechanics in drug design
PPTX
Molecular Mechanics in Molecular Modeling
PPTX
Virtual sreening
PPTX
In Silico methods for ADMET prediction of new molecules
PPTX
Molecular docking
ODP
7.local and global minima
PPTX
In silico drug desigining
PPT
Pharmacophore identification
PDF
MD Simulation
PPTX
Presentation on insilico drug design and virtual screening
PPTX
ENERGY MINIMIZATION METHODS.pptx
PPTX
Energy minimization
PPTX
Energy minimization methods - Molecular Modeling
PPTX
3D QSAR
PPTX
2D - QSAR
PPTX
Molecular modelling
PPTX
Structure based in silico virtual screening
energy minimization
Quantum Mechanics in Molecular modeling
Virtual screening techniques
Molecular and Quantum Mechanics in drug design
Molecular Mechanics in Molecular Modeling
Virtual sreening
In Silico methods for ADMET prediction of new molecules
Molecular docking
7.local and global minima
In silico drug desigining
Pharmacophore identification
MD Simulation
Presentation on insilico drug design and virtual screening
ENERGY MINIMIZATION METHODS.pptx
Energy minimization
Energy minimization methods - Molecular Modeling
3D QSAR
2D - QSAR
Molecular modelling
Structure based in silico virtual screening
Ad

Similar to Lecture 9 molecular descriptors (20)

PPTX
Descriptors
PDF
Unit 2 cadd assignment
PPT
371_Molecular_Dessddddddddddddddddddddddcriptors.ppt
PPTX
Molecular Descriptors: Comparing Structural Complexity and Software
PPTX
PDF
Molecular Descriptors: Understanding Structural Complexity
PDF
IB Chemistry on ICT, 3D software, Avogadro, AngusLab, Swiss PDB Viewer for In...
PDF
Electron Density Derived Descriptors in Drug Discovery and Protein Modeling
PPTX
VIRTUAL SCREENING TECHNIQUE CADD.pptx
PDF
Predicting Value of Binding Constants of Organic Ligands to Beta-Cyclodextrin...
PPTX
Chemoinformatics
PPTX
Overview of cheminformatics
PPT
SOT short course on computational toxicology
PDF
Representing molecules with minimalism: A solution to the entropy of informatics
PDF
IB Chemistry on ICT, 3D software, Avogadro, AngusLab, Swiss PDB Viewer for In...
PPTX
Cheminformatics, concept by kk sahu sir
PPTX
Review On Molecular Modeling
PPTX
Free online access to experimental and predicted chemical properties through ...
Descriptors
Unit 2 cadd assignment
371_Molecular_Dessddddddddddddddddddddddcriptors.ppt
Molecular Descriptors: Comparing Structural Complexity and Software
Molecular Descriptors: Understanding Structural Complexity
IB Chemistry on ICT, 3D software, Avogadro, AngusLab, Swiss PDB Viewer for In...
Electron Density Derived Descriptors in Drug Discovery and Protein Modeling
VIRTUAL SCREENING TECHNIQUE CADD.pptx
Predicting Value of Binding Constants of Organic Ligands to Beta-Cyclodextrin...
Chemoinformatics
Overview of cheminformatics
SOT short course on computational toxicology
Representing molecules with minimalism: A solution to the entropy of informatics
IB Chemistry on ICT, 3D software, Avogadro, AngusLab, Swiss PDB Viewer for In...
Cheminformatics, concept by kk sahu sir
Review On Molecular Modeling
Free online access to experimental and predicted chemical properties through ...
Ad

More from RAJAN ROLTA (12)

PDF
Lecture 8 drug targets and target identification
PDF
Lecture 13 – comparative modeling
PDF
Lecture 12 – chemoinformatic
PDF
Lecture 11 developing qsar, evaluation of qsar model and virtual screening
PDF
Lecture 10 pharmacophore modeling and sar paradox
PDF
Lecture 7 computer aided drug design
PDF
Lecture 6 –active site identification
PDF
Lecture 5 pharmacophore and qsar
PDF
Lecture 4 ligand based drug design
PDF
Lecture 3 rational drug design
PDF
Lecture 2 history of drug designing and development
PDF
Lecture 1 –Introduction to drug design and development
Lecture 8 drug targets and target identification
Lecture 13 – comparative modeling
Lecture 12 – chemoinformatic
Lecture 11 developing qsar, evaluation of qsar model and virtual screening
Lecture 10 pharmacophore modeling and sar paradox
Lecture 7 computer aided drug design
Lecture 6 –active site identification
Lecture 5 pharmacophore and qsar
Lecture 4 ligand based drug design
Lecture 3 rational drug design
Lecture 2 history of drug designing and development
Lecture 1 –Introduction to drug design and development

Recently uploaded (20)

PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PPTX
Lesson notes of climatology university.
PDF
Microbial disease of the cardiovascular and lymphatic systems
PPTX
Cell Structure & Organelles in detailed.
PDF
RMMM.pdf make it easy to upload and study
PDF
Computing-Curriculum for Schools in Ghana
PPTX
Cell Types and Its function , kingdom of life
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
Insiders guide to clinical Medicine.pdf
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PDF
O7-L3 Supply Chain Operations - ICLT Program
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PPTX
master seminar digital applications in india
PDF
Basic Mud Logging Guide for educational purpose
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
Complications of Minimal Access Surgery at WLH
PDF
Pre independence Education in Inndia.pdf
Module 4: Burden of Disease Tutorial Slides S2 2025
Lesson notes of climatology university.
Microbial disease of the cardiovascular and lymphatic systems
Cell Structure & Organelles in detailed.
RMMM.pdf make it easy to upload and study
Computing-Curriculum for Schools in Ghana
Cell Types and Its function , kingdom of life
Supply Chain Operations Speaking Notes -ICLT Program
Microbial diseases, their pathogenesis and prophylaxis
Insiders guide to clinical Medicine.pdf
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
O7-L3 Supply Chain Operations - ICLT Program
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
master seminar digital applications in india
Basic Mud Logging Guide for educational purpose
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Complications of Minimal Access Surgery at WLH
Pre independence Education in Inndia.pdf

Lecture 9 molecular descriptors

  • 1. Lecture 9- Molecular descriptors BTT- 516– Drug Designing and Development
  • 2. Topic To be covered 1. Introduction 2. Types of molecular descriptors 3. Tools for descriptor calculations 4. Home work
  • 3. Molecular descriptors can be defined as mathematical representations of molecules’ properties that are generated by algorithms. The numerical values of molecular descriptors are used to quantitatively describe the physical and chemical information of the molecules. An example of molecular descriptors is the LogP which is a quantitative representation of the lipophilicity of the molecules, it is obtained by measuring the partitioning of the molecule between an aqueous phase and a lipophilic phase which consists usually of water/n-octanol. Introduction Molecular descriptors can be useful in performing similarity searches in molecular libraries, as they can find molecules with similar physical or chemical properties based on their similarity in the descriptors’ values.
  • 4. The molecular descriptors are used in ADMET prediction models to correlate the structure–property relationship to help in predicting the ADMET properties of molecules based on their descriptors values (Khan and sylte, 2007). The molecular descriptors that are used in ADMET models can be classified on the basis of level of molecular representation required for calculating the descriptor. • One-dimensional (1D) • Two-dimensional (2D) • Three-dimensional (3D) The 1D descriptors are the simplest type of molecular descriptors, these represent information that are calculated from the molecular formula of the molecule, which includes the count and type of atoms in the molecule and the molecular weight. One-dimensional (1D)
  • 5. The 2D descriptors are more complex than the 1D descriptors, usually, they represent molecular information regarding the size, shape, and electronic distribution in the molecule. Calculating the 2D descriptors depends mainly on the database size, and the calculation of parts of a molecule in which the data is missing could largely result in a false result. The 3D descriptors describe mainly properties that are related to the 3D conformation of the molecule, such as the intramolecular hydrogen bonding. Examples of descriptors obtained from calculations involving the 3D structure of the molecules are the polar and nonpolar surface area (PSA and NPSA, respectively). More advanced calculation like quantum mechanics calculations can be used to obtain 3D descriptors that describe the valence electron distribution in the molecules (Bergström, 2005). 3D descriptors 2 D descriptors
  • 6. • 0D - bond counts, mol weight, atom counts • 1D - fragment counts, H-Bond acc/don, Crippen, PSA, SMARTS • 2D - topological descriptors (Balaban, Randic, Wiener, BCUT, kappa, chi) • 3D - geometrical descriptors (3D WHIM, 3D autocorrelation, 3D-Morse) + surface properties + COMFA • 4D - 3D coordinates + conformations (JCHEM conformer, CORINA, gold set, Crystaleye)
  • 7. A selection of commercial and free descriptor calculation utilities is collected under the molecular descriptor software collection or the CompChem list or new programs are posted to CCL. • alvaDesc - new visual descriptor suite from Kode solutions covering 4000 descriptors (developed by Alvascience) •CDK descriptor GUI (free and open source - using Open Source CDK and Joelib code) •BlueDesc- Molecular Descriptor Calculator (free and open source - using CDK and Joelib code, requires JAVA 1.6 •ChemAxon JChem - Descriptor package using Marvin JAVAAPI (free academic license) •ISIDA/QSPR - free fragment based QSPR descriptor package •E-Dragon (VCCLab) free (150 molecules), now with GSFRAG, GSFRAG-L, ETState > 3000 descriptors Tools for descriptor calculations
  • 8. •MOLD2 - (FDA) a free 2D molecule descriptor package •Toxicity Estimation Software Tool (T.E.S.T.) - (EPA) contains more than 790 2-dimensional descriptors •Open3DQSAR - pharmacophore modelling using molecular interaction fields (MIFs) •Dragon - 5,270 molecular descriptors for LINUX and WIN (Todeschini/Talete/Kode) •PaDEL-Descriptor- based on CDK but includes additional 737 2D and 3D descriptors (NUS/Singapore) •ADMEWORKS ModelBuilder - 400 descriptors (Jurs) and MOPAC (Stewart) (Fujitsu/Poland) •QuBiLS-MIDAS - a highly parallel software for three-dimensional molecular descriptor calculation
  • 9. Concepts for descriptor calculations and QSAR/QSPR modeling • You need a large dataset with the molecular property (logP, bp) to be modeled. The larger the number of data points the better. There are QSAR models with 20 or less points, however for broad applications one need to cover a large diversity space. Hundreds or thousands of such values can be collected from databases or are now available from HT screening methods. • You need the molecular structures itself (as SMILES, SDF in 2D or optimized 3D structure). Handling the molecules together with all descriptors can be a challenging task, software which can do that is highly preferred. • You need a descriptor package for descriptor calculation • You need to apply feature selection (a statistical process) to discard unimportant (invariant) or sometimes highly correlated descriptors (othogonalization)
  • 10. • You need to divide your molecule set into three parts. A training (70%), validation (30%) and an additional external training or validation set which is not used in either method. (Sometime the validation set is called testing set or vice versa). Cross-validation (n-fold or v-fold) techniques or other resampling tests (Monte Carlo Sampling, Jackknifing, Bootstrapping) need to be applied, especially if not enough molecules are available. • You need to apply regression or classification methods (including meta-learning approaches). • One need to make sure that for future predictions no other compound classes are included (which usually results in wrong predictions) by either including error values, fingerprint or substructure matches or a simple dimension reduction method (PCA, PLS) to avoid molecules which were not covered during development. As example a logP method only developed on alkanes will 100% fail on complex drug molecules or molecules with multiple -OH and -NH or -SH groups. Further more a complete statistical description for either the regression performance or classification performance needs to be included.
  • 11. Utility of molecular descriptors • The purpose of molecular-Descriptor is to calculate properties of molecules that serve as numerical descriptions or characterizations of molecules in other calculations such as QSAR model, diversity analysis or combinatorial library design.
  • 12. Thank you Er. Rajan Rolta Faculty of Applied Sciences and Biotechnology Shoolini University, Village Bhajol, Solan (H.P) +91-7018792621 (Mob No.) rajanrolta@shooliniuniversity.com