SlideShare a Scribd company logo
Ab Initio Protein Structure
Prediction
ag1805xag1805x
Protein structure prediction
● Protein structure prediction (PSP) is the
prediction of the three-dimensional structure
of a protein from its amino acid sequence
i.e. the prediction of its tertiary structure
from its primary structure.
Protein Structure Prediction: Methods
Similar Protein
Structure
Available
Not
Available
Template Based
Method
ab initio modelling
Threading
ab initio modelling
● ab initio modelling conducts a
conformational search under the guidance of
a designed energy function.
● This procedure usually generates a number of
possible conformations (structure decoys),
and final models are selected from them.
● A successful ab initio modelling depends on
three factors:
➢ an accurate energy function with which the
native structure of a protein corresponds to the
most thermodynamically stable state, compared
to all possible decoy structures;
➢ an efficient search method which can quickly
identify the low-energy states through
conformational search;
➢ selection of native-like models from a pool
of decoy structures.
Energy Functions
● Energy classified into two groups:
➔ Physics-based energy functions
➔ Knowledge-based energy functions
Physics-Based Energy Functions
“In a strictly-defined physics-based ab initio method,
interactions between atoms should be based on
quantum mechanics and the coulomb potential with
only a few fundamental parameters such as the
electron charge and the Planck constant; all atoms
should be described by their atom types where only
the number of electrons is relevant.”
(Hagler et al. 1974; Weiner et al. 1984)
Physics-Based Energy Functions
“In a strictly-defined physics-based ab initio method,
interactions between atoms should be based on
quantum mechanics and the coulomb potential with
only a few fundamental parameters such as the
electron charge and the Planck constant; all atoms
should be described by their atom types where only
the number of electrons is relevant.”
(Hagler et al. 1974; Weiner et al. 1984)
A compromised force field with a large
number of selected atom types is used. In
each atom type, the chemical and physical
properties of the atoms are enough alike
with the parameters calculated from crystal
packing or quantum mechanical theory.
● Well-known examples of such all-atom physics-
based force fields include:
✔ AMBER
✔ CHARMM
✔ OPLS
✔ GROMOS96
● These potentials contain terms associated with
bond lengths, angles, torsion angles, van der
Waals, and electrostatics interactions.
● The major difference between them lies in the
selection of atom types and the interaction
parameters.
Knowledge-Based Energy Function
● Refers to the empirical energy terms derived from the
statistics of the solved structures in deposited PDB.
● Can be divided into two types:
➢ generic and sequence-independent terms such as the
hydrogen bonding and the local backbone stiffness of a
polypeptide chain
➢ amino-acid or protein-sequence dependent terms, e.g. pair
wise residue contact potential, distance dependent atomic
contact potential , and secondary structure propensities
Conformational Search Methods
● Successful ab initio modelling of protein structures
depends on the availability of a powerful conformation
search method which can efficiently find the global
minimum energy structure for a given energy function
with complicated energy landscape.
● Types:
➔ Monte Carlo Simulations
➔ Molecular Dynamics
➔ Genetic Algorithm
➔ Mathematical Optimization
Monte Carlo Simulations
● Its core idea is to use random samples of
parameters or inputs to explore the behavior
of a complex system or process.
Initial configuration of particles
in a system
Monte Carlo move is attempted
that changes the configuration of
the particles
Move is accepted or rejected
based on an acceptance
criterion
Calculates the value of a
property of interest
An accurate average value of this
property can be obtained
StepsinMCsimulation
Molecular Dynamics
● MD simulation solves Newton’s equations of motion at each step of
atom movement, which is probably the most faithful method
depicting atomistically what is occurring in proteins.
● The method is therefore most-often used for the study of protein
folding pathways
● The long simulation time is one of the major issues of this method,
since the incremental time scale is usually in the order of
femtoseconds (10 15 s) while the fastest folding time of a small−
protein (less than 100 residues) is in the millisecond range in nature.
Genetic Algorithm
● The genetic algorithm is a method for solving problems
that is based on natural selection, the process that drives
biological evolution.
● The genetic algorithm repeatedly modifies a population of
individual solutions.
● At each step, the genetic algorithm selects individuals at
random from the current population to be parents and uses
them to produce the children for the next generation.
● Over successive generations, the population "evolves"
toward an optimal solution.
Mathematical Optimization
● Mathematical optimization is the selection of a best
element (with regard to some criteria) from some
set of available alternatives.
Model Selection
● The selection of protein models has been
emerged as a new field called Model Quality
Assessment Programs (MQAP)
● Modelling selection approaches can be
classified into two types:
 energy based
 free-energy based
Physics-Based Energy Function
● Selects the decoy with the lowest energy.
Knowledge-Based Energy Function
● Sippl developed a pair wise residue-distance based
potential (Sippl 1990) using the statistics of known PDB
structures in 1990 (its newest version is PROSA II (Sippl
1993; Wiederstein and Sippl 2007) ).
● A variety of knowledge-based potentials have been
proposed, which include atomic interaction potential,
solvation potential, hydrogen bond potential, torsion angle
potential, etc.
Sequence-Structure Compatibility Function
● Best models are selected not purely based on energy functions.
● They are selected based on the compatibility of target sequences
to model structures.
● The earliest and still successful example is that by Luthy et al.
(1992), who used threading scores to evaluate structures.
● Colovos and Yeates (1993) later used a quadratic error function
to describe the non-covalently bonded interactions among CC,
CN, CO, NN, NO and OO, where near-native structures have
fewer errors than other decoys
Clustering of Decoy Structures
● Cluster analysis or clustering is the task of grouping a set of objects in such a
way that objects in the same group (called a cluster) are more similar (in
some sense or another) to each other than to those in other groups (clusters).
● The cluster-centre conformation of the largest cluster is considered closer to
native structures than the majority of decoys.
● In the work by Shortle et al. (1998), for all 12 cases tested, the cluster-centre
conformation of the largest cluster was closer to native structures than the
majority of decoys. Cluster-centre structures were ranked as the top 1–5%
closest to their native structures.
Algorithms&Serversofabinitiomodelling
Fig.: Flowchart of the
ROSETTA protocol
Fig.:Flowchart of I-TASSER protein structure modelling
Thank You
ag1805xag1805x

More Related Content

PDF
Secondary Structure Prediction of proteins
PPT
methods for protein structure prediction
PPTX
Protein fold recognition and ab_initio modeling
PPTX
Protein 3 d structure prediction
PPTX
Homology modelling
PPTX
Homology Modelling
PPTX
Structure alignment methods
PPTX
Super secondary structure of protein
Secondary Structure Prediction of proteins
methods for protein structure prediction
Protein fold recognition and ab_initio modeling
Protein 3 d structure prediction
Homology modelling
Homology Modelling
Structure alignment methods
Super secondary structure of protein

What's hot (20)

PPTX
Comparative genomics
PPTX
Scoring matrices
PDF
Gene prediction methods vijay
PPTX
PPTX
protein data bank
PPTX
PROTEIN MICROARRAYS
PPT
PPTX
Chou fasman algorithm for protein structure prediction
PPTX
Introduction to sequence alignment partii
PPTX
Threading modeling methods
PDF
Phylogenetic analysis
PPTX
Flux balance analysis
PPTX
PPTX
Secondary protein structure prediction
PDF
Protein structure classification/domain prediction: SCOP and CATH (Bioinforma...
PDF
Dot matrix
PPT
Primer design
PPTX
Blast and fasta
PPT
RNA secondary structure prediction
Comparative genomics
Scoring matrices
Gene prediction methods vijay
protein data bank
PROTEIN MICROARRAYS
Chou fasman algorithm for protein structure prediction
Introduction to sequence alignment partii
Threading modeling methods
Phylogenetic analysis
Flux balance analysis
Secondary protein structure prediction
Protein structure classification/domain prediction: SCOP and CATH (Bioinforma...
Dot matrix
Primer design
Blast and fasta
RNA secondary structure prediction
Ad

Viewers also liked (8)

PDF
Genome Assembly
PPTX
Introduction to bioinformatics
PDF
Cytoscape plugins - GeneMania and CentiScape
PPTX
Kegg database resources
PPT
Biological databases
PPTX
Protein databases
PPTX
Protein structure: details
Genome Assembly
Introduction to bioinformatics
Cytoscape plugins - GeneMania and CentiScape
Kegg database resources
Biological databases
Protein databases
Protein structure: details
Ad

Similar to Ab Initio Protein Structure Prediction (20)

PPTX
Computational chemistry
PPTX
Molecular modelling (1)
PPTX
Molecular modeling.pptx Bsc Biochemistry
PPTX
Structure based drug design- kiranmayi
PDF
Monte Carlo Simulations & Membrane Simulation and Dynamics
PPTX
docking
PDF
Computational methodologies
PPTX
Molecular modelling
PDF
A systematic approach for the generation and verification of structural hypot...
PDF
STUDY OF NANO-SYSTEMS FOR COMPUTER SIMULATIONS
PPTX
Molecular modelling
PPTX
HOMOLOGY MODELLING.pptx
PPT
Molecular modelling-Needs and charcteristics
PPTX
protein Modeling Abi.pptx
PPTX
Conformational_Analysis.pptx
PPT
PDF
13C Chemical shifts of SUMO protein in the
PPT
Protein structure prediction by Homology modelling
PPT
Homology modelling-Protein structure prediction
Computational chemistry
Molecular modelling (1)
Molecular modeling.pptx Bsc Biochemistry
Structure based drug design- kiranmayi
Monte Carlo Simulations & Membrane Simulation and Dynamics
docking
Computational methodologies
Molecular modelling
A systematic approach for the generation and verification of structural hypot...
STUDY OF NANO-SYSTEMS FOR COMPUTER SIMULATIONS
Molecular modelling
HOMOLOGY MODELLING.pptx
Molecular modelling-Needs and charcteristics
protein Modeling Abi.pptx
Conformational_Analysis.pptx
13C Chemical shifts of SUMO protein in the
Protein structure prediction by Homology modelling
Homology modelling-Protein structure prediction

More from Arindam Ghosh (18)

PPTX
Network embedding in biomedical data science
PDF
Next Generation Sequencing
PPTX
Sequence alignment
PDF
Pharmacogenomics & its ethical issues
PPTX
Limb development in vertebrates
PPTX
Canning fish
PPTX
Polymerase Chain Reaction (PCR)
PDF
Carbon Nanotubes
PDF
Java - Interfaces & Packages
PDF
Freshers day anchoring script
PDF
Artificial Vectors
PPTX
Pseudo code
PPTX
Hamiltonian path
PPTX
Cedrus of Himachal Pradesh
PDF
MySQL and bioinformatics
PDF
Protein sorting in mitochondria
PDF
Survey of softwares for phylogenetic analysis
PDF
Publicly available tools and open resources in Bioinformatics
Network embedding in biomedical data science
Next Generation Sequencing
Sequence alignment
Pharmacogenomics & its ethical issues
Limb development in vertebrates
Canning fish
Polymerase Chain Reaction (PCR)
Carbon Nanotubes
Java - Interfaces & Packages
Freshers day anchoring script
Artificial Vectors
Pseudo code
Hamiltonian path
Cedrus of Himachal Pradesh
MySQL and bioinformatics
Protein sorting in mitochondria
Survey of softwares for phylogenetic analysis
Publicly available tools and open resources in Bioinformatics

Recently uploaded (20)

PDF
Complications of Minimal Access Surgery at WLH
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PPTX
Pharma ospi slides which help in ospi learning
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
RMMM.pdf make it easy to upload and study
PPTX
master seminar digital applications in india
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
Basic Mud Logging Guide for educational purpose
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
Cell Types and Its function , kingdom of life
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
TR - Agricultural Crops Production NC III.pdf
PDF
Classroom Observation Tools for Teachers
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
Complications of Minimal Access Surgery at WLH
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Pharma ospi slides which help in ospi learning
FourierSeries-QuestionsWithAnswers(Part-A).pdf
RMMM.pdf make it easy to upload and study
master seminar digital applications in india
STATICS OF THE RIGID BODIES Hibbelers.pdf
Basic Mud Logging Guide for educational purpose
Supply Chain Operations Speaking Notes -ICLT Program
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Cell Types and Its function , kingdom of life
2.FourierTransform-ShortQuestionswithAnswers.pdf
Abdominal Access Techniques with Prof. Dr. R K Mishra
Microbial diseases, their pathogenesis and prophylaxis
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
TR - Agricultural Crops Production NC III.pdf
Classroom Observation Tools for Teachers
Renaissance Architecture: A Journey from Faith to Humanism

Ab Initio Protein Structure Prediction

  • 1. Ab Initio Protein Structure Prediction ag1805xag1805x
  • 2. Protein structure prediction ● Protein structure prediction (PSP) is the prediction of the three-dimensional structure of a protein from its amino acid sequence i.e. the prediction of its tertiary structure from its primary structure.
  • 3. Protein Structure Prediction: Methods Similar Protein Structure Available Not Available Template Based Method ab initio modelling Threading
  • 4. ab initio modelling ● ab initio modelling conducts a conformational search under the guidance of a designed energy function. ● This procedure usually generates a number of possible conformations (structure decoys), and final models are selected from them.
  • 5. ● A successful ab initio modelling depends on three factors: ➢ an accurate energy function with which the native structure of a protein corresponds to the most thermodynamically stable state, compared to all possible decoy structures; ➢ an efficient search method which can quickly identify the low-energy states through conformational search; ➢ selection of native-like models from a pool of decoy structures.
  • 6. Energy Functions ● Energy classified into two groups: ➔ Physics-based energy functions ➔ Knowledge-based energy functions
  • 7. Physics-Based Energy Functions “In a strictly-defined physics-based ab initio method, interactions between atoms should be based on quantum mechanics and the coulomb potential with only a few fundamental parameters such as the electron charge and the Planck constant; all atoms should be described by their atom types where only the number of electrons is relevant.” (Hagler et al. 1974; Weiner et al. 1984)
  • 8. Physics-Based Energy Functions “In a strictly-defined physics-based ab initio method, interactions between atoms should be based on quantum mechanics and the coulomb potential with only a few fundamental parameters such as the electron charge and the Planck constant; all atoms should be described by their atom types where only the number of electrons is relevant.” (Hagler et al. 1974; Weiner et al. 1984)
  • 9. A compromised force field with a large number of selected atom types is used. In each atom type, the chemical and physical properties of the atoms are enough alike with the parameters calculated from crystal packing or quantum mechanical theory.
  • 10. ● Well-known examples of such all-atom physics- based force fields include: ✔ AMBER ✔ CHARMM ✔ OPLS ✔ GROMOS96 ● These potentials contain terms associated with bond lengths, angles, torsion angles, van der Waals, and electrostatics interactions. ● The major difference between them lies in the selection of atom types and the interaction parameters.
  • 11. Knowledge-Based Energy Function ● Refers to the empirical energy terms derived from the statistics of the solved structures in deposited PDB. ● Can be divided into two types: ➢ generic and sequence-independent terms such as the hydrogen bonding and the local backbone stiffness of a polypeptide chain ➢ amino-acid or protein-sequence dependent terms, e.g. pair wise residue contact potential, distance dependent atomic contact potential , and secondary structure propensities
  • 12. Conformational Search Methods ● Successful ab initio modelling of protein structures depends on the availability of a powerful conformation search method which can efficiently find the global minimum energy structure for a given energy function with complicated energy landscape. ● Types: ➔ Monte Carlo Simulations ➔ Molecular Dynamics ➔ Genetic Algorithm ➔ Mathematical Optimization
  • 13. Monte Carlo Simulations ● Its core idea is to use random samples of parameters or inputs to explore the behavior of a complex system or process.
  • 14. Initial configuration of particles in a system Monte Carlo move is attempted that changes the configuration of the particles Move is accepted or rejected based on an acceptance criterion Calculates the value of a property of interest An accurate average value of this property can be obtained StepsinMCsimulation
  • 15. Molecular Dynamics ● MD simulation solves Newton’s equations of motion at each step of atom movement, which is probably the most faithful method depicting atomistically what is occurring in proteins. ● The method is therefore most-often used for the study of protein folding pathways ● The long simulation time is one of the major issues of this method, since the incremental time scale is usually in the order of femtoseconds (10 15 s) while the fastest folding time of a small− protein (less than 100 residues) is in the millisecond range in nature.
  • 16. Genetic Algorithm ● The genetic algorithm is a method for solving problems that is based on natural selection, the process that drives biological evolution. ● The genetic algorithm repeatedly modifies a population of individual solutions. ● At each step, the genetic algorithm selects individuals at random from the current population to be parents and uses them to produce the children for the next generation. ● Over successive generations, the population "evolves" toward an optimal solution.
  • 17. Mathematical Optimization ● Mathematical optimization is the selection of a best element (with regard to some criteria) from some set of available alternatives.
  • 18. Model Selection ● The selection of protein models has been emerged as a new field called Model Quality Assessment Programs (MQAP) ● Modelling selection approaches can be classified into two types:  energy based  free-energy based
  • 19. Physics-Based Energy Function ● Selects the decoy with the lowest energy.
  • 20. Knowledge-Based Energy Function ● Sippl developed a pair wise residue-distance based potential (Sippl 1990) using the statistics of known PDB structures in 1990 (its newest version is PROSA II (Sippl 1993; Wiederstein and Sippl 2007) ). ● A variety of knowledge-based potentials have been proposed, which include atomic interaction potential, solvation potential, hydrogen bond potential, torsion angle potential, etc.
  • 21. Sequence-Structure Compatibility Function ● Best models are selected not purely based on energy functions. ● They are selected based on the compatibility of target sequences to model structures. ● The earliest and still successful example is that by Luthy et al. (1992), who used threading scores to evaluate structures. ● Colovos and Yeates (1993) later used a quadratic error function to describe the non-covalently bonded interactions among CC, CN, CO, NN, NO and OO, where near-native structures have fewer errors than other decoys
  • 22. Clustering of Decoy Structures ● Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). ● The cluster-centre conformation of the largest cluster is considered closer to native structures than the majority of decoys. ● In the work by Shortle et al. (1998), for all 12 cases tested, the cluster-centre conformation of the largest cluster was closer to native structures than the majority of decoys. Cluster-centre structures were ranked as the top 1–5% closest to their native structures.
  • 24. Fig.: Flowchart of the ROSETTA protocol
  • 25. Fig.:Flowchart of I-TASSER protein structure modelling