SlideShare a Scribd company logo
2
Most read
5
Most read
17
Most read
HOMOLOGY MODELLING
INTRODUCTION:
Homology modeling, also known as comparative modeling of protein
is the technique which allows to construct an unknown atomic-resolution
model of the "target" protein from:
1. Its amino acid sequence and
2. An experimental 3D structure of a related homologous protein (the
"template").
Prediction of the three dimensional structure of a given protein sequence i.e.
target protein from the amino acid sequence of a homologous (template)
protein for which an X-ray or NMR structure is available based on an
alignment to one or more known protein structures.
If similarity between the target sequence and the template sequence is
detected, structural similarity can be assumed.
In general, 30% sequence identity is required to generate an useful model.
SEQUENCE SIMILARITY & STRUCTURAL
SIMILARITY
As long as the length of two sequences and the percentage of identical residues fall in
the region marked as “safe” the two sequences are practically guaranteed to adopt a
similar structure.
HOMOLOGY MODELLING CONCEPT
STRUCTURE PREDICTION BY HOMOLOGY
MODELLING
AN EXAMPLE
To know the structure of sequence A (150 amino acids long), 1ST of all compare
sequence A to all the sequences of known structures stored in the PDB (using, for
example, BLAST), if a sequence B (300 amino acids long) containing a region of
150 amino acids that match sequence A with 50% identical residues.
As this match (alignment) clearly falls in the safe zone(50%) , we can simply
take the known structure of sequence B (the template), cut out the fragment
corresponding to the aligned region, mutate those amino acids that differ between
sequences A and B, and finally arrive at our model for structure A. Structure A is
called the target and is of course not known at the time of modeling.
HISTORY
The first homology modelling studies were done using wire and plastic models of
bonds and atoms as early as the 1960’s. The models were constructed by taking the
coordinates of a known protein structure and modified by hand for those amino
acids that did not match the structure.
In 1969 David Phillips, Brown and co-workers published the first paper regarding
homology modelling. They modelled a-lactalbumin based on the structure of hen-
egg white lysozyme. The sequence identity between these two proteins was 39%.
STEPS OF HOMOLOGY MODELLING
1.Template recognition
and initial alignment
2. Alignment correction
3. Backbone generation
4. Loop modeling
5. Side-chain modeling
6. Model optimization
7.Model Validation
Protein Sequence
Database Searches
Sequence alignment
Good
Structure
homologue?
Secondary structure
prediction
Improve alignment
using secondary
structure prediction
Homology modelling
Minimisation
Check model
Three dimensional
structure
1.Template recognition and initial alignment
Template recognition & selection involves searching the PDB for
homologous proteins with determined structures. The search can be
performed using simple sequence alignment programs such as BLAST or
FASTA as the percentage identity between the Target sequence and a
possible template is high enough in the safe zone, to be detected with these
programs.
To obtain a list of hits-the modeling templates and corresponding
alignments the program compares the query sequence to all the sequences
of known structures in the PDB using mainly two matrices:
1. A residue exchange matrix
2. An alignment matrix .
homology modellign lecture .pdf
2. Alignment correction
Sometimes it may be difficult to align two sequences in a region where the
percentage sequence identity is very low. One can then use other sequences
from homologous proteins to find a solution.
For ex: To align the sequence LTLTLTLT with YAYAYAYAY which is nearly
impossible, then only a third sequence, TYTYTYTYT, that aligns easily to
both of them can solve the issue.
2 is correct, because it leads to a small gap, compared to a huge hole
associated with alignment 1.
homology modellign lecture .pdf
3.BACKBONE GENERATION
When the alignment is correct, the backbone of the target can be created.
The coordinates of the template-backbone are copied to the target.
When the residues are identical, the side-chain coordinates are also copied.
4.LOOP MODELLING
After the sequence alignment, there are often regions created by insertions
and deletions that lead to gaps in alignment. These gaps are modeled by
loop modeling, which is less accurate. Currently, two main techniques are
used to approach the problem:
The database searching method - this involves finding loops from known
protein structures and superimposing them onto the two stem regions (main
chains mostly) of the target protein. Some specialized programs like
FREAD and CODA can be used.
The ab initio method - this generates many random loops and searches for
one that has reasonably low energy and φ and ψ angles in the allowable
regions in the Ramachandran plot.
The red loop is modeled with the green
residues as anchor residues. The insertion
of 2 residues results in a longer loop.
5.Side-Chain Modeling
This is important in evaluating protein–ligand interactions at active sites
and protein–protein interactions at the contact interface.
A side chain can be built by searching every possible conformation for
every torsion angle of the side chain to select the one that has the lowest
interaction energy with neighboring atoms.
A rotamer library can also be used, which has all the favorable side chain
torsion angles extracted from known protein crystal structures.
6: Model Optimization
energy minimization procedure on the entire model, by adjusting the
relative position of the atoms so that the overall conformation of the
molecule has the lowest possible energy potential. The goal is to relieve
steric collisions without altering the overall structure.
Optimization can also be done by Molecular Dynamic Simulation which
moves the atoms toward a global minimum by applying various stimulation
conditions (heating, cooling, considering water molecules) thus having a
better chance at finding the true structure.
Energy = Stretching Energy +Bending Energy +Torsion Energy +Non-
Bonded Interaction Energy
7.Model Validation
Every homology model contains errors. Two main reasons are:
1. The percentage sequence identity between template and target. If it is
greater than 90%, the accuracy of the model can be compared to
crystallographically determined structures & if less than 30% large error
occurs
2. The number of errors in templates
The final model has to be evaluated for checking the φ–ψ angles, chirality,
bond lengths, close contacts and also the stereo chemical
properties. Modeling Programs like Modeller, SWISS MODEL,
Schrodinger, 3D- JIGSAW.
A successful model depends on template selection, algorithm used and the
validation of the model.
Advantages
It can find the location of alpha carbons of key residues inside the folded
protein.
It can help to guide the mutagenesis experiments, or hypothesize structure-
function relationships.
The positions of conserved regions of the protein surface can help identify
putative active sites, binding pockets and ligands.
Disadvantages
Homology models are unable to predict conformations of insertions or
deletions, or side chain positions with a high level of accuracy.
Homology models are not useful in modeling and ligand docking studies
necessary for the drug designing and development process. However, it
may be helpful for the same, if the sequence identity with the template is
greater than 70%.
RAMACHANDRAN PLOT
In a polypeptide the main chain (N-Calpha)
and (Calpha-C bonds) relatively are free to
rotate. These rotations are represented by the
torsion angles phi (φ) and psi(ψ ), respectively.
A Ramachandran plot (or a [φ,ψ] plot),
originally developed in 1963 by G. N.
Ramachandran, C. Ramakrishnan, and V.
Sasisekharan,is a way to visualize backbone
dihedral angles ψ against φ of amino acid
residues in protein structure.
A Ramachandran plot can be used:
One is to show in theory which values, or
conformations, of the ψ and φ angles are
possible for an amino-acid residue in a protein.
second is to show the empirical distribution of
datapoints observed in a single structure in
usage for structure validation, or else in a
database of many structures.

More Related Content

PPTX
Secondary protein structure prediction
PPTX
Homology Modelling
PPTX
Protein 3 d structure prediction
PPTX
PAM : Point Accepted Mutation
PPTX
High throughput Data Analysis
PPTX
Threading modeling methods
PDF
Ab Initio Protein Structure Prediction
Secondary protein structure prediction
Homology Modelling
Protein 3 d structure prediction
PAM : Point Accepted Mutation
High throughput Data Analysis
Threading modeling methods
Ab Initio Protein Structure Prediction

What's hot (20)

PPTX
Gene Prediction
PPTX
Expression vectors
PPTX
Global and Local Sequence Alignment
PDF
Mapping and quantifying transcripts.pdf
PPTX
DNA microarray
PDF
Gene prediction method
PPTX
Pcr and primer designing
PPTX
Protein fold recognition and ab_initio modeling
PPTX
Scoring schemes in bioinformatics
PPT
PDF
Illumina sequencing introduction
PPTX
Histone modifications
PPTX
Protein microarray .pptx
PDF
Protein Structure Prediction
PPTX
Chou fasman algorithm for protein structure prediction
PPT
Homology modeling
PPT
Protein protein interaction
PPT
Pairwise sequence alignment
PPTX
Comparative genomics in eukaryotes, organelles
Gene Prediction
Expression vectors
Global and Local Sequence Alignment
Mapping and quantifying transcripts.pdf
DNA microarray
Gene prediction method
Pcr and primer designing
Protein fold recognition and ab_initio modeling
Scoring schemes in bioinformatics
Illumina sequencing introduction
Histone modifications
Protein microarray .pptx
Protein Structure Prediction
Chou fasman algorithm for protein structure prediction
Homology modeling
Protein protein interaction
Pairwise sequence alignment
Comparative genomics in eukaryotes, organelles
Ad

Similar to homology modellign lecture .pdf (20)

PPTX
Homology modelling
PPTX
Ultrasound to Enhance a Liquid–Liquid Reaction Presentation1.pptx
PPTX
HOMOLOGY MODELLING.pptx
PPTX
Presentation1
PPT
Homology modelling-Protein structure prediction
PPT
Protein structure prediction by Homology modelling
PPT
PPTX
Homology Modeling.pptx
PPTX
Homology modelling and generation of 3D-structure of protein (G).pptx
PPTX
L1Protein_Structure_Analysis.pptx
PPT
Protein struc pred-Ab initio and other methods as a short introduction.ppt
PPTX
Modelling Proteins By Computational Structural Biology
PPT
Homology Modeling of Protein, protein structure prediction
PPT
Presentation homolgy modeling
PPTX
Drug discovery presentation
DOC
PPTX
Computational Prediction Of Protein-1.pptx
PPT
HOMOLOGY MODELING IN EASIER WAY
PPTX
Protein Threading
PPTX
Computational Prediction of Protein Structure.pptx
Homology modelling
Ultrasound to Enhance a Liquid–Liquid Reaction Presentation1.pptx
HOMOLOGY MODELLING.pptx
Presentation1
Homology modelling-Protein structure prediction
Protein structure prediction by Homology modelling
Homology Modeling.pptx
Homology modelling and generation of 3D-structure of protein (G).pptx
L1Protein_Structure_Analysis.pptx
Protein struc pred-Ab initio and other methods as a short introduction.ppt
Modelling Proteins By Computational Structural Biology
Homology Modeling of Protein, protein structure prediction
Presentation homolgy modeling
Drug discovery presentation
Computational Prediction Of Protein-1.pptx
HOMOLOGY MODELING IN EASIER WAY
Protein Threading
Computational Prediction of Protein Structure.pptx
Ad

Recently uploaded (20)

PDF
LDMMIA Reiki Yoga Finals Review Spring Summer
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PPTX
ELIAS-SEZIURE AND EPilepsy semmioan session.pptx
PDF
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
PDF
advance database management system book.pdf
PPTX
Computer Architecture Input Output Memory.pptx
PDF
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 1)
PDF
FORM 1 BIOLOGY MIND MAPS and their schemes
PDF
CISA (Certified Information Systems Auditor) Domain-Wise Summary.pdf
PDF
What if we spent less time fighting change, and more time building what’s rig...
PPTX
B.Sc. DS Unit 2 Software Engineering.pptx
PDF
Τίμαιος είναι φιλοσοφικός διάλογος του Πλάτωνα
PPTX
Chinmaya Tiranga Azadi Quiz (Class 7-8 )
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PPTX
Virtual and Augmented Reality in Current Scenario
PDF
FOISHS ANNUAL IMPLEMENTATION PLAN 2025.pdf
PDF
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
PDF
Hazard Identification & Risk Assessment .pdf
PDF
Complications of Minimal Access-Surgery.pdf
PPTX
Onco Emergencies - Spinal cord compression Superior vena cava syndrome Febr...
LDMMIA Reiki Yoga Finals Review Spring Summer
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
ELIAS-SEZIURE AND EPilepsy semmioan session.pptx
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
advance database management system book.pdf
Computer Architecture Input Output Memory.pptx
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 1)
FORM 1 BIOLOGY MIND MAPS and their schemes
CISA (Certified Information Systems Auditor) Domain-Wise Summary.pdf
What if we spent less time fighting change, and more time building what’s rig...
B.Sc. DS Unit 2 Software Engineering.pptx
Τίμαιος είναι φιλοσοφικός διάλογος του Πλάτωνα
Chinmaya Tiranga Azadi Quiz (Class 7-8 )
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
Virtual and Augmented Reality in Current Scenario
FOISHS ANNUAL IMPLEMENTATION PLAN 2025.pdf
Vision Prelims GS PYQ Analysis 2011-2022 www.upscpdf.com.pdf
Hazard Identification & Risk Assessment .pdf
Complications of Minimal Access-Surgery.pdf
Onco Emergencies - Spinal cord compression Superior vena cava syndrome Febr...

homology modellign lecture .pdf

  • 1. HOMOLOGY MODELLING INTRODUCTION: Homology modeling, also known as comparative modeling of protein is the technique which allows to construct an unknown atomic-resolution model of the "target" protein from: 1. Its amino acid sequence and 2. An experimental 3D structure of a related homologous protein (the "template"). Prediction of the three dimensional structure of a given protein sequence i.e. target protein from the amino acid sequence of a homologous (template) protein for which an X-ray or NMR structure is available based on an alignment to one or more known protein structures. If similarity between the target sequence and the template sequence is detected, structural similarity can be assumed. In general, 30% sequence identity is required to generate an useful model.
  • 2. SEQUENCE SIMILARITY & STRUCTURAL SIMILARITY As long as the length of two sequences and the percentage of identical residues fall in the region marked as “safe” the two sequences are practically guaranteed to adopt a similar structure.
  • 4. STRUCTURE PREDICTION BY HOMOLOGY MODELLING
  • 5. AN EXAMPLE To know the structure of sequence A (150 amino acids long), 1ST of all compare sequence A to all the sequences of known structures stored in the PDB (using, for example, BLAST), if a sequence B (300 amino acids long) containing a region of 150 amino acids that match sequence A with 50% identical residues. As this match (alignment) clearly falls in the safe zone(50%) , we can simply take the known structure of sequence B (the template), cut out the fragment corresponding to the aligned region, mutate those amino acids that differ between sequences A and B, and finally arrive at our model for structure A. Structure A is called the target and is of course not known at the time of modeling. HISTORY The first homology modelling studies were done using wire and plastic models of bonds and atoms as early as the 1960’s. The models were constructed by taking the coordinates of a known protein structure and modified by hand for those amino acids that did not match the structure. In 1969 David Phillips, Brown and co-workers published the first paper regarding homology modelling. They modelled a-lactalbumin based on the structure of hen- egg white lysozyme. The sequence identity between these two proteins was 39%.
  • 6. STEPS OF HOMOLOGY MODELLING 1.Template recognition and initial alignment 2. Alignment correction 3. Backbone generation 4. Loop modeling 5. Side-chain modeling 6. Model optimization 7.Model Validation Protein Sequence Database Searches Sequence alignment Good Structure homologue? Secondary structure prediction Improve alignment using secondary structure prediction Homology modelling Minimisation Check model Three dimensional structure
  • 7. 1.Template recognition and initial alignment Template recognition & selection involves searching the PDB for homologous proteins with determined structures. The search can be performed using simple sequence alignment programs such as BLAST or FASTA as the percentage identity between the Target sequence and a possible template is high enough in the safe zone, to be detected with these programs. To obtain a list of hits-the modeling templates and corresponding alignments the program compares the query sequence to all the sequences of known structures in the PDB using mainly two matrices: 1. A residue exchange matrix 2. An alignment matrix .
  • 9. 2. Alignment correction Sometimes it may be difficult to align two sequences in a region where the percentage sequence identity is very low. One can then use other sequences from homologous proteins to find a solution. For ex: To align the sequence LTLTLTLT with YAYAYAYAY which is nearly impossible, then only a third sequence, TYTYTYTYT, that aligns easily to both of them can solve the issue. 2 is correct, because it leads to a small gap, compared to a huge hole associated with alignment 1.
  • 11. 3.BACKBONE GENERATION When the alignment is correct, the backbone of the target can be created. The coordinates of the template-backbone are copied to the target. When the residues are identical, the side-chain coordinates are also copied. 4.LOOP MODELLING After the sequence alignment, there are often regions created by insertions and deletions that lead to gaps in alignment. These gaps are modeled by loop modeling, which is less accurate. Currently, two main techniques are used to approach the problem: The database searching method - this involves finding loops from known protein structures and superimposing them onto the two stem regions (main chains mostly) of the target protein. Some specialized programs like FREAD and CODA can be used. The ab initio method - this generates many random loops and searches for one that has reasonably low energy and φ and ψ angles in the allowable regions in the Ramachandran plot.
  • 12. The red loop is modeled with the green residues as anchor residues. The insertion of 2 residues results in a longer loop.
  • 13. 5.Side-Chain Modeling This is important in evaluating protein–ligand interactions at active sites and protein–protein interactions at the contact interface. A side chain can be built by searching every possible conformation for every torsion angle of the side chain to select the one that has the lowest interaction energy with neighboring atoms. A rotamer library can also be used, which has all the favorable side chain torsion angles extracted from known protein crystal structures.
  • 14. 6: Model Optimization energy minimization procedure on the entire model, by adjusting the relative position of the atoms so that the overall conformation of the molecule has the lowest possible energy potential. The goal is to relieve steric collisions without altering the overall structure. Optimization can also be done by Molecular Dynamic Simulation which moves the atoms toward a global minimum by applying various stimulation conditions (heating, cooling, considering water molecules) thus having a better chance at finding the true structure. Energy = Stretching Energy +Bending Energy +Torsion Energy +Non- Bonded Interaction Energy
  • 15. 7.Model Validation Every homology model contains errors. Two main reasons are: 1. The percentage sequence identity between template and target. If it is greater than 90%, the accuracy of the model can be compared to crystallographically determined structures & if less than 30% large error occurs 2. The number of errors in templates The final model has to be evaluated for checking the φ–ψ angles, chirality, bond lengths, close contacts and also the stereo chemical properties. Modeling Programs like Modeller, SWISS MODEL, Schrodinger, 3D- JIGSAW. A successful model depends on template selection, algorithm used and the validation of the model.
  • 16. Advantages It can find the location of alpha carbons of key residues inside the folded protein. It can help to guide the mutagenesis experiments, or hypothesize structure- function relationships. The positions of conserved regions of the protein surface can help identify putative active sites, binding pockets and ligands. Disadvantages Homology models are unable to predict conformations of insertions or deletions, or side chain positions with a high level of accuracy. Homology models are not useful in modeling and ligand docking studies necessary for the drug designing and development process. However, it may be helpful for the same, if the sequence identity with the template is greater than 70%.
  • 17. RAMACHANDRAN PLOT In a polypeptide the main chain (N-Calpha) and (Calpha-C bonds) relatively are free to rotate. These rotations are represented by the torsion angles phi (φ) and psi(ψ ), respectively. A Ramachandran plot (or a [φ,ψ] plot), originally developed in 1963 by G. N. Ramachandran, C. Ramakrishnan, and V. Sasisekharan,is a way to visualize backbone dihedral angles ψ against φ of amino acid residues in protein structure. A Ramachandran plot can be used: One is to show in theory which values, or conformations, of the ψ and φ angles are possible for an amino-acid residue in a protein. second is to show the empirical distribution of datapoints observed in a single structure in usage for structure validation, or else in a database of many structures.