SlideShare a Scribd company logo
Computational study of Protein
From Scratch;
From Structure to Function
Topic:
Kinza Irshad
Soil and Ecosystem Ecology Lab,
COMSATS University Islamabad,
Abbottabad
5/28/2019 1
What is Protein?
Long chain polypeptides
Made up of 20 naturally occurring amino acids
Two amino acids have peptide bond, having double bond characteristic
Functional proteins are folded into 3D structure, containing helix, Beta sheets and loops
Helix and sheets are rigid part while loops are flexible
In 3D structure hydrophobic amino acids are located in core while hydrophilic are oriented at
surface.
5/28/2019 2
5/28/2019 3
Folding State of Protein
5/28/2019 4
First Step; Protein Sequence
We are going to take start from Protein Sequence.
We can take protein sequence from database or can translate gene sequence into protein.
In both cases, we should ultimately have FASTA format.
5/28/2019 5
What is FASTA Format???
>MH08765.1 Serine_protease, Human
ELPDFTPLVEQASPAVVNISTRQKLPDRAMARGQLSIPDLEGLPPMFRDFLERSIPQVPRNPRGQQREAQSLGSGFIISN
DGYITNNHVVADADEILVRLSDRSEHKAKLIGADPRSDVAVLKIEAKNLPTLKLGDSNKLKVGEWVLAIGSPFGFDHSVTA
GIVSAKGRSLPNESYVPFIQTDVAINPGNSGGPLLNLQGEVVGINSQIFTRSGGFMGLSFAIPIDVALNVADQLKKAGKVS
RGWLGVVIQEVNKDLAESFGLDKPSGALVAQLVEDGPAAKGGLQVGDVILSLNGQSINESADLPHLVGNMKPGDKINL
DVIRNGQRKSLSMAVGSLPDDDEEIASMGAPGAERSSNRLGVTVADLTAEQRKSLDIQGGVVIKEVQDGPAAVIGLRPG
DVITHLDNKAVTSTKVFADVAKALPKNRSVSMRVLRQGRASFITFKLA
5/28/2019 6
Header Region
Protein
Sequence
in one
letter code
How To Get Protein Sequence from Database??
1. We are going to take example of largest and most frequently used NCBI nucleotide and
Protein database
https://www.ncbi.nlm.nih.gov/
2. There are two different types of format
◦ GenBank
◦ FASTA
5/28/2019 7
After Having Sequence What to Do?
In next step, We will try to figure out Physio-chemical properties of our target protein, including
Molecular weight, Iso-electric point, stability, etc.
We will use very popular tool here, named Expasy Protparam.
https://web.expasy.org/protparam/
The input is protein sequence in FASTA format.
Genbank format is not acceptable
5/28/2019 8
Prediction of Secondary Structure Elements
There are three different structural forms of proteins, primary, secondary, and tertiary
structure.
In secondary structure, there will be three different type of element, alpha-helix, beta-sheets
and loops.
Two different type of algorithms used to predict secondary structure elements
I. Ab-initio Based
II. Homology Based
5/28/2019 9
Prediction of Secondary Structure Elements
Ab-initio algorithms are stand alone algorithms, identifying the secondary structure elements
of using intrinsic tendencies of amino acids to be in particular confirmation. For example,
glycine and proline, they love to stay in loops only
Homology-based algorithms make prediction based on secondary structure of homolgous.
Structures are more conserved as compared to sequence.
5/28/2019 10
PSI-PRED; A Homology-Based Tool
We will use Psi-Pred a homology based tool to predict secondary structure
http://bioinf.cs.ucl.ac.uk/psipred/
The submitted protein sequence will be searched in protein database through BLAST to search
the homologous sequences, based on query coverage and E-value.
The tool will align the homologous sequences (MSA) to get information about conservancy.
The conserved regions should ideally have same secondary structural elements.
5/28/2019 11
PSI-PRED Results
5/28/2019 12
Signal Peptide Prediction
Signal peptide in defining localization of protein in the cells.
Present at N-terminal of newly synthesized protein
Predominantly hydrophobic amino acids
Important to predict especially if we want to clone the gene into
expression system.
5/28/2019 13
Signal Peptide Prediction
Commonly used algorithm for the prediction of
signal peptide is SignalP4.1
http://www.cbs.dtu.dk/services/SignalP/
Input sequence is FASTA format
5/28/2019 14
Is Our Protein is Transmembrane ?
In cell, proteins are either globular or
transmembrane.
Transmembrane proteins can be
i. Transmembrane helical
ii. Beta-barrels
To make structural prediction of transmembrane
protein TMHMM (Transmembrane Hidden Markov
Model) algorithm is used.
http://www.cbs.dtu.dk/services/TMHMM/
5/28/2019 15
Prediction of Domains & Motifs in Protein Structure
InterPro online server will be used to identify the domains
https://www.ebi.ac.uk/interpro/
The input sequence will be in FASTA format.
InterPro used homology search to identify the domains and mortifies.
5/28/2019 16
5/28/2019 17
3D Structure predication of a protein
Protein structure prediction is the inference of the three-dimensional structure of
a protein from its amino acid sequence—that is, the prediction of its folding and its secondary
and tertiary structure from its primary structure.
Structure prediction is fundamentally different from the inverse problem of protein design.
Protein structure prediction is one of the most important goals pursued by bioinformatics
and theoretical chemistry; it is highly important in medicine (for example, in drug design)
and biotechnology (for example, in the design of novel enzymes).
3D Structure Prediction of Protein
3D structure prediction is one of most complicated computational process
Important step for protein structure analysis
There are three experimental techniques to determine 3D structure of proteins
i. X-ray Crystallography
ii. NMR
iii. Cryo-Electron Microscopy
5/28/2019 DEPARTMENT OF BIOCHEMISTRY & BIOTECHNOLOGY 19
Structure prediction of a protein
5/28/2019 21
End of Session No 1
Lets do some practical work
Accession # MH045598
https://www.ncbi.nlm.nih.gov/
https://web.expasy.org/protparam/
http://bioinf.cs.ucl.ac.uk/psipred/
http://www.cbs.dtu.dk/services/SignalP/
https://www.ebi.ac.uk/interpro/
https://zhanglab.ccmb.med.umich.edu/I-TASSER/
5/28/2019 22
Thank You
&
Happy to Answer The Questions

More Related Content

PPT
methods for protein structure prediction
PPTX
Protein 3 d structure prediction
PDF
Protein Structure Prediction
PPTX
protein stability
PPTX
In silico structure prediction
PPTX
Threading modeling methods
PPTX
DNA SEQUENCING METHODS AND STRATEGIES FOR GENOME SEQUENCING
PPTX
Protein fold recognition and ab_initio modeling
methods for protein structure prediction
Protein 3 d structure prediction
Protein Structure Prediction
protein stability
In silico structure prediction
Threading modeling methods
DNA SEQUENCING METHODS AND STRATEGIES FOR GENOME SEQUENCING
Protein fold recognition and ab_initio modeling

What's hot (20)

PPTX
Computational Prediction Of Protein-1.pptx
PPTX
Structure based drug design
PPT
PPTX
Structural bioinformatics.
PPTX
Secondary protein structure prediction
PPT
HOMOLOGY MODELING IN EASIER WAY
PPTX
Proteins databases
PPTX
Homology Modelling
PPTX
gor ppt (1).pptx
PPTX
Chou fasman algorithm for protein structure prediction
PDF
Ab Initio Protein Structure Prediction
PPTX
Homology modelling
PPTX
gene prediction programs
PPTX
gateway cloning
PPTX
Cheminformatics in drug design
PPTX
The Role of Bioinformatics in The Drug Discovery Process
PPTX
Sequence alignment global vs. local
PPTX
Molecular docking
Computational Prediction Of Protein-1.pptx
Structure based drug design
Structural bioinformatics.
Secondary protein structure prediction
HOMOLOGY MODELING IN EASIER WAY
Proteins databases
Homology Modelling
gor ppt (1).pptx
Chou fasman algorithm for protein structure prediction
Ab Initio Protein Structure Prediction
Homology modelling
gene prediction programs
gateway cloning
Cheminformatics in drug design
The Role of Bioinformatics in The Drug Discovery Process
Sequence alignment global vs. local
Molecular docking
Ad

Similar to Protein computational analysis (20)

PDF
D1803032632
PPTX
Protein structure 2
PDF
Basics of Protein structure in bioinformatics
PPTX
protein Modeling Abi.pptx
PPT
Class powerpoint.ppt
PDF
Protein Structure Prediction Using Support Vector Machine
PDF
PROTEIN STRUCTURE PREDICTION USING SUPPORT VECTOR MACHINE
PDF
Protein Secondary Structure Prediction using HMM
PPTX
Protein structure analysis
PPT
protein structure prediction in bioinformatics.ppt
PPTX
Protein structure
PDF
Secondary Structure Prediction of proteins
PPTX
Protein strucutre prediction
PPT
Protein Structural predection
PPT
Cs273 structure prediction
PPTX
6. protein secondry structure ppt
PPTX
2016 bioinformatics i_proteins_wim_vancriekinge
PPTX
Bioinformatics t7-proteinstructure v2014
PPT
2005_lecture_01.ppt
D1803032632
Protein structure 2
Basics of Protein structure in bioinformatics
protein Modeling Abi.pptx
Class powerpoint.ppt
Protein Structure Prediction Using Support Vector Machine
PROTEIN STRUCTURE PREDICTION USING SUPPORT VECTOR MACHINE
Protein Secondary Structure Prediction using HMM
Protein structure analysis
protein structure prediction in bioinformatics.ppt
Protein structure
Secondary Structure Prediction of proteins
Protein strucutre prediction
Protein Structural predection
Cs273 structure prediction
6. protein secondry structure ppt
2016 bioinformatics i_proteins_wim_vancriekinge
Bioinformatics t7-proteinstructure v2014
2005_lecture_01.ppt
Ad

More from Kinza Irshad (20)

PPTX
Why Pakistan will survive ppt by Kinza IRSHAD
PDF
Paper on sea role in trans boundry river management by Kinza Irshad
PDF
Regional aspects of development and planning
PDF
PROJECT MANAGEMENT
PPT
Memorizing techniquess
PPTX
Composition & structure of the atmosphere
PPTX
Lbm degradation
PPTX
Hospital waste incineration
PPTX
Strategic Environmental Assessment
PPTX
CPEC, A game changer
PPTX
Attractions and Distraction
PPTX
Halqa e asar final ppt
PPTX
Leadership edited
PPTX
Leadership influence
PPTX
Management
PPTX
Protein computational analysis
PPTX
Lignocellulyitc enzymes, COMSATS Vehari
PPTX
Impact of agriculture on climate change
PPT
R studio
PPTX
Beat plastic
Why Pakistan will survive ppt by Kinza IRSHAD
Paper on sea role in trans boundry river management by Kinza Irshad
Regional aspects of development and planning
PROJECT MANAGEMENT
Memorizing techniquess
Composition & structure of the atmosphere
Lbm degradation
Hospital waste incineration
Strategic Environmental Assessment
CPEC, A game changer
Attractions and Distraction
Halqa e asar final ppt
Leadership edited
Leadership influence
Management
Protein computational analysis
Lignocellulyitc enzymes, COMSATS Vehari
Impact of agriculture on climate change
R studio
Beat plastic

Recently uploaded (20)

PDF
KodekX | Application Modernization Development
PDF
Empathic Computing: Creating Shared Understanding
PDF
Electronic commerce courselecture one. Pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
Spectroscopy.pptx food analysis technology
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Machine learning based COVID-19 study performance prediction
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
KodekX | Application Modernization Development
Empathic Computing: Creating Shared Understanding
Electronic commerce courselecture one. Pdf
Network Security Unit 5.pdf for BCA BBA.
Digital-Transformation-Roadmap-for-Companies.pptx
Spectroscopy.pptx food analysis technology
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Advanced methodologies resolving dimensionality complications for autism neur...
Review of recent advances in non-invasive hemoglobin estimation
Machine learning based COVID-19 study performance prediction
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
MIND Revenue Release Quarter 2 2025 Press Release
20250228 LYD VKU AI Blended-Learning.pptx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Programs and apps: productivity, graphics, security and other tools
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Chapter 3 Spatial Domain Image Processing.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...

Protein computational analysis

  • 1. Computational study of Protein From Scratch; From Structure to Function Topic: Kinza Irshad Soil and Ecosystem Ecology Lab, COMSATS University Islamabad, Abbottabad 5/28/2019 1
  • 2. What is Protein? Long chain polypeptides Made up of 20 naturally occurring amino acids Two amino acids have peptide bond, having double bond characteristic Functional proteins are folded into 3D structure, containing helix, Beta sheets and loops Helix and sheets are rigid part while loops are flexible In 3D structure hydrophobic amino acids are located in core while hydrophilic are oriented at surface. 5/28/2019 2
  • 4. Folding State of Protein 5/28/2019 4
  • 5. First Step; Protein Sequence We are going to take start from Protein Sequence. We can take protein sequence from database or can translate gene sequence into protein. In both cases, we should ultimately have FASTA format. 5/28/2019 5
  • 6. What is FASTA Format??? >MH08765.1 Serine_protease, Human ELPDFTPLVEQASPAVVNISTRQKLPDRAMARGQLSIPDLEGLPPMFRDFLERSIPQVPRNPRGQQREAQSLGSGFIISN DGYITNNHVVADADEILVRLSDRSEHKAKLIGADPRSDVAVLKIEAKNLPTLKLGDSNKLKVGEWVLAIGSPFGFDHSVTA GIVSAKGRSLPNESYVPFIQTDVAINPGNSGGPLLNLQGEVVGINSQIFTRSGGFMGLSFAIPIDVALNVADQLKKAGKVS RGWLGVVIQEVNKDLAESFGLDKPSGALVAQLVEDGPAAKGGLQVGDVILSLNGQSINESADLPHLVGNMKPGDKINL DVIRNGQRKSLSMAVGSLPDDDEEIASMGAPGAERSSNRLGVTVADLTAEQRKSLDIQGGVVIKEVQDGPAAVIGLRPG DVITHLDNKAVTSTKVFADVAKALPKNRSVSMRVLRQGRASFITFKLA 5/28/2019 6 Header Region Protein Sequence in one letter code
  • 7. How To Get Protein Sequence from Database?? 1. We are going to take example of largest and most frequently used NCBI nucleotide and Protein database https://www.ncbi.nlm.nih.gov/ 2. There are two different types of format ◦ GenBank ◦ FASTA 5/28/2019 7
  • 8. After Having Sequence What to Do? In next step, We will try to figure out Physio-chemical properties of our target protein, including Molecular weight, Iso-electric point, stability, etc. We will use very popular tool here, named Expasy Protparam. https://web.expasy.org/protparam/ The input is protein sequence in FASTA format. Genbank format is not acceptable 5/28/2019 8
  • 9. Prediction of Secondary Structure Elements There are three different structural forms of proteins, primary, secondary, and tertiary structure. In secondary structure, there will be three different type of element, alpha-helix, beta-sheets and loops. Two different type of algorithms used to predict secondary structure elements I. Ab-initio Based II. Homology Based 5/28/2019 9
  • 10. Prediction of Secondary Structure Elements Ab-initio algorithms are stand alone algorithms, identifying the secondary structure elements of using intrinsic tendencies of amino acids to be in particular confirmation. For example, glycine and proline, they love to stay in loops only Homology-based algorithms make prediction based on secondary structure of homolgous. Structures are more conserved as compared to sequence. 5/28/2019 10
  • 11. PSI-PRED; A Homology-Based Tool We will use Psi-Pred a homology based tool to predict secondary structure http://bioinf.cs.ucl.ac.uk/psipred/ The submitted protein sequence will be searched in protein database through BLAST to search the homologous sequences, based on query coverage and E-value. The tool will align the homologous sequences (MSA) to get information about conservancy. The conserved regions should ideally have same secondary structural elements. 5/28/2019 11
  • 13. Signal Peptide Prediction Signal peptide in defining localization of protein in the cells. Present at N-terminal of newly synthesized protein Predominantly hydrophobic amino acids Important to predict especially if we want to clone the gene into expression system. 5/28/2019 13
  • 14. Signal Peptide Prediction Commonly used algorithm for the prediction of signal peptide is SignalP4.1 http://www.cbs.dtu.dk/services/SignalP/ Input sequence is FASTA format 5/28/2019 14
  • 15. Is Our Protein is Transmembrane ? In cell, proteins are either globular or transmembrane. Transmembrane proteins can be i. Transmembrane helical ii. Beta-barrels To make structural prediction of transmembrane protein TMHMM (Transmembrane Hidden Markov Model) algorithm is used. http://www.cbs.dtu.dk/services/TMHMM/ 5/28/2019 15
  • 16. Prediction of Domains & Motifs in Protein Structure InterPro online server will be used to identify the domains https://www.ebi.ac.uk/interpro/ The input sequence will be in FASTA format. InterPro used homology search to identify the domains and mortifies. 5/28/2019 16
  • 18. 3D Structure predication of a protein Protein structure prediction is the inference of the three-dimensional structure of a protein from its amino acid sequence—that is, the prediction of its folding and its secondary and tertiary structure from its primary structure. Structure prediction is fundamentally different from the inverse problem of protein design. Protein structure prediction is one of the most important goals pursued by bioinformatics and theoretical chemistry; it is highly important in medicine (for example, in drug design) and biotechnology (for example, in the design of novel enzymes).
  • 19. 3D Structure Prediction of Protein 3D structure prediction is one of most complicated computational process Important step for protein structure analysis There are three experimental techniques to determine 3D structure of proteins i. X-ray Crystallography ii. NMR iii. Cryo-Electron Microscopy 5/28/2019 DEPARTMENT OF BIOCHEMISTRY & BIOTECHNOLOGY 19
  • 21. 5/28/2019 21 End of Session No 1 Lets do some practical work Accession # MH045598 https://www.ncbi.nlm.nih.gov/ https://web.expasy.org/protparam/ http://bioinf.cs.ucl.ac.uk/psipred/ http://www.cbs.dtu.dk/services/SignalP/ https://www.ebi.ac.uk/interpro/ https://zhanglab.ccmb.med.umich.edu/I-TASSER/
  • 22. 5/28/2019 22 Thank You & Happy to Answer The Questions