SlideShare a Scribd company logo
Bioinformatics
Applications and Challenges
Dr. SV Singh, Assistant Professor
Bundelkhand University, Jhansi
5/21/2020 1
5/21/2020 2
A
C
G
T
5/21/2020 3
5/21/2020 4
https://images.app.goo.gl/ezxo1TyAXXqBwiUYA
5/21/2020 5
5/21/2020 6
Language of
Four
Alphabet
(ATCG)
Language of
Twenty
Alphabet
(A,C,D,E,F,G,H,
ILe,K,L,M,N,P,Q
,R,S,T,V,W,Y)
5/21/2020 7
5/21/2020 8
Biological Databases
Structure
PDB, SCOP,
CATH etc.
Sequence
Protein
SwissProt
PIR
TrEMBL
PROSITE
Pfam etc.
Nucleic Acid
GenBank
EMBL
DDBJ
Other
Genome browser,
RELIBASE,
OWL,FlyBase etc
https://images.app.goo.gl/L9VqL8rp7ekP87paA5/21/2020 9
5/21/2020 10
• Sequence alignment – BLAST, FASTA, CLUSTALW
• Gene prediction- GenScan, GRAIL, GenomeScan,
Gene Mark
• Protein domain analysis and identification-Pfam,
ProDom
• Pattern identification-GibbsSampler, MEME
• Protein folding- Predict protein, SwissModeller
• Protein Structure prediction- Modeler
• Protein structure analysis- Procheck, WhatIf
• Phylogeny reconstruction- PHYLIP, PAUP
• Molecular docking – Dock6, AutoDock and
many more………
IMPORTANT BIOINFORMATICS RESOURCES
5/21/2020 11
5/21/2020 12
5/21/2020 13
Genomics
Structural genomics
1. Construction of genomic sequence data
2. Gene discovery and localization
3. Construction of gene maps
Functional genomics
1. Biological function of genes
2. Regulation
3. Products
4. Plant development studies
Comparative genomics
1. Compare gene sequence to elucidate
2. Functional or evolutionary relationship
5/21/2020 14
5/21/2020 15
Genome Annotation
1. The process of identifying the locations of genes and the coding
regions in a genome to determine what those genes do
2. Finding and attaching the structural elements and its function to each
genome locations
Structural annotation (identification of
genomic elements)
1. Open reading frames and their localization
2. Coding regions
3. Location of regulatory motifs
4. start/stop
5.Splice sites
6. Non coding regions/RNAs
7. Introns
Functional annotation
(Attaching biological information to genomic
elements)
1. Biochemical function
2. Biological function
3. Improved regulation and interactions
Expression
4. Utilize known structural annotation to
predicted protein sequence
5/21/2020 16
5/21/2020 17
DNA sequence need to decode
5/21/2020 18
CCTGACAAATTCGACGTGCGGCATTGCATGCAGACGTGCATG
CGTGCAAATAATCAATGTGGACTTTTCTGCGATTATGGAAGAA
CTTTGTTACGCGTTTTTGTCATGGCTTTGGTCCCGCTTTGTTC
AGAATGCTTTTAATAAGCGGGGTTACCGGTTTGGTTAGCGAGA
AGAGCCAGTAAAAGACGCAGTGACGGAGATGTCTGATG CAA
TAT GGA CAA TTG GTT TCT TCT CTG AAT .................................
.............. TGAAAAACGTA
TF binding sitepromoter
Ribosome binding Site
ORF = Open Reading Frame
CDS = Coding Sequence
Transcription
StartSite
5/21/2020 19
Transcriptomics
The study of the complete set of RNAs (transcriptome)
encoded by the genome of a specific cell or organism at a
specific time or under a specific set of conditions
Role of transcriptomics
1. Reveal the process of development
2. Determine the role of non coding RNAs (miRNA)
3. Genetic basis of disease
4. Help in study the response of drug
5/21/2020 20
Transcriptome : an evolving definition
• The population of mRNAs expressed by a genome at any given
time (1999)
• The complete collection of transcribed elements of the genome
(2004)
1. mRNAs
2. Non coding RNAs
1. tRNAs
2. rRNAs
3. snmRNAs (small non messenger RNAs)
• miRNA and siRNA
• snoRNAs (small nucleolar)
• snRNAs (small nuclear)
3. Pseudogenes
5/21/2020 21
5/21/2020 22
Protein annotation
Identify and describe all the
physio-chemical, functional and
structural properties of a
protein including its sequence,
accession no, mass, pI,
absorptivity, solubility, active
sites, binding sites, reactions,
substrates, cellular localization,
signal peptides, homologues
functions, abundance, location,
secondary and 3D structure,
motifs and domains, post
translational modifications,
pathways and interaction
patterns etc.
5/21/2020 23
Domain organization and post-translational
modifications of p53 protein
CURRENT SCIENCE, VOL. 107, NO. 5, 10 SEPTEMBER 20145/21/2020 24
5/21/2020 25
5/21/2020 26
The Process of System Biology Research
5/21/2020 27
Metabolic pathways - Reference pathway
> 300 metabolic pathways
5/21/2020 28
KEGG- https://www.genome.jp/kegg/kegg2.html
Chemo-informatics
Chemo-informatics encompasses the design, creation, organization, management,
retrieval , analysis, dissemination , visualization and use of chemical information
5/21/2020 29
5/21/2020 30
Summary of bioinformatics applications in the drug discovery process (Searls,2000)5/21/2020 31
Drug Discovery Process
5/21/2020 32
5/21/2020 33
5/21/2020 34
5/21/2020 35
Other applications
• Microbial genome application
• Antibiotic resistance
• Alternative energy resources
• Crop improvement and development of resistant
varieties
• Forensic analysis
• Insect resistance
• Sequence analysis
• Literature analysis
5/21/2020 36
NCBI-https://www.ncbi.nlm.nih.gov/
EBI-https://www.ebi.ac.uk/
UniProt-https://www.ebi.ac.uk/uniprot/
ExPaSy-https://www.expasy.org/
PDB-https://www.rcsb.org/
UCSC Genome browser- https://genome.ucsc.edu/
KEGG- https://www.genome.jp/kegg/kegg2.html
OMIM-https://www.omim.org/
ENSEMBL-https://www.ensembl.org/index.html
PUBMED-https://www.ncbi.nlm.nih.gov/pubmed/
IMPORTANT BIOINFORMATICS RESOURCES
5/21/2020 37
Challenges in Bioinformatics
Our genetic code - DNA.
 try to write down our genetic code, it would probably be millions of pages and
lots of time.
 To create a database of the genetic code of every species of the universe, the
first step is to extract the DNA from every species and store it in a huge
repository. Thats only half the problem solved.
 Process data and try to understand how each species is different, their traits,
So many questions can be answered.
 But processing this data requires the most powerful supercomputers.
 Combination of computers running algorithms on biological data to uncover all
the different traits in different species
 genetic diversity
 to develop tools and software to resolve these problems
Accuracy ????
5/21/2020 38
 Software's work on some parameters may not
necessary that every sequence or structure follow
these parameters.
 Study protein-protein and protein-nucleic acid
recognition and assembly, Investigate integral
functional units (dynamic form and function of large
macro molecular complexes)
 Realize interactive modeling, Foster the
development of bio molecular modeling and
bioinformatics
 Train computational biologists in teraflop
technologies, numerical algorithms, and physical
concepts, Bring experimental and computational
groups in molecular bio medicine closer together
5/21/2020 39
 Full genome-genome comparisons
 Rapid assessment of polymorphic genetic variations
 Structure determination of large macro molecular
assemblies/complexes
 Rapid structural/topological clustering of proteins,
 Prediction of unknown molecular structures
 Protein folding
 Computer simulation of membrane structure and dynamic
function
 To cut and paste genes on demand, for modifying metabolic
pathways
 To efficiently and accurately simulate a large number of
reactions
 To construct, edit and maintain large metabolic knowledge bases
5/21/2020 40
 Predictive model of where and when transcription will
occur in a genome, transcription initiation and termination,
RNA Splicing, signal transduction pathways, cellular
response to external stimuli
 Determining effective protein-DNA, protein-RNA and
protein-protein recognition
 Accurate ab-initio structure prediction
 Rational design of small molecule inhibitors of proteins
 Mechanistic understanding of protein evolution:
understanding exactly how new protein functions evolve
 Mechanistic understanding of speciation: molecular details
of how speciation occurs
 systematic ways to describe the functions of any gene or
protein
Many more questions…………….
5/21/2020 41
Thank you
5/21/2020 42

More Related Content

PPTX
Genome Database Systems
PPTX
Genomics(functional genomics)
PDF
Sequence analysis - Bioinformatics
PPTX
How to submit a sequence in NCBI
PPTX
String.pptx
PPTX
Sequence Submission Tools
PPTX
Web based servers and softwares for genome analysis
PPTX
Kegg databse
Genome Database Systems
Genomics(functional genomics)
Sequence analysis - Bioinformatics
How to submit a sequence in NCBI
String.pptx
Sequence Submission Tools
Web based servers and softwares for genome analysis
Kegg databse

What's hot (20)

PPTX
Introduction to NCBI
PDF
Systems biology
PPTX
PPTX
Genomics
PPTX
PPTX
Kegg
PPTX
sequence of file formats in bioinformatics
PDF
Data Retrieval Systems
PPTX
Sequence alig Sequence Alignment Pairwise alignment:-
PPTX
DNA data bank of japan (DDBJ)
PPTX
Functional proteomics, methods and tools
PPT
methods for protein structure prediction
PPTX
Transcriptomics approaches
PPTX
Uni prot presentation
PPT
Sequence file formats
PPT
Est database
PPTX
Swiss prot database
PPTX
Protein protein interactions
PPTX
Applications of genomics and proteomics ppt
Introduction to NCBI
Systems biology
Genomics
Kegg
sequence of file formats in bioinformatics
Data Retrieval Systems
Sequence alig Sequence Alignment Pairwise alignment:-
DNA data bank of japan (DDBJ)
Functional proteomics, methods and tools
methods for protein structure prediction
Transcriptomics approaches
Uni prot presentation
Sequence file formats
Est database
Swiss prot database
Protein protein interactions
Applications of genomics and proteomics ppt
Ad

Similar to Bioinformatics applications and challenges (20)

PDF
Introduction to Bioinformatics 2025.....pdf
PPTX
Bioinformatics, application by kk sahu sir
PPTX
Bioinformatics final
PPTX
Applications of bioinformatics, main by kk sahu
PPT
BIOINFORMATICS.ppt History and applications
PDF
Bioinformatics - Exam_Materials.pdf by uos
PPT
An Introductory lecture on BIOINFORMATICS
PPT
BIOINFORMATICS.ppt
PPTX
proteomic and Genomics and the available proteomic technologies and the data ...
PPT
Lecture 1 Introduction to Bioinformatics BCH 433.ppt
PPTX
bioinformatics-200510115939.pptx introduction
PPTX
introduction of Bioinformatics
PPTX
Introduction to Biological database ppt(1).pptx
PPTX
MOLECULAR BIOLOGY TECHNIQUES AND APPLICATIONS
PPTX
MLS 5321 MOLECULAR BIOLOGY II TECHNIQUES AND APPLICATIONS POWER POINT.pptx
PPTX
Bioinformatics
PPT
Bioinformatics, its application main
PPTX
BIOINFO unit 1.pptx
PPT
Role of bioinformatics in life sciences research
Introduction to Bioinformatics 2025.....pdf
Bioinformatics, application by kk sahu sir
Bioinformatics final
Applications of bioinformatics, main by kk sahu
BIOINFORMATICS.ppt History and applications
Bioinformatics - Exam_Materials.pdf by uos
An Introductory lecture on BIOINFORMATICS
BIOINFORMATICS.ppt
proteomic and Genomics and the available proteomic technologies and the data ...
Lecture 1 Introduction to Bioinformatics BCH 433.ppt
bioinformatics-200510115939.pptx introduction
introduction of Bioinformatics
Introduction to Biological database ppt(1).pptx
MOLECULAR BIOLOGY TECHNIQUES AND APPLICATIONS
MLS 5321 MOLECULAR BIOLOGY II TECHNIQUES AND APPLICATIONS POWER POINT.pptx
Bioinformatics
Bioinformatics, its application main
BIOINFO unit 1.pptx
Role of bioinformatics in life sciences research
Ad

Recently uploaded (20)

PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
PDF
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
PPTX
DRUG THERAPY FOR SHOCK gjjjgfhhhhh.pptx.
PPT
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
PDF
An interstellar mission to test astrophysical black holes
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PDF
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PDF
Placing the Near-Earth Object Impact Probability in Context
PPTX
famous lake in india and its disturibution and importance
PPTX
Introduction to Cardiovascular system_structure and functions-1
PPTX
Microbiology with diagram medical studies .pptx
PPTX
Cell Membrane: Structure, Composition & Functions
PDF
Phytochemical Investigation of Miliusa longipes.pdf
PDF
Sciences of Europe No 170 (2025)
PPT
POSITIONING IN OPERATION THEATRE ROOM.ppt
PPTX
7. General Toxicologyfor clinical phrmacy.pptx
PPTX
2. Earth - The Living Planet Module 2ELS
PPTX
Introduction to Fisheries Biotechnology_Lesson 1.pptx
DOCX
Viruses (History, structure and composition, classification, Bacteriophage Re...
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
DRUG THERAPY FOR SHOCK gjjjgfhhhhh.pptx.
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
An interstellar mission to test astrophysical black holes
ECG_Course_Presentation د.محمد صقران ppt
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
Placing the Near-Earth Object Impact Probability in Context
famous lake in india and its disturibution and importance
Introduction to Cardiovascular system_structure and functions-1
Microbiology with diagram medical studies .pptx
Cell Membrane: Structure, Composition & Functions
Phytochemical Investigation of Miliusa longipes.pdf
Sciences of Europe No 170 (2025)
POSITIONING IN OPERATION THEATRE ROOM.ppt
7. General Toxicologyfor clinical phrmacy.pptx
2. Earth - The Living Planet Module 2ELS
Introduction to Fisheries Biotechnology_Lesson 1.pptx
Viruses (History, structure and composition, classification, Bacteriophage Re...

Bioinformatics applications and challenges

  • 1. Bioinformatics Applications and Challenges Dr. SV Singh, Assistant Professor Bundelkhand University, Jhansi 5/21/2020 1
  • 6. 5/21/2020 6 Language of Four Alphabet (ATCG) Language of Twenty Alphabet (A,C,D,E,F,G,H, ILe,K,L,M,N,P,Q ,R,S,T,V,W,Y)
  • 8. 5/21/2020 8 Biological Databases Structure PDB, SCOP, CATH etc. Sequence Protein SwissProt PIR TrEMBL PROSITE Pfam etc. Nucleic Acid GenBank EMBL DDBJ Other Genome browser, RELIBASE, OWL,FlyBase etc
  • 11. • Sequence alignment – BLAST, FASTA, CLUSTALW • Gene prediction- GenScan, GRAIL, GenomeScan, Gene Mark • Protein domain analysis and identification-Pfam, ProDom • Pattern identification-GibbsSampler, MEME • Protein folding- Predict protein, SwissModeller • Protein Structure prediction- Modeler • Protein structure analysis- Procheck, WhatIf • Phylogeny reconstruction- PHYLIP, PAUP • Molecular docking – Dock6, AutoDock and many more……… IMPORTANT BIOINFORMATICS RESOURCES 5/21/2020 11
  • 14. Genomics Structural genomics 1. Construction of genomic sequence data 2. Gene discovery and localization 3. Construction of gene maps Functional genomics 1. Biological function of genes 2. Regulation 3. Products 4. Plant development studies Comparative genomics 1. Compare gene sequence to elucidate 2. Functional or evolutionary relationship 5/21/2020 14
  • 16. Genome Annotation 1. The process of identifying the locations of genes and the coding regions in a genome to determine what those genes do 2. Finding and attaching the structural elements and its function to each genome locations Structural annotation (identification of genomic elements) 1. Open reading frames and their localization 2. Coding regions 3. Location of regulatory motifs 4. start/stop 5.Splice sites 6. Non coding regions/RNAs 7. Introns Functional annotation (Attaching biological information to genomic elements) 1. Biochemical function 2. Biological function 3. Improved regulation and interactions Expression 4. Utilize known structural annotation to predicted protein sequence 5/21/2020 16
  • 18. DNA sequence need to decode 5/21/2020 18
  • 19. CCTGACAAATTCGACGTGCGGCATTGCATGCAGACGTGCATG CGTGCAAATAATCAATGTGGACTTTTCTGCGATTATGGAAGAA CTTTGTTACGCGTTTTTGTCATGGCTTTGGTCCCGCTTTGTTC AGAATGCTTTTAATAAGCGGGGTTACCGGTTTGGTTAGCGAGA AGAGCCAGTAAAAGACGCAGTGACGGAGATGTCTGATG CAA TAT GGA CAA TTG GTT TCT TCT CTG AAT ................................. .............. TGAAAAACGTA TF binding sitepromoter Ribosome binding Site ORF = Open Reading Frame CDS = Coding Sequence Transcription StartSite 5/21/2020 19
  • 20. Transcriptomics The study of the complete set of RNAs (transcriptome) encoded by the genome of a specific cell or organism at a specific time or under a specific set of conditions Role of transcriptomics 1. Reveal the process of development 2. Determine the role of non coding RNAs (miRNA) 3. Genetic basis of disease 4. Help in study the response of drug 5/21/2020 20
  • 21. Transcriptome : an evolving definition • The population of mRNAs expressed by a genome at any given time (1999) • The complete collection of transcribed elements of the genome (2004) 1. mRNAs 2. Non coding RNAs 1. tRNAs 2. rRNAs 3. snmRNAs (small non messenger RNAs) • miRNA and siRNA • snoRNAs (small nucleolar) • snRNAs (small nuclear) 3. Pseudogenes 5/21/2020 21
  • 23. Protein annotation Identify and describe all the physio-chemical, functional and structural properties of a protein including its sequence, accession no, mass, pI, absorptivity, solubility, active sites, binding sites, reactions, substrates, cellular localization, signal peptides, homologues functions, abundance, location, secondary and 3D structure, motifs and domains, post translational modifications, pathways and interaction patterns etc. 5/21/2020 23
  • 24. Domain organization and post-translational modifications of p53 protein CURRENT SCIENCE, VOL. 107, NO. 5, 10 SEPTEMBER 20145/21/2020 24
  • 27. The Process of System Biology Research 5/21/2020 27
  • 28. Metabolic pathways - Reference pathway > 300 metabolic pathways 5/21/2020 28 KEGG- https://www.genome.jp/kegg/kegg2.html
  • 29. Chemo-informatics Chemo-informatics encompasses the design, creation, organization, management, retrieval , analysis, dissemination , visualization and use of chemical information 5/21/2020 29
  • 31. Summary of bioinformatics applications in the drug discovery process (Searls,2000)5/21/2020 31 Drug Discovery Process
  • 36. Other applications • Microbial genome application • Antibiotic resistance • Alternative energy resources • Crop improvement and development of resistant varieties • Forensic analysis • Insect resistance • Sequence analysis • Literature analysis 5/21/2020 36
  • 37. NCBI-https://www.ncbi.nlm.nih.gov/ EBI-https://www.ebi.ac.uk/ UniProt-https://www.ebi.ac.uk/uniprot/ ExPaSy-https://www.expasy.org/ PDB-https://www.rcsb.org/ UCSC Genome browser- https://genome.ucsc.edu/ KEGG- https://www.genome.jp/kegg/kegg2.html OMIM-https://www.omim.org/ ENSEMBL-https://www.ensembl.org/index.html PUBMED-https://www.ncbi.nlm.nih.gov/pubmed/ IMPORTANT BIOINFORMATICS RESOURCES 5/21/2020 37
  • 38. Challenges in Bioinformatics Our genetic code - DNA.  try to write down our genetic code, it would probably be millions of pages and lots of time.  To create a database of the genetic code of every species of the universe, the first step is to extract the DNA from every species and store it in a huge repository. Thats only half the problem solved.  Process data and try to understand how each species is different, their traits, So many questions can be answered.  But processing this data requires the most powerful supercomputers.  Combination of computers running algorithms on biological data to uncover all the different traits in different species  genetic diversity  to develop tools and software to resolve these problems Accuracy ???? 5/21/2020 38
  • 39.  Software's work on some parameters may not necessary that every sequence or structure follow these parameters.  Study protein-protein and protein-nucleic acid recognition and assembly, Investigate integral functional units (dynamic form and function of large macro molecular complexes)  Realize interactive modeling, Foster the development of bio molecular modeling and bioinformatics  Train computational biologists in teraflop technologies, numerical algorithms, and physical concepts, Bring experimental and computational groups in molecular bio medicine closer together 5/21/2020 39
  • 40.  Full genome-genome comparisons  Rapid assessment of polymorphic genetic variations  Structure determination of large macro molecular assemblies/complexes  Rapid structural/topological clustering of proteins,  Prediction of unknown molecular structures  Protein folding  Computer simulation of membrane structure and dynamic function  To cut and paste genes on demand, for modifying metabolic pathways  To efficiently and accurately simulate a large number of reactions  To construct, edit and maintain large metabolic knowledge bases 5/21/2020 40
  • 41.  Predictive model of where and when transcription will occur in a genome, transcription initiation and termination, RNA Splicing, signal transduction pathways, cellular response to external stimuli  Determining effective protein-DNA, protein-RNA and protein-protein recognition  Accurate ab-initio structure prediction  Rational design of small molecule inhibitors of proteins  Mechanistic understanding of protein evolution: understanding exactly how new protein functions evolve  Mechanistic understanding of speciation: molecular details of how speciation occurs  systematic ways to describe the functions of any gene or protein Many more questions……………. 5/21/2020 41