SlideShare a Scribd company logo
Gene identification
Open Reading frame
Six ORFs of dsDNA
Six ORFs of dsDNA
Complication with Introns
Genome annotation
•Annotation : Obtaining biological information from
unprocessed sequence data
•Structural annotation : Identification of genes and other
other important sequence elements
•Functional annotation : The determination of the functional
roles of genes in the organism
Genome annotation
•Raw genomic sequence can be annotated by,
i. Comparison with databases of previously cloned genes and ESTs
ii. Gene prediction based on consensus features such as
Promoters
Splice sites
Polyadenylation sites and
ORFs
Gene identification
Gene finding in eukaryotes is difficult
Genome Genes
Bacterial genome 80-85%
Yeast 70%
Fruit fly 25%
Human genome 3-5%
In human genome,
Typical exon = 150bp
Intron = Several kbs
Complete gene = Hundreds of kbs
ORF prediction
•Three reading frames are possible from each strand of a
DNA using “six-frame translation process”
- Result is 6 potential protein sequences
- Longest frame uninterrupted by a stop codon is the
correct one
•Finding the ends of ORF is easier than finding beginning
Beginning can be find using,
- Start codon
- kozak sequence (CCGCCAUGG) flanking start codon
- CpG islands
Software programs for gene
identification
•Advantage : Speed – annotation can be carried out
concurrently with sequencing itself.
•Disadvantage : Accuracy
•Two strategies used are,
- Homology searching
- ab initio prediction
Genome analysis2
Genome analysis2
Genome analysis2
Genome analysis2
ab initio prediction
Based on type of algorithm,
GRAIL – Based on neural networks
- Predicts exons, genes, promoters, polyAs, CpG islands
EST similarities, repetitive elements,
GeneFinder – Rule-based system
GENSCAN, GENEI, HMMGene, GeneMarkHMM, FGENEH
– Hidden Markov model
Genescan
ab initio prediction
1. Feature dependent methods,
Features of eukaryotic genes recognized are,
-Control signals such as TATA box, cap site, Kozak consensus
and polyadenylation sites
HEXON, MZEF are gene predicting programs that can predict
only a single feature, exon.
2. Few programs depend on differences in base composition
ab initio prediction
Accuracy problem – Algorithms are not 100% accurate
Errors include
- Incorrect calling of exon boundaries
- Missed exons
- Failure to detect entire genes
Solution:
Running different programs on single genome
Homology searching
•Finding genes in long sequences by looking for matches with
sequences that are known to be transcribed, e.g. cDNA, EST or
a gene
Programs used are BLAST (Basic Local Alignment Search Tool)
based,
BLASTN
BLASTX
BLASTP etc.
Homology searching
or ab initio ?
•Algorithms that take similarity data into account are better at
gene prediction – Reese et al(2000), Fortna et al(2001)
Latest gene prediction algorithms combine similarity data with
ab initio methods
examples : Grail/Exp,
GenieEST,
GenomeScan
tRNAScanSE : For tRNA identification
Advanced
gene finding programs
GLIMMER
•Gene Locator and Interpolated Markov ModelER
•For finding genes in microbial DNA
GLIMMER
GLIMMER
GeneMark
GeneMark
GenScan

More Related Content

PPTX
An Introduction to Genomics
PPTX
Genomics
PPTX
Functional genomics
PPTX
Expressed sequence tag (EST), molecular marker
PPT
Structural genomics
PPTX
Genes, Genomics and Proteomics
PPTX
Applications of genomics and proteomics ppt
An Introduction to Genomics
Genomics
Functional genomics
Expressed sequence tag (EST), molecular marker
Structural genomics
Genes, Genomics and Proteomics
Applications of genomics and proteomics ppt

What's hot (20)

PPTX
Comparative genomics
PPTX
Functional genomics
PPTX
Genomics(functional genomics)
PPTX
Next Generation Sequencing of DNA
PPTX
Protein micro array
PDF
Gene prediction methods vijay
PPTX
Nanopore sequencing
PPTX
Genome Mapping
PPTX
Techniques in proteomics
PPTX
Web based servers and softwares for genome analysis
PPTX
proteomics
PPTX
Functional proteomics, methods and tools
PPTX
Bioinformatics
PPTX
Comparative genomics
PPTX
Comparative and functional genomics
PPTX
Comparative genomics in eukaryotes, organelles
PPTX
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
PPTX
Bioinformatics in drug discovery
PPTX
Orthologs,Paralogs & Xenologs
PPTX
Genome sequencing
Comparative genomics
Functional genomics
Genomics(functional genomics)
Next Generation Sequencing of DNA
Protein micro array
Gene prediction methods vijay
Nanopore sequencing
Genome Mapping
Techniques in proteomics
Web based servers and softwares for genome analysis
proteomics
Functional proteomics, methods and tools
Bioinformatics
Comparative genomics
Comparative and functional genomics
Comparative genomics in eukaryotes, organelles
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
Bioinformatics in drug discovery
Orthologs,Paralogs & Xenologs
Genome sequencing
Ad

Viewers also liked (17)

PPT
Bioinformatica 08-12-2011-t8-go-hmm
PPT
Identification of disease genes
PPTX
Genetics introduction to Medicos
ZIP
Attachments (1)
PPT
Genetic Testing (Eastern Biotech & Life Sciences)
PDF
Methods for detecting mutations in DNA
PPT
Diagnosis Of Genetic Disorders & Infectious Diseases
KEY
Genetic screening
PPTX
Genetic testing
PPTX
Genetic screening & gene therapy
PPTX
Genetic screening Dr.Padmesh
PPTX
MOLECULAR TOOLS IN DIAGNOSIS AND CHARACTERIZATION OF INFECTIOUS DISEASES
PPTX
Genetic testing
PPT
Molecular Methods for Diagnosis of Genetic Diseases
PPT
Gene mutations ppt
PPTX
Introduction to Genetics
 
PPTX
Mutation detection methods in genetic disorders
Bioinformatica 08-12-2011-t8-go-hmm
Identification of disease genes
Genetics introduction to Medicos
Attachments (1)
Genetic Testing (Eastern Biotech & Life Sciences)
Methods for detecting mutations in DNA
Diagnosis Of Genetic Disorders & Infectious Diseases
Genetic screening
Genetic testing
Genetic screening & gene therapy
Genetic screening Dr.Padmesh
MOLECULAR TOOLS IN DIAGNOSIS AND CHARACTERIZATION OF INFECTIOUS DISEASES
Genetic testing
Molecular Methods for Diagnosis of Genetic Diseases
Gene mutations ppt
Introduction to Genetics
 
Mutation detection methods in genetic disorders
Ad

Similar to Genome analysis2 (20)

PDF
genomeannotation-160822182432.pdf
PPTX
Genome annotation
PPTX
Structural annotation................pptx
PPT
Lecture bioinformatics Part2.next generation
PDF
Bioinformatics.Practical Notebook
PPTX
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
PDF
2 md2016 annotation
PPTX
PCB_Lect07_Gen_genetic_yes I am like this Fin.pptx
PDF
Apollo Workshop AGS2017 Introduction
PPTX
Gene identification using bioinformatic tools.pptx
PPTX
gene prediction programs
PDF
Gene prediction method
PPT
Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation Overview
PPTX
gene prediction methods.pptx
PDF
Gene prediction strategies
PPTX
Bioinformatics t8-go-hmm v2014
PDF
Curation Introduction - Apollo Workshop
PPT
Bioinformatics
PPTX
Bioinformatics t8-go-hmm wim-vancriekinge_v2013
PPTX
Bioinformatic tool for Annotation of gene
genomeannotation-160822182432.pdf
Genome annotation
Structural annotation................pptx
Lecture bioinformatics Part2.next generation
Bioinformatics.Practical Notebook
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
2 md2016 annotation
PCB_Lect07_Gen_genetic_yes I am like this Fin.pptx
Apollo Workshop AGS2017 Introduction
Gene identification using bioinformatic tools.pptx
gene prediction programs
Gene prediction method
Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation Overview
gene prediction methods.pptx
Gene prediction strategies
Bioinformatics t8-go-hmm v2014
Curation Introduction - Apollo Workshop
Bioinformatics
Bioinformatics t8-go-hmm wim-vancriekinge_v2013
Bioinformatic tool for Annotation of gene

More from Malla Reddy College of Pharmacy (20)

PPT
Rna secondary structure prediction
PPT
Protein structure classification
PPT
Protein identication characterization
PPT
PPT
Phylogenetic studies
PPT
Multiple sequence alignment
PPTX
Homology modeling tools
PPT
PPT
PPT
Biological databases
PPT
Bio info statistical-methods[1]
Rna secondary structure prediction
Protein structure classification
Protein identication characterization
Phylogenetic studies
Multiple sequence alignment
Homology modeling tools
Biological databases
Bio info statistical-methods[1]

Recently uploaded (20)

PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PPTX
Pharma ospi slides which help in ospi learning
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
Classroom Observation Tools for Teachers
PDF
O7-L3 Supply Chain Operations - ICLT Program
PPTX
Institutional Correction lecture only . . .
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PPTX
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPTX
PPH.pptx obstetrics and gynecology in nursing
PPTX
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
Basic Mud Logging Guide for educational purpose
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Microbial disease of the cardiovascular and lymphatic systems
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
Pharma ospi slides which help in ospi learning
2.FourierTransform-ShortQuestionswithAnswers.pdf
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Abdominal Access Techniques with Prof. Dr. R K Mishra
Anesthesia in Laparoscopic Surgery in India
Final Presentation General Medicine 03-08-2024.pptx
Classroom Observation Tools for Teachers
O7-L3 Supply Chain Operations - ICLT Program
Institutional Correction lecture only . . .
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
VCE English Exam - Section C Student Revision Booklet
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPH.pptx obstetrics and gynecology in nursing
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
Pharmacology of Heart Failure /Pharmacotherapy of CHF
Basic Mud Logging Guide for educational purpose
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx

Genome analysis2

  • 3. Six ORFs of dsDNA
  • 4. Six ORFs of dsDNA
  • 6. Genome annotation •Annotation : Obtaining biological information from unprocessed sequence data •Structural annotation : Identification of genes and other other important sequence elements •Functional annotation : The determination of the functional roles of genes in the organism
  • 7. Genome annotation •Raw genomic sequence can be annotated by, i. Comparison with databases of previously cloned genes and ESTs ii. Gene prediction based on consensus features such as Promoters Splice sites Polyadenylation sites and ORFs
  • 8. Gene identification Gene finding in eukaryotes is difficult Genome Genes Bacterial genome 80-85% Yeast 70% Fruit fly 25% Human genome 3-5% In human genome, Typical exon = 150bp Intron = Several kbs Complete gene = Hundreds of kbs
  • 9. ORF prediction •Three reading frames are possible from each strand of a DNA using “six-frame translation process” - Result is 6 potential protein sequences - Longest frame uninterrupted by a stop codon is the correct one •Finding the ends of ORF is easier than finding beginning Beginning can be find using, - Start codon - kozak sequence (CCGCCAUGG) flanking start codon - CpG islands
  • 10. Software programs for gene identification •Advantage : Speed – annotation can be carried out concurrently with sequencing itself. •Disadvantage : Accuracy •Two strategies used are, - Homology searching - ab initio prediction
  • 15. ab initio prediction Based on type of algorithm, GRAIL – Based on neural networks - Predicts exons, genes, promoters, polyAs, CpG islands EST similarities, repetitive elements, GeneFinder – Rule-based system GENSCAN, GENEI, HMMGene, GeneMarkHMM, FGENEH – Hidden Markov model
  • 17. ab initio prediction 1. Feature dependent methods, Features of eukaryotic genes recognized are, -Control signals such as TATA box, cap site, Kozak consensus and polyadenylation sites HEXON, MZEF are gene predicting programs that can predict only a single feature, exon. 2. Few programs depend on differences in base composition
  • 18. ab initio prediction Accuracy problem – Algorithms are not 100% accurate Errors include - Incorrect calling of exon boundaries - Missed exons - Failure to detect entire genes Solution: Running different programs on single genome
  • 19. Homology searching •Finding genes in long sequences by looking for matches with sequences that are known to be transcribed, e.g. cDNA, EST or a gene Programs used are BLAST (Basic Local Alignment Search Tool) based, BLASTN BLASTX BLASTP etc.
  • 20. Homology searching or ab initio ? •Algorithms that take similarity data into account are better at gene prediction – Reese et al(2000), Fortna et al(2001) Latest gene prediction algorithms combine similarity data with ab initio methods examples : Grail/Exp, GenieEST, GenomeScan tRNAScanSE : For tRNA identification
  • 22. GLIMMER •Gene Locator and Interpolated Markov ModelER •For finding genes in microbial DNA