SlideShare a Scribd company logo
Submitted by-
Ishi tandon
CT-IV
Gene:
• Asequence of nucleotides coding for protein.
CentralDogma:
• Proposed in 1958 by Francis Crick.
• Hepostulated that all possibleinformation
transferred, are not viable.
• Hepublished apaper in 1970.
CODONS:
• Discovered by Sydney Brenner and Francis Crickin
1961.
• In every triplet of nucleotides, each codoncodesfor
one amino acid in aprotein.
DNA RNA PROTEIN PHENOTYPE
2
4
cDNA
1 3
1. TRANSCRIPTION
2. TRANSLATION
3. GENE EXPRESSION
4. REVERSETRANSCRIPTION
DEfiniTION
• It is aprerequisite for detailed functionalannotation
of genesand genomes.
• It candetect location of ORFs(Open Reading
Frames), structures of introns andexons.
• It describes all the genescomputationally withnear
100% accuracy.
• It canreduce the amount ofexperimental
verification work required.
TYPES
• Abinitio- gene signals, intron splice, transcription
factor binding site, ribosomal binding site, poly-
adenylation site, triplet codon structure and gene
content.
• Homology- significant matches of query sequence
with sequence of knowngenes.
• Probabilistic models like Markov model or Hidden
Markov Models (HMMs).
Abinitio-based
Homology-
based
Translation
Protein
Splicing
mRNA Cap- -Poly(A)
Transcription
pre-mRNA Cap- -Poly(A)
Genomic DNA
Stop codon
GT AG
exon intron
Splice sites
Donor site Acceptor site
SEQUENCE
SIGNALS
Start codon
Exonsare usually
shorter thanintrons.
Prokaryoticgene
prediction
• Geneprediction is easier in microbialgenomes.
• Smaller genomes, high gene density, very few
repetitive sequence, more sequenced genomes.
• Start codon is ATG.
• Ribosomal binding site/Shine Dalgarno sequence.
Openreadingframes
• A sequence defined by in-frame start and stop
codon, which in turn defines aputative amino acid
sequence.
• Agenome of length n is comprised of (n/3)codons.
• Stop codons break genome into segments between
consecutive stop codons.
• Thesub-segments of these that start from the Start
codon (ATG)areORFs.
• DNA is translated in all six possible frames,
three frames forward and three reverse.
ATG TGA
Genomic Sequence
Open reading frame
CTGCAGACGAAACCTCTTGATGTAGTTGGCCTGACACCGACAATAATGAAGACTACCGTCTTACTAACAC
CTGCAGACGAAACCTCTTGATGTAGTTGGCCTGACACCGACAATAATGAAGACTACCGTCTTACTAACAC
CTGCAGACGAAACCTCTTGATGTAGTTGGCCTGACACCGACAATAATGAAGACTACCGTCTTACTAACAC
CTGCAGACGAAACCTCTTGATGTAGTTGGCCTGACACCGACAATAATGAAGACTACCGTCTTACTAACAC
GACGTCTGCTTTGGAGAACTACATCAACCGGACTGTGGCTGTTATTACTTCTGATGGCAGAATGATTGTG
GACGTCTGCTTTGGAGAACTACATCAACCGGACTGTGGCTGTTATTACTTCTGATGGCAGAATGATTGTG
GACGTCTGCTTTGGAGAACTACATCAACCGGACTGTGGCTGTTATTACTTCTGATGGCAGAATGATTGTG
GACGTCTGCTTTGGAGAACTACATCAACCGGACTGTGGCTGTTATTACTTCTGATGGCAGAATGATTGTG
Probabilisticmodels
• Statistical description of agene.
• Markov Models &Hidden Markov Models.
• Usedto distinguish oligonucleotide distributions in
the coding regions from those for non-coding
regions.
• Probability of distribution of nucleotides inDNA
sequence depends on the order k.
• Typesof order- zero,first and second.
• Order , gene canpredicted more accurately.
Genecontent and length distribution of
prokaryotic genes
TYPICAL ATYPICAL
Ranges from100
to 500amino
acids with a
nucleotide
distribution
typical ofthe
organism.
Shorter or longer
with different
nucleotidestatistics.
Genes tend toescape
detection when
typical gene modelis
used.
Genefindingprogramsin
prokaryotes
• Theprograms are based on HMM/IMM.
 GeneMark.hmm (microbial genomes)
 Glimmer (UNIX program from TIGR). Computation
involves two steps viz. model building & gene
prediction.
 FGENESB (bacterial sequences). It uses Vertibi
algorithm & linear discriminant analysis(LDA).
 RBSfinder- Searches from ribosomal binding site or
shine dalgarno sequence for prediction of translation
initiation site.
Sensitivity Ability to include correct predictions. It is the
fraction of known genescorrectlypredicted.
Specificity Ability to exclude incorrect predictions. It is the
fraction of predicted genes that correspond to true genes.
 Both are the proportion of true signals.
Eukaryoticgeneprediction
• Genomes are much larger than prokaryotes(10Mbp to
670 Gbp).
• Low gene density.
• Spacebetween genesis very large and rich in
repetitive sequences & transposableelements.
• Splitting of genesby intervening noncodingsequences
(introns) and joining of coding sequences(exons).
• Splice junctions follow GT-AGrule.
• An intron at the 5’ splice junction hasaconsensus
motif GTAAGTand that at 3’ endNCAG.
exon 1 exon 2
• Geneshave ahigh density of CGdinucleotides near
the transcription start site. Thisregion is CpGisland. It
helps to identify the transcription initiation site of an
eukaryotic gene.
• Somepost-transcriptional modification occur with the
transcript to become mature mRNAviz. Capping,
Splicing and Polyadenylation.
Acceptor
Site
Donor
Site
GT AG
o CAPPING: Occurs at the 5’ end of the transcript. It
involves methylation at the initial residue of the
RNA.
o SPLICING: Processof removal of intronsand
joining of exons. It involves alargeRNA-protein
complex called spliceosome.
o POLYADENYLATION:Addition of astretch ofAs
(~250) at the 3’ end of the RNA.Theprocessis
accomplished by poly-Apolymerase.
Genefindingprogramsin
EUkaryotes
• Three categories of algorithms
 Ab Initiobased-
It joins the exonsin correct order.Twosignals->
a) Genesignals: asmall pattern within the genomic
DNAincluding putative splice sites, start and stop
sites of transcription or translation, branchpoints,
transcription factor binding sites, recognizable
consensus sequences.
b) Genecontent: aregion of genomic DNAincluding
nucleotide and amino acid distribution, Synonymous
codon usageand hexamer frequencies.
 Neural network based algorithm
-Composed of network of mathematicalvariables.
-Multiple layers like input, output and hiddenlayers.
-GRAIL (Splice junctions, start and stop codons, poly-A
sites, promoters and CpGislands). It scansthe query
sequence with windows of variable lengths &scores.
 Discriminant analysis
-Linear Discriminant Analysis (LDA) represents 2D
graph of coding signals vs. all possible 3’ splice site
positions; adiagonal line.
-Quadratic DiscriminantAnalysis (QDA)represents
quadratic function; acurved line.
-FGENES (LDA)
-FGENESH [Find Genes] (HMMs)
-FGENESH_C (Similarity based)
-FGENESH+ (Combination of ab initio &similarity
based)
-MZEF [Michael Zhang’s Exon Finder](QDA)
 HMMs
-GENSCAN (Fifth order HMMs); combination of
hexamer frequencies with coding signals;probability
score P>0.5
-HMMgene (Conditional Maximum Likelihood);
combination of ab initio & homology-basedalgorithm
 Homology-based-
Exonstructures and sequencesof related speciesare
highly conserved.
Comparison of homologous sequences derived from
cDNAor ExpressedSequenceTags (ESTs).
-GenomeScan (Combination of GENSCANprediction
results with BLASTXsimilaritysearches)
-EST2Genome (Intron-exon boundaries); Comparison
of an ESTsequence with agenomic DNAsequence
-SGP-1 [Syntenic Gene Prediction] (Similar to EST2)
-TwinScan (gene-finding server; similar to
GenomeScan)
 Consensus-based-
Combination of results of multiple programsbased
on consensus.
Improvement of specificity by correctingfalse
positives & problem ofoverprediction.
Lowered sensitivity & missedpredictons.
-GeneComber (Combination of HMMgene&
GenScanprediction results)
-DIGIT (Combination of FGENESH,GENSCAN&
HMMgene)
GENE EXPRESSION
Two steps are required
1. Translation
The synthesis of a polypeptide chain using the genetic
code on the mRNA molecule as its guide.
1. Transcription
The synthesis of mRNA uses the gene on the DNA
molecule as a template
This happens in the nucleus of eukaryotes
Types OF RNA
Messenger RNA (mRNA) <5%
Ribosomal RNA (rRNA) Up to 80%
Transfer RNA (tRNA) About 15%
In eukaryotes small nuclear ribonucleoproteins (snRNP aka
spliceosomes
Structural characteristics of RNA molecules
Single polynucleotide strand which may be looped or
coiled (not a double helix)
Sugar Ribose (not deoxyribose)
Bases used: Adenine, Guanine, Cytosine and Uracil (not
Thymine
Transcription: The synthesis of a strand of mRNA (and
other RNAs)
Uses an enzyme RNA polymerase
Proceeds in the same direction as replication (5’ to 3’)
Forms a complementary strand of mRNA
It begins at a promotor site, which signals that the beginning of
the gene is near (about 20 to 30 nucleotides away)
After the end of the gene is reached, there is a terminator
sequence that tells RNA polymerase to stop transcribing
NB Terminator sequence ≠ terminator codon
RNA POLYMERASE
Editing the mRNA
In prokaryotes, transcribed mRNA
goes straight to the ribosomes in the
cytoplasm
In eukaryotes, freshly transcribed
mRNA in the nucleus is about 5000
nucleotides long
When the same mRNA is used for
translation at the ribosome it is only
1000 nucleotides long
The mRNA has been edited
The parts which are kept for gene
expression are called EXONS (exons =
expressed)
The parts which are edited out (by
spliceosomes) are called INTRONS.
Translation
TRANSLATION
Complete protein
Polypeptide chain
Ribosomes
Stop codon Start codon
© 2016 Paul Billiet ODWS
Translation
 Location: The ribosomes in the cytoplasm
that provide the environment for translation
 The genetic code is brought by the mRNA
molecule.
© 2016 Paul Billiet ODWS
An important discovery Retro viruses (e.g. HIV)
carry RNA as their
genetic information
 When they invade their
host cell they convert
their RNA into a DNA
copy using reverse
transcriptase
 Thus the central dogma is modified:
DNA↔RNAProtein
 This has helped to explain an important paradox in the
evolution of life.
Reverse transcriptase
© 2016 Paul Billiet ODWS
The paradox of DNA
 DNA is a very stable molecule
 It is a good medium for storing genetic material
but…
 DNA can do nothing for itself
 It requires enzymes for replication
 It requires enzymes for gene expression
 The information in DNA is required to synthesise
enzymes (proteins) but enzymes are require to
make DNA function
 Which came first in the origin of life DNA or
enzymes?
© 2016 Paul Billiet ODWS
RIBOZYMES: Both genetic and
catalytic
 Certain forms of RNA have catalytic properties
 RIBOZYMES
 Ribosomes and spliceosomes are ribozymes
 RNA could have been the first genetic information
synthesizing proteins…
 …and at the same time a biocatalyst
 Reverse transcriptase provides the possibility of
producing DNA copies from RNA.
© 2016 Paul Billiet ODWS
The ribosome a ribozyme
REFERENCES
 http://www.4ulr.com/products/currentprotocols/bioinformatics.html
 http://proxy.lib.iastate.edu:2103/nrg/journal/v3/n9/full/nrg890_fs.html
 http://proxy.lib.iastate.edu:2103/nrg/journal/v5/n4/full/nrg1315_fs.html
 Xiong J.;Essential bioinformatics; QH324.2.X56 2006

More Related Content

PPTX
Biological database
PDF
Gene prediction method
PPTX
Microbial taxonomy
PPTX
Homology modelling
PPT
Protein protein interaction
PPTX
Regression analysis.
PPTX
PPTX
Go back-n protocol
Biological database
Gene prediction method
Microbial taxonomy
Homology modelling
Protein protein interaction
Regression analysis.
Go back-n protocol

What's hot (20)

PDF
Gene prediction methods vijay
PPTX
Swiss prot database
PDF
Nucleic Acid Sequence databases
PPTX
Secondary protein structure prediction
PPT
Clustal
DOCX
Protein structure visualization tools-RASMOL
PDF
Sequence alignment
PPTX
Uni prot presentation
PPTX
Clustal W - Multiple Sequence alignment
PPTX
Express sequence tags
PPTX
Multiple sequence alignment
PDF
Dot matrix
PPTX
Sequence alignment global vs. local
PPTX
Protein data bank
PPTX
Introduction to NCBI
PPTX
Cath
PPT
Genome sequencing
PPTX
shotgun sequncing
Gene prediction methods vijay
Swiss prot database
Nucleic Acid Sequence databases
Secondary protein structure prediction
Clustal
Protein structure visualization tools-RASMOL
Sequence alignment
Uni prot presentation
Clustal W - Multiple Sequence alignment
Express sequence tags
Multiple sequence alignment
Dot matrix
Sequence alignment global vs. local
Protein data bank
Introduction to NCBI
Cath
Genome sequencing
shotgun sequncing
Ad

Similar to Gene prediction and expression (20)

PPT
Central dogma
PDF
If you were looking at an mRNA and saw the codon AUG, what would you .pdf
PPTX
SAGE- Serial Analysis of Gene Expression
PPT
Power Point Presentation on Gene Expression and Regulation.ppt
PPTX
Central Dogma-Cell Theory.pptx
PPT
Central Dogma of Life
PPT
Central dogma of molecular genetics valerio
PPT
chapter 7
PDF
Transcription in prokaryotes and eukaryotes.pdf
PPTX
Role of DNA and RNA in Protein Synthesis
PPTX
DNA_RNA_Protein Synthesis_Mini lecture Thn 1.pptx
PPT
Translation of Proteins.ppt
PPTX
11 transcription
PPTX
lecture 3 Gene expression pptx
DOC
protein synthesis
PPT
Protein synthesis mechanism with reference of Translation and Transcription d...
PPTX
5.Genetics in orthodontics
PDF
Biol102 chp17-pp-spr10-100508132228-phpapp02
PDF
Biol102 chp17-pp-spr10-100508132228-phpapp02
PPT
Central dogma
If you were looking at an mRNA and saw the codon AUG, what would you .pdf
SAGE- Serial Analysis of Gene Expression
Power Point Presentation on Gene Expression and Regulation.ppt
Central Dogma-Cell Theory.pptx
Central Dogma of Life
Central dogma of molecular genetics valerio
chapter 7
Transcription in prokaryotes and eukaryotes.pdf
Role of DNA and RNA in Protein Synthesis
DNA_RNA_Protein Synthesis_Mini lecture Thn 1.pptx
Translation of Proteins.ppt
11 transcription
lecture 3 Gene expression pptx
protein synthesis
Protein synthesis mechanism with reference of Translation and Transcription d...
5.Genetics in orthodontics
Biol102 chp17-pp-spr10-100508132228-phpapp02
Biol102 chp17-pp-spr10-100508132228-phpapp02
Ad

Recently uploaded (20)

PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
Microbial disease of the cardiovascular and lymphatic systems
PPTX
UNIT III MENTAL HEALTH NURSING ASSESSMENT
PPTX
Cell Types and Its function , kingdom of life
PPTX
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
PDF
Paper A Mock Exam 9_ Attempt review.pdf.
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PPTX
History, Philosophy and sociology of education (1).pptx
PDF
01-Introduction-to-Information-Management.pdf
PDF
Classroom Observation Tools for Teachers
PDF
A systematic review of self-coping strategies used by university students to ...
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
Lesson notes of climatology university.
PDF
LNK 2025 (2).pdf MWEHEHEHEHEHEHEHEHEHEHE
PDF
What if we spent less time fighting change, and more time building what’s rig...
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
STATICS OF THE RIGID BODIES Hibbelers.pdf
Microbial disease of the cardiovascular and lymphatic systems
UNIT III MENTAL HEALTH NURSING ASSESSMENT
Cell Types and Its function , kingdom of life
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
Paper A Mock Exam 9_ Attempt review.pdf.
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
Module 4: Burden of Disease Tutorial Slides S2 2025
History, Philosophy and sociology of education (1).pptx
01-Introduction-to-Information-Management.pdf
Classroom Observation Tools for Teachers
A systematic review of self-coping strategies used by university students to ...
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Anesthesia in Laparoscopic Surgery in India
Lesson notes of climatology university.
LNK 2025 (2).pdf MWEHEHEHEHEHEHEHEHEHEHE
What if we spent less time fighting change, and more time building what’s rig...
school management -TNTEU- B.Ed., Semester II Unit 1.pptx

Gene prediction and expression

  • 2. Gene: • Asequence of nucleotides coding for protein. CentralDogma: • Proposed in 1958 by Francis Crick. • Hepostulated that all possibleinformation transferred, are not viable. • Hepublished apaper in 1970. CODONS: • Discovered by Sydney Brenner and Francis Crickin 1961. • In every triplet of nucleotides, each codoncodesfor one amino acid in aprotein.
  • 3. DNA RNA PROTEIN PHENOTYPE 2 4 cDNA 1 3 1. TRANSCRIPTION 2. TRANSLATION 3. GENE EXPRESSION 4. REVERSETRANSCRIPTION
  • 4. DEfiniTION • It is aprerequisite for detailed functionalannotation of genesand genomes. • It candetect location of ORFs(Open Reading Frames), structures of introns andexons. • It describes all the genescomputationally withnear 100% accuracy. • It canreduce the amount ofexperimental verification work required.
  • 5. TYPES • Abinitio- gene signals, intron splice, transcription factor binding site, ribosomal binding site, poly- adenylation site, triplet codon structure and gene content. • Homology- significant matches of query sequence with sequence of knowngenes. • Probabilistic models like Markov model or Hidden Markov Models (HMMs). Abinitio-based Homology- based
  • 6. Translation Protein Splicing mRNA Cap- -Poly(A) Transcription pre-mRNA Cap- -Poly(A) Genomic DNA Stop codon GT AG exon intron Splice sites Donor site Acceptor site SEQUENCE SIGNALS Start codon Exonsare usually shorter thanintrons.
  • 7. Prokaryoticgene prediction • Geneprediction is easier in microbialgenomes. • Smaller genomes, high gene density, very few repetitive sequence, more sequenced genomes. • Start codon is ATG. • Ribosomal binding site/Shine Dalgarno sequence.
  • 8. Openreadingframes • A sequence defined by in-frame start and stop codon, which in turn defines aputative amino acid sequence. • Agenome of length n is comprised of (n/3)codons. • Stop codons break genome into segments between consecutive stop codons. • Thesub-segments of these that start from the Start codon (ATG)areORFs. • DNA is translated in all six possible frames, three frames forward and three reverse. ATG TGA Genomic Sequence Open reading frame
  • 10. Probabilisticmodels • Statistical description of agene. • Markov Models &Hidden Markov Models. • Usedto distinguish oligonucleotide distributions in the coding regions from those for non-coding regions. • Probability of distribution of nucleotides inDNA sequence depends on the order k. • Typesof order- zero,first and second. • Order , gene canpredicted more accurately.
  • 11. Genecontent and length distribution of prokaryotic genes TYPICAL ATYPICAL Ranges from100 to 500amino acids with a nucleotide distribution typical ofthe organism. Shorter or longer with different nucleotidestatistics. Genes tend toescape detection when typical gene modelis used.
  • 12. Genefindingprogramsin prokaryotes • Theprograms are based on HMM/IMM.  GeneMark.hmm (microbial genomes)  Glimmer (UNIX program from TIGR). Computation involves two steps viz. model building & gene prediction.  FGENESB (bacterial sequences). It uses Vertibi algorithm & linear discriminant analysis(LDA).  RBSfinder- Searches from ribosomal binding site or shine dalgarno sequence for prediction of translation initiation site.
  • 13. Sensitivity Ability to include correct predictions. It is the fraction of known genescorrectlypredicted. Specificity Ability to exclude incorrect predictions. It is the fraction of predicted genes that correspond to true genes.  Both are the proportion of true signals.
  • 14. Eukaryoticgeneprediction • Genomes are much larger than prokaryotes(10Mbp to 670 Gbp). • Low gene density. • Spacebetween genesis very large and rich in repetitive sequences & transposableelements. • Splitting of genesby intervening noncodingsequences (introns) and joining of coding sequences(exons).
  • 15. • Splice junctions follow GT-AGrule. • An intron at the 5’ splice junction hasaconsensus motif GTAAGTand that at 3’ endNCAG. exon 1 exon 2 • Geneshave ahigh density of CGdinucleotides near the transcription start site. Thisregion is CpGisland. It helps to identify the transcription initiation site of an eukaryotic gene. • Somepost-transcriptional modification occur with the transcript to become mature mRNAviz. Capping, Splicing and Polyadenylation. Acceptor Site Donor Site GT AG
  • 16. o CAPPING: Occurs at the 5’ end of the transcript. It involves methylation at the initial residue of the RNA. o SPLICING: Processof removal of intronsand joining of exons. It involves alargeRNA-protein complex called spliceosome. o POLYADENYLATION:Addition of astretch ofAs (~250) at the 3’ end of the RNA.Theprocessis accomplished by poly-Apolymerase.
  • 17. Genefindingprogramsin EUkaryotes • Three categories of algorithms  Ab Initiobased- It joins the exonsin correct order.Twosignals-> a) Genesignals: asmall pattern within the genomic DNAincluding putative splice sites, start and stop sites of transcription or translation, branchpoints, transcription factor binding sites, recognizable consensus sequences. b) Genecontent: aregion of genomic DNAincluding nucleotide and amino acid distribution, Synonymous codon usageand hexamer frequencies.
  • 18.  Neural network based algorithm -Composed of network of mathematicalvariables. -Multiple layers like input, output and hiddenlayers. -GRAIL (Splice junctions, start and stop codons, poly-A sites, promoters and CpGislands). It scansthe query sequence with windows of variable lengths &scores.  Discriminant analysis -Linear Discriminant Analysis (LDA) represents 2D graph of coding signals vs. all possible 3’ splice site positions; adiagonal line. -Quadratic DiscriminantAnalysis (QDA)represents quadratic function; acurved line. -FGENES (LDA)
  • 19. -FGENESH [Find Genes] (HMMs) -FGENESH_C (Similarity based) -FGENESH+ (Combination of ab initio &similarity based) -MZEF [Michael Zhang’s Exon Finder](QDA)  HMMs -GENSCAN (Fifth order HMMs); combination of hexamer frequencies with coding signals;probability score P>0.5 -HMMgene (Conditional Maximum Likelihood); combination of ab initio & homology-basedalgorithm
  • 20.  Homology-based- Exonstructures and sequencesof related speciesare highly conserved. Comparison of homologous sequences derived from cDNAor ExpressedSequenceTags (ESTs). -GenomeScan (Combination of GENSCANprediction results with BLASTXsimilaritysearches) -EST2Genome (Intron-exon boundaries); Comparison of an ESTsequence with agenomic DNAsequence -SGP-1 [Syntenic Gene Prediction] (Similar to EST2) -TwinScan (gene-finding server; similar to GenomeScan)
  • 21.  Consensus-based- Combination of results of multiple programsbased on consensus. Improvement of specificity by correctingfalse positives & problem ofoverprediction. Lowered sensitivity & missedpredictons. -GeneComber (Combination of HMMgene& GenScanprediction results) -DIGIT (Combination of FGENESH,GENSCAN& HMMgene)
  • 22. GENE EXPRESSION Two steps are required 1. Translation The synthesis of a polypeptide chain using the genetic code on the mRNA molecule as its guide. 1. Transcription The synthesis of mRNA uses the gene on the DNA molecule as a template This happens in the nucleus of eukaryotes
  • 23. Types OF RNA Messenger RNA (mRNA) <5% Ribosomal RNA (rRNA) Up to 80% Transfer RNA (tRNA) About 15% In eukaryotes small nuclear ribonucleoproteins (snRNP aka spliceosomes Structural characteristics of RNA molecules Single polynucleotide strand which may be looped or coiled (not a double helix) Sugar Ribose (not deoxyribose) Bases used: Adenine, Guanine, Cytosine and Uracil (not Thymine
  • 24. Transcription: The synthesis of a strand of mRNA (and other RNAs) Uses an enzyme RNA polymerase Proceeds in the same direction as replication (5’ to 3’) Forms a complementary strand of mRNA It begins at a promotor site, which signals that the beginning of the gene is near (about 20 to 30 nucleotides away) After the end of the gene is reached, there is a terminator sequence that tells RNA polymerase to stop transcribing NB Terminator sequence ≠ terminator codon RNA POLYMERASE
  • 25. Editing the mRNA In prokaryotes, transcribed mRNA goes straight to the ribosomes in the cytoplasm In eukaryotes, freshly transcribed mRNA in the nucleus is about 5000 nucleotides long When the same mRNA is used for translation at the ribosome it is only 1000 nucleotides long The mRNA has been edited The parts which are kept for gene expression are called EXONS (exons = expressed) The parts which are edited out (by spliceosomes) are called INTRONS.
  • 27. Translation  Location: The ribosomes in the cytoplasm that provide the environment for translation  The genetic code is brought by the mRNA molecule. © 2016 Paul Billiet ODWS
  • 28. An important discovery Retro viruses (e.g. HIV) carry RNA as their genetic information  When they invade their host cell they convert their RNA into a DNA copy using reverse transcriptase  Thus the central dogma is modified: DNA↔RNAProtein  This has helped to explain an important paradox in the evolution of life. Reverse transcriptase © 2016 Paul Billiet ODWS
  • 29. The paradox of DNA  DNA is a very stable molecule  It is a good medium for storing genetic material but…  DNA can do nothing for itself  It requires enzymes for replication  It requires enzymes for gene expression  The information in DNA is required to synthesise enzymes (proteins) but enzymes are require to make DNA function  Which came first in the origin of life DNA or enzymes? © 2016 Paul Billiet ODWS
  • 30. RIBOZYMES: Both genetic and catalytic  Certain forms of RNA have catalytic properties  RIBOZYMES  Ribosomes and spliceosomes are ribozymes  RNA could have been the first genetic information synthesizing proteins…  …and at the same time a biocatalyst  Reverse transcriptase provides the possibility of producing DNA copies from RNA. © 2016 Paul Billiet ODWS
  • 31. The ribosome a ribozyme
  • 32. REFERENCES  http://www.4ulr.com/products/currentprotocols/bioinformatics.html  http://proxy.lib.iastate.edu:2103/nrg/journal/v3/n9/full/nrg890_fs.html  http://proxy.lib.iastate.edu:2103/nrg/journal/v5/n4/full/nrg1315_fs.html  Xiong J.;Essential bioinformatics; QH324.2.X56 2006