SlideShare a Scribd company logo
C value
K.Praveen
TAD/2013-18
C-value
 The C-value of an organism is the amount of DNA in the
organism’s genome. The size of the genome (C-value)
depends on the organism.
 Thomas coined the term C-value Paradox to denote the
unexpected lack of relationship between the presumed
complexity of an organism and its C-value
4
Range of C-values in various eukaryotic taxa
_____________________________________________________________
__
Taxon Genome size range Ratio
(Kb) (highest/lowest)
_____________________________________________________________
__
Eukaryotes 2,300 - 686,000,000 298,261
Amoebae 35,300 - 686,000,000 19,433
Fungi 8,800 - 1,470,000 167
Animals 49,000 - 139,000,000 2,837
Sponges 49,000 - 53,900 1
Molluscs 421,000 - 5,290,000 13
Crustaceans 686,000 - 22,100,000 32
Insects 98,000 - 7,350,000 75
Bony fishes 340,000 - 139,000,000 409
Amphibians 931,000 - 84,300,000 91
Reptiles 1,230,000 - 5,340,000 4
Birds 1,670,000 - 2,250,000 1
Mammals 1,700,000 - 6,700,000 4
Plants 50,000 - 307,000,000 6,140
_____________________________________________________________
__
5
If the variation in C-values is attributed to
genes, it can be due to interspecific
differences in
(1) the number of protein-coding
genes
(2) the size of proteins
(3) the size of protein-coding genes
(4) the number and sizes of genes
other than protein-coding ones.
6
3c
K N C
7
K-value paradox: Complexity
does not correlate with
chromosome number.
46 250
Ophioglossum
reticulatum
Homo sapiens Lysandra atlantica
~1260
8
C-value paradox: Complexity
does not correlate with
genome size.
3.4  109 bp
Homo sapiens
6.8  1011 bp
Amoeba dubia
1.5  1010 bp
Allium cepa
9
N-value paradox:
Complexity does not
correlate with gene number.
~21,000 genes ~25,000 genes ~60,000 genes
Figure 21.7 Exons (1.5%) Introns (5%)
Regulatory
sequences
(20%)
Unique
noncoding
DNA (15%)
Repetitive
DNA
unrelated to
transposable
elements
(14%)
Large-segment
duplications (56%)
Simple sequence
DNA (3%)
Alu elements
(10%)
L1
sequences
(17%)
Repetitive
DNA that
includes
transposable
elements
and related
sequences
(44%)
Sequence complexity- Introns
and Exons
 DNA comprises an interrupted gene are divided into the
two categories:
1. Exons
2. Introns
 Exons: are the sequences represented in the mature RNA.
By definition, a gene starts and ends with exons that
correspond to the 5’ and 3’ ends of the RNA.
 Introns: are the intervening sequences that are removed
when the primary transcript is processed to give the
mature RNA.
C value
 The exon sequences are in the same order in the gene and in the
RNA, but an interrupted gene is longer than its final RNA product
because of the presence of the introns.
 Introns are removed by the process of RNA splicing, which occur
only in cis on an individual RNA molecule.
Primary transcript
Sequences within the RNA
Determine Where Splicing Occurs
The borders between introns and exons are marked by
specific nucleotide sequences within the pre-mRNAs.
C value
●5’splice site: the exon-intron boundary at the 5’
end of the intron
●3’ splice site: the exon-intron boundary at the 3’
end of the intron
●Branch point site: an A close to the 3’ end of the
intron, which is followed by a polypyrimidine tract
(Py tract).
Ⅱ The intron is removed in a Form Called a
Lariat as the Flanking Exons are joined
Two successive transesterification
Step 1: The OH of the conserved A at the branch site attacks
the phosphoryl group of the conserved G in the 5’ splice site.
As a result, the 5’ exon is released and the 5’-end of the
intron forms a three-way junction structure.
Step 2: The OH of the 5’ exon attacks the phosphoryl group at
the 3’ splice site. As a consequence, the 5’ and 3’ exons are
joined and the intron is liberated in the shape of a lariat.
Three-way junction
The structure of three-way function
C value
Three class of RNA Splicing
Class Abundance Mechanism Catalytic
Machinery
Nuclear
pre-
mRNA
Very common; used
for most eukaryotic
genes
Two
transesterificat
ion reactions;
branch site A
Major
spliceoso
me
Group II
introns
Rare; some eu-
Karyotic genes from
organelles and
prokaryotes
Same as pre-
mRNA
RNA
enzyme
encoded
by intron
(ribozyme)
Group I
introns
Rare; nuclear rRNA in
some eukaryotics,
organlle genes, and a
few prokaryotic genes
Two
transesterific-
ation reactions;
exogenous G
Same as
group II
introns
G instead of A
a linear
intron
a Lariat intron
 In yeast most genes are uninterrupted.
 In higher eukaryotes most genes are interrupted and the
introns are usually much longer than exons.
 When a gene is uninterrupted, the restriction map of its
DNA corresponds exactly with the map of its mRNA.
 When a gene possess an intron, the map at each end of
the gene corresponds with the map at each end of the
message sequence.
C value
 Mutations that affect the splicing are usually
deleterious.
 The majority are single base substitutions at the
junctions between introns and exons.
 They may cause an exon to be left out of the product,
cause an intron to be included to make splicing occur at
an aberrant site.
 The most common result is to introduce a termination
codon that results in truncations of the protein
sequence.
 About 15% of the point mutations that cause human
diseases are caused by disruption of splicing.
 Introns can be detected by the presence of additional
regions when genes are compared with their RNA
products by restriction mapping or electron microscopy.
 The position of introns are usually conserved when
homologous genes are compared between different
organisms.
 The lengths of the corresponding introns may vary
greatly.
 Introns usually do not code for proteins.
 Comparisons of related genes in different species show
that the sequences of the corresponding exons are usually
conserved but the sequences of the introns are much less
well related.
 Introns evolve much more rapidly than exons because of
the lack of selective pressure to produce a protein with a
useful sequence.
 Exons are usually short, typically coding for <100 amino
acids.
 Introns are short in lower eukaryotes, but range up to
several 10s of kb in length in higher eukaryotes.
The overall length of a gene is determined largely by its
introns.
Some DNA Sequences Code for More Than One Protein
 The use of alternative initiation or termination codons allows
two proteins to be generated where one is equivalent to a
fragment of the other.
 Nonhomologous protein sequences can be produced from the
same sequence of DNA when it is read in different reading
frames by two (overlapping) genes.
 Homologous proteins that differ by the presence or absence of
certain regions can be generated by differential (alternative)
splicing when certain exons are included or excluded.
 This may take the form of including or excluding individual
exons or of choosing between alternative exons called Exon
shuffling.
There are five different ways to
alternatively splice a pre-mRNA
The outcome of alternative splicing
1. Producing multiple protein products,
called isoforms.
2. Switching on and off the expression of a
given gene. In this case, one functional
protein is produced by a splicing pattern,
and the non-functional proteins are
resulted from other splicing patterns.
Exons are shuffled by recombination to produce gene
encoding new proteins
All eukaryotes have introns, and yet these
elements are rare in bacteria. Two likely
explanations for these situation:
1. Introns early model – introns existed in all
organisms but have been lost from bacteria.
2. Intron late model – introns never existed in
bacteria but rather arose later in evolution.
Why have the introns been retained in eukaryotes?
1. The need to remove introns, allows for
alternative splicing which can generate
multiple proteins from a single gene.
2. Having the coding sequence of genes
divided into several exons allows new
genes to be created by reshuffling exon.
Three observations suggest exon shuffling actually
occur:
1. The borders between exons and introns within a
gene often coincide with the boundaries between
domains within the protein encoded by that gene.
2. Many genes, and proteins they encode, have
apparently arisen during evolution in part via exon
duplication and divergence.
3. Related exons are sometimes found in unrelated
genes.
Repeated sequences
 Repeated sequences (repetitive elements or repeats) are
patterns of nucleic acids (DNA or RNA) that occur in
multiple copies throughout the genome.
 Prokaryotes contain little or no repetitive sequences.
 Non coding repetitive DNA varies from one group of
organisms to another; individual to individual and
therefore used as DNA fingerprinting tool.
 3 major categories of repeated sequence based on
position
1. Terminal repeats
2. Tandem repeats-
3. Interspersed repeats
Based on number of repeats:
 Tandem repeats: copies which lie adjacent to each
other, either directly or inverted
 Satellite DNA - typically found in centromeres
and heterochromatin
 Minisatellite - repeat units from about 10 to 60 base pairs,
found in many places in the genome, including
the centromeres
 Microsatellite- repeat units of less than 10 base pairs; this
includes telomeres, which typically have 6 to 8 base pair
repeat units
Interspersed repeats (interspersed nuclear
elements)
 Transposable elements ( transposons or retroelements)
 SINEs (Short Interspersed Nuclear Elements)
 LINEs (Long Interspersed Nuclear Elements)
 In primates, the majority of LINEs are LINE-1 and the
majority of SINEs are Alu's.
 In prokaryotes, CRISPR are arrays of alternating
repeats and spacers.
C value
a) Satellite DNA – first identified as distinct bands of DNA that
are heavier or lighter than the majority of genomic DNA by
density centrifugation.
 These are repeated sequences that have either high GC
(heavy) or high AT (light) content.
 They are fairly short sequences (2-2000 bp) repeated 1000’s
of times in a row. They are found in heterochromatic regions
and around centromeres.
b) Minisatellites
 sequences of 9-100 bp repeated 10-100 times.
 Found in subtelomeric regions and (rarely) dispersed
throughout chromosomes.
 c) Microsatellites (SRS “short repetitive sequences”, STR
“short tandem repeats”, SSR “simple sequence repeats”)
 very short sequences of 1-5 bp repeated 10-100 times.
 Found dispersed throughout chromosomes, often in and
around genes.
 For example, the dinucleotide repeat CA is very common
in the human genome (≈50,000 copies)
C value
Example of a simple sequence repeat
(CCCA or GGGT) in human genomic DNA
 Microsatellites have very high mutation rates (where a
“mutation” means a change in repeat number).
 Thus they are often variable within a population and
useful for population genetics.
 This property also makes them useful for “DNA
fingerprinting”.

Retroposons
 Retroposons resemble processed RNAs and transpose
passively via RNA intermediate.
 Each element is composed of an A-rich tail at the 3' end
and short target site duplications (direct repeats of 5-21
bp) flanking the repeat.
 Two main subclasses dominate this class:
 Short Interspersed Elements (SINEs)
 Long Interspersed Elements (LINEs)
C value
C value
Short Interspersed Elements (SINEs)
 These are distributed throughout the non centromeric
regions of genome (over 100,000 copies per genome)
(Weiner, 1986).
 contains one or more RNA polymerase III, promoter sites
and an A-rich region.
 EX: Primate specific Alu sequence (5 to 9 kbp) with two
promoter sites and a dimer.
Long Interspersed Elements (LINEs)
 LINEs are composed open reading frames (ORFs) followed
by a 3' A-rich region having 20,000 to 50,000 copies per
genome (Hutchison et al., 1989; Weiner, 1986).
 Direct repeats of 6-15 bp flank the element.
 Ex: L1 family (primary LINE family) is 6 to 7 kbp long.
C value
Single-copy genes Satellite DNA (highly
repetitive sequences)
A single-copy gene has one locatable
region on a DNA molecule.
Satellite DNA consists of highly repetitive
sequences that can repeat up to 100,000
times in various places on a DNA
molecule.
Single-copy genes make up 1–2% of the
human genome.
Satellite DNA constitutes more than
5% of the human genome.
A single-copy gene corresponds to a unit of
inheritance (i.e., a protein).
Satellite DNA is not involved with
inheritance.
Single-copy genes are transcribed to make
RNA, which in turn is translated to make a
protein.
Satellite DNA is not transcribed.
Single-copy genes are usually thousands of
base pairs in length.
Satellite DNA is typically between 5 and
300 base pairs per repeat.
Single-copy genes are less useful for DNA
profiling.
Satellite DNA has a high rate of mutation
making it useful for DNA profiling.
Role of repetitive sequences:
 Tandem repeat hyper variability enables identification of
genes e.g. antifreeze gene and several degenerative
diseases.
 Repeats may help in stability of transcripts or proteins but
repeat expansions and instability (particularly of
trinucleotide repeats) lead to neurological disorders and
cancer (Ashley and Warren, 1995; Mitas, 1997).
 Long stretch of CAG repeats translated into polyglutamine
tracts result in a gain-of-function, possibly a toxin (Perutz
et al., 1994; Baldi et al., 1999).
 CGG, AGG and TGG repeats form quadriplex and GAA
repeats form triplex structures that can block or reduce
transcription and DNA replication (Sinden, 1999).
 CGG repeats also destabilize nucleosomes (Sinden, 1999)
due to CpG hyper methylation leading to promoter
repression and lack of gene expression (Nelson 1995,
Baldi et al., 1999). On the other hand, CTG repeats
stabilize nucleosomes and block replication forks in E. Coli
(Sinden, 1999).
 Microsatellites have very high mutation rates (where a
“mutation” means a change in repeat number). Thus they
are often variable within a population and useful for
population genetics. This property also makes them
useful for “DNA fingerprinting”.
Functions of highly repetitive DNA sequences:
 Structural and organizational roles in chromosomes
 Involvement in chromosome pairing during meiosis
 Involvement in crossing over and recombination
 Protection of important structural genes like histone,
rRNA or ribosomal protein genes
 A repository of unessential DNA sequences for use in the
future evolution of the species and
 No function at all – just junk DNA that is carried along by
the processes of replication and segregation of
chromosomes.
Referencs :
 Brown T.A. 2002. Genomes. Wiley-LISS.
 Snustard and Simmons. Genetics
 Tamarin. Principles of Genetics.
 Lewin B. Genes IX.
 Related articles.
C value

More Related Content

PPTX
Somatic cell cloning
PDF
Lecture_Chromatin remodelling_slideshare.pdf
PPT
Phycoremediation – a clean technology for water pollution abatement
PPTX
Bioentrepreneurship
PPTX
Hutchinsons system of classification
PPT
Water quality parameters
PDF
Sterilization and disinfection
Somatic cell cloning
Lecture_Chromatin remodelling_slideshare.pdf
Phycoremediation – a clean technology for water pollution abatement
Bioentrepreneurship
Hutchinsons system of classification
Water quality parameters
Sterilization and disinfection

What's hot (20)

PPTX
MODIFYING ENZYMES
PPTX
Second genetic code overlapping and split genes
PDF
Gene mapping
PDF
cDNA Library
PPTX
Complementation of defined mutations
PPTX
Mitochondrial genome
PPTX
Genetics ppt
PDF
Cot curve
PPTX
Cromatin Remodeling
PPTX
Restriction mapping
PPTX
Labelling of dna
PPTX
Gene Silencing
PPT
Protein and nucleic acid sequencing
PPTX
repetitive and non repetitive dna.pptx
PPT
Histone modification in living cells
PPTX
Tryptophan operon
PPTX
EXTRA CHROMOSOMAL INHERITANCE
PPTX
Complementation test; AC-DS System in Maize
DOC
STRUCTURE AND ORGANIZATION OF CHROMATIN
PPTX
Genomic and c dna library
MODIFYING ENZYMES
Second genetic code overlapping and split genes
Gene mapping
cDNA Library
Complementation of defined mutations
Mitochondrial genome
Genetics ppt
Cot curve
Cromatin Remodeling
Restriction mapping
Labelling of dna
Gene Silencing
Protein and nucleic acid sequencing
repetitive and non repetitive dna.pptx
Histone modification in living cells
Tryptophan operon
EXTRA CHROMOSOMAL INHERITANCE
Complementation test; AC-DS System in Maize
STRUCTURE AND ORGANIZATION OF CHROMATIN
Genomic and c dna library
Ad

Similar to C value (20)

PPTX
Genes in Action
PPTX
Unit 1 transcription
DOCX
Eukaryotic_Genome.docx
PPTX
Genome organization ,gene expression sand regulation
PPTX
DNA_RNA_Protein Synthesis_Mini lecture Thn 1.pptx
PPTX
ORF, Gene Clustering, Overlapping Genes and.pptx
PPTX
Microbial Genetics
PPTX
Microbial Genetics
PPTX
Microbial Genetics
PDF
Genome organization
PPTX
Regulation of eukaryotic gene expression
PDF
Genome Curation using Apollo - Workshop at UTK
PPTX
Molecular structure of genes and chromosomes
PPTX
GENETICS & periodontal disease.pptx
PPTX
Genetic fine structure
PDF
RNA Splicing
PDF
Differentiated Fern Research Paper
PPT
Introns: structure and functions
DOCX
Nuclear Genomes(Short Answers and questions)
Genes in Action
Unit 1 transcription
Eukaryotic_Genome.docx
Genome organization ,gene expression sand regulation
DNA_RNA_Protein Synthesis_Mini lecture Thn 1.pptx
ORF, Gene Clustering, Overlapping Genes and.pptx
Microbial Genetics
Microbial Genetics
Microbial Genetics
Genome organization
Regulation of eukaryotic gene expression
Genome Curation using Apollo - Workshop at UTK
Molecular structure of genes and chromosomes
GENETICS & periodontal disease.pptx
Genetic fine structure
RNA Splicing
Differentiated Fern Research Paper
Introns: structure and functions
Nuclear Genomes(Short Answers and questions)
Ad

More from Vinod Pawar (20)

PPTX
Selection: pure line, mass and pedigree breeding methods for self pollinated ...
PPTX
History of plant breeding by dr p vinod (2)
PPTX
“Genetic architecture improvement in cowpea”
PPTX
ROLE OF INHERITANCE IN CROP IMPROVEMENT
PPTX
Qtl mapping
PPTX
Forward and reverse genetics
PPT
Gene expression in bacteria and bacteriophages
PPTX
Operon
PPT
Brassinosteroids plant harmones
PPTX
Breeding for disease resistance in
PPT
Breeding for bac. wilt resi. in tomato
PPT
Classes of seeds
PPTX
Antisense RNA in crop
PPT
HISTORY, DISCRIPTION, CLASSIFICATION, ORIGIN AND PHYLOGENETIC RELATIONSHIP GE...
PPTX
Biotechnological approaches in Host Plant Resistance (HPR)
PPT
THE PROTECTION OF PLANT VARIETIES & FARMER’S ACT, 2001 And THE PPV & FR R...
PPTX
MARKER ASSISTED SELECTION IN CROP IMPROVEMENT
PPTX
TILLING & Eco-TILLING : Reverse Genetics Approaches for Crop Improvement
PPTX
APPLICATION OF MUTATION BREEDING IN FIELD CROPS
PPT
Gene silencing for crop improvement
Selection: pure line, mass and pedigree breeding methods for self pollinated ...
History of plant breeding by dr p vinod (2)
“Genetic architecture improvement in cowpea”
ROLE OF INHERITANCE IN CROP IMPROVEMENT
Qtl mapping
Forward and reverse genetics
Gene expression in bacteria and bacteriophages
Operon
Brassinosteroids plant harmones
Breeding for disease resistance in
Breeding for bac. wilt resi. in tomato
Classes of seeds
Antisense RNA in crop
HISTORY, DISCRIPTION, CLASSIFICATION, ORIGIN AND PHYLOGENETIC RELATIONSHIP GE...
Biotechnological approaches in Host Plant Resistance (HPR)
THE PROTECTION OF PLANT VARIETIES & FARMER’S ACT, 2001 And THE PPV & FR R...
MARKER ASSISTED SELECTION IN CROP IMPROVEMENT
TILLING & Eco-TILLING : Reverse Genetics Approaches for Crop Improvement
APPLICATION OF MUTATION BREEDING IN FIELD CROPS
Gene silencing for crop improvement

Recently uploaded (20)

PDF
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
PDF
An interstellar mission to test astrophysical black holes
PPTX
INTRODUCTION TO EVS | Concept of sustainability
PPTX
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
PPTX
2. Earth - The Living Planet earth and life
PPTX
Introduction to Fisheries Biotechnology_Lesson 1.pptx
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PPTX
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
PPTX
7. General Toxicologyfor clinical phrmacy.pptx
DOCX
Viruses (History, structure and composition, classification, Bacteriophage Re...
PPTX
2Systematics of Living Organisms t-.pptx
PPTX
Introduction to Cardiovascular system_structure and functions-1
PDF
The scientific heritage No 166 (166) (2025)
PPTX
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
PDF
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
PPT
protein biochemistry.ppt for university classes
PPTX
Cell Membrane: Structure, Composition & Functions
PPTX
neck nodes and dissection types and lymph nodes levels
PDF
bbec55_b34400a7914c42429908233dbd381773.pdf
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
An interstellar mission to test astrophysical black holes
INTRODUCTION TO EVS | Concept of sustainability
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
2. Earth - The Living Planet earth and life
Introduction to Fisheries Biotechnology_Lesson 1.pptx
ECG_Course_Presentation د.محمد صقران ppt
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
7. General Toxicologyfor clinical phrmacy.pptx
Viruses (History, structure and composition, classification, Bacteriophage Re...
2Systematics of Living Organisms t-.pptx
Introduction to Cardiovascular system_structure and functions-1
The scientific heritage No 166 (166) (2025)
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
protein biochemistry.ppt for university classes
Cell Membrane: Structure, Composition & Functions
neck nodes and dissection types and lymph nodes levels
bbec55_b34400a7914c42429908233dbd381773.pdf

C value

  • 3. C-value  The C-value of an organism is the amount of DNA in the organism’s genome. The size of the genome (C-value) depends on the organism.  Thomas coined the term C-value Paradox to denote the unexpected lack of relationship between the presumed complexity of an organism and its C-value
  • 4. 4 Range of C-values in various eukaryotic taxa _____________________________________________________________ __ Taxon Genome size range Ratio (Kb) (highest/lowest) _____________________________________________________________ __ Eukaryotes 2,300 - 686,000,000 298,261 Amoebae 35,300 - 686,000,000 19,433 Fungi 8,800 - 1,470,000 167 Animals 49,000 - 139,000,000 2,837 Sponges 49,000 - 53,900 1 Molluscs 421,000 - 5,290,000 13 Crustaceans 686,000 - 22,100,000 32 Insects 98,000 - 7,350,000 75 Bony fishes 340,000 - 139,000,000 409 Amphibians 931,000 - 84,300,000 91 Reptiles 1,230,000 - 5,340,000 4 Birds 1,670,000 - 2,250,000 1 Mammals 1,700,000 - 6,700,000 4 Plants 50,000 - 307,000,000 6,140 _____________________________________________________________ __
  • 5. 5 If the variation in C-values is attributed to genes, it can be due to interspecific differences in (1) the number of protein-coding genes (2) the size of proteins (3) the size of protein-coding genes (4) the number and sizes of genes other than protein-coding ones.
  • 7. 7 K-value paradox: Complexity does not correlate with chromosome number. 46 250 Ophioglossum reticulatum Homo sapiens Lysandra atlantica ~1260
  • 8. 8 C-value paradox: Complexity does not correlate with genome size. 3.4  109 bp Homo sapiens 6.8  1011 bp Amoeba dubia 1.5  1010 bp Allium cepa
  • 9. 9 N-value paradox: Complexity does not correlate with gene number. ~21,000 genes ~25,000 genes ~60,000 genes
  • 10. Figure 21.7 Exons (1.5%) Introns (5%) Regulatory sequences (20%) Unique noncoding DNA (15%) Repetitive DNA unrelated to transposable elements (14%) Large-segment duplications (56%) Simple sequence DNA (3%) Alu elements (10%) L1 sequences (17%) Repetitive DNA that includes transposable elements and related sequences (44%)
  • 12.  DNA comprises an interrupted gene are divided into the two categories: 1. Exons 2. Introns  Exons: are the sequences represented in the mature RNA. By definition, a gene starts and ends with exons that correspond to the 5’ and 3’ ends of the RNA.  Introns: are the intervening sequences that are removed when the primary transcript is processed to give the mature RNA.
  • 14.  The exon sequences are in the same order in the gene and in the RNA, but an interrupted gene is longer than its final RNA product because of the presence of the introns.  Introns are removed by the process of RNA splicing, which occur only in cis on an individual RNA molecule.
  • 16. Sequences within the RNA Determine Where Splicing Occurs The borders between introns and exons are marked by specific nucleotide sequences within the pre-mRNAs.
  • 18. ●5’splice site: the exon-intron boundary at the 5’ end of the intron ●3’ splice site: the exon-intron boundary at the 3’ end of the intron ●Branch point site: an A close to the 3’ end of the intron, which is followed by a polypyrimidine tract (Py tract).
  • 19. Ⅱ The intron is removed in a Form Called a Lariat as the Flanking Exons are joined Two successive transesterification Step 1: The OH of the conserved A at the branch site attacks the phosphoryl group of the conserved G in the 5’ splice site. As a result, the 5’ exon is released and the 5’-end of the intron forms a three-way junction structure. Step 2: The OH of the 5’ exon attacks the phosphoryl group at the 3’ splice site. As a consequence, the 5’ and 3’ exons are joined and the intron is liberated in the shape of a lariat.
  • 21. The structure of three-way function
  • 23. Three class of RNA Splicing Class Abundance Mechanism Catalytic Machinery Nuclear pre- mRNA Very common; used for most eukaryotic genes Two transesterificat ion reactions; branch site A Major spliceoso me Group II introns Rare; some eu- Karyotic genes from organelles and prokaryotes Same as pre- mRNA RNA enzyme encoded by intron (ribozyme) Group I introns Rare; nuclear rRNA in some eukaryotics, organlle genes, and a few prokaryotic genes Two transesterific- ation reactions; exogenous G Same as group II introns
  • 24. G instead of A a linear intron a Lariat intron
  • 25.  In yeast most genes are uninterrupted.  In higher eukaryotes most genes are interrupted and the introns are usually much longer than exons.  When a gene is uninterrupted, the restriction map of its DNA corresponds exactly with the map of its mRNA.  When a gene possess an intron, the map at each end of the gene corresponds with the map at each end of the message sequence.
  • 27.  Mutations that affect the splicing are usually deleterious.  The majority are single base substitutions at the junctions between introns and exons.  They may cause an exon to be left out of the product, cause an intron to be included to make splicing occur at an aberrant site.  The most common result is to introduce a termination codon that results in truncations of the protein sequence.  About 15% of the point mutations that cause human diseases are caused by disruption of splicing.
  • 28.  Introns can be detected by the presence of additional regions when genes are compared with their RNA products by restriction mapping or electron microscopy.  The position of introns are usually conserved when homologous genes are compared between different organisms.  The lengths of the corresponding introns may vary greatly.  Introns usually do not code for proteins.
  • 29.  Comparisons of related genes in different species show that the sequences of the corresponding exons are usually conserved but the sequences of the introns are much less well related.  Introns evolve much more rapidly than exons because of the lack of selective pressure to produce a protein with a useful sequence.  Exons are usually short, typically coding for <100 amino acids.  Introns are short in lower eukaryotes, but range up to several 10s of kb in length in higher eukaryotes. The overall length of a gene is determined largely by its introns.
  • 30. Some DNA Sequences Code for More Than One Protein  The use of alternative initiation or termination codons allows two proteins to be generated where one is equivalent to a fragment of the other.  Nonhomologous protein sequences can be produced from the same sequence of DNA when it is read in different reading frames by two (overlapping) genes.  Homologous proteins that differ by the presence or absence of certain regions can be generated by differential (alternative) splicing when certain exons are included or excluded.  This may take the form of including or excluding individual exons or of choosing between alternative exons called Exon shuffling.
  • 31. There are five different ways to alternatively splice a pre-mRNA
  • 32. The outcome of alternative splicing 1. Producing multiple protein products, called isoforms. 2. Switching on and off the expression of a given gene. In this case, one functional protein is produced by a splicing pattern, and the non-functional proteins are resulted from other splicing patterns.
  • 33. Exons are shuffled by recombination to produce gene encoding new proteins All eukaryotes have introns, and yet these elements are rare in bacteria. Two likely explanations for these situation: 1. Introns early model – introns existed in all organisms but have been lost from bacteria. 2. Intron late model – introns never existed in bacteria but rather arose later in evolution.
  • 34. Why have the introns been retained in eukaryotes?
  • 35. 1. The need to remove introns, allows for alternative splicing which can generate multiple proteins from a single gene. 2. Having the coding sequence of genes divided into several exons allows new genes to be created by reshuffling exon.
  • 36. Three observations suggest exon shuffling actually occur: 1. The borders between exons and introns within a gene often coincide with the boundaries between domains within the protein encoded by that gene. 2. Many genes, and proteins they encode, have apparently arisen during evolution in part via exon duplication and divergence. 3. Related exons are sometimes found in unrelated genes.
  • 37. Repeated sequences  Repeated sequences (repetitive elements or repeats) are patterns of nucleic acids (DNA or RNA) that occur in multiple copies throughout the genome.  Prokaryotes contain little or no repetitive sequences.  Non coding repetitive DNA varies from one group of organisms to another; individual to individual and therefore used as DNA fingerprinting tool.
  • 38.  3 major categories of repeated sequence based on position 1. Terminal repeats 2. Tandem repeats- 3. Interspersed repeats
  • 39. Based on number of repeats:
  • 40.  Tandem repeats: copies which lie adjacent to each other, either directly or inverted  Satellite DNA - typically found in centromeres and heterochromatin  Minisatellite - repeat units from about 10 to 60 base pairs, found in many places in the genome, including the centromeres  Microsatellite- repeat units of less than 10 base pairs; this includes telomeres, which typically have 6 to 8 base pair repeat units
  • 41. Interspersed repeats (interspersed nuclear elements)  Transposable elements ( transposons or retroelements)  SINEs (Short Interspersed Nuclear Elements)  LINEs (Long Interspersed Nuclear Elements)  In primates, the majority of LINEs are LINE-1 and the majority of SINEs are Alu's.  In prokaryotes, CRISPR are arrays of alternating repeats and spacers.
  • 43. a) Satellite DNA – first identified as distinct bands of DNA that are heavier or lighter than the majority of genomic DNA by density centrifugation.  These are repeated sequences that have either high GC (heavy) or high AT (light) content.  They are fairly short sequences (2-2000 bp) repeated 1000’s of times in a row. They are found in heterochromatic regions and around centromeres.
  • 44. b) Minisatellites  sequences of 9-100 bp repeated 10-100 times.  Found in subtelomeric regions and (rarely) dispersed throughout chromosomes.
  • 45.  c) Microsatellites (SRS “short repetitive sequences”, STR “short tandem repeats”, SSR “simple sequence repeats”)  very short sequences of 1-5 bp repeated 10-100 times.  Found dispersed throughout chromosomes, often in and around genes.  For example, the dinucleotide repeat CA is very common in the human genome (≈50,000 copies)
  • 47. Example of a simple sequence repeat (CCCA or GGGT) in human genomic DNA
  • 48.  Microsatellites have very high mutation rates (where a “mutation” means a change in repeat number).  Thus they are often variable within a population and useful for population genetics.  This property also makes them useful for “DNA fingerprinting”. 
  • 49. Retroposons  Retroposons resemble processed RNAs and transpose passively via RNA intermediate.  Each element is composed of an A-rich tail at the 3' end and short target site duplications (direct repeats of 5-21 bp) flanking the repeat.  Two main subclasses dominate this class:  Short Interspersed Elements (SINEs)  Long Interspersed Elements (LINEs)
  • 52. Short Interspersed Elements (SINEs)  These are distributed throughout the non centromeric regions of genome (over 100,000 copies per genome) (Weiner, 1986).  contains one or more RNA polymerase III, promoter sites and an A-rich region.  EX: Primate specific Alu sequence (5 to 9 kbp) with two promoter sites and a dimer.
  • 53. Long Interspersed Elements (LINEs)  LINEs are composed open reading frames (ORFs) followed by a 3' A-rich region having 20,000 to 50,000 copies per genome (Hutchison et al., 1989; Weiner, 1986).  Direct repeats of 6-15 bp flank the element.  Ex: L1 family (primary LINE family) is 6 to 7 kbp long.
  • 55. Single-copy genes Satellite DNA (highly repetitive sequences) A single-copy gene has one locatable region on a DNA molecule. Satellite DNA consists of highly repetitive sequences that can repeat up to 100,000 times in various places on a DNA molecule. Single-copy genes make up 1–2% of the human genome. Satellite DNA constitutes more than 5% of the human genome. A single-copy gene corresponds to a unit of inheritance (i.e., a protein). Satellite DNA is not involved with inheritance. Single-copy genes are transcribed to make RNA, which in turn is translated to make a protein. Satellite DNA is not transcribed. Single-copy genes are usually thousands of base pairs in length. Satellite DNA is typically between 5 and 300 base pairs per repeat. Single-copy genes are less useful for DNA profiling. Satellite DNA has a high rate of mutation making it useful for DNA profiling.
  • 56. Role of repetitive sequences:  Tandem repeat hyper variability enables identification of genes e.g. antifreeze gene and several degenerative diseases.  Repeats may help in stability of transcripts or proteins but repeat expansions and instability (particularly of trinucleotide repeats) lead to neurological disorders and cancer (Ashley and Warren, 1995; Mitas, 1997).  Long stretch of CAG repeats translated into polyglutamine tracts result in a gain-of-function, possibly a toxin (Perutz et al., 1994; Baldi et al., 1999).
  • 57.  CGG, AGG and TGG repeats form quadriplex and GAA repeats form triplex structures that can block or reduce transcription and DNA replication (Sinden, 1999).  CGG repeats also destabilize nucleosomes (Sinden, 1999) due to CpG hyper methylation leading to promoter repression and lack of gene expression (Nelson 1995, Baldi et al., 1999). On the other hand, CTG repeats stabilize nucleosomes and block replication forks in E. Coli (Sinden, 1999).
  • 58.  Microsatellites have very high mutation rates (where a “mutation” means a change in repeat number). Thus they are often variable within a population and useful for population genetics. This property also makes them useful for “DNA fingerprinting”.
  • 59. Functions of highly repetitive DNA sequences:  Structural and organizational roles in chromosomes  Involvement in chromosome pairing during meiosis  Involvement in crossing over and recombination  Protection of important structural genes like histone, rRNA or ribosomal protein genes  A repository of unessential DNA sequences for use in the future evolution of the species and  No function at all – just junk DNA that is carried along by the processes of replication and segregation of chromosomes.
  • 60. Referencs :  Brown T.A. 2002. Genomes. Wiley-LISS.  Snustard and Simmons. Genetics  Tamarin. Principles of Genetics.  Lewin B. Genes IX.  Related articles.