SlideShare a Scribd company logo
Rna Methodologies Laboratory Guide For Isolation
And Characterization 5th Robert E Farrell
download
https://ebookbell.com/product/rna-methodologies-laboratory-guide-
for-isolation-and-characterization-5th-robert-e-farrell-7016804
Explore and download more ebooks at ebookbell.com
Here are some recommended products that we believe you will be
interested in. You can click the link to download.
Rna Methodologies Fourth Edition Laboratory Guide For Isolation And
Characterization 4th Edition Robert E Farrell Jr
https://ebookbell.com/product/rna-methodologies-fourth-edition-
laboratory-guide-for-isolation-and-characterization-4th-edition-
robert-e-farrell-jr-1690358
Rna Methodologies Third Edition A Laboratory Guide For Isolation And
Characterization 3rd Edition Robert E Farrell Author
https://ebookbell.com/product/rna-methodologies-third-edition-a-
laboratory-guide-for-isolation-and-characterization-3rd-edition-
robert-e-farrell-author-2165136
Rna Methodologies A Laboratory Guide For Isolation And
Characterization 6th Edition Robert E Farrell Jr
https://ebookbell.com/product/rna-methodologies-a-laboratory-guide-
for-isolation-and-characterization-6th-edition-robert-e-farrell-
jr-48774206
Rna Structure And Dynamics 1st Ed 2023 Jienyu Ding Jason R Stagno
https://ebookbell.com/product/rna-structure-and-dynamics-1st-
ed-2023-jienyu-ding-jason-r-stagno-46591202
Rna Delivery Function For Anticancer Therapeutics Loutfy H Madkour
https://ebookbell.com/product/rna-delivery-function-for-anticancer-
therapeutics-loutfy-h-madkour-46668394
Rna Structure Prediction Risa Karakida Kawaguchi Junichi Iwakiri
https://ebookbell.com/product/rna-structure-prediction-risa-karakida-
kawaguchi-junichi-iwakiri-47627364
Rna Modifications 1st Mary Mcmahon
https://ebookbell.com/product/rna-modifications-1st-mary-
mcmahon-47699696
Rna Interference And Crispr Technologies 1st Mouldy Sioud
https://ebookbell.com/product/rna-interference-and-crispr-
technologies-1st-mouldy-sioud-47710484
Rnaprotein Complexes And Interactions Methods And Protocols Methods In
Molecular Biology 2666 2nd Ed 2023 Renjang Lin Editor
https://ebookbell.com/product/rnaprotein-complexes-and-interactions-
methods-and-protocols-methods-in-molecular-biology-2666-2nd-
ed-2023-renjang-lin-editor-50124494
Rna Methodologies Laboratory Guide For Isolation And Characterization 5th Robert E Farrell
Rna Methodologies Laboratory Guide For Isolation And Characterization 5th Robert E Farrell
Rna Methodologies Laboratory Guide For Isolation And Characterization 5th Robert E Farrell
Rna Methodologies Laboratory Guide For Isolation And Characterization 5th Robert E Farrell
Rna Methodologies Laboratory Guide For Isolation And Characterization 5th Robert E Farrell
Rna Methodologies Laboratory Guide For Isolation And Characterization 5th Robert E Farrell
Rna Methodologies Laboratory Guide For Isolation And Characterization 5th Robert E Farrell
Rna Methodologies Laboratory Guide For Isolation And Characterization 5th Robert E Farrell
Rna Methodologies Laboratory Guide For Isolation And Characterization 5th Robert E Farrell
Rna Methodologies Laboratory Guide For Isolation And Characterization 5th Robert E Farrell
Rna Methodologies Laboratory Guide For Isolation And Characterization 5th Robert E Farrell
Rna Methodologies Laboratory Guide For Isolation And Characterization 5th Robert E Farrell
Rna Methodologies Laboratory Guide For Isolation And Characterization 5th Robert E Farrell
Rna Methodologies Laboratory Guide For Isolation And Characterization 5th Robert E Farrell
Rna Methodologies Laboratory Guide For Isolation And Characterization 5th Robert E Farrell
Rna Methodologies Laboratory Guide For Isolation And Characterization 5th Robert E Farrell
Preface
RNA NEVER CEASES TO AMAZE
The first edition of RNA Methodologies was published in 1993. At that time,
RNA was viewed as an “interesting” molecule and many molecular biolo-
gists were happy if they could do a decent Northern blot. Twenty-five years
later, we have at least begun to appreciate that RNA is as diverse in function
as it is in form within the society of the cell. With that in mind, a major goal
of this book is to tell the RNA story from several perspectives to ensure a
holistic understanding of this aspect of molecular biology. The recurrent
themes herein are the correct way to isolate, handle, store, and assay RNA,
and an appropriate level of background information related to the fundamen-
tals of gene expression is likewise provided.
Many roles of RNA support the widely acclaimed “RNA world hypothesis,”
in which RNA, and not DNA, protein, or anything else was the primary pri-
mordial information molecule. RNA serves as a keeper of genetic information
(RNA viral genomes), a transporter of genetic information (mRNA), a guide
that leads proteins to specific RNA sequences on other molecules for possible
modification (gRNA, siRNA), a powerful posttranscriptional regulator of
gene expression (miRNA), the scaffolding of the protein synthesis machinery
(rRNA component of ribosomes), a sustainer of translation (tRNA transport of
amino acids), and a modifier of itself and other molecules (self-splicing and
catalytic RNA). Without a doubt, there are many other functions associated
with RNA that have yet to be uncovered. It is safe to say that we are in the midst
of a revolution in terms of our understanding of the several faces of RNA.
Who would have ever imagined?
In the RNA world, it is all about quality control. It is well known that
RNA molecules are examined, repeatedly and systematically, from the onset
of transcription, posttranscriptionally, and throughout its biological lifespan,
which ends with its dismantling when it has been damaged or is otherwise
no longer needed by the cell. To put this into perspective, consider CQI, that
is, continuous quality improvement. Successful companies and other institu-
tions often embrace the philosophy of CQI in order to sustain optimized per-
formance. Some of the CQI strategies that the cell has been using from time
immemorial are just starting to be understood, and it is truly mind-boggling.
xvii
The multiple quality control checkpoints associated with the production of
mRNA ensure that only the highest fidelity, error-free template material is
available to the ribosomes for protein synthesis.
In high school in the mid-1970s, this Author learned about the three, and
only three, types of RNA known at that time, to wit, mRNA, tRNA, and of
course rRNA; at least one long, noncoding RNA (lncRNA) was known back
then! In the present day, numerous functional lncRNAs have been described,
not to mention their all-important smaller miRNA cousins. In many patho-
logical states, miRNA expression patterns are altered, leading to detrimental
changes to cellular morphology and cellular physiology. Retrospectively, it
is amazing that miRNAs remained unknown for as long as they did. Perhaps
the major reason is the fact that RNA isolation protocols, particularly in the
1980s and early 1990s, did not favor the efficient recovery of small tran-
scripts. This was also true of the first generation of molecular biology kits.
With the contemporary tools now at hand, new transcript species are being
identified continuously. The number of known miRNAs in human cells is
already in the thousands and these small, powerful transcripts are critical
modulators of the flow of genetic information.
Many functions that RNA molecules are able to perform are a direct
result of the single-stranded character of polyribonucleotides. Consider, for
example, the presence of regulatory structures formed by some RNA mole-
cules such as stems, loops, and hairpins, and compare these structures to reg-
ulatory sequences such as AUG, UAG, and AAUAAA, which influence
translation and various posttranscriptional facets of RNA biogenesis. These
complementary properties are inherent to RNA because of its amazing ability
to fold into formidable secondary and tertiary structures, thereby imparting
transcript functionalities perhaps as diverse as its very nucleotide sequence.
Regarding the business at hand, transcriptional profiling is possible only
when high-quality RNA is isolated from its biological source, such that it is
able to support reverse transcription, hybridization, and downstream applica-
tions that include variegated quantitation assays as well as the detection of
previously uncharacterized genes, differentially spliced transcripts, and tran-
scripts with multiple start sites. These abilities are of ever increasing impor-
tance because of the apparent link between an abnormal abundance of a
transcript (coding or noncoding; too high or too low) and a genetic disease.
While impressively sensitive methods now exist for measuring transcript
abundance, it is just as important to be able to identify polymorphisms within
transcripts. For example, alternative splicing imparts an added level of vul-
nerability to mutations and the disease state. Moreover, serious thought is
required “outside the box,” i.e., the cell, because circulating nucleic acids
offer enormous potential as biomarkers. Succinctly, what happens at the
RNA level often determines the fate of the cell.
There have been many wonderful technological advances in the study
of RNA since the publication of the previous edition of this book in 2010,
xviii Preface
and many of those techniques and their applications and limitations are
discussed here. My philosophy in the preparation of the fifth edition of RNA
Methodologies has been that while technology is great, the fundamentals of
working with RNA must be understood because they are the foundation
upon which the contemporary methods to which the research community has
become accustomed have been built. It may come as something of a surprise
to learn that plenty of people continue to use comparatively roughhewn tools
such as the time-honored Northern blot, often as means of confirming data
gleaned from more sophisticated techniques. Regardless of the method, good
laboratory practices (also a quality control system) associated with RNA
methodologies are important to know about, particularly when it becomes
necessary to troubleshoot (it always does).
In light of the many advances in the study of RNA, another goal of this
book is to unify many of the facets of RNA characterization in a coherent
start-to-finish format. One of the difficulties toward the realization of this
goal is that the rapid succession of new techniques, and variants thereof, has
resulted in confusing technical nomenclature. To make matters worse, not
everyone uses the same terminology to describe the same techniques.
Regardless of the intended experimental trajectory, the purification of high-
quality RNA, what this Author affectionately refers to as eRNA (excellent
RNA!), from the biological source is always the starting point. Whether iso-
lated from cell culture or directly from whole tissue, only the meticulous
handling of RNA will support experiments that will be used for its study. All
of the background information and the updates included herein are appropri-
ate since the RNA novice lacks the historical perspective and frame of refer-
ence that more experienced investigators often enjoy.
This laboratory guide represents a growing collection of tried, tested, and
optimized laboratory protocols for the isolation and characterization of
eukaryotic RNA, with lesser emphasis on the characterization of prokaryotic
transcripts. Another goal of this book is to help the reader develop greater
confidence in the laboratory. Consequently, this text is written for the princi-
pal investigator, bench scientist, physician, veterinarian, lab technician, grad-
uate student, undergraduate research assistant, and anyone else capable of
performing basic research techniques—there is something in it for everyone.
This resource is intended to provide a rationale to assist in the decision-
making process for individuals at all levels of experience by presenting real-
istic alternatives for achieving the same experimental goals, and demonstrat-
ing how various techniques contribute to the understanding of gene
expression and functionality. Many of the incorporated notations and hints
are based upon personal experience and pave the way for the expedient
recovery of RNA and the most judicious use of resources. It is unfortunate
that commonplace unsound tactics for RNA handling and characterization
result in wasted resources due to an obvious failure to understand the “what”
and the “why” from the onset of the study. The best advice that I can offer:
Preface xix
always think two steps ahead in an experiment, and reflect upon how the
method of RNA isolation and the ensuing protocols will impact the interpre-
tation of data.
While it is hoped that this text be studied from cover to cover, one may
pick and choose salient protocols without loss of continuity. Collectively, the
chapters work together to embellish the RNA story, each presenting clear
take-home lessons. The liberal incorporation of flow charts, tables, and rep-
resentative data likewise facilitate learning and assist in the planning and
implementation phases of a project. You are limited only by your own
ingenuity.
  
The Author acknowledges, with sincere thanks and appreciation, the
intellectual encouragement of the many colleagues and friends who, in some
way, supported the preparation of this manuscript. The support and patience
of the Author’s family are also gratefully acknowledged and are very much
appreciated.
Initium sapientiae timor Domini
xx Preface
For Catherine Ann,
Sean Patrick, Emma Catherine, Liam Michael, and Patrick Joseph
Chapter 1
RNA and the Cellular
Biochemistry Revisited
WHY STUDY RNA?
All cell and tissue functions are ultimately governed by gene expression.
Consequently, the reasons for electing to study the modulation of RNA
levels as at least one parameter of the cellular biochemistry may be as
diverse as the intracellular RNA population itself. Generally speaking, the
characterization of RNA is almost always related to transcription, i.e., gene
expression questions being asked in the context of a particular scientific
inquiry, and most often revolves around measuring the dynamic abundance
level of one or several transcripts.
The goals in any experimental design involving RNA generally revolve
around one or more fundamental themes, including but not limited to the
following:
1. Measurement of the steady-state abundance of cellular transcripts.
Steady-state RNA refers to the net accumulation of transcription products
in the cell, or in a subcellular compartment such as the nucleus or the
cytoplasm. It is the combined result of RNA synthesis, stability, and deg-
radation. This is the most common reason why RNA is isolated from
cells and tissues. Analysis may focus on one transcript, a few transcripts,
or all transcripts simultaneously; this latter approach is commonly known
as global analysis of gene expression or whole transcriptome profiling.
Given the ease with which RNA can be purified from biological sources,
the use of various sensitive, contemporary approaches is widespread for
generating quantitative and qualitative profiles of RNA populations using
any of a variety of laboratory techniques.
2. Synthesis of complementary DNA (cDNA). Unstable, single-stranded
messenger RNA (mRNA) can serve as the template for the in vitro syn-
thesis of very stable single- or double-stranded cDNA molecules. This is
the first step for subsequent amplification by the polymerase chain reac-
tion (PCR), often for some “quantitative purpose,” for transcript mapping
purposes, for direct ligation into a vector for sequencing or for expression
of the encoded protein, for the physical separation of two or more cDNA
species, or for the older strategy of synthesizing an entire cDNA library
1
RNA Methodologies. DOI: http://dx.doi.org/10.1016/B978-0-12-804678-4.00001-4
© 2017 Elsevier Inc. All rights reserved.
(older literature occasionally refers to a cDNA library as a “clone bank.”)
which can be propagated for long-term storage and analysis. In any
event, the construction of cDNA is the creation of a permanent biochemi-
cal record of the cell at the moment of cellular disruption. Historically,
the synthesis of highly representative cDNA is one of the most important
methodologies in the molecular biology laboratory and, in some hands,
remains a significant challenge.
3. Detection of viruses which harbor an RNA genome. This proceeds via
the synthesis of cDNA, as described above, followed by PCR or another
cDNA amplification method.
4. Identification of the transcription start site (TSS). Historically, mapping
of RNA molecules, including the 50
end, the 30
end, and the size and
location of introns, was accomplished via nuclease protection assay, as
described in Chapter 18, Quantification of Specific mRNAs by Nuclease
Protection. Now, however, transcript mapping is now almost always per-
formed by some variant of 50
- or 30
-rapid amplification of cDNA ends
(RACE; see Chapter 8: RT-PCR: A Science and an Art Form). As it is
well known that a single genetic locus often has the potential to produce
multiple RNAs, each with a different TSS and often in a tissue-specific
manner, TSS mapping is an invaluable technique.
5. Measurement of the rate of transcription of gene sequences or the path-
ways of RNA processing. This may be deduced, at least in part, by the
nuclear run-on assay in which radiolabeled ribonucleotide precursors are
incorporated into nascent transcripts in direct proportion to the abundance
of each species of RNA being transcribed (see Chapter 19: Analysis of
Nuclear RNA). When used in conjunction with other methods that exam-
ine steady-state RNA levels, the regulation of genes can often be
assigned as transcriptional or due to posttranscriptional events.
6. In vitro translation of purified mRNA. The resulting polypeptide may be
further characterized by immunoprecipitation or Western analysis. Cell-
free translation represents an older method for the identification of specific
transcripts: by providing the raw materials needed to support translation,
one is able to demonstrate that a transcript of putative identity is able to
support the synthesis of the cognate peptide. For example, this approach
could be used to demonstrate that two transcripts from the same genetic
locus with alternative TSSs are, in fact, able to direct the synthesis of iden-
tical or closely related proteins. In applications such as rational drug
design, in vitro translation is helpful because understanding the three-
dimensional architecture of a protein, and its wild type or mutated function
(s), may suggest novel applications in the area of functional proteomics.
WHAT IS RNA?
RNA is a long, unbranched polymer of ribonucleoside monophosphate moie-
ties joined together by phosphodiester linkages. Both eukaryotic and
2 RNA Methodologies
prokaryotic RNAs are single-stranded molecules. The unlinked monomer
building blocks of both RNA and DNA are known generically as nucleotides.
Each nucleotide consists of three key components: a pentose (five-carbon
sugar), at least one phosphate group (nucleotides may contain as many as
three phosphate groups), and a nitrogenous base (Fig. 1.1). A nitrogenous
base joined to a pentose sugar is known as a nucleoside. When a phosphate
group is added, the composite, a phosphate ester of the nucleoside, is known
as a nucleotide.
base 1 sugar 5 nucleoside
nucleoside 1 phosphate 5 nucleotide
The components of RNA and DNA nucleosides and nucleotides are com-
pared in Table 1.1.
The key chemical difference between RNA and DNA is the presence the
five-carbon sugar ribose, in which a hydroxyl group (OH) is joined to the
20
carbon of the ribose sugar, in the case of RNA; the absence of the 20
OH
group in DNA is the underlying basis of the name of the sugar “deoxyri-
bose.” In addition, the base uracil is found in RNA, substituted in DNA by
the closely pyrimidine thymine (Chemically, thymine is 5-methyluracil),
though it is possible to find deoxynucleotides containing uracil in certain
situations. More precisely, RNA is assembled from ribonucleotide precursors
and DNA is assembled from deoxyribonucleotide precursors. Hence, RNA is
so-named because of the ribose sugar it contains, just as DNA is named from
its constituent 20
-deoxyribose sugar. Essential base, nucleoside, and nucleo-
tide nomenclature is summarized in Table 1.2.
FIGURE 1.1 The identity of a nucleotide is defined by the base that is attached to the 10
carbon.
In practice, the nucleotides that make up an RNA or a DNA molecule are represented by the stan-
dard one-letter abbreviation for the base each contains: adenine (A), cytosine (C), guanine (G),
thymine (T), and uracil (U).
RNA and the Cellular Biochemistry Revisited Chapter | 1 3
Nitrogenous bases and the pentose sugar components of nucleosides are
both cyclic. By convention, the numbering system for the carbon and nitro-
gen atoms that make up the bases is 1, 2, 3, and so forth, while the number-
ing system for the constituent carbon atoms of the sugar (ribose or
TABLE 1.1 Comparative Nucleotide Structure
RNA Nucleotides DNA Nucleotides
Key Nucleotide Components Five-carbon ribose Five-carbon deoxyribose
Phosphate group(s) Phosphate group(s)
Nitrogenous base Nitrogenous base
Common Nitrogenous Bases
Purines Adenine Adenine
Guanine Guanine
Pyrimidines Cytosine Cytosine
Uracil Thymine
TABLE 1.2 Essential Base, Nucleoside, and Nucleotide Nomenclature
Base Nucleoside Nucleotide Triphosphate
Precursors
RNA Adenine (A) Adenosine Adenosine-50
-triphosphate (ATP)
Cytosine (C) Cytidine Cytidine-50
-triphosphate (CTP)
Guanine (G) Guanosine Guanosine-50
-triphosphate (GTP)
Uracil (U) Uridine Uridine-50
-triphosphate (UTP)
DNA Adenine (A) 20
-Deoxyadenosine 20
-Deoxyadenosine-50
-triphosphate
(dATP)
Cytosine (C) 20
-Deoxycytidine 20
-Deoxycytidine-50
-triphosphate
(dCTP)
Guanine (G) 20
-Deoxyguanosine 20
-Deoxyguanosine-50
-triphosphate
(dGTP)
Thymine (T) 20
-Deoxythymidine 20
-Deoxythymidine-50
-triphosphate
(dTTP)
Nucleosides consist of a base and a sugar only. A nucleoside is promoted to a nucleotide upon
addition of at least one phosphate group.
4 RNA Methodologies
deoxyribose) is 10
, 20
, 30
, 40
, and 50
. The purpose of this nomenclature is to
avoid confusion when referring to the constituent atoms of the sugar versus
those found in the base of a particular nucleotide or nucleoside.
The ribonucleoside triphosphates are collectively referred to as NTP; in
various molecular biology protocols, the symbol NTP refers to an equimolar
cocktail of ATP, CTP, GTP, and UTP. Similarly, the deoxy-form of a nucle-
otide is denoted by the placement of a lower case “d” preceding the nucleo-
tide triphosphate, as in dATP, dCTP, dGTP, and dTTP, and the symbol
dNTP (also, dXTP) refers to an equimolar cocktail of the four deoxynucleo-
side triphosphates in protocols capable of supporting the synthesis of cDNA
or PCR products. It is the triphosphate form of a nucleotide that is utilized as
a precursor during nucleic acid synthesis. The phosphate nearest to the sugar
is known as the α phosphate, followed by the β phosphate, followed by the γ
phosphate, which is furthest from the nucleoside moiety (Fig. 1.2). During
nucleic acid polymerization, the β and γ phosphates (PPi; inorganic phos-
phate) are cleaved (released) from the nucleotide, and the resulting single-
phosphate nucleotide, a nucleoside monophosphate, is then incorporated into
the nascent polynucleotide chain.
POLYNUCLEOTIDE SYNTHESIS
Any enzyme with an associated polymerase activity is capable of synthesiz-
ing nucleic acid molecules from nucleotide precursors. The synthesis of
RNA is mediated by the activity of enzymes known as RNA polymerases
while DNA is synthesized, not unexpectedly, by DNA polymerases. A
nucleic acid molecule is the result of linking nucleotides together by phos-
phodiester bonds. The formation of these bonds involves the hydrophilic
attack by the 30
hydroxyl group of the last nucleotide added to the nascent
FIGURE 1.2 Adenosine-50
-triphosphate (ATP). The three constituent phosphate groups are
designated α, β, and γ based on the proximity of each group to the nucleoside (base 1 sugar)
component of the molecule. The replacement of the 20
-OH with H would convert this molecule
to a deoxynucleotide.
RNA and the Cellular Biochemistry Revisited Chapter | 1 5
polynucleotide on the 50
phosphate group of the incoming nucleotide
(Fig. 1.3). For this reason nucleic acid synthesis is said to proceed 50
-30
,
and there are no known exceptions to this process.
In order for the synthesis of nucleic acids to occur in vivo or in vitro,
there are two fundamental requirements that must be fulfilled and maintained
to initiate and to support continued nucleic acid polymerization:
1. There must be a template (a strand or an oligomer) to direct the
polymerase-mediated insertion of the correct (complementary) nucleotide
into the nascent chain (DNA polymerases capable of adding nucleotides
without template information are said to exhibit rare terminal transferase
activity, as in the case of the unusual enzyme terminal deoxynucleotidyl
transferase. This enzyme has broad applications in the area of cDNA syn-
thesis as well as certain forms of 50
-RACE. These special cases are dis-
cussed in detail in Chapter 7, cDNA: A Permanent Biochemical Record
of the Cell, and Chapter 8, RT-PCR: A Science and an Art Form.). This
occurs predictably, according to the conventions set down in Chargaff’s
Rule (Zamenhof et al., 1952), which succinctly states that adenine ordi-
narily base pairs with thymine or uracil through the formation of two
hydrogen bonds (A::T, A::U) and that guanine ordinarily base pairs to
cytosine through three hydrogen bonds (G:::C).
2. For initiation and elongation, there must be a free 30
-OH to which the
next nucleotide in the chain can be joined via a phosphodiester linkage.
Thus, the entire process of transcription requires some type of primer
manifesting the requisite 30
-OH. This applies equally to RNA and DNA
O
O
P
O−
OH OH
N
N
N
N
O
P
O
O
O
P
O
OH
O OH
N
N
N
N
NH2
NH2
O
O
OH
P
HO
OH
O
α
α
β
γ
Phosphodiester
linkage
5' end
3' end
FIGURE 1.3 The dinucleotide that results from the formation of the first phosphodiester link-
age has structurally different ends, namely a phosphate group at the 50
end and a hydroxyl group
at the 30
end. The structural differences at the 50
- and 30
-ends are maintained regardless of the
number of nucleotides that are joined together.
6 RNA Methodologies
synthesis, both in vivo and in vitro. Most of the enzymes used in molecu-
lar biology that exhibit polymerase activity have nearly identical template
and 30
-OH primer requirements.
This results in a polynucleotide with a consistent pattern of 50
-30
lin-
kages between adjacent nucleotides; elongation is frequently referred to as
the 50
-30
polymerase activity associated with the enzyme. Upon comple-
tion, nucleic acid molecules are assembled in such a way that:
1. The ends of the molecule are structurally different from one another. The
first nucleotide of the molecule has an uninvolved 50
(tri)phosphate, con-
stituting the so-called 50
end of the molecule. The last nucleotide that
was added exhibits a free 30
hydroxyl group, and this is known as the 30
end of the molecule.
2. The backbone of the molecule consists of an alternating series of sugar
and phosphate groups. Known as the phosphodiester backbone, or simply
the backbone, of the molecule, it imparts a net negative charge to the
molecule by virtue of its constituent phosphate groups.
3. The base associated with each nucleotide protrudes away from the back-
bone of the molecule. This stereochemistry makes the bases very accessi-
ble for hydrogen bonding (base pairing) to a complementary
polynucleotide sequence. This proclivity is at the very heart of molecular
hybridization in the laboratory.
The nitrogenous bases found in nucleotides are categorized as either pur-
ines (adenine and guanine) or pyrimidines (cytosine, thymine, and uracil),
both of which are flat aromatic molecules. The specificity of base pairing
(purine with pyrimidine) is maintained by the stereochemical preferences of
the bases listed here. In other words, what is commonly known as
“WatsonCrick” base pairing is predicated on the bases involved being in
their preferred tautomeric forms.
Base Preferred Rare
Purines Adenine Amino form Imino form
Guanine Keto form Enol form
Pyrimidines Cytosine Amino form Imino form
Thymine Keto form Enol form
Uracil Keto form Enol form
Hydrogen bonds, which are highly directional, form between complemen-
tary bases when an electropositive hydrogen atom is attracted to an electronega-
tive atom such as oxygen or nitrogen. Because of the manner in which bases
protrude from their respective phosphodiester backbones, antiparallel base pair-
ing or hybridization of complementary strands is strongly favored. Thus, the 50
end of one strand is opposite the 30
end of the complementary strand to which it
is base-paired and often represented as shown in the following graph:
RNA and the Cellular Biochemistry Revisited Chapter | 1 7
5
5
3 
3
This is true for all double-stranded molecules: dsDNA, dsRNA, and
DNA:RNA hybrids. The ability to promote, or to prevent, base pairing in
this manner is a central act in the molecular biology laboratory.
The obvious structural differences at the 50
and 30
ends of a molecule
support a convention by which one may unambiguously refer to the position
of any feature of a nucleic acid molecule in relation to any other feature:
Upstream means that a structure or feature is closer to or in the direction
of the 50
end of the molecule, relative to some other point of reference; it
can also mean in the opposite direction of gene expression.
Downstream means that a structure or feature is closer to or in the direc-
tion of the 30
end of the molecule, relative to some other point of refer-
ence; it can also mean in the direction of gene expression.
For the sake of simplicity, upstream and downstream are most often used
to mean “in the opposite direction of expression” and “in the direction of
expression,” respectively. This nomenclature may be especially useful when
describing features or regions of a double-stranded nucleic acid molecule, in
discussions pertaining to either the structure or the expression of a gene and,
in particular, for the purpose of primer design to support PCR (see
Chapter 8: RT-PCR: A Science and an Art Form).
The actual base sequence, i.e., the linear order of ribonucleotides, is
known as the primary (1
) structure of an RNA molecule, and this order is
dictated by the order of nucleotides on the DNA template strand. There is a
tremendous proclivity for a single RNA molecule to exhibit intramolecular
base pairing to occur, resulting in what is known as secondary (2
) structure.
The variety of possible interactions within the phosphodiester backbone are
often described using such colorful nomenclature as RNA hairpins, stems,
interior loops, bulge loops, multibranched loops, kissing loops, cruciform
structures, and pseudoknots (Fig. 1.4). Higher-order three-dimensional fold-
ing, the so-called tertiary (3
) structure which RNA molecules exhibit, is
best described as the collection of 2
structural elements arranged in such
a way that an RNA molecule is able to perform its biological function.
Much has been suggested, for example, about the role of folding by careful
study of transfer RNAs, the classical example of intramolecular base
pairing par excellence. It is important to note that some of the 2
and 3
structures of tRNA are attributed to the formation of noncanonical base
pairs. The canonical base pairs are G  C, A  T, and A  U; examples of
noncanonical base pairs include G  U, A  C, A  G, C  U, U  U, G  G, A  Ψ
(Ψ 5 pseudouridine), G  Ψ, A  A  U trimers, and others. An excellent data-
base containing known noncanonical base pairs involving RNA is maintained
8 RNA Methodologies
by Dr. George Fox at http://prion.bchs.uh.edu/bp_type/ (Nagaswamy et al.,
2000, 2002). Contemporary studies have demonstrated that mRNA also
assumes varying degrees of transient 2
and 3
structures which, in no small
measure, influence its function in the cytoplasm. For most laboratory appli-
cations, higher-order folding must be disrupted, as described below, before
an assay with a quantitative component can be performed using an RNA
sample. Failure to do so generally has a severe negative impact on accurate
quantitative profiling of the sample.
TYPES OF RNA
Transcription results in the production of RNA molecules, generically referred
to as transcripts. In the past, cellular transcripts were broadly classified as
ribosomal RNA (rRNA), transfer RNA (tRNA), heterogeneous nuclear RNA
(hnRNA), or messenger RNA (mRNA), as well as a collection of small RNAs
of previously unknown function. Now, however, one must include the very
diverse population of noncoding RNA (ncRNA), all of which are of immense
interest in the study of the regulation of gene expression (Table 1.3). Each cat-
egory of RNA, which in eukaryotic cells is synthesized by a different type of
RNA polymerase, performs a different function in the cell. In contrast, all
transcripts in bacteria are produced by a single type of RNA polymerase. The
various types of RNA are not represented in equal amounts—the abundance
of each is directly related to the physiology of the cell.
rRNA is the most abundant RNA component in the cell. In prokaryotic
cells the major rRNA species are the 23S rRNA, 16S rRNA, and 5S rRNA.
Helix Stem-loop Bulge
Pseudoknot
Three-stem
junction
FIGURE 1.4 Examples of secondary structure commonly observed in single-strand RNA mole-
cules. Note how a single molecule is able to exhibit intramolecular base pairing by the antiparal-
lel juxtaposition of complementary regions. Double-stranded regions may be perfectly or
imperfectly base-paired. To a large extent, the variety and locations of stems, hairpins, and loops
will influence the ensuing tertiary structure.
RNA and the Cellular Biochemistry Revisited Chapter | 1 9
TABLE 1.3 RNA Types and Functions
RNA Type Name Symbol Basic Function Prokaryotic Eukaryotic
Coding Messenger
RNA
mRNA Template for the synthesis of proteins Yes Yes
Heterogeneous
nuclear RNA
hnRNA Large unspliced precursor of mRNA (pre-mRNA) No Yes
Long
Noncoding
(lnc)
Ribosomal
RNA
rRNA Forms scaffolding of the ribosomal subunits Yes Yes
Transfer RNA tRNA Transports amino acids to the ribosome to support translation Yes Yes
Long intergenic
noncoding
RNA
lincRNA Production of transcription of intergenic regions No Yes
XIST Xist Sex-linked lncRNA involved in X-chromosome inactivation and Barr
body formation
No Yes
Small
noncoding
(snc)
Small nuclear
RNA
snRNA Facilitates splicing of hnRNA into mature mRNA as well as rRNA
processing. snRNA molecules exist as an RNAprotein complex, referred
to as a snRNP, or snurp
No Yes
Small nucleolar
RNA
snoRNA Processing of immature rRNA transcripts in the nucleolus; some have
a role in gene silencing
No Yes
Small
cytoplasmic
RNA
scRNA Facilitates protein trafficking and secretion; possible mRNA degradation.
scRNA molecules exist as an RNAprotein complex, referred to as a
scRNP, or scyrp
Yes Yes
microRNA miRNA Short antisense RNAs that participate in the regulation of gene expression
by blocking mRNA and inhibiting translation
No Yes
Catalytic
RNA
Ribozyme  An RNA molecule with a catalytic function RNA Yes Yes
Telomerase
RNA
 RNA portion of the enzyme/RNA complex that repairs chromosome
telomeres (TERC: telomerase RNA component)
No Yes
The eukaryotic counterparts are identified as the 28S rRNA, 18S rRNA, and
5S rRNA, as well as a fourth ribosomal transcript, the 5.8S rRNA. These
molecules form the scaffolding of ribosomes, which become translationally
competent when decorated with myriad ribosomal proteins. At present there
are 55 known prokaryotic ribosomal proteins and 82 known eukaryotic
(mammalian) ribosomal proteins. Not all ribosomes are functional at any
given time, and the existence of a pool of transiently inactive ribosomes is
itself a regulator of gene expression. The super abundance of rRNA in a
purified RNA sample is often used as both an RNA mass loading control
(see Chapter 9: Quantitative PCR Techniques) as well as internal electropho-
resis molecular weight markers (see Chapter 13: Electrophoresis of RNA).
tRNA is responsible for the transportation of amino acids to the ribosome
to support protein synthesis. Amino acid molecules are small, ordinarily
ranging from 74 to 95 nts. When shuttling an amino acid covalently linked
to its 30
end, a tRNA is said to be “charged”. Placement of the correct amino
acid into the nascent polypeptide depends on recognition of the mRNA
codon (a group of three nucleotides) within the coding region of mRNA by a
complementary trinucleotide motif carried on one arm of the tRNA known
as the anticodon. The tRNA anticodon base pairs to the mRNA codon within
the ribosome, thereby supporting protein elongation (for review, see Krebs
et al., 2012). While neither as large nor as abundant as rRNA, the smaller
tRNA species play a central role in translation.
mRNA is the most diverse of all the transcripts. Ironically, even though
mRNA is by far the least abundant of all transcript types, it is the mRNA
that drives the phenotype of the cell. mRNA alone directs the synthesis of
proteins through the use of the cellular translation apparatus. There is wide
variation in the number and abundance of RNA species in the cell; the abun-
dance of specific type of RNA is subject to dramatic change as the demands
on the cell change. Some mRNAs are present in hundreds of copies per cell
while others are present only a few copies per cell; this aspect of the RNA
profile of the cell can be problematic because very low abundance transcripts
are sometimes difficult to detect even with sensitive contemporary
techniques.
TRANSCRIPTION AND THE CENTRAL DOGMA
According to the central dogma (Crick, 1957) of molecular biology, the
expression of hereditary information flows from genomic sequences (DNA),
through an mRNA intermediate, to ultimate phenotypic manifestation in the
form of a functional polypeptide (Fig. 1.5). Whereas this design mirrors
what occurs naturally in both prokaryotic and eukaryotic cells, certain “viola-
tions” have been observed in nature: (1) accompanying the discovery of the
retroviral enzyme reverse transcriptase (RNA-dependent DNA polymerase)
(Baltimore, 1970; Temin and Mizutani, 1970), by which RNA may serve as
RNA and the Cellular Biochemistry Revisited Chapter | 1 11
the template for the synthesis of DNA, and (2) the discovery of RNA editing
(Benne et al., 1986; reviewed by Nishikura, 2010), in which a transcribed
sequence is subject to alteration.
Transcription is that process by which a single-stranded RNA molecule
is synthesized at a specific chromosome locus; this is the first of several
steps in what is commonly referred to as RNA biogenesis. Transcription
occurs in the nucleus (and mitochondria and chloroplasts) of eukaryotic cells,
and in the common cellular compartment in prokaryotic cells. All phases of
transcription are subject to variation and are potential control points in the
regulation of gene expression. A transcriptional unit is best thought of as a
DNA sequence that manifests appropriate signals for the initiation and termi-
nation of transcription and is capable of supporting the synthesis of a pri-
mary RNA transcript. The process of transcription is so-named because the
transfer of information from DNA to RNA is in the same language, namely
the language of nucleic acids. In contrast, the process known as translation
is so-named because nucleic acid instructions in the form of mRNA are used
to direct the assembly of a primary polypeptide from amino acid precursors:
the nucleic acid instructions are executed in (translated to) the language of
proteins. The ribosome is the organelle of polypeptide synthesis in all cells,
and each ribosome independently directs the sequential linkage of amino
acids as the associated mRNA is interpreted. Upon completion of translation
Central dogma of molecular biology
Directional flow of genetic information
Transcription
Translation
Final manifestation
Reverse
transcription
(creates cDNA)
Replication
2n→2n
DNA
mRNA
Protein
Phenotype
FIGURE 1.5 The central dogma of molecular biology. The process of transcription produces
mRNA while the process of translation produces protein. Replication, the process by which
DNA is duplicated, occurs during S phase in the eukaryotic cell cycle. cDNA, in contrast, is not
found in the cell but is synthesized in vitro and is commonly used to measure transcriptional
activity or to assay for the presence of an RNA virus.
12 RNA Methodologies
eukaryotic proteins are typically modified, sorted, packaged, and directed to
their proper subcellular location as they move through the endomembrane
system, of which the endoplasmic reticulum and the Golgi apparatus are key
components; prokaryotic and other eukaryotic proteins are often under the
influence of various small cytoplasmic RNA (scRNA) species that guide
them to their proper destination. As with RNA, and to a lesser extent DNA,
proteins exhibit a marked capacity for higher-order folding (Table 1.4). As
with RNA, the functionality of a protein molecule is associated with its
shape. Unlike RNA, however, in which the shape of the molecule is naturally
dynamic, the distortion of the tertiary (3
) or quaternary (4
) structure of pro-
tein is associated with immediate loss of function.
PROMOTERS, TRANSCRIPTION FACTORS,
AND REGULATORY ELEMENTS
Transcription is mediated by enzymes known as RNA polymerases. These
enzymes, in conjunction with myriad proteins known as transcription factors,
TABLE 1.4 Higher-Order Folding of Nucleic Acids and Protein
RNA Protein DNA
1
Structure Nucleotide order Amino acid
order
Nucleotide order
2
Structure Stem-loop structures
and hairpins, which
may include
mismatches
α helices
and β
pleated
sheets
Antiparallel base pairing
between two
complementary DNA
strands
3
Structure Three-dimensional
folding
Three-
dimensional
folding
Three-dimensional
folding. Double helix: A-
DNA, B-DNA, or Z-DNA
4
Structure Interaction of two or
more folded RNA
molecules, often by
association with RNA
binding proteins
Aggregation
of two or
more
subunits
Interaction of double
helical DNA with
proteins, such as histone
proteins, as in chromatin
formation
Catalytic
variants
Ribozymes Enzymes DNAzymes
The primary structure of nucleic acids and proteins is the order of monomers. The secondary
structure of a molecule is the first level of folding that occurs as a consequence of its primary
structure. The tertiary structure is the three-dimensional arrangement of atoms within the
molecule. The quaternary structure of a molecule, when it forms, is higher-order folding the results
from interaction of the molecule with one more identical or nonidentical molecules.
For DNAzyme review, see Hollenstein, M. (2015). DNA catalysis: the chemical repertoire of
DNAzymes. Molecules 20, 2077720804.
RNA and the Cellular Biochemistry Revisited Chapter | 1 13
recognize very specific and highly conserved promoter, or initiation,
sequences within the enormous complexity of genomic DNA. Promoters are
spatially associated with the structural portion (body) of a gene (Fig. 1.6)
and consist of several recognizable upstream nucleotide sequence motifs.
These sequences are known as consensus sequences, a term used to describe
the most commonly observed pattern of nucleotides at a particular location.
For example, the symbol T80A95T45A60A50T96 indicates that thymine is the
first base associated with this consensus motif 80% of the time, and so forth.
The exact sequence and precise geometry of these regulatory elements can
either promote or prevent the onset of transcription, and do so with varying
degrees of efficiency.
Any promoter component that is located 50
, or upstream, from the TSS is
indicated with a “minus” sign in front of the actual nucleotide distance from
the TSS. By convention, the first transcribed nucleotide is designated as 11,
and any other nucleotides or features located 30
, or downstream, from the
TSS are likewise designated with a “plus” sign placed in front of the actual
nucleotide distance. Knowledge of promoter consensus sequence function is
due largely to experiments involving standard DNA cloning techniques, site-
directed mutagenesis, DNA sequencing, and in silico analysis.
In prokaryotic systems, the essential elements of the promoter region
include the so-called 210 hexamer sequence, formerly known as the Pribnow
box (or the Pribnow-Schaller box), consisting of the consensus sequence
T80A95T45A60A50T96, and another conserved region located further upstream is
known as the 235 sequence (T82T84G78A65C54A45). In some organisms, an
AT-rich domain (the UP element) is also observed further upstream. The spac-
ing between the 210 sequence and the 235 sequence is tightly regulated,
with 17 base pairs being optimal, and variations in the length of the region
between these two elements can reduce the efficiency of the promoter.
In eukaryotic cells, promoters associated with nuclear genes are variable
in structure; these variations are due to the presence of multiple nuclear
Promoter Structural portion of the gene
mRNA transcript
Cell function
Transcription
Translation
Phenotype
Folded, functional
protein
FIGURE 1.6 Genes, some of which encode mRNA which, in turn, encode proteins, are under
the direct influence of a regulatory element known as a promoter.
14 RNA Methodologies
RNA polymerases as well as the requisite transcription factor initiation com-
plex that must form. Transcription factors are small proteins that are continu-
ally binding to and altering the shape of the chromatin. The remodeling of
chromatin in the promoter locale is characterized by changes in the associa-
tion between genomic DNA and the histone proteins which decorate it. Best
thought of as a type of histone displacement, the objective is to facilitate
access to the gene promoter by altering the local architecture of the chroma-
tin. This is an ATP-dependent process. Transient covalent modifications to
histone proteins include acetylation, methylation, and phosphorylation.
Generally speaking, histone acetylation is associated with the activation of
transcription, while methylation commonly correlates with gene silencing.
The net result is the activation, or silencing, of various subsets of genes in a
temporal or environmentally induced manner.
Interestingly, promoters recognized by RNA polymerase II, the enzyme
responsible for the synthesis of mRNA (discussed below), often display simi-
lar sequence homology with prokaryotic gene promoters (Fig. 1.7). The
eukaryotic promoter counterpart is known as the “TATA box,” formerly
known as the Hogness box, and so-named because of the prevalence of the
highly conserved TATAA motif. Point mutations involving any of these five
bases strongly downregulate the function of that promoter. Another promoter
component, the transcription initiation factor TFIIBrecognition element
(BRE), is directly adjacent to and upstream from the TATA box. The func-
tion of this heptanucleotide motif (often, GGGCGCC) is to attract TFIIB, a
key element in the assembly of the transcription apparatus associated with
RNA polymerase II. While at one time it was thought that all eukaryotic pro-
moters manifest a TATA box, this is now known to be untrue. Instead, these
rather prevalent TATA-less promoters are typically characterized by an initi-
ator region (INR) and a downstream promoter element (DPE), which is
observed approximately 30 base pairs downstream (130) from the TSS. The
motif ten element (MTE) exclusively maps to 118 through 127 and is
located downstream from INR and immediately upstream from the DPE. At
least one function associated with the MTE is its ability to act in place of an
absent TATA box. In addition to the TATA, DPE, and MTE promoter struc-
tural components, another promoter motif is the “CAAT box,” found in sev-
eral but not all promoters, and so-named because of the conservation of its
sequence. When present in eukaryotic promoters, the TATA box is usually
DPE
MTE
Initiator
TATA
box
CAAT
box
G-Box
(GGGCGG)n
BRE
+1 Transcription start site (TSS)
–30
–35
–75
–110
5 3
Nascent transcript (pre-mRNA)
Upstream Downstream
+30
+22
FIGURE 1.7 Generalized structure of a eukaryotic gene promoter. See text for details.
RNA and the Cellular Biochemistry Revisited Chapter | 1 15
centered at 230 and the CAAT box appears around 275, though the CAAT
box has been shown to function quite effectively much further upstream, and
even in reverse orientation. These elements appear to control initial binding
of the RNA polymerase and promoter efficiency, respectively. Another fre-
quently observed promoter element is the sequence (GGGCGG)n, known as
the G-box element or simply as the GC box. Present in one or more copies,
this GC-rich region is generally observed between 290 and 2120 within the
promoter region. Interestingly, it appears that there is no one component or
organization that is shared by all promoters, though the particular permuta-
tion of promoter elements and distances between them is recognizable as a
transcription initiation regulator. Succinctly, by comparison with transcrip-
tion in prokaryotic cells, the elaborate initiation of eukaryotic transcription
requires the presence of numerous transcription factors, coactivators, and
transcription activator proteins that bind to these cis-acting components
which, collectively, make up a promoter. The widely accepted role of early
transcription factor binding to gene promoters is to recruit RNA polymerase
to that site so as to ultimately initiate transcription. Rather than being
thought of as merely an onoff switch associated with a particular gene, a
promoter functions more like a thermostat that increases (upregulates) and
decreases (downregulates) the expression of a gene in response to the pre-
vailing local conditions acting upon a cell.
Eukaryotic promoters do not always function alone. Transcription in
eukaryotic cells can be influenced profoundly by the presence of a regulatory
element known as an enhancer, the function of which appears to be the stim-
ulation of transcription. First discovered in the early 1980s, the precise loca-
tion and orientation of an enhancer relative to the gene promoter varies from
one gene to the next. Some genes, including those which encode immunoglo-
bulins, carry enhancers within the structural portion of the gene itself.
Removal of enhancer sequences can reduce the transcriptional efficiency at a
locus normally under the influence of that enhancer sequence, as can the
binding of repressor proteins to functionally disparate DNA sequences
known as silencers.
In vitro transcription of genes that are not naturally associated with an
enhancer element can be increased significantly if an enhancer is ligated to
the DNA construct, usually in no particular orientation, and often hundreds,
if not thousands of base pairs away from the TSS. In vivo, a translocation
event that brings a promoter and a gene into proximity can result in inappro-
priate expression of the gene, often with potentially catastrophic conse-
quences, as in the case of Burkitt’s lymphoma (Taub et al., 1982). The
transcriptional influence of upstream and downstream enhancer sequences,
and antagonistic silencer sequences, on gene promoters is well documented.
Many enhancers, but not all, have been shown to be transcriptionally
active (Djebali et al., 2012; Andersson et al., 2014), producing enhancer
RNAs, or simply eRNAs. At present, the number of known eRNAs in human
16 RNA Methodologies
cells is in the tens of thousands and their transcription points to enhancer
functionality in terms of promoting expression of the cognate gene (reviewed
by Li et al., 2016). These noncoding transcripts are believed to recruit com-
ponents of the transcription initiation complex; it is possible that RNA poly-
merase II may track to a transcription promoter by first identifying the
enhancer itself. It is also possible that transcription of the intergenic area
between the enhancer and the promoter may have a role in chromatin acety-
lation and ensuing remodeling in order to facilitate transcription initiation at
the promoter (Gribnau et al., 2000). The sequential binding of transcription
factors and ancillary components in the immediate vicinity of the gene locus
ultimately results in the formation of a loop and concomitant spatial juxtapo-
sition of the components of the template DNA needed to support the initia-
tion of transcription. Succinctly, enhancers perform their function by
increasing the concentration of transcription activator proteins in the vicinity
of the associated promoter.
During transcription, both strands of the gene being transcribed have dif-
ferent names and different roles. The strand that actually serves as the tem-
plate upon which RNA is polymerized is properly referred to as the template
strand. The other strand, which does not act in a template capacity, is called
the coding strand. The coding strand is also known in some circles as the
sense strand, while the template strand may be referred to as the antisense
strand. The choice of nomenclature is purely a matter of personal preference.
When publishing a gene sequence, the convention is to report the sequence
of the coding strand, written 50
to 30
, from left to right. The implication is
that the template strand is base-paired to the coding strand and lying antipar-
allel to it and therefore does not need to be reported. The DNA template
strand is so-named because the precise sequence of nucleotides inserted into
the nascent RNA transcript is determined by, and complementary to, the
template strand nucleotide sequence. It is important to realize that the coding
strand and the template strand may switch roles depending upon the place-
ment of transcriptional promoter sequences (Fig. 1.8). One powerful example
of this phenomenon in vitro is the cloning of a double-stranded DNA
between two different transcription promoters in opposite orientations; often
the bacteriophage polymerase promoters SP6, T3, or T7 are selected because
of their high efficiency. Constructions such as these are frequently employed
SP6
promoter
T7
promoter
Template strand
Template strand
Coding strand
Coding strand
FIGURE 1.8 Promoters positioned in the opposite orientation relative to a DNA sequence
allow the template and coding strands to switch roles during transcription. This arrangement per-
mits the synthesis of 1 RNA and RNA from the same DNA construct.
RNA and the Cellular Biochemistry Revisited Chapter | 1 17
to accommodate in vitro transcription of large amounts of sense and/or anti-
sense RNA for use as nucleic acid probes (see Chapter 16: Nucleic Acid
Probe Technology) or for RNAi applications (see Chapter 11: RNA
Interference and RNA Editing).
GENE AND GENOME ORGANIZATION AFFECT
TRANSCRIPTION
In order to understand the significance of the products of transcription, it is
first essential to understand the organization of the genes themselves. The
typical prokaryotic genome exhibits little extraneous baggage. Frequently,
genes that encode proteins associated with a common metabolic pathway are
clustered together, as suggested by the operon model (Jacob and Monod,
1961). The lac operon, the gene products of which facilitate the metabolism
of lactose as a carbon source in bacteria, is but one extremely well-
characterized example. The RNA molecule that results from the transcription
of an operon is usually polycistronic, meaning that more than one polypep-
tide is encoded in a single RNA transcript.
Protein 1 Protein 2 Protein 3
5ʹ 3ʹ
Intercistronic Intercistronic
region 1 region 2
The coding information within a polycistronic mRNA for each polypep-
tide is contiguous: there are no interruptions in the coding sequences by non-
coding information. This design favors maximum efficiency of energy
resource utilization in unicellular organisms.
In fact, the kinetics of prokaryotic gene expression are so rapid that bac-
terial mRNA is usually being transcribed, undergoing translation, and being
degraded simultaneously. The rapid turnover of RNA in this manner has, in
the past, frustrated valiant attempts to clone or otherwise characterize pro-
karyotic mRNA. While significant improvements favoring the isolation of
high quality RNA from both Gram-negative and Gram-positive bacteria have
been made, and many of these innovations are available in kit form, the iso-
lation of intact prokaryotic RNA remains something of a challenge in many
laboratories.
In contrast to prokaryotes, nearly all eukaryotic mRNAs are monocistro-
nic. Although a single-polypeptide species results from the translation of a
particular monocistronic eukaryotic mRNA molecule, that same mRNA is
subject to repeated translation as long as the transcript remains biologically
competent and chemically stable. To maximize translation potential, an
mRNA transcript is often engaged by several ribosomes that are all involved
in simultaneous, orderly translation of that transcript. Such a cluster of ribo-
somes attached to a single mRNA molecule is known as a polysome (or
polyribosome). Polysomes are observed both in prokaryotic and eukaryotic
cells, though eukaryotic polysomes, with 78 ribosomes per polysome
18 RNA Methodologies
complex, tend to be smaller than their prokaryotic counterparts. Succinctly, a
large number of polypeptide molecules can be manufactured from a single
RNA molecule. Polysomes are observed free-floating in the cytoplasm, they
can be membrane-bound, and sometimes are attached to the cytoskeleton
(Lenk et al., 1977; Davies et al., 1991). The entirety of mRNAs so engaged
in a cell at any given moment is known as the polysome fraction, which can
be used to assess the translational competence of a cell under a defined set
of experimental conditions.
Close examination of eukaryotic genes reveals that for a vast majority of
genes there are considerably more nucleotides within a particular locus than
are necessary to direct the synthesis of the corresponding polypeptide, that
is, the DNA sequence and the amino acid sequence are not colinear over the
span of the locus. This size differential can also be observed at the level of
the mature mRNA in the cytoplasm, which is usually quite a bit shorter than
the DNA sequence from whence it was transcribed. Upon further scrutiny,
this discrepancy can be resolved at the level of the organization of the struc-
tural portion of the gene itself, the sequences within which fall into one of
two categories:
1. Exons are regions of DNA that are represented in the corresponding
mature mRNA. Exons may or may not have a peptide coding function.
2. Introns are regions of DNA that are transcribed but generally are not
represented in the corresponding mRNA. Introns are usually spliced out
of the primary RNA transcript (the immediate product of transcription),
accompanied by the joining of adjacent exon sequences. The majority of
introns do not direct polypeptide synthesis, though there are several note-
worthy exceptions (for review, see Farrell and Bassett, 2007).
Exon 1 Exon 2 Exon 3
5ʹ 3ʹ
Intron 1 Intron 2
The number and the length of exons and introns associated with a gene
are highly variable depending upon locus and this variability even pertains to
loci that are highly conserved across evolutionary time. By comparison with
introns which can be several thousand base pairs in length, exons tend to be
rather short, each encoding fewer than 100 amino acids in most organisms.
In some cases the high sequence conservation in one or more exons of a
gene has been directly responsible for the isolation of a related gene (an
ortholog) from a different organism. In some unusual cases, genes lack
introns altogether, of which human β-interferon and thrombomodulin are
examples.
The base sequence of the primary RNA transcript correlates precisely
with the DNA from which it is derived, meaning that it contains both exon
and intron sequences. These primary transcription products are only a precur-
sor to functional mRNA, and are confined to the eukaryotic nucleus where,
RNA and the Cellular Biochemistry Revisited Chapter | 1 19
appropriately, it is collectively known as heterogeneous nuclear RNA
(hnRNA) or simply pre-mRNA. hnRNA and specific nuclear proteins that
bind to it form rather abundant heterogeneous nuclear ribonucleoprotein
complexes (hnRNPs). Similarly, mRNAs exist in the cytoplasm, following
intron removal, as messenger ribonucleoprotein (mRNP) complexes after
having traversed a nuclear membrane channel. In order to promote unidirec-
tional movement, the combination of proteins associated with the mRNP is
changed immediately upon arrival in the cytoplasm. The ensuing remodeling
ensures that the mRNP, and the mRNA that it carries, is unable to travel
back into the nucleus.
Introns vary dramatically in number, length, and base sequence and often
exhibit multiple translation termination (stop) codons in all reading frames.
This is not entirely unexpected because the noncoding nature of introns
favors the accumulation of mutations that might otherwise be lethal if they
were to occur within an exon or other critical area. Examination of the splice
junctions of introns, however, reveals a strict conservation of two dinucleo-
tide consensus sequences (Breathnach and Chambon, 1981; Mount, 1982)
contained entirely within the intron. Proceeding from the 50
end of the RNA,
introns are found to begin with a GU dinucleotide (known as the left or
donor site) and end with an AG dinucleotide (known as the right or acceptor
site) (The so-called GU-AG rule, describing exonintron splice sites, refers
to the RNA sequence. The corresponding DNA coding strand dinucleotide at
the 50
end (beginning) of an intron is GT); while once believed to occur
100% of the time in higher eukaryotes, some exceptions have been noted
(Szafranski et al., 2007) and the consensus phenomenon does not apply to
yeast mitochondrial or tRNA genes, nor to chloroplast loci (Krebs et al.,
2012).
Exon(n) GU Intron AG Exon(n+1)
5ʹ 3ʹ
The nucleotides immediately adjacent to both sides of the GU and AG
intron boundaries are also conserved to an extent, typically 60%80% and a
point mutation at a splice site generally results in the inactivation of that
site. In some cases, splice-site mutations can result in the production of an
aberrant mRNA through the use of an alternative splice site, often located
within the intron (Triesman et al., 1982), as in certain β-thalassemic indivi-
duals. Knowledge of the high conservation of splice sites is the basis of a
method for exon identification known as exon trapping (Duyk et al., 1990;
Péterfy et al., 2000) in which a putative exon-containing sequence is cloned
into a specialized vector that consists of an intron flanked by two known
exons (exonintronexon); if an exon is present in the experimental DNA,
it will be trapped by ligation into the vector intron and will result in a longer
transcript that can be can be detect electrophoretically or by melting curve
analysis. The exon trapping method has fallen out of favor due to the low
20 RNA Methodologies
cost and ready availability of cDNA sequencing and an extensive repertoire
of tools for in silico analysis.
The mechanics of intron removal and exon ligation, which occurs cotran-
scriptionally, i.e., while the RNA polymerase is still active, are mediated in
part by a highly conserved family of small nuclear RNAs (snRNA; 100300
bases). These molecules exist as the RNAprotein complexes, known as U1,
U2, U4, U5, and U6, and are confined to the nucleus where they are referred
to as small nuclear ribonucleoproteins (snRNPs, or snurps). The snRNPs,
along with many other proteins as splicing factors, form enormous com-
plexes known as spliceosomes, which are known to mediate pre-mRNA
splicing.
Exon 1 Exon 2 Exon 3
5ʹ 3ʹ
Intron 1 Intron 2
hnRNA is an unsplicedprecursor of mRNA
Exon 1 Exon 2 Exon 3
5ʹ 3ʹ
mRNA after splicing
Spliceosome formation
Closely associated with the capacity for transcript splicing are Cajal bod-
ies (CBs), small organelle-like, punctulate structures in the nucleoplasm that
were first observed more than a century ago (Cajal, 1903). Unlike organelles,
these structures are nonmembrane bound; they are spherical in appearance,
with a typical diameter of 0.51.0 μm. CBs are characterized by a high con-
centration of the protein coilin and are packed with RNA. They are dynamic
in that they are visible at certain times and not others, which may be related
to differentiation, development, and even progression through the cell cycle.
Found in higher eukaryotic plant and animal cells, CBs represent transcrip-
tionally active regions, particularly the histone loci, and are also linked to
ribosome biogenesis and telomere upkeep. However, one of the best known
functions of CBs is their factory-like role in the assembly of snRNPs associ-
ated with mRNA splicing.
Intron removal has also been demonstrated to have a role in nuclear
export of the spliced and matured mRNA. The proteins involved in the splic-
ing mechanism and exon concatenation recruit additional proteins that are
specifically required for nuclear egress. Among the proteins in the resulting
exon junction complex (EJC) is the ALY/REF export adapter, which binds
directly to the RNA, and the TAP-p15 export receptor complex, each of
which has a direct role in nuclear pore engagement (reviewed by Grünwald
et al., 2011). Once on the cytoplasmic side of a nuclear pore, the mRNA
sheds the array of proteins which facilitated its nuclear egress. This ensures
unidirectional movement. Improperly spliced or otherwise compromised
mRNAs fail to associate with the correct combination of proteins required
RNA and the Cellular Biochemistry Revisited Chapter | 1 21
for nucleocytoplasmic transport, thereby promoting their retention in the
nucleus and rapid degradation. Although intron removal and the splicing
together of exons in and of itself is not required for transport from the
nucleus, since intron-less transcripts move efficiently into the cytoplasm,
splicing clearly enhances transport. The export process used for rRNA and
other RNAs is less clear.
Splicing of pre-mRNA molecules also produces some unexpected results.
Once believed to be a rare consequence of a spliceosomal machinery error, cir-
cular RNAs (circRNAs) are well-documented consequence of a phenomenon
known as backsplicing, a process by which the 30
end of exon “n” is joined
covalently to the 50
end of the same exon, i.e., exon “n,” rather than to the 50
end of exon “n1 1”. Consisting of one or two exons, the number of known
eukaryotic circRNAs is in the thousands. Nearly all circRNAs are exon-encoded
sequences; intronic circRNA sequences are almost completely unknown.
The highest incidence of circRNAs is found in the mammalian brain,
with the greatest density of these molecules observed in the synaptic region
of nervous tissue cells (Rybak-Wolf et al., 2015). circRNAs are believed to
play a role in neuronal differentiation, as their expression is upregulated dur-
ing brain development (You et al., 2015). In a surprise development, it is
known that the number of circRNAs from some genetic loci exceeds the lin-
ear counterpart by a factor of as much as 10 (Salzman et al., 2012). There is
also speculation that circRNAs may absorb miRNAs (described below and in
Chapter 10: miRNA), sequestering them as a means of controlling the
expression of specific genes associated with a metabolic of developmental
pathway. Similarly, the formation of circRNAs may regulate the concentra-
tion of certain types of the more than 1000 known RNA binding proteins by
transiently attracting them.
Since these molecules are the result of splicing events intended to bring
exons together, it is reasonable to assume that many gene loci are capable of
producing an even greater array of processes transcripts, further diversifying
the transcriptome. It is clear, however, that certain exons are greatly favored
in the formation of circRNA transcripts; this is probably controlled by the
presence of repetitive sequences residing in the flanking introns and trans-
acting factors associated with splicing that collaborate to control transcript
circularization (Kramer et al., 2015). circRNAs are nonpolyadenylated and
localized primarily in the cytoplasm; the lack of a poly(A) tail therefore
excludes their identification, characterization, and abundance measurement
by RNA-seq (poly A). The nonlinear shape of circRNA imparts great stabil-
ity to these molecules. The extended half-lives of these molecules presum-
ably allow them perform their intended function(s), whatever they may be,
for as long as possible. Even though circRNAs are derived from genes that
encode proteins, circRNAs are a type of noncoding RNA. This is in sharp
contrast, structurally and functionally, to the transient circular shape ordinar-
ily assumed by mRNA in order to stabilize them and enhance translation.
22 RNA Methodologies
Finally, the excision of certain introns in RNA can also occur as the
result of RNA self-cleavage. Catalytic RNAs, generically referred to as ribo-
zymes, were first described by Cech et al. (1981). In particular, the group I
intron ribozyme and RNase P ribozyme are well known because they are
first two ribozymes to be discovered (Kruger et al., 1982; Guerrier-Takada
and Altman, 1984). These discoveries led to the awarding of the Nobel Prize
in Chemistry in 1989 to Thomas Cech and Sidney Altman. Extensive infor-
mation about this fascinating aspect of RNA biology can be found elsewhere
(for reviews, see Ferré-D’Amaré and Scott, 2010).
RNA POLYMERASES AND THE PRODUCTS
OF TRANSCRIPTION
Genes are transcribed by enzymes known as RNA polymerases which pro-
duce different types of RNA in the cell, including rRNA, tRNA, and mRNA,
as well as a variety of very important noncoding (large and small) RNA spe-
cies. In the nucleus, eukaryotic genes are transcribed by one of three major
RNA polymerases; these enzymes are among the largest and most complex
proteins in the cell and consist of more subunits than their prokaryotic coun-
terpart. The eukaryotic enzymes are properly known as RNA polymerases I,
II, III, each of which is responsible for transcribing a different class of genes
(Table 1.5). RNA polymerases are active only in the presence of DNA, and
require nucleotide precursors (ATP, CTP, GTP, and UTP), myriad transcrip-
tion factors, and function in a Mg11
-dependent manner. In addition, eukary-
otic RNA polymerases differ in the sensitivity each exhibits to the bicyclic
octapeptide fungal toxin α-amanitin (Extracted from the poisonous mushroom
Amanita phalloides) (Roeder, 1976; Marzluff and Huang, 1984), applications
for which are described in Chapter 19, Analysis of Nuclear RNA. Prokaryotes,
in contrast, exhibit only one type of RNA polymerase, which transcribes all
classes of RNA.
In plants, RNA polymerase IV (RNAP IV) and RNA polymerase V
(RNAP V) have also been identified. These enzymes are homologs of the
well-characterized, mRNA-producing enzyme RNA polymerase II. RNAP IV
and RNAP V have collaborative roles in the synthesis and activation of
miRNA species involved in silencing pathways via RNA-directed DNA
methylation (RdDM) (Onodera et al., 2005; Herr et al., 2005; Kanno et al.,
2005; Zhang et al., 2007; reviewed by Haag and Pikaard, 2011; Huang et al.,
2015). Although products of RNAP IV and RNAP V have direct influence
on numerous biochemical processes, including growth and development and
defense mechanisms against viruses, these two enzymes are not essential for
cellular viability (Onodera et al., 2005; Pontier et al., 2005).
In mammals, a fourth RNA polymerase [single-polypeptide nuclear RNA
polymerase IV (spRNAP-IV)] has likewise been described (Kravchenko
et al., 2005). This polypeptide is a nuclear isoform of an RNA polymerase
RNA and the Cellular Biochemistry Revisited Chapter | 1 23
TABLE 1.5 Eukaryotic RNA Polymerase Enzymes and Their Respective
Products and Sensitivities to α-Amanitin
Eukaryotic
Enzyme
Site of Action Products Sensitivity to
α-Amanitin
RNA Polymerase I Nucleolus rRNA (28S, 18S, 5.8S)a

RNA Polymerase II Nucleoplasm hnRNA - mRNA;
lincRNA, miRNAb
,
snRNA, scRNAc
1111
RNA Polymerase III Nucleoplasm tRNA, 5S rRNA,
snRNA, snoRNA,
scRNA, miRNA
1
RNA Polymerase IV
Pol IV (plants) Nucleoplasm miRNA involved in
heterochromatin
methylation and gene
silencing

spRNAP-IV
(mammalian)
Mitochondria mRNA (subset) 
RNA Polymerase V
(plants)
Nucleoplasm miRNA involved in
heterochromatin
methylation and gene
silencing

RdRPs (RNA-
dependent RNA
polymerases)
Cellular Amplification of miRNA
for gene silencing; viral
replication

cp RNA
polymerase (PEP)
Chloroplastsd
Chloroplast gene
transcripts

mt RNA
polymerase
(mtRNAP)
Mitochondria,
but encoded in
the nucleus
Mitochondrial gene
transcripts

a
RNA polymerase I is considered the most specialized of the three canonical nuclear
polymerases because the members of this highly repetitive gene family that it transcribes
are virtually identical, as are the resulting transcripts.
b
While most known miRNAs are transcribed by RNA polymerase II, an increasing number
of these important regulatory transcripts has been shown to be transcribed by RNA
polymerase III.
c
Certain scRNAs and snRNAs are transcribed by RNA polymerase II while others
are transcribed by RNA polymerase III.
d
Chloroplast genes are also transcribed by nuclear-encoded RNA polymerases.
24 RNA Methodologies
localized in the mitochondria and was first observed in HeLa cells. spRNAP-
IV and the related polypeptide mitochondria-targeting RNA polymerase
(mtRNAP) are both transcribed from the same gene locus, POLRMT. As a
consequence of alternative splicing, however, spRNAP-IV is a truncated
polypeptide which, compared to mtRNAP, lacks 262 amino acids near the
amino terminus, including the mitochondrial transit sequence. Thus,
spRNAP-IV remains in the nucleus and regulates a subset of perhaps as
many as one thousand nuclear-encoded genes, though the level of coopera-
tion between RNA polymerase II and spRNAP-IV in gene transcription is
unclear. Suppression of the spRNAP-IV gene seriously inhibits cell growth,
gradually leading the cell down the apoptotic pathway.
As with all nucleic acid molecules, RNA transcripts are assembled only
in the 50
-30
direction. Transcription involves three distinct phases, namely,
initiation, elongation, and termination, the details of which are beyond the
scope of this volume. Briefly, initiation involves the attachment of RNA
polymerase to a DNA template promoter by association with transcription,
activation, and initiation factors, followed by the acquisition of what will be
the first ribonucleotide in the RNA molecule. Elongation involves the
sequential addition of ribonucleotides to the nascent chain, a process also
involving accessory elongation factors. Termination is the completion of
RNA synthesis and the disengagement of both the newly synthesized RNA
and the RNA polymerase which manufactured it from the DNA template.
Transcription termination, as with initiation and elongation, is sequence-
dependent and, at least in prokaryotes, is influenced by the presence of small
proteins (termination factors) and often involves the transient formation of
RNA 2
(hairpin) structures; different transcription termination strategies are
used by each of the eukaryotic RNA polymerases. Transcription errors not-
withstanding, the nucleotide sequence of the resulting RNA molecule is iden-
tical to the coding strand of the DNA from which it is derived, the only
difference being the substitution of the base uracil for thymine.
In eukaryotic cells, transcription of the genes encoding ribosomal RNA is
mediated by RNA polymerase I. The primary product of RNA polymerase I
transcription in higher eukaryotes is the very large unspliced 47S precursor
rRNA which, at approximately 14,000 bases in humans, is the largest known
precursor RNA in mammals. The 47S rRNA eventually yields the smaller
28S, 18S, and 5.8S rRNAs after processing (Fig. 1.9; reviewed by Henras
et al., 2015). Between 80% and 85% of cellular RNA is found in ribosomes
in the form of the 28S, 18S, 5.8S, and 5S rRNAs; these transcripts form
complexes with myriad ribosome-specific proteins to form the 60S and 40S
eukaryotic ribosomal subunits, respectively. As described in Chapter 13,
Electrophoresis of RNA, the abundant 28S and 18S rRNAs are useful as
excellent natural molecular weight size markers when either total cellular
RNA or total cytoplasmic RNA is electrophoresed. A comparison of eukary-
otic ribosomes and other aspects of transcription and translation with their
RNA and the Cellular Biochemistry Revisited Chapter | 1 25
prokaryotic counterparts is presented in Table 1.6. RNA polymerase III is
responsible for transcribing the genes that encode tRNA molecules, the 5S
rRNA, certain repetitive elements, snRNA, snoRNA, scRNA, and some
miRNA transcripts described above. Approximately 10%15% of the total
cellular RNA mass is tRNA. Thus, the vast majority of RNA in the cell
represents the transcriptional products of RNA polymerase I and RNA poly-
merase III; rRNA and tRNA are rather undiversified and generally not sub-
ject to significant alteration of their expression profile as a function of the
cell state.
In sharp contrast to RNA polymerases I and III, the transcription products
of RNA polymerase II are as diverse as the cellular biochemistry itself. This is
not at all unexpected because the mRNA in great measure drives the pheno-
type of the cell. Of the 15 3 1025
μg RNA harbored in a typical mammalian
cell, the relative mass contribution of RNA polymerase II transcription pro-
ducts is generally between 20% and 40%, though only 1%4% of which is
mature mRNA because a great deal of all transcribed hnRNA is degraded in
the nucleus, strongly suggesting a rationale for studying posttranscriptional
regulation of gene expression. Interestingly, rRNA was first believed to have a
template role in the synthesis of proteins. The first indications that a new, sep-
arate class of RNA, mRNA, acts as the template molecule emerged from stud-
ies involving T4 phage-infected Escherichia coli cells (Volkin and Astrachan,
1956; Brenner et al., 1961; Gros et al., 1961; Hall and Spiegelman, 1961).
Later, multiple 50
ends were observed among late SV40 mRNAs (Ghosh et al.,
1978) and late polyoma mRNAs (Flavell et al., 1979), the first indications of
18S rRNA 5.8S rRNA 28S rRNA
5 3 45S
18S rRNA 5.8S rRNA 28S rRNA
5 3 32S
18S rRNA 5.8S rRNA 28S rRNA
18S rRNA 5.8S rRNA 28S rRNA
5 3 41S
3
21S
5S rRNA
Functional ribosome
Ribosomal
proteins
18S rRNA 5.8S rRNA 28S rRNA
5 3 47S pre-rRNA
5
FIGURE 1.9 One pathway of rRNA biogenesis in human cells. The 28S, 18S, and 5.8S rRNA
are liberated from a common 47S transcript that is the product of RNA polymerase I. The other
essential transcript, 5S mRNA, is produced independently by RNA polymerase III. Adapted, in
part, from Lewin, Genes VI (1997), by permission of Oxford University Press.
26 RNA Methodologies
the heterogeneous nature of RNA polymerase II transcription initiation. The
amount of mRNA may vary depending on a number of factors, including
degree of differentiation.
Genes transcribed by RNA polymerase II which, after processing, renders
functional mRNA are localized mostly within the nonrepetitive regions of
the genome, as demonstrated by R0t kinetics studies. A typical cell is tran-
scribing a subset of several thousand different genes at any given moment
depending on cell type and cell state. Given the complexity of the cellular
biochemistry, this observed heterogeneity within the mRNA population is
necessary to satisfy even basal-level requirements for viability, though it is
remarkable to note that as much as 70% of all transcribed hnRNA is
degraded in the nucleus. RNA polymerase II also has major responsibility
for transcription of genes encoding miRNA.
Another category of RNA polymerase, RNA-dependent RNA polymerase
(RdRP), also known as RNA replicase, is an enzyme that can synthesize
RNA from an RNA template, rather than the requisite DNA template associ-
ated with the other RNA polymerases. RdRP is well known for its role in the
replication of poliovirus and other RNA viruses. It has come under much
closer scrutiny because of the documented role of this enzyme in the RNA
interference pathways, especially in higher plants by mediating the synthesis
of double-stranded RNA (dsRNA) molecules which, upon cleavage give rise
to miRNAs/siRNAs. This aspect of RdRP function is discussed in detail in
Chapter 11, RNA Interference and RNA Editing (RNA Interference).
HALLMARKS OF A TYPICAL mRNA
Many genes are transcribed constitutively by RNA polymerase II, and this
enzyme has a role in the integration of associated nuclear events such as
splicing and polyadenylation (Hirose and Manley, 2000). Of these numerous
transcripts, large quantities of heterogeneous nuclear RNA (hnRNA) are
turned over in the nucleus. This may partially be due to errors in transcrip-
tion or posttranscriptional processing, which is yet another example of qual-
ity control on the part of the gene expression machinery. It is also possible
that not an insignificant amount of hnRNA is retained in the nucleus to facil-
itate some aspect of the process of transcription. In any event, mRNAs in
eukaryotic cells emerge from precursor hnRNA through a series of modify-
ing reactions, which include formation of the 50
cap, methylation of the cap,
splicing, 30
-end processing, and frequently, polyadenylation. Transcripts are
produced at different rates from different loci; therefore, each mRNA species
is classified based on its cytoplasmic prevalence or, more properly, its abun-
dance. There are three official such categories, high abundance, medium
abundance, and low abundance mRNAs and, in the mind of this Author, the
unofficial very low abundance category.
RNA and the Cellular Biochemistry Revisited Chapter | 1 27
Highly abundant transcripts are present in hundreds of copies per cell.
These are most often observed when a cell is producing an enormous quan-
tity of a particular protein or is highly specialized or differentiated to per-
form a unique function. Medium abundance transcripts are best thought of as
being present in dozens of copies per cell; many genes with housekeeping
(A gene is said to have a housekeeping function if the encoded gene product
plays a maintenance role or if basal expression is needed to maintain the
physiology of the cell or even viability. Since the expression of these genes
is generally constant, and not expected to change as a function of cell state,
housekeeping genes are assayed as internal controls in various quantitative
assays. See Chapter 9, Quantitative PCR Techniques, for further details)
functions produce their mRNAs at this level in the cell. Low abundance
mRNAs are generally present in 10 or fewer copies per cell and often are
difficult to assay by many of the older classical techniques, such as Northern
analysis (see Chapter 15: Northern Analysis). Very low abundance mRNAs
are those present, on average, in fewer than one copy per cell, a designation
which is generally associated with heterogeneous tissue samples or, very
commonly, in cases where cancer cells manifest a high degree of aneuploidy.
In the past, low and very low abundance mRNAs were said to represent the
“hard to clone genes,” though newer methods for the assay of gene expres-
sion have revealed a plethora of previously unknown transcripts of various
persuasions. Most importantly, the prevalence or abundance of an mRNA
species in a cell is subject to change of monumental proportions. Such
changes may occur in response to natural changes in the cellular milieu or
due to experimental manipulation.
A typical human fibroblast cell contains approximately 10 picograms
(pg) of RNA, which is equivalent to about 106
molecules transcribed from a
particular subset of the 20,000 or so genes estimated to make up the human
genome. While this mRNA heterogeneity reflects the diversity of proteins
that these mRNAs encode, a typical eukaryotic mRNA molecule shares sev-
eral structural features with nearly all other mRNA molecules (Fig. 1.10). As
will become evident from the descriptions which follow, producing a func-
tional mRNA molecule is amazingly complex.
5 cap
Poly(A) tail
AAAAAAAAA
AAUAAA
Polyadenylation
signal
AUG
Initiation codon
Stop codon
UAA, UAG, UGA
3
Coding region
FIGURE 1.10 Topology of a typical eukaryotic mRNA molecule.
28 RNA Methodologies
50
Cap
A great majority of mature eukaryotic mRNA molecules are characteristi-
cally monocistronic polyribonucleotides produced by RNA polymerase II.
The immediate product of transcription, a precursor hnRNA molecule, dis-
plays the following structure at the 50
end of the molecule:
50
pppRNNN. . .::30
meaning that the first transcribed nucleotide contains a purine base (symbol
for a purine 5 R), either adenine or guanine. Not unexpectedly, the 50
tri-
phosphate exhibited by the nucleotide remains intact and phosphodiester
bonds join ribonucleotides sequentially in a 50
-30
orientation. A structure
known as the 50
cap (Reddy et al., 1974; Shatkin, 1976; Banerjee, 1980) is
assembled in a step-wise manner just after the initiation of transcription.
Capping occurs as soon as the 50
end of the transcript emerges from the
RNA polymerase complex, when nascent polynucleotides are generally
between 20 and 30 bases long (Coppola et al., 1983). The net result of the
ensuing reactions that are required for capping is the formation of an unusual
50
-50
triphosphate linkage between the first transcribed nucleotide (the orig-
inal RNA) and a 7-methylguanosine (m7
G) nucleotide, the polarity of which
effectively seals the 50
end of the transcript. In the realm of RNA biogenesis,
50
capping is the first example of a posttranslational modification associated
with gene expression and is observed in virtually all eukaryotic mRNAs and
is likewise a feature of most eRNA molecules.
The formation of the 50
cap begins with the addition of the terminal gua-
nosine nucleotide (G) to the 50
end of the nascent transcript. This occurs in
the nucleus and first requires removal of the γ-phosphate from the 50
end of
the hnRNA by RNA triphosphatase. Subsequently, the terminal “G” is
ligated by the enzyme guanylyltransferase and results in the structure
5′ GpppRNNN.....3′
The new terminal guanosine nucleotide is joined 50
-50
to what was the
first transcribed nucleotide via what is best thought of as an inverted linkage.
The resulting 50
cap structure is then subjected to one or more methylation
events (methyl group donor is S-adenosylmethionine; SAM), the first of
which is directed toward the number 7 position of the terminal guanine,
courtesy of the enzyme guanine-7-methyltransferase and represented as
m7
G(5′)ppp(5′)RNNN…..3′
Most eukaryotic mRNAs subsequently experience an additional methyla-
tion directed toward the 20
oxygen in the ribose of the second nucleotide, cata-
lyzed by the enzyme 20
-O-methyl-transferase. Depending on the mRNA, as
RNA and the Cellular Biochemistry Revisited Chapter | 1 29
many as four methylation events can occur involving the 50
guanine nucleo-
tide, the sugar of the penultimate and subpenultimate nucleotides, and may
also include the formation of N6
-methyladenosine (if the second nucleotide
contains adenine). The extent of methylation is a function of mRNA species,
and higher organisms usually have more extensively methylated caps. Cap
structures with multiple methyl groups have also been observed in the small
nuclear RNA species (Furuichi and Shatkin, 1989), though they are structur-
ally different from the mRNA caps described here (reviewed by Matera et al.,
2007). Capping of eukaryotic mRNAs also precedes rare internal methylation
of adenine; when it does occur, this infrequent modification results in the for-
mation of N6
-methyladenosine early in mRNA biogenesis and which is con-
served during RNA processing (Shatkin, 1976; Chen-Kiang et al., 1979).
The capping reaction associated with the 50
terminus also has a role regu-
lating transcription itself. It appears that the RNA polymerase slows down or
altogether pauses when the nascent hnRNA is no more than 30 nts in length,
facilitating the recruitment of the capping enzymes. This might be thought of
as a quality control checkpoint, as transcription resumes only when the cap-
ping process has been completed.
The presence of the 50
cap is also required to support initiation of transla-
tion in eukaryotes. Eukaryotic ribosomes distinguish mRNAs from non-
mRNAs by the presence of the 50
cap. Succinctly, rRNA and tRNA are not
translated because they are not capped. When produced by in vitro transcrip-
tion, RNAs must be subjected to an in vitro capping reaction if the tran-
scripts are to be expected to support synthesis of the encoded protein.
Formation of the translation apparatus is initiated in part by cap-binding pro-
teins, particularly eIF4E (for review, see Krebs et al., 2012), followed by the
assembly of the ribosomal subunits as mediated by initiation factors.
Capping also confers transcript stability by protecting against phosphatase
attack and 50
-30
exonucleolytic degradation (Furuichi et al., 1977; Furuichi
and Shatkin, 1989). In contrast, prokaryotic mRNAs, which naturally lack a
50
cap structure, are degraded exonucleolytically from the 50
end even while
translation is ongoing downstream. Mitochondrial and chloroplast mRNAs
are not capped, while most animal viruses that replicate in eukaryotic cells
manifest 50
capped mRNAs, a noteworthy exception being poliovirus
(Hewlett et al., 1976; Nomoto et al., 1976).
Finally, the 50
cap has a role in nuclear egress. Export of mRNA into the
cytoplasm occurs when the 50
cap engages the heterodimeric nuclear cap-
binding complex (CBC). Once through the nuclear pore, eIF4E replaces CBC;
this substitution is needed to ensure efficient translation, as noted above.
50
UTR (Leader Sequence)
The first nucleotides immediately 30
to the eukaryotic cap structure constitute
the 50
untranslated region (50
UTR), also known more casually as the
30 RNA Methodologies
Exploring the Variety of Random
Documents with Different Content
back
Rna Methodologies Laboratory Guide For Isolation And Characterization 5th Robert E Farrell
back
Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.
More than just a book-buying platform, we strive to be a bridge
connecting you with timeless cultural and intellectual values. With an
elegant, user-friendly interface and a smart search system, you can
quickly find the books that best suit your interests. Additionally,
our special promotions and home delivery services help you save time
and fully enjoy the joy of reading.
Join us on a journey of knowledge exploration, passion nurturing, and
personal growth every day!
ebookbell.com

More Related Content

PDF
Rna Methodologies Fourth Edition Laboratory Guide For Isolation And Character...
PDF
Rna Mapping Methods And Protocols 1st Edition M Lucrecia Alvarez
PDF
Back to basics: Fundamental Concepts and Special Considerations in RNA Isolation
PDF
Rna Bioinformatics 1st Edition Ernesto Picardi Eds
PPTX
Transcriptome analysis
PDF
Rna Abundance Analysis Methods And Protocols 2nd Ed Hailing Jin
PDF
Dna Vs Rna
PPTX
The-Evolution-of-Noncoding-RNA-Beyond-the-Central-Dogma.pptx
Rna Methodologies Fourth Edition Laboratory Guide For Isolation And Character...
Rna Mapping Methods And Protocols 1st Edition M Lucrecia Alvarez
Back to basics: Fundamental Concepts and Special Considerations in RNA Isolation
Rna Bioinformatics 1st Edition Ernesto Picardi Eds
Transcriptome analysis
Rna Abundance Analysis Methods And Protocols 2nd Ed Hailing Jin
Dna Vs Rna
The-Evolution-of-Noncoding-RNA-Beyond-the-Central-Dogma.pptx

Similar to Rna Methodologies Laboratory Guide For Isolation And Characterization 5th Robert E Farrell (20)

PPTX
BTC 810 Analysis of Transcriptomes.pptx
PDF
RNA sequencing: advances and opportunities
PDF
Bacterial Transcriptional Control Methods And Protocols 1st Edition Irina Art...
PDF
Differentiated Fern Research Paper
PDF
Rna Therapeutics Function Design And Delivery 1st Edition John Karijolich
PDF
Rna Polymerase And Associated Factors Part D 1st Edition Sankar Adhya Editor
PPTX
RNA structure
PDF
Regulatory Noncoding Rnas Methods And Protocols 1st Edition Gordon G Carmicha...
PPTX
Applications of transcriptomice s in modern biotechnology 2
PPTX
Chaim Lecture 2new tren in dna rna .pptx
PDF
Molecular Cell Biology 8th Edition Lodish Solutions Manual
PPTX
gene expression traditional methods
PDF
Rna Modification 1st Edition Jonatha Gott
PPTX
Dive into the fascinating world of RNA, the molecular architect of life. ppt....
PDF
Molecular Cell Biology 8th Edition Lodish Solutions Manual
PPT
Biochem synthesis of rna(june.23.2010)
PDF
The Eukaryotic RNA Exosome 1st Edition John Lacava
PDF
RNA Modification 1st Edition Jonatha Gott
PDF
Biochim Biophys Acta. 2014
PPTX
Rna . structure & functions
BTC 810 Analysis of Transcriptomes.pptx
RNA sequencing: advances and opportunities
Bacterial Transcriptional Control Methods And Protocols 1st Edition Irina Art...
Differentiated Fern Research Paper
Rna Therapeutics Function Design And Delivery 1st Edition John Karijolich
Rna Polymerase And Associated Factors Part D 1st Edition Sankar Adhya Editor
RNA structure
Regulatory Noncoding Rnas Methods And Protocols 1st Edition Gordon G Carmicha...
Applications of transcriptomice s in modern biotechnology 2
Chaim Lecture 2new tren in dna rna .pptx
Molecular Cell Biology 8th Edition Lodish Solutions Manual
gene expression traditional methods
Rna Modification 1st Edition Jonatha Gott
Dive into the fascinating world of RNA, the molecular architect of life. ppt....
Molecular Cell Biology 8th Edition Lodish Solutions Manual
Biochem synthesis of rna(june.23.2010)
The Eukaryotic RNA Exosome 1st Edition John Lacava
RNA Modification 1st Edition Jonatha Gott
Biochim Biophys Acta. 2014
Rna . structure & functions
Ad

Recently uploaded (20)

PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PPTX
Lesson notes of climatology university.
PDF
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PPTX
Presentation on HIE in infants and its manifestations
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
A systematic review of self-coping strategies used by university students to ...
Module 4: Burden of Disease Tutorial Slides S2 2025
Lesson notes of climatology university.
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
Chinmaya Tiranga quiz Grand Finale.pdf
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
Pharmacology of Heart Failure /Pharmacotherapy of CHF
O5-L3 Freight Transport Ops (International) V1.pdf
VCE English Exam - Section C Student Revision Booklet
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Microbial diseases, their pathogenesis and prophylaxis
Presentation on HIE in infants and its manifestations
Supply Chain Operations Speaking Notes -ICLT Program
102 student loan defaulters named and shamed – Is someone you know on the list?
202450812 BayCHI UCSC-SV 20250812 v17.pptx
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
A systematic review of self-coping strategies used by university students to ...
Ad

Rna Methodologies Laboratory Guide For Isolation And Characterization 5th Robert E Farrell

  • 1. Rna Methodologies Laboratory Guide For Isolation And Characterization 5th Robert E Farrell download https://ebookbell.com/product/rna-methodologies-laboratory-guide- for-isolation-and-characterization-5th-robert-e-farrell-7016804 Explore and download more ebooks at ebookbell.com
  • 2. Here are some recommended products that we believe you will be interested in. You can click the link to download. Rna Methodologies Fourth Edition Laboratory Guide For Isolation And Characterization 4th Edition Robert E Farrell Jr https://ebookbell.com/product/rna-methodologies-fourth-edition- laboratory-guide-for-isolation-and-characterization-4th-edition- robert-e-farrell-jr-1690358 Rna Methodologies Third Edition A Laboratory Guide For Isolation And Characterization 3rd Edition Robert E Farrell Author https://ebookbell.com/product/rna-methodologies-third-edition-a- laboratory-guide-for-isolation-and-characterization-3rd-edition- robert-e-farrell-author-2165136 Rna Methodologies A Laboratory Guide For Isolation And Characterization 6th Edition Robert E Farrell Jr https://ebookbell.com/product/rna-methodologies-a-laboratory-guide- for-isolation-and-characterization-6th-edition-robert-e-farrell- jr-48774206 Rna Structure And Dynamics 1st Ed 2023 Jienyu Ding Jason R Stagno https://ebookbell.com/product/rna-structure-and-dynamics-1st- ed-2023-jienyu-ding-jason-r-stagno-46591202
  • 3. Rna Delivery Function For Anticancer Therapeutics Loutfy H Madkour https://ebookbell.com/product/rna-delivery-function-for-anticancer- therapeutics-loutfy-h-madkour-46668394 Rna Structure Prediction Risa Karakida Kawaguchi Junichi Iwakiri https://ebookbell.com/product/rna-structure-prediction-risa-karakida- kawaguchi-junichi-iwakiri-47627364 Rna Modifications 1st Mary Mcmahon https://ebookbell.com/product/rna-modifications-1st-mary- mcmahon-47699696 Rna Interference And Crispr Technologies 1st Mouldy Sioud https://ebookbell.com/product/rna-interference-and-crispr- technologies-1st-mouldy-sioud-47710484 Rnaprotein Complexes And Interactions Methods And Protocols Methods In Molecular Biology 2666 2nd Ed 2023 Renjang Lin Editor https://ebookbell.com/product/rnaprotein-complexes-and-interactions- methods-and-protocols-methods-in-molecular-biology-2666-2nd- ed-2023-renjang-lin-editor-50124494
  • 20. Preface RNA NEVER CEASES TO AMAZE The first edition of RNA Methodologies was published in 1993. At that time, RNA was viewed as an “interesting” molecule and many molecular biolo- gists were happy if they could do a decent Northern blot. Twenty-five years later, we have at least begun to appreciate that RNA is as diverse in function as it is in form within the society of the cell. With that in mind, a major goal of this book is to tell the RNA story from several perspectives to ensure a holistic understanding of this aspect of molecular biology. The recurrent themes herein are the correct way to isolate, handle, store, and assay RNA, and an appropriate level of background information related to the fundamen- tals of gene expression is likewise provided. Many roles of RNA support the widely acclaimed “RNA world hypothesis,” in which RNA, and not DNA, protein, or anything else was the primary pri- mordial information molecule. RNA serves as a keeper of genetic information (RNA viral genomes), a transporter of genetic information (mRNA), a guide that leads proteins to specific RNA sequences on other molecules for possible modification (gRNA, siRNA), a powerful posttranscriptional regulator of gene expression (miRNA), the scaffolding of the protein synthesis machinery (rRNA component of ribosomes), a sustainer of translation (tRNA transport of amino acids), and a modifier of itself and other molecules (self-splicing and catalytic RNA). Without a doubt, there are many other functions associated with RNA that have yet to be uncovered. It is safe to say that we are in the midst of a revolution in terms of our understanding of the several faces of RNA. Who would have ever imagined? In the RNA world, it is all about quality control. It is well known that RNA molecules are examined, repeatedly and systematically, from the onset of transcription, posttranscriptionally, and throughout its biological lifespan, which ends with its dismantling when it has been damaged or is otherwise no longer needed by the cell. To put this into perspective, consider CQI, that is, continuous quality improvement. Successful companies and other institu- tions often embrace the philosophy of CQI in order to sustain optimized per- formance. Some of the CQI strategies that the cell has been using from time immemorial are just starting to be understood, and it is truly mind-boggling. xvii
  • 21. The multiple quality control checkpoints associated with the production of mRNA ensure that only the highest fidelity, error-free template material is available to the ribosomes for protein synthesis. In high school in the mid-1970s, this Author learned about the three, and only three, types of RNA known at that time, to wit, mRNA, tRNA, and of course rRNA; at least one long, noncoding RNA (lncRNA) was known back then! In the present day, numerous functional lncRNAs have been described, not to mention their all-important smaller miRNA cousins. In many patho- logical states, miRNA expression patterns are altered, leading to detrimental changes to cellular morphology and cellular physiology. Retrospectively, it is amazing that miRNAs remained unknown for as long as they did. Perhaps the major reason is the fact that RNA isolation protocols, particularly in the 1980s and early 1990s, did not favor the efficient recovery of small tran- scripts. This was also true of the first generation of molecular biology kits. With the contemporary tools now at hand, new transcript species are being identified continuously. The number of known miRNAs in human cells is already in the thousands and these small, powerful transcripts are critical modulators of the flow of genetic information. Many functions that RNA molecules are able to perform are a direct result of the single-stranded character of polyribonucleotides. Consider, for example, the presence of regulatory structures formed by some RNA mole- cules such as stems, loops, and hairpins, and compare these structures to reg- ulatory sequences such as AUG, UAG, and AAUAAA, which influence translation and various posttranscriptional facets of RNA biogenesis. These complementary properties are inherent to RNA because of its amazing ability to fold into formidable secondary and tertiary structures, thereby imparting transcript functionalities perhaps as diverse as its very nucleotide sequence. Regarding the business at hand, transcriptional profiling is possible only when high-quality RNA is isolated from its biological source, such that it is able to support reverse transcription, hybridization, and downstream applica- tions that include variegated quantitation assays as well as the detection of previously uncharacterized genes, differentially spliced transcripts, and tran- scripts with multiple start sites. These abilities are of ever increasing impor- tance because of the apparent link between an abnormal abundance of a transcript (coding or noncoding; too high or too low) and a genetic disease. While impressively sensitive methods now exist for measuring transcript abundance, it is just as important to be able to identify polymorphisms within transcripts. For example, alternative splicing imparts an added level of vul- nerability to mutations and the disease state. Moreover, serious thought is required “outside the box,” i.e., the cell, because circulating nucleic acids offer enormous potential as biomarkers. Succinctly, what happens at the RNA level often determines the fate of the cell. There have been many wonderful technological advances in the study of RNA since the publication of the previous edition of this book in 2010, xviii Preface
  • 22. and many of those techniques and their applications and limitations are discussed here. My philosophy in the preparation of the fifth edition of RNA Methodologies has been that while technology is great, the fundamentals of working with RNA must be understood because they are the foundation upon which the contemporary methods to which the research community has become accustomed have been built. It may come as something of a surprise to learn that plenty of people continue to use comparatively roughhewn tools such as the time-honored Northern blot, often as means of confirming data gleaned from more sophisticated techniques. Regardless of the method, good laboratory practices (also a quality control system) associated with RNA methodologies are important to know about, particularly when it becomes necessary to troubleshoot (it always does). In light of the many advances in the study of RNA, another goal of this book is to unify many of the facets of RNA characterization in a coherent start-to-finish format. One of the difficulties toward the realization of this goal is that the rapid succession of new techniques, and variants thereof, has resulted in confusing technical nomenclature. To make matters worse, not everyone uses the same terminology to describe the same techniques. Regardless of the intended experimental trajectory, the purification of high- quality RNA, what this Author affectionately refers to as eRNA (excellent RNA!), from the biological source is always the starting point. Whether iso- lated from cell culture or directly from whole tissue, only the meticulous handling of RNA will support experiments that will be used for its study. All of the background information and the updates included herein are appropri- ate since the RNA novice lacks the historical perspective and frame of refer- ence that more experienced investigators often enjoy. This laboratory guide represents a growing collection of tried, tested, and optimized laboratory protocols for the isolation and characterization of eukaryotic RNA, with lesser emphasis on the characterization of prokaryotic transcripts. Another goal of this book is to help the reader develop greater confidence in the laboratory. Consequently, this text is written for the princi- pal investigator, bench scientist, physician, veterinarian, lab technician, grad- uate student, undergraduate research assistant, and anyone else capable of performing basic research techniques—there is something in it for everyone. This resource is intended to provide a rationale to assist in the decision- making process for individuals at all levels of experience by presenting real- istic alternatives for achieving the same experimental goals, and demonstrat- ing how various techniques contribute to the understanding of gene expression and functionality. Many of the incorporated notations and hints are based upon personal experience and pave the way for the expedient recovery of RNA and the most judicious use of resources. It is unfortunate that commonplace unsound tactics for RNA handling and characterization result in wasted resources due to an obvious failure to understand the “what” and the “why” from the onset of the study. The best advice that I can offer: Preface xix
  • 23. always think two steps ahead in an experiment, and reflect upon how the method of RNA isolation and the ensuing protocols will impact the interpre- tation of data. While it is hoped that this text be studied from cover to cover, one may pick and choose salient protocols without loss of continuity. Collectively, the chapters work together to embellish the RNA story, each presenting clear take-home lessons. The liberal incorporation of flow charts, tables, and rep- resentative data likewise facilitate learning and assist in the planning and implementation phases of a project. You are limited only by your own ingenuity. The Author acknowledges, with sincere thanks and appreciation, the intellectual encouragement of the many colleagues and friends who, in some way, supported the preparation of this manuscript. The support and patience of the Author’s family are also gratefully acknowledged and are very much appreciated. Initium sapientiae timor Domini xx Preface
  • 24. For Catherine Ann, Sean Patrick, Emma Catherine, Liam Michael, and Patrick Joseph
  • 25. Chapter 1 RNA and the Cellular Biochemistry Revisited WHY STUDY RNA? All cell and tissue functions are ultimately governed by gene expression. Consequently, the reasons for electing to study the modulation of RNA levels as at least one parameter of the cellular biochemistry may be as diverse as the intracellular RNA population itself. Generally speaking, the characterization of RNA is almost always related to transcription, i.e., gene expression questions being asked in the context of a particular scientific inquiry, and most often revolves around measuring the dynamic abundance level of one or several transcripts. The goals in any experimental design involving RNA generally revolve around one or more fundamental themes, including but not limited to the following: 1. Measurement of the steady-state abundance of cellular transcripts. Steady-state RNA refers to the net accumulation of transcription products in the cell, or in a subcellular compartment such as the nucleus or the cytoplasm. It is the combined result of RNA synthesis, stability, and deg- radation. This is the most common reason why RNA is isolated from cells and tissues. Analysis may focus on one transcript, a few transcripts, or all transcripts simultaneously; this latter approach is commonly known as global analysis of gene expression or whole transcriptome profiling. Given the ease with which RNA can be purified from biological sources, the use of various sensitive, contemporary approaches is widespread for generating quantitative and qualitative profiles of RNA populations using any of a variety of laboratory techniques. 2. Synthesis of complementary DNA (cDNA). Unstable, single-stranded messenger RNA (mRNA) can serve as the template for the in vitro syn- thesis of very stable single- or double-stranded cDNA molecules. This is the first step for subsequent amplification by the polymerase chain reac- tion (PCR), often for some “quantitative purpose,” for transcript mapping purposes, for direct ligation into a vector for sequencing or for expression of the encoded protein, for the physical separation of two or more cDNA species, or for the older strategy of synthesizing an entire cDNA library 1 RNA Methodologies. DOI: http://dx.doi.org/10.1016/B978-0-12-804678-4.00001-4 © 2017 Elsevier Inc. All rights reserved.
  • 26. (older literature occasionally refers to a cDNA library as a “clone bank.”) which can be propagated for long-term storage and analysis. In any event, the construction of cDNA is the creation of a permanent biochemi- cal record of the cell at the moment of cellular disruption. Historically, the synthesis of highly representative cDNA is one of the most important methodologies in the molecular biology laboratory and, in some hands, remains a significant challenge. 3. Detection of viruses which harbor an RNA genome. This proceeds via the synthesis of cDNA, as described above, followed by PCR or another cDNA amplification method. 4. Identification of the transcription start site (TSS). Historically, mapping of RNA molecules, including the 50 end, the 30 end, and the size and location of introns, was accomplished via nuclease protection assay, as described in Chapter 18, Quantification of Specific mRNAs by Nuclease Protection. Now, however, transcript mapping is now almost always per- formed by some variant of 50 - or 30 -rapid amplification of cDNA ends (RACE; see Chapter 8: RT-PCR: A Science and an Art Form). As it is well known that a single genetic locus often has the potential to produce multiple RNAs, each with a different TSS and often in a tissue-specific manner, TSS mapping is an invaluable technique. 5. Measurement of the rate of transcription of gene sequences or the path- ways of RNA processing. This may be deduced, at least in part, by the nuclear run-on assay in which radiolabeled ribonucleotide precursors are incorporated into nascent transcripts in direct proportion to the abundance of each species of RNA being transcribed (see Chapter 19: Analysis of Nuclear RNA). When used in conjunction with other methods that exam- ine steady-state RNA levels, the regulation of genes can often be assigned as transcriptional or due to posttranscriptional events. 6. In vitro translation of purified mRNA. The resulting polypeptide may be further characterized by immunoprecipitation or Western analysis. Cell- free translation represents an older method for the identification of specific transcripts: by providing the raw materials needed to support translation, one is able to demonstrate that a transcript of putative identity is able to support the synthesis of the cognate peptide. For example, this approach could be used to demonstrate that two transcripts from the same genetic locus with alternative TSSs are, in fact, able to direct the synthesis of iden- tical or closely related proteins. In applications such as rational drug design, in vitro translation is helpful because understanding the three- dimensional architecture of a protein, and its wild type or mutated function (s), may suggest novel applications in the area of functional proteomics. WHAT IS RNA? RNA is a long, unbranched polymer of ribonucleoside monophosphate moie- ties joined together by phosphodiester linkages. Both eukaryotic and 2 RNA Methodologies
  • 27. prokaryotic RNAs are single-stranded molecules. The unlinked monomer building blocks of both RNA and DNA are known generically as nucleotides. Each nucleotide consists of three key components: a pentose (five-carbon sugar), at least one phosphate group (nucleotides may contain as many as three phosphate groups), and a nitrogenous base (Fig. 1.1). A nitrogenous base joined to a pentose sugar is known as a nucleoside. When a phosphate group is added, the composite, a phosphate ester of the nucleoside, is known as a nucleotide. base 1 sugar 5 nucleoside nucleoside 1 phosphate 5 nucleotide The components of RNA and DNA nucleosides and nucleotides are com- pared in Table 1.1. The key chemical difference between RNA and DNA is the presence the five-carbon sugar ribose, in which a hydroxyl group (OH) is joined to the 20 carbon of the ribose sugar, in the case of RNA; the absence of the 20 OH group in DNA is the underlying basis of the name of the sugar “deoxyri- bose.” In addition, the base uracil is found in RNA, substituted in DNA by the closely pyrimidine thymine (Chemically, thymine is 5-methyluracil), though it is possible to find deoxynucleotides containing uracil in certain situations. More precisely, RNA is assembled from ribonucleotide precursors and DNA is assembled from deoxyribonucleotide precursors. Hence, RNA is so-named because of the ribose sugar it contains, just as DNA is named from its constituent 20 -deoxyribose sugar. Essential base, nucleoside, and nucleo- tide nomenclature is summarized in Table 1.2. FIGURE 1.1 The identity of a nucleotide is defined by the base that is attached to the 10 carbon. In practice, the nucleotides that make up an RNA or a DNA molecule are represented by the stan- dard one-letter abbreviation for the base each contains: adenine (A), cytosine (C), guanine (G), thymine (T), and uracil (U). RNA and the Cellular Biochemistry Revisited Chapter | 1 3
  • 28. Nitrogenous bases and the pentose sugar components of nucleosides are both cyclic. By convention, the numbering system for the carbon and nitro- gen atoms that make up the bases is 1, 2, 3, and so forth, while the number- ing system for the constituent carbon atoms of the sugar (ribose or TABLE 1.1 Comparative Nucleotide Structure RNA Nucleotides DNA Nucleotides Key Nucleotide Components Five-carbon ribose Five-carbon deoxyribose Phosphate group(s) Phosphate group(s) Nitrogenous base Nitrogenous base Common Nitrogenous Bases Purines Adenine Adenine Guanine Guanine Pyrimidines Cytosine Cytosine Uracil Thymine TABLE 1.2 Essential Base, Nucleoside, and Nucleotide Nomenclature Base Nucleoside Nucleotide Triphosphate Precursors RNA Adenine (A) Adenosine Adenosine-50 -triphosphate (ATP) Cytosine (C) Cytidine Cytidine-50 -triphosphate (CTP) Guanine (G) Guanosine Guanosine-50 -triphosphate (GTP) Uracil (U) Uridine Uridine-50 -triphosphate (UTP) DNA Adenine (A) 20 -Deoxyadenosine 20 -Deoxyadenosine-50 -triphosphate (dATP) Cytosine (C) 20 -Deoxycytidine 20 -Deoxycytidine-50 -triphosphate (dCTP) Guanine (G) 20 -Deoxyguanosine 20 -Deoxyguanosine-50 -triphosphate (dGTP) Thymine (T) 20 -Deoxythymidine 20 -Deoxythymidine-50 -triphosphate (dTTP) Nucleosides consist of a base and a sugar only. A nucleoside is promoted to a nucleotide upon addition of at least one phosphate group. 4 RNA Methodologies
  • 29. deoxyribose) is 10 , 20 , 30 , 40 , and 50 . The purpose of this nomenclature is to avoid confusion when referring to the constituent atoms of the sugar versus those found in the base of a particular nucleotide or nucleoside. The ribonucleoside triphosphates are collectively referred to as NTP; in various molecular biology protocols, the symbol NTP refers to an equimolar cocktail of ATP, CTP, GTP, and UTP. Similarly, the deoxy-form of a nucle- otide is denoted by the placement of a lower case “d” preceding the nucleo- tide triphosphate, as in dATP, dCTP, dGTP, and dTTP, and the symbol dNTP (also, dXTP) refers to an equimolar cocktail of the four deoxynucleo- side triphosphates in protocols capable of supporting the synthesis of cDNA or PCR products. It is the triphosphate form of a nucleotide that is utilized as a precursor during nucleic acid synthesis. The phosphate nearest to the sugar is known as the α phosphate, followed by the β phosphate, followed by the γ phosphate, which is furthest from the nucleoside moiety (Fig. 1.2). During nucleic acid polymerization, the β and γ phosphates (PPi; inorganic phos- phate) are cleaved (released) from the nucleotide, and the resulting single- phosphate nucleotide, a nucleoside monophosphate, is then incorporated into the nascent polynucleotide chain. POLYNUCLEOTIDE SYNTHESIS Any enzyme with an associated polymerase activity is capable of synthesiz- ing nucleic acid molecules from nucleotide precursors. The synthesis of RNA is mediated by the activity of enzymes known as RNA polymerases while DNA is synthesized, not unexpectedly, by DNA polymerases. A nucleic acid molecule is the result of linking nucleotides together by phos- phodiester bonds. The formation of these bonds involves the hydrophilic attack by the 30 hydroxyl group of the last nucleotide added to the nascent FIGURE 1.2 Adenosine-50 -triphosphate (ATP). The three constituent phosphate groups are designated α, β, and γ based on the proximity of each group to the nucleoside (base 1 sugar) component of the molecule. The replacement of the 20 -OH with H would convert this molecule to a deoxynucleotide. RNA and the Cellular Biochemistry Revisited Chapter | 1 5
  • 30. polynucleotide on the 50 phosphate group of the incoming nucleotide (Fig. 1.3). For this reason nucleic acid synthesis is said to proceed 50 -30 , and there are no known exceptions to this process. In order for the synthesis of nucleic acids to occur in vivo or in vitro, there are two fundamental requirements that must be fulfilled and maintained to initiate and to support continued nucleic acid polymerization: 1. There must be a template (a strand or an oligomer) to direct the polymerase-mediated insertion of the correct (complementary) nucleotide into the nascent chain (DNA polymerases capable of adding nucleotides without template information are said to exhibit rare terminal transferase activity, as in the case of the unusual enzyme terminal deoxynucleotidyl transferase. This enzyme has broad applications in the area of cDNA syn- thesis as well as certain forms of 50 -RACE. These special cases are dis- cussed in detail in Chapter 7, cDNA: A Permanent Biochemical Record of the Cell, and Chapter 8, RT-PCR: A Science and an Art Form.). This occurs predictably, according to the conventions set down in Chargaff’s Rule (Zamenhof et al., 1952), which succinctly states that adenine ordi- narily base pairs with thymine or uracil through the formation of two hydrogen bonds (A::T, A::U) and that guanine ordinarily base pairs to cytosine through three hydrogen bonds (G:::C). 2. For initiation and elongation, there must be a free 30 -OH to which the next nucleotide in the chain can be joined via a phosphodiester linkage. Thus, the entire process of transcription requires some type of primer manifesting the requisite 30 -OH. This applies equally to RNA and DNA O O P O− OH OH N N N N O P O O O P O OH O OH N N N N NH2 NH2 O O OH P HO OH O α α β γ Phosphodiester linkage 5' end 3' end FIGURE 1.3 The dinucleotide that results from the formation of the first phosphodiester link- age has structurally different ends, namely a phosphate group at the 50 end and a hydroxyl group at the 30 end. The structural differences at the 50 - and 30 -ends are maintained regardless of the number of nucleotides that are joined together. 6 RNA Methodologies
  • 31. synthesis, both in vivo and in vitro. Most of the enzymes used in molecu- lar biology that exhibit polymerase activity have nearly identical template and 30 -OH primer requirements. This results in a polynucleotide with a consistent pattern of 50 -30 lin- kages between adjacent nucleotides; elongation is frequently referred to as the 50 -30 polymerase activity associated with the enzyme. Upon comple- tion, nucleic acid molecules are assembled in such a way that: 1. The ends of the molecule are structurally different from one another. The first nucleotide of the molecule has an uninvolved 50 (tri)phosphate, con- stituting the so-called 50 end of the molecule. The last nucleotide that was added exhibits a free 30 hydroxyl group, and this is known as the 30 end of the molecule. 2. The backbone of the molecule consists of an alternating series of sugar and phosphate groups. Known as the phosphodiester backbone, or simply the backbone, of the molecule, it imparts a net negative charge to the molecule by virtue of its constituent phosphate groups. 3. The base associated with each nucleotide protrudes away from the back- bone of the molecule. This stereochemistry makes the bases very accessi- ble for hydrogen bonding (base pairing) to a complementary polynucleotide sequence. This proclivity is at the very heart of molecular hybridization in the laboratory. The nitrogenous bases found in nucleotides are categorized as either pur- ines (adenine and guanine) or pyrimidines (cytosine, thymine, and uracil), both of which are flat aromatic molecules. The specificity of base pairing (purine with pyrimidine) is maintained by the stereochemical preferences of the bases listed here. In other words, what is commonly known as “WatsonCrick” base pairing is predicated on the bases involved being in their preferred tautomeric forms. Base Preferred Rare Purines Adenine Amino form Imino form Guanine Keto form Enol form Pyrimidines Cytosine Amino form Imino form Thymine Keto form Enol form Uracil Keto form Enol form Hydrogen bonds, which are highly directional, form between complemen- tary bases when an electropositive hydrogen atom is attracted to an electronega- tive atom such as oxygen or nitrogen. Because of the manner in which bases protrude from their respective phosphodiester backbones, antiparallel base pair- ing or hybridization of complementary strands is strongly favored. Thus, the 50 end of one strand is opposite the 30 end of the complementary strand to which it is base-paired and often represented as shown in the following graph: RNA and the Cellular Biochemistry Revisited Chapter | 1 7
  • 32. 5 5 3 3 This is true for all double-stranded molecules: dsDNA, dsRNA, and DNA:RNA hybrids. The ability to promote, or to prevent, base pairing in this manner is a central act in the molecular biology laboratory. The obvious structural differences at the 50 and 30 ends of a molecule support a convention by which one may unambiguously refer to the position of any feature of a nucleic acid molecule in relation to any other feature: Upstream means that a structure or feature is closer to or in the direction of the 50 end of the molecule, relative to some other point of reference; it can also mean in the opposite direction of gene expression. Downstream means that a structure or feature is closer to or in the direc- tion of the 30 end of the molecule, relative to some other point of refer- ence; it can also mean in the direction of gene expression. For the sake of simplicity, upstream and downstream are most often used to mean “in the opposite direction of expression” and “in the direction of expression,” respectively. This nomenclature may be especially useful when describing features or regions of a double-stranded nucleic acid molecule, in discussions pertaining to either the structure or the expression of a gene and, in particular, for the purpose of primer design to support PCR (see Chapter 8: RT-PCR: A Science and an Art Form). The actual base sequence, i.e., the linear order of ribonucleotides, is known as the primary (1 ) structure of an RNA molecule, and this order is dictated by the order of nucleotides on the DNA template strand. There is a tremendous proclivity for a single RNA molecule to exhibit intramolecular base pairing to occur, resulting in what is known as secondary (2 ) structure. The variety of possible interactions within the phosphodiester backbone are often described using such colorful nomenclature as RNA hairpins, stems, interior loops, bulge loops, multibranched loops, kissing loops, cruciform structures, and pseudoknots (Fig. 1.4). Higher-order three-dimensional fold- ing, the so-called tertiary (3 ) structure which RNA molecules exhibit, is best described as the collection of 2 structural elements arranged in such a way that an RNA molecule is able to perform its biological function. Much has been suggested, for example, about the role of folding by careful study of transfer RNAs, the classical example of intramolecular base pairing par excellence. It is important to note that some of the 2 and 3 structures of tRNA are attributed to the formation of noncanonical base pairs. The canonical base pairs are G C, A T, and A U; examples of noncanonical base pairs include G U, A C, A G, C U, U U, G G, A Ψ (Ψ 5 pseudouridine), G Ψ, A A U trimers, and others. An excellent data- base containing known noncanonical base pairs involving RNA is maintained 8 RNA Methodologies
  • 33. by Dr. George Fox at http://prion.bchs.uh.edu/bp_type/ (Nagaswamy et al., 2000, 2002). Contemporary studies have demonstrated that mRNA also assumes varying degrees of transient 2 and 3 structures which, in no small measure, influence its function in the cytoplasm. For most laboratory appli- cations, higher-order folding must be disrupted, as described below, before an assay with a quantitative component can be performed using an RNA sample. Failure to do so generally has a severe negative impact on accurate quantitative profiling of the sample. TYPES OF RNA Transcription results in the production of RNA molecules, generically referred to as transcripts. In the past, cellular transcripts were broadly classified as ribosomal RNA (rRNA), transfer RNA (tRNA), heterogeneous nuclear RNA (hnRNA), or messenger RNA (mRNA), as well as a collection of small RNAs of previously unknown function. Now, however, one must include the very diverse population of noncoding RNA (ncRNA), all of which are of immense interest in the study of the regulation of gene expression (Table 1.3). Each cat- egory of RNA, which in eukaryotic cells is synthesized by a different type of RNA polymerase, performs a different function in the cell. In contrast, all transcripts in bacteria are produced by a single type of RNA polymerase. The various types of RNA are not represented in equal amounts—the abundance of each is directly related to the physiology of the cell. rRNA is the most abundant RNA component in the cell. In prokaryotic cells the major rRNA species are the 23S rRNA, 16S rRNA, and 5S rRNA. Helix Stem-loop Bulge Pseudoknot Three-stem junction FIGURE 1.4 Examples of secondary structure commonly observed in single-strand RNA mole- cules. Note how a single molecule is able to exhibit intramolecular base pairing by the antiparal- lel juxtaposition of complementary regions. Double-stranded regions may be perfectly or imperfectly base-paired. To a large extent, the variety and locations of stems, hairpins, and loops will influence the ensuing tertiary structure. RNA and the Cellular Biochemistry Revisited Chapter | 1 9
  • 34. TABLE 1.3 RNA Types and Functions RNA Type Name Symbol Basic Function Prokaryotic Eukaryotic Coding Messenger RNA mRNA Template for the synthesis of proteins Yes Yes Heterogeneous nuclear RNA hnRNA Large unspliced precursor of mRNA (pre-mRNA) No Yes Long Noncoding (lnc) Ribosomal RNA rRNA Forms scaffolding of the ribosomal subunits Yes Yes Transfer RNA tRNA Transports amino acids to the ribosome to support translation Yes Yes Long intergenic noncoding RNA lincRNA Production of transcription of intergenic regions No Yes XIST Xist Sex-linked lncRNA involved in X-chromosome inactivation and Barr body formation No Yes Small noncoding (snc) Small nuclear RNA snRNA Facilitates splicing of hnRNA into mature mRNA as well as rRNA processing. snRNA molecules exist as an RNAprotein complex, referred to as a snRNP, or snurp No Yes Small nucleolar RNA snoRNA Processing of immature rRNA transcripts in the nucleolus; some have a role in gene silencing No Yes Small cytoplasmic RNA scRNA Facilitates protein trafficking and secretion; possible mRNA degradation. scRNA molecules exist as an RNAprotein complex, referred to as a scRNP, or scyrp Yes Yes microRNA miRNA Short antisense RNAs that participate in the regulation of gene expression by blocking mRNA and inhibiting translation No Yes Catalytic RNA Ribozyme An RNA molecule with a catalytic function RNA Yes Yes Telomerase RNA RNA portion of the enzyme/RNA complex that repairs chromosome telomeres (TERC: telomerase RNA component) No Yes
  • 35. The eukaryotic counterparts are identified as the 28S rRNA, 18S rRNA, and 5S rRNA, as well as a fourth ribosomal transcript, the 5.8S rRNA. These molecules form the scaffolding of ribosomes, which become translationally competent when decorated with myriad ribosomal proteins. At present there are 55 known prokaryotic ribosomal proteins and 82 known eukaryotic (mammalian) ribosomal proteins. Not all ribosomes are functional at any given time, and the existence of a pool of transiently inactive ribosomes is itself a regulator of gene expression. The super abundance of rRNA in a purified RNA sample is often used as both an RNA mass loading control (see Chapter 9: Quantitative PCR Techniques) as well as internal electropho- resis molecular weight markers (see Chapter 13: Electrophoresis of RNA). tRNA is responsible for the transportation of amino acids to the ribosome to support protein synthesis. Amino acid molecules are small, ordinarily ranging from 74 to 95 nts. When shuttling an amino acid covalently linked to its 30 end, a tRNA is said to be “charged”. Placement of the correct amino acid into the nascent polypeptide depends on recognition of the mRNA codon (a group of three nucleotides) within the coding region of mRNA by a complementary trinucleotide motif carried on one arm of the tRNA known as the anticodon. The tRNA anticodon base pairs to the mRNA codon within the ribosome, thereby supporting protein elongation (for review, see Krebs et al., 2012). While neither as large nor as abundant as rRNA, the smaller tRNA species play a central role in translation. mRNA is the most diverse of all the transcripts. Ironically, even though mRNA is by far the least abundant of all transcript types, it is the mRNA that drives the phenotype of the cell. mRNA alone directs the synthesis of proteins through the use of the cellular translation apparatus. There is wide variation in the number and abundance of RNA species in the cell; the abun- dance of specific type of RNA is subject to dramatic change as the demands on the cell change. Some mRNAs are present in hundreds of copies per cell while others are present only a few copies per cell; this aspect of the RNA profile of the cell can be problematic because very low abundance transcripts are sometimes difficult to detect even with sensitive contemporary techniques. TRANSCRIPTION AND THE CENTRAL DOGMA According to the central dogma (Crick, 1957) of molecular biology, the expression of hereditary information flows from genomic sequences (DNA), through an mRNA intermediate, to ultimate phenotypic manifestation in the form of a functional polypeptide (Fig. 1.5). Whereas this design mirrors what occurs naturally in both prokaryotic and eukaryotic cells, certain “viola- tions” have been observed in nature: (1) accompanying the discovery of the retroviral enzyme reverse transcriptase (RNA-dependent DNA polymerase) (Baltimore, 1970; Temin and Mizutani, 1970), by which RNA may serve as RNA and the Cellular Biochemistry Revisited Chapter | 1 11
  • 36. the template for the synthesis of DNA, and (2) the discovery of RNA editing (Benne et al., 1986; reviewed by Nishikura, 2010), in which a transcribed sequence is subject to alteration. Transcription is that process by which a single-stranded RNA molecule is synthesized at a specific chromosome locus; this is the first of several steps in what is commonly referred to as RNA biogenesis. Transcription occurs in the nucleus (and mitochondria and chloroplasts) of eukaryotic cells, and in the common cellular compartment in prokaryotic cells. All phases of transcription are subject to variation and are potential control points in the regulation of gene expression. A transcriptional unit is best thought of as a DNA sequence that manifests appropriate signals for the initiation and termi- nation of transcription and is capable of supporting the synthesis of a pri- mary RNA transcript. The process of transcription is so-named because the transfer of information from DNA to RNA is in the same language, namely the language of nucleic acids. In contrast, the process known as translation is so-named because nucleic acid instructions in the form of mRNA are used to direct the assembly of a primary polypeptide from amino acid precursors: the nucleic acid instructions are executed in (translated to) the language of proteins. The ribosome is the organelle of polypeptide synthesis in all cells, and each ribosome independently directs the sequential linkage of amino acids as the associated mRNA is interpreted. Upon completion of translation Central dogma of molecular biology Directional flow of genetic information Transcription Translation Final manifestation Reverse transcription (creates cDNA) Replication 2n→2n DNA mRNA Protein Phenotype FIGURE 1.5 The central dogma of molecular biology. The process of transcription produces mRNA while the process of translation produces protein. Replication, the process by which DNA is duplicated, occurs during S phase in the eukaryotic cell cycle. cDNA, in contrast, is not found in the cell but is synthesized in vitro and is commonly used to measure transcriptional activity or to assay for the presence of an RNA virus. 12 RNA Methodologies
  • 37. eukaryotic proteins are typically modified, sorted, packaged, and directed to their proper subcellular location as they move through the endomembrane system, of which the endoplasmic reticulum and the Golgi apparatus are key components; prokaryotic and other eukaryotic proteins are often under the influence of various small cytoplasmic RNA (scRNA) species that guide them to their proper destination. As with RNA, and to a lesser extent DNA, proteins exhibit a marked capacity for higher-order folding (Table 1.4). As with RNA, the functionality of a protein molecule is associated with its shape. Unlike RNA, however, in which the shape of the molecule is naturally dynamic, the distortion of the tertiary (3 ) or quaternary (4 ) structure of pro- tein is associated with immediate loss of function. PROMOTERS, TRANSCRIPTION FACTORS, AND REGULATORY ELEMENTS Transcription is mediated by enzymes known as RNA polymerases. These enzymes, in conjunction with myriad proteins known as transcription factors, TABLE 1.4 Higher-Order Folding of Nucleic Acids and Protein RNA Protein DNA 1 Structure Nucleotide order Amino acid order Nucleotide order 2 Structure Stem-loop structures and hairpins, which may include mismatches α helices and β pleated sheets Antiparallel base pairing between two complementary DNA strands 3 Structure Three-dimensional folding Three- dimensional folding Three-dimensional folding. Double helix: A- DNA, B-DNA, or Z-DNA 4 Structure Interaction of two or more folded RNA molecules, often by association with RNA binding proteins Aggregation of two or more subunits Interaction of double helical DNA with proteins, such as histone proteins, as in chromatin formation Catalytic variants Ribozymes Enzymes DNAzymes The primary structure of nucleic acids and proteins is the order of monomers. The secondary structure of a molecule is the first level of folding that occurs as a consequence of its primary structure. The tertiary structure is the three-dimensional arrangement of atoms within the molecule. The quaternary structure of a molecule, when it forms, is higher-order folding the results from interaction of the molecule with one more identical or nonidentical molecules. For DNAzyme review, see Hollenstein, M. (2015). DNA catalysis: the chemical repertoire of DNAzymes. Molecules 20, 2077720804. RNA and the Cellular Biochemistry Revisited Chapter | 1 13
  • 38. recognize very specific and highly conserved promoter, or initiation, sequences within the enormous complexity of genomic DNA. Promoters are spatially associated with the structural portion (body) of a gene (Fig. 1.6) and consist of several recognizable upstream nucleotide sequence motifs. These sequences are known as consensus sequences, a term used to describe the most commonly observed pattern of nucleotides at a particular location. For example, the symbol T80A95T45A60A50T96 indicates that thymine is the first base associated with this consensus motif 80% of the time, and so forth. The exact sequence and precise geometry of these regulatory elements can either promote or prevent the onset of transcription, and do so with varying degrees of efficiency. Any promoter component that is located 50 , or upstream, from the TSS is indicated with a “minus” sign in front of the actual nucleotide distance from the TSS. By convention, the first transcribed nucleotide is designated as 11, and any other nucleotides or features located 30 , or downstream, from the TSS are likewise designated with a “plus” sign placed in front of the actual nucleotide distance. Knowledge of promoter consensus sequence function is due largely to experiments involving standard DNA cloning techniques, site- directed mutagenesis, DNA sequencing, and in silico analysis. In prokaryotic systems, the essential elements of the promoter region include the so-called 210 hexamer sequence, formerly known as the Pribnow box (or the Pribnow-Schaller box), consisting of the consensus sequence T80A95T45A60A50T96, and another conserved region located further upstream is known as the 235 sequence (T82T84G78A65C54A45). In some organisms, an AT-rich domain (the UP element) is also observed further upstream. The spac- ing between the 210 sequence and the 235 sequence is tightly regulated, with 17 base pairs being optimal, and variations in the length of the region between these two elements can reduce the efficiency of the promoter. In eukaryotic cells, promoters associated with nuclear genes are variable in structure; these variations are due to the presence of multiple nuclear Promoter Structural portion of the gene mRNA transcript Cell function Transcription Translation Phenotype Folded, functional protein FIGURE 1.6 Genes, some of which encode mRNA which, in turn, encode proteins, are under the direct influence of a regulatory element known as a promoter. 14 RNA Methodologies
  • 39. RNA polymerases as well as the requisite transcription factor initiation com- plex that must form. Transcription factors are small proteins that are continu- ally binding to and altering the shape of the chromatin. The remodeling of chromatin in the promoter locale is characterized by changes in the associa- tion between genomic DNA and the histone proteins which decorate it. Best thought of as a type of histone displacement, the objective is to facilitate access to the gene promoter by altering the local architecture of the chroma- tin. This is an ATP-dependent process. Transient covalent modifications to histone proteins include acetylation, methylation, and phosphorylation. Generally speaking, histone acetylation is associated with the activation of transcription, while methylation commonly correlates with gene silencing. The net result is the activation, or silencing, of various subsets of genes in a temporal or environmentally induced manner. Interestingly, promoters recognized by RNA polymerase II, the enzyme responsible for the synthesis of mRNA (discussed below), often display simi- lar sequence homology with prokaryotic gene promoters (Fig. 1.7). The eukaryotic promoter counterpart is known as the “TATA box,” formerly known as the Hogness box, and so-named because of the prevalence of the highly conserved TATAA motif. Point mutations involving any of these five bases strongly downregulate the function of that promoter. Another promoter component, the transcription initiation factor TFIIBrecognition element (BRE), is directly adjacent to and upstream from the TATA box. The func- tion of this heptanucleotide motif (often, GGGCGCC) is to attract TFIIB, a key element in the assembly of the transcription apparatus associated with RNA polymerase II. While at one time it was thought that all eukaryotic pro- moters manifest a TATA box, this is now known to be untrue. Instead, these rather prevalent TATA-less promoters are typically characterized by an initi- ator region (INR) and a downstream promoter element (DPE), which is observed approximately 30 base pairs downstream (130) from the TSS. The motif ten element (MTE) exclusively maps to 118 through 127 and is located downstream from INR and immediately upstream from the DPE. At least one function associated with the MTE is its ability to act in place of an absent TATA box. In addition to the TATA, DPE, and MTE promoter struc- tural components, another promoter motif is the “CAAT box,” found in sev- eral but not all promoters, and so-named because of the conservation of its sequence. When present in eukaryotic promoters, the TATA box is usually DPE MTE Initiator TATA box CAAT box G-Box (GGGCGG)n BRE +1 Transcription start site (TSS) –30 –35 –75 –110 5 3 Nascent transcript (pre-mRNA) Upstream Downstream +30 +22 FIGURE 1.7 Generalized structure of a eukaryotic gene promoter. See text for details. RNA and the Cellular Biochemistry Revisited Chapter | 1 15
  • 40. centered at 230 and the CAAT box appears around 275, though the CAAT box has been shown to function quite effectively much further upstream, and even in reverse orientation. These elements appear to control initial binding of the RNA polymerase and promoter efficiency, respectively. Another fre- quently observed promoter element is the sequence (GGGCGG)n, known as the G-box element or simply as the GC box. Present in one or more copies, this GC-rich region is generally observed between 290 and 2120 within the promoter region. Interestingly, it appears that there is no one component or organization that is shared by all promoters, though the particular permuta- tion of promoter elements and distances between them is recognizable as a transcription initiation regulator. Succinctly, by comparison with transcrip- tion in prokaryotic cells, the elaborate initiation of eukaryotic transcription requires the presence of numerous transcription factors, coactivators, and transcription activator proteins that bind to these cis-acting components which, collectively, make up a promoter. The widely accepted role of early transcription factor binding to gene promoters is to recruit RNA polymerase to that site so as to ultimately initiate transcription. Rather than being thought of as merely an onoff switch associated with a particular gene, a promoter functions more like a thermostat that increases (upregulates) and decreases (downregulates) the expression of a gene in response to the pre- vailing local conditions acting upon a cell. Eukaryotic promoters do not always function alone. Transcription in eukaryotic cells can be influenced profoundly by the presence of a regulatory element known as an enhancer, the function of which appears to be the stim- ulation of transcription. First discovered in the early 1980s, the precise loca- tion and orientation of an enhancer relative to the gene promoter varies from one gene to the next. Some genes, including those which encode immunoglo- bulins, carry enhancers within the structural portion of the gene itself. Removal of enhancer sequences can reduce the transcriptional efficiency at a locus normally under the influence of that enhancer sequence, as can the binding of repressor proteins to functionally disparate DNA sequences known as silencers. In vitro transcription of genes that are not naturally associated with an enhancer element can be increased significantly if an enhancer is ligated to the DNA construct, usually in no particular orientation, and often hundreds, if not thousands of base pairs away from the TSS. In vivo, a translocation event that brings a promoter and a gene into proximity can result in inappro- priate expression of the gene, often with potentially catastrophic conse- quences, as in the case of Burkitt’s lymphoma (Taub et al., 1982). The transcriptional influence of upstream and downstream enhancer sequences, and antagonistic silencer sequences, on gene promoters is well documented. Many enhancers, but not all, have been shown to be transcriptionally active (Djebali et al., 2012; Andersson et al., 2014), producing enhancer RNAs, or simply eRNAs. At present, the number of known eRNAs in human 16 RNA Methodologies
  • 41. cells is in the tens of thousands and their transcription points to enhancer functionality in terms of promoting expression of the cognate gene (reviewed by Li et al., 2016). These noncoding transcripts are believed to recruit com- ponents of the transcription initiation complex; it is possible that RNA poly- merase II may track to a transcription promoter by first identifying the enhancer itself. It is also possible that transcription of the intergenic area between the enhancer and the promoter may have a role in chromatin acety- lation and ensuing remodeling in order to facilitate transcription initiation at the promoter (Gribnau et al., 2000). The sequential binding of transcription factors and ancillary components in the immediate vicinity of the gene locus ultimately results in the formation of a loop and concomitant spatial juxtapo- sition of the components of the template DNA needed to support the initia- tion of transcription. Succinctly, enhancers perform their function by increasing the concentration of transcription activator proteins in the vicinity of the associated promoter. During transcription, both strands of the gene being transcribed have dif- ferent names and different roles. The strand that actually serves as the tem- plate upon which RNA is polymerized is properly referred to as the template strand. The other strand, which does not act in a template capacity, is called the coding strand. The coding strand is also known in some circles as the sense strand, while the template strand may be referred to as the antisense strand. The choice of nomenclature is purely a matter of personal preference. When publishing a gene sequence, the convention is to report the sequence of the coding strand, written 50 to 30 , from left to right. The implication is that the template strand is base-paired to the coding strand and lying antipar- allel to it and therefore does not need to be reported. The DNA template strand is so-named because the precise sequence of nucleotides inserted into the nascent RNA transcript is determined by, and complementary to, the template strand nucleotide sequence. It is important to realize that the coding strand and the template strand may switch roles depending upon the place- ment of transcriptional promoter sequences (Fig. 1.8). One powerful example of this phenomenon in vitro is the cloning of a double-stranded DNA between two different transcription promoters in opposite orientations; often the bacteriophage polymerase promoters SP6, T3, or T7 are selected because of their high efficiency. Constructions such as these are frequently employed SP6 promoter T7 promoter Template strand Template strand Coding strand Coding strand FIGURE 1.8 Promoters positioned in the opposite orientation relative to a DNA sequence allow the template and coding strands to switch roles during transcription. This arrangement per- mits the synthesis of 1 RNA and RNA from the same DNA construct. RNA and the Cellular Biochemistry Revisited Chapter | 1 17
  • 42. to accommodate in vitro transcription of large amounts of sense and/or anti- sense RNA for use as nucleic acid probes (see Chapter 16: Nucleic Acid Probe Technology) or for RNAi applications (see Chapter 11: RNA Interference and RNA Editing). GENE AND GENOME ORGANIZATION AFFECT TRANSCRIPTION In order to understand the significance of the products of transcription, it is first essential to understand the organization of the genes themselves. The typical prokaryotic genome exhibits little extraneous baggage. Frequently, genes that encode proteins associated with a common metabolic pathway are clustered together, as suggested by the operon model (Jacob and Monod, 1961). The lac operon, the gene products of which facilitate the metabolism of lactose as a carbon source in bacteria, is but one extremely well- characterized example. The RNA molecule that results from the transcription of an operon is usually polycistronic, meaning that more than one polypep- tide is encoded in a single RNA transcript. Protein 1 Protein 2 Protein 3 5ʹ 3ʹ Intercistronic Intercistronic region 1 region 2 The coding information within a polycistronic mRNA for each polypep- tide is contiguous: there are no interruptions in the coding sequences by non- coding information. This design favors maximum efficiency of energy resource utilization in unicellular organisms. In fact, the kinetics of prokaryotic gene expression are so rapid that bac- terial mRNA is usually being transcribed, undergoing translation, and being degraded simultaneously. The rapid turnover of RNA in this manner has, in the past, frustrated valiant attempts to clone or otherwise characterize pro- karyotic mRNA. While significant improvements favoring the isolation of high quality RNA from both Gram-negative and Gram-positive bacteria have been made, and many of these innovations are available in kit form, the iso- lation of intact prokaryotic RNA remains something of a challenge in many laboratories. In contrast to prokaryotes, nearly all eukaryotic mRNAs are monocistro- nic. Although a single-polypeptide species results from the translation of a particular monocistronic eukaryotic mRNA molecule, that same mRNA is subject to repeated translation as long as the transcript remains biologically competent and chemically stable. To maximize translation potential, an mRNA transcript is often engaged by several ribosomes that are all involved in simultaneous, orderly translation of that transcript. Such a cluster of ribo- somes attached to a single mRNA molecule is known as a polysome (or polyribosome). Polysomes are observed both in prokaryotic and eukaryotic cells, though eukaryotic polysomes, with 78 ribosomes per polysome 18 RNA Methodologies
  • 43. complex, tend to be smaller than their prokaryotic counterparts. Succinctly, a large number of polypeptide molecules can be manufactured from a single RNA molecule. Polysomes are observed free-floating in the cytoplasm, they can be membrane-bound, and sometimes are attached to the cytoskeleton (Lenk et al., 1977; Davies et al., 1991). The entirety of mRNAs so engaged in a cell at any given moment is known as the polysome fraction, which can be used to assess the translational competence of a cell under a defined set of experimental conditions. Close examination of eukaryotic genes reveals that for a vast majority of genes there are considerably more nucleotides within a particular locus than are necessary to direct the synthesis of the corresponding polypeptide, that is, the DNA sequence and the amino acid sequence are not colinear over the span of the locus. This size differential can also be observed at the level of the mature mRNA in the cytoplasm, which is usually quite a bit shorter than the DNA sequence from whence it was transcribed. Upon further scrutiny, this discrepancy can be resolved at the level of the organization of the struc- tural portion of the gene itself, the sequences within which fall into one of two categories: 1. Exons are regions of DNA that are represented in the corresponding mature mRNA. Exons may or may not have a peptide coding function. 2. Introns are regions of DNA that are transcribed but generally are not represented in the corresponding mRNA. Introns are usually spliced out of the primary RNA transcript (the immediate product of transcription), accompanied by the joining of adjacent exon sequences. The majority of introns do not direct polypeptide synthesis, though there are several note- worthy exceptions (for review, see Farrell and Bassett, 2007). Exon 1 Exon 2 Exon 3 5ʹ 3ʹ Intron 1 Intron 2 The number and the length of exons and introns associated with a gene are highly variable depending upon locus and this variability even pertains to loci that are highly conserved across evolutionary time. By comparison with introns which can be several thousand base pairs in length, exons tend to be rather short, each encoding fewer than 100 amino acids in most organisms. In some cases the high sequence conservation in one or more exons of a gene has been directly responsible for the isolation of a related gene (an ortholog) from a different organism. In some unusual cases, genes lack introns altogether, of which human β-interferon and thrombomodulin are examples. The base sequence of the primary RNA transcript correlates precisely with the DNA from which it is derived, meaning that it contains both exon and intron sequences. These primary transcription products are only a precur- sor to functional mRNA, and are confined to the eukaryotic nucleus where, RNA and the Cellular Biochemistry Revisited Chapter | 1 19
  • 44. appropriately, it is collectively known as heterogeneous nuclear RNA (hnRNA) or simply pre-mRNA. hnRNA and specific nuclear proteins that bind to it form rather abundant heterogeneous nuclear ribonucleoprotein complexes (hnRNPs). Similarly, mRNAs exist in the cytoplasm, following intron removal, as messenger ribonucleoprotein (mRNP) complexes after having traversed a nuclear membrane channel. In order to promote unidirec- tional movement, the combination of proteins associated with the mRNP is changed immediately upon arrival in the cytoplasm. The ensuing remodeling ensures that the mRNP, and the mRNA that it carries, is unable to travel back into the nucleus. Introns vary dramatically in number, length, and base sequence and often exhibit multiple translation termination (stop) codons in all reading frames. This is not entirely unexpected because the noncoding nature of introns favors the accumulation of mutations that might otherwise be lethal if they were to occur within an exon or other critical area. Examination of the splice junctions of introns, however, reveals a strict conservation of two dinucleo- tide consensus sequences (Breathnach and Chambon, 1981; Mount, 1982) contained entirely within the intron. Proceeding from the 50 end of the RNA, introns are found to begin with a GU dinucleotide (known as the left or donor site) and end with an AG dinucleotide (known as the right or acceptor site) (The so-called GU-AG rule, describing exonintron splice sites, refers to the RNA sequence. The corresponding DNA coding strand dinucleotide at the 50 end (beginning) of an intron is GT); while once believed to occur 100% of the time in higher eukaryotes, some exceptions have been noted (Szafranski et al., 2007) and the consensus phenomenon does not apply to yeast mitochondrial or tRNA genes, nor to chloroplast loci (Krebs et al., 2012). Exon(n) GU Intron AG Exon(n+1) 5ʹ 3ʹ The nucleotides immediately adjacent to both sides of the GU and AG intron boundaries are also conserved to an extent, typically 60%80% and a point mutation at a splice site generally results in the inactivation of that site. In some cases, splice-site mutations can result in the production of an aberrant mRNA through the use of an alternative splice site, often located within the intron (Triesman et al., 1982), as in certain β-thalassemic indivi- duals. Knowledge of the high conservation of splice sites is the basis of a method for exon identification known as exon trapping (Duyk et al., 1990; Péterfy et al., 2000) in which a putative exon-containing sequence is cloned into a specialized vector that consists of an intron flanked by two known exons (exonintronexon); if an exon is present in the experimental DNA, it will be trapped by ligation into the vector intron and will result in a longer transcript that can be can be detect electrophoretically or by melting curve analysis. The exon trapping method has fallen out of favor due to the low 20 RNA Methodologies
  • 45. cost and ready availability of cDNA sequencing and an extensive repertoire of tools for in silico analysis. The mechanics of intron removal and exon ligation, which occurs cotran- scriptionally, i.e., while the RNA polymerase is still active, are mediated in part by a highly conserved family of small nuclear RNAs (snRNA; 100300 bases). These molecules exist as the RNAprotein complexes, known as U1, U2, U4, U5, and U6, and are confined to the nucleus where they are referred to as small nuclear ribonucleoproteins (snRNPs, or snurps). The snRNPs, along with many other proteins as splicing factors, form enormous com- plexes known as spliceosomes, which are known to mediate pre-mRNA splicing. Exon 1 Exon 2 Exon 3 5ʹ 3ʹ Intron 1 Intron 2 hnRNA is an unsplicedprecursor of mRNA Exon 1 Exon 2 Exon 3 5ʹ 3ʹ mRNA after splicing Spliceosome formation Closely associated with the capacity for transcript splicing are Cajal bod- ies (CBs), small organelle-like, punctulate structures in the nucleoplasm that were first observed more than a century ago (Cajal, 1903). Unlike organelles, these structures are nonmembrane bound; they are spherical in appearance, with a typical diameter of 0.51.0 μm. CBs are characterized by a high con- centration of the protein coilin and are packed with RNA. They are dynamic in that they are visible at certain times and not others, which may be related to differentiation, development, and even progression through the cell cycle. Found in higher eukaryotic plant and animal cells, CBs represent transcrip- tionally active regions, particularly the histone loci, and are also linked to ribosome biogenesis and telomere upkeep. However, one of the best known functions of CBs is their factory-like role in the assembly of snRNPs associ- ated with mRNA splicing. Intron removal has also been demonstrated to have a role in nuclear export of the spliced and matured mRNA. The proteins involved in the splic- ing mechanism and exon concatenation recruit additional proteins that are specifically required for nuclear egress. Among the proteins in the resulting exon junction complex (EJC) is the ALY/REF export adapter, which binds directly to the RNA, and the TAP-p15 export receptor complex, each of which has a direct role in nuclear pore engagement (reviewed by Grünwald et al., 2011). Once on the cytoplasmic side of a nuclear pore, the mRNA sheds the array of proteins which facilitated its nuclear egress. This ensures unidirectional movement. Improperly spliced or otherwise compromised mRNAs fail to associate with the correct combination of proteins required RNA and the Cellular Biochemistry Revisited Chapter | 1 21
  • 46. for nucleocytoplasmic transport, thereby promoting their retention in the nucleus and rapid degradation. Although intron removal and the splicing together of exons in and of itself is not required for transport from the nucleus, since intron-less transcripts move efficiently into the cytoplasm, splicing clearly enhances transport. The export process used for rRNA and other RNAs is less clear. Splicing of pre-mRNA molecules also produces some unexpected results. Once believed to be a rare consequence of a spliceosomal machinery error, cir- cular RNAs (circRNAs) are well-documented consequence of a phenomenon known as backsplicing, a process by which the 30 end of exon “n” is joined covalently to the 50 end of the same exon, i.e., exon “n,” rather than to the 50 end of exon “n1 1”. Consisting of one or two exons, the number of known eukaryotic circRNAs is in the thousands. Nearly all circRNAs are exon-encoded sequences; intronic circRNA sequences are almost completely unknown. The highest incidence of circRNAs is found in the mammalian brain, with the greatest density of these molecules observed in the synaptic region of nervous tissue cells (Rybak-Wolf et al., 2015). circRNAs are believed to play a role in neuronal differentiation, as their expression is upregulated dur- ing brain development (You et al., 2015). In a surprise development, it is known that the number of circRNAs from some genetic loci exceeds the lin- ear counterpart by a factor of as much as 10 (Salzman et al., 2012). There is also speculation that circRNAs may absorb miRNAs (described below and in Chapter 10: miRNA), sequestering them as a means of controlling the expression of specific genes associated with a metabolic of developmental pathway. Similarly, the formation of circRNAs may regulate the concentra- tion of certain types of the more than 1000 known RNA binding proteins by transiently attracting them. Since these molecules are the result of splicing events intended to bring exons together, it is reasonable to assume that many gene loci are capable of producing an even greater array of processes transcripts, further diversifying the transcriptome. It is clear, however, that certain exons are greatly favored in the formation of circRNA transcripts; this is probably controlled by the presence of repetitive sequences residing in the flanking introns and trans- acting factors associated with splicing that collaborate to control transcript circularization (Kramer et al., 2015). circRNAs are nonpolyadenylated and localized primarily in the cytoplasm; the lack of a poly(A) tail therefore excludes their identification, characterization, and abundance measurement by RNA-seq (poly A). The nonlinear shape of circRNA imparts great stabil- ity to these molecules. The extended half-lives of these molecules presum- ably allow them perform their intended function(s), whatever they may be, for as long as possible. Even though circRNAs are derived from genes that encode proteins, circRNAs are a type of noncoding RNA. This is in sharp contrast, structurally and functionally, to the transient circular shape ordinar- ily assumed by mRNA in order to stabilize them and enhance translation. 22 RNA Methodologies
  • 47. Finally, the excision of certain introns in RNA can also occur as the result of RNA self-cleavage. Catalytic RNAs, generically referred to as ribo- zymes, were first described by Cech et al. (1981). In particular, the group I intron ribozyme and RNase P ribozyme are well known because they are first two ribozymes to be discovered (Kruger et al., 1982; Guerrier-Takada and Altman, 1984). These discoveries led to the awarding of the Nobel Prize in Chemistry in 1989 to Thomas Cech and Sidney Altman. Extensive infor- mation about this fascinating aspect of RNA biology can be found elsewhere (for reviews, see Ferré-D’Amaré and Scott, 2010). RNA POLYMERASES AND THE PRODUCTS OF TRANSCRIPTION Genes are transcribed by enzymes known as RNA polymerases which pro- duce different types of RNA in the cell, including rRNA, tRNA, and mRNA, as well as a variety of very important noncoding (large and small) RNA spe- cies. In the nucleus, eukaryotic genes are transcribed by one of three major RNA polymerases; these enzymes are among the largest and most complex proteins in the cell and consist of more subunits than their prokaryotic coun- terpart. The eukaryotic enzymes are properly known as RNA polymerases I, II, III, each of which is responsible for transcribing a different class of genes (Table 1.5). RNA polymerases are active only in the presence of DNA, and require nucleotide precursors (ATP, CTP, GTP, and UTP), myriad transcrip- tion factors, and function in a Mg11 -dependent manner. In addition, eukary- otic RNA polymerases differ in the sensitivity each exhibits to the bicyclic octapeptide fungal toxin α-amanitin (Extracted from the poisonous mushroom Amanita phalloides) (Roeder, 1976; Marzluff and Huang, 1984), applications for which are described in Chapter 19, Analysis of Nuclear RNA. Prokaryotes, in contrast, exhibit only one type of RNA polymerase, which transcribes all classes of RNA. In plants, RNA polymerase IV (RNAP IV) and RNA polymerase V (RNAP V) have also been identified. These enzymes are homologs of the well-characterized, mRNA-producing enzyme RNA polymerase II. RNAP IV and RNAP V have collaborative roles in the synthesis and activation of miRNA species involved in silencing pathways via RNA-directed DNA methylation (RdDM) (Onodera et al., 2005; Herr et al., 2005; Kanno et al., 2005; Zhang et al., 2007; reviewed by Haag and Pikaard, 2011; Huang et al., 2015). Although products of RNAP IV and RNAP V have direct influence on numerous biochemical processes, including growth and development and defense mechanisms against viruses, these two enzymes are not essential for cellular viability (Onodera et al., 2005; Pontier et al., 2005). In mammals, a fourth RNA polymerase [single-polypeptide nuclear RNA polymerase IV (spRNAP-IV)] has likewise been described (Kravchenko et al., 2005). This polypeptide is a nuclear isoform of an RNA polymerase RNA and the Cellular Biochemistry Revisited Chapter | 1 23
  • 48. TABLE 1.5 Eukaryotic RNA Polymerase Enzymes and Their Respective Products and Sensitivities to α-Amanitin Eukaryotic Enzyme Site of Action Products Sensitivity to α-Amanitin RNA Polymerase I Nucleolus rRNA (28S, 18S, 5.8S)a RNA Polymerase II Nucleoplasm hnRNA - mRNA; lincRNA, miRNAb , snRNA, scRNAc 1111 RNA Polymerase III Nucleoplasm tRNA, 5S rRNA, snRNA, snoRNA, scRNA, miRNA 1 RNA Polymerase IV Pol IV (plants) Nucleoplasm miRNA involved in heterochromatin methylation and gene silencing spRNAP-IV (mammalian) Mitochondria mRNA (subset) RNA Polymerase V (plants) Nucleoplasm miRNA involved in heterochromatin methylation and gene silencing RdRPs (RNA- dependent RNA polymerases) Cellular Amplification of miRNA for gene silencing; viral replication cp RNA polymerase (PEP) Chloroplastsd Chloroplast gene transcripts mt RNA polymerase (mtRNAP) Mitochondria, but encoded in the nucleus Mitochondrial gene transcripts a RNA polymerase I is considered the most specialized of the three canonical nuclear polymerases because the members of this highly repetitive gene family that it transcribes are virtually identical, as are the resulting transcripts. b While most known miRNAs are transcribed by RNA polymerase II, an increasing number of these important regulatory transcripts has been shown to be transcribed by RNA polymerase III. c Certain scRNAs and snRNAs are transcribed by RNA polymerase II while others are transcribed by RNA polymerase III. d Chloroplast genes are also transcribed by nuclear-encoded RNA polymerases. 24 RNA Methodologies
  • 49. localized in the mitochondria and was first observed in HeLa cells. spRNAP- IV and the related polypeptide mitochondria-targeting RNA polymerase (mtRNAP) are both transcribed from the same gene locus, POLRMT. As a consequence of alternative splicing, however, spRNAP-IV is a truncated polypeptide which, compared to mtRNAP, lacks 262 amino acids near the amino terminus, including the mitochondrial transit sequence. Thus, spRNAP-IV remains in the nucleus and regulates a subset of perhaps as many as one thousand nuclear-encoded genes, though the level of coopera- tion between RNA polymerase II and spRNAP-IV in gene transcription is unclear. Suppression of the spRNAP-IV gene seriously inhibits cell growth, gradually leading the cell down the apoptotic pathway. As with all nucleic acid molecules, RNA transcripts are assembled only in the 50 -30 direction. Transcription involves three distinct phases, namely, initiation, elongation, and termination, the details of which are beyond the scope of this volume. Briefly, initiation involves the attachment of RNA polymerase to a DNA template promoter by association with transcription, activation, and initiation factors, followed by the acquisition of what will be the first ribonucleotide in the RNA molecule. Elongation involves the sequential addition of ribonucleotides to the nascent chain, a process also involving accessory elongation factors. Termination is the completion of RNA synthesis and the disengagement of both the newly synthesized RNA and the RNA polymerase which manufactured it from the DNA template. Transcription termination, as with initiation and elongation, is sequence- dependent and, at least in prokaryotes, is influenced by the presence of small proteins (termination factors) and often involves the transient formation of RNA 2 (hairpin) structures; different transcription termination strategies are used by each of the eukaryotic RNA polymerases. Transcription errors not- withstanding, the nucleotide sequence of the resulting RNA molecule is iden- tical to the coding strand of the DNA from which it is derived, the only difference being the substitution of the base uracil for thymine. In eukaryotic cells, transcription of the genes encoding ribosomal RNA is mediated by RNA polymerase I. The primary product of RNA polymerase I transcription in higher eukaryotes is the very large unspliced 47S precursor rRNA which, at approximately 14,000 bases in humans, is the largest known precursor RNA in mammals. The 47S rRNA eventually yields the smaller 28S, 18S, and 5.8S rRNAs after processing (Fig. 1.9; reviewed by Henras et al., 2015). Between 80% and 85% of cellular RNA is found in ribosomes in the form of the 28S, 18S, 5.8S, and 5S rRNAs; these transcripts form complexes with myriad ribosome-specific proteins to form the 60S and 40S eukaryotic ribosomal subunits, respectively. As described in Chapter 13, Electrophoresis of RNA, the abundant 28S and 18S rRNAs are useful as excellent natural molecular weight size markers when either total cellular RNA or total cytoplasmic RNA is electrophoresed. A comparison of eukary- otic ribosomes and other aspects of transcription and translation with their RNA and the Cellular Biochemistry Revisited Chapter | 1 25
  • 50. prokaryotic counterparts is presented in Table 1.6. RNA polymerase III is responsible for transcribing the genes that encode tRNA molecules, the 5S rRNA, certain repetitive elements, snRNA, snoRNA, scRNA, and some miRNA transcripts described above. Approximately 10%15% of the total cellular RNA mass is tRNA. Thus, the vast majority of RNA in the cell represents the transcriptional products of RNA polymerase I and RNA poly- merase III; rRNA and tRNA are rather undiversified and generally not sub- ject to significant alteration of their expression profile as a function of the cell state. In sharp contrast to RNA polymerases I and III, the transcription products of RNA polymerase II are as diverse as the cellular biochemistry itself. This is not at all unexpected because the mRNA in great measure drives the pheno- type of the cell. Of the 15 3 1025 μg RNA harbored in a typical mammalian cell, the relative mass contribution of RNA polymerase II transcription pro- ducts is generally between 20% and 40%, though only 1%4% of which is mature mRNA because a great deal of all transcribed hnRNA is degraded in the nucleus, strongly suggesting a rationale for studying posttranscriptional regulation of gene expression. Interestingly, rRNA was first believed to have a template role in the synthesis of proteins. The first indications that a new, sep- arate class of RNA, mRNA, acts as the template molecule emerged from stud- ies involving T4 phage-infected Escherichia coli cells (Volkin and Astrachan, 1956; Brenner et al., 1961; Gros et al., 1961; Hall and Spiegelman, 1961). Later, multiple 50 ends were observed among late SV40 mRNAs (Ghosh et al., 1978) and late polyoma mRNAs (Flavell et al., 1979), the first indications of 18S rRNA 5.8S rRNA 28S rRNA 5 3 45S 18S rRNA 5.8S rRNA 28S rRNA 5 3 32S 18S rRNA 5.8S rRNA 28S rRNA 18S rRNA 5.8S rRNA 28S rRNA 5 3 41S 3 21S 5S rRNA Functional ribosome Ribosomal proteins 18S rRNA 5.8S rRNA 28S rRNA 5 3 47S pre-rRNA 5 FIGURE 1.9 One pathway of rRNA biogenesis in human cells. The 28S, 18S, and 5.8S rRNA are liberated from a common 47S transcript that is the product of RNA polymerase I. The other essential transcript, 5S mRNA, is produced independently by RNA polymerase III. Adapted, in part, from Lewin, Genes VI (1997), by permission of Oxford University Press. 26 RNA Methodologies
  • 51. the heterogeneous nature of RNA polymerase II transcription initiation. The amount of mRNA may vary depending on a number of factors, including degree of differentiation. Genes transcribed by RNA polymerase II which, after processing, renders functional mRNA are localized mostly within the nonrepetitive regions of the genome, as demonstrated by R0t kinetics studies. A typical cell is tran- scribing a subset of several thousand different genes at any given moment depending on cell type and cell state. Given the complexity of the cellular biochemistry, this observed heterogeneity within the mRNA population is necessary to satisfy even basal-level requirements for viability, though it is remarkable to note that as much as 70% of all transcribed hnRNA is degraded in the nucleus. RNA polymerase II also has major responsibility for transcription of genes encoding miRNA. Another category of RNA polymerase, RNA-dependent RNA polymerase (RdRP), also known as RNA replicase, is an enzyme that can synthesize RNA from an RNA template, rather than the requisite DNA template associ- ated with the other RNA polymerases. RdRP is well known for its role in the replication of poliovirus and other RNA viruses. It has come under much closer scrutiny because of the documented role of this enzyme in the RNA interference pathways, especially in higher plants by mediating the synthesis of double-stranded RNA (dsRNA) molecules which, upon cleavage give rise to miRNAs/siRNAs. This aspect of RdRP function is discussed in detail in Chapter 11, RNA Interference and RNA Editing (RNA Interference). HALLMARKS OF A TYPICAL mRNA Many genes are transcribed constitutively by RNA polymerase II, and this enzyme has a role in the integration of associated nuclear events such as splicing and polyadenylation (Hirose and Manley, 2000). Of these numerous transcripts, large quantities of heterogeneous nuclear RNA (hnRNA) are turned over in the nucleus. This may partially be due to errors in transcrip- tion or posttranscriptional processing, which is yet another example of qual- ity control on the part of the gene expression machinery. It is also possible that not an insignificant amount of hnRNA is retained in the nucleus to facil- itate some aspect of the process of transcription. In any event, mRNAs in eukaryotic cells emerge from precursor hnRNA through a series of modify- ing reactions, which include formation of the 50 cap, methylation of the cap, splicing, 30 -end processing, and frequently, polyadenylation. Transcripts are produced at different rates from different loci; therefore, each mRNA species is classified based on its cytoplasmic prevalence or, more properly, its abun- dance. There are three official such categories, high abundance, medium abundance, and low abundance mRNAs and, in the mind of this Author, the unofficial very low abundance category. RNA and the Cellular Biochemistry Revisited Chapter | 1 27
  • 52. Highly abundant transcripts are present in hundreds of copies per cell. These are most often observed when a cell is producing an enormous quan- tity of a particular protein or is highly specialized or differentiated to per- form a unique function. Medium abundance transcripts are best thought of as being present in dozens of copies per cell; many genes with housekeeping (A gene is said to have a housekeeping function if the encoded gene product plays a maintenance role or if basal expression is needed to maintain the physiology of the cell or even viability. Since the expression of these genes is generally constant, and not expected to change as a function of cell state, housekeeping genes are assayed as internal controls in various quantitative assays. See Chapter 9, Quantitative PCR Techniques, for further details) functions produce their mRNAs at this level in the cell. Low abundance mRNAs are generally present in 10 or fewer copies per cell and often are difficult to assay by many of the older classical techniques, such as Northern analysis (see Chapter 15: Northern Analysis). Very low abundance mRNAs are those present, on average, in fewer than one copy per cell, a designation which is generally associated with heterogeneous tissue samples or, very commonly, in cases where cancer cells manifest a high degree of aneuploidy. In the past, low and very low abundance mRNAs were said to represent the “hard to clone genes,” though newer methods for the assay of gene expres- sion have revealed a plethora of previously unknown transcripts of various persuasions. Most importantly, the prevalence or abundance of an mRNA species in a cell is subject to change of monumental proportions. Such changes may occur in response to natural changes in the cellular milieu or due to experimental manipulation. A typical human fibroblast cell contains approximately 10 picograms (pg) of RNA, which is equivalent to about 106 molecules transcribed from a particular subset of the 20,000 or so genes estimated to make up the human genome. While this mRNA heterogeneity reflects the diversity of proteins that these mRNAs encode, a typical eukaryotic mRNA molecule shares sev- eral structural features with nearly all other mRNA molecules (Fig. 1.10). As will become evident from the descriptions which follow, producing a func- tional mRNA molecule is amazingly complex. 5 cap Poly(A) tail AAAAAAAAA AAUAAA Polyadenylation signal AUG Initiation codon Stop codon UAA, UAG, UGA 3 Coding region FIGURE 1.10 Topology of a typical eukaryotic mRNA molecule. 28 RNA Methodologies
  • 53. 50 Cap A great majority of mature eukaryotic mRNA molecules are characteristi- cally monocistronic polyribonucleotides produced by RNA polymerase II. The immediate product of transcription, a precursor hnRNA molecule, dis- plays the following structure at the 50 end of the molecule: 50 pppRNNN. . .::30 meaning that the first transcribed nucleotide contains a purine base (symbol for a purine 5 R), either adenine or guanine. Not unexpectedly, the 50 tri- phosphate exhibited by the nucleotide remains intact and phosphodiester bonds join ribonucleotides sequentially in a 50 -30 orientation. A structure known as the 50 cap (Reddy et al., 1974; Shatkin, 1976; Banerjee, 1980) is assembled in a step-wise manner just after the initiation of transcription. Capping occurs as soon as the 50 end of the transcript emerges from the RNA polymerase complex, when nascent polynucleotides are generally between 20 and 30 bases long (Coppola et al., 1983). The net result of the ensuing reactions that are required for capping is the formation of an unusual 50 -50 triphosphate linkage between the first transcribed nucleotide (the orig- inal RNA) and a 7-methylguanosine (m7 G) nucleotide, the polarity of which effectively seals the 50 end of the transcript. In the realm of RNA biogenesis, 50 capping is the first example of a posttranslational modification associated with gene expression and is observed in virtually all eukaryotic mRNAs and is likewise a feature of most eRNA molecules. The formation of the 50 cap begins with the addition of the terminal gua- nosine nucleotide (G) to the 50 end of the nascent transcript. This occurs in the nucleus and first requires removal of the γ-phosphate from the 50 end of the hnRNA by RNA triphosphatase. Subsequently, the terminal “G” is ligated by the enzyme guanylyltransferase and results in the structure 5′ GpppRNNN.....3′ The new terminal guanosine nucleotide is joined 50 -50 to what was the first transcribed nucleotide via what is best thought of as an inverted linkage. The resulting 50 cap structure is then subjected to one or more methylation events (methyl group donor is S-adenosylmethionine; SAM), the first of which is directed toward the number 7 position of the terminal guanine, courtesy of the enzyme guanine-7-methyltransferase and represented as m7 G(5′)ppp(5′)RNNN…..3′ Most eukaryotic mRNAs subsequently experience an additional methyla- tion directed toward the 20 oxygen in the ribose of the second nucleotide, cata- lyzed by the enzyme 20 -O-methyl-transferase. Depending on the mRNA, as RNA and the Cellular Biochemistry Revisited Chapter | 1 29
  • 54. many as four methylation events can occur involving the 50 guanine nucleo- tide, the sugar of the penultimate and subpenultimate nucleotides, and may also include the formation of N6 -methyladenosine (if the second nucleotide contains adenine). The extent of methylation is a function of mRNA species, and higher organisms usually have more extensively methylated caps. Cap structures with multiple methyl groups have also been observed in the small nuclear RNA species (Furuichi and Shatkin, 1989), though they are structur- ally different from the mRNA caps described here (reviewed by Matera et al., 2007). Capping of eukaryotic mRNAs also precedes rare internal methylation of adenine; when it does occur, this infrequent modification results in the for- mation of N6 -methyladenosine early in mRNA biogenesis and which is con- served during RNA processing (Shatkin, 1976; Chen-Kiang et al., 1979). The capping reaction associated with the 50 terminus also has a role regu- lating transcription itself. It appears that the RNA polymerase slows down or altogether pauses when the nascent hnRNA is no more than 30 nts in length, facilitating the recruitment of the capping enzymes. This might be thought of as a quality control checkpoint, as transcription resumes only when the cap- ping process has been completed. The presence of the 50 cap is also required to support initiation of transla- tion in eukaryotes. Eukaryotic ribosomes distinguish mRNAs from non- mRNAs by the presence of the 50 cap. Succinctly, rRNA and tRNA are not translated because they are not capped. When produced by in vitro transcrip- tion, RNAs must be subjected to an in vitro capping reaction if the tran- scripts are to be expected to support synthesis of the encoded protein. Formation of the translation apparatus is initiated in part by cap-binding pro- teins, particularly eIF4E (for review, see Krebs et al., 2012), followed by the assembly of the ribosomal subunits as mediated by initiation factors. Capping also confers transcript stability by protecting against phosphatase attack and 50 -30 exonucleolytic degradation (Furuichi et al., 1977; Furuichi and Shatkin, 1989). In contrast, prokaryotic mRNAs, which naturally lack a 50 cap structure, are degraded exonucleolytically from the 50 end even while translation is ongoing downstream. Mitochondrial and chloroplast mRNAs are not capped, while most animal viruses that replicate in eukaryotic cells manifest 50 capped mRNAs, a noteworthy exception being poliovirus (Hewlett et al., 1976; Nomoto et al., 1976). Finally, the 50 cap has a role in nuclear egress. Export of mRNA into the cytoplasm occurs when the 50 cap engages the heterodimeric nuclear cap- binding complex (CBC). Once through the nuclear pore, eIF4E replaces CBC; this substitution is needed to ensure efficient translation, as noted above. 50 UTR (Leader Sequence) The first nucleotides immediately 30 to the eukaryotic cap structure constitute the 50 untranslated region (50 UTR), also known more casually as the 30 RNA Methodologies
  • 55. Exploring the Variety of Random Documents with Different Content
  • 56. back
  • 58. back
  • 59. Welcome to our website – the perfect destination for book lovers and knowledge seekers. We believe that every book holds a new world, offering opportunities for learning, discovery, and personal growth. That’s why we are dedicated to bringing you a diverse collection of books, ranging from classic literature and specialized publications to self-development guides and children's books. More than just a book-buying platform, we strive to be a bridge connecting you with timeless cultural and intellectual values. With an elegant, user-friendly interface and a smart search system, you can quickly find the books that best suit your interests. Additionally, our special promotions and home delivery services help you save time and fully enjoy the joy of reading. Join us on a journey of knowledge exploration, passion nurturing, and personal growth every day! ebookbell.com