SlideShare a Scribd company logo
Serial Analysis of Gene Expression (SAGE)
Technology
By: Dr. Ashish C Patel
Assistant Professor
Vet College, AAU, Anand
Serial Analysis of Gene Expression
It is believed that the majority of biological phenomena found in a
variety of organisms can be explained by the quantity of gene
products.
To understand the cellular functions under the certain conditions
at a certain time By measuring the mRNAs of different
genes and respective numbers of mRNAs at a point of time.
Each cell contains more than 10000 mRNAs of different genes,
copies of mRNAs of each gene ranging from one to more than
10000, and, as a total, up to half a million mRNA transcript copies.
It is therefore practically impossible to determine them.
Large-scale Random cDNA sequencing by EST project was very
useful for the identification of unknown genes expressed in given
cells or tissues. (Adams et al., 1991)
mRNA Species 1 …………….
mRNA Species n
Plasmid Insertion
cDNA clones
RE
Assemble EST1…n
Hence, sequencing = n x n times
cDNA
Assemble EST1…n
Assemble EST1…n of all seq. projects
All
steps
• However, this approach was not designed to quantify expressed
genes.
• The body mapping project (Okubo et al., 1992) attempted to
construct gene expression profiles of a number of cells and tissues
by random sequencing of a 3’-directed cDNA library.
• About 300 bp fragments of these 3’-region were called gene
signature and each represented a particular mRNA species.
• By sequencing 1000 or more cDNA clones, they could make a
rough pattern of gene expression and identify mRNAs of highly
abundant class.
• However, an expected weakness of both EST and body mapping
projects, in which one sequencing process yields only one cDNA
sequence.
• Mainly because of this low throughput, the profiles obtained by
the body mapping project unavoidably became a long way from
what is expected and demanded.
• Although the more recent methods of hybridization-based analyses
(DNA microarray) using immobilized cDNAs or oligonucleotides
can potentially examine the expression patterns of a relatively
large number of genes but these method can only examine
expressed sequences that have already been identified.
• In contrast, the SAGE method allows for a quantitative and
simultaneous analysis of a large number transcripts in any
particular cells or tissues, without prior knowledge of the genes.
• As the body mapping procedure, this method takes advantage of
the 3’-portion of mRNA as the gene tag, but of much shorter form
(9–10 bp).These tags can be serially connected before cloning into
a plasmid vector.
• Since the resulting plasmid clones contain multiple tags,
sequences of several dozens of mRNAs can be obtained by a
single sequencing reaction.
• Rapid and cost-saving sequencing by this original device allows
quantification and identification of a large number of cellular
transcripts.
• SAGE is based mainly on two principles, representation of
mRNAs (cDNAs) by short sequence tags and concatenation of
these tags for cloning to allow the efficient sequencing analysis.
• The hypothetical eukaryotic cell that contains seven mRNA
molecules composed of four species is depicted.
• To explain the gene expression profile of this cell, they would
have to conduct several cDNA sequencing reactions.
• However, if each mRNA species can be represented by a short
unique sequence stretch (such as 9 bp tag), the purpose would be
attained by sequencing them, because a sequence stretch as short
as 9 bp can distinguish 49 (262 144) transcripts, provided a
random nucleotide distribution throughout the genome.
• If we could connect these tags into a long stretch of DNA
molecule, sequencing reaction would be needed only once.
Principle of SAGE
The Principle of SAGE. The hypothetical eukaryotic cell that
contain seven mRNA molecules composed of four species is shown
as a model. Boxed are tags that are proper to mRNA species
SAGE Scheme
SAGE method allows for a quantitative and simultaneous analysis of
a large number of transcripts in any particular cells or tissues
mRNA species 1
mRNA species 2
mRNA species 3
9–10 bp tag
AAAAA
AAAAA
AAAAA
clone
Extract tags ,concatenate in plasmid
SAGE Scheme
Isolate insertion seq from plasmid
sequencing
TAGCGG.. ATGCGGC.. TATTTTAGC…
mRNA tag of species 1 mRNA tag of species 2 mRNA tag of species 3
Use BLAST service
Human genome
ATCGCC
TAGCGG
TACGCCG
ATGCGGC
ATAAAATCG
TATTTTAGC
Annotated Gene 1 Annotated Gene 12 Annotated Gene 34
Result: gene 1, 12, 34 are expressed during certain time say mitosis
SAGE procedure
AAAAAmRNA
mRNa-
cDNA
hybrid
TTTTT
Oligo(dT)-primer
AAAAA
Remove RNA by RNase H
TTTTT
ds cDNA
synthesis
TTTTT
AAAAA
Double-stranded cDNA is synthesized from mRNA by biotinylated
oligo(dT) primer. b/c high efficiency for 3 ́ pol (A) region present in
most eukaryotic mRNA
SAGE procedure
AAAAA
TTTTT
TTTTT
AAAAA
5’ GTAC
Bind to streptavidin beads
TTTTT5’ GTAC
Divide in half
TTTTT5’ GTAC
AAAAA
AAAAA
TTTTT
AAAAA
5’ GTAC
The cDNA is then cleaved with a restriction enzyme (called anchoring
enzyme, NlaIII
The cDNA with a cohesive end at its 5’terminus is immobilize b
binding to streptavidin-coated beads.
SAGE procedure
GTAC
AAAAA
TTTTT
CATGGGGA
CCCT
GTAC
CATGGGGA
CCCT
AAAAA
TTTTT
Linkers A
Linkers B
Cleave Tagging Enzyme (TE) e.g.
BsmFI.
Linkers have RE site for BsmFI or FokI
TE RE site
TE RE site
GTAC
CATGGGGA
CCCT
NNNNN
NNNNNNNNNNNNN
Overlapping
end
CATGGGGA
CCCT
NNNNN
NNNNNNNNNNNNNGTAC
T4 DNA polymerase
GTAC
CATGGGGA
CCCT
NNNNNNNNNNNNN
NNNNNNNNNNNNN
CATGGGGA
CCCT
NNNNNNNNNNNNN
NNNNNNNNNNNNNGTAC
Blunt
end
Two independent linkers are ligated using NlaIII cohesive termini to each
SAGE procedure
GTAC
CATGGGGA
CCCT
NNNNNNNNNNNNN
NNNNNNNNNNNNN
CATGGGGA
CCCT
NNNNNNNNNNNNN
NNNNNNNNNNNNNGTAC
5’ 5’
Ligate tail-to-tail orientation
GTAC
CATGGGGA
CCCT
NNNNNNNNNNNNN
NNNNNNNNNNNNN
CATG CCCT
GGGA
NNNNNNNNNNNNN
NNNNNNNNNNNNN
Amplify by primers A and B
GTAC
CATGGGGA
CCCT
NNNNNNNNNNNNN
NNNNNNNNNNNNN
NNNNNNNNNNNNN
NNNNNNNNNNNNN
primer A
primer B
GTAC
CATG CCCT
GGGAGTAC
Two portions are mixed again and ligated. The 5’ends of the
linkers are blocked by amino group, only the mRNA-derived
termini are able to be ligated in a tail-to-tail orientation
SAGE procedure
After 1 round of amplification
GTAC
CATGGGGA
CCCT
NNNNNNNNNNNNN
NNNNNNNNNNNNN
NNNNNNNNNNNNN
NNNNNNNNNNNNN
GTAC
CATGGGGA
CCCT
NNNNNNNNNNNNN
NNNNNNNNNNNNN
NNNNNNNNNNNNN
NNNNNNNNNNNNN
AE RE site
AE RE site
NNNNNNNNNNNNN
NNNNNNNNNNNNN
NNNNNNNNNNNNN
NNNNNNNNNNNNNGTAC
CATG
CATGGGGA
CCCT
CATG CCCT
GGGA
CATG CCCT
GGGA
GTAC
GTAC
GTAC
CCCT
GGGAGTAC
NNNNNNNNNNNNN
NNNNNNNNNNNNN
NNNNNNNNNNNNN
NNNNNNNNNNNNNGTAC
CATG
Isolate ditags
Amplified product cleaved by NlaIII, an anchoring enzyme
Ditag fragments flanked both ends with NlaIII cohesive
terminus are isolated and ligated to obtain concatemers
SAGE procedure
NNNNNNNNNNNNN
NNNNNNNNNNNNN
NNNNNNNNNNNNN
NNNNNNNNNNNNNGTAC
CATG
NNNNNNNNNNNNN
NNNNNNNNNNNNN
NNNNNNNNNNNNN
NNNNNNNNNNNNNGTAC
concatenate
NNNNNNNNNNNNN
NNNNNNNNNNNNN
NNNNNNNNNNNNN
NNNNNNNNNNNNNGTAC
CATG NNNNNNNNNNNNN
NNNNNNNNNNNNN
NNNNNNNNNNNNN
NNNNNNNNNNNNNGTAC
Insert into plasmid & clone
CATG
CATG
You can concatenate n number
of species
1 mRNA species gives 2 ds cDNA joined by Palindromic Sequences
SAGE procedure
NNNNNNNNNNNNN
NNNNNNNNNNNNN
NNNNNNNNNNNNN
NNNNNNNNNNNNNGTAC
CATG NNNNNNNNNNNNN
NNNNNNNNNNNNN
NNNNNNNNNNNNN
NNNNNNNNNNNNNGTAC
CATG
1 mRNA
species
mRNA species no. 1
mRNA species
no. 2
mRNA species no. 3
mRNA
species
no. n
plasmid
• SAGE is a tool for the study of gene expression, a variety of
biological phenomena has been analyzed. Total tags analyzed by
this method are close to five million up to year 2000.
• Table 1 showing highly diverse types of cells and tissues under a
variety of physiological and pathological conditions can be
noticed. Numbers of total collected tags in each study were
variable.
SAGE- Serial Analysis of Gene Expression
Cancer studies (Lal et al., 1999)
• By comparing the gene expression profiles derived from
cancer and normal tissue of interest, a large number of
genes were identified as tumor specific.
• Usually Northern blot hybridization analysis was
performed for the confirmation of differential expression
of these genes against a number of independently isolated
tissue samples of similar nature.
• About half of the overrepresented genes identified by
SAGE were reproducibly present in these samples, while
the behavior of the other half was quite different. This may
reflect the heterogeneity among tumors from different
individuals.
Immunological studies
• A few SAGE analysis has been directly applied for the study of
immunological phenomena.
• Chen et al. (1998) have reported that the changes in gene
expression in the rat mast cells before and after they were
stimulated through high affinity receptors for immunoglobulin E.
• It had not been previously associated with mast cells were
macrophage migration inhibitory factor, receptors for growth
hormone-releasing factor and melatonin.
• Many other genes that were differentially expressed were those
related to cell structure and cell motility, and numerous unknown
genes that showed no database-matching.
Yeast
• Yeast is widely used to clarify the biochemical and physiologic
parameters underlying eukaryotic cellular functions.
• The entire genome sequence has been determined (Goffeau,
1997) and the number of genes has been estimated to be about
6300.
• Total mRNA molecules were also been estimated to be15 000
per cell (Hereford and Rosbach, 1977).
• So, yeast was chosen as a model organism to evaluate the power
of the SAGE technology.
Drawbacks, problems and technical modifications
• As technical problems, a disadvantage of the need of relatively high
amount of mRNA, relative difficulty to construct tag libraries and others.
• MicroSAGE (Datson et al., 1999) requires 500–5000-fold less starting
input RNA, and is simplified by the incorporation of a ‘one-tube’
procedure for all steps from RNA isolation to tag release.
• SAGE-lite, is another similarly-devised protocol also allows the global
analysis of transcription from less than 100 ng of total starting RNA
(Peters et al., 1999).
Technical difficulty of the procedure;
• In the original SAGE protocol, major products of PCR are often linker-
dimers. To minimize contaminating linker molecules, biotinylated PCR
primers were introduce, which generates biotinylated ditag products, thus
allowing removal of the unwanted linkers by binding to streptavidin
beads used at a later stage.
• A simple introduction of heating step at final ligation step
yields cloned concatemers with an average of 67 tags as
compared to 22 tags obtained by the original protocol.
• A major problem of the SAGE approach is how to further
analyze the unknown tags.
• The utilization of a conventional oligonucleotide-based plaque
lift method was employed successfully for the isolation and
cloning of a number of genes.
• However, it is almost impossible to discriminate one-base
mismatched sequence within oligonucleotides of only 13–14 bp
in length rather than temperature-regulated DNA–DNA
hybridization technology, thus resulting in numerous false
positives.
• An RT-PCR-based method was developed to analyze the
corresponding genes and this approach utilizes identified tag
sequences and oligo-dT as PCR primers.
• Matsumura et al. (1999) reported a procedure to recover a
longer cDNA fragment by PCR using the SAGE tag sequence
as a primer, thereby facilitating the analysis of unknown genes
identified by tag sequence in SAGE.
• Sequencing Error: Sequencing error rate affect a SAGE
experiment which can improve by using phred scores and
discarding ambiguous sequences.
• Short SAGE comprised 14bp and long SAGE comprised 21bp.
• About 12% of C. elegans tags are not unambiguously identified
using 14bp tags (Mc Kay et al., 2003). Results of empirical
data suggests that Long SAGE gives far greater resolution, but
at an increased cost.
SAGE Data Analysis Strategies
• The sequence files generated by the automated sequencer are
analyzed using the SAGE2000 software (www.sagenet.org).
• The three steps involved in obtaining a differential gene
expression list are as follows:
(1) Interpret the SAGE tags from the sequence data files by using the
SAGE2000 software for extracting ditags and checking for
duplicate ditags;
(2) Download a reference sequence database from the NCBI Web
site (SAGEmap, www.ncbi.nlm.nih.gov); and
(3) Associating the tags to the expressed gene database.
The relative transcript abundance can then be calculated by dividing
the unique tag count by the total tags sequenced, and the fold
change can be determined by the ratio of tags between
libraries.
• The initial analysis is usually limited to a predefined tag ratio of
greater than 5-fold and a value of P≤0.05.
• The rates of false-positives associated with different probability
values have been computed by Monte-Carlo test to validate
confidence intervals.
• Depending on the preliminary results, the SAGE data can be
reanalyzed by varying the P values and the fold-change
thresholds.
SAGEmap
http://www.sagenet.org/
Sage resources
Sage data
SAGE APPLICATION
• SAGE is useful in comparative expression studies to identify
differences in gene expression between two or more cellular
sources of RNA.
• Gene Discovery
• Determining changes on gene expression as consequence of an
experimental treatment (e.g. carcinogen, hormone)
• Provides quantitative data on both known and unknown genes
• Analyzes all transcripts (Transcriptome) without prior selection of
known genes
• Analysis of Cardiovascular gene expression
• Gene expression in carcinogenesis
• Substance abuse studies
• Cell, tissue and developmental stage profiling
• Profiling of human diseases
SAGE – Advantages & Disadvantages
Advantages
• No hybridizing, so no cross-hybridizing can occur.
• Can help identify new genes by using tag as a PCR primer
Disadvantages
• Cost and time required to perform so many PCR and
sequencing reactions.
• Type IIS restriction enzyme can yield fragments of the wrong
length depending on temperature.
• Multiple genes could have the same tag
• As with microarrays, mRNA levels may not represent protein
levels in a cell
Microarray Vs. SAGE

More Related Content

PPTX
Comparative transcriptomics
PPTX
Expressed sequence tag (EST), molecular marker
PPTX
Serial analysis of gene expression
PPTX
MASSIVELY PARELLEL SIGNATURE SEQUENCING
PPT
Genomics and bioinformatics
PPTX
Comparative and functional genomics
PPTX
Forward and reverse genetics
PPTX
Genome sequencing,shotgun sequencing.pptx
Comparative transcriptomics
Expressed sequence tag (EST), molecular marker
Serial analysis of gene expression
MASSIVELY PARELLEL SIGNATURE SEQUENCING
Genomics and bioinformatics
Comparative and functional genomics
Forward and reverse genetics
Genome sequencing,shotgun sequencing.pptx

What's hot (20)

PPTX
Transcriptome analysis
PPTX
Express sequence tags
PPTX
SAGE (Serial analysis of Gene Expression)
PPTX
Transcriptomics approaches
PPTX
Antisence Rna technology
PPTX
Electrophoretic mobility shift assay
PPTX
Pyrosequencing
PPTX
2 whole genome sequencing and analysis
PPTX
Functional genomics
PPT
Est database
PPT
Structural genomics
PPTX
Comparative genomics
PPTX
Whole genome shotgun sequencing
PPTX
''Electrophoretic Mobility Shift Assay'' by KATE, Wisdom Deebeke
PPTX
Transcriptomics
PPTX
YEAST TWO HYBRID SYSTEM
PPTX
Genetic and Physical map of Genome
PPTX
Presentation on marker genes
PPTX
Single strand conformation polymorphism
PPTX
Functional genomics, and tools
Transcriptome analysis
Express sequence tags
SAGE (Serial analysis of Gene Expression)
Transcriptomics approaches
Antisence Rna technology
Electrophoretic mobility shift assay
Pyrosequencing
2 whole genome sequencing and analysis
Functional genomics
Est database
Structural genomics
Comparative genomics
Whole genome shotgun sequencing
''Electrophoretic Mobility Shift Assay'' by KATE, Wisdom Deebeke
Transcriptomics
YEAST TWO HYBRID SYSTEM
Genetic and Physical map of Genome
Presentation on marker genes
Single strand conformation polymorphism
Functional genomics, and tools
Ad

Viewers also liked (20)

PPTX
Est Ppt
PPTX
X ray crystellography
PPTX
DNA microarray final ppt.
PPTX
Gene Snp 2010
PDF
A Comparative Study on Serial and Parallel Web Content Mining
PPTX
Serial Io
PPT
What is a gene?
PPTX
Kenny Cannon's Startup University - Company Structure
PPT
Sage technology
PPTX
Expression vectors
ODP
Interaction Between Matter and X ray
PPTX
advance material science
PDF
CRYSTAL STRUCTURE AND X – RAYS DIFFRACTION
PPT
Sage 100 V16
PDF
XRD_presentation_McElroy
PPT
Light detailed work v1
PDF
Des trucs et astuces pour gagner du temps avec Sage 100 Gestion Commerciale i7.
PPT
Revision lecture 1
PPT
Application1
Est Ppt
X ray crystellography
DNA microarray final ppt.
Gene Snp 2010
A Comparative Study on Serial and Parallel Web Content Mining
Serial Io
What is a gene?
Kenny Cannon's Startup University - Company Structure
Sage technology
Expression vectors
Interaction Between Matter and X ray
advance material science
CRYSTAL STRUCTURE AND X – RAYS DIFFRACTION
Sage 100 V16
XRD_presentation_McElroy
Light detailed work v1
Des trucs et astuces pour gagner du temps avec Sage 100 Gestion Commerciale i7.
Revision lecture 1
Application1
Ad

Similar to SAGE- Serial Analysis of Gene Expression (20)

PPT
31931 31941
PPT
PPTX
BTC 810 Analysis of Transcriptomes.pptx
PPTX
Parallel analysis of gene expression
PPTX
Sage - serial analysis of gene expression
PDF
Functional genomics
PPTX
Transcriptomics(Microarray: Chip and Image Analysis).pptx
PDF
Capanalysis Gene Expression Cage The Science Of Decoding Genes Transcription ...
PPTX
METHODS OF TRANSCRIPTOME ANALYSIS....pptx
PPTX
Transcriptomics: A Tool for Plant Disease Management
PPTX
Molecular Engineering
DOCX
Gene expression profile analysis of human hepatocellular carcinoma using sage...
PPT
Microarray biotechnologg ppy dna microarrays
PPT
Gene expression
PPT
212 basic molecular genetic studies in atherosclerosis
PPT
Basic molecular genetic studies in atherosclerosis
PPT
212 basic molecular genetic studies in atherosclerosis
PPTX
lecture 4 genomic and transcriptomic (1).pptx
PPT
Functional genomics
31931 31941
BTC 810 Analysis of Transcriptomes.pptx
Parallel analysis of gene expression
Sage - serial analysis of gene expression
Functional genomics
Transcriptomics(Microarray: Chip and Image Analysis).pptx
Capanalysis Gene Expression Cage The Science Of Decoding Genes Transcription ...
METHODS OF TRANSCRIPTOME ANALYSIS....pptx
Transcriptomics: A Tool for Plant Disease Management
Molecular Engineering
Gene expression profile analysis of human hepatocellular carcinoma using sage...
Microarray biotechnologg ppy dna microarrays
Gene expression
212 basic molecular genetic studies in atherosclerosis
Basic molecular genetic studies in atherosclerosis
212 basic molecular genetic studies in atherosclerosis
lecture 4 genomic and transcriptomic (1).pptx
Functional genomics

More from Aashish Patel (20)

PDF
P G STAT 531 Lecture 10 Regression
PDF
P G STAT 531 Lecture 9 Correlation
PDF
P G STAT 531 Lecture 8 Chi square test
PDF
P G STAT 531 Lecture 7 t test and Paired t test
PDF
PG STAT 531 Lecture 6 Test of Significance, z Test
PDF
PG STAT 531 Lecture 5 Probability Distribution
PDF
PG STAT 531 Lecture 4 Exploratory Data Analysis
PDF
PG STAT 531 Lecture 3 Graphical and Diagrammatic Representation of Data
PDF
PG STAT 531 Lecture 2 Descriptive statistics
PPTX
PG STAT 531 lecture 1 introduction about statistics and collection, compilati...
PPTX
Chromosomal abeeration
PPTX
Cytoplasmic inheritance
PPTX
sex determination
PPTX
sex linked inheritance, Sex Influence inheritance and sex limited characters
PPTX
Modification of Normal Mendelian ratios with Lethal gene effcets and Epistasis
PPT
Meiosis.ppt..
PPT
karyotyping and cell division.ppt..
PPTX
Chromosome and its structure
PPTX
Cell & Its Orgenells
PPTX
Introduction of Animal Genetics & History of Genetics
P G STAT 531 Lecture 10 Regression
P G STAT 531 Lecture 9 Correlation
P G STAT 531 Lecture 8 Chi square test
P G STAT 531 Lecture 7 t test and Paired t test
PG STAT 531 Lecture 6 Test of Significance, z Test
PG STAT 531 Lecture 5 Probability Distribution
PG STAT 531 Lecture 4 Exploratory Data Analysis
PG STAT 531 Lecture 3 Graphical and Diagrammatic Representation of Data
PG STAT 531 Lecture 2 Descriptive statistics
PG STAT 531 lecture 1 introduction about statistics and collection, compilati...
Chromosomal abeeration
Cytoplasmic inheritance
sex determination
sex linked inheritance, Sex Influence inheritance and sex limited characters
Modification of Normal Mendelian ratios with Lethal gene effcets and Epistasis
Meiosis.ppt..
karyotyping and cell division.ppt..
Chromosome and its structure
Cell & Its Orgenells
Introduction of Animal Genetics & History of Genetics

Recently uploaded (20)

PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
PPTX
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
PPT
protein biochemistry.ppt for university classes
PPTX
Classification Systems_TAXONOMY_SCIENCE8.pptx
PDF
bbec55_b34400a7914c42429908233dbd381773.pdf
PPTX
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
PDF
An interstellar mission to test astrophysical black holes
PDF
AlphaEarth Foundations and the Satellite Embedding dataset
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PPTX
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
PPTX
The KM-GBF monitoring framework – status & key messages.pptx
PDF
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
PPTX
neck nodes and dissection types and lymph nodes levels
PPTX
microscope-Lecturecjchchchchcuvuvhc.pptx
PDF
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
PPTX
Taita Taveta Laboratory Technician Workshop Presentation.pptx
PDF
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
DOCX
Q1_LE_Mathematics 8_Lesson 5_Week 5.docx
TOTAL hIP ARTHROPLASTY Presentation.pptx
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
protein biochemistry.ppt for university classes
Classification Systems_TAXONOMY_SCIENCE8.pptx
bbec55_b34400a7914c42429908233dbd381773.pdf
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
An interstellar mission to test astrophysical black holes
AlphaEarth Foundations and the Satellite Embedding dataset
ECG_Course_Presentation د.محمد صقران ppt
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
The KM-GBF monitoring framework – status & key messages.pptx
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
neck nodes and dissection types and lymph nodes levels
microscope-Lecturecjchchchchcuvuvhc.pptx
Formation of Supersonic Turbulence in the Primordial Star-forming Cloud
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
Taita Taveta Laboratory Technician Workshop Presentation.pptx
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
Q1_LE_Mathematics 8_Lesson 5_Week 5.docx

SAGE- Serial Analysis of Gene Expression

  • 1. Serial Analysis of Gene Expression (SAGE) Technology By: Dr. Ashish C Patel Assistant Professor Vet College, AAU, Anand
  • 2. Serial Analysis of Gene Expression It is believed that the majority of biological phenomena found in a variety of organisms can be explained by the quantity of gene products. To understand the cellular functions under the certain conditions at a certain time By measuring the mRNAs of different genes and respective numbers of mRNAs at a point of time. Each cell contains more than 10000 mRNAs of different genes, copies of mRNAs of each gene ranging from one to more than 10000, and, as a total, up to half a million mRNA transcript copies. It is therefore practically impossible to determine them.
  • 3. Large-scale Random cDNA sequencing by EST project was very useful for the identification of unknown genes expressed in given cells or tissues. (Adams et al., 1991) mRNA Species 1 ……………. mRNA Species n Plasmid Insertion cDNA clones RE Assemble EST1…n Hence, sequencing = n x n times cDNA Assemble EST1…n Assemble EST1…n of all seq. projects All steps
  • 4. • However, this approach was not designed to quantify expressed genes. • The body mapping project (Okubo et al., 1992) attempted to construct gene expression profiles of a number of cells and tissues by random sequencing of a 3’-directed cDNA library. • About 300 bp fragments of these 3’-region were called gene signature and each represented a particular mRNA species. • By sequencing 1000 or more cDNA clones, they could make a rough pattern of gene expression and identify mRNAs of highly abundant class. • However, an expected weakness of both EST and body mapping projects, in which one sequencing process yields only one cDNA sequence. • Mainly because of this low throughput, the profiles obtained by the body mapping project unavoidably became a long way from what is expected and demanded.
  • 5. • Although the more recent methods of hybridization-based analyses (DNA microarray) using immobilized cDNAs or oligonucleotides can potentially examine the expression patterns of a relatively large number of genes but these method can only examine expressed sequences that have already been identified. • In contrast, the SAGE method allows for a quantitative and simultaneous analysis of a large number transcripts in any particular cells or tissues, without prior knowledge of the genes. • As the body mapping procedure, this method takes advantage of the 3’-portion of mRNA as the gene tag, but of much shorter form (9–10 bp).These tags can be serially connected before cloning into a plasmid vector. • Since the resulting plasmid clones contain multiple tags, sequences of several dozens of mRNAs can be obtained by a single sequencing reaction.
  • 6. • Rapid and cost-saving sequencing by this original device allows quantification and identification of a large number of cellular transcripts.
  • 7. • SAGE is based mainly on two principles, representation of mRNAs (cDNAs) by short sequence tags and concatenation of these tags for cloning to allow the efficient sequencing analysis. • The hypothetical eukaryotic cell that contains seven mRNA molecules composed of four species is depicted. • To explain the gene expression profile of this cell, they would have to conduct several cDNA sequencing reactions. • However, if each mRNA species can be represented by a short unique sequence stretch (such as 9 bp tag), the purpose would be attained by sequencing them, because a sequence stretch as short as 9 bp can distinguish 49 (262 144) transcripts, provided a random nucleotide distribution throughout the genome. • If we could connect these tags into a long stretch of DNA molecule, sequencing reaction would be needed only once. Principle of SAGE
  • 8. The Principle of SAGE. The hypothetical eukaryotic cell that contain seven mRNA molecules composed of four species is shown as a model. Boxed are tags that are proper to mRNA species
  • 9. SAGE Scheme SAGE method allows for a quantitative and simultaneous analysis of a large number of transcripts in any particular cells or tissues mRNA species 1 mRNA species 2 mRNA species 3 9–10 bp tag AAAAA AAAAA AAAAA clone Extract tags ,concatenate in plasmid
  • 10. SAGE Scheme Isolate insertion seq from plasmid sequencing TAGCGG.. ATGCGGC.. TATTTTAGC… mRNA tag of species 1 mRNA tag of species 2 mRNA tag of species 3 Use BLAST service Human genome ATCGCC TAGCGG TACGCCG ATGCGGC ATAAAATCG TATTTTAGC Annotated Gene 1 Annotated Gene 12 Annotated Gene 34 Result: gene 1, 12, 34 are expressed during certain time say mitosis
  • 11. SAGE procedure AAAAAmRNA mRNa- cDNA hybrid TTTTT Oligo(dT)-primer AAAAA Remove RNA by RNase H TTTTT ds cDNA synthesis TTTTT AAAAA Double-stranded cDNA is synthesized from mRNA by biotinylated oligo(dT) primer. b/c high efficiency for 3 ́ pol (A) region present in most eukaryotic mRNA
  • 12. SAGE procedure AAAAA TTTTT TTTTT AAAAA 5’ GTAC Bind to streptavidin beads TTTTT5’ GTAC Divide in half TTTTT5’ GTAC AAAAA AAAAA TTTTT AAAAA 5’ GTAC The cDNA is then cleaved with a restriction enzyme (called anchoring enzyme, NlaIII The cDNA with a cohesive end at its 5’terminus is immobilize b binding to streptavidin-coated beads.
  • 13. SAGE procedure GTAC AAAAA TTTTT CATGGGGA CCCT GTAC CATGGGGA CCCT AAAAA TTTTT Linkers A Linkers B Cleave Tagging Enzyme (TE) e.g. BsmFI. Linkers have RE site for BsmFI or FokI TE RE site TE RE site GTAC CATGGGGA CCCT NNNNN NNNNNNNNNNNNN Overlapping end CATGGGGA CCCT NNNNN NNNNNNNNNNNNNGTAC T4 DNA polymerase GTAC CATGGGGA CCCT NNNNNNNNNNNNN NNNNNNNNNNNNN CATGGGGA CCCT NNNNNNNNNNNNN NNNNNNNNNNNNNGTAC Blunt end Two independent linkers are ligated using NlaIII cohesive termini to each
  • 14. SAGE procedure GTAC CATGGGGA CCCT NNNNNNNNNNNNN NNNNNNNNNNNNN CATGGGGA CCCT NNNNNNNNNNNNN NNNNNNNNNNNNNGTAC 5’ 5’ Ligate tail-to-tail orientation GTAC CATGGGGA CCCT NNNNNNNNNNNNN NNNNNNNNNNNNN CATG CCCT GGGA NNNNNNNNNNNNN NNNNNNNNNNNNN Amplify by primers A and B GTAC CATGGGGA CCCT NNNNNNNNNNNNN NNNNNNNNNNNNN NNNNNNNNNNNNN NNNNNNNNNNNNN primer A primer B GTAC CATG CCCT GGGAGTAC Two portions are mixed again and ligated. The 5’ends of the linkers are blocked by amino group, only the mRNA-derived termini are able to be ligated in a tail-to-tail orientation
  • 15. SAGE procedure After 1 round of amplification GTAC CATGGGGA CCCT NNNNNNNNNNNNN NNNNNNNNNNNNN NNNNNNNNNNNNN NNNNNNNNNNNNN GTAC CATGGGGA CCCT NNNNNNNNNNNNN NNNNNNNNNNNNN NNNNNNNNNNNNN NNNNNNNNNNNNN AE RE site AE RE site NNNNNNNNNNNNN NNNNNNNNNNNNN NNNNNNNNNNNNN NNNNNNNNNNNNNGTAC CATG CATGGGGA CCCT CATG CCCT GGGA CATG CCCT GGGA GTAC GTAC GTAC CCCT GGGAGTAC NNNNNNNNNNNNN NNNNNNNNNNNNN NNNNNNNNNNNNN NNNNNNNNNNNNNGTAC CATG Isolate ditags Amplified product cleaved by NlaIII, an anchoring enzyme Ditag fragments flanked both ends with NlaIII cohesive terminus are isolated and ligated to obtain concatemers
  • 17. SAGE procedure NNNNNNNNNNNNN NNNNNNNNNNNNN NNNNNNNNNNNNN NNNNNNNNNNNNNGTAC CATG NNNNNNNNNNNNN NNNNNNNNNNNNN NNNNNNNNNNNNN NNNNNNNNNNNNNGTAC CATG 1 mRNA species mRNA species no. 1 mRNA species no. 2 mRNA species no. 3 mRNA species no. n plasmid
  • 18. • SAGE is a tool for the study of gene expression, a variety of biological phenomena has been analyzed. Total tags analyzed by this method are close to five million up to year 2000. • Table 1 showing highly diverse types of cells and tissues under a variety of physiological and pathological conditions can be noticed. Numbers of total collected tags in each study were variable.
  • 20. Cancer studies (Lal et al., 1999) • By comparing the gene expression profiles derived from cancer and normal tissue of interest, a large number of genes were identified as tumor specific. • Usually Northern blot hybridization analysis was performed for the confirmation of differential expression of these genes against a number of independently isolated tissue samples of similar nature. • About half of the overrepresented genes identified by SAGE were reproducibly present in these samples, while the behavior of the other half was quite different. This may reflect the heterogeneity among tumors from different individuals.
  • 21. Immunological studies • A few SAGE analysis has been directly applied for the study of immunological phenomena. • Chen et al. (1998) have reported that the changes in gene expression in the rat mast cells before and after they were stimulated through high affinity receptors for immunoglobulin E. • It had not been previously associated with mast cells were macrophage migration inhibitory factor, receptors for growth hormone-releasing factor and melatonin. • Many other genes that were differentially expressed were those related to cell structure and cell motility, and numerous unknown genes that showed no database-matching.
  • 22. Yeast • Yeast is widely used to clarify the biochemical and physiologic parameters underlying eukaryotic cellular functions. • The entire genome sequence has been determined (Goffeau, 1997) and the number of genes has been estimated to be about 6300. • Total mRNA molecules were also been estimated to be15 000 per cell (Hereford and Rosbach, 1977). • So, yeast was chosen as a model organism to evaluate the power of the SAGE technology.
  • 23. Drawbacks, problems and technical modifications • As technical problems, a disadvantage of the need of relatively high amount of mRNA, relative difficulty to construct tag libraries and others. • MicroSAGE (Datson et al., 1999) requires 500–5000-fold less starting input RNA, and is simplified by the incorporation of a ‘one-tube’ procedure for all steps from RNA isolation to tag release. • SAGE-lite, is another similarly-devised protocol also allows the global analysis of transcription from less than 100 ng of total starting RNA (Peters et al., 1999). Technical difficulty of the procedure; • In the original SAGE protocol, major products of PCR are often linker- dimers. To minimize contaminating linker molecules, biotinylated PCR primers were introduce, which generates biotinylated ditag products, thus allowing removal of the unwanted linkers by binding to streptavidin beads used at a later stage.
  • 24. • A simple introduction of heating step at final ligation step yields cloned concatemers with an average of 67 tags as compared to 22 tags obtained by the original protocol. • A major problem of the SAGE approach is how to further analyze the unknown tags. • The utilization of a conventional oligonucleotide-based plaque lift method was employed successfully for the isolation and cloning of a number of genes. • However, it is almost impossible to discriminate one-base mismatched sequence within oligonucleotides of only 13–14 bp in length rather than temperature-regulated DNA–DNA hybridization technology, thus resulting in numerous false positives. • An RT-PCR-based method was developed to analyze the corresponding genes and this approach utilizes identified tag sequences and oligo-dT as PCR primers.
  • 25. • Matsumura et al. (1999) reported a procedure to recover a longer cDNA fragment by PCR using the SAGE tag sequence as a primer, thereby facilitating the analysis of unknown genes identified by tag sequence in SAGE. • Sequencing Error: Sequencing error rate affect a SAGE experiment which can improve by using phred scores and discarding ambiguous sequences. • Short SAGE comprised 14bp and long SAGE comprised 21bp. • About 12% of C. elegans tags are not unambiguously identified using 14bp tags (Mc Kay et al., 2003). Results of empirical data suggests that Long SAGE gives far greater resolution, but at an increased cost.
  • 26. SAGE Data Analysis Strategies • The sequence files generated by the automated sequencer are analyzed using the SAGE2000 software (www.sagenet.org). • The three steps involved in obtaining a differential gene expression list are as follows: (1) Interpret the SAGE tags from the sequence data files by using the SAGE2000 software for extracting ditags and checking for duplicate ditags; (2) Download a reference sequence database from the NCBI Web site (SAGEmap, www.ncbi.nlm.nih.gov); and (3) Associating the tags to the expressed gene database. The relative transcript abundance can then be calculated by dividing the unique tag count by the total tags sequenced, and the fold change can be determined by the ratio of tags between libraries.
  • 27. • The initial analysis is usually limited to a predefined tag ratio of greater than 5-fold and a value of P≤0.05. • The rates of false-positives associated with different probability values have been computed by Monte-Carlo test to validate confidence intervals. • Depending on the preliminary results, the SAGE data can be reanalyzed by varying the P values and the fold-change thresholds.
  • 32. SAGE APPLICATION • SAGE is useful in comparative expression studies to identify differences in gene expression between two or more cellular sources of RNA. • Gene Discovery • Determining changes on gene expression as consequence of an experimental treatment (e.g. carcinogen, hormone) • Provides quantitative data on both known and unknown genes • Analyzes all transcripts (Transcriptome) without prior selection of known genes • Analysis of Cardiovascular gene expression • Gene expression in carcinogenesis • Substance abuse studies • Cell, tissue and developmental stage profiling • Profiling of human diseases
  • 33. SAGE – Advantages & Disadvantages Advantages • No hybridizing, so no cross-hybridizing can occur. • Can help identify new genes by using tag as a PCR primer Disadvantages • Cost and time required to perform so many PCR and sequencing reactions. • Type IIS restriction enzyme can yield fragments of the wrong length depending on temperature. • Multiple genes could have the same tag • As with microarrays, mRNA levels may not represent protein levels in a cell