11/6/19, 3:57 PMNext generation sequencing
Page 1 of 15https://www.atdbio.com/content/58/Next-generation-sequencing
Next generation sequencing
CONTENTS
1 Introduction
1.1 Sanger sequencing and Next-generation sequencing
2 Library preparation
3 Amplification
3.1 Emulsion PCR
3.2 Bridge PCR
4 Sequencing
4.1 454 Pyrosequencing
4.2 Ion torrent semiconductor sequencing
4.3 Sequencing by ligation (SOLiD)
4.4 Reversible terminator sequencing (Illumina)
4.4.1 3'-O-blocked reversible terminators
4.4.2 3'-unblocked reversible terminators
5 Third generation sequencing
6 Sequencing epigenetic modifications
6.5 Bisulfite sequencing
6.5.1 Oxidative bisulfite sequencing
7 Applications of Next-generation sequencing
8 References
1. INTRODUCTION
The sequencing of the human genome was completed in 2003, after 13
years of international collaboration and investment of USD 3 billion. The
Human Genome Project used Sanger sequencing (albeit heavily
optimized), the principal method of DNA sequencing since its invention in
the 1970s.
Today, the demand for sequencing is growing exponentially, with large
amounts of genomic DNA needing to be analyzed quickly, cheaply, and
accurately. Thanks to new sequencing technologies known collectively as
Next Generation Sequencing, it is now possible to sequence an entire
NUCLEIC ACIDS BOOK
FREE, EXCLUSIVE ONLINE
BOOK
We wrote the book. Read our
Nucleic Acids Book, an exclusive
online guide to the chemistry and
biology of nucleic acids.
CONTACTUS
REQUESTAQUOTE NOW!
Tell us about your problem, and
we'll use all of our knowledge of
nucleic acids chemistry to help
you solve it.
11/6/19, 3:57 PMNext generation sequencing
Page 2 of 15https://www.atdbio.com/content/58/Next-generation-sequencing
human genome in a matter of hours.
1.1. Sanger sequencing and Next-generation sequencing
The principle behind Next Generation Sequencing (NGS) is similar to that
of Sanger sequencing, which relies on capillary electrophoresis. The
genomic strand is fragmented, and the bases in each fragment are
identified by emitted signals when the fragments are ligated against a
template strand.
The Sanger method required separate steps for sequencing, separation (by
electrophoresis) and detection, which made it difficult to automate the
sample preparation and it was limited in throughput, scalability and
resolution. The NGS method uses array-based sequencing which combines
the techniques developed in Sanger sequencing to process millions of
reactions in parallel, resulting in very high speed and throughput at a
reduced cost. The genome sequencing projects that took many years with
Sanger methods can now be completed in hours with NGS, although with
shorter read lengths (the number of bases that are sequenced at a time) and
less accuracy.
Next generation methods of DNA sequencing have three general steps:
Library preparation: libraries are created using random fragmentation
of DNA, followed by ligation with custom linkers
Amplification: the library is amplified using clonal amplification
methods and PCR
Sequencing: DNA is sequenced using one of several different
approaches
2. LIBRARY PREPARATION
Firstly, DNA is fragmented either enzymatically or by sonication
(excitation using ultrasound) to create smaller strands. Adaptors (short,
double-stranded pieces of synthetic DNA) are then ligated to these
fragments with the help of DNA ligase, an enzyme that joins DNA strands.
The adaptors enable the sequence to become bound to a complementary
counterpart.
Adaptors are synthesised so that one end is 'sticky' whilst the other is
'blunt' (non-cohesive) with the view to joining the blunt end to the blunt
11/6/19, 3:57 PMNext generation sequencing
Page 3 of 15https://www.atdbio.com/content/58/Next-generation-sequencing
ended DNA. This could lead to the potential problem of base pairing
between molecules and therefore dimer formation. To prevent this, the
chemical structure of DNA is utilised, since ligation takes place between
the 3′-OH and 5′-P ends. By removing the phosphate from the sticky end
of the adaptor and therefore creating a 5′-OH end instead, the DNA ligase
is unable to form a bridge between the two termini (Figure 1).
Figure 1 | Library preparation of Next-generation sequencing
In order for sequencing to be successful, the library fragments need to be
spatially clustered in PCR colonies or 'polonies' as they are conventionally
known, which consist of many copies of a particular library fragment.
Since these polonies are attached in a planar fashion, the features of the
array can be manipulated enzymatically in parallel. This method of library
construction is much faster than the previous labour intensive procedure of
colony picking and E. coli cloning used to isolate and amplify DNA for
Sanger sequencing, however, this is at the expense of read length of the
fragments.
11/6/19, 3:57 PMNext generation sequencing
Page 4 of 15https://www.atdbio.com/content/58/Next-generation-sequencing
3. AMPLIFICATION
Library amplification is required so that the received signal from the
sequencer is strong enough to be detected accurately. With enzymatic
amplification, phenomena such as 'biasing' and 'duplication' can occur
leading to preferential amplification of certain library fragments. Instead,
there are several types of amplification process which use PCR to create
large numbers of DNA clusters.
3.1. Emulsion PCR
Emulsion oil, beads, PCR mix and the library DNA are mixed to form an
emulsion which leads to the formation of micro wells (Figure 2).
Figure 2 | Emulsion PCR
In order for the sequencing process to be successful, each micro well
should contain one bead with one strand of DNA (approximately 15% of
micro wells are of this composition). The PCR then denatures the library
fragment leading two separate strands, one of which (the reverse strand)
anneals to the bead. The annealed DNA is amplified by polymerase
starting from the bead towards the primer site. The original reverse strand
then denatures and is released from the bead only to re-anneal to the bead
to give two separate strands. These are both amplified to give two DNA
strands attached to the bead. The process is then repeated over 30-60
cycles leading to clusters of DNA. This technique has been criticised for
its time consuming nature, since it requires many steps (forming and
breaking the emulsion, PCR amplification, enrichment etc) despite its
extensive use in many of the NGS platforms. It is also relatively inefficient
since only around two thirds of the emulsion micro reactors will actually
contain one bead. Therefore an extra step is required to separate empty
11/6/19, 3:57 PMNext generation sequencing
Page 5 of 15https://www.atdbio.com/content/58/Next-generation-sequencing
systems leading to more potential inaccuracies.
3.2. Bridge PCR
The surface of the flow cell is densely coated with primers that are
complementary to the primers attached to the DNA library fragments
(Figure 3). The DNA is then attached to the surface of the cell at random
where it is exposed to reagents for polymerase based extension. On
addition of nucleotides and enzymes, the free ends of the single strands of
DNA attach themselves to the surface of the cell via complementary
primers, creating bridged structures. Enzymes then interact with the
bridges to make them double stranded, so that when the denaturation
occurs, two single stranded DNA fragments are attached to the surface in
close proximity. Repetition of this process leads to clonal clusters of
localised identical strands. In order to optimise cluster density,
concentrations of reagents must be monitored very closely to avoid
overcrowding.
Figure 3 | Bridging PCR
4. SEQUENCING
Several competing methods of Next Generation Sequencing have been
developed by different companies.
4.1. 454 Pyrosequencing
11/6/19, 3:57 PMNext generation sequencing
Page 6 of 15https://www.atdbio.com/content/58/Next-generation-sequencing
Pyrosequencing is based on the 'sequencing by synthesis' principle, where
a complementary strand is synthesised in the presence of polymerase
enzyme (Figure 4). In contrast to using dideoxynucleotides to terminate
chain amplification (as in Sanger sequencing), pyrosequencing instead
detects the release of pyrophosphate when nucleotides are added to the
DNA chain. It initially uses the emulsion PCR technique to construct the
polonies required for sequencing and removes the complementary strand.
Next, a ssDNA sequencing primer hybridizes to the end of the strand
(primer-binding region), then the four different dNTPs are then
sequentially made to flow in and out of the wells over the polonies. When
the correct dNTP is enzymatically incorporated into the strand, it causes
release of pyrophosphate. In the presence of ATP sulfurylase and
adenosine, the pyrophosphate is converted into ATP. This ATP molecule is
used for luciferase-catalysed conversion of luciferin to oxyluciferin, which
produces light that can be detected with a camera. The relative intensity of
light is proportional to the amount of base added (i.e. a peak of twice the
intensity indicates two identical bases have been added in succession).
Figure 4 | 454 Pyrosequencing
Pyrosequencing, developed by 454 Life Sciences, was one of the early
successes of Next-generation sequencing; indeed, 454 Life Sciences
produced the first commercially available Next-generation sequencer.
However, the method was eclipsed by other technologies and, in 2013,
new owners Roche announced the closure of 454 Life Sciences and the
discontinuation of the 454 pyrosequencing platform.
11/6/19, 3:57 PMNext generation sequencing
Page 7 of 15https://www.atdbio.com/content/58/Next-generation-sequencing
4.2. Ion torrent semiconductor sequencing
Ion torrent sequencing uses a "sequencing by synthesis" approach, in
which a new DNA strand, complementary to the target strand, is
synthesized one base at a time. A semiconductor chip detects the hydrogen
ions produced during DNA polymerization (Figure 5).
Following polony formation using emulsion PCR, the DNA library
fragment is flooded sequentially with each nucleoside triphosphate
(dNTP), as in pyrosequencing. The dNTP is then incorporated into the new
strand if complementary to the nucleotide on the target strand. Each time a
nucleotide is successfully added, a hydrogen ion is released, and it
detected by the sequencer's pH sensor. As in the pyrosequencing method,
if more than one of the same nucleotide is added, the change in pH/signal
intensity is correspondingly larger.
Figure 5 | Ion Torrent semiconductor sequencing
Ion torrent sequencing is the first commercial technique not to use
fluorescence and camera scanning; it is therefore faster and cheaper than
many of the other methods. Unfortunately, it can be difficult to enumerate
the number of identical bases added consecutively. For example, it may be
difficult to differentiate the pH change for a homorepeat of length 9 to one
of length 10, making it difficult to decode repetitive sequences.
11/6/19, 3:57 PMNext generation sequencing
Page 8 of 15https://www.atdbio.com/content/58/Next-generation-sequencing
4.3. Sequencing by ligation (SOLiD)
SOLiD is an enzymatic method of sequencing that uses DNA ligase, an
enzyme used widely in biotechnology for its ability to ligate double-
stranded DNA strands (Figure 6). Emulsion PCR is used to
immobilise/amplify a ssDNA primer-binding region (known as an adapter)
which has been conjugated to the target sequence (i.e. the sequence that is
to be sequenced) on a bead. These beads are then deposited onto a glass
surface − a high density of beads can be achieved which which in turn,
increases the throughput of the technique.
Once bead deposition has occurred, a primer of length N is hybridized to
the adapter, then the beads are exposed to a library of 8-mer probes which
have different fluorescent dye at the 5' end and a hydroxyl group at the 3'
end. Bases 1 and 2 are complementary to the nucleotides to be sequenced
whilst bases 3-5 are degenerate and bases 6-8 are inosine bases. Only a
complementary probe will hybridize to the target sequence, adjacent to the
primer. DNA ligase is then uses to join the 8-mer probe to the primer. A
phosphorothioate linkage between bases 5 and 6 allows the fluorescent
dye to be cleaved from the fragment using silver ions. This cleavage
allows fluorescence to be measured (four different fluorescent dyes are
used, all of which have different emission spectra) and also generates a 5’-
phosphate group which can undergo further ligation. Once the first round
of sequencing is completed, the extension product is melted off and then a
second round of sequencing is perfomed with a primer of length N−1.
Many rounds of sequencing using shorter primers each time (i.e. N−2, N
−3 etc) and measuring the fluorescence ensures that the target is
sequenced.
Due to the two-base sequencing method (since each base is effectively
sequenced twice), the SOLiD technique is highly accurate (at 99.999%
with a sixth primer, it is the most accurate of the second generation
platforms) and also inexpensive. It can complete a single run in 7 days and
in that time can produce 30 Gb of data. Unfortunately, its main
disadvantage is that read lengths are short, making it unsuitable for many
applications.
11/6/19, 3:57 PMNext generation sequencing
Page 9 of 15https://www.atdbio.com/content/58/Next-generation-sequencing
Figure 6 | Sequencing by ligation
11/6/19, 3:57 PMNext generation sequencing
Page 10 of 15https://www.atdbio.com/content/58/Next-generation-sequencing
4.4. Reversible terminator sequencing (Illumina)
Reversible terminator sequencing differs from the traditional Sanger
method in that, instead of terminating the primer extension irreversibly
using dideoxynucleotide, modified nucleotides are used in reversible
termination. Whilst many other techniques use emulsion PCR to amplify
the DNA library fragments, reversible termination uses bridge PCR,
improving the efficiency of this stage of the process.
Reversible terminators can be grouped into two categories: 3′-O-blocked
reversible terminators and 3′-unblocked reversible terminators.
4.4.1. 3′-O-blocked reversible terminators
The mechanism uses a sequencing by synthesis approach, elongating the
primer in a stepwise manner. Firstly, the sequencing primers and templates
are fixed to a solid support. The support is exposed to each of the four
DNA bases, which have a different fluorophore attached (to the
nitrogenous base) in addition to a 3’-O-azidomethyl group (Figure 7).
Figure 7 | Structure of fluorescently labelled azidomethyl dNTP used in Illumina
sequencing
Only the correct base anneals to the target and is subsequently ligated to
the primer. The solid support is then imaged and nucleotides that have not
been incorporated are washed away and the fluorescent branch is cleaved
using TCEP (tris(2-carboxyethyl)phosphine). TCEP also removes the 3’-
O-azidomethyl group, regenerating 3’-OH, and the cycle can be repeated
(Figure 8) .
11/6/19, 3:57 PMNext generation sequencing
Page 11 of 15https://www.atdbio.com/content/58/Next-generation-sequencing
Figure 8 | Reversible terminator sequencing
4.4.2. 3′-unblocked reversible terminators
The reversible termination group of 3′-unblocked reversible terminators is
linked to both the base and the fluorescence group, which now acts as part
of the termination group as well as a reporter. This method differs from the
3′-O-blocked reversible terminators method in three ways: firstly, the 3’-
position is not blocked (i.e. the base has free 3’-OH); the fluorophore is
the same for all four bases; and each modified base is flowed in
sequentially rather than at the same time.
The main disadvantage of these techniques lies with their poor read length,
which can be caused by one of two phenomena. In order to prevent
incorporation of two nucleotides in a single step, a block is put in place,
however in the event of no block addition due to a poor synthesis, strands
can become out of phase creating noise which limits read length. Noise
can also be created if the fluorophore is unsuccessfully attached or
removed. These problems are prevalent in other sequencing methods and
are the main limiting factors to read length.
11/6/19, 3:57 PMNext generation sequencing
Page 12 of 15https://www.atdbio.com/content/58/Next-generation-sequencing
This technique was pioneered by Illumina, with their HiSeq and MiSeq
platforms. HiSeq is the cheapest of the second generation sequencers with
a cost of $0.02 per million bases. It also has a high data output of 600 Gb
per run which takes around 8 days to complete.
5. THIRD GENERATION SEQUENCING
A new cohort of techniques has since been developed using single
molecule sequencing and single real time sequencing, removing the need
for clonal amplification. This reduces errors caused by PCR, simplifies
library preparation and, most importantly, gives a much higher read length
using higher throughput platforms. Examples include Pacific Biosciences'
platform which uses SMRT (single molecule real time) sequencing to give
read lengths of around one thousand bases and Helicos Biosciences which
utilises single molecule sequencing and therefore does not require
amplification prior to sequencing. Oxford Nanopore Technologies are
currently developing silicon-based nanopores which are subjected to a
current that changes as DNA passes through the pore. This is anticipated to
be a high-throughput rapid method of DNA sequencing, although
problems such as slowing transportation through the pore must first be
addressed.
6. SEQUENCING EPIGENETIC MODIFICATIONS
Just as Next generation sequencing enabled genomic sequencing on a
massive scale, it has become clear recently that the genetic code does not
contain all the information needed by organisms. Epigenetic modifications
to DNA bases, in particular 5-methylcytosine, also convey important
information.
All of the second generation sequencing platforms depend, like Sanger
sequencing, on PCR and therefore cannot sequence modified DNA bases.
In fact, both 5-methylcytosine and 5-hydroxymethylcytosine are treated as
cytosine by the enzymes involved in PCR; therefore, epigenetic
information is lost during sequencing.
6.5. Bisulfite sequencing
Bisulfite sequencing exploits the difference in reactivity of cytosine and 5-
methylcytosine with respect to bisulfite: cytosine is deaminated by
11/6/19, 3:57 PMNext generation sequencing
Page 13 of 15https://www.atdbio.com/content/58/Next-generation-sequencing
bisulfite to form uracil (which reads as T when sequenced), whereas 5-
methylcytosine is unreactive (i.e. reads as C). If two sequencing runs are
done in parallel, one with bisulfite treatment and one without, the
differences between the outputs of the two runs indicate methylated
cytosines in the original sequence. This technique can also be used for
dsDNA, since after treatment with bisulfite, the strands are no longer
complementary and can be treated as ssDNA.
5-Hydroxymethylcytosine, another important epigenetic modification,
reacts with bisulfite to form cytosine-5-methylsulfonate (which reads as C
when sequenced). This complicates matters somewhat, and means that
bisulfite sequencing cannot be used as a true indicator of methylation in
itself.
6.5.1. Oxidative bisulfite sequencing
Oxidative bisulfite sequencing adds a chemical oxidation step, which
converts 5-hydroxymethylcytosine to 5-formylcytosine using potassium
perruthenate, KRuO4, before bisulfite treatment. 5-Formylcytosine is
deformylated and deaminated to form uracil by bisulfite treatment. Now,
three separate sequencing runs are necessary to distinguish cytosine, 5-
methylcytosine and 5-hydroxymethylcytosine (see Figure 9).
Figure 9 | Sequencing epigenetic modifications using bisulfite
7. APPLICATIONS OF NEXT-GENERATION SEQUENCING
11/6/19, 3:57 PMNext generation sequencing
Page 14 of 15https://www.atdbio.com/content/58/Next-generation-sequencing
Next generation sequencing has enabled researchers to collect vast
quantities of genomic sequencing data. This technology has a plethora of
applications, such as: diagnosing and understanding complex diseases;
whole-genome sequencing; analysis of epigenetic modifications;
mitochondrial sequencing; transcriptome sequencing − understanding how
altered expression of genetic variants affects an organism; and exome
sequencing − mutations in the exome are thought to contain up to 90% of
mutations in the human genome, which leads to disease. DNA techniques
have been used to identify and isolate genes responsible for certain
diseases, and provide the correct copy of the defective gene known as
‘gene therapy’.
A large focus area in gene therapy is cancer treatment – one potential
method would be to introduce an antisense RNA (which specifically
prevents the synthesis of a targeted protein) to the oncogene, which is
triggered to form tumorous cells. Another method is named ‘suicide gene
therapy’ which introduces genes to kill cancer cells selectively. Many
genetic codes for toxic proteins and enzymes are known, and introduction
of these genes into tumor cells would result in cell death. The difficulty in
this method is to ensure a very precise delivery system to prevent killing
healthy cells.
These methods are made possible by sequencing to analyze tumor
genomes, allowing medical experts to tailor chemotherapy and other
cancer treatments more effectively to their patients’ unique genetic
composition, revolutionizing the diagnostic stages of personalized
medicine.
As the cost of DNA sequencing goes down, it will become more
widespread, which brings a number of issues. Sequencing produces huge
volumes of data, and there are many computational challenges associated
with processing and storing the data. There are also ethical issues, such as
the ownership of an individual's DNA when the DNA is sequenced. DNA
sequencing data must be stored securely, since there are concerns that
insurance groups, mortgage brokers and employers may use this data to
modify insurance quotes or distinguish between candidates. Sequencing
may also help to find out whether an individual has an increased risk to a
particular disease, but whether the patient is informed or if there is a cure
for the disease is another issue altogether.
11/6/19, 3:57 PMNext generation sequencing
Page 15 of 15https://www.atdbio.com/content/58/Next-generation-sequencing
8. REFERENCES
Ahmadian, A.; Svahn, H.; Massively Parallel. Sequencing Platforms using
Lab on a Chip Technologies. Lab Chip, 11, 2653 − 2655 (2011).
Balasubramanian, S.; Decoding Genomes at High Speed: Implications for
Science and Medicine. Angew. Chem Int. Ed. 50, 12406-12410 (2011).
Balasubramanian, S.; Sequencing Nucleic Acids: from Chemistry to
Medicine. Chem. Commun. 47, 7281 − 7286 (2011).
Chen, F.; Dong, M.; Ge, M.; Zhu, L.; Ren, L.; Liu, G.; Mu, R.; The
History and Advances of Reversible Terminators Used in New
Generations of Sequencing Technology. Gen. Pro. Bio. 11, 34-40 (2013).
Mardis, E.; Next-Generation DNA Sequencing Methods. Annu. Rev.
Genomics Hum. Genet. 9, 387 − 402 (2008).
Mardis, E.; Next-Generation DNA Sequencing Platforms. Annu. Rev.
Anal. Chem. 6, 287-303 (2013).
Shendure, J.; Ji, H.; Next-Generation DNA Sequencing. Nat. Biotech. 26,
10 1135-1144 (2008).
SEE ALSO
Sequencing, forensic analysis and genetic analysis
(www.atdbio.com/content/20/Sequencing-forensic-analysis-and-
genetic-analysis)
Our free online Nucleic Acids Book (www.atdbio.com/nucleic-acids-
book) contains information on all aspects of nucleic acids chemistry and
biology.

More Related Content

PPTX
Next generation sequencing
PDF
Introduction to next generation sequencing
PPTX
Third Generation Sequencing
DOCX
Next generation sequencing
PPTX
NEXT GENERATION SEQUENCING
PPTX
NANOPORE SEQUENCING
PPTX
NGS.pptx
PPTX
Ion torrent sequencing
Next generation sequencing
Introduction to next generation sequencing
Third Generation Sequencing
Next generation sequencing
NEXT GENERATION SEQUENCING
NANOPORE SEQUENCING
NGS.pptx
Ion torrent sequencing

What's hot (20)

PDF
The next generation sequencing platform of roche 454
PPTX
Next generation sequencing
PPTX
Illumina infinium sequencing
PPTX
Nanopore sequencing (NGS)
PPTX
THIRD GEN SEQUENCING.pptx
PDF
PacBio SMRT - THIRD GENERATION SEQUENCING TECHNIQUE
PDF
Next Generation Sequencing
PPTX
Next generation sequencing
PPTX
Ion Torrent Sequencing
PPTX
Next Gen Sequencing (NGS) Technology Overview
PPTX
Next generation sequencing
PDF
Next generation sequencing
PPTX
Ion Torrent Sequencing
PPTX
Roche Pyrosequencing 454 ; Next generation DNA Sequencing
PPTX
Next Generation Sequencing of DNA
PPT
Ion torrent semiconductor sequencing technology
PPTX
PCR,polymerase chain reaction.Basic concept of PCR.
PPTX
DNA Microarray introdution and application
PPTX
ILLUMINA SEQUENCE.pptx
PDF
Introduction to Next-Generation Sequencing (NGS) Technology
The next generation sequencing platform of roche 454
Next generation sequencing
Illumina infinium sequencing
Nanopore sequencing (NGS)
THIRD GEN SEQUENCING.pptx
PacBio SMRT - THIRD GENERATION SEQUENCING TECHNIQUE
Next Generation Sequencing
Next generation sequencing
Ion Torrent Sequencing
Next Gen Sequencing (NGS) Technology Overview
Next generation sequencing
Next generation sequencing
Ion Torrent Sequencing
Roche Pyrosequencing 454 ; Next generation DNA Sequencing
Next Generation Sequencing of DNA
Ion torrent semiconductor sequencing technology
PCR,polymerase chain reaction.Basic concept of PCR.
DNA Microarray introdution and application
ILLUMINA SEQUENCE.pptx
Introduction to Next-Generation Sequencing (NGS) Technology
Ad

Similar to Next generation sequencing (20)

PPTX
Dna sequencing and its types
PPTX
Manal- sequencing presentation-biotechpresentation-.pptx
PDF
03_Microbio590B_sequencing_2022.pdf
PDF
EVE161: Microbial Phylogenomics - Class 2 - Evolution of DNA Sequencing
PPTX
Sequence based Markers
PPTX
Conventional and next generation sequencing ppt
PPTX
Genome sequencing
PDF
DNA Sequencing Modern Approaches BHARGAV BHATT 54429.pdf
PPTX
Sequencing genes and genomes
PPTX
NGS platform.pptx
PPTX
20150601 bio sb_assembly_course
PDF
nextgenerationsequencing-170606100132.pdf
PPTX
Next generation sequencing
PPT
DNA Sequencing: History, methods and NGS
PPT
Useful.ppt
PPT
DNA Sequencing - DNA sequencing is like reading the instructions inside a cell
DOCX
Next generation sequencing
DOCX
Pcr & gel
PPTX
Lec-7 Methods of DNA sequencing.pptx amrk
PDF
DNA Libraries / Genomic DNA vs cDNA .pdf
Dna sequencing and its types
Manal- sequencing presentation-biotechpresentation-.pptx
03_Microbio590B_sequencing_2022.pdf
EVE161: Microbial Phylogenomics - Class 2 - Evolution of DNA Sequencing
Sequence based Markers
Conventional and next generation sequencing ppt
Genome sequencing
DNA Sequencing Modern Approaches BHARGAV BHATT 54429.pdf
Sequencing genes and genomes
NGS platform.pptx
20150601 bio sb_assembly_course
nextgenerationsequencing-170606100132.pdf
Next generation sequencing
DNA Sequencing: History, methods and NGS
Useful.ppt
DNA Sequencing - DNA sequencing is like reading the instructions inside a cell
Next generation sequencing
Pcr & gel
Lec-7 Methods of DNA sequencing.pptx amrk
DNA Libraries / Genomic DNA vs cDNA .pdf
Ad

Recently uploaded (20)

DOCX
search engine optimization ppt fir known well about this
PDF
WOOl fibre morphology and structure.pdf for textiles
PDF
Getting Started with Data Integration: FME Form 101
PPT
Geologic Time for studying geology for geologist
PPTX
Web Crawler for Trend Tracking Gen Z Insights.pptx
PPTX
Benefits of Physical activity for teenagers.pptx
PDF
A review of recent deep learning applications in wood surface defect identifi...
PPTX
Modernising the Digital Integration Hub
PPTX
O2C Customer Invoices to Receipt V15A.pptx
PDF
Developing a website for English-speaking practice to English as a foreign la...
PDF
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
PDF
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
PDF
Five Habits of High-Impact Board Members
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PPTX
The various Industrial Revolutions .pptx
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Getting started with AI Agents and Multi-Agent Systems
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
search engine optimization ppt fir known well about this
WOOl fibre morphology and structure.pdf for textiles
Getting Started with Data Integration: FME Form 101
Geologic Time for studying geology for geologist
Web Crawler for Trend Tracking Gen Z Insights.pptx
Benefits of Physical activity for teenagers.pptx
A review of recent deep learning applications in wood surface defect identifi...
Modernising the Digital Integration Hub
O2C Customer Invoices to Receipt V15A.pptx
Developing a website for English-speaking practice to English as a foreign la...
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
Five Habits of High-Impact Board Members
Univ-Connecticut-ChatGPT-Presentaion.pdf
Taming the Chaos: How to Turn Unstructured Data into Decisions
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
The various Industrial Revolutions .pptx
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Getting started with AI Agents and Multi-Agent Systems
Group 1 Presentation -Planning and Decision Making .pptx

Next generation sequencing

  • 1. 11/6/19, 3:57 PMNext generation sequencing Page 1 of 15https://www.atdbio.com/content/58/Next-generation-sequencing Next generation sequencing CONTENTS 1 Introduction 1.1 Sanger sequencing and Next-generation sequencing 2 Library preparation 3 Amplification 3.1 Emulsion PCR 3.2 Bridge PCR 4 Sequencing 4.1 454 Pyrosequencing 4.2 Ion torrent semiconductor sequencing 4.3 Sequencing by ligation (SOLiD) 4.4 Reversible terminator sequencing (Illumina) 4.4.1 3'-O-blocked reversible terminators 4.4.2 3'-unblocked reversible terminators 5 Third generation sequencing 6 Sequencing epigenetic modifications 6.5 Bisulfite sequencing 6.5.1 Oxidative bisulfite sequencing 7 Applications of Next-generation sequencing 8 References 1. INTRODUCTION The sequencing of the human genome was completed in 2003, after 13 years of international collaboration and investment of USD 3 billion. The Human Genome Project used Sanger sequencing (albeit heavily optimized), the principal method of DNA sequencing since its invention in the 1970s. Today, the demand for sequencing is growing exponentially, with large amounts of genomic DNA needing to be analyzed quickly, cheaply, and accurately. Thanks to new sequencing technologies known collectively as Next Generation Sequencing, it is now possible to sequence an entire NUCLEIC ACIDS BOOK FREE, EXCLUSIVE ONLINE BOOK We wrote the book. Read our Nucleic Acids Book, an exclusive online guide to the chemistry and biology of nucleic acids. CONTACTUS REQUESTAQUOTE NOW! Tell us about your problem, and we'll use all of our knowledge of nucleic acids chemistry to help you solve it.
  • 2. 11/6/19, 3:57 PMNext generation sequencing Page 2 of 15https://www.atdbio.com/content/58/Next-generation-sequencing human genome in a matter of hours. 1.1. Sanger sequencing and Next-generation sequencing The principle behind Next Generation Sequencing (NGS) is similar to that of Sanger sequencing, which relies on capillary electrophoresis. The genomic strand is fragmented, and the bases in each fragment are identified by emitted signals when the fragments are ligated against a template strand. The Sanger method required separate steps for sequencing, separation (by electrophoresis) and detection, which made it difficult to automate the sample preparation and it was limited in throughput, scalability and resolution. The NGS method uses array-based sequencing which combines the techniques developed in Sanger sequencing to process millions of reactions in parallel, resulting in very high speed and throughput at a reduced cost. The genome sequencing projects that took many years with Sanger methods can now be completed in hours with NGS, although with shorter read lengths (the number of bases that are sequenced at a time) and less accuracy. Next generation methods of DNA sequencing have three general steps: Library preparation: libraries are created using random fragmentation of DNA, followed by ligation with custom linkers Amplification: the library is amplified using clonal amplification methods and PCR Sequencing: DNA is sequenced using one of several different approaches 2. LIBRARY PREPARATION Firstly, DNA is fragmented either enzymatically or by sonication (excitation using ultrasound) to create smaller strands. Adaptors (short, double-stranded pieces of synthetic DNA) are then ligated to these fragments with the help of DNA ligase, an enzyme that joins DNA strands. The adaptors enable the sequence to become bound to a complementary counterpart. Adaptors are synthesised so that one end is 'sticky' whilst the other is 'blunt' (non-cohesive) with the view to joining the blunt end to the blunt
  • 3. 11/6/19, 3:57 PMNext generation sequencing Page 3 of 15https://www.atdbio.com/content/58/Next-generation-sequencing ended DNA. This could lead to the potential problem of base pairing between molecules and therefore dimer formation. To prevent this, the chemical structure of DNA is utilised, since ligation takes place between the 3′-OH and 5′-P ends. By removing the phosphate from the sticky end of the adaptor and therefore creating a 5′-OH end instead, the DNA ligase is unable to form a bridge between the two termini (Figure 1). Figure 1 | Library preparation of Next-generation sequencing In order for sequencing to be successful, the library fragments need to be spatially clustered in PCR colonies or 'polonies' as they are conventionally known, which consist of many copies of a particular library fragment. Since these polonies are attached in a planar fashion, the features of the array can be manipulated enzymatically in parallel. This method of library construction is much faster than the previous labour intensive procedure of colony picking and E. coli cloning used to isolate and amplify DNA for Sanger sequencing, however, this is at the expense of read length of the fragments.
  • 4. 11/6/19, 3:57 PMNext generation sequencing Page 4 of 15https://www.atdbio.com/content/58/Next-generation-sequencing 3. AMPLIFICATION Library amplification is required so that the received signal from the sequencer is strong enough to be detected accurately. With enzymatic amplification, phenomena such as 'biasing' and 'duplication' can occur leading to preferential amplification of certain library fragments. Instead, there are several types of amplification process which use PCR to create large numbers of DNA clusters. 3.1. Emulsion PCR Emulsion oil, beads, PCR mix and the library DNA are mixed to form an emulsion which leads to the formation of micro wells (Figure 2). Figure 2 | Emulsion PCR In order for the sequencing process to be successful, each micro well should contain one bead with one strand of DNA (approximately 15% of micro wells are of this composition). The PCR then denatures the library fragment leading two separate strands, one of which (the reverse strand) anneals to the bead. The annealed DNA is amplified by polymerase starting from the bead towards the primer site. The original reverse strand then denatures and is released from the bead only to re-anneal to the bead to give two separate strands. These are both amplified to give two DNA strands attached to the bead. The process is then repeated over 30-60 cycles leading to clusters of DNA. This technique has been criticised for its time consuming nature, since it requires many steps (forming and breaking the emulsion, PCR amplification, enrichment etc) despite its extensive use in many of the NGS platforms. It is also relatively inefficient since only around two thirds of the emulsion micro reactors will actually contain one bead. Therefore an extra step is required to separate empty
  • 5. 11/6/19, 3:57 PMNext generation sequencing Page 5 of 15https://www.atdbio.com/content/58/Next-generation-sequencing systems leading to more potential inaccuracies. 3.2. Bridge PCR The surface of the flow cell is densely coated with primers that are complementary to the primers attached to the DNA library fragments (Figure 3). The DNA is then attached to the surface of the cell at random where it is exposed to reagents for polymerase based extension. On addition of nucleotides and enzymes, the free ends of the single strands of DNA attach themselves to the surface of the cell via complementary primers, creating bridged structures. Enzymes then interact with the bridges to make them double stranded, so that when the denaturation occurs, two single stranded DNA fragments are attached to the surface in close proximity. Repetition of this process leads to clonal clusters of localised identical strands. In order to optimise cluster density, concentrations of reagents must be monitored very closely to avoid overcrowding. Figure 3 | Bridging PCR 4. SEQUENCING Several competing methods of Next Generation Sequencing have been developed by different companies. 4.1. 454 Pyrosequencing
  • 6. 11/6/19, 3:57 PMNext generation sequencing Page 6 of 15https://www.atdbio.com/content/58/Next-generation-sequencing Pyrosequencing is based on the 'sequencing by synthesis' principle, where a complementary strand is synthesised in the presence of polymerase enzyme (Figure 4). In contrast to using dideoxynucleotides to terminate chain amplification (as in Sanger sequencing), pyrosequencing instead detects the release of pyrophosphate when nucleotides are added to the DNA chain. It initially uses the emulsion PCR technique to construct the polonies required for sequencing and removes the complementary strand. Next, a ssDNA sequencing primer hybridizes to the end of the strand (primer-binding region), then the four different dNTPs are then sequentially made to flow in and out of the wells over the polonies. When the correct dNTP is enzymatically incorporated into the strand, it causes release of pyrophosphate. In the presence of ATP sulfurylase and adenosine, the pyrophosphate is converted into ATP. This ATP molecule is used for luciferase-catalysed conversion of luciferin to oxyluciferin, which produces light that can be detected with a camera. The relative intensity of light is proportional to the amount of base added (i.e. a peak of twice the intensity indicates two identical bases have been added in succession). Figure 4 | 454 Pyrosequencing Pyrosequencing, developed by 454 Life Sciences, was one of the early successes of Next-generation sequencing; indeed, 454 Life Sciences produced the first commercially available Next-generation sequencer. However, the method was eclipsed by other technologies and, in 2013, new owners Roche announced the closure of 454 Life Sciences and the discontinuation of the 454 pyrosequencing platform.
  • 7. 11/6/19, 3:57 PMNext generation sequencing Page 7 of 15https://www.atdbio.com/content/58/Next-generation-sequencing 4.2. Ion torrent semiconductor sequencing Ion torrent sequencing uses a "sequencing by synthesis" approach, in which a new DNA strand, complementary to the target strand, is synthesized one base at a time. A semiconductor chip detects the hydrogen ions produced during DNA polymerization (Figure 5). Following polony formation using emulsion PCR, the DNA library fragment is flooded sequentially with each nucleoside triphosphate (dNTP), as in pyrosequencing. The dNTP is then incorporated into the new strand if complementary to the nucleotide on the target strand. Each time a nucleotide is successfully added, a hydrogen ion is released, and it detected by the sequencer's pH sensor. As in the pyrosequencing method, if more than one of the same nucleotide is added, the change in pH/signal intensity is correspondingly larger. Figure 5 | Ion Torrent semiconductor sequencing Ion torrent sequencing is the first commercial technique not to use fluorescence and camera scanning; it is therefore faster and cheaper than many of the other methods. Unfortunately, it can be difficult to enumerate the number of identical bases added consecutively. For example, it may be difficult to differentiate the pH change for a homorepeat of length 9 to one of length 10, making it difficult to decode repetitive sequences.
  • 8. 11/6/19, 3:57 PMNext generation sequencing Page 8 of 15https://www.atdbio.com/content/58/Next-generation-sequencing 4.3. Sequencing by ligation (SOLiD) SOLiD is an enzymatic method of sequencing that uses DNA ligase, an enzyme used widely in biotechnology for its ability to ligate double- stranded DNA strands (Figure 6). Emulsion PCR is used to immobilise/amplify a ssDNA primer-binding region (known as an adapter) which has been conjugated to the target sequence (i.e. the sequence that is to be sequenced) on a bead. These beads are then deposited onto a glass surface − a high density of beads can be achieved which which in turn, increases the throughput of the technique. Once bead deposition has occurred, a primer of length N is hybridized to the adapter, then the beads are exposed to a library of 8-mer probes which have different fluorescent dye at the 5' end and a hydroxyl group at the 3' end. Bases 1 and 2 are complementary to the nucleotides to be sequenced whilst bases 3-5 are degenerate and bases 6-8 are inosine bases. Only a complementary probe will hybridize to the target sequence, adjacent to the primer. DNA ligase is then uses to join the 8-mer probe to the primer. A phosphorothioate linkage between bases 5 and 6 allows the fluorescent dye to be cleaved from the fragment using silver ions. This cleavage allows fluorescence to be measured (four different fluorescent dyes are used, all of which have different emission spectra) and also generates a 5’- phosphate group which can undergo further ligation. Once the first round of sequencing is completed, the extension product is melted off and then a second round of sequencing is perfomed with a primer of length N−1. Many rounds of sequencing using shorter primers each time (i.e. N−2, N −3 etc) and measuring the fluorescence ensures that the target is sequenced. Due to the two-base sequencing method (since each base is effectively sequenced twice), the SOLiD technique is highly accurate (at 99.999% with a sixth primer, it is the most accurate of the second generation platforms) and also inexpensive. It can complete a single run in 7 days and in that time can produce 30 Gb of data. Unfortunately, its main disadvantage is that read lengths are short, making it unsuitable for many applications.
  • 9. 11/6/19, 3:57 PMNext generation sequencing Page 9 of 15https://www.atdbio.com/content/58/Next-generation-sequencing Figure 6 | Sequencing by ligation
  • 10. 11/6/19, 3:57 PMNext generation sequencing Page 10 of 15https://www.atdbio.com/content/58/Next-generation-sequencing 4.4. Reversible terminator sequencing (Illumina) Reversible terminator sequencing differs from the traditional Sanger method in that, instead of terminating the primer extension irreversibly using dideoxynucleotide, modified nucleotides are used in reversible termination. Whilst many other techniques use emulsion PCR to amplify the DNA library fragments, reversible termination uses bridge PCR, improving the efficiency of this stage of the process. Reversible terminators can be grouped into two categories: 3′-O-blocked reversible terminators and 3′-unblocked reversible terminators. 4.4.1. 3′-O-blocked reversible terminators The mechanism uses a sequencing by synthesis approach, elongating the primer in a stepwise manner. Firstly, the sequencing primers and templates are fixed to a solid support. The support is exposed to each of the four DNA bases, which have a different fluorophore attached (to the nitrogenous base) in addition to a 3’-O-azidomethyl group (Figure 7). Figure 7 | Structure of fluorescently labelled azidomethyl dNTP used in Illumina sequencing Only the correct base anneals to the target and is subsequently ligated to the primer. The solid support is then imaged and nucleotides that have not been incorporated are washed away and the fluorescent branch is cleaved using TCEP (tris(2-carboxyethyl)phosphine). TCEP also removes the 3’- O-azidomethyl group, regenerating 3’-OH, and the cycle can be repeated (Figure 8) .
  • 11. 11/6/19, 3:57 PMNext generation sequencing Page 11 of 15https://www.atdbio.com/content/58/Next-generation-sequencing Figure 8 | Reversible terminator sequencing 4.4.2. 3′-unblocked reversible terminators The reversible termination group of 3′-unblocked reversible terminators is linked to both the base and the fluorescence group, which now acts as part of the termination group as well as a reporter. This method differs from the 3′-O-blocked reversible terminators method in three ways: firstly, the 3’- position is not blocked (i.e. the base has free 3’-OH); the fluorophore is the same for all four bases; and each modified base is flowed in sequentially rather than at the same time. The main disadvantage of these techniques lies with their poor read length, which can be caused by one of two phenomena. In order to prevent incorporation of two nucleotides in a single step, a block is put in place, however in the event of no block addition due to a poor synthesis, strands can become out of phase creating noise which limits read length. Noise can also be created if the fluorophore is unsuccessfully attached or removed. These problems are prevalent in other sequencing methods and are the main limiting factors to read length.
  • 12. 11/6/19, 3:57 PMNext generation sequencing Page 12 of 15https://www.atdbio.com/content/58/Next-generation-sequencing This technique was pioneered by Illumina, with their HiSeq and MiSeq platforms. HiSeq is the cheapest of the second generation sequencers with a cost of $0.02 per million bases. It also has a high data output of 600 Gb per run which takes around 8 days to complete. 5. THIRD GENERATION SEQUENCING A new cohort of techniques has since been developed using single molecule sequencing and single real time sequencing, removing the need for clonal amplification. This reduces errors caused by PCR, simplifies library preparation and, most importantly, gives a much higher read length using higher throughput platforms. Examples include Pacific Biosciences' platform which uses SMRT (single molecule real time) sequencing to give read lengths of around one thousand bases and Helicos Biosciences which utilises single molecule sequencing and therefore does not require amplification prior to sequencing. Oxford Nanopore Technologies are currently developing silicon-based nanopores which are subjected to a current that changes as DNA passes through the pore. This is anticipated to be a high-throughput rapid method of DNA sequencing, although problems such as slowing transportation through the pore must first be addressed. 6. SEQUENCING EPIGENETIC MODIFICATIONS Just as Next generation sequencing enabled genomic sequencing on a massive scale, it has become clear recently that the genetic code does not contain all the information needed by organisms. Epigenetic modifications to DNA bases, in particular 5-methylcytosine, also convey important information. All of the second generation sequencing platforms depend, like Sanger sequencing, on PCR and therefore cannot sequence modified DNA bases. In fact, both 5-methylcytosine and 5-hydroxymethylcytosine are treated as cytosine by the enzymes involved in PCR; therefore, epigenetic information is lost during sequencing. 6.5. Bisulfite sequencing Bisulfite sequencing exploits the difference in reactivity of cytosine and 5- methylcytosine with respect to bisulfite: cytosine is deaminated by
  • 13. 11/6/19, 3:57 PMNext generation sequencing Page 13 of 15https://www.atdbio.com/content/58/Next-generation-sequencing bisulfite to form uracil (which reads as T when sequenced), whereas 5- methylcytosine is unreactive (i.e. reads as C). If two sequencing runs are done in parallel, one with bisulfite treatment and one without, the differences between the outputs of the two runs indicate methylated cytosines in the original sequence. This technique can also be used for dsDNA, since after treatment with bisulfite, the strands are no longer complementary and can be treated as ssDNA. 5-Hydroxymethylcytosine, another important epigenetic modification, reacts with bisulfite to form cytosine-5-methylsulfonate (which reads as C when sequenced). This complicates matters somewhat, and means that bisulfite sequencing cannot be used as a true indicator of methylation in itself. 6.5.1. Oxidative bisulfite sequencing Oxidative bisulfite sequencing adds a chemical oxidation step, which converts 5-hydroxymethylcytosine to 5-formylcytosine using potassium perruthenate, KRuO4, before bisulfite treatment. 5-Formylcytosine is deformylated and deaminated to form uracil by bisulfite treatment. Now, three separate sequencing runs are necessary to distinguish cytosine, 5- methylcytosine and 5-hydroxymethylcytosine (see Figure 9). Figure 9 | Sequencing epigenetic modifications using bisulfite 7. APPLICATIONS OF NEXT-GENERATION SEQUENCING
  • 14. 11/6/19, 3:57 PMNext generation sequencing Page 14 of 15https://www.atdbio.com/content/58/Next-generation-sequencing Next generation sequencing has enabled researchers to collect vast quantities of genomic sequencing data. This technology has a plethora of applications, such as: diagnosing and understanding complex diseases; whole-genome sequencing; analysis of epigenetic modifications; mitochondrial sequencing; transcriptome sequencing − understanding how altered expression of genetic variants affects an organism; and exome sequencing − mutations in the exome are thought to contain up to 90% of mutations in the human genome, which leads to disease. DNA techniques have been used to identify and isolate genes responsible for certain diseases, and provide the correct copy of the defective gene known as ‘gene therapy’. A large focus area in gene therapy is cancer treatment – one potential method would be to introduce an antisense RNA (which specifically prevents the synthesis of a targeted protein) to the oncogene, which is triggered to form tumorous cells. Another method is named ‘suicide gene therapy’ which introduces genes to kill cancer cells selectively. Many genetic codes for toxic proteins and enzymes are known, and introduction of these genes into tumor cells would result in cell death. The difficulty in this method is to ensure a very precise delivery system to prevent killing healthy cells. These methods are made possible by sequencing to analyze tumor genomes, allowing medical experts to tailor chemotherapy and other cancer treatments more effectively to their patients’ unique genetic composition, revolutionizing the diagnostic stages of personalized medicine. As the cost of DNA sequencing goes down, it will become more widespread, which brings a number of issues. Sequencing produces huge volumes of data, and there are many computational challenges associated with processing and storing the data. There are also ethical issues, such as the ownership of an individual's DNA when the DNA is sequenced. DNA sequencing data must be stored securely, since there are concerns that insurance groups, mortgage brokers and employers may use this data to modify insurance quotes or distinguish between candidates. Sequencing may also help to find out whether an individual has an increased risk to a particular disease, but whether the patient is informed or if there is a cure for the disease is another issue altogether.
  • 15. 11/6/19, 3:57 PMNext generation sequencing Page 15 of 15https://www.atdbio.com/content/58/Next-generation-sequencing 8. REFERENCES Ahmadian, A.; Svahn, H.; Massively Parallel. Sequencing Platforms using Lab on a Chip Technologies. Lab Chip, 11, 2653 − 2655 (2011). Balasubramanian, S.; Decoding Genomes at High Speed: Implications for Science and Medicine. Angew. Chem Int. Ed. 50, 12406-12410 (2011). Balasubramanian, S.; Sequencing Nucleic Acids: from Chemistry to Medicine. Chem. Commun. 47, 7281 − 7286 (2011). Chen, F.; Dong, M.; Ge, M.; Zhu, L.; Ren, L.; Liu, G.; Mu, R.; The History and Advances of Reversible Terminators Used in New Generations of Sequencing Technology. Gen. Pro. Bio. 11, 34-40 (2013). Mardis, E.; Next-Generation DNA Sequencing Methods. Annu. Rev. Genomics Hum. Genet. 9, 387 − 402 (2008). Mardis, E.; Next-Generation DNA Sequencing Platforms. Annu. Rev. Anal. Chem. 6, 287-303 (2013). Shendure, J.; Ji, H.; Next-Generation DNA Sequencing. Nat. Biotech. 26, 10 1135-1144 (2008). SEE ALSO Sequencing, forensic analysis and genetic analysis (www.atdbio.com/content/20/Sequencing-forensic-analysis-and- genetic-analysis) Our free online Nucleic Acids Book (www.atdbio.com/nucleic-acids- book) contains information on all aspects of nucleic acids chemistry and biology.