SlideShare a Scribd company logo
NGI stockholm
NGI Sweden
Next Generation Sequencing at the
National Genomics Infrastructure
Phil Ewels
phil.ewels@scilifelab.se
Introduction to Bioinformatics Using NGS Data
Linköping, 2018-05-23
NGI stockholm
Overview
National Genomics Infrastructure
Sequencing Technologies
Sequencing Applications
Bioinformatics at the NGI
The National Genomics
Infrastructure
NGI stockholm
National Genomics Infrastructure
Proteomics
Metabolomics
Single-Cell Biology
Cellular & Molecular Imaging
Molecular Structure
Chemical Biology
Genome Engineering
Diagnostic Development
Drug Discovery & Development
NationalBioinformaticsInfrastructure
DataOffice
SciLifeLab NGI
Technology Platforms
Research Programs
National Genomics Infrastructure
NGI stockholm
SciLifeLab NGI
Stockholm Uppsala
Genomics Production SNP&Seq
Uppsala Genome CenterGenomics Applications DevelopmentGenomics Applications Development
National Genomics Infrastructure
NGI stockholm
SciLifeLab NGI
Our mission is to offer a

state-of-the-art infrastructure

for massively parallel DNA sequencing
and SNP genotyping, available to
researchers all over Sweden
NGI stockholm
SciLifeLab NGI
National resource
State-of-the-art
infrastructure
Guidelines and
support
We provide 

guidelines and support

for sample collection, study
design, protocol selection and
bioinformatics analysis
NGI stockholm
NGI Organisation
NGI Stockholm NGI Uppsala
NGI stockholm
NGI Organisation
Funding
Staff salaries
Premises and service
contracts
Capital equipment
Host universities
SciLifeLab
VR
KAW
User fees
Reagent costs
NGI Stockholm NGI Uppsala
NGI stockholm
Project timeline
Sample QC
Library
preparation,
Sequencing,
Genotyping
Data processing
and primary
analysis
Scientific support
and project
consultation
Data delivery
NGI stockholm
Project timeline
Sample QC
Library
preparation,
Sequencing,
Genotyping
Data processing
and primary
analysis
Scientific support
and project
consultation
Data delivery
NGI stockholm
Just

Sequencing
Methods offered at NGI
FFPE

Sequencing
10X

Genomics
Nanoporesequencing
ATAC-seq
UserQC 

(cheap preps)
Hi-C(ox)Bisulphite

sequencing
RAD-seq
RNA
de novoDNA
Data analysis
pipelines included
NGI stockholm
NGI Stockholm
NGI Stockholm Projects in 2017
RNA-Seq
WG Re-Seq
De-Novo
Metagenomics
Targeted Re-Seq
ChIP-Seq
Epigenetics
RAD Seq
0 45 90 135 180
1
13
15
31
58
61
159
177
• RNA-seq is the most common project type
NGI stockholm
NGI Stockholm
• RNA-seq is the most common project type

• In total, NGI Sweden processed 1068 NGS projects with
almost 50 000 samples in 2017
NGI Stockholm Samples in 2017
RNA-Seq
WG Re-Seq
De-Novo
Metagenomics
Targeted Re-Seq
ChIP-Seq
Epigenetics
RAD Seq
0 4000 8000 12000 16000
192
180
397
5,909
4,496
211
4,551
15,022
NGI stockholm
NGI Stockholm
• Median turn around times from QC passed to data
delivered for 2017

• Sequencing only: 11.5 days
• RNA: 6.5 weeks
• WGS: 8 weeks
https://ngisweden.scilifelab.se/
file/stockholm_dashboard
Sequencing
Technologies
NGI stockholm
Sequencing Types
Illumina
PacBio
Oxford Nanopore
Ion Torrent
Lecture: NGS at the National Genomics Infrastructure
NGI stockholm
Illumina Sequencing
• Largest provider of sequencing technology

• NGS machines use "Sequencing-by-synthesis"

• Developed at the University of Cambridge in 1990s
• Spun into a company called Solexa in 1998
• Solexa acquired by illumina in 2007
• Responsible for vast majority of DNA sequencing
experiments worldwide
NGI stockholm
Illumina Sequencing
https://youtu.be/fCd6B5HRaZ8
NGI stockholm
Illumina iSeq 100
NGI stockholm
Illumina MiniSeq 100
NGI stockholm
Illumina MiSeq
NGI stockholm
Illumina NextSeq
NGI stockholm
Illumina HiSeq 2500
NGI stockholm
Illumina HiSeq 3000
NGI stockholm
Illumina HiSeq 4000
NGI stockholm
Illumina HiSeq X
NGI stockholm
Illumina NovaSeq 6000
NGI stockholm
Illumina at NGI
iSeq 100
Coming soon to NGI Uppsala
Small cheap runs
MiSeq Small runs, long reads (2x300bp)
HiSeq 2500 Primary machine for most of NGI's history
HiSeq X
Cheap, high throughput
Only allowed to run WGS with > 15X coverage
NovaSeq 6000
Newest machine, both Stockholm & Uppsala
Will eventually replace HiSeq 2500
NGI stockholm
How to choose
• Number of reads required

• How many samples, how deeply sequenced?
• Type of reads required

• Single End / Paired End, length?
• Urgency and cost

• Sharing flow cells with other users
• Best price for the project
NGI stockholm
Patterned flow cells
• New type of flow cell

• HiSeq 4000, HiSeq X, NovaSeq
• Single sequence per well

• Higher density, more data
• What's index-hopping?

• ExAmp can mix up index pairs in
tiny fraction of reads
• Avoided with dual unique indexes
Patterned flow cells
• Patterned flow cells can give "optical duplicates"

• https://sequencing.qcfail.com/articles/illumina-patterned-flow-
cells-generate-duplicated-sequences/
• Can be treated like regular PCR duplicates
HiSeq 2500 HiSeq 4000
NGI stockholm
Two colour chemistry
• Older SBS used four different fluorophores

• One for each nucleotide
• New machines use two

• Faster and cheaper
• NextSeq, NovaSeq, iSeq
• No signal = G

• Can get poly-G if something

goes wrong
https://sequencing.qcfail.com/articles/illumina-2-colour-
chemistry-can-overcall-high-confidence-g-bases/
Lecture: NGS at the National Genomics Infrastructure
NGI stockholm
PacBio
• Pacific Biosciences - specialists in long reads

• Also uses fluorescent nucleotides
• Polymerases immobilised at the bottom of tiny wells give
off pulses as the nucleotides are incorporated
• Each well is independent, doesn't use sequencing
rounds like illumina

• Can work with much longer DNA fragments

• 250 bp – 60 kb (max ~160 kb)
NGI stockholm
PacBio
https://youtu.be/NHCJ8PtYCFc
NGI stockholm
PacBio RS II
NGI stockholm
PacBio Sequel
NGI stockholm
PacBio Sequencing
• Long reads are excellent for de-novo genome assembly
and isoform detection

• Output is expensive compared to illumina, but getting
better

• Small genomes are no problem. Larger genomes are
now becoming more feasible.
• New amplification-free enrichment using CRISPR-Cas9
Lecture: NGS at the National Genomics Infrastructure
NGI stockholm
Oxford Nanopore
• Newest contender in the sequencing world

• Lots of hype and taken several years to become a reality
• Still developing very fast

• Quality, yield and cost changing almost monthly
• High error rates (but better than they used to be)

• Now 2-13% depending on sequencing type
NGI stockholm
Oxford Nanopore
NGI stockholm
MinION
NGI stockholm
MinION
NGI stockholm
GridION
NGI stockholm
PromethION
NGI stockholm
SmidgION
(not yet released)
NGI stockholm
Oxford Nanopore
• The best technology available for ultra long reads

• Twitter users report getting reads over 1 Mbp long
• "Whale spotting" - finding the longest reads on the end
of the distribution curve
• Price dropping rapidly, but still expensive compared to
illumina

• NGI has 2x MinIONs, hoping for PromethION soon
Lecture: NGS at the National Genomics Infrastructure
NGI stockholm
Ion Torrent
• Main application

• Microbial and metagenomic sequencing
• Targeted re-sequencing (gene panels)
• Clinical sequencing
• Short, single-end reads

• Fast run times
NGI stockholm
Ion Torrent PGM
• Yield

• 0.1 - 1 Gbp
• Run time

• 3 hrs
• Read length

• 200 - 400 bp
NGI stockholm
Ion Torrent Proton
• Yield

• 10 Gbp
• Run time

• 4 hrs
• Read length

• 200 bp
NGI stockholm
Ion Torrent S5 XL
• Yield

• 1-13 Gbp
• Run time

• 3 hrs
• Read length

• 200 - 600 bp
NGI stockholm
Sequencing Type
• No need to remember all of this

• Many considerations, changing all the time
• We are experts - come and speak to us!
support@ngisweden.se
https://ngisweden.scilifelab.se/
Sequencing
Applications
NGI stockholm
Library Preparation
• All high throughput sequencing requires some kind of
library preparation

• Add adapters for sequencing chemistry
• Adjust DNA fragment lengths
• Incorporate biological signal into sequence
• Add required enzymes
• Different library preps enable different applications
NGI stockholm
RNA Sequencing
• Choose a type of RNA

• Protein coding mRNA (poly-A)
• All RNA (rRNA depletion)
• Small RNA
• Choose your question

• Differential gene expression
• Differential isoform detection & quantification
• Fusion gene detection
• Define your limitations

• Low-input material
• Low quality material (eg. FFPE)
NGI stockholm
RNA Sequencing
• Illumina sequencing RNA library prep kits

• Illumina TruSeq RNA
• Illumina RiboZero
• Illumina TruSeq RNA Exome
• Clontech SMARTER Pico
• Illumina TruSeq Small RNA
• Oxford Nanopore, PacBio, IonTorrent
Protein-coding poly-A
rRNA depletion
FFPE / low quality
low input
small RNA
NGI stockholm
DNA Sequencing
• Choose your question

• SNP, SNV, indel calling
• Structural variant detection
• De-novo genome assembly
• Choose your priorities

• Sequencing accuracy
• Sequencing depth
• Ultra-long reads
• Define your requirements

• Low-input material
• Low quality material (eg. FFPE)
NGI stockholm
DNA Sequencing
• Illumina sequencing DNA library prep kits

• Illumina TruSeq DNA PCR Free
• Rubicon ThruPLEX
• Illumina Nextera XT
• Illumina Nextera Flex
• 10X Genomics
• Oxford Nanopore, PacBio, IonTorrent
Best quality
Low input
Cheap (plate format)
Fast and simple
Linked reads
NGI stockholm
10X Genomics
• Chromium instrument uses droplet emulsion technology
for nanoliter reaction volumes

• Linked-read sequencing

• Large molecules fragmented in droplets and barcoded
• Normal short-read illumina sequencing used
• Long fragments (20-100+ Kbp) reassembled from barcodes
• Regular illumina sequencing libraries produced
NGI stockholm
10X Genomics
NGI stockholm
10X Genomics
• Single cell RNA sequencing

• Thousands of cells captured in droplets
• Each RNA molecule tagged with droplet
barcode
NGI stockholm
• Now testing Hi-C in NGI Stockholm

• Proximity ligation assay to detect physical colocation of
DNA fragments within cell nuclei
• Multiple applications for data

• Epigenetics
• De-novo genome assembly
• Structural variation detection
Hi-C
Chr 14
Chr14
NGI stockholm
Methylation Sequencing
• Bisulphite sequencing detects Cytosine methylation in
genomic DNA

• Unmethylated Cs converted to Uracil by bisulfites and sequenced as T
• Methylated Cs are protected and sequenced as C
• Oxidative bisulphite informs about hydroxy-methylation

• Current under development at NGI Stockholm
• PacBio and Oxford Nanopore able to detect some
native base modifications
NGI stockholm
RAD Sequencing
• Restriction-site Associated DNA sequencing, also
known as GBS (Genotyping By Sequencing)

• Genome fragmented using a restriction enzyme
• Narrow size range purified - same regions of genome for
all individuals
• Allows cheap high-depth variant calling for large
numbers of samples, without a reference genome

• Excellent for population genomics and ecology
Bioinformatics

at the NGI
NGI stockholm
Bioinformatics at NGI
• Raw sequencing data management

• Demultiplexing, data transfers, backups, delivery
• Quality control

• Every project is checked against quality criteria
• Automated analysis pipelines

• Standardised pipelines give reproducible results
• Software development
NGI stockholm
NGI Data Handling
Sequencer
Network
storage
Preprocessing
Backup
UPPMAX

(Irma)
UPPMAX

(Grus)
SNIC Supr

Authentication
Your
computer
Your analysis
server
NGI stockholm
Grus Deliveries
• UPPMAX tool for NGI data deliveries

• NGI creates a SNIC Supr "delivery project" for each NGI
sequencing project
• Project PI and contact person given access, according
to what was put on the order form
• Email sent with project ID and instructions
• Grus is for secure short term storage only

• Requires two-factor authentication
NGI stockholm
Analysis Pipelines
• Initial data analysis for major protocols

• Internal QC and standardised starting point for users

• All software open source and on GitHub

• http://opensource.scilifelab.se/
• http://github.com/SciLifeLab/
• Accredited facility
NGI stockholm
Analysis Requirements
Automated

Reliable

Easy for others to run

Reproducible results
NGI stockholm
Analysis Pipelines
NouGAT (de-novo)
Sarek Somatic
• SNPs, SNVs and indels
• Structural variants
• Heterogeneity, ploidy and CNVs
• Germline and/or Somatic analysis

• Formerly called Cancer Analysis Workflow
https://github.com/SciLifeLab/Sarek
MuTect2
Strelka
FreeBayes
GATK
HaplotypeCaller
MuTect1
ASCAT
Manta
• Tumour/Normal pair WGS analysis
based on GATK best practices
Sarek
• Tool split into sub-workflows

• Bash wrapper script runs whole
workflow

• Manuscript submitted this week,
preprint available on bioRxiv

• https://www.biorxiv.org/content/early/
2018/05/09/316976
NGI-RNAseq
NGI, April – December 2017

10,227samples processed
     131user projects
https://github.com/SciLifeLab/NGI-RNAseq
MIT Licence
Read alignment
Gene counts
Quality Control
Reporting
Raw data
NGI-RNAseq
Quantitative Biology Center
Tübingen, Germany
https://github.com/SciLifeLab/NGI-RNAseq
Now running using
NGI stockholm
nf-core
• A community effort to collect a curated set of Nextflow
analysis pipelines

• GitHub organisation to collect pipelines in one place
• No institute-specific branding
• Strict set of guideline requirements
• Automated testing for code style and function
https://nf-co.re
NGI stockholm
nf-core
https://nf-co.re
• Easy to run pipelines

• Helpful community

• Super reproducible
results
NGI stockholm
Quality Control
• Every project has some level of quality control checks

• Sequencing quality
• FastQC, FastQ Screen
• Analysis pipelines give application-specific QC

• Qualimap, RSeQC
• Reporting is done using MultiQC
MultiQC
• Reporting tool, parses logs from completed analysis

• Creates single HTML report for all samples & steps in a
project

• Interactive plots for data exploration

• Current version now has 61 supported tools

• Works with anything from tens → thousands of samples

• Highly customisable
Lecture: NGS at the National Genomics Infrastructure
Getting MultiQC
PyPI
Conclusions
NGI stockholm
If you have a project
• Visit our order portal

• Create projects
• Request meetings
• Send us an email
https://ngisweden.scilifelab.se
support@ngisweden.se
NGI stockholm
Find our tools
• View our open-source
software

• All code available on
GitHub
http://opensource.scilifelab.se
Acknowledgements
NGI stockholm
Thanks to:
Max Käller

Olga Vinnere Pettersson

NGI Sweden
Phil Ewels

phil.ewels@scilifelab.se

ewels

tallphil
support@ngisweden.se
http://ngisweden.scilifelab.se
http://opensource.scilifelab.se

More Related Content

PDF
Advanced NGS Library Prep for Challenging Samples
PDF
QIAseq Technologies for Metagenomics and Microbiome NGS Library Prep
PPTX
Stem cells cryopreservation
PPTX
Next generation sequencing methods
PDF
NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...
PPTX
Snapgene
PDF
Generations of sequencing technologies.
PPTX
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017
Advanced NGS Library Prep for Challenging Samples
QIAseq Technologies for Metagenomics and Microbiome NGS Library Prep
Stem cells cryopreservation
Next generation sequencing methods
NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...
Snapgene
Generations of sequencing technologies.
scRNA-Seq Lecture - Stem Cell Network RNA-Seq Workshop 2017

What's hot (20)

PDF
Next generation-sequencing.ppt-converted
PPTX
Applications of Single-Cell Sequencing: Innovative and Tailor-Made Services
PPTX
Clinical Applications of Next Generation Sequencing
PDF
Illumina sequencing introduction
PPTX
PPTX
Application of Fluorescence Activated-Cell Sorting (FACS) in separation of di...
PPTX
Spatial transcriptome profiling by MERFISH reveals sub-cellular RNA compartme...
PDF
Rna seq
PPTX
Metagenomic
PPTX
MENDEL; 150 years on
PPTX
Applications of genomics and proteomics ppt
PPTX
Moderna at 35th Annual J.P. Morgan Healthcare Conference
PPTX
Next Generation Sequencing and its Applications in Medical Research - Frances...
PPTX
Transcriptomics approaches
PPTX
Introduction to Single-cell RNA-seq
PPSX
Next Generation Sequencing
PPTX
Applications of transcriptomice s in modern biotechnology 2
PPTX
Antisense and RNAi
PDF
RNA Sequencing from Single Cell
PPTX
DNA BARCODING INTRODUCTION
Next generation-sequencing.ppt-converted
Applications of Single-Cell Sequencing: Innovative and Tailor-Made Services
Clinical Applications of Next Generation Sequencing
Illumina sequencing introduction
Application of Fluorescence Activated-Cell Sorting (FACS) in separation of di...
Spatial transcriptome profiling by MERFISH reveals sub-cellular RNA compartme...
Rna seq
Metagenomic
MENDEL; 150 years on
Applications of genomics and proteomics ppt
Moderna at 35th Annual J.P. Morgan Healthcare Conference
Next Generation Sequencing and its Applications in Medical Research - Frances...
Transcriptomics approaches
Introduction to Single-cell RNA-seq
Next Generation Sequencing
Applications of transcriptomice s in modern biotechnology 2
Antisense and RNAi
RNA Sequencing from Single Cell
DNA BARCODING INTRODUCTION
Ad

Similar to Lecture: NGS at the National Genomics Infrastructure (20)

PPTX
Next Generation Sequencing - An Overview
PPTX
ngs.pptx
PDF
Whole Genome Sequencing - Data Processing and QC at SciLifeLab NGI
PPTX
Ngs introduction
PPTX
Lecture-1_NGS.pptx important document it
PDF
NGS - Basic principles and sequencing platforms
PDF
NBIS RNA-seq course
PPTX
Presentation dan sequencing.pptx
PDF
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
PPT
AdamAmeur_SciLife_Bioinfo_course_Nov2015.ppt
PPT
AdamAmeur_SciLife_Bioinfo_course_Nov2015.ppt
PDF
Introduction to Next-Generation Sequencing (NGS) Technology
PDF
NBIS ChIP-seq course
PPTX
PACBIO SEQUENCING - PRINCIPLE, TYPES, APPLICATION, ADVANTAGE AND DISADVANTAGE
PDF
Big data solution for ngs data analysis
PPTX
Ion torrent
PPTX
Lecture 3
PDF
Gena tp ny
PDF
EpiChrom 2019 - Updates in Epigenomics at the NGI
PPTX
NANOPORE SEQUENCING
Next Generation Sequencing - An Overview
ngs.pptx
Whole Genome Sequencing - Data Processing and QC at SciLifeLab NGI
Ngs introduction
Lecture-1_NGS.pptx important document it
NGS - Basic principles and sequencing platforms
NBIS RNA-seq course
Presentation dan sequencing.pptx
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
AdamAmeur_SciLife_Bioinfo_course_Nov2015.ppt
AdamAmeur_SciLife_Bioinfo_course_Nov2015.ppt
Introduction to Next-Generation Sequencing (NGS) Technology
NBIS ChIP-seq course
PACBIO SEQUENCING - PRINCIPLE, TYPES, APPLICATION, ADVANTAGE AND DISADVANTAGE
Big data solution for ngs data analysis
Ion torrent
Lecture 3
Gena tp ny
EpiChrom 2019 - Updates in Epigenomics at the NGI
NANOPORE SEQUENCING
Ad

More from Phil Ewels (14)

PDF
Reproducible bioinformatics for everyone: Nextflow & nf-core
PDF
Reproducible bioinformatics workflows with Nextflow and nf-core
PDF
ELIXIR Proteomics Community - Connection with nf-core
PDF
Coffee 'n code: Regexes
PDF
Nextflow Camp 2019: nf-core tutorial (Updated Feb 2020)
PDF
Nextflow Camp 2019: nf-core tutorial
PDF
The future of genomics in the cloud
PDF
SciLifeLab NGI NovaSeq seminar
PDF
SBW 2016: MultiQC Workshop
PDF
Developing Reliable QC at the Swedish National Genomics Infrastructure
PDF
Standardising Swedish genomics analyses using nextflow
PDF
Using visual aids effectively
PDF
Analysis of ChIP-Seq Data
PPT
Internet McMenemy
Reproducible bioinformatics for everyone: Nextflow & nf-core
Reproducible bioinformatics workflows with Nextflow and nf-core
ELIXIR Proteomics Community - Connection with nf-core
Coffee 'n code: Regexes
Nextflow Camp 2019: nf-core tutorial (Updated Feb 2020)
Nextflow Camp 2019: nf-core tutorial
The future of genomics in the cloud
SciLifeLab NGI NovaSeq seminar
SBW 2016: MultiQC Workshop
Developing Reliable QC at the Swedish National Genomics Infrastructure
Standardising Swedish genomics analyses using nextflow
Using visual aids effectively
Analysis of ChIP-Seq Data
Internet McMenemy

Recently uploaded (20)

PPTX
7. General Toxicologyfor clinical phrmacy.pptx
PPTX
Classification Systems_TAXONOMY_SCIENCE8.pptx
PPTX
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
PPTX
2Systematics of Living Organisms t-.pptx
PDF
lecture 2026 of Sjogren's syndrome l .pdf
PDF
Sciences of Europe No 170 (2025)
PDF
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
DOCX
Q1_LE_Mathematics 8_Lesson 5_Week 5.docx
PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
PPT
POSITIONING IN OPERATION THEATRE ROOM.ppt
PDF
AlphaEarth Foundations and the Satellite Embedding dataset
PDF
An interstellar mission to test astrophysical black holes
PPTX
Vitamins & Minerals: Complete Guide to Functions, Food Sources, Deficiency Si...
PPTX
Microbiology with diagram medical studies .pptx
PDF
. Radiology Case Scenariosssssssssssssss
PPTX
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
PDF
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
7. General Toxicologyfor clinical phrmacy.pptx
Classification Systems_TAXONOMY_SCIENCE8.pptx
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
2Systematics of Living Organisms t-.pptx
lecture 2026 of Sjogren's syndrome l .pdf
Sciences of Europe No 170 (2025)
Cosmic Outliers: Low-spin Halos Explain the Abundance, Compactness, and Redsh...
TOTAL hIP ARTHROPLASTY Presentation.pptx
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
Q1_LE_Mathematics 8_Lesson 5_Week 5.docx
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
POSITIONING IN OPERATION THEATRE ROOM.ppt
AlphaEarth Foundations and the Satellite Embedding dataset
An interstellar mission to test astrophysical black holes
Vitamins & Minerals: Complete Guide to Functions, Food Sources, Deficiency Si...
Microbiology with diagram medical studies .pptx
. Radiology Case Scenariosssssssssssssss
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...

Lecture: NGS at the National Genomics Infrastructure

  • 1. NGI stockholm NGI Sweden Next Generation Sequencing at the National Genomics Infrastructure Phil Ewels phil.ewels@scilifelab.se Introduction to Bioinformatics Using NGS Data Linköping, 2018-05-23
  • 2. NGI stockholm Overview National Genomics Infrastructure Sequencing Technologies Sequencing Applications Bioinformatics at the NGI
  • 4. NGI stockholm National Genomics Infrastructure Proteomics Metabolomics Single-Cell Biology Cellular & Molecular Imaging Molecular Structure Chemical Biology Genome Engineering Diagnostic Development Drug Discovery & Development NationalBioinformaticsInfrastructure DataOffice SciLifeLab NGI Technology Platforms Research Programs National Genomics Infrastructure
  • 5. NGI stockholm SciLifeLab NGI Stockholm Uppsala Genomics Production SNP&Seq Uppsala Genome CenterGenomics Applications DevelopmentGenomics Applications Development National Genomics Infrastructure
  • 6. NGI stockholm SciLifeLab NGI Our mission is to offer a
 state-of-the-art infrastructure
 for massively parallel DNA sequencing and SNP genotyping, available to researchers all over Sweden
  • 7. NGI stockholm SciLifeLab NGI National resource State-of-the-art infrastructure Guidelines and support We provide 
 guidelines and support
 for sample collection, study design, protocol selection and bioinformatics analysis
  • 8. NGI stockholm NGI Organisation NGI Stockholm NGI Uppsala
  • 9. NGI stockholm NGI Organisation Funding Staff salaries Premises and service contracts Capital equipment Host universities SciLifeLab VR KAW User fees Reagent costs NGI Stockholm NGI Uppsala
  • 10. NGI stockholm Project timeline Sample QC Library preparation, Sequencing, Genotyping Data processing and primary analysis Scientific support and project consultation Data delivery
  • 11. NGI stockholm Project timeline Sample QC Library preparation, Sequencing, Genotyping Data processing and primary analysis Scientific support and project consultation Data delivery
  • 12. NGI stockholm Just
 Sequencing Methods offered at NGI FFPE Sequencing 10X Genomics Nanoporesequencing ATAC-seq UserQC (cheap preps) Hi-C(ox)Bisulphite
 sequencing RAD-seq RNA de novoDNA Data analysis pipelines included
  • 13. NGI stockholm NGI Stockholm NGI Stockholm Projects in 2017 RNA-Seq WG Re-Seq De-Novo Metagenomics Targeted Re-Seq ChIP-Seq Epigenetics RAD Seq 0 45 90 135 180 1 13 15 31 58 61 159 177 • RNA-seq is the most common project type
  • 14. NGI stockholm NGI Stockholm • RNA-seq is the most common project type • In total, NGI Sweden processed 1068 NGS projects with almost 50 000 samples in 2017 NGI Stockholm Samples in 2017 RNA-Seq WG Re-Seq De-Novo Metagenomics Targeted Re-Seq ChIP-Seq Epigenetics RAD Seq 0 4000 8000 12000 16000 192 180 397 5,909 4,496 211 4,551 15,022
  • 15. NGI stockholm NGI Stockholm • Median turn around times from QC passed to data delivered for 2017 • Sequencing only: 11.5 days • RNA: 6.5 weeks • WGS: 8 weeks https://ngisweden.scilifelab.se/ file/stockholm_dashboard
  • 19. NGI stockholm Illumina Sequencing • Largest provider of sequencing technology • NGS machines use "Sequencing-by-synthesis" • Developed at the University of Cambridge in 1990s • Spun into a company called Solexa in 1998 • Solexa acquired by illumina in 2007 • Responsible for vast majority of DNA sequencing experiments worldwide
  • 30. NGI stockholm Illumina at NGI iSeq 100 Coming soon to NGI Uppsala Small cheap runs MiSeq Small runs, long reads (2x300bp) HiSeq 2500 Primary machine for most of NGI's history HiSeq X Cheap, high throughput Only allowed to run WGS with > 15X coverage NovaSeq 6000 Newest machine, both Stockholm & Uppsala Will eventually replace HiSeq 2500
  • 31. NGI stockholm How to choose • Number of reads required • How many samples, how deeply sequenced? • Type of reads required • Single End / Paired End, length? • Urgency and cost • Sharing flow cells with other users • Best price for the project
  • 32. NGI stockholm Patterned flow cells • New type of flow cell • HiSeq 4000, HiSeq X, NovaSeq • Single sequence per well • Higher density, more data • What's index-hopping? • ExAmp can mix up index pairs in tiny fraction of reads • Avoided with dual unique indexes
  • 33. Patterned flow cells • Patterned flow cells can give "optical duplicates" • https://sequencing.qcfail.com/articles/illumina-patterned-flow- cells-generate-duplicated-sequences/ • Can be treated like regular PCR duplicates HiSeq 2500 HiSeq 4000
  • 34. NGI stockholm Two colour chemistry • Older SBS used four different fluorophores • One for each nucleotide • New machines use two • Faster and cheaper • NextSeq, NovaSeq, iSeq • No signal = G • Can get poly-G if something
 goes wrong https://sequencing.qcfail.com/articles/illumina-2-colour- chemistry-can-overcall-high-confidence-g-bases/
  • 36. NGI stockholm PacBio • Pacific Biosciences - specialists in long reads • Also uses fluorescent nucleotides • Polymerases immobilised at the bottom of tiny wells give off pulses as the nucleotides are incorporated • Each well is independent, doesn't use sequencing rounds like illumina • Can work with much longer DNA fragments • 250 bp – 60 kb (max ~160 kb)
  • 40. NGI stockholm PacBio Sequencing • Long reads are excellent for de-novo genome assembly and isoform detection • Output is expensive compared to illumina, but getting better • Small genomes are no problem. Larger genomes are now becoming more feasible. • New amplification-free enrichment using CRISPR-Cas9
  • 42. NGI stockholm Oxford Nanopore • Newest contender in the sequencing world • Lots of hype and taken several years to become a reality • Still developing very fast • Quality, yield and cost changing almost monthly • High error rates (but better than they used to be) • Now 2-13% depending on sequencing type
  • 49. NGI stockholm Oxford Nanopore • The best technology available for ultra long reads • Twitter users report getting reads over 1 Mbp long • "Whale spotting" - finding the longest reads on the end of the distribution curve • Price dropping rapidly, but still expensive compared to illumina • NGI has 2x MinIONs, hoping for PromethION soon
  • 51. NGI stockholm Ion Torrent • Main application • Microbial and metagenomic sequencing • Targeted re-sequencing (gene panels) • Clinical sequencing • Short, single-end reads • Fast run times
  • 52. NGI stockholm Ion Torrent PGM • Yield • 0.1 - 1 Gbp • Run time • 3 hrs • Read length • 200 - 400 bp
  • 53. NGI stockholm Ion Torrent Proton • Yield • 10 Gbp • Run time • 4 hrs • Read length • 200 bp
  • 54. NGI stockholm Ion Torrent S5 XL • Yield • 1-13 Gbp • Run time • 3 hrs • Read length • 200 - 600 bp
  • 55. NGI stockholm Sequencing Type • No need to remember all of this • Many considerations, changing all the time • We are experts - come and speak to us! support@ngisweden.se https://ngisweden.scilifelab.se/
  • 57. NGI stockholm Library Preparation • All high throughput sequencing requires some kind of library preparation • Add adapters for sequencing chemistry • Adjust DNA fragment lengths • Incorporate biological signal into sequence • Add required enzymes • Different library preps enable different applications
  • 58. NGI stockholm RNA Sequencing • Choose a type of RNA • Protein coding mRNA (poly-A) • All RNA (rRNA depletion) • Small RNA • Choose your question • Differential gene expression • Differential isoform detection & quantification • Fusion gene detection • Define your limitations • Low-input material • Low quality material (eg. FFPE)
  • 59. NGI stockholm RNA Sequencing • Illumina sequencing RNA library prep kits • Illumina TruSeq RNA • Illumina RiboZero • Illumina TruSeq RNA Exome • Clontech SMARTER Pico • Illumina TruSeq Small RNA • Oxford Nanopore, PacBio, IonTorrent Protein-coding poly-A rRNA depletion FFPE / low quality low input small RNA
  • 60. NGI stockholm DNA Sequencing • Choose your question • SNP, SNV, indel calling • Structural variant detection • De-novo genome assembly • Choose your priorities • Sequencing accuracy • Sequencing depth • Ultra-long reads • Define your requirements • Low-input material • Low quality material (eg. FFPE)
  • 61. NGI stockholm DNA Sequencing • Illumina sequencing DNA library prep kits • Illumina TruSeq DNA PCR Free • Rubicon ThruPLEX • Illumina Nextera XT • Illumina Nextera Flex • 10X Genomics • Oxford Nanopore, PacBio, IonTorrent Best quality Low input Cheap (plate format) Fast and simple Linked reads
  • 62. NGI stockholm 10X Genomics • Chromium instrument uses droplet emulsion technology for nanoliter reaction volumes • Linked-read sequencing • Large molecules fragmented in droplets and barcoded • Normal short-read illumina sequencing used • Long fragments (20-100+ Kbp) reassembled from barcodes • Regular illumina sequencing libraries produced
  • 64. NGI stockholm 10X Genomics • Single cell RNA sequencing • Thousands of cells captured in droplets • Each RNA molecule tagged with droplet barcode
  • 65. NGI stockholm • Now testing Hi-C in NGI Stockholm • Proximity ligation assay to detect physical colocation of DNA fragments within cell nuclei • Multiple applications for data • Epigenetics • De-novo genome assembly • Structural variation detection Hi-C Chr 14 Chr14
  • 66. NGI stockholm Methylation Sequencing • Bisulphite sequencing detects Cytosine methylation in genomic DNA • Unmethylated Cs converted to Uracil by bisulfites and sequenced as T • Methylated Cs are protected and sequenced as C • Oxidative bisulphite informs about hydroxy-methylation • Current under development at NGI Stockholm • PacBio and Oxford Nanopore able to detect some native base modifications
  • 67. NGI stockholm RAD Sequencing • Restriction-site Associated DNA sequencing, also known as GBS (Genotyping By Sequencing) • Genome fragmented using a restriction enzyme • Narrow size range purified - same regions of genome for all individuals • Allows cheap high-depth variant calling for large numbers of samples, without a reference genome • Excellent for population genomics and ecology
  • 69. NGI stockholm Bioinformatics at NGI • Raw sequencing data management • Demultiplexing, data transfers, backups, delivery • Quality control • Every project is checked against quality criteria • Automated analysis pipelines • Standardised pipelines give reproducible results • Software development
  • 70. NGI stockholm NGI Data Handling Sequencer Network storage Preprocessing Backup UPPMAX (Irma) UPPMAX (Grus) SNIC Supr Authentication Your computer Your analysis server
  • 71. NGI stockholm Grus Deliveries • UPPMAX tool for NGI data deliveries • NGI creates a SNIC Supr "delivery project" for each NGI sequencing project • Project PI and contact person given access, according to what was put on the order form • Email sent with project ID and instructions • Grus is for secure short term storage only • Requires two-factor authentication
  • 72. NGI stockholm Analysis Pipelines • Initial data analysis for major protocols • Internal QC and standardised starting point for users • All software open source and on GitHub • http://opensource.scilifelab.se/ • http://github.com/SciLifeLab/ • Accredited facility
  • 73. NGI stockholm Analysis Requirements Automated Reliable Easy for others to run Reproducible results
  • 75. Sarek Somatic • SNPs, SNVs and indels • Structural variants • Heterogeneity, ploidy and CNVs • Germline and/or Somatic analysis • Formerly called Cancer Analysis Workflow https://github.com/SciLifeLab/Sarek MuTect2 Strelka FreeBayes GATK HaplotypeCaller MuTect1 ASCAT Manta • Tumour/Normal pair WGS analysis based on GATK best practices
  • 76. Sarek • Tool split into sub-workflows • Bash wrapper script runs whole workflow • Manuscript submitted this week, preprint available on bioRxiv • https://www.biorxiv.org/content/early/ 2018/05/09/316976
  • 77. NGI-RNAseq NGI, April – December 2017 10,227samples processed      131user projects https://github.com/SciLifeLab/NGI-RNAseq MIT Licence Read alignment Gene counts Quality Control Reporting Raw data
  • 78. NGI-RNAseq Quantitative Biology Center Tübingen, Germany https://github.com/SciLifeLab/NGI-RNAseq Now running using
  • 79. NGI stockholm nf-core • A community effort to collect a curated set of Nextflow analysis pipelines • GitHub organisation to collect pipelines in one place • No institute-specific branding • Strict set of guideline requirements • Automated testing for code style and function https://nf-co.re
  • 80. NGI stockholm nf-core https://nf-co.re • Easy to run pipelines • Helpful community • Super reproducible results
  • 81. NGI stockholm Quality Control • Every project has some level of quality control checks • Sequencing quality • FastQC, FastQ Screen • Analysis pipelines give application-specific QC • Qualimap, RSeQC • Reporting is done using MultiQC
  • 82. MultiQC • Reporting tool, parses logs from completed analysis • Creates single HTML report for all samples & steps in a project • Interactive plots for data exploration • Current version now has 61 supported tools • Works with anything from tens → thousands of samples • Highly customisable
  • 86. NGI stockholm If you have a project • Visit our order portal • Create projects • Request meetings • Send us an email https://ngisweden.scilifelab.se support@ngisweden.se
  • 87. NGI stockholm Find our tools • View our open-source software • All code available on GitHub http://opensource.scilifelab.se
  • 88. Acknowledgements NGI stockholm Thanks to: Max Käller Olga Vinnere Pettersson NGI Sweden Phil Ewels phil.ewels@scilifelab.se ewels tallphil support@ngisweden.se http://ngisweden.scilifelab.se http://opensource.scilifelab.se