SlideShare a Scribd company logo
Review Paper digit
Structural Variation Detection
Structural Variation Detection
Review Paper digit
Table of contents
• Detection of structural DNA variation from next generation
sequencing data: a review of informatic approaches
• The software pipeline digit
Structural Variation Detection
Review Paper digit
Detection of structural DNA variation from next generation sequencing
data: a review of informatic approaches
Authors: Haley J. Abel1, Eric J. Duncavage2
(1) Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA
(2) Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, MO, USA
Structural Variation Detection
Review Paper digit
Definition
Structural DNA variation is generally defined as
variation in a DNA region larger than 1 kb and
includes several classes such as translocations,
inversions, insertions/deletions and copy number
variations (CNVs).
Structural Variation Detection
Review Paper digit
Methods
• Cytogenetics:
unbiased
BUT limited resolution/sensitivity (350-500 band level)
• FISH - Fluorescence in situ hybridization:
increased resolution, ability to test fixed interphase cells, faster turnaround time,
greater sensitivity
BUT evaluation of multiple loci requires multiple probes/assays ⇒ increasing
complexity
• Microarrays:
especially reliable for CNV and loss of heterozygosity
BUT unable to detect balanced translocations
• Next Generation Sequencing:
ability to detect full range of genetic variation ⇒ potential to streamline testing by
using a single analysis platform
BUT dependent on coverage ⇒ susceptible to GC bias
Structural Variation Detection
Review Paper digit
NGS - Methods
• Depth of coverage analysis
......
• Discordant read pair analysis
......
......
• Split read analysis
......
......
Structural Variation Detection
Review Paper digit
Tools
Structural Variation Detection
Review Paper digit
Translocation and Inversion Detection
Structural Variation Detection
Review Paper digit
Translocation and Inversion Detection
Discordant pair analysis:
• sensitive but low breakpoint resolution and low specificity
• repetetive regions on top of beeing a source of false positives drive
translocations (difficult to separate from false positives)
• Many methods try heuristic cut offs to improve specificity:
• VariationHunter and Hydra consider multiple, high scoring mappings if
available
• GASVPRO tries to improve specificity by combining discordant pair
and coverage analysis
Split read analysis: excellent breakpoint resolution (up to single base
resolution), but requires much higher coverages.
Structural Variation Detection
Review Paper digit
Copy Number Variation Detection
Structural Variation Detection
Review Paper digit
Copy Number Variation Detection
Discordant pair analysis:
• performs best on large deletions. struggles with dublications
• cannot detect large insertions with the usual strategy due to pairs not
spanning the dublication
• cannot detect large insertions with the usual strategy due to pairs not
spanning the dublication
• Pindel pieces translocation calls together via pattern growth algorithm
to find large insertions
Structural Variation Detection
Review Paper digit
Copy Number Variation Detection
Depth of coverage analysis:
• DNA
• Main problem is accounting for factors that modify read depth like GC
bias
• event-wise testing (EWT) algorithms rely purely on deviations in
coverage from the sample’s mean depth. GC content is adressed by
analysing the genome bin wise.
• SegSeq, CNVnator, CNAseg, CNV-seq compare the same region across
multiple samples (control samples). Methods make also use of
bins/partitions and rely on coverage ratios which permit finer CNV
mapping.
• Exome
• target-capture-data increases GC bias
• small size of targets makes paired normals or population controls a
requirement
Structural Variation Detection
Review Paper digit
Copy Number Variation Detection
• Exome methods calculate local CNV first and then merge them
together with various strategies
• CONTRA: uses circular binary segmentation for merging
• CoNVEX: denoises coverage ratios with a discrete wavelet transform
and then uses a Hidden Markov Model to identify gains and losses
• ExomeCNV: models B-allele frequencies to detect loss of
heterozygosity
• Some methods try to find sporadic CNVs in population exome data by
normalizing read count with principal component analysis
Structural Variation Detection
Review Paper digit
Insertion and Deletion Detection
Structural Variation Detection
Review Paper digit
Insertion and Deletion Detection
• Alignment based:
• offered by many packages: SAMtools, GATK, VarScan
• usually rely on probabilistic models to make indel calls
• Dindel and Stampy rely on this methods but employ filters to
differentiate common errors from true indels.
• all of these methods require considerable validation
• insertion detection is limited to 15% of total read length
• Split read based:
• Suitable for medium sized indels
• High false-positive rate, because no probabilistic models discriminate
between alignment errors and true events
Structural Variation Detection
Review Paper digit
Conclusion
• There is currently no single informatic method capable of identifying
the full range structural DNA variation.
• multiple complementary tools are required for robust variant detection
• Since methods can perform differently based on assay design,
extensive validation is required for clinical use.
Structural Variation Detection
Review Paper digit
digit - A tool for detection and identification of genomic
inter-chromosomal translocations
Authors: Richard Meier1,4, Stefan Graw1,4, Julian R Molina3, Peter
Beyerlein1, Devin Koestler2, Jeremy Chien4
(1) Technical University of Applied Sciences Wildau, 15745 Wildau, Germany
(2) Department of Biostatistics, University of Kansas Medical Center, Kansas City, KS 66160
(3) Department of Medical Oncology, Mayo Clinic, Rochester, MN 55905
(4) Department of Cancer Biology, University of Kansas Medical Center, Kansas City, KS 66160
Structural Variation Detection
Review Paper digit
Goals of the project
• Interchromosomal translocation detection utilizing mate-pair
sequencing data
• Handle artifacts and robustly remove false positive calls
• Investigate translocation profiles of populations / trait associated
groups
Structural Variation Detection
Review Paper digit
Mate-pair sequencing
sequencing
adapter ligationfragmentation
circularisation
fragmentation
genome / chromosome
template
terminal
fragment
read1 read2
Structural Variation Detection
Review Paper digit
digit overview
MVM
Density
01234
1.0 1.5 2.0
rejected approved
chromosome_1
chromosome_2
read_1 read_2
preprocessed read pairs
retain discordantly
mapping read pairs
find
read
pair
clusters
cluster_Bcluster_A .... . .
calculate MVMs for
each pair and filter
out low value pairs
recluster
remaining
read pairs
compare samples
and search for
group associations
called translocations
chr14:1573290-158941 & chr22:2732247-2735312
chr2:11002738-11002738 & chr3:3763766-3766175
chr11:1573290-158941 & chr17:1147275-11149839
chr5:25819112-25821940 & chr9:5151006-5154147
. . . . . . .... . .
sample_1
sample_4
sample_5
sample_9
discordant read pair cluster
group associated super cluster
concordant pairs
discordant pairs
threshold
Structural Variation Detection
Review Paper digit
Mapping validity measure (MVM)
... ...
AC T GG G A CT A C T ACG TA C G T
AC T GG G A CT G C T ACG G AC CC A GG CT
G A CT A C T ACG
TA C G T
G AC CC A GG CT
2kb
mapper assigns
read to region
mapper assigns
read to region
chromosome A
chromosome B
G T A T C C CA A TC G C AT ......
......
but
• The two reads of a read pair are remapped to both regions the mapping software
originally assigned them to.
• If a read maps equally well to both regions it is impossible to resolve the read
pair’s origin and it is rejected.
• The MVM judges how ambiguous the mappability of a read pair is.
• The MVM distribution of concordant (well behaved) read pairs in a sample are
used as internal standard to determine a filtering threshold.
Structural Variation Detection
Review Paper digit
Simulated data
Structural Variation Detection
Review Paper digit
Real data
Samples achieved a good separation between ambiguous and distinct read
pairs via MVM thresholds across the board.
concordant
discordant
threshold
1.0 1.5 2.0 2.5
012345
sample LU526
N = 749 Bandwidth = 0.02034
Density
1.0 1.5 2.0 2.5
01234
sample LU748
N = 461 Bandwidth = 0.04017
Density
1.0 1.5 2.0 2.5
02468
sample LU271
N = 641 Bandwidth = 0.01287
Density
1.0 1.5 2.0 2.5
0123456
sample LU820
N = 534 Bandwidth = 0.02189
Density
1.0 1.5 2.0 2.5
01234
sample LU1160
N = 268 Bandwidth = 0.05798
Density
1.0 1.5 2.0 2.5
01234
sample LU1184
N = 370 Bandwidth = 0.04009
Density
1.0 1.5 2.0 2.5
012345
sample LU1434
N = 391 Bandwidth = 0.02477
Density
1.0 1.5 2.0 2.502468
sample LU1466
N = 585 Bandwidth = 0.01317
Density
Structural Variation Detection
Review Paper digit
Real data
• We processed 20 patient samples from a non-cancer background and
35 patient samples with a lung cancer background.
• After comparing the two populations we retrieved 218 sample specific
events, 160 of which were from cancer.
• 328 translocation calls were shared between 2 or more samples
• 16 translocations were shared between cancer samples exclusively.
• 13 translocations shared between cancer and normal samples were
labeled potentially disease relevant.
Structural Variation Detection
Review Paper digit
Translocations exclusively found in cancer
Structural Variation Detection
Review Paper digit
Translocations enriched in cancer
Structural Variation Detection
Review Paper digit
Conclusion
• The method sucessfully reduces the false positives rate.
• Group comparision and population analysis is working, but will require
more samples to make reliable judgements in the future.
• Comparisions with other tools are running as we speak.
• Combining strategies from different tools might be valuable to look
into in future projects.
Structural Variation Detection
Review Paper digit
Questions
?Structural Variation Detection

More Related Content

PDF
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
PPTX
Introduction to sequence alignment partii
PPTX
Codon usage/bias
PPTX
Sequence assembly
PPTX
Phylogenetic tree
PPTX
Comparative genomics
PPTX
Protein database
Introduction to RNA-seq and RNA-seq Data Analysis (UEB-UAT Bioinformatics Cou...
Introduction to sequence alignment partii
Codon usage/bias
Sequence assembly
Phylogenetic tree
Comparative genomics
Protein database

What's hot (20)

PPTX
PPTX
Chromosome walking
PPTX
Orthologs,Paralogs & Xenologs
PPT
Microarray Analysis
PPTX
Physical mapping
PPT
PHYLOGENETICS WITH MEGA
PDF
Gene prediction methods vijay
PDF
Next generation sequencing
PPTX
PPTX
Map based cloning
PDF
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
PPTX
Distance based method
PPTX
Structural genomics
PDF
Variant analysis and whole exome sequencing
PPTX
Next generation sequencing
PDF
Genome Assembly
PDF
Gene Expression Data Analysis
PPTX
P bluescript
PPTX
Next Generation Sequencing (NGS)
Chromosome walking
Orthologs,Paralogs & Xenologs
Microarray Analysis
Physical mapping
PHYLOGENETICS WITH MEGA
Gene prediction methods vijay
Next generation sequencing
Map based cloning
Variant (SNP) calling - an introduction (with a worked example, using FreeBay...
Distance based method
Structural genomics
Variant analysis and whole exome sequencing
Next generation sequencing
Genome Assembly
Gene Expression Data Analysis
P bluescript
Next Generation Sequencing (NGS)
Ad

Viewers also liked (14)

PDF
Translocation detection in lung cancer using mate-pair sequencing and iVIGS
PDF
Structural Engineering and Analysis Vol 2 Issue 2
PDF
Vilas Nikam- Mechanics of Structure-Stress in beam
PPTX
Copy number variations in monozygotic twins discordant for schizophrenia
PPTX
Human settlement pattern
KEY
Next Gen Sequencing Technologies Overview
PDF
Discovery and annotation of variants by exome analysis using NGS
PPTX
ARCHES AND ITS TYPES
PDF
Thinking About Bangladesh
PPT
Map Skills Revision
PPTX
Mutations powerpoint
PPTX
Mutation, Types and Causes, Chromosomal Variation in Number, Gene Mutation
PDF
Lightning fast genomics with Spark, Adam and Scala
Translocation detection in lung cancer using mate-pair sequencing and iVIGS
Structural Engineering and Analysis Vol 2 Issue 2
Vilas Nikam- Mechanics of Structure-Stress in beam
Copy number variations in monozygotic twins discordant for schizophrenia
Human settlement pattern
Next Gen Sequencing Technologies Overview
Discovery and annotation of variants by exome analysis using NGS
ARCHES AND ITS TYPES
Thinking About Bangladesh
Map Skills Revision
Mutations powerpoint
Mutation, Types and Causes, Chromosomal Variation in Number, Gene Mutation
Lightning fast genomics with Spark, Adam and Scala
Ad

Similar to Structural Variation Detection (20)

PPTX
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
PPT
Vanderbilt b
PPTX
GIAB for AMP GeT-RM Forum
PDF
CDAC 2018 Merico optimal scoring
PPTX
Linkage analysis
PPTX
Comparative and functional genomics
PPT
Myths in science & statistics s ha h.ppt
PDF
Data analysis
PPTX
Genome in a bottle for ashg grc giab workshop 181016
PPTX
Giab for jax long read 190917
PPTX
DHC Microbiome Presentation 4-23-19.pptx
PPTX
physical mapping- restriction map, STS map, EST map
PPTX
GIAB update for GRC GIAB workshop 191015
PDF
The Use of K-mer Minimizers to Identify Bacterium Genomes in High Throughput ...
PPTX
160627 giab for festival sv workshop
PPTX
Genome in a bottle for amp GeT-RM 181030
PDF
Pitfalls of multivariate pattern analysis(MVPA), fMRI
PPT
Cluster randomization trial presentation
PPTX
Functional genomics
PPTX
171017 giab for giab grc workshop
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
Vanderbilt b
GIAB for AMP GeT-RM Forum
CDAC 2018 Merico optimal scoring
Linkage analysis
Comparative and functional genomics
Myths in science & statistics s ha h.ppt
Data analysis
Genome in a bottle for ashg grc giab workshop 181016
Giab for jax long read 190917
DHC Microbiome Presentation 4-23-19.pptx
physical mapping- restriction map, STS map, EST map
GIAB update for GRC GIAB workshop 191015
The Use of K-mer Minimizers to Identify Bacterium Genomes in High Throughput ...
160627 giab for festival sv workshop
Genome in a bottle for amp GeT-RM 181030
Pitfalls of multivariate pattern analysis(MVPA), fMRI
Cluster randomization trial presentation
Functional genomics
171017 giab for giab grc workshop

More from Jennifer Shelton (15)

PDF
Bioinformatic core facilities discussion
PDF
Using BioNano Maps to Improve an Insect Genome Assembly​
PDF
Bng presentation draft
PDF
Lecture1: NGS Analysis on Beocat and an introduction to Perl programming for ...
PDF
Journal club slides to discuss "Differential analysis of gene regulation at t...
PPTX
Hub gene selection_ds
PPTX
Applied Bioinformatics Journal Club Pacbio RNA-Seq
PPTX
RNASeq DE methods review Applied Bioinformatics Journal Club
PDF
Bionano genome maps_feb2014
PDF
Summary slides by Prabhakar Chalise of the Oberg et al. 2012 article "Technic...
PPTX
RNA-Seq transcriptome analysis of Gonium pectorale cell cycle.
PPTX
RNA-Seq transcriptome analysis of Gonium pectorale cell cycle
PDF
Multi-k-mer de novo transcriptome assembly and assembly of assemblies using 4...
PDF
Param selection phase1summary_v2
PDF
Bioinformatic jc 08_14_2013_formal
Bioinformatic core facilities discussion
Using BioNano Maps to Improve an Insect Genome Assembly​
Bng presentation draft
Lecture1: NGS Analysis on Beocat and an introduction to Perl programming for ...
Journal club slides to discuss "Differential analysis of gene regulation at t...
Hub gene selection_ds
Applied Bioinformatics Journal Club Pacbio RNA-Seq
RNASeq DE methods review Applied Bioinformatics Journal Club
Bionano genome maps_feb2014
Summary slides by Prabhakar Chalise of the Oberg et al. 2012 article "Technic...
RNA-Seq transcriptome analysis of Gonium pectorale cell cycle.
RNA-Seq transcriptome analysis of Gonium pectorale cell cycle
Multi-k-mer de novo transcriptome assembly and assembly of assemblies using 4...
Param selection phase1summary_v2
Bioinformatic jc 08_14_2013_formal

Recently uploaded (20)

PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PDF
What if we spent less time fighting change, and more time building what’s rig...
PDF
Practical Manual AGRO-233 Principles and Practices of Natural Farming
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PPTX
UNIT III MENTAL HEALTH NURSING ASSESSMENT
PDF
Computing-Curriculum for Schools in Ghana
PDF
Paper A Mock Exam 9_ Attempt review.pdf.
PDF
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PDF
Indian roads congress 037 - 2012 Flexible pavement
PDF
SOIL: Factor, Horizon, Process, Classification, Degradation, Conservation
PPTX
History, Philosophy and sociology of education (1).pptx
PDF
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
Radiologic_Anatomy_of_the_Brachial_plexus [final].pptx
PDF
Hazard Identification & Risk Assessment .pdf
PPTX
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
PPTX
Cell Types and Its function , kingdom of life
PDF
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
Chinmaya Tiranga quiz Grand Finale.pdf
What if we spent less time fighting change, and more time building what’s rig...
Practical Manual AGRO-233 Principles and Practices of Natural Farming
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
UNIT III MENTAL HEALTH NURSING ASSESSMENT
Computing-Curriculum for Schools in Ghana
Paper A Mock Exam 9_ Attempt review.pdf.
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
202450812 BayCHI UCSC-SV 20250812 v17.pptx
Indian roads congress 037 - 2012 Flexible pavement
SOIL: Factor, Horizon, Process, Classification, Degradation, Conservation
History, Philosophy and sociology of education (1).pptx
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
Final Presentation General Medicine 03-08-2024.pptx
Radiologic_Anatomy_of_the_Brachial_plexus [final].pptx
Hazard Identification & Risk Assessment .pdf
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
Cell Types and Its function , kingdom of life
ChatGPT for Dummies - Pam Baker Ccesa007.pdf

Structural Variation Detection

  • 1. Review Paper digit Structural Variation Detection Structural Variation Detection
  • 2. Review Paper digit Table of contents • Detection of structural DNA variation from next generation sequencing data: a review of informatic approaches • The software pipeline digit Structural Variation Detection
  • 3. Review Paper digit Detection of structural DNA variation from next generation sequencing data: a review of informatic approaches Authors: Haley J. Abel1, Eric J. Duncavage2 (1) Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA (2) Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, MO, USA Structural Variation Detection
  • 4. Review Paper digit Definition Structural DNA variation is generally defined as variation in a DNA region larger than 1 kb and includes several classes such as translocations, inversions, insertions/deletions and copy number variations (CNVs). Structural Variation Detection
  • 5. Review Paper digit Methods • Cytogenetics: unbiased BUT limited resolution/sensitivity (350-500 band level) • FISH - Fluorescence in situ hybridization: increased resolution, ability to test fixed interphase cells, faster turnaround time, greater sensitivity BUT evaluation of multiple loci requires multiple probes/assays ⇒ increasing complexity • Microarrays: especially reliable for CNV and loss of heterozygosity BUT unable to detect balanced translocations • Next Generation Sequencing: ability to detect full range of genetic variation ⇒ potential to streamline testing by using a single analysis platform BUT dependent on coverage ⇒ susceptible to GC bias Structural Variation Detection
  • 6. Review Paper digit NGS - Methods • Depth of coverage analysis ...... • Discordant read pair analysis ...... ...... • Split read analysis ...... ...... Structural Variation Detection
  • 8. Review Paper digit Translocation and Inversion Detection Structural Variation Detection
  • 9. Review Paper digit Translocation and Inversion Detection Discordant pair analysis: • sensitive but low breakpoint resolution and low specificity • repetetive regions on top of beeing a source of false positives drive translocations (difficult to separate from false positives) • Many methods try heuristic cut offs to improve specificity: • VariationHunter and Hydra consider multiple, high scoring mappings if available • GASVPRO tries to improve specificity by combining discordant pair and coverage analysis Split read analysis: excellent breakpoint resolution (up to single base resolution), but requires much higher coverages. Structural Variation Detection
  • 10. Review Paper digit Copy Number Variation Detection Structural Variation Detection
  • 11. Review Paper digit Copy Number Variation Detection Discordant pair analysis: • performs best on large deletions. struggles with dublications • cannot detect large insertions with the usual strategy due to pairs not spanning the dublication • cannot detect large insertions with the usual strategy due to pairs not spanning the dublication • Pindel pieces translocation calls together via pattern growth algorithm to find large insertions Structural Variation Detection
  • 12. Review Paper digit Copy Number Variation Detection Depth of coverage analysis: • DNA • Main problem is accounting for factors that modify read depth like GC bias • event-wise testing (EWT) algorithms rely purely on deviations in coverage from the sample’s mean depth. GC content is adressed by analysing the genome bin wise. • SegSeq, CNVnator, CNAseg, CNV-seq compare the same region across multiple samples (control samples). Methods make also use of bins/partitions and rely on coverage ratios which permit finer CNV mapping. • Exome • target-capture-data increases GC bias • small size of targets makes paired normals or population controls a requirement Structural Variation Detection
  • 13. Review Paper digit Copy Number Variation Detection • Exome methods calculate local CNV first and then merge them together with various strategies • CONTRA: uses circular binary segmentation for merging • CoNVEX: denoises coverage ratios with a discrete wavelet transform and then uses a Hidden Markov Model to identify gains and losses • ExomeCNV: models B-allele frequencies to detect loss of heterozygosity • Some methods try to find sporadic CNVs in population exome data by normalizing read count with principal component analysis Structural Variation Detection
  • 14. Review Paper digit Insertion and Deletion Detection Structural Variation Detection
  • 15. Review Paper digit Insertion and Deletion Detection • Alignment based: • offered by many packages: SAMtools, GATK, VarScan • usually rely on probabilistic models to make indel calls • Dindel and Stampy rely on this methods but employ filters to differentiate common errors from true indels. • all of these methods require considerable validation • insertion detection is limited to 15% of total read length • Split read based: • Suitable for medium sized indels • High false-positive rate, because no probabilistic models discriminate between alignment errors and true events Structural Variation Detection
  • 16. Review Paper digit Conclusion • There is currently no single informatic method capable of identifying the full range structural DNA variation. • multiple complementary tools are required for robust variant detection • Since methods can perform differently based on assay design, extensive validation is required for clinical use. Structural Variation Detection
  • 17. Review Paper digit digit - A tool for detection and identification of genomic inter-chromosomal translocations Authors: Richard Meier1,4, Stefan Graw1,4, Julian R Molina3, Peter Beyerlein1, Devin Koestler2, Jeremy Chien4 (1) Technical University of Applied Sciences Wildau, 15745 Wildau, Germany (2) Department of Biostatistics, University of Kansas Medical Center, Kansas City, KS 66160 (3) Department of Medical Oncology, Mayo Clinic, Rochester, MN 55905 (4) Department of Cancer Biology, University of Kansas Medical Center, Kansas City, KS 66160 Structural Variation Detection
  • 18. Review Paper digit Goals of the project • Interchromosomal translocation detection utilizing mate-pair sequencing data • Handle artifacts and robustly remove false positive calls • Investigate translocation profiles of populations / trait associated groups Structural Variation Detection
  • 19. Review Paper digit Mate-pair sequencing sequencing adapter ligationfragmentation circularisation fragmentation genome / chromosome template terminal fragment read1 read2 Structural Variation Detection
  • 20. Review Paper digit digit overview MVM Density 01234 1.0 1.5 2.0 rejected approved chromosome_1 chromosome_2 read_1 read_2 preprocessed read pairs retain discordantly mapping read pairs find read pair clusters cluster_Bcluster_A .... . . calculate MVMs for each pair and filter out low value pairs recluster remaining read pairs compare samples and search for group associations called translocations chr14:1573290-158941 & chr22:2732247-2735312 chr2:11002738-11002738 & chr3:3763766-3766175 chr11:1573290-158941 & chr17:1147275-11149839 chr5:25819112-25821940 & chr9:5151006-5154147 . . . . . . .... . . sample_1 sample_4 sample_5 sample_9 discordant read pair cluster group associated super cluster concordant pairs discordant pairs threshold Structural Variation Detection
  • 21. Review Paper digit Mapping validity measure (MVM) ... ... AC T GG G A CT A C T ACG TA C G T AC T GG G A CT G C T ACG G AC CC A GG CT G A CT A C T ACG TA C G T G AC CC A GG CT 2kb mapper assigns read to region mapper assigns read to region chromosome A chromosome B G T A T C C CA A TC G C AT ...... ...... but • The two reads of a read pair are remapped to both regions the mapping software originally assigned them to. • If a read maps equally well to both regions it is impossible to resolve the read pair’s origin and it is rejected. • The MVM judges how ambiguous the mappability of a read pair is. • The MVM distribution of concordant (well behaved) read pairs in a sample are used as internal standard to determine a filtering threshold. Structural Variation Detection
  • 22. Review Paper digit Simulated data Structural Variation Detection
  • 23. Review Paper digit Real data Samples achieved a good separation between ambiguous and distinct read pairs via MVM thresholds across the board. concordant discordant threshold 1.0 1.5 2.0 2.5 012345 sample LU526 N = 749 Bandwidth = 0.02034 Density 1.0 1.5 2.0 2.5 01234 sample LU748 N = 461 Bandwidth = 0.04017 Density 1.0 1.5 2.0 2.5 02468 sample LU271 N = 641 Bandwidth = 0.01287 Density 1.0 1.5 2.0 2.5 0123456 sample LU820 N = 534 Bandwidth = 0.02189 Density 1.0 1.5 2.0 2.5 01234 sample LU1160 N = 268 Bandwidth = 0.05798 Density 1.0 1.5 2.0 2.5 01234 sample LU1184 N = 370 Bandwidth = 0.04009 Density 1.0 1.5 2.0 2.5 012345 sample LU1434 N = 391 Bandwidth = 0.02477 Density 1.0 1.5 2.0 2.502468 sample LU1466 N = 585 Bandwidth = 0.01317 Density Structural Variation Detection
  • 24. Review Paper digit Real data • We processed 20 patient samples from a non-cancer background and 35 patient samples with a lung cancer background. • After comparing the two populations we retrieved 218 sample specific events, 160 of which were from cancer. • 328 translocation calls were shared between 2 or more samples • 16 translocations were shared between cancer samples exclusively. • 13 translocations shared between cancer and normal samples were labeled potentially disease relevant. Structural Variation Detection
  • 25. Review Paper digit Translocations exclusively found in cancer Structural Variation Detection
  • 26. Review Paper digit Translocations enriched in cancer Structural Variation Detection
  • 27. Review Paper digit Conclusion • The method sucessfully reduces the false positives rate. • Group comparision and population analysis is working, but will require more samples to make reliable judgements in the future. • Comparisions with other tools are running as we speak. • Combining strategies from different tools might be valuable to look into in future projects. Structural Variation Detection