SlideShare a Scribd company logo
J. B. Cole
Animal Improvement Programs Laboratory
Agricultural Research Service, USDA
Beltsville, MD 20705-2350, USA
john.cole@ars.usda.gov
Use of NGS to identify the
causal variant associated
with a complex phenotype
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (2) Cole
Overview
Why are we sequencing?
How did we select the animals to
sequence?
What are the steps involved in the
process?
What do you do with the reads once you
have them?
Where are we now?
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (3) Cole
Introduction
Several studies (Kuhn et al., 2003; Cole
et al., 2007; Seidenspinner et al., 2009)
have reported QTL on BTA 18 associated
with dystocia
Bioinformatic analysis using SNP data has
not identified the causal variant
Next generation sequencing (NGS) has
recently been used to find causal
variants for novel recessive disorders
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (4) Cole
Chromosome 18 is different
Markers on chromosome 18 have large
effects on several traits:
Dystocia and stillbirth: Sire and
daughter calving ease and sire
stillbirth
Conformation: rump width, stature,
strength, and body depth
Efficiency: longevity and net merit
Large calves contribute to reduced
lifetimes and decreased profitability
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (5) Cole
Marker effects for dystocia complex
AR-BFGL-NGS-109285
Cole et al., 2009 (J. Dairy Sci. 92:2931–2946)
ARS-BFGL-NGS-109285
Cole et al., 2009 (J. Dairy Sci. 92:2931–2946)
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (6) Cole
Correlations in dystocia complex
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (7) Cole
The QTL also affects gestation length
Maltecca et al. 2011. Animal Genetics, 42:6, 585-591.
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (8) Cole
Overview of the dystocia complex
The key marker is ARS-BFGL-NGS-109285 at
(rs109478645 ) 57,585,121 Mb on BTA18
Intronic to SIGLEC12 (sialic acid binding Ig-like
lectin 12)
Recent results indicate effects on gestation
length (Maltecca et al., 2011) and calf birth
weight (Cole et al., unpublished data)
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (9) Cole
This is a gene-rich region
http://useast.ensembl.org/Bos_taurus/Location/View?r=18%3A57583000-57587000
http://www.ncbi.nlm.nih.gov/gene?cmd=Retrieve&dopt=Graphics&list_uids=618463
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (10) Cole
Copy number variants are present
ARS-BFGL-NGS-109285 is flanked by CNV
There’s a loss and a gain to the left (8
SNP region)
There’s a gain to the right (10 SNP
region)
This can result in assembly problems
Hou et al. 2011. Genomic characteristics of cattle copy number variations. BMC Genomics. 12:127.http://www.biomedcentral.com/1471-2164/12/127
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (11) Cole
Where did this problem come from?
http://aipl.arsusda.gov/CF-queries/Bull_Chromosomal_EBV/bull_chromosomal_ebv.cfm?
40,803 daughters
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (12) Cole
What if we look at a different trait?
Cole et al. (2007) proposed the following
mechanism:
SIGLEC12 may sequester circulating
leptin
This increases gestation length
Calf birth weight (BW) is higher
because of increased gestation length
Higher BW is associated with dystocia
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (13) Cole
We don’t have birth weight data
Birth weights are not routinely recorded
in the US
Collaborated with Hermann Swalve’s
group to develop a selection index
prediction of BW PTA
Performed GWAS and gene set
enrichment analysis to search for
interesting associations
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (14) Cole
GWAS for birth weight PTA
h
Cole et al.(2013), unpublished data
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (15) Cole
Are we measuring anything new?
Identified a SNP intronic to LHX4, which
is associated with cow body weight and
length (Ren et al., 2010, Mol. Bio.
Reprod., 37:417-422).
4 SNP in the QTL region on BTA 18 had
large effects
Several other SNP with large effects
intronic or adjacent to genes with
unknown functions
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (16) Cole
KEGG pathways for birth weight
What does
regulation of
the actin
cytoskeleton
have to do with
birth weight in
cattle?
That is, do
these results
make sense?
Maybe…these
pathways may
be involved in
establishment
& maintenance
of pregnancy,
as well as
coordination of
growth and
development.
Cole et al.(2013), unpublished data
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (17) Cole
Sequencing is becoming very affordable
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (18) Cole
Sequencing successes at AIPL/BFGL
Simple loss-of-function mutations
APAF1 – Spontaneous abortions in
Holstein cattle (Adams et al., 2012)
CWC15 – Early embryonic death in
Jersey cattle (Sonstegard et al., 2013)
Weaver syndrome – Neurological
degeneration and death in Brown Swiss
cattle (McClure et al., 2013)
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (19) Cole
Original pedigree-based design
Bull A (1968)
AA, SCE: 8
Bull B (1962)
AA, SCE: 7
MGS
Bull H (1989)
Aa, SCE: 14
Bull I (1994)
Aa, SCE: 18
Bull E (1982)
Aa, SCE: 8
Bull F
(1987)
Aa, SCE:
15
Bull C (1975)
AA, SCE: 8
= 10δBull D (1968)
??, SCE: 7
MGS
Bull E (1974)
Aa, SCE: 10
MGS
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (20) Cole
Modified pedigree & haplotype design
Bull A (1968)
AA, SCE: 8
Bull B (1962)
AA, SCE: 7
MGS
Bull H (1989)
Aa, SCE: 14
Bull I (1994)
Aa, SCE: 18
Bull E (1982)
Aa, SCE: 8
Bull F
(1987)
Aa, SCE:
15
Bull C (1975)
AA, SCE: 8
= 10δ Bull E (1974)
Aa, SCE: 10
MGS
Bull J (2002)
Aa, SCE: 6
Bull K (2002)
Aa, SCE: 15
Bull J (2002)
aa, SCE: 15
These bulls carry
the haplotype with
the largest, negative
effect on SCE:
Bull D (1968)
??, SCE: 7
Couldn’t obtain DNA:
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (21) Cole
DNA Quality
Molecular prep
Sample
Collection
DNA Extraction
Library Construction
Library Quality
Control
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (22) Cole
Sample preparation time is substantial
DNA Extraction: ~12 hours (30 mins)
DNA QC: ~1-2 hours (1-2 hours)
Library Construction: 48 hours (12
hours)
Library QC: ~2-4 hours (1 hour)
Total: 3-4 days (15.5 hours)
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (23) Cole
DNA quality
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (24) Cole
Library quality
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (25) Cole
Sequencing stage
• Illumina cBot:
• Preps DNA for sequencing
• Takes 4-5 hours
• Must be done 48 hours before
• Illumina HiSeq 2000:
• Does the sequencing
• Takes ~10-14 days for 100 x 100
• Minimal hands-on time
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (26) Cole
Anatomy of a flow cell
8 lanes per flow cell
3 columns per lane
− 96 tiles per column
Each tile imaged 8 times
1 from upper surface, 1 from lower
Approximately 300Gb of sequence per
flow cell
http://www.qbi.uq.edu.au/images/genomics/genomics1.jpg
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (27) Cole
Sequencing by synthesis
https://www.broadinstitute.org/files/shared/illuminavids/sequencingSlides.pdf
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (28) Cole
How many scientists does it take…
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (29) Cole
Flowcell 1: Cluster densities
uster densities from current HiSeq run finished 30 April 2013 (unpublished data):
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (30) Cole
Flowcell 2: Cluster densities
uster densities from current HiSeq run started 22 May 2013 (unpublished data):
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (31) Cole
The Aftermath
Total Time (sample to sequence):
3 weeks
That’s assuming nothing went wrong!
More realistic: months
Resulting Data
Large text files
~300 gigabytes compressed
Analysis
Often underestimated
Can take months as well
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (32) Cole
Variant detection
• Alignment against a reference
genome
• Analysis is very disk I/O-intensive.
Variant DetectionRaw Sequencer Output Alignment to the Genome
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (33) Cole
Computational Logistics
Desktop computers
Viable for single lanes
Long computation time
Servers are better
>100GB RAM and >16 processor
cores
Cloud
Amazon Web Services
iAnimal/iPlant
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (34) Cole
Storage considerations
What to save?
Raw data?
Processed results?
How much workspace?
Suggestions:
Workspace 10x compressed files
Save alignments
Backup REGULARLY!!!
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (35) Cole
Why should you use a pipeline?
• Automates analysis
• Maximizes resource consumption
• Because post-docs aren’t cheap
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (36) Cole
Galaxy server
NextGene
Custom pipeline
Scripting languages
Open-source tools
Many options for analysis pipelines
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (37) Cole
Challenges
Annotation
This is a mess in the cow
The reference assembly may not be
representative of all taurine cows
Validation
Doing functional genomics with large
mammals is expensive – who pays?
When have we proven something?
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (38) Cole
Conclusions
Sequencing is powerful, but presents
many challenges
Computational requirements are
substantial
We’re learning how much we don’t know
about functional genomics in the cow
Validation remains a problem
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (39) Cole
Acknowledgments
AIPL: Derek Bickhart, Dan Null, Paul
VanRaden
BFGL: Reuben Anderson, Steve
Schroeder, Tad Sonstegard, Curt Van
Tassell
Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (40) Cole
Questions?
http://gigaom.com/2012/05/31/t-mobile-pits-its-math-against-verizons-the-loser-common-sense/shutterstock_76826245/

More Related Content

PPT
Genetic Evaluation of Calving Traits in US Holsteins
PPTX
Dissertation Proposal - Is there a health difference between dried fed and ra...
PPTX
New tools for genomic selection in dairy cattle
PPTX
Recent Advances in animal biotechnology: Welfare and Ethical Implications
PPTX
New applications of genomic technology in the US dairy industry
PDF
Combining ability and inheritance of growth traits in rabbits
PDF
The Effects of ‘SoBe No Fear’ Energy Drink on the Wheel Activity of House Mic...
PPT
Genomics Beyond EBVs
Genetic Evaluation of Calving Traits in US Holsteins
Dissertation Proposal - Is there a health difference between dried fed and ra...
New tools for genomic selection in dairy cattle
Recent Advances in animal biotechnology: Welfare and Ethical Implications
New applications of genomic technology in the US dairy industry
Combining ability and inheritance of growth traits in rabbits
The Effects of ‘SoBe No Fear’ Energy Drink on the Wheel Activity of House Mic...
Genomics Beyond EBVs

Similar to Use of NGS to identify the causal variant associated with a complex phenotype (20)

PPTX
Using genotyping and whole-genome sequencing to identify causal variants asso...
PPT
The hunt for a functional mutation affecting conformation and calving traits ...
PDF
Genomic selection and systems biology – lessons from dairy cattle breeding
PPT
What can we do with dairy cattle genomics other than predict more accurate br...
PPTX
New Tools for Genomic Selection of Livestock
PPTX
Fine-mapping of QTL using high-density SNP genotypes
PPTX
Challenges and successes in dairy cattle genomics
PPTX
Applications of haplotypes in dairy farm management
PDF
Science 2013-schuenemann-179-83 leprosy önemli
PDF
Minor Planet Evidence for Water in the Rocky Debris of a Disrupted Extrasolar...
PDF
2017 11-28 European Alliance for Personalised Medicine Congressm 2017, Belfas...
PDF
Japanese Environmental Children's Study and Data-driven E
PDF
Academic aspect of Animal Research and its Application
PDF
A Comparative Study Evaluating the Impact of Physical Exercise on Disease Pro...
PDF
The study of modes of action: the AOP
PDF
A Critical Assessment Of Mus Musculus Gene Function Prediction Using Integrat...
PPTX
Heat tolerance, real-life genomics and GxE issues
PDF
EU PathoNGenTraceConsortium:cgMLST Evolvement and Challenges for Harmonization
PPTX
Case Studies in Home Cage Monitoring: Rodent Behavior, Circadian Biology and ...
PDF
Genomic evaluation of dairy cattle health
Using genotyping and whole-genome sequencing to identify causal variants asso...
The hunt for a functional mutation affecting conformation and calving traits ...
Genomic selection and systems biology – lessons from dairy cattle breeding
What can we do with dairy cattle genomics other than predict more accurate br...
New Tools for Genomic Selection of Livestock
Fine-mapping of QTL using high-density SNP genotypes
Challenges and successes in dairy cattle genomics
Applications of haplotypes in dairy farm management
Science 2013-schuenemann-179-83 leprosy önemli
Minor Planet Evidence for Water in the Rocky Debris of a Disrupted Extrasolar...
2017 11-28 European Alliance for Personalised Medicine Congressm 2017, Belfas...
Japanese Environmental Children's Study and Data-driven E
Academic aspect of Animal Research and its Application
A Comparative Study Evaluating the Impact of Physical Exercise on Disease Pro...
The study of modes of action: the AOP
A Critical Assessment Of Mus Musculus Gene Function Prediction Using Integrat...
Heat tolerance, real-life genomics and GxE issues
EU PathoNGenTraceConsortium:cgMLST Evolvement and Challenges for Harmonization
Case Studies in Home Cage Monitoring: Rodent Behavior, Circadian Biology and ...
Genomic evaluation of dairy cattle health
Ad

More from John B. Cole, Ph.D. (17)

PPTX
Crv 2015 jbc
PPTX
Using genotypes to construct phenotypes for dairy cattle breeding programs an...
PPTX
2015 AGIL Update
PPTX
If we would see further than others: research & technology today and tomorrow
PPTX
Genetic improvement programs for US dairy cattle
PPTX
An updated version of lifetime net merit incorporating additional fertility t...
PPTX
An updated version of lifetime net merit incorporating additional fertility t...
PPT
Genetic Evaluation of Stillbirth in US Holsteins Using a Sire-maternal Grands...
PPT
Stillbirth, Longevity and Fertility Update
PPTX
Opportunities for genetic improvement of health and fitness traits
PPTX
Uso e valore economico dei test genomici in azienda
PPTX
The use and economic value of genomic testing for calves on dairy farms
PPTX
Genomic evaluation of low-heritability traits: dairy cattle health as a model
PPTX
PyPedal, an open source software package for pedigree analysis
PPTX
Genomic Selection in Dairy Cattle
PPT
Distribution and Location of Genetic Effects for Dairy Traits
PDF
Validation of Producer-Recorded Health Event Data and Use in Genetic Improvem...
Crv 2015 jbc
Using genotypes to construct phenotypes for dairy cattle breeding programs an...
2015 AGIL Update
If we would see further than others: research & technology today and tomorrow
Genetic improvement programs for US dairy cattle
An updated version of lifetime net merit incorporating additional fertility t...
An updated version of lifetime net merit incorporating additional fertility t...
Genetic Evaluation of Stillbirth in US Holsteins Using a Sire-maternal Grands...
Stillbirth, Longevity and Fertility Update
Opportunities for genetic improvement of health and fitness traits
Uso e valore economico dei test genomici in azienda
The use and economic value of genomic testing for calves on dairy farms
Genomic evaluation of low-heritability traits: dairy cattle health as a model
PyPedal, an open source software package for pedigree analysis
Genomic Selection in Dairy Cattle
Distribution and Location of Genetic Effects for Dairy Traits
Validation of Producer-Recorded Health Event Data and Use in Genetic Improvem...
Ad

Recently uploaded (20)

PPTX
Classification Systems_TAXONOMY_SCIENCE8.pptx
PDF
The scientific heritage No 166 (166) (2025)
PPTX
BIOMOLECULES PPT........................
PDF
Placing the Near-Earth Object Impact Probability in Context
PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PPTX
neck nodes and dissection types and lymph nodes levels
PPTX
Pharmacology of Autonomic nervous system
PDF
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
PPTX
Fluid dynamics vivavoce presentation of prakash
DOCX
Q1_LE_Mathematics 8_Lesson 5_Week 5.docx
PDF
Phytochemical Investigation of Miliusa longipes.pdf
PPTX
Application of enzymes in medicine (2).pptx
PPTX
C1 cut-Methane and it's Derivatives.pptx
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PPTX
2Systematics of Living Organisms t-.pptx
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
PPTX
2. Earth - The Living Planet Module 2ELS
PPTX
Introduction to Fisheries Biotechnology_Lesson 1.pptx
PPT
POSITIONING IN OPERATION THEATRE ROOM.ppt
Classification Systems_TAXONOMY_SCIENCE8.pptx
The scientific heritage No 166 (166) (2025)
BIOMOLECULES PPT........................
Placing the Near-Earth Object Impact Probability in Context
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
ECG_Course_Presentation د.محمد صقران ppt
neck nodes and dissection types and lymph nodes levels
Pharmacology of Autonomic nervous system
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
Fluid dynamics vivavoce presentation of prakash
Q1_LE_Mathematics 8_Lesson 5_Week 5.docx
Phytochemical Investigation of Miliusa longipes.pdf
Application of enzymes in medicine (2).pptx
C1 cut-Methane and it's Derivatives.pptx
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
2Systematics of Living Organisms t-.pptx
TOTAL hIP ARTHROPLASTY Presentation.pptx
2. Earth - The Living Planet Module 2ELS
Introduction to Fisheries Biotechnology_Lesson 1.pptx
POSITIONING IN OPERATION THEATRE ROOM.ppt

Use of NGS to identify the causal variant associated with a complex phenotype

  • 1. J. B. Cole Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD 20705-2350, USA john.cole@ars.usda.gov Use of NGS to identify the causal variant associated with a complex phenotype
  • 2. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (2) Cole Overview Why are we sequencing? How did we select the animals to sequence? What are the steps involved in the process? What do you do with the reads once you have them? Where are we now?
  • 3. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (3) Cole Introduction Several studies (Kuhn et al., 2003; Cole et al., 2007; Seidenspinner et al., 2009) have reported QTL on BTA 18 associated with dystocia Bioinformatic analysis using SNP data has not identified the causal variant Next generation sequencing (NGS) has recently been used to find causal variants for novel recessive disorders
  • 4. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (4) Cole Chromosome 18 is different Markers on chromosome 18 have large effects on several traits: Dystocia and stillbirth: Sire and daughter calving ease and sire stillbirth Conformation: rump width, stature, strength, and body depth Efficiency: longevity and net merit Large calves contribute to reduced lifetimes and decreased profitability
  • 5. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (5) Cole Marker effects for dystocia complex AR-BFGL-NGS-109285 Cole et al., 2009 (J. Dairy Sci. 92:2931–2946) ARS-BFGL-NGS-109285 Cole et al., 2009 (J. Dairy Sci. 92:2931–2946)
  • 6. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (6) Cole Correlations in dystocia complex
  • 7. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (7) Cole The QTL also affects gestation length Maltecca et al. 2011. Animal Genetics, 42:6, 585-591.
  • 8. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (8) Cole Overview of the dystocia complex The key marker is ARS-BFGL-NGS-109285 at (rs109478645 ) 57,585,121 Mb on BTA18 Intronic to SIGLEC12 (sialic acid binding Ig-like lectin 12) Recent results indicate effects on gestation length (Maltecca et al., 2011) and calf birth weight (Cole et al., unpublished data)
  • 9. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (9) Cole This is a gene-rich region http://useast.ensembl.org/Bos_taurus/Location/View?r=18%3A57583000-57587000 http://www.ncbi.nlm.nih.gov/gene?cmd=Retrieve&dopt=Graphics&list_uids=618463
  • 10. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (10) Cole Copy number variants are present ARS-BFGL-NGS-109285 is flanked by CNV There’s a loss and a gain to the left (8 SNP region) There’s a gain to the right (10 SNP region) This can result in assembly problems Hou et al. 2011. Genomic characteristics of cattle copy number variations. BMC Genomics. 12:127.http://www.biomedcentral.com/1471-2164/12/127
  • 11. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (11) Cole Where did this problem come from? http://aipl.arsusda.gov/CF-queries/Bull_Chromosomal_EBV/bull_chromosomal_ebv.cfm? 40,803 daughters
  • 12. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (12) Cole What if we look at a different trait? Cole et al. (2007) proposed the following mechanism: SIGLEC12 may sequester circulating leptin This increases gestation length Calf birth weight (BW) is higher because of increased gestation length Higher BW is associated with dystocia
  • 13. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (13) Cole We don’t have birth weight data Birth weights are not routinely recorded in the US Collaborated with Hermann Swalve’s group to develop a selection index prediction of BW PTA Performed GWAS and gene set enrichment analysis to search for interesting associations
  • 14. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (14) Cole GWAS for birth weight PTA h Cole et al.(2013), unpublished data
  • 15. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (15) Cole Are we measuring anything new? Identified a SNP intronic to LHX4, which is associated with cow body weight and length (Ren et al., 2010, Mol. Bio. Reprod., 37:417-422). 4 SNP in the QTL region on BTA 18 had large effects Several other SNP with large effects intronic or adjacent to genes with unknown functions
  • 16. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (16) Cole KEGG pathways for birth weight What does regulation of the actin cytoskeleton have to do with birth weight in cattle? That is, do these results make sense? Maybe…these pathways may be involved in establishment & maintenance of pregnancy, as well as coordination of growth and development. Cole et al.(2013), unpublished data
  • 17. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (17) Cole Sequencing is becoming very affordable
  • 18. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (18) Cole Sequencing successes at AIPL/BFGL Simple loss-of-function mutations APAF1 – Spontaneous abortions in Holstein cattle (Adams et al., 2012) CWC15 – Early embryonic death in Jersey cattle (Sonstegard et al., 2013) Weaver syndrome – Neurological degeneration and death in Brown Swiss cattle (McClure et al., 2013)
  • 19. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (19) Cole Original pedigree-based design Bull A (1968) AA, SCE: 8 Bull B (1962) AA, SCE: 7 MGS Bull H (1989) Aa, SCE: 14 Bull I (1994) Aa, SCE: 18 Bull E (1982) Aa, SCE: 8 Bull F (1987) Aa, SCE: 15 Bull C (1975) AA, SCE: 8 = 10δBull D (1968) ??, SCE: 7 MGS Bull E (1974) Aa, SCE: 10 MGS
  • 20. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (20) Cole Modified pedigree & haplotype design Bull A (1968) AA, SCE: 8 Bull B (1962) AA, SCE: 7 MGS Bull H (1989) Aa, SCE: 14 Bull I (1994) Aa, SCE: 18 Bull E (1982) Aa, SCE: 8 Bull F (1987) Aa, SCE: 15 Bull C (1975) AA, SCE: 8 = 10δ Bull E (1974) Aa, SCE: 10 MGS Bull J (2002) Aa, SCE: 6 Bull K (2002) Aa, SCE: 15 Bull J (2002) aa, SCE: 15 These bulls carry the haplotype with the largest, negative effect on SCE: Bull D (1968) ??, SCE: 7 Couldn’t obtain DNA:
  • 21. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (21) Cole DNA Quality Molecular prep Sample Collection DNA Extraction Library Construction Library Quality Control
  • 22. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (22) Cole Sample preparation time is substantial DNA Extraction: ~12 hours (30 mins) DNA QC: ~1-2 hours (1-2 hours) Library Construction: 48 hours (12 hours) Library QC: ~2-4 hours (1 hour) Total: 3-4 days (15.5 hours)
  • 23. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (23) Cole DNA quality
  • 24. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (24) Cole Library quality
  • 25. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (25) Cole Sequencing stage • Illumina cBot: • Preps DNA for sequencing • Takes 4-5 hours • Must be done 48 hours before • Illumina HiSeq 2000: • Does the sequencing • Takes ~10-14 days for 100 x 100 • Minimal hands-on time
  • 26. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (26) Cole Anatomy of a flow cell 8 lanes per flow cell 3 columns per lane − 96 tiles per column Each tile imaged 8 times 1 from upper surface, 1 from lower Approximately 300Gb of sequence per flow cell http://www.qbi.uq.edu.au/images/genomics/genomics1.jpg
  • 27. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (27) Cole Sequencing by synthesis https://www.broadinstitute.org/files/shared/illuminavids/sequencingSlides.pdf
  • 28. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (28) Cole How many scientists does it take…
  • 29. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (29) Cole Flowcell 1: Cluster densities uster densities from current HiSeq run finished 30 April 2013 (unpublished data):
  • 30. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (30) Cole Flowcell 2: Cluster densities uster densities from current HiSeq run started 22 May 2013 (unpublished data):
  • 31. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (31) Cole The Aftermath Total Time (sample to sequence): 3 weeks That’s assuming nothing went wrong! More realistic: months Resulting Data Large text files ~300 gigabytes compressed Analysis Often underestimated Can take months as well
  • 32. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (32) Cole Variant detection • Alignment against a reference genome • Analysis is very disk I/O-intensive. Variant DetectionRaw Sequencer Output Alignment to the Genome
  • 33. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (33) Cole Computational Logistics Desktop computers Viable for single lanes Long computation time Servers are better >100GB RAM and >16 processor cores Cloud Amazon Web Services iAnimal/iPlant
  • 34. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (34) Cole Storage considerations What to save? Raw data? Processed results? How much workspace? Suggestions: Workspace 10x compressed files Save alignments Backup REGULARLY!!!
  • 35. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (35) Cole Why should you use a pipeline? • Automates analysis • Maximizes resource consumption • Because post-docs aren’t cheap
  • 36. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (36) Cole Galaxy server NextGene Custom pipeline Scripting languages Open-source tools Many options for analysis pipelines
  • 37. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (37) Cole Challenges Annotation This is a mess in the cow The reference assembly may not be representative of all taurine cows Validation Doing functional genomics with large mammals is expensive – who pays? When have we proven something?
  • 38. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (38) Cole Conclusions Sequencing is powerful, but presents many challenges Computational requirements are substantial We’re learning how much we don’t know about functional genomics in the cow Validation remains a problem
  • 39. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (39) Cole Acknowledgments AIPL: Derek Bickhart, Dan Null, Paul VanRaden BFGL: Reuben Anderson, Steve Schroeder, Tad Sonstegard, Curt Van Tassell
  • 40. Animal Sciences Group, Wageningen UR Livestock Research, The Netherlands, 29 May 2013 (40) Cole Questions? http://gigaom.com/2012/05/31/t-mobile-pits-its-math-against-verizons-the-loser-common-sense/shutterstock_76826245/