SlideShare a Scribd company logo
Partitioning heritability by functional 
annotation using summary statistics 
Hilary Finucane 
MIT Department of Mathematics 
HSPH Department of Epidemiology 
October 21, 2014
Acknowledgements 
• Brendan Bulik- 
Sullivan 
• Alkes Price 
• Ben Neale 
• Alexander Gusev 
• Nick Patterson 
• Po-Ru Loh 
• Gosia Trynka 
• Han Xu 
• Verneri Anttila 
• Yakir Reshef 
• Chongzhi Zang 
• Stephan Ripke 
• Schizophrenia Working 
Group of the PGC 
• Shaun Purcell 
• Mark Daly 
• Eli Stahl 
• Soumya Raychaudhuri 
• Sara Lindstrom
Partitioning heritability by functional 
annotation is an important goal 
• Learn about genetic architecture of disease 
– Where does the heritability lie? 
• Learn about disease biology 
– What are the relevant cell types? 
• Learn about the functional annotations 
– Which functional annotations show the highest 
enrichments? 
• Downstream applications 
– Fine mapping 
– Risk prediction 
– GWAS priors 
Maurano et al. 2012 Science 
Trynka et al. 2013 Nat Genet 
Pickrell 2014 AJHG
What is partitioned heritability? 
• Our model is 
Where 
• Y is an individual’s phenotype, 
• Xj is an individual’s genotype at the j-th SNP 
(normalized to mean 0 and variance 1), 
• βj is the effect of SNP j, and 
• ε is noise and random environmental effects.
What is partitioned heritability? 
• Our model is 
• We define heritability as
What is partitioned heritability? 
• Our model is 
• We define heritability as 
and the heritability of a category as
Partitioning heritability 
using variance components has yielded 
many insights 
• 31% of schizophrenia SNP-heritability lies in CNS+ 
gene regions spanning 20% of the genome1. 
• 28% of Tourette syndrome SNP-heritability and 
29% of OCD SNP-heritability lies in parietal lobe 
eQTLs spanning 5% of the genome2. 
• 79% of SNP-heritability, averaged across WTCCC 
and WTCCC2 traits, lies in DHS regions spanning 
16% of the genome3. 
1 Lee et al. 2012 Nat Genet 
2 Davis et al. 2013 PLoS Genet 
3 Gusev et al. in press AJHG
A method for partitioning heritability 
from summary statistics is needed 
• Variance components methods are intractable 
at very large sample sizes. 
• There is lots of information in large meta-analyses. 
• Lots of publicly available summary statistics 
allow us to compare many phenotypes and 
many annotations to get a big picture.
Our method partitions heritability 
from summary statistics 
• Input: 
– Sample size and p-value for every SNP tested in a 
large GWAS of a quantitative or case-control trait 
– LD information from a reference panel like 1000G 
– Genome annotation of interest 
– Other genome annotations to include in the 
model.
Our method partitions heritability 
from summary statistics 
• Input: 
– Sample size and p-value for every SNP tested in a 
large GWAS of a quantitative or case-control trait 
– LD information from a reference panel like 1000G 
– Genome annotation of interest 
– Other genome annotations to include in the 
model. 
• Output: 
– Estimated proportion of heritability that falls 
within the annotation of interest. 
– Enrichment = (% of heritability) / (% of SNPs)
Outline 
• Description of method 
• Validation on simulated data 
• Results on real data
Outline 
• Description of method 
• Validation on simulated data 
• Results on real data
LD is important for summary statistics-based 
methods 
• Some SNPs have a lot of LD 
to other SNPs in the same 
category. 
• Some SNPs have a lot of LD 
to SNPs in other categories. 
• Some SNPs do not have a lot 
of LD to other SNPs.
LD is important for summary statistics-based 
methods 
• Some SNPs have a lot of LD 
to other SNPs in the same 
category. 
• Some SNPs have a lot of LD 
to SNPs in other categories. 
• Some SNPs do not have a lot 
of LD to other SNPs. 
Our solution: LD Score Regression. 
See Bulik-Sullivan et al. biorxiv (under revision, Nat 
Genet) and ASHG 2014 poster 1787T Bulik-Sullivan
LD Score Regression: basic intuition 
High LD region Low LD region 
Chi-square 
• Polygenicity causes more chi-square statistic inflation 
in high LD regions than in low LD regions 
Mean chi-square for high LD region: high Mean chi-square for low LD region: low
Multivariate LD Score Regression: basic 
intuition 
High chi-square Low chi-square 
Enriched category  BIG difference between lots of LD vs little LD to the category 
Low chi-square Low chi-square 
Depleted category  SMALL difference between lots of LD vs little LD to the category
Multivariate LD Score regression 
allows us to partition SNP heritability 
• Multivariate LD Score: the sum over all SNPs 
in a category of r^2 with that SNP.
Multivariate LD Score regression 
allows us to partition SNP heritability 
• Multivariate LD Score: the sum over all SNPs 
in a category of r^2 with that SNP. 
• Derivations based on a polygenic model give:
Multivariate LD Score regression 
allows us to partition SNP heritability 
• Multivariate LD Score: the sum over all SNPs 
in a category of r^2 with that SNP. 
• Derivations based on a polygenic model give: 
• Easily extends to overlapping categories.
Multivariate LD Score regression 
allows us to partition SNP heritability 
To estimate partitioned heritability: 
• Estimate LD Scores from a reference panel. 
• Regress chi-square statistics on LD Scores. 
• The slopes give the partitioned heritability. 
• For best results, use many categories!
Outline 
• Description of method 
• Validation on simulated data 
• Results on real data
Multivariate LD Score regression works 
in simulations 
Null simulations DHS 3x enriched 
True h2(DHS) 0.092 
REML (2 cat) 0.089 (0.006) 
LD Score (27 cat) 0.086 (0.012) 
True h2(DHS) 0.276 
REML (2 cat) 0.281 (0.006) 
LD Score (27 cat) 0.278 (0.013) 
• Standard errors are over 100 simulations. 
• Simulated quantitative phenotype with h2 = 0.5. 
• M = 110,444, N = 2,713
Multivariate LD Score regression works 
in simulations 
Null simulations DHS 3x enriched 
True h2(DHS) 0.092 
REML (2 cat) 0.089 (0.006) 
LD Score (27 cat) 0.086 (0.012) 
True h2(DHS) 0.276 
REML (2 cat) 0.281 (0.006) 
LD Score (27 cat) 0.278 (0.013) 
FANTOM5 Enhancer* causal 
True h2(DHS) 0.379 
REML (2 cat) 0.531 (0.007) 
LD Score (27 cat) 0.361 (0.015) 
• Standard errors are over 100 simulations. 
• Simulated quantitative phenotype with h2 = 0.5. 
• M = 110,444, N = 2,713 
* Andersson et al. 2014 Nature
Multivariate LD Score regression works 
in simulations 
Null simulations DHS 3x enriched 
True h2(DHS) 0.092 
REML (2 cat) 0.089 (0.006) 
LD Score (27 cat) 0.086 (0.012) 
True h2(DHS) 0.276 
REML (2 cat) 0.281 (0.006) 
LD Score (27 cat) 0.278 (0.013) 
FANTOM5 Enhancer* causal 
True h2(DHS) 0.379 
REML (2 cat) 0.531 (0.007) 
LD Score (27 cat) 0.361 (0.015) 
FANTOM5 Enhancer* causal, 
Excluded from the model 
True h2(DHS) 0.379 
REML (2 cat) 0.531 (0.007) 
LD Score (26 cat) 0.318 (0.014) 
• Standard errors are over 100 simulations. 
• Simulated quantitative phenotype with h2 = 0.5. 
• M = 110,444, N = 2,713 
* Andersson et al. 2014 Nature
Outline 
• Description of method 
• Validation on simulated data 
• Results on real data
Datasets analyzed 
Phenotype Citation Sample size 
Schizophrenia SCZ working grp of the PGC, 2014 Nature 70,100 
Bipolar Disorder Bip working grp of the PGC, 2011 Nat Genet 16,731 
Rheumatoid Arthritis* Okada et al., 2014 Nature 38,242 
Crohn’s Disease* Jostins et al., 2012 Nature 20,883 
Ulcerative Colitis* Jostins et al., 2012 Nature 27,432 
Height Wood et al., 2014 Nature Genetics 253,280 
BMI Speliotes et al., 2010 Nature Genetics 123,865 
Coronary Artery Disease Schunkert et al., 2011 Nature Genetics 86,995 
College (yes/no) Rietveld et al., Science 2013 126,559 
Type 2 Diabetes Morris et al., 2012 Nature Genetics 69,033 
*HLA locus excluded from all analyses for autoimmune traits
Annotations used 
Mark Source/reference 
Coding, 3’ UTR, 5’ UTR, Promoter, Intron UCSC; Gusev et al., in press AJHG 
Digital Genomic Footprint, TFBS ENCODE; Gusev et al., in press AJHG 
CTCF binding site, Promoter Flanking, 
Repressed, Transcribed, TSS, Enhancer, 
Weak Enhancer 
ENCODE; Hoffman et al., 2012 Nucleic 
Acids Research 
DHS, fetal DHS, H3K4me1, H3K4me3, 
H3K9ac 
Trynka et al., 2013 Nature Genetics.* 
Conserved Lindblad-Toh et al., 2011 Nature 
FANTOM5 Enhancer Andersson et al., 2014 Nature 
lincRNAs Cabili et al., 2011 Genes Dev 
DHS and DHS promoter Maurano et al., 2012 Science 
H3K27ac Roadmap; PGC2 2014 Nature 
*Post-processed from ENCODE and Roadmap data by S. Raychaudhuri and X. Liu labs
Coding, Intergenic, Enhancer, H3K4me3, and DHS 
enrichments in six phenotypes 
(Bars indicate 95% confidence intervals)
Coding, Intergenic, Enhancer, H3K4me3, DHS, and 
Conserved enrichments in six phenotypes 
(Bars indicate 95% confidence intervals) 
*Lindblad-Toh et al., 2011 Nature
Coding, Intergenic, Enhancer, H3K4me3, DHS, and 
FANTOM5 Enhancer enrichments in six phenotypes 
(Bars indicate 95% confidence intervals) 
*Andersson et al., 2014 Nature
Cell-type specific H3K27ac enrichments 
inform trait biology 
• We group 56 cell types into 7 basic categories. 
• For each trait (10 traits) 
– For each category (7 categories) 
• We asses the significance of improvement to 
the model from adding that category.
Partitioning Heritability using GWAS Summary Statistics with LD Score Regression
Conclusions 
• Many annotations are enriched in many 
phenotypes. 
• Conserved regions, 2.6% of SNPs, are 
estimated to explain 30% of heritability on 
average. 
• FANTOM5 Enhancers are extremely enriched 
in auto-immune traits. 
• H3K27ac cell-type enrichment matches and 
extends our understanding of disease biology.
Acknowledgements 
• Brendan Bulik- 
Sullivan 
• Alkes Price 
• Ben Neale 
• Alexander Gusev 
• Nick Patterson 
• Po-Ru Loh 
• Gosia Trynka 
• Han Xu 
• Verneri Anttila 
• Yakir Reshef 
• Chongzhi Zang 
• Stephan Ripke 
• Schizophrenia Working 
Group of the PGC 
• Shaun Purcell 
• Mark Daly 
• Eli Stahl 
• Soumya Raychaudhuri 
• Sara Lindstrom

More Related Content

PDF
DNA Methylation Data Analysis
PPTX
Nano Pore sequencing
PPTX
MLOps.pptx
PDF
The 'omics' revolution: How will it improve our understanding of infections a...
PDF
Whole Genome Analysis
PDF
Back to basics: Fundamental Concepts and Special Considerations in RNA Isolation
PDF
Introduction to Next-Generation Sequencing (NGS) Technology
PPTX
Prokka - rapid bacterial genome annotation - ABPHM 2013
DNA Methylation Data Analysis
Nano Pore sequencing
MLOps.pptx
The 'omics' revolution: How will it improve our understanding of infections a...
Whole Genome Analysis
Back to basics: Fundamental Concepts and Special Considerations in RNA Isolation
Introduction to Next-Generation Sequencing (NGS) Technology
Prokka - rapid bacterial genome annotation - ABPHM 2013

What's hot (16)

PPT
Microsatellites- Molecular fingerprints
PPTX
RNA-seq Data Analysis Overview
PPTX
Single Nucleotide Polymorphism
PDF
Machine Learning in R
PPTX
Genome wide association studies seminar
PDF
A short introduction to single-cell RNA-seq analyses
PPT
Dna fingerprinting powerpoint 1
PPTX
Microarray and dna chips for transcriptome study
PPTX
Construction of physical mapping
PPTX
Software matrics and measurement
PPT
Measures of Linkage Disequilibrium
PPTX
SNPs analysis methods
PDF
Comparative Genomics and Visualisation - Part 1
PPT
Data Mining: Concepts and Techniques — Chapter 2 —
PPTX
Big data components - Introduction to Flume, Pig and Sqoop
PPTX
Nucleic acid detection Techniques
Microsatellites- Molecular fingerprints
RNA-seq Data Analysis Overview
Single Nucleotide Polymorphism
Machine Learning in R
Genome wide association studies seminar
A short introduction to single-cell RNA-seq analyses
Dna fingerprinting powerpoint 1
Microarray and dna chips for transcriptome study
Construction of physical mapping
Software matrics and measurement
Measures of Linkage Disequilibrium
SNPs analysis methods
Comparative Genomics and Visualisation - Part 1
Data Mining: Concepts and Techniques — Chapter 2 —
Big data components - Introduction to Flume, Pig and Sqoop
Nucleic acid detection Techniques
Ad

Viewers also liked (7)

PDF
Genetic Correlation from GWAS Summary Statistics
PPTX
Lecture 3 quantitative traits and heritability full
PDF
Heritability of intelligence 3pdf
PDF
Heritability and Genetic Advance for Grain Yield and its Component Characters...
PPTX
Heritability , genetic advance
PPTX
Presentation on Heritability
Genetic Correlation from GWAS Summary Statistics
Lecture 3 quantitative traits and heritability full
Heritability of intelligence 3pdf
Heritability and Genetic Advance for Grain Yield and its Component Characters...
Heritability , genetic advance
Presentation on Heritability
Ad

Similar to Partitioning Heritability using GWAS Summary Statistics with LD Score Regression (20)

PDF
Predicting phenotype from genotype with machine learning
PPTX
Lecture 7 gwas full
PPTX
GGWS_M3_L5_Estimation_of_heritability_from_GWAS_summary_statistics.pptx
PPTX
Genome wide association studies seminar Prepared by Ms Varsha Gaitonde.
PDF
Using and combining the different tools for predicting the pathogenicity of s...
PDF
Advances and Applications Enabled by Single Cell Technology
PDF
Manteia non confidential-presentation 2003-09
PPTX
Using NGS to detect CNVs in familial hypercholesterolemia
PPTX
Using NGS to detect CNVs in familial hypercholesterolemia
PPTX
Algorithm Implementation of Genetic Association ‎Analysis for Rheumatoid Arth...
PPTX
Introduction to haplotype blocks .pptx
PPT
Biometry for 2015.ppt
PPTX
Genetic predisposition to papillary thyroid cancer by Albert de la Chapelle, ...
PDF
Molecular techniques for pathology research - MDX .pdf
PPTX
Molecular profiling 2013
PPT
IInvestigation of the genetic basis of adaptation
PPTX
Final presentation1-----------------.pptx
PPTX
171017 giab for giab grc workshop
PDF
Investigating Shared Additive Genetic Variation for Alcohol Dependence
PDF
Investigating Shared Additive Genetic Variation for Alcohol Dependence
Predicting phenotype from genotype with machine learning
Lecture 7 gwas full
GGWS_M3_L5_Estimation_of_heritability_from_GWAS_summary_statistics.pptx
Genome wide association studies seminar Prepared by Ms Varsha Gaitonde.
Using and combining the different tools for predicting the pathogenicity of s...
Advances and Applications Enabled by Single Cell Technology
Manteia non confidential-presentation 2003-09
Using NGS to detect CNVs in familial hypercholesterolemia
Using NGS to detect CNVs in familial hypercholesterolemia
Algorithm Implementation of Genetic Association ‎Analysis for Rheumatoid Arth...
Introduction to haplotype blocks .pptx
Biometry for 2015.ppt
Genetic predisposition to papillary thyroid cancer by Albert de la Chapelle, ...
Molecular techniques for pathology research - MDX .pdf
Molecular profiling 2013
IInvestigation of the genetic basis of adaptation
Final presentation1-----------------.pptx
171017 giab for giab grc workshop
Investigating Shared Additive Genetic Variation for Alcohol Dependence
Investigating Shared Additive Genetic Variation for Alcohol Dependence

Recently uploaded (20)

PPT
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
PPTX
Introduction to Cardiovascular system_structure and functions-1
PDF
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
PPTX
microscope-Lecturecjchchchchcuvuvhc.pptx
PDF
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
PDF
AlphaEarth Foundations and the Satellite Embedding dataset
PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
PPTX
2. Earth - The Living Planet Module 2ELS
PPTX
Derivatives of integument scales, beaks, horns,.pptx
PPTX
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
PPTX
2. Earth - The Living Planet earth and life
PDF
Placing the Near-Earth Object Impact Probability in Context
PDF
bbec55_b34400a7914c42429908233dbd381773.pdf
PDF
HPLC-PPT.docx high performance liquid chromatography
PDF
Biophysics 2.pdffffffffffffffffffffffffff
PPTX
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
PPTX
DRUG THERAPY FOR SHOCK gjjjgfhhhhh.pptx.
PDF
Sciences of Europe No 170 (2025)
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
Introduction to Cardiovascular system_structure and functions-1
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
microscope-Lecturecjchchchchcuvuvhc.pptx
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
AlphaEarth Foundations and the Satellite Embedding dataset
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
2. Earth - The Living Planet Module 2ELS
Derivatives of integument scales, beaks, horns,.pptx
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
2. Earth - The Living Planet earth and life
Placing the Near-Earth Object Impact Probability in Context
bbec55_b34400a7914c42429908233dbd381773.pdf
HPLC-PPT.docx high performance liquid chromatography
Biophysics 2.pdffffffffffffffffffffffffff
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
DRUG THERAPY FOR SHOCK gjjjgfhhhhh.pptx.
Sciences of Europe No 170 (2025)
TOTAL hIP ARTHROPLASTY Presentation.pptx

Partitioning Heritability using GWAS Summary Statistics with LD Score Regression

  • 1. Partitioning heritability by functional annotation using summary statistics Hilary Finucane MIT Department of Mathematics HSPH Department of Epidemiology October 21, 2014
  • 2. Acknowledgements • Brendan Bulik- Sullivan • Alkes Price • Ben Neale • Alexander Gusev • Nick Patterson • Po-Ru Loh • Gosia Trynka • Han Xu • Verneri Anttila • Yakir Reshef • Chongzhi Zang • Stephan Ripke • Schizophrenia Working Group of the PGC • Shaun Purcell • Mark Daly • Eli Stahl • Soumya Raychaudhuri • Sara Lindstrom
  • 3. Partitioning heritability by functional annotation is an important goal • Learn about genetic architecture of disease – Where does the heritability lie? • Learn about disease biology – What are the relevant cell types? • Learn about the functional annotations – Which functional annotations show the highest enrichments? • Downstream applications – Fine mapping – Risk prediction – GWAS priors Maurano et al. 2012 Science Trynka et al. 2013 Nat Genet Pickrell 2014 AJHG
  • 4. What is partitioned heritability? • Our model is Where • Y is an individual’s phenotype, • Xj is an individual’s genotype at the j-th SNP (normalized to mean 0 and variance 1), • βj is the effect of SNP j, and • ε is noise and random environmental effects.
  • 5. What is partitioned heritability? • Our model is • We define heritability as
  • 6. What is partitioned heritability? • Our model is • We define heritability as and the heritability of a category as
  • 7. Partitioning heritability using variance components has yielded many insights • 31% of schizophrenia SNP-heritability lies in CNS+ gene regions spanning 20% of the genome1. • 28% of Tourette syndrome SNP-heritability and 29% of OCD SNP-heritability lies in parietal lobe eQTLs spanning 5% of the genome2. • 79% of SNP-heritability, averaged across WTCCC and WTCCC2 traits, lies in DHS regions spanning 16% of the genome3. 1 Lee et al. 2012 Nat Genet 2 Davis et al. 2013 PLoS Genet 3 Gusev et al. in press AJHG
  • 8. A method for partitioning heritability from summary statistics is needed • Variance components methods are intractable at very large sample sizes. • There is lots of information in large meta-analyses. • Lots of publicly available summary statistics allow us to compare many phenotypes and many annotations to get a big picture.
  • 9. Our method partitions heritability from summary statistics • Input: – Sample size and p-value for every SNP tested in a large GWAS of a quantitative or case-control trait – LD information from a reference panel like 1000G – Genome annotation of interest – Other genome annotations to include in the model.
  • 10. Our method partitions heritability from summary statistics • Input: – Sample size and p-value for every SNP tested in a large GWAS of a quantitative or case-control trait – LD information from a reference panel like 1000G – Genome annotation of interest – Other genome annotations to include in the model. • Output: – Estimated proportion of heritability that falls within the annotation of interest. – Enrichment = (% of heritability) / (% of SNPs)
  • 11. Outline • Description of method • Validation on simulated data • Results on real data
  • 12. Outline • Description of method • Validation on simulated data • Results on real data
  • 13. LD is important for summary statistics-based methods • Some SNPs have a lot of LD to other SNPs in the same category. • Some SNPs have a lot of LD to SNPs in other categories. • Some SNPs do not have a lot of LD to other SNPs.
  • 14. LD is important for summary statistics-based methods • Some SNPs have a lot of LD to other SNPs in the same category. • Some SNPs have a lot of LD to SNPs in other categories. • Some SNPs do not have a lot of LD to other SNPs. Our solution: LD Score Regression. See Bulik-Sullivan et al. biorxiv (under revision, Nat Genet) and ASHG 2014 poster 1787T Bulik-Sullivan
  • 15. LD Score Regression: basic intuition High LD region Low LD region Chi-square • Polygenicity causes more chi-square statistic inflation in high LD regions than in low LD regions Mean chi-square for high LD region: high Mean chi-square for low LD region: low
  • 16. Multivariate LD Score Regression: basic intuition High chi-square Low chi-square Enriched category  BIG difference between lots of LD vs little LD to the category Low chi-square Low chi-square Depleted category  SMALL difference between lots of LD vs little LD to the category
  • 17. Multivariate LD Score regression allows us to partition SNP heritability • Multivariate LD Score: the sum over all SNPs in a category of r^2 with that SNP.
  • 18. Multivariate LD Score regression allows us to partition SNP heritability • Multivariate LD Score: the sum over all SNPs in a category of r^2 with that SNP. • Derivations based on a polygenic model give:
  • 19. Multivariate LD Score regression allows us to partition SNP heritability • Multivariate LD Score: the sum over all SNPs in a category of r^2 with that SNP. • Derivations based on a polygenic model give: • Easily extends to overlapping categories.
  • 20. Multivariate LD Score regression allows us to partition SNP heritability To estimate partitioned heritability: • Estimate LD Scores from a reference panel. • Regress chi-square statistics on LD Scores. • The slopes give the partitioned heritability. • For best results, use many categories!
  • 21. Outline • Description of method • Validation on simulated data • Results on real data
  • 22. Multivariate LD Score regression works in simulations Null simulations DHS 3x enriched True h2(DHS) 0.092 REML (2 cat) 0.089 (0.006) LD Score (27 cat) 0.086 (0.012) True h2(DHS) 0.276 REML (2 cat) 0.281 (0.006) LD Score (27 cat) 0.278 (0.013) • Standard errors are over 100 simulations. • Simulated quantitative phenotype with h2 = 0.5. • M = 110,444, N = 2,713
  • 23. Multivariate LD Score regression works in simulations Null simulations DHS 3x enriched True h2(DHS) 0.092 REML (2 cat) 0.089 (0.006) LD Score (27 cat) 0.086 (0.012) True h2(DHS) 0.276 REML (2 cat) 0.281 (0.006) LD Score (27 cat) 0.278 (0.013) FANTOM5 Enhancer* causal True h2(DHS) 0.379 REML (2 cat) 0.531 (0.007) LD Score (27 cat) 0.361 (0.015) • Standard errors are over 100 simulations. • Simulated quantitative phenotype with h2 = 0.5. • M = 110,444, N = 2,713 * Andersson et al. 2014 Nature
  • 24. Multivariate LD Score regression works in simulations Null simulations DHS 3x enriched True h2(DHS) 0.092 REML (2 cat) 0.089 (0.006) LD Score (27 cat) 0.086 (0.012) True h2(DHS) 0.276 REML (2 cat) 0.281 (0.006) LD Score (27 cat) 0.278 (0.013) FANTOM5 Enhancer* causal True h2(DHS) 0.379 REML (2 cat) 0.531 (0.007) LD Score (27 cat) 0.361 (0.015) FANTOM5 Enhancer* causal, Excluded from the model True h2(DHS) 0.379 REML (2 cat) 0.531 (0.007) LD Score (26 cat) 0.318 (0.014) • Standard errors are over 100 simulations. • Simulated quantitative phenotype with h2 = 0.5. • M = 110,444, N = 2,713 * Andersson et al. 2014 Nature
  • 25. Outline • Description of method • Validation on simulated data • Results on real data
  • 26. Datasets analyzed Phenotype Citation Sample size Schizophrenia SCZ working grp of the PGC, 2014 Nature 70,100 Bipolar Disorder Bip working grp of the PGC, 2011 Nat Genet 16,731 Rheumatoid Arthritis* Okada et al., 2014 Nature 38,242 Crohn’s Disease* Jostins et al., 2012 Nature 20,883 Ulcerative Colitis* Jostins et al., 2012 Nature 27,432 Height Wood et al., 2014 Nature Genetics 253,280 BMI Speliotes et al., 2010 Nature Genetics 123,865 Coronary Artery Disease Schunkert et al., 2011 Nature Genetics 86,995 College (yes/no) Rietveld et al., Science 2013 126,559 Type 2 Diabetes Morris et al., 2012 Nature Genetics 69,033 *HLA locus excluded from all analyses for autoimmune traits
  • 27. Annotations used Mark Source/reference Coding, 3’ UTR, 5’ UTR, Promoter, Intron UCSC; Gusev et al., in press AJHG Digital Genomic Footprint, TFBS ENCODE; Gusev et al., in press AJHG CTCF binding site, Promoter Flanking, Repressed, Transcribed, TSS, Enhancer, Weak Enhancer ENCODE; Hoffman et al., 2012 Nucleic Acids Research DHS, fetal DHS, H3K4me1, H3K4me3, H3K9ac Trynka et al., 2013 Nature Genetics.* Conserved Lindblad-Toh et al., 2011 Nature FANTOM5 Enhancer Andersson et al., 2014 Nature lincRNAs Cabili et al., 2011 Genes Dev DHS and DHS promoter Maurano et al., 2012 Science H3K27ac Roadmap; PGC2 2014 Nature *Post-processed from ENCODE and Roadmap data by S. Raychaudhuri and X. Liu labs
  • 28. Coding, Intergenic, Enhancer, H3K4me3, and DHS enrichments in six phenotypes (Bars indicate 95% confidence intervals)
  • 29. Coding, Intergenic, Enhancer, H3K4me3, DHS, and Conserved enrichments in six phenotypes (Bars indicate 95% confidence intervals) *Lindblad-Toh et al., 2011 Nature
  • 30. Coding, Intergenic, Enhancer, H3K4me3, DHS, and FANTOM5 Enhancer enrichments in six phenotypes (Bars indicate 95% confidence intervals) *Andersson et al., 2014 Nature
  • 31. Cell-type specific H3K27ac enrichments inform trait biology • We group 56 cell types into 7 basic categories. • For each trait (10 traits) – For each category (7 categories) • We asses the significance of improvement to the model from adding that category.
  • 33. Conclusions • Many annotations are enriched in many phenotypes. • Conserved regions, 2.6% of SNPs, are estimated to explain 30% of heritability on average. • FANTOM5 Enhancers are extremely enriched in auto-immune traits. • H3K27ac cell-type enrichment matches and extends our understanding of disease biology.
  • 34. Acknowledgements • Brendan Bulik- Sullivan • Alkes Price • Ben Neale • Alexander Gusev • Nick Patterson • Po-Ru Loh • Gosia Trynka • Han Xu • Verneri Anttila • Yakir Reshef • Chongzhi Zang • Stephan Ripke • Schizophrenia Working Group of the PGC • Shaun Purcell • Mark Daly • Eli Stahl • Soumya Raychaudhuri • Sara Lindstrom

Editor's Notes

  • #4: For a GWAS of a common complex trait, most of the heritability—and so most of the information--lies in the majority of SNPs that do not reach statistical significance. Partitioning heritability is a way to leverage all of the SNPs, instead of just the statistically significant SNPs, to answer questions about genetic architecture, about the biology of disease, and about functional annotations.
  • #7: Note: this extends to case-control traits under a liability threshold model. Note: equivalent to other definitions under certain assumptions.
  • #8: Partitioning heritability is traditionally done with a variance components method such as REML implemented in GCTA, and has yielded many insights in the past. I’d like to highlight this recent result of Gusev et al that non-coding DHS regions comprising 16% of the genome explain an estimated 79% of heritability on average across 11 traits.
  • #9: We need a method for partitioning heritability from summary statistics not just because many of our largest datasets are meta-analyses for which no one has the genotype data required for a variance components approach, but also because even when we do have all of the genotypes, variance components methods are intractable, especially for more than a very few components. As an added benefit of computational ease, we can look at a lot of phenotypes and a lot of annotations to look at higher level patterns.