SlideShare a Scribd company logo
Statistical Genetics Using Sequence Data Dajiang J. Liu Department of Statistics
Why We Study Statistical Genetics Statistics is originated from genetics R.A. Fisher: “ The Correlation Between Relatives on the Supposition of Mendelian Inheritance” Introduced the concept of variance in this article Francis Galton : Regression of human height toward the mean: Introduced correlation and regression Karl Pearson:  “ Mendelism and the problem of mental defect” “ Tuberculosis, heredity and environment ” Why don’t we seek our roots? In order to find disease genes in the genome, statistics is a must
Statistical Genetics Disease gene mapping :   The determination of the sequence of genes and their relative distances from one another on a specific chromosome Technology driven field : Mendel’s era: Segregation Analysis -  Patience :  peas, fruit fly: inbreeding is necessary   Experimental  Design
Statistical Genetics Modern era: Microsatellite Markers: Genetic linkage analysis Extremely successful for mapping and identifying Mendelian traits Single nucleotide polymorphism (SNP) marker Case control studies: Genome Wide Association Studies: To identify common variants involved in complex traits Computational Techniques for likelihood in Pedigrees Statistics play a major role
Statistical Genetics Sequencing Era: Study of diseases due to rare variants is emerging ABI SOLiD sequencer Statistics is ALL  for sequencing data
Statistical Genetics Data we work with Human  Genome  Project Hap Map  Project 1000  Genome Project
Multi-facotorial Disease Etiology Hypothesis Common Disease Common Variants Hypothesis (CD/CV) hypothesis: Common diseases are caused by a few common variants with moderate effect E.g. Age-related Macular Degeneration:  Common variants are likely to have lower odds ratio than rare variants:
Multi-facotorial Disease Etiology Hypothesis Common Disease Rare Variants Hypothesis: Common diseases are caused by multiple rare variants with large effect size: The discovery of rare variants will have high impact on public health since they will aid in risk prediction and treatment E.g. Multiple Rare Alleles Contribute to Low Plasma Levels of HDL Cholesterol E.g. Colorectal Adenomas
Challenges on Statistical Methodologies Variants misclassification: Non-causal variants Included: Huge number of mutations on the genome: Most of them are not causing the disease under study  Causal Variants Excluded: Intronic mutations: Intergenic regions: Unknown patterns of interactions: Within gene interactions: e.g. Hirschsprung’s disease (RET gene) Gene x gene interactions: e.g. breast cancer genes (BRCA 1 BRCA2 x CHEK2) Adaptive methods are needed 1. 2. x
Kernel Based Adaptive Clustering Combine variant classification with association testing into a coherent framework Applicable to population based case/control studies using unrelated individuals Robust against variants misclassifications Can handle gene x gene interactions and gene x environment interactions

More Related Content

PPTX
The molecular times
PDF
Medical genetics
PPT
Folding: NEWS Principles of Medical Genetics
PPTX
Disease Ontology: mechanistic profiles of human disease
PDF
Disease Ontology: Improvements for Clinical Care and Research Applications
PPT
Biology molecular vanessa
PDF
Perlstein Lab Deck
PPTX
GENETIC MAPPING
The molecular times
Medical genetics
Folding: NEWS Principles of Medical Genetics
Disease Ontology: mechanistic profiles of human disease
Disease Ontology: Improvements for Clinical Care and Research Applications
Biology molecular vanessa
Perlstein Lab Deck
GENETIC MAPPING

What's hot (19)

PPT
Central Dogma Of Genetic Information
PDF
Evaluating cancer in twins
PPTX
PLEGABLE BIOLOGÍA MOLECULAR
PDF
Article Explores Genetics and Age-Related Macular Degeneration
PPTX
Gene regulation
PPTX
What is that causes cancer
PDF
Monogenic Diseases
PPTX
The Human Immune System Plays a Critical Role in Warding off Various Types of...
PPT
Presentación3
PPTX
Gene therapy & Targeted diseases
PPTX
Argumentative essay power point
PPTX
Introduction to data integration in bioinformatics
PPTX
DNA REPLICATION
PPTX
Biomol presentacion 1
PPTX
Clinical Application 2.0
ODT
Cancro del colon-retto: biotecnologie e prevenzione nell’eredità di Darwin e ...
PPTX
Role of heparanases
PPTX
Lely tech project
PDF
SchneiderTBAMooreRev07
Central Dogma Of Genetic Information
Evaluating cancer in twins
PLEGABLE BIOLOGÍA MOLECULAR
Article Explores Genetics and Age-Related Macular Degeneration
Gene regulation
What is that causes cancer
Monogenic Diseases
The Human Immune System Plays a Critical Role in Warding off Various Types of...
Presentación3
Gene therapy & Targeted diseases
Argumentative essay power point
Introduction to data integration in bioinformatics
DNA REPLICATION
Biomol presentacion 1
Clinical Application 2.0
Cancro del colon-retto: biotecnologie e prevenzione nell’eredità di Darwin e ...
Role of heparanases
Lely tech project
SchneiderTBAMooreRev07
Ad

Similar to 10 Liu, Dajiang (20)

PPTX
Gene hunting strategies
PDF
Informatics and data analytics to support for exposome-based discovery
PDF
How to transform genomic big data into valuable clinical information
PPTX
Statistical genetics.pptx...............
PDF
Hernandez ESEB 2018
PPTX
Sherlyn's genetic epidemiology
PPT
Genomics
PDF
Repurposing large datasets to dissect exposomic (and genomic) contributions i...
PPT
The Cochrane Collaboration Colloquium: The Human Genome Epidemiology Network:...
PDF
Theory and practice
PDF
Mark Daly - Finding risk genes in psychiatric disorders
 
PDF
Hernandez ASHG 2015
DOCX
Chapter 14Molecular and Genetic EpidemiologyLe.docx
PPTX
Family history
PPTX
SNPs Presentation Cavalcanti Lab
PDF
How Can Ngs Forward Research Essay
PPT
Biometry for 2015.ppt
PPTX
The practice of genetics in clinical medicine
PDF
Withinfamily che presentation_200609
PPT
Day2 145pm Crawford
Gene hunting strategies
Informatics and data analytics to support for exposome-based discovery
How to transform genomic big data into valuable clinical information
Statistical genetics.pptx...............
Hernandez ESEB 2018
Sherlyn's genetic epidemiology
Genomics
Repurposing large datasets to dissect exposomic (and genomic) contributions i...
The Cochrane Collaboration Colloquium: The Human Genome Epidemiology Network:...
Theory and practice
Mark Daly - Finding risk genes in psychiatric disorders
 
Hernandez ASHG 2015
Chapter 14Molecular and Genetic EpidemiologyLe.docx
Family history
SNPs Presentation Cavalcanti Lab
How Can Ngs Forward Research Essay
Biometry for 2015.ppt
The practice of genetics in clinical medicine
Withinfamily che presentation_200609
Day2 145pm Crawford
Ad

More from Hadley Wickham (20)

PDF
27 development
PDF
27 development
PDF
24 modelling
PDF
23 data-structures
PDF
Graphical inference
PDF
R packages
PDF
PDF
PDF
20 date-times
PDF
19 tables
PDF
18 cleaning
PDF
17 polishing
PDF
16 critique
PDF
15 time-space
PDF
14 case-study
PDF
13 case-study
PDF
12 adv-manip
PDF
11 adv-manip
PDF
11 adv-manip
PDF
10 simulation
27 development
27 development
24 modelling
23 data-structures
Graphical inference
R packages
20 date-times
19 tables
18 cleaning
17 polishing
16 critique
15 time-space
14 case-study
13 case-study
12 adv-manip
11 adv-manip
11 adv-manip
10 simulation

Recently uploaded (20)

PDF
A comparative study of natural language inference in Swahili using monolingua...
PDF
August Patch Tuesday
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
1. Introduction to Computer Programming.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
TLE Review Electricity (Electricity).pptx
PDF
Empathic Computing: Creating Shared Understanding
PDF
Encapsulation theory and applications.pdf
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Mushroom cultivation and it's methods.pdf
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
A comparative study of natural language inference in Swahili using monolingua...
August Patch Tuesday
Spectral efficient network and resource selection model in 5G networks
1. Introduction to Computer Programming.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Programs and apps: productivity, graphics, security and other tools
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
TLE Review Electricity (Electricity).pptx
Empathic Computing: Creating Shared Understanding
Encapsulation theory and applications.pdf
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Mushroom cultivation and it's methods.pdf
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
Heart disease approach using modified random forest and particle swarm optimi...
Mobile App Security Testing_ A Comprehensive Guide.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Univ-Connecticut-ChatGPT-Presentaion.pdf
Group 1 Presentation -Planning and Decision Making .pptx
Building Integrated photovoltaic BIPV_UPV.pdf

10 Liu, Dajiang

  • 1. Statistical Genetics Using Sequence Data Dajiang J. Liu Department of Statistics
  • 2. Why We Study Statistical Genetics Statistics is originated from genetics R.A. Fisher: “ The Correlation Between Relatives on the Supposition of Mendelian Inheritance” Introduced the concept of variance in this article Francis Galton : Regression of human height toward the mean: Introduced correlation and regression Karl Pearson: “ Mendelism and the problem of mental defect” “ Tuberculosis, heredity and environment ” Why don’t we seek our roots? In order to find disease genes in the genome, statistics is a must
  • 3. Statistical Genetics Disease gene mapping : The determination of the sequence of genes and their relative distances from one another on a specific chromosome Technology driven field : Mendel’s era: Segregation Analysis - Patience : peas, fruit fly: inbreeding is necessary Experimental Design
  • 4. Statistical Genetics Modern era: Microsatellite Markers: Genetic linkage analysis Extremely successful for mapping and identifying Mendelian traits Single nucleotide polymorphism (SNP) marker Case control studies: Genome Wide Association Studies: To identify common variants involved in complex traits Computational Techniques for likelihood in Pedigrees Statistics play a major role
  • 5. Statistical Genetics Sequencing Era: Study of diseases due to rare variants is emerging ABI SOLiD sequencer Statistics is ALL for sequencing data
  • 6. Statistical Genetics Data we work with Human Genome Project Hap Map Project 1000 Genome Project
  • 7. Multi-facotorial Disease Etiology Hypothesis Common Disease Common Variants Hypothesis (CD/CV) hypothesis: Common diseases are caused by a few common variants with moderate effect E.g. Age-related Macular Degeneration: Common variants are likely to have lower odds ratio than rare variants:
  • 8. Multi-facotorial Disease Etiology Hypothesis Common Disease Rare Variants Hypothesis: Common diseases are caused by multiple rare variants with large effect size: The discovery of rare variants will have high impact on public health since they will aid in risk prediction and treatment E.g. Multiple Rare Alleles Contribute to Low Plasma Levels of HDL Cholesterol E.g. Colorectal Adenomas
  • 9. Challenges on Statistical Methodologies Variants misclassification: Non-causal variants Included: Huge number of mutations on the genome: Most of them are not causing the disease under study Causal Variants Excluded: Intronic mutations: Intergenic regions: Unknown patterns of interactions: Within gene interactions: e.g. Hirschsprung’s disease (RET gene) Gene x gene interactions: e.g. breast cancer genes (BRCA 1 BRCA2 x CHEK2) Adaptive methods are needed 1. 2. x
  • 10. Kernel Based Adaptive Clustering Combine variant classification with association testing into a coherent framework Applicable to population based case/control studies using unrelated individuals Robust against variants misclassifications Can handle gene x gene interactions and gene x environment interactions