1. Single Nucleotide
Single Nucleotide
Polymorphisms (SNPs),
Polymorphisms (SNPs),
Haplotypes, Linkage
Haplotypes, Linkage
Disequilibrium, and the
Disequilibrium, and the
Human Genome
Human Genome
Manish Anand
Manish Anand
Nihar Sheth
Nihar Sheth
Jim Costello
Jim Costello
Univ. of Indiana
Univ. of Indiana
24
24th
th
November, 2003
November, 2003
2. Biological Background
Biological Background
►How can researchers hope to identify and
How can researchers hope to identify and
study all the changes that occur in so
study all the changes that occur in so
many different diseases?
many different diseases?
►How can they explain why some people
How can they explain why some people
respond to treatment and not others?
respond to treatment and not others?
3. ‘
‘SNP’
SNP’ is the answer to these questions…
is the answer to these questions…
►So what exactly are SNPs?
So what exactly are SNPs?
►How are they involved in so many
How are they involved in so many
different aspects of health?
different aspects of health?
4. What is SNP ?
What is SNP ?
►A
A SNP
SNP is defined as a single base change
is defined as a single base change
in a DNA sequence that occurs in a
in a DNA sequence that occurs in a
significant proportion (more than 1
significant proportion (more than 1
percent) of a large population.
percent) of a large population.
5. Some Facts
Some Facts
► In human beings, 99.9 percent bases are same.
In human beings, 99.9 percent bases are same.
► Remaining 0.1 percent makes a person unique.
Remaining 0.1 percent makes a person unique.
Different attributes / characteristics / traits
Different attributes / characteristics / traits
► how a person looks,
how a person looks,
► diseases he or she develops.
diseases he or she develops.
► These variations can be:
These variations can be:
Harmless (change in phenotype)
Harmless (change in phenotype)
Harmful (diabetes, cancer, heart disease, Huntington's disease,
Harmful (diabetes, cancer, heart disease, Huntington's disease,
and hemophilia )
and hemophilia )
Latent (variations found in coding and regulatory regions, are
Latent (variations found in coding and regulatory regions, are
not harmful on their own, and the change in each gene only
not harmful on their own, and the change in each gene only
becomes apparent under certain conditions e.g. susceptibility
becomes apparent under certain conditions e.g. susceptibility
to lung cancer)
to lung cancer)
6. SNP facts
SNP facts
► SNPs are found in
SNPs are found in
coding and (mostly) noncoding regions.
coding and (mostly) noncoding regions.
► Occur with a very high frequency
Occur with a very high frequency
about 1 in 1000 bases to 1 in 100 to 300 bases.
about 1 in 1000 bases to 1 in 100 to 300 bases.
► The abundance of SNPs and the ease with which they can
The abundance of SNPs and the ease with which they can
be measured make these genetic variations significant.
be measured make these genetic variations significant.
► SNPs close to particular gene acts as a marker for that
SNPs close to particular gene acts as a marker for that
gene.
gene.
► SNPs in coding regions may alter the protein structure
SNPs in coding regions may alter the protein structure
made by that coding region.
made by that coding region.
7. SNPs may / may not alter protein
SNPs may / may not alter protein
structure
structure
8. SNPs act as gene markers
SNPs act as gene markers
9. SNP maps
SNP maps
►Sequence genomes of a large number of
Sequence genomes of a large number of
people
people
►Compare the base sequences to discover
Compare the base sequences to discover
SNPs.
SNPs.
►Generate a single map of the human
Generate a single map of the human
genome containing all possible SNPs =>
genome containing all possible SNPs =>
SNP maps
SNP maps
11. SNP Profiles
SNP Profiles
► Genome of each individual contains distinct SNP
Genome of each individual contains distinct SNP
pattern.
pattern.
► People can be grouped based on the SNP profile.
People can be grouped based on the SNP profile.
► SNPs Profiles important for identifying response
SNPs Profiles important for identifying response
to Drug Therapy.
to Drug Therapy.
► Correlations might emerge between certain SNP
Correlations might emerge between certain SNP
profiles and specific responses to treatment.
profiles and specific responses to treatment.
13. Techniques to detect known
Techniques to detect known
Polymorphisms
Polymorphisms
► Hybridization Techniques
Hybridization Techniques
Micro arrays
Micro arrays
Real time PCR
Real time PCR
► Enzyme based Techniques
Enzyme based Techniques
Nucleotide extension
Nucleotide extension
Cleavage
Cleavage
Ligation
Ligation
Reaction product detection and display
Reaction product detection and display
► Comparison of Techniques used
14. Techniques to detect unknown
Techniques to detect unknown
Polymorphisms
Polymorphisms
► Direct Sequencing
Direct Sequencing
► Microarray
Microarray
► Cleavage / Ligation
Cleavage / Ligation
► Electrophoretic mobility assays
Electrophoretic mobility assays
► Comparison of Techniques used
15. Direct Sequencing
Direct Sequencing
► Sanger dideoxysequencing can detect any type of unknown
Sanger dideoxysequencing can detect any type of unknown
polymorphism and its position, when the majority of DNA contains
polymorphism and its position, when the majority of DNA contains
that polymorphism.
that polymorphism.
► Misses polymorphisms and mutations when the DNA is
Misses polymorphisms and mutations when the DNA is
heterozygous
heterozygous
► limited utility for analysis of solid tumors or pooled samples of
limited utility for analysis of solid tumors or pooled samples of
DNA due to low sensitivity
DNA due to low sensitivity
► Once a sample is known to contain a polymorphism in a specific
Once a sample is known to contain a polymorphism in a specific
region, direct sequencing is particularly useful for identifying a
region, direct sequencing is particularly useful for identifying a
polymorphism and its specific position.
polymorphism and its specific position.
► Even if the identity of the polymorphism cannot be discerned in
Even if the identity of the polymorphism cannot be discerned in
the first pass, multiple sequencing attempts have proven quite
the first pass, multiple sequencing attempts have proven quite
successful in elucidating sequence and position information.
successful in elucidating sequence and position information.
16. SIGNIFICANCE OF SNPs
IN DISEASE DIAGNOSIS
IN FINDING PREDISPOSITION TO DISEASES
IN DRUG DISCOVERY & DEVELOPMENT
IN DRUG RESPONSES
INVESTIGATION OF MIGRATION PATTERNS
ALL THESE ASPECT WILL HELP TO LOOK FOR MEDICATION &
DIAGNOSIS AT INDIVIDUAL LEVEL
Feb. 25. 2003 SI Hung
17. Two different screening strategies
- Many SNPs in a few individuals
- A few SNPs in many individuals
Different strategies will require different tools
Important in determining markers for complex genetic
states
SNP Screening
18. SNP genotyping methods for detecting genes
contributing to susceptibility or resistance to
multifactorial diseases, adverse drug
reactions:
=> case-control association analysis
case
control
….GCCGTTGAC….
….GCCATTGAC….
….GCCATTGAC….
….GCCATTGAC….
19. A set of closely linked genetic markers present
on one chromosome which tend to be inherited
together (not easily separable by
recombination)
HAPLOTYPE
21. Association of haplotype frequencies with the presence of
desired phenotypic frequencies in the population will help in
utilizing the maximum potential of SNP as a marker.
HAPLOTYPE CORRELATION WITH PHENOTYPE
The “Haplotype centric” approach combines the information
of adjacent SNPs into composite multilocus haplotypes.
Haplotypes are not only more informative but also capture
the regional LD information, which is assumed to be robust
and powerful
22. 1. SNPs ARE THE MOST FREQUENT FORM OF DNA VARIATIONS
2. THEY ARE THE DISEASE CAUSING MUTATIONS IN MANY GENES
3. THEY ARE ABUNDANT & HAVE SLOW MUTATION RATES
4. EASY TO SCORE
5. MAY WORK AS THE NEXT GENERATION OF GENETIC MARKERS
ADVANTAGES:
23. Some important SNP database Resources
1. dbSNP (http://www.ncbi.nlm.nih.gov/SNP/)
LocusLink (http://www.ncbi.nlm.nih.gov/LocusLink/list.cgi)
2. TSC (http://snp.cshl.org/)
3. SNPper (http://snpper.chip.org/bio/)
4. JSNP (http://snp.ims.u-tokyo.ac.jp/search.html)
5. GeneSNPs (http://www.genome.utah.edu/genesnps/)
6. HGVbase (http://hgvbase.cgb.ki.se/)
7. PolyPhen (http://dove.embl-heidelberg.de/PolyPhen/)
OMIM (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM)
Feb. 25. 2003 SI Hung
8. Human SNP database
(http://www-genome.wi.mit.edu/snp/human/)
Editor's Notes
#20:SNP Simple to measure & understand
Haplotype have the advantage in the appropriate circumstances of carrying more information about the genotype-phenotype link than do the underlying SNPs.