SlideShare a Scribd company logo
2
Most read
9
Most read
10
Most read
MAXIMUM PARSIMONY
SHRUTHI K
18308019
II M.Sc MICROBIOLOGY
 Phylogenetic trees, or evolutionary trees, are the basic structures
necessary to examine the relationships among organisms.
 They model evolutionary events of vertical and horizontal descent.
 The parsimony method is one such approach where it minimises the
number of steps to generate variations from common ancestral
sequences.
 It prefers simplest explanation over more complex explanations.
 A multiple sequence alignment (msa) is required to predict which
sequence positions are likely to correspond.
 For each aligned position, phylogenetic trees that require the
smallest number of evolutionary changes to produce the observed
sequence changes from ancestral sequences are identified.
 Finally, those trees that produce the smallest number of changes
overall for all sequence positions are identified.
McLennan, D.A. Evo Edu
Outreach (2010) 3: 506.
https://doi.org/10.1007/s12052-
010-0273-6
 A rooted tree is used to make inferences about the most common
ancestor of the leaves or branches of the tree. Most commonly the
root is referred to as ‘outgroup’.
 An unrooted tree is used to make an illustration about the leaves or
branches, but not make assumption regarding a common ancestor.
V.K., Singh & Singh, Anil &
Kayastha, Arvind & Singh,
Brahma. (2014). Legumes in
the Omic Era. 10.1007/978-1-
4614-8370-0_12.
 External nodes: things under comparison; operational
taxonomic units (OTUs).
 Internal nodes: ancestral units; hypothetical; goal is to
group current day units.
 Topology: branching pattern of a tree.
 Branch length: amount of difference that occurred along
a branch.
 Monophyletic group, or clade, is a group of organisms
that consists of all the descendants of a common
ancestor.
 Entrez: www.ncbi.nlm.nih.gov/Taxonomy
 Ribosomal database project: rdp.cme.msu.edu/html/
 Tree of Life:
phylogeny.arizona.edu/tree/phylogeny.html
 PHYLLIP PACKAGE:
i. DNAPERS
ii. DNAPENNY – For more sequences
1. DNACOMP – finds tree that supports largest number
of sites.
2. DNAMOVE – interactive analysis of parsimony
 Tree of life: Analyzing changes that have occurred in
evolution of different organisms.
 Phylogenetic relationships among genes can help
predict which ones might have similar functions (e.g.,
ortholog detection).
 Follow changes occuring in rapidly changing species
(e.g., HIV virus)
 This is an example of character based method.
 They are based on sequence character rather than
pairwise distances.
 They count mutational events accumulated on the
sequences and may therefore avoid loss of information
when character is converted to distances.
 Thereby evolutionary dynamics can be studied and
ancestral approaches can also be studied.
 Maximum parsimony is an example for this method.
 The parsimony method chooses a tree that has fewest
evolutionary changes or mutations or shortest overall
branch length.
 Based on Occam’s razor philosophy.
 Reduces chances of inconsistencies, ambiguities and
redundancies.
 By minimizing the changes, the method minimizes
the phylogenetic noise owing to homoplasy and
independent evolution.
•The four-way multiple
sequence alignment contains
positions that fall into two
categories – informative and
uninformative sites.
• For the first position all four
sequences have same character
and no mutations- invariant
• Position 2 and 4 have
minimum two mutations
which are derived from
ancestors - informative
Maximum parsimony
1 2 3 4 5 6 7 8 9 10
A – A T G G A T T T C G
B – A T G G C G T T C G
C – G C G G A G T T C G
D – G C G G C G T T T G
Now, lets map one of these characters onto an unrooted tree
Note that we must assign states to ancestral nodes
A
D
B
C
T
C
T
C T
C
1 step
T
C
T
C
C
T
5 steps
A B C D
T T C C
1 2 3 4 5 6 7 8 9 10
A – A T G G A T T T C G
B – A T G G C G T T C G
C – G C G G A G T T C G
D – G C G G C G T T T G
site 1 - 1 step
A B C D
A B C D A B C D
A A G G
A C A C T T C C
site 5 - 2 steps
on two equally
parsimonious trees
site 2 - 1 step
Mapping should also be done for all other sites
Sites 3,4,7,8,10 – 0 steps
Mapping should also be done for all possible trees
site 6 – 1 step
1 2 3 4 5 6 7 8 9 10
A – A T G G A T T T C G
B – A T G G C G T T C G
C – G C G G A G T T C G
D – G C G G C G T T T G
G
T
G
G
G
G
C
T
C
C
C
C
site 9 - 1 step
There are three possible unrooted trees for four taxa.
B
C
D
A
A
B
D
C
A
D
B
C
((A,B),(C,D)) ((A,D),(C,B)) ((A,C),(B,D))
CTND…
 Evaluate each possible tree for all sites to determine
the smallest total number of changes necessary to
generate each one
 Note sites 3,4,6,7,8,9,10 are the same for every tree –
parsimony uninformative
Sites
Tree 1 2 3 4 5 6 7 8 9 10 Total
((A,B),(C,D)) 1 1 0 0 2 1 0 0 1 0 6
((A,D),(C,B)) 2 2 0 0 2 1 0 0 1 0 8
((A,C),(B,D)) 2 2 0 0 1 1 0 0 1 0 7
WEIGHTED PARSIMONY
 Suppose we weight transversions with twice the
value of transitions
 Site 5 is now weighted twice as much as sites 1
and 2
Sites
Tree 1 2 3 4 5 6 7 8 9 10 Total
((A,B),(C,D)) 1 1 0 0 4 1 0 0 1 0 8
((A,D),(C,B)) 2 2 0 0 4 1 0 0 1 0 10
((A,C),(B,D)) 2 2 0 0 2 1 0 0 1 0 8
ADVANTAGES
 Easy to understand
 Makes relatively few assumptions.
 Well studied mathematically
 Many useful software packages
 More theoretical arguments:
 1. Methodologically, parsimony forces us to maximize
homologous similarity. This is not necessarily true for
other methods
 2. Parsimony is based on an evolutionary assumption –
evolutionary change is rare. Not true at all for most
distance methods
DISADVANTAGES
 Why not use parsimony?
 Not consistent, under some scenarios it is possible (even
likely) to get the wrong tree
 Long-branch attraction – similar to rate heterogeneity
problem encountered with distance methods
 When DNA substitution rates are high, the probability that
two lineages will convergently evolve the same nucleotide at
the same site increases. When this happens, parsimony
erroneously interprets this similarity as a synapomorphy
(i.e., evolving once in the common ancestor of the two
lineages).
VERSIONS
 Versions of parsimony
 Fitch parsimony – no limitations on permissible character
changes, reversible P(A->T) = P(T->A)
 Wagner parsimony – allows ordered transformations (to get
from C to G, you must proceed through A), reversible
 Dollo parsimony – consider restriction site characters
 P(0->1) ≠ P(1->0)
 Limited non-reversibility – derived states cannot be lost
and regained
 Works really well for mobile element insertion data
 Camin-Sokal parsimony – evolutionary changes are
irreversible
 Transversion parsimony – ignores transitions or downweights
them severely
 Refers to phylogenetic artifact in which rapidly
evolving taxa with long branches are placed together.
 It is regardless of their true positions.
 Due to assumption that all lineages evolve at the same
rate and that all mutations contribute to branch
length.
A
B D
C
Long branch
 The edges leading to sequences/taxa A and C are long
relative to other branches in the tree, reflecting the
relatively greater number of substitutions that have
occurred along those two edges.
 The long branch attraction occurs when rates of
evolution show considerable variation among
sequences, or where the sequences being analysed are
quite divergent.
How to overcome Long Branch Attraction?
To reduce the effects of long edges is to add
sequences/taxa that join onto those edges thus breaking
them up.
 Krane, Raymer.ML, Fundamental concepts of
bioinformatics, 2003, Pearson education
 Xiong.J, Essential bioinformatics, 2006, Cambridge
University press.
 Bioinformatics: Sequence and Genome Analysis by
Mount D., 2004 Cold Spring Harbor Laboratory Press,
New York.

More Related Content

PPTX
Bioleaching
PPTX
AIR POLLUTION AND ITS CONTROL THROUGH BIOTECHNOLOGY
PDF
Industrial Wastewater Effluents
PPTX
RADIATION CARCINOGENESIS
PPTX
Multiple sequence alignment
PPTX
Biodiversity history,levels,estimate.
PPTX
Cryptochrome By Vidan Biology
PPTX
Microsporogenesis
Bioleaching
AIR POLLUTION AND ITS CONTROL THROUGH BIOTECHNOLOGY
Industrial Wastewater Effluents
RADIATION CARCINOGENESIS
Multiple sequence alignment
Biodiversity history,levels,estimate.
Cryptochrome By Vidan Biology
Microsporogenesis

What's hot (20)

PPTX
Scop database
PPTX
Multiple sequence alignment
PDF
Secondary Structure Prediction of proteins
PPTX
Sequence alig Sequence Alignment Pairwise alignment:-
PPTX
Entrez databases
PDF
Dot matrix
PPTX
Genomics
PPT
Est database
PPTX
PPTX
Introduction to sequence alignment partii
PPTX
(Expasy)
PPTX
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
PPTX
Sequence alignment
PPTX
Protein data bank
PPT
Phylogenetic analysis
PPTX
Functional genomics
PPTX
BLAST AND FASTA.pptx
PPT
Clustal
PPTX
Express sequence tags
PPTX
Gene prediction and expression
Scop database
Multiple sequence alignment
Secondary Structure Prediction of proteins
Sequence alig Sequence Alignment Pairwise alignment:-
Entrez databases
Dot matrix
Genomics
Est database
Introduction to sequence alignment partii
(Expasy)
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
Sequence alignment
Protein data bank
Phylogenetic analysis
Functional genomics
BLAST AND FASTA.pptx
Clustal
Express sequence tags
Gene prediction and expression
Ad

Similar to Maximum parsimony (20)

PPTX
Phylogenetic Tree evolution
DOCX
Humans, it would seem, have a great love of categorizing, organi
PPT
6238578.ppt
PPTX
Msa & rooted/unrooted tree
PDF
phylogenetics.pdf
PPTX
Zunera-Lecture-Introduction to Phylogenetic Analysis-V1.pptx
PDF
Phylogenetics
PPTX
Molecular phylogenetics
PPT
Phylogenetic analysis in nutshell
PPTX
Perl for Phyloinformatics
PPT
Phylogenetic alignment analysis an important tool in computational biology
PPT
Cg7 trees
PPT
Bioinformatica 27-10-2011-t4-alignments
PPTX
BTC 506 Phylogenetic Analysis.pptx
PPT
SyMAP Master's Thesis Presentation
PPT
Bioinformatica 20-10-2011-t3-scoring matrices
PPT
20100515 bioinformatics kapushesky_lecture07
PPTX
Towards the comparative analysis of genomic variants with Jalview
PPT
Bioinformatica 08-12-2011-t8-go-hmm
PPT
Alignments
Phylogenetic Tree evolution
Humans, it would seem, have a great love of categorizing, organi
6238578.ppt
Msa & rooted/unrooted tree
phylogenetics.pdf
Zunera-Lecture-Introduction to Phylogenetic Analysis-V1.pptx
Phylogenetics
Molecular phylogenetics
Phylogenetic analysis in nutshell
Perl for Phyloinformatics
Phylogenetic alignment analysis an important tool in computational biology
Cg7 trees
Bioinformatica 27-10-2011-t4-alignments
BTC 506 Phylogenetic Analysis.pptx
SyMAP Master's Thesis Presentation
Bioinformatica 20-10-2011-t3-scoring matrices
20100515 bioinformatics kapushesky_lecture07
Towards the comparative analysis of genomic variants with Jalview
Bioinformatica 08-12-2011-t8-go-hmm
Alignments
Ad

More from Shruthi Krishnaswamy (8)

PPTX
Applications of infrared spectroscopy
PPTX
PPTX
Microbial degradation of xenobiotics
PPTX
PPTX
Structure of p53 protein
PPTX
Toll-like receptors
PPTX
Traditional vaccine preparation
PPTX
Contributions of Edward jenner, Robert koch and Joseph Lister
Applications of infrared spectroscopy
Microbial degradation of xenobiotics
Structure of p53 protein
Toll-like receptors
Traditional vaccine preparation
Contributions of Edward jenner, Robert koch and Joseph Lister

Recently uploaded (20)

PPTX
2Systematics of Living Organisms t-.pptx
PPTX
C1 cut-Methane and it's Derivatives.pptx
PPTX
Introduction to Cardiovascular system_structure and functions-1
PPTX
Vitamins & Minerals: Complete Guide to Functions, Food Sources, Deficiency Si...
PPTX
7. General Toxicologyfor clinical phrmacy.pptx
PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
PPTX
POULTRY PRODUCTION AND MANAGEMENTNNN.pptx
PPTX
Application of enzymes in medicine (2).pptx
PDF
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
PPTX
BIOMOLECULES PPT........................
PPT
protein biochemistry.ppt for university classes
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PPTX
2. Earth - The Living Planet Module 2ELS
PDF
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
PPTX
Overview of calcium in human muscles.pptx
PDF
Assessment of environmental effects of quarrying in Kitengela subcountyof Kaj...
PDF
Lymphatic System MCQs & Practice Quiz – Functions, Organs, Nodes, Ducts
PPTX
famous lake in india and its disturibution and importance
PDF
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
PDF
Warm, water-depleted rocky exoplanets with surfaceionic liquids: A proposed c...
2Systematics of Living Organisms t-.pptx
C1 cut-Methane and it's Derivatives.pptx
Introduction to Cardiovascular system_structure and functions-1
Vitamins & Minerals: Complete Guide to Functions, Food Sources, Deficiency Si...
7. General Toxicologyfor clinical phrmacy.pptx
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
POULTRY PRODUCTION AND MANAGEMENTNNN.pptx
Application of enzymes in medicine (2).pptx
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
BIOMOLECULES PPT........................
protein biochemistry.ppt for university classes
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
2. Earth - The Living Planet Module 2ELS
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
Overview of calcium in human muscles.pptx
Assessment of environmental effects of quarrying in Kitengela subcountyof Kaj...
Lymphatic System MCQs & Practice Quiz – Functions, Organs, Nodes, Ducts
famous lake in india and its disturibution and importance
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
Warm, water-depleted rocky exoplanets with surfaceionic liquids: A proposed c...

Maximum parsimony

  • 2.  Phylogenetic trees, or evolutionary trees, are the basic structures necessary to examine the relationships among organisms.  They model evolutionary events of vertical and horizontal descent.  The parsimony method is one such approach where it minimises the number of steps to generate variations from common ancestral sequences.  It prefers simplest explanation over more complex explanations.  A multiple sequence alignment (msa) is required to predict which sequence positions are likely to correspond.
  • 3.  For each aligned position, phylogenetic trees that require the smallest number of evolutionary changes to produce the observed sequence changes from ancestral sequences are identified.  Finally, those trees that produce the smallest number of changes overall for all sequence positions are identified. McLennan, D.A. Evo Edu Outreach (2010) 3: 506. https://doi.org/10.1007/s12052- 010-0273-6
  • 4.  A rooted tree is used to make inferences about the most common ancestor of the leaves or branches of the tree. Most commonly the root is referred to as ‘outgroup’.  An unrooted tree is used to make an illustration about the leaves or branches, but not make assumption regarding a common ancestor. V.K., Singh & Singh, Anil & Kayastha, Arvind & Singh, Brahma. (2014). Legumes in the Omic Era. 10.1007/978-1- 4614-8370-0_12.
  • 5.  External nodes: things under comparison; operational taxonomic units (OTUs).  Internal nodes: ancestral units; hypothetical; goal is to group current day units.  Topology: branching pattern of a tree.  Branch length: amount of difference that occurred along a branch.  Monophyletic group, or clade, is a group of organisms that consists of all the descendants of a common ancestor.
  • 6.  Entrez: www.ncbi.nlm.nih.gov/Taxonomy  Ribosomal database project: rdp.cme.msu.edu/html/  Tree of Life: phylogeny.arizona.edu/tree/phylogeny.html  PHYLLIP PACKAGE: i. DNAPERS ii. DNAPENNY – For more sequences 1. DNACOMP – finds tree that supports largest number of sites. 2. DNAMOVE – interactive analysis of parsimony
  • 7.  Tree of life: Analyzing changes that have occurred in evolution of different organisms.  Phylogenetic relationships among genes can help predict which ones might have similar functions (e.g., ortholog detection).  Follow changes occuring in rapidly changing species (e.g., HIV virus)
  • 8.  This is an example of character based method.  They are based on sequence character rather than pairwise distances.  They count mutational events accumulated on the sequences and may therefore avoid loss of information when character is converted to distances.  Thereby evolutionary dynamics can be studied and ancestral approaches can also be studied.  Maximum parsimony is an example for this method.
  • 9.  The parsimony method chooses a tree that has fewest evolutionary changes or mutations or shortest overall branch length.  Based on Occam’s razor philosophy.  Reduces chances of inconsistencies, ambiguities and redundancies.  By minimizing the changes, the method minimizes the phylogenetic noise owing to homoplasy and independent evolution.
  • 10. •The four-way multiple sequence alignment contains positions that fall into two categories – informative and uninformative sites. • For the first position all four sequences have same character and no mutations- invariant • Position 2 and 4 have minimum two mutations which are derived from ancestors - informative
  • 12. 1 2 3 4 5 6 7 8 9 10 A – A T G G A T T T C G B – A T G G C G T T C G C – G C G G A G T T C G D – G C G G C G T T T G Now, lets map one of these characters onto an unrooted tree Note that we must assign states to ancestral nodes A D B C T C T C T C 1 step T C T C C T 5 steps A B C D T T C C
  • 13. 1 2 3 4 5 6 7 8 9 10 A – A T G G A T T T C G B – A T G G C G T T C G C – G C G G A G T T C G D – G C G G C G T T T G site 1 - 1 step A B C D A B C D A B C D A A G G A C A C T T C C site 5 - 2 steps on two equally parsimonious trees site 2 - 1 step
  • 14. Mapping should also be done for all other sites Sites 3,4,7,8,10 – 0 steps Mapping should also be done for all possible trees site 6 – 1 step 1 2 3 4 5 6 7 8 9 10 A – A T G G A T T T C G B – A T G G C G T T C G C – G C G G A G T T C G D – G C G G C G T T T G G T G G G G C T C C C C site 9 - 1 step
  • 15. There are three possible unrooted trees for four taxa. B C D A A B D C A D B C ((A,B),(C,D)) ((A,D),(C,B)) ((A,C),(B,D))
  • 16. CTND…  Evaluate each possible tree for all sites to determine the smallest total number of changes necessary to generate each one  Note sites 3,4,6,7,8,9,10 are the same for every tree – parsimony uninformative Sites Tree 1 2 3 4 5 6 7 8 9 10 Total ((A,B),(C,D)) 1 1 0 0 2 1 0 0 1 0 6 ((A,D),(C,B)) 2 2 0 0 2 1 0 0 1 0 8 ((A,C),(B,D)) 2 2 0 0 1 1 0 0 1 0 7
  • 17. WEIGHTED PARSIMONY  Suppose we weight transversions with twice the value of transitions  Site 5 is now weighted twice as much as sites 1 and 2 Sites Tree 1 2 3 4 5 6 7 8 9 10 Total ((A,B),(C,D)) 1 1 0 0 4 1 0 0 1 0 8 ((A,D),(C,B)) 2 2 0 0 4 1 0 0 1 0 10 ((A,C),(B,D)) 2 2 0 0 2 1 0 0 1 0 8
  • 18. ADVANTAGES  Easy to understand  Makes relatively few assumptions.  Well studied mathematically  Many useful software packages  More theoretical arguments:  1. Methodologically, parsimony forces us to maximize homologous similarity. This is not necessarily true for other methods  2. Parsimony is based on an evolutionary assumption – evolutionary change is rare. Not true at all for most distance methods
  • 19. DISADVANTAGES  Why not use parsimony?  Not consistent, under some scenarios it is possible (even likely) to get the wrong tree  Long-branch attraction – similar to rate heterogeneity problem encountered with distance methods  When DNA substitution rates are high, the probability that two lineages will convergently evolve the same nucleotide at the same site increases. When this happens, parsimony erroneously interprets this similarity as a synapomorphy (i.e., evolving once in the common ancestor of the two lineages).
  • 20. VERSIONS  Versions of parsimony  Fitch parsimony – no limitations on permissible character changes, reversible P(A->T) = P(T->A)  Wagner parsimony – allows ordered transformations (to get from C to G, you must proceed through A), reversible  Dollo parsimony – consider restriction site characters  P(0->1) ≠ P(1->0)  Limited non-reversibility – derived states cannot be lost and regained  Works really well for mobile element insertion data  Camin-Sokal parsimony – evolutionary changes are irreversible  Transversion parsimony – ignores transitions or downweights them severely
  • 21.  Refers to phylogenetic artifact in which rapidly evolving taxa with long branches are placed together.  It is regardless of their true positions.  Due to assumption that all lineages evolve at the same rate and that all mutations contribute to branch length. A B D C Long branch
  • 22.  The edges leading to sequences/taxa A and C are long relative to other branches in the tree, reflecting the relatively greater number of substitutions that have occurred along those two edges.  The long branch attraction occurs when rates of evolution show considerable variation among sequences, or where the sequences being analysed are quite divergent. How to overcome Long Branch Attraction? To reduce the effects of long edges is to add sequences/taxa that join onto those edges thus breaking them up.
  • 23.  Krane, Raymer.ML, Fundamental concepts of bioinformatics, 2003, Pearson education  Xiong.J, Essential bioinformatics, 2006, Cambridge University press.  Bioinformatics: Sequence and Genome Analysis by Mount D., 2004 Cold Spring Harbor Laboratory Press, New York.