SlideShare a Scribd company logo
2
Most read
3
Most read
4
Most read
MULTIPLE SEQUENCE ALINGMENT
HARSHITA AGARWAL
KUMAR SHASHANK
SHAURYA
NANCY
BACKGROUND
In bioinformatics, a sequence alignment is a way of arranging the
sequences of DNA, RNA or protein to identify regions of similarity that
may be a consequence of functional, structural or evolutionary
relationships between the sequences. One could align these sequences
either pair wise against each other or against whole database, or by
performing multiple sequence alignment where one align 3-many
sequences at a time. Multiple sequence alignment (MSA) helps us to
find conserved domain among a whole set of sequences that is difficult
to find if pair wise alignment is done. Like pair wise alignment MSA
could also be done locally or globally. Global alignment includes gaps
while local alignment could avoid these gaps. One could Take MSA an
extension of pair wise alignment, as primary step of MSA is pair wise
alignment between all the possible pairs followed by formation of a
guided tree ultimately providing the final MSA report. There are various
ways of doing MSA: Progressive, Iterative, and Block based method.
These methods are explained in following text. There are various
statistical model of MSA which provides us a table or file containing
information regarding probability of amino acid or nucleotide at each
position of MSA file. These models help us to correlate aligned
sequences and study insertions and deletions more easily and efficiently.
In our present study we have explained various ways of performing
MSA and some statistical model of MSA (PSSM, Profile, and PSI-
BLAST).
Multiple Sequence Alignment
A natural extension of pairwise alignment is multiple sequence
alignment, which is to align multiple related sequences to achieve
optimal matching of the sequences. There is a unique advantage of
multiple sequence alignment because it reveals more biological
information than many pairwise alignments can.
SCORING FUNCTION
The scoring function for multiple sequence alignment is based on the
concept of sum of pairs (SP). As the name suggests, it is the sum of the
scores of all possible pairs of sequences in a multiple alignment based
on a particular scoring matrix. In calculating the SP scores, each column
is scored by summing the scores for all possible pairwise matches,
mismatches and gap costs. The score of the entire alignment is the sum
of all of the column scores .
Multiple
sequence
alignment
heurestic
progressive itertative
block based
method
exhaustive
EXHAUSTIVE ALGORITHMS
The exhaustive alignment method involves examining all possible
aligned positions simultaneously. Similar to dynamic programming in
pairwise alignment, which involves the use of a two-dimensional matrix
to search for an optimal alignment, to use dynamic programming for
multiple sequence alignment, extra dimensions are needed to take all
possible ways of sequence matching into consideration. This means to
establish a multidimensional search matrix. For aligning N sequences, an
N-dimensional matrix is needed to be filled with alignment scores. As
the amount of computational time and memory space required increases
exponentially with the number of sequences, it makes the method
computationally prohibitive to use for a large data set. For this reason,
full dynamic programming is limited to small datasets of less than ten
short sequences.
HEURISTIC ALGORITHMS
Because the use of dynamic programming is not feasible for routine
multiple sequence alignment, faster and heuristic algorithms have been
developed. The heuristic algorithms fall into three categories:
 progressive alignment type
 iterative alignment type
 and block-based alignment type.
Progressive Alignment Method
Progressive alignment depends on the stepwise assembly of multiple
alignment and is heuristic in nature . it does not seprate the process of
scoring an alignment from the optimization algorithm. It is fast and
efficient and does not directly optimize any global scoring function of
alignment correctness.
Steps:
1. perform pairwise alignment for all sequences
2. use the alignment score, that gives a phylogenetic tree using
neighbor-joining(NJ) method.
3. The sequences are aligned using the phylogenetic relationships
indicated by the tree.
CLUSTALW produces the best match for the sequences and arranges
them so that the similarities and differences can be seen. It works on the
hypothesis that sequence in an alignment will reflect their evolutionary
history.
Iterative Alignment
The iterative approach is based on the idea that an optimal solution can
be found by repeatedly modifying existing suboptimal solutions. The
procedure starts by producing a low-quality alignment and gradually
improves it by iterative realignment through well-defined procedures
until no more improvements in the alignment scores.
Steps remain similar to progressive alignment, only pre-alignment is
done multiple times.
Software used: T-coffee
Block-Based Alignment
The progressive and iterative alignment strategies are largely global
alignment based and may therefore fail to recognize conserved domains
and motifs among highly divergent sequences of varying lengths. For
such divergent sequences that share only regional similarities, a local
alignment based approach has to be used. The strategy identifies a block
of ungapped alignment shared by all the sequences, hence, the block-
based local alignment strategy is used.
APPLICATION OF M.S.A
 phylogenetic tree construction
 illumination of functionally important regions
 prediction of structure of proteins
REFERENCES
 S.C. Rastogi and N. Mendiratla and P.Rastogi. Bioinformatics
methods and applicationsGenomics, Proteomics and Drug
Discovery. Prentice Hall India, 2004.
 Multiple sequence alignment Introduction to Computational
Biology Teresa Przytycka, PhD, https://www.ncbi.nlm.nih.gov
 https://www.ebi.ac.uk/Tools/msa/clustal
 http://www.iiserpune.ac.in/~farhat/wordpress/wpcontent/uploads/2
011/06/lecture_MSA.pdf

More Related Content

PPTX
Multiple Sequence Alignment
PPTX
Scoring matrices
PPTX
DOCX
PPTX
Sequence alignment
PPTX
Multiple sequence alignment
PPTX
Protein data bank
PPTX
Needleman-wunch algorithm harshita
Multiple Sequence Alignment
Scoring matrices
Sequence alignment
Multiple sequence alignment
Protein data bank
Needleman-wunch algorithm harshita

What's hot (20)

PPTX
Scoring schemes in bioinformatics
PPTX
Multiple sequence alignment
PPTX
Introduction to sequence alignment partii
PPTX
Phylogenetic data analysis
PPTX
Scoring schemes in bioinformatics (blosum)
PPTX
protein data bank
PPTX
Swiss prot database
PDF
Gene prediction methods vijay
PPTX
Sequenced taged sites (sts)
PDF
multiple sequence and pairwise alignment.pdf
PPTX
Upgma
PDF
PPTX
Orthologs,Paralogs & Xenologs
PPTX
Labelling of dna
PPTX
Entrez databases
PPTX
BLAST (Basic local alignment search Tool)
DOCX
Gen bank (genetic sequence databank)
PDF
Dot matrix
Scoring schemes in bioinformatics
Multiple sequence alignment
Introduction to sequence alignment partii
Phylogenetic data analysis
Scoring schemes in bioinformatics (blosum)
protein data bank
Swiss prot database
Gene prediction methods vijay
Sequenced taged sites (sts)
multiple sequence and pairwise alignment.pdf
Upgma
Orthologs,Paralogs & Xenologs
Labelling of dna
Entrez databases
BLAST (Basic local alignment search Tool)
Gen bank (genetic sequence databank)
Dot matrix
Ad

Similar to multiple sequence alignment (20)

PPTX
Parwati sihag
PDF
International Journal of Computer Science, Engineering and Information Techno...
PPTX
Sequence homology search and multiple sequence alignment(1)
PPTX
PRESENTATION MULTIPLE SEQUENCE ALIGNMENT.pptx
PDF
Sequence alignment
DOCX
Bioinformatics_Sequence Analysis
PPTX
MULTIPLE SEQUENCE ALIGNMENT
PDF
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
PDF
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
PPTX
Sequence Analysis
PPTX
Sequence Alignment
PPTX
Sequence alignment.pptx
PDF
Automatic Parallelization for Parallel Architectures Using Smith Waterman Alg...
PPTX
Bioinformatics
PPTX
Blast and fasta
PPT
Seq alignment
PPTX
Sequence Alignment
PPTX
Sequence database
PDF
Machine learning and reinforcement learning
PDF
20120140505011
Parwati sihag
International Journal of Computer Science, Engineering and Information Techno...
Sequence homology search and multiple sequence alignment(1)
PRESENTATION MULTIPLE SEQUENCE ALIGNMENT.pptx
Sequence alignment
Bioinformatics_Sequence Analysis
MULTIPLE SEQUENCE ALIGNMENT
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Sequence Analysis
Sequence Alignment
Sequence alignment.pptx
Automatic Parallelization for Parallel Architectures Using Smith Waterman Alg...
Bioinformatics
Blast and fasta
Seq alignment
Sequence Alignment
Sequence database
Machine learning and reinforcement learning
20120140505011
Ad

Recently uploaded (20)

PDF
01-Introduction-to-Information-Management.pdf
PPTX
Lesson notes of climatology university.
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
Insiders guide to clinical Medicine.pdf
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
Computing-Curriculum for Schools in Ghana
PDF
Basic Mud Logging Guide for educational purpose
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
Microbial disease of the cardiovascular and lymphatic systems
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
RMMM.pdf make it easy to upload and study
PPTX
Cell Types and Its function , kingdom of life
PPTX
Pharma ospi slides which help in ospi learning
PDF
Complications of Minimal Access Surgery at WLH
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
01-Introduction-to-Information-Management.pdf
Lesson notes of climatology university.
Abdominal Access Techniques with Prof. Dr. R K Mishra
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
Final Presentation General Medicine 03-08-2024.pptx
PPH.pptx obstetrics and gynecology in nursing
Insiders guide to clinical Medicine.pdf
2.FourierTransform-ShortQuestionswithAnswers.pdf
Computing-Curriculum for Schools in Ghana
Basic Mud Logging Guide for educational purpose
human mycosis Human fungal infections are called human mycosis..pptx
Microbial disease of the cardiovascular and lymphatic systems
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
RMMM.pdf make it easy to upload and study
Cell Types and Its function , kingdom of life
Pharma ospi slides which help in ospi learning
Complications of Minimal Access Surgery at WLH
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf

multiple sequence alignment

  • 1. MULTIPLE SEQUENCE ALINGMENT HARSHITA AGARWAL KUMAR SHASHANK SHAURYA NANCY
  • 2. BACKGROUND In bioinformatics, a sequence alignment is a way of arranging the sequences of DNA, RNA or protein to identify regions of similarity that may be a consequence of functional, structural or evolutionary relationships between the sequences. One could align these sequences either pair wise against each other or against whole database, or by performing multiple sequence alignment where one align 3-many sequences at a time. Multiple sequence alignment (MSA) helps us to find conserved domain among a whole set of sequences that is difficult to find if pair wise alignment is done. Like pair wise alignment MSA could also be done locally or globally. Global alignment includes gaps while local alignment could avoid these gaps. One could Take MSA an extension of pair wise alignment, as primary step of MSA is pair wise alignment between all the possible pairs followed by formation of a guided tree ultimately providing the final MSA report. There are various ways of doing MSA: Progressive, Iterative, and Block based method. These methods are explained in following text. There are various statistical model of MSA which provides us a table or file containing information regarding probability of amino acid or nucleotide at each position of MSA file. These models help us to correlate aligned sequences and study insertions and deletions more easily and efficiently. In our present study we have explained various ways of performing MSA and some statistical model of MSA (PSSM, Profile, and PSI- BLAST).
  • 3. Multiple Sequence Alignment A natural extension of pairwise alignment is multiple sequence alignment, which is to align multiple related sequences to achieve optimal matching of the sequences. There is a unique advantage of multiple sequence alignment because it reveals more biological information than many pairwise alignments can. SCORING FUNCTION The scoring function for multiple sequence alignment is based on the concept of sum of pairs (SP). As the name suggests, it is the sum of the scores of all possible pairs of sequences in a multiple alignment based on a particular scoring matrix. In calculating the SP scores, each column is scored by summing the scores for all possible pairwise matches, mismatches and gap costs. The score of the entire alignment is the sum of all of the column scores . Multiple sequence alignment heurestic progressive itertative block based method exhaustive
  • 4. EXHAUSTIVE ALGORITHMS The exhaustive alignment method involves examining all possible aligned positions simultaneously. Similar to dynamic programming in pairwise alignment, which involves the use of a two-dimensional matrix to search for an optimal alignment, to use dynamic programming for multiple sequence alignment, extra dimensions are needed to take all possible ways of sequence matching into consideration. This means to establish a multidimensional search matrix. For aligning N sequences, an N-dimensional matrix is needed to be filled with alignment scores. As the amount of computational time and memory space required increases exponentially with the number of sequences, it makes the method computationally prohibitive to use for a large data set. For this reason, full dynamic programming is limited to small datasets of less than ten short sequences. HEURISTIC ALGORITHMS Because the use of dynamic programming is not feasible for routine multiple sequence alignment, faster and heuristic algorithms have been developed. The heuristic algorithms fall into three categories:  progressive alignment type  iterative alignment type  and block-based alignment type. Progressive Alignment Method Progressive alignment depends on the stepwise assembly of multiple alignment and is heuristic in nature . it does not seprate the process of scoring an alignment from the optimization algorithm. It is fast and
  • 5. efficient and does not directly optimize any global scoring function of alignment correctness. Steps: 1. perform pairwise alignment for all sequences 2. use the alignment score, that gives a phylogenetic tree using neighbor-joining(NJ) method. 3. The sequences are aligned using the phylogenetic relationships indicated by the tree. CLUSTALW produces the best match for the sequences and arranges them so that the similarities and differences can be seen. It works on the hypothesis that sequence in an alignment will reflect their evolutionary history.
  • 6. Iterative Alignment The iterative approach is based on the idea that an optimal solution can be found by repeatedly modifying existing suboptimal solutions. The procedure starts by producing a low-quality alignment and gradually improves it by iterative realignment through well-defined procedures until no more improvements in the alignment scores. Steps remain similar to progressive alignment, only pre-alignment is done multiple times. Software used: T-coffee Block-Based Alignment The progressive and iterative alignment strategies are largely global alignment based and may therefore fail to recognize conserved domains and motifs among highly divergent sequences of varying lengths. For such divergent sequences that share only regional similarities, a local alignment based approach has to be used. The strategy identifies a block of ungapped alignment shared by all the sequences, hence, the block- based local alignment strategy is used.
  • 7. APPLICATION OF M.S.A  phylogenetic tree construction  illumination of functionally important regions  prediction of structure of proteins
  • 8. REFERENCES  S.C. Rastogi and N. Mendiratla and P.Rastogi. Bioinformatics methods and applicationsGenomics, Proteomics and Drug Discovery. Prentice Hall India, 2004.  Multiple sequence alignment Introduction to Computational Biology Teresa Przytycka, PhD, https://www.ncbi.nlm.nih.gov  https://www.ebi.ac.uk/Tools/msa/clustal  http://www.iiserpune.ac.in/~farhat/wordpress/wpcontent/uploads/2 011/06/lecture_MSA.pdf