SlideShare a Scribd company logo
Sequence Alignment
Lecture – 4
Nafis Neehal, Lecturer, Department of CSE, DIU
CONTENTS
1. Sequence Alignment
2. Sequence Alignment Methods
- Pairwise Alignment
- Multiple Sequence Alignment
3. Pairwise Sequence Alignment Methods
-Global Alignment (Needleman-Wunsch)
- Local Alignment (Smith-Waterman)
4. Multiple Sequence Alignment
- Progressive Method
- Iterative Method
- MSA Challenges
1. Sequence Alignment
Why and how align sequences
Sequence
Alignment
A way of arranging the sequences of DNA, RNA, or
protein to identify regions of similarity that may
be a consequence of functional, structural, or
evolutionary relationships between the
sequences
CTGTCG-CTGCACG
-TGC-CG-TG----
2. Sequence Alignment Methods
Pairwise and Multiple
Pairwise Sequence Alignment
▹A pair of sequences as input
▹Align them in such a way that, for that particular
alignment the assumed region of similarity produces
higher score than all the other alignments
▹Methods
- Global Alignment (Needleman-Wunsch)
- Local Alignment (Smith-Waterman)
CTGTCGCTGCACG--
-------TGC-CGTG
Multiple Sequence Alignment
• Three or more than three sequences as
input
• Align all the sequences altogether in such a
manner that the alignment produces
highest score
3. Pairwise Sequence Alignment
Global and Local methods
Global Alignment (Needleman-Wunsch)
3 Major Steps
-Create 2D Matrix
-Trace back
-Final Alignment
Create 2D Matrix
- Row x Col 2D matrix draw (Row , Col
size of seq1 and seq2 respectively)
- Place 2 seqs as Row and Column
Header
- Cell (0,0) = 0
- Cell (0,1) to Cell (0,Column) and Cell
(1,0) to Cell (Row,0) value = delete
gap value from previous cell value
- For other cell values, follow
equation in (1)
Trace back
- Start from Cell (Row, Col)
- Go back up to Cell (0,0)
Final Alignment
- Start from Cell (Row, Col)
- If then, place
character in both seq
- If or then
character in start seq &
gap in end seq
Global Alignment (Needleman-Wunsch) - Example
Input
- seq1 = AAAC
- seq2 = AGC
Scoring Scheme
δ(x, x) = 1 (Match)
δ(x,-) = -2 (Gap)
δ(x, y) = -1 (Mis match)
Eq. 1: Cell Value
A G C
A
A
A
C
0 -2 -4 -6
-2 1 -1 -3
-4
-6
-8
-1
-3
-5
0
-2
-4
-2
-1
-1
Final
Alignment
-AGC
AAAC
Local Alignment (Smith-Waterman)
3 Major Steps
-Create 2D Matrix
-Trace back
-Final Alignment
Create 2D Matrix
- Row x Col 2D matrix draw (Row , Col
size of seq1 and seq2 respectively)
- Place 2 seqs as Row and Column
Header
- First Row, First Column all value = 0
- For other cell values, follow
equation in (2)
Trace back
- Start from each Cell which has the maximum
value in the entire matrix
- Go back up to the Cell where first time 0
occurs
Final Alignment
- Start from each Cell with max value
- If then, place character in both seq
- If or then character in start
seq & gap in end seq
Local Alignment (Smith-Waterman) - Example
Input
- seq1 = AAAC
- seq2 = AAG
Scoring Scheme
δ(x, x) = 1 (Match)
δ(x,-) = -2 (Gap)
δ(x, y) = -1 (Mis match)
Eq. 2: Cell Value
A A G
A
A
A
C
0 0 0 0
0 1 1 0
0
0
0
1
1
0
2
2
0
0
1
1
Final
Alignment
-AAG
AAAC
4. Multiple Sequence Alignment
Progressive, Iterative
Progressive Method
▹ Two major steps – Guide Tree build up and
Multiple Pairwise Alignment
▹ Steps
- Take each pair, align
- Generate consensus of that
alignment
- Align new sequence with the
consensus of the previous one
- Go back, Until all sequences are
finished
▹ Example
- Clustal ω
- MAFFT
- KALIGN
- T-COFFEE
Iterative Method
▹ Works similarly to progressive
methods
▹ Repeatedly realign the initial
sequences as well as add new
sequences to the growing MSA
▹ Example
- DIALIGN
- MUSCLE
- POA
MSA Challanges
▹ Computationally Expensive
▹ Difficult to score. Multiple comparison necessary in each
column of the MSA for a cumulative score
▹ Placement of gaps and scoring of substitution is more difficult
▹ Difficulty increases with diversity
▹ Relatively easy for a set of closely related sequences.
Identifying the correct ancestry relationships for a set of
distantly related sequences is more challenging
▹ Even difficult if some members are more alike compared to
others
95%
Of Human DNA is identical to Chimpanzees
510 DNA Codes
Lost throughout human evolution
2 gm DNA
Can contain digital information of whole world
1.8 Meter
Long DNA is squeezed into a space of 0.09 µm
Shocked?
Youtube Links
▹Global Alignment Part 1 - https://www.youtube.com/watch?v=vqxc2EfPWdk
▹Global Alignment Part 2 - https://www.youtube.com/watch?v=zwA-6_1bLgE
▹Local Alignment - https://www.youtube.com/watch?v=IatoWOsJ35Q

More Related Content

PPT
immobilized Enzyme reactors- batch and continuous types.
PPTX
FLUID FLOW AND MIXING IN BIOREACTOR
PPTX
GenBank Database and its different sections (Bioinformatics)
PPT
10. Scaling up of cell culture
PPTX
Genetic engineering of plants for fatty acid
PPTX
Cell synchronization, animal cell culture
PPTX
Cell synchronization
PPTX
Screening methods for cloned libraries
immobilized Enzyme reactors- batch and continuous types.
FLUID FLOW AND MIXING IN BIOREACTOR
GenBank Database and its different sections (Bioinformatics)
10. Scaling up of cell culture
Genetic engineering of plants for fatty acid
Cell synchronization, animal cell culture
Cell synchronization
Screening methods for cloned libraries

What's hot (20)

PPTX
Bacteriophage vector
PPTX
Ncbi basic intro_v_pitt_kent_osu
PPTX
Biotechnology: Yeast Artificial Chromosome Cloning Vector
PPTX
Agrobacterium tumefaciensppt............it is a slide presentation on interki...
PPT
Measurement growth in cell culture
PPTX
Needleman-Wunsch Algorithm
PPTX
PDF
Gene Silencing
PPTX
Restriction Modification Enzymes
PPT
construction of genomicc dna libraries
PPTX
Organellar genome and its composition
DOCX
Restriction mapping
PPTX
Packed bed reactor
PPTX
Sequence and Structural Databases of DNA and Protein, and its significance in...
PPTX
Embryo culture
PPTX
Organ culture technique in synthetic media- animal tissue culture
PPTX
RETROVIRUS MEDIATED GENE TRANSFER AND EXPRESSION CLONING
Bacteriophage vector
Ncbi basic intro_v_pitt_kent_osu
Biotechnology: Yeast Artificial Chromosome Cloning Vector
Agrobacterium tumefaciensppt............it is a slide presentation on interki...
Measurement growth in cell culture
Needleman-Wunsch Algorithm
Gene Silencing
Restriction Modification Enzymes
construction of genomicc dna libraries
Organellar genome and its composition
Restriction mapping
Packed bed reactor
Sequence and Structural Databases of DNA and Protein, and its significance in...
Embryo culture
Organ culture technique in synthetic media- animal tissue culture
RETROVIRUS MEDIATED GENE TRANSFER AND EXPRESSION CLONING
Ad

Similar to Lecture 4 (20)

PPT
seq alignment.ppt
PPTX
Needleman-wunch algorithm harshita
PDF
Sequence alignment
PPTX
Sequence Alignment
PPTX
Sequence alignment global vs. local
PPTX
Sequence alignment for bio informatics.pptx
PDF
Sequence Alignment_Assumption.pdf sequence
PPT
Sequence alignments complete coverage
PDF
Swaati algorithm of alignment ppt
DOCX
Bioinformatics_Sequence Analysis
PPTX
5. Global and Local Alignment Algorithms.pptx
PPTX
Dynamic programming and pairwise sequence alignment
PPTX
Sequence Alignment
PPTX
Sequence alignment unit 3
PPT
Needleman wunsch computional ppt
PPTX
Bioinformatics lesson
PPTX
Bioinformatics lesson
PDF
02-alignment.pdf
PPT
Laboratory 1 sequence_alignments
PDF
sequence alignment
seq alignment.ppt
Needleman-wunch algorithm harshita
Sequence alignment
Sequence Alignment
Sequence alignment global vs. local
Sequence alignment for bio informatics.pptx
Sequence Alignment_Assumption.pdf sequence
Sequence alignments complete coverage
Swaati algorithm of alignment ppt
Bioinformatics_Sequence Analysis
5. Global and Local Alignment Algorithms.pptx
Dynamic programming and pairwise sequence alignment
Sequence Alignment
Sequence alignment unit 3
Needleman wunsch computional ppt
Bioinformatics lesson
Bioinformatics lesson
02-alignment.pdf
Laboratory 1 sequence_alignments
sequence alignment
Ad

More from Owali Shawon (10)

PPTX
Lecture 7
PPTX
Lecture 8
PPTX
Lecture 6
PPTX
Lecture 2
PPTX
Lecture 5
PPTX
Lecture 3
PPTX
Lecture 1
PPTX
ABOUT ME!
PPTX
Electrical Circuit
PPTX
Short Review of Stack
Lecture 7
Lecture 8
Lecture 6
Lecture 2
Lecture 5
Lecture 3
Lecture 1
ABOUT ME!
Electrical Circuit
Short Review of Stack

Recently uploaded (20)

PPTX
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
PPTX
The KM-GBF monitoring framework – status & key messages.pptx
PDF
Placing the Near-Earth Object Impact Probability in Context
PPTX
BIOMOLECULES PPT........................
PPTX
Classification Systems_TAXONOMY_SCIENCE8.pptx
PPTX
Vitamins & Minerals: Complete Guide to Functions, Food Sources, Deficiency Si...
PPTX
Introduction to Cardiovascular system_structure and functions-1
PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
PDF
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
PPTX
2. Earth - The Living Planet Module 2ELS
PPTX
2Systematics of Living Organisms t-.pptx
PDF
Phytochemical Investigation of Miliusa longipes.pdf
DOCX
Q1_LE_Mathematics 8_Lesson 5_Week 5.docx
PPTX
Derivatives of integument scales, beaks, horns,.pptx
PPTX
famous lake in india and its disturibution and importance
PDF
HPLC-PPT.docx high performance liquid chromatography
PPTX
Introduction to Fisheries Biotechnology_Lesson 1.pptx
PPT
protein biochemistry.ppt for university classes
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
The KM-GBF monitoring framework – status & key messages.pptx
Placing the Near-Earth Object Impact Probability in Context
BIOMOLECULES PPT........................
Classification Systems_TAXONOMY_SCIENCE8.pptx
Vitamins & Minerals: Complete Guide to Functions, Food Sources, Deficiency Si...
Introduction to Cardiovascular system_structure and functions-1
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
TOTAL hIP ARTHROPLASTY Presentation.pptx
2. Earth - The Living Planet Module 2ELS
2Systematics of Living Organisms t-.pptx
Phytochemical Investigation of Miliusa longipes.pdf
Q1_LE_Mathematics 8_Lesson 5_Week 5.docx
Derivatives of integument scales, beaks, horns,.pptx
famous lake in india and its disturibution and importance
HPLC-PPT.docx high performance liquid chromatography
Introduction to Fisheries Biotechnology_Lesson 1.pptx
protein biochemistry.ppt for university classes

Lecture 4

  • 1. Sequence Alignment Lecture – 4 Nafis Neehal, Lecturer, Department of CSE, DIU
  • 2. CONTENTS 1. Sequence Alignment 2. Sequence Alignment Methods - Pairwise Alignment - Multiple Sequence Alignment 3. Pairwise Sequence Alignment Methods -Global Alignment (Needleman-Wunsch) - Local Alignment (Smith-Waterman) 4. Multiple Sequence Alignment - Progressive Method - Iterative Method - MSA Challenges
  • 3. 1. Sequence Alignment Why and how align sequences
  • 4. Sequence Alignment A way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences CTGTCG-CTGCACG -TGC-CG-TG----
  • 5. 2. Sequence Alignment Methods Pairwise and Multiple
  • 6. Pairwise Sequence Alignment ▹A pair of sequences as input ▹Align them in such a way that, for that particular alignment the assumed region of similarity produces higher score than all the other alignments ▹Methods - Global Alignment (Needleman-Wunsch) - Local Alignment (Smith-Waterman) CTGTCGCTGCACG-- -------TGC-CGTG
  • 7. Multiple Sequence Alignment • Three or more than three sequences as input • Align all the sequences altogether in such a manner that the alignment produces highest score
  • 8. 3. Pairwise Sequence Alignment Global and Local methods
  • 9. Global Alignment (Needleman-Wunsch) 3 Major Steps -Create 2D Matrix -Trace back -Final Alignment Create 2D Matrix - Row x Col 2D matrix draw (Row , Col size of seq1 and seq2 respectively) - Place 2 seqs as Row and Column Header - Cell (0,0) = 0 - Cell (0,1) to Cell (0,Column) and Cell (1,0) to Cell (Row,0) value = delete gap value from previous cell value - For other cell values, follow equation in (1) Trace back - Start from Cell (Row, Col) - Go back up to Cell (0,0) Final Alignment - Start from Cell (Row, Col) - If then, place character in both seq - If or then character in start seq & gap in end seq
  • 10. Global Alignment (Needleman-Wunsch) - Example Input - seq1 = AAAC - seq2 = AGC Scoring Scheme δ(x, x) = 1 (Match) δ(x,-) = -2 (Gap) δ(x, y) = -1 (Mis match) Eq. 1: Cell Value A G C A A A C 0 -2 -4 -6 -2 1 -1 -3 -4 -6 -8 -1 -3 -5 0 -2 -4 -2 -1 -1 Final Alignment -AGC AAAC
  • 11. Local Alignment (Smith-Waterman) 3 Major Steps -Create 2D Matrix -Trace back -Final Alignment Create 2D Matrix - Row x Col 2D matrix draw (Row , Col size of seq1 and seq2 respectively) - Place 2 seqs as Row and Column Header - First Row, First Column all value = 0 - For other cell values, follow equation in (2) Trace back - Start from each Cell which has the maximum value in the entire matrix - Go back up to the Cell where first time 0 occurs Final Alignment - Start from each Cell with max value - If then, place character in both seq - If or then character in start seq & gap in end seq
  • 12. Local Alignment (Smith-Waterman) - Example Input - seq1 = AAAC - seq2 = AAG Scoring Scheme δ(x, x) = 1 (Match) δ(x,-) = -2 (Gap) δ(x, y) = -1 (Mis match) Eq. 2: Cell Value A A G A A A C 0 0 0 0 0 1 1 0 0 0 0 1 1 0 2 2 0 0 1 1 Final Alignment -AAG AAAC
  • 13. 4. Multiple Sequence Alignment Progressive, Iterative
  • 14. Progressive Method ▹ Two major steps – Guide Tree build up and Multiple Pairwise Alignment ▹ Steps - Take each pair, align - Generate consensus of that alignment - Align new sequence with the consensus of the previous one - Go back, Until all sequences are finished ▹ Example - Clustal ω - MAFFT - KALIGN - T-COFFEE
  • 15. Iterative Method ▹ Works similarly to progressive methods ▹ Repeatedly realign the initial sequences as well as add new sequences to the growing MSA ▹ Example - DIALIGN - MUSCLE - POA
  • 16. MSA Challanges ▹ Computationally Expensive ▹ Difficult to score. Multiple comparison necessary in each column of the MSA for a cumulative score ▹ Placement of gaps and scoring of substitution is more difficult ▹ Difficulty increases with diversity ▹ Relatively easy for a set of closely related sequences. Identifying the correct ancestry relationships for a set of distantly related sequences is more challenging ▹ Even difficult if some members are more alike compared to others
  • 17. 95% Of Human DNA is identical to Chimpanzees 510 DNA Codes Lost throughout human evolution 2 gm DNA Can contain digital information of whole world 1.8 Meter Long DNA is squeezed into a space of 0.09 µm
  • 19. Youtube Links ▹Global Alignment Part 1 - https://www.youtube.com/watch?v=vqxc2EfPWdk ▹Global Alignment Part 2 - https://www.youtube.com/watch?v=zwA-6_1bLgE ▹Local Alignment - https://www.youtube.com/watch?v=IatoWOsJ35Q