SlideShare a Scribd company logo
5
Most read
8
Most read
9
Most read
Lecture Notes on Dictionary Based
Compression Techniques
for
Open Educational Resource
on
Data Compression(CA209)
by
Dr. Piyush Charan
Assistant Professor
Department of Electronics and Communication Engg.
Integral University, Lucknow
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
• Dictionary-based compression algorithms
usually create a dictionary (a pattern of
characters) in memory as data is scanned
looking for repeated information (some
implementations use a static dictionary so it
does have to be built dynamically).
4/22/2021 2
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
LZ77/LZ1/ Sliding Window
• In many applications, the output of the source consists of
recurring patterns.
• A very reasonable approach to encode such sources is to
keep a list, or dictionary, of frequently occurring patterns.
• The input is split into two classes, frequently occurring and
infrequently occurring patterns.
• There are static and adaptive dictionary techniques. Most
adaptive techniques have their roots in two papers by Ziv
and Lempel in 1977 (LZ77) and 1978 (LZ78)
4/22/2021 3
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
LZ77 Approach
• LZ77 is a Dynamic Adaptive Dictionary Technique that consists of a Sliding
Window.
• The widow consists of two parts:
– Search Buffer (SB)
– Look Ahead Buffer (LAB)
• The size of the sliding window is given as:
• Window Size= SB+LAB
1 2 3 4 5 6 7 8 9 10 11 12 13
Search Buffer
Look Ahead Buffer
4/22/2021 4
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
• Search Buffer: A Search Buffer that contains a portion of the
recently encoded sequence.
• Look Ahead Buffer: A Look - Ahead Buffer that contains the next
portion of the sequence.
• To encode the sequence in look-ahead buffer, the encoder moves a
search pointer back through the search buffer until it encounters a
match to the first symbol in the look-ahead buffer.
• Any two of the three must be given in the problem to encode given
sequence of text.
4/22/2021 5
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
Process of LZ77 Compression
• Lets see the process below:
• Triplets: <o, l, c>
c a b r a c a d a b r a
Window Size=13
r r a ……
Search Buffer Look-Ahead Buffer
Offset
Length of match
codeword
4/22/2021 6
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
• Offset (o): The distance between the search pointer and the
look-ahead buffer is called the offset.
• Length of match (l): The number of consecutive symbols in
the search buffer that match the consecutive symbols in the
look-ahead buffer, starting with the first symbol, is called the
length of match.
• Codeword (c): It is the codeword corresponding to the symbol
in the look-ahead buffer that follows the match.
4/22/2021 7
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
LZ77 Example
• Encode the message-
c a b r a c a d a b r a r r a r r a d
• Here Window Size =13
• And Size of Look Ahead Buffer =6
c a b r a c a d a b r a r r a r r a d
4/22/2021 8
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
LZ77 Example contd..
c a b r a c a
c a b r a c
c a b r a c a d
a d a……
d a b……
a b r……
<0,0,c(c)>
<0,0,c(a)>
Search buffer Look Ahead Buffer
<0,0,c(b)>
4/22/2021 9
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
LZ77 Example contd..
c a b r a c a d a b
c a b r a c a d a
c a b r a c a d a b r a
b r a……
r a r……
r r a……
<0,0,c(r)>
<3,1,c(c)>
Search buffer Look Ahead Buffer
<2,1,c(d)>
4/22/2021 10
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
LZ77 Example contd..
a d a b r a r r a r r a d
a b r a c a d a b r a r r a r r…… <7,4,c(r)>
<3,5,c(d)>
Search buffer Look Ahead Buffer
c
c a b r a c
4/22/2021 11
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
LZ77 Example contd..
• The encoded message in the form of triplets are as
follows:
<0, 0, c(c)>,<0, 0, c(a)>,<0, 0, c(b)>,<0, 0, c(r)>
<3, 1, c(c)>,<2, 1, c(d)>,<7, 4, c(r)>,<3, 5, c(d)>
4/22/2021 12
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow

More Related Content

PPTX
Vector Quantization Vs Scalar Quantization
PPTX
Signature files
PPTX
Fundamentals and image compression models
PPTX
Vector quantization
PPTX
Multimedia synchronization
PPTX
Difference between Vector Quantization and Scalar Quantization
PDF
Unit 5 Quantization
PDF
Syntax Directed Definition and its applications
Vector Quantization Vs Scalar Quantization
Signature files
Fundamentals and image compression models
Vector quantization
Multimedia synchronization
Difference between Vector Quantization and Scalar Quantization
Unit 5 Quantization
Syntax Directed Definition and its applications

What's hot (20)

PPT
Heuristic Search Techniques Unit -II.ppt
PPTX
Planning in AI(Partial order planning)
PPTX
Introduction to Image Compression
PPT
Multimedia compression
PPT
Lzw coding technique for image compression
PPT
Video Compression Basics - MPEG2
PPTX
Operating system memory management
PPT
Iterative deepening search
PPTX
Chess board problem(divide and conquer)
PPTX
digital image processing
PDF
Edge linking in image processing
PPTX
Fundamentals of Data compression
PPTX
And or graph
PPTX
Image Sensing and Acquisition.pptx
PPT
Discrete cosine transform
PPTX
Transform coding
PDF
Dictionary Based Compression
PPTX
3 d display-methods
PPT
vector QUANTIZATION
Heuristic Search Techniques Unit -II.ppt
Planning in AI(Partial order planning)
Introduction to Image Compression
Multimedia compression
Lzw coding technique for image compression
Video Compression Basics - MPEG2
Operating system memory management
Iterative deepening search
Chess board problem(divide and conquer)
digital image processing
Edge linking in image processing
Fundamentals of Data compression
And or graph
Image Sensing and Acquisition.pptx
Discrete cosine transform
Transform coding
Dictionary Based Compression
3 d display-methods
vector QUANTIZATION
Ad

Similar to Unit 3 Dictionary based Compression Techniques (20)

PDF
OPTIMIZATION OF LZ77 DATA COMPRESSION ALGORITHM
PDF
50120130405006
PDF
Unit 3 Arithmetic Coding
PPT
111111111111111111111111111111111789.ppt
PPTX
Cjb0912010 lz algorithms
PPTX
Lz77 by ayush
PDF
Unit 2 Lecture notes on Huffman coding
PDF
Introduction Data Compression/ Data compression, modelling and coding,Image C...
PDF
lempel_ziv
PDF
Lossless LZW Data Compression Algorithm on CUDA
PPT
Compress
PDF
Compression tech
PPTX
Data compression & Classification
PPTX
Text compression in LZW and Flate
PDF
Lz algorithm
PPTX
LZ77 and LZ78 Compression Algorithms
PPT
Data Compression Technique
PPT
Compression techniques
PDF
Speeding Up Distributed Machine Learning Using Codes
PPTX
Group presentation.pptx
OPTIMIZATION OF LZ77 DATA COMPRESSION ALGORITHM
50120130405006
Unit 3 Arithmetic Coding
111111111111111111111111111111111789.ppt
Cjb0912010 lz algorithms
Lz77 by ayush
Unit 2 Lecture notes on Huffman coding
Introduction Data Compression/ Data compression, modelling and coding,Image C...
lempel_ziv
Lossless LZW Data Compression Algorithm on CUDA
Compress
Compression tech
Data compression & Classification
Text compression in LZW and Flate
Lz algorithm
LZ77 and LZ78 Compression Algorithms
Data Compression Technique
Compression techniques
Speeding Up Distributed Machine Learning Using Codes
Group presentation.pptx
Ad

More from Dr Piyush Charan (20)

PDF
Unit 1- Intro to Wireless Standards.pdf
PPTX
Unit 1 Solar Collectors
PDF
Unit 4 Lossy Coding Preliminaries
PDF
Unit 3 Geothermal Energy
PDF
Unit 2: Programming Language Tools
PDF
Unit 4 Arrays
PDF
Unit 3 Lecture Notes on Programming
PDF
Unit 3 introduction to programming
PDF
Forensics and wireless body area networks
PDF
Final PhD Defense Presentation
PDF
Unit 1 Introduction to Data Compression
PDF
Unit 1 Introduction to Non-Conventional Energy Resources
PDF
Unit 5-Operational Amplifiers and Electronic Measurement Devices
PDF
Unit 1 Introduction to Data Compression
PDF
Unit 4 Switching Theory and Logic Gates
PDF
Unit 1 Numerical Problems on PN Junction Diode
PDF
Unit 4_Part 1_Number System
PDF
Unit 5 Global Issues- Early life of Prophet Muhammad
PDF
Unit 4 Engineering Ethics
PDF
Unit 3 Professional Responsibility
Unit 1- Intro to Wireless Standards.pdf
Unit 1 Solar Collectors
Unit 4 Lossy Coding Preliminaries
Unit 3 Geothermal Energy
Unit 2: Programming Language Tools
Unit 4 Arrays
Unit 3 Lecture Notes on Programming
Unit 3 introduction to programming
Forensics and wireless body area networks
Final PhD Defense Presentation
Unit 1 Introduction to Data Compression
Unit 1 Introduction to Non-Conventional Energy Resources
Unit 5-Operational Amplifiers and Electronic Measurement Devices
Unit 1 Introduction to Data Compression
Unit 4 Switching Theory and Logic Gates
Unit 1 Numerical Problems on PN Junction Diode
Unit 4_Part 1_Number System
Unit 5 Global Issues- Early life of Prophet Muhammad
Unit 4 Engineering Ethics
Unit 3 Professional Responsibility

Recently uploaded (20)

PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PDF
PPT on Performance Review to get promotions
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
additive manufacturing of ss316l using mig welding
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
Internet of Things (IOT) - A guide to understanding
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PDF
composite construction of structures.pdf
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPTX
Geodesy 1.pptx...............................................
PPT
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Foundation to blockchain - A guide to Blockchain Tech
Operating System & Kernel Study Guide-1 - converted.pdf
PPT on Performance Review to get promotions
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
additive manufacturing of ss316l using mig welding
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Internet of Things (IOT) - A guide to understanding
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
composite construction of structures.pdf
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
Geodesy 1.pptx...............................................
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
UNIT-1 - COAL BASED THERMAL POWER PLANTS

Unit 3 Dictionary based Compression Techniques

  • 1. Lecture Notes on Dictionary Based Compression Techniques for Open Educational Resource on Data Compression(CA209) by Dr. Piyush Charan Assistant Professor Department of Electronics and Communication Engg. Integral University, Lucknow This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
  • 2. • Dictionary-based compression algorithms usually create a dictionary (a pattern of characters) in memory as data is scanned looking for repeated information (some implementations use a static dictionary so it does have to be built dynamically). 4/22/2021 2 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
  • 3. LZ77/LZ1/ Sliding Window • In many applications, the output of the source consists of recurring patterns. • A very reasonable approach to encode such sources is to keep a list, or dictionary, of frequently occurring patterns. • The input is split into two classes, frequently occurring and infrequently occurring patterns. • There are static and adaptive dictionary techniques. Most adaptive techniques have their roots in two papers by Ziv and Lempel in 1977 (LZ77) and 1978 (LZ78) 4/22/2021 3 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
  • 4. LZ77 Approach • LZ77 is a Dynamic Adaptive Dictionary Technique that consists of a Sliding Window. • The widow consists of two parts: – Search Buffer (SB) – Look Ahead Buffer (LAB) • The size of the sliding window is given as: • Window Size= SB+LAB 1 2 3 4 5 6 7 8 9 10 11 12 13 Search Buffer Look Ahead Buffer 4/22/2021 4 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
  • 5. • Search Buffer: A Search Buffer that contains a portion of the recently encoded sequence. • Look Ahead Buffer: A Look - Ahead Buffer that contains the next portion of the sequence. • To encode the sequence in look-ahead buffer, the encoder moves a search pointer back through the search buffer until it encounters a match to the first symbol in the look-ahead buffer. • Any two of the three must be given in the problem to encode given sequence of text. 4/22/2021 5 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
  • 6. Process of LZ77 Compression • Lets see the process below: • Triplets: <o, l, c> c a b r a c a d a b r a Window Size=13 r r a …… Search Buffer Look-Ahead Buffer Offset Length of match codeword 4/22/2021 6 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
  • 7. • Offset (o): The distance between the search pointer and the look-ahead buffer is called the offset. • Length of match (l): The number of consecutive symbols in the search buffer that match the consecutive symbols in the look-ahead buffer, starting with the first symbol, is called the length of match. • Codeword (c): It is the codeword corresponding to the symbol in the look-ahead buffer that follows the match. 4/22/2021 7 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
  • 8. LZ77 Example • Encode the message- c a b r a c a d a b r a r r a r r a d • Here Window Size =13 • And Size of Look Ahead Buffer =6 c a b r a c a d a b r a r r a r r a d 4/22/2021 8 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
  • 9. LZ77 Example contd.. c a b r a c a c a b r a c c a b r a c a d a d a…… d a b…… a b r…… <0,0,c(c)> <0,0,c(a)> Search buffer Look Ahead Buffer <0,0,c(b)> 4/22/2021 9 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
  • 10. LZ77 Example contd.. c a b r a c a d a b c a b r a c a d a c a b r a c a d a b r a b r a…… r a r…… r r a…… <0,0,c(r)> <3,1,c(c)> Search buffer Look Ahead Buffer <2,1,c(d)> 4/22/2021 10 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
  • 11. LZ77 Example contd.. a d a b r a r r a r r a d a b r a c a d a b r a r r a r r…… <7,4,c(r)> <3,5,c(d)> Search buffer Look Ahead Buffer c c a b r a c 4/22/2021 11 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
  • 12. LZ77 Example contd.. • The encoded message in the form of triplets are as follows: <0, 0, c(c)>,<0, 0, c(a)>,<0, 0, c(b)>,<0, 0, c(r)> <3, 1, c(c)>,<2, 1, c(d)>,<7, 4, c(r)>,<3, 5, c(d)> 4/22/2021 12 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow