SlideShare a Scribd company logo
5
Most read
11
Most read
16
Most read
Lecture Notes on Introduction to
Data Compression
for
Open Educational Resource
on
Data Compression(CA209)
by
Dr. Piyush Charan
Assistant Professor
Department of Electronics and Communication Engg.
Integral University, Lucknow
Content
• UNIT-I: Introduction to Compression Techniques: Loss less
compression, Lossy Compression, Measures of performance,
Modeling and coding, Mathematical Preliminaries for Lossless
compression.
• Introduction to Information Theory and Models: Physical
models, Probability models, Markov models.
2 February 2021 2
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
What is Data Compression?
• Data Compression = Modeling + Coding
• data compression consists of taking a stream of symbols and
transforming them into codes. If the compression is
effective, the resulting stream of codes will be smaller than
the original symbols.
• The decision to output a certain code for a certain symbol or
set of symbols is based on a model.
• The model is simply a collection of data and rules used to
process input symbols and determine which code(s) to
output.
2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 3
Other Definitions
• Data compression is the process of converting an input data stream
(the source stream or the original raw data) into another data stream
(the output, the bitstream, or the compressed stream) that has a
smaller size. A stream is either a file or a buffer in memory.
• The field of data compression is often called source coding. We
imagine that the input symbols (such as bits, ASCII codes, bytes,
audio samples, or pixel values) are emitted by a certain information
source and have to be coded before being sent to their destination.
The source can be memoryless, or it can have memory.
2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 4
Need of Compression
• Why Data Compression?
– There are two practical motivations for compression:
• Make optimal use of limited storage space (Reduction of storage
requirements)
• Save time and help to optimize resources
– If compression and decompression are done in I/O processor,
less time is required to move data to or from storage
subsystem, freeing I/O bus for other work
– In sending data over communication line: less time to transmit
and less storage to host
2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 5
Data Compression
• Data compression, source coding, or bit-rate reduction is the process of
encoding information using fewer bits than the original representation. Any
particular compression is either lossy or lossless.
• Lossless compression reduces bits by identifying and eliminating statistical
redundancy. No information is lost in lossless compression.
• Lossy compression reduces bits by removing unnecessary or less important
information.
• Typically, a device that performs data compression is referred to as an
encoder, and one that performs the reversal of the process (decompression)
as a decoder.
2 February 2021 6
Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow
Data Compression contd…
• In compression technique or compression algorithm,
we are actually referring to two algorithms.
• There is the compression algorithm that takes an input
and generates a representation that requires fewer
bits, and there is a reconstruction algorithm
(decompression algorithm) that operates on the
compressed representation to generate the
reconstruction .
2 February 2021 7
Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow
Data Compression contd…
Fig.1. Compression and Reconstruction
2 February 2021 8
Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow
Process of Data Compression
2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 9
• Based on the requirements of reconstruction, data
compression schemes can be divided into two
broad classes:
• lossless compression schemes, in which is
identical to , and
• lossy compression schemes, which generally
provide much higher compression than lossless
compression but allow to be different from .
2 February 2021 10
Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow
Types of Data Compression
• Data compression is about storing and sending a smaller number of bits.
• There are two major categories for methods to compress data: lossless and lossy
methods.
2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 11
Lossless Compression Methods
• In lossless methods, original data and the data
after compression and decompression are exactly
the same.
• Redundant data is removed in compression and
added during decompression.
• Lossless methods are used when we can’t afford
to lose any data: legal and medical documents,
computer programs.
2 February 2021 12
Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow
Lossy Compression Methods
• Used for compressing images and video files (our eyes
cannot distinguish subtle changes, so lossy data is
acceptable).
• These methods are cheaper, require less time and space.
• Several methods:
– JPEG: compress pictures and graphics
– MPEG: compress video
– MP3: compress audio
2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 13
Measure of Performance
• A compression algorithm can be evaluated in a
number of different ways.
• We could measure-
– the relative complexity of the algorithm,
– the memory required to implement the algorithm,
– how fast the algorithm performs on a given machine,
– the amount of compression, and
– how closely the reconstruction resembles the original.
2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 14
1. Compression Ratio
• A very logical way of measuring how well a compression
algorithm compresses a given set of data is to look at the ratio
of the number of bits required to represent the data before
compression to the number of bits required to represent the
data after compression. This ratio is called the compression
ratio.
2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 15
Example
• Suppose storing an image made up of a square array of
256×256 pixels requires 65,536 bytes. The image is
compressed and the compressed version requires 16,384 bytes.
• The compression Ratio for the above compression is given by-
Compression Ratio= Original Size
Compressed Size
 Compression Ratio= 65536 = 4:1
16384
2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 16
• We can also represent the compression ratio by expressing the
reduction in the amount of data required as a percentage of the size of
the original data.
• Total Compression in percentage = Original-Compressed ×100%
Original
= 65536-16384 ×100%
65536
= 75%
• In this particular example, the compression ratio calculated in this
manner would be 75%.
2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 17
2. Rate of Compression
• Compression performance can also be reported by providing
the average number of bits required to represent a single
sample.
• This is generally referred to as the rate.
• For example, in the case of the compressed image described
previously, if we assume 8 bits per byte (or pixel), the average
number of bits per pixel in the compressed representation is 2.
• Thus, we would say that the rate is 2 bits per pixel.
2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 18
3. Distortion
• In lossy compression, the reconstruction differs from the
original data.
• Therefore, in order to determine the efficiency of a
compression algorithm, we have to have some way of
quantifying the difference.
• The difference between the original and the reconstruction is
often called the distortion.
2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 19

More Related Content

PDF
Introduction Data Compression/ Data compression, modelling and coding,Image C...
PPTX
Using prior knowledge to initialize the hypothesis,kbann
PPTX
image basics and image compression
PPTX
Computer Network - Network Layer
PDF
Unit 3 Arithmetic Coding
PPT
computer Networks Error Detection and Correction.ppt
PPTX
Halftoning in Computer Graphics
PDF
Modelling and evaluation
Introduction Data Compression/ Data compression, modelling and coding,Image C...
Using prior knowledge to initialize the hypothesis,kbann
image basics and image compression
Computer Network - Network Layer
Unit 3 Arithmetic Coding
computer Networks Error Detection and Correction.ppt
Halftoning in Computer Graphics
Modelling and evaluation

What's hot (20)

PPTX
Noise filtering
PDF
Digital Image Processing - Image Compression
PPTX
Arp and rarp
PDF
Application of MapReduce in Cloud Computing
PPTX
And or graph
PDF
From Image Processing To Computer Vision
PPTX
Fundamentals of Data compression
PPTX
IMAGE SEGMENTATION.
PPT
lecture07.ppt
PDF
Target language in compiler design
PDF
Lecture 4 principles of parallel algorithm design updated
PPTX
Ai (new)
PPTX
data compression technique
PPT
Discrete cosine transform
PPT
morphological image processing
PPTX
Digital image processing- Compression- Different Coding techniques
PPTX
The sutherland hodgeman polygon clipping algorithm
PPTX
Matching techniques
PPTX
Media Access Control (MAC Layer)
ODP
MPEG-1 Part 2 Video Encoding
Noise filtering
Digital Image Processing - Image Compression
Arp and rarp
Application of MapReduce in Cloud Computing
And or graph
From Image Processing To Computer Vision
Fundamentals of Data compression
IMAGE SEGMENTATION.
lecture07.ppt
Target language in compiler design
Lecture 4 principles of parallel algorithm design updated
Ai (new)
data compression technique
Discrete cosine transform
morphological image processing
Digital image processing- Compression- Different Coding techniques
The sutherland hodgeman polygon clipping algorithm
Matching techniques
Media Access Control (MAC Layer)
MPEG-1 Part 2 Video Encoding
Ad

Similar to Unit 1 Introduction to Data Compression (20)

PDF
Lossless Image Compression Techniques Comparative Study
PDF
Survey of Hybrid Image Compression Techniques
PDF
Unit 5 Quantization
PPTX
Module-4_Part-II.pptx
PDF
BIG DATA-DRIVEN FAST REDUCING THE VISUAL BLOCK ARTIFACTS OF DCT COMPRESSED IM...
PDF
A study on cloud computing ppt n_24-12-2017
PPTX
CBIR with RF
PDF
Novel hybrid framework for image compression for supportive hardware design o...
PDF
Enhanced Image Compression Using Wavelets
PDF
Comparative Analysis of Naive Bayes and Decision Tree Algorithms in Data Mini...
PDF
Content adaptive single image interpolation based Super Resolution of compres...
PDF
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
PDF
Super-Spatial Structure Prediction Compression of Medical
PDF
Image compression and reconstruction using a new approach by artificial neura...
PDF
Image compression and reconstruction using a new approach by artificial neura...
PDF
REGION OF INTEREST BASED COMPRESSION OF MEDICAL IMAGE USING DISCRETE WAVELET ...
DOCX
Thesis on Image compression by Manish Myst
PPTX
Database architecture and Data modelling
PPT
Ansi spark
Lossless Image Compression Techniques Comparative Study
Survey of Hybrid Image Compression Techniques
Unit 5 Quantization
Module-4_Part-II.pptx
BIG DATA-DRIVEN FAST REDUCING THE VISUAL BLOCK ARTIFACTS OF DCT COMPRESSED IM...
A study on cloud computing ppt n_24-12-2017
CBIR with RF
Novel hybrid framework for image compression for supportive hardware design o...
Enhanced Image Compression Using Wavelets
Comparative Analysis of Naive Bayes and Decision Tree Algorithms in Data Mini...
Content adaptive single image interpolation based Super Resolution of compres...
EMPIRICAL APPLICATION OF SIMULATED ANNEALING USING OBJECT-ORIENTED METRICS TO...
Super-Spatial Structure Prediction Compression of Medical
Image compression and reconstruction using a new approach by artificial neura...
Image compression and reconstruction using a new approach by artificial neura...
REGION OF INTEREST BASED COMPRESSION OF MEDICAL IMAGE USING DISCRETE WAVELET ...
Thesis on Image compression by Manish Myst
Database architecture and Data modelling
Ansi spark
Ad

More from Dr Piyush Charan (20)

PDF
Unit 1- Intro to Wireless Standards.pdf
PPTX
Unit 1 Solar Collectors
PDF
Unit 4 Lossy Coding Preliminaries
PDF
Unit 3 Geothermal Energy
PDF
Unit 2: Programming Language Tools
PDF
Unit 4 Arrays
PDF
Unit 3 Lecture Notes on Programming
PDF
Unit 3 introduction to programming
PDF
Forensics and wireless body area networks
PDF
Final PhD Defense Presentation
PDF
Unit 2 Lecture notes on Huffman coding
PDF
Unit 3 Dictionary based Compression Techniques
PDF
Unit 1 Introduction to Non-Conventional Energy Resources
PDF
Unit 5-Operational Amplifiers and Electronic Measurement Devices
PDF
Unit 1 Introduction to Data Compression
PDF
Unit 4 Switching Theory and Logic Gates
PDF
Unit 1 Numerical Problems on PN Junction Diode
PDF
Unit 4_Part 1_Number System
PDF
Unit 5 Global Issues- Early life of Prophet Muhammad
PDF
Unit 4 Engineering Ethics
Unit 1- Intro to Wireless Standards.pdf
Unit 1 Solar Collectors
Unit 4 Lossy Coding Preliminaries
Unit 3 Geothermal Energy
Unit 2: Programming Language Tools
Unit 4 Arrays
Unit 3 Lecture Notes on Programming
Unit 3 introduction to programming
Forensics and wireless body area networks
Final PhD Defense Presentation
Unit 2 Lecture notes on Huffman coding
Unit 3 Dictionary based Compression Techniques
Unit 1 Introduction to Non-Conventional Energy Resources
Unit 5-Operational Amplifiers and Electronic Measurement Devices
Unit 1 Introduction to Data Compression
Unit 4 Switching Theory and Logic Gates
Unit 1 Numerical Problems on PN Junction Diode
Unit 4_Part 1_Number System
Unit 5 Global Issues- Early life of Prophet Muhammad
Unit 4 Engineering Ethics

Recently uploaded (20)

PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PDF
PPT on Performance Review to get promotions
PPTX
CH1 Production IntroductoryConcepts.pptx
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
Sustainable Sites - Green Building Construction
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PPTX
additive manufacturing of ss316l using mig welding
PPTX
Construction Project Organization Group 2.pptx
PDF
Digital Logic Computer Design lecture notes
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
Embodied AI: Ushering in the Next Era of Intelligent Systems
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
CYBER-CRIMES AND SECURITY A guide to understanding
PPT on Performance Review to get promotions
CH1 Production IntroductoryConcepts.pptx
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Sustainable Sites - Green Building Construction
Foundation to blockchain - A guide to Blockchain Tech
UNIT-1 - COAL BASED THERMAL POWER PLANTS
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Automation-in-Manufacturing-Chapter-Introduction.pdf
additive manufacturing of ss316l using mig welding
Construction Project Organization Group 2.pptx
Digital Logic Computer Design lecture notes
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx

Unit 1 Introduction to Data Compression

  • 1. Lecture Notes on Introduction to Data Compression for Open Educational Resource on Data Compression(CA209) by Dr. Piyush Charan Assistant Professor Department of Electronics and Communication Engg. Integral University, Lucknow
  • 2. Content • UNIT-I: Introduction to Compression Techniques: Loss less compression, Lossy Compression, Measures of performance, Modeling and coding, Mathematical Preliminaries for Lossless compression. • Introduction to Information Theory and Models: Physical models, Probability models, Markov models. 2 February 2021 2 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
  • 3. What is Data Compression? • Data Compression = Modeling + Coding • data compression consists of taking a stream of symbols and transforming them into codes. If the compression is effective, the resulting stream of codes will be smaller than the original symbols. • The decision to output a certain code for a certain symbol or set of symbols is based on a model. • The model is simply a collection of data and rules used to process input symbols and determine which code(s) to output. 2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 3
  • 4. Other Definitions • Data compression is the process of converting an input data stream (the source stream or the original raw data) into another data stream (the output, the bitstream, or the compressed stream) that has a smaller size. A stream is either a file or a buffer in memory. • The field of data compression is often called source coding. We imagine that the input symbols (such as bits, ASCII codes, bytes, audio samples, or pixel values) are emitted by a certain information source and have to be coded before being sent to their destination. The source can be memoryless, or it can have memory. 2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 4
  • 5. Need of Compression • Why Data Compression? – There are two practical motivations for compression: • Make optimal use of limited storage space (Reduction of storage requirements) • Save time and help to optimize resources – If compression and decompression are done in I/O processor, less time is required to move data to or from storage subsystem, freeing I/O bus for other work – In sending data over communication line: less time to transmit and less storage to host 2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 5
  • 6. Data Compression • Data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. • Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. • Lossy compression reduces bits by removing unnecessary or less important information. • Typically, a device that performs data compression is referred to as an encoder, and one that performs the reversal of the process (decompression) as a decoder. 2 February 2021 6 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow
  • 7. Data Compression contd… • In compression technique or compression algorithm, we are actually referring to two algorithms. • There is the compression algorithm that takes an input and generates a representation that requires fewer bits, and there is a reconstruction algorithm (decompression algorithm) that operates on the compressed representation to generate the reconstruction . 2 February 2021 7 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow
  • 8. Data Compression contd… Fig.1. Compression and Reconstruction 2 February 2021 8 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow
  • 9. Process of Data Compression 2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 9
  • 10. • Based on the requirements of reconstruction, data compression schemes can be divided into two broad classes: • lossless compression schemes, in which is identical to , and • lossy compression schemes, which generally provide much higher compression than lossless compression but allow to be different from . 2 February 2021 10 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow
  • 11. Types of Data Compression • Data compression is about storing and sending a smaller number of bits. • There are two major categories for methods to compress data: lossless and lossy methods. 2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 11
  • 12. Lossless Compression Methods • In lossless methods, original data and the data after compression and decompression are exactly the same. • Redundant data is removed in compression and added during decompression. • Lossless methods are used when we can’t afford to lose any data: legal and medical documents, computer programs. 2 February 2021 12 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow
  • 13. Lossy Compression Methods • Used for compressing images and video files (our eyes cannot distinguish subtle changes, so lossy data is acceptable). • These methods are cheaper, require less time and space. • Several methods: – JPEG: compress pictures and graphics – MPEG: compress video – MP3: compress audio 2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 13
  • 14. Measure of Performance • A compression algorithm can be evaluated in a number of different ways. • We could measure- – the relative complexity of the algorithm, – the memory required to implement the algorithm, – how fast the algorithm performs on a given machine, – the amount of compression, and – how closely the reconstruction resembles the original. 2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 14
  • 15. 1. Compression Ratio • A very logical way of measuring how well a compression algorithm compresses a given set of data is to look at the ratio of the number of bits required to represent the data before compression to the number of bits required to represent the data after compression. This ratio is called the compression ratio. 2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 15
  • 16. Example • Suppose storing an image made up of a square array of 256×256 pixels requires 65,536 bytes. The image is compressed and the compressed version requires 16,384 bytes. • The compression Ratio for the above compression is given by- Compression Ratio= Original Size Compressed Size  Compression Ratio= 65536 = 4:1 16384 2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 16
  • 17. • We can also represent the compression ratio by expressing the reduction in the amount of data required as a percentage of the size of the original data. • Total Compression in percentage = Original-Compressed ×100% Original = 65536-16384 ×100% 65536 = 75% • In this particular example, the compression ratio calculated in this manner would be 75%. 2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 17
  • 18. 2. Rate of Compression • Compression performance can also be reported by providing the average number of bits required to represent a single sample. • This is generally referred to as the rate. • For example, in the case of the compressed image described previously, if we assume 8 bits per byte (or pixel), the average number of bits per pixel in the compressed representation is 2. • Thus, we would say that the rate is 2 bits per pixel. 2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 18
  • 19. 3. Distortion • In lossy compression, the reconstruction differs from the original data. • Therefore, in order to determine the efficiency of a compression algorithm, we have to have some way of quantifying the difference. • The difference between the original and the reconstruction is often called the distortion. 2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 19