SlideShare a Scribd company logo
3
Most read
5
Most read
19
Most read
Lecture Notes on Introduction to
Data Compression
for
Open Educational Resource
on
Data Compression(CA209)
by
Dr. Piyush Charan
Assistant Professor
Department of Electronics and Communication Engg.
Integral University, Lucknow
Content
• UNIT-I: Introduction to Compression Techniques: Loss less
compression, Lossy Compression, Measures of performance,
Modeling and coding, Mathematical Preliminaries for Lossless
compression.
• Introduction to Information Theory and Models: Physical
models, Probability models, Markov models.
2 February 2021 2
Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
What is Data Compression?
• Data Compression = Modeling + Coding
• data compression consists of taking a stream of symbols and
transforming them into codes. If the compression is
effective, the resulting stream of codes will be smaller than
the original symbols.
• The decision to output a certain code for a certain symbol or
set of symbols is based on a model.
• The model is simply a collection of data and rules used to
process input symbols and determine which code(s) to
output.
2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 3
Other Definitions
• Data compression is the process of converting an input data stream
(the source stream or the original raw data) into another data stream
(the output, the bitstream, or the compressed stream) that has a
smaller size. A stream is either a file or a buffer in memory.
• The field of data compression is often called source coding. We
imagine that the input symbols (such as bits, ASCII codes, bytes,
audio samples, or pixel values) are emitted by a certain information
source and have to be coded before being sent to their destination.
The source can be memoryless, or it can have memory.
2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 4
Need of Compression
• Why Data Compression?
– There are two practical motivations for compression:
• Make optimal use of limited storage space (Reduction of storage
requirements)
• Save time and help to optimize resources
– If compression and decompression are done in I/O processor,
less time is required to move data to or from storage
subsystem, freeing I/O bus for other work
– In sending data over communication line: less time to transmit
and less storage to host
2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 5
Data Compression
• Data compression, source coding, or bit-rate reduction is the process of
encoding information using fewer bits than the original representation. Any
particular compression is either lossy or lossless.
• Lossless compression reduces bits by identifying and eliminating statistical
redundancy. No information is lost in lossless compression.
• Lossy compression reduces bits by removing unnecessary or less important
information.
• Typically, a device that performs data compression is referred to as an
encoder, and one that performs the reversal of the process (decompression)
as a decoder.
2 February 2021 6
Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow
Data Compression contd…
• In compression technique or compression algorithm,
we are actually referring to two algorithms.
• There is the compression algorithm that takes an input
and generates a representation that requires fewer
bits, and there is a reconstruction algorithm
(decompression algorithm) that operates on the
compressed representation to generate the
reconstruction .
2 February 2021 7
Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow
Data Compression contd…
Fig.1. Compression and Reconstruction
2 February 2021 8
Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow
Process of Data Compression
2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 9
• Based on the requirements of reconstruction, data
compression schemes can be divided into two
broad classes:
• lossless compression schemes, in which is
identical to , and
• lossy compression schemes, which generally
provide much higher compression than lossless
compression but allow to be different from .
2 February 2021 10
Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow
Types of Data Compression
• Data compression is about storing and sending a smaller number of bits.
• There are two major categories for methods to compress data: lossless and lossy
methods.
2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 11
Lossless Compression Methods
• In lossless methods, original data and the data
after compression and decompression are exactly
the same.
• Redundant data is removed in compression and
added during decompression.
• Lossless methods are used when we can’t afford
to lose any data: legal and medical documents,
computer programs.
2 February 2021 12
Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow
Lossy Compression Methods
• Used for compressing images and video files (our eyes
cannot distinguish subtle changes, so lossy data is
acceptable).
• These methods are cheaper, require less time and space.
• Several methods:
– JPEG: compress pictures and graphics
– MPEG: compress video
– MP3: compress audio
2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 13
Measure of Performance
• A compression algorithm can be evaluated in a
number of different ways.
• We could measure-
– the relative complexity of the algorithm,
– the memory required to implement the algorithm,
– how fast the algorithm performs on a given machine,
– the amount of compression, and
– how closely the reconstruction resembles the original.
2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 14
1. Compression Ratio
• A very logical way of measuring how well a compression
algorithm compresses a given set of data is to look at the ratio
of the number of bits required to represent the data before
compression to the number of bits required to represent the
data after compression. This ratio is called the compression
ratio.
2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 15
Example
• Suppose storing an image made up of a square array of
256×256 pixels requires 65,536 bytes. The image is
compressed and the compressed version requires 16,384 bytes.
• The compression Ratio for the above compression is given by-
Compression Ratio= Original Size
Compressed Size
 Compression Ratio= 65536 = 4:1
16384
2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 16
• We can also represent the compression ratio by expressing the
reduction in the amount of data required as a percentage of the size of
the original data.
• Total Compression in percentage = Original-Compressed ×100%
Original
= 65536-16384 ×100%
65536
= 75%
• In this particular example, the compression ratio calculated in this
manner would be 75%.
2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 17
2. Rate of Compression
• Compression performance can also be reported by providing
the average number of bits required to represent a single
sample.
• This is generally referred to as the rate.
• For example, in the case of the compressed image described
previously, if we assume 8 bits per byte (or pixel), the average
number of bits per pixel in the compressed representation is 2.
• Thus, we would say that the rate is 2 bits per pixel.
2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 18
3. Distortion
• In lossy compression, the reconstruction differs from the
original data.
• Therefore, in order to determine the efficiency of a
compression algorithm, we have to have some way of
quantifying the difference.
• The difference between the original and the reconstruction is
often called the distortion.
2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 19

More Related Content

PDF
Information theory
PPTX
Information Theory Coding 1
PDF
Data Communication & Computer network: Channel capacity
PPT
Lecture 07
PPT
2. data and signals
PPTX
Line Coding.pptx
PDF
Arithmetic coding
PPTX
Comparison between Lossy and Lossless Compression
Information theory
Information Theory Coding 1
Data Communication & Computer network: Channel capacity
Lecture 07
2. data and signals
Line Coding.pptx
Arithmetic coding
Comparison between Lossy and Lossless Compression

What's hot (20)

PPT
Data compression
PPTX
Vector quantization
PPTX
Video compression
PPSX
Image Processing: Spatial filters
PPTX
Image restoration and degradation model
PPTX
Erosion and dilation
PPT
Arithmetic coding
PDF
Video Compression Basics
PPTX
Introduction to Image Compression
PPTX
Log Transformation in Image Processing with Example
PDF
Unit 5 Quantization
PPTX
Image compression .
PPTX
Digital watermarking
PPT
Data compression
PPTX
Predictive coding
PPTX
Subband Coding
PPTX
Data compression
PPT
Adaptive Huffman Coding
PPT
image compresson
Data compression
Vector quantization
Video compression
Image Processing: Spatial filters
Image restoration and degradation model
Erosion and dilation
Arithmetic coding
Video Compression Basics
Introduction to Image Compression
Log Transformation in Image Processing with Example
Unit 5 Quantization
Image compression .
Digital watermarking
Data compression
Predictive coding
Subband Coding
Data compression
Adaptive Huffman Coding
image compresson
Ad

Similar to Unit 1 Introduction to Data Compression (20)

PPTX
Introduction to data compression.pptx
PPTX
Introduction for Data Compression
PDF
Charter1 material
PPT
VII Compression Introduction
PDF
Presentation on Image Compression
PPT
lecture on data compression
PDF
10lecture10datacompression-171023182241.pdf
PPT
Compressionbasics
PDF
Data Communication & Computer network: Data compression
PPTX
Unit 3 Image Compression and Segmentation.pptx
PPTX
dc module1 part 1.pptx
PPTX
Data compression
PPTX
Data-Compression-Technique(communication).pptx
DOC
Seminar Report on image compression
PDF
A Critical Review of Well Known Method For Image Compression
PPTX
PPTX
Image compression
PDF
The Language of Compression
PDF
The Language of Compression - Leif Walsh
PPTX
Data Compression and encryption for security
Introduction to data compression.pptx
Introduction for Data Compression
Charter1 material
VII Compression Introduction
Presentation on Image Compression
lecture on data compression
10lecture10datacompression-171023182241.pdf
Compressionbasics
Data Communication & Computer network: Data compression
Unit 3 Image Compression and Segmentation.pptx
dc module1 part 1.pptx
Data compression
Data-Compression-Technique(communication).pptx
Seminar Report on image compression
A Critical Review of Well Known Method For Image Compression
Image compression
The Language of Compression
The Language of Compression - Leif Walsh
Data Compression and encryption for security
Ad

More from Dr Piyush Charan (20)

PDF
Unit 1- Intro to Wireless Standards.pdf
PPTX
Unit 1 Solar Collectors
PDF
Unit 4 Lossy Coding Preliminaries
PDF
Unit 3 Geothermal Energy
PDF
Unit 2: Programming Language Tools
PDF
Unit 4 Arrays
PDF
Unit 3 Lecture Notes on Programming
PDF
Unit 3 introduction to programming
PDF
Forensics and wireless body area networks
PDF
Final PhD Defense Presentation
PDF
Unit 3 Arithmetic Coding
PDF
Unit 2 Lecture notes on Huffman coding
PDF
Unit 1 Introduction to Data Compression
PDF
Unit 3 Dictionary based Compression Techniques
PDF
Unit 1 Introduction to Non-Conventional Energy Resources
PDF
Unit 5-Operational Amplifiers and Electronic Measurement Devices
PDF
Unit 4 Switching Theory and Logic Gates
PDF
Unit 1 Numerical Problems on PN Junction Diode
PDF
Unit 4_Part 1_Number System
PDF
Unit 5 Global Issues- Early life of Prophet Muhammad
Unit 1- Intro to Wireless Standards.pdf
Unit 1 Solar Collectors
Unit 4 Lossy Coding Preliminaries
Unit 3 Geothermal Energy
Unit 2: Programming Language Tools
Unit 4 Arrays
Unit 3 Lecture Notes on Programming
Unit 3 introduction to programming
Forensics and wireless body area networks
Final PhD Defense Presentation
Unit 3 Arithmetic Coding
Unit 2 Lecture notes on Huffman coding
Unit 1 Introduction to Data Compression
Unit 3 Dictionary based Compression Techniques
Unit 1 Introduction to Non-Conventional Energy Resources
Unit 5-Operational Amplifiers and Electronic Measurement Devices
Unit 4 Switching Theory and Logic Gates
Unit 1 Numerical Problems on PN Junction Diode
Unit 4_Part 1_Number System
Unit 5 Global Issues- Early life of Prophet Muhammad

Recently uploaded (20)

PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PPT
Project quality management in manufacturing
PDF
Digital Logic Computer Design lecture notes
PPTX
Sustainable Sites - Green Building Construction
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPT
Mechanical Engineering MATERIALS Selection
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PPTX
additive manufacturing of ss316l using mig welding
PPTX
Lecture Notes Electrical Wiring System Components
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PPTX
UNIT 4 Total Quality Management .pptx
PPTX
CH1 Production IntroductoryConcepts.pptx
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
bas. eng. economics group 4 presentation 1.pptx
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Operating System & Kernel Study Guide-1 - converted.pdf
Project quality management in manufacturing
Digital Logic Computer Design lecture notes
Sustainable Sites - Green Building Construction
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
CYBER-CRIMES AND SECURITY A guide to understanding
Embodied AI: Ushering in the Next Era of Intelligent Systems
Mechanical Engineering MATERIALS Selection
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
additive manufacturing of ss316l using mig welding
Lecture Notes Electrical Wiring System Components
Model Code of Practice - Construction Work - 21102022 .pdf
UNIT 4 Total Quality Management .pptx
CH1 Production IntroductoryConcepts.pptx
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
bas. eng. economics group 4 presentation 1.pptx
Foundation to blockchain - A guide to Blockchain Tech
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf

Unit 1 Introduction to Data Compression

  • 1. Lecture Notes on Introduction to Data Compression for Open Educational Resource on Data Compression(CA209) by Dr. Piyush Charan Assistant Professor Department of Electronics and Communication Engg. Integral University, Lucknow
  • 2. Content • UNIT-I: Introduction to Compression Techniques: Loss less compression, Lossy Compression, Measures of performance, Modeling and coding, Mathematical Preliminaries for Lossless compression. • Introduction to Information Theory and Models: Physical models, Probability models, Markov models. 2 February 2021 2 Dr. Piyush, Charan Dept. of ECE, Integral University, Lucknow
  • 3. What is Data Compression? • Data Compression = Modeling + Coding • data compression consists of taking a stream of symbols and transforming them into codes. If the compression is effective, the resulting stream of codes will be smaller than the original symbols. • The decision to output a certain code for a certain symbol or set of symbols is based on a model. • The model is simply a collection of data and rules used to process input symbols and determine which code(s) to output. 2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 3
  • 4. Other Definitions • Data compression is the process of converting an input data stream (the source stream or the original raw data) into another data stream (the output, the bitstream, or the compressed stream) that has a smaller size. A stream is either a file or a buffer in memory. • The field of data compression is often called source coding. We imagine that the input symbols (such as bits, ASCII codes, bytes, audio samples, or pixel values) are emitted by a certain information source and have to be coded before being sent to their destination. The source can be memoryless, or it can have memory. 2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 4
  • 5. Need of Compression • Why Data Compression? – There are two practical motivations for compression: • Make optimal use of limited storage space (Reduction of storage requirements) • Save time and help to optimize resources – If compression and decompression are done in I/O processor, less time is required to move data to or from storage subsystem, freeing I/O bus for other work – In sending data over communication line: less time to transmit and less storage to host 2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 5
  • 6. Data Compression • Data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. • Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. • Lossy compression reduces bits by removing unnecessary or less important information. • Typically, a device that performs data compression is referred to as an encoder, and one that performs the reversal of the process (decompression) as a decoder. 2 February 2021 6 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow
  • 7. Data Compression contd… • In compression technique or compression algorithm, we are actually referring to two algorithms. • There is the compression algorithm that takes an input and generates a representation that requires fewer bits, and there is a reconstruction algorithm (decompression algorithm) that operates on the compressed representation to generate the reconstruction . 2 February 2021 7 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow
  • 8. Data Compression contd… Fig.1. Compression and Reconstruction 2 February 2021 8 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow
  • 9. Process of Data Compression 2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 9
  • 10. • Based on the requirements of reconstruction, data compression schemes can be divided into two broad classes: • lossless compression schemes, in which is identical to , and • lossy compression schemes, which generally provide much higher compression than lossless compression but allow to be different from . 2 February 2021 10 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow
  • 11. Types of Data Compression • Data compression is about storing and sending a smaller number of bits. • There are two major categories for methods to compress data: lossless and lossy methods. 2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 11
  • 12. Lossless Compression Methods • In lossless methods, original data and the data after compression and decompression are exactly the same. • Redundant data is removed in compression and added during decompression. • Lossless methods are used when we can’t afford to lose any data: legal and medical documents, computer programs. 2 February 2021 12 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow
  • 13. Lossy Compression Methods • Used for compressing images and video files (our eyes cannot distinguish subtle changes, so lossy data is acceptable). • These methods are cheaper, require less time and space. • Several methods: – JPEG: compress pictures and graphics – MPEG: compress video – MP3: compress audio 2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 13
  • 14. Measure of Performance • A compression algorithm can be evaluated in a number of different ways. • We could measure- – the relative complexity of the algorithm, – the memory required to implement the algorithm, – how fast the algorithm performs on a given machine, – the amount of compression, and – how closely the reconstruction resembles the original. 2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 14
  • 15. 1. Compression Ratio • A very logical way of measuring how well a compression algorithm compresses a given set of data is to look at the ratio of the number of bits required to represent the data before compression to the number of bits required to represent the data after compression. This ratio is called the compression ratio. 2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 15
  • 16. Example • Suppose storing an image made up of a square array of 256×256 pixels requires 65,536 bytes. The image is compressed and the compressed version requires 16,384 bytes. • The compression Ratio for the above compression is given by- Compression Ratio= Original Size Compressed Size  Compression Ratio= 65536 = 4:1 16384 2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 16
  • 17. • We can also represent the compression ratio by expressing the reduction in the amount of data required as a percentage of the size of the original data. • Total Compression in percentage = Original-Compressed ×100% Original = 65536-16384 ×100% 65536 = 75% • In this particular example, the compression ratio calculated in this manner would be 75%. 2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 17
  • 18. 2. Rate of Compression • Compression performance can also be reported by providing the average number of bits required to represent a single sample. • This is generally referred to as the rate. • For example, in the case of the compressed image described previously, if we assume 8 bits per byte (or pixel), the average number of bits per pixel in the compressed representation is 2. • Thus, we would say that the rate is 2 bits per pixel. 2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 18
  • 19. 3. Distortion • In lossy compression, the reconstruction differs from the original data. • Therefore, in order to determine the efficiency of a compression algorithm, we have to have some way of quantifying the difference. • The difference between the original and the reconstruction is often called the distortion. 2 February 2021 Dr. Piyush Charan, Dept. of ECE, Integral University, Lucknow 19