SlideShare a Scribd company logo
Compression of Digital Voice and Video


Overview of Data Compression

The benefits of data compression in high-speed networks are obvious. Following are
those that are especially important for the compressed version of data.

   •   Less transmission power is required.
   •   Less communication bandwidth is required.
   •   System efficiency is increased.

There are, however, certain trade-offs with data compression. For example, the encoding
and decoding processes of data compression increase the cost, complexity, and delay of
data transmission. Both of the two processes of data compression are required for
producing multimedia networking information: compression with loss and compression
without loss.
In the first category of data compression, some less valuable or almost similar data must
be eliminated permanently. The most notable case of compression with loss is the process
of signal sampling. In this category, for example, is voice sampling.

The following figure shows the basic information process in high-speed communication
systems. Any type of "source" data is converted to digital form in a long information-
source process. The outcome is the generation of digital words. Words are encoded in the
source coding system to result in a compressed form of the data.




Digital Voice and Compression

Signal Sampling
In the process of digitalizing a signal, analog signals first go through a sampling process,
as shown in the following figure. The sampling function is required in the process of
converting an analog signal to digital bits. However, acquiring samples from an analog
signal and eliminating the unsampled portions of the signal may result in some permanent
loss of information. In other words, the sampling resembles an information-compression
process with loss.
Sampling techniques are of several types:

   •   Pulse amplitude modulation (PAM), which translates sampled values to pulses
       with corresponding amplitudes
   •   Pulse width modulation (PWM), which translates sampled values to pulses with
       corresponding widths
   •   Pulse position modulation (PPM), which translates sampled values to identical
       pulses but with corresponding positions to sampling points

Quantization and Distortion

Samples are real numbersdecimal-point values and integer valuesand, thus, up to infinite
bits are required for transmission of a raw sample. The transmission of infinite bits
occupies infinite bandwidth and is not practical for implementation. In practice, sampled
values are rounded off to available quantized levels.




Still Images and JPEG Compression

         This section investigates algorithms that prepare and compress still and moving
images. The compression of such data substantially affects the utilization of bandwidths
over the multimedia and IP networking infrastructures. We begin with a single visual
image, such as a photograph, and then look at video, a motion image. The Joint
Photographic Experts Group (JPEG) is the compression standard for still images. It is
used for gray-scale and quality-color images. Similar to voice compression, JPEG is a
lossy process. An image obtained after the decompression at a receiving end may not be
the same as the original.




            The DCT process is complex and converts a snapshot of a real image into a
matrix of corresponding values. The quantization phase converts the values generated by
DCT to simple numbers in order to occupy less bandwidth. As usual, all quantizing
processes are lossy.

Raw-Image Sampling and DCT
As with a voice signal, we first need samples of a raw image: a picture. Pictures are of
two types: photographs, which contain no digital data, and images, which contain digital
data suitable for computer networks. An image is made up of m x n blocks of picture
units, or pixels, as shown in the following figure. For FAX transmissions, images are
made up of 0s and 1s to represent black and white pixels, respectively.
JPEG Files

Color images are based on the fact that any color can be represented to the human eye by
using a particular combination of the base colors red, green, and blue (RGB). Computer
monitor screens, digital camera images, or any other still color images are formed by
varying the intensity of the three primary colors at pixel level, resulting in the creation of
virtually any corresponding color from the real raw image. Each intensity created on any
of the three pixels is represented by 8 bits.

GIF Files
JPEG is designed to work with full-color images up to 2 24 colors. The graphics
interchange format (GIF) is an image file format that reduces the number of colors to
256. This reduction in the number of possible colors is a trade-off between the quality of
the image and the transmission bandwidth. GIF stores up to 28 = 256 colors in a table and
covers the range of colors in an image as closely as possible. Therefore, 8 bits are used to
represent a single pixel. GIF uses a variation of Lempel-Ziv encoding for compression of
an image.

Encoding
In the last phase of the JPEG process, encoding finally does the task of compression. In
the quantization phase, a matrix with numerous 0s is produced. The Q matrix in this
example has produced 57 zeros from the original raw image. A practical approach to
compressing this matrix is to use run-length coding .f run-length coding is used, scanning
matrix Q[i][j] row by row may result in several phrases.
            This method is attractive because the larger values in the matrix tend to collect
in the upper-left corner of the matrix, and the elements representing larger values tend to
be gathered together in that area of the matrix. Thus, we can induce a better rule:
Scanning should always start from the upper-left corner element of the matrix. This way,
we get much longer runs for each phrase and a much lower number of phrases in the run-
length coding.

Moving Images and MPEG Compression
A motion image, or video is a rapid display of still images. Moving from one image to
another must be fast enough to fool the human eye. There are different standards on the
number of still images comprising a video clip.
The common standard that defines the video compression is the Moving Pictures Expert
Group (MPEG), which has several branch standards:

   •   MPEG-1, primarily for video on CD-ROM
   •   MPEG-2, for multimedia entertainment and high-definition television (HDTV)
       and the satellite broadcasting industry
   •   MPEG-4, for object-oriented video compression and videoconferencing over low-
       bandwidth channels
   •   MPEG-7, for a broad range of demands requiring large bandwidths providing
       multimedia tools
   •   MPEG-21 for interaction among the various MPEG groups.

Logically, using JPEG compression for each still picture does not provide sufficient
compression for video as it occupies a large bandwidth. MPEG deploys additional
compression. Normally, the difference between two consecutive frames is small. With
MPEG, a base frame is sent first, and successive frames are encoded by computing the
differences.

Depending on the relative position of a frame in a sequence, it can be compressed
through one of the following types of frames:

   •   Interimage (I) frames. An I frame is treated as a JPEG still image and compressed
       using DCT.
   •   Predictive (P) frames. These frames are produced by computing differences
       between a current and a previous I or P frame.
   •   Bidirectional (B) frames. A B frame is similar to a P frame, but the P frame
       considers differences between a previous, current, and future frames.

Snapshot of moving frames for MPEG compression
MP3 and Streaming Audio
         The MPEG-1 layer 3 (MP3) technology compresses audio for networking and
producing CD-quality sound. The sampling part of PCM is performed at a rate of 44.1
KHz to cover the maximum of 20 KHz of audible signals. Using the commonly used 16-
bit encoding for each sample, the maximum total bits required for audio is 16 x 44.1 =
700 kilobits and 1.4 megabits for two channels if the sound is processed in a stereo
fashion. For example a 60-minute CD (3,600 seconds) requires about 1.4 x 3,600 = 5,040
megabits, or 630 megabytes. This amount may be acceptable for recording on a CD but is
considered extremely large for networking, and thus a carefully designed compression
technique is needed.

        MP3 combines the advantages of MPEG with "three" layers of audio
compressions. MP3 removes from a piece of sound all portions that an average ear may
not be able to hear, such as weak background sounds.

Limits of Compression with Loss
        Hartely, Nyquist, and Shannon are the founders of information theory, which has
resulted in the mathematical modeling of information sources. Consider a communication
system in which a source signal is processed to produce sequences of n words




Basics of Information Theory
If ai is the most likely output and aj is the least likely output, clearly, aj conveys the most
information and ai conveys the least information. This observation can be rephrased as an
important conclusion: The measure of information for an output is a decreasing and
continuous function of the probability of source output. To formulate this statement, let
Pk1 and Pk2 be the probabilities of an information source's outputs ak1 and ak2, respectively.
Let I(Pk1) and I(Pk2) be the information content of ak1 and ak2, respectively. The following
four facts apply.

   1.   As discussed, I(Pk) depends on Pk.
   2.   I(Pk) = a continuous function of Pk.
   3.   I(Pk) = a decreasing function of Pk.
   4.   Pk = Pk1.Pk2 (probability of two outputs happen in the same time).
   5.   I(Pk) = I(Pk1) + I(Pk2) (sum of two pieces of information).
Compression Methods Without Loss
Some types of data, including text, image, and video, might contain redundant or
repeated elements. If so, those elements can be eliminated and some sort of codes
substituted for future decoding. In this section, we focus on techniques that do not incur
any loss during compression:
    • Arithmetic encoding
    • Run-length encoding
    • Huffman encoding
    • Lempel-Ziv encoding

Run-Length Encoding

One of the simplest data-compression techniques is run-length encoding. This technique
is fairly effective for compression of plaintext and numbers, especially for facsimile
systems. With run-length code, repeated letters can be replaced by a run length,
beginning with Cc to express the compression letter count.




Huffman Encoding

Huffman encoding is an efficient frequency-dependent coding technique. With this
algorithm, source values with smaller probabilities appear to be encoded by a longer
word. The algorithm that implements such a technique is as follows.

Begin Huffman Encoding Algorithm

   1. Sort outputs of the source in decreasing order of their probabilities. For example,
      0.7, 0.6, 0.6, 0.59, ..., 0.02, 0.01.
2. Merge the two least probabilistic outputs into a single output whose probability is
      the sum of corresponding probability, such as 0.02 + 0.01 = 0.03.
   3. If the number of remaining outputs is 2, go to the next step; otherwise, go to step
      1.
   4. Assign 0 and 1 as codes on the diagram.
   5. If a new output is the result of merging two outputs, append the code word with 0
      and 1; otherwise, stop.




Lempel-Ziv Encoding
Lempel-Ziv codes are independent of the source statistics. This coding technique is
normally used for UNIX compressed files. The algorithm that converts a string of logical
bits into a Lempel-Ziv code is summarized as follows.
Begin Lempel-Ziv Encoding Algorithm
    1. Any sequence of source output is passed in a phrase of varying length. At the first
         step, identify phrases of the smallest length that have not appeared so far. Note
         that all phrases are different, and lengths of words grow as the encoding process
         proceeds.
    2. Phrases are encoded using code words of equal length. If k1 = number of bits are
         needed to describe the code word and k2 = the number of phrases, we must have

      k1 = log2     k2     2.
   3. A code is the location of the prefix to the phrases.
   4. A code is followed by the last bit of parser output to double-check the last bit.
Compression of digital voice and video

More Related Content

PDF
Research Design: Quantitative, Qualitative and Mixed Methods Design
PPT
Interpixel redundancy
PPTX
Human computer interaction-Memory, Reasoning and Problem solving
PDF
Hemolytic anemia
PPTX
Haemolytic anaemias
PPTX
Diabetes Mellitus
PPTX
Hypertension
Research Design: Quantitative, Qualitative and Mixed Methods Design
Interpixel redundancy
Human computer interaction-Memory, Reasoning and Problem solving
Hemolytic anemia
Haemolytic anaemias
Diabetes Mellitus
Hypertension

What's hot (20)

PPTX
Video compression
PDF
Image compression
PPTX
image basics and image compression
PDF
Chapter 5 - Data Compression
PDF
Video compression
PPTX
Comparison between JPEG(DCT) and JPEG 2000(DWT) compression standards
PPTX
Chapter 8 image compression
PPTX
data compression technique
PPTX
Data compression
PPTX
Fundamentals of Data compression
PPTX
Introduction to Image Compression
PDF
Video Compression Techniques
PPTX
Data compression
PPT
Video Compression Basics - MPEG2
PPTX
PPTX
Data compression
PDF
Compression: Images (JPEG)
PPTX
Data compression & Classification
PDF
Video Compression
PPTX
Jpeg dct
Video compression
Image compression
image basics and image compression
Chapter 5 - Data Compression
Video compression
Comparison between JPEG(DCT) and JPEG 2000(DWT) compression standards
Chapter 8 image compression
data compression technique
Data compression
Fundamentals of Data compression
Introduction to Image Compression
Video Compression Techniques
Data compression
Video Compression Basics - MPEG2
Data compression
Compression: Images (JPEG)
Data compression & Classification
Video Compression
Jpeg dct
Ad

Similar to Compression of digital voice and video (20)

PPTX
Image compression and jpeg
DOCX
video comparison
DOC
Image compression
PDF
Comparison of different Fingerprint Compression Techniques
PDF
Multimedia.pdf
DOC
Seminar Report on image compression
PPSX
image file format, digital image processing .ppsx
PPTX
Data compression
PPTX
data compression IN COMPUTER NRETWORKS RR
PDF
J03502050055
PPT
Jpeg and mpeg ppt
PPT
Compression presentation 415 (1)
PPTX
Unit 3 Image Compression and Segmentation.pptx
PDF
Ec36783787
PPT
Data Compression
PPT
Multimedia Object - Video
PDF
M.sc.iii sem digital image processing unit v
PDF
A Study of Image Compression Methods
PPT
Compression
Image compression and jpeg
video comparison
Image compression
Comparison of different Fingerprint Compression Techniques
Multimedia.pdf
Seminar Report on image compression
image file format, digital image processing .ppsx
Data compression
data compression IN COMPUTER NRETWORKS RR
J03502050055
Jpeg and mpeg ppt
Compression presentation 415 (1)
Unit 3 Image Compression and Segmentation.pptx
Ec36783787
Data Compression
Multimedia Object - Video
M.sc.iii sem digital image processing unit v
A Study of Image Compression Methods
Compression
Ad

More from sangusajjan (19)

PPT
Unit iv atm networks
PDF
VoIP and multimedia networking
PDF
PDF
Network management
PDF
PDF
ATM Network
DOC
Computer network lesson plan
PDF
Question bank cn2
DOC
Profile
PPT
VII VoIP
PPT
VII Compression Introduction
PPT
UNIT II tramission control
PPT
Unit VI Overlays
PPT
Unit V network management and security
PPT
Unit III IPV6 UDP
PPT
Vivpn pp tfinal
PPT
UnIT VIII manet
PPT
Unit VIII wireless sensor networks
PPT
Unit i packet switching networks
Unit iv atm networks
VoIP and multimedia networking
Network management
ATM Network
Computer network lesson plan
Question bank cn2
Profile
VII VoIP
VII Compression Introduction
UNIT II tramission control
Unit VI Overlays
Unit V network management and security
Unit III IPV6 UDP
Vivpn pp tfinal
UnIT VIII manet
Unit VIII wireless sensor networks
Unit i packet switching networks

Recently uploaded (20)

PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Encapsulation theory and applications.pdf
PDF
Approach and Philosophy of On baking technology
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
cuic standard and advanced reporting.pdf
PPT
Teaching material agriculture food technology
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
KodekX | Application Modernization Development
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Electronic commerce courselecture one. Pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
Encapsulation theory and applications.pdf
Approach and Philosophy of On baking technology
The AUB Centre for AI in Media Proposal.docx
Per capita expenditure prediction using model stacking based on satellite ima...
MYSQL Presentation for SQL database connectivity
Building Integrated photovoltaic BIPV_UPV.pdf
cuic standard and advanced reporting.pdf
Teaching material agriculture food technology
Mobile App Security Testing_ A Comprehensive Guide.pdf
KodekX | Application Modernization Development
Chapter 3 Spatial Domain Image Processing.pdf
NewMind AI Weekly Chronicles - August'25 Week I
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
sap open course for s4hana steps from ECC to s4
Electronic commerce courselecture one. Pdf
Understanding_Digital_Forensics_Presentation.pptx

Compression of digital voice and video

  • 1. Compression of Digital Voice and Video Overview of Data Compression The benefits of data compression in high-speed networks are obvious. Following are those that are especially important for the compressed version of data. • Less transmission power is required. • Less communication bandwidth is required. • System efficiency is increased. There are, however, certain trade-offs with data compression. For example, the encoding and decoding processes of data compression increase the cost, complexity, and delay of data transmission. Both of the two processes of data compression are required for producing multimedia networking information: compression with loss and compression without loss. In the first category of data compression, some less valuable or almost similar data must be eliminated permanently. The most notable case of compression with loss is the process of signal sampling. In this category, for example, is voice sampling. The following figure shows the basic information process in high-speed communication systems. Any type of "source" data is converted to digital form in a long information- source process. The outcome is the generation of digital words. Words are encoded in the source coding system to result in a compressed form of the data. Digital Voice and Compression Signal Sampling In the process of digitalizing a signal, analog signals first go through a sampling process, as shown in the following figure. The sampling function is required in the process of converting an analog signal to digital bits. However, acquiring samples from an analog signal and eliminating the unsampled portions of the signal may result in some permanent loss of information. In other words, the sampling resembles an information-compression process with loss.
  • 2. Sampling techniques are of several types: • Pulse amplitude modulation (PAM), which translates sampled values to pulses with corresponding amplitudes • Pulse width modulation (PWM), which translates sampled values to pulses with corresponding widths • Pulse position modulation (PPM), which translates sampled values to identical pulses but with corresponding positions to sampling points Quantization and Distortion Samples are real numbersdecimal-point values and integer valuesand, thus, up to infinite bits are required for transmission of a raw sample. The transmission of infinite bits occupies infinite bandwidth and is not practical for implementation. In practice, sampled values are rounded off to available quantized levels. Still Images and JPEG Compression This section investigates algorithms that prepare and compress still and moving images. The compression of such data substantially affects the utilization of bandwidths over the multimedia and IP networking infrastructures. We begin with a single visual
  • 3. image, such as a photograph, and then look at video, a motion image. The Joint Photographic Experts Group (JPEG) is the compression standard for still images. It is used for gray-scale and quality-color images. Similar to voice compression, JPEG is a lossy process. An image obtained after the decompression at a receiving end may not be the same as the original. The DCT process is complex and converts a snapshot of a real image into a matrix of corresponding values. The quantization phase converts the values generated by DCT to simple numbers in order to occupy less bandwidth. As usual, all quantizing processes are lossy. Raw-Image Sampling and DCT As with a voice signal, we first need samples of a raw image: a picture. Pictures are of two types: photographs, which contain no digital data, and images, which contain digital data suitable for computer networks. An image is made up of m x n blocks of picture units, or pixels, as shown in the following figure. For FAX transmissions, images are made up of 0s and 1s to represent black and white pixels, respectively.
  • 4. JPEG Files Color images are based on the fact that any color can be represented to the human eye by using a particular combination of the base colors red, green, and blue (RGB). Computer monitor screens, digital camera images, or any other still color images are formed by varying the intensity of the three primary colors at pixel level, resulting in the creation of virtually any corresponding color from the real raw image. Each intensity created on any of the three pixels is represented by 8 bits. GIF Files JPEG is designed to work with full-color images up to 2 24 colors. The graphics interchange format (GIF) is an image file format that reduces the number of colors to 256. This reduction in the number of possible colors is a trade-off between the quality of the image and the transmission bandwidth. GIF stores up to 28 = 256 colors in a table and covers the range of colors in an image as closely as possible. Therefore, 8 bits are used to represent a single pixel. GIF uses a variation of Lempel-Ziv encoding for compression of an image. Encoding In the last phase of the JPEG process, encoding finally does the task of compression. In the quantization phase, a matrix with numerous 0s is produced. The Q matrix in this example has produced 57 zeros from the original raw image. A practical approach to compressing this matrix is to use run-length coding .f run-length coding is used, scanning matrix Q[i][j] row by row may result in several phrases. This method is attractive because the larger values in the matrix tend to collect in the upper-left corner of the matrix, and the elements representing larger values tend to be gathered together in that area of the matrix. Thus, we can induce a better rule: Scanning should always start from the upper-left corner element of the matrix. This way, we get much longer runs for each phrase and a much lower number of phrases in the run- length coding. Moving Images and MPEG Compression A motion image, or video is a rapid display of still images. Moving from one image to another must be fast enough to fool the human eye. There are different standards on the number of still images comprising a video clip.
  • 5. The common standard that defines the video compression is the Moving Pictures Expert Group (MPEG), which has several branch standards: • MPEG-1, primarily for video on CD-ROM • MPEG-2, for multimedia entertainment and high-definition television (HDTV) and the satellite broadcasting industry • MPEG-4, for object-oriented video compression and videoconferencing over low- bandwidth channels • MPEG-7, for a broad range of demands requiring large bandwidths providing multimedia tools • MPEG-21 for interaction among the various MPEG groups. Logically, using JPEG compression for each still picture does not provide sufficient compression for video as it occupies a large bandwidth. MPEG deploys additional compression. Normally, the difference between two consecutive frames is small. With MPEG, a base frame is sent first, and successive frames are encoded by computing the differences. Depending on the relative position of a frame in a sequence, it can be compressed through one of the following types of frames: • Interimage (I) frames. An I frame is treated as a JPEG still image and compressed using DCT. • Predictive (P) frames. These frames are produced by computing differences between a current and a previous I or P frame. • Bidirectional (B) frames. A B frame is similar to a P frame, but the P frame considers differences between a previous, current, and future frames. Snapshot of moving frames for MPEG compression
  • 6. MP3 and Streaming Audio The MPEG-1 layer 3 (MP3) technology compresses audio for networking and producing CD-quality sound. The sampling part of PCM is performed at a rate of 44.1 KHz to cover the maximum of 20 KHz of audible signals. Using the commonly used 16- bit encoding for each sample, the maximum total bits required for audio is 16 x 44.1 = 700 kilobits and 1.4 megabits for two channels if the sound is processed in a stereo fashion. For example a 60-minute CD (3,600 seconds) requires about 1.4 x 3,600 = 5,040 megabits, or 630 megabytes. This amount may be acceptable for recording on a CD but is considered extremely large for networking, and thus a carefully designed compression technique is needed. MP3 combines the advantages of MPEG with "three" layers of audio compressions. MP3 removes from a piece of sound all portions that an average ear may not be able to hear, such as weak background sounds. Limits of Compression with Loss Hartely, Nyquist, and Shannon are the founders of information theory, which has resulted in the mathematical modeling of information sources. Consider a communication system in which a source signal is processed to produce sequences of n words Basics of Information Theory If ai is the most likely output and aj is the least likely output, clearly, aj conveys the most information and ai conveys the least information. This observation can be rephrased as an important conclusion: The measure of information for an output is a decreasing and continuous function of the probability of source output. To formulate this statement, let Pk1 and Pk2 be the probabilities of an information source's outputs ak1 and ak2, respectively. Let I(Pk1) and I(Pk2) be the information content of ak1 and ak2, respectively. The following four facts apply. 1. As discussed, I(Pk) depends on Pk. 2. I(Pk) = a continuous function of Pk. 3. I(Pk) = a decreasing function of Pk. 4. Pk = Pk1.Pk2 (probability of two outputs happen in the same time). 5. I(Pk) = I(Pk1) + I(Pk2) (sum of two pieces of information).
  • 7. Compression Methods Without Loss Some types of data, including text, image, and video, might contain redundant or repeated elements. If so, those elements can be eliminated and some sort of codes substituted for future decoding. In this section, we focus on techniques that do not incur any loss during compression: • Arithmetic encoding • Run-length encoding • Huffman encoding • Lempel-Ziv encoding Run-Length Encoding One of the simplest data-compression techniques is run-length encoding. This technique is fairly effective for compression of plaintext and numbers, especially for facsimile systems. With run-length code, repeated letters can be replaced by a run length, beginning with Cc to express the compression letter count. Huffman Encoding Huffman encoding is an efficient frequency-dependent coding technique. With this algorithm, source values with smaller probabilities appear to be encoded by a longer word. The algorithm that implements such a technique is as follows. Begin Huffman Encoding Algorithm 1. Sort outputs of the source in decreasing order of their probabilities. For example, 0.7, 0.6, 0.6, 0.59, ..., 0.02, 0.01.
  • 8. 2. Merge the two least probabilistic outputs into a single output whose probability is the sum of corresponding probability, such as 0.02 + 0.01 = 0.03. 3. If the number of remaining outputs is 2, go to the next step; otherwise, go to step 1. 4. Assign 0 and 1 as codes on the diagram. 5. If a new output is the result of merging two outputs, append the code word with 0 and 1; otherwise, stop. Lempel-Ziv Encoding Lempel-Ziv codes are independent of the source statistics. This coding technique is normally used for UNIX compressed files. The algorithm that converts a string of logical bits into a Lempel-Ziv code is summarized as follows. Begin Lempel-Ziv Encoding Algorithm 1. Any sequence of source output is passed in a phrase of varying length. At the first step, identify phrases of the smallest length that have not appeared so far. Note that all phrases are different, and lengths of words grow as the encoding process proceeds. 2. Phrases are encoded using code words of equal length. If k1 = number of bits are needed to describe the code word and k2 = the number of phrases, we must have k1 = log2 k2 2. 3. A code is the location of the prefix to the phrases. 4. A code is followed by the last bit of parser output to double-check the last bit.