SlideShare a Scribd company logo
VIDEO COMPRESSION BASICS MPEG-2 BY- VIJAY
AGENDA OVERVIEW VIDEO SCHEME VIDEO COMPRESSION MPEG MPEG-2 VIDEO COMPRESSION MPEG-2 FRAMES INTRA FRAME ENCODING NON-INTRA FRAME ENCODING MPEG-2 VIDEO DECODING INTRA FRAME DECODING NON-INTRA FRAME DECODING MPEG-2 VIDEO ENCODING INTRA FRAME NON-INTRA FRAME
A video comprises of a sequence of frames. A video,  of the duration of 1 second , generated by a TV camera usually contains 24 frames or 30 frames. Each pixel in a frame is represented by three attributes  (each 8 bits long)  –  One luminance  attribute and  two chrominance  attributes. ( i.e.  YCbCr ) Luminance (Y) : Describes the brightness of the pixel. Chrominance (CbCr) : Describes the color of the pixel. 1. OVERVIEW Frame { Y ,  Cb,  Cr } Frame 1 Frame 2 Frame 3 Frame 4
For example- A single frame having the resolution of 720 X 480  (no. of  pixels in each horizontal line is 720 and total no. of horizontal lines per frame is 480)  will be described by A complete video of 1 second will be described by  ( 720 X 480 X 8  +  720 X 480 X 8  +  720 X 480 X 8 ) X  24  bits =  199065600 bits ~  199 Mb. Thus, for the entire movie, the data would be too big to fit on DVDs or to transmit it using the bandwidth of available TV channels. An  uncompressed  video data is big in size. 720 X 480 X 8  +  720 X 480 X 8  +  720 X 480 X 8   bits =  8294400  bits. ~  8.29 Mb. Or {Y, Cb, Cr}, {Y, Cb, Cr}, {Y, Cb, Cr}…. {Y, Cb, Cr} {Y, Cb, Cr}, {Y, Cb, Cr}, {Y, Cb, Cr}…. {Y, Cb, Cr} ………………………………………………………………………………… . ……………………………………………………… ........................ ………………………………………………………………………………… . ………………………………………………………………………………… . ………………………………………………………………………………… . ……………………………………………………… ........................ ………………………………………………………………………………… . ………………………………………………………………………………… . {Y, Cb, Cr}, {Y, Cb, Cr}, {Y, Cb, Cr}…. {Y, Cb, Cr} 720 Pixels 480 Lines Frame (720 X 480)
2. VIDEO SCHEME Interlaced Video Scanning Types of video schemes used for transmission - Bottom Fields Even numbered rows in a frame. Top Fields Odd numbered rows in a frame. The two successive fields  (field 1 & field 2)  are called a frame. Both the fields are sent one after another and display puts them back together before displaying the full frame. Quality degraded as sometimes the frames come out of sync. It conserve the bandwidth. Maximum frame rate is 60 frames/ second. In this, a frame is divided into two separate fields – Top Fields and Bottom fields. Field 1 Field 2 Frame
Progressive Video Scanning In this, complete frame is send to display. Bandwidth requirement is twice as compared to Interlaced video scanning. Quality is good as frames come in sync  and image is much sharper. Maximum frame rate is 30 frames/ second. Frame
Spatial redundancy In a frame, adjacent pixels are usually correlated.  e.g. - The grass is green in the background of a frame. 3. VIDEO COMPRESSION The data in frames is often redundant in space and time. The concept of  video compression lies on two main factors- For example- The human eye better resolve the brightness details than color details. So the way human eye works, it is also possible to delete some data from the frame with almost no noticeable degradation in image quality. Time based redundancy In a video, adjacent frames are usually correlated.  e.g. - The green background is persisting frame after frame. Frame 1 Frame 2 Frame 3 Frame 4
4. MPEG MPEG-1 :  It was the very first project of this group and published in 1993 as ISO/IEC 11172 standard. MPEG stands for  Motion Picture Experts Group  established in 1988 as a working group within ISO/IEC that has defined standards for digital compression of audio & video signals.  Such as- MPEG-1 defines coding methods to compress the progressively scanned video. Commonly used in CD-i and Video CD systems. It supports coding bit rate of 1.5 Mbit/s. MPEG-2 :  is an extension of MPEG-1, published in 1995 as ISO/IEC 13818 standard. MPEG-2 defines coding methods to compress progressively scanned video as well as interlaced scanned video. Commonly used in broadcast format, such as – Standard Definition TV (SDT) and High Definition TV (HDT). It supports coding bit rate of 3 - 15 Mbit/s for SDT and 15 – 20 Mbit/s for HDT. MPEG-4 :  introduced in 1998 and still in development as ISO/IEC 14496 standard. MPEG-2 defines object based coding methods for mixed media data and provides new features, such as – 3D rendering, animation graphics, DRM, various types of interactivity etc. Commonly used in web based streaming media, CD, videophone, DVB, etc . It supports coding bit rate of few Kbit/s – tens of Mbit/s.
5. MPEG-2 VIDEO COMPRESSION MPEG-2 compresses a raw frame into three different kind of frames – Compressed frames (I, P & B frames) are organized in a sequence to form  Group of Pictures  (GOP). Compression is based on - Spatial redundancy Time based redundancy Intra coded frames  (I-frames), Predictive coded frames  (P-frames), and Bi-directionally predictive coded frames  (B-frames) Raw Frame I-Frame P-Frame B-Frame I B P B P B P B I (GOP 1) B Group of pictures
I - Frame P - Frame Compression is based on spatial redundancy as well as on time based redundancy. P-frame can be predicted by referring I-frame or P-frame immediately preceding it.  (P-frame is also a reference frame) . P-frame provides better compression than I-frame as it uses the data of previous I- frame or P-frame. I-frame is a reference frame  and can be used to predict the P-frame immediately following it. Compressed directly from of a raw (uncompressed) frame. Compression is based on spatial redundancy in the current raw frame only and inability of human eye to detect certain changes in the image. 6. MPEG-2 FRAMES Reference Previous P Next I Reference Previous P Next P Compressed Raw Frame I
B - Frame Compression is similar to P-frame except that B-frame compression is done by referring previous as well as following I-frame and/ or P frame. B-frame required frame sequence must be transmitted or stored (I or P frames) out of order, so that future frame is available for reference. It causes some delay through out the system . B-frame provides better compression than P-frame & I-frame, as it uses the data of previous as well as succeeding I- frame and/ or P-frame. It requires a memory buffer of double in size to store data of two reference/ anchor frames B-frame is not a  reference  frame. There is no defined limit to the number of consecutive B-frames within a group of pictures. Most application uses two consecutive B-frames as the ideal trade off between compression efficiency and video quality. Previous Future References B P I Previous Future References B P I References Previous Future B P P 100 102 99 104 101 106 103 107 105 108 Transmission Order Raw Frames 99 100 101 102 103 104 105 106 107 108 B I B P B P B P I P Display Order Frame Number Encoded Frames I P B P B P B I B P 100 102 99 104 101 106 103 107 105 108 Encoding Order Frame Number Encoded Frames
It takes the advantage of spatial redundancy. Compression techniques are applied using the data of the current frame only. It uses combination of various lossless and lossy compression techniques. Such as –  It takes the advantage of time-based redundancy as well as spatial redundancy. Compression techniques are applied using the data of the current frame as well as preceding and/ or succeeding frames. It mainly uses lossy compression techniques. Such as –  Forward interpolated prediction Intra Frame Encoding Non-Intra Frame Encoding Uses data of previously coded frame. Forward  & backward interpolated prediction Uses data of previously coded frame & future frame. Temporal prediction Uses motion estimation & vectors to generate predicted frame. Residual error frame & its coding It is generated by subtracting predicted frame from its reference frame, which is further spatially coded & transmitted. Video filter Compress spatial redundancy at chrominance plane. Discrete cosine transform (DCT) Convert spatial variation into frequency variation. DCT coefficient quantization Reduces higher frequency DCT coefficients to zero . Run-Length amplitude/ Variable length encoding Compression using entropy encoding, run-length encoding & Huffman encoding . 7. MPEG-2 VIDEO ENCODING MPEG-2 video encoding can be broadly categorized into –  Intra Frame Encoding and Non-Intra Frame Encoding. Bit rate control Prevents under/ over flow of data buffer.
7.1 INTRA FRAME ENCODING Video Filtering Video filtering is a lossy compression technique and is used to compress the spatial redundancies on  macro-block  basis within the current frame. It operates on color space  (i.e. YCbCr encoding & CbCr sub-sampling) , as Human Visual System is less sensitive to variations in color as compared to variations in brightness. Video filtering includes- Macro-block :  Macro-blocks are created by dividing raw frame into 8 x 8 pixels blocks.  For example- Raw Frame Block_1 Block_2 Block_n 8 X 8 Pixels Quantization DCT Video Filter (Optional) Run-Length VLC Bit Stream Buffer Bit Rate Control Intra Frame Encoder
A raw frame contains an image in RGB color space. RGB color space contains mutual redundancies, so it requires large space for storage and high bandwidth for transmission. Encoding RGB into YCbCr color space reduces the mutual redundancies YCbCr Encoding:  converts block’s RGB data into YCbCr color space.  For example- Conversion:  Y = + 0.299 * R + 0.587 * G + 0.114 * B   Cb = 128 - 0.168736 * R - 0.331264 * G + 0.5 * B   Cr = 128 + 0.5 * R - 0.418688 * G - 0.081312 * B Where- R, G & B values are 8 bits long and lies in {0, 1, 2, ..., 255} range. Raw Frame G R B Raw Frame Y Cb Cr
Chrominance (CbCr) Sub-Sampling:  provides further compression at chrominance plane to reduced number of bits to represent an image.  For example- A YCbCr encoded frame can be represented as-  4:4:4 Sampling Format 4:4:4 sampling format states that, for every four Y samples – there is four Cb & four Cr samples. If an image resolution is 640 x 480 pixels then, the number. of  Y samples = 640 x 480, Cr samples = 640 x 480 & Cb samples = 640 X 480 Number of bits required =  640 x 480 x 8  +  640 x 480 x 8  +  640 x 480 x 8     = 7372800 bits ~ 7.3728 Mb x Y Cb Cr x x x x x x x x x x x x x x x x x x x x x x x x 4:4:4 Pixel with Y, Cr & Cb value
4:4:4 sampling format can be sub–sampled to 4:2:2 format, where Cr & Cb are sub-sampled to half the horizontal resolution of Y. That is, in 4:2:0 sampling format - for every four 4 Y samples in horizontal  direction, there would be 2 Cb & 2 Cr samples If an image resolution is 640 x 480 pixels then, the number. of  Y samples  = 640 x 480,  Cr samples  = 320 x 480 &  Cb samples  = 320 X 480 Number of bits required =  640 x 480 x 8  +  320 x 480 x 8  +  320 x 480 x 8     = 4915200 bits ~ 4.9152 Mb x x x x x x x x x x x x x x x x x x x x x x x x 4:4:4 x x x x x x x x x x x x 4:2:2 Pixel with Y value
It can be further sub–sampled to 4:2:0 format, where Cb & Cr are sub-sampled to half the horizontal and vertical resolution of Y. If an image resolution is 640 x 480 pixels then, the number. of  Y samples  = 640 x 480,  Cr samples  = 320 x 240 &  Cb samples  = 320 X 240 Number of bits required =  640 x 480 x 8  +  320 x 240 x 8  +  320 x 240 x 8     = 3686400 bits ~  3.6864   Mb   (which is far less than 4:4:4 and 4:2:2 format) 4:2:0 OR Pixel with Cr & Cb value Samples are taken at different intervals x x x x x x x x x x x x x x x x x x x x x x x x 4:2:2
Discrete Cosine Transformation DCT converts the spatial variations within the macro-block into frequency variations without changing the data. A two-dimension DCT function can be represented as: The output of a DCT function is a DCT coefficient matrix containing the data in frequency domain. Data in frequency domain can be efficiently processed and compressed. Where-
In DCT, 8x8 data block of Y, Cb and Cr components is converted into frequency domain.  For example- Lets assume, 8x8 pixel block of Y component and its corresponding 8x8 data block - Subtract 128 from each pixel value. DCT Apply two-dimension DCT to get DCT coefficient matrix. Block_1 8 X 8 Pixels Y 52 55 61 66 70 61 64 73 63 59 55 90 109 85 69 72 62 59 68 113 144 104 66 73 63 58 71 122 154 106 70 69 67 61 68 104 126 88 68 70 79 65 60 70 77 68 58 75 85 71 64 59 55 61 65 83 87 79 69 68 65 76 78 94 8 X 8 Data block -76 -73 -67 -62 -58 -67 -64 -55 -65 -69 -73 -38 -19 -43 -59 -56 -66 -69 -60 -15 16 -24 -62 -55 -65 -70 -57 -6 26 -22 -58 -59 -61 -67 -60 -24 -2 -40 -60 -58 -49 -63 -68 -58 -51 -60 -70 -53 -43 -57 -64 -69 -73 -67 -63 -45 -41 -49 -59 -60 -63 -52 -50 -34 8 X 8 Data block -415 -30 -61 27 56 -20 -2 -0 4 -22 -61 10 13 -7 -9 5 -47 7 77 -25 -29 10 5 -6 -49 12 34 -15 -10 6 2 2 12 -7 -13 -4 -2 2 -3 3 -8 3 2 -6 -2 1 4 2 -1 0 0 -2 -1 -3 4 -1 0 0 -1 -4 -1 0 1 2 8 X 8 DCT Coefficient Matrix DCT 8 X 8 Pixels Y ~
Quantization Quantization reduces the amount of information in higher frequency DCT coefficient components using a default  quantization matrix  defined by MPEG-2 standard. Default quantization matrix contains constant values. It is a lossy operation that causes minor degradation in the image quality due to some subtle loss in brightness and colors. Each component in DCT coefficient matrix is divided by its corresponding constant value in default quantization matrix and a quantized DCT coefficient matrix is computed. A quantization function can be represented as: 16 11 19 16 24 40 51 61 12 12 14 19 26 58 60 55 14 13 16 24 40 57 69 56 14 17 22 29 51 87 80 62 18 22 37 56 68 109 103 77 24 35 55 64 81 104 113 92 49 64 78 87 103 121 120 101 72 92 95 98 112 100 103 99 Default Quantization Matrix
Quantization For example- -26 -3 -6 2 2 -1 0 0 0 -2 -4 1 1 0 0 0 -3 1 5 -1 -1 0 0 0 -4 1 2 -1 0 0 0 0 1 0 Quantized 8 X 8 DCT Coefficient Matrix 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -415 -30 -61 27 56 -20 -2 -0 4 -22 -61 10 13 -7 -9 5 -47 7 77 -25 -29 10 5 -6 -49 12 34 -15 -10 6 2 2 12 -7 -13 -4 -2 2 -3 3 -8 3 2 -6 -2 1 4 2 -1 0 0 -2 -1 -3 4 -1 0 0 -1 -4 -1 0 1 2 8 X 8 DCT Coefficient Matrix 16 11 19 16 24 40 51 61 12 12 14 19 26 58 60 55 14 13 16 24 40 57 69 56 14 17 22 29 51 87 80 62 18 22 37 56 68 109 103 77 24 35 55 64 81 104 113 92 49 64 78 87 103 121 120 101 72 92 95 98 112 100 103 99 Default Quantization Matrix
Run Length Amplitude/ Variable Length Encoding Run length amplitude/ Variable length encoding is a lossless compression techniques that includes  entropy encoding, run-length encoding and Huffman encoding . Entropy Encoding :  Components of quantized DCT coefficient matrix is read in zigzag order.  It helps in representing the frequency coefficients (both higher & lower) of quantized DCT coefficient matrix in an efficient manner.  For example- Entropy Encoding ~ Entropy Encoding -26 -3 -6 2 2 -1 0 0 0 -2 -4 1 1 0 0 0 -3 1 5 -1 -1 0 0 0 -4 1 2 -1 0 0 0 0 1 0 Quantized 8 X 8 DCT Coefficient Matrix 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -26 -3 -6 2 2 -1 0 -2 -4 1 1 -3 1 5 -1 -1 -4 1 2 -1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Zigzag ordering of quantized 8 X 8 DCT Coefficient Matrix -26 -3 -6 2 2 -1 0 -2 -4 1 -3 1 5 -4 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 15 1 -1 0 2 0 0 -1 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 48 63
Run Length Encoding :  is a lossless data compression technique of a sequence in which same data value occurs in many consecutive data elements.  WWWWWWWWWWWWBWWWWWWWWWWWWBBB (Sequence)   Here, the runs of data are stored as single data value and count. That is, the above sequence can be represented as  12W1B12W3B. For example- -26, -3, 0, -3, -2, -6, 2, -4, 1, -4, (two 1s), 5, 1, 2, -1, 1, -1, 2, (5 0s), (two -1s), (thirty eights zeroes) Run Length Encoded DCT coefficients Run Length Encoding Zigzag ordering of quantized 8 X 8 DCT Coefficient Matrix -26 -3 -6 2 2 -1 0 -2 -4 1 -3 1 5 -4 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 15 1 -1 0 2 0 0 -1 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 48 63
Huffman Encoding :  is a lossless data compression technique that uses  variable length code table  for encoding a source symbol.  Where, Variable length code table is derived based on the estimated probability of occurrence of each possible value of the source symbol.  For example-  MPEG-2 has special Huffman code word (i.e. EOB) to end the sequence prematurely when the remaining coefficients are zero and then it performs variable length encoding for further compression. -26, -3, 0, -3, -2, -6, 2, -4, 1, -4, (two 1s), 5, 1, 2, -1, 1, -1, 2, (5 0s), (two -1s), (thirty eights zeroes) Run Length Encoded DCT coefficients -26, -3, 0, -3, -2, -6, 2, -4, 1, -4, (two 1s), 5, 1, 2, -1, 1, -1, 2, (5 0s), (two -1s),  EOB DCT coefficients sequence with EOB  Variable Length Encoding 0011101010011011010110001010 Compressed Bit Stream Sequence char Freq Code space 7 111 a 4 010 e 4 000 f 3 1101 h 2 1010 Variable length code table
Bit Rate Control Bit rate control is a mechanism to prevent the underflow or overflow of buffer used for temporarily storage of encoded bit stream within the encoder. Bit rate control is necessary in applications that requires fixed bit rate transmission of encoded bit-stream.  For example- Quantization process may affects relative buffer fullness which in turn affects the output bit rate, as quantization depends on default quantization matrix (picture basis) and quantization scale (macro-block basis). Encoder has to pass these two parameters to bit rate control mechanism in order to control relative buffer fullness and constant bit rate. Buffer under flow/ over flow can be prevented by repeating or dropping of entire video frames. Quantization Run-Length VLC Bit Stream Buffer Bit Rate Control
7.2 NON-INTRA FRAME ENCODING Video Filter DCT Quantization Run-Length VLC Bit Stream Buffer Inverse Quantization Inverse  DCT Motion Estimation Motion Compensation Anchor Frame Memory (2) + + - Bit Rate Control Non – Intra Frame Encoder
Using  forward interpolated prediction , encoder can forward predict a future frame called P-frame. The very first P-frame in a  group of picture , is predicted from the I-frame immediately preceding it. Other P-frame in a  group of picture  can be predicted from previous I-frame or P-frame immediately preceding it. Forward Interpolated Prediction Forward & Backward Interpolated Prediction Using  forward & backward interpolated prediction , encoder can forward predict a future frame called B-frame. The B-frame in a  group of picture , is predicted based on a forward prediction from a previous I or P  as well as backward prediction from a succeeding I or P frame. I P P P P P P I P P (GOP 1) (GOP 2) I B P B P B I B P B (GOP 1) (GOP 2)
Mostly, consecutive video frames are similar except for the differences induced by the objects moving within the frames.  For example- Temporal prediction uses  motion estimation & motion vector techniques  to predict these changes in the future frames. Motion estimation is applied at luminance plane only.  (It is not applied at chrominance plane as it is assumed that the color motion can be adequately presented with the same motion information as the luminance. ) Temporal Prediction Frame 1 (Current) Frame 2 (Next) Tree moved down and to the right People moved farther to the right than tree
Macro-block to be searched Search Search Search If there is no acceptable match, then encoder shall code that particular macro-block as an intra macro-block, even though it may be in a P or B frame. Motion Estimation :  It performs a 2-Dimesional spatial search for each luminance macro-blocks within the frame to get the best match.  For example-   Frame 2 Frame 1 Frame 1 Frame 1
Motion Vectors Since, each forward and backward macro-block contains 2 motion vectors, so a bi-directionally predicted macro-block will contain 4 motion vectors. Motion Vector :  Assign motion vectors to the resultant macro-blocks to indicate how far the horizontally and vertically the macro-block must be moved so that a predicted frame can be generated  For example-   Predicted Frame Frame 1
Subtract Since, motion vector tends to be highly correlated between macro-blocks – Horizontal & vertical component is compared to the previously valid horizontal & vertical motion vector respectively and difference is calculated. These differences  (i.e. Residual Error Frame)  are then coded and variable length code is applied on it for maximum compression efficiency. Residual error Frame is less complicated and can be encoded efficiently. More accurate the motion is estimated & matched, more likely the residual error will approach zero and the coding efficiency shall be highest. Residual Error Frame Residual error Frame is generated by subtracting the predicted frame from desired frame.  For example- Desired Frame Frame 2 Predicted Frame Residual Error Frame
Default quantization matrix for non-intra frame is flat matrix with constant value of 16 for each of the 64 locations . Non-intra frame quantization contains a dead-zone around zero which helps in eliminating any lone DCT coefficient quantization values that might reduce the run-length amplitude efficiency. Motion vectors for the residual block information are calculated as differential values and coded with a variable length code according to their statistical likelihood of occurrence. Residual Error Frame Coding :  Coding of residual error frame is similar to I-frame with some differences.  Such as-
8. MPEG-2 VIDEO DECODING MPEG-2 video decoding can be broadly categorized into –  Intra Frame Decoding and Non-Intra Frame Decoding. Intra Frame Decoding Intra frame decoding reverse the order of intra frame encoding process .  For example- Buffer :  Contains input bit-stream. For fixed rate applications, constant bit-stream is buffered in the memory and read out at variable rate based on the coding efficiency of the macro-blocks and frames to be decoded. VLD :  Reverse the order of run length amplitude/ variable length encoding done in encoding process and recover the quantized DCT coefficient matrix.  It is a most complex and computationally expensive portion in decoding. Perform bitwise decoding of input bit-stream using table look-ups to generate quantized DCT coefficients matrix. Inverse Quantization Run-Length VLD Bit Stream Buffer Inverse DCT Output I/F (Optional) Intra Frame Decoder
Inverse Quantization :  Reverse the order of quantization done in encoding process and recover the DCT coefficient matrix. Components of decoded quantized DCT coefficient matrix is multiplied by the corresponding value of the default quantization matrix and the quantization scale factor. Resulting coefficient is clipped to the region -2048 to +2047. Perform IDCT mismatch control to prevent long term error propagation within the sequence. Inverse DCT :  Reverse the order of DCT done in encoding process and recover the original frame. A two-dimension DCT function can be represented as: Where-
Non-Intra Frame Decoding Non-Intra frame decoding is similar to intra frame decoding with the addition of motion compensation support.  For example- Inverse Quantization Run-Length VLD Bit Stream Buffer Inverse DCT Output I/F (Optional) Motion Compensation Anchor Frame Memory (2) + Non-Intra Frame Decoder
9. REFERENCES: http://en.wikipedia.org/wiki/MPEG-2 http://www.john-wiseman.com/technical/MPEG_tutorial.htm http://www.bretl.com/mpeghtml/MPEGindex.htm
THANK YOU!

More Related Content

PPT
Introduction to H.264 Advanced Video Compression
PPTX
Video compression
PPT
Iain Richardson: An Introduction to Video Compression
PPTX
MPEG video compression standard
PPT
Compression
PPTX
Audio compression
PPTX
A short history of video coding
PDF
Chapter 5 - Data Compression
Introduction to H.264 Advanced Video Compression
Video compression
Iain Richardson: An Introduction to Video Compression
MPEG video compression standard
Compression
Audio compression
A short history of video coding
Chapter 5 - Data Compression

What's hot (20)

PPTX
Video coding standards ppt
ODP
MPEG-1 Part 2 Video Encoding
PDF
Video compression
PPTX
Region based segmentation
PPTX
PPTX
Multimedia basic video compression techniques
PDF
Video Compression
PPS
MPEG/Audio Compression
PPT
Data Redundacy
PPTX
Image compression standards
PPTX
SPIHT(Set Partitioning In Hierarchical Trees)
PPTX
POTX
Presentation of Lossy compression
PDF
Compression: Video Compression (MPEG and others)
PPT
Thresholding.ppt
PPT
Interpixel redundancy
PPTX
video compression techique
PDF
Lecture 4 Relationship between pixels
PPTX
Image compression in digital image processing
PPTX
digital image processing
Video coding standards ppt
MPEG-1 Part 2 Video Encoding
Video compression
Region based segmentation
Multimedia basic video compression techniques
Video Compression
MPEG/Audio Compression
Data Redundacy
Image compression standards
SPIHT(Set Partitioning In Hierarchical Trees)
Presentation of Lossy compression
Compression: Video Compression (MPEG and others)
Thresholding.ppt
Interpixel redundancy
video compression techique
Lecture 4 Relationship between pixels
Image compression in digital image processing
digital image processing
Ad

Similar to Video Compression Basics - MPEG2 (20)

PDF
An overview Survey on Various Video compressions and its importance
PPTX
Multimedia presentation video compression
PPT
Mmclass5b
DOCX
video comparison
PPT
Introduction to Video Compression Techniques - Anurag Jain
PPT
Mpeg4copy 120428133000-phpapp01
PPT
Multimedia Object - Video
PPT
Android Media Player Development
PPT
MPEG4 vs H.264
PDF
video compression
PDF
video compression
PDF
video compression
PPTX
simple video compression
PPT
Multimedia Presentation
PPT
Digital Video 101.ppt
PPT
Video00.ppt
PPT
mpeg4copy-120428133000-phpapp01.ppt
PPT
H263.ppt
PDF
To Understand Video
PDF
Video Compression Techniques
An overview Survey on Various Video compressions and its importance
Multimedia presentation video compression
Mmclass5b
video comparison
Introduction to Video Compression Techniques - Anurag Jain
Mpeg4copy 120428133000-phpapp01
Multimedia Object - Video
Android Media Player Development
MPEG4 vs H.264
video compression
video compression
video compression
simple video compression
Multimedia Presentation
Digital Video 101.ppt
Video00.ppt
mpeg4copy-120428133000-phpapp01.ppt
H263.ppt
To Understand Video
Video Compression Techniques
Ad

Recently uploaded (20)

PDF
Machine learning based COVID-19 study performance prediction
PDF
cuic standard and advanced reporting.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Electronic commerce courselecture one. Pdf
PPTX
Big Data Technologies - Introduction.pptx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
Cloud computing and distributed systems.
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Spectroscopy.pptx food analysis technology
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Chapter 3 Spatial Domain Image Processing.pdf
Machine learning based COVID-19 study performance prediction
cuic standard and advanced reporting.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
sap open course for s4hana steps from ECC to s4
Electronic commerce courselecture one. Pdf
Big Data Technologies - Introduction.pptx
Mobile App Security Testing_ A Comprehensive Guide.pdf
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
The Rise and Fall of 3GPP – Time for a Sabbatical?
Advanced methodologies resolving dimensionality complications for autism neur...
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Encapsulation_ Review paper, used for researhc scholars
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Cloud computing and distributed systems.
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Spectroscopy.pptx food analysis technology
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Chapter 3 Spatial Domain Image Processing.pdf

Video Compression Basics - MPEG2

  • 1. VIDEO COMPRESSION BASICS MPEG-2 BY- VIJAY
  • 2. AGENDA OVERVIEW VIDEO SCHEME VIDEO COMPRESSION MPEG MPEG-2 VIDEO COMPRESSION MPEG-2 FRAMES INTRA FRAME ENCODING NON-INTRA FRAME ENCODING MPEG-2 VIDEO DECODING INTRA FRAME DECODING NON-INTRA FRAME DECODING MPEG-2 VIDEO ENCODING INTRA FRAME NON-INTRA FRAME
  • 3. A video comprises of a sequence of frames. A video, of the duration of 1 second , generated by a TV camera usually contains 24 frames or 30 frames. Each pixel in a frame is represented by three attributes (each 8 bits long) – One luminance attribute and two chrominance attributes. ( i.e. YCbCr ) Luminance (Y) : Describes the brightness of the pixel. Chrominance (CbCr) : Describes the color of the pixel. 1. OVERVIEW Frame { Y , Cb, Cr } Frame 1 Frame 2 Frame 3 Frame 4
  • 4. For example- A single frame having the resolution of 720 X 480 (no. of pixels in each horizontal line is 720 and total no. of horizontal lines per frame is 480) will be described by A complete video of 1 second will be described by ( 720 X 480 X 8 + 720 X 480 X 8 + 720 X 480 X 8 ) X 24 bits = 199065600 bits ~ 199 Mb. Thus, for the entire movie, the data would be too big to fit on DVDs or to transmit it using the bandwidth of available TV channels. An uncompressed video data is big in size. 720 X 480 X 8 + 720 X 480 X 8 + 720 X 480 X 8 bits = 8294400 bits. ~ 8.29 Mb. Or {Y, Cb, Cr}, {Y, Cb, Cr}, {Y, Cb, Cr}…. {Y, Cb, Cr} {Y, Cb, Cr}, {Y, Cb, Cr}, {Y, Cb, Cr}…. {Y, Cb, Cr} ………………………………………………………………………………… . ……………………………………………………… ........................ ………………………………………………………………………………… . ………………………………………………………………………………… . ………………………………………………………………………………… . ……………………………………………………… ........................ ………………………………………………………………………………… . ………………………………………………………………………………… . {Y, Cb, Cr}, {Y, Cb, Cr}, {Y, Cb, Cr}…. {Y, Cb, Cr} 720 Pixels 480 Lines Frame (720 X 480)
  • 5. 2. VIDEO SCHEME Interlaced Video Scanning Types of video schemes used for transmission - Bottom Fields Even numbered rows in a frame. Top Fields Odd numbered rows in a frame. The two successive fields (field 1 & field 2) are called a frame. Both the fields are sent one after another and display puts them back together before displaying the full frame. Quality degraded as sometimes the frames come out of sync. It conserve the bandwidth. Maximum frame rate is 60 frames/ second. In this, a frame is divided into two separate fields – Top Fields and Bottom fields. Field 1 Field 2 Frame
  • 6. Progressive Video Scanning In this, complete frame is send to display. Bandwidth requirement is twice as compared to Interlaced video scanning. Quality is good as frames come in sync and image is much sharper. Maximum frame rate is 30 frames/ second. Frame
  • 7. Spatial redundancy In a frame, adjacent pixels are usually correlated. e.g. - The grass is green in the background of a frame. 3. VIDEO COMPRESSION The data in frames is often redundant in space and time. The concept of video compression lies on two main factors- For example- The human eye better resolve the brightness details than color details. So the way human eye works, it is also possible to delete some data from the frame with almost no noticeable degradation in image quality. Time based redundancy In a video, adjacent frames are usually correlated. e.g. - The green background is persisting frame after frame. Frame 1 Frame 2 Frame 3 Frame 4
  • 8. 4. MPEG MPEG-1 : It was the very first project of this group and published in 1993 as ISO/IEC 11172 standard. MPEG stands for Motion Picture Experts Group established in 1988 as a working group within ISO/IEC that has defined standards for digital compression of audio & video signals. Such as- MPEG-1 defines coding methods to compress the progressively scanned video. Commonly used in CD-i and Video CD systems. It supports coding bit rate of 1.5 Mbit/s. MPEG-2 : is an extension of MPEG-1, published in 1995 as ISO/IEC 13818 standard. MPEG-2 defines coding methods to compress progressively scanned video as well as interlaced scanned video. Commonly used in broadcast format, such as – Standard Definition TV (SDT) and High Definition TV (HDT). It supports coding bit rate of 3 - 15 Mbit/s for SDT and 15 – 20 Mbit/s for HDT. MPEG-4 : introduced in 1998 and still in development as ISO/IEC 14496 standard. MPEG-2 defines object based coding methods for mixed media data and provides new features, such as – 3D rendering, animation graphics, DRM, various types of interactivity etc. Commonly used in web based streaming media, CD, videophone, DVB, etc . It supports coding bit rate of few Kbit/s – tens of Mbit/s.
  • 9. 5. MPEG-2 VIDEO COMPRESSION MPEG-2 compresses a raw frame into three different kind of frames – Compressed frames (I, P & B frames) are organized in a sequence to form Group of Pictures (GOP). Compression is based on - Spatial redundancy Time based redundancy Intra coded frames (I-frames), Predictive coded frames (P-frames), and Bi-directionally predictive coded frames (B-frames) Raw Frame I-Frame P-Frame B-Frame I B P B P B P B I (GOP 1) B Group of pictures
  • 10. I - Frame P - Frame Compression is based on spatial redundancy as well as on time based redundancy. P-frame can be predicted by referring I-frame or P-frame immediately preceding it. (P-frame is also a reference frame) . P-frame provides better compression than I-frame as it uses the data of previous I- frame or P-frame. I-frame is a reference frame and can be used to predict the P-frame immediately following it. Compressed directly from of a raw (uncompressed) frame. Compression is based on spatial redundancy in the current raw frame only and inability of human eye to detect certain changes in the image. 6. MPEG-2 FRAMES Reference Previous P Next I Reference Previous P Next P Compressed Raw Frame I
  • 11. B - Frame Compression is similar to P-frame except that B-frame compression is done by referring previous as well as following I-frame and/ or P frame. B-frame required frame sequence must be transmitted or stored (I or P frames) out of order, so that future frame is available for reference. It causes some delay through out the system . B-frame provides better compression than P-frame & I-frame, as it uses the data of previous as well as succeeding I- frame and/ or P-frame. It requires a memory buffer of double in size to store data of two reference/ anchor frames B-frame is not a reference frame. There is no defined limit to the number of consecutive B-frames within a group of pictures. Most application uses two consecutive B-frames as the ideal trade off between compression efficiency and video quality. Previous Future References B P I Previous Future References B P I References Previous Future B P P 100 102 99 104 101 106 103 107 105 108 Transmission Order Raw Frames 99 100 101 102 103 104 105 106 107 108 B I B P B P B P I P Display Order Frame Number Encoded Frames I P B P B P B I B P 100 102 99 104 101 106 103 107 105 108 Encoding Order Frame Number Encoded Frames
  • 12. It takes the advantage of spatial redundancy. Compression techniques are applied using the data of the current frame only. It uses combination of various lossless and lossy compression techniques. Such as – It takes the advantage of time-based redundancy as well as spatial redundancy. Compression techniques are applied using the data of the current frame as well as preceding and/ or succeeding frames. It mainly uses lossy compression techniques. Such as – Forward interpolated prediction Intra Frame Encoding Non-Intra Frame Encoding Uses data of previously coded frame. Forward & backward interpolated prediction Uses data of previously coded frame & future frame. Temporal prediction Uses motion estimation & vectors to generate predicted frame. Residual error frame & its coding It is generated by subtracting predicted frame from its reference frame, which is further spatially coded & transmitted. Video filter Compress spatial redundancy at chrominance plane. Discrete cosine transform (DCT) Convert spatial variation into frequency variation. DCT coefficient quantization Reduces higher frequency DCT coefficients to zero . Run-Length amplitude/ Variable length encoding Compression using entropy encoding, run-length encoding & Huffman encoding . 7. MPEG-2 VIDEO ENCODING MPEG-2 video encoding can be broadly categorized into – Intra Frame Encoding and Non-Intra Frame Encoding. Bit rate control Prevents under/ over flow of data buffer.
  • 13. 7.1 INTRA FRAME ENCODING Video Filtering Video filtering is a lossy compression technique and is used to compress the spatial redundancies on macro-block basis within the current frame. It operates on color space (i.e. YCbCr encoding & CbCr sub-sampling) , as Human Visual System is less sensitive to variations in color as compared to variations in brightness. Video filtering includes- Macro-block : Macro-blocks are created by dividing raw frame into 8 x 8 pixels blocks. For example- Raw Frame Block_1 Block_2 Block_n 8 X 8 Pixels Quantization DCT Video Filter (Optional) Run-Length VLC Bit Stream Buffer Bit Rate Control Intra Frame Encoder
  • 14. A raw frame contains an image in RGB color space. RGB color space contains mutual redundancies, so it requires large space for storage and high bandwidth for transmission. Encoding RGB into YCbCr color space reduces the mutual redundancies YCbCr Encoding: converts block’s RGB data into YCbCr color space. For example- Conversion: Y = + 0.299 * R + 0.587 * G + 0.114 * B Cb = 128 - 0.168736 * R - 0.331264 * G + 0.5 * B Cr = 128 + 0.5 * R - 0.418688 * G - 0.081312 * B Where- R, G & B values are 8 bits long and lies in {0, 1, 2, ..., 255} range. Raw Frame G R B Raw Frame Y Cb Cr
  • 15. Chrominance (CbCr) Sub-Sampling: provides further compression at chrominance plane to reduced number of bits to represent an image. For example- A YCbCr encoded frame can be represented as- 4:4:4 Sampling Format 4:4:4 sampling format states that, for every four Y samples – there is four Cb & four Cr samples. If an image resolution is 640 x 480 pixels then, the number. of Y samples = 640 x 480, Cr samples = 640 x 480 & Cb samples = 640 X 480 Number of bits required = 640 x 480 x 8 + 640 x 480 x 8 + 640 x 480 x 8 = 7372800 bits ~ 7.3728 Mb x Y Cb Cr x x x x x x x x x x x x x x x x x x x x x x x x 4:4:4 Pixel with Y, Cr & Cb value
  • 16. 4:4:4 sampling format can be sub–sampled to 4:2:2 format, where Cr & Cb are sub-sampled to half the horizontal resolution of Y. That is, in 4:2:0 sampling format - for every four 4 Y samples in horizontal direction, there would be 2 Cb & 2 Cr samples If an image resolution is 640 x 480 pixels then, the number. of Y samples = 640 x 480, Cr samples = 320 x 480 & Cb samples = 320 X 480 Number of bits required = 640 x 480 x 8 + 320 x 480 x 8 + 320 x 480 x 8 = 4915200 bits ~ 4.9152 Mb x x x x x x x x x x x x x x x x x x x x x x x x 4:4:4 x x x x x x x x x x x x 4:2:2 Pixel with Y value
  • 17. It can be further sub–sampled to 4:2:0 format, where Cb & Cr are sub-sampled to half the horizontal and vertical resolution of Y. If an image resolution is 640 x 480 pixels then, the number. of Y samples = 640 x 480, Cr samples = 320 x 240 & Cb samples = 320 X 240 Number of bits required = 640 x 480 x 8 + 320 x 240 x 8 + 320 x 240 x 8 = 3686400 bits ~ 3.6864 Mb (which is far less than 4:4:4 and 4:2:2 format) 4:2:0 OR Pixel with Cr & Cb value Samples are taken at different intervals x x x x x x x x x x x x x x x x x x x x x x x x 4:2:2
  • 18. Discrete Cosine Transformation DCT converts the spatial variations within the macro-block into frequency variations without changing the data. A two-dimension DCT function can be represented as: The output of a DCT function is a DCT coefficient matrix containing the data in frequency domain. Data in frequency domain can be efficiently processed and compressed. Where-
  • 19. In DCT, 8x8 data block of Y, Cb and Cr components is converted into frequency domain. For example- Lets assume, 8x8 pixel block of Y component and its corresponding 8x8 data block - Subtract 128 from each pixel value. DCT Apply two-dimension DCT to get DCT coefficient matrix. Block_1 8 X 8 Pixels Y 52 55 61 66 70 61 64 73 63 59 55 90 109 85 69 72 62 59 68 113 144 104 66 73 63 58 71 122 154 106 70 69 67 61 68 104 126 88 68 70 79 65 60 70 77 68 58 75 85 71 64 59 55 61 65 83 87 79 69 68 65 76 78 94 8 X 8 Data block -76 -73 -67 -62 -58 -67 -64 -55 -65 -69 -73 -38 -19 -43 -59 -56 -66 -69 -60 -15 16 -24 -62 -55 -65 -70 -57 -6 26 -22 -58 -59 -61 -67 -60 -24 -2 -40 -60 -58 -49 -63 -68 -58 -51 -60 -70 -53 -43 -57 -64 -69 -73 -67 -63 -45 -41 -49 -59 -60 -63 -52 -50 -34 8 X 8 Data block -415 -30 -61 27 56 -20 -2 -0 4 -22 -61 10 13 -7 -9 5 -47 7 77 -25 -29 10 5 -6 -49 12 34 -15 -10 6 2 2 12 -7 -13 -4 -2 2 -3 3 -8 3 2 -6 -2 1 4 2 -1 0 0 -2 -1 -3 4 -1 0 0 -1 -4 -1 0 1 2 8 X 8 DCT Coefficient Matrix DCT 8 X 8 Pixels Y ~
  • 20. Quantization Quantization reduces the amount of information in higher frequency DCT coefficient components using a default quantization matrix defined by MPEG-2 standard. Default quantization matrix contains constant values. It is a lossy operation that causes minor degradation in the image quality due to some subtle loss in brightness and colors. Each component in DCT coefficient matrix is divided by its corresponding constant value in default quantization matrix and a quantized DCT coefficient matrix is computed. A quantization function can be represented as: 16 11 19 16 24 40 51 61 12 12 14 19 26 58 60 55 14 13 16 24 40 57 69 56 14 17 22 29 51 87 80 62 18 22 37 56 68 109 103 77 24 35 55 64 81 104 113 92 49 64 78 87 103 121 120 101 72 92 95 98 112 100 103 99 Default Quantization Matrix
  • 21. Quantization For example- -26 -3 -6 2 2 -1 0 0 0 -2 -4 1 1 0 0 0 -3 1 5 -1 -1 0 0 0 -4 1 2 -1 0 0 0 0 1 0 Quantized 8 X 8 DCT Coefficient Matrix 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -415 -30 -61 27 56 -20 -2 -0 4 -22 -61 10 13 -7 -9 5 -47 7 77 -25 -29 10 5 -6 -49 12 34 -15 -10 6 2 2 12 -7 -13 -4 -2 2 -3 3 -8 3 2 -6 -2 1 4 2 -1 0 0 -2 -1 -3 4 -1 0 0 -1 -4 -1 0 1 2 8 X 8 DCT Coefficient Matrix 16 11 19 16 24 40 51 61 12 12 14 19 26 58 60 55 14 13 16 24 40 57 69 56 14 17 22 29 51 87 80 62 18 22 37 56 68 109 103 77 24 35 55 64 81 104 113 92 49 64 78 87 103 121 120 101 72 92 95 98 112 100 103 99 Default Quantization Matrix
  • 22. Run Length Amplitude/ Variable Length Encoding Run length amplitude/ Variable length encoding is a lossless compression techniques that includes entropy encoding, run-length encoding and Huffman encoding . Entropy Encoding : Components of quantized DCT coefficient matrix is read in zigzag order. It helps in representing the frequency coefficients (both higher & lower) of quantized DCT coefficient matrix in an efficient manner. For example- Entropy Encoding ~ Entropy Encoding -26 -3 -6 2 2 -1 0 0 0 -2 -4 1 1 0 0 0 -3 1 5 -1 -1 0 0 0 -4 1 2 -1 0 0 0 0 1 0 Quantized 8 X 8 DCT Coefficient Matrix 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -26 -3 -6 2 2 -1 0 -2 -4 1 1 -3 1 5 -1 -1 -4 1 2 -1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Zigzag ordering of quantized 8 X 8 DCT Coefficient Matrix -26 -3 -6 2 2 -1 0 -2 -4 1 -3 1 5 -4 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 15 1 -1 0 2 0 0 -1 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 48 63
  • 23. Run Length Encoding : is a lossless data compression technique of a sequence in which same data value occurs in many consecutive data elements. WWWWWWWWWWWWBWWWWWWWWWWWWBBB (Sequence) Here, the runs of data are stored as single data value and count. That is, the above sequence can be represented as 12W1B12W3B. For example- -26, -3, 0, -3, -2, -6, 2, -4, 1, -4, (two 1s), 5, 1, 2, -1, 1, -1, 2, (5 0s), (two -1s), (thirty eights zeroes) Run Length Encoded DCT coefficients Run Length Encoding Zigzag ordering of quantized 8 X 8 DCT Coefficient Matrix -26 -3 -6 2 2 -1 0 -2 -4 1 -3 1 5 -4 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 15 1 -1 0 2 0 0 -1 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 48 63
  • 24. Huffman Encoding : is a lossless data compression technique that uses variable length code table for encoding a source symbol. Where, Variable length code table is derived based on the estimated probability of occurrence of each possible value of the source symbol. For example- MPEG-2 has special Huffman code word (i.e. EOB) to end the sequence prematurely when the remaining coefficients are zero and then it performs variable length encoding for further compression. -26, -3, 0, -3, -2, -6, 2, -4, 1, -4, (two 1s), 5, 1, 2, -1, 1, -1, 2, (5 0s), (two -1s), (thirty eights zeroes) Run Length Encoded DCT coefficients -26, -3, 0, -3, -2, -6, 2, -4, 1, -4, (two 1s), 5, 1, 2, -1, 1, -1, 2, (5 0s), (two -1s), EOB DCT coefficients sequence with EOB Variable Length Encoding 0011101010011011010110001010 Compressed Bit Stream Sequence char Freq Code space 7 111 a 4 010 e 4 000 f 3 1101 h 2 1010 Variable length code table
  • 25. Bit Rate Control Bit rate control is a mechanism to prevent the underflow or overflow of buffer used for temporarily storage of encoded bit stream within the encoder. Bit rate control is necessary in applications that requires fixed bit rate transmission of encoded bit-stream. For example- Quantization process may affects relative buffer fullness which in turn affects the output bit rate, as quantization depends on default quantization matrix (picture basis) and quantization scale (macro-block basis). Encoder has to pass these two parameters to bit rate control mechanism in order to control relative buffer fullness and constant bit rate. Buffer under flow/ over flow can be prevented by repeating or dropping of entire video frames. Quantization Run-Length VLC Bit Stream Buffer Bit Rate Control
  • 26. 7.2 NON-INTRA FRAME ENCODING Video Filter DCT Quantization Run-Length VLC Bit Stream Buffer Inverse Quantization Inverse DCT Motion Estimation Motion Compensation Anchor Frame Memory (2) + + - Bit Rate Control Non – Intra Frame Encoder
  • 27. Using forward interpolated prediction , encoder can forward predict a future frame called P-frame. The very first P-frame in a group of picture , is predicted from the I-frame immediately preceding it. Other P-frame in a group of picture can be predicted from previous I-frame or P-frame immediately preceding it. Forward Interpolated Prediction Forward & Backward Interpolated Prediction Using forward & backward interpolated prediction , encoder can forward predict a future frame called B-frame. The B-frame in a group of picture , is predicted based on a forward prediction from a previous I or P as well as backward prediction from a succeeding I or P frame. I P P P P P P I P P (GOP 1) (GOP 2) I B P B P B I B P B (GOP 1) (GOP 2)
  • 28. Mostly, consecutive video frames are similar except for the differences induced by the objects moving within the frames. For example- Temporal prediction uses motion estimation & motion vector techniques to predict these changes in the future frames. Motion estimation is applied at luminance plane only. (It is not applied at chrominance plane as it is assumed that the color motion can be adequately presented with the same motion information as the luminance. ) Temporal Prediction Frame 1 (Current) Frame 2 (Next) Tree moved down and to the right People moved farther to the right than tree
  • 29. Macro-block to be searched Search Search Search If there is no acceptable match, then encoder shall code that particular macro-block as an intra macro-block, even though it may be in a P or B frame. Motion Estimation : It performs a 2-Dimesional spatial search for each luminance macro-blocks within the frame to get the best match. For example- Frame 2 Frame 1 Frame 1 Frame 1
  • 30. Motion Vectors Since, each forward and backward macro-block contains 2 motion vectors, so a bi-directionally predicted macro-block will contain 4 motion vectors. Motion Vector : Assign motion vectors to the resultant macro-blocks to indicate how far the horizontally and vertically the macro-block must be moved so that a predicted frame can be generated For example- Predicted Frame Frame 1
  • 31. Subtract Since, motion vector tends to be highly correlated between macro-blocks – Horizontal & vertical component is compared to the previously valid horizontal & vertical motion vector respectively and difference is calculated. These differences (i.e. Residual Error Frame) are then coded and variable length code is applied on it for maximum compression efficiency. Residual error Frame is less complicated and can be encoded efficiently. More accurate the motion is estimated & matched, more likely the residual error will approach zero and the coding efficiency shall be highest. Residual Error Frame Residual error Frame is generated by subtracting the predicted frame from desired frame. For example- Desired Frame Frame 2 Predicted Frame Residual Error Frame
  • 32. Default quantization matrix for non-intra frame is flat matrix with constant value of 16 for each of the 64 locations . Non-intra frame quantization contains a dead-zone around zero which helps in eliminating any lone DCT coefficient quantization values that might reduce the run-length amplitude efficiency. Motion vectors for the residual block information are calculated as differential values and coded with a variable length code according to their statistical likelihood of occurrence. Residual Error Frame Coding : Coding of residual error frame is similar to I-frame with some differences. Such as-
  • 33. 8. MPEG-2 VIDEO DECODING MPEG-2 video decoding can be broadly categorized into – Intra Frame Decoding and Non-Intra Frame Decoding. Intra Frame Decoding Intra frame decoding reverse the order of intra frame encoding process . For example- Buffer : Contains input bit-stream. For fixed rate applications, constant bit-stream is buffered in the memory and read out at variable rate based on the coding efficiency of the macro-blocks and frames to be decoded. VLD : Reverse the order of run length amplitude/ variable length encoding done in encoding process and recover the quantized DCT coefficient matrix. It is a most complex and computationally expensive portion in decoding. Perform bitwise decoding of input bit-stream using table look-ups to generate quantized DCT coefficients matrix. Inverse Quantization Run-Length VLD Bit Stream Buffer Inverse DCT Output I/F (Optional) Intra Frame Decoder
  • 34. Inverse Quantization : Reverse the order of quantization done in encoding process and recover the DCT coefficient matrix. Components of decoded quantized DCT coefficient matrix is multiplied by the corresponding value of the default quantization matrix and the quantization scale factor. Resulting coefficient is clipped to the region -2048 to +2047. Perform IDCT mismatch control to prevent long term error propagation within the sequence. Inverse DCT : Reverse the order of DCT done in encoding process and recover the original frame. A two-dimension DCT function can be represented as: Where-
  • 35. Non-Intra Frame Decoding Non-Intra frame decoding is similar to intra frame decoding with the addition of motion compensation support. For example- Inverse Quantization Run-Length VLD Bit Stream Buffer Inverse DCT Output I/F (Optional) Motion Compensation Anchor Frame Memory (2) + Non-Intra Frame Decoder
  • 36. 9. REFERENCES: http://en.wikipedia.org/wiki/MPEG-2 http://www.john-wiseman.com/technical/MPEG_tutorial.htm http://www.bretl.com/mpeghtml/MPEGindex.htm