SlideShare a Scribd company logo
Adaptive Huffman Coding
Why Adaptive Huffman Coding? Huffman coding suffers from the fact that the uncompresser need have some knowledge of the probabilities of the symbols in the compressed files this can need more bit to encode the file if this information is unavailable compressing the file requires two passes first pass:  find the frequency of each symbol and construct the huffman tree  second pass:  compress the file   Gabriele Monfardini - Corso di Basi di Dati Multimediali  a.a. 2005-2006
The key idea The key idea is to build a Huffman tree that is optimal for the part of the message already seen, and to reorganize it when needed, to maintain its optimality Gabriele Monfardini - Corso di Basi di Dati Multimediali  a.a. 2005-2006
Pro & Con - I Adaptive Huffman determines the mapping to codewords using a  running estimate  of the source symbols probabilities Effective exploitation of locality For example suppose that a file starts out with a series of a character that are not repeated again in the file. In static Huffman coding, that character will be low down on the tree because of its low overall count, thus taking lots of bits to encode. In adaptive huffman coding, the character will be inserted at the highest leaf possible to be decoded, before eventually getting pushed down the tree by higher-frequency characters Gabriele Monfardini - Corso di Basi di Dati Multimediali  a.a. 2005-2006
Pro & Con - II only one pass over the data overhead In static Huffman, we need to transmit someway the model used for compression, i.e. the tree shape. This costs about 2 n  bits in a clever representation. As we will see, in adaptive schemes the overhead is  n log n. sometimes encoding needs some more bits w.r.t. static Huffman (without overhead) But adaptive schemes generally compare well with static Huffman if overhead is taken into account Gabriele Monfardini - Corso di Basi di Dati Multimediali  a.a. 2005-2006
Some history Adaptive Huffman coding was first conceived independently by Faller (1973) and Gallager (1978) Knuth contributed improvements to the original algorithm (1985) and the resulting algorithm is referred to as  algorithm   FGK   A more recent version of adaptive Huffman coding is described by Vitter (1987) and called  algorithm V Gabriele Monfardini - Corso di Basi di Dati Multimediali  a.a. 2005-2006
An important question Better exploiting locality, adaptive Huffman coding is sometimes able to do better than static Huffman coding, i.e., for some messages, it can have a greater compression ... but we’ve assessed optimality of static Huffman coding, in the sense of minimal redundancy There is a contradiction? Gabriele Monfardini - Corso di Basi di Dati Multimediali  a.a. 2005-2006
Algorithm FGK - I The basis for algorithm FGK is the  Sibling Property  (Gallager 1978)   A binary code tree with nonnegative weights has the sibling property if each node (except the root) has a sibling and if the nodes can be numbered in order of nondecreasing weight with each node adjacent to its sibling. Moreover the parent of a node is higher in the numbering A binary prefix code is a Huffman code if and only if the code tree has the sibling property Gabriele Monfardini - Corso di Basi di Dati Multimediali  a.a. 2005-2006
Algorithm FGK - II Note that node numbering corresponds to the order in which the nodes are combined by Huffman’s algorithm, first nodes 1 and 2, then nodes 3 and 4 ... 2 a 3 b 5 d 6 e 5 c 11 f 32 21 11 10 5 1 2 3 4 5 6 7 8 9 10 11
Algorithm FGK - III In algorithm FGK, both encoder and decoder maintain dynamically changing Huffman code trees. For each symbol the encoder sends the codeword for that symbol in current tree and then update the tree The problem is to change quickly the tree optimal after  t  symbols (not necessarily distinct) into the tree optimal for  t +1 symbols If we simply increment the weight of the  t +1-th symbols and of all its ancestors, the sibling property may no longer be valid    we must rebuild the tree Gabriele Monfardini - Corso di Basi di Dati Multimediali  a.a. 2005-2006
Algorithm FGK - IV Suppose next symbol is “b” if we update the weigths...  ... sibling property is violated!! This is no more a Huffman tree Gabriele Monfardini - Corso di Basi di Dati Multimediali  a.a. 2005-2006 2 a 3 b 5 d 6 e 5 c 11 f 32 21 11 10 5 1 2 3 4 5 6 7 8 9 10 11 b 4 6 11 22 33 no more ordered by nondecreasing weight
Algorithm FGK - V The solution can be described as a two-phase process first phase: original tree is transformed in another valid Huffman tree for the first t symbols, that has the property that simple increment process can be applied succesfully second phase: increment process, as described previously Gabriele Monfardini - Corso di Basi di Dati Multimediali  a.a. 2005-2006
Algorithm FGK - V The first phase starts at the leaf of the  t +1-th symbol We swap this node and all its subtree, but not its numbering, with the highest numbered node of the same weight New current node is the parent of this latter node The process is repeated until we reach the root Gabriele Monfardini - Corso di Basi di Dati Multimediali  a.a. 2005-2006
Algorithm FGK - VI First phase Node 2: nothing to be done Node 4: to be swapped with node 5 Node 8: to be swapped with node 9 Root reached: stop! Second phase Gabriele Monfardini - Corso di Basi di Dati Multimediali  a.a. 2005-2006 2 a 3 b 5 d 6 e 5 c 11 f 32 21 11 10 5 1 2 3 4 5 6 7 8 9 10 11 b 4 6 12 33
Why FGK works? The two phase procedure builds a valid Huffman tree for  t +1 symbols, as the sibling properties is satisfied In fact, we swap each node which weight is to be increased with the highest numbered node with the same weight After the increasing process there is no node with previous weight that is higher numbered Gabriele Monfardini - Corso di Basi di Dati Multimediali  a.a. 2005-2006
The Not Yet Seen problem - I When the algorithm starts and sometimes during the encoding we encounter a symbol that has not been seen before. How do we face this problem? We use a single 0-node (with weight 0) that represents all the unseen symbols. When a new symbol appears we send the code for the 0-node and some bits to discern which is the new symbol. As each time we send log n  bits to discern the symbol, total overhead is  n log n  bits It is possible to do better, sending only the index of the symbol in the list of the current unseen symbols. In this way we can save some bit, on average
The Not Yet Seen problem - II Then the 0-node is splitted into two leaves, that are sibling, one for the new symbol, with weight 1, and a new 0-node Then the tree is recomputed as seen before in order to satisfy the sibling property Gabriele Monfardini - Corso di Basi di Dati Multimediali  a.a. 2005-2006
Algorithm FGK - summary The algorithm starts with only one leaf node, the 0-node. As the symbols arrive, new leaves are created and each time the tree is recomputed Each symbol is coded with its codeword in the current tree, and then the tree is updated Unseen symbols are coded with 0-node codeword and some other bits are needed to specify the symbol Gabriele Monfardini - Corso di Basi di Dati Multimediali  a.a. 2005-2006
Algorithm FGK - VII Algorithm FGK compares favourably with static Huffman code, if we consider also overhead costs  (it is used in the Unix utility  compact ) Exercise Construct the static Huffman tree and the FGK tree for the message  e eae de eabe eae dcf   and evaluate the number of bits needed for the coding with both the algorithms, ignoring the overhead for Huffman SOL.   FGK    60 bits, Huffman    52 bits FGK is obtained using the minimum number of bits for the element in the list of the unseen symbols
Algorithm FGK - VIII if  T =“total number of bits transmitted by algorithm FGK for a message of length  t  containing  n  distinct symbols“, then where  S  is the performance of the static Huffman (Vitter 1987) So the performance of algorithm FGK is never much worse than twice optimal   Gabriele Monfardini - Corso di Basi di Dati Multimediali  a.a. 2005-2006
Algorithm V - I Vitter in his work of the 1987 introduces two improvements over algorithm FGK, calling the new scheme algorithm As a tribute to his work, the algorithm is become famous... with the letter flipped upside-down... algorithm  Gabriele Monfardini - Corso di Basi di Dati Multimediali  a.a. 2005-2006
The key ideas - I swapping of nodes during encoding and decoding is onerous In FGK algorithm the number of swapping (considering a double cost for the updates that move a swapped node two levels higher) is bounded by    , where  is the length of the added symbol in the old tree  (this bound require some effort to be proved and is due to the work of Vitter) In algorithm V, the number of swapping is bounded by 1 Gabriele Monfardini - Corso di Basi di Dati Multimediali  a.a. 2005-2006
The key ideas - II Moreover algorithm V, not only minimize as Huffman and FGK, but also  , i.e. the height of the tree, and  , i.e. is better suited to code next symbol, given it could be represented by any of the leaves of the tree This two objectives are reached through a new numbering scheme, called  implicit numbering Gabriele Monfardini - Corso di Basi di Dati Multimediali  a.a. 2005-2006
Implicit numbering The nodes of the tree are numbered in increasing order by level; nodes on one level are numbered lower than the nodes on the next higher level Nodes on the same level are numbered in increasing order from left to right If this numbering is satisfied (and in FGK it is not always satisfied), certain types of updates cannot occur Gabriele Monfardini - Corso di Basi di Dati Multimediali  a.a. 2005-2006
An invariant The key to minimize the other kind of interchanges is to maintain the following  invariant for each weight w, all leaves of weight w precede (in the implicit numbering) all internal nodes of weight w The interchanges, in the algorithm V, are designed to restore implicit numbering, when a new symbol is read, and to preserve the invariant Gabriele Monfardini - Corso di Basi di Dati Multimediali  a.a. 2005-2006
Algorithm V - II if  T =“total number of bits transmitted by algorithm V for a message of length  t  containing  n  distinct symbols“, then At worst then, Vitter's adaptive method may transmit one more bit per codeword than the static Huffman method Empirically, algorithm V slightly outperforms algorithm FGK Gabriele Monfardini - Corso di Basi di Dati Multimediali  a.a. 2005-2006

More Related Content

PPT
Data compression
PPTX
Image Restoration (Order Statistics Filters)
PDF
Adaptive huffman coding
PPTX
Automatic number plate recognition (anpr)
PPSX
Color Image Processing: Basics
PPTX
Vector quantization
PPTX
Automatic number-plate-recognition
PPTX
Predictive coding
Data compression
Image Restoration (Order Statistics Filters)
Adaptive huffman coding
Automatic number plate recognition (anpr)
Color Image Processing: Basics
Vector quantization
Automatic number-plate-recognition
Predictive coding

What's hot (20)

PPT
Spatial filtering
PPTX
Point processing
PPT
Data compression
PPTX
Image processing second unit Notes
PPT
Lzw coding technique for image compression
PPTX
Artificial neural network
PPTX
License Plate recognition
PPTX
Multimedia:Multimedia compression
PPT
Interpixel redundancy
PPT
Multimedia compression
PPTX
Chain code in dip
PPTX
Lossless predictive coding in Digital Image Processing
PDF
Digital Image Processing: Image Enhancement in the Frequency Domain
PPTX
MPEG video compression standard
PPT
Fields of digital image processing slides
PPTX
Image compression standards
PPTX
SPATIAL FILTERING IN IMAGE PROCESSING
PPTX
Chapter 3 image enhancement (spatial domain)
ODP
image compression ppt
Spatial filtering
Point processing
Data compression
Image processing second unit Notes
Lzw coding technique for image compression
Artificial neural network
License Plate recognition
Multimedia:Multimedia compression
Interpixel redundancy
Multimedia compression
Chain code in dip
Lossless predictive coding in Digital Image Processing
Digital Image Processing: Image Enhancement in the Frequency Domain
MPEG video compression standard
Fields of digital image processing slides
Image compression standards
SPATIAL FILTERING IN IMAGE PROCESSING
Chapter 3 image enhancement (spatial domain)
image compression ppt
Ad

Viewers also liked (20)

PPT
Huffman Coding
PDF
Data compression huffman coding algoritham
PPTX
Huffman Coding
PDF
Arithmetic Coding
PPTX
Huffman codes
PPT
Arithmetic coding
PPTX
Image compression
PPTX
Huffman coding
PPTX
Image compression .
PDF
Presentation on Image Compression
PDF
Ch 04 Arithmetic Coding (Ppt)
PPT
Lz77 (sliding window)
PPT
artificial neural network
PPT
Image compression
PPTX
Image Compression
PPTX
Shannon Fano
PPT
Image compression
PPTX
Audio compression
PPTX
Data link control protocol(2)
PPTX
Data link control protocol(1)
Huffman Coding
Data compression huffman coding algoritham
Huffman Coding
Arithmetic Coding
Huffman codes
Arithmetic coding
Image compression
Huffman coding
Image compression .
Presentation on Image Compression
Ch 04 Arithmetic Coding (Ppt)
Lz77 (sliding window)
artificial neural network
Image compression
Image Compression
Shannon Fano
Image compression
Audio compression
Data link control protocol(2)
Data link control protocol(1)
Ad

Similar to Adaptive Huffman Coding (20)

PPT
Huffman coding.ppt
PDF
Lp2520162020
PDF
Lp2520162020
PDF
vorlage
PDF
DUAL FIELD DUAL CORE SECURE CRYPTOPROCESSOR ON FPGA PLATFORM
PDF
Using Transcendental Number to Encrypt BlackBerry Video
PDF
Connected Component Labeling on Intel Xeon Phi Coprocessors – Parallelization...
PPT
Logic Fe Tcom
PPTX
digital design of communication systems
DOCX
Chapter 1SyllabusCatalog Description Computer structu
PDF
j001adcpresentation-2112170415 23.pdf
PPTX
Huffman Algorithm and its Application by Ekansh Agarwal
PDF
FPGA Implementation of LDPC Encoder for Terrestrial Television
PDF
A Lecture Note on Nested MIMO in 2017, Seoul
PPTX
A Lossless FBAR Compressor
PDF
5 ofdm
PDF
Ci25500508
PDF
Multiple Valued Logic for Synthesis and Simulation of Digital Circuits
PPT
0015.register allocation-graph-coloring
PDF
SFSCON23 - Chris Mair - Self-hosted, Open Source Large Language Models (LLMs)
Huffman coding.ppt
Lp2520162020
Lp2520162020
vorlage
DUAL FIELD DUAL CORE SECURE CRYPTOPROCESSOR ON FPGA PLATFORM
Using Transcendental Number to Encrypt BlackBerry Video
Connected Component Labeling on Intel Xeon Phi Coprocessors – Parallelization...
Logic Fe Tcom
digital design of communication systems
Chapter 1SyllabusCatalog Description Computer structu
j001adcpresentation-2112170415 23.pdf
Huffman Algorithm and its Application by Ekansh Agarwal
FPGA Implementation of LDPC Encoder for Terrestrial Television
A Lecture Note on Nested MIMO in 2017, Seoul
A Lossless FBAR Compressor
5 ofdm
Ci25500508
Multiple Valued Logic for Synthesis and Simulation of Digital Circuits
0015.register allocation-graph-coloring
SFSCON23 - Chris Mair - Self-hosted, Open Source Large Language Models (LLMs)

More from anithabalaprabhu (20)

PDF
Ch 04 Arithmetic Coding ( P P T)
PPT
Compression
PPT
Datacompression1
PPT
Speech Compression
PDF
Z24 4 Speech Compression
PPT
PDF
Dictionary Based Compression
PDF
Module 4 Arithmetic Coding
PPT
Compression Ii
PDF
06 Arithmetic 1
PPT
Compression Ii
PPT
PPT
PPT
Losseless
PPT
Lec5 Compression
PPT
Huffman Student
PDF
Huffman Encoding Pr
Ch 04 Arithmetic Coding ( P P T)
Compression
Datacompression1
Speech Compression
Z24 4 Speech Compression
Dictionary Based Compression
Module 4 Arithmetic Coding
Compression Ii
06 Arithmetic 1
Compression Ii
Losseless
Lec5 Compression
Huffman Student
Huffman Encoding Pr

Recently uploaded (20)

PPTX
GDM (1) (1).pptx small presentation for students
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
Complications of Minimal Access Surgery at WLH
PPTX
Pharma ospi slides which help in ospi learning
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
RMMM.pdf make it easy to upload and study
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
Computing-Curriculum for Schools in Ghana
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
Institutional Correction lecture only . . .
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
01-Introduction-to-Information-Management.pdf
GDM (1) (1).pptx small presentation for students
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Abdominal Access Techniques with Prof. Dr. R K Mishra
Supply Chain Operations Speaking Notes -ICLT Program
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
2.FourierTransform-ShortQuestionswithAnswers.pdf
Complications of Minimal Access Surgery at WLH
Pharma ospi slides which help in ospi learning
Renaissance Architecture: A Journey from Faith to Humanism
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
RMMM.pdf make it easy to upload and study
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Computing-Curriculum for Schools in Ghana
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Anesthesia in Laparoscopic Surgery in India
Institutional Correction lecture only . . .
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
01-Introduction-to-Information-Management.pdf

Adaptive Huffman Coding

  • 2. Why Adaptive Huffman Coding? Huffman coding suffers from the fact that the uncompresser need have some knowledge of the probabilities of the symbols in the compressed files this can need more bit to encode the file if this information is unavailable compressing the file requires two passes first pass: find the frequency of each symbol and construct the huffman tree second pass: compress the file Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006
  • 3. The key idea The key idea is to build a Huffman tree that is optimal for the part of the message already seen, and to reorganize it when needed, to maintain its optimality Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006
  • 4. Pro & Con - I Adaptive Huffman determines the mapping to codewords using a running estimate of the source symbols probabilities Effective exploitation of locality For example suppose that a file starts out with a series of a character that are not repeated again in the file. In static Huffman coding, that character will be low down on the tree because of its low overall count, thus taking lots of bits to encode. In adaptive huffman coding, the character will be inserted at the highest leaf possible to be decoded, before eventually getting pushed down the tree by higher-frequency characters Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006
  • 5. Pro & Con - II only one pass over the data overhead In static Huffman, we need to transmit someway the model used for compression, i.e. the tree shape. This costs about 2 n bits in a clever representation. As we will see, in adaptive schemes the overhead is n log n. sometimes encoding needs some more bits w.r.t. static Huffman (without overhead) But adaptive schemes generally compare well with static Huffman if overhead is taken into account Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006
  • 6. Some history Adaptive Huffman coding was first conceived independently by Faller (1973) and Gallager (1978) Knuth contributed improvements to the original algorithm (1985) and the resulting algorithm is referred to as algorithm FGK A more recent version of adaptive Huffman coding is described by Vitter (1987) and called algorithm V Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006
  • 7. An important question Better exploiting locality, adaptive Huffman coding is sometimes able to do better than static Huffman coding, i.e., for some messages, it can have a greater compression ... but we’ve assessed optimality of static Huffman coding, in the sense of minimal redundancy There is a contradiction? Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006
  • 8. Algorithm FGK - I The basis for algorithm FGK is the Sibling Property (Gallager 1978) A binary code tree with nonnegative weights has the sibling property if each node (except the root) has a sibling and if the nodes can be numbered in order of nondecreasing weight with each node adjacent to its sibling. Moreover the parent of a node is higher in the numbering A binary prefix code is a Huffman code if and only if the code tree has the sibling property Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006
  • 9. Algorithm FGK - II Note that node numbering corresponds to the order in which the nodes are combined by Huffman’s algorithm, first nodes 1 and 2, then nodes 3 and 4 ... 2 a 3 b 5 d 6 e 5 c 11 f 32 21 11 10 5 1 2 3 4 5 6 7 8 9 10 11
  • 10. Algorithm FGK - III In algorithm FGK, both encoder and decoder maintain dynamically changing Huffman code trees. For each symbol the encoder sends the codeword for that symbol in current tree and then update the tree The problem is to change quickly the tree optimal after t symbols (not necessarily distinct) into the tree optimal for t +1 symbols If we simply increment the weight of the t +1-th symbols and of all its ancestors, the sibling property may no longer be valid  we must rebuild the tree Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006
  • 11. Algorithm FGK - IV Suppose next symbol is “b” if we update the weigths... ... sibling property is violated!! This is no more a Huffman tree Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006 2 a 3 b 5 d 6 e 5 c 11 f 32 21 11 10 5 1 2 3 4 5 6 7 8 9 10 11 b 4 6 11 22 33 no more ordered by nondecreasing weight
  • 12. Algorithm FGK - V The solution can be described as a two-phase process first phase: original tree is transformed in another valid Huffman tree for the first t symbols, that has the property that simple increment process can be applied succesfully second phase: increment process, as described previously Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006
  • 13. Algorithm FGK - V The first phase starts at the leaf of the t +1-th symbol We swap this node and all its subtree, but not its numbering, with the highest numbered node of the same weight New current node is the parent of this latter node The process is repeated until we reach the root Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006
  • 14. Algorithm FGK - VI First phase Node 2: nothing to be done Node 4: to be swapped with node 5 Node 8: to be swapped with node 9 Root reached: stop! Second phase Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006 2 a 3 b 5 d 6 e 5 c 11 f 32 21 11 10 5 1 2 3 4 5 6 7 8 9 10 11 b 4 6 12 33
  • 15. Why FGK works? The two phase procedure builds a valid Huffman tree for t +1 symbols, as the sibling properties is satisfied In fact, we swap each node which weight is to be increased with the highest numbered node with the same weight After the increasing process there is no node with previous weight that is higher numbered Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006
  • 16. The Not Yet Seen problem - I When the algorithm starts and sometimes during the encoding we encounter a symbol that has not been seen before. How do we face this problem? We use a single 0-node (with weight 0) that represents all the unseen symbols. When a new symbol appears we send the code for the 0-node and some bits to discern which is the new symbol. As each time we send log n bits to discern the symbol, total overhead is n log n bits It is possible to do better, sending only the index of the symbol in the list of the current unseen symbols. In this way we can save some bit, on average
  • 17. The Not Yet Seen problem - II Then the 0-node is splitted into two leaves, that are sibling, one for the new symbol, with weight 1, and a new 0-node Then the tree is recomputed as seen before in order to satisfy the sibling property Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006
  • 18. Algorithm FGK - summary The algorithm starts with only one leaf node, the 0-node. As the symbols arrive, new leaves are created and each time the tree is recomputed Each symbol is coded with its codeword in the current tree, and then the tree is updated Unseen symbols are coded with 0-node codeword and some other bits are needed to specify the symbol Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006
  • 19. Algorithm FGK - VII Algorithm FGK compares favourably with static Huffman code, if we consider also overhead costs (it is used in the Unix utility compact ) Exercise Construct the static Huffman tree and the FGK tree for the message e eae de eabe eae dcf and evaluate the number of bits needed for the coding with both the algorithms, ignoring the overhead for Huffman SOL. FGK  60 bits, Huffman  52 bits FGK is obtained using the minimum number of bits for the element in the list of the unseen symbols
  • 20. Algorithm FGK - VIII if T =“total number of bits transmitted by algorithm FGK for a message of length t containing n distinct symbols“, then where S is the performance of the static Huffman (Vitter 1987) So the performance of algorithm FGK is never much worse than twice optimal Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006
  • 21. Algorithm V - I Vitter in his work of the 1987 introduces two improvements over algorithm FGK, calling the new scheme algorithm As a tribute to his work, the algorithm is become famous... with the letter flipped upside-down... algorithm Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006
  • 22. The key ideas - I swapping of nodes during encoding and decoding is onerous In FGK algorithm the number of swapping (considering a double cost for the updates that move a swapped node two levels higher) is bounded by , where is the length of the added symbol in the old tree (this bound require some effort to be proved and is due to the work of Vitter) In algorithm V, the number of swapping is bounded by 1 Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006
  • 23. The key ideas - II Moreover algorithm V, not only minimize as Huffman and FGK, but also , i.e. the height of the tree, and , i.e. is better suited to code next symbol, given it could be represented by any of the leaves of the tree This two objectives are reached through a new numbering scheme, called implicit numbering Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006
  • 24. Implicit numbering The nodes of the tree are numbered in increasing order by level; nodes on one level are numbered lower than the nodes on the next higher level Nodes on the same level are numbered in increasing order from left to right If this numbering is satisfied (and in FGK it is not always satisfied), certain types of updates cannot occur Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006
  • 25. An invariant The key to minimize the other kind of interchanges is to maintain the following invariant for each weight w, all leaves of weight w precede (in the implicit numbering) all internal nodes of weight w The interchanges, in the algorithm V, are designed to restore implicit numbering, when a new symbol is read, and to preserve the invariant Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006
  • 26. Algorithm V - II if T =“total number of bits transmitted by algorithm V for a message of length t containing n distinct symbols“, then At worst then, Vitter's adaptive method may transmit one more bit per codeword than the static Huffman method Empirically, algorithm V slightly outperforms algorithm FGK Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006