SlideShare a Scribd company logo
Space and Time Tradeoffs (Hashing)




                                     1
Space and Time Tradeoffs
 Space and Time tradeoffs in algorithm design are a
 well-known issue .
    Example: computing values of a function at many points.


 One type of technique is to use extra space to
 facilitate faster and/or more flexible access to the
 data.
    This approach is called prestructuring.
    We illustrate this approach by Hashing.




                                                              2
Hashing
 A dictionary is a set that supports operations
 of searching, insertion, and deletion.
   Each element in the set contains a key and
   satellite data (the remainder of the record.)
   The keys are unique, but the satellite data are
   not.
 A hash table is an effective data structure for
 implementing dictionaries.
 Hashing is based on the idea of distributing
 keys among an one-dimensional array.

                                                     3
Direct-address Tables
 Suppose that an application needs a dynamic set in
 which each element has a key drawn from the
 Universe U = {0, 1, …, m-1}, where m is not too
 large. Denote direct-address table by T[0..m-1], in
 which each position, or slot, corresponds to a key in
 the universe U.
 Operations
    DIRECT-ADDRESS-SEARCH(T, k)          O(1)
     Return T[k]
    DIRECT-ADDRESS-INSERT(T, x)          O(1)
     T[key[x]]   x
    DIRECT-ADDRESS-DELETE(T, x)          O(1)
     T[key[x]]   NIL
                                                     4
Hash Tables
A hash table is used when the set K of keys stored in
dictionary is much smaller than the universe U = {0,
1, …, n-1}, of all possible Keys.
  An example, the key space of strings of characters.
  Requires much less storage while search cost is still O(1).
An example of hash table
Direct addressing vs. Hashing
  Direct addressing: an element with key k is stored in slot k;
  Hashing: an element with k is stored in slot h(k), where h(k)
  is the hash function.




                                                                5
Hash Tables
Hash function assigns an integer between 0 and m-1,
called hash address, to a key.
   An example hash function: h(K) = K mod m
     Integer keys (example)
     Character keys: ord(K), the position of the key in the alphabet.
     Character string keys:
            s −1
          (∑ ord (c j )) mod m
            i =0

         ( ord(c s-1) Cs-1   + ord(c
                                       s-2)   Cs-2   + … + ord(c
                                                                   0)   C0 ) mod m
  Let m = 13, calculate the hash address of the following
  strings
         A, FOOL, AND, HIS, MONEY, ARE, SOON, PARTED


                                                                                     6
Hash Function
 A hash function needs to satisfy two
 requirements:
   Needs to distribute keys among the cells of
   the hash table as evenly as possible. (m is
   usually chosen to be prime)
   Has to be easy to compute.




                                            7
Collision and Resolution
 Collision: two keys hash to the same
 slot.
 Collision resolution by open hashing
 (separate chaining)
 Collision resolution by closed hashing
 (open addressing)


                                          8
Open Hashing (Separate Chaining)
  Put all the elements that hash to the same
  slot in a linked list.
     Example
  Dictionary Operations
     CHAINED-HASH-SEARCH(T, k)
      search for an element with key k in list T[h(k)]
     CHAINED-HASH-INSERT(T, x)                   O(1)
      insert x at the head of list T[h(key[x])]
     CHAINED-HASH-DELETE(T, x)
      search and delete x from the list T[h(key[x])]
Exercise
                                                     9
Cost of Search
 Load factor of the hash table
    α = n/m, where n is the number of keys and m is
    the number of slots in the hash table.
    Too small: waste of space but fast in search
    Too large: save space but slow in search
 The worst case O(n): all keys hash to the same slot
 The average case
    Average cost of a successful search: O(1 + α / 2)
    Average cost of an unsuccessful search: O(α)
    If n is about equal to m, O(1)


                                                       10
Closed Hashing (Open Address Hashing)


 Open address hashing
    a strategy for storing all elements right in the array of the hash
    table, rather than using linked lists to accommodate collisions.
    Assumption: (m >=n)
    The idea is that if the hash slot for a certain key is occupied by a
    different element, then a sequence of alternative locations for the
    current element is defined.
     For every key k, a probe sequence <h(k, 0), h(k, 1), …, h(k, m-1)>
    is generated so that when a collision occurs, we successively
    examine, or probe the hash table until we find an empty slot in
    which to put the key..
 Probing policies
       Linear probing
       Quadratic probing
       Double hashing

                                                                    11
Linear Probing
 Given an ordinary hash function: h’, an auxiliary hash function,
 the method of linear probing uses the hash function
 h(k, i) = (h’(k) + i) mod m, for i = 0, 1, …, m-1.
 Search
    Compare the given key with the key in the probed position until
    either the key is found or an empty slot is encountered.
 An example
 The problem with deletion and the solution
    Lazy deletion: mark the previously occupied locations as “obsolete”
    to distinguish them from locations that have not been occupied.
 Advantage & Disadvantage:
    Easy to implement
    but when the load factor approaches 1, it suffers from clustering:
    Long runs of occupied slots build up, increasing the average search
    time.
 Exercise
                                                                   12
Quadratic Probing
 Given an ordinary hash function: h’, an auxiliary hash
 function, the method of quadratic probing uses the
 hash function
 h(k, i) = (h’(k) + c1i + c2i2) mod m,
 where i = 0, 1, …, m-1, c1 and c2 ‡ 0.

 Advantage & Disadvantage:
    Easy to implement
    It suffers from a milder form clustering: If two keys have the
    same initial probe position, then their probe sequences are
    the same.


                                                             13
Double Hashing
 Given two auxiliary hash functions: h1 and h2,
 double hashing uses the hash function
 h(k, i) = (h1(k) + ih2(k)) mod m,
 where i = 0, 1, …, m-1.
 An example
 One of the best methods available for open
 addressing.




                                                  14

More Related Content

PDF
Application of hashing in better alg design tanmay
PPT
Hashing
PDF
Hashing Algorithm
PPS
Ds 8
PPTX
Hashing Technique In Data Structures
PPT
Data Structure and Algorithms Hashing
PPT
Analysis Of Algorithms - Hashing
PPT
Hashing
Application of hashing in better alg design tanmay
Hashing
Hashing Algorithm
Ds 8
Hashing Technique In Data Structures
Data Structure and Algorithms Hashing
Analysis Of Algorithms - Hashing
Hashing

What's hot (20)

PPTX
Hashing in datastructure
PPT
Hashing
PPT
Hash table
PPT
Hashing
PPT
Hashing PPT
ZIP
Hashing
PPTX
Hash tables
PPT
Open Addressing on Hash Tables
PPTX
Hashing Techniques in Data Structures Part2
PPTX
Open addressiing &amp;rehashing,extendiblevhashing
PPT
4.4 hashing
PPT
Concept of hashing
PPT
Hashing
PDF
Skiena algorithm 2007 lecture06 sorting
PDF
Hashing notes data structures (HASHING AND HASH FUNCTIONS)
PPT
PPTX
Hashing In Data Structure
Hashing in datastructure
Hashing
Hash table
Hashing
Hashing PPT
Hashing
Hash tables
Open Addressing on Hash Tables
Hashing Techniques in Data Structures Part2
Open addressiing &amp;rehashing,extendiblevhashing
4.4 hashing
Concept of hashing
Hashing
Skiena algorithm 2007 lecture06 sorting
Hashing notes data structures (HASHING AND HASH FUNCTIONS)
Hashing In Data Structure
Ad

Similar to Algorithm chapter 7 (20)

PPT
13-hashing.ppt
PPTX
Presentation.pptx
PPTX
Hashing using a different methods of technic
PPT
Advance algorithm hashing lec II
PDF
Randamization.pdf
PPTX
Quadratic probing
PPTX
Hashing.pptx
PDF
03.01 hash tables
PPTX
hashing1.pptx Data Structures and Algorithms
PPT
Design data Analysis hashing.ppt by piyush
PDF
08 Hash Tables
PPTX
presentation on important DAG,TRIE,Hashing.pptx
PPT
13-hashing.ppt computer networks introduction
PDF
Algorithms notes tutorials duniya
PPT
4.4 hashing02
PPTX
session 15 hashing.pptx
PDF
data structure and algorithm hashing collision resolving strategirs
PPTX
hashing explained in detail with hash functions
13-hashing.ppt
Presentation.pptx
Hashing using a different methods of technic
Advance algorithm hashing lec II
Randamization.pdf
Quadratic probing
Hashing.pptx
03.01 hash tables
hashing1.pptx Data Structures and Algorithms
Design data Analysis hashing.ppt by piyush
08 Hash Tables
presentation on important DAG,TRIE,Hashing.pptx
13-hashing.ppt computer networks introduction
Algorithms notes tutorials duniya
4.4 hashing02
session 15 hashing.pptx
data structure and algorithm hashing collision resolving strategirs
hashing explained in detail with hash functions
Ad

More from chidabdu (20)

PDF
Sienna 12 huffman
PDF
Sienna 11 graphs
PDF
Sienna 10 dynamic
PDF
Sienna 9 hashing
PDF
Sienna 8 countingsorts
PDF
Sienna 7 heaps
PDF
Sienna 6 bst
PDF
Sienna 5 decreaseandconquer
PDF
Sienna 4 divideandconquer
PDF
Sienna 3 bruteforce
PDF
Sienna 2 analysis
PDF
Sienna 1 intro
PDF
Sienna 13 limitations
PPT
Unit 3 basic processing unit
PPT
Unit 5 I/O organization
PDF
Algorithm chapter 1
PDF
Algorithm chapter 11
PDF
Algorithm chapter 10
PDF
Algorithm chapter 9
PDF
Algorithm chapter 8
Sienna 12 huffman
Sienna 11 graphs
Sienna 10 dynamic
Sienna 9 hashing
Sienna 8 countingsorts
Sienna 7 heaps
Sienna 6 bst
Sienna 5 decreaseandconquer
Sienna 4 divideandconquer
Sienna 3 bruteforce
Sienna 2 analysis
Sienna 1 intro
Sienna 13 limitations
Unit 3 basic processing unit
Unit 5 I/O organization
Algorithm chapter 1
Algorithm chapter 11
Algorithm chapter 10
Algorithm chapter 9
Algorithm chapter 8

Recently uploaded (20)

PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Empathic Computing: Creating Shared Understanding
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
cuic standard and advanced reporting.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
Machine Learning_overview_presentation.pptx
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Encapsulation theory and applications.pdf
PDF
Approach and Philosophy of On baking technology
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Cloud computing and distributed systems.
PPTX
MYSQL Presentation for SQL database connectivity
Advanced methodologies resolving dimensionality complications for autism neur...
Unlocking AI with Model Context Protocol (MCP)
Empathic Computing: Creating Shared Understanding
A comparative analysis of optical character recognition models for extracting...
Per capita expenditure prediction using model stacking based on satellite ima...
cuic standard and advanced reporting.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Assigned Numbers - 2025 - Bluetooth® Document
20250228 LYD VKU AI Blended-Learning.pptx
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
MIND Revenue Release Quarter 2 2025 Press Release
Digital-Transformation-Roadmap-for-Companies.pptx
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Machine Learning_overview_presentation.pptx
The AUB Centre for AI in Media Proposal.docx
Encapsulation theory and applications.pdf
Approach and Philosophy of On baking technology
Review of recent advances in non-invasive hemoglobin estimation
Cloud computing and distributed systems.
MYSQL Presentation for SQL database connectivity

Algorithm chapter 7

  • 1. Space and Time Tradeoffs (Hashing) 1
  • 2. Space and Time Tradeoffs Space and Time tradeoffs in algorithm design are a well-known issue . Example: computing values of a function at many points. One type of technique is to use extra space to facilitate faster and/or more flexible access to the data. This approach is called prestructuring. We illustrate this approach by Hashing. 2
  • 3. Hashing A dictionary is a set that supports operations of searching, insertion, and deletion. Each element in the set contains a key and satellite data (the remainder of the record.) The keys are unique, but the satellite data are not. A hash table is an effective data structure for implementing dictionaries. Hashing is based on the idea of distributing keys among an one-dimensional array. 3
  • 4. Direct-address Tables Suppose that an application needs a dynamic set in which each element has a key drawn from the Universe U = {0, 1, …, m-1}, where m is not too large. Denote direct-address table by T[0..m-1], in which each position, or slot, corresponds to a key in the universe U. Operations DIRECT-ADDRESS-SEARCH(T, k) O(1) Return T[k] DIRECT-ADDRESS-INSERT(T, x) O(1) T[key[x]] x DIRECT-ADDRESS-DELETE(T, x) O(1) T[key[x]] NIL 4
  • 5. Hash Tables A hash table is used when the set K of keys stored in dictionary is much smaller than the universe U = {0, 1, …, n-1}, of all possible Keys. An example, the key space of strings of characters. Requires much less storage while search cost is still O(1). An example of hash table Direct addressing vs. Hashing Direct addressing: an element with key k is stored in slot k; Hashing: an element with k is stored in slot h(k), where h(k) is the hash function. 5
  • 6. Hash Tables Hash function assigns an integer between 0 and m-1, called hash address, to a key. An example hash function: h(K) = K mod m Integer keys (example) Character keys: ord(K), the position of the key in the alphabet. Character string keys: s −1 (∑ ord (c j )) mod m i =0 ( ord(c s-1) Cs-1 + ord(c s-2) Cs-2 + … + ord(c 0) C0 ) mod m Let m = 13, calculate the hash address of the following strings A, FOOL, AND, HIS, MONEY, ARE, SOON, PARTED 6
  • 7. Hash Function A hash function needs to satisfy two requirements: Needs to distribute keys among the cells of the hash table as evenly as possible. (m is usually chosen to be prime) Has to be easy to compute. 7
  • 8. Collision and Resolution Collision: two keys hash to the same slot. Collision resolution by open hashing (separate chaining) Collision resolution by closed hashing (open addressing) 8
  • 9. Open Hashing (Separate Chaining) Put all the elements that hash to the same slot in a linked list. Example Dictionary Operations CHAINED-HASH-SEARCH(T, k) search for an element with key k in list T[h(k)] CHAINED-HASH-INSERT(T, x) O(1) insert x at the head of list T[h(key[x])] CHAINED-HASH-DELETE(T, x) search and delete x from the list T[h(key[x])] Exercise 9
  • 10. Cost of Search Load factor of the hash table α = n/m, where n is the number of keys and m is the number of slots in the hash table. Too small: waste of space but fast in search Too large: save space but slow in search The worst case O(n): all keys hash to the same slot The average case Average cost of a successful search: O(1 + α / 2) Average cost of an unsuccessful search: O(α) If n is about equal to m, O(1) 10
  • 11. Closed Hashing (Open Address Hashing) Open address hashing a strategy for storing all elements right in the array of the hash table, rather than using linked lists to accommodate collisions. Assumption: (m >=n) The idea is that if the hash slot for a certain key is occupied by a different element, then a sequence of alternative locations for the current element is defined. For every key k, a probe sequence <h(k, 0), h(k, 1), …, h(k, m-1)> is generated so that when a collision occurs, we successively examine, or probe the hash table until we find an empty slot in which to put the key.. Probing policies Linear probing Quadratic probing Double hashing 11
  • 12. Linear Probing Given an ordinary hash function: h’, an auxiliary hash function, the method of linear probing uses the hash function h(k, i) = (h’(k) + i) mod m, for i = 0, 1, …, m-1. Search Compare the given key with the key in the probed position until either the key is found or an empty slot is encountered. An example The problem with deletion and the solution Lazy deletion: mark the previously occupied locations as “obsolete” to distinguish them from locations that have not been occupied. Advantage & Disadvantage: Easy to implement but when the load factor approaches 1, it suffers from clustering: Long runs of occupied slots build up, increasing the average search time. Exercise 12
  • 13. Quadratic Probing Given an ordinary hash function: h’, an auxiliary hash function, the method of quadratic probing uses the hash function h(k, i) = (h’(k) + c1i + c2i2) mod m, where i = 0, 1, …, m-1, c1 and c2 ‡ 0. Advantage & Disadvantage: Easy to implement It suffers from a milder form clustering: If two keys have the same initial probe position, then their probe sequences are the same. 13
  • 14. Double Hashing Given two auxiliary hash functions: h1 and h2, double hashing uses the hash function h(k, i) = (h1(k) + ih2(k)) mod m, where i = 0, 1, …, m-1. An example One of the best methods available for open addressing. 14