Maps&hash tables

Map ADT
• models a searchable collection of (key, value) entries.
• requires each key to be unique.
• association of keys to values defines a mapping.

Maps
• allow to store elements so they can be located quickly using
keys.
• stores key-value pairs (k, v), which we call entries, where k is
the key and v is its corresponding value.

Conceptual illustration of Maps
• Keys (labels) are assigned to values (diskettes) by a user.
• The resulting entries (labeled diskettes) are inserted into the map
(file cabinet).
• The keys can be used later to retrieve or remove values.

Applications
• address book.
• student-record database.

Map ADT methods
• size():
• Deterines the size of Map M.
• isEmpty():
• Test whether M is empty.
• get(k):
• If M contains an entry e=(k,v), where k is key, then return the value v, else
return null.

Map ADT methods
• put(k,v):
• If M does not have an entry (k,v) , then add entry (k,v) to M and
return null; else, replace with v the existing value of the entry with “k”
key and return the old value.
• remove(k):
• Remove from M the entry with key equal to k, and return its value.
•keySet():
• Return an iterable collection containing all the keys stored in M.

Map ADT methods
• values():
• Return an iterable collection containing all the values associated with keys
stored in M.
• entrySet():
• Return an iterable collection containing all the key-value entries in M.

Map ADT representation
Operation Output
put(5,A) null
put(7,B) null
put(2,C) null
put(8,D) null
put(2,E) C
get(7) B
get(4) null
get(2) E
remove(2) E
entrySet() (5,A),
(7,B),
(8,D)
keySet() 5,7,8
(7,B)
,
(5,A), (2,C), (8,D),(2,E),

A Simple List-Based Map Implementation
• Using doubly-linked list

Performance of a List-Based Map
In unsorted list
• Put(k,v)  O(1) time
• Get(k), remove(key)  O(n) time.

Hash Table
• One of the most eﬃcient ways to implement a map such
that the keys serves as the address for the associated values
is to use a hash table.
• Recall that maps are collection of entries (k,v), where the
keys associated with values are typically thought of as
addresses for those values.

Hash Table components
• In general, a hash table consists of two major components, a
bucket array and a hash function.
• A bucket array
• A hash function.

Bucket Array
• Consider array A of size N (array size)
• each cell is a bucket (i.e. a collection of (k,v))
• The keys of entries are integers in the range of [0, N-1], each
bucket holds at most one entry.
• Search, insertion and removal in the bucket array seems to take
O(1) time.

Bucket Array(cont.)
• It has two drawbacks:
• As the space used is proportional to N (array size).
• if N >> number of entries n present in the map, there is a waste of space.
• keys are required to be integers (range [0, N − 1]), which is often not the case.
• Overcome:
• Use the bucket array in conjunction with a "good" mapping from the keys to the
integers in the range [0,N − 1] like hash functions.

Hash functions (h)
• Is second part of hash table structure.
• Hash function value, h(k), is an index into the bucket array,
instead of k.
• So entry (k, v) is stored in the bucket A[h(k)].

Evaluation of a hash function, h(k),
• Consists of two functions:
• mapping the key k to an integer, called the hash code.
• mapping the hash code to an integer within the range of
indices ([0, N − 1]) of a bucket array, called the
compression function.
• Hash codes may be generated by casting to an integer,
summing components, Polynomial hash codes etc.

One simple compression function is the division method,
which maps an integer i to
i (mod N)
where N, the size of the bucket array, is a ﬁxed positive
integer.

Collision Handling Schemes
Collision occurs when diﬀerent elements are mapped to the same cell.
Some collision resolution methods:
• Separate Chaining
• Open Addressing

Separate chaining
• To index the n entries of map in a bucket array of capacity N.
• Each bucket has to be of size n/N called as the load factor of
the hash table.
• So the expected running time of operations is O(n/N).
• These operations can be implemented to run in O(1) time,
provided n is O(N).

Linear probing
• Distance between probes is constant (i.e. 1, when probe
examines consequent slots).
• When an entry into a bucket A[i] is already occupied,
where i = h(k) then :
• Try next at A[(i + 1) modN]. If this is also occupied, then
• Try A[(i + 2) mod N], and so on.
• until we find an empty bucket that can accept the new entry.

Open Addressing with Linear Probing Strategy
Insert keys 18, 41, 22, 44, 59, 32, 31, 73 in this order to a bucket, using
h(k) = k (mod 13).
41 18 44 59 32 22 31 73
0 1 2 3 4 5 6 7 8 9 10 11 12

Quadratic probing
• trying the buckets
A[h(k) + j2] (mod N), for j = 0,1,..., N −1
until ﬁnding an empty bucket.
• N has to be a prime number.
• Bucket array must be less than half full.

Double Hashing
• Handles collision , by placing an item in the series:
H(k) = (h(k) + j × h’(k)) mod N for j = 0,1,...N −1.
• h’(k) cannot have zero values.
• Table size N must be a prime number.
• Common choice of compression function :
h’(k) = q – (k mod q) where q < N is a prime.

Open Addressing with Double Hashing Strategy
Insert keys 18, 41, 22, 44, 59, 32, 31, 73 in this order to a
bucket, using double-hashing resolution where:
h(k) = k (mod 13) and h’(k) = 7 – k (mode 7).
41 18 445932 2231 73
0 1 2 3 4 5 6 7 8 9 10 11 12
H(k) = (h(k) + j × h’(k))
= 5 + 1 x (7 – 44%7)
= 10
H(k) = (h(k) + j × h’(k))
= 5 + 1 x (7 – 31%7)
= 9
H(k) = (h(k) + j × h’(k))
= 5 + 2 x (7 – 31%7)
= 13

• Worst-case for insertions, removal, and searches, on a hash
table take O(n) time.
• The worst-case  all the keys inserted into the map collide.
• The load factor α = n/N aﬀects the performance of hash
table.

Ordered Maps
• To keep the entries in a map sorted according to some order
• To look up keys and values based on this ordering.
• Performs the usual map operations, maintaining an order relation
for the keys.
• The worst-case time for searching in hash tables is O(n).
• A list implementation of an ordered array (known as ordered search
table), has O(lgn) as the worst-case time for searching.

Searching algorithm – Binary search
Algorithm BinarySearch(S, k, low, high)
if low > high then
return null
else
mid ← [(low + high)/2 ]
e ← S.get(mid)
if k = e.getKey() then
return e
else if k < e.getKey() then
return BinarySearch(S, k, low, mid-1)
else
return BinarySearch(S, k, mid+1, high)

Illustration on an ordered search table
Execution of binary search algorithm to perform get(22)

Maps&hash tables

More Related Content

What's hot (20)

Similar to Maps&hash tables (20)

More from Priyanka Rana (17)

Recently uploaded (20)

Maps&hash tables

Editor's Notes