0829173645ExampleAdjacency ListVisited Table (T/F)sourcePredRDFS( 2 )        RDFS(8)	RDFS(9)	      RDFS(1)		RDFS(3)                                   RDFS(5)                                         RDFS(6)                                             visit 7 -> RDFS(7)Mark 6 as visitedMark Pred[6]Recursivecalls
0829173645ExampleAdjacency ListVisited Table (T/F)sourcePredRDFS( 2 )        RDFS(8)	RDFS(9)	      RDFS(1)		RDFS(3)                                   RDFS(5)                                         RDFS(6)                                             RDFS(7) -> Stop no more unvisited neighborsMark 7 as visitedMark Pred[7]Recursivecalls
0829173645ExampleAdjacency ListVisited Table (T/F)sourcePredRDFS( 2 )        RDFS(8)	RDFS(9)	      RDFS(1)		RDFS(3)                                   RDFS(5)                                         RDFS(6) -> StopRecursivecalls
0829173645ExampleAdjacency ListVisited Table (T/F)sourcePredRDFS( 2 )        RDFS(8)	RDFS(9)	      RDFS(1)		RDFS(3)                                   RDFS(5) -> StopRecursivecalls
0829173645ExampleAdjacency ListVisited Table (T/F)sourcePredRDFS( 2 )        RDFS(8)	RDFS(9)	      RDFS(1)		RDFS(3) -> StopRecursivecalls
0829173645ExampleAdjacency ListVisited Table (T/F)sourcePredRDFS( 2 )        RDFS(8)	RDFS(9)	      RDFS(1) -> StopRecursivecalls
0829173645ExampleAdjacency ListVisited Table (T/F)sourcePredRDFS( 2 )        RDFS(8)	RDFS(9) -> StopRecursivecalls
0829173645ExampleAdjacency ListVisited Table (T/F)sourcePredRDFS( 2 )        RDFS(8) -> StopRecursivecalls
0829173645ExampleAdjacency ListVisited Table (T/F)sourcePredRDFS( 2 ) -> StopRecursivecalls
Example0829173645Adjacency ListVisited Table (T/F)sourcePredCheck our paths, does DFS find valid paths? Yes.Try some examples.Path(0) ->Path(6) ->Path(7) ->
Time Complexity of DFS(Using adjacency list)We never visited a vertex more than onceWe had to examine all edges of the verticesWe know Σvertex v degree(v) = 2m  where m is the number of edgesSo, the running time of DFS is proportional to the number of edges and number of vertices (same as BFS)O(n + m)You will also see this written as:O(|v|+|e|)		|v| = number of vertices (n)			 		|e| = number of edges   (m)
DFS TreeResulting DFS-tree.Notice it is much “deeper”than the BFS tree.Captures the structure of the recursive calls when we visit a neighbor w of v, we add w as child of v
 whenever DFS returns from a vertex v, we climb up in the tree from v to its parentHashingCOMP171Fall 2005
Hash tableSupport the following operations
Find
Insert
Delete. (deletions may be unnecessary in some applications)
Unlike binary search tree, AVL tree and B+-tree, the following functions cannot be done:
Minimum and maximum
Successor and predecessor
Report data within a given range
List out the data in orderUnrealistic solution Each position (slot) corresponds to a key in the universe of keysT[k] corresponds to an element with key kIf the set contains no element with key k, then T[k]=NULL
Unrealistic solutioninsert, delete and find all take O(1) (worst-case) timeProblem:The scheme wastes too much space if the universe is too large compared with the actual number of elements to be stored. E.g. student IDs are 8-digit integers, so the universe size is 108, but we only have about 7000 students
HashingUsually, m << N.h(Ki) = an integer in [0, …, m-1] called the hash value of Ki
Example applicationsCompilers use hash tables (symbol table) to keep track of declared variables.On-line spell checkers.  After prehashing the entire dictionary, one can check each word in constant time and print out the misspelled word in order of their appearance in the document.Useful in applications when the input keys come in sorted order.  This is a bad case for binary search tree.  AVL tree and B+-tree are harder to implement and they are not necessarily more efficient.
HashingWith hashing, an element of key k is stored in T[h(k)]
h: hash function
maps the universe U of keys into the slots of a hash table T[0,1,...,m-1]
an element of key k hashes to slot h(k)
h(k) is the hash value of key kHashingProblem: collision
two keys may hash to the same slot
can we ensure that any two distinct keys get different cells?
No, if |U|>m, where m is the size of the hash table
Design a good hash function
that is fast to compute and
can minimize the number of collisions
Design a method to resolve the collisions when they occurHash FunctionThe division method

More Related Content

PDF
Application of hashing in better alg design tanmay
PPTX
Hashing Technique In Data Structures
PPT
Data Structure and Algorithms Hashing
PPTX
Hashing
PPTX
Rehashing
PPT
Hashing PPT
PPS
Ds 8
PDF
THoSP: an Algorithm for Nesting Property Graphs
Application of hashing in better alg design tanmay
Hashing Technique In Data Structures
Data Structure and Algorithms Hashing
Hashing
Rehashing
Hashing PPT
Ds 8
THoSP: an Algorithm for Nesting Property Graphs

What's hot (20)

PPT
Hash table
PPTX
Hash tables
PDF
Hash Tables in data Structure
ZIP
Hashing
PPT
Hashing
PPT
Concept of hashing
PPTX
Hashing data
PDF
08 Hash Tables
PPT
Analysis Of Algorithms - Hashing
PPT
Hashing
PPT
Ch17 Hashing
PPTX
Quadratic probing
PPT
Hashing gt1
PDF
Hashing and Hash Tables
PDF
Algorithm chapter 7
PPTX
Hashing 1
PPTX
Unit 8 searching and hashing
PPTX
Hash function
PDF
PPT
Hash tables
Hash table
Hash tables
Hash Tables in data Structure
Hashing
Hashing
Concept of hashing
Hashing data
08 Hash Tables
Analysis Of Algorithms - Hashing
Hashing
Ch17 Hashing
Quadratic probing
Hashing gt1
Hashing and Hash Tables
Algorithm chapter 7
Hashing 1
Unit 8 searching and hashing
Hash function
Hash tables
Ad

Viewers also liked (7)

PPS
Trendi zold eletrevalo
PPS
Tekertanultulelvidam 4
PPS
Ode to nature
PPT
Megujulo energiavala szegenysegellen_11feb2
PPS
Mersz ime 11maj07
PPS
A kerekpar a_jovo_utja_alap
PDF
Webdesign idepac
Trendi zold eletrevalo
Tekertanultulelvidam 4
Ode to nature
Megujulo energiavala szegenysegellen_11feb2
Mersz ime 11maj07
A kerekpar a_jovo_utja_alap
Webdesign idepac
Ad

Similar to Presentation1 (20)

PPT
Hashing
PPTX
Hashing in datastructure
PPTX
presentation on important DAG,TRIE,Hashing.pptx
PPT
13-hashing.ppt
PPTX
hashing1.pptx Data Structures and Algorithms
PPT
Hashing
PPT
PPT
Advance algorithm hashing lec II
PDF
Randamization.pdf
PDF
Algorithms notes tutorials duniya
PPTX
hashing explained in detail with hash functions
PPTX
Hashing.pptx
PDF
Sienna 9 hashing
PPTX
LISP: Introduction to lisp
PPTX
LISP: Introduction To Lisp
PPTX
Hashing .pptx
PPTX
LR(1) and SLR(1) parsing
PDF
03.01 hash tables
Hashing
Hashing in datastructure
presentation on important DAG,TRIE,Hashing.pptx
13-hashing.ppt
hashing1.pptx Data Structures and Algorithms
Hashing
Advance algorithm hashing lec II
Randamization.pdf
Algorithms notes tutorials duniya
hashing explained in detail with hash functions
Hashing.pptx
Sienna 9 hashing
LISP: Introduction to lisp
LISP: Introduction To Lisp
Hashing .pptx
LR(1) and SLR(1) parsing
03.01 hash tables

More from Saurabh Mishra (8)

PPTX
Sorting2
PPT
PPT
Searching
PPTX
PPT
PPT
Data structures
PPT
PPTX
Binary trees1
Sorting2
Searching
Data structures
Binary trees1

Recently uploaded (20)

PDF
OpenACC and Open Hackathons Monthly Highlights July 2025
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PDF
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
PPT
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PDF
A comparative study of natural language inference in Swahili using monolingua...
PDF
Abstractive summarization using multilingual text-to-text transfer transforme...
PDF
The influence of sentiment analysis in enhancing early warning system model f...
PDF
STKI Israel Market Study 2025 version august
PDF
UiPath Agentic Automation session 1: RPA to Agents
PDF
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
PPTX
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
PDF
Zenith AI: Advanced Artificial Intelligence
PDF
Flame analysis and combustion estimation using large language and vision assi...
PDF
Two-dimensional Klein-Gordon and Sine-Gordon numerical solutions based on dee...
PPT
Geologic Time for studying geology for geologist
PPTX
Custom Battery Pack Design Considerations for Performance and Safety
PPTX
Benefits of Physical activity for teenagers.pptx
DOCX
search engine optimization ppt fir known well about this
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
OpenACC and Open Hackathons Monthly Highlights July 2025
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
A contest of sentiment analysis: k-nearest neighbor versus neural network
A comparative study of natural language inference in Swahili using monolingua...
Abstractive summarization using multilingual text-to-text transfer transforme...
The influence of sentiment analysis in enhancing early warning system model f...
STKI Israel Market Study 2025 version august
UiPath Agentic Automation session 1: RPA to Agents
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
Zenith AI: Advanced Artificial Intelligence
Flame analysis and combustion estimation using large language and vision assi...
Two-dimensional Klein-Gordon and Sine-Gordon numerical solutions based on dee...
Geologic Time for studying geology for geologist
Custom Battery Pack Design Considerations for Performance and Safety
Benefits of Physical activity for teenagers.pptx
search engine optimization ppt fir known well about this
Taming the Chaos: How to Turn Unstructured Data into Decisions

Presentation1

  • 1. 0829173645ExampleAdjacency ListVisited Table (T/F)sourcePredRDFS( 2 ) RDFS(8) RDFS(9) RDFS(1) RDFS(3) RDFS(5) RDFS(6) visit 7 -> RDFS(7)Mark 6 as visitedMark Pred[6]Recursivecalls
  • 2. 0829173645ExampleAdjacency ListVisited Table (T/F)sourcePredRDFS( 2 ) RDFS(8) RDFS(9) RDFS(1) RDFS(3) RDFS(5) RDFS(6) RDFS(7) -> Stop no more unvisited neighborsMark 7 as visitedMark Pred[7]Recursivecalls
  • 3. 0829173645ExampleAdjacency ListVisited Table (T/F)sourcePredRDFS( 2 ) RDFS(8) RDFS(9) RDFS(1) RDFS(3) RDFS(5) RDFS(6) -> StopRecursivecalls
  • 4. 0829173645ExampleAdjacency ListVisited Table (T/F)sourcePredRDFS( 2 ) RDFS(8) RDFS(9) RDFS(1) RDFS(3) RDFS(5) -> StopRecursivecalls
  • 5. 0829173645ExampleAdjacency ListVisited Table (T/F)sourcePredRDFS( 2 ) RDFS(8) RDFS(9) RDFS(1) RDFS(3) -> StopRecursivecalls
  • 6. 0829173645ExampleAdjacency ListVisited Table (T/F)sourcePredRDFS( 2 ) RDFS(8) RDFS(9) RDFS(1) -> StopRecursivecalls
  • 7. 0829173645ExampleAdjacency ListVisited Table (T/F)sourcePredRDFS( 2 ) RDFS(8) RDFS(9) -> StopRecursivecalls
  • 8. 0829173645ExampleAdjacency ListVisited Table (T/F)sourcePredRDFS( 2 ) RDFS(8) -> StopRecursivecalls
  • 9. 0829173645ExampleAdjacency ListVisited Table (T/F)sourcePredRDFS( 2 ) -> StopRecursivecalls
  • 10. Example0829173645Adjacency ListVisited Table (T/F)sourcePredCheck our paths, does DFS find valid paths? Yes.Try some examples.Path(0) ->Path(6) ->Path(7) ->
  • 11. Time Complexity of DFS(Using adjacency list)We never visited a vertex more than onceWe had to examine all edges of the verticesWe know Σvertex v degree(v) = 2m where m is the number of edgesSo, the running time of DFS is proportional to the number of edges and number of vertices (same as BFS)O(n + m)You will also see this written as:O(|v|+|e|) |v| = number of vertices (n) |e| = number of edges (m)
  • 12. DFS TreeResulting DFS-tree.Notice it is much “deeper”than the BFS tree.Captures the structure of the recursive calls when we visit a neighbor w of v, we add w as child of v
  • 13. whenever DFS returns from a vertex v, we climb up in the tree from v to its parentHashingCOMP171Fall 2005
  • 14. Hash tableSupport the following operations
  • 15. Find
  • 17. Delete. (deletions may be unnecessary in some applications)
  • 18. Unlike binary search tree, AVL tree and B+-tree, the following functions cannot be done:
  • 21. Report data within a given range
  • 22. List out the data in orderUnrealistic solution Each position (slot) corresponds to a key in the universe of keysT[k] corresponds to an element with key kIf the set contains no element with key k, then T[k]=NULL
  • 23. Unrealistic solutioninsert, delete and find all take O(1) (worst-case) timeProblem:The scheme wastes too much space if the universe is too large compared with the actual number of elements to be stored. E.g. student IDs are 8-digit integers, so the universe size is 108, but we only have about 7000 students
  • 24. HashingUsually, m << N.h(Ki) = an integer in [0, …, m-1] called the hash value of Ki
  • 25. Example applicationsCompilers use hash tables (symbol table) to keep track of declared variables.On-line spell checkers. After prehashing the entire dictionary, one can check each word in constant time and print out the misspelled word in order of their appearance in the document.Useful in applications when the input keys come in sorted order. This is a bad case for binary search tree. AVL tree and B+-tree are harder to implement and they are not necessarily more efficient.
  • 26. HashingWith hashing, an element of key k is stored in T[h(k)]
  • 28. maps the universe U of keys into the slots of a hash table T[0,1,...,m-1]
  • 29. an element of key k hashes to slot h(k)
  • 30. h(k) is the hash value of key kHashingProblem: collision
  • 31. two keys may hash to the same slot
  • 32. can we ensure that any two distinct keys get different cells?
  • 33. No, if |U|>m, where m is the size of the hash table
  • 34. Design a good hash function
  • 35. that is fast to compute and
  • 36. can minimize the number of collisions
  • 37. Design a method to resolve the collisions when they occurHash FunctionThe division method
  • 38. h(k) = k mod m
  • 39. e.g. m=12, k=100, h(k)=4 Requires only a single division operation (quite fast)Certain values of m should be avoided
  • 40. e.g. if m=2p, then h(k) is just the p lowest-order bits of k; the hash function does not depend on all the bits
  • 41. Similarly, if the keys are decimal numbers, should not set m to be a power of 10
  • 42. It’s a good practice to set the table size m to be a prime number
  • 43. Good values for m: primes not too close to exact powers of 2
  • 44. e.g. the hash table is to hold 2000 numbers, and we don’t mind an average of 3 numbers being hashed to the same entry
  • 45. choose m=701Hash Function...Can the keys be strings?
  • 46. Most hash functions assume that the keys are natural numbers
  • 47. if keys are not natural numbers, a way must be found to interpret them as natural numbers
  • 49. Add up the ASCII values of the characters in the string
  • 51. Different permutations of the same set of characters would have the same hash value
  • 52. If the table size is large, the keys are not distribute well. e.g. Suppose m=10007 and all the keys are eight or fewer characters long. Since ASCII value <= 127, the hash function can only assume values between 0 and 127*8=1016Hash Function...Method 2
  • 53. If the first 3 characters are random and the table size is 10,0007 => a reasonably equitable distribution
  • 55. English is not random
  • 56. Only 28 percent of the table can actually be hashed to (assuming a table size of 10,007)
  • 59. involves all characters in the key and be expected to distribute wella,…,z and space272
  • 60. Collision Handling: (1) Separate ChainingInstead of a hash table, we use a table of linked listkeep a linked list of keys that hash to the same valueh(K) = K mod 10
  • 62. Compute h(K) to determine which list to traverse
  • 63. If T[h(K)] contains a null pointer, initiatize this entry to point to a linked list that contains K alone.
  • 64. If T[h(K)] is a non-empty list, we add K at the beginning of this list.
  • 65. To delete a key K
  • 66. compute h(K), then search for K within the list at T[h(K)]. Delete K if it is found.Separate ChainingAssume that we will be storing n keys. Then we should make m the next larger prime number. If the hash function works well, the number of keys in each linked list will be a small constant.
  • 67. Therefore, we expect that each search, insertion, and deletion can be done in constant time.
  • 68. Disadvantage: Memory allocation in linked list manipulation will slow down the program.
  • 69. Advantage: deletion is easy.Collision Handling:(2) Open AddressingOpen addressing:
  • 70. relocate the key K to be inserted if it collides with an existing key. That is, we store K at an entry different from T[h(K)].
  • 72. what is the relocation scheme?
  • 73. how to search for K later?
  • 74. Three common methods for resolving a collision in open addressing
  • 77. Double hashingOpen AddressingTo insert a key K, compute h0(K). If T[h0(K)] is empty, insert it there. If collision occurs, probe alternative cell h1(K), h2(K), .... until an empty cell is found.hi(K) = (hash(K) + f(i)) mod m, with f(0) = 0f: collision resolution strategy
  • 78. Linear Probingf(i) =icells are probed sequentially (with wraparound) hi(K) = (hash(K) + i) mod mInsertion:Let K be the new key to be inserted. We compute hash(K)For i = 0 to m-1compute L = ( hash(K) + I ) mod mT[L] is empty, then we put K there and stop. If we cannot find an empty entry to put K, it means that the table is full and we should report an error.
  • 79. Linear Probinghi(K) = (hash(K) + i) mod mE.g, inserting keys 89, 18, 49, 58, 69 with hash(K)=K mod 10To insert 58, probe T[8], T[9], T[0], T[1]To insert 69, probe T[9], T[0], T[1], T[2]
  • 80. Primary ClusteringWe call a block of contiguously occupied table entries a clusterOn the average, when we insert a new key K, we may hit the middle of a cluster. Therefore, the time to insert K would be proportional to half the size of a cluster. That is, the larger the cluster, the slower the performance. Linear probing has the following disadvantages:Once h(K) falls into a cluster, this cluster will definitely grow in size by one. Thus, this may worsen the performance of insertion in the future.If two cluster are only separated by one entry, then inserting one key into a cluster can merge the two clusters together. Thus, the cluster size can increase drastically by a single insertion. This means that the performance of insertion can deteriorate drastically after a single insertion.Large clusters are easy targets for collisions.
  • 81. Quadratic Probingf(i) = i2hi(K) = ( hash(K) + i2 ) mod mE.g., inserting keys 89, 18, 49, 58, 69 withhash(K) = K mod 10To insert 58, probe T[8], T[9], T[(8+4) mod 10]To insert 69, probe T[9], T[(9+1) mod 10], T[(9+4) mod 10]
  • 82. Quadratic ProbingTwo keys with different home positions will have different probe sequences
  • 84. probe sequence for k1: 30,30+1, 30+4, 30+9
  • 85. probe sequence for k2: 29, 29+1, 29+4, 29+9
  • 86. If the table size is prime, then a new key can always be inserted if the table is at least half empty (see proof in text book)
  • 88. Keys that hash to the same home position will probe the same alternative cells
  • 89. Simulation results suggest that it generally causes less than an extra half probe per search
  • 90. To avoid secondary clustering, the probe sequence need to be a function of the original key value, not the home positionDouble HashingTo alleviate the problem of clustering, the sequence of probes for a key should be independent of its primary position => use two hash functions: hash() and hash2()f(i) = i * hash2(K)E.g. hash2(K) = R - (K mod R), with R is a prime smaller than m
  • 91. Double Hashinghi(K) = ( hash(K) + f(i) ) mod m; hash(K) = K mod mf(i) = i * hash2(K); hash2(K) = R - (K mod R),Example: m=10, R = 7 and insert keys 89, 18, 49, 58, 69To insert 49, hash2(49)=7, 2nd probe is T[(9+7) mod 10]To insert 58, hash2(58)=5, 2nd probe is T[(8+5) mod 10]To insert 69, hash2(69)=1, 2nd probe is T[(9+1) mod 10]
  • 92. Choice of hash2()Hash2() must never evaluate to zero
  • 93. For any key K, hash2(K) must be relatively prime to the table size m. Otherwise, we will only be able to examine a fraction of the table entries.
  • 94. E.g.,if hash(K) = 0 and hash2(K) = m/2, then we can only examine the entries T[0], T[m/2], and nothing else!
  • 95. One solution is to make m prime, and choose R to be a prime smaller than m, and set hash2(K) = R – (K mod R)Quadratic probing, however, does not require the use of a second hash function
  • 96. likely to be simpler and faster in practiceDeletion in open addressingActual deletion cannot be performed in open addressing hash tablesotherwise this will isolate records further down the probe sequenceSolution: Add an extra bit to each table entry, and mark a deleted slot by storing a special value DELETED (tombstone)