SlideShare a Scribd company logo
1
Tries
• Standard Tries
• Compressed Tries
• Suffix Tries
2
Text Processing
• We have seen that preprocessing the pattern speeds up pattern
matching queries
• After preprocessing the pattern in time proportional to the pattern
length, the Boyer-Moore algorithm searches an arbitrary English text
in (average) time proportional to the text length
• If the text is large, immutable and searched for often (e.g., works by
Shakespeare), we may want to preprocess the text instead of the
pattern in order to perform pattern matching queries in time
proportional to the pattern length.
• Tradeoffs in text
searching
3
Standard Tries
• The standard trie for a set of strings S is an ordered tree such that:
– each node but the root is labeled with a character
– the children of a node are alphabetically ordered
– the paths from the external nodes to the root yield the strings of S
• Example: standard trie for
the set of strings
S = { bear, bell, bid, bull,
buy, sell, stock, stop }
•A standard trie uses O(n) space. Operations (find, insert, remove) take time
O(dm) each, where:
-n = total size of the strings in S,
-m =size of the string parameter of the operation
-d =alphabet size,
4
Applications of Tries
• A standard trie supports the following operations on a preprocessed
text in time O(m), where m = |X|
-word matching: find the first occurence of word X in the text
-prefix matching: find the first occurrence of the longest prefix of
word X in the text
• Each operation is performed by tracing a path in the trie starting at the
root
5
Compressed Tries
• Trie with nodes of degree at least 2
• Obtained from standard trie by compressing chains of redundant
nodes
Compressed Trie:
Standard Trie:
6
Compact Storage of Compressed
Tries
• A compressed trie can be stored in space O(s), where s = |S|, by using
O(1) space index ranges at the nodes
7
Insertion and Deletion
into/from a Compressed Trie
8
Suffix Tries
• A suffix trie is a compressed trie for all the suffixes of a text
Example:
Compact representation:
9
Properties of Suffix Tries
• The suffix trie for a text X of size n from an alphabet of size d
-stores all the n(n-1)/2 suffixes of X in O(n) space
-supports arbitrary pattern matching and prefix matching queries in
O(dm) time, where m is the length of the pattern
-can be constructed in O(dn) time
10
Tries and Web Search Engines
• The index of a search engine (collection of all searchable words) is stored
into a compressed trie
• Each leaf of the trie is associated with a word and has a list of pages (URLs)
containing that word, called occurrence list
• The trie is kept in internal memory
• The occurrence lists are kept in external memory and are ranked by
relevance
• Boolean queries for sets of words (e.g., Java and coffee) correspond to set
operations (e.g., intersection) on the occurrence lists
• Additional information retrieval techniques are used, such as
– stopword elimination (e.g., ignore “the” “a” “is”)
– stemming (e.g., identify “add” “adding” “added”)
– link analysis (recognize authoritative pages)
11
Tries and Internet Routers
• Computers on the internet (hosts) are identified by a unique 32-bit IP
(internet protocol) addres, usually written in “dotted-quad-decimal” notation
• E.g., www.cs.brown.edu is 128.148.32.110
• Use nslookup on Unix to find out IP addresses
• An organization uses a subset of IP addresses with the same prefix, e.g.,
Brown uses 128.148.*.*, Yale uses 130.132.*.*
• Data is sent to a host by fragmenting it into packets. Each packet carries the
IP address of its destination.
• The internet whose nodes are routers, and whose edges are communication
links.
• A router forwards packets to its neighbors using IP prefix matching rules.
E.g., a packet with IP prefix 128.148. should be forwarded to the Brown
gateway router.
• Routers use tries on the alphabet 0,1 to do prefix matching.

More Related Content

PDF
data structure and algorithm notes - tries
PPT
PPTX
TRIES_data_structure
PPTX
Tries data structures
PPTX
Shishirppt
PPTX
Trie (1)
PPT
4888009.pptnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
PPTX
Data structure tries
data structure and algorithm notes - tries
TRIES_data_structure
Tries data structures
Shishirppt
Trie (1)
4888009.pptnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn
Data structure tries

Similar to Tries .ppt (20)

PPTX
PPTX
Application of tries
PPTX
Suffix Tree and Suffix Array
PDF
'Trie' Data Structure for Auto Search Complete
PPTX
Lecture 7- Text Statistics and Document Parsing
PPTX
Trie Logic.pptx for data structures and algorithms
PPTX
presentation on important DAG,TRIE,Hashing.pptx
PPTX
Google code search
PDF
Souvenir's Booth - Algorithm Design and Analysis Project Project Report
PDF
SEARCH ENGINE 2015_11111111111111111.pdf
PDF
Compressed full text indexes
PPTX
Image Captioning of Handwritten Mathematical Expressions
PPT
Web indexing finale
PDF
Pattern Matching Part One: Suffix Trees
PDF
Text Indexing / Inverted Indices
PDF
Lecture10.pdf
PDF
PPTX
Ads applications of ads
Application of tries
Suffix Tree and Suffix Array
'Trie' Data Structure for Auto Search Complete
Lecture 7- Text Statistics and Document Parsing
Trie Logic.pptx for data structures and algorithms
presentation on important DAG,TRIE,Hashing.pptx
Google code search
Souvenir's Booth - Algorithm Design and Analysis Project Project Report
SEARCH ENGINE 2015_11111111111111111.pdf
Compressed full text indexes
Image Captioning of Handwritten Mathematical Expressions
Web indexing finale
Pattern Matching Part One: Suffix Trees
Text Indexing / Inverted Indices
Lecture10.pdf
Ads applications of ads
Ad

Recently uploaded (20)

PDF
What if we spent less time fighting change, and more time building what’s rig...
PDF
Computing-Curriculum for Schools in Ghana
PDF
1_English_Language_Set_2.pdf probationary
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
LDMMIA Reiki Yoga Finals Review Spring Summer
PDF
Hazard Identification & Risk Assessment .pdf
PPTX
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
PPTX
Chinmaya Tiranga Azadi Quiz (Class 7-8 )
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PDF
Paper A Mock Exam 9_ Attempt review.pdf.
PPTX
Unit 4 Skeletal System.ppt.pptxopresentatiom
PDF
advance database management system book.pdf
PDF
Weekly quiz Compilation Jan -July 25.pdf
PPTX
Onco Emergencies - Spinal cord compression Superior vena cava syndrome Febr...
PDF
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
PPTX
Introduction to Building Materials
PPTX
Radiologic_Anatomy_of_the_Brachial_plexus [final].pptx
PPTX
A powerpoint presentation on the Revised K-10 Science Shaping Paper
PDF
Empowerment Technology for Senior High School Guide
PDF
Practical Manual AGRO-233 Principles and Practices of Natural Farming
What if we spent less time fighting change, and more time building what’s rig...
Computing-Curriculum for Schools in Ghana
1_English_Language_Set_2.pdf probationary
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
LDMMIA Reiki Yoga Finals Review Spring Summer
Hazard Identification & Risk Assessment .pdf
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
Chinmaya Tiranga Azadi Quiz (Class 7-8 )
Chinmaya Tiranga quiz Grand Finale.pdf
Paper A Mock Exam 9_ Attempt review.pdf.
Unit 4 Skeletal System.ppt.pptxopresentatiom
advance database management system book.pdf
Weekly quiz Compilation Jan -July 25.pdf
Onco Emergencies - Spinal cord compression Superior vena cava syndrome Febr...
احياء السادس العلمي - الفصل الثالث (التكاثر) منهج متميزين/كلية بغداد/موهوبين
Introduction to Building Materials
Radiologic_Anatomy_of_the_Brachial_plexus [final].pptx
A powerpoint presentation on the Revised K-10 Science Shaping Paper
Empowerment Technology for Senior High School Guide
Practical Manual AGRO-233 Principles and Practices of Natural Farming
Ad

Tries .ppt

  • 1. 1 Tries • Standard Tries • Compressed Tries • Suffix Tries
  • 2. 2 Text Processing • We have seen that preprocessing the pattern speeds up pattern matching queries • After preprocessing the pattern in time proportional to the pattern length, the Boyer-Moore algorithm searches an arbitrary English text in (average) time proportional to the text length • If the text is large, immutable and searched for often (e.g., works by Shakespeare), we may want to preprocess the text instead of the pattern in order to perform pattern matching queries in time proportional to the pattern length. • Tradeoffs in text searching
  • 3. 3 Standard Tries • The standard trie for a set of strings S is an ordered tree such that: – each node but the root is labeled with a character – the children of a node are alphabetically ordered – the paths from the external nodes to the root yield the strings of S • Example: standard trie for the set of strings S = { bear, bell, bid, bull, buy, sell, stock, stop } •A standard trie uses O(n) space. Operations (find, insert, remove) take time O(dm) each, where: -n = total size of the strings in S, -m =size of the string parameter of the operation -d =alphabet size,
  • 4. 4 Applications of Tries • A standard trie supports the following operations on a preprocessed text in time O(m), where m = |X| -word matching: find the first occurence of word X in the text -prefix matching: find the first occurrence of the longest prefix of word X in the text • Each operation is performed by tracing a path in the trie starting at the root
  • 5. 5 Compressed Tries • Trie with nodes of degree at least 2 • Obtained from standard trie by compressing chains of redundant nodes Compressed Trie: Standard Trie:
  • 6. 6 Compact Storage of Compressed Tries • A compressed trie can be stored in space O(s), where s = |S|, by using O(1) space index ranges at the nodes
  • 8. 8 Suffix Tries • A suffix trie is a compressed trie for all the suffixes of a text Example: Compact representation:
  • 9. 9 Properties of Suffix Tries • The suffix trie for a text X of size n from an alphabet of size d -stores all the n(n-1)/2 suffixes of X in O(n) space -supports arbitrary pattern matching and prefix matching queries in O(dm) time, where m is the length of the pattern -can be constructed in O(dn) time
  • 10. 10 Tries and Web Search Engines • The index of a search engine (collection of all searchable words) is stored into a compressed trie • Each leaf of the trie is associated with a word and has a list of pages (URLs) containing that word, called occurrence list • The trie is kept in internal memory • The occurrence lists are kept in external memory and are ranked by relevance • Boolean queries for sets of words (e.g., Java and coffee) correspond to set operations (e.g., intersection) on the occurrence lists • Additional information retrieval techniques are used, such as – stopword elimination (e.g., ignore “the” “a” “is”) – stemming (e.g., identify “add” “adding” “added”) – link analysis (recognize authoritative pages)
  • 11. 11 Tries and Internet Routers • Computers on the internet (hosts) are identified by a unique 32-bit IP (internet protocol) addres, usually written in “dotted-quad-decimal” notation • E.g., www.cs.brown.edu is 128.148.32.110 • Use nslookup on Unix to find out IP addresses • An organization uses a subset of IP addresses with the same prefix, e.g., Brown uses 128.148.*.*, Yale uses 130.132.*.* • Data is sent to a host by fragmenting it into packets. Each packet carries the IP address of its destination. • The internet whose nodes are routers, and whose edges are communication links. • A router forwards packets to its neighbors using IP prefix matching rules. E.g., a packet with IP prefix 128.148. should be forwarded to the Brown gateway router. • Routers use tries on the alphabet 0,1 to do prefix matching.