SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 07 | July-2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 2742
BLOOM FILTERS: AN INTRODUCTION
Shrivatsa D Perur1
1 Assistant Professor, Dept. of ISE, GIT, Belagavi, Karnataka, India
---------------------------------------------------------------------***------------------------------------------------------------------
Abstract—many network solutions and overlay networks
utilize probabilistic techniques to reduce information
processing and cost of networking. This article presents a
number of frequently used and useful probabilistic
techniques. Bloom filters and their variants are of prime
importance, and they are heavily used in various distributed
systems. This has been reflected in recent research and many
new algorithms have been proposed for distributed systems
that are either directly or indirectly based on Bloom filters.
To keep false positive probabilities low, the size of the bloom
filter must be dimensioned a priori to be linear in the
maximum number of keys inserted, with the linearity
constant ranging typically from one to few bytes.
Key words-Bloom filters, probabilistic structures,
distributed systems
I.INTRODUCTION
The bloom filter is a bit-vector data structure that provides
a compact representation of a set of elements (keys). It
supports insertion of elements and membership queries. A
membership answer is probabilistically correct in the sense
that it allows a small probability of a false positive (i.e., an
incorrect answer for a non-member element). The bloom
filter allows tradeoffs between small size (compactness)
and low false positives (accuracy). To keep false positives
low, the size of the bloom filter must be dimensioned a
priori to be linear in the maximum number of keys inserted,
with the linearity constant typically ranging from one to
few bytes. Fast matching of arbitrary identifiers to values is
a basic requirement for a large number of applications.
Data objects are typically referenced using locally or
globally unique identifiers. Recently, many distributed
systems have been developed using probabilistic globally
unique random bit strings as node identifiers. For example,
a node tracks a large number of peers that advertise files or
parts of files. Fast mapping from host identifiers to object
identifiers and vice versa are needed. The number of these
identifiers in memory may be great, which motivates the
development of fast and compact matching algorithms.
Given that there are millions or even billions of data
elements, developing efficient solutions for storing,
updating, and querying them becomes increasingly
important. The key idea behind the data structures
discussed in this survey is that by allowing the
representation of the set of elements to lose some
information, in other words to become lossy, the storage
requirements can be significantly reduced. Bloom in 1970.
Bloom first described a compact probabilistic data
structure that was used to represent words in a dictionary.
There was little interest in using Bloom filters for
networking until 1995, after which this area has gained
widespread interest both in academia and in the industry. A
bloom filter is simply used to test whether the element is
present in the set or not.
Its main properties are:
1. The amount of space needed to store the bloom filter is
very less when compared to the amount of data belonging
to the set being tested.
2. The time needed to check whether an element is a
member of a given set is independent of the number of
elements contained in the set.
3. False negatives are not possible.
4. False positives are possible, but their frequency can
be controlled. In practice, it is a trade off between
space/time efficiency and the false positive frequency.
II. BLOOM FILTER
Whenever a list or set is used, and space is at a premium,
consider using a Bloom filters if the effect of false
positives can be mitigated. A Bloom filters is an array of
m bits for representing a set S = {x1, x2 . . . xn} of n
elements. Initially all the bits in the filters are set to zero.
The key idea is to use k hash functions, hi(x), 1 ≤ i ≤ k to
map items x ∈ S to random numbers uniform in the
range 1, . . .m. The hash functions are assumed to be
uniform. The MD5 hash algorithm∈ is a popular choice
for the hash functions. An element x S is inserted into the
filters by setting the bits hi(x) to one for 1 ≤ i ≤ k.
Conversely, y is assumed a member of S if the bits hi(y)
are set, and guaranteed not to be a member if any bit
hi(y) is not set. The weak point of Bloom filters is the
possibility for a false positive. False positives are
elements that are not part of S but are reported being in
the set by the filters.
Fig 1. Overview of Bloom Filters
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 07 | July-2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 2743
The bloom filter utilizes the hashing technique for the
search of best document. The bloom filter gets the Query
from the node, it performs multiple hashing in the query
and as a result it converts the query into URLs. A BF is a
loss but succinct and efficient data structure to represent a
set S, which can efficiently process the membership query
such as “is element x in set S.”
III.HASHING TECHNIQUES
Here I briefly present hashing techniques that are the basis
for good Bloom filter implementations. I start with perfect
hashing, which is an alternative to Bloom filters when the
set is known beforehand and it is static. Double hashing
allows reducing the number of true hash computations.
Partitioned hashing and multiple hashing deal with how
bits are allocated in a Bloom filter. Finally, the use of simple
hash functions is considered.
A. Perfect Hashing Scheme: A simple technique called
perfect hashing (or explicit hashing) can be used to store a
static set S of values in an optimal manner using a perfect
hash function.
A perfect hash function is a computable bijection from S to
an array of |S| = n hash buckets. The n-size array can be
∈used to store the information associated with each
element x S [1]. Bloom filter like functionality can be
obtained by, given a set of elements S, first finding a perfect
hash function P and then storing at each location an f = 1/ǫ
bit fingerprint, computed using some (pseudo-)random
hash function H.
B. Double Hashing: The improvement of the double
hashing technique over basic hashing is being able to
generate k hash values based on only two universal hash
functions as base generators (or “seed” hashes). As a
practical consequence, Bloom filters can be built with less
hashing operations without crificing performance. Kirsch
and Mitzenmacher have shown [2] that it requires only
two independent hash functions, h1(x) and h2(x), to
generate additional “pseudo” hashes defined as:
hi(x) = h1(x) + f(i) ∗ h2(x) (10)
where i is the hash value index, f(i) can be any arbitrary
function of i (e.g., i2), and x is the element being hashed. For
Bloom filter operations, the double hashing scheme reduces
the number of true hash computations from k down to two
without any increase in the asymptotic false positive
probability [2].
C. Partitioned Hashing: In this hashing technique, the k hash
functions are allocated disjoint ranges of m/k consecutive
bits instead of the full m-bit array space , probability of a
specific bit being 0 in a partitioned Bloom filter can be
approximated to:
(1 − k/m)n ≈ e−kn/m
While the asymptotic performance remains the same, in
practice, partitioned Bloom filters exhibit a poorer false
positive performance as they tend to have larger fill
factors (more 1s) due to the m/k bit range restriction.
This can be explained by the observation that: Lookup
of x simply consists of computing P(x) and checking
whether∈ the stored hash function value matches H(x).
When x S, the correct value is always returned, and
when x do not belong to S a false positive (claiming the
element being in S) occurs with probability at most ǫ.
This follows from the definition of 2- universal hashing
by Carter and Wengman [3], that any element y not in S
has probability at most ǫ of having the same hash
function value h(y) as the element in S that maps to the
same entry of the array. While space efficient, this
approach is disconsidered for dynamic environments,
because the perfect hash function.
D. Multiple Hashing: Multiple hashing is a popular
technique that exploits the notion of having multiple hash
choices and having the power to choose the most
convenient candidate. When applied for hash table
constructions, multiple hashing provides a probabilistic
method to limit the effects of collisions by allocating
elements more-or-less evenly distributed. The original idea
was proposed by Azar et al. in his seminal work on
balanced allocations [4]. Formulating hashing as a balls
into bins problem, the authors show that if n balls are
placed sequentially into m for m = O(n) with each ball
being placed in one of a constant d = 2 randomly chosen.
E. Simple Hash Functions: A common assumption is to
consider output hash values as truly random, that is, each
hashed element is independently mapped to a uniform
location. While this is a great aid to theoretical analyses,
hash function implementations are known to behave far
worse than truly random ones. On the other hand,
empirical works using standard universal hashing have
been reporting negligible
differences in practical performance compared to
predictions assuming ideal hashing (see [6] for the case of
Bloom filters).
IV. BLOOM FILTER VARIANTS
A number of Bloom filter variants have been proposed
that address some of the limitations of the original
structure, including counting, deletion, multisets, and
space-efficiency. We take up few variants here.
A. Compressed Bloom Filter:
Compressing a Bloom filter improves performance when a
Bloom filter is passed in a message between distributed
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 07 | July-2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 2744
nodes. This structure is particularly useful when
information must be transmitted repeatedly, and the
bandwidth is a limiting factor[7]. Compressed Bloom filters
are used only for optimizing the transmission (over the
network) size of the filters. This is motivated by
applications such as Web caches and P2P information
sharing, which frequently use Bloom filters to distribute
routing tables. The key idea in compressed Bloom filters is
that by changing the way bits are distributed in the filter, it
can be compressed for transmission purposes. This is
achieved by choosing the number of hash functions k in
such a way that the entries in the m vector have a smaller
probability than ½ of being set.
B. Spectral Bloom Filters:
Spectral Bloom filters generalize Bloom filters to storing
an approximate multiset and support frequency queries
[8]. The membership query is generalized to a query on
the multiplicity of an element. The answer to any
multiplicity query is never smaller than the true
multiplicity, and greater only with probability ǫ. In this
sense, spectral refers to the range within which
multiplicity answers are given. The space usage is similar
to that of a Bloom filter for a set of the same size
(including the counters to store the frequency values).
The time needed to determine a multiplicity of k is
O(logk).The query time is Θ(log(1/ϵ)). The answer
estimate is given by returning the minimum value of the k
counters determined by the hash functions.
C. Space Code Bloom Filter:
Per-flow traffic measurement is crucial for usage
accounting, traffic engineering, and anomaly detection.
Previous methodologies are either based on random
sampling (e.g., Cisco’s NetFlow), which is inaccurate, or
only account for the ”elephants”. A data structure called
Space Code Bloom Filter (SCBF) can be used to measure
per-flow traffic approximately at high speeds. SCBF
employs a Maximum Likelihood Estimation (MLE) method
to measure the multiplicity of an element in the multiset.
D. Decaying Bloom Filter:
The Decaying Bloom Filter (DBF) structure has been
proposed for this application scenario. DBF is an extension
of the counting Bloom filter and it supports the removal of
stale elements from the structure as new elements are
inserted. DBF may produce false positive errors, but not
false negatives as is the case with the basic Bloom filter. A
variant of DBF has been applied for hint-based routing in
wireless sensor networks [9]. This motivates approximate
detection of duplicates among newly arrived data elements
of a data stream. This can be accomplished within a fixed
time window.
V. BLOOM FILTERS IN DISTRIBUTED COMPUTING
We have surveyed techniques for probabilistic
representation of sets and functions. The applications of
these structures are many fold, and they are widely used
in various networking systems, such as Web proxies and
caches, database servers, and routers.
A. Caching
Bloom filters have been applied extensively to caching in
distributed environments. To take an early example, Fan,
Cao, Almeida, and Broader proposed the Summary Cache
[10], [11] system, which uses Bloom filters for the
distribution of Web cache information. The system
consists of cooperative proxies that store and exchange
summary cache data structures, essentially Bloom filters.
When a local cache miss happens, the proxy in question
will try to find out if another proxy has a copy of the Web
resource using the summary cache. If another proxy has a
copy, then the request is forwarded there. In order for
distributed proxy-based caching to work well, the proxies
need to have a way to compactly summarize available
content. In the Summary Cache system, proxies
periodically transfer the Bloom filters that represent the
cache contents (URL lists).
Google’s Bigtable system that is used by many massively
popular Google services, such as Google Maps and
Google Earth, and Web indexing. Bigtable is a distributed
storage system for structured data that has been
designed with high scalability requirements in mind, for
example capability to store and query petabytes of data
across thousands of commodity servers [12]. A Bigtable
is a sparse multidimensional sorted map. The map is
indexed by a row key, column key, and a timestamp.
Each value in the map is an uninterpreted array of bytes.
Bigtable uses Bloom filters to reduce the disk lookups
for non-existent rows or columns [12]. As a result the
query performance of the database has to rely less on
costly disk operations and thus performance increases.
B. P2P Networks
Bloom filters have been extensively applied in P2P
environments for various tasks, such as compactly storing
keyword- based searches and indices [13], synchronizing
sets over network, and summarizing content. The
exchange of keyword lists and other metadata between
peers is crucial for P2P networks. Ideally, the state should
be such that it allows for accurate matching of queries and
takes sublinear space (or near constant space). The later
versions of the Gnutella protocol use Bloom filters [14] to
represent the keyword lists in an efficient manner. In
Gnutella, each leaf node sends its keyword Bloom filter to
an ultra-node, which can then produce a summary of all
the filters from its leaves, and then sends it to
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 07 | July-2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 2745
neighbouring ultra-nodes. The ultra-nodes are hubs of
connectivity, each being connected to more than 32 other
ultra node
D. Monitoring and Measurement
Network monitoring and measurement are key
application areas for Bloom filters and their variants. We
briefly examine some key cases in this domain, for
example detection of heavy flows, Iceberg queries, packet
attribution, and approximate state machines.
Fig:2-tier gnutella architecture
A. Heavy Flows: Bloom filters have found many applications
in measurement of network traffic. One particular
application is the detection of heavy flows in a router.
Heavy flows can be detected with a relatively small amount
of space and small number of operations per packet by
hashing incoming packets into a variant of the counting
Bloom filter and incrementing the counter at each set bit
with the size of the packet. Then if the minimum counter
exceeds some threshold value, the flow is marked as a
heavy flow [15].
B. Iceberg Queries: An Iceberg query is such that identifies
all items with frequency above some given threshold.
Bloom filter variants that are able to count elements are
good candidate structures for supporting Iceberg queries.
In networking, low-memory approximate histogram
structures are needed for collecting network statistics at
runtime. For example, in some applications it is necessary
to track flows across domains and perform, to name a few
examples, congestion and security monitoring. Iceberg
queries can be used to detect Denial-of-Service attacks.
VI. BLOOM FILTERS FOR SIMILARITY TESTING
Observe that we can view each document to be a set in
Bloom filter parlance whose elements are the CDCs that it
is composed of. Given that Bloom filters compactly
represent a set, they can also be used to approximately
match two sets. Bloom filters, however, cannot be used
for exact matching as they have a finite false-match
probability but they are naturally suited for similarity
matching. For finding similar documents, we compare the
Bloom filter of one with that of the other. In case the two
documents share a large number of 1’s (bit-wise AND)
they are marked as similar. In this case, the bit-wise AND
can also be perceived as the dot product of the two bit
vectors. If the set bits in the Bloom filter of a document
are a complete subset of that of another filter then it is
highly probable that the document is included in the
other. Web pages are typically composed of fragments,
either static ones (e.g., logo images), or dynamic (e.g.,
personalized product promotions, local weather) [16].
When targeting pages for a similarity based “grouping”,
the test for similarity should be on the fragment of
interest and not the entire page.
Bloom filters, when applied to similarity detection, have
several advantages. First, the compactness of Bloom filters
is very attractive for storage and transmission whenever
we want to minimize the meta-data overheads. Second,
Bloom filters enable fast comparison as matching is a
bitwise-AND operation. Third, since Bloom filters are a
complete representation of a set rather than a
deterministic sample (e.g., shingling), they can determine
inclusions effectively.
VII. SUMMARY
Bloom filters are a general aid for network processing and
improving the performance and scalability of distributed
systems. The space required by the bloom filter is very less
when compared to the size of the data in the element set.
Compressed bloom filters are used to optimize the data to
be transmitted in the distributed system. This
automatically increases the performance of the system in
distributed system. The bloom filters can be used for the
large data to know whether the particular element is
present in the set or not. Per flow traffic can be measured
easily in the heavy traffic by the usage of space code bloom
filters . the spectral bloom filters generalize bloom filters to
storing an approximate multiset and support frequency
queries. Spectral refers to the range within which
multiplicity answers are given. Decaying bloom filter
supports for the removal of stale elements from the
structure as new elements are inserted. It may produce
false positive errors but not false negatives as in the case of
the basic bloom filters.
REFERENCES
[1] A. Pagh, R. Pagh, And S. S. Rao, “An Optimal Bloom
FilterReplacement,” In Soda ’05: Proceedings Of The
Sixteenth Annual Acm-Siam Symposium On Discrete
Algorithms. Philadelphia, Pa, Usa: Society For Industrial
And Applied Mathematics, 2005, Pp. 823–829.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 07 | July-2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 2746
[2] A. Kirsch And M. Mitzenmacher, “Less Hashing, Same
Performance:Building A Better Bloom Filter,” In
Esa’06:Proceedings Of The 14th Annual European
Symposium On Algorithms. London, Uk: Springer Verlag,
2006, Pp. 456–467.
[3] J. L. Carter And M. N. Wegman, “Universal Classes Of
Hash Functions(Extended Abstract),” In Stoc ’77:
Proceedings Of The Ninth Annual Acm Symposium On
Theory Of Computing. New York, Ny, Usa: Acm, 1977, Pp.
106–112. Y. Azar, A. Z. Broder, A. R. Karlin, And E. Upfal,
“Balanced Allocations,” Siam J. Comput., Vol. 29, No.1, Pp.
180–200, 2000.
[4] B. V¨ Ocking, “How Asymmetry Helps Load
Balancing,” J. Acm, Vol. 50, No. 4, Pp. 568–589, 2003.
[5] M. Mitzenmacher, “Compressed Bloom Filters,” In Podc
’01: Proceedings Of The Twentieth Annual Acm Symposium
On Principles Of Distributed Computing. New York, Ny, Usa:
Acm, 2001, Pp. 144–150.
[6] L. Fan, P. Cao, J. Almeida, And A. Z. Broder, “Summary
Cache: A Scalable Wide-Area Web Cache Sharing Protocol,”
Sigcomm Comput. Commun. Rev., Vol. 28, No. 4, Pp. 254–
265, 1998.
[7]“Summary Cache: A Scalable Wide-Area Web Cache
Sharing Protocol,” Ieee/Acm Trans. Netw., Vol. 8, No. 3,
Pp. 281–293, 2000.
[8]F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A.
Wallach, M. Burrows, T. Chandra, A. Fikes, And R. E.
Gruber, “Bigtable: A
Distributed Storage System For Structured Data,” In Osdi
’06: Proceedings Of The 7th Usenix Symposium On
Operating Systems Design And Implementation. Berkeley,
Ca, Usa: Usenix Association, 2006, Pp.15–15.
[9] J. Risson And T. Moors, “Survey Of Research Towards
Robust Peer-To-Peer Networks: Search Methods,” Comput.
Netw., Vol. 50, No. 17, Pp. 3485–3521, 2006.
[10] H. Cai, P. Ge, And J. Wang, “Applications Of Bloom
Filters In Peer- To-Peer Systems: Issues And Questions,” In
Nas ’08: Proceedings Of The 2008 International Conference
On Networking, Architecture, And Storage. Washington, Dc,
Usa: Ieee Computer Society, 2008, Pp.97– 103.
[11] W.-C. Feng, K. G. Shin, D. D. Kandlur, And D. Saha, “The
Blue Active Queue Management Algorithms,” Ieee/Acm
Trans. Netw., Vol. 10,No. 4, Pp. 513–528, 2002.
[12] A. Z. Broder And A. R. Karlin, “Multilevel Adaptive
Hashing,” In Soda’90: Proceedings Of The First Annual Acm-
Siam Symposium On Discrete Algorithms. Philadelphia, Pa,
Usa: Society For Industrial And Applied Mathematics, 1990,
Pp. 43–53.

More Related Content

ZIP
Hashing
PPT
358 33 powerpoint-slides_15-hashing-collision_chapter-15
PPT
Extensible hashing
PPT
Concept of hashing
PDF
Bigdata analytics
PPTX
Hashing 1
PPT
4.4 external hashing
PDF
Application of hashing in better alg design tanmay
Hashing
358 33 powerpoint-slides_15-hashing-collision_chapter-15
Extensible hashing
Concept of hashing
Bigdata analytics
Hashing 1
4.4 external hashing
Application of hashing in better alg design tanmay

What's hot (20)

PPT
Ch17 Hashing
DOC
Data structure lecture 2
PDF
Hashing and Hash Tables
PDF
DBMS 9 | Extendible Hashing
PPTX
linear probing
PPS
Ds 8
PPTX
Hashing Technique In Data Structures
PPTX
Lecture 11
PDF
Hash Tables in data Structure
PPT
Hashing PPT
PPTX
Quadratic probing
PPTX
Hashing
PPTX
Hashing In Data Structure
PPT
Data Structure and Algorithms Hashing
PPT
Hashing
PDF
Array and Pointers
PDF
Text encryption
PPTX
Hashing in datastructure
PPTX
Hashing and Hashtable, application of hashing, advantages of hashing, disadva...
PPT
Introduction of data structure
Ch17 Hashing
Data structure lecture 2
Hashing and Hash Tables
DBMS 9 | Extendible Hashing
linear probing
Ds 8
Hashing Technique In Data Structures
Lecture 11
Hash Tables in data Structure
Hashing PPT
Quadratic probing
Hashing
Hashing In Data Structure
Data Structure and Algorithms Hashing
Hashing
Array and Pointers
Text encryption
Hashing in datastructure
Hashing and Hashtable, application of hashing, advantages of hashing, disadva...
Introduction of data structure
Ad

Similar to Bloom Filters: An Introduction (20)

PPT
New zealand bloom filter
PDF
On Improving the Performance of Data Leak Prevention using White-list Approach
PDF
An Examination of the Bloom Filter and its Application in Preventing Weak Pas...
PDF
Lesson11 transactions
PDF
Hashing and File Structures in Data Structure.pdf
PDF
Algorithms notes tutorials duniya
PDF
A Combined Approach for Feature Subset Selection and Size Reduction for High ...
PDF
UNIT V.pdf
PDF
Feature Subset Selection for High Dimensional Data Using Clustering Techniques
PDF
Dimensionality reduction by matrix factorization using concept lattice in dat...
PDF
A Survey on Fuzzy Association Rule Mining Methodologies
PDF
Feature Subset Selection for High Dimensional Data using Clustering Techniques
PPTX
Hashing_Unit4.pptx Data Structures and Algos
PDF
Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...
DOC
report2.doc
PDF
Performance Analysis of Hashing Mathods on the Employment of App
PDF
Description Of A Graph
PDF
Variations in Outcome for the Same Map Reduce Transitive Closure Algorithm Im...
PDF
VARIATIONS IN OUTCOME FOR THE SAME MAP REDUCE TRANSITIVE CLOSURE ALGORITHM IM...
PDF
VARIATIONS IN OUTCOME FOR THE SAME MAP REDUCE TRANSITIVE CLOSURE ALGORITHM IM...
New zealand bloom filter
On Improving the Performance of Data Leak Prevention using White-list Approach
An Examination of the Bloom Filter and its Application in Preventing Weak Pas...
Lesson11 transactions
Hashing and File Structures in Data Structure.pdf
Algorithms notes tutorials duniya
A Combined Approach for Feature Subset Selection and Size Reduction for High ...
UNIT V.pdf
Feature Subset Selection for High Dimensional Data Using Clustering Techniques
Dimensionality reduction by matrix factorization using concept lattice in dat...
A Survey on Fuzzy Association Rule Mining Methodologies
Feature Subset Selection for High Dimensional Data using Clustering Techniques
Hashing_Unit4.pptx Data Structures and Algos
Hadoop Map-Reduce To Generate Frequent Item Set on Large Datasets Using Impro...
report2.doc
Performance Analysis of Hashing Mathods on the Employment of App
Description Of A Graph
Variations in Outcome for the Same Map Reduce Transitive Closure Algorithm Im...
VARIATIONS IN OUTCOME FOR THE SAME MAP REDUCE TRANSITIVE CLOSURE ALGORITHM IM...
VARIATIONS IN OUTCOME FOR THE SAME MAP REDUCE TRANSITIVE CLOSURE ALGORITHM IM...
Ad

More from IRJET Journal (20)

PDF
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
PDF
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
PDF
Kiona – A Smart Society Automation Project
PDF
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
PDF
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
PDF
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
PDF
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
PDF
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
PDF
BRAIN TUMOUR DETECTION AND CLASSIFICATION
PDF
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
PDF
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
PDF
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
PDF
Breast Cancer Detection using Computer Vision
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
PDF
Auto-Charging E-Vehicle with its battery Management.
PDF
Analysis of high energy charge particle in the Heliosphere
PDF
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Enhanced heart disease prediction using SKNDGR ensemble Machine Learning Model
Utilizing Biomedical Waste for Sustainable Brick Manufacturing: A Novel Appro...
Kiona – A Smart Society Automation Project
DESIGN AND DEVELOPMENT OF BATTERY THERMAL MANAGEMENT SYSTEM USING PHASE CHANG...
Invest in Innovation: Empowering Ideas through Blockchain Based Crowdfunding
SPACE WATCH YOUR REAL-TIME SPACE INFORMATION HUB
A Review on Influence of Fluid Viscous Damper on The Behaviour of Multi-store...
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...
Explainable AI(XAI) using LIME and Disease Detection in Mango Leaf by Transfe...
BRAIN TUMOUR DETECTION AND CLASSIFICATION
The Project Manager as an ambassador of the contract. The case of NEC4 ECC co...
"Enhanced Heat Transfer Performance in Shell and Tube Heat Exchangers: A CFD ...
Advancements in CFD Analysis of Shell and Tube Heat Exchangers with Nanofluid...
Breast Cancer Detection using Computer Vision
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
A Novel System for Recommending Agricultural Crops Using Machine Learning App...
Auto-Charging E-Vehicle with its battery Management.
Analysis of high energy charge particle in the Heliosphere
Wireless Arduino Control via Mobile: Eliminating the Need for a Dedicated Wir...

Recently uploaded (20)

PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PDF
Arduino robotics embedded978-1-4302-3184-4.pdf
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPT
Project quality management in manufacturing
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPTX
Internet of Things (IOT) - A guide to understanding
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PDF
Structs to JSON How Go Powers REST APIs.pdf
PDF
PPT on Performance Review to get promotions
PPTX
additive manufacturing of ss316l using mig welding
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PDF
composite construction of structures.pdf
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
Foundation to blockchain - A guide to Blockchain Tech
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Model Code of Practice - Construction Work - 21102022 .pdf
Arduino robotics embedded978-1-4302-3184-4.pdf
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Project quality management in manufacturing
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
Embodied AI: Ushering in the Next Era of Intelligent Systems
Internet of Things (IOT) - A guide to understanding
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Structs to JSON How Go Powers REST APIs.pdf
PPT on Performance Review to get promotions
additive manufacturing of ss316l using mig welding
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
composite construction of structures.pdf

Bloom Filters: An Introduction

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 04 Issue: 07 | July-2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 2742 BLOOM FILTERS: AN INTRODUCTION Shrivatsa D Perur1 1 Assistant Professor, Dept. of ISE, GIT, Belagavi, Karnataka, India ---------------------------------------------------------------------***------------------------------------------------------------------ Abstract—many network solutions and overlay networks utilize probabilistic techniques to reduce information processing and cost of networking. This article presents a number of frequently used and useful probabilistic techniques. Bloom filters and their variants are of prime importance, and they are heavily used in various distributed systems. This has been reflected in recent research and many new algorithms have been proposed for distributed systems that are either directly or indirectly based on Bloom filters. To keep false positive probabilities low, the size of the bloom filter must be dimensioned a priori to be linear in the maximum number of keys inserted, with the linearity constant ranging typically from one to few bytes. Key words-Bloom filters, probabilistic structures, distributed systems I.INTRODUCTION The bloom filter is a bit-vector data structure that provides a compact representation of a set of elements (keys). It supports insertion of elements and membership queries. A membership answer is probabilistically correct in the sense that it allows a small probability of a false positive (i.e., an incorrect answer for a non-member element). The bloom filter allows tradeoffs between small size (compactness) and low false positives (accuracy). To keep false positives low, the size of the bloom filter must be dimensioned a priori to be linear in the maximum number of keys inserted, with the linearity constant typically ranging from one to few bytes. Fast matching of arbitrary identifiers to values is a basic requirement for a large number of applications. Data objects are typically referenced using locally or globally unique identifiers. Recently, many distributed systems have been developed using probabilistic globally unique random bit strings as node identifiers. For example, a node tracks a large number of peers that advertise files or parts of files. Fast mapping from host identifiers to object identifiers and vice versa are needed. The number of these identifiers in memory may be great, which motivates the development of fast and compact matching algorithms. Given that there are millions or even billions of data elements, developing efficient solutions for storing, updating, and querying them becomes increasingly important. The key idea behind the data structures discussed in this survey is that by allowing the representation of the set of elements to lose some information, in other words to become lossy, the storage requirements can be significantly reduced. Bloom in 1970. Bloom first described a compact probabilistic data structure that was used to represent words in a dictionary. There was little interest in using Bloom filters for networking until 1995, after which this area has gained widespread interest both in academia and in the industry. A bloom filter is simply used to test whether the element is present in the set or not. Its main properties are: 1. The amount of space needed to store the bloom filter is very less when compared to the amount of data belonging to the set being tested. 2. The time needed to check whether an element is a member of a given set is independent of the number of elements contained in the set. 3. False negatives are not possible. 4. False positives are possible, but their frequency can be controlled. In practice, it is a trade off between space/time efficiency and the false positive frequency. II. BLOOM FILTER Whenever a list or set is used, and space is at a premium, consider using a Bloom filters if the effect of false positives can be mitigated. A Bloom filters is an array of m bits for representing a set S = {x1, x2 . . . xn} of n elements. Initially all the bits in the filters are set to zero. The key idea is to use k hash functions, hi(x), 1 ≤ i ≤ k to map items x ∈ S to random numbers uniform in the range 1, . . .m. The hash functions are assumed to be uniform. The MD5 hash algorithm∈ is a popular choice for the hash functions. An element x S is inserted into the filters by setting the bits hi(x) to one for 1 ≤ i ≤ k. Conversely, y is assumed a member of S if the bits hi(y) are set, and guaranteed not to be a member if any bit hi(y) is not set. The weak point of Bloom filters is the possibility for a false positive. False positives are elements that are not part of S but are reported being in the set by the filters. Fig 1. Overview of Bloom Filters
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 04 Issue: 07 | July-2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 2743 The bloom filter utilizes the hashing technique for the search of best document. The bloom filter gets the Query from the node, it performs multiple hashing in the query and as a result it converts the query into URLs. A BF is a loss but succinct and efficient data structure to represent a set S, which can efficiently process the membership query such as “is element x in set S.” III.HASHING TECHNIQUES Here I briefly present hashing techniques that are the basis for good Bloom filter implementations. I start with perfect hashing, which is an alternative to Bloom filters when the set is known beforehand and it is static. Double hashing allows reducing the number of true hash computations. Partitioned hashing and multiple hashing deal with how bits are allocated in a Bloom filter. Finally, the use of simple hash functions is considered. A. Perfect Hashing Scheme: A simple technique called perfect hashing (or explicit hashing) can be used to store a static set S of values in an optimal manner using a perfect hash function. A perfect hash function is a computable bijection from S to an array of |S| = n hash buckets. The n-size array can be ∈used to store the information associated with each element x S [1]. Bloom filter like functionality can be obtained by, given a set of elements S, first finding a perfect hash function P and then storing at each location an f = 1/ǫ bit fingerprint, computed using some (pseudo-)random hash function H. B. Double Hashing: The improvement of the double hashing technique over basic hashing is being able to generate k hash values based on only two universal hash functions as base generators (or “seed” hashes). As a practical consequence, Bloom filters can be built with less hashing operations without crificing performance. Kirsch and Mitzenmacher have shown [2] that it requires only two independent hash functions, h1(x) and h2(x), to generate additional “pseudo” hashes defined as: hi(x) = h1(x) + f(i) ∗ h2(x) (10) where i is the hash value index, f(i) can be any arbitrary function of i (e.g., i2), and x is the element being hashed. For Bloom filter operations, the double hashing scheme reduces the number of true hash computations from k down to two without any increase in the asymptotic false positive probability [2]. C. Partitioned Hashing: In this hashing technique, the k hash functions are allocated disjoint ranges of m/k consecutive bits instead of the full m-bit array space , probability of a specific bit being 0 in a partitioned Bloom filter can be approximated to: (1 − k/m)n ≈ e−kn/m While the asymptotic performance remains the same, in practice, partitioned Bloom filters exhibit a poorer false positive performance as they tend to have larger fill factors (more 1s) due to the m/k bit range restriction. This can be explained by the observation that: Lookup of x simply consists of computing P(x) and checking whether∈ the stored hash function value matches H(x). When x S, the correct value is always returned, and when x do not belong to S a false positive (claiming the element being in S) occurs with probability at most ǫ. This follows from the definition of 2- universal hashing by Carter and Wengman [3], that any element y not in S has probability at most ǫ of having the same hash function value h(y) as the element in S that maps to the same entry of the array. While space efficient, this approach is disconsidered for dynamic environments, because the perfect hash function. D. Multiple Hashing: Multiple hashing is a popular technique that exploits the notion of having multiple hash choices and having the power to choose the most convenient candidate. When applied for hash table constructions, multiple hashing provides a probabilistic method to limit the effects of collisions by allocating elements more-or-less evenly distributed. The original idea was proposed by Azar et al. in his seminal work on balanced allocations [4]. Formulating hashing as a balls into bins problem, the authors show that if n balls are placed sequentially into m for m = O(n) with each ball being placed in one of a constant d = 2 randomly chosen. E. Simple Hash Functions: A common assumption is to consider output hash values as truly random, that is, each hashed element is independently mapped to a uniform location. While this is a great aid to theoretical analyses, hash function implementations are known to behave far worse than truly random ones. On the other hand, empirical works using standard universal hashing have been reporting negligible differences in practical performance compared to predictions assuming ideal hashing (see [6] for the case of Bloom filters). IV. BLOOM FILTER VARIANTS A number of Bloom filter variants have been proposed that address some of the limitations of the original structure, including counting, deletion, multisets, and space-efficiency. We take up few variants here. A. Compressed Bloom Filter: Compressing a Bloom filter improves performance when a Bloom filter is passed in a message between distributed
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 04 Issue: 07 | July-2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 2744 nodes. This structure is particularly useful when information must be transmitted repeatedly, and the bandwidth is a limiting factor[7]. Compressed Bloom filters are used only for optimizing the transmission (over the network) size of the filters. This is motivated by applications such as Web caches and P2P information sharing, which frequently use Bloom filters to distribute routing tables. The key idea in compressed Bloom filters is that by changing the way bits are distributed in the filter, it can be compressed for transmission purposes. This is achieved by choosing the number of hash functions k in such a way that the entries in the m vector have a smaller probability than ½ of being set. B. Spectral Bloom Filters: Spectral Bloom filters generalize Bloom filters to storing an approximate multiset and support frequency queries [8]. The membership query is generalized to a query on the multiplicity of an element. The answer to any multiplicity query is never smaller than the true multiplicity, and greater only with probability ǫ. In this sense, spectral refers to the range within which multiplicity answers are given. The space usage is similar to that of a Bloom filter for a set of the same size (including the counters to store the frequency values). The time needed to determine a multiplicity of k is O(logk).The query time is Θ(log(1/ϵ)). The answer estimate is given by returning the minimum value of the k counters determined by the hash functions. C. Space Code Bloom Filter: Per-flow traffic measurement is crucial for usage accounting, traffic engineering, and anomaly detection. Previous methodologies are either based on random sampling (e.g., Cisco’s NetFlow), which is inaccurate, or only account for the ”elephants”. A data structure called Space Code Bloom Filter (SCBF) can be used to measure per-flow traffic approximately at high speeds. SCBF employs a Maximum Likelihood Estimation (MLE) method to measure the multiplicity of an element in the multiset. D. Decaying Bloom Filter: The Decaying Bloom Filter (DBF) structure has been proposed for this application scenario. DBF is an extension of the counting Bloom filter and it supports the removal of stale elements from the structure as new elements are inserted. DBF may produce false positive errors, but not false negatives as is the case with the basic Bloom filter. A variant of DBF has been applied for hint-based routing in wireless sensor networks [9]. This motivates approximate detection of duplicates among newly arrived data elements of a data stream. This can be accomplished within a fixed time window. V. BLOOM FILTERS IN DISTRIBUTED COMPUTING We have surveyed techniques for probabilistic representation of sets and functions. The applications of these structures are many fold, and they are widely used in various networking systems, such as Web proxies and caches, database servers, and routers. A. Caching Bloom filters have been applied extensively to caching in distributed environments. To take an early example, Fan, Cao, Almeida, and Broader proposed the Summary Cache [10], [11] system, which uses Bloom filters for the distribution of Web cache information. The system consists of cooperative proxies that store and exchange summary cache data structures, essentially Bloom filters. When a local cache miss happens, the proxy in question will try to find out if another proxy has a copy of the Web resource using the summary cache. If another proxy has a copy, then the request is forwarded there. In order for distributed proxy-based caching to work well, the proxies need to have a way to compactly summarize available content. In the Summary Cache system, proxies periodically transfer the Bloom filters that represent the cache contents (URL lists). Google’s Bigtable system that is used by many massively popular Google services, such as Google Maps and Google Earth, and Web indexing. Bigtable is a distributed storage system for structured data that has been designed with high scalability requirements in mind, for example capability to store and query petabytes of data across thousands of commodity servers [12]. A Bigtable is a sparse multidimensional sorted map. The map is indexed by a row key, column key, and a timestamp. Each value in the map is an uninterpreted array of bytes. Bigtable uses Bloom filters to reduce the disk lookups for non-existent rows or columns [12]. As a result the query performance of the database has to rely less on costly disk operations and thus performance increases. B. P2P Networks Bloom filters have been extensively applied in P2P environments for various tasks, such as compactly storing keyword- based searches and indices [13], synchronizing sets over network, and summarizing content. The exchange of keyword lists and other metadata between peers is crucial for P2P networks. Ideally, the state should be such that it allows for accurate matching of queries and takes sublinear space (or near constant space). The later versions of the Gnutella protocol use Bloom filters [14] to represent the keyword lists in an efficient manner. In Gnutella, each leaf node sends its keyword Bloom filter to an ultra-node, which can then produce a summary of all the filters from its leaves, and then sends it to
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 04 Issue: 07 | July-2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 2745 neighbouring ultra-nodes. The ultra-nodes are hubs of connectivity, each being connected to more than 32 other ultra node D. Monitoring and Measurement Network monitoring and measurement are key application areas for Bloom filters and their variants. We briefly examine some key cases in this domain, for example detection of heavy flows, Iceberg queries, packet attribution, and approximate state machines. Fig:2-tier gnutella architecture A. Heavy Flows: Bloom filters have found many applications in measurement of network traffic. One particular application is the detection of heavy flows in a router. Heavy flows can be detected with a relatively small amount of space and small number of operations per packet by hashing incoming packets into a variant of the counting Bloom filter and incrementing the counter at each set bit with the size of the packet. Then if the minimum counter exceeds some threshold value, the flow is marked as a heavy flow [15]. B. Iceberg Queries: An Iceberg query is such that identifies all items with frequency above some given threshold. Bloom filter variants that are able to count elements are good candidate structures for supporting Iceberg queries. In networking, low-memory approximate histogram structures are needed for collecting network statistics at runtime. For example, in some applications it is necessary to track flows across domains and perform, to name a few examples, congestion and security monitoring. Iceberg queries can be used to detect Denial-of-Service attacks. VI. BLOOM FILTERS FOR SIMILARITY TESTING Observe that we can view each document to be a set in Bloom filter parlance whose elements are the CDCs that it is composed of. Given that Bloom filters compactly represent a set, they can also be used to approximately match two sets. Bloom filters, however, cannot be used for exact matching as they have a finite false-match probability but they are naturally suited for similarity matching. For finding similar documents, we compare the Bloom filter of one with that of the other. In case the two documents share a large number of 1’s (bit-wise AND) they are marked as similar. In this case, the bit-wise AND can also be perceived as the dot product of the two bit vectors. If the set bits in the Bloom filter of a document are a complete subset of that of another filter then it is highly probable that the document is included in the other. Web pages are typically composed of fragments, either static ones (e.g., logo images), or dynamic (e.g., personalized product promotions, local weather) [16]. When targeting pages for a similarity based “grouping”, the test for similarity should be on the fragment of interest and not the entire page. Bloom filters, when applied to similarity detection, have several advantages. First, the compactness of Bloom filters is very attractive for storage and transmission whenever we want to minimize the meta-data overheads. Second, Bloom filters enable fast comparison as matching is a bitwise-AND operation. Third, since Bloom filters are a complete representation of a set rather than a deterministic sample (e.g., shingling), they can determine inclusions effectively. VII. SUMMARY Bloom filters are a general aid for network processing and improving the performance and scalability of distributed systems. The space required by the bloom filter is very less when compared to the size of the data in the element set. Compressed bloom filters are used to optimize the data to be transmitted in the distributed system. This automatically increases the performance of the system in distributed system. The bloom filters can be used for the large data to know whether the particular element is present in the set or not. Per flow traffic can be measured easily in the heavy traffic by the usage of space code bloom filters . the spectral bloom filters generalize bloom filters to storing an approximate multiset and support frequency queries. Spectral refers to the range within which multiplicity answers are given. Decaying bloom filter supports for the removal of stale elements from the structure as new elements are inserted. It may produce false positive errors but not false negatives as in the case of the basic bloom filters. REFERENCES [1] A. Pagh, R. Pagh, And S. S. Rao, “An Optimal Bloom FilterReplacement,” In Soda ’05: Proceedings Of The Sixteenth Annual Acm-Siam Symposium On Discrete Algorithms. Philadelphia, Pa, Usa: Society For Industrial And Applied Mathematics, 2005, Pp. 823–829.
  • 5. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 04 Issue: 07 | July-2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 2746 [2] A. Kirsch And M. Mitzenmacher, “Less Hashing, Same Performance:Building A Better Bloom Filter,” In Esa’06:Proceedings Of The 14th Annual European Symposium On Algorithms. London, Uk: Springer Verlag, 2006, Pp. 456–467. [3] J. L. Carter And M. N. Wegman, “Universal Classes Of Hash Functions(Extended Abstract),” In Stoc ’77: Proceedings Of The Ninth Annual Acm Symposium On Theory Of Computing. New York, Ny, Usa: Acm, 1977, Pp. 106–112. Y. Azar, A. Z. Broder, A. R. Karlin, And E. Upfal, “Balanced Allocations,” Siam J. Comput., Vol. 29, No.1, Pp. 180–200, 2000. [4] B. V¨ Ocking, “How Asymmetry Helps Load Balancing,” J. Acm, Vol. 50, No. 4, Pp. 568–589, 2003. [5] M. Mitzenmacher, “Compressed Bloom Filters,” In Podc ’01: Proceedings Of The Twentieth Annual Acm Symposium On Principles Of Distributed Computing. New York, Ny, Usa: Acm, 2001, Pp. 144–150. [6] L. Fan, P. Cao, J. Almeida, And A. Z. Broder, “Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol,” Sigcomm Comput. Commun. Rev., Vol. 28, No. 4, Pp. 254– 265, 1998. [7]“Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol,” Ieee/Acm Trans. Netw., Vol. 8, No. 3, Pp. 281–293, 2000. [8]F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, And R. E. Gruber, “Bigtable: A Distributed Storage System For Structured Data,” In Osdi ’06: Proceedings Of The 7th Usenix Symposium On Operating Systems Design And Implementation. Berkeley, Ca, Usa: Usenix Association, 2006, Pp.15–15. [9] J. Risson And T. Moors, “Survey Of Research Towards Robust Peer-To-Peer Networks: Search Methods,” Comput. Netw., Vol. 50, No. 17, Pp. 3485–3521, 2006. [10] H. Cai, P. Ge, And J. Wang, “Applications Of Bloom Filters In Peer- To-Peer Systems: Issues And Questions,” In Nas ’08: Proceedings Of The 2008 International Conference On Networking, Architecture, And Storage. Washington, Dc, Usa: Ieee Computer Society, 2008, Pp.97– 103. [11] W.-C. Feng, K. G. Shin, D. D. Kandlur, And D. Saha, “The Blue Active Queue Management Algorithms,” Ieee/Acm Trans. Netw., Vol. 10,No. 4, Pp. 513–528, 2002. [12] A. Z. Broder And A. R. Karlin, “Multilevel Adaptive Hashing,” In Soda’90: Proceedings Of The First Annual Acm- Siam Symposium On Discrete Algorithms. Philadelphia, Pa, Usa: Society For Industrial And Applied Mathematics, 1990, Pp. 43–53.