SlideShare a Scribd company logo
Content based filtering,
Pub/Sub
& Bloom filter
Presented By: Yara Ali
Agenda
•
•
•
•
•
•
•
•
•
•
•

Introduction
Human Networks (HUMNETs)
Content-based PublishSubscribe
Pub-sub service network
Bloom filter-based pub-SUB “B-SUB”
B-SUB Components
Bloom Filters (BF)
How Bloom filters work?
Temporal counting Bloom filter (TCBF)
Problems with TCBF
Decaying Factor
Introduction
• Distributed system:
• A system consisting
of several connected
computers that appear
to be one computing
entity.
Introduction .. Cont,
Communication
Mechanism

Client / Server
Architecture

Remote
Procedure call
(RPC)

Message
Oriented
Middleware
(MOM)

Message Queues

Tuple Space

Publish / Subscribe
Architecture
Introduction .. Cont,
• Publish / Subscribe Architectures
1- Lists at server:
Middleware is at the servers
2- Message broker:
Middleware is in a separate unit
3- BroadCast & filter at client:
Middleware is at the clients
Human Networks (HUMNETs)
• It’s a dynamic networks composed of
human-carried wireless devices.
• Applications in HUMNETs require contentbased networking services. (style of
communication that associates source
and destination pairs based on actual
content and interests, rather than letting
source nodes specify the destination)
Content-based
PublishSubscribe (CBPS)
• Content-based matching is the problem of finding all the
subscriptions that match a given notification.
• CBPS represents a compromise between the extremes
of publisher-side filtering of messages ( with event
directly transmitted to interested subscribers ) and
subscriber-side filtering of messages ( with events
broadcasted to all subscribers ).
• Event delivery is the task of delivering the notification to
the set of interested subscribers selected with contentbased matching.
Pub-sub service network
•

Two Approaches :

1. Filter-based approach:
Performs content-based filtering on
intermediate routing servers to dynamically
guide routing decisions.
2. Multicast-based approach:
Delivers events through a few high-quality
multicast groups that are pre-constructed to
approximately match user interests.
Pub-sub service network…Cont,
Pub-sub service network…Cont,
• In the filter-based approach, Routing decisions are made
via successive content-based filtering at all nodes from source
to destination: every pub-sub server along the way matches
the event with remote subscriptions from other servers and
then forwards it only toward directions that lead to matching
subscriptions
• In the multicast-based approach, A limited number of
multicast groups are computed before event transmission
begins. For each event the routing decision is made only once
at the publisher, mapping the event into the single appropriate
group. The event is then multicast to the group assuming IP
multicast or application-level multicast support. Because only
a limited number of multicast groups can be built, servers with
different interests may be clustered into same group and
events may be sent to uninterested servers as well.
Bloom filter-based pub-SUB “BSUB”
• It’s a content-based publish-subscribe
system.
• In B-SUB, messages are identified by
using strings that summarize their
contents. ( called keys )
Bloom filter-based pub-SUB “BSUB” …Cont,
• Pub/sub paradigm is used in B-SUB
Bloom filter-based pub-SUB “BSUB” …Cont,
• Advantages:
1- Frees users from addressing & routing tasks.
(reduces the overall overhead in the system)
2- Message producers & consumers are
separated.
3- Messages are forwarded only by brokers
(Perform content matching for the users)
B-SUB Components
B-SUB

Pub – Sub
forwarding

Broker Allocation

TCBF

Interests
propagation

Message
forwarding
B-SUB Components … Cont,
1- Broker Allocation:
•

Group of socially active nodes are selected to be brokers.

•

Normal users don’t participate in interest propagation & message
forwarding

•

Brokers are responsible for collecting subscriptions and forwarding
messages

•

A Broker stores a TCBF for propagating other users’ interests.
(which is called relay filters)

2- Pub – Sub forwarding
•

It’s separated into 2 parts: interests propagation and message
forwarding
Bloom Filters (BF)
• It’s a space-efficient data structure for representing sets
which supports probabilistic membership querying.
•

is a space-efficient probabilistic data structure that is
used to test whether an element is a member of a set

• BF maps a key through multiple hash functions into a bit
vector of a few bits being set. “ User’s interests are
represented as keys – Also messages are identified by
strings that summarize their contents are called Keys “
Bloom Filters (BF) … Cont,
• The locations of the set bits are determined by
the hash functions.
• A query of a key to a
BF checks if all the
hashed bits of the key
are set, which indicates
if the key is contained
in the BF
Bloom Filters (BF) … Cont,
• A BF for a set of keys is obtained by sequentially
inserting keys into the filter.

10

10

1

1

{K0}

20
+

10

10

1

1

{K1}

{k0,K1}

1

10

10

1

1
Bloom Filters (BF) … Cont,
• To merge multiple BFs we do a bit-wise
OR on them.
10

10

1

1

{K0}

10
M

10

10

1

1

{K1}

{k0,K1}

1

10

10

1

1
Bloom Filters (BF) … Cont,
• The basic BF doesn’t support deletions since we are
unable to trace the associated keys of set bits.
• The counting bloom filter (CBF) is proposed to provide
deletion.
• In a CBF each bit is associated with a counter, which
represent the number of keys that are associated with it.
• To delete a key from a CBF we decrement the counters
of the key’s hashed bits. A bit will be reset once its
counter reaches 0.
Bloom Filters (BF) … Cont,
• The sizes of messages in B-SUB are small
which are in order of hundreds of bytes. This
assumption is true in social networking
applications.
• Ex: twitter; a popular micro-blogging application,
requires a max size of 140 bytes for each post. If
a message is wrongly injected into the network,
the wasted bandwidth is acceptable)
How Bloom filters work?
“Message Forwarding”
• When a producer meets a consumer, the
consumer reports its interests in a BF to the
producer. The producer then queries all its
messages against the filter, and forwards all the
messages that match the filter, to the consumer.
• When a broker meets a producer, it forwards a
BF to the producer. The producer queries that
filter and determines the events that need to be
transmitted.
How Bloom filters work?
“Message Forwarding” … Cont,
• When a broker meets a consumer, the broker requests a
BF containing the consumer’s interests, then forwards
the matched messages to the consumer.
• Message are removed from brokers’ memory after being
forwarded. This is to prevent excessive copies in the
network.
•

Messages’ lifetime is controlled by their time-to-live
(TTL) values, which are identical to their maximum
tolerable delay. The TTL is counted since the message
has been created.
Temporal counting Bloom filter
(TCBF)
•

Extension to BF, proposed to perform content-based networking
tasks.

•

It doesn’t support direct deletion of elements. It only supports
temporal deletion, that is, A filter constantly decrements the
counter’s values of all its set bits, which is called Decaying

•

B-SUB uses TCBF to encode user’s interest & embed information
needed for brokers to make forwarding decisions.

•

B-SUB makes forwarding decisions through querying the TCBFs
( B-SUB can propagate interests by transmitting at most two TCBFs
of dozens of bytes)

•

The operations performed are only hashing and table lookup.
Problems with TCBF
• False positive (Spam) occur because a
key’s hashed bits are accidentally set by
other keys that have been put into the
TCBF.
• Because of false postivies, B-SUB may
falsely inject useless messages into the
network.
Decaying Factor (DF)
• It’s the key for adjusting B-SUB’s behaviors.
• If decaying is not used, the counters of the set
bits don’t change after being set, then no
interests will be removed.
• An obvious consequence is that a broker will
end up with carrying the interests from the users
that it meets rarely.
Decaying Factor (DF)…Cont,
• Suppose that each message has a delay limit of time T,
we should set the DF in such a way that an interest will
get removed after T since a consumer inserted the
interest once.
• If the broker contains the interest, then that means that
the broker has met a consumer that is interested in it
within T.
• If a message is forwarded by the broker it’s likely that the
message will be delivered within T.
References

• http://temple.academia.edu/YaxiongZhao/Papers/1043038/B• http://scholar.google.com/scholar?q=
bloom+filters+in+publish+subscribe&hl=en&btnG=
Search&as_sdt=1%2C5&as_sdtp=on
• http://en.wikipedia.org/wiki/Bloom_filter

• https://www.comp.nus.edu.sg/~david/Publications/de
Thank You !

More Related Content

PPTX
Linux System Programming - File I/O
PPTX
Google cluster architecture
PDF
Google jeff dean lessons learned while building infrastructure software at go...
PPT
Linked allocation 48
PPTX
Memory management ppt
PPT
Normalization of database tables
PPT
Kernal
PDF
Linux System Programming - File I/O
Google cluster architecture
Google jeff dean lessons learned while building infrastructure software at go...
Linked allocation 48
Memory management ppt
Normalization of database tables
Kernal

What's hot (20)

PDF
Course 102: Lecture 24: Archiving and Compression of Files
PDF
10 File System
PDF
Course 102: Lecture 25: Devices and Device Drivers
PDF
Linux Locking Mechanisms
PPTX
Operations on Processes and Cooperating processes
PDF
Emu8086
PDF
Unit II - 2 - Operating System - Threads
PDF
Linux Directory Structure
PPTX
System call (Fork +Exec)
PPTX
First Come First Serve
PDF
Fast File System
PDF
Linux kernel architecture
PDF
linuxcommands.pdf
PDF
Ejemplo Base de Datos SQLite (Android)
PPT
Unix ch03-03(2)
DOCX
Code generation errors and recovery
PPT
Disk scheduling algorithm.52
PPT
Disk scheduling
PPTX
Database and types of database
PDF
Lenguaje de programación Ruby
Course 102: Lecture 24: Archiving and Compression of Files
10 File System
Course 102: Lecture 25: Devices and Device Drivers
Linux Locking Mechanisms
Operations on Processes and Cooperating processes
Emu8086
Unit II - 2 - Operating System - Threads
Linux Directory Structure
System call (Fork +Exec)
First Come First Serve
Fast File System
Linux kernel architecture
linuxcommands.pdf
Ejemplo Base de Datos SQLite (Android)
Unix ch03-03(2)
Code generation errors and recovery
Disk scheduling algorithm.52
Disk scheduling
Database and types of database
Lenguaje de programación Ruby
Ad

Viewers also liked (18)

PDF
Publish-Subscribe Middlewares
PPTX
Publish subscribe model overview
PDF
Full-Text Retrieval in Unstructured P2P Networks using Bloom Cast Efficiently
PPT
New zealand bloom filter
PDF
Event Driven Architecture
PPTX
Distributed Event Routing in Publish/Subscribe Systems
PDF
Homomorphic encryption in_cloud
PPTX
Publish Subscribe pattern - Design Patterns
PPTX
Content based filtering
PDF
Inverted Index Based Multi-Keyword Public-key Searchable Encryption with Stro...
KEY
Publish and Subscribe
PPTX
HiveServer2
PPTX
How to Build Recommender System with Content based Filtering
PPTX
How to build a Recommender System
PPTX
Recommender systems: Content-based and collaborative filtering
PDF
Recommender Systems
PPT
Power Quality
PPTX
Collaborative Filtering Recommendation System
Publish-Subscribe Middlewares
Publish subscribe model overview
Full-Text Retrieval in Unstructured P2P Networks using Bloom Cast Efficiently
New zealand bloom filter
Event Driven Architecture
Distributed Event Routing in Publish/Subscribe Systems
Homomorphic encryption in_cloud
Publish Subscribe pattern - Design Patterns
Content based filtering
Inverted Index Based Multi-Keyword Public-key Searchable Encryption with Stro...
Publish and Subscribe
HiveServer2
How to Build Recommender System with Content based Filtering
How to build a Recommender System
Recommender systems: Content-based and collaborative filtering
Recommender Systems
Power Quality
Collaborative Filtering Recommendation System
Ad

Similar to Content based filtering, pub sub, bloom filters (20)

PPTX
Design of an information system for HUNETs
PDF
Location based information sharing system for mobile devices
PPSX
Prospective Use of Bloom Filters and MuXing for Information Centric Network C...
PPTX
Social Piggybacking: Leveraging Common Friends to Generate Event Streams
PPTX
Band of brothers, building scalable social web apps on windows azure with asp...
PDF
Why databases suck for messaging
PPTX
20120412 searching techniques in peer to peer networks
PPTX
CoolStreaming
PPT
Asadpour
ZIP
ProcessOne Push Platform: XMPP-based Push Solutions
ZIP
ProcessOne Push Platform: XMPP-based Push Solutions
PPTX
PPTX
Dancing with publish/subscribe
PDF
Distributed Coordination-Based Systems
PDF
Bloom Filter Based Routing for Content-Based Publish/Subscribe
PPTX
Routing papers in ccn
PPT
Distributed System-Multicast & Indirect communication
PDF
Module: Mutable Content in IPFS
PDF
4af46e43-4dc7-4b54-ba8b-3a2594bb5269 j.pdf
PDF
Key management in information centric networking
Design of an information system for HUNETs
Location based information sharing system for mobile devices
Prospective Use of Bloom Filters and MuXing for Information Centric Network C...
Social Piggybacking: Leveraging Common Friends to Generate Event Streams
Band of brothers, building scalable social web apps on windows azure with asp...
Why databases suck for messaging
20120412 searching techniques in peer to peer networks
CoolStreaming
Asadpour
ProcessOne Push Platform: XMPP-based Push Solutions
ProcessOne Push Platform: XMPP-based Push Solutions
Dancing with publish/subscribe
Distributed Coordination-Based Systems
Bloom Filter Based Routing for Content-Based Publish/Subscribe
Routing papers in ccn
Distributed System-Multicast & Indirect communication
Module: Mutable Content in IPFS
4af46e43-4dc7-4b54-ba8b-3a2594bb5269 j.pdf
Key management in information centric networking

More from Yara Ali (6)

PPT
Generating a time shrunk lecture video by event
PPT
Sudoku
PPTX
Quality enhamcment
PPT
Localization in WSN
PPT
Interference mitigation by dynamic self power control in femtocell
PPT
Intel® core™ i5 700 desktop processor
Generating a time shrunk lecture video by event
Sudoku
Quality enhamcment
Localization in WSN
Interference mitigation by dynamic self power control in femtocell
Intel® core™ i5 700 desktop processor

Recently uploaded (20)

PPTX
Spectroscopy.pptx food analysis technology
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
cuic standard and advanced reporting.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Electronic commerce courselecture one. Pdf
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
A Presentation on Artificial Intelligence
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Encapsulation theory and applications.pdf
PDF
Assigned Numbers - 2025 - Bluetooth® Document
Spectroscopy.pptx food analysis technology
Programs and apps: productivity, graphics, security and other tools
Review of recent advances in non-invasive hemoglobin estimation
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Spectral efficient network and resource selection model in 5G networks
cuic standard and advanced reporting.pdf
20250228 LYD VKU AI Blended-Learning.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Network Security Unit 5.pdf for BCA BBA.
Unlocking AI with Model Context Protocol (MCP)
Electronic commerce courselecture one. Pdf
MIND Revenue Release Quarter 2 2025 Press Release
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
A Presentation on Artificial Intelligence
Encapsulation_ Review paper, used for researhc scholars
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
“AI and Expert System Decision Support & Business Intelligence Systems”
Encapsulation theory and applications.pdf
Assigned Numbers - 2025 - Bluetooth® Document

Content based filtering, pub sub, bloom filters

  • 1. Content based filtering, Pub/Sub & Bloom filter Presented By: Yara Ali
  • 2. Agenda • • • • • • • • • • • Introduction Human Networks (HUMNETs) Content-based PublishSubscribe Pub-sub service network Bloom filter-based pub-SUB “B-SUB” B-SUB Components Bloom Filters (BF) How Bloom filters work? Temporal counting Bloom filter (TCBF) Problems with TCBF Decaying Factor
  • 3. Introduction • Distributed system: • A system consisting of several connected computers that appear to be one computing entity.
  • 4. Introduction .. Cont, Communication Mechanism Client / Server Architecture Remote Procedure call (RPC) Message Oriented Middleware (MOM) Message Queues Tuple Space Publish / Subscribe Architecture
  • 5. Introduction .. Cont, • Publish / Subscribe Architectures 1- Lists at server: Middleware is at the servers 2- Message broker: Middleware is in a separate unit 3- BroadCast & filter at client: Middleware is at the clients
  • 6. Human Networks (HUMNETs) • It’s a dynamic networks composed of human-carried wireless devices. • Applications in HUMNETs require contentbased networking services. (style of communication that associates source and destination pairs based on actual content and interests, rather than letting source nodes specify the destination)
  • 7. Content-based PublishSubscribe (CBPS) • Content-based matching is the problem of finding all the subscriptions that match a given notification. • CBPS represents a compromise between the extremes of publisher-side filtering of messages ( with event directly transmitted to interested subscribers ) and subscriber-side filtering of messages ( with events broadcasted to all subscribers ). • Event delivery is the task of delivering the notification to the set of interested subscribers selected with contentbased matching.
  • 8. Pub-sub service network • Two Approaches : 1. Filter-based approach: Performs content-based filtering on intermediate routing servers to dynamically guide routing decisions. 2. Multicast-based approach: Delivers events through a few high-quality multicast groups that are pre-constructed to approximately match user interests.
  • 10. Pub-sub service network…Cont, • In the filter-based approach, Routing decisions are made via successive content-based filtering at all nodes from source to destination: every pub-sub server along the way matches the event with remote subscriptions from other servers and then forwards it only toward directions that lead to matching subscriptions • In the multicast-based approach, A limited number of multicast groups are computed before event transmission begins. For each event the routing decision is made only once at the publisher, mapping the event into the single appropriate group. The event is then multicast to the group assuming IP multicast or application-level multicast support. Because only a limited number of multicast groups can be built, servers with different interests may be clustered into same group and events may be sent to uninterested servers as well.
  • 11. Bloom filter-based pub-SUB “BSUB” • It’s a content-based publish-subscribe system. • In B-SUB, messages are identified by using strings that summarize their contents. ( called keys )
  • 12. Bloom filter-based pub-SUB “BSUB” …Cont, • Pub/sub paradigm is used in B-SUB
  • 13. Bloom filter-based pub-SUB “BSUB” …Cont, • Advantages: 1- Frees users from addressing & routing tasks. (reduces the overall overhead in the system) 2- Message producers & consumers are separated. 3- Messages are forwarded only by brokers (Perform content matching for the users)
  • 14. B-SUB Components B-SUB Pub – Sub forwarding Broker Allocation TCBF Interests propagation Message forwarding
  • 15. B-SUB Components … Cont, 1- Broker Allocation: • Group of socially active nodes are selected to be brokers. • Normal users don’t participate in interest propagation & message forwarding • Brokers are responsible for collecting subscriptions and forwarding messages • A Broker stores a TCBF for propagating other users’ interests. (which is called relay filters) 2- Pub – Sub forwarding • It’s separated into 2 parts: interests propagation and message forwarding
  • 16. Bloom Filters (BF) • It’s a space-efficient data structure for representing sets which supports probabilistic membership querying. • is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set • BF maps a key through multiple hash functions into a bit vector of a few bits being set. “ User’s interests are represented as keys – Also messages are identified by strings that summarize their contents are called Keys “
  • 17. Bloom Filters (BF) … Cont, • The locations of the set bits are determined by the hash functions. • A query of a key to a BF checks if all the hashed bits of the key are set, which indicates if the key is contained in the BF
  • 18. Bloom Filters (BF) … Cont, • A BF for a set of keys is obtained by sequentially inserting keys into the filter. 10 10 1 1 {K0} 20 + 10 10 1 1 {K1} {k0,K1} 1 10 10 1 1
  • 19. Bloom Filters (BF) … Cont, • To merge multiple BFs we do a bit-wise OR on them. 10 10 1 1 {K0} 10 M 10 10 1 1 {K1} {k0,K1} 1 10 10 1 1
  • 20. Bloom Filters (BF) … Cont, • The basic BF doesn’t support deletions since we are unable to trace the associated keys of set bits. • The counting bloom filter (CBF) is proposed to provide deletion. • In a CBF each bit is associated with a counter, which represent the number of keys that are associated with it. • To delete a key from a CBF we decrement the counters of the key’s hashed bits. A bit will be reset once its counter reaches 0.
  • 21. Bloom Filters (BF) … Cont, • The sizes of messages in B-SUB are small which are in order of hundreds of bytes. This assumption is true in social networking applications. • Ex: twitter; a popular micro-blogging application, requires a max size of 140 bytes for each post. If a message is wrongly injected into the network, the wasted bandwidth is acceptable)
  • 22. How Bloom filters work? “Message Forwarding” • When a producer meets a consumer, the consumer reports its interests in a BF to the producer. The producer then queries all its messages against the filter, and forwards all the messages that match the filter, to the consumer. • When a broker meets a producer, it forwards a BF to the producer. The producer queries that filter and determines the events that need to be transmitted.
  • 23. How Bloom filters work? “Message Forwarding” … Cont, • When a broker meets a consumer, the broker requests a BF containing the consumer’s interests, then forwards the matched messages to the consumer. • Message are removed from brokers’ memory after being forwarded. This is to prevent excessive copies in the network. • Messages’ lifetime is controlled by their time-to-live (TTL) values, which are identical to their maximum tolerable delay. The TTL is counted since the message has been created.
  • 24. Temporal counting Bloom filter (TCBF) • Extension to BF, proposed to perform content-based networking tasks. • It doesn’t support direct deletion of elements. It only supports temporal deletion, that is, A filter constantly decrements the counter’s values of all its set bits, which is called Decaying • B-SUB uses TCBF to encode user’s interest & embed information needed for brokers to make forwarding decisions. • B-SUB makes forwarding decisions through querying the TCBFs ( B-SUB can propagate interests by transmitting at most two TCBFs of dozens of bytes) • The operations performed are only hashing and table lookup.
  • 25. Problems with TCBF • False positive (Spam) occur because a key’s hashed bits are accidentally set by other keys that have been put into the TCBF. • Because of false postivies, B-SUB may falsely inject useless messages into the network.
  • 26. Decaying Factor (DF) • It’s the key for adjusting B-SUB’s behaviors. • If decaying is not used, the counters of the set bits don’t change after being set, then no interests will be removed. • An obvious consequence is that a broker will end up with carrying the interests from the users that it meets rarely.
  • 27. Decaying Factor (DF)…Cont, • Suppose that each message has a delay limit of time T, we should set the DF in such a way that an interest will get removed after T since a consumer inserted the interest once. • If the broker contains the interest, then that means that the broker has met a consumer that is interested in it within T. • If a message is forwarded by the broker it’s likely that the message will be delivered within T.