SlideShare a Scribd company logo
Tutorial: Tackling Variety in Event-
Based Systems
Souleiman Hasan, Edward Curry
DEBS 2015
About Us
Souleiman Hasan
PhD Researcher
Insight @ NUI Galway
Interests:
Semantic Event Processing
Internet of things
Semantic Web
souleiman.hasan@insight-centre.org
http://www.souleimanhasan.org/
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 2
Dr. Edward Curry
Unit Leader
Insight @ NUI Galway
Interests:
Event Systems
Energy Intelligence
Internet of things
ed.curry@insight-centre.org
http://edwardcurry.org/
Overview
13:40-14:00 Part I: Events in the IoT
14:00-14:20 Part II: Computational Paradigms
14:20-14:40 Part III: A Theory for Event Exchange
14:40-15:10 Part IV: Semantics and Approximation
15:10-15:30 Break
15:00-16:00 Part V: Approaches to Semantic Coupling
16:00-16:20 Part VI: Thematic Event Processing
16:20-16:40 Part VII: Building IoT Event Systems
16:40-17:00 Part VIII: Future Research Challenges
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 3
Events in the IoT
PART I
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 4
Current Trends
Smart Homes, Cities, Internet of Things, Big
Data
By 2020 50 billion devices
connected to mobile networks
(OECD, 2012)
OECD, 2012. Machine-to-Machine Communications: Connecting Billions of Devices. OECD Digital Economy Papers, No. 192.
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 5
From Internet of Things to Internet of
Everything
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 6
Web of Things
From Rigid Schemas to Schema-less
Heterogeneous, complex and large-scale data
Very-large and dynamic “schemas”
Open Environments: distributed, decoupled data sources,
anonymous users, multi-domain, lack of global order of
information flow
10s-100s attributes
1,000s-1,000,000s attributes
circa 2000
circa 2014Slide Credits: Andre Freitas
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 8
Fundamental Decentralization
Multiple perspectives (conceptualizations) of the reality.
Ambiguity, vagueness, inconsistency.
Slide Credits: Andre Freitas
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 9
Shift in Data Production/Consumption
Number of Information Sources: increases
mobile subscribers 2,205 millions in 2005 to 6,662 millions in 2013, 200% (ITU, 2014)
Data Heterogeneity: increases
Falcons discovers 4,000 ontologies 2008 to 6,400 in 2015 (Cheng et al., 2008)
Number of non-technical Users: increases
Internet users 1,024 millions in 2005, to 2,710 millions in 2013, 160% (ITU, 2014)
USA 3% in STEM disciplines in 2010 (NSF, 2010)
Organization of Users: distributed & decoupled
Wikipedia crowdsourcers global diversity (Ross et al., 2010)
Timeliness: required
important to filter important data items as early as possible (Jagadish at al., 2014)
Information Completeness: Uncertain
Uncertainty, errors, and missing values are endemic, and must be managed (Jagadish at al., 2014)
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 10
Current Trends
Small scale,
controlled
environments
Large scale, open environments
Information
sources
10s to 100s 1000s to millions
Data
heterogeneity
Small number of
schemas
High number of schemas
Users Small number
Know the environment
Large number
Not quite know the environment
Users
organization
Users know each others
Top-down hierarchies
(e.g. enterprises)
Decoupled and distributed
Dynamism Low High (sources and users join and
leave often)
Domain Domain specific Users interest range from domain
specific to domain agnostic
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 11
INTERNET OF THINGS (IOT)
CASE STUDY
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 12
Internet of Things
Sensing and Communication Layer
(RFID, 6LoWPAN, IPv6,…)
Middleware Layer
(MoM, SOA, …)
Applications Layer
(logistics, healthcare, smart envs, analytics,
robo-taxis, virtual reality, …)
ATZORI, LUIGI, ANTONIO IERA, AND GIACOMO MORABITO. "THE INTERNET OF THINGS: A SURVEY." COMPUTER
NETWORKS 54.15 (2010): 2787-2805.
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 13
“SmartSantander proposes a unique way in the
world city-scale experimental research facility
in support of typical applications and services
for a smart city”
http://www.smartsantander.eu/
Smart City- How Real?
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 14
IoT in Light of Big Data
• Significant efforts in IoT come from Sensing and
Communication communities
• Challenges of the IoT will be more prevalent at the data
level (Aggarwal et al., 2013)
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 15
IoT in Light of Big Data
• Datasets so large and complex to be processed
by current data processing applications.
• Volume
 Terabytes, Petabytes…
• Variety
 Many sources, syntax and semantics.
• Velocity
 Near real-time, real-time
Stonebraker, Michael. "What Does' Big Data'Mean." Communications of the ACM, BLOG@ ACM (2012).
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 16
Volume
• Distributed and parallel processing
• Hadoop, MapReduce, …
• SPARK
Example: Word count in Spark
file = spark.textFile("hdfs://...")
file.flatMap(lambda line: line.split())
.map(lambda word: (word, 1))
.reduceByKey(lambda a, b: a+b)
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 17
Velocity
• Stream and event processing
• Lambda architecture’s speed layer
• SPARK Streaming
• Storm
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 18
Variety
• Mainly heterogeneous data models
• Heterogeneous schema in DB terminology
• Currently targeted by
 ETL
 Data Integration
 Semantic Web technologies
• Example
 SSN Ontology
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 19
DISCUSSION POINT
• What current paradigms do you think will
struggle with IoT?
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 20
Why Do We Really Have Variety?
• Symbolization process in computing systems
 Easier to process symbols by computers
• We use symbols to reduce meanings
• Mostly words
• E.g. Unique Name Assumption in DB
community:
 “Different things and concepts are denoted using
different names”
 California and CA denote different things according to
this assumption
 The whole processing model is symbolic
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 21
Why Do We Really Have Variety?
• Same assumption is usually made in event
systems
 Data models in Middleware is symbolic
• Decoupled, distributed parties symbolize
differently!
Variety Happens!
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 22
COMPUTATIONAL PARADIGMS
PART II
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 23
Information Flow Processing (IFP)
• Users need to collect information
 Produced by multiple distributed sources
 For timely way processing
 To extract knowledge asap
CUGOLA, G. AND MARGARA, A., 2011. Processing flows of information: From data stream to complex
event processing. ACM Computing Surveys Journal.
Financial Continuous
Analytics
RFID Inventory
Management
Environmental
Monitoring
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 24
Information Flow Processing (IFP)
• Processing information as it flows
 No intermediate storage
 New information produced
 Raw information can be discarded
Information
Flow
Processing
Engine
Producers Consumers
Rule managers
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 25
Information Flow Processing (IFP)
• Requirements
 Real-time or near real-time processing
 Expressive language for rules
 Scalability to large number of producers and consumers
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 26
Active Databases
• Traditional database systems
 Passive
 Store data and wait for user’s interaction
 Reactive behaviour in the application layer
DAYAL, U., BLAUSTEIN, B., BUCHMANN, A., CHAKRAVARTHY, U., HSU, M., LEDIN, R., MCCARTHY,
D., ROSENTHAL, A., SARIN, S., CAREY, M. J., LIVNY, M., AND JAUHARI, R. 1988. The hipac project:
Combining active databases and timing constraints. SIGMOD Rec. 17, 1, 51–70.
LIEUWEN, D. F., GEHANI, N. H., AND ARLEIN, R. M. 1996. The ode active database: Trigger semantics
and implementation. In Proceedings of the 12th International Conference on Data Engineering
(ICDE’96). IEEE Computer Society, Los Alamitos, CA, 412–420.
GATZIU, S. AND DITTRICH, K. 1993. Events in an active object-oriented database system. In
Proceedings of the International Workshop on Rules in Database Systems (RIDS), N. Paton and H.
Williams, Eds. Workshops in Computing, Springer-Verlag, Edinburgh, U.K.
CHAKRAVARTHY, S. AND ADAIKKALAVAN, R. 2008. Events and streams: Harnessing and unleashing
their synergy! In Proceedings of the 2nd International Conference on Distributed Event-Based Systems
(DEBS’08). ACM, New York, NY, 1–12.
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 27
Active Databases
• Reactive behaviour moved to database layer
• Event-Condition-Action (ECA) rules
 Event: source. E.g. tuple inserted
 Condition: post event. E.g. inserted.value > 5
 Action: what to do. E.g. modify the DB
• Cons
 Persistent storage model
 Suitable when updates not frequent and few rules
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 28
Data Stream Management Systems
• Streams unbounded (not like tables)
• No arrival order assumptions
• Typically no storage
• Use continuous, or standing, queries
• Reactive in nature
CHANDRASEKARAN, S., COOPER, O., DESHPANDE, A., FRANKLIN, M. J., HELLERSTEIN, J. M., HONG, W., KRISHNAMURTHY, S.,
MADDEN, S. R., REISS,
F., AND SHAH, M. A. 2003. Telegraphcq: Continuous dataflow processing. In Proceedings of the ACM SIGMOD International Conference on
Management of Data (SIGMOD’03). ACM, New York, NY, 668–668.
CHEN, J., DEWITT, D. J., TIAN, F., AND WANG, Y. 2000. Niagaracq: A scalable continuous query system for Internet databases. SIGMOD Rec.
29, 2, 379–390.
LIU, L., PU, C., AND TANG, W. 1999. Continual queries for internet scale event-driven information delivery. IEEE Trans. Knowl. Data Eng. 11, 4,
610–628.
ARASU, A., BABU, S., AND WIDOM, J. 2006. The CQL continuous query language: Semantic foundations and query execution. VLDB J. 15, 2,
121–142.
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 29
Data Stream Management Systems
• Continuous queries semantics
 Answer: append only stream or update store
 Exact or approximate answer
• Cons
 Atomic item is the stream
 Not possible to detect sequencing or causal patterns
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 30
Publish/Subscribe Systems
• Information items are notifications
• Indirect addressing-based communication
scheme
• Ancestors
 Message Passing
 Remote Procedure Call (RPC)
 Shared spaces
 Message Queueing
Eugster, P.T., Felber, P.A., Guerraoui, R. and Kermarrec, A.M., 2003. The many faces of
publish/subscribe. ACM Computing Surveys (CSUR), 35(2), pp.114–131.
MUHL , G., FIEGE, L., AND PIETZUCH, P. 2006. Distributed Event-Based Systems. Springer
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 31
Message Queues
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 32
Messaging Models
• Two main message models are commonly
available
 point-to-point
 publish/subscribe
• Both are based on the exchange of
messages through a channel (queue)
• Typical system will utilize a mix of these
models to achieve different messaging
objectives
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 33
Point-to-Point Model
• Straightforward asynchronous exchange of
messages
 message routed to consuming clients via a queue
 no restriction on number of publishing clients
 usually only a single consuming client
 message is delivered only once to only one receiver
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 34
Point-to-Point Model
• Messages are always delivered and will be
stored in the queue until a consumer is ready
to retrieve them
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 35
Publish Subscribe Models
• One-to-many and many-to-many distribution
mechanism
 allows single producer to send a message to one user or
potentially hundreds of thousands of consumers
• Clients "publish" to a specific topic or
channel
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 36
Publish Subscribe Models
• Channels are “subscribed” to by clients to
consume messages
• No restriction on the role of a client
 may be both a producer and consumer of a channel
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 37
Publish/Subscribe Systems
• Topic-based pub/sub
 Topics are groups or channels
 Events of a topic are sent to the topic’s subscribers
ALTHERR, M., ERZBERGER, M., AND MAFFEIS, S. 1999. iBus—a software bus middleware for the
Java platform. In Proceedings of the International Workshop on Reliable Middleware Systems. 43–53.
TIBCO. 1999. TIB/Rendezvous. White paper. TIBCO, Palo Alto, CA.
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 38
Publish/Subscribe Systems
• Content-based pub/sub
 Matching by message filters
 Publishers and subscribers channels are defined by the
content and the subscriptions
David S. Rosenblum and Alexander L. Wolf. 1997. A design framework for Internet-scale event
observation and notification. SIGSOFT Softw. Eng. Notes 22, 6 (November 1997), 344-360.
DOI=10.1145/267896.267920 http://doi.acm.org/10.1145/267896.267920
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 39
Publish/Subscribe Systems
• Type-based pub/sub
 Matching on type hierarchy
 Type hierarchy can come from programming language
inheritance hierarchies
EUGSTER, P. AND GUERRAOUI, R. 2001. Content based publish/subscribe with structural reflection. In
Proceedings of the 6th Usenix Conference on Object-Oriented Technologies andSystems (COOTS’01).
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 40
Publish Subscribe Systems
• Events and matching
 Ordered tuples
 Attribute-values
 XML documents
• Cons
 Single event matching only
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 41
• Detection of complex patterns
 Sequencing
 Causal
 Ordering in general
 Of multiple events
 And generate complex,
or derived, events
Complex Event Processing Systems
8-12 Dec 2014, Bordeaux, France
LUCKHAM, D., 2002. The Power of Events: An Introduction to Complex Event Processing in Distributed
Enterprise Systems, Addison-Wesley Professional.
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 42
Complex Event Processing Systems
Adapted from CUGOLA, G. AND MARGARA, A., 2011. Processing flows of information: From data stream to
complex event processing. ACM Computing Surveys Journal.
SOULEIMAN HASAN AND EDWARD CURRY. 2014. APPROXIMATE SEMANTIC MATCHING OF EVENTS FOR
THE INTERNET OF THINGS. ACM TRANS. INTERNET TECHNOL. 14, 1, ARTICLE 2 (AUGUST 2014), 23
PAGES.
ADAPTED FROM (CUGOLA AND MARGARA)
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 43
DISCUSSION POINT
• What paradigms are the audience used to?
• What is their experience with each?
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 44
A THEORY FOR EVENT EXCHANGE
PART III
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 45
3 Traits of Large-Scale Event Processing
Systems
• 1- Distribution
• Two complementary aspects:
• The first is the placement of processing workloads on different nodes and
thus making use of parallel computing.
• The second aspect is that large-scale environments are inherently
distributed with event production and consumption happening at distributed
components.
• Thus, even when dealing with a centralized event
processing engine, considerations of the innate nature of
distribution of the environment of event producers and
consumers shall be taken into account.
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 46
3 Traits of Large-Scale Event Processing
Systems
• 2- Heterogeneity
• Differences in hardware components, protocols, operating
systems, middleware, and data
• Muhl et al. “Syntax and semantics of notifications are
likely to vary and there are inevitably different data models
in use.”
• We deal here with semantic heterogeneity
• Semantics discussed in PART IV
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 47
Problem
• Event producers and consumers are semantically
coupled
 Consumers need prior knowledge of event types, attributes and
values.
 Limits scalability in heterogeneous and dynamic environments
due to explicit dependencies
 Difficult development of event processing subscriptions/rules in
heterogeneous an dynamic environments.
Space
Time
Synch
Producer Consumer
Semantic
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 48
Exact Matching Model
Type Energy
Consumption
Place Room 202e
Amount 40 kWh
Type Electricity
Consumption
Location Room 202e
Amount 70 kWh
Type Electricity
Utilized
Venue Room 202e
Amount 600 kWh
e1
Event
Producers
e.g. Sensors
Type =“Energy Consumption”
Place =“Room 202e”
Type =“Electricity Consumption”
Location =“Room 202e”
Type =“Electricity Utilized”
Venue =“Room 202e”
Traditional
Event
Processing
e1
Consumer
e1e2
e1e3
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 49
Semantic Matching
Type Energy
Consumption
Place Room 202e
Amount 40 kWh
Type Electricity
Consumption
Location Room 202e
Amount 70 kWh
Type Electricity
Utilized
Venue Room 202e
Amount 600 kWh
e1
Event
Producers
e.g. Sensors
e1
e1e2
e1e3
Semantic
Event
Processing
Type =“Energy Consumption”~
Location =“Room 202e”
Consumer
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 50
3 Traits of Large-Scale Event Processing
Systems
• 3- Openness
• The term “open” has been used frequently in the literature
to describe distributed event systems at large scales, it
has not been defined precisely
• Draw upon the definition used in systems theory
“system that has external interactions in form of information,
energy, or matter transfer through its boundary.”
• An open event system from the semantics perspective as
the event environment where an agent can exchange
events with other agents that use different event
semantics
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 51
The Principle of Decoupling
• Eugster et al. : decoupling as “removing all explicit
dependencies between the interacting participants.”
• Implicit Interaction: the control over an event-based
system is decentralized into an autonomous version
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 52
The Principle of Decoupling
• Event Processing
•Exchange atomic items called events
•Scalable by decoupling
•Eugster et al.
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 53
Producer Consumer
event
Space
No Addresses
Time
Active or Not
Synchronization
No Blocking
DISCUSSION POINT
• The hypothesis is that removing explicit
dependencies between event producers and
consumers leads to an increased scalability
• Where did the dependencies go?
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 54
How Good are Our Paradigms?
• Scale
 Big volume
 Big Velocity
 Big Variety
• Distributed sources and consumers
• The big challenge is now in the exchange of
knowledge at a very large-scale
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 55
Shannon-Weaver Model
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 56
Shannon-Weaver Model
Cross-Boundaries Exchange
P. R. Carlile. Transferring, translating, and transforming: An integrative framework for managing knowledge across boundaries. Organization science, 15(5):555{568, 2004.
Known
environmen
t
Syntactic
Semantic
Pragmatic
Producer Consumer
Boundaries
Open
environment
Known
environment
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 58
Syntactic Boundary
• Transfer is the most common type of
information movement across this boundary
• A common lexicon exists
 Move and process syntax (0’s and 1’s)
 Dominant form of Shannon Weaver’s theory
• Examples
 Different data models of events
 E.g. Transfer RDF events over HTTP
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 59
Semantic Boundary
• Common lexicon doesn’t exist
• Lexicon evolve
• Ambiguities exist
• Translation is the process to cross this
boundary
• Examples
 Different ontologies for sensors
 Ontology alignment for RDF events
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 60
Pragmatic Boundary
• Actors on the sides of the boundary have
 Different contexts
 Different perspectives
 Different interests
• Transformation is the process to cross this
boundary
• Example
 Temp sensor reading of 35 celsius is acceptable from
outdoor sensors but not from indoor
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 61
Cross-Boundaries Exchange
P. R. Carlile. Transferring, translating, and transforming: An integrative framework for managing knowledge across boundaries. Organization science, 15(5):555{568, 2004.
Known
environmen
t
Syntactic
Semantic
Pragmatic
Producer Consumer
Boundaries
Open
environment
Known
environment
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 62
Transfer-Translate-Transform
• Current approaches in event processing
• Transfer
 Common event models
 Common language models
• E.g. RDF over HTTP
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 63
Transfer-Translate-Transform
• Current approaches in event processing
• Translate
 Agreements on schemas/thesauri/ontologies
• E.g. DERI Energy ontology for building energy
events
Curry, Edward, et al. "Linking building data in the cloud: Integrating cross-domain building
data using linked data." Advanced Engineering Informatics 27.2 (2013): 206-219.
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 64
Transfer-Translate-Transform
• Current approaches in event processing
• Transform
 Dedicated enrichers, joins in event languages
• CQELS language for Linked Stream Data
mashups
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 65
Decoupling for Scalability
Patrick Th. Eugster, Pascal A. Felber, Rachid Guerraoui, and Anne-Marie Kermarrec. 2003. The
many faces of publish/subscribe. ACM Comput. Surv. 35, 2 (June 2003), 114-131.
Event Processing
Space
Time
SynchronizationEvent
source
Event
consumer
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 66
A Trade-Off
• Current decoupling scale at lower boundaries
• Human in the loop to cross higher
boundaries, introducing coupling, limiting
scalability
De/Coupling
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 67
Pragmatics
(contexts)
Semantics
(meanings)
Syntactic
Consum
er
Producer
Boundaries
HeterogeneousDistributed
OpenEnvironments
De/Coupling
Formats
Space
Time
Synchronization
Agree-
ments
Ext.
data
A Trade-off
Publisher
Alice
Syntactic boundary
Semantic normalization for Bob
Semantic boundary
Consumer Bob
Publisher
Alice
Syntactic boundary
Semantic boundary
Consumer Dan
Consumer Bob
Consumer Erin
A. Small scale known environment
Low cost to cross boundary
(agreements, number of rules)
B. Large scale open environment (e.g. IoT)
High cost to cross boundary
(agreements, number of rules)
type energy consumption
increaselocation university street
e Semantic normalization for Bob
type electricity usage
riseplace university street
eB
type energy usage
riselocation university road
eE
type energy usage rise
place university road
eD
type electricity usage
riseplace university street
eB
type energy consumption
increaselocation university street
e
Souleiman Hasan and Edward Curry. 2015. TACKLING VARIETY IN INTERNET OF THINGS EVENTS, IEEE
Internet Computing (In Press)
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 68
Semantic Coupling
Event Processing
Space
Time
Synchronization
Event
source
Event
consumerSemantic Coupling
type, attributes, values
Souleiman Hasan and Edward Curry. 2014. Approximate Semantic Matching of Events for the
Internet of Things. ACM Trans. Internet Technol. 14, 1, Article 2 (August 2014), 23 pages
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 69
DISCUSSION POINT
• How significant is semantic coupling
compared to other decoupling dimensions?
• What are possible solutions?
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 70
SEMANTICS AND APPROXIMATION
PART IV
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 71
What is Semantics?
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 72
What is Semantics?
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 73
• Relationship between two spaces (or worlds or sets): the
meanings, and the symbols
Peter Gardenfors
• Semiotics and sign systems
What is Semantics?
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 74
• Semiotics (Chandlers, 2001)
Semantic Heterogeneity
• Semantics maps a language L to meanings M
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 75
The Meanings Space
The meanings space:
• Objects are individuals like a specfic laptop used by Alice.
• Properties are a “way of abstracting away redundant
information about objects”. E.g. Alice's laptop is “black”
which is a property.
• Concepts are the most generic form of objects and
properties. A concept clusters similar properties and objects
such as the concept “Laptop”.
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 76
Three Main Levels of Semantics
1. Symbolic
2. Conceptual or Sub-Symbolic
3. Non-Symbolic
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 77
Symbolic Semantics
• Information is represented by symbols, e.g. “apple”
• Processing of information is by definition a manipulation
of symbols through Rules.
• Symbols can be gathered into sentences of a language of
• thought.
• What a sentence means is a belief of an agent.
• Various beliefs are connected by logical or inferential
relations such as first-order logic in Artificial Intelligence
(AI)
• Meanings are purely the result of logical, syntactic
relations of symbols, rather than the states they refer to.
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 78
Symbolism and Computationalism
“One of the fundamental contributions to knowledge of
computer science has been to explain, at a rather basic
level, what symbols are. This explanation is a scientific
proposition about Nature. It is empirically derived, with a
long and gradual development. Symbols lie at the root of
intelligent action, which is, of course, the primary topic of
artificial intelligence. For that matter, it is a primary
question for all of computer science.”
Newell and Simon
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 79
Symbolism and Computationalism
• Fundamental tenet in AI
• But also in databases and hence event systems
• E.g. Codd’s relational model
• E.g. Unique name assumption in databases
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 80
Symbolic Semantics
• Extensional Semantics where a property is defined by the
set of objects in the world that have the property.
• Intensional Semantics alters the concept of one world to
the case of multiple possible worlds, to tackle properties
like small or big
• Situation Semantics uses one world model, but instead of
truth functions from symbols or sentences to possible
worlds, it uses a polarity function from symbols or
sentences to a subset of the world, called situation.
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 81
Critiques to Symbolic Semantics
1. They do not explain how a person can perceive two properties to
be similar.
2. Their limited account for inductive reasoning.
3. The frame problem which states that representing all necessary
knowledge about the world requires a combinatorial explosion of
logical axioms and inferences.
4. The symbol grounding problem which states that in the symbolic
paradigm the meanings of symbols are actually grounded in
symbols themselves.
5. Symbolism does not largely separate the symbolic level from the
meaning level. When event agents need to agree on the meanings
they have to agree on which is a highly costly process and thus
hinders the loose semantic coupling requirement.
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 82
Conceptual Sub-Symbolic Semantics
• Fundamentally leverage topological features of meanings,
e.g. Apple is closer to Orange than to Car
• Geometrical nature of the meaning space
• Distances and closeness between meanings can be
established.
• E.g. Gardenfors conceptual spaces
•concepts are structured into domains, e.g. the domain of colors, the
spatial domain, etc.
•Conceptual spaces are then built up from quality dimensions which
serve the purpose of building the domains. For instance, the colors
domain can be built up from three dimensions: hue, chromaticness or
saturation, and brightness.
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 83
Conceptual Sub-Symbolic Semantics
• Fundamentally leverage topological features of meanings,
e.g. Apple is closer to Orange than to Car
• Geometrical nature of the meaning space
• Distances and closeness between meanings can be
established.
• E.g. Gardenfors conceptual spaces
•Computationally challenging due to need to build and agree on quality
dimensions
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 84
Distributional Semantic Model
• Distributional hypothesis: the context surrounding a given word
in a text provides relevant information about its meaning.
• Simplified semantic model.
• Associational and quantitative.
• Explicit Semantic Analysis (ESA) is the primary distributional
model used in this work.
A wife is a female partner in a marriage. The term "wife" seems to be a
close term to bride, the latter is a female participant in a wedding
ceremony, while a wife is a married woman during her marriage.
...
Slide Credits: Andre Freitas (http://andrefreitas.org/), [Freitas et al., 2013]
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 85
Distributional Semantic Model
c1
function (number of times that the words occur in c1)
c1
child
husband
spouse
cn
c2
function (number of times that the words occur in c1)
0.7
0.5
Commonsense is here
Slide Credits: Andre Freitas (http://andrefreitas.org/), [Freitas et al., 2013]
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 86
Semantic Relatedness
θ
c1
child
husband
spouse
cn
c2
Works as a semantic ranking function
E.g. esa(room, building)= 0.099
E.g. esa(room, car)= 0.009
Slide Credits: Andre Freitas (http://andrefreitas.org/)
[Freitas et al., 2013]
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 87
Critiques to Sub-Symbolic Semantics
• Compositionality: they mainly concern lexical meanings,
i.e. meanings of individual terms, rather than complex
sentences.
• We argue that the compositionality problem is not an issue
for event matching. That is due to the fact that linguistic
structures and syntax is not the kind of data model used in
event processing systems to represent events and
subscriptions.
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 88
Non-Symbolic Semantics
• Artificial Neural Networks (ANNs)
• Connectionism
• Fundamentally can be abstracted into geometrical models
• Difficult to build an interpret
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 89
Semantic Models
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 90
Tagging
• Inspired by works in social
tagging, i.e. folksonomies
• Folksonomies are bottom
up approach to semantics
• Free words
• Tag events such as in IoT
• Leads to the concepts of
thingsonomies
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 91
Free Tagging and Thingsonomies
• Top-down taxonomies difficult to impose
• Folksonomies successful in social tagging
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 92
Approximation
• Coupling is important to cross semantic and pragmatic
boundaries, but it limits scalability.
• Loosening coupling at these levels is a compromise to
tackle the trade-of between decoupling for scalability and
crossing the boundaries.
• The cost of this compromise is a loss in effectiveness
while crossing the boundaries, i.e. loss of some precision
and context when processing the events.
• In literature approximation has been used to address
•Time efficiency, e.g. approximation algorithms
•Full integration, e.g. uncertain schema matching, Gal et al.
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 93
Approximation
• Exact matching assumes full agreements
• Approximation flexible with uncertainties
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 94
APPROACHES TO SEMANTIC COUPLING
PART V
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 95
Loosening the Semantic Coupling
• Approach 1: Content-Based with Semantic
Decoupling
• Approach 2: Content-Based with Implicit Shared
Agreements
• Approach 3: Concept-Based
• Approach 4: Loose Semantic Coupling +
Approximation
Hasan, S. and Curry, E., 2014. Approximate Semantic Matching of Events for The Internet
of Things. ACM Transactions on Internet Technology (TOIT).
• Approach 5: Theme-Based
Hasan, S. and Curry, E., 2014. Thematic Event Processing. Middleware 2014.
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 96
Approach 1: Content-Based with Semantic
Decoupling
7-11 July 2014,
Rhodes, Greece
EarthBiAs2014
Producer Consumer
event
Semantic De-
Coupling
Happened
Publish:
A
Happened
Interested in
Subscribe:
Interested in
B
• Very low detection rate
 High false positives/negatives
 Low precision/recall
Current Approaches
Semantic Decoupling
Effectiveness & Efficiency
Content-based
Concept-based
Bottom-up
Semantics
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 98
Approach 1: Content-Based with Semantic
Decoupling
7-11 July 2014,
Rhodes, Greece
EarthBiAs2014
Producer Consumer
event
Semantic De-Coupling
Happened
Publish:
A
Happened
Interested in
Subscribe:
Interested in A
Interested in B
Interested in C
• Use many rules to improve detection
 Time and effort
 Affects scalability to heterogeneous environments
Approach 2: Content-Based with Implicit
Shared Agreements
7-11 July 2014,
Rhodes, Greece
EarthBiAs2014
Producer Consumer
event
Semantic Coupling via
Implicit Agreements
Happened
Publish:
A Happened
Interested in
Subscribe:
Interested in A
Face-to-face, or via
documentation
e.g. Use symbol A to
describe
Approach 2: Content-Based with Implicit
Shared Agreements
• Implicit semantics
 Top-down approach to semantics
 Granular on the level of concepts
Producer Consumer
event
Semantic Coupling via
Implicit Agreements
Happened
Publish:
A Happened
Interested in
Subscribe:
Interested in A
Approach 2: Content-Based with Implicit
Shared Agreements
• Need for shared agreements
 Time and effort
 Affects scalability to heterogeneous environments
Producer Consumer
event
Semantic Coupling via
Implicit Agreements
Happened
Publish:
A Happened
Interested in
Subscribe:
Interested in A
Approach 3: Concept-Based
7-11 July 2014,
Rhodes, Greece
EarthBiAs2014
Producer Consumer
event
Semantic Coupling via
Ontologies
Happened
Publish:
A Happened
Interested in
Subscribe:
Interested in B
C
D
B
E
A
F
subClassOf
Approach 3: Concept-Based
• Explicit semantics
 Top-down approach to semantics
 Granular on the level of concepts
Producer Consumer
event
Semantic Coupling via
Ontologies
Happened
Publish:
A Happened
Interested in
Subscribe:
Interested in B
Approach 3: Concept-Based
• Need for shared agreements
 Time and effort
 Affects scalability to heterogeneous environments
Producer Consumer
event
Semantic Coupling via
Ontologies
Happened
Publish:
A Happened
Interested in
Subscribe:
Interested in B
Approach 4: Loose Semantic Coupling +
Approximation
7-11 July 2014,
Rhodes, Greece
EarthBiAs2014
Producer Consumer
event
Loose Semantic
Coupling via Large
Text Corpora
Happened
Publish:
A Happened
Interested in
Subscribe:
Interested in B
A d1 d2 d3 d4 d5 d6 d7 d8 ….
B d1 d3 d4 d17 d25 d26 d77 d78 ….
~
Souleiman Hasan and Edward
Curry. 2014. Approximate
Semantic Matching of Events for
the Internet of Things. ACM Trans.
Internet Technol. 14, 1, Article 2
(August 2014), 23 pages
Approach 4: Loose Semantic Coupling +
Approximation
• Bottom-up model of semantics
• Global semantics: distribution vs. granular
7-11 July 2014,
Rhodes, Greece
EarthBiAs2014
Producer Consumer
event
Loose Semantic
Coupling via Large
Text Corpora
Happened
Publish:
A Happened
Interested in
Subscribe:
Interested in B
~
Approach 4: Loose Semantic Coupling +
Approximation
• Low cost to scale to heterogeneous
environments
• Slightly lower detection rate
7-11 July 2014,
Rhodes, Greece
EarthBiAs2014
Producer Consumer
event
Loose Semantic
Coupling via Large
Text Corpora
Happened
Publish:
A Happened
Interested in
Subscribe:
Interested in B
~
Approach 5: Thematic Event Processing
7-11 July 2014,
Rhodes, Greece
EarthBiAs2014
Producer Consumer
event
Loose Semantic
Coupling via Large
Text Corpora
Happened
Publish:
A Happened
Interested in
Subscribe:
Interested in B
~
• Can we exchange better approximations of
meanings rather than mere symbols to
improving detection rate?
Approach 5: Thematic Event Processing
7-11 July 2014,
Rhodes, Greece
EarthBiAs2014
Producer Consumer
event
Loose Semantic
Coupling via Large
Text Corpora
Happened
Publish:
(A+T1)
Happened
Interested in
Subscribe:
Interested in
(B+T2)
A d1 d2 d3 d4 d5 d6 d7 d8 ….
B d1 d3 d4 d17 d25 d26 d77 d78 ….
~
Souleiman Hasan and
Edward Curry. 2014.
Thematic event processing.
In Proceedings of the 15th
International Middleware
Conference (Middleware '14).
Summary
Simple
Content-
based
Content-
based +
Many Rules
Concept-
based
Simple
Distributional+
Approximation
Thematic
Matching exact string
matching
exact string
matching
Boolean semantic
matching
approximate
semantic matching
approximate
semantic matching
Semantic
Coupling
term-level full
agreement
term-level full
agreement
concept-level
shared agreement
loose agreement loose agreement
Semantics not explicit not explicit top-down
ontology-based
statistical model
based on
distributional
semantics
statistical model
based on
distributional
semantics +
themes
Effectiveness very low 100% depends on the
domains and
number of
concept models
depends on the
corpus
depends on the
corpus + theme
representatives
Cost defining a small
number of rules
defining a large
number of rules
establishing
shared agreement
on ontologies
minimal agreement
on a large
textual corpus
minimal
agreement on a
large
textual corpus +
good theme
representatives
Efficiency high high medium to high medium to high Medium to high
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 111
DISCUSSION POINT
• Is the audience familiar with the symbolic vs.
non-symbolic debate in AI?
• What position do they take?
• How relevant to our current discussion?
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 112
RDF EVENT PROCESSING
CASE STUDY
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 113
From The Web The Semantic Web
• 1989 Tim Berners Lee
proposed what became the
Web
• Hypertext over HTTP and URI
• 2001 TBL et al. published their
Scientific American paper The
Semantic Web
• A structured Web for machines
• Builds over the Web
architecture
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 114
From The Web The Semantic Web
• Initial Semantic
Web vision
• Standards: W3C
• Research: mainly
ISWC, ESWC
• Recently attention
moved to the lower
layers of the stack
• Called: Linked Data
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 115
Linked Data is Web-based
• Leverages the architecture of the Web to
make sharing data easier
• Linked Data is a method of exposing, sharing,
and connecting data (via dereferenceable
URIs) on the Web.
• Provides a Data (RDF) and Naming (URI)
model for the Web
• W3C Web-based Standards
• Adaptive Ontologies
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 116
Linked Data is Web-based
1. Use URIs as names for things
2. Use HTTP URIs so that people
can look up those names.
3. When someone looks up
a URI, provide useful
information, using the
standards (RDF*, SPARQL)
4. Include links to other URIs. so
that they can discover more
things
http://www.w3.org/DesignIssues/LinkedData.html
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 117
Linked Data Cloud
US government
UK government
BBC
New York Times
LinkedGeoData
BestBuy
Overstock.com
Facebook
Media
Government
Geo
Publications
User-generated
Life sciences
Cross-domain
Over 200 open data sets with more than 25 billion facts,
interlinked by 400 million typed links, doubling every 10 month!
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 118
Two Key Ingredients
• RDF – Resource Description Framework
Graph based Data – nodes and arcs
 Identifies objects (URIs)
 Interlink information (Relationships)
• Vocabularies (Ontologies)
 provide shared understanding of a domain
 organise knowledge in a machine-comprehensible way
 give an exploitable meaning to the data
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 119
Example: Linked Building Data
http://www.deri.ie/about/team/member/edward_curry/
http://lab.linkeddata.deri.ie/2010/deri-rooms#r202e
http://vocab.deri.ie/rooms#occupant
Resource Description Framework (RDF)
subject - predicate – object
Edward Curry is the Occupant of Room 202e
Edward Curry is the Occupant of Room 202e
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 120
Example: Linked Building Data
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 121
Why Linked Data for the IoT?
• Many communities struggle with closed
approaches
 E.g., pervasive computing, embedded systems, IoT, …
• Cyber-Physical Systems are inherently “open
world”
 Prof. David Karger (MIT) in his ESWC 2013 keynote:
“Semantic Web technologies support and open world
assumption where millions of unforeseeable schemas
may have to be integrated.”
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 122
Why Linked Data for the IoT?
• Simple integration with existing LOD data sets
 Geo-spatial, governmental, media, …
• Manageable integration effort with other graph
data, e.g., Google Knowledge Graph,
Facebook Graph, etc.
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 123
• How to classify RDF event processing
according to previous classification?
• What are the pros and cons?
• How to make RDF event processing more
suitable for large scale IoT with variety?
DISCUSSION POINT
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 124
• RDF Event Processing
 Based on a top-down semantic model, i.e. ontologies
 Use RDF as data model
 Use URIs for interlinking
 SPARQL-like query languages
 Graph matching
DISCUSSION POINT
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 125
• RDF Event Processing
 Top-down semantics difficult to agree on
 Granular semantics difficult to agree on
 RDF graph model powerful
 URIs good for linking
 SPARQL-like query languages, expressive
 SPARQL-like query languages, not user friendly
DISCUSSION POINT
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 126
• RDF Event Processing
 Is an example of Concept-Based approach to semantic
coupling
 Needs some research to address the problems of this
class, but also to gain power of RDF model and URIs
 Efforts in querying RDF graphs using distributional
semantics can largely feed this research, see [Freitas et
al., 2013]
DISCUSSION POINT
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 127
TEMATIC EVENT PROCESSING
PART VI
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 128
The Thematic Approach
• Decoupling is good for scalability
• Decoupling parties creates challenges to
cross semantic boundaries
• We can solve that by re-coupling parties at
semantic boundaries, but it is not ideal
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 129
The Thematic Approach
Problem: How can we achieve effective and
efficient event processing without causing
semantic coupling at the semantic boundaries?
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 130
The Thematic Approach
• Exchange approximations of meanings
• Instead of exchanging words, try to convey
their meaning too
• How?Use tags that can help understand their
meaning
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 131
The Thematic Approach
• Inspired by works in
social tagging, i.e.
folksonomies
• Folksonomies are
bottom up approach
to semantics
Souleiman Hasan and Edward Curry. 2014. Thematic event processing. In Proceedings of the 15th International
Middleware Conference (Middleware '14). ACM, Bordeaux, France, 109-120. DOI=10.1145/2663165.2663335
http://doi.acm.org/10.1145/2663165.2663335
Image credit http://lowriderlibrarian.blogspot.fr/
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 132
The Thematic Approach
Event
Publisher
Alice
Consumer
Bob
Theme the
Payload
Subscription
Theme ths
ExpressionApproximate
matcher
Parameterization
Loose coupling mode: lightweight agreement on themes
No coupling mode: free use of well representative themes
Souleiman Hasan and Edward Curry. 2014. Thematic event processing. In Proceedings of the 15th International
Middleware Conference (Middleware '14). ACM, Bordeaux, France, 109-120. DOI=10.1145/2663165.2663335
http://doi.acm.org/10.1145/2663165.2663335
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 133
Event Representation
Event
energy, appliances, building
type: increased energy consumption event,
measurement unit: kilowatt per hour,
device: computer,
office: room 112
• Thematic tags added to events
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 134
Subscription Representation
Subscription
power, computers
type= increased energy usage event~,
device~= laptop~,
office= room 112
• Thematic tags added to subscriptions
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 135
Matching Model
• Top-1 and Top-k most probable mappings
{type: increased energy
consumption event,
measurement unit: kilowatt per hour,
device: computer,
desk: desk 112c,
office: room 112,
floor: ground floor,
zone: building,
city: Galway,
country: Ireland,
continent: Europe}
{type = increased energy
consumption event,
device = laptop~,
room~esa = room 112}
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 136
Probabilistic Approximate Matcher
• Top-1 and Top-k mappings between an event
and a subscription
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 137
Parameterized Similarity
• Thematic tags used to parameterize the
semantic measure
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 138
Interpreting Terms
• Project vectors in a
distributional semantic
vector space
• Thematic projection
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 139
Evaluation Metrics
• Number of exact rules
• Precision
• Recall
• F1Score
• Throughput
• Standard error
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 140
Evaluation Dataset
• Seed events synthesized from IoT sensors
• SmartSantander smart city project
 Luis Sanchez, Jos´e Antonio Galache, Veronica Gutierrez, JM Hernandez, J Bernat, Alex Gluhak, and Tom´as Garcia. 2011.
SmartSantander: The meeting point between Future Internet research and experimentation and the smart cities. In Future
Network & Mobile Summit (FutureNetw), 2011. IEEE, 1–8.
• Sensor Capabilities
 solar radiation, particles, speed, wind direction, wind
speed, temperature, water flow, atmospheric pressure,
noise, ozone, rainfall, parking, radiation par, co, ground
temperature, light, no2, soil moisture tension, relative
humidity, energy consumption, cpu usage, memory
usage
Souleiman Hasan and Edward Curry. 2014. Approximate Semantic Matching of Events for the
Internet of Things. ACM Trans. Internet Technol. 14, 1, Article 2 (August 2014), 23 pages.
DOI=10.1145/2633684 http://doi.acm.org/10.1145/2633684
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 141
Evaluation Dataset
• Seed events synthesized from IoT sensors
• Linked Energy Intelligence platform
 Edward Curry, Souleiman Hasan, and Sean O’Riain. 2012. Enterprise energy management using a linked dataspace for Energy
Intelligence. In Sustainable Internet and ICT for Sustainability (SustainIT), 2012. IEEE, 1–6.
• Car brands from the yahoo directory
 Yahoo! 2013. Yahoo! Directory: Automotive - Makes and Models. (2013). http://dir.yahoo.com/recreation/ automotive/makes and
models/
• Home based appliances from BLUED dataset
 Kyle Anderson, Adrian Ocneanu, Diego Benitez, Derrick Carlson, Anthony Rowe, and Mario Berges. 2012. BLUED: A Fully
Labeled Public Dataset for Event-Based Non-Intrusive Load Monitoring Research. In Proc. SustKDD.
• Rooms from DERI Building
 Richard Cyganiak. 2013. Rooms in the DERI building. (2013). http://lab.linkeddata.deri.ie/2010/deri-rooms
Souleiman Hasan and Edward Curry. 2014. Approximate Semantic Matching of Events for the
Internet of Things. ACM Trans. Internet Technol. 14, 1, Article 2 (August 2014), 23 pages.
DOI=10.1145/2633684 http://doi.acm.org/10.1145/2633684
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 142
Evaluation Methodology
Souleiman Hasan and Edward
Curry. 2014. Thematic event
processing. In Proceedings of
the 15th International
Middleware
Conference (Middleware '14).
ACM, Bordeaux, France, 109-
120.
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 143
Evaluation
• FScore up to 95% and 1000s events/sec
Hasan, S. and Curry,
E., 2014.
Approximate
Semantic Matching
of Events for The
Internet of Things.
ACM Transactions
on Internet
Technology (TOIT).
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 144
Evaluation
Souleiman Hasan and Edward Curry. 2014. Thematic event processing. In Proceedings of the 15th International
Middleware Conference (Middleware '14). ACM, Bordeaux, France, 109-120.
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 145
Evaluation
The use of less terms to describe events, around 2-7, and more to
describe subscriptions, around 2 -15, can achieve a good matching
quality and throughput together with less error rates.
Lightweight amount of tags.
Souleiman Hasan and Edward Curry. 2014. Thematic event processing. In Proceedings of the 15th International
Middleware Conference (Middleware '14). ACM, Bordeaux, France, 109-120.
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 146
BUILDING IOT Event Systems
PART VII
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 147
Challenges for Building IoT Systems
• Vastly heterogeneous, decoupled, and
distributed nodes
• Lack of central coordination, reference
requirements, or data model
• High overhead associated with software
design associated with establishing
agreements between parties
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 148
Building IoT Event Systems
Indexing
Collector
Semantic
relatedness
web service
Textual
corpus
Vector
space
index
Consumer Bob
(user)
Publisher Alice
Publish + thematic
tags
Thematic event processing engine(s)
Approximate single event matching
Subscribe
+ thematic
tags
IoT sensors
Terms +
themes
pairs
Relatednes
s score
CollectorPublisher Carol
Publish + thematic
tags
CollectorPublisher Dave
Publish + thematic
tags
Consumer Dan
(application developer)
Consumer Erin
(application developer)
Heterogeneous IoT Events
Relevant
events
normalized
for Bob
Subscribe
+ thematic
tags
Relevant
events
normalized
for Dan
Subscribe
+ thematic
tags
Relevant
events
normalized
for Erin
5
3
1
2
4 6
Souleiman Hasan and
Edward Curry. 2015.
TACKLING VARIETY IN
INTERNET OF THINGS
EVENTS, IEEE Internet
Computing
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 149
Building IoT Event Systems
Collector
Semantic
relatedness
web service
Consumer Bob
(user)
Publisher Alice
Publish + thematic
tags
Thematic event processing engine(s)
Approximate single event matching
Subscribe
+ thematic
tags
IoT sensors
Terms +
themes
pairs
Relatednes
s score
CollectorPublisher Carol
Publish + thematic
tags
CollectorPublisher Dave
Publish + thematic
tags
Consumer Dan
(application developer)
Consumer Erin
(application developer)
Heterogeneous IoT Events
Relevant
events
normalized
for Bob
Subscribe
+ thematic
tags
Relevant
events
normalized
for Dan
Subscribe
+ thematic
tags
Relevant
events
normalized
for Erin
5
3
2
4 6
1
Textual corpus
Vector
space
index
Indexing
• Step 1: Build the semantic model
 To enable the system establish relationships between
various terms such as ‘computer’ vs. ‘laptop’.
 Use distributional semantics: bottom-up, not fine-grained
model
 Revise model with software iterations
 update corpus
 Wikipedia or a subset of it is a good start
 Use enterprise Wikis if available
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 150
1
Textual
corpus
Vector
space
index
Indexing
Building IoT Event Systems
Collector
Consumer Bob
(user)
Publisher Alice
Publish + thematic
tags
Thematic event processing engine(s)
Approximate single event matching
Subscribe
+ thematic
tags
IoT sensors
CollectorPublisher Carol
Publish + thematic
tags
CollectorPublisher Dave
Publish + thematic
tags
Consumer Dan
(application developer)
Consumer Erin
(application developer)
Heterogeneous IoT Events
Relevant
events
normalized
for Bob
Subscribe
+ thematic
tags
Relevant
events
normalized
for Dan
Subscribe
+ thematic
tags
Relevant
events
normalized
for Erin
5
3
4 6
• Step 2: Avail semantic measure
 REST and JSON
 Request:
http://example.com/esa?term1=energy&term2=electricity
 Response {“relatedness” : 0.154}
Semantic
relatedness
web service
Terms +
themes pairs
Relatedness
score
2
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 151
Semantic
relatedness
web service
Terms +
themes
pairs
Relatednes
s score
2
1
Textual
corpus
Vector
space
index
Indexing
Building IoT Event Systems
Consumer Bob
(user)
Thematic event processing engine(s)
Approximate single event matching
Subscribe
+ thematic
tags
Consumer Dan
(application developer)
Consumer Erin
(application developer)
Relevant
events
normalized
for Bob
Subscribe
+ thematic
tags
Relevant
events
normalized
for Dan
Subscribe
+ thematic
tags
Relevant
events
normalized
for Erin
5
4 6
• Step 3: Publishers associate sensor events
with tags
 Thematic tags to represent domain and meaning
 Associate with attribute-value events
 Tags {energy, appliances, building} to accompany
increased energy consumption event
CollectorPublisher Alice
Publish + thematic
tags
IoT sensors
CollectorPublisher Carol
Publish + thematic
tags
CollectorPublisher Dave
Publish + thematic
tags
Heterogeneous IoT
Events
3
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 152
Relevant
events
normalized
for Bob
5
6
CollectorPublisher Alice
Publish + thematic
tags
IoT sensors
CollectorPublisher Carol
Publish + thematic
tags
CollectorPublisher Dave
Publish + thematic
tags
Heterogeneous IoT Events
3
Semantic
relatedness
web service
Terms +
themes
pairs
Relatednes
s score
2
1
Textual
corpus
Vector
space
index
Indexing
Building IoT Event Systems
Thematic event processing engine(s)
Approximate single event matching
Consumer Dan
(application developer)
Consumer Erin
(application developer)
Subscribe
+ thematic
tags
Relevant
events
normalized
for Dan
Subscribe
+ thematic
tags
Relevant
events
normalized
for Erin
• Step 4: Subscribers associate subscriptions
with tags
 Thematic tags to represent domain and meaning
 Associate with attribute-value events
 Tags {power, computers} to accompany increased
energy consumption event of a device
Subscribe
+ thematic
tags
4
Consumer Bob
(user)
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 153
4
Relevant
events
normalized
for Bob
6
Consumer Bob
(user)
Subscribe
+ thematic
tags
Publisher Alice
Publisher Carol
Publisher Dave
1
Textual
corpus
Indexing
Semantic
relatedness
web service
Terms +
themes
pairs
Relatednes
s score
2
Vector
space
index
Building IoT Event Systems
Consumer Dan
(application developer)
Consumer Erin
(application developer)
Subscribe
+ thematic
tags
Relevant
events
normalized
for Dan
Subscribe
+ thematic
tags
Relevant
events
normalized
for Erin
• Step 5: Middleware matches events to
subscriptions
 Approximate, probabilistic semantic normalization
Publish +
thematic tags
Publish +
thematic tags
Publish +
thematic tags
3
5
Thematic event processing engine(s)
Approximate single event matching
Collector
Collector
Collector
Heterogeneous IoT Events
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 154
Subscribe
+ thematic
tags
4
5
Publisher Alice
Publisher Carol
Publisher Dave
Publish + thematic
tags
Publish + thematic
tags
Publish + thematic
tags
1
Textual
corpus
Indexing
3
Semantic
relatedness
web service
Terms +
themes
pairs
Relatednes
s score
2
Vector
space
index
Thematic event processing engine(s)
Approximate single event matching
Collector
IoT sensors
Collector
Collector
Heterogeneous IoT Events
Building IoT Event Systems
Consumer Dan
(application developer)
Consumer Erin
(application developer)
Subscribe
+ thematic
tags
Relevant
events
normalized
for Dan
Subscribe
+ thematic
tags
Relevant
events
normalized
for Erin
• Step 6: Middleware distributes matching
results to subscribers
 Relevant events are returned
 Uncertainty scores are associated
 Uncertainty reflects semantic normalization
Relevant
events
normalized
for Bob
6
Consumer Bob
(user)
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 155
FUTURE RESEARCH CHALLENGES
PART VIII
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 156
Future Research Challenges
• Investigation of user tagging behavior of
sensor events in zero-coupled open
environment
• Investigation of the limitations of evolution of
semantic agreements in open environments
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 157
Future Research Challenges
• Investigation of event dynamic enrichment for
pragmatic boundaries within loose coupled
environments
Hasan, S., O’Riain, S. and Curry, E., 2013.
Towards Unified and Native Enrichment in
Event Processing Systems. In The 7th ACM
International Conference on Distributed
Event-Based Systems (DEBS 2013).
Arlington, Texas, USA: ACM
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 158
Future Research Challenges
• Dynamic enrichment
Hasan, S., O’Riain, S. and Curry, E., 2013.
Towards Unified and Native Enrichment in
Event Processing Systems. In The 7th ACM
International Conference on Distributed
Event-Based Systems (DEBS 2013).
Arlington, Texas, USA: ACM
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 159
Future Research Challenges
• Support complex pattern detection
• Uncertainty propagation
Hasan, S. and Curry,
E., 2014.
Approximate
Semantic Matching
of Events for The
Internet of Things.
ACM Transactions
on Internet
Technology (TOIT).
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 160
Future Research Challenges
• Support complex pattern detection
• Statistical monotonicity and top-k
8-12 Dec 2014, Bordeaux, FranceMiddleware ’14 161
Hasan, S. and Curry,
E., 2014.
Approximate
Semantic Matching
of Events for The
Internet of Things.
ACM Transactions
on Internet
Technology (TOIT).
Future Research Challenges
• Optimization of approximate matching for
higher throughputs
• Extension of the approximate thematic
matcher into parallel computation
• Investigation of integration of the vector
space model within current data management
systems paradigms
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 162
Conclusions
• Coupling necessary for crossing boundaries
• Decoupling necessary for scalable software
• Event-based systems need extension to
address the coupling/decoupling trade-off for
semantics
• Approximate and thematic event processing
exchange approximations of meaning with
loose semantic coupling
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 163
DISCUSSION POINT
• Overall questions and answers
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 164
References
• CUGOLA, G. AND MARGARA, A., 2011. Processing flows of information: From data stream to complex event processing.
ACM Computing Surveys Journal.
• EUGSTER, P.T., FELBER, P.A., GUERRAOUI, R. AND KERMARREC, A.M., 2003. The many faces of publish/subscribe. ACM
Computing Surveys (CSUR), 35(2), pp.114–131.
• Carlile, Paul R. "Transferring, translating, and transforming: An integrative framework for managing knowledge across
boundaries." Organization science15.5 (2004): 555-568.
• SOULEIMAN HASAN AND EDWARD CURRY. 2015. TACKLING VARIETY IN INTERNET OF THINGS EVENTS, IEEE
Internet Computing (In Press)
• SOULEIMAN HASAN AND EDWARD CURRY. 2014. APPROXIMATE SEMANTIC MATCHING OF EVENTS FOR THE INTERNET OF
THINGS. ACM TRANS. INTERNET TECHNOL. 14, 1, ARTICLE 2 (AUGUST 2014), 23 PAGES. DOI=10.1145/2633684
HTTP://DOI.ACM.ORG/10.1145/2633684
• HASAN, S., O’RIAIN, S. AND CURRY, E., 2013. TOWARDS UNIFIED AND NATIVE ENRICHMENT IN EVENT PROCESSING
SYSTEMS. IN THE 7TH ACM INTERNATIONAL CONFERENCE ON DISTRIBUTED EVENT-BASED SYSTEMS (DEBS 2013).
ARLINGTON, TEXAS, USA: ACM.
• HASAN, S., O’RIAIN, S. AND CURRY, E., 2012. Approximate Semantic Matching of Heterogeneous Events. In 6th ACM
International Conference on Distributed Event-Based Systems (DEBS 2012). Berlin, Germany: ACM, pp. 252–263.
• SOULEIMAN HASAN AND EDWARD CURRY. 2014. THEMATIC EVENT PROCESSING. IN PROCEEDINGS OF THE 15TH
INTERNATIONAL MIDDLEWARE CONFERENCE (MIDDLEWARE '14). ACM, BORDEAUX, FRANCE, 109-120.
DOI=10.1145/2663165.2663335 HTTP://DOI.ACM.ORG/10.1145/2663165.2663335
• HASAN, S., CURRY, E., BANDUK, M., AND O’RIAIN, S. TOWARD SITUATION AWARENESS FOR THE SEMANTIC SENSOR WEB:
COMPLEX EVENT PROCESSING WITH DYNAMIC LINKED DATA ENRICHMENT. THE 4TH INTERNATIONAL WORKSHOP ON
SEMANTIC SENSOR NETWORKS 2011 (SSN11), (2011), 60–72.
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 165
Dataset and Software
• Dataset
 Souleiman Hasan, Edward Curry, Thematic event
processing dataset, DOI: 10.13140/2.1.3342.9123
 Available at
http://www.researchgate.net/publication/263673956_Thematic_event_proce
ssing_dataset
• Collider
 Souleiman Hasan, Kalpa Gunaratna, Yongrui Qin, and Edward Curry. 2013.
Demo: approximate semantic matching in the collider event processing
engine. In Proceedings of the 7th ACM international conference on
Distributed event-based systems (DEBS '13). ACM, New York, NY, USA,
337-338. DOI=10.1145/2488222.2489277
http://doi.acm.org/10.1145/2488222.2489277
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 166
More References
• OECD, 2012. MACHINE-TO-MACHINE COMMUNICATIONS: CONNECTING BILLIONS OF DEVICES. OECD DIGITAL ECONOMY PAPERS, NO. 192.
• P. MCFEDRIES, THE COMING DATA DELUGE, IEEE SPECTRUM, 2011.
• CUGOLA, G. AND MARGARA, A., 2011. PROCESSING FLOWS OF INFORMATION: FROM DATA STREAM TO COMPLEX EVENT PROCESSING. ACM COMPUTING SURVEYS JOURNAL.
• EUGSTER, P.T., FELBER, P.A., GUERRAOUI, R. AND KERMARREC, A.M., 2003. THE MANY FACES OF PUBLISH/SUBSCRIBE. ACM COMPUTING SURVEYS (CSUR), 35(2), PP.114–131.
• LUCKHAM, D., 2002. THE POWER OF EVENTS: AN INTRODUCTION TO COMPLEX EVENT PROCESSING IN DISTRIBUTED ENTERPRISE SYSTEMS, ADDISON-WESLEY PROFESSIONAL.
• DAYAL, U., BLAUSTEIN, B., BUCHMANN, A., CHAKRAVARTHY, U., HSU, M., LEDIN, R., MCCARTHY, D., ROSENTHAL, A., SARIN, S., CAREY, M. J., LIVNY, M., AND
JAUHARI, R. 1988. THE HIPAC PROJECT: COMBINING ACTIVE DATABASES AND TIMING CONSTRAINTS. SIGMOD REC. 17, 1, 51–70.
• LIEUWEN, D. F., GEHANI, N. H., AND ARLEIN, R. M. 1996. THE ODE ACTIVE DATABASE: TRIGGER SEMANTICS AND IMPLEMENTATION. IN PROCEEDINGS OF THE 12TH INTERNATIONAL
CONFERENCE ON DATA ENGINEERING (ICDE’96). IEEE COMPUTER SOCIETY, LOS ALAMITOS, CA, 412–420.
• GATZIU, S. AND DITTRICH, K. 1993. EVENTS IN AN ACTIVE OBJECT-ORIENTED DATABASE SYSTEM. IN PROCEEDINGS OF THE INTERNATIONAL WORKSHOP ON RULES IN DATABASE
SYSTEMS (RIDS), N. PATON AND H. WILLIAMS, EDS. WORKSHOPS IN COMPUTING, SPRINGER-VERLAG, EDINBURGH, U.K.
• CHAKRAVARTHY, S. AND ADAIKKALAVAN, R. 2008. EVENTS AND STREAMS: HARNESSING AND UNLEASHING THEIR SYNERGY! IN PROCEEDINGS OF THE 2ND INTERNATIONAL
CONFERENCE ON DISTRIBUTED EVENT-BASED SYSTEMS (DEBS’08). ACM, NEW YORK, NY, 1–12.
• CHANDRASEKARAN, S., COOPER, O., DESHPANDE, A., FRANKLIN, M. J., HELLERSTEIN, J. M., HONG, W., KRISHNAMURTHY, S., MADDEN, S. R., REISS, F., AND
SHAH, M. A. 2003. TELEGRAPHCQ: CONTINUOUS DATAflOW PROCESSING. IN PROCEEDINGS OF THE ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA
(SIGMOD’03). ACM, NEW YORK, NY, 668–668.
• CHEN, J., DEWITT, D. J., TIAN, F., AND WANG, Y. 2000. NIAGARACQ: A SCALABLE CONTINUOUS QUERY SYSTEM FOR INTERNET DATABASES. SIGMOD REC. 29, 2, 379–390.
• LIU, L., PU, C., AND TANG, W. 1999. CONTINUAL QUERIES FOR INTERNET SCALE EVENT-DRIVEN INFORMATION DELIVERY. IEEE TRANS. KNOWL. DATA ENG. 11, 4, 610–628.
• ARASU, A., BABU, S., AND WIDOM, J. 2006. THE CQL CONTINUOUS QUERY LANGUAGE: SEMANTIC FOUNDATIONS AND QUERY EXECUTION. VLDB J. 15, 2, 121–142.
• MUHL , G., FIEGE, L., AND PIETZUCH, P. 2006. DISTRIBUTED EVENT-BASED SYSTEMS. SPRINGER
• ALTHERR, M., ERZBERGER, M., AND MAFFEIS, S. 1999. IBUS—A SOFTWARE BUS MIDDLEWARE FOR THE JAVA PLATFORM. IN PROCEEDINGS OF THE INTERNATIONAL WORKSHOP
ON RELIABLE MIDDLEWARE SYSTEMS. 43–53.
• TIBCO. 1999. TIB/RENDEZVOUS. WHITE PAPER. TIBCO, PALO ALTO, CA.
• A. FREITAS, J. G. OLIVEIRA, S. O’RIAIN, J. C. DA SILVA, AND E. CURRY. QUERYING LINKED DATA GRAPHS USING SEMANTIC RELATEDNESS: A VOCABULARY INDEPENDENT
• APPROACH. DATA & KNOWLEDGE ENGINEERING, 88:126–141, 2013.
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 167
More References
• DAVID S. ROSENBLUM AND ALEXANDER L. WOLF. 1997. A DESIGN FRAMEWORK FOR INTERNET-SCALE EVENT OBSERVATION AND NOTIFICATION. SIGSOFT SOFTW. ENG. NOTES 22, 6
(NOVEMBER 1997), 344-360. DOI=10.1145/267896.267920 HTTP://DOI.ACM.ORG/10.1145/267896.267920
• EUGSTER, P. AND GUERRAOUI, R. 2001. CONTENT BASED PUBLISH/SUBSCRIBE WITH STRUCTURAL REFLECTION. IN PROCEEDINGS OF THE 6TH USENIX CONFERENCE ON OBJECT-
ORIENTED TECHNOLOGIES ANDSYSTEMS (COOTS’01).
• C. SHANNON AND W. WEAVER. THE MATHEMATICAL THEORY OF COMMUNICATION. UNIVERSITY OF ILLINOIS PRESS, 1949.
• P. R. CARLILE. TRANSFERRING, TRANSLATING, AND TRANSFORMING: AN INTEGRATIVE FRAMEWORK FOR MANAGING KNOWLEDGE ACROSS BOUNDARIES. ORGANIZATION SCIENCE,
15(5):555{568, 2004.
• CURRY, EDWARD, SOULEIMAN HASAN, AND SEÁN O'RIAIN. "ENTERPRISE ENERGY MANAGEMENT USING A LINKED DATASPACE FOR ENERGY INTELLIGENCE." SUSTAINABLE INTERNET AND
ICT FOR SUSTAINABILITY (SUSTAINIT), 2012. IEEE, 2012.
• CURRY, EDWARD, ET AL. "LINKING BUILDING DATA IN THE CLOUD: INTEGRATING CROSS-DOMAIN BUILDING DATA USING LINKED DATA." ADVANCED ENGINEERING INFORMATICS 27.2 (2013):
206-219.
• PATRICK TH. EUGSTER, PASCAL A. FELBER, RACHID GUERRAOUI, AND ANNE-MARIE KERMARREC. 2003. THE MANY FACES OF PUBLISH/SUBSCRIBE. ACM COMPUT. SURV. 35, 2 (JUNE
2003), 114-131.
• A. CARZANIGA, D. S. ROSENBLUM, AND A. L. WOLF. ACHIEVING SCALABILITY AND EXPRESSIVENESS IN AN INTERNET-SCALE EVENT NOTI_CATION SERVICE. IN PROCEEDINGS OF THE
NINETEENTH ANNUAL ACM SYMPOSIUM ON PRINCIPLES OF DISTRIBUTED COMPUTING, PAGES 219{227. ACM, 2000.
• M. PETROVIC, I. BURCEA, AND H.-A. JACOBSEN. S-TOPSS: SEMANTIC TORONTO PUBLISH/SUBSCRIBE SYSTEM. IN PROCEEDINGS OF THE 29TH INTERNATIONAL CONFERENCE ON VERY
LARGE DATA BASES - VOLUME 29, VLDB '03, PAGES 1101-1104. VLDB ENDOWMENT, 2003.
• HASAN, S. AND CURRY, E., 2014. APPROXIMATE SEMANTIC MATCHING OF EVENTS FOR THE INTERNET OF THINGS. ACM TRANSACTIONS ON INTERNET TECHNOLOGY (TOIT). IN PRESS
• HASAN, S. AND CURRY, E., 2014. THEMATIC EVENT PROCESSING. MIDDLEWARE 2014. UNDER REVIEW.
• HASAN, S. AND CURRY, E., 2014. TACKLING EVENT VARIETY IN INTERNET OF THINGS SOFTWARE. IEEE INTERNET COMPUTING 2015. UNDER REVIEW.
• LUIS SANCHEZ, JOS´E ANTONIO GALACHE, VERONICA GUTIERREZ, JM HERNANDEZ, J BERNAT, ALEX GLUHAK, AND TOM´AS GARCIA. 2011. SMARTSANTANDER: THE MEETING POINT
BETWEEN FUTURE INTERNET RESEARCH AND EXPERIMENTATION AND THE SMART CITIES. IN FUTURE NETWORK & MOBILE SUMMIT (FUTURENETW), 2011. IEEE, 1–8.
• EDWARD CURRY, SOULEIMAN HASAN, AND SEAN O’RIAIN. 2012. ENTERPRISE ENERGY MANAGEMENT USING A LINKED DATASPACE FOR ENERGY INTELLIGENCE. IN SUSTAINABLE INTERNET
AND ICT FOR SUSTAINABILITY (SUSTAINIT), 2012. IEEE, 1–6.
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 168
More References
• YAHOO! 2013. YAHOO! DIRECTORY: AUTOMOTIVE - MAKES AND MODELS. (2013). HTTP://DIR.YAHOO.COM/RECREATION/ AUTOMOTIVE/MAKES AND MODELS/
• KYLE ANDERSON, ADRIAN OCNEANU, DIEGO BENITEZ, DERRICK CARLSON, ANTHONY ROWE, AND MARIO BERGES. 2012. BLUED: A FULLY LABELED PUBLIC DATASET FOR EVENT-
BASED NON-INTRUSIVE LOAD MONITORING RESEARCH. IN PROC. SUSTKDD.
• RICHARD CYGANIAK. 2013. ROOMS IN THE DERI BUILDING. (2013). HTTP://LAB.LINKEDDATA.DERI.IE/2010/DERI-ROOMS
29 June- 03 July 2015, Oslo, NorwayDEBS ’15 169

More Related Content

PDF
Think Big - How to Design a Big Data Information Architecture
PPTX
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
PDF
From Big Data to Fast Data
PDF
December 16, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types Pa...
PPTX
Shikha fdp 62_14july2017
PDF
Seminaire bigdata23102014
PPTX
2016 05 sanger
PPTX
NOAA BDP Progress Update - Kearns CDAC Oct 2016.v2
Think Big - How to Design a Big Data Information Architecture
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)
From Big Data to Fast Data
December 16, 2015 NISO Webinar: Two-Part Webinar: Emerging Resource Types Pa...
Shikha fdp 62_14july2017
Seminaire bigdata23102014
2016 05 sanger
NOAA BDP Progress Update - Kearns CDAC Oct 2016.v2

What's hot (12)

PDF
2015/12/16 Participatory Urban Sensing
PPTX
“Filling the digital preservation gap” an update from the Jisc Research Data ...
PDF
From Open Access to Open Standards, (Linked) Data and Collaborations
PPTX
Introduction to Data Engineering
PPTX
HathiTrust Research Center Data Capsule Overview 09.10.14
PPTX
2016 09 cxo forum
PPTX
Lunch & Learn Intro to Big Data
PDF
Digital Science: Towards the executable paper
PPT
Big Data As a service - Sethuonline.com | Sathyabama University Chennai
PDF
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
PDF
Digital Pragmatism with Business Intelligence, Big Data and Data Visualisation
PDF
From AirBox to Smart City: where are we and what's next?
2015/12/16 Participatory Urban Sensing
“Filling the digital preservation gap” an update from the Jisc Research Data ...
From Open Access to Open Standards, (Linked) Data and Collaborations
Introduction to Data Engineering
HathiTrust Research Center Data Capsule Overview 09.10.14
2016 09 cxo forum
Lunch & Learn Intro to Big Data
Digital Science: Towards the executable paper
Big Data As a service - Sethuonline.com | Sathyabama University Chennai
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
Digital Pragmatism with Business Intelligence, Big Data and Data Visualisation
From AirBox to Smart City: where are we and what's next?
Ad

Viewers also liked (11)

PDF
Semantic Complex Event Processing with Reaction RuleML 1.0 and Prova 3.0
PDF
Dynamic Complex Event Processing for Hybrid Telecommunication Networks and Sm...
PDF
Semantic Complex Event Processing at Sem Tech 2010
PDF
Semantic Complex Event Processing
PPTX
StreamInsight Breakthrough
PDF
Using Complex Event Processing for Modeling Semantic Requests in Real-Time So...
PPTX
Microsoft SQL Server - StreamInsight Overview Presentation
PDF
Reaction RuleML 1.0
PPT
Complex Event Processing
PPT
Semantic Web for Enterprise Architecture
PPT
Complex Event Processing: What?, Why?, How?
Semantic Complex Event Processing with Reaction RuleML 1.0 and Prova 3.0
Dynamic Complex Event Processing for Hybrid Telecommunication Networks and Sm...
Semantic Complex Event Processing at Sem Tech 2010
Semantic Complex Event Processing
StreamInsight Breakthrough
Using Complex Event Processing for Modeling Semantic Requests in Real-Time So...
Microsoft SQL Server - StreamInsight Overview Presentation
Reaction RuleML 1.0
Complex Event Processing
Semantic Web for Enterprise Architecture
Complex Event Processing: What?, Why?, How?
Ad

Similar to Tackling variety in event based systems (20)

PPTX
Managing 'Big Data' in the social sciences: the contribution of an analytico-...
PPTX
Data-intensive bioinformatics on HPC and Cloud
PPTX
Data-intensive applications on cloud computing resources: Applications in lif...
PPTX
Semantic Sensor Networks and Linked Stream Data
PPTX
20160414 23 Research Data Things
PPTX
Nectar cloud workshop ndj 20110331.2
PPTX
Building a semantic-based decision support system to optimize the energy use ...
PPTX
Sediment Experimentalist Network (SEN): Sharing and reusing methods and data ...
PDF
A Linked Fusion of Things, Services, and Data to Support a Collaborative Data...
PPTX
PPTX
Paving the way to open and interoperable research data service workflows Prog...
PPTX
Linked Open Data about Springer Nature conferences. The story so far
PPTX
Observlets
PPTX
SSSW2015 Data Workflow Tutorial
PDF
Data Science and What It Means to Library and Information Science
PPTX
Challenges and Issues of Next Cloud Computing Platforms
PDF
Beyond Meta-Data: Nano-Publications Recording Scientific Endeavour
PPT
050317 Ws Telecon Husar
PDF
Reinventing Laboratory Data To Be Bigger, Smarter & Faster
PDF
10-1-13 “Research Data Curation at UC San Diego: An Overview” Presentation Sl...
Managing 'Big Data' in the social sciences: the contribution of an analytico-...
Data-intensive bioinformatics on HPC and Cloud
Data-intensive applications on cloud computing resources: Applications in lif...
Semantic Sensor Networks and Linked Stream Data
20160414 23 Research Data Things
Nectar cloud workshop ndj 20110331.2
Building a semantic-based decision support system to optimize the energy use ...
Sediment Experimentalist Network (SEN): Sharing and reusing methods and data ...
A Linked Fusion of Things, Services, and Data to Support a Collaborative Data...
Paving the way to open and interoperable research data service workflows Prog...
Linked Open Data about Springer Nature conferences. The story so far
Observlets
SSSW2015 Data Workflow Tutorial
Data Science and What It Means to Library and Information Science
Challenges and Issues of Next Cloud Computing Platforms
Beyond Meta-Data: Nano-Publications Recording Scientific Endeavour
050317 Ws Telecon Husar
Reinventing Laboratory Data To Be Bigger, Smarter & Faster
10-1-13 “Research Data Curation at UC San Diego: An Overview” Presentation Sl...

Recently uploaded (20)

PDF
Sims 4 Historia para lo sims 4 para jugar
PPTX
INTERNET------BASICS-------UPDATED PPT PRESENTATION
PPTX
Power Point - Lesson 3_2.pptx grad school presentation
PDF
SASE Traffic Flow - ZTNA Connector-1.pdf
PPTX
Mathew Digital SEO Checklist Guidlines 2025
PDF
The New Creative Director: How AI Tools for Social Media Content Creation Are...
PDF
Introduction to the IoT system, how the IoT system works
PDF
Exploring VPS Hosting Trends for SMBs in 2025
PPTX
SAP Ariba Sourcing PPT for learning material
PDF
Vigrab.top – Online Tool for Downloading and Converting Social Media Videos a...
PPTX
presentation_pfe-universite-molay-seltan.pptx
PPT
Design_with_Watersergyerge45hrbgre4top (1).ppt
PDF
FINAL CALL-6th International Conference on Networks & IOT (NeTIOT 2025)
PPTX
Funds Management Learning Material for Beg
PPT
FIRE PREVENTION AND CONTROL PLAN- LUS.FM.MQ.OM.UTM.PLN.00014.ppt
PPTX
June-4-Sermon-Powerpoint.pptx USE THIS FOR YOUR MOTIVATION
PPT
Ethics in Information System - Management Information System
PDF
Smart Home Technology for Health Monitoring (www.kiu.ac.ug)
DOC
Rose毕业证学历认证,利物浦约翰摩尔斯大学毕业证国外本科毕业证
PPTX
Internet___Basics___Styled_ presentation
Sims 4 Historia para lo sims 4 para jugar
INTERNET------BASICS-------UPDATED PPT PRESENTATION
Power Point - Lesson 3_2.pptx grad school presentation
SASE Traffic Flow - ZTNA Connector-1.pdf
Mathew Digital SEO Checklist Guidlines 2025
The New Creative Director: How AI Tools for Social Media Content Creation Are...
Introduction to the IoT system, how the IoT system works
Exploring VPS Hosting Trends for SMBs in 2025
SAP Ariba Sourcing PPT for learning material
Vigrab.top – Online Tool for Downloading and Converting Social Media Videos a...
presentation_pfe-universite-molay-seltan.pptx
Design_with_Watersergyerge45hrbgre4top (1).ppt
FINAL CALL-6th International Conference on Networks & IOT (NeTIOT 2025)
Funds Management Learning Material for Beg
FIRE PREVENTION AND CONTROL PLAN- LUS.FM.MQ.OM.UTM.PLN.00014.ppt
June-4-Sermon-Powerpoint.pptx USE THIS FOR YOUR MOTIVATION
Ethics in Information System - Management Information System
Smart Home Technology for Health Monitoring (www.kiu.ac.ug)
Rose毕业证学历认证,利物浦约翰摩尔斯大学毕业证国外本科毕业证
Internet___Basics___Styled_ presentation

Tackling variety in event based systems

  • 1. Tutorial: Tackling Variety in Event- Based Systems Souleiman Hasan, Edward Curry DEBS 2015
  • 2. About Us Souleiman Hasan PhD Researcher Insight @ NUI Galway Interests: Semantic Event Processing Internet of things Semantic Web souleiman.hasan@insight-centre.org http://www.souleimanhasan.org/ 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 2 Dr. Edward Curry Unit Leader Insight @ NUI Galway Interests: Event Systems Energy Intelligence Internet of things ed.curry@insight-centre.org http://edwardcurry.org/
  • 3. Overview 13:40-14:00 Part I: Events in the IoT 14:00-14:20 Part II: Computational Paradigms 14:20-14:40 Part III: A Theory for Event Exchange 14:40-15:10 Part IV: Semantics and Approximation 15:10-15:30 Break 15:00-16:00 Part V: Approaches to Semantic Coupling 16:00-16:20 Part VI: Thematic Event Processing 16:20-16:40 Part VII: Building IoT Event Systems 16:40-17:00 Part VIII: Future Research Challenges 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 3
  • 4. Events in the IoT PART I 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 4
  • 5. Current Trends Smart Homes, Cities, Internet of Things, Big Data By 2020 50 billion devices connected to mobile networks (OECD, 2012) OECD, 2012. Machine-to-Machine Communications: Connecting Billions of Devices. OECD Digital Economy Papers, No. 192. 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 5
  • 6. From Internet of Things to Internet of Everything 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 6
  • 8. From Rigid Schemas to Schema-less Heterogeneous, complex and large-scale data Very-large and dynamic “schemas” Open Environments: distributed, decoupled data sources, anonymous users, multi-domain, lack of global order of information flow 10s-100s attributes 1,000s-1,000,000s attributes circa 2000 circa 2014Slide Credits: Andre Freitas 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 8
  • 9. Fundamental Decentralization Multiple perspectives (conceptualizations) of the reality. Ambiguity, vagueness, inconsistency. Slide Credits: Andre Freitas 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 9
  • 10. Shift in Data Production/Consumption Number of Information Sources: increases mobile subscribers 2,205 millions in 2005 to 6,662 millions in 2013, 200% (ITU, 2014) Data Heterogeneity: increases Falcons discovers 4,000 ontologies 2008 to 6,400 in 2015 (Cheng et al., 2008) Number of non-technical Users: increases Internet users 1,024 millions in 2005, to 2,710 millions in 2013, 160% (ITU, 2014) USA 3% in STEM disciplines in 2010 (NSF, 2010) Organization of Users: distributed & decoupled Wikipedia crowdsourcers global diversity (Ross et al., 2010) Timeliness: required important to filter important data items as early as possible (Jagadish at al., 2014) Information Completeness: Uncertain Uncertainty, errors, and missing values are endemic, and must be managed (Jagadish at al., 2014) 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 10
  • 11. Current Trends Small scale, controlled environments Large scale, open environments Information sources 10s to 100s 1000s to millions Data heterogeneity Small number of schemas High number of schemas Users Small number Know the environment Large number Not quite know the environment Users organization Users know each others Top-down hierarchies (e.g. enterprises) Decoupled and distributed Dynamism Low High (sources and users join and leave often) Domain Domain specific Users interest range from domain specific to domain agnostic 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 11
  • 12. INTERNET OF THINGS (IOT) CASE STUDY 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 12
  • 13. Internet of Things Sensing and Communication Layer (RFID, 6LoWPAN, IPv6,…) Middleware Layer (MoM, SOA, …) Applications Layer (logistics, healthcare, smart envs, analytics, robo-taxis, virtual reality, …) ATZORI, LUIGI, ANTONIO IERA, AND GIACOMO MORABITO. "THE INTERNET OF THINGS: A SURVEY." COMPUTER NETWORKS 54.15 (2010): 2787-2805. 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 13
  • 14. “SmartSantander proposes a unique way in the world city-scale experimental research facility in support of typical applications and services for a smart city” http://www.smartsantander.eu/ Smart City- How Real? 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 14
  • 15. IoT in Light of Big Data • Significant efforts in IoT come from Sensing and Communication communities • Challenges of the IoT will be more prevalent at the data level (Aggarwal et al., 2013) 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 15
  • 16. IoT in Light of Big Data • Datasets so large and complex to be processed by current data processing applications. • Volume  Terabytes, Petabytes… • Variety  Many sources, syntax and semantics. • Velocity  Near real-time, real-time Stonebraker, Michael. "What Does' Big Data'Mean." Communications of the ACM, BLOG@ ACM (2012). 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 16
  • 17. Volume • Distributed and parallel processing • Hadoop, MapReduce, … • SPARK Example: Word count in Spark file = spark.textFile("hdfs://...") file.flatMap(lambda line: line.split()) .map(lambda word: (word, 1)) .reduceByKey(lambda a, b: a+b) 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 17
  • 18. Velocity • Stream and event processing • Lambda architecture’s speed layer • SPARK Streaming • Storm 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 18
  • 19. Variety • Mainly heterogeneous data models • Heterogeneous schema in DB terminology • Currently targeted by  ETL  Data Integration  Semantic Web technologies • Example  SSN Ontology 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 19
  • 20. DISCUSSION POINT • What current paradigms do you think will struggle with IoT? 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 20
  • 21. Why Do We Really Have Variety? • Symbolization process in computing systems  Easier to process symbols by computers • We use symbols to reduce meanings • Mostly words • E.g. Unique Name Assumption in DB community:  “Different things and concepts are denoted using different names”  California and CA denote different things according to this assumption  The whole processing model is symbolic 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 21
  • 22. Why Do We Really Have Variety? • Same assumption is usually made in event systems  Data models in Middleware is symbolic • Decoupled, distributed parties symbolize differently! Variety Happens! 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 22
  • 23. COMPUTATIONAL PARADIGMS PART II 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 23
  • 24. Information Flow Processing (IFP) • Users need to collect information  Produced by multiple distributed sources  For timely way processing  To extract knowledge asap CUGOLA, G. AND MARGARA, A., 2011. Processing flows of information: From data stream to complex event processing. ACM Computing Surveys Journal. Financial Continuous Analytics RFID Inventory Management Environmental Monitoring 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 24
  • 25. Information Flow Processing (IFP) • Processing information as it flows  No intermediate storage  New information produced  Raw information can be discarded Information Flow Processing Engine Producers Consumers Rule managers 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 25
  • 26. Information Flow Processing (IFP) • Requirements  Real-time or near real-time processing  Expressive language for rules  Scalability to large number of producers and consumers 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 26
  • 27. Active Databases • Traditional database systems  Passive  Store data and wait for user’s interaction  Reactive behaviour in the application layer DAYAL, U., BLAUSTEIN, B., BUCHMANN, A., CHAKRAVARTHY, U., HSU, M., LEDIN, R., MCCARTHY, D., ROSENTHAL, A., SARIN, S., CAREY, M. J., LIVNY, M., AND JAUHARI, R. 1988. The hipac project: Combining active databases and timing constraints. SIGMOD Rec. 17, 1, 51–70. LIEUWEN, D. F., GEHANI, N. H., AND ARLEIN, R. M. 1996. The ode active database: Trigger semantics and implementation. In Proceedings of the 12th International Conference on Data Engineering (ICDE’96). IEEE Computer Society, Los Alamitos, CA, 412–420. GATZIU, S. AND DITTRICH, K. 1993. Events in an active object-oriented database system. In Proceedings of the International Workshop on Rules in Database Systems (RIDS), N. Paton and H. Williams, Eds. Workshops in Computing, Springer-Verlag, Edinburgh, U.K. CHAKRAVARTHY, S. AND ADAIKKALAVAN, R. 2008. Events and streams: Harnessing and unleashing their synergy! In Proceedings of the 2nd International Conference on Distributed Event-Based Systems (DEBS’08). ACM, New York, NY, 1–12. 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 27
  • 28. Active Databases • Reactive behaviour moved to database layer • Event-Condition-Action (ECA) rules  Event: source. E.g. tuple inserted  Condition: post event. E.g. inserted.value > 5  Action: what to do. E.g. modify the DB • Cons  Persistent storage model  Suitable when updates not frequent and few rules 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 28
  • 29. Data Stream Management Systems • Streams unbounded (not like tables) • No arrival order assumptions • Typically no storage • Use continuous, or standing, queries • Reactive in nature CHANDRASEKARAN, S., COOPER, O., DESHPANDE, A., FRANKLIN, M. J., HELLERSTEIN, J. M., HONG, W., KRISHNAMURTHY, S., MADDEN, S. R., REISS, F., AND SHAH, M. A. 2003. Telegraphcq: Continuous dataflow processing. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’03). ACM, New York, NY, 668–668. CHEN, J., DEWITT, D. J., TIAN, F., AND WANG, Y. 2000. Niagaracq: A scalable continuous query system for Internet databases. SIGMOD Rec. 29, 2, 379–390. LIU, L., PU, C., AND TANG, W. 1999. Continual queries for internet scale event-driven information delivery. IEEE Trans. Knowl. Data Eng. 11, 4, 610–628. ARASU, A., BABU, S., AND WIDOM, J. 2006. The CQL continuous query language: Semantic foundations and query execution. VLDB J. 15, 2, 121–142. 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 29
  • 30. Data Stream Management Systems • Continuous queries semantics  Answer: append only stream or update store  Exact or approximate answer • Cons  Atomic item is the stream  Not possible to detect sequencing or causal patterns 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 30
  • 31. Publish/Subscribe Systems • Information items are notifications • Indirect addressing-based communication scheme • Ancestors  Message Passing  Remote Procedure Call (RPC)  Shared spaces  Message Queueing Eugster, P.T., Felber, P.A., Guerraoui, R. and Kermarrec, A.M., 2003. The many faces of publish/subscribe. ACM Computing Surveys (CSUR), 35(2), pp.114–131. MUHL , G., FIEGE, L., AND PIETZUCH, P. 2006. Distributed Event-Based Systems. Springer 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 31
  • 32. Message Queues 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 32
  • 33. Messaging Models • Two main message models are commonly available  point-to-point  publish/subscribe • Both are based on the exchange of messages through a channel (queue) • Typical system will utilize a mix of these models to achieve different messaging objectives 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 33
  • 34. Point-to-Point Model • Straightforward asynchronous exchange of messages  message routed to consuming clients via a queue  no restriction on number of publishing clients  usually only a single consuming client  message is delivered only once to only one receiver 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 34
  • 35. Point-to-Point Model • Messages are always delivered and will be stored in the queue until a consumer is ready to retrieve them 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 35
  • 36. Publish Subscribe Models • One-to-many and many-to-many distribution mechanism  allows single producer to send a message to one user or potentially hundreds of thousands of consumers • Clients "publish" to a specific topic or channel 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 36
  • 37. Publish Subscribe Models • Channels are “subscribed” to by clients to consume messages • No restriction on the role of a client  may be both a producer and consumer of a channel 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 37
  • 38. Publish/Subscribe Systems • Topic-based pub/sub  Topics are groups or channels  Events of a topic are sent to the topic’s subscribers ALTHERR, M., ERZBERGER, M., AND MAFFEIS, S. 1999. iBus—a software bus middleware for the Java platform. In Proceedings of the International Workshop on Reliable Middleware Systems. 43–53. TIBCO. 1999. TIB/Rendezvous. White paper. TIBCO, Palo Alto, CA. 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 38
  • 39. Publish/Subscribe Systems • Content-based pub/sub  Matching by message filters  Publishers and subscribers channels are defined by the content and the subscriptions David S. Rosenblum and Alexander L. Wolf. 1997. A design framework for Internet-scale event observation and notification. SIGSOFT Softw. Eng. Notes 22, 6 (November 1997), 344-360. DOI=10.1145/267896.267920 http://doi.acm.org/10.1145/267896.267920 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 39
  • 40. Publish/Subscribe Systems • Type-based pub/sub  Matching on type hierarchy  Type hierarchy can come from programming language inheritance hierarchies EUGSTER, P. AND GUERRAOUI, R. 2001. Content based publish/subscribe with structural reflection. In Proceedings of the 6th Usenix Conference on Object-Oriented Technologies andSystems (COOTS’01). 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 40
  • 41. Publish Subscribe Systems • Events and matching  Ordered tuples  Attribute-values  XML documents • Cons  Single event matching only 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 41
  • 42. • Detection of complex patterns  Sequencing  Causal  Ordering in general  Of multiple events  And generate complex, or derived, events Complex Event Processing Systems 8-12 Dec 2014, Bordeaux, France LUCKHAM, D., 2002. The Power of Events: An Introduction to Complex Event Processing in Distributed Enterprise Systems, Addison-Wesley Professional. 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 42
  • 43. Complex Event Processing Systems Adapted from CUGOLA, G. AND MARGARA, A., 2011. Processing flows of information: From data stream to complex event processing. ACM Computing Surveys Journal. SOULEIMAN HASAN AND EDWARD CURRY. 2014. APPROXIMATE SEMANTIC MATCHING OF EVENTS FOR THE INTERNET OF THINGS. ACM TRANS. INTERNET TECHNOL. 14, 1, ARTICLE 2 (AUGUST 2014), 23 PAGES. ADAPTED FROM (CUGOLA AND MARGARA) 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 43
  • 44. DISCUSSION POINT • What paradigms are the audience used to? • What is their experience with each? 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 44
  • 45. A THEORY FOR EVENT EXCHANGE PART III 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 45
  • 46. 3 Traits of Large-Scale Event Processing Systems • 1- Distribution • Two complementary aspects: • The first is the placement of processing workloads on different nodes and thus making use of parallel computing. • The second aspect is that large-scale environments are inherently distributed with event production and consumption happening at distributed components. • Thus, even when dealing with a centralized event processing engine, considerations of the innate nature of distribution of the environment of event producers and consumers shall be taken into account. 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 46
  • 47. 3 Traits of Large-Scale Event Processing Systems • 2- Heterogeneity • Differences in hardware components, protocols, operating systems, middleware, and data • Muhl et al. “Syntax and semantics of notifications are likely to vary and there are inevitably different data models in use.” • We deal here with semantic heterogeneity • Semantics discussed in PART IV 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 47
  • 48. Problem • Event producers and consumers are semantically coupled  Consumers need prior knowledge of event types, attributes and values.  Limits scalability in heterogeneous and dynamic environments due to explicit dependencies  Difficult development of event processing subscriptions/rules in heterogeneous an dynamic environments. Space Time Synch Producer Consumer Semantic 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 48
  • 49. Exact Matching Model Type Energy Consumption Place Room 202e Amount 40 kWh Type Electricity Consumption Location Room 202e Amount 70 kWh Type Electricity Utilized Venue Room 202e Amount 600 kWh e1 Event Producers e.g. Sensors Type =“Energy Consumption” Place =“Room 202e” Type =“Electricity Consumption” Location =“Room 202e” Type =“Electricity Utilized” Venue =“Room 202e” Traditional Event Processing e1 Consumer e1e2 e1e3 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 49
  • 50. Semantic Matching Type Energy Consumption Place Room 202e Amount 40 kWh Type Electricity Consumption Location Room 202e Amount 70 kWh Type Electricity Utilized Venue Room 202e Amount 600 kWh e1 Event Producers e.g. Sensors e1 e1e2 e1e3 Semantic Event Processing Type =“Energy Consumption”~ Location =“Room 202e” Consumer 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 50
  • 51. 3 Traits of Large-Scale Event Processing Systems • 3- Openness • The term “open” has been used frequently in the literature to describe distributed event systems at large scales, it has not been defined precisely • Draw upon the definition used in systems theory “system that has external interactions in form of information, energy, or matter transfer through its boundary.” • An open event system from the semantics perspective as the event environment where an agent can exchange events with other agents that use different event semantics 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 51
  • 52. The Principle of Decoupling • Eugster et al. : decoupling as “removing all explicit dependencies between the interacting participants.” • Implicit Interaction: the control over an event-based system is decentralized into an autonomous version 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 52
  • 53. The Principle of Decoupling • Event Processing •Exchange atomic items called events •Scalable by decoupling •Eugster et al. 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 53 Producer Consumer event Space No Addresses Time Active or Not Synchronization No Blocking
  • 54. DISCUSSION POINT • The hypothesis is that removing explicit dependencies between event producers and consumers leads to an increased scalability • Where did the dependencies go? 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 54
  • 55. How Good are Our Paradigms? • Scale  Big volume  Big Velocity  Big Variety • Distributed sources and consumers • The big challenge is now in the exchange of knowledge at a very large-scale 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 55
  • 56. Shannon-Weaver Model 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 56
  • 58. Cross-Boundaries Exchange P. R. Carlile. Transferring, translating, and transforming: An integrative framework for managing knowledge across boundaries. Organization science, 15(5):555{568, 2004. Known environmen t Syntactic Semantic Pragmatic Producer Consumer Boundaries Open environment Known environment 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 58
  • 59. Syntactic Boundary • Transfer is the most common type of information movement across this boundary • A common lexicon exists  Move and process syntax (0’s and 1’s)  Dominant form of Shannon Weaver’s theory • Examples  Different data models of events  E.g. Transfer RDF events over HTTP 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 59
  • 60. Semantic Boundary • Common lexicon doesn’t exist • Lexicon evolve • Ambiguities exist • Translation is the process to cross this boundary • Examples  Different ontologies for sensors  Ontology alignment for RDF events 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 60
  • 61. Pragmatic Boundary • Actors on the sides of the boundary have  Different contexts  Different perspectives  Different interests • Transformation is the process to cross this boundary • Example  Temp sensor reading of 35 celsius is acceptable from outdoor sensors but not from indoor 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 61
  • 62. Cross-Boundaries Exchange P. R. Carlile. Transferring, translating, and transforming: An integrative framework for managing knowledge across boundaries. Organization science, 15(5):555{568, 2004. Known environmen t Syntactic Semantic Pragmatic Producer Consumer Boundaries Open environment Known environment 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 62
  • 63. Transfer-Translate-Transform • Current approaches in event processing • Transfer  Common event models  Common language models • E.g. RDF over HTTP 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 63
  • 64. Transfer-Translate-Transform • Current approaches in event processing • Translate  Agreements on schemas/thesauri/ontologies • E.g. DERI Energy ontology for building energy events Curry, Edward, et al. "Linking building data in the cloud: Integrating cross-domain building data using linked data." Advanced Engineering Informatics 27.2 (2013): 206-219. 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 64
  • 65. Transfer-Translate-Transform • Current approaches in event processing • Transform  Dedicated enrichers, joins in event languages • CQELS language for Linked Stream Data mashups 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 65
  • 66. Decoupling for Scalability Patrick Th. Eugster, Pascal A. Felber, Rachid Guerraoui, and Anne-Marie Kermarrec. 2003. The many faces of publish/subscribe. ACM Comput. Surv. 35, 2 (June 2003), 114-131. Event Processing Space Time SynchronizationEvent source Event consumer 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 66
  • 67. A Trade-Off • Current decoupling scale at lower boundaries • Human in the loop to cross higher boundaries, introducing coupling, limiting scalability De/Coupling 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 67 Pragmatics (contexts) Semantics (meanings) Syntactic Consum er Producer Boundaries HeterogeneousDistributed OpenEnvironments De/Coupling Formats Space Time Synchronization Agree- ments Ext. data
  • 68. A Trade-off Publisher Alice Syntactic boundary Semantic normalization for Bob Semantic boundary Consumer Bob Publisher Alice Syntactic boundary Semantic boundary Consumer Dan Consumer Bob Consumer Erin A. Small scale known environment Low cost to cross boundary (agreements, number of rules) B. Large scale open environment (e.g. IoT) High cost to cross boundary (agreements, number of rules) type energy consumption increaselocation university street e Semantic normalization for Bob type electricity usage riseplace university street eB type energy usage riselocation university road eE type energy usage rise place university road eD type electricity usage riseplace university street eB type energy consumption increaselocation university street e Souleiman Hasan and Edward Curry. 2015. TACKLING VARIETY IN INTERNET OF THINGS EVENTS, IEEE Internet Computing (In Press) 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 68
  • 69. Semantic Coupling Event Processing Space Time Synchronization Event source Event consumerSemantic Coupling type, attributes, values Souleiman Hasan and Edward Curry. 2014. Approximate Semantic Matching of Events for the Internet of Things. ACM Trans. Internet Technol. 14, 1, Article 2 (August 2014), 23 pages 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 69
  • 70. DISCUSSION POINT • How significant is semantic coupling compared to other decoupling dimensions? • What are possible solutions? 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 70
  • 71. SEMANTICS AND APPROXIMATION PART IV 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 71
  • 72. What is Semantics? 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 72
  • 73. What is Semantics? 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 73 • Relationship between two spaces (or worlds or sets): the meanings, and the symbols Peter Gardenfors • Semiotics and sign systems
  • 74. What is Semantics? 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 74 • Semiotics (Chandlers, 2001)
  • 75. Semantic Heterogeneity • Semantics maps a language L to meanings M 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 75
  • 76. The Meanings Space The meanings space: • Objects are individuals like a specfic laptop used by Alice. • Properties are a “way of abstracting away redundant information about objects”. E.g. Alice's laptop is “black” which is a property. • Concepts are the most generic form of objects and properties. A concept clusters similar properties and objects such as the concept “Laptop”. 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 76
  • 77. Three Main Levels of Semantics 1. Symbolic 2. Conceptual or Sub-Symbolic 3. Non-Symbolic 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 77
  • 78. Symbolic Semantics • Information is represented by symbols, e.g. “apple” • Processing of information is by definition a manipulation of symbols through Rules. • Symbols can be gathered into sentences of a language of • thought. • What a sentence means is a belief of an agent. • Various beliefs are connected by logical or inferential relations such as first-order logic in Artificial Intelligence (AI) • Meanings are purely the result of logical, syntactic relations of symbols, rather than the states they refer to. 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 78
  • 79. Symbolism and Computationalism “One of the fundamental contributions to knowledge of computer science has been to explain, at a rather basic level, what symbols are. This explanation is a scientific proposition about Nature. It is empirically derived, with a long and gradual development. Symbols lie at the root of intelligent action, which is, of course, the primary topic of artificial intelligence. For that matter, it is a primary question for all of computer science.” Newell and Simon 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 79
  • 80. Symbolism and Computationalism • Fundamental tenet in AI • But also in databases and hence event systems • E.g. Codd’s relational model • E.g. Unique name assumption in databases 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 80
  • 81. Symbolic Semantics • Extensional Semantics where a property is defined by the set of objects in the world that have the property. • Intensional Semantics alters the concept of one world to the case of multiple possible worlds, to tackle properties like small or big • Situation Semantics uses one world model, but instead of truth functions from symbols or sentences to possible worlds, it uses a polarity function from symbols or sentences to a subset of the world, called situation. 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 81
  • 82. Critiques to Symbolic Semantics 1. They do not explain how a person can perceive two properties to be similar. 2. Their limited account for inductive reasoning. 3. The frame problem which states that representing all necessary knowledge about the world requires a combinatorial explosion of logical axioms and inferences. 4. The symbol grounding problem which states that in the symbolic paradigm the meanings of symbols are actually grounded in symbols themselves. 5. Symbolism does not largely separate the symbolic level from the meaning level. When event agents need to agree on the meanings they have to agree on which is a highly costly process and thus hinders the loose semantic coupling requirement. 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 82
  • 83. Conceptual Sub-Symbolic Semantics • Fundamentally leverage topological features of meanings, e.g. Apple is closer to Orange than to Car • Geometrical nature of the meaning space • Distances and closeness between meanings can be established. • E.g. Gardenfors conceptual spaces •concepts are structured into domains, e.g. the domain of colors, the spatial domain, etc. •Conceptual spaces are then built up from quality dimensions which serve the purpose of building the domains. For instance, the colors domain can be built up from three dimensions: hue, chromaticness or saturation, and brightness. 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 83
  • 84. Conceptual Sub-Symbolic Semantics • Fundamentally leverage topological features of meanings, e.g. Apple is closer to Orange than to Car • Geometrical nature of the meaning space • Distances and closeness between meanings can be established. • E.g. Gardenfors conceptual spaces •Computationally challenging due to need to build and agree on quality dimensions 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 84
  • 85. Distributional Semantic Model • Distributional hypothesis: the context surrounding a given word in a text provides relevant information about its meaning. • Simplified semantic model. • Associational and quantitative. • Explicit Semantic Analysis (ESA) is the primary distributional model used in this work. A wife is a female partner in a marriage. The term "wife" seems to be a close term to bride, the latter is a female participant in a wedding ceremony, while a wife is a married woman during her marriage. ... Slide Credits: Andre Freitas (http://andrefreitas.org/), [Freitas et al., 2013] 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 85
  • 86. Distributional Semantic Model c1 function (number of times that the words occur in c1) c1 child husband spouse cn c2 function (number of times that the words occur in c1) 0.7 0.5 Commonsense is here Slide Credits: Andre Freitas (http://andrefreitas.org/), [Freitas et al., 2013] 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 86
  • 87. Semantic Relatedness θ c1 child husband spouse cn c2 Works as a semantic ranking function E.g. esa(room, building)= 0.099 E.g. esa(room, car)= 0.009 Slide Credits: Andre Freitas (http://andrefreitas.org/) [Freitas et al., 2013] 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 87
  • 88. Critiques to Sub-Symbolic Semantics • Compositionality: they mainly concern lexical meanings, i.e. meanings of individual terms, rather than complex sentences. • We argue that the compositionality problem is not an issue for event matching. That is due to the fact that linguistic structures and syntax is not the kind of data model used in event processing systems to represent events and subscriptions. 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 88
  • 89. Non-Symbolic Semantics • Artificial Neural Networks (ANNs) • Connectionism • Fundamentally can be abstracted into geometrical models • Difficult to build an interpret 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 89
  • 90. Semantic Models 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 90
  • 91. Tagging • Inspired by works in social tagging, i.e. folksonomies • Folksonomies are bottom up approach to semantics • Free words • Tag events such as in IoT • Leads to the concepts of thingsonomies 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 91
  • 92. Free Tagging and Thingsonomies • Top-down taxonomies difficult to impose • Folksonomies successful in social tagging 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 92
  • 93. Approximation • Coupling is important to cross semantic and pragmatic boundaries, but it limits scalability. • Loosening coupling at these levels is a compromise to tackle the trade-of between decoupling for scalability and crossing the boundaries. • The cost of this compromise is a loss in effectiveness while crossing the boundaries, i.e. loss of some precision and context when processing the events. • In literature approximation has been used to address •Time efficiency, e.g. approximation algorithms •Full integration, e.g. uncertain schema matching, Gal et al. 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 93
  • 94. Approximation • Exact matching assumes full agreements • Approximation flexible with uncertainties 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 94
  • 95. APPROACHES TO SEMANTIC COUPLING PART V 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 95
  • 96. Loosening the Semantic Coupling • Approach 1: Content-Based with Semantic Decoupling • Approach 2: Content-Based with Implicit Shared Agreements • Approach 3: Concept-Based • Approach 4: Loose Semantic Coupling + Approximation Hasan, S. and Curry, E., 2014. Approximate Semantic Matching of Events for The Internet of Things. ACM Transactions on Internet Technology (TOIT). • Approach 5: Theme-Based Hasan, S. and Curry, E., 2014. Thematic Event Processing. Middleware 2014. 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 96
  • 97. Approach 1: Content-Based with Semantic Decoupling 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Producer Consumer event Semantic De- Coupling Happened Publish: A Happened Interested in Subscribe: Interested in B • Very low detection rate  High false positives/negatives  Low precision/recall
  • 98. Current Approaches Semantic Decoupling Effectiveness & Efficiency Content-based Concept-based Bottom-up Semantics 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 98
  • 99. Approach 1: Content-Based with Semantic Decoupling 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Producer Consumer event Semantic De-Coupling Happened Publish: A Happened Interested in Subscribe: Interested in A Interested in B Interested in C • Use many rules to improve detection  Time and effort  Affects scalability to heterogeneous environments
  • 100. Approach 2: Content-Based with Implicit Shared Agreements 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Producer Consumer event Semantic Coupling via Implicit Agreements Happened Publish: A Happened Interested in Subscribe: Interested in A Face-to-face, or via documentation e.g. Use symbol A to describe
  • 101. Approach 2: Content-Based with Implicit Shared Agreements • Implicit semantics  Top-down approach to semantics  Granular on the level of concepts Producer Consumer event Semantic Coupling via Implicit Agreements Happened Publish: A Happened Interested in Subscribe: Interested in A
  • 102. Approach 2: Content-Based with Implicit Shared Agreements • Need for shared agreements  Time and effort  Affects scalability to heterogeneous environments Producer Consumer event Semantic Coupling via Implicit Agreements Happened Publish: A Happened Interested in Subscribe: Interested in A
  • 103. Approach 3: Concept-Based 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Producer Consumer event Semantic Coupling via Ontologies Happened Publish: A Happened Interested in Subscribe: Interested in B C D B E A F subClassOf
  • 104. Approach 3: Concept-Based • Explicit semantics  Top-down approach to semantics  Granular on the level of concepts Producer Consumer event Semantic Coupling via Ontologies Happened Publish: A Happened Interested in Subscribe: Interested in B
  • 105. Approach 3: Concept-Based • Need for shared agreements  Time and effort  Affects scalability to heterogeneous environments Producer Consumer event Semantic Coupling via Ontologies Happened Publish: A Happened Interested in Subscribe: Interested in B
  • 106. Approach 4: Loose Semantic Coupling + Approximation 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Producer Consumer event Loose Semantic Coupling via Large Text Corpora Happened Publish: A Happened Interested in Subscribe: Interested in B A d1 d2 d3 d4 d5 d6 d7 d8 …. B d1 d3 d4 d17 d25 d26 d77 d78 …. ~ Souleiman Hasan and Edward Curry. 2014. Approximate Semantic Matching of Events for the Internet of Things. ACM Trans. Internet Technol. 14, 1, Article 2 (August 2014), 23 pages
  • 107. Approach 4: Loose Semantic Coupling + Approximation • Bottom-up model of semantics • Global semantics: distribution vs. granular 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Producer Consumer event Loose Semantic Coupling via Large Text Corpora Happened Publish: A Happened Interested in Subscribe: Interested in B ~
  • 108. Approach 4: Loose Semantic Coupling + Approximation • Low cost to scale to heterogeneous environments • Slightly lower detection rate 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Producer Consumer event Loose Semantic Coupling via Large Text Corpora Happened Publish: A Happened Interested in Subscribe: Interested in B ~
  • 109. Approach 5: Thematic Event Processing 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Producer Consumer event Loose Semantic Coupling via Large Text Corpora Happened Publish: A Happened Interested in Subscribe: Interested in B ~ • Can we exchange better approximations of meanings rather than mere symbols to improving detection rate?
  • 110. Approach 5: Thematic Event Processing 7-11 July 2014, Rhodes, Greece EarthBiAs2014 Producer Consumer event Loose Semantic Coupling via Large Text Corpora Happened Publish: (A+T1) Happened Interested in Subscribe: Interested in (B+T2) A d1 d2 d3 d4 d5 d6 d7 d8 …. B d1 d3 d4 d17 d25 d26 d77 d78 …. ~ Souleiman Hasan and Edward Curry. 2014. Thematic event processing. In Proceedings of the 15th International Middleware Conference (Middleware '14).
  • 111. Summary Simple Content- based Content- based + Many Rules Concept- based Simple Distributional+ Approximation Thematic Matching exact string matching exact string matching Boolean semantic matching approximate semantic matching approximate semantic matching Semantic Coupling term-level full agreement term-level full agreement concept-level shared agreement loose agreement loose agreement Semantics not explicit not explicit top-down ontology-based statistical model based on distributional semantics statistical model based on distributional semantics + themes Effectiveness very low 100% depends on the domains and number of concept models depends on the corpus depends on the corpus + theme representatives Cost defining a small number of rules defining a large number of rules establishing shared agreement on ontologies minimal agreement on a large textual corpus minimal agreement on a large textual corpus + good theme representatives Efficiency high high medium to high medium to high Medium to high 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 111
  • 112. DISCUSSION POINT • Is the audience familiar with the symbolic vs. non-symbolic debate in AI? • What position do they take? • How relevant to our current discussion? 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 112
  • 113. RDF EVENT PROCESSING CASE STUDY 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 113
  • 114. From The Web The Semantic Web • 1989 Tim Berners Lee proposed what became the Web • Hypertext over HTTP and URI • 2001 TBL et al. published their Scientific American paper The Semantic Web • A structured Web for machines • Builds over the Web architecture 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 114
  • 115. From The Web The Semantic Web • Initial Semantic Web vision • Standards: W3C • Research: mainly ISWC, ESWC • Recently attention moved to the lower layers of the stack • Called: Linked Data 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 115
  • 116. Linked Data is Web-based • Leverages the architecture of the Web to make sharing data easier • Linked Data is a method of exposing, sharing, and connecting data (via dereferenceable URIs) on the Web. • Provides a Data (RDF) and Naming (URI) model for the Web • W3C Web-based Standards • Adaptive Ontologies 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 116
  • 117. Linked Data is Web-based 1. Use URIs as names for things 2. Use HTTP URIs so that people can look up those names. 3. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL) 4. Include links to other URIs. so that they can discover more things http://www.w3.org/DesignIssues/LinkedData.html 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 117
  • 118. Linked Data Cloud US government UK government BBC New York Times LinkedGeoData BestBuy Overstock.com Facebook Media Government Geo Publications User-generated Life sciences Cross-domain Over 200 open data sets with more than 25 billion facts, interlinked by 400 million typed links, doubling every 10 month! Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 118
  • 119. Two Key Ingredients • RDF – Resource Description Framework Graph based Data – nodes and arcs  Identifies objects (URIs)  Interlink information (Relationships) • Vocabularies (Ontologies)  provide shared understanding of a domain  organise knowledge in a machine-comprehensible way  give an exploitable meaning to the data 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 119
  • 120. Example: Linked Building Data http://www.deri.ie/about/team/member/edward_curry/ http://lab.linkeddata.deri.ie/2010/deri-rooms#r202e http://vocab.deri.ie/rooms#occupant Resource Description Framework (RDF) subject - predicate – object Edward Curry is the Occupant of Room 202e Edward Curry is the Occupant of Room 202e 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 120
  • 121. Example: Linked Building Data 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 121
  • 122. Why Linked Data for the IoT? • Many communities struggle with closed approaches  E.g., pervasive computing, embedded systems, IoT, … • Cyber-Physical Systems are inherently “open world”  Prof. David Karger (MIT) in his ESWC 2013 keynote: “Semantic Web technologies support and open world assumption where millions of unforeseeable schemas may have to be integrated.” 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 122
  • 123. Why Linked Data for the IoT? • Simple integration with existing LOD data sets  Geo-spatial, governmental, media, … • Manageable integration effort with other graph data, e.g., Google Knowledge Graph, Facebook Graph, etc. 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 123
  • 124. • How to classify RDF event processing according to previous classification? • What are the pros and cons? • How to make RDF event processing more suitable for large scale IoT with variety? DISCUSSION POINT 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 124
  • 125. • RDF Event Processing  Based on a top-down semantic model, i.e. ontologies  Use RDF as data model  Use URIs for interlinking  SPARQL-like query languages  Graph matching DISCUSSION POINT 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 125
  • 126. • RDF Event Processing  Top-down semantics difficult to agree on  Granular semantics difficult to agree on  RDF graph model powerful  URIs good for linking  SPARQL-like query languages, expressive  SPARQL-like query languages, not user friendly DISCUSSION POINT 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 126
  • 127. • RDF Event Processing  Is an example of Concept-Based approach to semantic coupling  Needs some research to address the problems of this class, but also to gain power of RDF model and URIs  Efforts in querying RDF graphs using distributional semantics can largely feed this research, see [Freitas et al., 2013] DISCUSSION POINT 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 127
  • 128. TEMATIC EVENT PROCESSING PART VI 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 128
  • 129. The Thematic Approach • Decoupling is good for scalability • Decoupling parties creates challenges to cross semantic boundaries • We can solve that by re-coupling parties at semantic boundaries, but it is not ideal 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 129
  • 130. The Thematic Approach Problem: How can we achieve effective and efficient event processing without causing semantic coupling at the semantic boundaries? 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 130
  • 131. The Thematic Approach • Exchange approximations of meanings • Instead of exchanging words, try to convey their meaning too • How?Use tags that can help understand their meaning 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 131
  • 132. The Thematic Approach • Inspired by works in social tagging, i.e. folksonomies • Folksonomies are bottom up approach to semantics Souleiman Hasan and Edward Curry. 2014. Thematic event processing. In Proceedings of the 15th International Middleware Conference (Middleware '14). ACM, Bordeaux, France, 109-120. DOI=10.1145/2663165.2663335 http://doi.acm.org/10.1145/2663165.2663335 Image credit http://lowriderlibrarian.blogspot.fr/ 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 132
  • 133. The Thematic Approach Event Publisher Alice Consumer Bob Theme the Payload Subscription Theme ths ExpressionApproximate matcher Parameterization Loose coupling mode: lightweight agreement on themes No coupling mode: free use of well representative themes Souleiman Hasan and Edward Curry. 2014. Thematic event processing. In Proceedings of the 15th International Middleware Conference (Middleware '14). ACM, Bordeaux, France, 109-120. DOI=10.1145/2663165.2663335 http://doi.acm.org/10.1145/2663165.2663335 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 133
  • 134. Event Representation Event energy, appliances, building type: increased energy consumption event, measurement unit: kilowatt per hour, device: computer, office: room 112 • Thematic tags added to events 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 134
  • 135. Subscription Representation Subscription power, computers type= increased energy usage event~, device~= laptop~, office= room 112 • Thematic tags added to subscriptions 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 135
  • 136. Matching Model • Top-1 and Top-k most probable mappings {type: increased energy consumption event, measurement unit: kilowatt per hour, device: computer, desk: desk 112c, office: room 112, floor: ground floor, zone: building, city: Galway, country: Ireland, continent: Europe} {type = increased energy consumption event, device = laptop~, room~esa = room 112} 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 136
  • 137. Probabilistic Approximate Matcher • Top-1 and Top-k mappings between an event and a subscription 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 137
  • 138. Parameterized Similarity • Thematic tags used to parameterize the semantic measure 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 138
  • 139. Interpreting Terms • Project vectors in a distributional semantic vector space • Thematic projection 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 139
  • 140. Evaluation Metrics • Number of exact rules • Precision • Recall • F1Score • Throughput • Standard error 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 140
  • 141. Evaluation Dataset • Seed events synthesized from IoT sensors • SmartSantander smart city project  Luis Sanchez, Jos´e Antonio Galache, Veronica Gutierrez, JM Hernandez, J Bernat, Alex Gluhak, and Tom´as Garcia. 2011. SmartSantander: The meeting point between Future Internet research and experimentation and the smart cities. In Future Network & Mobile Summit (FutureNetw), 2011. IEEE, 1–8. • Sensor Capabilities  solar radiation, particles, speed, wind direction, wind speed, temperature, water flow, atmospheric pressure, noise, ozone, rainfall, parking, radiation par, co, ground temperature, light, no2, soil moisture tension, relative humidity, energy consumption, cpu usage, memory usage Souleiman Hasan and Edward Curry. 2014. Approximate Semantic Matching of Events for the Internet of Things. ACM Trans. Internet Technol. 14, 1, Article 2 (August 2014), 23 pages. DOI=10.1145/2633684 http://doi.acm.org/10.1145/2633684 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 141
  • 142. Evaluation Dataset • Seed events synthesized from IoT sensors • Linked Energy Intelligence platform  Edward Curry, Souleiman Hasan, and Sean O’Riain. 2012. Enterprise energy management using a linked dataspace for Energy Intelligence. In Sustainable Internet and ICT for Sustainability (SustainIT), 2012. IEEE, 1–6. • Car brands from the yahoo directory  Yahoo! 2013. Yahoo! Directory: Automotive - Makes and Models. (2013). http://dir.yahoo.com/recreation/ automotive/makes and models/ • Home based appliances from BLUED dataset  Kyle Anderson, Adrian Ocneanu, Diego Benitez, Derrick Carlson, Anthony Rowe, and Mario Berges. 2012. BLUED: A Fully Labeled Public Dataset for Event-Based Non-Intrusive Load Monitoring Research. In Proc. SustKDD. • Rooms from DERI Building  Richard Cyganiak. 2013. Rooms in the DERI building. (2013). http://lab.linkeddata.deri.ie/2010/deri-rooms Souleiman Hasan and Edward Curry. 2014. Approximate Semantic Matching of Events for the Internet of Things. ACM Trans. Internet Technol. 14, 1, Article 2 (August 2014), 23 pages. DOI=10.1145/2633684 http://doi.acm.org/10.1145/2633684 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 142
  • 143. Evaluation Methodology Souleiman Hasan and Edward Curry. 2014. Thematic event processing. In Proceedings of the 15th International Middleware Conference (Middleware '14). ACM, Bordeaux, France, 109- 120. 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 143
  • 144. Evaluation • FScore up to 95% and 1000s events/sec Hasan, S. and Curry, E., 2014. Approximate Semantic Matching of Events for The Internet of Things. ACM Transactions on Internet Technology (TOIT). 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 144
  • 145. Evaluation Souleiman Hasan and Edward Curry. 2014. Thematic event processing. In Proceedings of the 15th International Middleware Conference (Middleware '14). ACM, Bordeaux, France, 109-120. 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 145
  • 146. Evaluation The use of less terms to describe events, around 2-7, and more to describe subscriptions, around 2 -15, can achieve a good matching quality and throughput together with less error rates. Lightweight amount of tags. Souleiman Hasan and Edward Curry. 2014. Thematic event processing. In Proceedings of the 15th International Middleware Conference (Middleware '14). ACM, Bordeaux, France, 109-120. 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 146
  • 147. BUILDING IOT Event Systems PART VII 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 147
  • 148. Challenges for Building IoT Systems • Vastly heterogeneous, decoupled, and distributed nodes • Lack of central coordination, reference requirements, or data model • High overhead associated with software design associated with establishing agreements between parties 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 148
  • 149. Building IoT Event Systems Indexing Collector Semantic relatedness web service Textual corpus Vector space index Consumer Bob (user) Publisher Alice Publish + thematic tags Thematic event processing engine(s) Approximate single event matching Subscribe + thematic tags IoT sensors Terms + themes pairs Relatednes s score CollectorPublisher Carol Publish + thematic tags CollectorPublisher Dave Publish + thematic tags Consumer Dan (application developer) Consumer Erin (application developer) Heterogeneous IoT Events Relevant events normalized for Bob Subscribe + thematic tags Relevant events normalized for Dan Subscribe + thematic tags Relevant events normalized for Erin 5 3 1 2 4 6 Souleiman Hasan and Edward Curry. 2015. TACKLING VARIETY IN INTERNET OF THINGS EVENTS, IEEE Internet Computing 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 149
  • 150. Building IoT Event Systems Collector Semantic relatedness web service Consumer Bob (user) Publisher Alice Publish + thematic tags Thematic event processing engine(s) Approximate single event matching Subscribe + thematic tags IoT sensors Terms + themes pairs Relatednes s score CollectorPublisher Carol Publish + thematic tags CollectorPublisher Dave Publish + thematic tags Consumer Dan (application developer) Consumer Erin (application developer) Heterogeneous IoT Events Relevant events normalized for Bob Subscribe + thematic tags Relevant events normalized for Dan Subscribe + thematic tags Relevant events normalized for Erin 5 3 2 4 6 1 Textual corpus Vector space index Indexing • Step 1: Build the semantic model  To enable the system establish relationships between various terms such as ‘computer’ vs. ‘laptop’.  Use distributional semantics: bottom-up, not fine-grained model  Revise model with software iterations  update corpus  Wikipedia or a subset of it is a good start  Use enterprise Wikis if available 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 150
  • 151. 1 Textual corpus Vector space index Indexing Building IoT Event Systems Collector Consumer Bob (user) Publisher Alice Publish + thematic tags Thematic event processing engine(s) Approximate single event matching Subscribe + thematic tags IoT sensors CollectorPublisher Carol Publish + thematic tags CollectorPublisher Dave Publish + thematic tags Consumer Dan (application developer) Consumer Erin (application developer) Heterogeneous IoT Events Relevant events normalized for Bob Subscribe + thematic tags Relevant events normalized for Dan Subscribe + thematic tags Relevant events normalized for Erin 5 3 4 6 • Step 2: Avail semantic measure  REST and JSON  Request: http://example.com/esa?term1=energy&term2=electricity  Response {“relatedness” : 0.154} Semantic relatedness web service Terms + themes pairs Relatedness score 2 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 151
  • 152. Semantic relatedness web service Terms + themes pairs Relatednes s score 2 1 Textual corpus Vector space index Indexing Building IoT Event Systems Consumer Bob (user) Thematic event processing engine(s) Approximate single event matching Subscribe + thematic tags Consumer Dan (application developer) Consumer Erin (application developer) Relevant events normalized for Bob Subscribe + thematic tags Relevant events normalized for Dan Subscribe + thematic tags Relevant events normalized for Erin 5 4 6 • Step 3: Publishers associate sensor events with tags  Thematic tags to represent domain and meaning  Associate with attribute-value events  Tags {energy, appliances, building} to accompany increased energy consumption event CollectorPublisher Alice Publish + thematic tags IoT sensors CollectorPublisher Carol Publish + thematic tags CollectorPublisher Dave Publish + thematic tags Heterogeneous IoT Events 3 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 152
  • 153. Relevant events normalized for Bob 5 6 CollectorPublisher Alice Publish + thematic tags IoT sensors CollectorPublisher Carol Publish + thematic tags CollectorPublisher Dave Publish + thematic tags Heterogeneous IoT Events 3 Semantic relatedness web service Terms + themes pairs Relatednes s score 2 1 Textual corpus Vector space index Indexing Building IoT Event Systems Thematic event processing engine(s) Approximate single event matching Consumer Dan (application developer) Consumer Erin (application developer) Subscribe + thematic tags Relevant events normalized for Dan Subscribe + thematic tags Relevant events normalized for Erin • Step 4: Subscribers associate subscriptions with tags  Thematic tags to represent domain and meaning  Associate with attribute-value events  Tags {power, computers} to accompany increased energy consumption event of a device Subscribe + thematic tags 4 Consumer Bob (user) 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 153
  • 154. 4 Relevant events normalized for Bob 6 Consumer Bob (user) Subscribe + thematic tags Publisher Alice Publisher Carol Publisher Dave 1 Textual corpus Indexing Semantic relatedness web service Terms + themes pairs Relatednes s score 2 Vector space index Building IoT Event Systems Consumer Dan (application developer) Consumer Erin (application developer) Subscribe + thematic tags Relevant events normalized for Dan Subscribe + thematic tags Relevant events normalized for Erin • Step 5: Middleware matches events to subscriptions  Approximate, probabilistic semantic normalization Publish + thematic tags Publish + thematic tags Publish + thematic tags 3 5 Thematic event processing engine(s) Approximate single event matching Collector Collector Collector Heterogeneous IoT Events 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 154
  • 155. Subscribe + thematic tags 4 5 Publisher Alice Publisher Carol Publisher Dave Publish + thematic tags Publish + thematic tags Publish + thematic tags 1 Textual corpus Indexing 3 Semantic relatedness web service Terms + themes pairs Relatednes s score 2 Vector space index Thematic event processing engine(s) Approximate single event matching Collector IoT sensors Collector Collector Heterogeneous IoT Events Building IoT Event Systems Consumer Dan (application developer) Consumer Erin (application developer) Subscribe + thematic tags Relevant events normalized for Dan Subscribe + thematic tags Relevant events normalized for Erin • Step 6: Middleware distributes matching results to subscribers  Relevant events are returned  Uncertainty scores are associated  Uncertainty reflects semantic normalization Relevant events normalized for Bob 6 Consumer Bob (user) 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 155
  • 156. FUTURE RESEARCH CHALLENGES PART VIII 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 156
  • 157. Future Research Challenges • Investigation of user tagging behavior of sensor events in zero-coupled open environment • Investigation of the limitations of evolution of semantic agreements in open environments 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 157
  • 158. Future Research Challenges • Investigation of event dynamic enrichment for pragmatic boundaries within loose coupled environments Hasan, S., O’Riain, S. and Curry, E., 2013. Towards Unified and Native Enrichment in Event Processing Systems. In The 7th ACM International Conference on Distributed Event-Based Systems (DEBS 2013). Arlington, Texas, USA: ACM 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 158
  • 159. Future Research Challenges • Dynamic enrichment Hasan, S., O’Riain, S. and Curry, E., 2013. Towards Unified and Native Enrichment in Event Processing Systems. In The 7th ACM International Conference on Distributed Event-Based Systems (DEBS 2013). Arlington, Texas, USA: ACM 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 159
  • 160. Future Research Challenges • Support complex pattern detection • Uncertainty propagation Hasan, S. and Curry, E., 2014. Approximate Semantic Matching of Events for The Internet of Things. ACM Transactions on Internet Technology (TOIT). 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 160
  • 161. Future Research Challenges • Support complex pattern detection • Statistical monotonicity and top-k 8-12 Dec 2014, Bordeaux, FranceMiddleware ’14 161 Hasan, S. and Curry, E., 2014. Approximate Semantic Matching of Events for The Internet of Things. ACM Transactions on Internet Technology (TOIT).
  • 162. Future Research Challenges • Optimization of approximate matching for higher throughputs • Extension of the approximate thematic matcher into parallel computation • Investigation of integration of the vector space model within current data management systems paradigms 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 162
  • 163. Conclusions • Coupling necessary for crossing boundaries • Decoupling necessary for scalable software • Event-based systems need extension to address the coupling/decoupling trade-off for semantics • Approximate and thematic event processing exchange approximations of meaning with loose semantic coupling 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 163
  • 164. DISCUSSION POINT • Overall questions and answers 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 164
  • 165. References • CUGOLA, G. AND MARGARA, A., 2011. Processing flows of information: From data stream to complex event processing. ACM Computing Surveys Journal. • EUGSTER, P.T., FELBER, P.A., GUERRAOUI, R. AND KERMARREC, A.M., 2003. The many faces of publish/subscribe. ACM Computing Surveys (CSUR), 35(2), pp.114–131. • Carlile, Paul R. "Transferring, translating, and transforming: An integrative framework for managing knowledge across boundaries." Organization science15.5 (2004): 555-568. • SOULEIMAN HASAN AND EDWARD CURRY. 2015. TACKLING VARIETY IN INTERNET OF THINGS EVENTS, IEEE Internet Computing (In Press) • SOULEIMAN HASAN AND EDWARD CURRY. 2014. APPROXIMATE SEMANTIC MATCHING OF EVENTS FOR THE INTERNET OF THINGS. ACM TRANS. INTERNET TECHNOL. 14, 1, ARTICLE 2 (AUGUST 2014), 23 PAGES. DOI=10.1145/2633684 HTTP://DOI.ACM.ORG/10.1145/2633684 • HASAN, S., O’RIAIN, S. AND CURRY, E., 2013. TOWARDS UNIFIED AND NATIVE ENRICHMENT IN EVENT PROCESSING SYSTEMS. IN THE 7TH ACM INTERNATIONAL CONFERENCE ON DISTRIBUTED EVENT-BASED SYSTEMS (DEBS 2013). ARLINGTON, TEXAS, USA: ACM. • HASAN, S., O’RIAIN, S. AND CURRY, E., 2012. Approximate Semantic Matching of Heterogeneous Events. In 6th ACM International Conference on Distributed Event-Based Systems (DEBS 2012). Berlin, Germany: ACM, pp. 252–263. • SOULEIMAN HASAN AND EDWARD CURRY. 2014. THEMATIC EVENT PROCESSING. IN PROCEEDINGS OF THE 15TH INTERNATIONAL MIDDLEWARE CONFERENCE (MIDDLEWARE '14). ACM, BORDEAUX, FRANCE, 109-120. DOI=10.1145/2663165.2663335 HTTP://DOI.ACM.ORG/10.1145/2663165.2663335 • HASAN, S., CURRY, E., BANDUK, M., AND O’RIAIN, S. TOWARD SITUATION AWARENESS FOR THE SEMANTIC SENSOR WEB: COMPLEX EVENT PROCESSING WITH DYNAMIC LINKED DATA ENRICHMENT. THE 4TH INTERNATIONAL WORKSHOP ON SEMANTIC SENSOR NETWORKS 2011 (SSN11), (2011), 60–72. 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 165
  • 166. Dataset and Software • Dataset  Souleiman Hasan, Edward Curry, Thematic event processing dataset, DOI: 10.13140/2.1.3342.9123  Available at http://www.researchgate.net/publication/263673956_Thematic_event_proce ssing_dataset • Collider  Souleiman Hasan, Kalpa Gunaratna, Yongrui Qin, and Edward Curry. 2013. Demo: approximate semantic matching in the collider event processing engine. In Proceedings of the 7th ACM international conference on Distributed event-based systems (DEBS '13). ACM, New York, NY, USA, 337-338. DOI=10.1145/2488222.2489277 http://doi.acm.org/10.1145/2488222.2489277 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 166
  • 167. More References • OECD, 2012. MACHINE-TO-MACHINE COMMUNICATIONS: CONNECTING BILLIONS OF DEVICES. OECD DIGITAL ECONOMY PAPERS, NO. 192. • P. MCFEDRIES, THE COMING DATA DELUGE, IEEE SPECTRUM, 2011. • CUGOLA, G. AND MARGARA, A., 2011. PROCESSING FLOWS OF INFORMATION: FROM DATA STREAM TO COMPLEX EVENT PROCESSING. ACM COMPUTING SURVEYS JOURNAL. • EUGSTER, P.T., FELBER, P.A., GUERRAOUI, R. AND KERMARREC, A.M., 2003. THE MANY FACES OF PUBLISH/SUBSCRIBE. ACM COMPUTING SURVEYS (CSUR), 35(2), PP.114–131. • LUCKHAM, D., 2002. THE POWER OF EVENTS: AN INTRODUCTION TO COMPLEX EVENT PROCESSING IN DISTRIBUTED ENTERPRISE SYSTEMS, ADDISON-WESLEY PROFESSIONAL. • DAYAL, U., BLAUSTEIN, B., BUCHMANN, A., CHAKRAVARTHY, U., HSU, M., LEDIN, R., MCCARTHY, D., ROSENTHAL, A., SARIN, S., CAREY, M. J., LIVNY, M., AND JAUHARI, R. 1988. THE HIPAC PROJECT: COMBINING ACTIVE DATABASES AND TIMING CONSTRAINTS. SIGMOD REC. 17, 1, 51–70. • LIEUWEN, D. F., GEHANI, N. H., AND ARLEIN, R. M. 1996. THE ODE ACTIVE DATABASE: TRIGGER SEMANTICS AND IMPLEMENTATION. IN PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE’96). IEEE COMPUTER SOCIETY, LOS ALAMITOS, CA, 412–420. • GATZIU, S. AND DITTRICH, K. 1993. EVENTS IN AN ACTIVE OBJECT-ORIENTED DATABASE SYSTEM. IN PROCEEDINGS OF THE INTERNATIONAL WORKSHOP ON RULES IN DATABASE SYSTEMS (RIDS), N. PATON AND H. WILLIAMS, EDS. WORKSHOPS IN COMPUTING, SPRINGER-VERLAG, EDINBURGH, U.K. • CHAKRAVARTHY, S. AND ADAIKKALAVAN, R. 2008. EVENTS AND STREAMS: HARNESSING AND UNLEASHING THEIR SYNERGY! IN PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON DISTRIBUTED EVENT-BASED SYSTEMS (DEBS’08). ACM, NEW YORK, NY, 1–12. • CHANDRASEKARAN, S., COOPER, O., DESHPANDE, A., FRANKLIN, M. J., HELLERSTEIN, J. M., HONG, W., KRISHNAMURTHY, S., MADDEN, S. R., REISS, F., AND SHAH, M. A. 2003. TELEGRAPHCQ: CONTINUOUS DATAflOW PROCESSING. IN PROCEEDINGS OF THE ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (SIGMOD’03). ACM, NEW YORK, NY, 668–668. • CHEN, J., DEWITT, D. J., TIAN, F., AND WANG, Y. 2000. NIAGARACQ: A SCALABLE CONTINUOUS QUERY SYSTEM FOR INTERNET DATABASES. SIGMOD REC. 29, 2, 379–390. • LIU, L., PU, C., AND TANG, W. 1999. CONTINUAL QUERIES FOR INTERNET SCALE EVENT-DRIVEN INFORMATION DELIVERY. IEEE TRANS. KNOWL. DATA ENG. 11, 4, 610–628. • ARASU, A., BABU, S., AND WIDOM, J. 2006. THE CQL CONTINUOUS QUERY LANGUAGE: SEMANTIC FOUNDATIONS AND QUERY EXECUTION. VLDB J. 15, 2, 121–142. • MUHL , G., FIEGE, L., AND PIETZUCH, P. 2006. DISTRIBUTED EVENT-BASED SYSTEMS. SPRINGER • ALTHERR, M., ERZBERGER, M., AND MAFFEIS, S. 1999. IBUS—A SOFTWARE BUS MIDDLEWARE FOR THE JAVA PLATFORM. IN PROCEEDINGS OF THE INTERNATIONAL WORKSHOP ON RELIABLE MIDDLEWARE SYSTEMS. 43–53. • TIBCO. 1999. TIB/RENDEZVOUS. WHITE PAPER. TIBCO, PALO ALTO, CA. • A. FREITAS, J. G. OLIVEIRA, S. O’RIAIN, J. C. DA SILVA, AND E. CURRY. QUERYING LINKED DATA GRAPHS USING SEMANTIC RELATEDNESS: A VOCABULARY INDEPENDENT • APPROACH. DATA & KNOWLEDGE ENGINEERING, 88:126–141, 2013. 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 167
  • 168. More References • DAVID S. ROSENBLUM AND ALEXANDER L. WOLF. 1997. A DESIGN FRAMEWORK FOR INTERNET-SCALE EVENT OBSERVATION AND NOTIFICATION. SIGSOFT SOFTW. ENG. NOTES 22, 6 (NOVEMBER 1997), 344-360. DOI=10.1145/267896.267920 HTTP://DOI.ACM.ORG/10.1145/267896.267920 • EUGSTER, P. AND GUERRAOUI, R. 2001. CONTENT BASED PUBLISH/SUBSCRIBE WITH STRUCTURAL REFLECTION. IN PROCEEDINGS OF THE 6TH USENIX CONFERENCE ON OBJECT- ORIENTED TECHNOLOGIES ANDSYSTEMS (COOTS’01). • C. SHANNON AND W. WEAVER. THE MATHEMATICAL THEORY OF COMMUNICATION. UNIVERSITY OF ILLINOIS PRESS, 1949. • P. R. CARLILE. TRANSFERRING, TRANSLATING, AND TRANSFORMING: AN INTEGRATIVE FRAMEWORK FOR MANAGING KNOWLEDGE ACROSS BOUNDARIES. ORGANIZATION SCIENCE, 15(5):555{568, 2004. • CURRY, EDWARD, SOULEIMAN HASAN, AND SEÁN O'RIAIN. "ENTERPRISE ENERGY MANAGEMENT USING A LINKED DATASPACE FOR ENERGY INTELLIGENCE." SUSTAINABLE INTERNET AND ICT FOR SUSTAINABILITY (SUSTAINIT), 2012. IEEE, 2012. • CURRY, EDWARD, ET AL. "LINKING BUILDING DATA IN THE CLOUD: INTEGRATING CROSS-DOMAIN BUILDING DATA USING LINKED DATA." ADVANCED ENGINEERING INFORMATICS 27.2 (2013): 206-219. • PATRICK TH. EUGSTER, PASCAL A. FELBER, RACHID GUERRAOUI, AND ANNE-MARIE KERMARREC. 2003. THE MANY FACES OF PUBLISH/SUBSCRIBE. ACM COMPUT. SURV. 35, 2 (JUNE 2003), 114-131. • A. CARZANIGA, D. S. ROSENBLUM, AND A. L. WOLF. ACHIEVING SCALABILITY AND EXPRESSIVENESS IN AN INTERNET-SCALE EVENT NOTI_CATION SERVICE. IN PROCEEDINGS OF THE NINETEENTH ANNUAL ACM SYMPOSIUM ON PRINCIPLES OF DISTRIBUTED COMPUTING, PAGES 219{227. ACM, 2000. • M. PETROVIC, I. BURCEA, AND H.-A. JACOBSEN. S-TOPSS: SEMANTIC TORONTO PUBLISH/SUBSCRIBE SYSTEM. IN PROCEEDINGS OF THE 29TH INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES - VOLUME 29, VLDB '03, PAGES 1101-1104. VLDB ENDOWMENT, 2003. • HASAN, S. AND CURRY, E., 2014. APPROXIMATE SEMANTIC MATCHING OF EVENTS FOR THE INTERNET OF THINGS. ACM TRANSACTIONS ON INTERNET TECHNOLOGY (TOIT). IN PRESS • HASAN, S. AND CURRY, E., 2014. THEMATIC EVENT PROCESSING. MIDDLEWARE 2014. UNDER REVIEW. • HASAN, S. AND CURRY, E., 2014. TACKLING EVENT VARIETY IN INTERNET OF THINGS SOFTWARE. IEEE INTERNET COMPUTING 2015. UNDER REVIEW. • LUIS SANCHEZ, JOS´E ANTONIO GALACHE, VERONICA GUTIERREZ, JM HERNANDEZ, J BERNAT, ALEX GLUHAK, AND TOM´AS GARCIA. 2011. SMARTSANTANDER: THE MEETING POINT BETWEEN FUTURE INTERNET RESEARCH AND EXPERIMENTATION AND THE SMART CITIES. IN FUTURE NETWORK & MOBILE SUMMIT (FUTURENETW), 2011. IEEE, 1–8. • EDWARD CURRY, SOULEIMAN HASAN, AND SEAN O’RIAIN. 2012. ENTERPRISE ENERGY MANAGEMENT USING A LINKED DATASPACE FOR ENERGY INTELLIGENCE. IN SUSTAINABLE INTERNET AND ICT FOR SUSTAINABILITY (SUSTAINIT), 2012. IEEE, 1–6. 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 168
  • 169. More References • YAHOO! 2013. YAHOO! DIRECTORY: AUTOMOTIVE - MAKES AND MODELS. (2013). HTTP://DIR.YAHOO.COM/RECREATION/ AUTOMOTIVE/MAKES AND MODELS/ • KYLE ANDERSON, ADRIAN OCNEANU, DIEGO BENITEZ, DERRICK CARLSON, ANTHONY ROWE, AND MARIO BERGES. 2012. BLUED: A FULLY LABELED PUBLIC DATASET FOR EVENT- BASED NON-INTRUSIVE LOAD MONITORING RESEARCH. IN PROC. SUSTKDD. • RICHARD CYGANIAK. 2013. ROOMS IN THE DERI BUILDING. (2013). HTTP://LAB.LINKEDDATA.DERI.IE/2010/DERI-ROOMS 29 June- 03 July 2015, Oslo, NorwayDEBS ’15 169

Editor's Notes