SlideShare a Scribd company logo
S. Hallé
Sylvain Hallé
Université du Québec à Chicoutimi
CANADA
A Stream-Based
Approach to Intrusion
Detection
CRSNG
NSERC
CHAPTER 9
November 26th, 2023
S. Hallé
$
$$$$$
$$$$$
$$$$$
update of a stock price
sensor reading
link
state of a video game
user visi�ng a web page
parcel being delivered
/
move in a chess game
Events
The expected behavior of an information system
can be evaluated through the observation and
analysis of events.
S. Hallé
A stream is an ordered sequence of events of a
given type.
Streams
...
...
Formally, if Σ is a set of events, a stream is a
sequence σ ∈ Σ*.
A stream processing calculation is a function:
π : (Σ1* × ... ×Σm*) → (Σ1* × ... ×Σn*)
Z
Z
Z
S. Hallé
expected
normal
rou�ne
nominal
"ok"
surprising
out of the ordinary
strange
different
"not ok"
Anomaly / intrusion detection
S. Hallé
Approaches
A taxonomy of existing approaches classifies
detection systems into three broad categories:
Anomaly-based approaches: detect uncommon
events, outliers, etc.
Multi-agent-based approaches: cooperating
units observe and interact with an
environment
Knowledge-based approaches: a priori
information about expected behavior is
compared with observations
1.
2.
3.
S. Hallé
Dealing with volume
Common issue with many approaches: high volume
of alarms and related information
S. Hallé
Dealing with volume
Two possible ways to reduce volume:
1. Decrease false positive rate
2. Identify relevant information about the
occurrence of an alarm
1
2
3
4
5
1 2 3 4 5
...
+
+
+
+
S. Hallé
Dealing with volume
Two possible ways to reduce volume:
1. Decrease false positive rate
2. Identify relevant information about the
occurrence of an alarm
1
2
3
4
5
1 2 3 4 5
...
+
+
+
+
this
chapter
S. Hallé
Enter RV
Proposed approach: intrusion/attack detection as a
form of Runtime Verification
S. Hallé
Enter RV
Proposed approach: intrusion/attack detection as a
form of Runtime Verification
System
S. Hallé
Enter RV
Proposed approach: intrusion/attack detection as a
form of Runtime Verification
System
Instrumentation
S. Hallé
Enter RV
Proposed approach: intrusion/attack detection as a
form of Runtime Verification
System
Trace
Events
Instrumentation
S. Hallé
Enter RV
Proposed approach: intrusion/attack detection as a
form of Runtime Verification
System
Monitor
Trace
Events
Instrumentation
S. Hallé
Enter RV
Proposed approach: intrusion/attack detection as a
form of Runtime Verification
System
Monitor
Trace
Events
Instrumentation
Formal
spec.
S. Hallé
Pattern monitor
The monitor is a process that updates a 3-valued
verdict:
⊤ if the input stream
contains the pattern
⊥ if the input stream can
never* contain the pattern
? otherwise
Example: "a eventually followed by b, then c"
c a b a b a c b c
? ? ? ? ? ? ⊤ ⊤
⊤
S. Hallé
Shortcomings
c a b a b a c b c
? ? ? ? ? ? ⊤ ⊤
⊤
S. Hallé
Shortcomings
c a b a b a c b c
? ? ? ? ? ? ⊤ ⊤
⊤
A monitor "stops" at the first occurrence of the
pattern (i.e. cannot detect multiple instances)
found!
!
S. Hallé
Shortcomings
c a b a b a c b c
? ? ? ? ? ? ⊤ ⊤
⊤
A monitor "stops" at the first occurrence of the
pattern (i.e. cannot detect multiple instances)
found!
!
Limited feedback: location of last element of the
pattern + everything that precedes
!
all this may be relevant
S. Hallé
Solution
Run one instance of on every suffix of the
input
S. Hallé
Solution
Run one instance of on every suffix of the
input
c a b a b a c b c
S. Hallé
Solution
Run one instance of on every suffix of the
input
c a b a b a c b c
1 ? ? ? ? ? ? ⊤ ⊤
⊤
S. Hallé
Solution
Run one instance of on every suffix of the
input
c a b a b a c b c
1 ? ? ? ? ? ? ⊤ ⊤
⊤
2 ? ? ? ? ? ⊤ ⊤
⊤
S. Hallé
Solution
Run one instance of on every suffix of the
input
c a b a b a c b c
1 ? ? ? ? ? ? ⊤ ⊤
⊤
2 ? ? ? ? ? ⊤ ⊤
⊤
3 ? ? ? ? ⊤ ⊤
⊤
S. Hallé
Solution
Run one instance of on every suffix of the
input
c a b a b a c b c
1 ? ? ? ? ? ? ⊤ ⊤
⊤
2 ? ? ? ? ? ⊤ ⊤
⊤
3 ? ? ? ? ⊤ ⊤
⊤
4 ? ? ? ⊤ ⊤
⊤
S. Hallé
Solution
Run one instance of on every suffix of the
input
c a b a b a c b c
1 ? ? ? ? ? ? ⊤ ⊤
⊤
2 ? ? ? ? ? ⊤ ⊤
⊤
3 ? ? ? ? ⊤ ⊤
⊤
4 ? ? ? ⊤ ⊤
⊤
5 ? ? ⊤
? ?
S. Hallé
Solution
Run one instance of on every suffix of the
input
c a b a b a c b c
1 ? ? ? ? ? ? ⊤ ⊤
⊤
2 ? ? ? ? ? ⊤ ⊤
⊤
3 ? ? ? ? ⊤ ⊤
⊤
4 ? ? ? ⊤ ⊤
⊤
5 ? ? ⊤
? ?
6
7
8
9
? ⊤
? ?
? ? ?
? ?
?
S. Hallé
Solution
Run one instance of on every suffix of the
input
c a b a b a c b c
1 ? ? ? ? ? ? ⊤ ⊤
⊤
2 ? ? ? ? ? ⊤ ⊤
⊤
3 ? ? ? ? ⊤ ⊤
⊤
4 ? ? ? ⊤ ⊤
⊤
5 ? ? ⊤
? ?
6
7
8
9
? ⊤
? ?
? ? ?
? ?
?
Multiple matches for
what is essentially
the "same" pattern
S. Hallé
Monitor state
Access to the monitor's internal state opens the
door to pruning tactics.
*
*
*
*
a b c
? ? ? ⊤
S. Hallé
First step
Pruning tactic #1
Discard monitors that
remain in their initial state
after consuming the first
event
c a b a b a c b c
1
*
*
*
*
a b c
? ? ? ⊤
? ? ? ? ? ? ⊤ ⊤
⊤
S. Hallé
First step
Pruning tactic #1
Discard monitors that
remain in their initial state
after consuming the first
event
c a b a b a c b c
1
*
*
*
*
a b c
? ? ? ⊤
S. Hallé
First step
Pruning tactic #1
Discard monitors that
remain in their initial state
after consuming the first
event
c a b a b a c b c
1
*
*
*
*
a b c
? ? ? ⊤
S. Hallé
First step
Pruning tactic #1
Discard monitors that
remain in their initial state
after consuming the first
event
c a b a b a c b c
1
*
*
*
*
a b c
? ? ? ⊤
Intuition: if the monitor does not change state
after reading σ, then a stream σ is a match if
and only if σ · σ is a match
S. Hallé
Unique state
Pruning tactic #2
Discard monitors that reach
the same state at the same
step
c a b a b a c b c
2
*
*
*
*
a b c
? ? ? ⊤
4
? ? ? ? ? ⊤ ⊤
⊤
? ? ? ⊤ ⊤
⊤
S. Hallé
Unique state
Pruning tactic #2
Discard monitors that reach
the same state at the same
step
c a b a b a c b c
2
*
*
*
*
a b c
? ? ? ⊤
4
S. Hallé
Unique state
Pruning tactic #2
Discard monitors that reach
the same state at the same
step
c a b a b a c b c
2
*
*
*
*
a b c
? ? ? ⊤
4
S. Hallé
Unique state
Pruning tactic #2
Discard monitors that reach
the same state at the same
step
c a b a b a c b c
2
*
*
*
*
a b c
? ? ? ⊤
4
Intuition: if monitor i is in state s after reading
σ · σ', and monitor j is in state s after reading
σ', then σ' is a match iff σ · σ' is a match
S. Hallé
Progressing subsequence
Pruning tactic #3
Retain only the progressing
sub-sequence of the input
c a b a b a c b c
2
*
*
*
*
a b c
? ? ? ⊤
⇒ all loops in the state space are removed
? ? ? ? ? ⊤ ⊤
⊤
S. Hallé
Progressing subsequence
Pruning tactic #3
Retain only the progressing
sub-sequence of the input
c a b a b a c b c
2
*
*
*
*
a b c
? ? ? ⊤
⇒ all loops in the state space are removed
S. Hallé
Progressing subsequence
Pruning tactic #3
Retain only the progressing
sub-sequence of the input
c a b a b a c b c
2
*
*
*
*
a b c
? ? ? ⊤
⇒ all loops in the state space are removed
S. Hallé
Progressing subsequence
Pruning tactic #3
Retain only the progressing
sub-sequence of the input
c a b a b a c b c
2
*
*
*
*
a b c
? ? ? ⊤
⇒ all loops in the state space are removed
*
?
S. Hallé
Progressing subsequence
Pruning tactic #3
Retain only the progressing
sub-sequence of the input
c a b a b a c b c
2
*
*
*
*
a b c
? ? ? ⊤
⇒ all loops in the state space are removed
S. Hallé
Progressing subsequence
Pruning tactic #3
Retain only the progressing
sub-sequence of the input
c a b a b a c b c
2
*
*
*
*
a b c
? ? ? ⊤
⇒ all loops in the state space are removed
Intuition: if σ is a match, the progressing sub-
sequence is the shortest subset of σ visiting
each new state in the same order
S. Hallé
Comparison
c a b a b a c b c
1
2
3
4
5
6
7
8
9
? ? ? ? ? ? ⊤ ⊤
⊤
? ? ? ? ? ⊤ ⊤
⊤
? ? ? ? ⊤ ⊤
⊤
? ? ? ⊤ ⊤
⊤
? ? ⊤
? ?
? ⊤
? ?
? ? ?
? ?
?
S. Hallé
Comparison
c a b a b a c b c
1
2
3
4
5
6
7
8
9
S. Hallé
Comparison
c a b a b a c b c
1
2
3
4
5
6
7
8
9
1
2
3
4
5
6
7
8
9
first-step
S. Hallé
Comparison
c a b a b a c b c
1
2
3
4
5
6
7
8
9
1
2
3
4
5
6
7
8
9
first-step same state
+
S. Hallé
Comparison
c a b a b a c b c
1
2
3
4
5
6
7
8
9
1
2
3
4
5
6
7
8
9
first-step same state progressing
+ +
S. Hallé
Pros and cons
Pros
Expects to be a (finite) state machine
Limiting in terms of possible patterns that can be
expressed
Cons
Pattern to look for can be any monitor
Drastically reduces number of matches (and events
retained in each match)
Produces results that correspond to intuition
S. Hallé
Pros and cons
Pros
Expects to be a (finite) state machine
Limiting in terms of possible patterns that can be
expressed
Cons
Pattern to look for can be any monitor
Drastically reduces number of matches (and events
retained in each match)
Produces results that correspond to intuition
does it?
S. Hallé
Monitor state (revisited)
A "state" can be deduced even for processes that
are not expressed as state machines.
Given a monitor 𝜋 : Σ* → Σ'*, a function 𝜄𝜋 : Σ* → 𝑆 is
said to be a state function if, for every 𝜎1, 𝜎2 ∈ Σ*:
𝜄𝜋(𝜎1) = 𝜄𝜋(𝜎2) implies 𝜋(𝜎1 · 𝜎') = 𝜋(𝜎2 · 𝜎')
for every 𝜎' ∈ Σ*.
S. Hallé
Monitor state (revisited)
A "state" can be deduced even for processes that
are not expressed as state machines.
Given a monitor 𝜋 : Σ* → Σ'*, a function 𝜄𝜋 : Σ* → 𝑆 is
said to be a state function if, for every 𝜎1, 𝜎2 ∈ Σ*:
𝜄𝜋(𝜎1) = 𝜄𝜋(𝜎2) implies 𝜋(𝜎1 · 𝜎') = 𝜋(𝜎2 · 𝜎')
for every 𝜎' ∈ Σ*.
Intuition: two instances of 𝜋 are in the "same
state" if they have identical behavior for any
future events they read
S. Hallé
Introducing
BeepBeep is an open source library written in Java
and developed at UQAC since 2015.
Centered on the notion of processor, a computation
unit that ingests input events and produces output
events:
h�ps://liflab.github.io/beepbeep-3
P
processor
input pipe
event
output pipe
S. Hallé
Introducing
Complex calculations can be achieved through
composition (i.e. passing the output of a processor
to the input of another).
Doing so creates processor chains or "pipelines":
S. Hallé
,φ
ψ
f
π
⊻
⊻
n
{
π
π
Sequence Eventual
disjunction
Eventual
conjunction
Eventual
occurrence
Existential
window
Existential
slice
Monitors in
BeepBeep has been extended with a set of
processors acting as monitors.
Recognize an
elementary pattern of
events
Provide a state
function
Can identify their
progressing
sub-sequence
S. Hallé
Monitors in
,φ
ψ
,φ
ψ
f
B
1 =?
f
C
1 =?
f
A
1 =?
2
1
2
3
4
6
7
5
"a eventually followed by b, then c"
Complex monitors can be obtained by freely
composing these processors
S. Hallé
Explainability in
BeepBeep allows processors to define associations
between output events and input events —a feature
called explainability.*
P
1 P
2
S. Hallé. (2020). Explainable Queries over Event Logs. Proc. EDOC 2020.
IEEE Computer Society. DOI: 10.1109/EDOC49727.2020.00029
*
Monitors associate their verdict to the progressing
sub-sequence of their input.
S. Hallé
Monitors in
,φ
ψ
,φ
ψ
f
B
1 =?
f
C
1 =?
f
A
1 =?
2
1
2
3
4
6
7
5
"a eventually followed by b, then c"
The progressing sub-sequence of each processor
can be propagated upstream
AbABbC
AbABbC
AbABbC
AbABbC
⊤⊤⊤⊤⊤⊤
???⊤⊤⊤
?????⊤
???⊤⊤⊤
AabcAaBabbCa
?????⊤
S. Hallé
Usage
Sample script in Java:
CountDecimate d =
new CountDecimate(2);
Fork f = new Fork(3);
SomeEventually a = new SomeEventually(
new ApplyFunction(new Equals("a")));
SomeEventually b = new SomeEventually(
new ApplyFunction(new Equals("b")));
SomeEventually c = new SomeEventually(
new ApplyFunction(new Equals("c")));
Sequence s1 = new Sequence();
Sequence s2 = new Sequence();
EventTracker t = new IndexEventTracker();
Connector con = new Connector(t);
con.connect(d, 0, f, 0).connect(f, 0, a, 0)
.connect(f, 1, b, 0).connect(f, 2, c, 0)
.connect(a, 0, s1, 0).connect(b, 0, s1, 1)
.connect(s1, 0, s2, 0).connect(c, 0, s2, 1);
,φ
ψ
,φ
ψ
f
B
1 =?
f
C
1 =?
f
A
1 =?
2
1
2
3
4
6
7
5
S. Hallé
Usage
FindPattern fp = new FindPattern(g);
Connector.connect(fp, new Print());
for (char e : "AabcAaBabbCa".toCharArray())
fp.getPushableInput().push(e);
,φ
ψ
,φ
ψ
f
B
1 =?
f
C
1 =?
f
A
1 =?
2
1
2
3
4
6
7
5
Detecting the pattern and producing
the witness:
Produces the output:
{(1,1), (1,7), (1,11)}
S. Hallé
Usage
FindPattern fp = new FindPattern(g);
Connector.connect(fp, new Print());
for (char e : "AabcAaBabbCa".toCharArray())
fp.getPushableInput().push(e);
,φ
ψ
,φ
ψ
f
B
1 =?
f
C
1 =?
f
A
1 =?
2
1
2
3
4
6
7
5
Detecting the pattern and producing
the witness:
Produces the output:
{(1,1), (1,7), (1,11)}
S. Hallé
Examples
Pattern monitors have been implemented to detect
"distilled" versions of real-world attacks and
intrusions
S. Hallé
Examples
Pattern monitors have been implemented to detect
"distilled" versions of real-world attacks and
intrusions
Linear sequence
Declare a match when n
successive events are
observed
ptrace exploit
S. Hallé
,φ
ψ
,φ
ψ
f
B
1 =?
f
C
1 =?
f
A
1 =?
2
1
2
3
4
6
7
5
Linear sequence
S. Hallé
Examples
Pattern monitors have been implemented to detect
"distilled" versions of real-world attacks and
intrusions
Linear sequence
Declare a match when n
successive events are
observed
Combined
Declare a match when n
other patterns have been
observed
GandCrab attack
ptrace exploit
S. Hallé
f
B
1 =?
f
A
1 =?
⊻
,φ
ψ
f
D
1 =?
f
C
1 =?
,φ
ψ
f
F
1 =?
f
E
1 =?
,φ
ψ
⊻
Combined
S. Hallé
Examples
Pattern monitors have been implemented to detect
"distilled" versions of real-world attacks and
intrusions
Incomplete
Declare a match when more
than n instances of a sequential
pattern are incomplete
SYN flooding
Linear sequence
Declare a match when n
successive events are
observed
Combined
Declare a match when n
other patterns have been
observed
GandCrab attack
ptrace exploit
S. Hallé
f
A
=?
,φ
ψ
f
f
=?
1 ?
0
1 f
B
B
=?
B
A
1
f
>
1 k
4
↑
Σ
⊻
?
2
3
5
Incomplete
S. Hallé
Examples
Pattern monitors have been implemented to detect
"distilled" versions of real-world attacks and
intrusions
Incomplete
Declare a match when more
than n instances of a sequential
pattern are incomplete
Threshold
Declare a match when more
than n instances of a pattern
have been observed
Remote Access Trojan
SYN flooding
Linear sequence
Declare a match when n
successive events are
observed
Combined
Declare a match when n
other patterns have been
observed
GandCrab attack
ptrace exploit
S. Hallé
A
f f
#
>
1
k
Σ
B
{ }
∪
1
2 3 4 5
∨
Σ
Threshold
S. Hallé
Results
Do the pruning strategies reduce the number of
matches?
S. Hallé
Results
Do the pruning strategies reduce the number of
matches?
Linear sequence Combined Incomplete Threshold
0
50
100
150
200
250
300
None
First step
Distinct states
Progressing
Matches
S. Hallé
Results
Do the pruning strategies reduce the number of
events included as witnesses?
S. Hallé
Results
Do the pruning strategies reduce the number of
events included as witnesses?
Linear sequence Combined Incomplete Threshold
0
0.2
0.4
0.6
0.8
1
1.2
None
First step
Distinct states
Progressing
Fraction
of
events
included
S. Hallé
Results
What is the impact of the pruning strategies on
processing time?
S. Hallé
Results
What is the impact of the pruning strategies on
processing time?
Linear sequence Combined Incomplete Threshold
0
200
400
600
800
1000
1200
1400
None
First step
Distinct states
Progressing
Time
(ms)
S. Hallé
Take-home points
Attack/intrusion patterns can be formalized as
runtime monitors
Stream processing pipelines allow the expression of
complex monitors
Reducing the volume of data produced by pattern
matches can be done using simple state-based
pruning strategies on these monitors
A proof-of-concept implementation of these notions
in a stream processing engine shows that intuitive
incident reports can be produced automatically
S. Hallé
For more information
https://liflab.ca
https://liflab.github.io/beepbeep-3

More Related Content

PDF
Event Stream Processing with BeepBeep 3
PDF
AI Lesson 04
PDF
A formalization of complex event stream processing
PPTX
Compiler Design_Code Optimization tech.pptx
PPT
Chapter3 Search
PPT
m3-searchAbout AI About AI About AI1.ppt
PPTX
Complete and Interpretable Conformance Checking of Business Processes
PDF
Approximation Data Structures for Streaming Applications
Event Stream Processing with BeepBeep 3
AI Lesson 04
A formalization of complex event stream processing
Compiler Design_Code Optimization tech.pptx
Chapter3 Search
m3-searchAbout AI About AI About AI1.ppt
Complete and Interpretable Conformance Checking of Business Processes
Approximation Data Structures for Streaming Applications

Similar to A Stream-Based Approach to Intrusion Detection (20)

PDF
Ay4201347349
PDF
Mining Adaptively Frequent Closed Unlabeled Rooted Trees in Data Streams
KEY
Verification with LoLA
PDF
Basics of Dynamic programming
KEY
Defense
PDF
Actors for Behavioural Simulation
PDF
Low-Cost Approximate and Adaptive Techniques for the Internet of Things
PDF
Introduction to Artificial Intelligence with Python, CS50 Approach - GDG on C...
PPTX
Artificial intelligence searches
PDF
Mining event streams with BeepBeep 3
PDF
Efficient Offline Monitoring of LTL with Bit Vectors (Talk at SAC 2021)
PPTX
FPPM algorithm
PPTX
An efficient approach to mine flexible periodic patterns in time series datab...
PPT
03 search blind
PDF
Test Sequence Generation with Cayley Graphs (Talk @ A-MOST 2021)
PPTX
CS2303-TOC.pptx
PPTX
Lec#2
PDF
An AsmL model for an Intelligent Vehicle Control System
PPT
Different Search Techniques used in AI.ppt
Ay4201347349
Mining Adaptively Frequent Closed Unlabeled Rooted Trees in Data Streams
Verification with LoLA
Basics of Dynamic programming
Defense
Actors for Behavioural Simulation
Low-Cost Approximate and Adaptive Techniques for the Internet of Things
Introduction to Artificial Intelligence with Python, CS50 Approach - GDG on C...
Artificial intelligence searches
Mining event streams with BeepBeep 3
Efficient Offline Monitoring of LTL with Bit Vectors (Talk at SAC 2021)
FPPM algorithm
An efficient approach to mine flexible periodic patterns in time series datab...
03 search blind
Test Sequence Generation with Cayley Graphs (Talk @ A-MOST 2021)
CS2303-TOC.pptx
Lec#2
An AsmL model for an Intelligent Vehicle Control System
Different Search Techniques used in AI.ppt
Ad

More from Sylvain Hallé (20)

PDF
A Tree-Based Definition of Business Process Conformance (Talk @ EDOC 2024)
PDF
Monitoring Business Process Compliance Across Multiple Executions with Stream...
PDF
Smart Contracts-Enabled Simulation for Hyperconnected Logistics
PDF
Test Suite Generation for Boolean Conditions with Equivalence Class Partitioning
PDF
Synthia: a Generic and Flexible Data Structure Generator (Long Version)
PDF
A Generic Explainability Framework for Function Circuits
PDF
Detecting Responsive Web Design Bugs with Declarative Specifications
PDF
Streamlining the Inclusion of Computer Experiments in Research Papers
PDF
Writing Domain-Specific Languages for BeepBeep
PDF
Real-Time Data Mining for Event Streams
PDF
Technologies intelligentes d'aide au développement d'applications web (WAQ 2018)
PDF
LabPal: Repeatable Computer Experiments Made Easy (ACM Workshop Talk)
PDF
A "Do-It-Yourself" Specification Language with BeepBeep 3 (Talk @ Dagstuhl 2017)
PDF
Event Stream Processing with Multiple Threads
PDF
A Few Things We Heard About RV Tools (Position Paper)
PDF
Solving Equations on Words with Morphisms and Antimorphisms
PDF
Runtime monitoring de propriétés temporelles par (streaming) XML
PDF
La quantification du premier ordre en logique temporelle
PDF
When RV Meets CEP (RV 2016 Tutorial)
PDF
Decentralized Enforcement of Artifact Lifecycles
A Tree-Based Definition of Business Process Conformance (Talk @ EDOC 2024)
Monitoring Business Process Compliance Across Multiple Executions with Stream...
Smart Contracts-Enabled Simulation for Hyperconnected Logistics
Test Suite Generation for Boolean Conditions with Equivalence Class Partitioning
Synthia: a Generic and Flexible Data Structure Generator (Long Version)
A Generic Explainability Framework for Function Circuits
Detecting Responsive Web Design Bugs with Declarative Specifications
Streamlining the Inclusion of Computer Experiments in Research Papers
Writing Domain-Specific Languages for BeepBeep
Real-Time Data Mining for Event Streams
Technologies intelligentes d'aide au développement d'applications web (WAQ 2018)
LabPal: Repeatable Computer Experiments Made Easy (ACM Workshop Talk)
A "Do-It-Yourself" Specification Language with BeepBeep 3 (Talk @ Dagstuhl 2017)
Event Stream Processing with Multiple Threads
A Few Things We Heard About RV Tools (Position Paper)
Solving Equations on Words with Morphisms and Antimorphisms
Runtime monitoring de propriétés temporelles par (streaming) XML
La quantification du premier ordre en logique temporelle
When RV Meets CEP (RV 2016 Tutorial)
Decentralized Enforcement of Artifact Lifecycles
Ad

Recently uploaded (20)

PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Machine learning based COVID-19 study performance prediction
PDF
KodekX | Application Modernization Development
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Big Data Technologies - Introduction.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
Cloud computing and distributed systems.
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPT
Teaching material agriculture food technology
Building Integrated photovoltaic BIPV_UPV.pdf
Network Security Unit 5.pdf for BCA BBA.
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
“AI and Expert System Decision Support & Business Intelligence Systems”
Chapter 3 Spatial Domain Image Processing.pdf
Machine learning based COVID-19 study performance prediction
KodekX | Application Modernization Development
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Encapsulation_ Review paper, used for researhc scholars
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
Review of recent advances in non-invasive hemoglobin estimation
Big Data Technologies - Introduction.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Dropbox Q2 2025 Financial Results & Investor Presentation
Cloud computing and distributed systems.
Mobile App Security Testing_ A Comprehensive Guide.pdf
Teaching material agriculture food technology

A Stream-Based Approach to Intrusion Detection

  • 1. S. Hallé Sylvain Hallé Université du Québec à Chicoutimi CANADA A Stream-Based Approach to Intrusion Detection CRSNG NSERC CHAPTER 9 November 26th, 2023
  • 2. S. Hallé $ $$$$$ $$$$$ $$$$$ update of a stock price sensor reading link state of a video game user visi�ng a web page parcel being delivered / move in a chess game Events The expected behavior of an information system can be evaluated through the observation and analysis of events.
  • 3. S. Hallé A stream is an ordered sequence of events of a given type. Streams ... ... Formally, if Σ is a set of events, a stream is a sequence σ ∈ Σ*. A stream processing calculation is a function: π : (Σ1* × ... ×Σm*) → (Σ1* × ... ×Σn*)
  • 4. Z Z Z S. Hallé expected normal rou�ne nominal "ok" surprising out of the ordinary strange different "not ok" Anomaly / intrusion detection
  • 5. S. Hallé Approaches A taxonomy of existing approaches classifies detection systems into three broad categories: Anomaly-based approaches: detect uncommon events, outliers, etc. Multi-agent-based approaches: cooperating units observe and interact with an environment Knowledge-based approaches: a priori information about expected behavior is compared with observations 1. 2. 3.
  • 6. S. Hallé Dealing with volume Common issue with many approaches: high volume of alarms and related information
  • 7. S. Hallé Dealing with volume Two possible ways to reduce volume: 1. Decrease false positive rate 2. Identify relevant information about the occurrence of an alarm 1 2 3 4 5 1 2 3 4 5 ... + + + +
  • 8. S. Hallé Dealing with volume Two possible ways to reduce volume: 1. Decrease false positive rate 2. Identify relevant information about the occurrence of an alarm 1 2 3 4 5 1 2 3 4 5 ... + + + + this chapter
  • 9. S. Hallé Enter RV Proposed approach: intrusion/attack detection as a form of Runtime Verification
  • 10. S. Hallé Enter RV Proposed approach: intrusion/attack detection as a form of Runtime Verification System
  • 11. S. Hallé Enter RV Proposed approach: intrusion/attack detection as a form of Runtime Verification System Instrumentation
  • 12. S. Hallé Enter RV Proposed approach: intrusion/attack detection as a form of Runtime Verification System Trace Events Instrumentation
  • 13. S. Hallé Enter RV Proposed approach: intrusion/attack detection as a form of Runtime Verification System Monitor Trace Events Instrumentation
  • 14. S. Hallé Enter RV Proposed approach: intrusion/attack detection as a form of Runtime Verification System Monitor Trace Events Instrumentation Formal spec.
  • 15. S. Hallé Pattern monitor The monitor is a process that updates a 3-valued verdict: ⊤ if the input stream contains the pattern ⊥ if the input stream can never* contain the pattern ? otherwise Example: "a eventually followed by b, then c" c a b a b a c b c ? ? ? ? ? ? ⊤ ⊤ ⊤
  • 16. S. Hallé Shortcomings c a b a b a c b c ? ? ? ? ? ? ⊤ ⊤ ⊤
  • 17. S. Hallé Shortcomings c a b a b a c b c ? ? ? ? ? ? ⊤ ⊤ ⊤ A monitor "stops" at the first occurrence of the pattern (i.e. cannot detect multiple instances) found! !
  • 18. S. Hallé Shortcomings c a b a b a c b c ? ? ? ? ? ? ⊤ ⊤ ⊤ A monitor "stops" at the first occurrence of the pattern (i.e. cannot detect multiple instances) found! ! Limited feedback: location of last element of the pattern + everything that precedes ! all this may be relevant
  • 19. S. Hallé Solution Run one instance of on every suffix of the input
  • 20. S. Hallé Solution Run one instance of on every suffix of the input c a b a b a c b c
  • 21. S. Hallé Solution Run one instance of on every suffix of the input c a b a b a c b c 1 ? ? ? ? ? ? ⊤ ⊤ ⊤
  • 22. S. Hallé Solution Run one instance of on every suffix of the input c a b a b a c b c 1 ? ? ? ? ? ? ⊤ ⊤ ⊤ 2 ? ? ? ? ? ⊤ ⊤ ⊤
  • 23. S. Hallé Solution Run one instance of on every suffix of the input c a b a b a c b c 1 ? ? ? ? ? ? ⊤ ⊤ ⊤ 2 ? ? ? ? ? ⊤ ⊤ ⊤ 3 ? ? ? ? ⊤ ⊤ ⊤
  • 24. S. Hallé Solution Run one instance of on every suffix of the input c a b a b a c b c 1 ? ? ? ? ? ? ⊤ ⊤ ⊤ 2 ? ? ? ? ? ⊤ ⊤ ⊤ 3 ? ? ? ? ⊤ ⊤ ⊤ 4 ? ? ? ⊤ ⊤ ⊤
  • 25. S. Hallé Solution Run one instance of on every suffix of the input c a b a b a c b c 1 ? ? ? ? ? ? ⊤ ⊤ ⊤ 2 ? ? ? ? ? ⊤ ⊤ ⊤ 3 ? ? ? ? ⊤ ⊤ ⊤ 4 ? ? ? ⊤ ⊤ ⊤ 5 ? ? ⊤ ? ?
  • 26. S. Hallé Solution Run one instance of on every suffix of the input c a b a b a c b c 1 ? ? ? ? ? ? ⊤ ⊤ ⊤ 2 ? ? ? ? ? ⊤ ⊤ ⊤ 3 ? ? ? ? ⊤ ⊤ ⊤ 4 ? ? ? ⊤ ⊤ ⊤ 5 ? ? ⊤ ? ? 6 7 8 9 ? ⊤ ? ? ? ? ? ? ? ?
  • 27. S. Hallé Solution Run one instance of on every suffix of the input c a b a b a c b c 1 ? ? ? ? ? ? ⊤ ⊤ ⊤ 2 ? ? ? ? ? ⊤ ⊤ ⊤ 3 ? ? ? ? ⊤ ⊤ ⊤ 4 ? ? ? ⊤ ⊤ ⊤ 5 ? ? ⊤ ? ? 6 7 8 9 ? ⊤ ? ? ? ? ? ? ? ? Multiple matches for what is essentially the "same" pattern
  • 28. S. Hallé Monitor state Access to the monitor's internal state opens the door to pruning tactics. * * * * a b c ? ? ? ⊤
  • 29. S. Hallé First step Pruning tactic #1 Discard monitors that remain in their initial state after consuming the first event c a b a b a c b c 1 * * * * a b c ? ? ? ⊤ ? ? ? ? ? ? ⊤ ⊤ ⊤
  • 30. S. Hallé First step Pruning tactic #1 Discard monitors that remain in their initial state after consuming the first event c a b a b a c b c 1 * * * * a b c ? ? ? ⊤
  • 31. S. Hallé First step Pruning tactic #1 Discard monitors that remain in their initial state after consuming the first event c a b a b a c b c 1 * * * * a b c ? ? ? ⊤
  • 32. S. Hallé First step Pruning tactic #1 Discard monitors that remain in their initial state after consuming the first event c a b a b a c b c 1 * * * * a b c ? ? ? ⊤ Intuition: if the monitor does not change state after reading σ, then a stream σ is a match if and only if σ · σ is a match
  • 33. S. Hallé Unique state Pruning tactic #2 Discard monitors that reach the same state at the same step c a b a b a c b c 2 * * * * a b c ? ? ? ⊤ 4 ? ? ? ? ? ⊤ ⊤ ⊤ ? ? ? ⊤ ⊤ ⊤
  • 34. S. Hallé Unique state Pruning tactic #2 Discard monitors that reach the same state at the same step c a b a b a c b c 2 * * * * a b c ? ? ? ⊤ 4
  • 35. S. Hallé Unique state Pruning tactic #2 Discard monitors that reach the same state at the same step c a b a b a c b c 2 * * * * a b c ? ? ? ⊤ 4
  • 36. S. Hallé Unique state Pruning tactic #2 Discard monitors that reach the same state at the same step c a b a b a c b c 2 * * * * a b c ? ? ? ⊤ 4 Intuition: if monitor i is in state s after reading σ · σ', and monitor j is in state s after reading σ', then σ' is a match iff σ · σ' is a match
  • 37. S. Hallé Progressing subsequence Pruning tactic #3 Retain only the progressing sub-sequence of the input c a b a b a c b c 2 * * * * a b c ? ? ? ⊤ ⇒ all loops in the state space are removed ? ? ? ? ? ⊤ ⊤ ⊤
  • 38. S. Hallé Progressing subsequence Pruning tactic #3 Retain only the progressing sub-sequence of the input c a b a b a c b c 2 * * * * a b c ? ? ? ⊤ ⇒ all loops in the state space are removed
  • 39. S. Hallé Progressing subsequence Pruning tactic #3 Retain only the progressing sub-sequence of the input c a b a b a c b c 2 * * * * a b c ? ? ? ⊤ ⇒ all loops in the state space are removed
  • 40. S. Hallé Progressing subsequence Pruning tactic #3 Retain only the progressing sub-sequence of the input c a b a b a c b c 2 * * * * a b c ? ? ? ⊤ ⇒ all loops in the state space are removed * ?
  • 41. S. Hallé Progressing subsequence Pruning tactic #3 Retain only the progressing sub-sequence of the input c a b a b a c b c 2 * * * * a b c ? ? ? ⊤ ⇒ all loops in the state space are removed
  • 42. S. Hallé Progressing subsequence Pruning tactic #3 Retain only the progressing sub-sequence of the input c a b a b a c b c 2 * * * * a b c ? ? ? ⊤ ⇒ all loops in the state space are removed Intuition: if σ is a match, the progressing sub- sequence is the shortest subset of σ visiting each new state in the same order
  • 43. S. Hallé Comparison c a b a b a c b c 1 2 3 4 5 6 7 8 9 ? ? ? ? ? ? ⊤ ⊤ ⊤ ? ? ? ? ? ⊤ ⊤ ⊤ ? ? ? ? ⊤ ⊤ ⊤ ? ? ? ⊤ ⊤ ⊤ ? ? ⊤ ? ? ? ⊤ ? ? ? ? ? ? ? ?
  • 44. S. Hallé Comparison c a b a b a c b c 1 2 3 4 5 6 7 8 9
  • 45. S. Hallé Comparison c a b a b a c b c 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 first-step
  • 46. S. Hallé Comparison c a b a b a c b c 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 first-step same state +
  • 47. S. Hallé Comparison c a b a b a c b c 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 first-step same state progressing + +
  • 48. S. Hallé Pros and cons Pros Expects to be a (finite) state machine Limiting in terms of possible patterns that can be expressed Cons Pattern to look for can be any monitor Drastically reduces number of matches (and events retained in each match) Produces results that correspond to intuition
  • 49. S. Hallé Pros and cons Pros Expects to be a (finite) state machine Limiting in terms of possible patterns that can be expressed Cons Pattern to look for can be any monitor Drastically reduces number of matches (and events retained in each match) Produces results that correspond to intuition does it?
  • 50. S. Hallé Monitor state (revisited) A "state" can be deduced even for processes that are not expressed as state machines. Given a monitor 𝜋 : Σ* → Σ'*, a function 𝜄𝜋 : Σ* → 𝑆 is said to be a state function if, for every 𝜎1, 𝜎2 ∈ Σ*: 𝜄𝜋(𝜎1) = 𝜄𝜋(𝜎2) implies 𝜋(𝜎1 · 𝜎') = 𝜋(𝜎2 · 𝜎') for every 𝜎' ∈ Σ*.
  • 51. S. Hallé Monitor state (revisited) A "state" can be deduced even for processes that are not expressed as state machines. Given a monitor 𝜋 : Σ* → Σ'*, a function 𝜄𝜋 : Σ* → 𝑆 is said to be a state function if, for every 𝜎1, 𝜎2 ∈ Σ*: 𝜄𝜋(𝜎1) = 𝜄𝜋(𝜎2) implies 𝜋(𝜎1 · 𝜎') = 𝜋(𝜎2 · 𝜎') for every 𝜎' ∈ Σ*. Intuition: two instances of 𝜋 are in the "same state" if they have identical behavior for any future events they read
  • 52. S. Hallé Introducing BeepBeep is an open source library written in Java and developed at UQAC since 2015. Centered on the notion of processor, a computation unit that ingests input events and produces output events: h�ps://liflab.github.io/beepbeep-3 P processor input pipe event output pipe
  • 53. S. Hallé Introducing Complex calculations can be achieved through composition (i.e. passing the output of a processor to the input of another). Doing so creates processor chains or "pipelines":
  • 54. S. Hallé ,φ ψ f π ⊻ ⊻ n { π π Sequence Eventual disjunction Eventual conjunction Eventual occurrence Existential window Existential slice Monitors in BeepBeep has been extended with a set of processors acting as monitors. Recognize an elementary pattern of events Provide a state function Can identify their progressing sub-sequence
  • 55. S. Hallé Monitors in ,φ ψ ,φ ψ f B 1 =? f C 1 =? f A 1 =? 2 1 2 3 4 6 7 5 "a eventually followed by b, then c" Complex monitors can be obtained by freely composing these processors
  • 56. S. Hallé Explainability in BeepBeep allows processors to define associations between output events and input events —a feature called explainability.* P 1 P 2 S. Hallé. (2020). Explainable Queries over Event Logs. Proc. EDOC 2020. IEEE Computer Society. DOI: 10.1109/EDOC49727.2020.00029 * Monitors associate their verdict to the progressing sub-sequence of their input.
  • 57. S. Hallé Monitors in ,φ ψ ,φ ψ f B 1 =? f C 1 =? f A 1 =? 2 1 2 3 4 6 7 5 "a eventually followed by b, then c" The progressing sub-sequence of each processor can be propagated upstream AbABbC AbABbC AbABbC AbABbC ⊤⊤⊤⊤⊤⊤ ???⊤⊤⊤ ?????⊤ ???⊤⊤⊤ AabcAaBabbCa ?????⊤
  • 58. S. Hallé Usage Sample script in Java: CountDecimate d = new CountDecimate(2); Fork f = new Fork(3); SomeEventually a = new SomeEventually( new ApplyFunction(new Equals("a"))); SomeEventually b = new SomeEventually( new ApplyFunction(new Equals("b"))); SomeEventually c = new SomeEventually( new ApplyFunction(new Equals("c"))); Sequence s1 = new Sequence(); Sequence s2 = new Sequence(); EventTracker t = new IndexEventTracker(); Connector con = new Connector(t); con.connect(d, 0, f, 0).connect(f, 0, a, 0) .connect(f, 1, b, 0).connect(f, 2, c, 0) .connect(a, 0, s1, 0).connect(b, 0, s1, 1) .connect(s1, 0, s2, 0).connect(c, 0, s2, 1); ,φ ψ ,φ ψ f B 1 =? f C 1 =? f A 1 =? 2 1 2 3 4 6 7 5
  • 59. S. Hallé Usage FindPattern fp = new FindPattern(g); Connector.connect(fp, new Print()); for (char e : "AabcAaBabbCa".toCharArray()) fp.getPushableInput().push(e); ,φ ψ ,φ ψ f B 1 =? f C 1 =? f A 1 =? 2 1 2 3 4 6 7 5 Detecting the pattern and producing the witness: Produces the output: {(1,1), (1,7), (1,11)}
  • 60. S. Hallé Usage FindPattern fp = new FindPattern(g); Connector.connect(fp, new Print()); for (char e : "AabcAaBabbCa".toCharArray()) fp.getPushableInput().push(e); ,φ ψ ,φ ψ f B 1 =? f C 1 =? f A 1 =? 2 1 2 3 4 6 7 5 Detecting the pattern and producing the witness: Produces the output: {(1,1), (1,7), (1,11)}
  • 61. S. Hallé Examples Pattern monitors have been implemented to detect "distilled" versions of real-world attacks and intrusions
  • 62. S. Hallé Examples Pattern monitors have been implemented to detect "distilled" versions of real-world attacks and intrusions Linear sequence Declare a match when n successive events are observed ptrace exploit
  • 63. S. Hallé ,φ ψ ,φ ψ f B 1 =? f C 1 =? f A 1 =? 2 1 2 3 4 6 7 5 Linear sequence
  • 64. S. Hallé Examples Pattern monitors have been implemented to detect "distilled" versions of real-world attacks and intrusions Linear sequence Declare a match when n successive events are observed Combined Declare a match when n other patterns have been observed GandCrab attack ptrace exploit
  • 65. S. Hallé f B 1 =? f A 1 =? ⊻ ,φ ψ f D 1 =? f C 1 =? ,φ ψ f F 1 =? f E 1 =? ,φ ψ ⊻ Combined
  • 66. S. Hallé Examples Pattern monitors have been implemented to detect "distilled" versions of real-world attacks and intrusions Incomplete Declare a match when more than n instances of a sequential pattern are incomplete SYN flooding Linear sequence Declare a match when n successive events are observed Combined Declare a match when n other patterns have been observed GandCrab attack ptrace exploit
  • 67. S. Hallé f A =? ,φ ψ f f =? 1 ? 0 1 f B B =? B A 1 f > 1 k 4 ↑ Σ ⊻ ? 2 3 5 Incomplete
  • 68. S. Hallé Examples Pattern monitors have been implemented to detect "distilled" versions of real-world attacks and intrusions Incomplete Declare a match when more than n instances of a sequential pattern are incomplete Threshold Declare a match when more than n instances of a pattern have been observed Remote Access Trojan SYN flooding Linear sequence Declare a match when n successive events are observed Combined Declare a match when n other patterns have been observed GandCrab attack ptrace exploit
  • 69. S. Hallé A f f # > 1 k Σ B { } ∪ 1 2 3 4 5 ∨ Σ Threshold
  • 70. S. Hallé Results Do the pruning strategies reduce the number of matches?
  • 71. S. Hallé Results Do the pruning strategies reduce the number of matches? Linear sequence Combined Incomplete Threshold 0 50 100 150 200 250 300 None First step Distinct states Progressing Matches
  • 72. S. Hallé Results Do the pruning strategies reduce the number of events included as witnesses?
  • 73. S. Hallé Results Do the pruning strategies reduce the number of events included as witnesses? Linear sequence Combined Incomplete Threshold 0 0.2 0.4 0.6 0.8 1 1.2 None First step Distinct states Progressing Fraction of events included
  • 74. S. Hallé Results What is the impact of the pruning strategies on processing time?
  • 75. S. Hallé Results What is the impact of the pruning strategies on processing time? Linear sequence Combined Incomplete Threshold 0 200 400 600 800 1000 1200 1400 None First step Distinct states Progressing Time (ms)
  • 76. S. Hallé Take-home points Attack/intrusion patterns can be formalized as runtime monitors Stream processing pipelines allow the expression of complex monitors Reducing the volume of data produced by pattern matches can be done using simple state-based pruning strategies on these monitors A proof-of-concept implementation of these notions in a stream processing engine shows that intuitive incident reports can be produced automatically
  • 77. S. Hallé For more information https://liflab.ca https://liflab.github.io/beepbeep-3