SlideShare a Scribd company logo
1/19
Digital Twins for Security Automation
IEEE/IFIP Network Operations and Management Symposium
8-12 May 2023, Miami FL USA
Kim Hammar & Rolf Stadler
2/19
Use Case: Intrusion Response
I A defender owns an infrastructure
I Consists of connected components
I Components run network services
I Defender defends the infrastructure
by monitoring and active defense
I Has partial observability
I An attacker seeks to intrude on the
infrastructure
I Has a partial view of the
infrastructure
I Wants to compromise specific
components
I Attacks by reconnaissance,
exploitation and pivoting
Attacker Clients
. . .
Defender
1 IPS
1
alerts
Gateway
7 8 9 10 11
6
5
4
3
2
12
13 14 15 16
17
18
19
21
23
20
22
24
25 26
27 28 29 30 31
3/19
Automated Intrusion Response: Current Landscape
Levels of security automation
No automation.
Manual detection.
Manual prevention.
No alerts.
No automatic responses.
Lack of tools.
1980s 1990s 2000s-Now Research
Operator assistance.
Manual detection.
Manual prevention.
Audit logs.
Security tools.
Partial automation.
System has automated functions
for detection/prevention
but requires manual
updating and configuration.
Intrusion detection systems.
Intrusion prevention systems.
High automation.
System automatically
updates itself.
Automated attack detection.
Automated attack mitigation.
4/19
Can we use decision theory and learning-based methods to
automatically find effective security strategies?1
π
Σ Σ
security
objective
feedback
control
input
target
system
security
indicators
disturbance
1
Kim Hammar and Rolf Stadler. “Finding Effective Security Strategies through Reinforcement Learning and
Self-Play”. In: International Conference on Network and Service Management (CNSM 2020). Izmir, Turkey, 2020,
Kim Hammar and Rolf Stadler. “Learning Intrusion Prevention Policies through Optimal Stopping”. In:
International Conference on Network and Service Management (CNSM 2021).
http://dl.ifip.org/db/conf/cnsm/cnsm2021/1570732932.pdf. Izmir, Turkey, 2021, Kim Hammar and
Rolf Stadler. “Intrusion Prevention Through Optimal Stopping”. In: IEEE Transactions on Network and Service
Management 19.3 (2022), pp. 2333–2348. doi: 10.1109/TNSM.2022.3176781, Kim Hammar and Rolf Stadler.
Learning Near-Optimal Intrusion Responses Against Dynamic Attackers. 2023. doi: 10.48550/ARXIV.2301.06085.
url: https://arxiv.org/abs/2301.06085.
5/19
Our Framework for Automated Network Security
s1,1 s1,2 s1,3 . . . s1,n
s2,1 s2,2 s2,3 . . . s2,n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Digital Twin
Target
Infrastructure
Model Creation &
System Identification
Strategy Mapping
π
Selective
Replication
Strategy
Implementation π
Simulation System
Reinforcement Learning &
Generalization
Strategy evaluation &
Model estimation
Automation &
Self-learning systems
5/19
Our Framework for Automated Network Security
s1,1 s1,2 s1,3 . . . s1,n
s2,1 s2,2 s2,3 . . . s2,n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Digital Twin
Target
Infrastructure
Model Creation &
System Identification
Strategy Mapping
π
Selective
Replication
Strategy
Implementation π
Simulation System
Reinforcement Learning &
Generalization
Strategy evaluation &
Model estimation
Automation &
Self-learning systems
5/19
Our Framework for Automated Network Security
s1,1 s1,2 s1,3 . . . s1,n
s2,1 s2,2 s2,3 . . . s2,n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Digital Twin
Target
Infrastructure
Model Creation &
System Identification
Strategy Mapping
π
Selective
Replication
Strategy
Implementation π
Simulation System
Reinforcement Learning &
Generalization
Strategy evaluation &
Model estimation
Automation &
Self-learning systems
5/19
Our Framework for Automated Network Security
s1,1 s1,2 s1,3 . . . s1,n
s2,1 s2,2 s2,3 . . . s2,n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Digital Twin
Target
Infrastructure
Model Creation &
System Identification
Strategy Mapping
π
Selective
Replication
Strategy
Implementation π
Simulation System
Reinforcement Learning &
Generalization
Strategy evaluation &
Model estimation
Automation &
Self-learning systems
5/19
Our Framework for Automated Network Security
s1,1 s1,2 s1,3 . . . s1,n
s2,1 s2,2 s2,3 . . . s2,n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Digital Twin
Target
Infrastructure
Model Creation &
System Identification
Strategy Mapping
π
Selective
Replication
Strategy
Implementation π
Simulation System
Reinforcement Learning &
Generalization
Strategy evaluation &
Model estimation
Automation &
Self-learning systems
5/19
Our Framework for Automated Network Security
s1,1 s1,2 s1,3 . . . s1,n
s2,1 s2,2 s2,3 . . . s2,n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Digital Twin
Target
Infrastructure
Model Creation &
System Identification
Strategy Mapping
π
Selective
Replication
Strategy
Implementation π
Simulation System
Reinforcement Learning &
Generalization
Strategy evaluation &
Model estimation
Automation &
Self-learning systems
5/19
Our Framework for Automated Network Security
s1,1 s1,2 s1,3 . . . s1,n
s2,1 s2,2 s2,3 . . . s2,n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Digital Twin
Target
Infrastructure
Model Creation &
System Identification
Strategy Mapping
π
Selective
Replication
Strategy
Implementation π
Simulation System
Reinforcement Learning &
Generalization
Strategy evaluation &
Model estimation
Automation &
Self-learning systems
5/19
Our Framework for Automated Network Security
s1,1 s1,2 s1,3 . . . s1,n
s2,1 s2,2 s2,3 . . . s2,n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Digital Twin
Target
Infrastructure
Model Creation &
System Identification
Strategy Mapping
π
Selective
Replication
Strategy
Implementation π
Simulation System
Reinforcement Learning &
Generalization
Strategy evaluation &
Model estimation
Automation &
Self-learning systems
6/19
Creating a Digital Twin of the Target Infrastructure
s1,1 s1,2 s1,3 . . . s1,n
s2,1 s2,2 s2,3 . . . s2,n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Digital Twin
Target
Infrastructure
Model Creation &
System Identification
Strategy Mapping
π
Selective
Replication
Strategy
Implementation π
Simulation System
Reinforcement Learning &
Generalization
Strategy evaluation &
Model estimation
Automation &
Self-learning systems
6/19
Theoretical Analysis and Learning of Defender Strategies
s1,1 s1,2 s1,3 . . . s1,n
s2,1 s2,2 s2,3 . . . s2,n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Digital Twin
Target
Infrastructure
Model Creation &
System Identification
Strategy Mapping
π
Selective
Replication
Strategy
Implementation π
Simulation System
Reinforcement Learning &
Generalization
Strategy evaluation &
Model estimation
Automation &
Self-learning systems
7/19
Creating a Digital Twin of the Target Infrastructure
I An infrastructure is
defined by its
configuration.
I Set of configurations
supported by our
framework can be
seen as a
configuration space
I The configuration
space defines the
class of
infrastructures for
which we can create
digital twins.
Configuration Space
*
* *
Digital twins
R1 R1 R1
8/19
The Target Infrastructure
I 33 components
I Topology shown to the right
I Components run network services, e.g.
IDPS, SSH, Web, etc.
I A subset of components have
vulnerabilities
I CVE-2017-7494, CVE-2015-3306,
CVE-2015-5602
I CVE-2014-6271, CVE-2016-10033,
CVE-2015-1427, etc.
I Clients and the attacker access the
infrastructure through the public
gateway
Attacker Clients
. . .
Defender
1 IPS
1
alerts
Gateway
7 8 9 10 11
6
5
4
3
2
12
13 14 15 16
17
18
19
21
23
20
22
24
25 26
27 28 29 30 31
9/19
Emulating Physical Components
I We emulate physical components with
Docker containers
I Focus on linux-based systems
I The containers include everything
needed to emulate the host: a runtime
system, code, system tools, system
libraries, and configurations.
I Examples of containers: IDPS
container, client container, attacker
container, CVE-2015-1427 container,
Open vSwitch containers, etc.
Containers
Physical server
Operating system
Docker engine
CSLE
10/19
Emulating Network Connectivity
Management node 1
Emulated IT infrastructure
Management node 2
Emulated IT infrastructure
Management node n
Emulated IT infrastructure
VXLAN VXLAN . . . VXLAN
IP network
I We emulate network connectivity on the same host using
network namespaces.
I Connectivity across physical hosts is achieved using VXLAN
tunnels with Docker swarm.
11/19
Emulating Network Conditions
I We do traffic shaping using
NetEm in the Linux kernel
I Emulate internal
connections are full-duplex
& loss-less with bit
capacities of 1000 Mbit/s
I Emulate external
connections are full-duplex
with bit capacities of 100
Mbit/s & 0.1% packet loss
in normal operation and
random bursts of 1% packet
loss
User space
. . .
Application processes
Kernel
TCP/UDP
IP/Ethernet/802.11
OS
TCP/IP
stack
Queueing
discipline
Device driver
queue (FIFO)
NIC
Netem config:
latency,
jitter, etc.
Sockets
12/19
Emulating Actors
I We emulate client arrivals
with Poisson processes
I We emulate client
interactions with load
generators
I Attackers are emulated by
automated programs that
select actions from a
pre-defined set
I Defender actions are
emulated through a custom
gRPC API.
Markov Decision Process
s1,1 s1,2 s1,3 . . . s1,4
s2,1 s2,2 s2,3 . . . s2,4
Digital Twin
. . .
Virtual
network
Virtual
devices
Emulated
services
Emulated
actors
IT Infrastructure
Configuration
& change events
System traces
Verified security policy
Optimized security policy
13/19
System Identification
s1,1 s1,2 s1,3 . . . s1,n
s2,1 s2,2 s2,3 . . . s2,n
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Digital Twin
Target
Infrastructure
Model Creation &
System Identification
Strategy Mapping
π
Selective
Replication
Strategy
Implementation π
Simulation System
Reinforcement Learning &
Generalization
Strategy evaluation &
Model estimation
Automation &
Self-learning systems
14/19
Monitoring and Telemetry
Devices
Event bus
Security Policy
Storage Systems
Control actions
Data pipelines
Events
I Emulated devices run monitoring agents that periodically
push metrics to a Kafka event bus.
I The data in the event bus is consumed by data pipelines that
process the data and write to storage systems.
I The processed data is used by an automated security policy to
decide on control actions to execute in the digital twin.
15/19
Estimating Metric Distributions
ˆ
f
O
(o
t
|0)
Probability distribution of # IPS alerts weighted by priority ot
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
ˆ
f
O
(o
t
|1)
Fitted model Distribution st = 0 Distribution st = 1
I We use the collected data to estimate metric distributions.
I We use the estimated distributions to instantiate Markov
games and Markov decision processes.
16/19
Learning Security Strategies
I We model the evolution of the system with a discrete-time
dynamical system.
I We assume a Markovian system with stochastic dynamics and
partial observability.
I A Partially Observed Markov Decision Process (POMDP)
I If attacker is static.
I A Partially Observed Stochastic Game (POSG)
I If attacker is dynamic.
Stochastic
System
(Markov)
Noisy
Sensor
Optimal
filter
Controller
Attacker
action a
(1)
t
action a
(2)
t
observation
ot
state
st
belief
bt
17/19
Learning Security Strategies
0
50
100
Reward per episode
0
50
100
150
Episode length (steps)
0.0
0.5
1.0
P[intrusion stopped]
T-SPSA simulation T-SPSA digital twin ot > 0 baseline Snort IDPS upper bound
I t-spsa is our reinforcement learning algorithm
I t-spsa outperforms Snort and converges to near-optimal
strategies
I While the performance is slightly better in simulation than in
the digital twin, it is clear that the performance in the two
environments are correlated.
18/19
For more details about the theory
I Finding Effective Security Strategies through Reinforcement Learning and Self-Play2
I Learning Intrusion Prevention Policies through Optimal Stopping3
I A System for Interactive Examination of Learned Security Policies4
I Intrusion Prevention Through Optimal Stopping5
I Learning Security Strategies through Game Play and Optimal Stopping6
I An Online Framework for Adapting Security Policies in Dynamic IT Environments7
I Learning Near-Optimal Intrusion Responses Against Dynamic Attackers8
2
Kim Hammar and Rolf Stadler. “Finding Effective Security Strategies through Reinforcement Learning and
Self-Play”. In: International Conference on Network and Service Management (CNSM 2020). Izmir, Turkey, 2020.
3
Kim Hammar and Rolf Stadler. “Learning Intrusion Prevention Policies through Optimal Stopping”. In:
International Conference on Network and Service Management (CNSM 2021).
http://dl.ifip.org/db/conf/cnsm/cnsm2021/1570732932.pdf. Izmir, Turkey, 2021.
4
Kim Hammar and Rolf Stadler. “A System for Interactive Examination of Learned Security Policies”. In:
NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium. 2022, pp. 1–3. doi:
10.1109/NOMS54207.2022.9789707.
5
Kim Hammar and Rolf Stadler. “Intrusion Prevention Through Optimal Stopping”. In: IEEE Transactions on
Network and Service Management 19.3 (2022), pp. 2333–2348. doi: 10.1109/TNSM.2022.3176781.
6
Kim Hammar and Rolf Stadler. “Learning Security Strategies through Game Play and Optimal Stopping”. In:
Proceedings of the ML4Cyber workshop, ICML 2022, Baltimore, USA, July 17-23, 2022. PMLR, 2022.
7
Kim Hammar and Rolf Stadler. “An Online Framework for Adapting Security Policies in Dynamic IT
Environments”. In: International Conference on Network and Service Management (CNSM 2022). Thessaloniki,
Greece, 2022.
8
Kim Hammar and Rolf Stadler. Learning Near-Optimal Intrusion Responses Against Dynamic Attackers. 2023.
doi: 10.48550/ARXIV.2301.06085. url: https://arxiv.org/abs/2301.06085.
19/19
Conclusions
I We develop a framework for
automated security.
I Our framework centers
around a digital twin
I We use the digital twin to
optimize security strategies
through reinforcement
learning, game theory, and
control theory.
I Documentation of our
framework:
limmen.dev/csle.
Markov Decision Process
s1,1 s1,2 s1,3 . . . s1,4
s2,1 s2,2 s2,3 . . . s2,4
Digital Twin
. . .
Virtual
network
Virtual
devices
Emulated
services
Emulated
actors
IT Infrastructure
Configuration
& change events
System traces
Verified security policy
Optimized security policy

More Related Content

PDF
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...
PDF
Learning Optimal Intrusion Responses via Decomposition
PDF
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...
PDF
Learning Automated Intrusion Response
PDF
Automated Intrusion Response - CDIS Spring Conference 2024
PDF
CNSM 2022 - An Online Framework for Adapting Security Policies in Dynamic IT ...
PDF
Self-Learning Systems for Cyber Security
PDF
Self-learning systems for cyber security
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...
Learning Optimal Intrusion Responses via Decomposition
Learning Near-Optimal Intrusion Responses for IT Infrastructures via Decompos...
Learning Automated Intrusion Response
Automated Intrusion Response - CDIS Spring Conference 2024
CNSM 2022 - An Online Framework for Adapting Security Policies in Dynamic IT ...
Self-Learning Systems for Cyber Security
Self-learning systems for cyber security

Similar to Digital Twins for Security Automation (20)

PDF
Intrusion Prevention through Optimal Stopping
PDF
Learning Near-Optimal Intrusion Response for Large-Scale IT Infrastructures v...
PDF
Self-Learning Systems for Cyber Security
PDF
Research of Intrusion Preventio System based on Snort
PDF
A SIMULATION APPROACH TO PREDICATE THE RELIABILITY OF A PERVASIVE SOFTWARE SY...
PDF
Hacking Kubernetes Threat Driven Analysis and Defense 1st Edition Andrew Martin
PDF
Chapter9 network managment-3ed
PDF
Understand, verify, and act on the security of your Kubernetes clusters - Sca...
PPTX
Lecture 1 - Introduction.pptx
PPT
Computernetworkingkurosech9 091011003335-phpapp01
PDF
IRJET- Study of Various Network Simulators
PDF
Security Center.pdf
PPT
Persentation of Cyber Security in Smart Grid
PDF
Leveraging Artificial Intelligence Processing on Edge Devices
 
PDF
Learning Security Strategies through Game Play and Optimal Stopping
PPTX
User Behavior Analytics Using Machine Learning
PDF
AI for Cybersecurity Innovation
PPTX
Cyber Security
PDF
Design and Implementation of Smart Bell Notification System using IoT
PDF
Design and Implementation of Smart Bell Notification System using IoT
Intrusion Prevention through Optimal Stopping
Learning Near-Optimal Intrusion Response for Large-Scale IT Infrastructures v...
Self-Learning Systems for Cyber Security
Research of Intrusion Preventio System based on Snort
A SIMULATION APPROACH TO PREDICATE THE RELIABILITY OF A PERVASIVE SOFTWARE SY...
Hacking Kubernetes Threat Driven Analysis and Defense 1st Edition Andrew Martin
Chapter9 network managment-3ed
Understand, verify, and act on the security of your Kubernetes clusters - Sca...
Lecture 1 - Introduction.pptx
Computernetworkingkurosech9 091011003335-phpapp01
IRJET- Study of Various Network Simulators
Security Center.pdf
Persentation of Cyber Security in Smart Grid
Leveraging Artificial Intelligence Processing on Edge Devices
 
Learning Security Strategies through Game Play and Optimal Stopping
User Behavior Analytics Using Machine Learning
AI for Cybersecurity Innovation
Cyber Security
Design and Implementation of Smart Bell Notification System using IoT
Design and Implementation of Smart Bell Notification System using IoT
Ad

More from Kim Hammar (20)

PDF
Approximation in Value Space using Aggregation, with Applications to POMDPs a...
PDF
Adaptive Security Policies via Belief Aggregation and Rollout
PDF
Optimal Security Response to Network Intrusions in IT Systems
PDF
Intrusion Tolerance as a Two-Level Game - GameSec24
PDF
Intrusion Tolerance for Networked Systems through Two-Level Feedback Control
PDF
Intrusion Tolerance as a Two-Level Game (Visit to Melbourne University)
PDF
Automated Security Response through Online Learning with Adaptive Con jectures
PDF
Självlärande System för Cybersäkerhet. KTH
PDF
Intrusion Tolerance for Networked Systems through Two-level Feedback Control
PDF
Gamesec23 - Scalable Learning of Intrusion Response through Recursive Decompo...
PDF
Självlärande system för cyberförsvar.
PDF
Intrusion Response through Optimal Stopping
PDF
Self-Learning Systems for Cyber Defense
PDF
Self-learning Intrusion Prevention Systems.
PDF
Intrusion Prevention through Optimal Stopping
PDF
Intrusion Prevention through Optimal Stopping and Self-Play
PDF
Introduktion till försvar mot nätverksintrång. 22 Feb 2022. EP1200 KTH.
PDF
Intrusion Prevention through Optimal Stopping.
PDF
A Game Theoretic Analysis of Intrusion Detection in Access Control Systems - ...
PDF
Reinforcement Learning Algorithms for Adaptive Cyber Defense against Heartbleed
Approximation in Value Space using Aggregation, with Applications to POMDPs a...
Adaptive Security Policies via Belief Aggregation and Rollout
Optimal Security Response to Network Intrusions in IT Systems
Intrusion Tolerance as a Two-Level Game - GameSec24
Intrusion Tolerance for Networked Systems through Two-Level Feedback Control
Intrusion Tolerance as a Two-Level Game (Visit to Melbourne University)
Automated Security Response through Online Learning with Adaptive Con jectures
Självlärande System för Cybersäkerhet. KTH
Intrusion Tolerance for Networked Systems through Two-level Feedback Control
Gamesec23 - Scalable Learning of Intrusion Response through Recursive Decompo...
Självlärande system för cyberförsvar.
Intrusion Response through Optimal Stopping
Self-Learning Systems for Cyber Defense
Self-learning Intrusion Prevention Systems.
Intrusion Prevention through Optimal Stopping
Intrusion Prevention through Optimal Stopping and Self-Play
Introduktion till försvar mot nätverksintrång. 22 Feb 2022. EP1200 KTH.
Intrusion Prevention through Optimal Stopping.
A Game Theoretic Analysis of Intrusion Detection in Access Control Systems - ...
Reinforcement Learning Algorithms for Adaptive Cyber Defense against Heartbleed
Ad

Recently uploaded (20)

PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
cuic standard and advanced reporting.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Spectroscopy.pptx food analysis technology
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
KodekX | Application Modernization Development
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Machine learning based COVID-19 study performance prediction
DOCX
The AUB Centre for AI in Media Proposal.docx
Diabetes mellitus diagnosis method based random forest with bat algorithm
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
cuic standard and advanced reporting.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Spectroscopy.pptx food analysis technology
Chapter 3 Spatial Domain Image Processing.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
NewMind AI Weekly Chronicles - August'25 Week I
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Understanding_Digital_Forensics_Presentation.pptx
KodekX | Application Modernization Development
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Machine learning based COVID-19 study performance prediction
The AUB Centre for AI in Media Proposal.docx

Digital Twins for Security Automation

  • 1. 1/19 Digital Twins for Security Automation IEEE/IFIP Network Operations and Management Symposium 8-12 May 2023, Miami FL USA Kim Hammar & Rolf Stadler
  • 2. 2/19 Use Case: Intrusion Response I A defender owns an infrastructure I Consists of connected components I Components run network services I Defender defends the infrastructure by monitoring and active defense I Has partial observability I An attacker seeks to intrude on the infrastructure I Has a partial view of the infrastructure I Wants to compromise specific components I Attacks by reconnaissance, exploitation and pivoting Attacker Clients . . . Defender 1 IPS 1 alerts Gateway 7 8 9 10 11 6 5 4 3 2 12 13 14 15 16 17 18 19 21 23 20 22 24 25 26 27 28 29 30 31
  • 3. 3/19 Automated Intrusion Response: Current Landscape Levels of security automation No automation. Manual detection. Manual prevention. No alerts. No automatic responses. Lack of tools. 1980s 1990s 2000s-Now Research Operator assistance. Manual detection. Manual prevention. Audit logs. Security tools. Partial automation. System has automated functions for detection/prevention but requires manual updating and configuration. Intrusion detection systems. Intrusion prevention systems. High automation. System automatically updates itself. Automated attack detection. Automated attack mitigation.
  • 4. 4/19 Can we use decision theory and learning-based methods to automatically find effective security strategies?1 π Σ Σ security objective feedback control input target system security indicators disturbance 1 Kim Hammar and Rolf Stadler. “Finding Effective Security Strategies through Reinforcement Learning and Self-Play”. In: International Conference on Network and Service Management (CNSM 2020). Izmir, Turkey, 2020, Kim Hammar and Rolf Stadler. “Learning Intrusion Prevention Policies through Optimal Stopping”. In: International Conference on Network and Service Management (CNSM 2021). http://dl.ifip.org/db/conf/cnsm/cnsm2021/1570732932.pdf. Izmir, Turkey, 2021, Kim Hammar and Rolf Stadler. “Intrusion Prevention Through Optimal Stopping”. In: IEEE Transactions on Network and Service Management 19.3 (2022), pp. 2333–2348. doi: 10.1109/TNSM.2022.3176781, Kim Hammar and Rolf Stadler. Learning Near-Optimal Intrusion Responses Against Dynamic Attackers. 2023. doi: 10.48550/ARXIV.2301.06085. url: https://arxiv.org/abs/2301.06085.
  • 5. 5/19 Our Framework for Automated Network Security s1,1 s1,2 s1,3 . . . s1,n s2,1 s2,2 s2,3 . . . s2,n . . . . . . . . . . . . . . . Digital Twin Target Infrastructure Model Creation & System Identification Strategy Mapping π Selective Replication Strategy Implementation π Simulation System Reinforcement Learning & Generalization Strategy evaluation & Model estimation Automation & Self-learning systems
  • 6. 5/19 Our Framework for Automated Network Security s1,1 s1,2 s1,3 . . . s1,n s2,1 s2,2 s2,3 . . . s2,n . . . . . . . . . . . . . . . Digital Twin Target Infrastructure Model Creation & System Identification Strategy Mapping π Selective Replication Strategy Implementation π Simulation System Reinforcement Learning & Generalization Strategy evaluation & Model estimation Automation & Self-learning systems
  • 7. 5/19 Our Framework for Automated Network Security s1,1 s1,2 s1,3 . . . s1,n s2,1 s2,2 s2,3 . . . s2,n . . . . . . . . . . . . . . . Digital Twin Target Infrastructure Model Creation & System Identification Strategy Mapping π Selective Replication Strategy Implementation π Simulation System Reinforcement Learning & Generalization Strategy evaluation & Model estimation Automation & Self-learning systems
  • 8. 5/19 Our Framework for Automated Network Security s1,1 s1,2 s1,3 . . . s1,n s2,1 s2,2 s2,3 . . . s2,n . . . . . . . . . . . . . . . Digital Twin Target Infrastructure Model Creation & System Identification Strategy Mapping π Selective Replication Strategy Implementation π Simulation System Reinforcement Learning & Generalization Strategy evaluation & Model estimation Automation & Self-learning systems
  • 9. 5/19 Our Framework for Automated Network Security s1,1 s1,2 s1,3 . . . s1,n s2,1 s2,2 s2,3 . . . s2,n . . . . . . . . . . . . . . . Digital Twin Target Infrastructure Model Creation & System Identification Strategy Mapping π Selective Replication Strategy Implementation π Simulation System Reinforcement Learning & Generalization Strategy evaluation & Model estimation Automation & Self-learning systems
  • 10. 5/19 Our Framework for Automated Network Security s1,1 s1,2 s1,3 . . . s1,n s2,1 s2,2 s2,3 . . . s2,n . . . . . . . . . . . . . . . Digital Twin Target Infrastructure Model Creation & System Identification Strategy Mapping π Selective Replication Strategy Implementation π Simulation System Reinforcement Learning & Generalization Strategy evaluation & Model estimation Automation & Self-learning systems
  • 11. 5/19 Our Framework for Automated Network Security s1,1 s1,2 s1,3 . . . s1,n s2,1 s2,2 s2,3 . . . s2,n . . . . . . . . . . . . . . . Digital Twin Target Infrastructure Model Creation & System Identification Strategy Mapping π Selective Replication Strategy Implementation π Simulation System Reinforcement Learning & Generalization Strategy evaluation & Model estimation Automation & Self-learning systems
  • 12. 5/19 Our Framework for Automated Network Security s1,1 s1,2 s1,3 . . . s1,n s2,1 s2,2 s2,3 . . . s2,n . . . . . . . . . . . . . . . Digital Twin Target Infrastructure Model Creation & System Identification Strategy Mapping π Selective Replication Strategy Implementation π Simulation System Reinforcement Learning & Generalization Strategy evaluation & Model estimation Automation & Self-learning systems
  • 13. 6/19 Creating a Digital Twin of the Target Infrastructure s1,1 s1,2 s1,3 . . . s1,n s2,1 s2,2 s2,3 . . . s2,n . . . . . . . . . . . . . . . Digital Twin Target Infrastructure Model Creation & System Identification Strategy Mapping π Selective Replication Strategy Implementation π Simulation System Reinforcement Learning & Generalization Strategy evaluation & Model estimation Automation & Self-learning systems
  • 14. 6/19 Theoretical Analysis and Learning of Defender Strategies s1,1 s1,2 s1,3 . . . s1,n s2,1 s2,2 s2,3 . . . s2,n . . . . . . . . . . . . . . . Digital Twin Target Infrastructure Model Creation & System Identification Strategy Mapping π Selective Replication Strategy Implementation π Simulation System Reinforcement Learning & Generalization Strategy evaluation & Model estimation Automation & Self-learning systems
  • 15. 7/19 Creating a Digital Twin of the Target Infrastructure I An infrastructure is defined by its configuration. I Set of configurations supported by our framework can be seen as a configuration space I The configuration space defines the class of infrastructures for which we can create digital twins. Configuration Space * * * Digital twins R1 R1 R1
  • 16. 8/19 The Target Infrastructure I 33 components I Topology shown to the right I Components run network services, e.g. IDPS, SSH, Web, etc. I A subset of components have vulnerabilities I CVE-2017-7494, CVE-2015-3306, CVE-2015-5602 I CVE-2014-6271, CVE-2016-10033, CVE-2015-1427, etc. I Clients and the attacker access the infrastructure through the public gateway Attacker Clients . . . Defender 1 IPS 1 alerts Gateway 7 8 9 10 11 6 5 4 3 2 12 13 14 15 16 17 18 19 21 23 20 22 24 25 26 27 28 29 30 31
  • 17. 9/19 Emulating Physical Components I We emulate physical components with Docker containers I Focus on linux-based systems I The containers include everything needed to emulate the host: a runtime system, code, system tools, system libraries, and configurations. I Examples of containers: IDPS container, client container, attacker container, CVE-2015-1427 container, Open vSwitch containers, etc. Containers Physical server Operating system Docker engine CSLE
  • 18. 10/19 Emulating Network Connectivity Management node 1 Emulated IT infrastructure Management node 2 Emulated IT infrastructure Management node n Emulated IT infrastructure VXLAN VXLAN . . . VXLAN IP network I We emulate network connectivity on the same host using network namespaces. I Connectivity across physical hosts is achieved using VXLAN tunnels with Docker swarm.
  • 19. 11/19 Emulating Network Conditions I We do traffic shaping using NetEm in the Linux kernel I Emulate internal connections are full-duplex & loss-less with bit capacities of 1000 Mbit/s I Emulate external connections are full-duplex with bit capacities of 100 Mbit/s & 0.1% packet loss in normal operation and random bursts of 1% packet loss User space . . . Application processes Kernel TCP/UDP IP/Ethernet/802.11 OS TCP/IP stack Queueing discipline Device driver queue (FIFO) NIC Netem config: latency, jitter, etc. Sockets
  • 20. 12/19 Emulating Actors I We emulate client arrivals with Poisson processes I We emulate client interactions with load generators I Attackers are emulated by automated programs that select actions from a pre-defined set I Defender actions are emulated through a custom gRPC API. Markov Decision Process s1,1 s1,2 s1,3 . . . s1,4 s2,1 s2,2 s2,3 . . . s2,4 Digital Twin . . . Virtual network Virtual devices Emulated services Emulated actors IT Infrastructure Configuration & change events System traces Verified security policy Optimized security policy
  • 21. 13/19 System Identification s1,1 s1,2 s1,3 . . . s1,n s2,1 s2,2 s2,3 . . . s2,n . . . . . . . . . . . . . . . Digital Twin Target Infrastructure Model Creation & System Identification Strategy Mapping π Selective Replication Strategy Implementation π Simulation System Reinforcement Learning & Generalization Strategy evaluation & Model estimation Automation & Self-learning systems
  • 22. 14/19 Monitoring and Telemetry Devices Event bus Security Policy Storage Systems Control actions Data pipelines Events I Emulated devices run monitoring agents that periodically push metrics to a Kafka event bus. I The data in the event bus is consumed by data pipelines that process the data and write to storage systems. I The processed data is used by an automated security policy to decide on control actions to execute in the digital twin.
  • 23. 15/19 Estimating Metric Distributions ˆ f O (o t |0) Probability distribution of # IPS alerts weighted by priority ot 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 ˆ f O (o t |1) Fitted model Distribution st = 0 Distribution st = 1 I We use the collected data to estimate metric distributions. I We use the estimated distributions to instantiate Markov games and Markov decision processes.
  • 24. 16/19 Learning Security Strategies I We model the evolution of the system with a discrete-time dynamical system. I We assume a Markovian system with stochastic dynamics and partial observability. I A Partially Observed Markov Decision Process (POMDP) I If attacker is static. I A Partially Observed Stochastic Game (POSG) I If attacker is dynamic. Stochastic System (Markov) Noisy Sensor Optimal filter Controller Attacker action a (1) t action a (2) t observation ot state st belief bt
  • 25. 17/19 Learning Security Strategies 0 50 100 Reward per episode 0 50 100 150 Episode length (steps) 0.0 0.5 1.0 P[intrusion stopped] T-SPSA simulation T-SPSA digital twin ot > 0 baseline Snort IDPS upper bound I t-spsa is our reinforcement learning algorithm I t-spsa outperforms Snort and converges to near-optimal strategies I While the performance is slightly better in simulation than in the digital twin, it is clear that the performance in the two environments are correlated.
  • 26. 18/19 For more details about the theory I Finding Effective Security Strategies through Reinforcement Learning and Self-Play2 I Learning Intrusion Prevention Policies through Optimal Stopping3 I A System for Interactive Examination of Learned Security Policies4 I Intrusion Prevention Through Optimal Stopping5 I Learning Security Strategies through Game Play and Optimal Stopping6 I An Online Framework for Adapting Security Policies in Dynamic IT Environments7 I Learning Near-Optimal Intrusion Responses Against Dynamic Attackers8 2 Kim Hammar and Rolf Stadler. “Finding Effective Security Strategies through Reinforcement Learning and Self-Play”. In: International Conference on Network and Service Management (CNSM 2020). Izmir, Turkey, 2020. 3 Kim Hammar and Rolf Stadler. “Learning Intrusion Prevention Policies through Optimal Stopping”. In: International Conference on Network and Service Management (CNSM 2021). http://dl.ifip.org/db/conf/cnsm/cnsm2021/1570732932.pdf. Izmir, Turkey, 2021. 4 Kim Hammar and Rolf Stadler. “A System for Interactive Examination of Learned Security Policies”. In: NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium. 2022, pp. 1–3. doi: 10.1109/NOMS54207.2022.9789707. 5 Kim Hammar and Rolf Stadler. “Intrusion Prevention Through Optimal Stopping”. In: IEEE Transactions on Network and Service Management 19.3 (2022), pp. 2333–2348. doi: 10.1109/TNSM.2022.3176781. 6 Kim Hammar and Rolf Stadler. “Learning Security Strategies through Game Play and Optimal Stopping”. In: Proceedings of the ML4Cyber workshop, ICML 2022, Baltimore, USA, July 17-23, 2022. PMLR, 2022. 7 Kim Hammar and Rolf Stadler. “An Online Framework for Adapting Security Policies in Dynamic IT Environments”. In: International Conference on Network and Service Management (CNSM 2022). Thessaloniki, Greece, 2022. 8 Kim Hammar and Rolf Stadler. Learning Near-Optimal Intrusion Responses Against Dynamic Attackers. 2023. doi: 10.48550/ARXIV.2301.06085. url: https://arxiv.org/abs/2301.06085.
  • 27. 19/19 Conclusions I We develop a framework for automated security. I Our framework centers around a digital twin I We use the digital twin to optimize security strategies through reinforcement learning, game theory, and control theory. I Documentation of our framework: limmen.dev/csle. Markov Decision Process s1,1 s1,2 s1,3 . . . s1,4 s2,1 s2,2 s2,3 . . . s2,4 Digital Twin . . . Virtual network Virtual devices Emulated services Emulated actors IT Infrastructure Configuration & change events System traces Verified security policy Optimized security policy