SlideShare a Scribd company logo
Élő szövet a fémvázon: Python és
gépi tanulás a Zeek platformon
Budapest Hackersuli Meetup 2024 Május
1
Szili Dávid
Nem, még mindig nem vettem magyar kiosztású
billentyűzetet!
Ezt a két slide-ot is kínszenvedés volt ékezetekkel
megírni!
2
Dávid, magyarul lécci… (Diszklémer 1)
Diszklémer 2
• We are going to discuss
intermediate/advanced topics!
• The assumption is that you are
somewhat familiar with security
monitoring and/or machine
learning concepts.
• Originally, this was a workshop,
NOT a presentation
• Finally; none of this is our own
research! We are just connecting
dots (see refs later)!
3
ChatGPT-4/DALL-E3 prompt:
This is too scary now, keep the original
image, all the details, but make it look cute,
and transform it into cartoon network art
style, like the style of Dexter's Laboratory, or
Power Puff Girls
ChatGPT-4/DALL-E3 prompt:
Make an epic portrait of a Terminator T-800
with red glowing eyes and a python wrapped
around its neck, where the python is
showing its fangs as it is about to attack.
Make it dynamic, cinematic, highly detailed,
packed with hidden details, style, high
dynamic range, hyper-realistic, realistic,
attention to detail.
Diszklémer 3
4
• Managing partner at Alzette Information Security (@AlzetteInfoSec)
• Network penetration testing, security architectures, security monitoring,
threat hunting, incident response, digital forensics
• Instructor at SANS Institute: FOR572, FOR509
• SANS Lead author: SANS DFIR NetWars
• BSides Luxembourg Organizer: https://bsideslux.lu
• Twitch: https://www.twitch.tv/alzetteinfosec
• YouTube: https://www.youtube.com/@alzetteinfosec
• Twitter (X): @DavidSzili
• E-mail: david.szili@alzetteinfosec.com
5
About David
Agenda for Today
Introduction to Zeek
Machine Learning on Zeek Logs
Snakes!  (Anaconda and Python)
Zeek Broker and Machine Learning
6
Introduction to Zeek
Budapest Hackersuli Meetup 2023 December
7
About Zeek
What is Zeek?
• Passive, open-source network
traffic analyzer
• Event/data-driven NIDS/NSM
• Fully customizable and extensible
platform for traffic analysis
• Can run on commodity hardware
(up to 10GbE or even 100GbE links)
• Commercial offerings: Corelight
Why Zeek?
• Network Intrusion Detection
Systems (NIDS)
• Alert data only
• Network Security Monitoring
(NSM)
• Alert data
• Flow (or Session) data
• Transaction data
• Packet data (PCAP)
• Statistical data
• Correlated data
8
Zeek’s History
• 1995 – Initial version by Vern Paxson
• 1996 – Berkeley Lab deployment
• 2003 – National Science Foundation
(NSF) began supporting Bro R&D
• 2010 – National Center for
Supercomputing Applications (NCSA)
joined the team as a core partner
• 2013 – NSF renewed its support
• 2014 – try.bro.org (now try.zeek.org)
• 2016 – Zeek package manager
• 2018 – Changing the name of the
software from Bro to Zeek.
• 2020 – Spicy Parser Generator
9
Zeek’s Cluster Architecture (1)
• Standalone vs. cluster mode
• Network Frontend:
• hardware flow balancers
• on-host flow balancing (PF_RING)
• Manager: central log collector
• Worker: traffic inspection, stream
reassembly, protocol analysis
• Proxy: synchronizing Zeek state
• Logger (optional): receives log
messages from nodes
10
Source: https://docs.zeek.org/en/master/_images/deployment.png
Zeek’s Internal Architecture
• Event Engine: runs
protocol analyzers,
generates network events
• Policy Script Interpreter:
performs action/writes
output
11
Source: https://docs.zeek.org/en/master/_images/architecture.png
• Zeek’s Event Engine:
• Reduces the incoming packet stream into a series of
higher-level events
• Places events into an ordered "event queue“
• Events:
• State change (new_connection, signature_match)
• Protocol specific (http_response, dns_request)
• Data availability (http_entity_data, file_sniff)
• Etc.
12
Zeek Events
Zeek Logs (Just a Few Examples)
Log File Description
conn.log TCP/UDP/ICMP connections
dhcp.log DHCP leases
dns.log DNS activity
ftp.log FTP activity
http.log HTTP requests and replies
rdp.log RDP
smb_cmd.log SMB commands
smb_files.log SMB files
ssh.log SSH connections
Log File Description
ssl.log SSL/TLS handshake info
files.log File analysis results
x509.log X.509 certificate info
intel.log Intelligence data matches
notice.log Zeek notices
signatures.log Signature matches
known_hosts.log Hosts seen (TCP handshakes)
software.log Software seen on the network
weird.log Unexpected network activity
13
Complete list: https://docs.zeek.org/en/master/script-reference/log-files.html
Details: https://docs.zeek.org/en/master/logs/index.html
Framework Description
Broker Communication Framework Exchange information with other Zeek processes
Cluster Framework The basic premise of Zeek clusterization
Configuration Framework Allows updating script options dynamically at runtime
File Analysis Framework Generalized presentation of file-related information
Input Framework Allows users to import data into Zeek
Intelligence Framework Consume data and make it available for matching
Logging Framework Fine-grained control of what and how is logged
Management Framework Provides a Zeek-based, service-oriented architecture
NetControl Framework Flexible, unified interface for active response
14
Zeek Frameworks (1)
Source: https://docs.zeek.org/en/master/frameworks/index.html
Framework Description
NetControl Framework Flexible, unified interface for active response
Notice Framework Detect potentially interesting situations and take action
Packet Analysis Handles parsing of packet headers at layers
Summary Statistics Framework Measuring aspects of network traffic
Signature Framework Signature language for low-level pattern matching
Telemetry Framework Can be used to record metrics
TLS Decryption Limited support for decrypting TLS connections
15
Zeek Frameworks (2)
Source: https://docs.zeek.org/en/master/frameworks/index.html
• Event-driven
• Domain-specific
• Turing-complete
• Based on ML (LISP-like)
• Basically, all Zeek output is generated by Zeek scripts!
16
Zeek Scripting Overview
Source: https://docs.zeek.org/en/master/script-reference/index.html
Zeek Logs DEMO
Workshop
17
Machine Learning on Zeek Logs
Budapest Hackersuli Meetup 2023 December
18
• From the Stratosphere Laboratory team:
• StratosphereLinuxIPS (SLIPS):
https://github.com/stratosphereips/StratosphereLinuxIPS
• zeek_anomaly_detector:
https://github.com/stratosphereips/zeek_anomaly_detector
• From the Active Countermeasures team:
• Real Intelligence Threat Analytics (RITA):
https://github.com/activecm/rita/
• AC-Hunter Community Edition:
https://www.activecountermeasures.com/ac-hunter-community-
edition/
19
Tools to Parse and Analyze Zeek Logs
Machine Learning on Zeek Logs
DEMO (1)
Budapest Hackersuli Meetup 2023 December
20
Budapest Hackersuli Meetup 2023 December
21
Snakes! 
(Anaconda and Python)
Anaconda, Jupyter Labs, Python
• We are going to use the
following tools:
• Anaconda
• Jupyter Lab
• Python (no, not R )
• pandas
• numpy
• mathplotlib
• scikit-learn
• tensorflow
22
Source: https://twitter.com/aboutsecurity/status/1094277269751762946
• None of this is our own research! We are just connecting dots!
• Largely based on David Hoelzer’s SANS “SEC595: Applied Data Science and AI/Machine Learning
for Cybersecurity Professionals” Day 2 labs:
• https://www.sans.org/cyber-security-courses/applied-data-science-machine-learning/
• Go check out David Hoelzer’s YouTube channel:
• https://www.youtube.com/@DHAtEnclaveForensics/videos
• Also, check out David Hoelzer’s other presentations on the SANS Cyber Defense YouTube channel:
• https://www.youtube.com/@SANSCyberDefense/search?query=Hoelzer
• We also used Nik Alleyne’s blog post and GitHub repository:
• https://www.securitynik.com/2023/10/beginning-fourier-transform-detecting.html
• https://showmethepackets.com/index.php/2023/10/09/beginning-fourier-transform-detecting-beaconing-
in-our-networks/
• https://github.com/SecurityNik/Data-Science-and-
ML/blob/main/Beginning%20Fourrier%20Transform%20for%20Beacon%20Detection%20-%20Blog.ipynb
23
Disclaimer / Credits
(Discrete) Fourier Transform and (R)FFT
24
Sources: https://www.thefouriertransform.com/
https://en.wikipedia.org/wiki/Discrete_Fourier_transform and https://en.wikipedia.org/wiki/Fast_Fourier_transform
Machine Learning on Zeek Logs
DEMO (2)
Budapest Hackersuli Meetup 2023 December
25
Zeek Broker and
Machine Learning
Budapest Hackersuli Meetup 2023 December
26
• Broker is a library for type-rich publish/subscribe
communication
• It is the successor of Broccoli
• It enables arbitrary applications to communicate in
Zeek’s data model
• Broker also offers distributed key-value stores to
facilitate unified data management and persistence
27
Zeek Broker
Broker Overview
• Endpoints: data senders and receivers.
• Peering: to publish or receive
messages an endpoint needs to peer
with other endpoints.
• Messages: information to send
• Topics: filters for a publish or
subscribe communication pattern.
• Subscriptions: peers only receive
messages which match one of their
subscriptions.
• Data stores: distributed key-value
stores.
28
• Almost all functionality of Broker is also accessible
through Python bindings
• The Python API mostly mimics the C++ interface
• Installation:
• Virtual Environment (compiled Broker from source)
• Binary package: $(PREFIX)/zeek/lib/zeek/python/
• Your Broker version must match your Zeek version!
29
Broker’s Python Bindings
import sys
sys.path.append('/opt/zeek/lib/zeek/python/')
• None of this is our own research! We are just connecting dots!
• Largely based on David Hoelzer’s livestream recordings, presentations, and his SANS “SEC595:
Applied Data Science and AI/Machine Learning for Cybersecurity Professionals” Day 3 and Day 5
labs:
• https://www.sans.org/cyber-security-courses/applied-data-science-machine-learning/
• Go check out David Hoelzer’s YouTube channel:
• https://www.youtube.com/@DHAtEnclaveForensics/videos
• We used code from these videos:
• Machine Learning with Zeek and Tensorflow Part 1:
https://www.youtube.com/watch?v=5w25kEMLdQk
• Machine Learning with Zeek and Tensorflow - Part 2: Processing the Data:
https://www.youtube.com/watch?v=8qJFnX214yE
• Threat Hunting with Data Science, Machine Learning, and Artificial Intelligence:
https://www.youtube.com/watch?v=fdqFdnkf9I4
• Also, check out David Hoelzer’s other presentations on the SANS Cyber Defense YouTube channel:
• https://www.youtube.com/@SANSCyberDefense/search?query=Hoelzer
30
Disclaimer / Credits (1)
• We used the code from the Zeek Broker documentation:
• https://docs.zeek.org/projects/broker/en/master/python.html
• We also used a little Dr. Keith Jones’ “How To Easily Connect Zeek to
Python” YouTube video, blog, and GitHub repository:
• https://www.youtube.com/watch?v=iIYi17VqFkY
• https://drkeithjones.com/index.php/2023/03/11/how-to-connect-zeek-to-python/
• https://github.com/keithjjones/zeek-python-broker-demo
• Go check out Dr. Keith Jones’ YouTube channel:
• https://www.youtube.com/@dr.keithjones
31
Disclaimer / Credits (2)
Zeek Broker and Python
DEMO (1)
Budapest Hackersuli Meetup 2023 December
32
Representing Files as Images
33
Convolutional Neural Networks
34
Source: https://saturncloud.io/blog/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way/
Decision Tree Classifier
• Non-parametric supervised
learning method
• Gini coefficient: represents
the differences between areas
• The algorithm used to build
the decision tree is
attempting to maximize the
information gain at every
decision node
petal length (cm) ≤ 2.45
gini = 0.6667
samples = 150
value = [50, 50, 50]
class = setosa
gini = 0.0
samples = 50
value = [50, 0, 0]
class = setosa
True
petal width (cm) ≤ 1.75
gini = 0.5
samples = 100
value = [0, 50, 50]
class = versicolor
False
petal length (cm) ≤ 4.95
gini = 0.168
samples = 54
value = [0, 49, 5]
class = versicolor
petal length (cm) ≤ 4.85
gini = 0.0425
samples = 46
value = [0, 1, 45]
class = virginica
petal width (cm) ≤ 1.65
gini = 0.0408
samples = 48
value = [0, 47, 1]
class = versicolor
petal width (cm) ≤ 1.55
gini = 0.4444
samples = 6
value = [0, 2, 4]
class = virginica
gini = 0.0
samples = 47
value = [0, 47, 0]
class = versicolor
gini = 0.0
samples = 1
value = [0, 0, 1]
class = virginica
gini = 0.0
samples = 3
value = [0, 0, 3]
class = virginica
sepal length (cm) ≤ 6.95
gini = 0.4444
samples = 3
value = [0, 2, 1]
class = versicolor
gini = 0.0
samples = 2
value = [0, 2, 0]
class = versicolor
gini = 0.0
samples = 1
value = [0, 0, 1]
class = virginica
sepal length (cm) ≤ 5.95
gini = 0.4444
samples = 3
value = [0, 1, 2]
class = virginica
gini = 0.0
samples = 43
value = [0, 0, 43]
class = virginica
gini = 0.0
samples = 1
value = [0, 1, 0]
class = versicolor
gini = 0.0
samples = 2
value = [0, 0, 2]
class = virginica
35
Source: https://scikit-learn.org/stable/modules/tree.html
• Decision Trees have an issue with outliers in the data.
• Let’s build a forest! Each tree makes a determination, classifies
that sample, and gets one vote. Whichever category has the
greatest number of votes, is the final classification.
36
Random Forest Classifier
Source: https://stackoverflow.com/questions/40155128/plot-trees-for-a-random-forest-in-python-with-scikit-learn
Training ML Models
DEMO
Budapest Hackersuli Meetup 2023 December
37
Zeek Broker and Python
DEMO (2)
Budapest Hackersuli Meetup 2023 December
38
Conclusion
Budapest Hackersuli Meetup 2023 December
39
• Google Drive:
https://drive.google.com/drive/folders/14orYiq8X0MSz
42X6kii6LaSHJB_8Bfj4?usp=sharing
• Shortened:
https://bit.ly/3JVy64F
E-mail: david.szili@alzetteinfosec.com
40
Slides and Contact
• Google Drive:
https://drive.google.com/drive/folders/1bMbuLpS9GVE
_3d3bHhYeRHdIvFPNiaRw?usp=sharing
• Shortened:
https://bit.ly/3TIZtDk
E-mail: eva.szilagyi@alzetteinfosec.com
david.szili@alzetteinfosec.com
41
Workshop VM and Slides
• Zeek Documentation
• https://www.zeek.org/documentation/index.html
• Zeek Broker Documentation
• https://docs.zeek.org/projects/broker
• Install Zeek
• https://docs.zeek.org/en/master/install.html
• Zeek on DockerHub
• https://hub.docker.com/u/zeek
• Try Zeek Online
• http://try.zeek.org
42
References
Questions?
Budapest Hackersuli Meetup 2023 December
43

More Related Content

PDF
Zephyr-Overview-20230124.pdf
PDF
BRKSEC-3144.pdf
PPTX
SUGCON EU 2023 - Secure Composable SaaS.pptx
PPTX
FIWARE Wednesday Webinars - How to Debug IoT Agents
PDF
Kernel Recipes 2015: Kernel packet capture technologies
PPTX
What's New in Grizzly & Deploying OpenStack with Puppet
PDF
.NET Cloud-Native Bootcamp Minneapolis
PPTX
Database Firewall from Scratch
Zephyr-Overview-20230124.pdf
BRKSEC-3144.pdf
SUGCON EU 2023 - Secure Composable SaaS.pptx
FIWARE Wednesday Webinars - How to Debug IoT Agents
Kernel Recipes 2015: Kernel packet capture technologies
What's New in Grizzly & Deploying OpenStack with Puppet
.NET Cloud-Native Bootcamp Minneapolis
Database Firewall from Scratch

Similar to [Hackersuli] Élő szövet a fémvázon: Python és gépi tanulás a Zeek platformon (20)

PDF
Zephyr Introduction - Nordic Webinar - Sept. 24.pdf
PPTX
Blackhat USA 2016 - What's the DFIRence for ICS?
ODP
OWASP WTE - Now in the Cloud!
PDF
Enterprise guide to building a Data Mesh
PDF
Как разработать DBFW с нуля
PPTX
SANS_PentestHackfest_2022-PurpleTeam_Cloud_Identity.pptx
PDF
Databricks Meetup @ Los Angeles Apache Spark User Group
PPTX
The Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
PDF
Zephyr: Creating a Best-of-Breed, Secure RTOS for IoT
PPTX
4055-841_Project_ShailendraSadh
PPTX
OpenTelemetry 101 FTW
PDF
Soc analyst course content
PDF
Soc analyst course content v3
PPTX
Feec telecom-nw-softwarization-aug-2015
PDF
How to over-engineer things and have fun? | Oto Brglez, OPALAB
PDF
Automating Security Response with Serverless
PDF
Securing the Container Pipeline at Salesforce by Cem Gurkok
PPTX
Scalable Open-Source IoT Solutions on Microsoft Azure
PPTX
Blue Teaming On A Budget
PDF
StarlingX - A Platform for the Distributed Edge | Ildiko Vancsa
Zephyr Introduction - Nordic Webinar - Sept. 24.pdf
Blackhat USA 2016 - What's the DFIRence for ICS?
OWASP WTE - Now in the Cloud!
Enterprise guide to building a Data Mesh
Как разработать DBFW с нуля
SANS_PentestHackfest_2022-PurpleTeam_Cloud_Identity.pptx
Databricks Meetup @ Los Angeles Apache Spark User Group
The Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
Zephyr: Creating a Best-of-Breed, Secure RTOS for IoT
4055-841_Project_ShailendraSadh
OpenTelemetry 101 FTW
Soc analyst course content
Soc analyst course content v3
Feec telecom-nw-softwarization-aug-2015
How to over-engineer things and have fun? | Oto Brglez, OPALAB
Automating Security Response with Serverless
Securing the Container Pipeline at Salesforce by Cem Gurkok
Scalable Open-Source IoT Solutions on Microsoft Azure
Blue Teaming On A Budget
StarlingX - A Platform for the Distributed Edge | Ildiko Vancsa

More from hackersuli (20)

PDF
[HUN][Hackersuli] Lila köpeny, fekete kalap, fehér kesztyű – avagy threat hun...
PPTX
HUN Hackersuli 2025 Jatekok megmokolasa csalo motorral
PDF
[HUN][Hackersuli]Android intentek - ne hagyd magad intentekkel tamadni
PDF
[HUN][Hackersuli] Haunted by bugs on a cybersecurity side-quest
PDF
[HUN]2025_HackerSuli_Meetup_Mesek_a_kript(ografi)abol.pdf
PPTX
[HUN] Unity alapú mobil játékok hekkelése
PPTX
Hackersuli_2024_LLM_prompt_injection.pptx
PPTX
[HUN][Hackersuli] Abusing Active Directory Certificate Services
PDF
ITBN - LLM prompt injection with Hackersuli
PPTX
[HUN][hackersuli] Red Teaming alapok 2024
PDF
2024_hackersuli_mobil_ios_android ______
PDF
[HUN[]Hackersuli] Hornyai Alex - Elliptikus görbék kriptográfiája
PPTX
[Hackersuli]Privacy on the blockchain
PPTX
[HUN] 2023_Hacker_Suli_Meetup_Cloud_DFIR_Alapok.pptx
PPTX
[Hackersuli][HUN] GSM halozatok hackelese
PDF
Hackersuli Minecraft hackeles kezdoknek
PDF
HUN Hackersuli - How to hack an airplane
PDF
[HUN][Hackersuli] Cryptocurrency scams
PPTX
[Hackersuli] [HUN] Windows a szereloaknan
PDF
[HUN][Hackersuli] Szol a szoftveresen definialt radio - SDR alapok
[HUN][Hackersuli] Lila köpeny, fekete kalap, fehér kesztyű – avagy threat hun...
HUN Hackersuli 2025 Jatekok megmokolasa csalo motorral
[HUN][Hackersuli]Android intentek - ne hagyd magad intentekkel tamadni
[HUN][Hackersuli] Haunted by bugs on a cybersecurity side-quest
[HUN]2025_HackerSuli_Meetup_Mesek_a_kript(ografi)abol.pdf
[HUN] Unity alapú mobil játékok hekkelése
Hackersuli_2024_LLM_prompt_injection.pptx
[HUN][Hackersuli] Abusing Active Directory Certificate Services
ITBN - LLM prompt injection with Hackersuli
[HUN][hackersuli] Red Teaming alapok 2024
2024_hackersuli_mobil_ios_android ______
[HUN[]Hackersuli] Hornyai Alex - Elliptikus görbék kriptográfiája
[Hackersuli]Privacy on the blockchain
[HUN] 2023_Hacker_Suli_Meetup_Cloud_DFIR_Alapok.pptx
[Hackersuli][HUN] GSM halozatok hackelese
Hackersuli Minecraft hackeles kezdoknek
HUN Hackersuli - How to hack an airplane
[HUN][Hackersuli] Cryptocurrency scams
[Hackersuli] [HUN] Windows a szereloaknan
[HUN][Hackersuli] Szol a szoftveresen definialt radio - SDR alapok

Recently uploaded (20)

PPT
250152213-Excitation-SystemWERRT (1).ppt
DOC
Rose毕业证学历认证,利物浦约翰摩尔斯大学毕业证国外本科毕业证
PPT
FIRE PREVENTION AND CONTROL PLAN- LUS.FM.MQ.OM.UTM.PLN.00014.ppt
PDF
mera desh ae watn.(a source of motivation and patriotism to the youth of the ...
PDF
📍 LABUAN4D EXCLUSIVE SERVER STAR GAMING ASIA NO.1 TERPOPULER DI INDONESIA ! 🌟
PPT
Design_with_Watersergyerge45hrbgre4top (1).ppt
PDF
simpleintnettestmetiaerl for the simple testint
PDF
FINAL CALL-6th International Conference on Networks & IOT (NeTIOT 2025)
PDF
Uptota Investor Deck - Where Africa Meets Blockchain
PPTX
Introduction to cybersecurity and digital nettiquette
PDF
Introduction to the IoT system, how the IoT system works
PPT
Ethics in Information System - Management Information System
PPTX
artificial intelligence overview of it and more
PDF
Session 1 (Week 1)fghjmgfdsfgthyjkhfdsadfghjkhgfdsa
PPTX
SAP Ariba Sourcing PPT for learning material
PPTX
newyork.pptxirantrafgshenepalchinachinane
PDF
The New Creative Director: How AI Tools for Social Media Content Creation Are...
PPTX
t_and_OpenAI_Combined_two_pressentations
PPTX
Power Point - Lesson 3_2.pptx grad school presentation
PPTX
E -tech empowerment technologies PowerPoint
250152213-Excitation-SystemWERRT (1).ppt
Rose毕业证学历认证,利物浦约翰摩尔斯大学毕业证国外本科毕业证
FIRE PREVENTION AND CONTROL PLAN- LUS.FM.MQ.OM.UTM.PLN.00014.ppt
mera desh ae watn.(a source of motivation and patriotism to the youth of the ...
📍 LABUAN4D EXCLUSIVE SERVER STAR GAMING ASIA NO.1 TERPOPULER DI INDONESIA ! 🌟
Design_with_Watersergyerge45hrbgre4top (1).ppt
simpleintnettestmetiaerl for the simple testint
FINAL CALL-6th International Conference on Networks & IOT (NeTIOT 2025)
Uptota Investor Deck - Where Africa Meets Blockchain
Introduction to cybersecurity and digital nettiquette
Introduction to the IoT system, how the IoT system works
Ethics in Information System - Management Information System
artificial intelligence overview of it and more
Session 1 (Week 1)fghjmgfdsfgthyjkhfdsadfghjkhgfdsa
SAP Ariba Sourcing PPT for learning material
newyork.pptxirantrafgshenepalchinachinane
The New Creative Director: How AI Tools for Social Media Content Creation Are...
t_and_OpenAI_Combined_two_pressentations
Power Point - Lesson 3_2.pptx grad school presentation
E -tech empowerment technologies PowerPoint

[Hackersuli] Élő szövet a fémvázon: Python és gépi tanulás a Zeek platformon

  • 1. Élő szövet a fémvázon: Python és gépi tanulás a Zeek platformon Budapest Hackersuli Meetup 2024 Május 1 Szili Dávid
  • 2. Nem, még mindig nem vettem magyar kiosztású billentyűzetet! Ezt a két slide-ot is kínszenvedés volt ékezetekkel megírni! 2 Dávid, magyarul lécci… (Diszklémer 1)
  • 3. Diszklémer 2 • We are going to discuss intermediate/advanced topics! • The assumption is that you are somewhat familiar with security monitoring and/or machine learning concepts. • Originally, this was a workshop, NOT a presentation • Finally; none of this is our own research! We are just connecting dots (see refs later)! 3
  • 4. ChatGPT-4/DALL-E3 prompt: This is too scary now, keep the original image, all the details, but make it look cute, and transform it into cartoon network art style, like the style of Dexter's Laboratory, or Power Puff Girls ChatGPT-4/DALL-E3 prompt: Make an epic portrait of a Terminator T-800 with red glowing eyes and a python wrapped around its neck, where the python is showing its fangs as it is about to attack. Make it dynamic, cinematic, highly detailed, packed with hidden details, style, high dynamic range, hyper-realistic, realistic, attention to detail. Diszklémer 3 4
  • 5. • Managing partner at Alzette Information Security (@AlzetteInfoSec) • Network penetration testing, security architectures, security monitoring, threat hunting, incident response, digital forensics • Instructor at SANS Institute: FOR572, FOR509 • SANS Lead author: SANS DFIR NetWars • BSides Luxembourg Organizer: https://bsideslux.lu • Twitch: https://www.twitch.tv/alzetteinfosec • YouTube: https://www.youtube.com/@alzetteinfosec • Twitter (X): @DavidSzili • E-mail: david.szili@alzetteinfosec.com 5 About David
  • 6. Agenda for Today Introduction to Zeek Machine Learning on Zeek Logs Snakes!  (Anaconda and Python) Zeek Broker and Machine Learning 6
  • 7. Introduction to Zeek Budapest Hackersuli Meetup 2023 December 7
  • 8. About Zeek What is Zeek? • Passive, open-source network traffic analyzer • Event/data-driven NIDS/NSM • Fully customizable and extensible platform for traffic analysis • Can run on commodity hardware (up to 10GbE or even 100GbE links) • Commercial offerings: Corelight Why Zeek? • Network Intrusion Detection Systems (NIDS) • Alert data only • Network Security Monitoring (NSM) • Alert data • Flow (or Session) data • Transaction data • Packet data (PCAP) • Statistical data • Correlated data 8
  • 9. Zeek’s History • 1995 – Initial version by Vern Paxson • 1996 – Berkeley Lab deployment • 2003 – National Science Foundation (NSF) began supporting Bro R&D • 2010 – National Center for Supercomputing Applications (NCSA) joined the team as a core partner • 2013 – NSF renewed its support • 2014 – try.bro.org (now try.zeek.org) • 2016 – Zeek package manager • 2018 – Changing the name of the software from Bro to Zeek. • 2020 – Spicy Parser Generator 9
  • 10. Zeek’s Cluster Architecture (1) • Standalone vs. cluster mode • Network Frontend: • hardware flow balancers • on-host flow balancing (PF_RING) • Manager: central log collector • Worker: traffic inspection, stream reassembly, protocol analysis • Proxy: synchronizing Zeek state • Logger (optional): receives log messages from nodes 10 Source: https://docs.zeek.org/en/master/_images/deployment.png
  • 11. Zeek’s Internal Architecture • Event Engine: runs protocol analyzers, generates network events • Policy Script Interpreter: performs action/writes output 11 Source: https://docs.zeek.org/en/master/_images/architecture.png
  • 12. • Zeek’s Event Engine: • Reduces the incoming packet stream into a series of higher-level events • Places events into an ordered "event queue“ • Events: • State change (new_connection, signature_match) • Protocol specific (http_response, dns_request) • Data availability (http_entity_data, file_sniff) • Etc. 12 Zeek Events
  • 13. Zeek Logs (Just a Few Examples) Log File Description conn.log TCP/UDP/ICMP connections dhcp.log DHCP leases dns.log DNS activity ftp.log FTP activity http.log HTTP requests and replies rdp.log RDP smb_cmd.log SMB commands smb_files.log SMB files ssh.log SSH connections Log File Description ssl.log SSL/TLS handshake info files.log File analysis results x509.log X.509 certificate info intel.log Intelligence data matches notice.log Zeek notices signatures.log Signature matches known_hosts.log Hosts seen (TCP handshakes) software.log Software seen on the network weird.log Unexpected network activity 13 Complete list: https://docs.zeek.org/en/master/script-reference/log-files.html Details: https://docs.zeek.org/en/master/logs/index.html
  • 14. Framework Description Broker Communication Framework Exchange information with other Zeek processes Cluster Framework The basic premise of Zeek clusterization Configuration Framework Allows updating script options dynamically at runtime File Analysis Framework Generalized presentation of file-related information Input Framework Allows users to import data into Zeek Intelligence Framework Consume data and make it available for matching Logging Framework Fine-grained control of what and how is logged Management Framework Provides a Zeek-based, service-oriented architecture NetControl Framework Flexible, unified interface for active response 14 Zeek Frameworks (1) Source: https://docs.zeek.org/en/master/frameworks/index.html
  • 15. Framework Description NetControl Framework Flexible, unified interface for active response Notice Framework Detect potentially interesting situations and take action Packet Analysis Handles parsing of packet headers at layers Summary Statistics Framework Measuring aspects of network traffic Signature Framework Signature language for low-level pattern matching Telemetry Framework Can be used to record metrics TLS Decryption Limited support for decrypting TLS connections 15 Zeek Frameworks (2) Source: https://docs.zeek.org/en/master/frameworks/index.html
  • 16. • Event-driven • Domain-specific • Turing-complete • Based on ML (LISP-like) • Basically, all Zeek output is generated by Zeek scripts! 16 Zeek Scripting Overview Source: https://docs.zeek.org/en/master/script-reference/index.html
  • 18. Machine Learning on Zeek Logs Budapest Hackersuli Meetup 2023 December 18
  • 19. • From the Stratosphere Laboratory team: • StratosphereLinuxIPS (SLIPS): https://github.com/stratosphereips/StratosphereLinuxIPS • zeek_anomaly_detector: https://github.com/stratosphereips/zeek_anomaly_detector • From the Active Countermeasures team: • Real Intelligence Threat Analytics (RITA): https://github.com/activecm/rita/ • AC-Hunter Community Edition: https://www.activecountermeasures.com/ac-hunter-community- edition/ 19 Tools to Parse and Analyze Zeek Logs
  • 20. Machine Learning on Zeek Logs DEMO (1) Budapest Hackersuli Meetup 2023 December 20
  • 21. Budapest Hackersuli Meetup 2023 December 21 Snakes!  (Anaconda and Python)
  • 22. Anaconda, Jupyter Labs, Python • We are going to use the following tools: • Anaconda • Jupyter Lab • Python (no, not R ) • pandas • numpy • mathplotlib • scikit-learn • tensorflow 22 Source: https://twitter.com/aboutsecurity/status/1094277269751762946
  • 23. • None of this is our own research! We are just connecting dots! • Largely based on David Hoelzer’s SANS “SEC595: Applied Data Science and AI/Machine Learning for Cybersecurity Professionals” Day 2 labs: • https://www.sans.org/cyber-security-courses/applied-data-science-machine-learning/ • Go check out David Hoelzer’s YouTube channel: • https://www.youtube.com/@DHAtEnclaveForensics/videos • Also, check out David Hoelzer’s other presentations on the SANS Cyber Defense YouTube channel: • https://www.youtube.com/@SANSCyberDefense/search?query=Hoelzer • We also used Nik Alleyne’s blog post and GitHub repository: • https://www.securitynik.com/2023/10/beginning-fourier-transform-detecting.html • https://showmethepackets.com/index.php/2023/10/09/beginning-fourier-transform-detecting-beaconing- in-our-networks/ • https://github.com/SecurityNik/Data-Science-and- ML/blob/main/Beginning%20Fourrier%20Transform%20for%20Beacon%20Detection%20-%20Blog.ipynb 23 Disclaimer / Credits
  • 24. (Discrete) Fourier Transform and (R)FFT 24 Sources: https://www.thefouriertransform.com/ https://en.wikipedia.org/wiki/Discrete_Fourier_transform and https://en.wikipedia.org/wiki/Fast_Fourier_transform
  • 25. Machine Learning on Zeek Logs DEMO (2) Budapest Hackersuli Meetup 2023 December 25
  • 26. Zeek Broker and Machine Learning Budapest Hackersuli Meetup 2023 December 26
  • 27. • Broker is a library for type-rich publish/subscribe communication • It is the successor of Broccoli • It enables arbitrary applications to communicate in Zeek’s data model • Broker also offers distributed key-value stores to facilitate unified data management and persistence 27 Zeek Broker
  • 28. Broker Overview • Endpoints: data senders and receivers. • Peering: to publish or receive messages an endpoint needs to peer with other endpoints. • Messages: information to send • Topics: filters for a publish or subscribe communication pattern. • Subscriptions: peers only receive messages which match one of their subscriptions. • Data stores: distributed key-value stores. 28
  • 29. • Almost all functionality of Broker is also accessible through Python bindings • The Python API mostly mimics the C++ interface • Installation: • Virtual Environment (compiled Broker from source) • Binary package: $(PREFIX)/zeek/lib/zeek/python/ • Your Broker version must match your Zeek version! 29 Broker’s Python Bindings import sys sys.path.append('/opt/zeek/lib/zeek/python/')
  • 30. • None of this is our own research! We are just connecting dots! • Largely based on David Hoelzer’s livestream recordings, presentations, and his SANS “SEC595: Applied Data Science and AI/Machine Learning for Cybersecurity Professionals” Day 3 and Day 5 labs: • https://www.sans.org/cyber-security-courses/applied-data-science-machine-learning/ • Go check out David Hoelzer’s YouTube channel: • https://www.youtube.com/@DHAtEnclaveForensics/videos • We used code from these videos: • Machine Learning with Zeek and Tensorflow Part 1: https://www.youtube.com/watch?v=5w25kEMLdQk • Machine Learning with Zeek and Tensorflow - Part 2: Processing the Data: https://www.youtube.com/watch?v=8qJFnX214yE • Threat Hunting with Data Science, Machine Learning, and Artificial Intelligence: https://www.youtube.com/watch?v=fdqFdnkf9I4 • Also, check out David Hoelzer’s other presentations on the SANS Cyber Defense YouTube channel: • https://www.youtube.com/@SANSCyberDefense/search?query=Hoelzer 30 Disclaimer / Credits (1)
  • 31. • We used the code from the Zeek Broker documentation: • https://docs.zeek.org/projects/broker/en/master/python.html • We also used a little Dr. Keith Jones’ “How To Easily Connect Zeek to Python” YouTube video, blog, and GitHub repository: • https://www.youtube.com/watch?v=iIYi17VqFkY • https://drkeithjones.com/index.php/2023/03/11/how-to-connect-zeek-to-python/ • https://github.com/keithjjones/zeek-python-broker-demo • Go check out Dr. Keith Jones’ YouTube channel: • https://www.youtube.com/@dr.keithjones 31 Disclaimer / Credits (2)
  • 32. Zeek Broker and Python DEMO (1) Budapest Hackersuli Meetup 2023 December 32
  • 34. Convolutional Neural Networks 34 Source: https://saturncloud.io/blog/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way/
  • 35. Decision Tree Classifier • Non-parametric supervised learning method • Gini coefficient: represents the differences between areas • The algorithm used to build the decision tree is attempting to maximize the information gain at every decision node petal length (cm) ≤ 2.45 gini = 0.6667 samples = 150 value = [50, 50, 50] class = setosa gini = 0.0 samples = 50 value = [50, 0, 0] class = setosa True petal width (cm) ≤ 1.75 gini = 0.5 samples = 100 value = [0, 50, 50] class = versicolor False petal length (cm) ≤ 4.95 gini = 0.168 samples = 54 value = [0, 49, 5] class = versicolor petal length (cm) ≤ 4.85 gini = 0.0425 samples = 46 value = [0, 1, 45] class = virginica petal width (cm) ≤ 1.65 gini = 0.0408 samples = 48 value = [0, 47, 1] class = versicolor petal width (cm) ≤ 1.55 gini = 0.4444 samples = 6 value = [0, 2, 4] class = virginica gini = 0.0 samples = 47 value = [0, 47, 0] class = versicolor gini = 0.0 samples = 1 value = [0, 0, 1] class = virginica gini = 0.0 samples = 3 value = [0, 0, 3] class = virginica sepal length (cm) ≤ 6.95 gini = 0.4444 samples = 3 value = [0, 2, 1] class = versicolor gini = 0.0 samples = 2 value = [0, 2, 0] class = versicolor gini = 0.0 samples = 1 value = [0, 0, 1] class = virginica sepal length (cm) ≤ 5.95 gini = 0.4444 samples = 3 value = [0, 1, 2] class = virginica gini = 0.0 samples = 43 value = [0, 0, 43] class = virginica gini = 0.0 samples = 1 value = [0, 1, 0] class = versicolor gini = 0.0 samples = 2 value = [0, 0, 2] class = virginica 35 Source: https://scikit-learn.org/stable/modules/tree.html
  • 36. • Decision Trees have an issue with outliers in the data. • Let’s build a forest! Each tree makes a determination, classifies that sample, and gets one vote. Whichever category has the greatest number of votes, is the final classification. 36 Random Forest Classifier Source: https://stackoverflow.com/questions/40155128/plot-trees-for-a-random-forest-in-python-with-scikit-learn
  • 37. Training ML Models DEMO Budapest Hackersuli Meetup 2023 December 37
  • 38. Zeek Broker and Python DEMO (2) Budapest Hackersuli Meetup 2023 December 38
  • 40. • Google Drive: https://drive.google.com/drive/folders/14orYiq8X0MSz 42X6kii6LaSHJB_8Bfj4?usp=sharing • Shortened: https://bit.ly/3JVy64F E-mail: david.szili@alzetteinfosec.com 40 Slides and Contact
  • 41. • Google Drive: https://drive.google.com/drive/folders/1bMbuLpS9GVE _3d3bHhYeRHdIvFPNiaRw?usp=sharing • Shortened: https://bit.ly/3TIZtDk E-mail: eva.szilagyi@alzetteinfosec.com david.szili@alzetteinfosec.com 41 Workshop VM and Slides
  • 42. • Zeek Documentation • https://www.zeek.org/documentation/index.html • Zeek Broker Documentation • https://docs.zeek.org/projects/broker • Install Zeek • https://docs.zeek.org/en/master/install.html • Zeek on DockerHub • https://hub.docker.com/u/zeek • Try Zeek Online • http://try.zeek.org 42 References