SlideShare a Scribd company logo
London | 14-15 November 2019
A Kernel of Truth
Matt Carroll
Matt Carroll
@grimmware
A Kernel of Truth
Intrusion Detection and Attestation with eBPF
● Matt Carroll
○ @grimmware
○ github.com/oholiab
● Infrastructure Security
Engineer at Yelp
● Ex-SRE (like a sysadmin but
with more yaml)
● Hand-wringing Linux
botherer
Who am I?
● We built a supplementary* IDS and it’s pretty cool!
● Utilizing OS features as security features
● Told in (roughly) the order it happened.
What is this about?
● How to get a greenfield security project off the ground
○ Treating defensive security like economics
○ Gluing together extant technologies to bootstrap
custom security tools
○ Using your business logic to maximize signal vs noise
What is this about?
Yelp’s Mission
Connecting people with great
local businesses.
● Built on Mesos + Marathon
+ Docker
● More recently migration
towards k8s
● Majority of our workloads
run here
● What are they all doing???
PaaSTA
Network IDS: Amazon GuardDuty
A Kernel of Truth: Intrusion Detection and Attestation with eBPF
Kind of unsurprising, also pretty unhelpful...
Welp...
Uuuhhh… 🤔
A Kernel of Truth: Intrusion Detection and Attestation with eBPF
WHAAAAAAA
😱
● What host class connected?
● What IP/ASN did it connect to?
● What’s on the other end?
● How long was the connection?
● What direction?
● How many bytes were transferred?
● What did the pslogs say?
Attestation From Inference
● What host class connected?
● What IP/ASN did it connect to?
● What’s on the other end?
● How long was the connection?
● What direction?
● How many bytes were transferred?
● What did the pslogs say?
Attestation From Inference
lol jk
Context is lost as soon as the
instantiating process ends
What if we could reduce MTTR for false
positives?
● When a GuardDuty alert fires I want to be able to
determine if it’s a false-positive quickly
● Only for GuardDuty traffic (not internal to our VPCs)
● Only for outbound TCP (i.e. non-RFC1918)
● I want the entire calling process tree so I can see full
local causality
● Include process ownership information
● Must not require workload tooling
The problem space
eBPF!
eBPF!
● “Berkeley Packet Filter” from BSD
● An in-kernel VM accessed as a device
(/dev/bpf)
● Limited number of registers
● No loops (to prevent kernel
deadlocking)
● Used for packet filtering
BPF
● An in-kernel VM in Linux (and now FreeBSD!)
● It’s “extended”!
● Moar registers than BPF
● Used for hooking syscalls, tracing, proxying sockets, and
(you guessed) in-kernel packet filtering
○ Can actually offload to some NICs!
● In our case, dispatching kprobes for the tcp_v4_connect
syscall
eBPF
A Kernel of Truth: Intrusion Detection and Attestation with eBPF
Enjoy writing your filters
as an array of BPF VM
instructions...
A Kernel of Truth: Intrusion Detection and Attestation with eBPF
A Kernel of Truth: Intrusion Detection and Attestation with eBPF
bcc + psutil = PROFIT???
bcc + psutil = PROFIT???
✅
A Kernel of Truth: Intrusion Detection and Attestation with eBPF
How it works
sd
54321
for each syscall...
A Kernel of Truth: Intrusion Detection and Attestation with eBPF
A Kernel of Truth: Intrusion Detection and Attestation with eBPF
● Filters in-kernel from Jinja2
templates which iterate over
subnets in YAML
configuration
● Events that don’t get filtered
out are passed to userland
Python daemon
● psutil used to crawl process
tree to init and log alongside
other metadata
The End.
Except it was a hackathon project so all
it did was print events to stdout and
could only match classful networks and
I developed it on my personal laptop.
The Road To Production
Don’t try
to be
clever
with
bitwise
network
matching
● I realised only the classful networks worked
because of the byte boundaries
● Don’t try to do clever bitwise shifting with the
mask length
● Endianness and byte ordering between network
and host don’t work how you think they do
● No srs
Matching
all CIDRs
● A coworker was trying to figure out which batch jobs
were accessing a service for a data auth project
● He asked me if we could match ports
● I said I’d have it:
○ Matching ports
○ Dockerized for adhoc usage
○ By the next day
● The next day he found all
unauthenticated clients.
Dockerizing for debugging
● Contains python2.7 and
dependencies (sorry)
● Needs some setup at
runtime
● Volume mount
/etc/passwd for uid
mapping
● Not your typical flags:
○ --privileged
○ --cap-add sys_admin
○ --pid host
● Don’t worry I am a
professional probably.
pidtree-bcc in Docker
● We run our own PaaS called PaaSTA which uses Docker
as containerizer
● Runs the vast majority of our workloads
● Can pull-from-registry and run in a systemd unit file
without further setup
● Don’t have to install dependencies
(inc. LLVM, python2)
● Get coverage quickly
Opportunistic deploy with Docker
● Previous projects with goaudit meant we already had a
secure logging pipeline for reading a FIFO and outputting
to Amazon Kinesis
○ syslog2kinesis adds other Yelpy metadata (e.g.
hostname, environment, Puppet role...)
● Originally fed to our Logstash => Elasticsearch SIEM
● Migrated to Kinesis Firehose => Splunk this quarter <3
Log aggregation
● Better to ask forgiveness than permission...
● Rolled out to two security devboxes and watched the
logs roll in!
● Negligible performance impact!!!
○ As postulated, cost of subnet filtering << cost of
instantiating a TCP connection
● Lots of connections out to public Amazon IPs creating a
lot of noise
Dip Test
If only Amazon maintained some kind
of list of their public prefixes...
A Kernel of Truth: Intrusion Detection and Attestation with eBPF
A Kernel of Truth: Intrusion Detection and Attestation with eBPF
Surely you can’t load ~200 netblocks
into the kernel and compare all non-
RFC1918 tcp_v4_connect syscalls to
them in a performant manner...
Surely you can’t load ~200 netblocks
into the kernel and compare all non-
RFC1918 tcp_v4_connect syscalls to
them in a performant manner...
A Kernel of Truth: Intrusion Detection and Attestation with eBPF
● ~25,000 - ~50,000 messages per hour across dev and
stage
● Once accidentally load-tested at ~80,000 messages in
5m from one host for several hours
● Nobody on the host noticed
● TCP connections are way more expensive than the
filters!
Load
● bpf_trace_printk() -> BPF_PERF_OUTPUT()
○ Global (e.g. per-kernel) debug output with hand-
hacked json and string manipulation
○ To structured data in a ring buffer
○ Multi-tenancy makes it a better utility and more
testable!
● Added unit tests
● Adding integration tests
● Adding infrastructure for deploy in production
environment
Undoing my nasty hacks
● De-containerize (e.g. debian package)
● Python3
● Plugin for container awareness
○ Easy mapping to service and therefore owner!
● Enable immutable loginuid and add that to metadata
○ --loginuid-immutable under `man auditctl`
○ Cryptically says “but can cause some problems in
certain kinds of containers”
● Threat modelling/hardening!
Future work
● Performance improvements
○ BPF longest-match maps
○ Pre-processing masks
○ Probably totally unnecessary
● Moar syscalls!
○ TCP listens, ipv6, UDP, SUID, forwarded SSH socket
reads…
● SIEM tooling
○ ASN matching, bad IP matching, GuardDuty auto-
enrichment...
Future work
www.yelp.com/careers/
We're Hiring!
@YelpEngineering
fb.com/YelpEngineers
engineeringblog.yelp.com
github.com/yelp
London | 14-15 November 2019
https://github.com/Yelp/pidtree-bcc
@grimmware
Thanks for listening!

More Related Content

PDF
Systems@Scale 2021 BPF Performance Getting Started
PDF
eBPF - Rethinking the Linux Kernel
PDF
eBPF/XDP
PDF
Introduction to eBPF
PDF
Building Network Functions with eBPF & BCC
PDF
Meet cute-between-ebpf-and-tracing
ODP
eBPF maps 101
PDF
Meetup 2009
Systems@Scale 2021 BPF Performance Getting Started
eBPF - Rethinking the Linux Kernel
eBPF/XDP
Introduction to eBPF
Building Network Functions with eBPF & BCC
Meet cute-between-ebpf-and-tracing
eBPF maps 101
Meetup 2009

What's hot (20)

PDF
EBPF and Linux Networking
PDF
BPF: Tracing and more
PDF
LSFMM 2019 BPF Observability
PDF
Velocity 2017 Performance analysis superpowers with Linux eBPF
PDF
Enhancing Network and Runtime Security with Cilium and Tetragon by Raymond De...
PDF
LinuxCon 2015 Linux Kernel Networking Walkthrough
PDF
malloc & vmalloc in Linux
PPTX
eBPF Basics
PDF
eBPF in the view of a storage developer
PPTX
Understanding eBPF in a Hurry!
PDF
ARM Trusted FirmwareのBL31を単体で使う!
PPTX
Staring into the eBPF Abyss
PDF
OSC2011 Tokyo/Fall 濃いバナ(virtio)
PDF
BPF & Cilium - Turning Linux into a Microservices-aware Operating System
PDF
UM2019 Extended BPF: A New Type of Software
PDF
Using eBPF to Measure the k8s Cluster Health
PDF
Introduction to eBPF and XDP
PDF
マルチコアとネットワークスタックの高速化技法
PDF
DoS and DDoS mitigations with eBPF, XDP and DPDK
PDF
Linux Networking Explained
EBPF and Linux Networking
BPF: Tracing and more
LSFMM 2019 BPF Observability
Velocity 2017 Performance analysis superpowers with Linux eBPF
Enhancing Network and Runtime Security with Cilium and Tetragon by Raymond De...
LinuxCon 2015 Linux Kernel Networking Walkthrough
malloc & vmalloc in Linux
eBPF Basics
eBPF in the view of a storage developer
Understanding eBPF in a Hurry!
ARM Trusted FirmwareのBL31を単体で使う!
Staring into the eBPF Abyss
OSC2011 Tokyo/Fall 濃いバナ(virtio)
BPF & Cilium - Turning Linux into a Microservices-aware Operating System
UM2019 Extended BPF: A New Type of Software
Using eBPF to Measure the k8s Cluster Health
Introduction to eBPF and XDP
マルチコアとネットワークスタックの高速化技法
DoS and DDoS mitigations with eBPF, XDP and DPDK
Linux Networking Explained
Ad

Similar to A Kernel of Truth: Intrusion Detection and Attestation with eBPF (20)

PDF
Security Monitoring with eBPF
PPTX
Tcpdump hunter
PDF
Linux Native, HTTP Aware Network Security
PPTX
Cfgmgmtcamp 2023 — eBPF Superpowers
PDF
Alexander Reelsen - Seccomp for Developers
PPTX
DevSecCon Singapore 2018 - System call auditing made effective with machine l...
PDF
Cloud Monitors Cloud
PDF
Threat stack aws
PDF
The Next Generation Firewall for Red Hat Enterprise Linux 7 RC
PDF
WTF my container just spawned a shell!
PDF
eBPF — Divulging The Hidden Super Power.pdf
PDF
DEF CON 24 - workshop - Craig Young - brainwashing embedded systems
PDF
Compliance as Code with InSpec - DevOps Melbourne 2017
PDF
Introduction of eBPF - 時下最夯的Linux Technology
PDF
Replacing iptables with eBPF in Kubernetes with Cilium
PDF
Efficient System Monitoring in Cloud Native Environments
PDF
How we built Packet's bare metal cloud platform
PDF
NetConf 2018 BPF Observability
PDF
Prometheus as exposition format for eBPF programs running on Kubernetes
PDF
Cloud Native Networking & Security with Cilium & eBPF
Security Monitoring with eBPF
Tcpdump hunter
Linux Native, HTTP Aware Network Security
Cfgmgmtcamp 2023 — eBPF Superpowers
Alexander Reelsen - Seccomp for Developers
DevSecCon Singapore 2018 - System call auditing made effective with machine l...
Cloud Monitors Cloud
Threat stack aws
The Next Generation Firewall for Red Hat Enterprise Linux 7 RC
WTF my container just spawned a shell!
eBPF — Divulging The Hidden Super Power.pdf
DEF CON 24 - workshop - Craig Young - brainwashing embedded systems
Compliance as Code with InSpec - DevOps Melbourne 2017
Introduction of eBPF - 時下最夯的Linux Technology
Replacing iptables with eBPF in Kubernetes with Cilium
Efficient System Monitoring in Cloud Native Environments
How we built Packet's bare metal cloud platform
NetConf 2018 BPF Observability
Prometheus as exposition format for eBPF programs running on Kubernetes
Cloud Native Networking & Security with Cilium & eBPF
Ad

Recently uploaded (20)

PPTX
ISO 45001 Occupational Health and Safety Management System
PDF
Nekopoi APK 2025 free lastest update
PDF
top salesforce developer skills in 2025.pdf
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PPTX
Online Work Permit System for Fast Permit Processing
PDF
medical staffing services at VALiNTRY
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
AI in Product Development-omnex systems
PPTX
L1 - Introduction to python Backend.pptx
PDF
System and Network Administration Chapter 2
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PPTX
Transform Your Business with a Software ERP System
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PPTX
CHAPTER 2 - PM Management and IT Context
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
ISO 45001 Occupational Health and Safety Management System
Nekopoi APK 2025 free lastest update
top salesforce developer skills in 2025.pdf
Design an Analysis of Algorithms II-SECS-1021-03
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
VVF-Customer-Presentation2025-Ver1.9.pptx
Online Work Permit System for Fast Permit Processing
medical staffing services at VALiNTRY
Wondershare Filmora 15 Crack With Activation Key [2025
AI in Product Development-omnex systems
L1 - Introduction to python Backend.pptx
System and Network Administration Chapter 2
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Transform Your Business with a Software ERP System
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
CHAPTER 2 - PM Management and IT Context
Odoo Companies in India – Driving Business Transformation.pdf
How to Choose the Right IT Partner for Your Business in Malaysia
Navsoft: AI-Powered Business Solutions & Custom Software Development
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...

A Kernel of Truth: Intrusion Detection and Attestation with eBPF

  • 1. London | 14-15 November 2019 A Kernel of Truth Matt Carroll
  • 2. Matt Carroll @grimmware A Kernel of Truth Intrusion Detection and Attestation with eBPF
  • 3. ● Matt Carroll ○ @grimmware ○ github.com/oholiab ● Infrastructure Security Engineer at Yelp ● Ex-SRE (like a sysadmin but with more yaml) ● Hand-wringing Linux botherer Who am I?
  • 4. ● We built a supplementary* IDS and it’s pretty cool! ● Utilizing OS features as security features ● Told in (roughly) the order it happened. What is this about?
  • 5. ● How to get a greenfield security project off the ground ○ Treating defensive security like economics ○ Gluing together extant technologies to bootstrap custom security tools ○ Using your business logic to maximize signal vs noise What is this about?
  • 6. Yelp’s Mission Connecting people with great local businesses.
  • 7. ● Built on Mesos + Marathon + Docker ● More recently migration towards k8s ● Majority of our workloads run here ● What are they all doing??? PaaSTA
  • 10. Kind of unsurprising, also pretty unhelpful...
  • 15. ● What host class connected? ● What IP/ASN did it connect to? ● What’s on the other end? ● How long was the connection? ● What direction? ● How many bytes were transferred? ● What did the pslogs say? Attestation From Inference
  • 16. ● What host class connected? ● What IP/ASN did it connect to? ● What’s on the other end? ● How long was the connection? ● What direction? ● How many bytes were transferred? ● What did the pslogs say? Attestation From Inference lol jk
  • 17. Context is lost as soon as the instantiating process ends
  • 18. What if we could reduce MTTR for false positives?
  • 19. ● When a GuardDuty alert fires I want to be able to determine if it’s a false-positive quickly ● Only for GuardDuty traffic (not internal to our VPCs) ● Only for outbound TCP (i.e. non-RFC1918) ● I want the entire calling process tree so I can see full local causality ● Include process ownership information ● Must not require workload tooling The problem space
  • 20. eBPF!
  • 21. eBPF!
  • 22. ● “Berkeley Packet Filter” from BSD ● An in-kernel VM accessed as a device (/dev/bpf) ● Limited number of registers ● No loops (to prevent kernel deadlocking) ● Used for packet filtering BPF
  • 23. ● An in-kernel VM in Linux (and now FreeBSD!) ● It’s “extended”! ● Moar registers than BPF ● Used for hooking syscalls, tracing, proxying sockets, and (you guessed) in-kernel packet filtering ○ Can actually offload to some NICs! ● In our case, dispatching kprobes for the tcp_v4_connect syscall eBPF
  • 25. Enjoy writing your filters as an array of BPF VM instructions...
  • 28. bcc + psutil = PROFIT???
  • 29. bcc + psutil = PROFIT??? ✅
  • 31. How it works sd 54321 for each syscall...
  • 34. ● Filters in-kernel from Jinja2 templates which iterate over subnets in YAML configuration ● Events that don’t get filtered out are passed to userland Python daemon ● psutil used to crawl process tree to init and log alongside other metadata
  • 36. Except it was a hackathon project so all it did was print events to stdout and could only match classful networks and I developed it on my personal laptop.
  • 37. The Road To Production
  • 39. ● I realised only the classful networks worked because of the byte boundaries ● Don’t try to do clever bitwise shifting with the mask length ● Endianness and byte ordering between network and host don’t work how you think they do ● No srs Matching all CIDRs
  • 40. ● A coworker was trying to figure out which batch jobs were accessing a service for a data auth project ● He asked me if we could match ports ● I said I’d have it: ○ Matching ports ○ Dockerized for adhoc usage ○ By the next day ● The next day he found all unauthenticated clients. Dockerizing for debugging
  • 41. ● Contains python2.7 and dependencies (sorry) ● Needs some setup at runtime ● Volume mount /etc/passwd for uid mapping ● Not your typical flags: ○ --privileged ○ --cap-add sys_admin ○ --pid host ● Don’t worry I am a professional probably. pidtree-bcc in Docker
  • 42. ● We run our own PaaS called PaaSTA which uses Docker as containerizer ● Runs the vast majority of our workloads ● Can pull-from-registry and run in a systemd unit file without further setup ● Don’t have to install dependencies (inc. LLVM, python2) ● Get coverage quickly Opportunistic deploy with Docker
  • 43. ● Previous projects with goaudit meant we already had a secure logging pipeline for reading a FIFO and outputting to Amazon Kinesis ○ syslog2kinesis adds other Yelpy metadata (e.g. hostname, environment, Puppet role...) ● Originally fed to our Logstash => Elasticsearch SIEM ● Migrated to Kinesis Firehose => Splunk this quarter <3 Log aggregation
  • 44. ● Better to ask forgiveness than permission... ● Rolled out to two security devboxes and watched the logs roll in! ● Negligible performance impact!!! ○ As postulated, cost of subnet filtering << cost of instantiating a TCP connection ● Lots of connections out to public Amazon IPs creating a lot of noise Dip Test
  • 45. If only Amazon maintained some kind of list of their public prefixes...
  • 48. Surely you can’t load ~200 netblocks into the kernel and compare all non- RFC1918 tcp_v4_connect syscalls to them in a performant manner...
  • 49. Surely you can’t load ~200 netblocks into the kernel and compare all non- RFC1918 tcp_v4_connect syscalls to them in a performant manner...
  • 51. ● ~25,000 - ~50,000 messages per hour across dev and stage ● Once accidentally load-tested at ~80,000 messages in 5m from one host for several hours ● Nobody on the host noticed ● TCP connections are way more expensive than the filters! Load
  • 52. ● bpf_trace_printk() -> BPF_PERF_OUTPUT() ○ Global (e.g. per-kernel) debug output with hand- hacked json and string manipulation ○ To structured data in a ring buffer ○ Multi-tenancy makes it a better utility and more testable! ● Added unit tests ● Adding integration tests ● Adding infrastructure for deploy in production environment Undoing my nasty hacks
  • 53. ● De-containerize (e.g. debian package) ● Python3 ● Plugin for container awareness ○ Easy mapping to service and therefore owner! ● Enable immutable loginuid and add that to metadata ○ --loginuid-immutable under `man auditctl` ○ Cryptically says “but can cause some problems in certain kinds of containers” ● Threat modelling/hardening! Future work
  • 54. ● Performance improvements ○ BPF longest-match maps ○ Pre-processing masks ○ Probably totally unnecessary ● Moar syscalls! ○ TCP listens, ipv6, UDP, SUID, forwarded SSH socket reads… ● SIEM tooling ○ ASN matching, bad IP matching, GuardDuty auto- enrichment... Future work
  • 57. London | 14-15 November 2019 https://github.com/Yelp/pidtree-bcc @grimmware Thanks for listening!