SlideShare a Scribd company logo
@CxOSidekick
Opinions were shared
Mistakes were made
How to not fail at security data analytics (by CxOSidekick)
132 sides … in 15 minutes?
Theme #1: Vendors and the
details they leave out
Or: How better questions
avoid Pointless Offensive
Concepts
More global
deployments than
our competitors
can’t be wrong!
Hi, vaguely
plausible maths
guy here. We use
all the machine
learnings.
“The product’s good, as long as you
don’t believe what the marketing tells
you it’ll do; we had to invest 5 years for
two analysts to build the skills to use it
for real effect.”
1. What are the specific detection use cases you cover?
2. What are their limits?
3. What are your high and low bars for false positive rates in customer deployments and
why?
4. What attack techniques over what attack surfaces that you cover?
5. What specific data sources, and what fields within those data sources, are vital to solve
the detection problem sets you focus on?
6. What is the ideal vs bare minimum baseline sufficient data set and coverage to deliver
value?
7. What are your dependencies on config, modules and settings of technologies that deliver
input?
8. What visibility do you provide of the logic available (i.e. rule sets / algos) and the
rationale for your choices in how you’ve applied that logic?
9. What is the process for tuning and how adaptable is that by the customer?
10.What is the playbook for triage used by your most advanced customers?
11.How long will it be before we see value, how do you define ‘value’, and what are the
conditions that must be true for you to stand behind delivering that in ‘n’ number of days?
Theme #2: Security Data
Ops & why you need it
Behind the GUI,
no one can hear
you scream
How to not fail at security data analytics (by CxOSidekick)
Deliver on use cases
Analytics capability
Platform(s?)
Data Pipeline
Data
A lot of effort for a
feature, not a product
Solution to problem sets
Analytics capability
Platform(s?)
Data Pipeline
Data
8 platforms, 4 years, >20m blown,
users asking for their money back
NEVER START WITH
THE DASHBOARD
We are envoys, we take what is offered
AKA: in a mission where you have a clear target
outcome, but an unknowable path to victory in
a complex environment…
1. No plan survives 1st contact with the enemy
2. Always be adapting
Theme #3: There’s
getting the data, then
getting the data
Data sludge (SIEMs familiar)
We sit on
top of
your SIEM
No virtuous correlation
Act 1
Getting the data
A pre-requisite for
recognition is visibility
Visibility != Recognition
“My log has something to tell you.”
How to not fail at security data analytics (by CxOSidekick)
How to not fail at security data analytics (by CxOSidekick)
Wait, it gets worse
1. What’s the level of variation / entropy across sensors?
2. How is your sensor eco-system likely to change over time?
3. What data are sensors generating currently?
4. How consistently, with what coverage across environments?
5. What are the data generation options per sensor?
6. What are the change considerations to get maximum data?
7. What volumes of data will that generate?
8. What’s are the options to transport them (stream, batch)?
9. What are the costs of that to the network?
10. How does that impact your collection criteria of ‘by default’
or ‘by exception’?
Identify Protect Detect Respond Recover
Users
Devices
Apps
Data
Network
Sounil Yu
How to not fail at security data analytics (by CxOSidekick)
How to not fail at security data analytics (by CxOSidekick)
§ Revenue Engines
§ Trade Secrets
§ Executive Image
§ Compliance
§ Critical Operations
Identify Protect Detect Respond Recover
Users
Devices
Apps
Data
Network
Business level view:
“I want to detect super-users who attempt to exfiltrate trade secrets (e.g. customer lists)”
Identify Protect Detect Respond Recover
Users
Devices
Apps
Data
Network
Assuming your network isn’t…
Prod
environment?
Digital
Dev?
= risk appetite = constraints on flexible working
Back office
apps?WAN?
Trust zonesLoose Strict
How far do we promise we’ll get?
How far will our data get us?
Duplo Lego Technics
MVPD
Phase 1 Phase 2 Phase 3 Phase 4
Data sets that…
… triage to give
broad situational
awareness
… support infil / exfil
detection across
internal and external
attack surfaces
… support IR and
hunting for threat
actor TTPs
… support proactive
hunting for end-to-
end adversary
tradecraft
Users
§ HR database
§ Badge swipe system
§ Cloud auth / SSO § DC AD Kerberos tickets
Devices
§ Active directory
§ Vuln scanner
§ Endpoint software
§ CMDB
§ Domain controller
events
§ Workstation events
§ Server events
§ Workstation and server
processes with full
command args
§ Server & Workstation
config baseline
§ Server & Workstation
PowerShell logs
Apps
§ Static /Dynamic scan
results
§ Application access logs
§ Web server logs
Data § DLP alerts § SQL / Database logs
Network
§ Web proxy
§ DHCP
§ Netflow
§ DNS
§ Firewall
§ Custom IDS alerts
Act 2
Getting the data
Host /
Observer
Collector Platform
Host /
Observer
Collector Platform
Security Data Operations
User Need
Host / Observer
#1
Collector
#1 Platform
#1
Host / Observer
#2
Host / Observer
#n
Collector
#n
Platform
#n
Host / Observer
#n
Collector
#2
Host / Observer
#1
Collector
#1 Platform
#1
Host / Observer
#2
Host / Observer
#n
Collector
#n
Platform
#n
Host / Observer
#n
Collector
#2
not
Class of Host /
Observer
Collector Platform
Filtered?
Aggregated?
Metricated?
Summarized?
All the
data?
Full collect?
Class of Host /
Observer
Collector Platform
Filtered?
Aggregated?
Metricated?
Summarized?
All the
data?
What is centralized + available
Richness of content
Full collect?
How to not fail at security data analytics (by CxOSidekick)
Host Observer Collector Platform
Full collect
Filtered
Aggregated
Summarised
Metricated
Host Observer Collector Platform
Full collect
Filtered
Aggregated
Summarised
Metricated
Storage Settings
Implied Retention
Host dimensions
1. Mechanism
2. Volume
3. Collect mode
4. Format
Host
1. Native
2. Agent
3. Role based
1. Mechanism
2. Volume
3. Collect mode
4. Format
Host
1. Win logs
2. AV logs
3. DHCP logs
1. Mechanism
2. Volume
3. Collect mode
4. Format
DHCP
Server
Avg/min/max/ peak
a) EPS
b) Size / event
1. Mechanism
2. Volume
3. Collect mode
4. Format
Host
[EPS x Avg SPE] x Seconds = Log vol for time period n
[EPS x Avg SPE] x Seconds = Log vol for time period n
What is the time
scale of interest?
Host /
Observer
Collector Platform
How contended is this resource, from
a) my top talker device
b) at the 95th percentile from all devices?
1. Mechanism
2. Volume
3. Collect mode
4. Format
1. Push / pull
a) Directly from…
b) Indirectly from…
2. On demand only
pull
Host
Win Box
Directly from…
Win logs Collection
Server
Indirectly from…
Win Box
AV
alerts
Sophos
Server
AV
alerts
Collection
Server
1. Syslog (CEF, Key
value pairs)
2. JSON
3. XML
4. Agent telemetry
5. CSV
6. … big long list
1. Mechanism
2. Volume
3. Collect mode
4. Format
Host
Collector dimensions
Collector
1. Collection
decisions
2. Fwd’r agent
decisions
3. Spec
decisions
4. Pipe
decisions
No
collect
Full
collect
Filter
Agg
Summ
Met
Collector
1. Collection
decisions
2. Fwd’r agent
decisions
3. Spec
decisions
4. Pipe
decisions
But measure (or summarize, then
measure) what you filter out if
possible as that tells you the cost if
you decided to filter in those event
types in future.
Collector
1. Collection
decisions
2. Fwd’r agent
decisions
3. Spec
decisions
4. Pipe
decisions
Note: this may be contingent on the
Collection Modes and Formats of
sources considered under the Host
section, and/or compatibility with
the chosen downstream Platform
Collector
1. Collection
decisions
2. Fwd’r agent
decisions
3. Spec
decisions
4. Pipe
decisions
How to not fail at security data analytics (by CxOSidekick)
How to not fail at security data analytics (by CxOSidekick)
Do you feel lucky? Well? Do ya?
1. Turn on all
the logs
2. Tweak EPS
input via
config
3. Test spec 4. At fail,
increase
required
resource
5. Measure
output
Pick a ‘top
talker’
1000
5000
10,000
15,000
30,000
50,000
!?,000
CPU n
MEM n
NET n
DISK IOPS n
CPU n
MEM n
NET n
DISK IOPS n
x2
x2
EPS
Compression
performance
Utilization
Delay
Level up
1. Turn on all
the logs
2. Tweak EPS
input via
config
3. Test spec 4. At fail,
increase
required
resource
5. Measure
output
Pick a ‘top
talker’
1000
5000
10,000
15,000
30,000
50,000
!?,000
CPU n
MEM n
NET n
DISK IOPS n
CPU n
MEM n
NET n
DISK IOPS n
x2
x2
EPS
Compression
performance
Utilization
Delay
Collector
1. Collection
decisions
2. Fwd’r agent
decisions
3. Spec
decisions
4. Pipe
decisions
1. Bandwidth
2. Tolerable latency
3. Fail-over
“I cannot apologize for the cost of the logs.”
Security logs were probably not factored in
Delaying/batching by only a few seconds can
positively impact compression and/or aggregation
Throttle
Host /
Observer
Collector
Server
Platform
Muchos logs
4 hrs
of
logs
Possible rate
of send
Muchos logs
Fail. Over.
Host /
Observer
Collector
Server
Platform
Muchos logs
4 hrs
of
logs
Possible rate
of send
Muchos logs
Platform dimensions
Platform
1. Cost of data ingress /
egress
2. Cost of data
processing
3. Expertise to
operationalize
4. Integration / API
availability
“I’m technology agnostic
other than when it comes
to Splunk and Excel.”
Act 3
Getting me the data
1. What is the user need?
2. What system of processes
does it exist in? (i.e. context)
What problem
sets drive data
collection?
How to not fail at security data analytics (by CxOSidekick)
Scope of all
possible log
collection
Breaks into
these broad
categories
Sec Tech
Inter-networking
Devices
Apps
Enrichment
§ Telemetry
§ Alerts
§ Logs
§ Databases
§ Random excel spreadsheets
Let’s say
we’re
generating
something
like this
Sec Tech
Inter-netwk
Devices
Apps
Enrichment
Time
Modules Config Settings
Sec Tech
Inter-netwk
Devices
Apps
Enrichment
Dip test for ops
Manual xls for
4erly reporting
Time
Let’s say
we’re
generating
something
like this
Of ‘available’
we’ll be
centralizing,
(cont. or
periodically)
between
0-100%
Only
available
locally
Not available
For any
of these
Of ‘available’
we’ll be
centralizing,
(cont. or
periodically)
between
0-100%
Only
available
locally
Not available
For any
of these
Compliance
driven
Last 30 daysLast 60 days
Ops / assurance
driven
Only localLocal only
default
Logs are
available
Logs may be
available
Centralised
continuous
and periodic
How to not fail at security data analytics (by CxOSidekick)
Compliance
driven
Last 30 daysLast 60 days
Ops / assurance
driven
Local only
default
Logs are
available
Logs may be
available
Time to forensicate
Compliance
driven
Last 30 daysLast 60 days
Ops / assurance
driven
Local only
default
Availability of historic logs
Need for historic logs
Logs are
available
Logs may be
available
x
x
x x
x
x x
x
x x
x
x
x
x
x x
x
x
x
x
x
x
Compliance
driven
Last 30 daysLast 60 days
Ops / assurance
driven
Local only
default
Availability of historic logs
Need for historic logs
Logs are
available
Logs may be
available
x
x
x x
x
x x
x
x x
x
x
x
x
x x
x
x
x
x
x
x
X = relevant signals or …. ?
Compliance
driven
Last 30 daysLast 60 days
Ops / assurance
driven
Local only
default
Availability of historic logs
Need for historic logs
Logs are
available
Logs may be
available
Maybe alerts
show in SIEM
for relevant X?
x
x
x x
x
x x
x
x x
x
x
x
x
x x
x
x
x
x
x
x
Compliance
driven
Last 30 daysLast 60 days
Ops / assurance
driven
Local only
default
Availability of historic logs
Need for historic logs
Logs are
available
Logs may be
available
x
x
x x
x
x
x x
x
x x
x
x
x
x
x x
x
x
x
x
x
x
Maybe alerts
show in SIEM
for relevant X?
Compliance
driven
Last 30 daysLast 60 days
Ops / assurance
driven
Local only
default
Availability of historic logs
Need for historic logs
Logs are
available
Logs may be
available
x
x
x x
x
x
x x
x
x x
x
x
x
x
x x
x
x
x
x
x
x
Alerts may
show in
SIEM for X
Compliance
driven
Last 30 daysLast 60 days
Ops / assurance
driven
Local only
default
Availability of historic logs
Need for historic logs
Logs are
available
Logs may be
available
x
x
x x
x
x
x x
x
x x
x
x
x
x
x x
x
x
x
x
x
x
Alerts may
show in
SIEM for X
Compliance
driven
Last 30 daysLast 60 days
Ops / assurance
driven
Local only
default
Availability of historic logs
Need for historic logs
Logs are
available
Logs may be
available
x
x
x x
x
x
x x
x
x x
x
x
x
x
x x
x
x
x
x
x
x
This is incident / story
driven collection.
RinseRepeat
>
>
>
This is why ‘hunt’ today
is a fundamentally
challenged discipline.
“We are not focused on
technical indicators. We take
adversary tradecraft and codify
that into a threat analytic.”
Collect everything by
default. Filter or
exclude by exception.

More Related Content

PDF
Evolving challenges for modern enterprise architectures in the age of APIs
PDF
Using security to drive chaos engineering - April 2018
PPTX
6 Most Common Threat Modeling Misconceptions
PPTX
7 Steps to Build a SOC with Limited Resources
PPTX
Application Security at DevOps Speed and Portfolio Scale
PPTX
2017-11 Three Ways of Security - OWASP London
PPTX
Vulnerability Management Nirvana - Seattle Agora - 18Mar16
PPTX
TDC PoA submission
Evolving challenges for modern enterprise architectures in the age of APIs
Using security to drive chaos engineering - April 2018
6 Most Common Threat Modeling Misconceptions
7 Steps to Build a SOC with Limited Resources
Application Security at DevOps Speed and Portfolio Scale
2017-11 Three Ways of Security - OWASP London
Vulnerability Management Nirvana - Seattle Agora - 18Mar16
TDC PoA submission

What's hot (20)

PPTX
Information Security Life Cycle
PPTX
Assess all the things
PPTX
Should You Use Security Point Solutions?
PPTX
Anton's 2020 SIEM Best and Worst Practices - in Brief
PDF
Security Metrics Rehab: Breaking Free from Top ‘X’ Lists, Cultivating Organic...
PDF
Secure Software Development Lifecycle - Devoxx MA 2018
PPTX
It All Started With a Wager About System Upgrades
PPTX
How to Choose the Right Security Training for You
PDF
SHOWDOWN: Threat Stack vs. Red Hat AuditD
PPTX
Threat modeling the security of the enterprise
PPT
Software Security Engineering
PPTX
451 and Endgame - Zero breach Tolerance: Earliest protection across the attac...
PDF
5 Important Secure Coding Practices
PPT
Baselining Logs
PPTX
Insight into SOAR
PPTX
Secure Design: Threat Modeling
PPTX
Tips on SIEM Ops 2015
PDF
SentinelOne Buyers Guide
PPTX
The QA Analyst's Hacker's Landmark Tour v3.0
PPTX
Continuous Application Security at Scale with IAST and RASP -- Transforming D...
Information Security Life Cycle
Assess all the things
Should You Use Security Point Solutions?
Anton's 2020 SIEM Best and Worst Practices - in Brief
Security Metrics Rehab: Breaking Free from Top ‘X’ Lists, Cultivating Organic...
Secure Software Development Lifecycle - Devoxx MA 2018
It All Started With a Wager About System Upgrades
How to Choose the Right Security Training for You
SHOWDOWN: Threat Stack vs. Red Hat AuditD
Threat modeling the security of the enterprise
Software Security Engineering
451 and Endgame - Zero breach Tolerance: Earliest protection across the attac...
5 Important Secure Coding Practices
Baselining Logs
Insight into SOAR
Secure Design: Threat Modeling
Tips on SIEM Ops 2015
SentinelOne Buyers Guide
The QA Analyst's Hacker's Landmark Tour v3.0
Continuous Application Security at Scale with IAST and RASP -- Transforming D...
Ad

Similar to How to not fail at security data analytics (by CxOSidekick) (20)

PDF
Optimizing connected system performance md&m-anaheim-sandhi bhide 02-07-2017
PPTX
Storage, Virtual, and Server Profiler Training
PDF
Sensors, threats, responses and challenges - Dr Emil Lupu (Imperial College L...
PPTX
Production debugging web applications
PPTX
An Introduction to Prometheus (GrafanaCon 2016)
PPTX
Cloud security From Infrastructure to People-ware
PPTX
VoIP Security 101 what you need to know
PPTX
Welcome Webinar Slides
PDF
The Golden Rules - Detecting more with RSA Security Analytics
PDF
OMG Data-Distribution Service Security
PPTX
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
PPTX
Building a Security Information and Event Management platform at Travis Per...
PDF
Protect the data - Cyber security - Breaches - Brand/Reputation
PPTX
Machine Learning AND Deep Learning for OpenPOWER
PDF
Adtech scala-performance-tuning-150323223738-conversion-gate01
PDF
Adtech x Scala x Performance tuning
PDF
How to Improve Data Labels and Feedback Loops Through High-Frequency Sensor A...
PDF
PyConline AU 2021 - Things might go wrong in a data-intensive application
PPTX
Derby con 2014
PPTX
Filar seymour oreilly_bot_story_
Optimizing connected system performance md&m-anaheim-sandhi bhide 02-07-2017
Storage, Virtual, and Server Profiler Training
Sensors, threats, responses and challenges - Dr Emil Lupu (Imperial College L...
Production debugging web applications
An Introduction to Prometheus (GrafanaCon 2016)
Cloud security From Infrastructure to People-ware
VoIP Security 101 what you need to know
Welcome Webinar Slides
The Golden Rules - Detecting more with RSA Security Analytics
OMG Data-Distribution Service Security
Monitoring What Matters: The Prometheus Approach to Whitebox Monitoring (Berl...
Building a Security Information and Event Management platform at Travis Per...
Protect the data - Cyber security - Breaches - Brand/Reputation
Machine Learning AND Deep Learning for OpenPOWER
Adtech scala-performance-tuning-150323223738-conversion-gate01
Adtech x Scala x Performance tuning
How to Improve Data Labels and Feedback Loops Through High-Frequency Sensor A...
PyConline AU 2021 - Things might go wrong in a data-intensive application
Derby con 2014
Filar seymour oreilly_bot_story_
Ad

More from Dinis Cruz (20)

PDF
Map camp - Why context is your crown jewels (Wardley Maps and Threat Modeling)
PDF
Glasswall - Safety and Integrity Through Trusted Files
PDF
Glasswall - How to Prevent, Detect and React to Ransomware incidents
PDF
The benefits of police and industry investigation - NPCC Conference
PDF
Serverless Security Workflows - cyber talks - 19th nov 2019
PDF
Modern security using graphs, automation and data science
PDF
Using Wardley Maps to Understand Security's Landscape and Strategy
PDF
Dinis Cruz (CV) - CISO and Transformation Agent v1.2
PDF
Making fact based decisions and 4 board decisions (Oct 2019)
PDF
CISO Application presentation - Babylon health security
PDF
Using OWASP Security Bot (OSBot) to make Fact Based Security Decisions
PDF
GSBot Commands (Slack Bot used to access Jira data)
PDF
(OLD VERSION) Dinis Cruz (CV) - CISO and Transformation Agent v0.6
PDF
OSBot - Data transformation workflow (from GSheet to Jupyter)
PDF
Jira schemas - Open Security Summit (Working Session 21th May 2019)
PDF
Template for "Sharing anonymised risk theme dashboards v0.8"
PDF
Owasp and summits (may 2019)
PDF
Creating a graph based security organisation - Apr 2019 (OWASP London chapter...
PDF
Open security summit 2019 owasp london 25th feb
PDF
Owasp summit 2019 - OWASP London 25th feb
Map camp - Why context is your crown jewels (Wardley Maps and Threat Modeling)
Glasswall - Safety and Integrity Through Trusted Files
Glasswall - How to Prevent, Detect and React to Ransomware incidents
The benefits of police and industry investigation - NPCC Conference
Serverless Security Workflows - cyber talks - 19th nov 2019
Modern security using graphs, automation and data science
Using Wardley Maps to Understand Security's Landscape and Strategy
Dinis Cruz (CV) - CISO and Transformation Agent v1.2
Making fact based decisions and 4 board decisions (Oct 2019)
CISO Application presentation - Babylon health security
Using OWASP Security Bot (OSBot) to make Fact Based Security Decisions
GSBot Commands (Slack Bot used to access Jira data)
(OLD VERSION) Dinis Cruz (CV) - CISO and Transformation Agent v0.6
OSBot - Data transformation workflow (from GSheet to Jupyter)
Jira schemas - Open Security Summit (Working Session 21th May 2019)
Template for "Sharing anonymised risk theme dashboards v0.8"
Owasp and summits (may 2019)
Creating a graph based security organisation - Apr 2019 (OWASP London chapter...
Open security summit 2019 owasp london 25th feb
Owasp summit 2019 - OWASP London 25th feb

Recently uploaded (20)

PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
Spectroscopy.pptx food analysis technology
PDF
cuic standard and advanced reporting.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Encapsulation theory and applications.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
Big Data Technologies - Introduction.pptx
PPT
Teaching material agriculture food technology
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Spectral efficient network and resource selection model in 5G networks
Chapter 3 Spatial Domain Image Processing.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
Building Integrated photovoltaic BIPV_UPV.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
Dropbox Q2 2025 Financial Results & Investor Presentation
Spectroscopy.pptx food analysis technology
cuic standard and advanced reporting.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
20250228 LYD VKU AI Blended-Learning.pptx
Mobile App Security Testing_ A Comprehensive Guide.pdf
Encapsulation theory and applications.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
NewMind AI Weekly Chronicles - August'25 Week I
Reach Out and Touch Someone: Haptics and Empathic Computing
Big Data Technologies - Introduction.pptx
Teaching material agriculture food technology
Network Security Unit 5.pdf for BCA BBA.
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy

How to not fail at security data analytics (by CxOSidekick)