SlideShare a Scribd company logo
BATTLING UNKNOWN MALWARE
WITH MACHINE LEARNING
DR. SVEN KRASSER CHIEF SCIENTIST
@SVENKRASSER
FALCON ON
VIRUSTOTAL
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
SUBMITTING TO VIRUSTOTAL
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
SCAN RESULTS
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
SCAN RESULTS
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
SCAN RESULTS
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
MACHINE LEARNING
PRIMER
More on this: watch http://tinyurl.com/MLcrowdcast
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
Some Data to Get Started:
1988 ANTHROPOMETRIC
SURVEY OF ARMY PERSONNEL
Source: http://mreed.umtri.umich.edu/mreed/downloads.html#anthro 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
• Over 4000 soldiers surveyed
• Over 100 measurements
• Reported by gender
Data
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
FIRST LOOK
Height [mm]
Density
• Difference in
distribution
• Significant overlap
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
SECOND
DIMENSION
Height [mm]
Weight[10-1
kg]
• Correlation
• Overlap
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
FEATURE
SELECTION
“Buttock Circumference” [mm]
Weight[10-1
kg]
• Correlation
• Reduced overlap
• Selection of
features matters
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
LET’S
CLASSIFY
“Buttock Circumference” [mm]
Weight[10-1
kg]
• Let’s assume we
want to detect
males (blue)
• I.e. “blue” is our
positive class
• TP: classify blue
as blue
• Note some
misclassifications
• FP: classify red as
blue
• FN: classify blue
as red
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
“Buttock Circumference” [mm]
Weight[10-1
kg]
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
LET’S
CLASSIFY
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
“Buttock Circumference” [mm]
Weight[10-1
kg]
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
LET’S
CLASSIFY
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
“Buttock Circumference” [mm]
Weight[10-1
kg]
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
LET’S
CLASSIFY
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
“Buttock Circumference” [mm]
Weight[10-1
kg]
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
LET’S
CLASSIFY
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
“Buttock Circumference” [mm]
Weight[10-1
kg]
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
LET’S
CLASSIFY
• Get more “blue”
right (true positives)
• Get more “red”
wrong (false
positives)
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
RECEIVER
OPERATING
CHARACTERISTICS
CURVE
False Positive Rate
TruePositiveRate
Detect	more	by	accepting	more	false	positives
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
MORE
DIMENSIONS
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
MISSION ACCOMPLISHED:
WE JUST ADD MORE DIMENSIONS…
RIGHT?
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
CURSE OF DIMENSIONALITY
REDUCED
predictive performance
INCREASED
training time
SLOWER
classification
LARGER
memory footprint
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
Source: https://commons.wikimedia.org/w/index.php?curid=2257082 2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
Source: https://commons.wikimedia.org/w/index.php?curid=2257082
Battling Unknown Malware with Machine Learning
Height (mm)
Weight[10-1
kg]
DIMENSIONALITY
AND SPARSENESS
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
2016	CrowdStrike,	Inc.	All	rights	reserved.
Height (mm)
Weight[10-1
kg]
DIMENSIONALITY
AND SPARSENESS
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
LET’S APPLY THIS TO
SECURITY
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
FILE
ANALYSIS
AKA Static Analysis
• THE GOOD
– Relatively fast
– Scalable
– No need to detonate
– Platform independent, can be done at gateway
• THE BAD
– Limited insight due to narrow view
– Different file types require different techniques
– Different subtypes need special consideration
– Packed files
– .Net
– Installers
– EXEs vs DLLs
– Obfuscations (yet good if detectable)
– Ineffective against exploitation and malware-less attacks
– Asymmetry: a fraction of a second to decide for the
defender, months to craft for the attacker
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
FILE CONTENT
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
EXAMPLE FEATURES
32/64 BIT
EXECUTABLE
GUI
SUBSYSTEM
COMMAND
LINE
SUBSYSTEM
FILE SIZE TIMESTAMP
DEBUG
INFORMATION
PRESENT
PACKER TYPE FILE ENTROPY
NUMBER OF
SECTIONS
NUMBER
WRITABLE
NUMBER
READABLE
NUMBER
EXECUTABLE
DISTRIBUTION
OF SECTION
ENTROPY
IMPORTED DLL
NAMES
IMPORTED
FUNCTION
NAMES
COMPILER
ARTIFACTS
LINKER
ARTIFACTS
RESOURCE
DATA
EMBEDDED
PROTOCOL
STRINGS
EMBEDDED
IPS/DOMAINS
EMBEDDED
PATHS
EMBEDDED
PRODUCT
META DATA
DIGITAL
SIGNATURE
ICON
CONTENT …
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
String-based feature
Executablesectionsize-basedfeature
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
COMBINING
FEATURES
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
Subspace Projection A
SubspaceProjectionB
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
COMBINING
FEATURES
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
False Positive Rate
TruePositiveRate
Detect	more	by	accepting	more	false	positives
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
ARMY DATA ROC
CURVE
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
False Positive Rate
TruePositiveRate
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
ML MALWARE
DETECTION ROC
CURVE
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
APTS & 99% OF MALWARE DETECTED…
36
Chanceofatleastone
successforadversary
Number of attempts
1%
>99%
500
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
MALWARE
40%
THREAT
SOPHISTICATION
MALWARE
STOPPING
MALWARE
IS NOT
ENOUGH
HARDERTOPREVENT
&DETECT
LOW
HIGH
HIGH
LOW
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
THREAT
SOPHISTICATION
MALWARE
NON-MALWARE
ATTACKS
MALWARE
40%
NATION-
STATES
60%
NON-MALWARE
ATTACKS
ORGANIZED
CRIMINAL GANGS
HACKTIVISTS/
VIGILANTES
TERRORISTS CYBER-
CRIMINALS
YOU NEED COMPLETE
BREACH
PREVENTION
HARDERTOPREVENT
&DETECT
LOW
HIGH
HIGH
LOW
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
Next-Generation Endpoint Protection
Cloud Delivered. Enriched by Threat Intelligence
MANAGED
HUNTING
ENDPOINT DETECTION
AND RESPONSE
NEXT-GEN
ANTIVIRUS
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
ML SETTINGS WITHIN FALCON HOST
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
ML PREVENTION IN ACTION
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
KEY
POINTS
• Machine Learning is an effective tool against
unknown malware
• Try it out on VirusTotal
• Trading off true positives and false positives
• Detecting 99% malware means an APT has a
100% chance of getting malware into your
environment
• The majority of intrusions are not malware-
based
• Avoid silent failure
• Use a comprehensive array of techniques
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.
www.crowdstrike.com
2016 CROWDSTRIKE, INC. ALL RIGHTS RESERVED.

More Related Content

PDF
Proactive Threat Hunting: Game-Changing Endpoint Protection Beyond Alerting
PDF
How to Replace Your Legacy Antivirus Solution with CrowdStrike
PDF
You Can't Stop The Breach Without Prevention And Detection
PDF
Cloud-Enabled: The Future of Endpoint Security
PDF
Threat Hunting with Splunk Hands-on
PDF
MITRE ATT&CK Framework
PDF
Threat Hunting Procedures and Measurement Matrice
PPTX
The Zero Trust Model of Information Security
Proactive Threat Hunting: Game-Changing Endpoint Protection Beyond Alerting
How to Replace Your Legacy Antivirus Solution with CrowdStrike
You Can't Stop The Breach Without Prevention And Detection
Cloud-Enabled: The Future of Endpoint Security
Threat Hunting with Splunk Hands-on
MITRE ATT&CK Framework
Threat Hunting Procedures and Measurement Matrice
The Zero Trust Model of Information Security

What's hot (20)

PDF
Mapping to MITRE ATT&CK: Enhancing Operations Through the Tracking of Interac...
PPTX
Crowdstrike .pptx
PPTX
Cyber Threat Intelligence
PDF
Threat Hunting
PPTX
Threat Hunting
PDF
State of Endpoint Security: The Buyers Mindset
PDF
Cyber Threat Intelligence
PDF
Zero Trust Model
PDF
MITRE ATT&CKcon 2.0: State of the ATT&CK; Blake Strom, MITRE
PPTX
Fraud Analytics
PDF
MITRE ATT&CKcon 2.0: Using Threat Intelligence to Focus ATT&CK Activities; Da...
PPTX
FortiMail
PDF
Intelligence Failures of Lincolns Top Spies: What CTI Analysts Can Learn Fro...
PPTX
Osint {open source intelligence }
PDF
Web Application Security
PPTX
cyber security presentation.pptx
PDF
Understanding Fileless (or Non-Malware) Attacks and How to Stop Them
PDF
Upgrade Your SOC with Cortex XSOAR & Elastic SIEM
PPTX
Getting started with using the Dark Web for OSINT investigations
PPTX
Effective Threat Hunting with Tactical Threat Intelligence
Mapping to MITRE ATT&CK: Enhancing Operations Through the Tracking of Interac...
Crowdstrike .pptx
Cyber Threat Intelligence
Threat Hunting
Threat Hunting
State of Endpoint Security: The Buyers Mindset
Cyber Threat Intelligence
Zero Trust Model
MITRE ATT&CKcon 2.0: State of the ATT&CK; Blake Strom, MITRE
Fraud Analytics
MITRE ATT&CKcon 2.0: Using Threat Intelligence to Focus ATT&CK Activities; Da...
FortiMail
Intelligence Failures of Lincolns Top Spies: What CTI Analysts Can Learn Fro...
Osint {open source intelligence }
Web Application Security
cyber security presentation.pptx
Understanding Fileless (or Non-Malware) Attacks and How to Stop Them
Upgrade Your SOC with Cortex XSOAR & Elastic SIEM
Getting started with using the Dark Web for OSINT investigations
Effective Threat Hunting with Tactical Threat Intelligence
Ad

Viewers also liked (10)

PDF
AI approach to malware similarity analysis: Maping the malware genome with a...
PPTX
Machine Learning for Malware Classification and Clustering
PDF
Checkmate to crypto malware. Scacco matto ai crypto malware
PPTX
Malware Detection Using Machine Learning Techniques
PPTX
Cognitive Computing in Security with AI
PPT
Malware Detection using Machine Learning
PDF
Adversarial machine learning for av software
PDF
Automated In-memory Malware/Rootkit Detection via Binary Analysis and Machin...
PPTX
In that case, we have an OWASP Top 10 opportunity...
PPTX
Talha Obaid, Email Security, Symantec at MLconf ATL 2017
AI approach to malware similarity analysis: Maping the malware genome with a...
Machine Learning for Malware Classification and Clustering
Checkmate to crypto malware. Scacco matto ai crypto malware
Malware Detection Using Machine Learning Techniques
Cognitive Computing in Security with AI
Malware Detection using Machine Learning
Adversarial machine learning for av software
Automated In-memory Malware/Rootkit Detection via Binary Analysis and Machin...
In that case, we have an OWASP Top 10 opportunity...
Talha Obaid, Email Security, Symantec at MLconf ATL 2017
Ad

Similar to Battling Unknown Malware with Machine Learning (20)

PDF
A Sober Look at Machine Learning
PDF
Straight Talk on Machine Learning -- What the Marketing Department Doesn’t Wa...
PDF
Practical Machine Learning in Information Security
PDF
Fundamentals of Machine Learning: Perspectives from a Data Scientist (ISC Wes...
PDF
Of Search Lights and Blind Spots: Machine Learning in Cybersecurity
PDF
How to Replace Your Legacy Antivirus Solution with CrowdStrike
PPTX
Machine learning cyphort_malware_most_wanted
PPTX
Machine Learning for Malware Classification and Clustering
PPTX
Using Big Data to Counteract Advanced Threats
PDF
BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015
PDF
stackconf 2021 | Data Driven Security
PPTX
Machine learning in computer security
PDF
CrowdCasts Monthly: Going Beyond the Indicator
PPTX
Best Practices for Scoping Infections and Disrupting Breaches
PPTX
rsec2a-2016-jheaton-morning
PDF
Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning. ICMLC 201...
PDF
Adversarial Pattern Classification
PDF
Machine learning at b.e.s.t. summer university
PDF
I FOR ONE WELCOME OUR NEW CYBER OVERLORDS! AN INTRODUCTION TO THE USE OF MACH...
PPT
126622gghyytgggffggggggfsssssssssdff70.ppt
A Sober Look at Machine Learning
Straight Talk on Machine Learning -- What the Marketing Department Doesn’t Wa...
Practical Machine Learning in Information Security
Fundamentals of Machine Learning: Perspectives from a Data Scientist (ISC Wes...
Of Search Lights and Blind Spots: Machine Learning in Cybersecurity
How to Replace Your Legacy Antivirus Solution with CrowdStrike
Machine learning cyphort_malware_most_wanted
Machine Learning for Malware Classification and Clustering
Using Big Data to Counteract Advanced Threats
BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015
stackconf 2021 | Data Driven Security
Machine learning in computer security
CrowdCasts Monthly: Going Beyond the Indicator
Best Practices for Scoping Infections and Disrupting Breaches
rsec2a-2016-jheaton-morning
Wild Patterns: A Half-day Tutorial on Adversarial Machine Learning. ICMLC 201...
Adversarial Pattern Classification
Machine learning at b.e.s.t. summer university
I FOR ONE WELCOME OUR NEW CYBER OVERLORDS! AN INTRODUCTION TO THE USE OF MACH...
126622gghyytgggffggggggfsssssssssdff70.ppt

More from CrowdStrike (18)

PDF
Cyber Security Extortion: Defending Against Digital Shakedowns
PDF
An Inside Look At The WannaCry Ransomware Outbreak
PDF
DEFENDING AGAINST THREATS TARGETING THE MAC PLATFORM
PDF
CrowdStrike CrowdCast: Is Ransomware Morphing Beyond The Ability Of Standard ...
PDF
Bear Hunting: History and Attribution of Russian Intelligence Operations
PDF
Java Journal & Pyresso: A Python-Based Framework for Debugging Java
PDF
Venom
PDF
CrowdCasts Monthly: When Pandas Attack
PDF
CrowdCast Monthly: Operationalizing Intelligence
PDF
CrowdCasts Monthly: You Have an Adversary Problem
PDF
CrowdCasts Monthly: Mitigating Pass the Hash
PDF
End-to-End Analysis of a Domain Generating Algorithm Malware Family
PDF
TOR... ALL THE THINGS
PDF
End-to-End Analysis of a Domain Generating Algorithm Malware Family Whitepaper
PDF
TOR... ALL THE THINGS Whitepaper
PDF
I/O, You Own: Regaining Control of Your Disk in the Presence of Bootkits
PDF
Hacking Exposed Live: Mobile Targeted Threats
PDF
Be Social. Use CrowdRE.
Cyber Security Extortion: Defending Against Digital Shakedowns
An Inside Look At The WannaCry Ransomware Outbreak
DEFENDING AGAINST THREATS TARGETING THE MAC PLATFORM
CrowdStrike CrowdCast: Is Ransomware Morphing Beyond The Ability Of Standard ...
Bear Hunting: History and Attribution of Russian Intelligence Operations
Java Journal & Pyresso: A Python-Based Framework for Debugging Java
Venom
CrowdCasts Monthly: When Pandas Attack
CrowdCast Monthly: Operationalizing Intelligence
CrowdCasts Monthly: You Have an Adversary Problem
CrowdCasts Monthly: Mitigating Pass the Hash
End-to-End Analysis of a Domain Generating Algorithm Malware Family
TOR... ALL THE THINGS
End-to-End Analysis of a Domain Generating Algorithm Malware Family Whitepaper
TOR... ALL THE THINGS Whitepaper
I/O, You Own: Regaining Control of Your Disk in the Presence of Bootkits
Hacking Exposed Live: Mobile Targeted Threats
Be Social. Use CrowdRE.

Recently uploaded (20)

PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Spectroscopy.pptx food analysis technology
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPT
Teaching material agriculture food technology
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
cuic standard and advanced reporting.pdf
PPTX
sap open course for s4hana steps from ECC to s4
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Empathic Computing: Creating Shared Understanding
PDF
Machine learning based COVID-19 study performance prediction
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
Review of recent advances in non-invasive hemoglobin estimation
Spectroscopy.pptx food analysis technology
MIND Revenue Release Quarter 2 2025 Press Release
Understanding_Digital_Forensics_Presentation.pptx
MYSQL Presentation for SQL database connectivity
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Diabetes mellitus diagnosis method based random forest with bat algorithm
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Per capita expenditure prediction using model stacking based on satellite ima...
Teaching material agriculture food technology
20250228 LYD VKU AI Blended-Learning.pptx
cuic standard and advanced reporting.pdf
sap open course for s4hana steps from ECC to s4
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Empathic Computing: Creating Shared Understanding
Machine learning based COVID-19 study performance prediction
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Digital-Transformation-Roadmap-for-Companies.pptx

Battling Unknown Malware with Machine Learning