Subverting Machine Learning Detections for fun and profit

“But Watson couldn’t distinguish between polite language and profanity —
which the Urban Dictionary is full of”
- Eric Brown (IBM)

Subverting Machine Learning
for Fun And Profit
Ram Shankar Siva Kumar, John Walton
Email: Ram.Shankar@Microsoft.com; JoWalt@Microsoft.com

Goals
• This talk:
• Is a primer on Adversarial Machine Learning
• Will show, through a sampling, how ML algorithms are vulnerable
• Illustrates how to defend against such attacks
• This talk IS NOT
• An exhaustive review of all algorithms
• End goal: Gain an intuitive understanding of ML algorithms and how
to attack them

Agenda
• Motivation to Attack ML systems
• Practical Attacks and Defenses
• Best Practices

ML is everywhere…
“Machine Learning is shifting from an academic discipline to an
industrial tool” – John Langford

In Security…!!
“The only effective approach to
defending against today’s ever-increasing
volume and diversity of
attacks is to shift to fully
automated systems capable of
discovering and neutralizing
attacks instantly.”
- Mike Walker (on DARPA
Cyber Grand Challenge)

Computer
System
Data
Program
Output
Computer
System
Data
Output
Program
Traditional Programming
Machine Learning
Source: Lectures by Pedro Domingos

Things to note about
• For the program to be functional, input data must be functional
• What does a program/model look like?
• Literally, bunch of numbers and data points
• The output model can be expressed in terms of parameters:
Linear Regression
y = 225x + 875
3,500
3,000
2,500
2,000
1,500
1,000
500
0 R² = 0.574
1 2 3 4 5 6 7 8
Number of Logons
Time
Non-Linear
y = 982.23e0.1305x
R² = 0.6624
3,500
3,000
2,500
2,000
1,500
1,000
500
0
1 2 3 4 5 6 7 8
Number of Logons = 225 * Time + 875 Number of Logons = 982*e
0.1305* (Time)

Malicious Mindset
• Data and parameters define the model
• By controlling the data or parameters, you can change the model
• Where do you find them?
• Data
• At the source
• Collected in a big data store
• Stored in the cloud (MLaaS)
• Parameters:
• Code repository

The mother lode
Data is collected
Data is within
anomaly detector’s
purview
Anomaly is
significant for
detector
Anomaly is
surfaced!
Source: Arun Viswanathan, Kymie Tan, and Clifford Neuman, Deconstructing the Assessment of Anomaly-based Intrusion Detectors, RAID 2013.

Putting it all together
• Opportunity = ML is/will be everywhere
• Prevalence = ML is/will be widely used in security
• Ease = (most) ML algorithms can be easily subverted by controlling
data/parameters
• High rate of return = Once subverted, you can evade or even control
the system
Opportunity * Prevalence * Ease * High Rate of Return =

Agenda
• Motivation to Attack ML systems
• Practical Attacks and Defenses
• Intuitive understanding of the algorithm
• How the system looks before the attack?
• How the system looks after the attack?
• How to defend from these attacks?
• Takeaway – From Evasion to total control of the system
• Best Practices

About the dataset
• Used Enron Spam Dataset
• Came out of the Federal investigation of Enron corporation
• Real world corpus of spam and ham messages.
• 619,446 email messages belonging to 158 users. After cleaning it up
(removing duplicate messages, discussion threads), you end up with
200,399 messages.

Word P(Word|Spam) P(Word|Ham)
Assets 0/3 2/3
Assignment 0/3 2/3
Cialis 3/3 0/3
Group 0/3 2/3
Viagra 1/3 0/3
Vallium 2/3 0/3
Naïve Bayes Algorithm
Choose whichever probability is higher:
푃 푆푝푎푚 푀 ∝ 푃 푆푝푎푚 ∗ 푃(W|Spam)
푃 퐻푎푚 푀 ∝ 푃 퐻푎푚 ∗ 푃(W|Ham)
P(Spam|M) = 0.5*(0/3)*(0/3)*(0/3) = 0
P(Ham|M) = 0.5*(2/3)*(2/3)*(2/3) = 0.14
Since 0.14 > 0 => Message is more likely to be
Ham

Before Attack
• Built a vanilla Naïve Bayes classifier on Enron email dataset (with
some normalizations)
• Goal: Given a new subject, can I predict if it is spam or ham?
• Testing on 20% of data, you get test accuracy of 62%

After the attack
• Good Word Attack: Introduce innocuous words in the message
E.g: Gas Meeting Expense Report Payroll
-> Test Accuracy dropped to 52.8%
100
80
60
40
20
0
0 10 20 30
False Positive Rate
Number of Benign words added

Takeaway
• How to use in real-world: Spear phishing
• By manipulating the input to the algorithm, we can increase the false
positive rate
• Make the system unusable!

Support Vector Machines – The Ferrari of ML
• Immensely popular
• Quite fast
• Deliver a solid performance
• Widely used in classification setting
In Security setting, beginning to gain
popularity in the Malware community.
• Goal: Given a piece of code, is it Malicious
or benign?

Intuition
Which is the right decision boundary?

SVM Intuition
Choose the hyperplane, that maximizes the
margin between the positive and negative
examples!
Those examples on the boundary are called
support vectors!

Facts about SVMs
• Output of SVM = a set of weights + Support vectors
• Once you have the support vectors {special points in the
training data}, rest of the training data can be thrown away
• Takeaway: A good part of the model, is determined by
support vectors
• Intuition: Controlling the support vectors, should help us to
control the model

Takeaway
• How it can be used in real-world: Fool the malware classifier
• Changes to support vectors, lead to changes in decision boundary

Clustering
• Widely used learning algorithm for
anomaly detection

Attack Intuition
Center
Before Attack
After Attack
Attack Point
to be included
Source:Laskov, Pavel, and Marius Kloft. "A framework for quantitative security analysis of machine learning." Proceedings of the 2nd ACM workshop on Security and artificial intelligence. ACM, 2009.

Takeaway
• In order to attack the algorithm, we don’t change the parameter
(centroid) -> Simply send in data as part of “normal” traffic
• Increased the false negative rate

Summary of Attacks
Algorithm Result of Attack What does this mean?
Naïve Bayes Increased false positive rate You can make the system unusable
K-means
clustering
Increased false negative rate You can evade detection
SVM Control of the decision boundary You have full control of what gets alerted
and what doesn’t

Ensembling – You can’t fool ‘em all
- Build separate models to detect
malicious activity
- The models are chosen so that they are
orthogonal
- Each model independently assess for
maliciousness
- Results are combining using a separate
function

• Used Gaussian Naïve Bayes, linear SVM in addition to Naïve Bayes
• Used a simple majority voting method, to combine the three outputs.

Using Robust Learning Methods
• Intuition: Treat the tainted data points
as outliers (presumably because of
noise)
Outlier?

Instead of Consider
Vanilla Naïve Bayes Multinomial Model (even better
than multivariate Bernoulli model)
SVM Robust SVM (feature noise, and
label noise)
K-means with finite window K-means with infinite window
Logistic Regression Robust Logistic Regression using
Shift Parameters
Vanilla PCA Robust PCA with Laplcian
Threshold (Antidote)

Caution!
• Pros: Well studied field with a gamut of choices
• Optimization perspective
• Game Theoretic perspective
• Statistical perspective
• Cons:
• Some of these algorithms have higher computational complexity than standard
algorithms
Standard SVM: 10 minutes Robust SVM: 1 hr and 8 mins
(Single node implementation, 50k data points, 20% test, no kernel )
• Requires a lot more tuning and babysitting

Threat Modeling
• Adversary Goal - Evasion? Poisoning? Deletion?
• Adversary’s knowledge – Perfect Knowledge? Limited Knowledge?
• Training set or part of it
• Feature representation of each sample
• Type of a learning algorithm and the form of its decision function
• Parameters and hyper-parameters of the learned model
• Feedback from the classifier; e.g., classifier labels for samples chosen by the
adversary.
• Attacker’s capability
• Ability to modify – Complete or partial?
Source:Biggio, Battista, Blaine Nelson, and Pavel Laskov. "Poisoning attacks against support vector machines." arXiv preprint arXiv:1206.6389 (2012).

Tablestakes
• Secure log sources
• Secure your storage space
• Monitor data quality
• Treat parameters and features as secrets
• Don’t use publically available datasets to train your system
• When designing the system, avoid interactive feedback

3 Key Takeaways
1) Naïve implementation of machine Learning Algorithms are
vulnerable to attacks.
2) Attackers can evade detections, cause the system to be unusable or
even control it.
3) Trustworthy results depend on trustworthy data.

Thank you!
- TwC: Tim Burell
- Azure Security: Ross Snider, Shrikant
Adhirkala, Sacha Faust Bourque,
Bryan Smith, Marcin Olszewski,
Ashish Kurmi, Lars Mohr, Ben
Ridgway
- O365 Security: Dave Hull, Chetan
Bhat, Jerry Cochran
- MSR: Jay Stokes, Gang Wang
(intern)
- LCA: Matt Sommer
Source: http://www.lecun.org/gallery/libpro/20011121-allyourbayes/dsc01228-02-h.jpg

Subverting Machine Learning Detections for fun and profit

More Related Content

What's hot (20)

Similar to Subverting Machine Learning Detections for fun and profit (20)

Recently uploaded (20)

Subverting Machine Learning Detections for fun and profit

Editor's Notes