SlideShare a Scribd company logo
Machine Learning in Computer
Security
Presented by :
Kishor Datta Gupta
Computer security
Task of
cyber
security
Prediction
Prevention
Detection
Response
Monitoring
Places to
do the task
Network (network traffic
analysis and intrusion
detection)
Endpoint (anti-malware)
Application (WAF or
database firewalls)
User (UBA)
Process (anti-fraud)
Time to do
the tasks
In transit in real time
At rest
Historically
What Machine Learning Can Do?
• A task of predicting the next value based on the
previous values.
Regression (or
prediction)
• A task of separating things into different categories.
Classification
• Similar to classification but the classes are unknown,
grouping things by their similarity.
Clustering
• A task of recommending something based on the
previous experience.
Association rule learning
(or recommendation)
• A task of searching common and most important
features in multiple examples.
Dimensionality reduction
or generalization
• A task of creating something based on the previous
knowledge of the distribution.
Generative models
Regression:
The knowledge about the existing
data is utilized to have an idea of
the new data. Example : house
prices prediction.
Example in Cyber security: it can
be applied to fraud detection. The
features (e.g., the total amount of
suspicious transaction, location,
etc.) determine a probability of
fraudulent actions.
Regression
• Linear regression
• Polynomial regression
• Ridge regression
• Decision trees
• SVR (Support Vector Regression)
• Random forest
Machine
learning
• Artificial Neural Network (ANN)
• Recurrent Neural Network (RNN)
• Neural Turing Machines (NTM)
• Differentiable Neural Computer (DNC)
Deep
learning
Linear
Regression:
• Linear regression performs
the task to predict a
dependent variable value (y)
based on a given
independent variable (x)
• . So, this regression
technique finds out a linear
relationship between x (input)
and y(output). Hence, the
name is Linear Regression.
• Y=MX+C
Polynomial
Regression:
2 Degree polynomial
y = θo + θ₁x₁ + θ₂ x₁²
General equation of a
polynomial regression is:
Y=θo + θ₁X + θ₂X² + … + θₘXᵐ
Decision Tree
• The goal of using a Decision Tree is
to create a training model that can
use to predict the class or value of the
target variable by learning simple
decision rules inferred from prior
data(training data).
• In Decision Trees, for predicting a
class label for a record we start from
the root of the tree. We compare the
values of the root attribute with the
record’s attribute.
• On the basis of comparison, we follow
the branch corresponding to that
value and jump to the next node.
Regression
Evaluations
MAE (Mean absolute error) represents
the difference between the original and
predicted values extracted by averaged
the absolute difference over the data set.
•MSE (Mean Squared Error) represents
the difference between the original and
predicted values extracted by squared
the average difference over the data set.
•RMSE (Root Mean Squared Error) is
the error rate by the square root of MSE.
•R-squared (Coefficient of
determination) represents the coefficient
of how well the values fit compared to
the original values. The value from 0 to 1
interpreted as percentages. The higher
the value is, the better the model is.
Classification:
Classification refers to a
predictive modeling
problem where a class label
is predicted for a given
example of input data.
In terms of cybersecurity, a
spam filter separating
spams from other messages
can serve as an example.
Classification:
• LogisticRegression (LR)
• K-Nearest Neighbors (K-NN)
• Support Vector Machine (SVM)
• KernelSVM
• NaiveBayes
• DecisionTreeClassification
• Random Forest Classification
Machine
learning
• Artificial Neural Network
• Convolutional Neural Networks
Deep
learning
Support
Vector
Machine
(SVM):
The objective of the SVM is to
find a hyperplane in an N-
dimensional space(N — the
number of features) that
distinctly classifies the data
points.
Naïve Bayes:
It is a probabilistic classifier that
makes classifications using the
Maximum A Posteriori decision rule
in a Bayesian setting.
Naive Bayes classifiers have been
especially popular for text
classification, and are a traditional
solution for problems such as spam
detection.
Artificial Neural
Network:
The core component of ANNs is artificial neurons.
Each neuron receives inputs from several other
neurons, multiplies them by assigned weights, adds
them and passes the sum to one or more neurons.
Some artificial neurons might apply an activation
function to the output before passing it to the next
variable.
Artificial neural networks are composed of an input
layer, which receives data from outside sources
(data files, images, hardware sensors,
microphone…), one or more hidden layers that
process the data, and an output layer that provides
one or more data points based on the function of the
network.
Classification
Evaluations
Accuracy
Accuracy = (TP+TN)/(TP+FP+FN+TN)
Accuracy is the proportion of true results
among the total number of cases
examined.
Precision
•. what proportion of predicted Positives
is truly Positive?
•Precision = (TP)/(TP+FP)
Recall
• what proportion of actual Positives is
correctly classified?
•Recall = (TP)/(TP+FN)
F1 Score
• Harmonic Mean of precision and recall.
Clustering:
The information about the classes of the data is unknown.
There is no idea whether this data can be classified. This is
unsupervised learning.
Supposedly, the best task for clustering is forensic analysis. The
reasons, course, and consequences of an incident are obscure.
It’s required to classify all activities to find anomalies. Solutions
to malware analysis (i.e., malware protection or secure email
gateways) may implement it to separate legal files from outliers.
Another interesting area where clustering can be applied is user
behavior analytics. In this instance, application users cluster
together so that it is possible to see if they should belong to a
particular group.
Usually clustering is not applied to solving a particular task in
cybersecurity as it is more like one of the subtasks in a pipeline
(e.g., grouping users into separate groups to adjust risk values).
Clustering :
• K-means
• Mixturemodel(LDA)
• DBSCn
• Bayesian
• GaussianMixtureModel
• Agglomerative
• Mean-shift
Machine
learning
• Self-organized Maps (SOM)
• Kohonen Networks
Deep
learning
K-Means
Clustering
K-Means finds the best centroids by alternating
between (1) assigning data points to clusters based on
the current centroids (2) choosing centroids (points
which are the center of a cluster) based on the
current assignment of data points to clusters.
Association
Rule learning
Netflix and SoundCloud recommend films or songs
according to your movies or music preferences.
In cybersecurity, this principle can be used primarily for
incident response.
If a company faces a wave of incidents and offers
various types of responses, a system learns a type of
response for a particular incident (e.g., mark it as a false
positive, change a risk value, run the investigation).
Risk management solutions can also have a benefit if
they automatically assign risk values for new
vulnerabilities or misconfigurations built on their
description.
Association Rule learning :
• Apriori
• Euclat
• FP-Growth
Machine
learning
• Deep Restricted Boltzmann Machine
(RBM)
• Deep Belief Network (DBN)
• Stacked Autoencoder
Deep
learning
Generalization:
Dimensionality reduction can help
handle it and cut unnecessary
features. Like clustering,
dimensionality reduction is usually
one of the tasks in a more
complex model.
As to cybersecurity tasks,
dimensionality reduction is
common for face detection
solutions
Generalization :
• Principal Component Analysis (PCA)
• Singular-value decomposition (SVD)
• T-distributed Stochastic Neighbor Embedding (T-SNE)
• Linear Discriminant Analysis (LDA)
• Latent Semantic Analysis (LSA)
• Factor Analysis (FA)
• Independent Component Analysis (ICA)
• Non-negative Matrix Factorization (NMF)
Machine
learning
• Auto encoder
Deep
learning
Generative models:
Generative models are designed to simulate the actual data
(not decisions) based on the previous decisions.
The simple task of offensive cybersecurity is to generate a
list of input parameters to test a particular application for
Injection vulnerabilities.
Alternatively, we can have a vulnerability scanning tool for
web applications. One of its modules is testing files for
unauthorized access. These tests are able to mutate
existing filenames to identify the new ones.
For example, if a crawler detected a file called login.php, it’s
better to check the existence of any backup or test its copies
by trying names like login_1.php, login_backup.php,
login.php.2017. Generative models are good at this.
Generative models :
• Markov Chains
• Genetic Algorithm
Machine
learning
• Variational Autoencoders
• Generative adversarial networks (GANs)
• Boltzmann Machines
Deep
learning
Machine learning for Network Protection
ML in network security implies new solutions aimed at in-depth
analysis of all the traffic at each layer and detect attacks and
anomalies.
How can ML help here?
• Regression to predict the network packet parameters and compare them with the
normal ones;
• Classification to identify different classes of network attacks such as scanning and
spoofing;
• Clustering for forensic analysis.
Machine learning for Endpoint Protection
The new generation of anti-viruses is Endpoint Detection and Response. It’s
better to learn features in executable files or in the process behavior. Data may
differ depending on the type of endpoint (e.g., workstation, server, container, cloud
instance, mobile, PLC, IoT device) but the tasks are common
How can ML help here?
• Regression to predict the next system call for executable process and compare it with real ones;
• Classification to divide programs into such categories as malware, spyware and ransomware;
• Clustering for malware protection on secure email gateways (e.g., to separate legal file attachments
from outliers).
Machine learning for Application Security
Application security can differ. There are web applications,
databases, ERP systems, SaaS applications, micro services, etc.
How can ML help here?
• Regression to detect anomalies in HTTP requests (for example, XXE and
SSRF attacks and auth bypass);
• Classification to detect known types of attacks like injections (SQLi, XSS,
RCE, etc.);
• Clustering user activity to detect DDOS attacks and mass exploitation.
Machine learning for User Behavior
There are domain users, application users, SaaS users, social networks,
messengers, and other accounts that should be monitored.
User behavior is one of the complex layers and unsupervised learning problem.
As a rule, there is no labelled dataset as well as any idea of what to look for.
How can ML help here?
• Regression to detect anomalies in User actions (e.g., login in unusual time);
• Classification to group different users for peer-group analysis;
• Clustering to separate groups of users and detect outliers
Machine learning for Process Behavior
it’s necessary to know a business process in order to find something
anomalous.
Business processes can differ significantly. You can look for fraud in
banking and retail system, or a plant floor in manufacturing.
How can ML help here?
• Regression to predict the next user action and detect outliers such as credit card fraud;
• Classification to detect known types of fraud;
• Clustering to compare business processes and detect outliers.
References
• https://towardsdatascience.com/machine-learning-for-cybersecurity-101-7822b802790b
• AI for Cybersecurity by Cylance(2017)- Short but good introduction to basics of ML for Cybersecurity. Good practical
examples.
• Machine Learning and Security by O’reilly ( January 2018 ) — Best book so far about this topic but very few examples of Deep
Learning and mostly a general Machine Learning
• Machine Learning For Penetration Testers, by Packt ( July 2018 )- Less fundamental than previous one, but have more Deep
Learning approaches

More Related Content

PPTX
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...
PPTX
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...
PPTX
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
PPTX
Who is responsible for adversarial defense
PPTX
Adversarial Input Detection Using Image Processing Techniques (IPT)
PPTX
Policy Based reinforcement Learning for time series Anomaly detection
PPTX
Delayed Rewards in the context of Reinforcement Learning based Recommender ...
PPTX
Defending deep learning from adversarial attacks
Applicability issues of Evasion-Based Adversarial Attacks and Mitigation Tech...
Robust Filtering Schemes for Machine Learning Systems to Defend Adversarial A...
Deep Reinforcement Learning based Recommendation with Explicit User-ItemInter...
Who is responsible for adversarial defense
Adversarial Input Detection Using Image Processing Techniques (IPT)
Policy Based reinforcement Learning for time series Anomaly detection
Delayed Rewards in the context of Reinforcement Learning based Recommender ...
Defending deep learning from adversarial attacks

What's hot (16)

PPTX
NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique
PDF
Adversarial examples in deep learning (Gregory Chatel)
PPTX
A review of machine learning based anomaly detection
PPTX
Intrusion Detection System
PDF
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
PPTX
Adversarial Learning_Rupam Bhattacharya
PDF
Causative Adversarial Learning
PPTX
Introduction to Machine Learning
PPTX
I Dunderstn
PDF
Adversarial Attacks on A.I. Systems — NextCon, Jan 2019
PDF
ANALYSIS OF MACHINE LEARNING ALGORITHMS WITH FEATURE SELECTION FOR INTRUSION ...
PDF
Sentiment analysis of tweets using Neural Networks
PDF
Machine Learning under Attack: Vulnerability Exploitation and Security Measures
PPT
Keyboards, Privacy, and Sensor Webs (Part II)
PPTX
01 Introduction to Machine Learning
PDF
Extract Stressors for Suicide from Twitter Using Deep Learning
NSL KDD Cup 99 dataset Anomaly Detection using Machine Learning Technique
Adversarial examples in deep learning (Gregory Chatel)
A review of machine learning based anomaly detection
Intrusion Detection System
SPAM FILTERING SECURITY EVALUATION FRAMEWORK USING SVM, LR AND MILR
Adversarial Learning_Rupam Bhattacharya
Causative Adversarial Learning
Introduction to Machine Learning
I Dunderstn
Adversarial Attacks on A.I. Systems — NextCon, Jan 2019
ANALYSIS OF MACHINE LEARNING ALGORITHMS WITH FEATURE SELECTION FOR INTRUSION ...
Sentiment analysis of tweets using Neural Networks
Machine Learning under Attack: Vulnerability Exploitation and Security Measures
Keyboards, Privacy, and Sensor Webs (Part II)
01 Introduction to Machine Learning
Extract Stressors for Suicide from Twitter Using Deep Learning
Ad

Similar to Machine learning in computer security (20)

PPTX
Presentation_Malware Analysis.pptx
PDF
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
PPTX
Machine Can Think
PDF
machinecanthink-160226155704.pdf
PPTX
Presentation1.pptx
DOC
Cyb 5675 class project final
PPTX
Introduction to Machine Learning basics.pptx
PDF
Different Types of Data Science Models You Should Know.pdf
PPTX
PPT.pptx
PDF
PNN and inversion-B
PDF
Machine Learning in Malware Detection
PPTX
network layer service models forwarding versus routing how a router works rou...
PPTX
Intro to machine learning
PPTX
5. Machine Learning.pptx
PDF
Identifying and classifying unknown Network Disruption
PPTX
ML) is a subdomain of artificial intelligence (AI) that focuses on developing...
PPTX
network layer service models forwarding versus routing how a router works rou...
PPTX
AI_06_Machine Learning.pptx
PPT
Machine-Learning-Algorithms- A Overview.ppt
Presentation_Malware Analysis.pptx
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
Machine Can Think
machinecanthink-160226155704.pdf
Presentation1.pptx
Cyb 5675 class project final
Introduction to Machine Learning basics.pptx
Different Types of Data Science Models You Should Know.pdf
PPT.pptx
PNN and inversion-B
Machine Learning in Malware Detection
network layer service models forwarding versus routing how a router works rou...
Intro to machine learning
5. Machine Learning.pptx
Identifying and classifying unknown Network Disruption
ML) is a subdomain of artificial intelligence (AI) that focuses on developing...
network layer service models forwarding versus routing how a router works rou...
AI_06_Machine Learning.pptx
Machine-Learning-Algorithms- A Overview.ppt
Ad

More from Kishor Datta Gupta (20)

PPTX
GAN introduction.pptx
PPTX
Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...
PPTX
A safer approach to build recommendation systems on unidentifiable data
PPTX
Adversarial Attacks and Defense
PPTX
Zero shot learning
PPTX
Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...
PPTX
Cyber intrusion
PPTX
understanding the pandemic through mining covid news using natural language p...
PPTX
Different representation space for MNIST digit
PPTX
"Can NLP techniques be utilized as a reliable tool for medical science?" -Bui...
PPTX
Clustering report
PPTX
Basic digital image concept
PPTX
An empirical study on algorithmic bias (aiml compsac2020)
PPTX
Hybrid pow-pos-based-system against majority attack-in-cryptocurrency system ...
PPTX
Shamir secret sharing: Alternative of hashing for authentication
PPTX
A Genetic Algorithm Approach to Optimize Dispatching for A Micro-grid Energy ...
PPTX
Multi level ransomware analysis MALCON 2019 conference
PPTX
COMXAI A tool to explain AI USING FAULT LOCATION
PPTX
Time expired ledger for File access blockchain
PPTX
BigData Computing For WebSite Classifier
GAN introduction.pptx
Interpretable Learning Model for Lower Dimensional Feature Space: A Case stud...
A safer approach to build recommendation systems on unidentifiable data
Adversarial Attacks and Defense
Zero shot learning
Using Negative Detectors for Identifying Adversarial Data Manipulation in Mac...
Cyber intrusion
understanding the pandemic through mining covid news using natural language p...
Different representation space for MNIST digit
"Can NLP techniques be utilized as a reliable tool for medical science?" -Bui...
Clustering report
Basic digital image concept
An empirical study on algorithmic bias (aiml compsac2020)
Hybrid pow-pos-based-system against majority attack-in-cryptocurrency system ...
Shamir secret sharing: Alternative of hashing for authentication
A Genetic Algorithm Approach to Optimize Dispatching for A Micro-grid Energy ...
Multi level ransomware analysis MALCON 2019 conference
COMXAI A tool to explain AI USING FAULT LOCATION
Time expired ledger for File access blockchain
BigData Computing For WebSite Classifier

Recently uploaded (20)

PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
OOP with Java - Java Introduction (Basics)
PPTX
additive manufacturing of ss316l using mig welding
PDF
Well-logging-methods_new................
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
Strings in CPP - Strings in C++ are sequences of characters used to store and...
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PPTX
CH1 Production IntroductoryConcepts.pptx
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PPTX
web development for engineering and engineering
PPTX
Lesson 3_Tessellation.pptx finite Mathematics
PDF
PPT on Performance Review to get promotions
PDF
composite construction of structures.pdf
PPTX
bas. eng. economics group 4 presentation 1.pptx
CYBER-CRIMES AND SECURITY A guide to understanding
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
OOP with Java - Java Introduction (Basics)
additive manufacturing of ss316l using mig welding
Well-logging-methods_new................
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
Strings in CPP - Strings in C++ are sequences of characters used to store and...
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Operating System & Kernel Study Guide-1 - converted.pdf
CH1 Production IntroductoryConcepts.pptx
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
web development for engineering and engineering
Lesson 3_Tessellation.pptx finite Mathematics
PPT on Performance Review to get promotions
composite construction of structures.pdf
bas. eng. economics group 4 presentation 1.pptx

Machine learning in computer security

  • 1. Machine Learning in Computer Security Presented by : Kishor Datta Gupta
  • 2. Computer security Task of cyber security Prediction Prevention Detection Response Monitoring Places to do the task Network (network traffic analysis and intrusion detection) Endpoint (anti-malware) Application (WAF or database firewalls) User (UBA) Process (anti-fraud) Time to do the tasks In transit in real time At rest Historically
  • 3. What Machine Learning Can Do? • A task of predicting the next value based on the previous values. Regression (or prediction) • A task of separating things into different categories. Classification • Similar to classification but the classes are unknown, grouping things by their similarity. Clustering • A task of recommending something based on the previous experience. Association rule learning (or recommendation) • A task of searching common and most important features in multiple examples. Dimensionality reduction or generalization • A task of creating something based on the previous knowledge of the distribution. Generative models
  • 4. Regression: The knowledge about the existing data is utilized to have an idea of the new data. Example : house prices prediction. Example in Cyber security: it can be applied to fraud detection. The features (e.g., the total amount of suspicious transaction, location, etc.) determine a probability of fraudulent actions.
  • 5. Regression • Linear regression • Polynomial regression • Ridge regression • Decision trees • SVR (Support Vector Regression) • Random forest Machine learning • Artificial Neural Network (ANN) • Recurrent Neural Network (RNN) • Neural Turing Machines (NTM) • Differentiable Neural Computer (DNC) Deep learning
  • 6. Linear Regression: • Linear regression performs the task to predict a dependent variable value (y) based on a given independent variable (x) • . So, this regression technique finds out a linear relationship between x (input) and y(output). Hence, the name is Linear Regression. • Y=MX+C
  • 7. Polynomial Regression: 2 Degree polynomial y = θo + θ₁x₁ + θ₂ x₁² General equation of a polynomial regression is: Y=θo + θ₁X + θ₂X² + … + θₘXᵐ
  • 8. Decision Tree • The goal of using a Decision Tree is to create a training model that can use to predict the class or value of the target variable by learning simple decision rules inferred from prior data(training data). • In Decision Trees, for predicting a class label for a record we start from the root of the tree. We compare the values of the root attribute with the record’s attribute. • On the basis of comparison, we follow the branch corresponding to that value and jump to the next node.
  • 9. Regression Evaluations MAE (Mean absolute error) represents the difference between the original and predicted values extracted by averaged the absolute difference over the data set. •MSE (Mean Squared Error) represents the difference between the original and predicted values extracted by squared the average difference over the data set. •RMSE (Root Mean Squared Error) is the error rate by the square root of MSE. •R-squared (Coefficient of determination) represents the coefficient of how well the values fit compared to the original values. The value from 0 to 1 interpreted as percentages. The higher the value is, the better the model is.
  • 10. Classification: Classification refers to a predictive modeling problem where a class label is predicted for a given example of input data. In terms of cybersecurity, a spam filter separating spams from other messages can serve as an example.
  • 11. Classification: • LogisticRegression (LR) • K-Nearest Neighbors (K-NN) • Support Vector Machine (SVM) • KernelSVM • NaiveBayes • DecisionTreeClassification • Random Forest Classification Machine learning • Artificial Neural Network • Convolutional Neural Networks Deep learning
  • 12. Support Vector Machine (SVM): The objective of the SVM is to find a hyperplane in an N- dimensional space(N — the number of features) that distinctly classifies the data points.
  • 13. Naïve Bayes: It is a probabilistic classifier that makes classifications using the Maximum A Posteriori decision rule in a Bayesian setting. Naive Bayes classifiers have been especially popular for text classification, and are a traditional solution for problems such as spam detection.
  • 14. Artificial Neural Network: The core component of ANNs is artificial neurons. Each neuron receives inputs from several other neurons, multiplies them by assigned weights, adds them and passes the sum to one or more neurons. Some artificial neurons might apply an activation function to the output before passing it to the next variable. Artificial neural networks are composed of an input layer, which receives data from outside sources (data files, images, hardware sensors, microphone…), one or more hidden layers that process the data, and an output layer that provides one or more data points based on the function of the network.
  • 15. Classification Evaluations Accuracy Accuracy = (TP+TN)/(TP+FP+FN+TN) Accuracy is the proportion of true results among the total number of cases examined. Precision •. what proportion of predicted Positives is truly Positive? •Precision = (TP)/(TP+FP) Recall • what proportion of actual Positives is correctly classified? •Recall = (TP)/(TP+FN) F1 Score • Harmonic Mean of precision and recall.
  • 16. Clustering: The information about the classes of the data is unknown. There is no idea whether this data can be classified. This is unsupervised learning. Supposedly, the best task for clustering is forensic analysis. The reasons, course, and consequences of an incident are obscure. It’s required to classify all activities to find anomalies. Solutions to malware analysis (i.e., malware protection or secure email gateways) may implement it to separate legal files from outliers. Another interesting area where clustering can be applied is user behavior analytics. In this instance, application users cluster together so that it is possible to see if they should belong to a particular group. Usually clustering is not applied to solving a particular task in cybersecurity as it is more like one of the subtasks in a pipeline (e.g., grouping users into separate groups to adjust risk values).
  • 17. Clustering : • K-means • Mixturemodel(LDA) • DBSCn • Bayesian • GaussianMixtureModel • Agglomerative • Mean-shift Machine learning • Self-organized Maps (SOM) • Kohonen Networks Deep learning
  • 18. K-Means Clustering K-Means finds the best centroids by alternating between (1) assigning data points to clusters based on the current centroids (2) choosing centroids (points which are the center of a cluster) based on the current assignment of data points to clusters.
  • 19. Association Rule learning Netflix and SoundCloud recommend films or songs according to your movies or music preferences. In cybersecurity, this principle can be used primarily for incident response. If a company faces a wave of incidents and offers various types of responses, a system learns a type of response for a particular incident (e.g., mark it as a false positive, change a risk value, run the investigation). Risk management solutions can also have a benefit if they automatically assign risk values for new vulnerabilities or misconfigurations built on their description.
  • 20. Association Rule learning : • Apriori • Euclat • FP-Growth Machine learning • Deep Restricted Boltzmann Machine (RBM) • Deep Belief Network (DBN) • Stacked Autoencoder Deep learning
  • 21. Generalization: Dimensionality reduction can help handle it and cut unnecessary features. Like clustering, dimensionality reduction is usually one of the tasks in a more complex model. As to cybersecurity tasks, dimensionality reduction is common for face detection solutions
  • 22. Generalization : • Principal Component Analysis (PCA) • Singular-value decomposition (SVD) • T-distributed Stochastic Neighbor Embedding (T-SNE) • Linear Discriminant Analysis (LDA) • Latent Semantic Analysis (LSA) • Factor Analysis (FA) • Independent Component Analysis (ICA) • Non-negative Matrix Factorization (NMF) Machine learning • Auto encoder Deep learning
  • 23. Generative models: Generative models are designed to simulate the actual data (not decisions) based on the previous decisions. The simple task of offensive cybersecurity is to generate a list of input parameters to test a particular application for Injection vulnerabilities. Alternatively, we can have a vulnerability scanning tool for web applications. One of its modules is testing files for unauthorized access. These tests are able to mutate existing filenames to identify the new ones. For example, if a crawler detected a file called login.php, it’s better to check the existence of any backup or test its copies by trying names like login_1.php, login_backup.php, login.php.2017. Generative models are good at this.
  • 24. Generative models : • Markov Chains • Genetic Algorithm Machine learning • Variational Autoencoders • Generative adversarial networks (GANs) • Boltzmann Machines Deep learning
  • 25. Machine learning for Network Protection ML in network security implies new solutions aimed at in-depth analysis of all the traffic at each layer and detect attacks and anomalies. How can ML help here? • Regression to predict the network packet parameters and compare them with the normal ones; • Classification to identify different classes of network attacks such as scanning and spoofing; • Clustering for forensic analysis.
  • 26. Machine learning for Endpoint Protection The new generation of anti-viruses is Endpoint Detection and Response. It’s better to learn features in executable files or in the process behavior. Data may differ depending on the type of endpoint (e.g., workstation, server, container, cloud instance, mobile, PLC, IoT device) but the tasks are common How can ML help here? • Regression to predict the next system call for executable process and compare it with real ones; • Classification to divide programs into such categories as malware, spyware and ransomware; • Clustering for malware protection on secure email gateways (e.g., to separate legal file attachments from outliers).
  • 27. Machine learning for Application Security Application security can differ. There are web applications, databases, ERP systems, SaaS applications, micro services, etc. How can ML help here? • Regression to detect anomalies in HTTP requests (for example, XXE and SSRF attacks and auth bypass); • Classification to detect known types of attacks like injections (SQLi, XSS, RCE, etc.); • Clustering user activity to detect DDOS attacks and mass exploitation.
  • 28. Machine learning for User Behavior There are domain users, application users, SaaS users, social networks, messengers, and other accounts that should be monitored. User behavior is one of the complex layers and unsupervised learning problem. As a rule, there is no labelled dataset as well as any idea of what to look for. How can ML help here? • Regression to detect anomalies in User actions (e.g., login in unusual time); • Classification to group different users for peer-group analysis; • Clustering to separate groups of users and detect outliers
  • 29. Machine learning for Process Behavior it’s necessary to know a business process in order to find something anomalous. Business processes can differ significantly. You can look for fraud in banking and retail system, or a plant floor in manufacturing. How can ML help here? • Regression to predict the next user action and detect outliers such as credit card fraud; • Classification to detect known types of fraud; • Clustering to compare business processes and detect outliers.
  • 30. References • https://towardsdatascience.com/machine-learning-for-cybersecurity-101-7822b802790b • AI for Cybersecurity by Cylance(2017)- Short but good introduction to basics of ML for Cybersecurity. Good practical examples. • Machine Learning and Security by O’reilly ( January 2018 ) — Best book so far about this topic but very few examples of Deep Learning and mostly a general Machine Learning • Machine Learning For Penetration Testers, by Packt ( July 2018 )- Less fundamental than previous one, but have more Deep Learning approaches