Machine learning in Cyber Security

Cyber Security
using Machine
Learning
Rajath V
4MH15IS105
1

Contents
● INTRODUCTION
● LITERATURE SURVEY
● METHODOLOGY
● REAL TIME CASE STUDY
● APPLICATIONS
● ADVANTAGES AND DRAWBACKS
● RESULT ANALYSIS AND DISCUSSION
● CONCLUSION
● SCOPE FOR FUTURE WORK
● REFERENCES
2

INTRODUCTION
Cyber security is the main focus of companies in
recent years but providing security for
infrastructure is a monotonous job so we can use
the power of machine learning and other data
mining techniques.
3

Machine learning
A computer program is
said to learn from
experience E with
respect to some class
of tasks T and
performance measure
P, if its performance at
tasks in T, as measured
by P, improves with
experience E.
4

Supervised
Machine
Learning
Supervised: Using Recurrent Neural
Networks to distinguish between “normal”
DNS domains and those generated by
Domain Generation Algorithms
5

Unsupervised
Machine
Learning
Unsupervised: Using Self Organizing Maps
and clustering techniques to identify
anomalous IP traffic
6

Reinforcement
Learning
Reinforcement Learning: bringing threat
intelligence and the end user into the loop
7

8
CYBER SECURITY
Cybersecurity is the protection of internet-
connected systems, including hardware,
software and data, from cyberattacks.

1. Application Security
2. Information Security
3. Network Security
4. Disaster Recovery/
business continuity
5. Operational Security
6. End-user education
1. Ransomware
2. Malware
3. Social Engineering
4. Phishing
9
Elements of Cyber
Security
Ty
Types of Cyber
Security Threats

Why Machine
Learning used in
Cyber Security?
● DDoS on Github 1.3 terabytes per second
● 16.7 million identity theft 16.8 billion
dollars stolen.
● 50 million users compromised Facebook
developer API’s breach.
All these breaches if were detected and
terminated by the machine itself huge losses
could be prevented.
10

Cyber Security
tasks can be divided
into 5 main
categories.
1. Prediction
2. Prevention
3. Detection
4. Response
5. Monitoring
12

Literature Survey
13
Title Description Advantages & Limitations
Machine learning
aided Android
malware
classification
Authors:
Nikola Milosevic,
Ali Dehghantanha,
Kim-Kwang
Raymond Choo
● This paper deals with static
analysis of application for
malware detection.
● Combinations of permissions
granted to application and the
system calls it makes are
analysed.
● MOdroid dataset is used which
contains 200 malicious android
apps.
● Source based analysis is used
which results in binary
classification whether a
application is malicious or not.
● High precision and
recall rates
detection of new
malicious
applications
● Should use big and
balanced dataset
and apply advanced
machine learning
techniques

Literature Survey
15
Title Description Advantages & Limitations
Intelligent Phishing
Website Detection
using Random Forest
Classifier
Authors:
Abdulhamit Subasi,
Esraa Molah, Fatin
Almkallawi, Touseef J.
Chaudhery
● Random Forest classifier is
used to detect phishing links.
● Software used to apply machine
learning classifier WEKA.
● Dataset used UCI ML library
● Accuracy rate of Random
Forest was higher than
ANN,RF,k-NN,C4.5,CART,SVM.
● Random Forest
higher
accuracy,robust and
faster execution
compared to other
classifier.
● New methods have
come into existence
which are advanced
phishing techniques
● Phishing attacks
have improvised by
masking the URL to a
real one.

Literature Survey
16
Title Description Advantages &
Limitations
Machine Learning
approach to apply big data
analytics in DDoS
forcenics.
Authors:
Kian Son Hoon, Kheng
Cher Yeo, Sami Azam,
Bharanidharan
Shanmugam, Friso De
Boer
● DDoS attacks (Distributed Denial
of Service) can bring down
servers or web services.
● Software used to apply machine
learning classifier WEKA and
H2O.
● Dataset used NSL-KDD
● Simulation was used to test the
model.
● Experiment suggested that
supervised learning algorithms
performed better than
unsupervised learning in DDos
forcentics.
● DDoS can be
detected but
complete
protection is
not possible.
● Botnets and
Zombie
computers are
widely used in
DDoS whose
detection is
challenging.

Methodology
● Static malware analysis
● Machine Learning Techniques
○ Classification
○ Clustering
● Dataset used M0droid containing 200 malicious android
applications
● Permission based analysis
● Source-code based analysis
● Ensemble Learning
17

RANDOM
FOREST
ALGORITHM
Randomly select “k” features from total
“m” features.
Where k << m
Among the “k” features, calculate the
node “d” using the best split point.
Split the node into daughter nodes using
the best split.
Repeat 1 to 3 steps until “l” number of
nodes has been reached. 19

20
Single Machine Learning
Permission based clustering
Ensemble Learning for source based analysis

Real time case
studies
TOR Traffic Detection
using Deep Learning
21

Feed Forward
Neural Network
Dataset consists of network
traffic from a PCAP file and
it is feeded into the neural
network and the hyper
parameters of the neural
network is optimized.
22

23
Tensorboard generated statics
depicting the network training
process
Results from different
machine learning classifier.

APPLICATIONS OF MACHINE LEARNING IN
CYBER SECURITY
24

Machine
Learning for
Network
Protection
IDS [Intrusion Detection System]
NTA [Network Traffic Analytics]
Examples:
● regression to predict the network packet
parameters and compare them with the normal
ones;
● classification to identify different classes of
network attacks such as scanning and spoofing;
● clustering for forensic analysis.
25

Machine
Learning for
Application
Security
Web Application Firewall [WAF] Systems
A universal ML model cannot be developed to deal with
all threats some examples are listed below.
● regression to detect anomalies in HTTP requests
(for example, XXE and SSRF attacks and auth
bypass);
● classification to detect known types of attacks like
injections (SQLi, XSS, RCE, etc.);
● clustering user activity to detect DDOS attacks
and mass exploitation.
26

Machine
Learning for
user behavior
Unlike malware detection focusing on common
attacks and the possibility to train a classifier, user
behavior is one of the complex layers and
unsupervised learning problem.
● regression to detect anomalies in User actions
(e.g., login in unusual time);
● classification to group different users for
peer-group analysis;
27

Disadvantages
● Like every force of automation, AI-based cybersecurity will bring in
questions of potential job loss or reassigning existing cybersecurity
personnel; this makes it a tricky affair for enterprise IT to manage.
● There’s always a sense of discomfort with the idea of trusting a purely
automated system, more so when it’s cybersecurity automation we’re
talking about. Enterprises need to manage this perceived loss of control.
● Purchasing advanced cybersecurity technologies can be difficult
because there are not many AI-based systems for purchase yet. This
creates risks of overspending.
28

Conclusion
Machine Learning is definitely not a silver-bullet solution if you
want to protect your systems. Undoubtedly, there are many
issues with interpretability (particularly for Deep Learning
algorithms)
On the other hand, with the growing amount of data and
decreasing number of experts, ML is the only remedy. It works
now and will be mandatory soon. It is better to start right now.
Keep in mind, hackers are also starting to use ML in their
attacks
29

30
Scope of
Future Work
Cyber Security has various
fields everyday as technology
improves new vulnerabilities are
discovered and machine
learning is also advancing in all
verticals.
A right tool is needed for
enterprises to maintain there
infrastructure security and for
people to safeguard from all the
threats.

31
References:
[1]: Abdulhamit Subasi, Esraa Molah, Fatin Almkallawi, Touseef J. Chaudhery “Intelligent Phishing Website Detection using Random
Forest Classifier “ 2017 International Conference on Electrical and Computing Technologies and Applications (ICECTA)
[2]: Kian Son Hoon1 , Kheng Cher Yeo2, Sami Azam2, Bharanidharan Shanmugam2, Friso De Boer2 “Machine Learning approaches
to apply big data analytics in DDoS forcenics” 2018 International Conference on Computer Communication and Informatics (ICCCI -
2018), Jan. 04 – 06, 2018, Coimbatore, INDIA
[3]:Nikola Milosevic, Ali Dehghantanha,Kim-Kwang Raymond Choo “Machine Learning Aided Android Malware Classification”
[4]:https://towardsdatascience.com/machine-learning-for-cybersecurity-101-7822b802790b
[5]:https://www.analyticsvidhya.com/blog/2018/07/using-power-deep-learning-cyber-security/

Machine learning in Cyber Security

More Related Content

What's hot (20)

Similar to Machine learning in Cyber Security (20)

Recently uploaded (20)

Machine learning in Cyber Security

Editor's Notes