SlideShare a Scribd company logo
A TRANSDUCTIVE SCHEME BASED
INFERENCE TECHNIQUES
FOR NETWORK FORENSIC
ANALYSIS
BY: AKSHAYA ARUNAN
M1 NE [IT]
GECBH
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 1
OUTLINE
 Objective
 Introduction
 Literature Survey
 Proposed System
 Conclusion
 Reference
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 2
OBJECTIVE
To develop a Network Intrusion Forensics System based on “transductive
scheme” that can
detect and analyze efficiently computer crime
extract digital evidence
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 3
INTRODUCTION
Rapid development of network connectivity
Complexity and growth
Increase in the number of crimes
System connected are potential candidates for the malicious attack
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 4
These attacks can affect:
physical or digital assets
funds
consumer confidence
national security
loss of life
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 5
Network Forensics
Goal: To discover the source of security breaches or other information assurance
problems [1].
Evidence is captured from networks
Interpretation is substantially based on knowledge of network attacks
Allows us to make forensic determinations based on the observed traffic [2]
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 6
LITERATURE SURVEY
Tcpdump [4],[5]
Wireshark[5]
Artificial Neural Network[1]
Support Vector Machine[5],[6]
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 7
tcpdump
A free source common packet analyzer that runs under the command line.
Few functions:
Prints the contents of network packets
Display TCP/IP and other packets being transmitted or received
Can read packets from a network interface card
Can write packets to standard output or a file
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 8
Wireshark
Wireshark is a free and open source packet analyzer.
Wireshark is similar to TCP Dump, but has a graphical front-end, plus some
integrated sorting and filtering options.
It is used for
network troubleshooting
analysis
software and communications protocol development
educational purpose
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 9
Artificial Neural Network [1]
An ANN is an interconnected group of nodes, akin to the vast network of
neurons in a brain.
 They can be used to infer a function from:
observations
data processing
Example: Robotics etc.
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 10
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 11
INPUT HIDDEN OUTPUT
In the figure, each node represents an artificial neuron and an arrow represents a
connection from the output of one neuron to the input of another.
Support Vector Machine [5], [6]
Constructs a hyperplane or a set of hyperplanes in a high or infinite dimensional
space, which can be used for classification, regression, or other tasks.
Supervised learning models
Analyze data and recognize patterns
Hyperplane: It is a subspace of one dimension less than its ambient space
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 12
Disadvantages
 ANN and SVM:
They were designed to find features for network forensics
These methods are effective in reducing the processing-time
But are insufficient in forensic analysis
tcpdump and Wireshark
These tools are designed to help debug network problems, but not special for forensic analysis
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 13
PROPOSED SYSTEM
First, we propose an efficient TCM-KNN[3] based inference technology
It is much more effective than single, multiple traffic threshold
Second, to boost the real-time network forensic performance of TCM-KNN
simulated annealing (SA) algorithm[10]
Reduce the computational cost
More suitable in real network environment
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 14
Transductive Confidence Machines for
K-Nearest Neighbors
Commonly used machine learning and data mining method
Effective in fraud detection, pattern recognition and outlier detection
The confidence measure used in TCM is based upon universal tests for
randomness or their approximation
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 15
Transductive scheme based
network forensic
We develop a network intrusion forensics system based on transductive scheme
(NIFSTC) that can detect and analyze efficiently
network crime, and
digital evidence
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 16
NIFSTC consists of the following components:
Network Traffic Capturer
Instance Selection and Feature Extractor
TCMKNN Based Network Forensic Analyzer
Evidence Analyzer
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 17
NIFSTC system architecture
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 18
Traffic capturer
The first step of NIFSTC system
Network traffic capture
Preparation for traffic analysis
Provides the base information for other components of the forensics system
The traditional packet capture library, Libpcap[4]
provides implementation independent access to the underlying packet capture facility
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 19
 Problems while using Libcap:
While heavy traffic network - captured data is transferred by the kernel to the user
processes with system call and memory copy.
In a high throughput network - the total amount of valuable CPU cycles is non-
ignorable.
The system overhead- too many operations of memory copy will consume a large
amount of CPU and memory resources.
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 20
In order to improve the packet capture performance of the NIFSTC, it is
necessary
to reduce the intermediate steps during packet transmission,
bypass the OS kernel and
eliminate kernel’s memory copy.
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 21
An efficient user-level packet capture mechanism based on semi-polling driven
technique [7,8].
Semi polling - With the semi-polling driven mechanism,
1) interrupts frequency is lowered
2) processing performance for short message is significantly ameliorated
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 22
TCM-KNN based network forensic
analyzer
TCM-KNN is an algorithm combining TCM [9] and KNN algorithm effectively
In the KNN algorithm, we denote the sorted sequence (in ascending order) of
the distances of point “i”, from the other points, with the same classification “y”
as
In this paper, we use Euclidean distance to calculate the distances between
points
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 23
𝐷𝑖
𝑦
 We assign to every point a measure called the individual strangeness measure
 This measure defines the strangeness of the point in relation to the rest of the
points
 In our case the strangeness measure for a point I belonging to a normal class is
defined as:
 = Ʃ D (1)
  computed for an anomaly
 D will stand for the jth shortest distance in this sequence
 k is the number of neighbors used
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 24
ik
J=1 ij
Equation (1) to compute the p-value as follows:
p( ) = #{i:  ≥  }
(n+1)
 # denotes the cardinality of the set
  is the strangeness value for the test point
  is among the j largest occurs with probability of at most j/n+1.
 p value – non universal tests (Proedru et al) - a measure of how well the data
supports or not a null hypothesis – should be smaller to get greater evidence
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 25
new i new (2)
new
Feature extractor
Extracting features on the “network traffic” captured by Traffic Capturer
component.
A group of features is a kind of data structure characterizing network traffic.
The data structure for network event analysis is the connection log.
Some of the secondary attributes are
1) TCP flags
2) connection duration
3) volume of data passed in each direction
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 26
Simulated annealing based
instance selection
A local search technique simulating the physical process of “annealing”[10].
 Deals with highly non–linear problems.
Begins a random solution, and in the next neighborhood search for each step of
the process.
Moves are controlled by some probability function.
The acceptance of a downhill depends on
reduction in the value of the objective function
size of the search time
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 27
Selects the most contributing examples and omits useless fitness function.
To apply SA, two important problems should be addressed:
Specification of the representation of the solutions
Definition of the fitness function
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 28
1) Representation:
Training dataset - TR with instances.
Search space associated with the instance selection of TR is constituted by –
Subsets of TR
Eg: chromosomes - subsets of TR - Uses a binary representation
A chromosome consists of genes with two possible states: 0 and 1
If 1, then its associated instance is included in the subset of TR represented by
the chromosome.
If 0, then this does not occur.
Result: Selected chromosomes would be the reduced training dataset for TCM-
KNN.
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 29
2) Fitness function:
Let F(X) be a subset of instances of TR to evaluate and be coded by a chromosome.
Three measures to be seriously considered:
TP
FP
Percentage of training dataset reduction
Thus, Fitness function combines three values:
the detect_rate associated with fal_rate
reduce_rate of instances of with regards to TR
F(x)=C * (detect_rate - fal_rate) +(1-C) * reduce_rate (3)
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 30
reduce rate =|TR|-|S | * 100 (4)
|TR|
|TR| - the number of the original training dataset and
|S| - the reduced training dataset using SA
C - an adjustment constant set by experiences
The objective of the SA is to maximize the fitness function defined
maximize detection rate
minimize the number of instances obtained as well as FP rate
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 31
Evidence analyzer
Can connect distant, and incomplete abnormal events
A set of evidence analyzing utilities can examine different aspects of correlated
events in an efficient way
Then utilities are formed into NIFSTC system
Evidence analyzer uses two work modes:
1) count mode or
2) weighted analysis mode
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 32
Evidence analyzer results in undirected evidence graph
Value of the attribute - nodes in graph
Node size - different weight
Edges - a relationship between two attribute values.
An evidence graph is shown in figure.
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 33
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 34
Evidence Graph
CONCLUSION
TCM- KNN is the most modern and precise algorithm to detect the network
crimes and analyze the forensic data.
Evidence analyzer gives the package of number of evidences and corresponding
weighted values.
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 35
REFERENCES
1) S Mukkamala, A.Sung, - ‘’Identifying significant features for network forensic analysis using
artificial intelligent techniques’’ - Int’l Journal of Digital Evidence[2003]
2) M.I. Cohen. PyFlag‚ - “An advanced network forensic framework” - Digital Investigation
(Elsevier Journal) [2008]
3) Y. Li, L. Guo, - “An active learning based TCM-KNN algorithm for supervised network
intrusion detection” – Computers Security (Elsevier Journal) [2007]
4) Libpcap – http://www.tcpdump.org/release/libcap-0.7.2.tar.gz, [2002]
5) Wikipedia – www.wikipedia.com
6) E. Eskin, A. Arnold, M, Prerau, L. Portnoy, S. Stolfo. – “A geometric framework for
unsupervised anomaly detection: detecting intrusions in unlabeled data” - D. Barbara and S.
Jajodia (editors), Applications of Data Mining in Computer Security, Kluwer, [2002]
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 36
7) ZH Tian, BX Fang, XC Yun, - “User-Level message passing mechanism based on semi-
polling driven in RTLinux” - Journal of Software [2004]
8) ZH Tian, MZ Hu, B Li., - “Semi-Polling Based Interrupt Mitigation for High Performance
Packet Processing” - High Technology Letters [2005]
9) A. Gammerman, V. Vovk, - “Prediction algorithms and confidence measure based on
algorithmic randomness theory”, - Theoretical Computer Science[2002]
10) Aarts, E. and van Laarhoven, - “ Simulated anealing: A pedestrian review of the theory and
some applications”, in J. Kittler and P.A. Devijver (Eds.) - Pattern Recognition and
Applications, Springer-Verlag, Berlin[1987]
22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 37

More Related Content

PDF
Review of the paper: Traffic-aware Frequency Scaling for Balanced On-Chip Net...
PDF
FPGA based Data Scrambler for Ultra-Wideband Communication Systems
PDF
Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...
PDF
Partially connected 3D NoC - Access Noxim.
PDF
Design A Congestion Aware Routing Algorithm for Synchronous Cam Design
DOCX
Chapter 3. sensors in the network domain
PDF
30 ijaprr vol1-4-24-28syed
PDF
I1102014953
Review of the paper: Traffic-aware Frequency Scaling for Balanced On-Chip Net...
FPGA based Data Scrambler for Ultra-Wideband Communication Systems
Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...
Partially connected 3D NoC - Access Noxim.
Design A Congestion Aware Routing Algorithm for Synchronous Cam Design
Chapter 3. sensors in the network domain
30 ijaprr vol1-4-24-28syed
I1102014953

What's hot (17)

PPT
MP2P 2008 (PerCom 2008) - Elisa Rondini
PDF
Queue Size Trade Off with Modulation in 802.15.4 for Wireless Sensor Networks
PDF
Design and Implementation Of Packet Switched Network Based RKT-NoC on FPGA
PDF
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
PDF
ANALOG MODELING OF RECURSIVE ESTIMATOR DESIGN WITH FILTER DESIGN MODEL
PDF
Term paper presentation
PDF
Iaetsd a novel scheduling algorithms for mimo based wireless networks
DOCX
Passive ip traceback disclosing the locations of ip spoofers from path backsc...
PDF
On the modeling of
PDF
Uw as ns design challenges in transport layer
PDF
M phil-computer-science-mobile-computing-projects
DOCX
A historical beacon-aided localization algorithm for mobile sensor networks
PDF
M.E Computer Science Mobile Computing Projects
PPTX
Networking revolution
PDF
A Systematic Review of Congestion Control in Ad Hoc Network
PDF
Paper id 36201515
PDF
An ecn approach to congestion control mechanisms in mobile ad hoc networks
MP2P 2008 (PerCom 2008) - Elisa Rondini
Queue Size Trade Off with Modulation in 802.15.4 for Wireless Sensor Networks
Design and Implementation Of Packet Switched Network Based RKT-NoC on FPGA
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
ANALOG MODELING OF RECURSIVE ESTIMATOR DESIGN WITH FILTER DESIGN MODEL
Term paper presentation
Iaetsd a novel scheduling algorithms for mimo based wireless networks
Passive ip traceback disclosing the locations of ip spoofers from path backsc...
On the modeling of
Uw as ns design challenges in transport layer
M phil-computer-science-mobile-computing-projects
A historical beacon-aided localization algorithm for mobile sensor networks
M.E Computer Science Mobile Computing Projects
Networking revolution
A Systematic Review of Congestion Control in Ad Hoc Network
Paper id 36201515
An ecn approach to congestion control mechanisms in mobile ad hoc networks
Ad

Viewers also liked (7)

PPTX
Process Control
PPT
operational control
PPTX
Techniques of Strategic Evaluation & Strategic
PPS
Benchmarking ppt
PPTX
Balanace score card ppt
PPT
Strategic evaluation and control
PPT
Strategic control
Process Control
operational control
Techniques of Strategic Evaluation & Strategic
Benchmarking ppt
Balanace score card ppt
Strategic evaluation and control
Strategic control
Ad

Similar to A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NETWORK FORENSIC ANALYSIS (13)

PDF
Evaluating Network Forensics Applying Advanced Tools
PDF
Technical seminar for btech student for the presentation which can be present
PDF
A proposed architecture for network
PDF
IRJET - Digital Forensics Analysis for Network Related Data
PPTX
Network forensic
PDF
Comparative Analysis: Network Forensic Systems
PDF
Dp31547550
PDF
Implementation of Secured Network Based Intrusion Detection System Using SVM ...
PDF
Intrusion Detection Using Conditional Random Fields
PPTX
Network Intrusion Detection Systems #2
PDF
Network Forensic Investigation of HTTPS Protocol
PDF
Anomaly detection by using CFS subset and neural network with WEKA tools
PPT
Establishing conclusive proof in Forensic Data Analytics
Evaluating Network Forensics Applying Advanced Tools
Technical seminar for btech student for the presentation which can be present
A proposed architecture for network
IRJET - Digital Forensics Analysis for Network Related Data
Network forensic
Comparative Analysis: Network Forensic Systems
Dp31547550
Implementation of Secured Network Based Intrusion Detection System Using SVM ...
Intrusion Detection Using Conditional Random Fields
Network Intrusion Detection Systems #2
Network Forensic Investigation of HTTPS Protocol
Anomaly detection by using CFS subset and neural network with WEKA tools
Establishing conclusive proof in Forensic Data Analytics

More from Akshaya Arunan (10)

PDF
Traffic Based Malicious Switch and DDoS Detection in Software Defined Network
PPTX
Enhanced Traffic Based Malicious Switch Detection in SDN
PDF
Akshayappt
PPTX
OpenSec Policy-Based Security Using
PDF
Intermediate code generation
PDF
Syntax directed translation
PDF
Operator precedence
PDF
Syntax analysis
PDF
Bottom up parser
PDF
Compilers Design
Traffic Based Malicious Switch and DDoS Detection in Software Defined Network
Enhanced Traffic Based Malicious Switch Detection in SDN
Akshayappt
OpenSec Policy-Based Security Using
Intermediate code generation
Syntax directed translation
Operator precedence
Syntax analysis
Bottom up parser
Compilers Design

Recently uploaded (20)

PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PDF
Insiders guide to clinical Medicine.pdf
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PPTX
Institutional Correction lecture only . . .
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PPTX
Cell Types and Its function , kingdom of life
PDF
Microbial disease of the cardiovascular and lymphatic systems
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
Business Ethics Teaching Materials for college
PPTX
Pharma ospi slides which help in ospi learning
PDF
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
Insiders guide to clinical Medicine.pdf
STATICS OF THE RIGID BODIES Hibbelers.pdf
human mycosis Human fungal infections are called human mycosis..pptx
O5-L3 Freight Transport Ops (International) V1.pdf
Institutional Correction lecture only . . .
Anesthesia in Laparoscopic Surgery in India
Final Presentation General Medicine 03-08-2024.pptx
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Cell Types and Its function , kingdom of life
Microbial disease of the cardiovascular and lymphatic systems
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Pharmacology of Heart Failure /Pharmacotherapy of CHF
Business Ethics Teaching Materials for college
Pharma ospi slides which help in ospi learning
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
Renaissance Architecture: A Journey from Faith to Humanism
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
2.FourierTransform-ShortQuestionswithAnswers.pdf

A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NETWORK FORENSIC ANALYSIS

  • 1. A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NETWORK FORENSIC ANALYSIS BY: AKSHAYA ARUNAN M1 NE [IT] GECBH 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 1
  • 2. OUTLINE  Objective  Introduction  Literature Survey  Proposed System  Conclusion  Reference 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 2
  • 3. OBJECTIVE To develop a Network Intrusion Forensics System based on “transductive scheme” that can detect and analyze efficiently computer crime extract digital evidence 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 3
  • 4. INTRODUCTION Rapid development of network connectivity Complexity and growth Increase in the number of crimes System connected are potential candidates for the malicious attack 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 4
  • 5. These attacks can affect: physical or digital assets funds consumer confidence national security loss of life 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 5
  • 6. Network Forensics Goal: To discover the source of security breaches or other information assurance problems [1]. Evidence is captured from networks Interpretation is substantially based on knowledge of network attacks Allows us to make forensic determinations based on the observed traffic [2] 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 6
  • 7. LITERATURE SURVEY Tcpdump [4],[5] Wireshark[5] Artificial Neural Network[1] Support Vector Machine[5],[6] 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 7
  • 8. tcpdump A free source common packet analyzer that runs under the command line. Few functions: Prints the contents of network packets Display TCP/IP and other packets being transmitted or received Can read packets from a network interface card Can write packets to standard output or a file 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 8
  • 9. Wireshark Wireshark is a free and open source packet analyzer. Wireshark is similar to TCP Dump, but has a graphical front-end, plus some integrated sorting and filtering options. It is used for network troubleshooting analysis software and communications protocol development educational purpose 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 9
  • 10. Artificial Neural Network [1] An ANN is an interconnected group of nodes, akin to the vast network of neurons in a brain.  They can be used to infer a function from: observations data processing Example: Robotics etc. 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 10
  • 11. 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 11 INPUT HIDDEN OUTPUT In the figure, each node represents an artificial neuron and an arrow represents a connection from the output of one neuron to the input of another.
  • 12. Support Vector Machine [5], [6] Constructs a hyperplane or a set of hyperplanes in a high or infinite dimensional space, which can be used for classification, regression, or other tasks. Supervised learning models Analyze data and recognize patterns Hyperplane: It is a subspace of one dimension less than its ambient space 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 12
  • 13. Disadvantages  ANN and SVM: They were designed to find features for network forensics These methods are effective in reducing the processing-time But are insufficient in forensic analysis tcpdump and Wireshark These tools are designed to help debug network problems, but not special for forensic analysis 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 13
  • 14. PROPOSED SYSTEM First, we propose an efficient TCM-KNN[3] based inference technology It is much more effective than single, multiple traffic threshold Second, to boost the real-time network forensic performance of TCM-KNN simulated annealing (SA) algorithm[10] Reduce the computational cost More suitable in real network environment 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 14
  • 15. Transductive Confidence Machines for K-Nearest Neighbors Commonly used machine learning and data mining method Effective in fraud detection, pattern recognition and outlier detection The confidence measure used in TCM is based upon universal tests for randomness or their approximation 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 15
  • 16. Transductive scheme based network forensic We develop a network intrusion forensics system based on transductive scheme (NIFSTC) that can detect and analyze efficiently network crime, and digital evidence 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 16
  • 17. NIFSTC consists of the following components: Network Traffic Capturer Instance Selection and Feature Extractor TCMKNN Based Network Forensic Analyzer Evidence Analyzer 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 17
  • 18. NIFSTC system architecture 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 18
  • 19. Traffic capturer The first step of NIFSTC system Network traffic capture Preparation for traffic analysis Provides the base information for other components of the forensics system The traditional packet capture library, Libpcap[4] provides implementation independent access to the underlying packet capture facility 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 19
  • 20.  Problems while using Libcap: While heavy traffic network - captured data is transferred by the kernel to the user processes with system call and memory copy. In a high throughput network - the total amount of valuable CPU cycles is non- ignorable. The system overhead- too many operations of memory copy will consume a large amount of CPU and memory resources. 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 20
  • 21. In order to improve the packet capture performance of the NIFSTC, it is necessary to reduce the intermediate steps during packet transmission, bypass the OS kernel and eliminate kernel’s memory copy. 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 21
  • 22. An efficient user-level packet capture mechanism based on semi-polling driven technique [7,8]. Semi polling - With the semi-polling driven mechanism, 1) interrupts frequency is lowered 2) processing performance for short message is significantly ameliorated 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 22
  • 23. TCM-KNN based network forensic analyzer TCM-KNN is an algorithm combining TCM [9] and KNN algorithm effectively In the KNN algorithm, we denote the sorted sequence (in ascending order) of the distances of point “i”, from the other points, with the same classification “y” as In this paper, we use Euclidean distance to calculate the distances between points 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 23 𝐷𝑖 𝑦
  • 24.  We assign to every point a measure called the individual strangeness measure  This measure defines the strangeness of the point in relation to the rest of the points  In our case the strangeness measure for a point I belonging to a normal class is defined as:  = Ʃ D (1)   computed for an anomaly  D will stand for the jth shortest distance in this sequence  k is the number of neighbors used 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 24 ik J=1 ij
  • 25. Equation (1) to compute the p-value as follows: p( ) = #{i:  ≥  } (n+1)  # denotes the cardinality of the set   is the strangeness value for the test point   is among the j largest occurs with probability of at most j/n+1.  p value – non universal tests (Proedru et al) - a measure of how well the data supports or not a null hypothesis – should be smaller to get greater evidence 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 25 new i new (2) new
  • 26. Feature extractor Extracting features on the “network traffic” captured by Traffic Capturer component. A group of features is a kind of data structure characterizing network traffic. The data structure for network event analysis is the connection log. Some of the secondary attributes are 1) TCP flags 2) connection duration 3) volume of data passed in each direction 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 26
  • 27. Simulated annealing based instance selection A local search technique simulating the physical process of “annealing”[10].  Deals with highly non–linear problems. Begins a random solution, and in the next neighborhood search for each step of the process. Moves are controlled by some probability function. The acceptance of a downhill depends on reduction in the value of the objective function size of the search time 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 27
  • 28. Selects the most contributing examples and omits useless fitness function. To apply SA, two important problems should be addressed: Specification of the representation of the solutions Definition of the fitness function 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 28
  • 29. 1) Representation: Training dataset - TR with instances. Search space associated with the instance selection of TR is constituted by – Subsets of TR Eg: chromosomes - subsets of TR - Uses a binary representation A chromosome consists of genes with two possible states: 0 and 1 If 1, then its associated instance is included in the subset of TR represented by the chromosome. If 0, then this does not occur. Result: Selected chromosomes would be the reduced training dataset for TCM- KNN. 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 29
  • 30. 2) Fitness function: Let F(X) be a subset of instances of TR to evaluate and be coded by a chromosome. Three measures to be seriously considered: TP FP Percentage of training dataset reduction Thus, Fitness function combines three values: the detect_rate associated with fal_rate reduce_rate of instances of with regards to TR F(x)=C * (detect_rate - fal_rate) +(1-C) * reduce_rate (3) 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 30
  • 31. reduce rate =|TR|-|S | * 100 (4) |TR| |TR| - the number of the original training dataset and |S| - the reduced training dataset using SA C - an adjustment constant set by experiences The objective of the SA is to maximize the fitness function defined maximize detection rate minimize the number of instances obtained as well as FP rate 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 31
  • 32. Evidence analyzer Can connect distant, and incomplete abnormal events A set of evidence analyzing utilities can examine different aspects of correlated events in an efficient way Then utilities are formed into NIFSTC system Evidence analyzer uses two work modes: 1) count mode or 2) weighted analysis mode 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 32
  • 33. Evidence analyzer results in undirected evidence graph Value of the attribute - nodes in graph Node size - different weight Edges - a relationship between two attribute values. An evidence graph is shown in figure. 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 33
  • 34. 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 34 Evidence Graph
  • 35. CONCLUSION TCM- KNN is the most modern and precise algorithm to detect the network crimes and analyze the forensic data. Evidence analyzer gives the package of number of evidences and corresponding weighted values. 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 35
  • 36. REFERENCES 1) S Mukkamala, A.Sung, - ‘’Identifying significant features for network forensic analysis using artificial intelligent techniques’’ - Int’l Journal of Digital Evidence[2003] 2) M.I. Cohen. PyFlag‚ - “An advanced network forensic framework” - Digital Investigation (Elsevier Journal) [2008] 3) Y. Li, L. Guo, - “An active learning based TCM-KNN algorithm for supervised network intrusion detection” – Computers Security (Elsevier Journal) [2007] 4) Libpcap – http://www.tcpdump.org/release/libcap-0.7.2.tar.gz, [2002] 5) Wikipedia – www.wikipedia.com 6) E. Eskin, A. Arnold, M, Prerau, L. Portnoy, S. Stolfo. – “A geometric framework for unsupervised anomaly detection: detecting intrusions in unlabeled data” - D. Barbara and S. Jajodia (editors), Applications of Data Mining in Computer Security, Kluwer, [2002] 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 36
  • 37. 7) ZH Tian, BX Fang, XC Yun, - “User-Level message passing mechanism based on semi- polling driven in RTLinux” - Journal of Software [2004] 8) ZH Tian, MZ Hu, B Li., - “Semi-Polling Based Interrupt Mitigation for High Performance Packet Processing” - High Technology Letters [2005] 9) A. Gammerman, V. Vovk, - “Prediction algorithms and confidence measure based on algorithmic randomness theory”, - Theoretical Computer Science[2002] 10) Aarts, E. and van Laarhoven, - “ Simulated anealing: A pedestrian review of the theory and some applications”, in J. Kittler and P.A. Devijver (Eds.) - Pattern Recognition and Applications, Springer-Verlag, Berlin[1987] 22-Jul-16 A TRANSDUCTIVE SCHEME BASED INFERENCE TECHNIQUES FOR NFA 37