Network Intrusion
Detection
By: Jack Song, Julina Zhang, Kerry Jones
Advisors: Dr. Don Brown, Dr. Hyojung Kang, Dr.
Malathi Veeraraghavan
Client: UVA Information Security, Policy, and
Records Office (ISPRO)
Sponsors: UVA SEAS/ Leidos
1
Agenda
● Team Members
● Project Objectives
● Progress to Date
● Deliverables
● Potential Sponsors
2
Team Members - Data Science Institute
3
Jack Song
● Majored in
Computer
Science at UVA
Julina Zhang
● Majored in
Statistics and
Economics at
UVA
Kerry Jones
● Majored in
Government and
Geography at
UMD
Team Members - Advisors
4
Dr. Donald E. Brown
● Director of the Data
Science Institute
● Dept. of Systems and
Information Engineering
Dr. Malathi Veeraraghavan
● Dept. of Electrical &
Computer Engineering
Dr. Hyojung Kang
● Dept. of Systems and
Information Engineering
Team Members
Jason Belford
● Chief Information Security
Officer 5
Jeff Collyer
● Information Security Engineer
Team Members
6
Sourav Maji
● Third-year PhD student in Computer
Engineering
Ron Hutchins
● Vice President for Information
Technology
Objectives
● To detect anomalous traffic leaving UVA network using
machine learning and data mining.
● Develop a network intrusion detection prototype.
7
Agenda
● Team Members
● Project Objectives
● Progress to Date
● Deliverables
● Potential Sponsors
8
Background - Approaches
● Lancope StealthWatch
● Previous approaches
○ Density-based Spatial Clustering of Applications with
Noise (Erman, Arlitt, Mahanti)
○ K-Means Clustering (Erman, Arlitt, Mahanti)
○ One-class Support Vector Machine (Locke, Wang,
Paschalidis)
○ Neural Network (Locke, Wang, Paschalidis)
○ Hierarchical Clustering (Ling, Rosti, Swanson)
○ Isolation Forest( Liu, Ting, Zhou)
● Our approach
○ Isolation Forest - An unsupervised learning method
that utilizes a tree structure to isolate anomalies.
9
Our progress, in a glance
10
- ISPRO
- Preprocessing
- Wireshark
- Filtering
- Unsupervised methods
- Isolation Forest
- Didn’t work out well
- Collection server
- Power Edge
- TShark
- Conversation data
- Better ‘Unit’
- Preliminary results
Course of Time
Progress
Initial Data
Filtered Data
netFlow data
Initial data phase
Data from ISPRO
+ Data preprocessing
+ Data filtering by source IPs within UVA network
Result: a subset of packet capture data of all conns
initiated within the UVA network
11
Init, Data Preprocessing
12
ISPRO data
1 TB
WIRESHARK
/TShark
50GB → 5GB
.pcap → .csv
One pcap
file
50GB/6min
Summary statistics;
AlgorithmsPython Script
Filtered data phase
Result from last phase
Created source - destination IP pairs
Calculated frequency and mean length for each pair
+ Isolation Forest
Provided an initial view, but more is needed.
13
Filtered data phase, what we’ve learned
Packet capture data ONLY captures packets
+ Need to capture the entire use session
Need netFlow records data
14
NetFlow data phase -- Now
● Setting up a collection server
○ Power Edge
● Conversation data & TShark
● Better ‘Unit of comparison’
○ include port number
● Preliminary analysis
15
16
17
Count 157,313
Unique Source IP 11514
Unique Destination IP 13113
Unique Destination Ports 1631
Unique Source Ports 48925
Average Duration 31 Secs
Average Packets Source to Destination 34 Packets
Average Packets Destination to Source 31 Packets
Average Bytes Source to Destination 10172 Bytes
Average Bytes Destination to Source 58134 Bytes
Summary Statistics
Top Five Most Frequently used Destination
Ports
18
Destination
Port
Count Number of Unique Source IP
pairs
80 ( HTTP) 66390 11238
443 (HTTPS) 38422 954
25 (FTP) 24277 39
6 20387 1
3 957 2
19
NetFlow data phase, next steps
● Finish setting up Power Edge
○ Shell script
○ Cron job
■ Automation of daily data collection
● Go into specifics, “symptoms”
○ DNS tunneling
○ Phishing
20
Identified Cyber Security Needs
● Identifying anomalous behavior in traffic leaving the UVa network
○ Source data: NetFlow records
○ Traffic from hosts with static public IP addresses
● DNS Tunneling
○ Data theft using port 53 as a pathway
● Phishing Attack
○ Obtain sensitive information by disguising and baiting.
21
Challenges
1. Domain knowledge
2. Size of data
a. 36 min of data, approx. 270 GB
3. IP addresses
a. Dynamic vs. Static
b. Private vs. Public
4. Unlabeled data → unsupervised learning
22
Deliverables
● Paper
● Network intrusion detection prototype
● Shell script
23
Potential Sponsors
● NSF Cybersecurity Innovation for Cyberinfrastructure (CICI)
● NSF Secure and Trustworthy Cyberspace (SaTC) programs
● DHS CyberSecurity Division programs
● DOE Cybersecurity for Energy program
● Industry, specifically NTT Labs and Cisco
24
References
1. Ashfaq, Rana Aamir Raza, et al. "Fuzziness Based Semi-Supervised Learning Approach for Intrusion Detection
System." Information Sciences (2016).
2. Boutaba, Carol Fung and Raouf. Intrusion Detection Networks. CRC Press, 2013.
3. —. Intrusion Detection Networks: A Key to Distributed Security. CRC Press, 2013.
4. Erman, Jeffrey, Martin Arlitt, and Anirban Mahanti. "Traffic Classification using Clustering Algorithms." Proceedings
of the 2006 SIGCOMM workshop on Mining network data. ACM, 2006. 281-286.
5. Farnham, Greg. “Detecting DNS Tunneling”. SANS Institute InfoSec Reading Room. 2013
6. Grimes, Robert. Detect network anomalies with StealthWatch. 2014. IDG. 2016.
<http://www.infoworld.com/article/2848768/security/detect-network-anomalies-with-stealthwatch.html>.
7. Locke, R., J. Wang, and I. Paschalidis. "Anomaly Detection Techniques for Data Exfiltration Attempts.." Boston
University Center for Information and Systems Engineering, 2012.
8. Sommer, Robin, and Vern Paxson. "Outside the Closed World: On using Machine Learning for Network Intrusion
Detection." 2010 IEEE symposium on security and privacy (2010).
9. Yuning Ling, Marcus Rosti, Gregory Swanson. "A Hands-off Approach to Network Intrustion Detection." IEEE
Systems and Information Engineering Design Conference (SIEDS). Charlottesville : IEEE, 2016. 216-220.
10. Liu, Fei Tony, Ting, Kai Ming and Zhou, Zhi-Hua. “Isolation-based anomaly detection.” ACM Transactions on Knowledge
Discovery from Data (TKDD) 6.1 (2012): 3.
25
Isolation Forest
• Unsupervised learning method
• Builds an ensemble of ITrees
for a given data set.
• The anomalies are those
observations with shortest
average length path root node.
26
Preliminary Results of iForest
27

More Related Content

PDF
Network Forensic Investigation of HTTPS Protocol
PDF
(130511) #fitalk network forensics and its role and scope
PDF
Network Forensic
PPTX
Network based file carving
PPTX
Forensic Analysis - Empower Tech Days 2013
PDF
Ariu - Ph.D. Defense Slides
PPTX
Hacking - penetration tools
PDF
MMIX Peering Forum and MMNOG 2020: Packet Analysis for Network Security
Network Forensic Investigation of HTTPS Protocol
(130511) #fitalk network forensics and its role and scope
Network Forensic
Network based file carving
Forensic Analysis - Empower Tech Days 2013
Ariu - Ph.D. Defense Slides
Hacking - penetration tools
MMIX Peering Forum and MMNOG 2020: Packet Analysis for Network Security

Similar to Network Intrusion Detection Dean Final, actual version (20)

PDF
A PHASED APPROACH TO INTRUSION DETECTION IN NETWORK
PDF
IRJET- Genetic Algorithm based Intrusion Detection-Survey
PPTX
BBLL BBKK GGHH 234567 NNHH UUYBVCCV.pptx
PPTX
Secondd 22 44 Rreview for the final.pptx
PPTX
44HHVVDDBBGGFFKKLLJJHHSSXXYGGVCCV22.pptx
PDF
Intelligent cyber security solutions
PPTX
Network Intrusion Detection (1)-converted-1.pptx
PPTX
Anomaly detection final
PDF
IDS / IPS Survey
PDF
IoT Guardian: A Novel Feature Discovery and Cooperative Game Theory Empowered...
PDF
IoT Guardian: A Novel Feature Discovery and Cooperative Game Theory Empowered...
PDF
Algorithms for network server anomaly behavior detection without traffic cont...
PDF
Detecting network attacks model based on a convolutional neural network
PDF
Implementation of Secured Network Based Intrusion Detection System Using SVM ...
PPTX
major_project.pptxvvvvvbbjjjjjjjjnjnnjjjjjj
PPT
Chapter-3-Intrusion-Detection-Systems-part-1.ppt
PPTX
Network-Intrusion-Detection-Using-Machine-Learning-1.pptx
PPTX
2024 Eighth International Conference on Parallel^J.pptx
PPT
ids.ppt
PDF
A novel deep anomaly detection approach for intrusion detection in futuristic...
A PHASED APPROACH TO INTRUSION DETECTION IN NETWORK
IRJET- Genetic Algorithm based Intrusion Detection-Survey
BBLL BBKK GGHH 234567 NNHH UUYBVCCV.pptx
Secondd 22 44 Rreview for the final.pptx
44HHVVDDBBGGFFKKLLJJHHSSXXYGGVCCV22.pptx
Intelligent cyber security solutions
Network Intrusion Detection (1)-converted-1.pptx
Anomaly detection final
IDS / IPS Survey
IoT Guardian: A Novel Feature Discovery and Cooperative Game Theory Empowered...
IoT Guardian: A Novel Feature Discovery and Cooperative Game Theory Empowered...
Algorithms for network server anomaly behavior detection without traffic cont...
Detecting network attacks model based on a convolutional neural network
Implementation of Secured Network Based Intrusion Detection System Using SVM ...
major_project.pptxvvvvvbbjjjjjjjjnjnnjjjjjj
Chapter-3-Intrusion-Detection-Systems-part-1.ppt
Network-Intrusion-Detection-Using-Machine-Learning-1.pptx
2024 Eighth International Conference on Parallel^J.pptx
ids.ppt
A novel deep anomaly detection approach for intrusion detection in futuristic...
Ad

Network Intrusion Detection Dean Final, actual version

  • 1. Network Intrusion Detection By: Jack Song, Julina Zhang, Kerry Jones Advisors: Dr. Don Brown, Dr. Hyojung Kang, Dr. Malathi Veeraraghavan Client: UVA Information Security, Policy, and Records Office (ISPRO) Sponsors: UVA SEAS/ Leidos 1
  • 2. Agenda ● Team Members ● Project Objectives ● Progress to Date ● Deliverables ● Potential Sponsors 2
  • 3. Team Members - Data Science Institute 3 Jack Song ● Majored in Computer Science at UVA Julina Zhang ● Majored in Statistics and Economics at UVA Kerry Jones ● Majored in Government and Geography at UMD
  • 4. Team Members - Advisors 4 Dr. Donald E. Brown ● Director of the Data Science Institute ● Dept. of Systems and Information Engineering Dr. Malathi Veeraraghavan ● Dept. of Electrical & Computer Engineering Dr. Hyojung Kang ● Dept. of Systems and Information Engineering
  • 5. Team Members Jason Belford ● Chief Information Security Officer 5 Jeff Collyer ● Information Security Engineer
  • 6. Team Members 6 Sourav Maji ● Third-year PhD student in Computer Engineering Ron Hutchins ● Vice President for Information Technology
  • 7. Objectives ● To detect anomalous traffic leaving UVA network using machine learning and data mining. ● Develop a network intrusion detection prototype. 7
  • 8. Agenda ● Team Members ● Project Objectives ● Progress to Date ● Deliverables ● Potential Sponsors 8
  • 9. Background - Approaches ● Lancope StealthWatch ● Previous approaches ○ Density-based Spatial Clustering of Applications with Noise (Erman, Arlitt, Mahanti) ○ K-Means Clustering (Erman, Arlitt, Mahanti) ○ One-class Support Vector Machine (Locke, Wang, Paschalidis) ○ Neural Network (Locke, Wang, Paschalidis) ○ Hierarchical Clustering (Ling, Rosti, Swanson) ○ Isolation Forest( Liu, Ting, Zhou) ● Our approach ○ Isolation Forest - An unsupervised learning method that utilizes a tree structure to isolate anomalies. 9
  • 10. Our progress, in a glance 10 - ISPRO - Preprocessing - Wireshark - Filtering - Unsupervised methods - Isolation Forest - Didn’t work out well - Collection server - Power Edge - TShark - Conversation data - Better ‘Unit’ - Preliminary results Course of Time Progress Initial Data Filtered Data netFlow data
  • 11. Initial data phase Data from ISPRO + Data preprocessing + Data filtering by source IPs within UVA network Result: a subset of packet capture data of all conns initiated within the UVA network 11
  • 12. Init, Data Preprocessing 12 ISPRO data 1 TB WIRESHARK /TShark 50GB → 5GB .pcap → .csv One pcap file 50GB/6min Summary statistics; AlgorithmsPython Script
  • 13. Filtered data phase Result from last phase Created source - destination IP pairs Calculated frequency and mean length for each pair + Isolation Forest Provided an initial view, but more is needed. 13
  • 14. Filtered data phase, what we’ve learned Packet capture data ONLY captures packets + Need to capture the entire use session Need netFlow records data 14
  • 15. NetFlow data phase -- Now ● Setting up a collection server ○ Power Edge ● Conversation data & TShark ● Better ‘Unit of comparison’ ○ include port number ● Preliminary analysis 15
  • 16. 16
  • 17. 17 Count 157,313 Unique Source IP 11514 Unique Destination IP 13113 Unique Destination Ports 1631 Unique Source Ports 48925 Average Duration 31 Secs Average Packets Source to Destination 34 Packets Average Packets Destination to Source 31 Packets Average Bytes Source to Destination 10172 Bytes Average Bytes Destination to Source 58134 Bytes Summary Statistics
  • 18. Top Five Most Frequently used Destination Ports 18 Destination Port Count Number of Unique Source IP pairs 80 ( HTTP) 66390 11238 443 (HTTPS) 38422 954 25 (FTP) 24277 39 6 20387 1 3 957 2
  • 19. 19
  • 20. NetFlow data phase, next steps ● Finish setting up Power Edge ○ Shell script ○ Cron job ■ Automation of daily data collection ● Go into specifics, “symptoms” ○ DNS tunneling ○ Phishing 20
  • 21. Identified Cyber Security Needs ● Identifying anomalous behavior in traffic leaving the UVa network ○ Source data: NetFlow records ○ Traffic from hosts with static public IP addresses ● DNS Tunneling ○ Data theft using port 53 as a pathway ● Phishing Attack ○ Obtain sensitive information by disguising and baiting. 21
  • 22. Challenges 1. Domain knowledge 2. Size of data a. 36 min of data, approx. 270 GB 3. IP addresses a. Dynamic vs. Static b. Private vs. Public 4. Unlabeled data → unsupervised learning 22
  • 23. Deliverables ● Paper ● Network intrusion detection prototype ● Shell script 23
  • 24. Potential Sponsors ● NSF Cybersecurity Innovation for Cyberinfrastructure (CICI) ● NSF Secure and Trustworthy Cyberspace (SaTC) programs ● DHS CyberSecurity Division programs ● DOE Cybersecurity for Energy program ● Industry, specifically NTT Labs and Cisco 24
  • 25. References 1. Ashfaq, Rana Aamir Raza, et al. "Fuzziness Based Semi-Supervised Learning Approach for Intrusion Detection System." Information Sciences (2016). 2. Boutaba, Carol Fung and Raouf. Intrusion Detection Networks. CRC Press, 2013. 3. —. Intrusion Detection Networks: A Key to Distributed Security. CRC Press, 2013. 4. Erman, Jeffrey, Martin Arlitt, and Anirban Mahanti. "Traffic Classification using Clustering Algorithms." Proceedings of the 2006 SIGCOMM workshop on Mining network data. ACM, 2006. 281-286. 5. Farnham, Greg. “Detecting DNS Tunneling”. SANS Institute InfoSec Reading Room. 2013 6. Grimes, Robert. Detect network anomalies with StealthWatch. 2014. IDG. 2016. <http://www.infoworld.com/article/2848768/security/detect-network-anomalies-with-stealthwatch.html>. 7. Locke, R., J. Wang, and I. Paschalidis. "Anomaly Detection Techniques for Data Exfiltration Attempts.." Boston University Center for Information and Systems Engineering, 2012. 8. Sommer, Robin, and Vern Paxson. "Outside the Closed World: On using Machine Learning for Network Intrusion Detection." 2010 IEEE symposium on security and privacy (2010). 9. Yuning Ling, Marcus Rosti, Gregory Swanson. "A Hands-off Approach to Network Intrustion Detection." IEEE Systems and Information Engineering Design Conference (SIEDS). Charlottesville : IEEE, 2016. 216-220. 10. Liu, Fei Tony, Ting, Kai Ming and Zhou, Zhi-Hua. “Isolation-based anomaly detection.” ACM Transactions on Knowledge Discovery from Data (TKDD) 6.1 (2012): 3. 25
  • 26. Isolation Forest • Unsupervised learning method • Builds an ensemble of ITrees for a given data set. • The anomalies are those observations with shortest average length path root node. 26