Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Large Scale Studies: Malware Needles in a
Haystack
Giovanni Bert˜ao1,3, Marcus Botacin2, Andr´e Gr´egio2, Paulo
L´ıcio de Geus1
1University of Campinas (UNICAMP)
{bertao, paulo}@lasca.ic.unicamp.br
2Federal University of Parana (UFPR)
{mfbotacin, gregio}@inf.ufpr.br
3Bolsista PIBIC-CNPq
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Topics
1 Introduction
2 Motivation & Related Work
3 Methodology
4 Analysis Results
5 Discussion
6 Final Remarks
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Analysis Type: Coarse-grained and Fine-grained
Topics
1 Introduction
Analysis Type:
Coarse-grained and
Fine-grained
2 Motivation & Related Work
Motivation
Related Work
3 Methodology
Dataset
Processing the Data
Architecture
4 Analysis Results
Coarse-Grained
Fine-Grained
5 Discussion
Coarse-Grained
Fine-Grained
6 Final Remarks
Conclusion
Acknowledgments
Questions
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Analysis Type: Coarse-grained and Fine-grained
Malware Analysis Approaches
Coarse-Grained
Highlight major aspects.
Discard sample details.
Fine-Grained
Focus on implementation details.
Don’t state the risk of such sample in the overall scenario.
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Motivation
Topics
1 Introduction
Analysis Type:
Coarse-grained and
Fine-grained
2 Motivation & Related Work
Motivation
Related Work
3 Methodology
Dataset
Processing the Data
Architecture
4 Analysis Results
Coarse-Grained
Fine-Grained
5 Discussion
Coarse-Grained
Fine-Grained
6 Final Remarks
Conclusion
Acknowledgments
Questions
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Motivation
Malware Studies
Figure: Coarse-Grained
BBC: https://www.bbc.com/news/technology-39730407
Figure: Fine-Grained
Trend Micro: https://bit.ly/2PaSPDC
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Related Work
Topics
1 Introduction
Analysis Type:
Coarse-grained and
Fine-grained
2 Motivation & Related Work
Motivation
Related Work
3 Methodology
Dataset
Processing the Data
Architecture
4 Analysis Results
Coarse-Grained
Fine-Grained
5 Discussion
Coarse-Grained
Fine-Grained
6 Final Remarks
Conclusion
Acknowledgments
Questions
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Related Work
Related Work
Bayer (Windows)
A view on current malware behaviors.
Lindorfer (Android)
Andrubis – 1,000,000 apps later: A view on current android
malware behaviors.
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Dataset
Topics
1 Introduction
Analysis Type:
Coarse-grained and
Fine-grained
2 Motivation & Related Work
Motivation
Related Work
3 Methodology
Dataset
Processing the Data
Architecture
4 Analysis Results
Coarse-Grained
Fine-Grained
5 Discussion
Coarse-Grained
Fine-Grained
6 Final Remarks
Conclusion
Acknowledgments
Questions
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Dataset
Building the Dataset
Dataset Composition
Malware repositories and blacklists crawled daily.
135,000 unique malware samples collected from Malshare
database.
Only Windows samples.
Samples submitted to static, dynamic and network analysis
procedures.
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Processing the Data
Topics
1 Introduction
Analysis Type:
Coarse-grained and
Fine-grained
2 Motivation & Related Work
Motivation
Related Work
3 Methodology
Dataset
Processing the Data
Architecture
4 Analysis Results
Coarse-Grained
Fine-Grained
5 Discussion
Coarse-Grained
Fine-Grained
6 Final Remarks
Conclusion
Acknowledgments
Questions
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Processing the Data
Analysis Method
Static Analysis
Presence of packers.
Anti-analysis techniques.
Anti-virus detection.
Dynamic and Network Analysis
Logs from BehEMOT, an internal sandbox solution.
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Architecture
Topics
1 Introduction
Analysis Type:
Coarse-grained and
Fine-grained
2 Motivation & Related Work
Motivation
Related Work
3 Methodology
Dataset
Processing the Data
Architecture
4 Analysis Results
Coarse-Grained
Fine-Grained
5 Discussion
Coarse-Grained
Fine-Grained
6 Final Remarks
Conclusion
Acknowledgments
Questions
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Architecture
Large Scale Support
Figure: Parallel Processing Architecture. Samples are independently
analyzed and their results are stored on a centralized database.
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Architecture
Time Elapsed
0M
10M
20M
30M
40M
50M
60M
0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36
Tuples
Hours
Tuples x Time
Figure: Parallel Processing Time. The complete analysis took 36 hours.
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Coarse-Grained
Topics
1 Introduction
Analysis Type:
Coarse-grained and
Fine-grained
2 Motivation & Related Work
Motivation
Related Work
3 Methodology
Dataset
Processing the Data
Architecture
4 Analysis Results
Coarse-Grained
Fine-Grained
5 Discussion
Coarse-Grained
Fine-Grained
6 Final Remarks
Conclusion
Acknowledgments
Questions
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Coarse-Grained
Malicious Behavior (Static Analysis)
0%
20%
40%
60%
80%
100%
Percentage(%)
Malicious Behaviors
Identified Malicious Behaviors
Termination
Timing
Performance
Process
Window
Fingerprinting
Modularization
Removal
AntiDebug
Figure: Identified Malicious Behaviors. We identified multiple, distinct
malicious behaviors during samples executions, such as AV analysis
evasion using Timing delays, measuring Performance overhead due to
monitoring and the Termination of security solutions to avoid detection.
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Coarse-Grained
Prevalent Signatures
Table: Most Prevalent Signatures. Compilers are more prevalent than
packers.
Signature Type Occurrence (%)
Microsoft Compiler 37.95%
Nullsoft PIMP Installer 25.51%
Borland Delphi Compiler 15.06%
UPX Packer 4.23%
MSLHR Packer 2.25%
PEcompact Packer 1.66%
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Coarse-Grained
Malicious Behavior (Dynamic Analysis)
Table: Dynamic Analysis. Identified Malicious Behaviors.
Subsystem Operation Samples (%) Target Samples (%)
File Subsystem
Create Files 91.56 Internet Explorer 10.14%
Read Files 89.18%
.DLL 86.80%
Internet Explorer 7.01%
.SYS 1.26%
Write Files 81.74%
.EXE 46.20%
.DLL 31.62%
Internet Explorer <0.01%
Host 0.00%
Delete Files 62.45% Internet Explorer 0.00%
Process Subsystem
Create Process 22.84%
Delete Process 23.38%
Registry Subsystem
Set Registry Values 74.73%
Proxy 68.36%
Autorun 5.66 %
Delete Registry Values 55.43%
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Coarse-Grained
Anti-Analysis Techniques
0%
10%
20%
30%
40%
50%
60%
70%
80%
Percentage(%)
Techniques
Identified Anti-Analysis Techniques
getlasterror
unhandledexceptionfilter
raiseexception
terminateprocess
isdebuggerpresent
isprocessorfeaturepresent
findwindowexa
vmcheck.dll
bochs & qemu cpuid
vmware
virtual box
Figure: Identified Anti-Analysis Techniques. Samples employ anti-analysis
techniques to avoid being inspected during sandbox execution. We
identified techniques aimed to detect the presence of debuggers and
virtual-machines.
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Coarse-Grained
Network Analysis — Protocols Distribution
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Samples(%)
Protocols
Protocol distribution by samples
HTTP/Non−TCP
HTTP/TCP
UDP/Non−DNS
UDP/DNS
Figure: Protocol usage distribution by sample. HTTP over TCP and
DNS over UDP are the prevalent communication channels.
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Coarse-Grained
Network Analysis — Domains Contacted
Table: Most Contacted Domains. We observe the presence of cloud
providers among the most accessed domains.
Domain Accesses (%)
Cloudfront 4.39%
Amazonaws 3.32%
Kirov 3.20%
Kerch 3.11%
Comcast 1.75%
Akamaitechnologies 1.48%
Sbcglobal 1.10%
Broadband 1.08%
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Coarse-Grained
Network Analysis — TLDs
0%
5%
10%
15%
20%
25%
30%
35%
Access(%)
Top Level Domains
Contacteds TLDS
net
com
ru
de
sc
jp
eu
br
fr
biz
Figure: Most Accessed TLDs distribution. Generic domains are
prevalent and country-specific domains are well distributed, thus showing
that malware creators are ready to exploit vulnerabilities in multiple
countries.
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Coarse-Grained
Network Analysis — DNS Resolvers
0.01%
0.1%
1%
10%
100%
Access(%)
DNS hosts
DNS queries distribution
google
opendns
d0wn
emerion
as43289
root−servers
polyram−group
jp
114dns
cryptostorm
Figure: DNS Resolvers Distribution. Default sandbox DNS (google)
was the most used one.
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Fine-Grained
Topics
1 Introduction
Analysis Type:
Coarse-grained and
Fine-grained
2 Motivation & Related Work
Motivation
Related Work
3 Methodology
Dataset
Processing the Data
Architecture
4 Analysis Results
Coarse-Grained
Fine-Grained
5 Discussion
Coarse-Grained
Fine-Grained
6 Final Remarks
Conclusion
Acknowledgments
Questions
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Fine-Grained
Keys Modification
Table: Modified AutoRun Keys. Whereas coarse-grained analysis
identified that ≈ 5.66% of samples write in an Autorun key, the
fine-grained analysis specified their location.
Key Samples(%)
HKCUIDSoftwareMicrosoftWindowsCurrentVersionRun 46.33%
HKCU.DEFAULTSOFTWAREMicrosoftWindowsCurrentVersionRun < 0.01%
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Fine-Grained
Contacted IPs
Table: Contacted IPs by Sample. Coarse-grained analysis showed that,
in average, each sample contacts 2 distinct IP addresses. Fine-grained
analysis revelead that ransomware samples which spread through
scanning contact many more IPs.
MD5 Hash Number of Distinct IPs Label
c1abb496deb7bd51a4ad2f8a43113b13 16386 Ransomware.Cerber
bc88096e7cc09f02f11deec35f84d5cd 16385 Ransomware.AWA
a801cdef09a61d3ba7969015a8bffec0 1 Ransomware.VirLock
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Fine-Grained
HTTP requests
Table: Prevalent HTTP Requests. Coarse-grained analysis shows that
HTTP payloads dominate TCP traffic. Fine-grained analysis shows that
this is due to Downloader samples.
MD5 Hash Total of Distinct HTTP Request Label
ede13f40a96a8b6e5de1029200c0b15e 394 Downloader
e5f4116d08c343623d5ee3af5553cbee 353 Downloader
47a328b0b903bb68147facc3a084172c 310 Downloader
28c4e2a48d9ddfffa01a943ca1ba1262 304 Downloader
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Fine-Grained
DNS Resolvers
Table: DNS queries resolvers distribution. Coarse-grained analysis
shows that DNS queries dominate UDP traffic. Fine-grained analysis
reveals that this is due to Bot samples.
Query Contacted(%) Description
bmp.pilenga.co.uk. 12.29% Hijacked Subdomain - Andromeda Botnet
tgr.tecnoagenzia.eu. 5.96% Hijacked Subdomain - Andromeda Botnet
tds.repack.it. 2.48% Andromeda Botnet
rxxl.tecnoagenzia.eu. 2.06% Andromeda Botnet
and31.blllaaaaaazblaaa1.com. 0.92% Andromeda Botnet
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Coarse-Grained
Topics
1 Introduction
Analysis Type:
Coarse-grained and
Fine-grained
2 Motivation & Related Work
Motivation
Related Work
3 Methodology
Dataset
Processing the Data
Architecture
4 Analysis Results
Coarse-Grained
Fine-Grained
5 Discussion
Coarse-Grained
Fine-Grained
6 Final Remarks
Conclusion
Acknowledgments
Questions
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Coarse-Grained
Drawing Panoramas
Dataset Characterization
Mainly executables.
Few libraries.
Rely on system native libraries.
Few external libraries.
GUI usage.
Strong presence of system interactions (file and registry
creation/deletion).
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Coarse-Grained
Panorama Comparison
Brazilian Panorama
Mix of binaries and DLL.
Rely on system native libraries.
Mostly as background activity.
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Fine-Grained
Topics
1 Introduction
Analysis Type:
Coarse-grained and
Fine-grained
2 Motivation & Related Work
Motivation
Related Work
3 Methodology
Dataset
Processing the Data
Architecture
4 Analysis Results
Coarse-Grained
Fine-Grained
5 Discussion
Coarse-Grained
Fine-Grained
6 Final Remarks
Conclusion
Acknowledgments
Questions
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Fine-Grained
Identifying project decisions
Coarse-Grained
In average, each sample contacts 2 different IP address.
Fine-Grained
c1abb496deb7bd51a4ad2f8a43113b13 contacts 16386 IPs.
a801cdef09a61d3ba7969015a8bffec0 contacts only 1 IP.
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Conclusion
Topics
1 Introduction
Analysis Type:
Coarse-grained and
Fine-grained
2 Motivation & Related Work
Motivation
Related Work
3 Methodology
Dataset
Processing the Data
Architecture
4 Analysis Results
Coarse-Grained
Fine-Grained
5 Discussion
Coarse-Grained
Fine-Grained
6 Final Remarks
Conclusion
Acknowledgments
Questions
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Conclusion
Conclusion
A coarse-grained analysis procedure is the only approach able
to draw threat panoramas.
A fine-grained analysis procedure is the only approach which
enables individual samples characterization.
Fine-grained and coarse-grained analysis approaches must be
combined for increased threat understanding.
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Acknowledgments
Topics
1 Introduction
Analysis Type:
Coarse-grained and
Fine-grained
2 Motivation & Related Work
Motivation
Related Work
3 Methodology
Dataset
Processing the Data
Architecture
4 Analysis Results
Coarse-Grained
Fine-Grained
5 Discussion
Coarse-Grained
Fine-Grained
6 Final Remarks
Conclusion
Acknowledgments
Questions
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Acknowledgments
Acknowledgments
CNPq
Institute of Computing/Unicamp
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Questions
Topics
1 Introduction
Analysis Type:
Coarse-grained and
Fine-grained
2 Motivation & Related Work
Motivation
Related Work
3 Methodology
Dataset
Processing the Data
Architecture
4 Analysis Results
Coarse-Grained
Fine-Grained
5 Discussion
Coarse-Grained
Fine-Grained
6 Final Remarks
Conclusion
Acknowledgments
Questions
Large Scale Studies: Malware Needles in a Haystack SBSeg’18
Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks
Questions
Questions?
bertao@lasca.ic.unicamp.br
Large Scale Studies: Malware Needles in a Haystack SBSeg’18

More Related Content

PPTX
Databases, Web Services and Tools For Systems Immunology
PDF
Software Analytics: Towards Software Mining that Matters
PDF
Software Analytics: Data Analytics for Software Engineering
PDF
Software Analytics: Data Analytics for Software Engineering and Security
PPTX
Findability through Traceability - A Realistic Application of Candidate Tr...
PDF
Presentation1.pdf
PPTX
Towards Automated AI-guided Drug Discovery Labs
PPTX
Automating the process of continuously prioritising data, updating and deploy...
Databases, Web Services and Tools For Systems Immunology
Software Analytics: Towards Software Mining that Matters
Software Analytics: Data Analytics for Software Engineering
Software Analytics: Data Analytics for Software Engineering and Security
Findability through Traceability - A Realistic Application of Candidate Tr...
Presentation1.pdf
Towards Automated AI-guided Drug Discovery Labs
Automating the process of continuously prioritising data, updating and deploy...

What's hot (20)

PPTX
Dice.com Bay Area Search - Beyond Learning to Rank Talk
PDF
Developing A Big Data Search Engine - Where we have gone. Where we are going:...
PDF
Towards Effective Bug Triage with Software Data Reduction Techniques
PPTX
HyQue: Evaluating scientific Hypotheses using semantic web technologies
PDF
Software bug prediction
PDF
Evidence Briefings: Towards a Medium to Transfer Knowledge from Systematic Re...
PPTX
Big(ger) Data in Software Engineering
PDF
Software Mining and Software Datasets
PDF
Stack Overflow slides Data Analytics
PPTX
HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William Enck
PPTX
Evolving the Optimal Relevancy Ranking Model at Dice.com
PDF
Sharing is Caring: Understanding and Measuring Threat Intelligence Sharing Ef...
PDF
ICSME2014
PDF
Biting into the Jawbreaker: Pushing the Boundaries of Threat Hunting Automation
PDF
Beyond Matching: Applying Data Science Techniques to IOC-based Detection
PDF
Analyzing Stack Overflow - Problem
PDF
Runtime Behavior of JavaScript Programs
PDF
Quality Control of Sequencing Data
PDF
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
PDF
Applying Machine Learning to Network Security Monitoring - BayThreat 2013
Dice.com Bay Area Search - Beyond Learning to Rank Talk
Developing A Big Data Search Engine - Where we have gone. Where we are going:...
Towards Effective Bug Triage with Software Data Reduction Techniques
HyQue: Evaluating scientific Hypotheses using semantic web technologies
Software bug prediction
Evidence Briefings: Towards a Medium to Transfer Knowledge from Systematic Re...
Big(ger) Data in Software Engineering
Software Mining and Software Datasets
Stack Overflow slides Data Analytics
HotSoS16 Tutorial "Text Analytics for Security" by Tao Xie and William Enck
Evolving the Optimal Relevancy Ranking Model at Dice.com
Sharing is Caring: Understanding and Measuring Threat Intelligence Sharing Ef...
ICSME2014
Biting into the Jawbreaker: Pushing the Boundaries of Threat Hunting Automation
Beyond Matching: Applying Data Science Techniques to IOC-based Detection
Analyzing Stack Overflow - Problem
Runtime Behavior of JavaScript Programs
Quality Control of Sequencing Data
Measuring the IQ of your Threat Intelligence Feeds (#tiqtest)
Applying Machine Learning to Network Security Monitoring - BayThreat 2013
Ad

Similar to Large Scale Studies: Malware Needles in a Haystack (20)

PDF
Analysis random org nist2005
PDF
What do malware analysts want from academia? A survey on the state-of-the-pra...
PDF
Supporting image-based meta-analysis with NIDM: Standardized reporting of neu...
PDF
Malicious Linux binaries: A Landscape
PPTX
INT254_Zero Lecture Machine Learning 1st book
PDF
Sybrandt Thesis Proposal Presentation
DOC
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
PPTX
Awareness Support in Global Software Development: A Systematic Review Based o...
PPTX
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
PPTX
Art into Science 2017 - Investigation Theory: A Cognitive Approach
PDF
Publish or Perish: Questioning the Impact of Our Research on the Software Dev...
PDF
A Survey And Taxonomy Of Distributed Data Mining Research Studies A Systemat...
PDF
2011 EASE - Motivation in Software Engineering: A Systematic Review Update
PDF
Transition From Mechanical Engineering to Data Science | Tutort Academy
PDF
حلقة تكنولوجية 11 بحث علمى بعنوان A Systematic Mapping Study for Big Data Str...
PPTX
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
PDF
Software Engineering Research: Leading a Double-Agent Life.
PPTX
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
PDF
0912f50eedb48e44d7000000
PDF
From data lakes to actionable data (adventures in data curation)
Analysis random org nist2005
What do malware analysts want from academia? A survey on the state-of-the-pra...
Supporting image-based meta-analysis with NIDM: Standardized reporting of neu...
Malicious Linux binaries: A Landscape
INT254_Zero Lecture Machine Learning 1st book
Sybrandt Thesis Proposal Presentation
IEEE 2014 DOTNET CLOUD COMPUTING PROJECTS A scientometric analysis of cloud c...
Awareness Support in Global Software Development: A Systematic Review Based o...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Art into Science 2017 - Investigation Theory: A Cognitive Approach
Publish or Perish: Questioning the Impact of Our Research on the Software Dev...
A Survey And Taxonomy Of Distributed Data Mining Research Studies A Systemat...
2011 EASE - Motivation in Software Engineering: A Systematic Review Update
Transition From Mechanical Engineering to Data Science | Tutort Academy
حلقة تكنولوجية 11 بحث علمى بعنوان A Systematic Mapping Study for Big Data Str...
Automatic Classification of Springer Nature Proceedings with Smart Topic Miner
Software Engineering Research: Leading a Double-Agent Life.
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
0912f50eedb48e44d7000000
From data lakes to actionable data (adventures in data curation)
Ad

More from Marcus Botacin (20)

PDF
Cross-Regional Malware Detection via Model Distilling and Federated Learning
PDF
GPThreats: Fully-automated AI-generated malware and its security risks
PDF
[Texas A&M University] Research @ Botacin's Lab
PDF
Pilares da Segurança e Chaves criptográficas
PDF
Machine Learning by Examples - Marcus Botacin - TAMU 2024
PDF
Near-memory & In-Memory Detection of Fileless Malware
PDF
GPThreats-3: Is Automated Malware Generation a Threat?
PDF
[HackInTheBOx] All You Always Wanted to Know About Antiviruses
PDF
[Usenix Enigma\ Why Is Our Security Research Failing? Five Practices to Change!
PDF
Hardware-accelerated security monitoring
PDF
How do we detect malware? A step-by-step guide
PDF
Among Viruses, Trojans, and Backdoors:Fighting Malware in 2022
PDF
Extraindo Caracterı́sticas de Arquivos Binários Executáveis
PDF
On the Malware Detection Problem: Challenges & Novel Approaches
PDF
All You Need to Know to Win a Cybersecurity Adversarial Machine Learning Comp...
PDF
Near-memory & In-Memory Detection of Fileless Malware
PDF
Does Your Threat Model Consider Country and Culture? A Case Study of Brazilia...
PDF
Integridade, confidencialidade, disponibilidade, ransomware
PDF
An Empirical Study on the Blocking of HTTP and DNS Requests at Providers Leve...
PDF
On the Security of Application Installers & Online Software Repositories
Cross-Regional Malware Detection via Model Distilling and Federated Learning
GPThreats: Fully-automated AI-generated malware and its security risks
[Texas A&M University] Research @ Botacin's Lab
Pilares da Segurança e Chaves criptográficas
Machine Learning by Examples - Marcus Botacin - TAMU 2024
Near-memory & In-Memory Detection of Fileless Malware
GPThreats-3: Is Automated Malware Generation a Threat?
[HackInTheBOx] All You Always Wanted to Know About Antiviruses
[Usenix Enigma\ Why Is Our Security Research Failing? Five Practices to Change!
Hardware-accelerated security monitoring
How do we detect malware? A step-by-step guide
Among Viruses, Trojans, and Backdoors:Fighting Malware in 2022
Extraindo Caracterı́sticas de Arquivos Binários Executáveis
On the Malware Detection Problem: Challenges & Novel Approaches
All You Need to Know to Win a Cybersecurity Adversarial Machine Learning Comp...
Near-memory & In-Memory Detection of Fileless Malware
Does Your Threat Model Consider Country and Culture? A Case Study of Brazilia...
Integridade, confidencialidade, disponibilidade, ransomware
An Empirical Study on the Blocking of HTTP and DNS Requests at Providers Leve...
On the Security of Application Installers & Online Software Repositories

Recently uploaded (20)

PDF
Warm, water-depleted rocky exoplanets with surfaceionic liquids: A proposed c...
PDF
GROUP 2 ORIGINAL PPT. pdf Hhfiwhwifhww0ojuwoadwsfjofjwsofjw
PPTX
Probability.pptx pearl lecture first year
PPTX
Microbes in human welfare class 12 .pptx
PPTX
gene cloning powerpoint for general biology 2
PPT
Presentation of a Romanian Institutee 2.
PPTX
Hypertension_Training_materials_English_2024[1] (1).pptx
PPT
THE CELL THEORY AND ITS FUNDAMENTALS AND USE
PDF
Assessment of environmental effects of quarrying in Kitengela subcountyof Kaj...
PPTX
Substance Disorders- part different drugs change body
PDF
Science Form five needed shit SCIENEce so
PDF
Communicating Health Policies to Diverse Populations (www.kiu.ac.ug)
PPTX
endocrine - management of adrenal incidentaloma.pptx
PPTX
POULTRY PRODUCTION AND MANAGEMENTNNN.pptx
PDF
Social preventive and pharmacy. Pdf
PDF
BET Eukaryotic signal Transduction BET Eukaryotic signal Transduction.pdf
PPTX
INTRODUCTION TO PAEDIATRICS AND PAEDIATRIC HISTORY TAKING-1.pptx
PDF
Unit 5 Preparations, Reactions, Properties and Isomersim of Organic Compounds...
PPTX
Presentation1 INTRODUCTION TO ENZYMES.pptx
PDF
Is Earendel a Star Cluster?: Metal-poor Globular Cluster Progenitors at z ∼ 6
Warm, water-depleted rocky exoplanets with surfaceionic liquids: A proposed c...
GROUP 2 ORIGINAL PPT. pdf Hhfiwhwifhww0ojuwoadwsfjofjwsofjw
Probability.pptx pearl lecture first year
Microbes in human welfare class 12 .pptx
gene cloning powerpoint for general biology 2
Presentation of a Romanian Institutee 2.
Hypertension_Training_materials_English_2024[1] (1).pptx
THE CELL THEORY AND ITS FUNDAMENTALS AND USE
Assessment of environmental effects of quarrying in Kitengela subcountyof Kaj...
Substance Disorders- part different drugs change body
Science Form five needed shit SCIENEce so
Communicating Health Policies to Diverse Populations (www.kiu.ac.ug)
endocrine - management of adrenal incidentaloma.pptx
POULTRY PRODUCTION AND MANAGEMENTNNN.pptx
Social preventive and pharmacy. Pdf
BET Eukaryotic signal Transduction BET Eukaryotic signal Transduction.pdf
INTRODUCTION TO PAEDIATRICS AND PAEDIATRIC HISTORY TAKING-1.pptx
Unit 5 Preparations, Reactions, Properties and Isomersim of Organic Compounds...
Presentation1 INTRODUCTION TO ENZYMES.pptx
Is Earendel a Star Cluster?: Metal-poor Globular Cluster Progenitors at z ∼ 6

Large Scale Studies: Malware Needles in a Haystack

  • 1. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Large Scale Studies: Malware Needles in a Haystack Giovanni Bert˜ao1,3, Marcus Botacin2, Andr´e Gr´egio2, Paulo L´ıcio de Geus1 1University of Campinas (UNICAMP) {bertao, paulo}@lasca.ic.unicamp.br 2Federal University of Parana (UFPR) {mfbotacin, gregio}@inf.ufpr.br 3Bolsista PIBIC-CNPq Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 2. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Topics 1 Introduction 2 Motivation & Related Work 3 Methodology 4 Analysis Results 5 Discussion 6 Final Remarks Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 3. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Analysis Type: Coarse-grained and Fine-grained Topics 1 Introduction Analysis Type: Coarse-grained and Fine-grained 2 Motivation & Related Work Motivation Related Work 3 Methodology Dataset Processing the Data Architecture 4 Analysis Results Coarse-Grained Fine-Grained 5 Discussion Coarse-Grained Fine-Grained 6 Final Remarks Conclusion Acknowledgments Questions Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 4. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Analysis Type: Coarse-grained and Fine-grained Malware Analysis Approaches Coarse-Grained Highlight major aspects. Discard sample details. Fine-Grained Focus on implementation details. Don’t state the risk of such sample in the overall scenario. Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 5. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Motivation Topics 1 Introduction Analysis Type: Coarse-grained and Fine-grained 2 Motivation & Related Work Motivation Related Work 3 Methodology Dataset Processing the Data Architecture 4 Analysis Results Coarse-Grained Fine-Grained 5 Discussion Coarse-Grained Fine-Grained 6 Final Remarks Conclusion Acknowledgments Questions Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 6. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Motivation Malware Studies Figure: Coarse-Grained BBC: https://www.bbc.com/news/technology-39730407 Figure: Fine-Grained Trend Micro: https://bit.ly/2PaSPDC Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 7. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Related Work Topics 1 Introduction Analysis Type: Coarse-grained and Fine-grained 2 Motivation & Related Work Motivation Related Work 3 Methodology Dataset Processing the Data Architecture 4 Analysis Results Coarse-Grained Fine-Grained 5 Discussion Coarse-Grained Fine-Grained 6 Final Remarks Conclusion Acknowledgments Questions Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 8. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Related Work Related Work Bayer (Windows) A view on current malware behaviors. Lindorfer (Android) Andrubis – 1,000,000 apps later: A view on current android malware behaviors. Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 9. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Dataset Topics 1 Introduction Analysis Type: Coarse-grained and Fine-grained 2 Motivation & Related Work Motivation Related Work 3 Methodology Dataset Processing the Data Architecture 4 Analysis Results Coarse-Grained Fine-Grained 5 Discussion Coarse-Grained Fine-Grained 6 Final Remarks Conclusion Acknowledgments Questions Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 10. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Dataset Building the Dataset Dataset Composition Malware repositories and blacklists crawled daily. 135,000 unique malware samples collected from Malshare database. Only Windows samples. Samples submitted to static, dynamic and network analysis procedures. Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 11. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Processing the Data Topics 1 Introduction Analysis Type: Coarse-grained and Fine-grained 2 Motivation & Related Work Motivation Related Work 3 Methodology Dataset Processing the Data Architecture 4 Analysis Results Coarse-Grained Fine-Grained 5 Discussion Coarse-Grained Fine-Grained 6 Final Remarks Conclusion Acknowledgments Questions Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 12. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Processing the Data Analysis Method Static Analysis Presence of packers. Anti-analysis techniques. Anti-virus detection. Dynamic and Network Analysis Logs from BehEMOT, an internal sandbox solution. Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 13. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Architecture Topics 1 Introduction Analysis Type: Coarse-grained and Fine-grained 2 Motivation & Related Work Motivation Related Work 3 Methodology Dataset Processing the Data Architecture 4 Analysis Results Coarse-Grained Fine-Grained 5 Discussion Coarse-Grained Fine-Grained 6 Final Remarks Conclusion Acknowledgments Questions Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 14. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Architecture Large Scale Support Figure: Parallel Processing Architecture. Samples are independently analyzed and their results are stored on a centralized database. Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 15. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Architecture Time Elapsed 0M 10M 20M 30M 40M 50M 60M 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34 36 Tuples Hours Tuples x Time Figure: Parallel Processing Time. The complete analysis took 36 hours. Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 16. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Coarse-Grained Topics 1 Introduction Analysis Type: Coarse-grained and Fine-grained 2 Motivation & Related Work Motivation Related Work 3 Methodology Dataset Processing the Data Architecture 4 Analysis Results Coarse-Grained Fine-Grained 5 Discussion Coarse-Grained Fine-Grained 6 Final Remarks Conclusion Acknowledgments Questions Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 17. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Coarse-Grained Malicious Behavior (Static Analysis) 0% 20% 40% 60% 80% 100% Percentage(%) Malicious Behaviors Identified Malicious Behaviors Termination Timing Performance Process Window Fingerprinting Modularization Removal AntiDebug Figure: Identified Malicious Behaviors. We identified multiple, distinct malicious behaviors during samples executions, such as AV analysis evasion using Timing delays, measuring Performance overhead due to monitoring and the Termination of security solutions to avoid detection. Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 18. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Coarse-Grained Prevalent Signatures Table: Most Prevalent Signatures. Compilers are more prevalent than packers. Signature Type Occurrence (%) Microsoft Compiler 37.95% Nullsoft PIMP Installer 25.51% Borland Delphi Compiler 15.06% UPX Packer 4.23% MSLHR Packer 2.25% PEcompact Packer 1.66% Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 19. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Coarse-Grained Malicious Behavior (Dynamic Analysis) Table: Dynamic Analysis. Identified Malicious Behaviors. Subsystem Operation Samples (%) Target Samples (%) File Subsystem Create Files 91.56 Internet Explorer 10.14% Read Files 89.18% .DLL 86.80% Internet Explorer 7.01% .SYS 1.26% Write Files 81.74% .EXE 46.20% .DLL 31.62% Internet Explorer <0.01% Host 0.00% Delete Files 62.45% Internet Explorer 0.00% Process Subsystem Create Process 22.84% Delete Process 23.38% Registry Subsystem Set Registry Values 74.73% Proxy 68.36% Autorun 5.66 % Delete Registry Values 55.43% Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 20. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Coarse-Grained Anti-Analysis Techniques 0% 10% 20% 30% 40% 50% 60% 70% 80% Percentage(%) Techniques Identified Anti-Analysis Techniques getlasterror unhandledexceptionfilter raiseexception terminateprocess isdebuggerpresent isprocessorfeaturepresent findwindowexa vmcheck.dll bochs & qemu cpuid vmware virtual box Figure: Identified Anti-Analysis Techniques. Samples employ anti-analysis techniques to avoid being inspected during sandbox execution. We identified techniques aimed to detect the presence of debuggers and virtual-machines. Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 21. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Coarse-Grained Network Analysis — Protocols Distribution 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Samples(%) Protocols Protocol distribution by samples HTTP/Non−TCP HTTP/TCP UDP/Non−DNS UDP/DNS Figure: Protocol usage distribution by sample. HTTP over TCP and DNS over UDP are the prevalent communication channels. Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 22. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Coarse-Grained Network Analysis — Domains Contacted Table: Most Contacted Domains. We observe the presence of cloud providers among the most accessed domains. Domain Accesses (%) Cloudfront 4.39% Amazonaws 3.32% Kirov 3.20% Kerch 3.11% Comcast 1.75% Akamaitechnologies 1.48% Sbcglobal 1.10% Broadband 1.08% Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 23. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Coarse-Grained Network Analysis — TLDs 0% 5% 10% 15% 20% 25% 30% 35% Access(%) Top Level Domains Contacteds TLDS net com ru de sc jp eu br fr biz Figure: Most Accessed TLDs distribution. Generic domains are prevalent and country-specific domains are well distributed, thus showing that malware creators are ready to exploit vulnerabilities in multiple countries. Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 24. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Coarse-Grained Network Analysis — DNS Resolvers 0.01% 0.1% 1% 10% 100% Access(%) DNS hosts DNS queries distribution google opendns d0wn emerion as43289 root−servers polyram−group jp 114dns cryptostorm Figure: DNS Resolvers Distribution. Default sandbox DNS (google) was the most used one. Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 25. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Fine-Grained Topics 1 Introduction Analysis Type: Coarse-grained and Fine-grained 2 Motivation & Related Work Motivation Related Work 3 Methodology Dataset Processing the Data Architecture 4 Analysis Results Coarse-Grained Fine-Grained 5 Discussion Coarse-Grained Fine-Grained 6 Final Remarks Conclusion Acknowledgments Questions Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 26. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Fine-Grained Keys Modification Table: Modified AutoRun Keys. Whereas coarse-grained analysis identified that ≈ 5.66% of samples write in an Autorun key, the fine-grained analysis specified their location. Key Samples(%) HKCUIDSoftwareMicrosoftWindowsCurrentVersionRun 46.33% HKCU.DEFAULTSOFTWAREMicrosoftWindowsCurrentVersionRun < 0.01% Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 27. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Fine-Grained Contacted IPs Table: Contacted IPs by Sample. Coarse-grained analysis showed that, in average, each sample contacts 2 distinct IP addresses. Fine-grained analysis revelead that ransomware samples which spread through scanning contact many more IPs. MD5 Hash Number of Distinct IPs Label c1abb496deb7bd51a4ad2f8a43113b13 16386 Ransomware.Cerber bc88096e7cc09f02f11deec35f84d5cd 16385 Ransomware.AWA a801cdef09a61d3ba7969015a8bffec0 1 Ransomware.VirLock Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 28. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Fine-Grained HTTP requests Table: Prevalent HTTP Requests. Coarse-grained analysis shows that HTTP payloads dominate TCP traffic. Fine-grained analysis shows that this is due to Downloader samples. MD5 Hash Total of Distinct HTTP Request Label ede13f40a96a8b6e5de1029200c0b15e 394 Downloader e5f4116d08c343623d5ee3af5553cbee 353 Downloader 47a328b0b903bb68147facc3a084172c 310 Downloader 28c4e2a48d9ddfffa01a943ca1ba1262 304 Downloader Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 29. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Fine-Grained DNS Resolvers Table: DNS queries resolvers distribution. Coarse-grained analysis shows that DNS queries dominate UDP traffic. Fine-grained analysis reveals that this is due to Bot samples. Query Contacted(%) Description bmp.pilenga.co.uk. 12.29% Hijacked Subdomain - Andromeda Botnet tgr.tecnoagenzia.eu. 5.96% Hijacked Subdomain - Andromeda Botnet tds.repack.it. 2.48% Andromeda Botnet rxxl.tecnoagenzia.eu. 2.06% Andromeda Botnet and31.blllaaaaaazblaaa1.com. 0.92% Andromeda Botnet Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 30. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Coarse-Grained Topics 1 Introduction Analysis Type: Coarse-grained and Fine-grained 2 Motivation & Related Work Motivation Related Work 3 Methodology Dataset Processing the Data Architecture 4 Analysis Results Coarse-Grained Fine-Grained 5 Discussion Coarse-Grained Fine-Grained 6 Final Remarks Conclusion Acknowledgments Questions Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 31. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Coarse-Grained Drawing Panoramas Dataset Characterization Mainly executables. Few libraries. Rely on system native libraries. Few external libraries. GUI usage. Strong presence of system interactions (file and registry creation/deletion). Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 32. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Coarse-Grained Panorama Comparison Brazilian Panorama Mix of binaries and DLL. Rely on system native libraries. Mostly as background activity. Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 33. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Fine-Grained Topics 1 Introduction Analysis Type: Coarse-grained and Fine-grained 2 Motivation & Related Work Motivation Related Work 3 Methodology Dataset Processing the Data Architecture 4 Analysis Results Coarse-Grained Fine-Grained 5 Discussion Coarse-Grained Fine-Grained 6 Final Remarks Conclusion Acknowledgments Questions Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 34. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Fine-Grained Identifying project decisions Coarse-Grained In average, each sample contacts 2 different IP address. Fine-Grained c1abb496deb7bd51a4ad2f8a43113b13 contacts 16386 IPs. a801cdef09a61d3ba7969015a8bffec0 contacts only 1 IP. Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 35. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Conclusion Topics 1 Introduction Analysis Type: Coarse-grained and Fine-grained 2 Motivation & Related Work Motivation Related Work 3 Methodology Dataset Processing the Data Architecture 4 Analysis Results Coarse-Grained Fine-Grained 5 Discussion Coarse-Grained Fine-Grained 6 Final Remarks Conclusion Acknowledgments Questions Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 36. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Conclusion Conclusion A coarse-grained analysis procedure is the only approach able to draw threat panoramas. A fine-grained analysis procedure is the only approach which enables individual samples characterization. Fine-grained and coarse-grained analysis approaches must be combined for increased threat understanding. Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 37. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Acknowledgments Topics 1 Introduction Analysis Type: Coarse-grained and Fine-grained 2 Motivation & Related Work Motivation Related Work 3 Methodology Dataset Processing the Data Architecture 4 Analysis Results Coarse-Grained Fine-Grained 5 Discussion Coarse-Grained Fine-Grained 6 Final Remarks Conclusion Acknowledgments Questions Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 38. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Acknowledgments Acknowledgments CNPq Institute of Computing/Unicamp Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 39. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Questions Topics 1 Introduction Analysis Type: Coarse-grained and Fine-grained 2 Motivation & Related Work Motivation Related Work 3 Methodology Dataset Processing the Data Architecture 4 Analysis Results Coarse-Grained Fine-Grained 5 Discussion Coarse-Grained Fine-Grained 6 Final Remarks Conclusion Acknowledgments Questions Large Scale Studies: Malware Needles in a Haystack SBSeg’18
  • 40. Introduction Motivation & Related Work Methodology Analysis Results Discussion Final Remarks Questions Questions? bertao@lasca.ic.unicamp.br Large Scale Studies: Malware Needles in a Haystack SBSeg’18