SlideShare a Scribd company logo
It's you on photo?
Automatic Detection of Twitter
Accounts Infected With the
Blackhole Exploit Kit
Josh White and Jeanna Matthews
Clarkson University

MALWARE 2013October 22-24, 2013
Objective
 This work identifies some indicators of possible BEK
infectious messages on Twitter.
 These indicators are used in the production of a filter
which can be applied to our collection system to
identify user accounts on Twitter which have reached
specific thresholds and can be considered
compromised or purposefully infectious.

MALWARE 2013October 22-24, 2013
Overview
 BEK Details
 Data Collection
 Analysis Framework
 Metrics
 Results
 Infectious Message Variations
 API Usage
 Infectious Indicators

MALWARE 2013October 22-24, 2013
Blackhole Exploit Kit (BEK)
Web-based application that manages the
installation and C2 of malware.
Utilizes a compromised server for malware
and web-page hosting.
Links luring victims to a compromised
server are distributed mainly through spam,
spear-phishing, and links in social network
posts.
MALWARE 2013October 22-24, 2013
Infection
The exploit server hosts innocuous looking
web-page
Page hosts a tool for scanning the visiting syste
Once a vulnerability is identified, it loads the
necessary exploit tools and compromises the visiting
system
A wide variety of malware may be loaded at this
point depending on the exact mission of the attacker.

MALWARE 2013October 22-24, 2013
Other BEK features
Contains modular
capabilities for new exploits
to be added rapidly and in
many languages.
Employs typical
countermeasures:
Packing, binary
obfuscation and antivirus
avoidance.
MALWARE 2013October 22-24, 2013
Proliferation of BEK
In 2012, BEK creators released version 2.0,
since it has become the most well
known/commonly deployed exploit kits
BEK enabled majority of malware infections
in 2012
One study found: BEK accounted for 29%
of all malicious URLS, in a dataset of 77,000
URLs marked harmful by the Google Safe
Browsing API
MALWARE 2013October 22-24, 2013
Data Collection Overview
 Over the course of 2012 we collected 165 TB of
Twitter Data (Uncompressed)
 175 Days Collected, 147 Full Days
 Estimated 45 Billion Tweets
 Recently released estimates place total Twitter traffic
at 175 million tweets per day in 2012
 Our daily collection rates varied between 50% and
80% of total Twitter traffic.
 We captured complete tweet data in JSON format
using Twitters REST API.
MALWARE 2013October 22-24, 2013
Key Examples of Attributes in
JSON format


profile link color/background color/title/default
image/image url (http and https)/text color/default
description, background image url (http and https), In
reply to screen name/status id and str/user id, follow
request sent, friends count, screen name, show all inline
media, utc offset, url, created at, favorite, retweet count,
favorites count, id translator, trunkated, contributors
enabled, contributors, time zone, verified, coordinates,
Geo, text, entities, id, id str, following, application,
retweeted, place, sidebar border color/fill color, followers
count, geo enabled listed count, notifications, name, lang,
location, protected, statuses count
MALWARE 2013October 22-24, 2013
Data Collection System
 Distributed Data Collection Infrastructure
 Geographically dissimilar IP's to simulate multiple
systems
 Registered Application with Non-authenticated API
access (1 billion+ / week)

MALWARE 2013October 22-24, 2013
Data Storage
 Collection in Streaming Gzip Python Dictionary Format (10:1
Average Compression Ratio); Storing 1.5 TB a Week
 Converted to JSON on the fly when needed
 Initially Stored in HDFS (Had Issues Scaling); Now Use DDFS

MALWARE 2013October 22-24, 2013
High Level Patterns
• Basic observable patterns
– Twitter has a lot of outages
– Posting rates follow predictable patterns

MALWARE 2013October 22-24, 2013
Analysis Framework
 Filter Analysis (on live stream)
 Experimental Analysis (after the fact)

MALWARE 2013October 22-24, 2013
Some Key Metrics
 Entropy
 Pearson’s Correlation

MALWARE 2013October 22-24, 2013
Initial Identification


Searched for two well publicized strings being seen
in the wild: “It's you on Photo?” and “It's about you?”

MALWARE 2013October 22-24, 2013
Other message types found
 Others found using REGEX, but not mentioned in
any articles or blogs at the time:









“You were nude at party) cool” photo)”
“Wow! Your photo is cool.”
“At party you was drunken) cool photo)”
“Your photo is amazing”
“It's photo of you?”
“It's all about you”
“It's about you?”
“Wow! You look good)”

 Because BEK allows message customization the
permutations are virtually limitless
MALWARE 2013October 22-24, 2013
Results: Entropy
 Normal Tweets =
4.6-7.5
 197,237 manually
verified messages
from 100 sample
accounts

Normal

 Infectious = 4.3 &
lower
Infectious

MALWARE 2013October 22-24, 2013
Results: Pearson’s Correlation
Coefficient
 Two Infectious Accounts
 Compared have an average PCC value of 0.927581013955
 Positive value near 1
 Indicates strong correlation “similarity” between accounts

 One Infectious Account and One Non-Infectious
Account
 Compared have an average PCC value of -0.0847935420003
 Negative value near 0
 Indicates strong negative correlation “difference” between
accounts

MALWARE 2013October 22-24, 2013
Results: Use of API/Application
 Applications must be
registered for specific Twitter
API function usage
 There hundreds of
registered applications:
 ie.: “Iphone”, “Android”,
“Official Twitter Client”
 “Mobile Web” is legacy
 Requires no registration of
application using it
 Requires no 0Auth
 Less CPU/Memory utilization

MALWARE 2013October 22-24, 2013
Results: Graph Clustering
 Visualize 729,609 suspicious
accounts


Utilized Gephi and the OpenOrd clustering
algorithm

 Shows Obvious Clusters


Based on: Tx infectious directs message
to victim, victim is infectious when it starts
transmitting infectious messages
 Non-connected accounts are assumed to
have clicked on an infectious message
without it being directed at them. We can
not currently trace what messages they
clicked on



Dense area's are the most successfully
spread infection chains.
 The cores are considered Infection Hubs

MALWARE 2013October 22-24, 2013
Results: Some Summary Stats







Total Number of Tweets Processed:
Total Number of Unique Accounts Processed:
Total Number of Suspicious Accounts Found:
Total Number of Suspicious Tweets Found:
Calculated Percentage of BEK Infectious Accounts:
Calculated Percentage of BEK Infectious Tweets: 12.7%

MALWARE 2013October 22-24, 2013

6,531,319,202
265,163,290
729,609
8,286,480
0.275%
Related Work
 A lot of research has been done into social network analysis using sites such as
Twitter. [21,22,24]
 Including research that uses social network trends to track real world contagion
spread [25]
 A few studies exist that examine BEK's malware dropping capability [2]
 The URL identification method they used “w.php?f=(.*?)&e=(.*?)” does not
pick up all of the URL patterns that we witnessed

MALWARE 2013October 22-24, 2013
Related Work (continued)
 Various works on determining if a message is automated or not exist, most
notably “Human, Bot or Cyborg” [26]
 Unfortunately they relied heavily on Google Safe Browsing API which is only
updated after someone has verified the link is dangerous. [27]
 One work showed that up to 16% of all Twitter accounts show signs of
automation. [28]
 However, they point out that only a small number of tweets use the Mobile
Web API

MALWARE 2013October 22-24, 2013
Future Work (with edits from
Malware 2013 audience!)
Analysis of use of specific strings over time
Studying spread of ideas in Twitter in addition to spread of malware
Case study of top infectors
Carrier vs virology model of spread
Compare to all vs just benign
Testing twitter ( measure how well they do in disabling infected
accounts/help them get better); Work with Twitter to integrate
 Include follower to following ratio on infected accounts
 More like antispam than antimalware







 Geographic analysis of the infected accounts

MALWARE 2013October 22-24, 2013
Conclusion
 We completed a large-scale analysis of the characteristics of BEK infectious
Twitter Accounts
 Some accounts showed signs of being solely for malware distribution
 We found substantial variation in infectious message structure
 We identified a large set of message types not previously published
 We identified the characteristics most strongly associated with BEK infectious
messages
 Tweets using the Mobile API, with a Text Entropy lower than 4.3, and
showing a strong PCC with known infectious messages, and those that
additionally have URL's embedded in them
 We presented the integration of our measurement techniques and how they
integrate into our larger platform
 Without manual investigation of all messages that we flagged as infectious we
can not be certain of our results
MALWARE 2013October 22-24, 2013
Citations
1. J. Oliver, S. Cheng, L. Manly, J. Zhu, R. Paz, S. Sioting, J. Leopando. “Blackhole Exploit Kit: A Spam Campaign, Not a Series
of Individual Spam Runs, An In-Depth Analysis,” Trend Micro Incorporated Research Paper, 2012
2. Chris Grier, Lucas Ballard, Juan Caballero, et al. 2012. Manufacturing compromise: the emergence of exploit-as-a-service. In
Proceedings of the 2012 ACM conference on Computer and communications security (CCS ’12). ACM, New York, NY, USA, 821832.
3. Gabor Szappanos. ”Inside The Blackhole,” SophosLabs, 2012
4. Jason Jones. ”The State of Web Exploit Kits,” HP DVLabs, 2012
5. Howard, Fraser. 2013. Technical paper: Journey inside the Blackhole exploit kit. Naked Security from Sophos. November 30
2012
6. Chris Grier, Lucas Ballard, Juan Caballero, et al. 2012. Manufacturing compromise: the emergence of exploit-as-a-service. In
Proceedings of the 2012 ACM conference on Computer and communications security (CCS ’12). ACM, New York, NY, USA, 821832
7. Fraser Howard. ”Exploring the Blackhole Exploit Kit,” Sophos Technical Paper, March 2012
8. Ziv Mador. ”Exploiting Kits: The Underground’s Weapon of Choice,” Infosecurity Europe 2012, SpiderLabs at Trustwave, 2012
9. Zhou Li, Kehuan Zhang, Yinglian Xie, Fang Yu, and XiaoFeng Wang. 2012. Knowing your enemy: understanding and detecting
malicious web advertising. In Proceedings of the 2012 ACM conference on Computer and communications security (CCS ’12).
ACM, New York, NY, USA, 674-686.
10. Shea Bennett. ”Just How Big Is twitter In 2012 [INFOGRAPHIC],” All Twitter - The Unofficial Twitter Resource, February 2013

MALWARE 2013October 22-24, 2013
Citations
11. Mike Melanson, Twitter Kills the API Whitelist: What it Means for Developers and Innovation, February 11 2011, URL
=http://www.readwriteweb.com/archives/
12. Joab Jackson, Twitter Now Using Oauth authentication for Third Party Apps, Computer World UK, September 1, 2010, URL=
http://www.computerworlduk.com/news/security/3237659/twitter-now-using-oauth- authentication-for-third-party-apps/
13. Arne Roomann-Kurrik, Announcing gzip Compression for Streaming API’s, Twitter Developers Feed, Jan 20, 2012, URL
=https://dev.twitter.com/blog/announcing-gzip-compression-streaming-apis
14. Prashanth Mundkur, Ville Tuulos, and Jared Flatow. 2011. Disco: a computing platform for large-scale data analytics. In
Proceedings of the 10th ACM SIGPLAN workshop on Erlang (Erlang 11). ACM, New York, NY, USA, 84-89.
15. C. E. Shannon. A Mathematical Theory of Communication, Reprinted with corrections from The Bell System Technical
Journal, Vol. 27, pp. 379-423, 623-656, July, October, 1948.
16. Graham Cluley. ”Outbreak: Blackhole malware attack spreading on Twitter using ”It’s you on photo? diguise,” Sophos Naked
Security Blog, July 27, 1012
17. Rob Waugh. ”It’s you! Blackhole vierus spreading rapidly via Twitter fools users with fake photo link,” MailOnline Science and
Tech New, July 2012
18. Bastian M., Heymann S., Jacomy M. (2009). Gephi: an open source software for exploring and manipulating networks.
International AAAI Conference on Weblogs and Social Media.
19. S. Martin, W. M. Brown, R. Klavans, and K. Boyack (to appear, 2011), OpenOrd: An Open-Source Toolbox for Large Graph
Layout, SPIE Conference on Visualization and Data Analysis (VDA).
20. Aditya Mogadala and Vasudeva Varma. 2012. Twitter user behavior understanding with mood transition prediction. In
Proceedings of the 2012 workshop on Data-driven user behavioral modelling and mining from social media (DUBMMSM ’12).
ACM, New York, NY, USA, 31-34.

MALWARE 2013October 22-24, 2013
Citations

21.Johan Bollen and Huina Mao. 2011. Twitter Mood as a Stock Market Predictor. Computer 44, 10 (October 2011), 91-94.
DOI=10.1109/MC.2011.323 http://dx.doi.org/10.1109/MC.2011.323
22. Johan Bollen, Bruno Gonalves, Guangchen Ruan, and Huina Mao. 2011. Happiness is assortative in online social networks. Artif. Life 17, 3
(August 2011), 237-251.
23. Manuel Cebrian. 2012. Using friends as sensors to detect planetary-scale contagious outbreaks. In Proceedings of the 1st international
workshop on Multimodal crowd sensing (CrowdSens ’12). ACM, New York, NY, USA, 15-16.
24 Z. Chu, S. Gianvecchio, H. Wang, and S. Jajodia. 2010. Who is tweeting on Twitter: human, bot, or cyborg?. In Proceedings of the 26th
Annual Computer Security Applications Conference (ACSAC ’10). ACM, New York, NY, USA, 21-30.
25. Google. Google safe browsing API. http://code.google.com/apis/safebrowsing/, Accessed: Feb 5, 2010
26. Chao Michael Zhang and Vern Paxson. 2011. Detecting and analyzing automated activity on twitter. In Proceedings of the 12th international
conference on Passive and active measurement (PAM’11), Neil Spring and George F. Riley (Eds.). Springer-Verlag, Berlin, Heidelberg, 102111.

MALWARE 2013October 22-24, 2013

More Related Content

PDF
MALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORK
PDF
An Ontology-based Technique for Online Profile Resolution
PDF
Iy2515891593
PDF
A web content analytics
DOCX
Warningbird a near real time detection system for suspicious urls in twitter ...
PDF
Open Source Insight: Struts in VMware, Law Firm Cybersecurity, Hospital Data ...
PDF
State of the Art Analysis Approach for Identification of the Malignant URLs
PDF
Vulnerability Assessment and Penetration Testing using Webkill
MALICIOUS URL DETECTION USING CONVOLUTIONAL NEURAL NETWORK
An Ontology-based Technique for Online Profile Resolution
Iy2515891593
A web content analytics
Warningbird a near real time detection system for suspicious urls in twitter ...
Open Source Insight: Struts in VMware, Law Firm Cybersecurity, Hospital Data ...
State of the Art Analysis Approach for Identification of the Malignant URLs
Vulnerability Assessment and Penetration Testing using Webkill

What's hot (15)

PDF
B07040308
PDF
IEEE ANDROID APPLICATION 2016 TITLE AND ABSTRACT
PPTX
Splunk for Security Workshop
PDF
IRJET - Chrome Extension for Detecting Phishing Websites
PDF
vulnerability scanning and reporting tool
PPTX
DEVNET-1186 Harnessing the Power of the Cloud to Detect Advanced Threats: Cog...
PDF
Analysis of Malware Infected Systems & Classification with Gradient-boosted T...
PDF
PDMLP: PHISHING DETECTION USING MULTILAYER PERCEPTRON
PDF
FRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITY
PDF
Analyzing the effectualness of Phishing Algorithms in Web Applications Inques...
PDF
Target List of Hesper-BOT Malware
PDF
Flaws in Oauth 2.0 Can Oauth be used as a Security Server
PPTX
Technologies in Support of Big Data Ethics
PDF
Symantec Intelligence Report December 2014
B07040308
IEEE ANDROID APPLICATION 2016 TITLE AND ABSTRACT
Splunk for Security Workshop
IRJET - Chrome Extension for Detecting Phishing Websites
vulnerability scanning and reporting tool
DEVNET-1186 Harnessing the Power of the Cloud to Detect Advanced Threats: Cog...
Analysis of Malware Infected Systems & Classification with Gradient-boosted T...
PDMLP: PHISHING DETECTION USING MULTILAYER PERCEPTRON
FRAMEWORK FOR ANALYZING TWITTER TO DETECT COMMUNITY SUSPICIOUS CRIME ACTIVITY
Analyzing the effectualness of Phishing Algorithms in Web Applications Inques...
Target List of Hesper-BOT Malware
Flaws in Oauth 2.0 Can Oauth be used as a Security Server
Technologies in Support of Big Data Ethics
Symantec Intelligence Report December 2014
Ad

Viewers also liked (8)

PPT
Trading Profit And Loss CMD
PPSX
Accounts of insurance companies
PPTX
FINAL ACCOUNT CONCLUSION OF BUILDING PROJECTS
PPT
Final account
PDF
Graphical presentation of data
PPTX
Graphical Representation of data
PPTX
Project on Solar Energy
PDF
Accounting in insurance companies basic concepts
Trading Profit And Loss CMD
Accounts of insurance companies
FINAL ACCOUNT CONCLUSION OF BUILDING PROJECTS
Final account
Graphical presentation of data
Graphical Representation of data
Project on Solar Energy
Accounting in insurance companies basic concepts
Ad

Similar to Malware bek slides 20131023 final (20)

PPTX
Social Hive Index by MSLGROUP
 
PDF
Websense 2013 Threat Report
PDF
2013 Threat Report
PDF
IRJET - Detecting Spiteful Accounts in Social Network
PPTX
Open Source Insight: SCA for DevOps, DHS Security, Securing Open Source for G...
PDF
IRJET- Effective Countering of Communal Hatred During Disaster Events in Soci...
PDF
SENTIMENT ANALYSIS OF SOCIAL MEDIA DATA USING DEEP LEARNING
PDF
SECURITY ANALYSIS ON PASSWORD AUTHENTICATION SYSTEM OF WEB PORTAL
PDF
A Survey of Keylogger in Cybersecurity Education
PDF
Detection of Attacker using Honeywords
PDF
Studying user footprints in different online social networks
PDF
Learning to detect phishing ur ls
PDF
Edgescan 2022 Vulnerability Statistics Report
PDF
2022 Vulnerability Statistics Report.pdf
PDF
Briskinfosec - Threatsploit Report Augest 2021- Cyber security updates
PDF
Detecting Phishing using Machine Learning
PPTX
Beyond Interaction Networks: An Introduction to Practice Mapping
PDF
Spammer Detection and Fake User Identification on Social Networks
PDF
Privacy and Security on Online Social Media: Workshop on Data Analytics & Its...
Social Hive Index by MSLGROUP
 
Websense 2013 Threat Report
2013 Threat Report
IRJET - Detecting Spiteful Accounts in Social Network
Open Source Insight: SCA for DevOps, DHS Security, Securing Open Source for G...
IRJET- Effective Countering of Communal Hatred During Disaster Events in Soci...
SENTIMENT ANALYSIS OF SOCIAL MEDIA DATA USING DEEP LEARNING
SECURITY ANALYSIS ON PASSWORD AUTHENTICATION SYSTEM OF WEB PORTAL
A Survey of Keylogger in Cybersecurity Education
Detection of Attacker using Honeywords
Studying user footprints in different online social networks
Learning to detect phishing ur ls
Edgescan 2022 Vulnerability Statistics Report
2022 Vulnerability Statistics Report.pdf
Briskinfosec - Threatsploit Report Augest 2021- Cyber security updates
Detecting Phishing using Machine Learning
Beyond Interaction Networks: An Introduction to Practice Mapping
Spammer Detection and Fake User Identification on Social Networks
Privacy and Security on Online Social Media: Workshop on Data Analytics & Its...

More from Joshua S. White, PhD josh@securemind.org (12)

PDF
Presentation - Hybrid Sentiment Analysis Utilizing Multiple Indicators To Det...
PDF
Presentation - Social Relevance Toward Understanding the Impact of the Indivi...
PDF
Presentation - Application of Actor Level Social Characteristic Indicator Sel...
PDF
PDF
ase-social-informatics (6)
PDF
Social Network Analysis Applications and Approach
ODP
Clarkson joshua white - ids testing - spie 2013 presentation - jsw - d1
PDF
CSIAC - Social Media Analysis and Privacy
PDF
Clarkson - Joshua White - Research Proposal Presentation
PPT
Coalmine spie 2012 presentation - jsw -d3
PPT
Phishing spie 2012 presentation - jsw - d2
PPT
Physical Layer Optical Network Security Thesis Presentation To The CNY ISSA C...
Presentation - Hybrid Sentiment Analysis Utilizing Multiple Indicators To Det...
Presentation - Social Relevance Toward Understanding the Impact of the Indivi...
Presentation - Application of Actor Level Social Characteristic Indicator Sel...
ase-social-informatics (6)
Social Network Analysis Applications and Approach
Clarkson joshua white - ids testing - spie 2013 presentation - jsw - d1
CSIAC - Social Media Analysis and Privacy
Clarkson - Joshua White - Research Proposal Presentation
Coalmine spie 2012 presentation - jsw -d3
Phishing spie 2012 presentation - jsw - d2
Physical Layer Optical Network Security Thesis Presentation To The CNY ISSA C...

Recently uploaded (20)

PPTX
Strategies for Social Media App Enhancement
PDF
Why Digital Marketing Matters in Today’s World Ask ChatGPT
PDF
StarNetCafeSB2012D3POYNagaworld2-Hotel-Casino-Phnom Entertainment
PDF
Customer Churn Prediction in Digital Banking: A Comparative Study of Xai Tech...
PPTX
Types of Social Media Marketing for Business Success
PDF
The Edge You’ve Been Missing Get the Sociocosmos Edge
PPT
memimpindegra1uejehejehdksnsjsbdkdndgggwksj
DOC
ASU毕业证学历认证,圣三一拉邦音乐与舞蹈学院毕业证留学本科毕业证
PDF
Presence That Pays Off Activate My Social Growth
PDF
Live Echo Boost on TikTok_ Double Devices, Higher Ranks
PDF
Your Breakthrough Starts Here Make Me Popular
PDF
THE ULTIMATE YOUTUBE SHORTS GROWTH......
PDF
Instagram Reels Growth Guide 2025.......
PDF
Does Ownership Structure Play an Important Role in the Banking Industry?
PDF
Mastering Social Media Marketing in 2025.pdf
PDF
Regulation Study, Differences and Implementation of Bank Indonesia National C...
PDF
Subscribe This Channel Subscribe Back You
PDF
Transform Your Social Media, Grow Your Brand
PDF
Climate Risk and Credit Allocation: How Banks Are Integrating Environmental R...
PDF
25K Btc Enabled Cash App Accounts – Safe, Fast, Verified.pdf
Strategies for Social Media App Enhancement
Why Digital Marketing Matters in Today’s World Ask ChatGPT
StarNetCafeSB2012D3POYNagaworld2-Hotel-Casino-Phnom Entertainment
Customer Churn Prediction in Digital Banking: A Comparative Study of Xai Tech...
Types of Social Media Marketing for Business Success
The Edge You’ve Been Missing Get the Sociocosmos Edge
memimpindegra1uejehejehdksnsjsbdkdndgggwksj
ASU毕业证学历认证,圣三一拉邦音乐与舞蹈学院毕业证留学本科毕业证
Presence That Pays Off Activate My Social Growth
Live Echo Boost on TikTok_ Double Devices, Higher Ranks
Your Breakthrough Starts Here Make Me Popular
THE ULTIMATE YOUTUBE SHORTS GROWTH......
Instagram Reels Growth Guide 2025.......
Does Ownership Structure Play an Important Role in the Banking Industry?
Mastering Social Media Marketing in 2025.pdf
Regulation Study, Differences and Implementation of Bank Indonesia National C...
Subscribe This Channel Subscribe Back You
Transform Your Social Media, Grow Your Brand
Climate Risk and Credit Allocation: How Banks Are Integrating Environmental R...
25K Btc Enabled Cash App Accounts – Safe, Fast, Verified.pdf

Malware bek slides 20131023 final

  • 1. It's you on photo? Automatic Detection of Twitter Accounts Infected With the Blackhole Exploit Kit Josh White and Jeanna Matthews Clarkson University MALWARE 2013October 22-24, 2013
  • 2. Objective  This work identifies some indicators of possible BEK infectious messages on Twitter.  These indicators are used in the production of a filter which can be applied to our collection system to identify user accounts on Twitter which have reached specific thresholds and can be considered compromised or purposefully infectious. MALWARE 2013October 22-24, 2013
  • 3. Overview  BEK Details  Data Collection  Analysis Framework  Metrics  Results  Infectious Message Variations  API Usage  Infectious Indicators MALWARE 2013October 22-24, 2013
  • 4. Blackhole Exploit Kit (BEK) Web-based application that manages the installation and C2 of malware. Utilizes a compromised server for malware and web-page hosting. Links luring victims to a compromised server are distributed mainly through spam, spear-phishing, and links in social network posts. MALWARE 2013October 22-24, 2013
  • 5. Infection The exploit server hosts innocuous looking web-page Page hosts a tool for scanning the visiting syste Once a vulnerability is identified, it loads the necessary exploit tools and compromises the visiting system A wide variety of malware may be loaded at this point depending on the exact mission of the attacker. MALWARE 2013October 22-24, 2013
  • 6. Other BEK features Contains modular capabilities for new exploits to be added rapidly and in many languages. Employs typical countermeasures: Packing, binary obfuscation and antivirus avoidance. MALWARE 2013October 22-24, 2013
  • 7. Proliferation of BEK In 2012, BEK creators released version 2.0, since it has become the most well known/commonly deployed exploit kits BEK enabled majority of malware infections in 2012 One study found: BEK accounted for 29% of all malicious URLS, in a dataset of 77,000 URLs marked harmful by the Google Safe Browsing API MALWARE 2013October 22-24, 2013
  • 8. Data Collection Overview  Over the course of 2012 we collected 165 TB of Twitter Data (Uncompressed)  175 Days Collected, 147 Full Days  Estimated 45 Billion Tweets  Recently released estimates place total Twitter traffic at 175 million tweets per day in 2012  Our daily collection rates varied between 50% and 80% of total Twitter traffic.  We captured complete tweet data in JSON format using Twitters REST API. MALWARE 2013October 22-24, 2013
  • 9. Key Examples of Attributes in JSON format  profile link color/background color/title/default image/image url (http and https)/text color/default description, background image url (http and https), In reply to screen name/status id and str/user id, follow request sent, friends count, screen name, show all inline media, utc offset, url, created at, favorite, retweet count, favorites count, id translator, trunkated, contributors enabled, contributors, time zone, verified, coordinates, Geo, text, entities, id, id str, following, application, retweeted, place, sidebar border color/fill color, followers count, geo enabled listed count, notifications, name, lang, location, protected, statuses count MALWARE 2013October 22-24, 2013
  • 10. Data Collection System  Distributed Data Collection Infrastructure  Geographically dissimilar IP's to simulate multiple systems  Registered Application with Non-authenticated API access (1 billion+ / week) MALWARE 2013October 22-24, 2013
  • 11. Data Storage  Collection in Streaming Gzip Python Dictionary Format (10:1 Average Compression Ratio); Storing 1.5 TB a Week  Converted to JSON on the fly when needed  Initially Stored in HDFS (Had Issues Scaling); Now Use DDFS MALWARE 2013October 22-24, 2013
  • 12. High Level Patterns • Basic observable patterns – Twitter has a lot of outages – Posting rates follow predictable patterns MALWARE 2013October 22-24, 2013
  • 13. Analysis Framework  Filter Analysis (on live stream)  Experimental Analysis (after the fact) MALWARE 2013October 22-24, 2013
  • 14. Some Key Metrics  Entropy  Pearson’s Correlation MALWARE 2013October 22-24, 2013
  • 15. Initial Identification  Searched for two well publicized strings being seen in the wild: “It's you on Photo?” and “It's about you?” MALWARE 2013October 22-24, 2013
  • 16. Other message types found  Others found using REGEX, but not mentioned in any articles or blogs at the time:         “You were nude at party) cool” photo)” “Wow! Your photo is cool.” “At party you was drunken) cool photo)” “Your photo is amazing” “It's photo of you?” “It's all about you” “It's about you?” “Wow! You look good)”  Because BEK allows message customization the permutations are virtually limitless MALWARE 2013October 22-24, 2013
  • 17. Results: Entropy  Normal Tweets = 4.6-7.5  197,237 manually verified messages from 100 sample accounts Normal  Infectious = 4.3 & lower Infectious MALWARE 2013October 22-24, 2013
  • 18. Results: Pearson’s Correlation Coefficient  Two Infectious Accounts  Compared have an average PCC value of 0.927581013955  Positive value near 1  Indicates strong correlation “similarity” between accounts  One Infectious Account and One Non-Infectious Account  Compared have an average PCC value of -0.0847935420003  Negative value near 0  Indicates strong negative correlation “difference” between accounts MALWARE 2013October 22-24, 2013
  • 19. Results: Use of API/Application  Applications must be registered for specific Twitter API function usage  There hundreds of registered applications:  ie.: “Iphone”, “Android”, “Official Twitter Client”  “Mobile Web” is legacy  Requires no registration of application using it  Requires no 0Auth  Less CPU/Memory utilization MALWARE 2013October 22-24, 2013
  • 20. Results: Graph Clustering  Visualize 729,609 suspicious accounts  Utilized Gephi and the OpenOrd clustering algorithm  Shows Obvious Clusters  Based on: Tx infectious directs message to victim, victim is infectious when it starts transmitting infectious messages  Non-connected accounts are assumed to have clicked on an infectious message without it being directed at them. We can not currently trace what messages they clicked on  Dense area's are the most successfully spread infection chains.  The cores are considered Infection Hubs MALWARE 2013October 22-24, 2013
  • 21. Results: Some Summary Stats       Total Number of Tweets Processed: Total Number of Unique Accounts Processed: Total Number of Suspicious Accounts Found: Total Number of Suspicious Tweets Found: Calculated Percentage of BEK Infectious Accounts: Calculated Percentage of BEK Infectious Tweets: 12.7% MALWARE 2013October 22-24, 2013 6,531,319,202 265,163,290 729,609 8,286,480 0.275%
  • 22. Related Work  A lot of research has been done into social network analysis using sites such as Twitter. [21,22,24]  Including research that uses social network trends to track real world contagion spread [25]  A few studies exist that examine BEK's malware dropping capability [2]  The URL identification method they used “w.php?f=(.*?)&e=(.*?)” does not pick up all of the URL patterns that we witnessed MALWARE 2013October 22-24, 2013
  • 23. Related Work (continued)  Various works on determining if a message is automated or not exist, most notably “Human, Bot or Cyborg” [26]  Unfortunately they relied heavily on Google Safe Browsing API which is only updated after someone has verified the link is dangerous. [27]  One work showed that up to 16% of all Twitter accounts show signs of automation. [28]  However, they point out that only a small number of tweets use the Mobile Web API MALWARE 2013October 22-24, 2013
  • 24. Future Work (with edits from Malware 2013 audience!) Analysis of use of specific strings over time Studying spread of ideas in Twitter in addition to spread of malware Case study of top infectors Carrier vs virology model of spread Compare to all vs just benign Testing twitter ( measure how well they do in disabling infected accounts/help them get better); Work with Twitter to integrate  Include follower to following ratio on infected accounts  More like antispam than antimalware        Geographic analysis of the infected accounts MALWARE 2013October 22-24, 2013
  • 25. Conclusion  We completed a large-scale analysis of the characteristics of BEK infectious Twitter Accounts  Some accounts showed signs of being solely for malware distribution  We found substantial variation in infectious message structure  We identified a large set of message types not previously published  We identified the characteristics most strongly associated with BEK infectious messages  Tweets using the Mobile API, with a Text Entropy lower than 4.3, and showing a strong PCC with known infectious messages, and those that additionally have URL's embedded in them  We presented the integration of our measurement techniques and how they integrate into our larger platform  Without manual investigation of all messages that we flagged as infectious we can not be certain of our results MALWARE 2013October 22-24, 2013
  • 26. Citations 1. J. Oliver, S. Cheng, L. Manly, J. Zhu, R. Paz, S. Sioting, J. Leopando. “Blackhole Exploit Kit: A Spam Campaign, Not a Series of Individual Spam Runs, An In-Depth Analysis,” Trend Micro Incorporated Research Paper, 2012 2. Chris Grier, Lucas Ballard, Juan Caballero, et al. 2012. Manufacturing compromise: the emergence of exploit-as-a-service. In Proceedings of the 2012 ACM conference on Computer and communications security (CCS ’12). ACM, New York, NY, USA, 821832. 3. Gabor Szappanos. ”Inside The Blackhole,” SophosLabs, 2012 4. Jason Jones. ”The State of Web Exploit Kits,” HP DVLabs, 2012 5. Howard, Fraser. 2013. Technical paper: Journey inside the Blackhole exploit kit. Naked Security from Sophos. November 30 2012 6. Chris Grier, Lucas Ballard, Juan Caballero, et al. 2012. Manufacturing compromise: the emergence of exploit-as-a-service. In Proceedings of the 2012 ACM conference on Computer and communications security (CCS ’12). ACM, New York, NY, USA, 821832 7. Fraser Howard. ”Exploring the Blackhole Exploit Kit,” Sophos Technical Paper, March 2012 8. Ziv Mador. ”Exploiting Kits: The Underground’s Weapon of Choice,” Infosecurity Europe 2012, SpiderLabs at Trustwave, 2012 9. Zhou Li, Kehuan Zhang, Yinglian Xie, Fang Yu, and XiaoFeng Wang. 2012. Knowing your enemy: understanding and detecting malicious web advertising. In Proceedings of the 2012 ACM conference on Computer and communications security (CCS ’12). ACM, New York, NY, USA, 674-686. 10. Shea Bennett. ”Just How Big Is twitter In 2012 [INFOGRAPHIC],” All Twitter - The Unofficial Twitter Resource, February 2013 MALWARE 2013October 22-24, 2013
  • 27. Citations 11. Mike Melanson, Twitter Kills the API Whitelist: What it Means for Developers and Innovation, February 11 2011, URL =http://www.readwriteweb.com/archives/ 12. Joab Jackson, Twitter Now Using Oauth authentication for Third Party Apps, Computer World UK, September 1, 2010, URL= http://www.computerworlduk.com/news/security/3237659/twitter-now-using-oauth- authentication-for-third-party-apps/ 13. Arne Roomann-Kurrik, Announcing gzip Compression for Streaming API’s, Twitter Developers Feed, Jan 20, 2012, URL =https://dev.twitter.com/blog/announcing-gzip-compression-streaming-apis 14. Prashanth Mundkur, Ville Tuulos, and Jared Flatow. 2011. Disco: a computing platform for large-scale data analytics. In Proceedings of the 10th ACM SIGPLAN workshop on Erlang (Erlang 11). ACM, New York, NY, USA, 84-89. 15. C. E. Shannon. A Mathematical Theory of Communication, Reprinted with corrections from The Bell System Technical Journal, Vol. 27, pp. 379-423, 623-656, July, October, 1948. 16. Graham Cluley. ”Outbreak: Blackhole malware attack spreading on Twitter using ”It’s you on photo? diguise,” Sophos Naked Security Blog, July 27, 1012 17. Rob Waugh. ”It’s you! Blackhole vierus spreading rapidly via Twitter fools users with fake photo link,” MailOnline Science and Tech New, July 2012 18. Bastian M., Heymann S., Jacomy M. (2009). Gephi: an open source software for exploring and manipulating networks. International AAAI Conference on Weblogs and Social Media. 19. S. Martin, W. M. Brown, R. Klavans, and K. Boyack (to appear, 2011), OpenOrd: An Open-Source Toolbox for Large Graph Layout, SPIE Conference on Visualization and Data Analysis (VDA). 20. Aditya Mogadala and Vasudeva Varma. 2012. Twitter user behavior understanding with mood transition prediction. In Proceedings of the 2012 workshop on Data-driven user behavioral modelling and mining from social media (DUBMMSM ’12). ACM, New York, NY, USA, 31-34. MALWARE 2013October 22-24, 2013
  • 28. Citations 21.Johan Bollen and Huina Mao. 2011. Twitter Mood as a Stock Market Predictor. Computer 44, 10 (October 2011), 91-94. DOI=10.1109/MC.2011.323 http://dx.doi.org/10.1109/MC.2011.323 22. Johan Bollen, Bruno Gonalves, Guangchen Ruan, and Huina Mao. 2011. Happiness is assortative in online social networks. Artif. Life 17, 3 (August 2011), 237-251. 23. Manuel Cebrian. 2012. Using friends as sensors to detect planetary-scale contagious outbreaks. In Proceedings of the 1st international workshop on Multimodal crowd sensing (CrowdSens ’12). ACM, New York, NY, USA, 15-16. 24 Z. Chu, S. Gianvecchio, H. Wang, and S. Jajodia. 2010. Who is tweeting on Twitter: human, bot, or cyborg?. In Proceedings of the 26th Annual Computer Security Applications Conference (ACSAC ’10). ACM, New York, NY, USA, 21-30. 25. Google. Google safe browsing API. http://code.google.com/apis/safebrowsing/, Accessed: Feb 5, 2010 26. Chao Michael Zhang and Vern Paxson. 2011. Detecting and analyzing automated activity on twitter. In Proceedings of the 12th international conference on Passive and active measurement (PAM’11), Neil Spring and George F. Riley (Eds.). Springer-Verlag, Berlin, Heidelberg, 102111. MALWARE 2013October 22-24, 2013

Editor's Notes

  • #2: {"22":"What are you setting out to do with your research described here today? Why is this significant?\n","17":"What are you setting out to do with your research described here today? Why is this significant?\n","23":"What are you setting out to do with your research described here today? Why is this significant?\n","18":"What are you setting out to do with your research described here today? Why is this significant?\n","19":"What are you setting out to do with your research described here today? Why is this significant?\n","25":"What are you setting out to do with your research described here today? Why is this significant?\n","20":"What are you setting out to do with your research described here today? Why is this significant?\n","21":"What are you setting out to do with your research described here today? Why is this significant?\n"}