Life Cycle And Detection Of Bot Infections Through Network Traffic Analysis

Taming Botnets

Life cycle and detection of bot infections through
network traffic analysis

agenda
● Introduction
● Bots and botnets: short walk-through
● Taming botnets: Detection and Evasion
● Our approach
● Case studies
● Conclusion
● Disclaimer:
We steal our images
From google image :)

Introduction
● Why we are doing this research?
● Objectives
● Our data sources
● Our environment

bunch of code in node.js
and python. Customized sandboxing platform
(cuckoo based). Data indexed in solr

Introduction: bots
● “bot”: a software program, installed on target
machine(s) for the purpose of utilizing that
machine computational/network resources or
collect information
● A typical bot is controlled by external party
therefore needs to be able to utilize a
communication channel in order to receive
commands and pass information
● Bots typically are used for malicious purposes ;-)

Introduction: bots (lifecycle)
● Installation (infection) phase: often by means of
a software exploit or a social engineering
technique (fake antivirus, fake software update)
● Post-infection phase: communication (C&C,
peer etc)

Introduction
● Our basic assumption is that a bot needs to be
able to communicate back in order to be useful.
● Our analysis is primarily “blackbox” by observing
network traffic of a large network infrastructure in
order to identify possible infections and
“communication” links
● We also utilize sandboxing techniques to
observe behavior (mainly from the network side)
● We do not attempt to reverse engineer
(manually or automatically) botnet software

Botnets
● Infection vectors → often targetting enduser
machines (clients) in large number of
occurrences by exploiting a software
vulnerability in browser or related components
● C&C communication:
● Remember IRC bots? :)
● over HTTP (most common)
● Proprietary protocol
● Centralized or P2P infrastructure

Botnets: lifecycle
● C&C Hosting itself is another interesting
research area ;-)

So how do you get bots on your
machine? :)

How do you get bots on your
machine? ;-)
● Compromised servers: most widespread, often
through silly vulns (i.e. wordpress!), but also
high profile web sites are affected, or domains
taken over (DNS poisoning and more)
● Placing a javascript iframe on compromised
high-traffic machine is way more profitable than
defacing (hacktivism is only for hippies? ;)

How do you get bots (pt 2)
● SEO poisoning/manipulation.

How you get bots (pt 3)
● Advertisements and malvertisements: whole
new ecosystem:

OpenX is a huge security hole ;)

Anyways
● Once infected, the bot talks back...

Lets look at some real-life cases. (data is very
recently, mostly past few months).

Old-school bots (still active. For real!
May/2012: IRC bots still real :-D ;-))

Carberp
● Bot Infection: Drive-By-HTTP
● Payload and intermediate malware domains: normal, just
registered/DynDNS
● Distributed via: Many many compromised web-sites, top
score > 100 compromised resources detected during 1
week.
● C&C domains usually generated, but some special cases
below ;-).
● C&C and Malware domains located on the same AS (from
bot point of view). Easy to detect.
● Typical bot activity: Mass HTTP Post

Domain URL Referrer Payload Size
beatshine.is- /g/18418362672595167.js www.*****press.ru javascript 9414
saved.org
activatedreplacing. /index.php? www.*****press.ru html 45443
is-very-evil.org 28d9000e56c2a63080ff89c
6f5357591
activatedreplacing. //images/r/785cee8be7f1da application/x 4135
is-very-evil.org 9a9d60820cbf8b1840.jar -jar
activatedreplacing. /server_privileges.php? application/e 155529
is-very-evil.org 91370f5f009a815950578cb xecutable
539f28b58=3

Another attack atempt and update
URLs
Time Domain URL IP

10/Apr/2012: nod32-matrosov-pideri.org //images/785cee8be7f1da9a9d6 62.122.79.42
10:29:09 0820cbf8b1840.jar
10/Apr/2012: nod32-matrosov-pideri.org /expl0it/At00micArray.class 62.122.79.42
10:29:10
10/Apr/2012: nod32-matrosov-pideri.org / 62.122.79.42
10:29:11 expl0it/At00micArray/class.class

02/May/2012: rgn7er8yafh89cehuighv.org / 91.228.134.210
08:42:59 bxlkizmfgtlfwcdmljmrjlunqkvsslfir
u.tpl
02/May/2012: avast-pidersiy-gandon.com /crypt/files/crypted/config.bin 62.122.79.52
08:42:59

02/May/2012: rgn7er8yafh89cehuighv.org /aDHfNt8w43yYGM.tiff 91.228.134.210
08:43:00

Detection during infection and by
postinfection activity
● Infection: executable transfer from just
registered, example lifenews-sport.org or
Dyn-DNS domains, like
uphchtxmji.homelinux.com
● Updates: executable transfer from just
registered or DynDNS domain
● Postinfection activity: Mass HTTP Post to
generated domains like
n87e0wfoghoucjfe0id.org, URL ends with
different extensions

Netprotocol.exe
● Bot Infection was: Drive-By-FTP,
now: Drive-By-FTP, Drive-By-HTTP
● Payload and intermediate malware domains:Normal, Obfuscated
● Distributed via: compromised web-sites
● C&C domains usually generated, many domains in .be zone.
● C&C and Malware domains located on the different AS. Bot
updates payload via HTTP
● Typical bot activity: HTTP Post, payload updates via HTTP.

Domain URL Referrer Payload Size
3645455029 /1/s.html Infected site html 997
Java.com /js/deployJava.js 3645455029 javascript 4923
3645455029 /1/exp.jar application/x 18046
-jar
3645455029 /file1.dat application/e 138352
xecutable

Attack analysis
- Script from www. Java.com used during attack.
- Applet exp.jar loaded by FTP
- FTP Server IP address obfuscated to avoid
detection

Interesting modifications
GET http://java.com/ru/download
/windows_ie.jsp?host=java.com%26
returnPage=ftp://217.73.58.181/1/s.html%26
locale=ru HTTP/1.1
Key feature example
Date/Time 2012-04-20 11:11:49 MSD
Tag Name FTP_Pass
Target IP Address 217.73.63.202
Target Object Name 21

:password Java1.6.0_30@
:user anonymous

Activity example
Date/Time 2012-04-29 Date/Time 2012-04-29
02:05:48 MSD 02:06:08 MSD
Tag Name HTTP_Post Tag Name HTTP_Post
Target IP Address Target IP Address
217.73.60.107 208.73.210.29
:server :server
rugtif.be eksyghskgsbakrys.com
● :URL :URL
/check_system.php /check_system.php
Domain registered:
2012-04-21

Onhost deteciton and activity
Payload: usually netprotocol.exe. Located in
UsersUSER_NAMEAppDataRoaming,
which periodically downloads other malware
Further payload loaded via HTTP
http://64.191.65.99/view_img.php?c=4&
k=a4422297a462ec0f01b83bc96068e064

Detection By AV Sample from May
09 2012 Detect ratio 1/42
● (demos, recoreded as videos)

● Infection: .jar and .dat file downloaded by FTP, server name
= obfuscated IP Addres, example ftp://3645456330/6/e.jar
Java version in FTP password, example Java1.6.0_29@
● Updates: executable transfer from some Internet host,
example GET http://184.82.0.35/f/kwe.exe
● Postinfection activity: Mass HTTP Post to normal and
generated domains with URL: check_system.php
09:04:46 POST http://hander.be/check_system.php
09:05:06 POST http://aratecti.be/check_system.php
09:06:48 POST http://hander.be/check_system.php
09:07:11 POST http://aratecti.be/check_system.php

Noproblemslove.com,
whoismistergreen.com, etc...
● Bot Infection: Drive-By-HTTP
● Payload and intermediate malware
domains:Normal /DynDNS
● Distributed via: Compromised web-sites.
● C&C domains: normal.
● C&C and Malware domains located on the
different AS. Sophisticated attack scheme.
Timeout before activity.
● Typical bot activity: Mass HTTP Post

Noproblemslove.com,

Interesting domains from range
184.82.149.178-184.82.149.180 (Feb 2012)
Domain Name IP
www.google-analylics.com 184.82.149.179
google-anatylics.com 184.82.149.178
www.google-analitycs.com 184.82.149.180
webmaster-google.ru 184.82.149.178
paged2.googlesyndlcation.com 184.82.149.179
googlefilter.ru 184.82.149.179
rambler-analytics.ru 184.82.149.179
site-yandex.net 184.82.149.180
paged2.googlesyndlcation.com 184.82.149.179
www.yandex-analytics.ru 184.82.149.178
googles.4pu.com 184.82.149.178
googleapis.www1.biz 184.82.149.178
syn1-adriver.ru 184.82.149.178

HOSTER RANGE AND AS
www.google-analylics.com looks good,
BUT
Google, Rambler and Yandex together on
184.82.149.176/29 ?

hoster range and autonomous system (AS)
are useful, when you analyze suspicious events.

Other domains but owner is the
same

What's common
whoismistergreen.com noproblemslove.com
IP-адрес: 213.5.68.105 213.5.68.105
Create: 2011-07-26 Created: 2011-12-07
Registrant Name: JOHN Registrant Contact:
ABRAHAM Whois Privacy Protection Service
Address: ul. Dubois 119 Whois Agent
City: Lodz gmvjcxkxhs@whoisservices.cn
patr1ckjane.com noproblemsbro.com
IP Was 176.65.166.28 176.65.166.28
IP Now 213.5.68.105 Created: 2011-12-07
Registrant Contact:
Create: 2011-07-21
Whois Privacy Protection Service
Registrant Name: patrick jane Whois Agent
Address: ul. Dubois 119 gmvjcxkxhs@whoisservices.cn
City: Lodz

● Infection: executable transfer from just
registered, or Dyn-DNS domains, like
fx58.ddns.us
● Updates: application/octet-stream bulk data
load from C&C
● Postinfection activity: Mass HTTP Post to
seem-normal domains,i.e:
noproblemslove.com,

Detection
● What we are building ;)

Cross-correlation data sources
● WHOIS (including team cymru whois)
● Our own DNS index, also talking to ISC about
possibilities of data swaps
● Sandbox farm (mainly to detect compromised
websites automagically and study behavior)
● Public “malicious IP address” databases.
● Public reputation (I.e ToS) databases.
● (still work in progress)

Detection
● Manual and Automated
● Automated detection is largely based on
analysis of network traffic:
● Anomaly detection
● Pattern based-analysis
● Signatures (snort!)
● Traffic profiling (DNS traffic profiling, HTTP traffic
profiling etc)

Detection
● Detecting malicious botnet activity is very
popular in academia (interesting problem).
● In our research we do not claim extreme
novelty but rather will demonstrate our
experience and a few practical solutions that
seem to work :-)

Detection: loooots of papers!~

Detection: intreresting bits
● Botnet detection evolved from pattern based
approach (hardcoded bot CMD patterns and
capture then with snort) to a complex field of
generic detection of automated “call-back”
communication channels..

Detection
● Different “callback” methods, as seen in the
wild, possess interesting properties, such as:
● Large number of failed DNS requests
● Large number of DNS requests for IP addresses,
which are offline
● Connection attempts to mostly dead IP addresses
● Traffic pattern (differs from regular browsing)

Cat and mouse game
● Of course all of this is easy to evade. Once you
know the method. But security is always about
'cat-n-mouse' game ;-)

Detection
● Detecting botnet activities by analyzing DNS
traffic
● Analyzing DNS names (dictionary-comparison,
alpha numeric characters, detection of “generated”
domain names (similarities/patterns)
● Analyzing failed DNS queries
● DNS “ranking” (based on whois information)

Detection: rcode: 3 (Non-existing
domains)
12

10

8

Column 1
6 Column 2
Column 3

4

2

0
Row 1 Row 2 Row 3 Row 4

Rcode:2 domains
Detection: rcode:2 (server failure)
(failed servers)

Detection
● WHOIS cross-correlation – easily automated.

Detection
● Further step: cross-correlation to domain
names which have the same WHOIS attributes
● Sandboxing (we use modified version of
cuckoosandbox, with user event simulation, not
perfect but works)
● Challenges:
– Simulate complex user behavior (mouse movements)
– Simulate complex user browsing pattern (visiting X with
search engine (image?) as referer)

Detection
(visualization)
● Parallel coordinates (also see recent talk by
Alexandre Dulaunoy from CIRCL.LU and
Sebastien Tricaud from Picviz Labs at
cansectwest)

Detection
● (demos, lets look at some videos :)

Conclusions
● Detection is still trivial, but keep your methods
“private” ;-)
● Detecting 'advanced' botnets (name your
favourite traffic profiling evasion method!) is out
of question here. Unless this becomes wide-
spread
● Cat and mouse game is still fun! ;-)

Tips and recommendations
● For infected machines: boot from clean media
and periodically do OFFLINE AV checking
● Monitor network traffic for any unusual activity
● Default-deny firewall policies + block any active
executable content

questions
● Contact us at:
● fygrave@gmail.com
● vladimir.b.kropotov@gmail.com

http://github.com/fygrave/dnslyzer for some code

Life Cycle And Detection Of Bot Infections Through Network Traffic Analysis

More Related Content

What's hot (8)

Similar to Life Cycle And Detection Of Bot Infections Through Network Traffic Analysis (20)

More from Positive Hack Days (20)

Recently uploaded (20)

Life Cycle And Detection Of Bot Infections Through Network Traffic Analysis