SlideShare a Scribd company logo
Protecting Browsers from DNS Rebinding Attacks
Collin Jackson
Stanford University
[email protected]
Adam Barth
Stanford University
[email protected]
Andrew Bortz
Stanford University
[email protected]
Weidong Shao
Stanford University
[email protected]
Dan Boneh
Stanford University
[email protected]
ABSTRACT
DNS rebinding attacks subvert the same-origin policy of
browsers and convert them into open network proxies. We
survey new DNS rebinding attacks that exploit the inter-
action between browsers and their plug-ins, such as Flash
Player and Java. These attacks can be used to circumvent
firewalls and are highly cost-e↵ ective for sending spam e-
mail and defrauding pay-per-click advertisers, requiring less
than $100 to temporarily hijack 100,000 IP addresses. We
show that the classic defense against these attacks, called
“DNS pinning,” is ine↵ ective in modern browsers. The pri-
mary focus of this work, however, is the design of strong
defenses against DNS rebinding attacks that protect mod-
ern browsers: we suggest easy-to-deploy patches for plug-ins
that prevent large-scale exploitation, provide a defense tool,
dnswall, that prevents firewall circumvention, and detail
two defense options, policy-based pinning and host name
authorization.
Categories and Subject Descriptors
K.6.5 [Management of Computing and Information
Systems]: Security and Protection
General Terms
Security, Design, Experimentation
Keywords
Same-Origin Policy, DNS, Firewall, Spam, Click Fraud
1. INTRODUCTION
Users who visit web pages trust their browser to prevent
malicious web sites from leveraging their machines to attack
others. Organizations that permit JavaScript and other ac-
tive content through their firewall rely on the browser to
protect internal network resources from attack. To achieve
Permission to make digital or hard copies of all or part of this
work for
personal or classroom use is granted without fee provided that
copies are
not made or distributed for profit or commercial advantage and
that copies
bear this notice and the full citation on the first page. To copy
otherwise, to
republish, to post on servers or to redistribute to lists, requires
prior specific
permission and/or a fee.
CCS’07, October 29–November 2, 2007, Alexandria, Virginia,
USA.
Copyright 2007 ACM 978-1-59593-703-2/07/0011 ...$5.00.
these security goals, modern browsers implement the same-
origin policy that attempts to isolate distinct “origins,” pro-
tecting sites from each other.
DNS rebinding attacks subvert the same-origin policy by
confusing the browser into aggregating network resources
controlled by distinct entities into one origin, e↵ ectively con-
verting browsers into open proxies. Using DNS rebinding,
an attacker can circumvent firewalls to spider corporate in-
tranets, exfiltrate sensitive documents, and compromise un-
patched internal machines. An attacker can also hijack the
IP address of innocent clients to send spam e-mail, commit
click fraud, and frame clients for misdeeds. DNS rebinding
vulnerabilities permit the attacker to read and write directly
on network sockets, subsuming the attacks possible with ex-
isting JavaScript-based botnets [24], which can send HTTP
requests but cannot read back the responses.
To mount a DNS rebinding attack, the attacker need only
register a domain name, such as attacker.com, and attract
web tra�c, for example by running an advertisement. In
the basic DNS rebinding attack, the attacker answers DNS
queries for attacker.com with the IP address of his or her
own server with a short time-to-live (TTL) and serves vis-
iting clients malicious JavaScript. To circumvent a firewall,
when the script issues a second request to attacker.com, the
attacker rebinds the host name to the IP address of a tar-
get server that is inaccessible from the public Internet. The
browser believes the two servers belong to the same origin
because they share a host name, and it allows the script to
read back the response. The script can easily exfiltrate the
response, enabling the attacker to read arbitrary documents
from the internal server, as shown in Figure 1.
To mount this attack, the attacker did not compromise
any DNS servers. The attacker simply provided valid, au-
thoritative responses for attacker.com, a domain owned by
the attacker. This attack is very di↵ erent from “pharm-
ing” [34], where the attacker must compromise a host name
owned by the target by subverting a user’s DNS cache or
server. DNS rebinding requires no such subversion. Conse-
quently, DNSSEC provides no protection against DNS re-
binding attacks: the attacker can legitimately sign all DNS
records provided by his or her DNS server in the attack.
DNS rebinding attacks have been known for a decade [8,
36]. A common defense implemented in several browsers is
DNS pinning: once the browser resolves a host name to an
IP address, the browser caches the result for a fixed dura-
tion, regardless of TTL. As a result, when JavaScript con-
nects to attacker.com, the browser will connect back to the
attacker’s server instead of the internal server.
Attacker
web
server
Target
server
Browser
client
Figure 1: Firewall Circumvention Using Rebinding
Pinning is no longer an e↵ ective defense against DNS re-
binding attacks in current browsers because of vulnerabil-
ities introduced by plug-ins. These plug-ins provide addi-
tional functionality, including socket-level network access,
to web pages. The browser and each plug-in maintain sep-
arate pin databases, creating a new class of vulnerabilities
we call multi-pin vulnerabilities that permit an attacker to
mount DNS rebinding attacks. We demonstrate, for exam-
ple, how to exploit the interaction between the browser and
Java LiveConnect to pin the browser to one IP address while
pinning Java to another IP address, permitting the attacker
to read and write data directly on sockets to a host and
port of the attacker’s choice despite strong pinning by each
component.
Our experiments show how an attacker can exploit multi-
pin vulnerabilities to cheaply and e�ciently assemble a tem-
porary, large-scale bot network. Our findings suggest that
nearly 90% of web browsers are vulnerable to rebinding at-
tacks that only require a few hundreds of milliseconds to
conduct (see Table 1). These attacks do not require users
to click on any malicious links: users need only view an at-
tacker’s web advertisement. By spending less than $100 on
advertising, an attacker can hijack 100,000 unique IP ad-
dress to send spam, commit click fraud, or otherwise misuse
as open network proxies.
The bulk of our work focuses on designing robust defenses
to DNS rebinding attacks that protect current and future
browsers and plug-ins:
1. To combat firewall circumvention, we recommend or-
ganizations deploy DNS resolvers that prevent external
names from resolving to internal addresses. We pro-
vide an open-source implementation of such a resolver
in 300 lines of C called dnswall [15].
2. For Flash Player, Java, and LiveConnect, we suggest
specific, easy-to-deploy patches to prevent multi-pin
vulnerabilities, mitigating large-scale exploitation of
DNS rebinding for firewall circumvention and IP hi-
jacking.
Technology Attack Time
LiveConnect (JVM loaded) 47.8 ± 10.3 ms
Flash Player 9 192 ± 5.7 ms
Internet Explorer 6 (no plug-ins) 1000 ms
Internet Explorer 7 (no plug-ins) 1000 ms
Firefox 1.5 and 2 (no plug-ins) 1000 ms
Safari 2 (no plug-ins) 1000 ms
LiveConnect 1294 ± 37 ms
Opera 9 (no plug-ins) 4000 ms
Table 1: Time Required for DNS Rebinding Attack
by Technology (95% Confidence)
3. We propose two options for protecting browsers from
DNS rebinding: smarter pinning that provides better
security and robustness, and a backwards-compatible
use of the DNS system that fixes rebinding vulnerabil-
ities at their root (which we implemented as a 72-line
patch to Firefox 2).
The remainder of the paper is organized as follows. Sec-
tion 2 describes existing browser policy for network access.
Section 3 details DNS rebinding vulnerabilities, including
standard DNS rebinding and current multi-pin vulnerabili-
ties. Section 4 explains two classes of attacks that use these
vulnerabilities, firewall circumvention and IP hijacking, and
contains our experimental results. Section 5 proposes de-
fenses against both classes of attacks. Section 6 describes
related work. Section 7 concludes.
2. NETWORK ACCESS IN THE BROWSER
To display web pages, browsers are instructed to make
network requests by static content such as HTML and by
active content such as JavaScript, Flash Player, Java, and
CSS. Browsers restrict this network access in order to to pre-
vent web sites from making malicious network connections.
The same-origin policy provides partial resource isolation
by restricting access according to origin, specifying when
content from one origin can access a resource in another ori-
gin. The policy applies to both network access and browser
state such as the Document Object Model (DOM) interface,
cookies, cache, history, and the password database [20]. The
attacks described in this paper circumvent the same origin-
policy for network access.
Access Within Same Origin. Within the same origin,
both content and browser scripts can read and write net-
work resources using the HTTP protocol. Plug-ins, such as
Flash Player and Java, can access network sockets directly,
allowing them to make TCP connections and, in some cases,
send and receive UDP packets as well. Java does not restrict
access based on port number, but Flash Player permits ac-
cess to port numbers less than 1024 only if the machine
authorizes the connection in an XML policy served from a
port number less than 1024.
Access Between Di↵ erent Origins. In general, con-
tent from one origin can make HTTP requests to servers
in another origin, but it cannot read responses, e↵ ectively
restricting access to “send-only.” Flash Player permits its
movies to read back HTTP responses from di↵ erent origins,
provided the remote server responds with an XML policy
authorizing the movie’s origin. Flash Player also permits
reading and writing data on TCP connections to arbitrary
port numbers, again provided the remote server responds
with a suitable XML policy on an appropriate port.
By convention, certain types of web content are assumed
to be public libraries, such as JavaScript, CSS, Java ap-
plets, and SWF movies. These files may be included across
domains. For example, one origin can include a CSS file
from another origin and read its text. Scripts can also read
certain properties of other objects loaded across domains,
such as the height and width of an image.
Prohibited Access. Some types of network access are pro-
hibited even within the same origin. Internet Explorer 7
blocks port numbers 19 (chargen), 21 (FTP), 25 (SMTP),
110 (POP3), 119 (NNTP), and 143 (IMAP), Firefox 2 blocks
those plus 51 additional port numbers, but Safari 2 does not
block any ports. Some of these port restrictions are designed
to prevent malicious web site operators from leveraging vis-
iting browsers to launch distributed denial of service or to
send spam e-mail, whereas others prevent universal cross-
site scripting via the HTML Form Protocol Attack [41].
Origin Definition. Di↵ erent definitions of “origin” are
used by di↵ erent parts of the browser. For network access,
browsers enforce the same-origin policy [38] based on three
components of the Uniform Resource Locator (URL) from
which it obtained the content. A typical URL is composed
of the below components:
scheme://hostname:port/path
Current browsers treat two objects as belonging to the same
origin if, and only if, their URLs contain the same scheme,
host name, and port number (e.g., http://amazon.com/ is
a di↵ erent origin than http://amazon.co.uk/, even though
the two domains are owned by the same company). Other
resources use fewer components of the URL. For example,
cookies use only the host name.
Objects on the Internet, however, are not accessed by host
name. To connect to a server, the browser must first trans-
late a host name into an IP address and then open a socket
to that IP address. If one host name resolves to multiple IP
addresses owned by multiple entities, the browser will treat
them as if they were the same origin even though they are,
from an ownership point-of-view, di↵ erent.
3. DNS REBINDING VULNERABILITIES
The network access policy in web browsers is based on
host names, which are bound by the Domain Name Sys-
tem (DNS) to IP addresses. An attacker mounting a DNS
rebinding attack attempts to subvert this security policy by
binding his or her host name to both the attack and target
server’s IP addresses.
3.1 Standard Rebinding Vulnerabilities
A standard rebinding attack uses a single browser tech-
nology (e.g. JavaScript, Java, or Flash Player) to connect
to multiple IP addresses with the same host name.
Multiple A Records. When a client resolves a host name
using DNS, the authoritative server can respond with mul-
tiple A records indicating the IP addresses of the host. The
first attack using DNS rebinding [8] in 1996 leveraged this
property to confuse the security policy of the Java Virtual
Machine (JVM):
1. A client visits a malicious web site, attacker.com, con-
taining a Java applet. The attacker’s DNS server binds
attacker.com to two IP addresses: the attacker’s web
server and the target’s web server.
2. The client executes the attacker’s applet, which opens
a socket to the target. The JVM permits this connec-
tion, because the target’s IP address is contained in
the DNS record for attacker.com.
Current versions of the JVM are not vulnerable to this at-
tack because the Java security policy has been changed. Ap-
plets are now restricted to connecting to the IP address from
which they were loaded. (Current attacks on Java are de-
scribed in Section 3.2.)
In the JavaScript version of this attack, the attacker sends
some JavaScript to the browser that instructs the browser
to connect back to attacker.com. The attacker’s server
refuses this second TCP connection, forcing the browser to
switch over to the victim IP address [21]. By using a RST
packet to refuse the connection, the attacker can cause some
browsers to switch to the new IP address after one second.
Subsequent XMLHttpRequests issued by the attacker’s code
will connect to the new IP address.
Time-Varying DNS. In 2001, the original attack on Java
was extended [36] to use use time-varying DNS:
1. A client visits a malicious web site, attacker.com,
containing JavaScript. The attacker’s DNS server is
configured to bind attacker.com to the attacker’s IP
address with a very short TTL.
2. The attacker rebinds attacker.com to the target’s IP
address.
3. The malicious script uses frames or XMLHttpRequest
to connect to attacker.com, which now resolves to the
IP address of the target’s server.
Because the connection in Step 3 has the same host name
as the original malicious script, the browser permits the at-
tacker to read the response from the target.
Pinning in Current Browsers. Current browsers defend
against the standard rebinding attack by “pinning” host
names to IP address, preventing host names from referring
to multiple IP addresses.
• Internet Explorer 7 pins DNS bindings for 30 minutes.1
Unfortunately, if the attacker’s domain has multiple A
records and the current server becomes unavailable,
the browser will try a di↵ erent IP address within one
second.
• Internet Explorer 6 also pins DNS bindings for 30 min-
utes, but an attacker can cause the browser to release
its pin after one second by forcing a connection to the
current IP address to fail, for example by including the
element <img src="http://attacker.com:81/">.
1The duration is set by the registry keys DnsCacheTimeout
and ServerInfoTimeOut in
HKEY CURRENT USERSOFTWAREMicrosoft Windows
CurrentVersionInternet Settings
• Firefox 1.5 and 2 cache DNS entries for between 60 and
120 seconds. DNS entries expire when the value of the
current minute increments twice. 2 Using JavaScript,
the attacker can read the user’s clock and compute
when the pin will expire. Using multiple A records, an
attacker can further reduce this time to one second.
• Opera 9 behaves similarly to Internet Explorer 6. In
our experiments, we found that it pins for approxi-
mately 12 minutes but can be tricked into releasing its
pin after 4 seconds by connecting to a closed port.
• Safari 2 pins DNS bindings for one second. Because
the pinning time is so low, the attacker may need to
send a “Connection: close” HTTP header to ensure
that the browser does not re-use the existing TCP con-
nection to the attacker.
Flash Player 9. Flash Player 9 permits SWF movies to
open TCP sockets to arbitrary hosts, provided the destina-
tion serves an XML policy authorizing the movie’s origin [2].
According to Adobe, Flash Player 9 is installed on 55.8% of
web browsers (as of December 2006) [1]; according to our
own experiments, Flash Player 9 was present in 86.9% of
browsers. Flash Player is vulnerable to the following re-
binding attack:
1. The client’s web browser visits a malicious web site
that embeds a SWF movie.
2. The SWF movie opens a socket on a port less than
1024 to attacker.com, bound to the attacker’s IP ad-
dress. Flash Player sends <policy-file-request />.
3. The attacker responds with the following XML:
<?xml version="1.0"?>
<cross-domain-policy>
<allow-access-from domain="*" to-ports="*" />
</cross-domain-policy>
4. The SWF movie opens a socket to an arbitrary port
number on attacker.com, which the attacker has re-
bound to the target’s IP address.
The policy XML provided by the attacker in step 3 in-
structs Flash Player to permit arbitrary socket access to
attacker.com. Flash Player permits the socket connections
to the target because it does not pin host names to a single
IP address. If the attacker were to serve the policy file from
a port number � 1024, Flash Player would authorize only
ports � 1024.
3.2 Multi-Pin Vulnerabilities
Current browsers use several plug-ins to render web pages,
many of which permit direct socket access back to their ori-
gins. Another class of rebinding attacks exploit the fact
that these multiple technologies maintain separate DNS pin
databases. If one technology pins to the attacker’s IP ad-
dress and another pins to the target’s IP address, the at-
tacker can make use of inter-technology communication to
circumvent the same-origin restrictions on network access.
Some of these attacks have been discussed previously in the
full-disclosure community [4].
Java. Java, installed on 87.6%3 of web browsers [1], can also
2The duration is set by network.dnsCacheExpiration.
3We observed 98.1% penetration in our experiment.
open TCP connections back to their origins. The Java Vir-
tual Machine (JVM) maintains DNS pins separately from
the browser, opening up the possibility of DNS rebinding
vulnerabilities. Java applets themselves are not vulnerable
because the JVM retrieves applets directly from the net-
work, permitting the JVM to pin the origin of the applet to
the correct IP address. Java is vulnerable, however, to the
following attacks.
• LiveConnect bridges JavaScript and the JVM in Fire-
fox and Opera, permitting script access to the Java
standard library, including the Socket class, without
loading an applet. The browser pins to the attacker’s
IP address, but the JVM spawned by LiveConnect
does a second DNS resolve and pins to the target’s
IP address. The attacker’s JavaScript can exploit this
pin mismatch to open and communicate on a socket
from the client machine to an arbitrary IP address on
an arbitrary destination port, including UDP sockets
with a source port number � 1024.
• Applets with Proxies are also vulnerable to a multi-
pin attack, regardless of which browser the client uses.
If the client uses an HTTP proxy to access the web,
there is yet another DNS resolver involved—the proxy.
When the JVM retrieves an applet via a proxy, it re-
quests the applet by host name, not by IP address.
If the applet opens a socket, the JVM does a second
DNS resolve and pins to the target’s IP address.
• Relative Paths can cause multi-pin vulnerabilities. If
a server hosts an HTML page that embeds an applet
using a relative path with the parameter mayscript
set to true, that machine can be the target of a multi-
pin attack. The browser pins to the target, retrieves
the HTML page, and instructs the JVM to load the
applet. The JVM does a second DNS resolve, pins
to the attacker, and retrieves a malicious applet. The
applet instructs the browser, via JavaScript, to issue
XMLHttpRequests to the target’s IP address.
Flash Player. Flash Player would still be vulnerable to
multi-pin attacks even if it pinned DNS bindings. Flash
Player does not retrieve its movies directly from the net-
work. Instead, the browser downloads the movie and spawns
Flash Player, transferring the movie’s origin by host name.
When the attacker’s movie attempts to open a socket, Flash
Player does a second DNS resolution and would pin to the
target’s IP address. The URLLoader class is not vulnerable to
multi-pin attacks because it uses the browser to request the
URL and thus uses the browser’s DNS pins, but the Socket
class could still be used to read and write on arbitrary TCP
sockets.
Other Plug-ins. Other browser plug-ins permit network
access, including Adobe Acrobat and Microsoft Silverlight.
Acrobat restricts network communication to the SOAP pro-
tocol but does not restrict access by document origin. Of-
ten, the Acrobat plug-in will prompt the user before access-
ing the network. Silverlight permits network access through
BrowserHttpWebRequest, which uses the browser to make
the request (like URLLoader in Flash Player) and thus uses
the browser’s DNS pins.
4. ATTACKS USING DNS REBINDING
An attacker can exploit the DNS rebinding vulnerabilities
described in Section 3 to mount a number of attacks. For
some of these attacks, the attacker requires the direct socket
access a↵ orded by DNS rebinding with Flash Player and
Java, whereas others require only the ability to read HTTP
responses from the target. The attacks fall into two broad
categories, according to the attacker’s goal:
• Firewall Circumvention. The attacker can use DNS re-
binding to access machines behind firewalls that he or
she cannot access directly. With direct socket access,
the attacker can interact with a number of internal
services besides HTTP.
• IP Hijacking. The attacker can also use DNS rebinding
to access publicly available servers from the client’s IP
address. This allows the attacker to take advantage of
the target’s implicit or explicit trust in the client’s IP
address.
To mount these attacks, the attacker must first induce the
client to load some active content. This can be done by a
variety of techniques discussed in Section 4.4. Once loaded
onto the client’s machine, the attacker’s code can communi-
cate with any machine reachable by the client.
4.1 Firewall Circumvention
A firewall restricts tra�c between computer networks in
di↵ erent zones of trust. Some examples include blocking
connections from the public Internet to internal machines
and mediating connections from internal machines to Inter-
net servers with application-level proxies. Firewall circum-
vention attacks bypass the prohibition on inbound connec-
tions, allowing the attacker to connect to internal servers
while the user is visiting the attacker’s Internet web page
(see Figure 1).
Spidering the Intranet. The attacker need not specify
the target machine by IP address. Instead, the attacker
can guess the internal host name of the target, for example
hr.corp.company.com, and rebind attacker.com to a CNAME
record pointing to that host name. The client’s own recur-
sive DNS resolver will complete the resolution and return
the IP address of the target. Intranet host names are often
guessable and occasionally disclosed publicly [30, 9]. This
technique obviates the need for the attacker to scan IP ad-
dresses to find an interesting target but does not work with
the multiple A record technique described in Section 3.1.
Having found a machine on the intranet, the attacker can
connect to the machine over HTTP and request the root
document. If the server responds with an HTML page, the
attacker can follow links and search forms on that page,
eventually spidering the entire intranet. Web servers inside
corporate firewalls often host confidential documents, rely-
ing on the firewall to prevent untrusted users from accessing
the documents. Using a DNS rebinding attack, the attacker
can leverage the client’s browser to read these documents
and exfiltrate them to the attacker, for example by submit-
ting an HTML form to the attacker’s web server.
Compromising Unpatched Machines. Network admin-
istrators often do not patch internal machines as quickly
as Internet-facing machines because the patching process is
time-consuming and expensive. The attacker can attempt
to exploit known vulnerabilities in machines on the internal
network. In particular, the attacker can attempt to exploit
the client machine itself. The attacks against the client it-
self originate from localhost and so bypass software fire-
walls and other security checks, including many designed to
protect serious vulnerabilities. If an exploit succeeds, the
attacker can establish a presence within the firewall that
persists even after clients close their browsers.
Abusing Internal Open Services. Internal networks
contain many open services intended for internal use only.
For example, network printers often accept print jobs from
internal machines without additional authentication. The
attacker can use direct socket access to command network
printers to exhaust their toner and paper supplies.
Similarly, users inside firewalls often feel comfortable cre-
ating file shares or FTP servers accessible to anonymous
users under the assumption that the servers will be avail-
able only to clients within the network. With the ability to
read and write arbitrary sockets, the attacker can exfiltrate
the shared documents and use these servers to store illicit
information for later retrieval.
Consumer routers are often installed without changing the
default password, making them an attractive target for re-
configuration attacks by web pages [40]. Firmware patches
have attempted to secure routers against cross-site scripting
and cross-site request forgery, in an e↵ ort to prevent recon-
figuration attacks. DNS rebinding attacks allow the attacker
direct socket access to the router, bypassing these defenses.
4.2 IP Hijacking
Attackers can also use DNS rebinding attacks to target
machines on the public Internet. For these attacks, the at-
tacker is not leveraging the client’s machine to connect to
otherwise inaccessible services but instead abusing the im-
plicit or explicit trust public services have in the client’s IP
address. Once the attacker has hijacked a client’s IP ad-
dress, there are several attacks he or she can perpetrate.
Committing Click Fraud. Web publishers are often paid
by web advertisers on a per-click basis. Fraudulent publish-
ers can increase their advertising revenue by generating fake
clicks, and advertisers can drain competitors’ budgets by
clicking on their advertisements. The exact algorithms used
by advertising networks to detect these “invalid” clicks are
proprietary, but the IP address initiating the click is widely
believed to be an essential input. In fact, one common use
of bot networks is to generate clicks [7].
Click fraud would appear to require only the ability to
send HTTP requests to the advertising network, but adver-
tisers defend against the send-only attacks, permitted by the
same-origin policy, by including a unique nonce with every
advertising impression. Clicks lacking the correct nonce are
rejected as invalid, requiring the attacker to read the nonce
from an HTTP response in order to generate a click.
This attack is highly cost-e↵ ective, as the attacker can
buy advertising impressions, which cost tens of cents per
thousand, and convert them into clicks, worth tens of cents
each. The attack is su�ciently cost-e↵ ective that the at-
tacker need not convert every purchased impression into a
click. Instead, the fraudster can use most of the purchased
impressions to generate fake impressions on the site, main-
taining a believable click-through rate.
Sending Spam. Many e-mail servers blacklist IP addresses
known to send spam e-mail [39]. By hijacking a client’s IP
address, an attacker can send spam from IP addresses with
clean reputations. To send spam e-mail, the attacker need
only write content to SMTP servers on port 25, an action
blocked by most browsers but permitted by Flash Player
and Java. Additionally, an attacker will often be able to use
the client’s actual mail relay. Even service providers that
require successful authentication via POP3 before sending
e-mail are not protected, because users typically leave their
desktop mail clients open and polling their POP3 servers.
Defeating IP-based Authentication. Although discour-
aged by security professionals [10], many Internet services
still employ IP-based authentication. For example, the ACM
Digital Library makes the full text of articles available only
to subscribers, who are often authenticated by IP address.
After hijacking an authorized IP address, the attacker can
access the service, defeating the authentication mechanism.
Because the communication originates from an IP address
actually authorized to use the service, it can be di�cult,
or even impossible, for the service provider to recognize the
security breach.
Framing Clients. An attacker who hijacks an IP address
can perform misdeeds and frame the client. For example,
an attacker can attempt to gain unauthorized access to a
computer system using a hijacked IP address as a proxy.
As the attack originates from the hijacked IP address, the
logs will implicate the client, not the attacker, in the crime.
Moreover, if the attacker hosts the malicious web site over
HTTPS, the browser will not cache the page and no traces
will be left on the client’s machine.
4.3 Proof-of-Concept Demonstration
We developed proof-of-concept exploits for DNS rebinding
vulnerabilities in Flash Player 9, LiveConnect, Java applets
with proxy servers, and the browser itself. Our system con-
sists of a custom DNS server authoritative for dnsrebinding.net,
a custom Flash Player policy server, and a standard Apache
web server. The various technologies issue DNS queries
that encode the attacker and target host names, together
with a nonce, in the subdomain. For each nonce, the DNS
server first responds with the attacker’s IP address (with a
zero TTL) and thereafter with the target’s IP address. Our
proof-of-concept demo, http://crypto.stanford.edu/dns,
implements wget and telnet by mounting a rebinding at-
tack against the browser.
4.4 Experiment: Recruiting Browsers
Methodology. We tested DNS rebinding experimentally
by running a Flash Player 9 advertisement on a minor ad-
vertising network targeting the keywords “Firefox,” “game,”
“Internet Explorer,” “video,” and “YouTube.” The experi-
ment used two machines in our laboratory, an attacker and a
target. The attacker ran a custom authoritative DNS server
for dnsrebinding.net, a custom Flash Player policy server,
and an Apache web server hosting the advertisement. The
target ran an Apache web server to log successful attacks.
The Flash Player advertisement exploited the vulnerability
described in Section 3.1 to load an XML document from the
target server in our lab. The attack required only that the
client view the ad, not that the user click on the ad.
Vulnerability Impressions
Flash Player 9 86.9%
LiveConnect 24.4%
Java+Proxy 2.2%
Total Multi-Pin 90.6%
Table 2: Percentage of Impressions by Vulnerability
Cumulative Duration of Successful Attacks
for 75% Shortest Duration Attacks
0
10
20
30
40
50
60
70
80
90
100
0 64 128 192 256
Duration of Attack Success (secs)
S
uc
ce
ss
fu
l A
tt
ac
ks
(
pe
rc
en
t)
Cumulative Duration of Successful Attacks
1
10
100
1,000
10,000
100,000
1 10 100 1000 10000 100000 1000000
Duration of Attack Success (secs, logscale)
S
uc
ce
ss
fu
l A
tt
ac
ks
(
lo
gs
ca
le
)
Figure 2: Duration of Successful Attacks
The experiment lasted until the user navigated away from
the advertisement, at which time we lost the ability to use
the viewer’s network connection. For privacy, we collected
only properties typically disclosed by browsers when viewing
web pages (e.g., plug-in support, user agent, and external IP
address). The experiment conformed to the terms of service
of the advertising network and to the guidelines of the in-
dependent review board at our institution. Every network
operation produced by the advertisement could have been
produced by a legitimate SWF advertisement, but we pro-
duced the operations through the Socket interface, demon-
strating the ability to make arbitrary TCP connections.
Results. We ran the ad beginning at midnight EDT on
three successive nights in late April 2007. We bid $0.50
per 1000 impressions for a variety of keywords. We spent
$10 per day, garnering approximately 20,000 impressions per
day. Due to a server misconfiguration, we disregarded ap-
proximately 10,000 impressions. We also disregarded 19 im-
pressions from our university. We received 50,951 impres-
sions from 44,924 unique IP addresses (40.2% IE7, 32.3%
IE6, 23.5% Firefox, 4% Other).
We ran the rebinding experiment on the 44,301 (86.9%)
impressions that reported Flash Player 9. We did not at-
tempt to exploit other rebinding vulnerabilities (see Ta-
ble 2). The experiment was successful on 30,636 (60.1%)
impressions and 27,480 unique IP addresses. The attack
was less successful on the 1,672 impressions served to Mac
OS, succeeding 36.4% of the time, compared to a success
rate of 70.0% on the 49,535 (97.2%) Windows impressions.4
Mac OS is more resistant to this rebinding attack due to
some caching of DNS entries despite their zero TTL.
For each successful experiment, we measured how long an
attacker could have used the client’s network access by load-
ing the target document at exponentially longer intervals, as
shown in Figure 2. The median impression duration was 32
seconds, with 25% of the impressions lasting longer than 256
seconds. We observed 9 impressions with a duration of at
least 36.4 hours, 25 at least 18.2 hours, and 81 at least 9.1
hours. In aggregate, we obtained 100.3 machine-days of net-
work access. These observations are consistent with those
of [24]. The large number of attacks ending between 4.2 and
8.5 minutes suggests that this is a common duration of time
for users to spend on a web page.
Discussion. Our experimental results show that DNS re-
binding vulnerabilities are widespread and cost-e↵ ective to
exploit on a large scale. Each impression costs $0.0005 and
54% of the impressions convert to successful attacks from
unique IP addresses. To hijack 100,000 IP addresses for a
temporary bot network, and attacker would need to spend
less than $100. This technique compares favorably to rent-
ing a traditional bot network for sending spam e-mail and
committing click fraud for two reasons. First, these applica-
tions require large numbers of “fresh” IP address for short
durations as compromised machines are quickly blacklisted.
Second, while estimates of the rental cost of bot networks
vary [44, 14, 7], this technique appears to be at least one or
two orders of magnitude less expensive.
5. DEFENSES AGAINST REBINDING
Defenses for DNS rebinding attacks can be implemented
in browsers, plug-ins, DNS resolvers, firewalls, and servers.
These defenses range in complexity of development, di�-
culty of deployment, and e↵ ectiveness against firewall cir-
cumvention and IP hijacking. In addition to necessary mit-
igations for Flash Player, Java LiveConnect, and browsers,
we propose three long-term defenses. To protect against fire-
wall circumvention, we propose a solution that can be de-
ployed unilaterally by organizations at their network bound-
ary. To fully defend against rebinding attacks, we propose
two defenses: one that requires socket-level network access
be authorized explicitly by the destination server and an-
other works even if sockets are allowed by default.
5.1 Fixing Firewall Circumvention
Networks can be protected against firewall circumvention
by forbidding external host names from resolving to internal
IP addresses, e↵ ectively preventing the attacker from nam-
ing the target server. Without the ability to name the tar-
get, the attacker is unable to aggregate the target server into
an origin under his or her control. These malicious bindings
4We succeeded in opening a socket with 2 of 11 PlayStation 3
impressions (those with Flash Player 9), but none of the 12
Nintendo Wii impressions were vulnerable.
can be blocked either by filtering packets at the firewall [5]
or by modifying the DNS resolvers used by clients on the
network.
• Enterprise. By blocking outbound tra�c on port 53, a
firewall administrator for an organization can force all
internal machines, including HTTP proxies and VPN
clients, to use a DNS server that is configured not to
resolve external names to internal IP addresses. To
implement this approach, we developed a 300 line C
program, dnswall [15], that runs alongside BIND and
enforces this policy.
• Consumer. Many consumer firewalls, such as those
produced by Linksys, already expose a caching DNS
resolver and can be augmented with dnswall to block
DNS responses that contain private IP addresses. The
vendors of these devices have an incentive to patch
their firewalls because these rebinding attacks can be
used to reconfigure these routers to mount further at-
tacks on their owners.
• Software. Software firewalls, such as the Windows
Firewall, can also prevent their own circumvention by
blocking DNS resolutions to 127.*.*.*. This tech-
nique does not defend services bound to the external
network interface but does protects a large number of
services that bind only to the loopback interface.
Blocking external names from resolving to internal addresses
prevents firewall circumvention but does not defend against
IP hijacking. An attacker can still use internal machines to
attack services running on the public Internet.
5.2 Fixing Plug-ins
Plug-ins are a particular source of complexity in defend-
ing against DNS rebinding attacks because they enable sub-
second attacks, provide socket-level network access, and op-
erate independently from browsers. In order to prevent re-
binding attacks, these plug-ins must be patched.
Flash Player. When a SWF movie opens a socket to a
new host name, it requests a policy over the socket to de-
termine whether the host accepts socket connections from
the origin of the movie. Flash Player could fix most of
its rebinding vulnerabilities by considering a policy valid
for a socket connection only if it obtained the policy from
the same IP address in addition to its current requirement
that it obtained the policy from the same host name. Us-
ing this design, when attacker.com is rebound to the tar-
get IP address, Flash Player will refuse to open a socket to
that address unless the target provides a policy authorizing
attacker.com. This simple refinement uses existing Flash
Player policy deployments and is backwards compatible, as
host names expecting Flash Player connections already serve
policy documents from all of their IP addresses.
SWF movies can also access ports numbers � 1024 on
their origin host name without requesting a policy. Al-
though the majority of services an attacker can profitably
target (e.g., SMTP, HTTP, HTTPS, SSH, FTP, NNTP)
are hosted on low-numbered ports, other services such as
MySQL, BitTorrent, IRC, and HTTP proxies are vulnera-
ble. To fully protect against rebinding attacks, Flash Player
could request a policy before opening sockets to any port,
even back to its origin. However, this modification breaks
backwards compatibility because those servers might not be
already serving policy files.
Java. Many deployed Java applets expect sockets to be al-
lowed by default. If clients are permitted to use these applets
from behind HTTP proxies, they will remain vulnerable to
multi-pin attacks because proxy requests are made by host
name instead of by IP address. A safer approach is to use the
CONNECT method to obtain a proxied socket connection to an
external machine. Typically proxies only allow CONNECT on
port 443 (HTTPS), making this the only port available for
these applets. Alternatively, proxies can use HTTP head-
ers to communicate IP addresses of hosts between the client
and the proxy [28, 29], but this approach requires both the
client and the proxy to implement the protocol.
Java LiveConnect. LiveConnect introduces additional
vulnerabilities, but browsers can fix the LiveConnect multi-
pin vulnerability without altering the JVM by installing
their own DNS resolver into the JVM using a standard
interface. Firefox, in particular, implements LiveConnect
through the Java Native Interface (JNI). When Firefox ini-
tializes the JVM, it can install a custom InetAddress class
that will handle DNS resolution for the JVM. This custom
class should contain a native method that implements DNS
resolution using Firefox’s DNS resolver instead of the system
resolver. If the browser implements pinning, LiveConnect
and the browser will use a common pin database, removing
multi-pin vulnerabilities.
5.3 Fixing Browsers (Default-Deny Sockets)
Allowing direct socket access by default precludes many
defenses for DNS rebinding attacks. If browser plug-ins de-
faulted to denying socket access, as a patched Flash Player
and the proposed TCPConnection (specified in HTML5 [19])
would, these defenses would become viable. Java and Live-
Connect, along with any number of lesser-known plug-ins,
expect socket access to be allowed, and fixing these is a chal-
lenge.
Checking Host Header. HTTP 1.1 requires that user
agents include a Host header in HTTP requests that spec-
ifies the host name of the server [11]. This feature is used
extensively by HTTP proxies and by web servers to host
many virtual hosts on one IP address. If sockets are de-
nied by default, the Host header reliably indicates the host
name being used by the browser to contact the server be-
cause XMLHttpRequest [43] and related technologies are re-
stricted from spoofing the Host header.5 One server-side de-
fense for these attacks is therefore to reject incoming HTTP
requests with unexpected Host headers [28, 37].
Finer-grained Origins. Another defense against DNS
rebinding attacks is to refine origins to include additional
information, such as the server’s IP address [28] or public
key [27, 23], so that when the attacker rebinds attacker.com
to the target, the browser will consider the rebound host
name to be a new origin. One challenge to deploying finer-
grained origins is that every plug-in would need to revise its
security policies and interacting technologies would need to
hand-o↵ refined origins correctly.
5Lack of integrity of the Host header has been a recur-
ring source of security vulnerabilities, most notably in Flash
Player 7.
• IP Addresses. Refining origins with IP address [28]
is more robust than pinning in that a single browsing
session can fail-over from one IP address to another.
When such a fail-over occurs, however, it will likely
break long-lived AJAX applications, such as Gmail,
because they will be prevented from making XML-
HttpRequests to the new IP address. Users can recover
from this by clicking the browser’s reload button. Un-
fortunately, browsers that use a proxy server do not
know the actual IP address of the remote server and
thus cannot properly refine origins. Also, this defense
is vulnerable to an attack using relative paths to script
files, similar to the applet relative-path vulnerability
described in Section 3.2.
• Public Keys. Augmenting origins with public keys [27,
23] prevents two HTTPS pages served from the same
domain with di↵ erent public keys from reading each
other’s state. This defense is useful when users dis-
miss HTTPS invalid certificate warnings and chiefly
protects HTTPS-only “secure” cookies from network
attackers. Many web pages, however, are not served
over HTTPS, rendering this defense more appropriate
for pharming attacks that compromise victim domains
than for rebinding attacks.
Smarter Pinning. To mitigate rebinding attacks, browsers
can implement smarter pinning policies. Pinning is a de-
fense for DNS rebinding that trades o↵ robustness for secu-
rity. RFC 1035 [32] provides for small (and even zero) TTLs
to enable dynamic DNS and robust behavior in the case of
server failure but respecting these TTLs allows rebinding
attacks. Over the last decade, browsers have experimented
with di↵ erent pin durations and release heuristics, leading
some vendors to shorten their pin duration to improve ro-
bustness [13]. However, duration is not the only parameter
that can be varied in a pinning policy.
Browsers can vary the width of their pins by permitting
host names to be rebound within a set of IP addresses that
meet some similarity heuristic. Selecting an optimal width
as well as duration enables a better trade-o↵ between se-
curity and robustness than optimizing duration alone. One
promising policy is to allow rebinding within a class C net-
work. For example, if a host name resolved to 171.64.78.10,
then the client would also accept any IP address beginning
with 171.64.78 for that host name. The developers of the
NoScript Firefox extension [26] have announced plans [25]
to adopt this pinning heuristic.
• Security. When browsers use class C network pinning,
the attacker must locate the attack server on the same
class C network as the target, making the rebinding
attack much more di�cult to mount. The attack is
possible only if the attacker co-locates a server at the
same hosting facility or leverages a cross-site scripting
vulnerability on a co-located server. This significantly
raises the bar for the attacker and provides better re-
courses for the target.
• Robustness. To study the robustness of class C net-
work pinning, we investigated the IP addresses re-
ported by the 100 most visited English-language sites
(according to Alexa [3]). We visited the home page of
these sites and compiled a list of the 336 host names
used for embedded content (e.g., www.yahoo.com em-
beds images from us.i1.yimg.com). We then issued
DNS queries for these hosts every 10 minutes for 24
hours, recording the IP addresses reported.
In this experiment, 58% reported a single IP address
consistently across all queries. Note that geographic
load balancing is not captured in our data because we
issued our queries from a single machine, mimicking
the behavior of a real client. Averaged over the 42%
of hosts reporting multiple IP addresses, if a browser
pinned to an IP address at random, the expected frac-
tion of IP addresses available for rebinding under class
C network pinning is 81.3% compared with 16.4% un-
der strict IP address pinning, suggesting that class C
pinning is significantly more robust to server failure.
Other heuristics for pin width are possible. For example,
the browser could prevent rebinding between public IP ad-
dresses and the RFC 1918 [35] private IP addresses. This
provides greater robustness for fail-overs across data centers
and for dynamic DNS. LocalRodeo [22, 45] is a Firefox ex-
tension that implements RFC 1918 pinning for JavaScript.
As for security, RFC 1918 pinning largely prevents firewall
circumvention but does not protect against IP hijacking nor
does it prevent firewall circumvention in the case where a
firewall protects non-private IP addresses, which is the case
for many real-life protected networks and personal software
firewalls.
Even the widest possible pinning heuristic prevents some
legitimate rebinding of DNS names. For example, public
host names controlled by an organization often have two IP
addresses, a private IP address used by clients within the
firewall and a public IP address used by clients on the Inter-
net. Pinning prevents employees from properly connecting
to these severs after joining the organization’s Virtual Pri-
vate Network (VPN) as those host names appear to rebind
from public to private IP addresses.
Policy-based Pinning. Instead of using unpinning heuris-
tics, we propose browsers consult server-supplied policies to
determine when it is safe to re-pin a host name from one IP
address to another, providing robustness without degrading
security. To re-pin safely, the browser must obtain a policy
from both the old and new IP address (because some at-
tacks first bind to the attacker whereas others first bind to
the target). Servers can supply this policy at a well-known
location, such as /crossdomain.xml, or in reverse DNS (see
Section 5.4).
Pinning Pitfalls. Correctly implementing pinning has sev-
eral subtleties that are critical to its ability to defend against
DNS rebinding attacks.
• Common Pin Database. To eliminate multi-pin at-
tacks, pinning-based defense require that all browser
technologies that access the network share a common
pin database. Many plug-ins, including Flash Player
and Silverlight, already use the browser’s pins when
issuing HTTP requests because they issue these re-
quests through the browser. To share DNS pins for
other kinds of network access, either the browser could
expose an interface to its pin database or the operating
system could pin in its DNS resolver. Unfortunately,
browser vendors appear reluctant to expose such an
interface [12, 33] and pinning in the operating system
either changes the semantics of DNS for other applica-
tions or requires that the OS treats browsers and their
plug-ins di↵ erently from other applications.
• Cache. The browser’s cache and all plug-in caches
must be modified to prevent rebinding attacks. Cur-
rently, objects stored in the cache are retrieved by
URL, irrespective of the originating IP address, creat-
ing a rebinding vulnerability: a cached script from the
attacker might run later when attacker.com is bound
to the target. To prevent this attack, objects in the
cache must be retrieved by both URL and originat-
ing IP address. This degrades performance when the
browser pins to a new IP address, which might occur
when the host at the first IP address fails, the user
starts a new browsing session, or the user’s network
connectivity changes. These events are uncommon and
are unlikely to impact performance significantly.
• document.domain. Even with the strictest pinning, a
server is vulnerable to rebinding attacks if it hosts a
web page that executes the following, seemingly in-
nocuous, JavaScript:
document.domain = document.domain;
After a page sets its domain property, the browser al-
lows cross-origin interactions with other pages that
have set their domain property to the same value [42,
17]. This idiom, used by a number of JavaScript li-
braries6, sets the domain property to a value under
the control of the attacker: the current host name.
5.4 Fixing Browsers (Default-Allow Sockets)
Instead of trying to prevent a host name from rebinding
from one IP address to another—a fairly common event—a
di↵ erent approach to defending against rebinding is to pre-
vent the attacker from naming the target server, essentially
generalizing dnswall to the Internet. Without the ability to
name the target server, the attacker cannot mount a DNS
rebinding attack against the target. This approach defends
against rebinding, can allow socket access by default, and
preserves the robustness of dynamic DNS.
Host Name Authorization. On the Internet, clients re-
quire additional information to determine the set of valid
host names for an given IP address. We propose that servers
advertise the set of host names they consider valid for them-
selves and clients check these advertisements before binding
a host name to an IP address, making explicit which host
names can map to which IP addresses. Host name autho-
rization prevents rebinding attacks because honest machines
will not advertise host names controlled by attackers.
Reverse DNS already provides a mapping from IP ad-
dresses to host names. The owner of an IP address ip is
delegated naming authority for ip.in-addr.arpa and typi-
cally stores a PTR record containing the host name associ-
ated with that IP address. These records are insu�cient
for host name authorization because a single IP address can
have many valid host names, and existing PTR records do
not indicate that other host names are invalid.
6For example, “Dojo” AJAX library, Struts servlet/JSP
based web application framework, jsMath AJAX Mathemat-
ics library, and Sun’s “Ultimate client-side JavaScript client
sni↵ ” library are vulnerable in this way.
The reverse DNS system can be extended to authorize
host names without sacrificing backwards compatibility. To
authorize the host www.example.com for 171.64.78.146, the
owner of the IP address inserts the following DNS records:
auth.146.78.64.171.in-addr.arpa.
IN A 171.64.78.146
www.example.com.auth.146.78.64.171.in-addr.arpa.
IN A 171.64.78.146
To make a policy-enabled resolution for www.example.com,
first resolve the host name a set of IP addresses normally
and then validate each IP address as follows:
1. Resolve the host name auth.ip.in-addr.arpa.
2. If the host name exists, ip is policy-enabled and ac-
cepts only authorized host names. Otherwise, ip is
not policy-enabled and accepts any host name.
3. Finally, if ip is policy-enabled, resolve the host name
www.example.com.auth.ip.in-addr.arpa
to determine if the host name is authorized.
An IP address ip implicitly authorizes every host name of
the form *.auth.ip.in-addr.arpa, preventing incorrect re-
cursive policy checks. For host names with multiple IP ad-
dresses, only authorized IP addresses should be included in
the result. If no IP addresses are authorized, the result
should be “not found.” If an IP address is not policy en-
abled, DNS rebinding attacks can be mitigated using the
techniques in Section 5.3.
The policy check can be implemented in DNS resolvers7,
such as ones run by organizations and ISPs, transparently
protecting large groups of machines from having their IP
addresses hijacked. User agents, such as browser and plug-
ins, can easily query the policy records because they are
stored in A records and can issue policy checks in paral-
lel with HTTP requests (provided they do not process the
HTTP response before the host name is authorized). Stan-
dard DNS caching reduces much of the overhead of redun-
dant policy checks issued by DNS resolvers, browsers, and
plug-ins. As a further optimization, policy-enabled resolvers
can include policy records in the “additional” section of the
DNS response, allowing downstream resolvers to cache com-
plete policies and user-agents to get policy records without
a separate request. We have implemented host name autho-
rization as a 72-line patch to Firefox 2.
One disadvantage of this mechanism is that the owner of
an IP address, the ISP, might not be the owner of the ma-
chine at that IP address. The machine can advertise the
correct set of authorized host names only if the ISP is will-
ing to delegate the auth subdomain to the owner or insert
appropriate DNS records. Instead, machines could advertise
authorized host names over HTTP in a well-known location,
similar to crossdomain.xml, but this has several disadvan-
tages: it requires policy-enabled DNS resolvers to implement
HTTP clients, it requires all machines, such as SMTP gate-
ways, to run an HTTP server, and policy queries are not
cached, resulting in extra tra�c comparable to favicon.ico.
7To prevent a subtle attack that involves poisoning DNS
caches, a policy-enabled DNS resolver must follow the same
procedure for CNAME queries as for A queries, even though
responses to the former do not directly include IP addresses.
Trusted Policy Providers. Clients and DNS resolvers
can also check policy by querying a trusted policy provider.
Much like spam black lists [39] and phishing filters [6, 31,
16], di↵ erent policy providers can use di↵ erent heuristics to
determine whether a host name is valid for an IP address,
but every provider should respect host names authorized
in reverse DNS. When correctly configured, host name au-
thorization in reverse DNS has no false negatives (no valid
host name is rejected) but many false positives (lack of pol-
icy is implicit authorization). Trusted policy providers can
greatly reduce the false positive rate, possibly at the cost of
increasing the false negative rate. Clients are free to select
as aggressive a policy provider as they desire.
6. RELATED WORK
Using Browsers as Bots. The technique of luring web
users to an attacker’s site and then distracting them while
their browsers participate in a coordinated attack is de-
scribed in [24]. These “puppetnets” can be used for dis-
tributed denial of service but cannot be used to mount the
attacks described in Section 4 because puppetnets cannot
read back responses from di↵ erent origins or connect to for-
bidden ports such as 25.
JavaScript can also be misused to scan behind firewalls [18]
and reconfigure home routers [40]. These techniques of-
ten rely on exploiting default passwords and on underlying
cross-site scripting or cross-site request forgery vulnerabil-
ities. DNS rebinding attacks can be used to exploit de-
fault passwords without the need for a cross-site scripting
or cross-site request forgery hole.
Sender Policy Framework. To fight spam e-mail, the
Sender Policy Framework (SPF) [46] stores policy informa-
tion in DNS. SPF policies are stored as TXT records in for-
ward DNS, where host names can advertise the set of IP
addresses authorized to send e-mail on their behalf.
7. CONCLUSIONS
An attacker can exploit DNS rebinding vulnerabilities to
circumvent firewalls and hijack IP addresses. Basic DNS re-
binding attacks have been known for over a decade, but the
classic defense, pinning, reduces robustness and fails to pro-
tect current browsers that use plug-ins. Modern multi-pin
attacks defeat pinning in hundreds of milliseconds, granting
the attacker direct socket access from the client’s machine.
These attacks are a highly cost-e↵ ective technique for hi-
jacking hundreds of thousands of IP addresses for sending
spam e-mail and committing click fraud.
For network administrators, we provide a tool to prevent
DNS rebinding from being used for firewall circumvention
by blocking external DNS names from resolving to internal
IP addresses. For the vendors of Flash Player, Java, and
LiveConnect, we suggest simple patches that mitigate large-
scale exploitation by vastly reducing the cost-e↵ ectiveness
of the attacks for sending spam e-mail and committing click
fraud. Finally, we propose two defense options that prevent
both firewall circumvention and IP hijacking: policy-based
pinning and host name authorization. We hope that ven-
dors and network administrators will deploy these defenses
quickly before attackers exploit DNS rebinding on a large
scale.
Acknowledgments
We thank Drew Dean, Darin Fisher, Jeremiah Grossman,
Martin Johns, Dan Kaminsky, Chris Karlof, Jim Roskind,
and Dan Wallach for their helpful suggestions and feedback.
This work is supported by grants from the National Science
Foundation and the US Department of Homeland Security.
8. REFERENCES
[1] Adobe. Flash Player Penetration. http://www.adobe.
com/products/player census/flashplayer/.
[2] Adobe. Adobe Flash Player 9 Security.
http://www.adobe.com/devnet/flashplayer/
articles/flash player 9 security.pdf, July 2006.
[3] Alexa. Top sites. http://www.alexa.com/site/ds/
top sites?ts mode=global.
[4] K. Anvil. Anti-DNS pinning + socket in flash.
http://www.jumperz.net/, 2007.
[5] W. Cheswick and S. Bellovin. A DNS filter and switch
for packet-filtering gateways. In Proc. Usenix, 1996.
[6] N. Chou, R. Ledesma, Y. Teraguchi, and J. Mitchell.
Client-side defense against web-based identity theft.
In Proc. NDSS, 2004.
[7] N. Daswani, M. Stoppelman, et al. The anatomy of
Clickbot.A. In Proc. HotBots, 2007.
[8] D. Dean, E. W. Felten, and D. S. Wallach. Java
security: from HotJava to Netscape and beyond. In
IEEE Symposium on Security and Privacy: Oakland,
California, May 1996.
[9] D. Edwards. Your MOMA knows best, December
2005. http://xooglers.blogspot.com/2005/12/
your-moma-knows-best.html.
[10] K. Fenzi and D. Wreski. Linux security HOWTO,
January 2004.
[11] R. Fielding et al. Hypertext Transfer
Protocol—HTTP/1.1. RFC 2616, June 1999.
[12] D. Fisher, 2007. Personal communication.
[13] D. Fisher et al. Problems with new DNS cache
(“pinning” forever). https:
//bugzilla.mozilla.org/show bug.cgi?id=162871.
[14] D. Goodin. Calif. man pleads guilty to felony hacking.
Associated Press, Janurary 2005.
[15] Google. dnswall.
http://code.google.com/p/google-dnswall/.
[16] Google. Google Safe Browsing for Firefox, 2005. http:
//www.google.com/tools/firefox/safebrowsing/.
[17] S. Grimm et al. Setting document.domain doesn’t
match an implicit parent domain. https:
//bugzilla.mozilla.org/show bug.cgi?id=183143.
[18] J. Grossman and T. Niedzialkowski. Hacking intranet
websites from the outside: JavaScript malware just
got a lot more dangerous. In Blackhat USA, August
2006. Invited talk.
[19] I. Hickson et al. HTML 5 Working Draft. http:
//www.whatwg.org/specs/web-apps/current-work/.
[20] C. Jackson, A. Bortz, D. Boneh, and J. Mitchell.
Protecting browser state from web privacy attacks. In
Proc. WWW, 2006.
[21] M. Johns. (somewhat) breaking the same-origin policy
by undermining DNS pinning, August 2006.
http://shampoo.antville.org/stories/1451301/.
[22] M. Johns and J. Winter. Protecting the Intranet
against “JavaScript Malware” and related attacks. In
Proc. DIMVA, July 2007.
[23] C. K. Karlof, U. Shankar, D. Tygar, and D. Wagner.
Dynamic pharming attacks and the locked same-origin
policies for web browsers. In Proc. CCS, October 2007.
[24] V. T. Lam, S. Antonatos, P. Akritidis, and K. G.
Anagnostakis. Puppetnets: Misusing web browsers as
a distributed attack infrastructure. In Proc. CCS,
2006.
[25] G. Maone. DNS Spoofing/Pinning. http:
//sla.ckers.org/forum/read.php?6,4511,14500.
[26] G. Maone. NoScript. http://noscript.net/.
[27] C. Masone, K. Baek, and S. Smith. WSKE: web server
key enabled cookies. In Proc. USEC, 2007.
[28] A. Megacz. XWT Foundation Security Advisory.
http://xwt.org/research/papers/sop.txt.
[29] A. Megacz and D. Meketa. X-RequestOrigin.
http://www.xwt.org/x-requestorigin.txt.
[30] Microsoft. Microsoft Web Enterprise Portal, January
2004. http://www.microsoft.com/technet/
itshowcase/content/MSWebTWP.mspx.
[31] Microsoft. Microsoft phishing filter: A new approach
to building trust in e-commerce content, 2005.
[32] P. Mockapetris. Domain Names—Implementation and
Specification. IETF RFC 1035, November 1987.
[33] C. Nuuja (Adobe), 2007. Personal communication.
[34] G. Ollmann. The pharming guide. http://www.
ngssoftware.com/papers/ThePharmingGuide.pdf,
August 2005.
[35] Y. Rekhter, B. Moskowitz, D. Karrenberg, G. J.
de Groot, and E. Lear. Address Allocation for Private
Internets. IETF RFC 1918, February 1996.
[36] J. Roskind. Attacks against the Netscape browser. In
RSA Conference, April 2001. Invited talk.
[37] D. Ross. Notes on DNS pinning.
http://blogs.msdn.com/dross/archive/2007/07/
09/notes-on-dns-pinning.aspx, 2007.
[38] J. Ruderman. JavaScript Security: Same Origin.
http://www.mozilla.org/projects/security/
components/same-origin.html.
[39] Spamhaus. The spamhaus block list, 2007.
http://www.spamhaus.org/sbl/.
[40] S. Stamm, Z. Ramzan, and M. Jakobsson. Drive-by
pharming. Technical Report 641, Computer Science,
Indiana University, December 2006.
[41] J. Topf. HTML Form Protocol Attack, August 2001.
http://www.remote.org/jochen/sec/hfpa/hfpa.pdf.
[42] D. Veditz et al. document.domain abused to access
hosts behind firewall. https:
//bugzilla.mozilla.org/show bug.cgi?id=154930.
[43] W3C. The XMLHttpRequest Object, February 2007.
http://www.w3.org/TR/XMLHttpRequest/.
[44] B. Warner. Home PCs rented out in sabotage-for-hire
racket. Reuters, July 2004.
[45] J. Winter and M. Johns. LocalRodeo: Client-side
protection against JavaScript Malware.
http://databasement.net/labs/localrodeo/, 2007.
[46] M. Wong and W. Schlitt. Sender Policy Framework
(SPF) for Authorizing Use of Domains in E-Mail.
IETF RFC 4408, April 2006.
Enhancing Byte-Level
Network Intrusion Detection Signatures with Context
Robin Sommer
TU München
Germany
[email protected]
Vern Paxson
International Computer Science Institute and
Lawrence Berkeley National Laboratory
Berkeley, CA, USA
[email protected]
ABSTRACT
Many network intrusion detection systems (NIDS) use byte
sequen-
ces as signatures to detect malicious activity. While being
highly
efficient, they tend to suffer from a high false-positive rate. We
develop the concept of contextual signatures as an improvement
of
string-based signature-matching. Rather than matching fixed
strings
in isolation, we augment the matching process with additional
con-
text. When designing an efficient signature engine for the NIDS
Bro, we provide low-level context by using regular expressions
for
matching, and high-level context by taking advantage of the se-
mantic information made available by Bro’s protocol analysis
and
scripting language. Therewith, we greatly enhance the
signature’s
expressiveness and hence the ability to reduce false positives.
We
present several examples such as matching requests with
replies,
using knowledge of the environment, defining dependencies be-
tween signatures to model step-wise attacks, and recognizing
ex-
ploit scans.
To leverage existing efforts, we convert the comprehensive sig-
nature set of the popular freeware NIDS Snort into Bro’s
language.
While this does not provide us with improved signatures by
itself,
we reap an established base to build upon. Consequently, we
evalu-
ate our work by comparing to Snort, discussing in the process
sev-
eral general problems of comparing different NIDSs.
Categories and Subject Descriptors: C.2.0 [Computer-Communi-
cation Networks]: General - Security and protection.
General Terms: Performance, Security.
Keywords: Bro, Network Intrusion Detection, Pattern Matching,
Security, Signatures, Snort, Evaluation
1. INTRODUCTION
Several different approaches are employed in attempting to
detect
computer attacks. Anomaly-based systems derive (usually in an
au-
tomated fashion) a notion of “normal” system behavior, and
report
divergences from this profile, an approach premised on the
notion
that attacks tend to look different in some fashion from
legitimate
computer use. Misuse detection systems look for particular,
explicit
indications of attacks (Host-based IDSs inspect audit logs for
this
while network-based IDSs, or NIDSs, inspect the network
traffic).
Permission to make digital or hard copies of all or part of this
work for
personal or classroom use is granted without fee provided that
copies are
not made or distributed for profit or commercial advantage and
that copies
bear this notice and the full citation on the first page. To copy
otherwise, to
republish, to post on servers or to redistribute to lists, requires
prior specific
permission and/or a fee.
CCS’03,�October�27–31,�2003,�Washington,�DC,�USA.
Copyright 2003 ACM 1-58113-738-9/03/0010 ...$5.00.
In this paper, we concentrate on one popular form of misuse de-
tection, network-based signature matching in which the system
in-
spects network traffic for matches against exact, precisely-
described
patterns. While NIDSs use different abstractions for defining
such
patterns, most of the time the term signature refers to raw byte
se-
quences. Typically, a site deploys a NIDS where it can see
network
traffic between the trusted hosts it protects and the untrusted
exterior
world, and the signature-matching NIDS inspects the passing
pack-
ets for these sequences. It generates an alert as soon as it
encounters
one. Most commercial NIDSs follow this approach [19], and
also
the most well-known freeware NIDS, Snort [29]. As an
example,
to detect the buffer overflow described in CAN-2002-0392 [9],
Snort’s signature #1808 looks for the byte pattern 0xC0505289-
E150515250B83B000000CD80 [2] in Web requests. Keeping
in mind that there are more general forms of signatures used in
in-
trusion detection as well—some of which we briefly discuss in
§2—
in this paper we adopt this common use of the term signature.
Signature-matching in this sense has several appealing proper-
ties. First, the underlying conceptual notion is simple: it is easy
to explain what the matcher is looking for and why, and what
sort
of total coverage it provides. Second, because of this simplicity,
signatures can be easy to share, and to accumulate into large
“at-
tack libraries.” Third, for some signatures, the matching can be
quite tight: a match indicates with high confidence that an
attack
occurred.
On the other hand, signature-matching also has significant lim-
itations. In general, especially when using tight signatures, the
matcher has no capability to detect attacks other than those for
which it has explicit signatures; the matcher will in general
com-
pletely miss novel attacks, which, unfortunately, continue to be
de-
veloped at a brisk pace. In addition, often signatures are not in
fact
“tight.” For example, the Snort signature #1042 to detect an
exploit
of CVE-2000-0778 [9] searches for “Translate: F” in Web
requests; but it turns out that this header is regularly used by
certain
applications. Loose signatures immediately raise the major
problem
of false positives: alerts that in fact do not reflect an actual
attack.
A second form of false positive, which signature matchers
likewise
often fail to address, is that of failed attacks. Since at many
sites
attacks occur at nearly-continuous rates, failed attacks are often
of
little interest. At a minimum, it is important to distinguish
between
them and successful attacks.
A key point here is that the problem of false positives can po-
tentially be greatly reduced if the matcher has additional
context at
its disposal: either additional particulars regarding the exact
activ-
ity and its semantics, in order to weed out false positives due to
overly general “loose” signatures; or the additional information
of
how the attacked system responded to the attack, which often
indi-
cates whether the attack succeeded.
262
In this paper, we develop the concept of contextual signatures,
in which the traditional form of string-based signature matching
is
augmented by incorporating additional context on different
levels
when evaluating the signatures. First of all, we design and
imple-
ment an efficient pattern matcher similar in spirit to traditional
sig-
nature engines used in other NIDS. But already on this low-
level
we enable the use of additional context by (i) providing full
regu-
lar expressions instead of fixed strings, and (ii) giving the
signature
engine a notion of full connection state, which allows it to
corre-
late multiple interdependent matches in both directions of a user
session. Then, if the signature engine reports the match of a sig-
nature, we use this event as the start of a decision process,
instead
of an alert by itself as is done by most signature-matching
NIDSs.
Again, we use additional context to judge whether something
alert-
worthy has indeed occurred. This time the context is located on
a
higher-level, containing our knowledge about the network that
we
have either explicitly defined or already learned during
operation.
In §3.5, we will show several examples to demonstrate how the
concept of contextual signatures can help to eliminate most of
the
limitations of traditional signatures discussed above. We will
see
that regular expressions, interdependent signatures, and
knowledge
about the particular environment have significant potential to
reduce
the false positive rate and to identify failed attack attempts. For
example, we can consider the server’s response to an attack and
the set of software it is actually running—its vulnerability
profile—
to decide whether an attack has succeeded. In addition, treating
signature matches as events rather than alerts enables us to
analyze
them on a meta-level as well, which we demonstrate by
identifying
exploit scans (scanning multiple hosts for a known
vulnerability).
Instrumenting signatures to consider additional context has to
be
performed manually. For each signature, we need to determine
what
context might actually help to increase its performance. While
this
is tedious for large sets of already-existing signatures, it is not
an
extra problem when developing new ones, as such signatures
have
to be similarly adjusted to the specifics of particular attacks
anyway.
Contextual signatures serve as a building block for increasing
the
expressivess of signatures; not as a stand-alone solution.
We implemented the concept of contextual signatures in the
framework already provided by the freeware NIDS Bro [25]. In
contrast to most NIDSs, Bro is fundamentally neither an
anomaly-
based system nor a signature-based system. It is instead
partitioned
into a protocol analysis component and a policy script
component.
The former feeds the latter via generating a stream of events
that
reflect different types of activity detected by the protocol analy-
sis; consequently, the analyzer is also referred to as the event
en-
gine. For example, when the analyzer sees the establishment of
a TCP connection, it generates a connection established
event; when it sees an HTTP request it generates http request
and for the corresponding reply http reply; and when the event
engine’s heuristics determine that a user has successfully
authenti-
cated during a Telnet or Rlogin session, it generates login suc-
cess (likewise, each failed attempt results in a login failure
event).
Bro’s event engine is policy-neutral: it does not consider any
particular events as reflecting trouble. It simply makes the
events
available to the policy script interpreter. The interpreter then
ex-
ecutes scripts written in Bro’s custom scripting language in
order
to define the response to the stream of events. Because the lan-
guage includes rich data types, persistent state, and access to
timers
and external programs, the response can incorporate a great deal
of
context in addition to the event itself. The script’s reaction to a
par-
ticular event can range from updating arbitrary state (for
example,
tracking types of activity by address or address pair, or
grouping re-
lated connections into higher-level “sessions”) to generating
alerts
(e.g., via syslog) or invoking programs for a reactive response.
More generally, a Bro policy script can implement signature-
style
matching—for example, inspecting the URIs in Web requests,
the
MIME-encoded contents of email (which the event engine will
first
unpack), the user names and keystrokes in login sessions, or the
filenames in FTP sessions—but at a higher semantic level than
as
just individual packets or generic TCP byte streams.
Bro’s layered approach is very powerful as it allows a wide
range
of different applications. But it has a significant shortcoming:
while, as discussed above, the policy script is capable of
perform-
ing traditional signature-matching, doing so can be cumbersome
for
large sets of signatures, because each signature has to be coded
as
part of a script function. This is in contrast to the concise, low-
level
languages used by most traditional signature-based systems. In
ad-
dition, if the signatures are matched sequentially, then the
overhead
of the matching can become prohibitive. Finally, a great deal of
community effort is already expended on developing and
dissemi-
nating packet-based and byte-stream-based signatures. For
exam-
ple, the 1.9.0 release of Snort comes with a library of 1,715
signa-
tures [2]. It would be a major advantage if we could leverage
these
efforts by incorporating such libraries.
Therefore, one motivation for this work is to combine Bro’s
flexi-
bility with the capabilities of other NIDSs by implementing a
signa-
ture engine. But in contrast to traditional systems, which use
their
signature matcher more or less on its own, we tightly integrate
it
into Bro’s architecture in order to provide contextual signatures.
As
discussed above, there are two main levels on which we use
addi-
tional context for signature matching. First, at a detailed level,
we
extend the expressiveness of signatures. Although byte-level
pattern
matching is a central part of NIDSs, most only allow signatures
to
be expressed in terms of fixed strings. Bro, on the other hand,
al-
ready provides regular expressions for use in policy scripts, and
we
use them for signatures as well. The expressiveness of such
patterns
provides us with an immediate way to express syntactic context.
For example, with regular expressions it is easy to express the
no-
tion “string XYZ but only if preceded at some point earlier by
string
ABC”. An important point to keep in mind regarding regular
expres-
sion matching is that, once we have fully constructed the
matcher,
which is expressed as a Deterministic Finite Automaton (DFA),
the
matching can be done in O(n) time for n characters in the input,
and also Ω(n) time. (That is, the matching always takes time
linear
in the size of the input, regardless of the specifics of the input.)
The
“parallel Boyer-Moore” approaches that have been explored in
the
literature for fast matching of multiple fixed strings for Snort
[12, 8]
have a wide range of running times—potentially sublinear in n,
but
also potentially superlinear in n. So, depending on the
particulars
of the strings we want to match and the input against which we
do
the matching, regular expressions might prove fundamentally
more
efficient, or might not; we need empirical evaluations to
determine
the relative performance in practice. In addition, the
construction of
a regular expression matcher requires time potentially
exponential
in the length of the expression, clearly prohibitive, a point to
which
we return in §3.1.
Second, on a higher level, we use Bro’s rich contextual state to
implement our improvements to plain matching described
above.
Making use of Bro’s architecture, our engine sends events to the
policy layer. There, the policy script can use all of Bro’s
already
existing mechanisms to decide how to react. We show several
such
examples in §3.5.
Due to Snort’s large user base, it enjoys a comprehensive and
up-to-date set of signatures. Therefore, although for flexibility
we
have designed a custom signature language for Bro, we make
use
263
of the Snort libraries via a conversion program. This program
takes
an unmodified Snort configuration and creates a corresponding
Bro
signature set. Of course, by just using the same signatures in
Bro as
in Snort, we are not able to improve the resulting alerts in terms
of
quality. But even if we do not accompany them with additional
context, they immediately give us a baseline of already widely-
deployed signatures. Consequently, Snort serves us as a
reference.
Throughout the paper we compare with Snort both in terms of
qual-
ity and performance. But while doing so, we encountered
several
general problems for evaluating and comparing NIDSs. We be-
lieve these arise independently of our work with Bro and Snort,
and
therefore describe them in some detail. Keeping these
limitations
in mind, we then evaluate the performance of our signature
engine
and find that it performs well.
§2 briefly summarizes related work. In §3 we present the main
design ideas behind implementing contextual signatures: regular
expressions, integration into Bro’s architecture, some
difficulties
with using Snort signatures, and examples of the power of the
Bro
signature language. In §4 we discuss general problems of
evaluating
NIDSs, and then compare Bro’s signature matching with
Snort’s. §5
summarizes our conclusions.
2. RELATED WORK
[4] gives an introduction to intrusion detection in general,
defin-
ing basic concepts and terminology.
In the context of signature-based network intrusion detection,
previous work has focussed on efficiently matching hundreds of
fixed strings in parallel: [12] and [8] both present
implementations
of set-wise pattern matching for Snort [29]. For Bro’s signature
en-
gine, we make use of regular expressions [18]. They give us
both
flexibility and efficiency. [17] presents a method to
incrementally
build the underlying DFA, which we can use to avoid the
potentially
enormous memory and computation required to generate the
com-
plete DFA for thousands of signatures. An extended form of
regular
expressions has been used in intrusion detection for defining se-
quences of events [30], but to our knowledge no NIDS uses
them
for actually matching multiple byte patterns against the payload
of
packets.
In this paper, we concentrate on signature-based NIDS. Snort is
one of the most-widely deployed systems and relies heavily on
its
signature set. Also, most of the commercial NIDSs are
signature-
based [19], although there are systems that use more powerful
con-
cepts to express signatures than just specifying byte patterns.
NFR [28], for example, uses a flexible language called N-Code
to
declare its signatures. In this sense, Bro already provides
sophisti-
cated signatures by means of its policy language. But the goal
of our
work is to combine the advantages of a traditional dedicated
pattern
matcher with the power of an additional layer abstracting from
the
raw network traffic. IDS like STAT [35] or Emerald [26] are
more
general in scope than purely network-based systems. They con-
tain misuse-detection components as well, but their signatures
are
defined at a higher level. The STAT framework abstracts from
low-
level details by using transitions on a set of states as signatures.
A
component called NetSTAT [36] defines such state transitions
based
on observed network-traffic. Emerald, on the other hand,
utilizes
P-BEST [20], a production-based expert system to define
attacks
based on a set of facts and rules. Due to their general scope,
both
systems use a great deal of context to detect intrusions. On the
other
hand, our aim is to complement the most common form of
signa-
ture matching—low-level string matching—with context, while
still
keeping its efficiency.
The huge number of generated alerts is one of the most impor-
tant problems of NIDS (see, for example, [23]). [3] discusses
some
statistical limits, arguing in particular that the false-alarm rate
is the
limiting factor for the performance of an IDS.
Most string-based NIDSs use their own signature language, and
are therefore incompatible. But since most languages cover a
com-
mon subset, it is generally possible to convert the signatures of
one
system into the syntax of another. ArachNIDS [1], for example,
generates signatures dynamically for different systems based on
a
common database, and [32] presents a conversion of Snort
signa-
tures into STAT’s language, although it does not compare the
two
systems in terms of performance. We take a similar approach,
and
convert Snort’s set into Bro’s new signature language.
For evaluation of the new signature engine, we take Snort as a
reference. But while comparing Bro and Snort, we have encoun-
tered several difficulties which we discuss in §4. They are part
of
the general question of how to evaluate NIDSs. One of the most
comprehensive evaluations is presented in [21, 22], while [24]
of-
fers a critique of the methodology used in these studies. [14]
further
extends the evaluation method by providing a user-friendly
environ-
ment on the one hand, and new characterizations of attack
traffic
on the other hand. More recently, [10] evaluates several
commer-
cial systems, emphasizing the view of an analyst who receives
the
alerts, finding that these systems ignore relevant information
about
the context of the alerts. [15] discusses developing a benchmark
for
NIDSs, measuring their capacity with a representative traffic
mix.
(Note, in §4.2 we discuss our experiences with the difficulty of
find-
ing “representative” traces.)
3. CONTEXTUAL SIGNATURES
The heart of Bro’s contextual signatures is a signature engine
de-
signed with three main goals in mind: (i) expressive power, (ii)
the
ability to improve alert quality by utilizing Bro’s contextual
state,
and (iii) enabling the reuse of existing signature sets. We
discuss
each in turn. Afterwards, we present our experiences with
Snort’s
signature set, and finally show examples which demonstrate
appli-
cations for the described concepts.
3.1 Regular Expressions
A traditional signature usually contains a sequence of bytes that
are representative of a specific attack. If this sequence is found
in the payload of a packet, this is an indicator of a possible at-
tack. Therefore, the matcher is a central part of any signature-
based
NIDS. While many NIDSs only allow fixed strings as search
pat-
terns, we argue for the utility of using regular expressions.
Regular
expressions provide several significant advantages: first, they
are
far more flexible than fixed strings. Their expressiveness has
made
them a well-known tool in many applications, and their power
arises
in part from providing additional syntactic context with
which to sharpen textual searches. In particular, character
classes,
union, optional elements, and closures prove very useful for
speci-
fying attack signatures, as we see in §3.5.1.
Surprisingly, given their power, regular expressions can be
matched very efficiently. This is done by compiling the expres-
sions into DFAs whose terminating states indicate whether a
match
is found. A sequence of n bytes can therefore be matched with
O(n) operations, and each operation is simply an array lookup—
highly efficient.
The total number of patterns contained in the signature set of
a NIDSs can be quite large. Snort’s set, for example, contains
1,715 distinct signatures, of which 1,273 are enabled by default.
Matching these individually is very expensive. However, for
fixed
strings, there are algorithms for matching sets of strings
simulta-
neously. Consequently, while Snort’s default engine still works
it-
eratively, there has been recent work to replace it with a “set-
wise”
264
matcher [8, 12].1 On the other hand, regular expressions give us
set-
wise matching for free: by using the union operator on the
individ-
ual patterns, we get a new regular expression which effectively
com-
bines all of them. The result is a single DFA that again needs
O(n)
operations to match against an n byte sequence. Only slight
modifi-
cations have been necessary to extend the interface of Bro’s
already-
existing regular expression matcher to explicitly allow grouping
of
expressions.
Given the expressiveness and efficiency of regular expressions,
there is still a reason why a NIDS might avoid using them: the
underlying DFA can grow very large. Fully compiling a regular
ex-
pression into a DFA leads potentially to an exponential number
of
DFA states, depending on the particulars of the patterns [18].
Con-
sidering the very complex regular expression built by combining
all individual patterns, this straight-forward approach could
easily
be intractable. Our experience with building DFAs for regular
ex-
pressions matching many hundreds of signatures shows that this
is
indeed the case. However, it turns out that in practice it is
possible
to avoid the state/time explosion, as follows.
Instead of pre-computing the DFA, we build the DFA “on-the-
fly” during the actual matching [17]. Each time the DFA needs
to
transit into a state that is not already constructed, we compute
the
new state and record it for future reuse. This way, we only store
DFA states that are actually needed. An important observation
is
that for n new input characters, we will build at most n new
states.
Furthermore, we find in practice (§4.3) that for normal traffic
the
growth is much less than linear.
However, there is still a concern that given inauspicious
traffic—
which may actually be artificially crafted by an attacker—the
state
construction may eventually consume more memory than we
have
available. Therefore, we also implemented a memory-bounded
DFA
state cache. Configured with a maximum number of DFA states,
it expires old states on a least-recently-used basis. In the sequel,
when we mention “Bro with a limited state cache,” we are
referring
to such a bounded set of states (which is a configuration option
for
our version of Bro), using the default bound of 10,000 states.
Another important point is that it’s not necessary to combine all
patterns contained in the signature set into a single regular
expres-
sion. Most signatures contain additional constraints like IP
address
ranges or port numbers that restrict their applicability to a
subset of
the whole traffic. Based on these constraints, we can build
groups
of signatures that match the same kind of traffic. By collecting
only
those patterns into a common regular expression for matching
the
group, we are able to reduce the size of the resulting DFA dras-
tically. As we show in §4, this gives us a very powerful pattern
matcher still efficient enough to cope with high-volume traffic.
3.2 Improving Alert Quality by Using Context
Though pattern matching is a central part of any signature-based
NIDSs, as we discussed above there is potentially great utility
in
incorporating more context in the system’s analysis prior to
gener-
ating an alert, to ensure that there is indeed something alert-
worthy
occurring. We can considerably increase the quality of alerts,
while
simultaneously reducing their quantity, by utilizing knowledge
about the current state of the network. Bro is an excellent tool
for
this as it already keeps a lot of easily accessible state.
The new signature engine is designed to fit nicely into Bro’s
lay-
ered architecture as an adjunct to the protocol analysis event en-
gine (see Figure 1). We have implemented a custom language
for
defining signatures. It is mostly a superset of other, similar lan-
1The code of [12] is already contained in the Snort distribution,
but
not compiled-in by default. This is perhaps due to some subtle
bugs,
some of which we encountered during our testing as well.
Figure 1: Integrating the signature engine (adapted from [25])
Event Control
Event Engine
Event stream
Real−time notification
Signature Control
Packet filter
Policy script
Filtered packet stream
Packet stream
Signature
Engine
Signatures
Network
Policy Layer
Packet capture
guages, and we describe it in more detail in §3.3. A new
component
placed within Bro’s middle layer matches these signatures
against
the packet stream. Whenever it finds a match, it inserts a new
event
into the event stream. The policy layer can then decide how to
re-
act. Additionally, we can pass information from the policy layer
back into the signature engine to control its operation. A
signature
can specify a script function to call whenever a particular
signature
matches. This function can then consult additional context and
in-
dicate whether the corresponding event should indeed be
generated.
We show an example of this later in §3.5.4.
In general, Bro’s analyzers follow the communication between
two endpoints and extract protocol-specific information. For
exam-
ple, the HTTP analyzer is able to extract URIs requested by
Web
clients (which includes performing general preprocessing such
as
expanding hex escapes) and the status code and items sent back
by servers in reply, whereas the FTP analyzer follows the
applica-
tion dialog, matching FTP commands and arguments (such as
the
names of accessed files) with their corresponding replies.
Clearly,
this protocol-specific analysis provides significantly more
context
than does a simple view of the total payload as an
undifferentiated
byte stream.
The signature engine can take advantage of this additional in-
formation by incorporating semantic-level signature matching.
For
example, the signatures can include the notion of matching
against
HTTP URIs; the URIs to be matched are provided by Bro’s
HTTP
analyzer. Having developed this mechanism for interfacing the
sig-
nature engine with the HTTP analyzer, it is now straight
forward
to extend it to other analyzers and semantic elements (indeed,
we
timed how long it took to add and debug interfaces for FTP and
Finger, and the two totalled only 20 minutes).
Central to Bro’s architecture is its connection management.
Each
network packet is associated with exactly one connection. This
no-
tion of connections allows several powerful extensions to
traditional
signatures. First of all, Bro reassembles the payload stream of
TCP
connections. Therefore, we can perform all pattern matching on
the
actual stream (in contrast to individual packets). While Snort
has a
preprocessor for TCP session reassembling, it does so by
combin-
ing several packets into a larger “virtual” packet. This packet is
then
passed on to the pattern matcher. Because the resulting analysis
remains packet-based, it still suffers from discretization
problems
introduced by focusing on packets, such as missing byte
sequences
that cross packet boundaries. (See a related discussion in [25] of
the
problem of matching strings in TCP traffic in the face of
possible
intruder evasion [27].)
In Bro, a signature match does not necessarily correspond to an
alert; as with other events, that decision is left to the policy
script.
Hence, it makes sense to remember which signatures have
matched
for a particular connection so far. Given this information, it is
then
possible to specify dependencies between signatures like
“signature
265
A only matches if signature B has already matched,” or “if a
host
matches more than N signatures of type C, then generate an
alert.”
This way, we can for example describe multiple steps of an
attack.
In addition, Bro notes in which direction of a connection a
particular
signature has matched, which gives us the notion of
request/reply
signatures: we can associate a client request with the
corresponding
server reply. A typical use is to differentiate between successful
and
unsuccessful attacks. We show an example in §3.5.3.
More generally, the policy script layer can associate arbitrary
kinds of data with a connection or with one of its endpoints.
This
means that any information we can deduce from any of Bro’s
other
components can be used to improve the quality of alerts. We
demon-
strate the power of this approach in §3.5.2.
Keeping per-connection state for signature matching naturally
raises the question of state management: at some point in time
we
have to reclaim state from older connections to prevent the
system
from exhausting the available memory. But again we can
leverage
the work already being done by Bro. Independently of our
signa-
tures, it already performs a sophisticated connection-tracking
using
various timeouts to expire connections. By attaching the
matching
state to the already-existing per-connection state, we assure that
the
signature engine works economically even with large numbers
of
connections.
3.3 Signature Language
Any signature-based NIDS needs a language for actually
defining
signatures. For Bro, we had to choose between using an already
existing language and implementing a new one. We have
decided
to create a new language for two reasons. First, it gives us more
flexibility. We can more easily integrate the new concepts
described
in §3.1 and §3.2. Second, for making use of existing signature
sets,
it is easier to write a converter in some high-level scripting
language
than to implement it within Bro itself.
Snort’s signatures are comprehensive, free and frequently up-
dated. Therefore, we are particularly interested in converting
them
into our signature language. We have written a corresponding
Py-
thon script that takes an arbitrary Snort configuration and
outputs
signatures in Bro’s syntax. Figure 2 shows an example of such a
conversion.
Figure 2: Example of signature conversion
alert tcp any any -> [a.b.0.0/16,c.d.e.0/24] 80
( msg:"WEB-ATTACKS conf/httpd.conf attempt";
nocase; sid:1373; flow:to_server,established;
content:"conf/httpd.conf"; [...] )
(a) Snort
signature sid-1373 {
ip-proto == tcp
dst-ip == a.b.0.0/16,c.d.e.0/24
dst-port == 80
# The payload below is actually generated in a
# case-insensitive format, which we omit here
# for clarity.
payload /.*conf/httpd.conf/
tcp-state established,originator
event "WEB-ATTACKS conf/httpd.conf attempt"
}%
(b) Bro
It turns out to be rather difficult to implement a complete parser
for Snort’s language. As far as we have been able to determine,
its
syntax and semantics are not fully documented, and in fact often
only defined by the source code. In addition, due to different
inter-
nals of Bro and Snort, it is sometimes not possible to keep the
exact
semantics of the signatures. We return to this point in §4.2.
As the example in Figure 2 shows, our signatures are defined by
means of an identifier and a set of attributes. There are two
main
types of attributes: (i) conditions and (ii) actions. The
conditions
define when the signature matches, while the actions declare
what
to do in the case of a match. Conditions can be further divided
into
four types: header, content, dependency, and context.
Header conditions limit the applicability of the signature to a
sub-
set of traffic that contains matching packet headers. For TCP,
this
match is performed only for the first packet of a connection. For
other protocols, it is done on each individual packet. In general,
header conditions are defined by using a tcpdump-like [33]
syntax
(for example, tcp[2:2] == 80 matches TCP traffic with desti-
nation port 80). While this is very flexible, for convenience
there
are also some short-cuts (e.g., dst-port == 80).
Content conditions are defined by regular expressions. Again,
we differentiate two kinds of conditions here: first, the
expression
may be declared with the payload statement, in which case it is
matched against the raw packet payload (reassembled where
appli-
cable). Alternatively, it may be prefixed with an analyzer-
specific
label, in which case the expression is matched against the data
as
extracted by the corresponding analyzer. For example, the
HTTP
analyzer decodes requested URIs. So, http /(etc/(passwd
|shadow)/ matches any request containing either etc/passwd
or etc/shadow.
Signature conditions define dependencies between signatures.
We have implemented requires-signature, which specifies
another signature that has to match on the same connection first,
and requires-reverse-signature, which additionally re-
quires the match to happen for the other direction of the
connection.
Both conditions can be negated to match only if another
signature
does not match.
Finally, context conditions allow us to pass the match decision
on to various components of Bro. They are only evaluated if all
other conditions have already matched. For example, we have
im-
plemented a tcp-state condition that poses restrictions on the
current state of the TCP connection, and eval, which calls an ar-
bitrary script policy function.
If all conditions are met, the actions associated with a signature
are executed: event inserts a signature match event into the
event stream, with the value of the event including the signature
identifier, corresponding connection, and other context. The
policy
layer can then analyze the signature match.
3.4 Snort’s Signature Set
Snort comes with a large set of signatures, with 1,273 enabled
by default [2]. Unfortunately, the default configuration turns
out to
generate a lot of false positives. In addition, many alerts belong
to
failed exploit attempts executed by attackers who scan networks
for
vulnerable hosts. As noted above, these are general problems of
signature-based systems.
The process of selectively disabling signatures that are not
appli-
cable to the local environment, or “tuning,” takes time,
knowledge
and experience. With respect to Snort, a particular problem is
that
many of its signatures are too general. For example, Snort’s
signa-
ture #1560:
alert tcp $EXTERNAL_NET any
-> $HTTP_SERVERS $HTTP_PORTS
(msg:"WEB-MISC /doc/ access";
uricontent:"/doc/"; flow:to_server,established;
nocase; sid:1560; [...])
searches for the string /doc/ within URIs of HTTP requests.
While this signature is indeed associated with a particular
vulner-
ability (CVE-1999-0678 [9]), it only makes sense to use it if
you
have detailed knowledge about your site (for example, that there
is
no valid document whose path contains the string /doc/). Other-
wise, the probability of a signature match reflecting a false
alarm
266
is much higher than that it indicates an attacker exploiting an
old
vulnerability.
Another problem with Snort’s default set is the presence of
over-
lapping signatures for the same exploit. For example, signatures
#1536, #1537, #1455, and #1456 (the latter is disabled by
default)
all search for CVE-2000-0432, but their patterns differ in the
amount of detail. In addition, the vulnerability IDs given in
Snort’s
signatures are not always correct. For example, signature #884
ref-
erences CVE-1999-0172 and Buqtraq [6] ID #1187. But the lat-
ter corresponds to CVE-2000-0411.
As already noted, we cannot expect to avoid these limitations
of Snort’s signatures by just using them semantically
unmodified in
Bro. For example, although we convert the Snort’s fixed strings
into
Bro’s regular expressions, naturally they still represent fixed
sets of
characters. Only manual editing would give us the additional
power
of regular expressions. We give an example for such an
improve-
ment in §3.5.1.
3.5 The Power of Bro Signatures
In this section, we show several examples to convey the power
provided by our signatures. First, we demonstrate how to define
more “tight” signatures by using regular expressions. Then, we
show how to identify failed attack attempts by considering the
set of
software a particular server is runnning (we call this its
vulnerabil-
ity profile and incorporate some ideas from [22] here) as well as
the
response of the server. We next demonstrate modelling an attack
in
multiple steps to avoid false positives, and finally show how to
use
alert-counting for identifying exploit scans. We note that none
of
the presented examples are supported by Snort without
extending
its core significantly (e.g. by writing new plug-ins).
3.5.1 Using Regular Expressions
Regular expressions allow far more flexibility than fixed
strings.
Figure 3 (a) shows a Snort signature for CVE-1999-0172 that
generates a large number of false positives at Saarland
University’s
border router. (See §4.1 for a description of the university.)
Fig-
ure 3 (b) shows a corresponding Bro signature that uses a
regular
expression to identify the exploit more reliably. CVE-1999-
0172
describes a vulnerability of the formmail CGI script. If an at-
tacker constructs a string of the form “...; <shell-cmds>”
(a | instead of the ; works as well), and passes it on as argument
of the recipient CGI parameter, vulnerable formmails will ex-
ecute the included shell commands. Because CGI parameters
can
be given in arbitrary order, the Snort signature has to rely on
iden-
tifying the formmail access by its own. But by using a regular
expression, we can explicitly define that the recipient parame-
ter has to contain a particular character.
Figure 3: Two signatures for CVE-1999-0172
alert tcp any any -> a.b.0.0/16 80
(msg:"WEB-CGI formmail access";
uricontent:"/formmail";
flow:to_server,established;
nocase; sid:884; [...])
(a) Snort using a fixed string
signature formmail-cve-1999-0172 {
ip-proto == tcp
dst-ip == a.b.0.0/16
dst-port = 80
# Again, actually expressed in a
# case-insensitive manner.
http /.*formmail.*?.*recipient=[ˆ&]*[;|]/
event "formmail shell command"
}
(b) Bro using a regular expression
3.5.2 Vulnerability Profiles
Most exploits are aimed at particular software, and usually only
some versions of the software are actually vulnerable. Given
the overwhelming number of alerts a signature-matching NIDS
can
generate, we may well take the view that the only attacks of
interest
are those that actually have a chance of succeeding. If, for
example,
an IIS exploit is tried on a Web server running Apache, one may
not even care. [23] proposes to prioritize alerts based on this
kind of
vulnerability information. We call the set of software versions
that
a host is running its vulnerability profile. We have implemented
this
concept in Bro. By protocol analysis, it collects the profiles of
hosts
on the network, using version/implementation information that
the
analyzer observes. Signatures can then be restricted to certain
ver-
sions of particular software.
As a proof of principle, we have implemented vulnerability pro-
files for HTTP servers (which usually characterize themselves
via
the Server header), and for SSH clients and servers (which iden-
tify their specific versions in the clear during the initial
protocol
handshake). We intend to extend the software identification to
other
protocols.
We aim in future work to extend the notion of developing a pro-
file beyond just using protocol analysis. We can passively
finger-
print hosts to determine their operating system version
information
by observing specific idiosyncrasies of the header fields in the
traffic
they generate, similar to the probing techniques described in
[13], or
we can separately or in addition employ active techniques to
explic-
itly map the properties of the site’s hosts and servers [31].
Finally,
in addition to automated techniques, we can implement a
configu-
ration mechanism for manually entering vulnerability profiles.
3.5.3 Request/Reply Signatures
Further pursuing the idea to avoid alerts for failed attack
attempts,
we can define signatures that take into account both directions
of a
connection. Figure 4 shows an example. In operational use, we
see a lot of attempts to exploit CVE-2001-0333 to execute the
Windows command interpreter cmd.exe. For a failed attempt,
the
server typically answers with a 4xx HTTP reply code, indicating
an
error.2 To ignore these failed attempts, we first define one
signature,
http-error, that recognizes such replies. Then we define a sec-
ond signature, cmdexe-success, that matches only if cmd.exe
is contained in the requested URI (case-insensitive) and the
server
does not reply with an error. It’s not possible to define this kind
of
signature in Snort, as it lacks the notion of associating both
direc-
tions of a connection.
Figure 4: Request/reply signature
signature cmdexe-success {
ip-proto == tcp
dst-port == 80
http /.*[cC][mM][dD].[eE][xX][eE]/
event "WEB-IIS cmd.exe success"
requires-signature-opposite ! http-error
tcp-state established
}
signature http-error {
ip-proto == tcp
src-port == 80
payload /.*HTTP/1.. *4[0-9][0-9]/
event "HTTP error reply"
tcp-state established
}
2There are other reply codes that reflect additional types of
errors,
too, which we omit for clarity.
267
3.5.4 Attacks with Multiple Steps
An example of an attack executed in two steps is the infection
by the Apache/mod ssl worm [7] (also known as Slapper),
released in September 2002. The worm first probes a target for
its potential vulnerability by sending a simple HTTP request
and
inspecting the response. It turns out that the request it sends is
in
fact in violation of the HTTP 1.1 standard [11] (because it does
not
include a Host header), and this idiosyncracy provides a
somewhat
“tight” signature for detecting a Slapper probe.
If the server identifies itself as Apache, the worm then tries
to exploit an OpenSSL vulnerability on TCP port 443. Figure 5
shows two signatures that only report an alert if these steps are
performed for a destination that runs a vulnerable OpenSSL ver-
sion. The first signature, slapper-probe, checks the payload
for the illegal request. If found, the script function is vulnera-
ble to slapper (omitted here due to limited space, see [2]) is
called. Using the vulnerability profile described above, the
func-
tion evaluates to true if the destination is known to run Apache
as well as a vulnerable OpenSSL version.3 If so, the signature
matches (depending on the configuration this may or may not
gen-
erate an alert by itself). The header conditions of the second
sig-
nature, slapper-exploit, match for any SSL connection into
the specified network. For each, the signature calls the script
func-
tion has slapper probed. This function generates a signa-
ture match if slapper-probe has already matched for the same
source/destination pair. Thus, Bro alerts if the combination of
prob-
ing for a vulnerable server, plus a potential follow-on exploit of
the
vulnerability, has been seen.
Figure 5: Signature for Apache/mod sslworm
signature slapper-probe {
ip-proto == tcp
dst-ip == x.y.0.0/16 # sent to local net
dst-port == 80
payload /.*GET / HTTP/1.1x0dx0ax0dx0a/
eval is_vulnerable_to_slapper # call policy fct.
event "Vulner. host possibly probed by Slapper"
}
signature slapper-exploit {
ip-proto == tcp
dst-ip == x.y.0.0/16
dst-port == 443 # 443/tcp = SSL/TLS
eval has_slapper_probed # test: already probed?
event "Slapper tried to exploit vulnerable host"
}
3.5.5 Exploit Scanning
Often attackers do not target a particular system on the Internet,
but probe a large number of hosts for vulnerabilities (exploit
scan-
ning). Such a scan can be executed either horizontally (several
hosts
are probed for a particular exploit), vertically (one host is
probed
for several exploits), or both. While, by their own, most of
these
probes are usually low-priority failed attempts, the scan itself is
an
important event. By simply counting the number signature alerts
per source address (horizontal) or per source/destination pair
(ver-
tical), Bro can readily identify such scans. We have
implemented
this with a policy script which generates alerts like:
a.b.c.d triggered 10 signatures on host e.f.g.h
i.j.k.l triggered signature sid-1287 on 100 hosts
m.n.o.p triggered signature worm-probe on 500 hosts
q.r.s.t triggered 5 signatures on host u.v.x.y
3Note that it could instead implement a more conservative
policy,
and return true unless the destination is known to not run a
vulner-
able version of OpenSSL/Apache.
4. EVALUATION
Our approach for evaluating the effectiveness of the signature
en-
gine is to compare it to Snort in terms of run-time performance
and
generated alerts, using semantically equivalent signature sets.
We
note that we do not evaluate the concept of conceptual
signatures by
itself. Instead, as a first step, we validate that our
implementation
is capable of acting as an effective substitute for the most-
widely
deployed NIDS even when we do not use any of the advanced
fea-
tures it provides. Building further on this base by thoroughly
evalu-
ating the actual power of contextual signatures when deployed
op-
erationally is part of our ongoing work.
During our comparision of Bro and Snort, we found several pe-
culiarities that we believe are of more general interest. Our re-
sults stress that the performance of a NIDS can be very
sensitive
to semantics, configuration, input, and even underlying
hardware.
Therefore, after discussing our test data, we delve into these in
some
detail. Keeping these limitations in mind, we then assess the
overall
performance of the Bro signature engine.
4.1 Test Data
For our testing, we use two traces:
USB-Full A 30-minute trace collected at Saarland University,
Germany (USB-Full), consisting of all traffic (including
packet contents) except for three high-volume peer-to-peer
applications (to reduce the volume). The university has 5,500
internal hosts, and the trace was gathered on its 155 Mbps
access link to the Internet. The trace totals 9.8 GB, 15.3M
packets, and 220K connections. 35% of the trace packets be-
long to HTTP on port 80, 19% to eDonkey on port 4662, and
4% to ssh on port 22, with other individual ports being less
common than these three (and the high-volume peer-to-peer
that was removed).
LBL-Web A two-hour trace of HTTP client-side traffic,
including
packet contents, gathered at the Lawrence Berkeley National
Laboratory (LBL), Berkeley, USA (LBL-Web). The labora-
tory has 13,000 internal hosts, and the trace was gathered on
its Gbps access link to the Internet. The trace totals 667MB,
5.5M packets, and 596K connections.
Unless stated otherwise, we performed all measurements on
550MHz Pentium-3 systems containing ample memory (512MB
or
more). For both Snort and Bro’s signature engine, we used
Snort’s
default signature set. We disabled Snort’s “experimental” set of
sig-
natures as some of the latest signatures use new options which
are
not yet implemented in our conversion program. In addition, we
disabled Snort signature #526, BAD TRAFFIC data in TCP
SYN packet. Due to Bro matching stream-wise instead of
packet-
wise, it generates thousands of false positives. We discuss this
in
§4.2. In total, 1,118 signatures are enabled. They contain 1,107
distinct patterns and cover 89 different service ports. 60% of the
signatures cover HTTP traffic. For LBL-Web, only these were
acti-
vated.
For Snort, we enabled the preprocessors for IP defragmentation,
TCP stream reassembling on its default ports, and HTTP
decoding.
For Bro, we have turned on TCP reassembling for the same
ports
(even if otherwise Bro would not reassemble them because none
of
the usual event handlers indicated interest in traffic for those
ports),
enabled its memory-saving configuration (“@load reduce-
memory”), and used an inactivity timeout of 30 seconds
(in correspondence with Snort’s default session timeout). We
con-
figured both systems to consider all packets contained in the
traces.
We used the version 1.9 branch of Snort, and version 0.8a1 of
Bro.
268
4.2 Difficulties of Evaluating NIDSs
The evaluation of a NIDS is a challenging undertaking, both in
terms of assessing attack recognition and in terms of assessing
per-
formance. Several efforts to develop objective measures have
been
made in the past (e.g., [21, 22, 15]), while others stress the
diffi-
culties with such approaches [24]. During our evaluation, we
en-
countered several additional problems that we discuss here.
While
these arose in the specific context of comparing Snort and Bro,
their
applicability is more general.
When comparing two NIDSs, differing internal semantics can
present a major problem. Even if both systems basically perform
the same task—capturing network packets, rebuilding payload,
de-
coding protocols—that task is sufficiently complex that it is
almost
inevitable that the systems will do it somewhat differently.
When
coupled with the need to evaluate a NIDS over a large traffic
trace
(millions of packets), which presents ample opportunity for the
dif-
fering semantics to manifest, the result is that understanding the
significance of the disagreement between the two systems can
en-
tail significant manual effort.
One example is the particular way in which TCP streams are re-
assembled. Due to state-holding time-outs, ambiguities (see [27,
16] and [25] for discussion of how these occur for benign
reasons in
practice) and non-analyzed packets (which can be caused by
packet
filter drops, or by internal sanity checks), TCP stream analyzers
will
generally wind up with slightly differing answers for corner
cases.
Snort, for example, uses a preprocessor that collects a number
of
packets belonging to the same session until certain thresholds
are
reached and then combines them into “virtual” packets. The rest
of
Snort is not aware of the reassembling and still only sees
packets.
Bro, on the other hand, has an intrinsic notion of a data stream.
It collects as much payload as needed to correctly reconstruct
the
next in-sequence chunk of a stream and passes these data
chunks
on as soon as it is able to. The analyzers are aware of the fact
that
they get their data chunk-wise, and track their state across
chunks.
They are not aware of the underlying packetization that lead to
those
chunks. While Bro’s approach allows true stream-wise
signatures,
it also means that the signature engine loses the notion of
“packet
size”: packets and session payload are decoupled for most of
Bro’s
analyzers. However, Snort’s signature format includes a way of
specifying the packet size. Our signature engine must fake up an
equivalent by using the size of the first matched payload chunk
for
each connection, which can lead to differing results.
Another example of differing semantics comes from the
behavior
of protocol analyzers. Even when two NIDS both decode the
same
protocol, they will differ in the level-of-detail and their
interpreta-
tion of protocol corner cases and violations (which, as
mentioned
above, are in fact seen in non-attack traffic [25]). For example,
both
Bro and Snort extract URIs from HTTP sessions, but they do not
interpret them equally in all situations. Character encodings
within
URIs are sometimes decoded differently, and neither contains a
full
Unicode decoder. The anti-IDS tool Whisker [37] can actively
ex-
ploit these kinds of deficiencies. Similarly, Bro decodes
pipelined
HTTP sessions; Snort does not (it only processes the first URI
in a
series of pipelined HTTP requests).
Usually, the details of a NIDS can be controlled by a number of
options. But frequently for a Bro option there is no equivalent
Snort
option, and vice versa. For example, the amount of memory
used
by Snort’s TCP reassembler can be bounded to a fixed value. If
this
limit is reached, old data is expired aggressively. Bro relies
solely
on time-outs. Options like these often involve time-memory
trade-
offs. The more memory we have, the more we can spend for
Snort’s
reassembler, and the larger we can make Bro’s time-outs. But
how
to choose the values, so that both will utilize the same amount
of
memory? And even if we do, how to arrange that both expire the
same old data? The hooks to do so simply aren’t there.
The result of these differences is differing views of the same
net-
work data. If one NIDS reports an alert while the other does
not,
it may take a surprisingly large amount of effort to tell which
one
of them is indeed correct. More fundamentally, this depends on
the definition of “correct,” as generally both are correct within
their
own semantics. From a user’s point of the view, this leads to
differ-
ent alerts even when both systems seem to use the same
signatures.
From an evaluator’s point of view, we have to (i) grit our teeth
and
be ready to spend substantial effort in tracking down the root
cause
when validating the output of one tool versus another, and (ii)
be
very careful in how we frame our assessment of the differences,
be-
cause there is to some degree a fundamental problem of
“comparing
apples and oranges”.
The same applies for measuring performance in terms of effi-
ciency. If two systems do different things, it is hard to compare
them fairly. Again, the HTTP analyzers of Snort and Bro
illustrate
this well. While Snort only extracts the first URI from each
packet,
Bro decodes the full HTTP session, including tracking multiple
re-
quests and replies (which entails processing the numerous ways
in
which HTTP delimits data entities, including “multipart MIME”
and “chunking”). Similarly, Bro provides much more
information
at various other points than the corresponding parts of Snort.
But there are still more factors that influence performance.
Even
if one system seems to be significantly faster than another, this
can
change by modifying the input or even the underlying hardware.
One of our main observations along these lines is that the
perfor-
mance of NIDSs can depend heavily on the particular input
trace.
On a Pentium-3 system, Snort needs 440 CPU seconds for the
trace
LBL-Web (see Figure 6). This only decreases by 6% when us-
ing the set-wise pattern matcher of [12]. In addition, we devised
a small modification to Snort that, compared to the original ver-
sion, speeds it up by factor of 2.6 for this particular trace. (The
modification is an enhancement to the set-wise matcher: the
orig-
inal implementation first performs a set-wise search for all of
the
possible strings, caching the results, and then iterates through
the
lists of signatures, looking up for each in turn whether its
particular
strings were matched. Our modification uses the result of the
set-
wise match to identify potential matching signatures directly if
the
corresponding list is large, avoiding the iteration.)
Figure 6: Run-times on different hardware
Pentium−3, 512Mhz Pentium−4, 1.5Ghz
Snort
Snort−[FV01]
Snort−Modified
Bro w/o DFA cache
Bro w/ DFA cache
Run−times on Web trace
S
e
co
n
d
s
0
1
0
0
2
0
0
3
0
0
4
0
0
5
0
0
Using the trace USB-Full, however, the improvement realized
by our modified set-wise matcher for Snort is only a factor of
1.2.
Even more surprisingly, on a trace from another environment (a
re-
search laboratory with 1,500 workstations and supercomputers),
the
original version of Snort is twice as fast as the set-wise
implemen-
tation of [12] (148 CPU secs vs. 311 CPU secs), while our
patched
version lies in between (291 CPU secs). While the reasons
remain
to be discovered in Snort’s internals, this demonstrates the
difficulty
of finding representative traffic as proposed, for example, in
[15].
269
Furthermore, relative performance does not only depend on the
input but even on the underlying hardware. As described above,
the
original Snort needs 440 CPU seconds for LBL-Web on a
Pentium-
3 based system. Using exactly the same configuration and input
on a Pentium-4 based system (1.5GHz), it actually takes 29 CPU
seconds more. But now the difference between stock Snort and
our
modified version is a factor of 5.8! On the same system, Bro’s
run-
time decreases from 280 to 156 CPU seconds.4
Without detailed hardware-level analysis, we can only guess
why
Snort suffers from the upgrade. To do so, we ran valgrind’s [34]
cache simulation on Snort. For the second-level data cache, it
shows
a miss-rate of roughly 10%. The corresponding value for Bro is
be-
low 1%. While we do not know if valgrind’s values are airtight,
they could at least be the start of an explanation. We have heard
other anecdotal comments that the Pentium-4 performs quite
poorly
for applications with lots of cache-misses. On the other hand,
by
building Bro’s regular expression matcher incrementally, as a
side
effect the DFA tables will wind up having memory locality that
somewhat reflects the dynamic patterns of the state accesses,
which
will tend to decrease cache misses.
4.3 Performance Evaluation
We now present measurements of the performance of the Bro
sig-
nature engine compared with Snort, keeping in mind the
difficulties
described above. Figure 7 shows run-times on trace subsets of
dif-
ferent length for the USB-Full trace. We show CPU times for
the
original implementation of Snort, for Snort using [12] (virtually
no
difference in performance), for Snort modified by us as
described in
the previous section, for Bro with a limited DFA state cache,
and for
Bro without a limited DFA state cache. We see that our
modified
Snort runs 18% faster than the original one, while the cache-less
Bro takes about the same amount of time. Bro with a limited
state
cache needs roughly a factor of 2.2 more time.
We might think that the discrepancy between Bro operating with
a limited DFA state cache and it operating with unlimited DFA
state
memory is due to it having to spend considerable time
recomputing
states previously expired from the limited cache. This, however,
turns out not to be the case. Additional experiments with
essentially
infinite cache sizes indicate that the performance decrease is
due to
the additional overhead of maintaining the cache.
While this looks like a significant impact, we note that it is not
clear whether the space savings of a cache is in fact needed in
opera-
tional use. For this trace, only 2,669 DFA states had to be
computed,
totaling roughly 10MB. When running Bro operationally for a
day
at the university’s gateway, the number of states rapidly climbs
to
about 2,500 in the first hour, but then from that point on only
slowly
rises to a bit over 4,000 by the end of the day.
A remaining question, however, is whether an attacker could
cre-
ate traffic specifically tailored to enlarge the DFAs (a “state-
holding”
attack on the IDS), perhaps by sending a stream of packets that
nearly trigger each of the different patterns. Additional research
is needed to further evaluate this threat.
Comparing for USB-Full the alerts generated by Snort to the
signature matches reported by Bro, all in all we find very good
agreement. The main difference is the way they report a match.
By design, Bro reports all matching signatures, but each one
only
once per connection. This is similar to the approach suggested
in [10]. Snort, on the other hand, reports the first matching sig-
nature for each packet, independently of the connection it
belongs
4This latter figure corresponds to about 35,000 packets per
second,
though we strongly argue that measuring performance in PPS
rates
implies undue generality, since, as developed above, the
specifics of
the packets make a great difference in the results.
to. This makes it difficult to compare the matches. We account
for these difference by comparing connections for which at least
one match is generated by either system. With USB-Full, we get
2,065 matches by Bro in total on 1,313 connections. Snort
reports
4,147 alerts. When counting each alert only once per
connection,
Snort produces 1,320 on 1,305 connections.5 There are 1,296
con-
nections for which both generate at least one alert, and 17 (9)
for
which Bro (Snort) reports a match but not Snort (Bro).
Looking at individual signatures, we see that Bro misses 10
matches of Snort. 5 of them are caused by Snort ID #1013
(WEB-
IIS fpcount access). The corresponding connections con-
tain several requests, but an idle time larger than the defined in-
activity timeout of 30 seconds. Therefore, Bro flushes the
state before it can encounter the match which would happen
later
in the session. On the other hand, Bro reports 41 signature
matches
for connections for which Snort does not report anything. 37 of
them are Web signatures. The discrepancy is due to different
TCP
stream semantics. Bro and Snort have slightly different
definitions
of when a session is established. In addition, the semantic
differ-
ences between stream-wise and packet-wise matching discussed
in
§4.2 cause some of the additional alerts.
Figure 7: Run-time comparison on 550MHz Pentium-3
0 5 10 15 20 25 30
0
5
0
0
1
0
0
0
1
5
0
0
2
0
0
0
Runtime for USB−Full on Pentium−3
Trace length (mins)
S
e
co
n
d
s
Bro w/o state cache
Bro w/ state cache
Snort
Snort [FV01]
Snort patched
We have done similar measurements with LBL-Web. Due to
lim-
ited space, we omit the corresponding plot here. While the
original
Snort takes 440 CPU seconds for the trace, Bro without (with) a
lim-
ited state cache needs 280 (328) CPU seconds, and Snort as
modi-
fied by us needs only 164 CPU seconds. While this suggests
room
for improvement in some of Bro’s internal data structures, Bro’s
matcher still compares quite well to the typical Snort
configuration.
For this trace, Bro (Snort) reports 2,764 (2,049) matches in
total.
If we count Snort’s alerts only once per connection, there are
1,472
of them. There are 1,395 connections for which both report at
least
one alert. For 133 (69) connections, Bro (Snort) reports a match
but Snort (Bro) does not. Again, looking at individual
signatures,
Bro misses 73 of Snort’s alerts. 25 of them are matches of Snort
signature #1287 (WEB-IIS scripts access). These are all
caused by the same host. The reason is packets missing from the
trace, which, due to a lack of in-order sequencing, prevent the
TCP
stream from being reassembled by Bro. Another 19 are due to
sig-
nature #1287 (CodeRed v2 root.exe access). The ones of
these we inspected further were due to premature server-side
resets,
which Bro correctly identifies as the end of the corresponding
con-
nections, while Snort keeps matching on the traffic still being
send
by the client. Bro reports 186 signature matches for connections
for
which Snort does not report a match at all. 68 of these
connections
simultaneously trigger three signatures (#1002, #1113, #1287).
46
5Most of the duplicates are ICMP Destination Unreach-
able messages. Using Bro’s terminology, we define all ICMP
packets between two hosts as belonging to one “connection.”
270
are due to simultaneous matches of signatures #1087 and #1242.
Looking at some of them, one reason is SYN-packets missing
from
the trace. Their absence leads to different interpretations of
estab-
lished sessions by Snort and Bro, and therefore to different
matches.
5. CONCLUSIONS
In this work, we develop the general notion of contextual sig-
natures as an improvement on the traditional form of string-
based
signature-matching used by NIDS. Rather than matching fixed
strings in isolation, contextual signatures augment the matching
pro-
cess with both low-level context, by using regular expressions
for
matching rather than simply fixed strings, and high-level
context,
by taking advantage of the rich, additional semantic context
made
available by Bro’s protocol analysis and scripting language.
By tightly integrating the new signature engine into Bro’s
event-
based architecture, we achieve several major improvements over
other signature-based NIDSs such as Snort, which frequently
suf-
fer from generating a huge number of alerts. By interpreting a
signature-match only as an event, rather than as an alert by
itself,
we are able to leverage Bro’s context and state-management
mech-
anisms to improve the quality of alerts. We showed several
exam-
ples of the power of this approach: matching requests with
replies,
recognizing exploit scans, making use of vulnerabilty profiles,
and
defining dependencies between signatures to model attacks that
span
multiple connections. In addition, by converting the freely
available
signature set of Snort into Bro’s language, we are able to build
upon
existing community efforts.
As a baseline, we evaluated our signature engine using Snort as
a reference, comparing the two systems in terms of both run-
time
performance and generated alerts using the signature set
archived
at [2]. But in the process of doing so, we encountered several
gen-
eral problems when comparing NIDSs: differing internal
semantics,
incompatible tuning options, the difficulty of devising
“representa-
tive” input, and extreme sensitivity to hardware particulars. The
last
two are particularly challenging, because there are no a priori
indi-
cations when comparing performance on one particular trace
and
hardware platform that we might obtain very different results
using
a different trace or hardware platform. Thus, we must exercise
great
caution in interpreting comparisons between NIDSs.
Based on this work, we are now in the process of deploying
Bro’s
contextual signatures operationally in several educational,
research
and commercial enviroments.
Finally, we have integrated our work into version 0.8 of the Bro
distribution, freely available at [5].
6. ACKNOWLEDGMENTS
We would like to thank the Lawrence Berkeley National Labora-
tory (LBL), Berkeley, USA; the National Energy Research
Scien-
tific Computing Center (NERSC), Berkeley, USA; and the
Saarland
University, Germany. We are in debt to Anja Feldmann for
making
this work possible. Finally, we would like to thank the
anonymous
reviewers for their valuable suggestions.
7. REFERENCES
[1] arachNIDS. http://whitehats.com/ids/.
[2] Web archive of versions of software and signatures used in
this paper.
http://www.net.in.tum.de/˜robin/ccs03.
[3] S. Axelsson. The base-rate fallacy and the difficulty of
intrusion detection.
ACM Transactions on Information and System Security,
3(3):186–205, August
2000.
[4] R. G. Bace. Intrusion Detection. Macmillan Technical
Publishing,
Indianapolis, IN, USA, 2000.
[5] Bro: A System for Detecting Network Intruders in Real-
Time.
http://www.icir.org/vern/bro-info.html.
[6] Bugtraq. http://www.securityfocus.com/bid/1187.
[7] CERT Advisory CA-2002-27 Apache/mod ssl Worm.
http://www.cert.org/advisories/CA-2002-27.html.
[8] C. J. Coit, S. Staniford, and J. McAlerney. Towards Faster
Pattern Matching for
Intrusion Detection or Exceeding the Speed of Snort. In Proc.
2nd DARPA
Information Survivability Conference and Exposition, June
2001.
[9] Common Vulnerabilities and Exposures.
http://www.cve.mitre.org.
[10] H. Debar and B. Morin. Evaluation of the Diagnostic
Capabilities of
Commercial Intrusion Detection Systems. In Proc. Recent
Advances in
Intrusion Detection, number 2516 in Lecture Notes in Computer
Science.
Springer-Verlag, 2002.
[11] R. F. et. al. Hypertext transfer protocol – http/1.1. Request
for Comments 2616,
June 1999.
[12] M. Fisk and G. Varghese. Fast Content-Based Packet
Handling for Intrusion
Detection. Technical Report CS2001-0670, UC San Diego, May
2001.
[13] Fyodor. Remote OS detection via TCP/IP Stack Finger
Printing. Phrack
Magazine, 8(54), 1998.
[14] J. Haines, L. Rossey, R. Lippmann, and R. Cunnigham.
Extending the 1999
Evaluation. In Proc. 2nd DARPA Information Survivability
Conference and
Exposition, June 2001.
[15] M. Hall and K. Wiley. Capacity Verification for High
Speed Network Intrusion
Detection Systems. In Proc. Recent Advances in Intrusion
Detection, number
2516 in Lecture Notes in Computer Science. Springer-Verlag,
2002.
[16] M. Handley, C. Kreibich, and V. Paxson. Network
intrusion detection: Evasion,
traffic normalization, and end-to-end protocol semantics. In
Proc. 10th
USENIX Security Symposium, Washington, D.C., August 2001.
[17] J. Heering, P. Klint, and J. Rekers. Incremental generation
of lexical scanners.
ACM Transactions on Programming Languages and Systems
(TOPLAS),
14(4):490–520, 1992.
[18] J. E. Hopcroft and J. D. Ullman. Introduction to Automata
Theory, Languages,
and Computation. Addison Wesley, 1979.
[19] K. Jackson. Intrusion detection system product survey.
Technical Report
LA-UR-99-3883, Los Alamos National Laboratory, June 1999.
[20] U. Lindqvist and P. A. Porras. Detecting computer and
network misuse through
the production-based expert system toolset (P-BEST). In Proc.
IEEE
Symposium on Security and Privacy. IEEE Computer Society
Press, May 1999.
[21] R. Lippmann, R. K. Cunningham, D. J. Fried, I. Graf, K. R.
Kendall, S. E.
Webster, and M. A. Zissman. Results of the 1998 DARPA
Offline Intrusion
Detection Evaluation. In Proc. Recent Advances in Intrusion
Detection, 1999.
[22] R. Lippmann, J. W. Haines, D. J. Fried, J. Korba, and K.
Das. The 1999
DARPA off-line intrusion detection evaluation. Computer
Networks,
34(4):579–595, October 2000.
[23] R. Lippmann, S. Webster, and D. Stetson. The Effect of
Identifying
Vulnerabilities and Patching Software on the Utility of Network
Intrusion
Detection. In Proc. Recent Advances in Intrusion Detection,
number 2516 in
Lecture Notes in Computer Science. Springer-Verlag, 2002.
[24] J. McHugh. Testing Intrusion detection systems: A critique
of the 1998 and
1999 DARPA intrusion detection system evaluations as
performed by Lincoln
Laboratory. ACM Transactions on Information and System
Security,
3(4):262–294, November 2000.
[25] V. Paxson. Bro: A system for detecting network intruders
in real-time.
Computer Networks, 31(23–24):2435–2463, 1999.
[26] P. A. Porras and P. G. Neumann. EMERALD: Event
monitoring enabling
responses to anomalous live disturbances. In National
Information Systems
Security Conference, Baltimore, MD, October 1997.
[27] T. H. Ptacek and T. N. Newsham. Insertion, evasion, and
denial of service:
Eluding network intrusion detection. Technical report, Secure
Networks, Inc.,
January 1998.
[28] M. J. Ranum, K. Landfield, M. Stolarchuk, M. Sienkiewicz,
A. Lambeth, and
E. Wall. Implementing a generalized tool for network
monitoring. In Proc. 11th
Systems Administration Conference (LISA), 1997.
[29] M. Roesch. Snort: Lightweight intrusion detection for
networks. In Proc. 13th
Systems Administration Conference (LISA), pages 229–238.
USENIX
Association, November 1999.
[30] R. Sekar and P. Uppuluri. Synthesizing fast intrusion
prevention/detection
systems from high-level specifications. In Proc. 8th USENIX
Security
Symposium. USENIX Association, August 1999.
[31] U. Shankar and V. Paxson. Active Mapping: Resisting
NIDS Evasion Without
Altering Traffic. In Proc. IEEE Symposium on Security and
Privacy, 2003.
[32] Steven T. Eckmann. Translating Snort rules to STATL
scenarios. In Proc.
Recent Advances in Intrusion Detection, October 2001.
[33] tcpdump. http://www.tcpdump.org.
[34] Valgrind. http://developer.kde.org/˜sewardj.
[35] G. Vigna, S. Eckmann, and R. Kemmerer. The STAT Tool
Suite. In Proc. 1st
DARPA Information Survivability Conference and Exposition,
Hilton Head,
South Carolina, January 2000. IEEE Computer Society Press.
[36] G. Vigna and R. A. Kemmerer. Netstat: A network-based
intrusion detection
system. Journal of Computer Security, 7(1):37–71, 1999.
[37] Whisker. http://www.wiretrip.net/rfp.
271

More Related Content

PDF
A Survey on CDN Vulnerability to DoS Attacks
PDF
A Survey on CDN Vulnerability to DoS Attacks
PDF
Report on xss and do s
PDF
DNS Advanced Attacks and Analysis
PDF
D do s_white_paper_june2015
PDF
Using the Web or another research tool, search for alternative means.pdf
PPTX
Browser Security ppt.pptx
PPTX
Crossfire DDoS Protection
A Survey on CDN Vulnerability to DoS Attacks
A Survey on CDN Vulnerability to DoS Attacks
Report on xss and do s
DNS Advanced Attacks and Analysis
D do s_white_paper_june2015
Using the Web or another research tool, search for alternative means.pdf
Browser Security ppt.pptx
Crossfire DDoS Protection

Similar to Protecting Browsers from DNS Rebinding AttacksCollin Jacks.docx (20)

PDF
Study of flooding based ddos attacks and their effect using deter testbed
PDF
Study of flooding based d do s attacks and their effect using deter testbed
PPTX
DoS/DDoS
PDF
Proxy Server
PDF
A ROBUST MECHANISM FOR DEFENDING DISTRIBUTED DENIAL OF SERVICE ATTACKS ON WEB...
PDF
Welcome to International Journal of Engineering Research and Development (IJERD)
PDF
Azure DDoS Protection Standard
PPTX
Presentation1 shweta
PDF
DNS Rebinding Attack
PDF
A survey of trends in massive ddos attacks and cloud based mitigations
PDF
A SURVEY OF TRENDS IN MASSIVE DDOS ATTACKS AND CLOUD-BASED MITIGATIONS
PDF
A survey of trends in massive ddos attacks and cloud based mitigations
PDF
Ix3615551559
PDF
Encountering distributed denial of service attack utilizing federated softwar...
PDF
DEF CON 27 - GERALD DOUSSOT AND ROGER MEYER - state of dns rebinding attack ...
PPTX
Denial of service attack
DOCX
DNS spoofing/poisoning Attack Report (Word Document)
PPTX
Denial of Service Attacks (DoS/DDoS)
PPTX
Website hacking and prevention (All Tools,Topics & Technique )
PDF
Paper id 41201622
Study of flooding based ddos attacks and their effect using deter testbed
Study of flooding based d do s attacks and their effect using deter testbed
DoS/DDoS
Proxy Server
A ROBUST MECHANISM FOR DEFENDING DISTRIBUTED DENIAL OF SERVICE ATTACKS ON WEB...
Welcome to International Journal of Engineering Research and Development (IJERD)
Azure DDoS Protection Standard
Presentation1 shweta
DNS Rebinding Attack
A survey of trends in massive ddos attacks and cloud based mitigations
A SURVEY OF TRENDS IN MASSIVE DDOS ATTACKS AND CLOUD-BASED MITIGATIONS
A survey of trends in massive ddos attacks and cloud based mitigations
Ix3615551559
Encountering distributed denial of service attack utilizing federated softwar...
DEF CON 27 - GERALD DOUSSOT AND ROGER MEYER - state of dns rebinding attack ...
Denial of service attack
DNS spoofing/poisoning Attack Report (Word Document)
Denial of Service Attacks (DoS/DDoS)
Website hacking and prevention (All Tools,Topics & Technique )
Paper id 41201622
Ad

More from amrit47 (20)

DOCX
APA, The assignment require a contemporary approach addressing Race,.docx
DOCX
APA style and all questions answered ( no min page requirements) .docx
DOCX
Apa format1-2 paragraphsreferences It is often said th.docx
DOCX
APA format2-3 pages, double-spaced1. Choose a speech to review. It.docx
DOCX
APA format  httpsapastyle.apa.orghttpsowl.purd.docx
DOCX
APA format2-3 pages, double-spaced1. Choose a speech to review. .docx
DOCX
APA Formatting AssignmentUse the information below to create.docx
DOCX
APA style300 words10 maximum plagiarism  Mrs. Smith was.docx
DOCX
APA format1. What are the three most important takeawayslessons.docx
DOCX
APA General Format Summary APA (American Psychological.docx
DOCX
Appearance When I watched the video of myself, I felt that my b.docx
DOCX
apa format1-2 paragraphsreferencesFor this week’s .docx
DOCX
APA Format, with 2 references for each question and an assignment..docx
DOCX
APA-formatted 8-10 page research paper which examines the potential .docx
DOCX
APA    STYLE 1.Define the terms multiple disabilities and .docx
DOCX
APA STYLE  follow this textbook answer should be summarize for t.docx
DOCX
APA7Page length 3-4, including Title Page and Reference Pag.docx
DOCX
APA format, 2 pagesThree general sections 1. an article s.docx
DOCX
APA Style with minimum of 450 words, with annotations, quotation.docx
DOCX
APA FORMAT1.  What are the three most important takeawayslesson.docx
APA, The assignment require a contemporary approach addressing Race,.docx
APA style and all questions answered ( no min page requirements) .docx
Apa format1-2 paragraphsreferences It is often said th.docx
APA format2-3 pages, double-spaced1. Choose a speech to review. It.docx
APA format  httpsapastyle.apa.orghttpsowl.purd.docx
APA format2-3 pages, double-spaced1. Choose a speech to review. .docx
APA Formatting AssignmentUse the information below to create.docx
APA style300 words10 maximum plagiarism  Mrs. Smith was.docx
APA format1. What are the three most important takeawayslessons.docx
APA General Format Summary APA (American Psychological.docx
Appearance When I watched the video of myself, I felt that my b.docx
apa format1-2 paragraphsreferencesFor this week’s .docx
APA Format, with 2 references for each question and an assignment..docx
APA-formatted 8-10 page research paper which examines the potential .docx
APA    STYLE 1.Define the terms multiple disabilities and .docx
APA STYLE  follow this textbook answer should be summarize for t.docx
APA7Page length 3-4, including Title Page and Reference Pag.docx
APA format, 2 pagesThree general sections 1. an article s.docx
APA Style with minimum of 450 words, with annotations, quotation.docx
APA FORMAT1.  What are the three most important takeawayslesson.docx
Ad

Recently uploaded (20)

PDF
Basic Mud Logging Guide for educational purpose
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PPTX
Cell Structure & Organelles in detailed.
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PPTX
master seminar digital applications in india
PDF
Insiders guide to clinical Medicine.pdf
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PPTX
GDM (1) (1).pptx small presentation for students
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
Pre independence Education in Inndia.pdf
PDF
Computing-Curriculum for Schools in Ghana
PDF
O7-L3 Supply Chain Operations - ICLT Program
Basic Mud Logging Guide for educational purpose
Microbial diseases, their pathogenesis and prophylaxis
Cell Structure & Organelles in detailed.
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
master seminar digital applications in india
Insiders guide to clinical Medicine.pdf
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
GDM (1) (1).pptx small presentation for students
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Supply Chain Operations Speaking Notes -ICLT Program
2.FourierTransform-ShortQuestionswithAnswers.pdf
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPH.pptx obstetrics and gynecology in nursing
102 student loan defaulters named and shamed – Is someone you know on the list?
Pre independence Education in Inndia.pdf
Computing-Curriculum for Schools in Ghana
O7-L3 Supply Chain Operations - ICLT Program

Protecting Browsers from DNS Rebinding AttacksCollin Jacks.docx

  • 1. Protecting Browsers from DNS Rebinding Attacks Collin Jackson Stanford University [email protected] Adam Barth Stanford University [email protected] Andrew Bortz Stanford University [email protected] Weidong Shao Stanford University [email protected] Dan Boneh Stanford University [email protected] ABSTRACT DNS rebinding attacks subvert the same-origin policy of browsers and convert them into open network proxies. We survey new DNS rebinding attacks that exploit the inter- action between browsers and their plug-ins, such as Flash
  • 2. Player and Java. These attacks can be used to circumvent firewalls and are highly cost-e↵ ective for sending spam e- mail and defrauding pay-per-click advertisers, requiring less than $100 to temporarily hijack 100,000 IP addresses. We show that the classic defense against these attacks, called “DNS pinning,” is ine↵ ective in modern browsers. The pri- mary focus of this work, however, is the design of strong defenses against DNS rebinding attacks that protect mod- ern browsers: we suggest easy-to-deploy patches for plug-ins that prevent large-scale exploitation, provide a defense tool, dnswall, that prevents firewall circumvention, and detail two defense options, policy-based pinning and host name authorization. Categories and Subject Descriptors K.6.5 [Management of Computing and Information Systems]: Security and Protection General Terms Security, Design, Experimentation Keywords Same-Origin Policy, DNS, Firewall, Spam, Click Fraud 1. INTRODUCTION Users who visit web pages trust their browser to prevent malicious web sites from leveraging their machines to attack others. Organizations that permit JavaScript and other ac- tive content through their firewall rely on the browser to protect internal network resources from attack. To achieve Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are
  • 3. not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. CCS’07, October 29–November 2, 2007, Alexandria, Virginia, USA. Copyright 2007 ACM 978-1-59593-703-2/07/0011 ...$5.00. these security goals, modern browsers implement the same- origin policy that attempts to isolate distinct “origins,” pro- tecting sites from each other. DNS rebinding attacks subvert the same-origin policy by confusing the browser into aggregating network resources controlled by distinct entities into one origin, e↵ ectively con- verting browsers into open proxies. Using DNS rebinding, an attacker can circumvent firewalls to spider corporate in- tranets, exfiltrate sensitive documents, and compromise un- patched internal machines. An attacker can also hijack the IP address of innocent clients to send spam e-mail, commit click fraud, and frame clients for misdeeds. DNS rebinding vulnerabilities permit the attacker to read and write directly on network sockets, subsuming the attacks possible with ex- isting JavaScript-based botnets [24], which can send HTTP requests but cannot read back the responses. To mount a DNS rebinding attack, the attacker need only register a domain name, such as attacker.com, and attract web tra�c, for example by running an advertisement. In the basic DNS rebinding attack, the attacker answers DNS queries for attacker.com with the IP address of his or her own server with a short time-to-live (TTL) and serves vis- iting clients malicious JavaScript. To circumvent a firewall,
  • 4. when the script issues a second request to attacker.com, the attacker rebinds the host name to the IP address of a tar- get server that is inaccessible from the public Internet. The browser believes the two servers belong to the same origin because they share a host name, and it allows the script to read back the response. The script can easily exfiltrate the response, enabling the attacker to read arbitrary documents from the internal server, as shown in Figure 1. To mount this attack, the attacker did not compromise any DNS servers. The attacker simply provided valid, au- thoritative responses for attacker.com, a domain owned by the attacker. This attack is very di↵ erent from “pharm- ing” [34], where the attacker must compromise a host name owned by the target by subverting a user’s DNS cache or server. DNS rebinding requires no such subversion. Conse- quently, DNSSEC provides no protection against DNS re- binding attacks: the attacker can legitimately sign all DNS records provided by his or her DNS server in the attack. DNS rebinding attacks have been known for a decade [8, 36]. A common defense implemented in several browsers is DNS pinning: once the browser resolves a host name to an IP address, the browser caches the result for a fixed dura- tion, regardless of TTL. As a result, when JavaScript con- nects to attacker.com, the browser will connect back to the attacker’s server instead of the internal server. Attacker web server Target server
  • 5. Browser client Figure 1: Firewall Circumvention Using Rebinding Pinning is no longer an e↵ ective defense against DNS re- binding attacks in current browsers because of vulnerabil- ities introduced by plug-ins. These plug-ins provide addi- tional functionality, including socket-level network access, to web pages. The browser and each plug-in maintain sep- arate pin databases, creating a new class of vulnerabilities we call multi-pin vulnerabilities that permit an attacker to mount DNS rebinding attacks. We demonstrate, for exam- ple, how to exploit the interaction between the browser and Java LiveConnect to pin the browser to one IP address while pinning Java to another IP address, permitting the attacker to read and write data directly on sockets to a host and port of the attacker’s choice despite strong pinning by each component. Our experiments show how an attacker can exploit multi- pin vulnerabilities to cheaply and e�ciently assemble a tem- porary, large-scale bot network. Our findings suggest that nearly 90% of web browsers are vulnerable to rebinding at- tacks that only require a few hundreds of milliseconds to conduct (see Table 1). These attacks do not require users to click on any malicious links: users need only view an at- tacker’s web advertisement. By spending less than $100 on advertising, an attacker can hijack 100,000 unique IP ad- dress to send spam, commit click fraud, or otherwise misuse as open network proxies. The bulk of our work focuses on designing robust defenses to DNS rebinding attacks that protect current and future browsers and plug-ins:
  • 6. 1. To combat firewall circumvention, we recommend or- ganizations deploy DNS resolvers that prevent external names from resolving to internal addresses. We pro- vide an open-source implementation of such a resolver in 300 lines of C called dnswall [15]. 2. For Flash Player, Java, and LiveConnect, we suggest specific, easy-to-deploy patches to prevent multi-pin vulnerabilities, mitigating large-scale exploitation of DNS rebinding for firewall circumvention and IP hi- jacking. Technology Attack Time LiveConnect (JVM loaded) 47.8 ± 10.3 ms Flash Player 9 192 ± 5.7 ms Internet Explorer 6 (no plug-ins) 1000 ms Internet Explorer 7 (no plug-ins) 1000 ms Firefox 1.5 and 2 (no plug-ins) 1000 ms Safari 2 (no plug-ins) 1000 ms LiveConnect 1294 ± 37 ms Opera 9 (no plug-ins) 4000 ms Table 1: Time Required for DNS Rebinding Attack by Technology (95% Confidence) 3. We propose two options for protecting browsers from DNS rebinding: smarter pinning that provides better security and robustness, and a backwards-compatible use of the DNS system that fixes rebinding vulnerabil- ities at their root (which we implemented as a 72-line patch to Firefox 2). The remainder of the paper is organized as follows. Sec- tion 2 describes existing browser policy for network access. Section 3 details DNS rebinding vulnerabilities, including
  • 7. standard DNS rebinding and current multi-pin vulnerabili- ties. Section 4 explains two classes of attacks that use these vulnerabilities, firewall circumvention and IP hijacking, and contains our experimental results. Section 5 proposes de- fenses against both classes of attacks. Section 6 describes related work. Section 7 concludes. 2. NETWORK ACCESS IN THE BROWSER To display web pages, browsers are instructed to make network requests by static content such as HTML and by active content such as JavaScript, Flash Player, Java, and CSS. Browsers restrict this network access in order to to pre- vent web sites from making malicious network connections. The same-origin policy provides partial resource isolation by restricting access according to origin, specifying when content from one origin can access a resource in another ori- gin. The policy applies to both network access and browser state such as the Document Object Model (DOM) interface, cookies, cache, history, and the password database [20]. The attacks described in this paper circumvent the same origin- policy for network access. Access Within Same Origin. Within the same origin, both content and browser scripts can read and write net- work resources using the HTTP protocol. Plug-ins, such as Flash Player and Java, can access network sockets directly, allowing them to make TCP connections and, in some cases, send and receive UDP packets as well. Java does not restrict access based on port number, but Flash Player permits ac- cess to port numbers less than 1024 only if the machine authorizes the connection in an XML policy served from a port number less than 1024. Access Between Di↵ erent Origins. In general, con-
  • 8. tent from one origin can make HTTP requests to servers in another origin, but it cannot read responses, e↵ ectively restricting access to “send-only.” Flash Player permits its movies to read back HTTP responses from di↵ erent origins, provided the remote server responds with an XML policy authorizing the movie’s origin. Flash Player also permits reading and writing data on TCP connections to arbitrary port numbers, again provided the remote server responds with a suitable XML policy on an appropriate port. By convention, certain types of web content are assumed to be public libraries, such as JavaScript, CSS, Java ap- plets, and SWF movies. These files may be included across domains. For example, one origin can include a CSS file from another origin and read its text. Scripts can also read certain properties of other objects loaded across domains, such as the height and width of an image. Prohibited Access. Some types of network access are pro- hibited even within the same origin. Internet Explorer 7 blocks port numbers 19 (chargen), 21 (FTP), 25 (SMTP), 110 (POP3), 119 (NNTP), and 143 (IMAP), Firefox 2 blocks those plus 51 additional port numbers, but Safari 2 does not block any ports. Some of these port restrictions are designed to prevent malicious web site operators from leveraging vis- iting browsers to launch distributed denial of service or to send spam e-mail, whereas others prevent universal cross- site scripting via the HTML Form Protocol Attack [41]. Origin Definition. Di↵ erent definitions of “origin” are used by di↵ erent parts of the browser. For network access, browsers enforce the same-origin policy [38] based on three components of the Uniform Resource Locator (URL) from
  • 9. which it obtained the content. A typical URL is composed of the below components: scheme://hostname:port/path Current browsers treat two objects as belonging to the same origin if, and only if, their URLs contain the same scheme, host name, and port number (e.g., http://amazon.com/ is a di↵ erent origin than http://amazon.co.uk/, even though the two domains are owned by the same company). Other resources use fewer components of the URL. For example, cookies use only the host name. Objects on the Internet, however, are not accessed by host name. To connect to a server, the browser must first trans- late a host name into an IP address and then open a socket to that IP address. If one host name resolves to multiple IP addresses owned by multiple entities, the browser will treat them as if they were the same origin even though they are, from an ownership point-of-view, di↵ erent. 3. DNS REBINDING VULNERABILITIES The network access policy in web browsers is based on host names, which are bound by the Domain Name Sys- tem (DNS) to IP addresses. An attacker mounting a DNS rebinding attack attempts to subvert this security policy by binding his or her host name to both the attack and target server’s IP addresses. 3.1 Standard Rebinding Vulnerabilities A standard rebinding attack uses a single browser tech- nology (e.g. JavaScript, Java, or Flash Player) to connect to multiple IP addresses with the same host name.
  • 10. Multiple A Records. When a client resolves a host name using DNS, the authoritative server can respond with mul- tiple A records indicating the IP addresses of the host. The first attack using DNS rebinding [8] in 1996 leveraged this property to confuse the security policy of the Java Virtual Machine (JVM): 1. A client visits a malicious web site, attacker.com, con- taining a Java applet. The attacker’s DNS server binds attacker.com to two IP addresses: the attacker’s web server and the target’s web server. 2. The client executes the attacker’s applet, which opens a socket to the target. The JVM permits this connec- tion, because the target’s IP address is contained in the DNS record for attacker.com. Current versions of the JVM are not vulnerable to this at- tack because the Java security policy has been changed. Ap- plets are now restricted to connecting to the IP address from which they were loaded. (Current attacks on Java are de- scribed in Section 3.2.) In the JavaScript version of this attack, the attacker sends some JavaScript to the browser that instructs the browser to connect back to attacker.com. The attacker’s server refuses this second TCP connection, forcing the browser to switch over to the victim IP address [21]. By using a RST packet to refuse the connection, the attacker can cause some browsers to switch to the new IP address after one second. Subsequent XMLHttpRequests issued by the attacker’s code will connect to the new IP address. Time-Varying DNS. In 2001, the original attack on Java was extended [36] to use use time-varying DNS:
  • 11. 1. A client visits a malicious web site, attacker.com, containing JavaScript. The attacker’s DNS server is configured to bind attacker.com to the attacker’s IP address with a very short TTL. 2. The attacker rebinds attacker.com to the target’s IP address. 3. The malicious script uses frames or XMLHttpRequest to connect to attacker.com, which now resolves to the IP address of the target’s server. Because the connection in Step 3 has the same host name as the original malicious script, the browser permits the at- tacker to read the response from the target. Pinning in Current Browsers. Current browsers defend against the standard rebinding attack by “pinning” host names to IP address, preventing host names from referring to multiple IP addresses. • Internet Explorer 7 pins DNS bindings for 30 minutes.1 Unfortunately, if the attacker’s domain has multiple A records and the current server becomes unavailable, the browser will try a di↵ erent IP address within one second. • Internet Explorer 6 also pins DNS bindings for 30 min- utes, but an attacker can cause the browser to release its pin after one second by forcing a connection to the current IP address to fail, for example by including the element <img src="http://attacker.com:81/">. 1The duration is set by the registry keys DnsCacheTimeout and ServerInfoTimeOut in HKEY CURRENT USERSOFTWAREMicrosoft Windows
  • 12. CurrentVersionInternet Settings • Firefox 1.5 and 2 cache DNS entries for between 60 and 120 seconds. DNS entries expire when the value of the current minute increments twice. 2 Using JavaScript, the attacker can read the user’s clock and compute when the pin will expire. Using multiple A records, an attacker can further reduce this time to one second. • Opera 9 behaves similarly to Internet Explorer 6. In our experiments, we found that it pins for approxi- mately 12 minutes but can be tricked into releasing its pin after 4 seconds by connecting to a closed port. • Safari 2 pins DNS bindings for one second. Because the pinning time is so low, the attacker may need to send a “Connection: close” HTTP header to ensure that the browser does not re-use the existing TCP con- nection to the attacker. Flash Player 9. Flash Player 9 permits SWF movies to open TCP sockets to arbitrary hosts, provided the destina- tion serves an XML policy authorizing the movie’s origin [2]. According to Adobe, Flash Player 9 is installed on 55.8% of web browsers (as of December 2006) [1]; according to our own experiments, Flash Player 9 was present in 86.9% of browsers. Flash Player is vulnerable to the following re- binding attack: 1. The client’s web browser visits a malicious web site that embeds a SWF movie. 2. The SWF movie opens a socket on a port less than 1024 to attacker.com, bound to the attacker’s IP ad-
  • 13. dress. Flash Player sends <policy-file-request />. 3. The attacker responds with the following XML: <?xml version="1.0"?> <cross-domain-policy> <allow-access-from domain="*" to-ports="*" /> </cross-domain-policy> 4. The SWF movie opens a socket to an arbitrary port number on attacker.com, which the attacker has re- bound to the target’s IP address. The policy XML provided by the attacker in step 3 in- structs Flash Player to permit arbitrary socket access to attacker.com. Flash Player permits the socket connections to the target because it does not pin host names to a single IP address. If the attacker were to serve the policy file from a port number � 1024, Flash Player would authorize only ports � 1024. 3.2 Multi-Pin Vulnerabilities Current browsers use several plug-ins to render web pages, many of which permit direct socket access back to their ori- gins. Another class of rebinding attacks exploit the fact that these multiple technologies maintain separate DNS pin databases. If one technology pins to the attacker’s IP ad- dress and another pins to the target’s IP address, the at- tacker can make use of inter-technology communication to circumvent the same-origin restrictions on network access. Some of these attacks have been discussed previously in the full-disclosure community [4]. Java. Java, installed on 87.6%3 of web browsers [1], can also 2The duration is set by network.dnsCacheExpiration.
  • 14. 3We observed 98.1% penetration in our experiment. open TCP connections back to their origins. The Java Vir- tual Machine (JVM) maintains DNS pins separately from the browser, opening up the possibility of DNS rebinding vulnerabilities. Java applets themselves are not vulnerable because the JVM retrieves applets directly from the net- work, permitting the JVM to pin the origin of the applet to the correct IP address. Java is vulnerable, however, to the following attacks. • LiveConnect bridges JavaScript and the JVM in Fire- fox and Opera, permitting script access to the Java standard library, including the Socket class, without loading an applet. The browser pins to the attacker’s IP address, but the JVM spawned by LiveConnect does a second DNS resolve and pins to the target’s IP address. The attacker’s JavaScript can exploit this pin mismatch to open and communicate on a socket from the client machine to an arbitrary IP address on an arbitrary destination port, including UDP sockets with a source port number � 1024. • Applets with Proxies are also vulnerable to a multi- pin attack, regardless of which browser the client uses. If the client uses an HTTP proxy to access the web, there is yet another DNS resolver involved—the proxy. When the JVM retrieves an applet via a proxy, it re- quests the applet by host name, not by IP address. If the applet opens a socket, the JVM does a second DNS resolve and pins to the target’s IP address. • Relative Paths can cause multi-pin vulnerabilities. If a server hosts an HTML page that embeds an applet using a relative path with the parameter mayscript set to true, that machine can be the target of a multi-
  • 15. pin attack. The browser pins to the target, retrieves the HTML page, and instructs the JVM to load the applet. The JVM does a second DNS resolve, pins to the attacker, and retrieves a malicious applet. The applet instructs the browser, via JavaScript, to issue XMLHttpRequests to the target’s IP address. Flash Player. Flash Player would still be vulnerable to multi-pin attacks even if it pinned DNS bindings. Flash Player does not retrieve its movies directly from the net- work. Instead, the browser downloads the movie and spawns Flash Player, transferring the movie’s origin by host name. When the attacker’s movie attempts to open a socket, Flash Player does a second DNS resolution and would pin to the target’s IP address. The URLLoader class is not vulnerable to multi-pin attacks because it uses the browser to request the URL and thus uses the browser’s DNS pins, but the Socket class could still be used to read and write on arbitrary TCP sockets. Other Plug-ins. Other browser plug-ins permit network access, including Adobe Acrobat and Microsoft Silverlight. Acrobat restricts network communication to the SOAP pro- tocol but does not restrict access by document origin. Of- ten, the Acrobat plug-in will prompt the user before access- ing the network. Silverlight permits network access through BrowserHttpWebRequest, which uses the browser to make the request (like URLLoader in Flash Player) and thus uses the browser’s DNS pins. 4. ATTACKS USING DNS REBINDING An attacker can exploit the DNS rebinding vulnerabilities described in Section 3 to mount a number of attacks. For
  • 16. some of these attacks, the attacker requires the direct socket access a↵ orded by DNS rebinding with Flash Player and Java, whereas others require only the ability to read HTTP responses from the target. The attacks fall into two broad categories, according to the attacker’s goal: • Firewall Circumvention. The attacker can use DNS re- binding to access machines behind firewalls that he or she cannot access directly. With direct socket access, the attacker can interact with a number of internal services besides HTTP. • IP Hijacking. The attacker can also use DNS rebinding to access publicly available servers from the client’s IP address. This allows the attacker to take advantage of the target’s implicit or explicit trust in the client’s IP address. To mount these attacks, the attacker must first induce the client to load some active content. This can be done by a variety of techniques discussed in Section 4.4. Once loaded onto the client’s machine, the attacker’s code can communi- cate with any machine reachable by the client. 4.1 Firewall Circumvention A firewall restricts tra�c between computer networks in di↵ erent zones of trust. Some examples include blocking connections from the public Internet to internal machines and mediating connections from internal machines to Inter- net servers with application-level proxies. Firewall circum- vention attacks bypass the prohibition on inbound connec- tions, allowing the attacker to connect to internal servers while the user is visiting the attacker’s Internet web page (see Figure 1).
  • 17. Spidering the Intranet. The attacker need not specify the target machine by IP address. Instead, the attacker can guess the internal host name of the target, for example hr.corp.company.com, and rebind attacker.com to a CNAME record pointing to that host name. The client’s own recur- sive DNS resolver will complete the resolution and return the IP address of the target. Intranet host names are often guessable and occasionally disclosed publicly [30, 9]. This technique obviates the need for the attacker to scan IP ad- dresses to find an interesting target but does not work with the multiple A record technique described in Section 3.1. Having found a machine on the intranet, the attacker can connect to the machine over HTTP and request the root document. If the server responds with an HTML page, the attacker can follow links and search forms on that page, eventually spidering the entire intranet. Web servers inside corporate firewalls often host confidential documents, rely- ing on the firewall to prevent untrusted users from accessing the documents. Using a DNS rebinding attack, the attacker can leverage the client’s browser to read these documents and exfiltrate them to the attacker, for example by submit- ting an HTML form to the attacker’s web server. Compromising Unpatched Machines. Network admin- istrators often do not patch internal machines as quickly as Internet-facing machines because the patching process is time-consuming and expensive. The attacker can attempt to exploit known vulnerabilities in machines on the internal network. In particular, the attacker can attempt to exploit the client machine itself. The attacks against the client it- self originate from localhost and so bypass software fire- walls and other security checks, including many designed to protect serious vulnerabilities. If an exploit succeeds, the attacker can establish a presence within the firewall that
  • 18. persists even after clients close their browsers. Abusing Internal Open Services. Internal networks contain many open services intended for internal use only. For example, network printers often accept print jobs from internal machines without additional authentication. The attacker can use direct socket access to command network printers to exhaust their toner and paper supplies. Similarly, users inside firewalls often feel comfortable cre- ating file shares or FTP servers accessible to anonymous users under the assumption that the servers will be avail- able only to clients within the network. With the ability to read and write arbitrary sockets, the attacker can exfiltrate the shared documents and use these servers to store illicit information for later retrieval. Consumer routers are often installed without changing the default password, making them an attractive target for re- configuration attacks by web pages [40]. Firmware patches have attempted to secure routers against cross-site scripting and cross-site request forgery, in an e↵ ort to prevent recon- figuration attacks. DNS rebinding attacks allow the attacker direct socket access to the router, bypassing these defenses. 4.2 IP Hijacking Attackers can also use DNS rebinding attacks to target machines on the public Internet. For these attacks, the at- tacker is not leveraging the client’s machine to connect to otherwise inaccessible services but instead abusing the im- plicit or explicit trust public services have in the client’s IP address. Once the attacker has hijacked a client’s IP ad- dress, there are several attacks he or she can perpetrate. Committing Click Fraud. Web publishers are often paid
  • 19. by web advertisers on a per-click basis. Fraudulent publish- ers can increase their advertising revenue by generating fake clicks, and advertisers can drain competitors’ budgets by clicking on their advertisements. The exact algorithms used by advertising networks to detect these “invalid” clicks are proprietary, but the IP address initiating the click is widely believed to be an essential input. In fact, one common use of bot networks is to generate clicks [7]. Click fraud would appear to require only the ability to send HTTP requests to the advertising network, but adver- tisers defend against the send-only attacks, permitted by the same-origin policy, by including a unique nonce with every advertising impression. Clicks lacking the correct nonce are rejected as invalid, requiring the attacker to read the nonce from an HTTP response in order to generate a click. This attack is highly cost-e↵ ective, as the attacker can buy advertising impressions, which cost tens of cents per thousand, and convert them into clicks, worth tens of cents each. The attack is su�ciently cost-e↵ ective that the at- tacker need not convert every purchased impression into a click. Instead, the fraudster can use most of the purchased impressions to generate fake impressions on the site, main- taining a believable click-through rate. Sending Spam. Many e-mail servers blacklist IP addresses known to send spam e-mail [39]. By hijacking a client’s IP address, an attacker can send spam from IP addresses with clean reputations. To send spam e-mail, the attacker need only write content to SMTP servers on port 25, an action blocked by most browsers but permitted by Flash Player and Java. Additionally, an attacker will often be able to use the client’s actual mail relay. Even service providers that
  • 20. require successful authentication via POP3 before sending e-mail are not protected, because users typically leave their desktop mail clients open and polling their POP3 servers. Defeating IP-based Authentication. Although discour- aged by security professionals [10], many Internet services still employ IP-based authentication. For example, the ACM Digital Library makes the full text of articles available only to subscribers, who are often authenticated by IP address. After hijacking an authorized IP address, the attacker can access the service, defeating the authentication mechanism. Because the communication originates from an IP address actually authorized to use the service, it can be di�cult, or even impossible, for the service provider to recognize the security breach. Framing Clients. An attacker who hijacks an IP address can perform misdeeds and frame the client. For example, an attacker can attempt to gain unauthorized access to a computer system using a hijacked IP address as a proxy. As the attack originates from the hijacked IP address, the logs will implicate the client, not the attacker, in the crime. Moreover, if the attacker hosts the malicious web site over HTTPS, the browser will not cache the page and no traces will be left on the client’s machine. 4.3 Proof-of-Concept Demonstration We developed proof-of-concept exploits for DNS rebinding vulnerabilities in Flash Player 9, LiveConnect, Java applets with proxy servers, and the browser itself. Our system con- sists of a custom DNS server authoritative for dnsrebinding.net, a custom Flash Player policy server, and a standard Apache web server. The various technologies issue DNS queries that encode the attacker and target host names, together with a nonce, in the subdomain. For each nonce, the DNS
  • 21. server first responds with the attacker’s IP address (with a zero TTL) and thereafter with the target’s IP address. Our proof-of-concept demo, http://crypto.stanford.edu/dns, implements wget and telnet by mounting a rebinding at- tack against the browser. 4.4 Experiment: Recruiting Browsers Methodology. We tested DNS rebinding experimentally by running a Flash Player 9 advertisement on a minor ad- vertising network targeting the keywords “Firefox,” “game,” “Internet Explorer,” “video,” and “YouTube.” The experi- ment used two machines in our laboratory, an attacker and a target. The attacker ran a custom authoritative DNS server for dnsrebinding.net, a custom Flash Player policy server, and an Apache web server hosting the advertisement. The target ran an Apache web server to log successful attacks. The Flash Player advertisement exploited the vulnerability described in Section 3.1 to load an XML document from the target server in our lab. The attack required only that the client view the ad, not that the user click on the ad. Vulnerability Impressions Flash Player 9 86.9% LiveConnect 24.4% Java+Proxy 2.2% Total Multi-Pin 90.6% Table 2: Percentage of Impressions by Vulnerability Cumulative Duration of Successful Attacks for 75% Shortest Duration Attacks 0 10
  • 22. 20 30 40 50 60 70 80 90 100 0 64 128 192 256 Duration of Attack Success (secs) S uc ce ss fu l A tt ac ks ( pe rc en t) Cumulative Duration of Successful Attacks 1 10
  • 23. 100 1,000 10,000 100,000 1 10 100 1000 10000 100000 1000000 Duration of Attack Success (secs, logscale) S uc ce ss fu l A tt ac ks ( lo gs ca le ) Figure 2: Duration of Successful Attacks The experiment lasted until the user navigated away from the advertisement, at which time we lost the ability to use the viewer’s network connection. For privacy, we collected only properties typically disclosed by browsers when viewing web pages (e.g., plug-in support, user agent, and external IP
  • 24. address). The experiment conformed to the terms of service of the advertising network and to the guidelines of the in- dependent review board at our institution. Every network operation produced by the advertisement could have been produced by a legitimate SWF advertisement, but we pro- duced the operations through the Socket interface, demon- strating the ability to make arbitrary TCP connections. Results. We ran the ad beginning at midnight EDT on three successive nights in late April 2007. We bid $0.50 per 1000 impressions for a variety of keywords. We spent $10 per day, garnering approximately 20,000 impressions per day. Due to a server misconfiguration, we disregarded ap- proximately 10,000 impressions. We also disregarded 19 im- pressions from our university. We received 50,951 impres- sions from 44,924 unique IP addresses (40.2% IE7, 32.3% IE6, 23.5% Firefox, 4% Other). We ran the rebinding experiment on the 44,301 (86.9%) impressions that reported Flash Player 9. We did not at- tempt to exploit other rebinding vulnerabilities (see Ta- ble 2). The experiment was successful on 30,636 (60.1%) impressions and 27,480 unique IP addresses. The attack was less successful on the 1,672 impressions served to Mac OS, succeeding 36.4% of the time, compared to a success rate of 70.0% on the 49,535 (97.2%) Windows impressions.4 Mac OS is more resistant to this rebinding attack due to some caching of DNS entries despite their zero TTL. For each successful experiment, we measured how long an attacker could have used the client’s network access by load- ing the target document at exponentially longer intervals, as shown in Figure 2. The median impression duration was 32
  • 25. seconds, with 25% of the impressions lasting longer than 256 seconds. We observed 9 impressions with a duration of at least 36.4 hours, 25 at least 18.2 hours, and 81 at least 9.1 hours. In aggregate, we obtained 100.3 machine-days of net- work access. These observations are consistent with those of [24]. The large number of attacks ending between 4.2 and 8.5 minutes suggests that this is a common duration of time for users to spend on a web page. Discussion. Our experimental results show that DNS re- binding vulnerabilities are widespread and cost-e↵ ective to exploit on a large scale. Each impression costs $0.0005 and 54% of the impressions convert to successful attacks from unique IP addresses. To hijack 100,000 IP addresses for a temporary bot network, and attacker would need to spend less than $100. This technique compares favorably to rent- ing a traditional bot network for sending spam e-mail and committing click fraud for two reasons. First, these applica- tions require large numbers of “fresh” IP address for short durations as compromised machines are quickly blacklisted. Second, while estimates of the rental cost of bot networks vary [44, 14, 7], this technique appears to be at least one or two orders of magnitude less expensive. 5. DEFENSES AGAINST REBINDING Defenses for DNS rebinding attacks can be implemented in browsers, plug-ins, DNS resolvers, firewalls, and servers. These defenses range in complexity of development, di�- culty of deployment, and e↵ ectiveness against firewall cir- cumvention and IP hijacking. In addition to necessary mit- igations for Flash Player, Java LiveConnect, and browsers, we propose three long-term defenses. To protect against fire- wall circumvention, we propose a solution that can be de- ployed unilaterally by organizations at their network bound- ary. To fully defend against rebinding attacks, we propose
  • 26. two defenses: one that requires socket-level network access be authorized explicitly by the destination server and an- other works even if sockets are allowed by default. 5.1 Fixing Firewall Circumvention Networks can be protected against firewall circumvention by forbidding external host names from resolving to internal IP addresses, e↵ ectively preventing the attacker from nam- ing the target server. Without the ability to name the tar- get, the attacker is unable to aggregate the target server into an origin under his or her control. These malicious bindings 4We succeeded in opening a socket with 2 of 11 PlayStation 3 impressions (those with Flash Player 9), but none of the 12 Nintendo Wii impressions were vulnerable. can be blocked either by filtering packets at the firewall [5] or by modifying the DNS resolvers used by clients on the network. • Enterprise. By blocking outbound tra�c on port 53, a firewall administrator for an organization can force all internal machines, including HTTP proxies and VPN clients, to use a DNS server that is configured not to resolve external names to internal IP addresses. To implement this approach, we developed a 300 line C program, dnswall [15], that runs alongside BIND and enforces this policy. • Consumer. Many consumer firewalls, such as those produced by Linksys, already expose a caching DNS resolver and can be augmented with dnswall to block DNS responses that contain private IP addresses. The vendors of these devices have an incentive to patch their firewalls because these rebinding attacks can be
  • 27. used to reconfigure these routers to mount further at- tacks on their owners. • Software. Software firewalls, such as the Windows Firewall, can also prevent their own circumvention by blocking DNS resolutions to 127.*.*.*. This tech- nique does not defend services bound to the external network interface but does protects a large number of services that bind only to the loopback interface. Blocking external names from resolving to internal addresses prevents firewall circumvention but does not defend against IP hijacking. An attacker can still use internal machines to attack services running on the public Internet. 5.2 Fixing Plug-ins Plug-ins are a particular source of complexity in defend- ing against DNS rebinding attacks because they enable sub- second attacks, provide socket-level network access, and op- erate independently from browsers. In order to prevent re- binding attacks, these plug-ins must be patched. Flash Player. When a SWF movie opens a socket to a new host name, it requests a policy over the socket to de- termine whether the host accepts socket connections from the origin of the movie. Flash Player could fix most of its rebinding vulnerabilities by considering a policy valid for a socket connection only if it obtained the policy from the same IP address in addition to its current requirement that it obtained the policy from the same host name. Us- ing this design, when attacker.com is rebound to the tar- get IP address, Flash Player will refuse to open a socket to that address unless the target provides a policy authorizing attacker.com. This simple refinement uses existing Flash Player policy deployments and is backwards compatible, as
  • 28. host names expecting Flash Player connections already serve policy documents from all of their IP addresses. SWF movies can also access ports numbers � 1024 on their origin host name without requesting a policy. Al- though the majority of services an attacker can profitably target (e.g., SMTP, HTTP, HTTPS, SSH, FTP, NNTP) are hosted on low-numbered ports, other services such as MySQL, BitTorrent, IRC, and HTTP proxies are vulnera- ble. To fully protect against rebinding attacks, Flash Player could request a policy before opening sockets to any port, even back to its origin. However, this modification breaks backwards compatibility because those servers might not be already serving policy files. Java. Many deployed Java applets expect sockets to be al- lowed by default. If clients are permitted to use these applets from behind HTTP proxies, they will remain vulnerable to multi-pin attacks because proxy requests are made by host name instead of by IP address. A safer approach is to use the CONNECT method to obtain a proxied socket connection to an external machine. Typically proxies only allow CONNECT on port 443 (HTTPS), making this the only port available for these applets. Alternatively, proxies can use HTTP head- ers to communicate IP addresses of hosts between the client and the proxy [28, 29], but this approach requires both the client and the proxy to implement the protocol. Java LiveConnect. LiveConnect introduces additional vulnerabilities, but browsers can fix the LiveConnect multi- pin vulnerability without altering the JVM by installing their own DNS resolver into the JVM using a standard interface. Firefox, in particular, implements LiveConnect
  • 29. through the Java Native Interface (JNI). When Firefox ini- tializes the JVM, it can install a custom InetAddress class that will handle DNS resolution for the JVM. This custom class should contain a native method that implements DNS resolution using Firefox’s DNS resolver instead of the system resolver. If the browser implements pinning, LiveConnect and the browser will use a common pin database, removing multi-pin vulnerabilities. 5.3 Fixing Browsers (Default-Deny Sockets) Allowing direct socket access by default precludes many defenses for DNS rebinding attacks. If browser plug-ins de- faulted to denying socket access, as a patched Flash Player and the proposed TCPConnection (specified in HTML5 [19]) would, these defenses would become viable. Java and Live- Connect, along with any number of lesser-known plug-ins, expect socket access to be allowed, and fixing these is a chal- lenge. Checking Host Header. HTTP 1.1 requires that user agents include a Host header in HTTP requests that spec- ifies the host name of the server [11]. This feature is used extensively by HTTP proxies and by web servers to host many virtual hosts on one IP address. If sockets are de- nied by default, the Host header reliably indicates the host name being used by the browser to contact the server be- cause XMLHttpRequest [43] and related technologies are re- stricted from spoofing the Host header.5 One server-side de- fense for these attacks is therefore to reject incoming HTTP requests with unexpected Host headers [28, 37]. Finer-grained Origins. Another defense against DNS rebinding attacks is to refine origins to include additional information, such as the server’s IP address [28] or public key [27, 23], so that when the attacker rebinds attacker.com
  • 30. to the target, the browser will consider the rebound host name to be a new origin. One challenge to deploying finer- grained origins is that every plug-in would need to revise its security policies and interacting technologies would need to hand-o↵ refined origins correctly. 5Lack of integrity of the Host header has been a recur- ring source of security vulnerabilities, most notably in Flash Player 7. • IP Addresses. Refining origins with IP address [28] is more robust than pinning in that a single browsing session can fail-over from one IP address to another. When such a fail-over occurs, however, it will likely break long-lived AJAX applications, such as Gmail, because they will be prevented from making XML- HttpRequests to the new IP address. Users can recover from this by clicking the browser’s reload button. Un- fortunately, browsers that use a proxy server do not know the actual IP address of the remote server and thus cannot properly refine origins. Also, this defense is vulnerable to an attack using relative paths to script files, similar to the applet relative-path vulnerability described in Section 3.2. • Public Keys. Augmenting origins with public keys [27, 23] prevents two HTTPS pages served from the same domain with di↵ erent public keys from reading each other’s state. This defense is useful when users dis- miss HTTPS invalid certificate warnings and chiefly protects HTTPS-only “secure” cookies from network attackers. Many web pages, however, are not served over HTTPS, rendering this defense more appropriate for pharming attacks that compromise victim domains than for rebinding attacks.
  • 31. Smarter Pinning. To mitigate rebinding attacks, browsers can implement smarter pinning policies. Pinning is a de- fense for DNS rebinding that trades o↵ robustness for secu- rity. RFC 1035 [32] provides for small (and even zero) TTLs to enable dynamic DNS and robust behavior in the case of server failure but respecting these TTLs allows rebinding attacks. Over the last decade, browsers have experimented with di↵ erent pin durations and release heuristics, leading some vendors to shorten their pin duration to improve ro- bustness [13]. However, duration is not the only parameter that can be varied in a pinning policy. Browsers can vary the width of their pins by permitting host names to be rebound within a set of IP addresses that meet some similarity heuristic. Selecting an optimal width as well as duration enables a better trade-o↵ between se- curity and robustness than optimizing duration alone. One promising policy is to allow rebinding within a class C net- work. For example, if a host name resolved to 171.64.78.10, then the client would also accept any IP address beginning with 171.64.78 for that host name. The developers of the NoScript Firefox extension [26] have announced plans [25] to adopt this pinning heuristic. • Security. When browsers use class C network pinning, the attacker must locate the attack server on the same class C network as the target, making the rebinding attack much more di�cult to mount. The attack is possible only if the attacker co-locates a server at the same hosting facility or leverages a cross-site scripting vulnerability on a co-located server. This significantly raises the bar for the attacker and provides better re- courses for the target. • Robustness. To study the robustness of class C net- work pinning, we investigated the IP addresses re-
  • 32. ported by the 100 most visited English-language sites (according to Alexa [3]). We visited the home page of these sites and compiled a list of the 336 host names used for embedded content (e.g., www.yahoo.com em- beds images from us.i1.yimg.com). We then issued DNS queries for these hosts every 10 minutes for 24 hours, recording the IP addresses reported. In this experiment, 58% reported a single IP address consistently across all queries. Note that geographic load balancing is not captured in our data because we issued our queries from a single machine, mimicking the behavior of a real client. Averaged over the 42% of hosts reporting multiple IP addresses, if a browser pinned to an IP address at random, the expected frac- tion of IP addresses available for rebinding under class C network pinning is 81.3% compared with 16.4% un- der strict IP address pinning, suggesting that class C pinning is significantly more robust to server failure. Other heuristics for pin width are possible. For example, the browser could prevent rebinding between public IP ad- dresses and the RFC 1918 [35] private IP addresses. This provides greater robustness for fail-overs across data centers and for dynamic DNS. LocalRodeo [22, 45] is a Firefox ex- tension that implements RFC 1918 pinning for JavaScript. As for security, RFC 1918 pinning largely prevents firewall circumvention but does not protect against IP hijacking nor does it prevent firewall circumvention in the case where a firewall protects non-private IP addresses, which is the case for many real-life protected networks and personal software firewalls.
  • 33. Even the widest possible pinning heuristic prevents some legitimate rebinding of DNS names. For example, public host names controlled by an organization often have two IP addresses, a private IP address used by clients within the firewall and a public IP address used by clients on the Inter- net. Pinning prevents employees from properly connecting to these severs after joining the organization’s Virtual Pri- vate Network (VPN) as those host names appear to rebind from public to private IP addresses. Policy-based Pinning. Instead of using unpinning heuris- tics, we propose browsers consult server-supplied policies to determine when it is safe to re-pin a host name from one IP address to another, providing robustness without degrading security. To re-pin safely, the browser must obtain a policy from both the old and new IP address (because some at- tacks first bind to the attacker whereas others first bind to the target). Servers can supply this policy at a well-known location, such as /crossdomain.xml, or in reverse DNS (see Section 5.4). Pinning Pitfalls. Correctly implementing pinning has sev- eral subtleties that are critical to its ability to defend against DNS rebinding attacks. • Common Pin Database. To eliminate multi-pin at- tacks, pinning-based defense require that all browser technologies that access the network share a common pin database. Many plug-ins, including Flash Player and Silverlight, already use the browser’s pins when issuing HTTP requests because they issue these re- quests through the browser. To share DNS pins for other kinds of network access, either the browser could expose an interface to its pin database or the operating system could pin in its DNS resolver. Unfortunately, browser vendors appear reluctant to expose such an
  • 34. interface [12, 33] and pinning in the operating system either changes the semantics of DNS for other applica- tions or requires that the OS treats browsers and their plug-ins di↵ erently from other applications. • Cache. The browser’s cache and all plug-in caches must be modified to prevent rebinding attacks. Cur- rently, objects stored in the cache are retrieved by URL, irrespective of the originating IP address, creat- ing a rebinding vulnerability: a cached script from the attacker might run later when attacker.com is bound to the target. To prevent this attack, objects in the cache must be retrieved by both URL and originat- ing IP address. This degrades performance when the browser pins to a new IP address, which might occur when the host at the first IP address fails, the user starts a new browsing session, or the user’s network connectivity changes. These events are uncommon and are unlikely to impact performance significantly. • document.domain. Even with the strictest pinning, a server is vulnerable to rebinding attacks if it hosts a web page that executes the following, seemingly in- nocuous, JavaScript: document.domain = document.domain; After a page sets its domain property, the browser al- lows cross-origin interactions with other pages that have set their domain property to the same value [42, 17]. This idiom, used by a number of JavaScript li- braries6, sets the domain property to a value under the control of the attacker: the current host name. 5.4 Fixing Browsers (Default-Allow Sockets)
  • 35. Instead of trying to prevent a host name from rebinding from one IP address to another—a fairly common event—a di↵ erent approach to defending against rebinding is to pre- vent the attacker from naming the target server, essentially generalizing dnswall to the Internet. Without the ability to name the target server, the attacker cannot mount a DNS rebinding attack against the target. This approach defends against rebinding, can allow socket access by default, and preserves the robustness of dynamic DNS. Host Name Authorization. On the Internet, clients re- quire additional information to determine the set of valid host names for an given IP address. We propose that servers advertise the set of host names they consider valid for them- selves and clients check these advertisements before binding a host name to an IP address, making explicit which host names can map to which IP addresses. Host name autho- rization prevents rebinding attacks because honest machines will not advertise host names controlled by attackers. Reverse DNS already provides a mapping from IP ad- dresses to host names. The owner of an IP address ip is delegated naming authority for ip.in-addr.arpa and typi- cally stores a PTR record containing the host name associ- ated with that IP address. These records are insu�cient for host name authorization because a single IP address can have many valid host names, and existing PTR records do not indicate that other host names are invalid. 6For example, “Dojo” AJAX library, Struts servlet/JSP based web application framework, jsMath AJAX Mathemat- ics library, and Sun’s “Ultimate client-side JavaScript client sni↵ ” library are vulnerable in this way.
  • 36. The reverse DNS system can be extended to authorize host names without sacrificing backwards compatibility. To authorize the host www.example.com for 171.64.78.146, the owner of the IP address inserts the following DNS records: auth.146.78.64.171.in-addr.arpa. IN A 171.64.78.146 www.example.com.auth.146.78.64.171.in-addr.arpa. IN A 171.64.78.146 To make a policy-enabled resolution for www.example.com, first resolve the host name a set of IP addresses normally and then validate each IP address as follows: 1. Resolve the host name auth.ip.in-addr.arpa. 2. If the host name exists, ip is policy-enabled and ac- cepts only authorized host names. Otherwise, ip is not policy-enabled and accepts any host name. 3. Finally, if ip is policy-enabled, resolve the host name www.example.com.auth.ip.in-addr.arpa to determine if the host name is authorized. An IP address ip implicitly authorizes every host name of the form *.auth.ip.in-addr.arpa, preventing incorrect re- cursive policy checks. For host names with multiple IP ad- dresses, only authorized IP addresses should be included in the result. If no IP addresses are authorized, the result should be “not found.” If an IP address is not policy en- abled, DNS rebinding attacks can be mitigated using the techniques in Section 5.3.
  • 37. The policy check can be implemented in DNS resolvers7, such as ones run by organizations and ISPs, transparently protecting large groups of machines from having their IP addresses hijacked. User agents, such as browser and plug- ins, can easily query the policy records because they are stored in A records and can issue policy checks in paral- lel with HTTP requests (provided they do not process the HTTP response before the host name is authorized). Stan- dard DNS caching reduces much of the overhead of redun- dant policy checks issued by DNS resolvers, browsers, and plug-ins. As a further optimization, policy-enabled resolvers can include policy records in the “additional” section of the DNS response, allowing downstream resolvers to cache com- plete policies and user-agents to get policy records without a separate request. We have implemented host name autho- rization as a 72-line patch to Firefox 2. One disadvantage of this mechanism is that the owner of an IP address, the ISP, might not be the owner of the ma- chine at that IP address. The machine can advertise the correct set of authorized host names only if the ISP is will- ing to delegate the auth subdomain to the owner or insert appropriate DNS records. Instead, machines could advertise authorized host names over HTTP in a well-known location, similar to crossdomain.xml, but this has several disadvan- tages: it requires policy-enabled DNS resolvers to implement HTTP clients, it requires all machines, such as SMTP gate- ways, to run an HTTP server, and policy queries are not cached, resulting in extra tra�c comparable to favicon.ico. 7To prevent a subtle attack that involves poisoning DNS caches, a policy-enabled DNS resolver must follow the same procedure for CNAME queries as for A queries, even though responses to the former do not directly include IP addresses.
  • 38. Trusted Policy Providers. Clients and DNS resolvers can also check policy by querying a trusted policy provider. Much like spam black lists [39] and phishing filters [6, 31, 16], di↵ erent policy providers can use di↵ erent heuristics to determine whether a host name is valid for an IP address, but every provider should respect host names authorized in reverse DNS. When correctly configured, host name au- thorization in reverse DNS has no false negatives (no valid host name is rejected) but many false positives (lack of pol- icy is implicit authorization). Trusted policy providers can greatly reduce the false positive rate, possibly at the cost of increasing the false negative rate. Clients are free to select as aggressive a policy provider as they desire. 6. RELATED WORK Using Browsers as Bots. The technique of luring web users to an attacker’s site and then distracting them while their browsers participate in a coordinated attack is de- scribed in [24]. These “puppetnets” can be used for dis- tributed denial of service but cannot be used to mount the attacks described in Section 4 because puppetnets cannot read back responses from di↵ erent origins or connect to for- bidden ports such as 25. JavaScript can also be misused to scan behind firewalls [18] and reconfigure home routers [40]. These techniques of- ten rely on exploiting default passwords and on underlying cross-site scripting or cross-site request forgery vulnerabil- ities. DNS rebinding attacks can be used to exploit de- fault passwords without the need for a cross-site scripting or cross-site request forgery hole. Sender Policy Framework. To fight spam e-mail, the Sender Policy Framework (SPF) [46] stores policy informa- tion in DNS. SPF policies are stored as TXT records in for-
  • 39. ward DNS, where host names can advertise the set of IP addresses authorized to send e-mail on their behalf. 7. CONCLUSIONS An attacker can exploit DNS rebinding vulnerabilities to circumvent firewalls and hijack IP addresses. Basic DNS re- binding attacks have been known for over a decade, but the classic defense, pinning, reduces robustness and fails to pro- tect current browsers that use plug-ins. Modern multi-pin attacks defeat pinning in hundreds of milliseconds, granting the attacker direct socket access from the client’s machine. These attacks are a highly cost-e↵ ective technique for hi- jacking hundreds of thousands of IP addresses for sending spam e-mail and committing click fraud. For network administrators, we provide a tool to prevent DNS rebinding from being used for firewall circumvention by blocking external DNS names from resolving to internal IP addresses. For the vendors of Flash Player, Java, and LiveConnect, we suggest simple patches that mitigate large- scale exploitation by vastly reducing the cost-e↵ ectiveness of the attacks for sending spam e-mail and committing click fraud. Finally, we propose two defense options that prevent both firewall circumvention and IP hijacking: policy-based pinning and host name authorization. We hope that ven- dors and network administrators will deploy these defenses quickly before attackers exploit DNS rebinding on a large scale. Acknowledgments We thank Drew Dean, Darin Fisher, Jeremiah Grossman, Martin Johns, Dan Kaminsky, Chris Karlof, Jim Roskind, and Dan Wallach for their helpful suggestions and feedback.
  • 40. This work is supported by grants from the National Science Foundation and the US Department of Homeland Security. 8. REFERENCES [1] Adobe. Flash Player Penetration. http://www.adobe. com/products/player census/flashplayer/. [2] Adobe. Adobe Flash Player 9 Security. http://www.adobe.com/devnet/flashplayer/ articles/flash player 9 security.pdf, July 2006. [3] Alexa. Top sites. http://www.alexa.com/site/ds/ top sites?ts mode=global. [4] K. Anvil. Anti-DNS pinning + socket in flash. http://www.jumperz.net/, 2007. [5] W. Cheswick and S. Bellovin. A DNS filter and switch for packet-filtering gateways. In Proc. Usenix, 1996. [6] N. Chou, R. Ledesma, Y. Teraguchi, and J. Mitchell. Client-side defense against web-based identity theft. In Proc. NDSS, 2004. [7] N. Daswani, M. Stoppelman, et al. The anatomy of Clickbot.A. In Proc. HotBots, 2007. [8] D. Dean, E. W. Felten, and D. S. Wallach. Java security: from HotJava to Netscape and beyond. In IEEE Symposium on Security and Privacy: Oakland, California, May 1996. [9] D. Edwards. Your MOMA knows best, December 2005. http://xooglers.blogspot.com/2005/12/ your-moma-knows-best.html.
  • 41. [10] K. Fenzi and D. Wreski. Linux security HOWTO, January 2004. [11] R. Fielding et al. Hypertext Transfer Protocol—HTTP/1.1. RFC 2616, June 1999. [12] D. Fisher, 2007. Personal communication. [13] D. Fisher et al. Problems with new DNS cache (“pinning” forever). https: //bugzilla.mozilla.org/show bug.cgi?id=162871. [14] D. Goodin. Calif. man pleads guilty to felony hacking. Associated Press, Janurary 2005. [15] Google. dnswall. http://code.google.com/p/google-dnswall/. [16] Google. Google Safe Browsing for Firefox, 2005. http: //www.google.com/tools/firefox/safebrowsing/. [17] S. Grimm et al. Setting document.domain doesn’t match an implicit parent domain. https: //bugzilla.mozilla.org/show bug.cgi?id=183143. [18] J. Grossman and T. Niedzialkowski. Hacking intranet websites from the outside: JavaScript malware just got a lot more dangerous. In Blackhat USA, August 2006. Invited talk. [19] I. Hickson et al. HTML 5 Working Draft. http: //www.whatwg.org/specs/web-apps/current-work/. [20] C. Jackson, A. Bortz, D. Boneh, and J. Mitchell. Protecting browser state from web privacy attacks. In
  • 42. Proc. WWW, 2006. [21] M. Johns. (somewhat) breaking the same-origin policy by undermining DNS pinning, August 2006. http://shampoo.antville.org/stories/1451301/. [22] M. Johns and J. Winter. Protecting the Intranet against “JavaScript Malware” and related attacks. In Proc. DIMVA, July 2007. [23] C. K. Karlof, U. Shankar, D. Tygar, and D. Wagner. Dynamic pharming attacks and the locked same-origin policies for web browsers. In Proc. CCS, October 2007. [24] V. T. Lam, S. Antonatos, P. Akritidis, and K. G. Anagnostakis. Puppetnets: Misusing web browsers as a distributed attack infrastructure. In Proc. CCS, 2006. [25] G. Maone. DNS Spoofing/Pinning. http: //sla.ckers.org/forum/read.php?6,4511,14500. [26] G. Maone. NoScript. http://noscript.net/. [27] C. Masone, K. Baek, and S. Smith. WSKE: web server key enabled cookies. In Proc. USEC, 2007. [28] A. Megacz. XWT Foundation Security Advisory. http://xwt.org/research/papers/sop.txt. [29] A. Megacz and D. Meketa. X-RequestOrigin. http://www.xwt.org/x-requestorigin.txt. [30] Microsoft. Microsoft Web Enterprise Portal, January 2004. http://www.microsoft.com/technet/ itshowcase/content/MSWebTWP.mspx.
  • 43. [31] Microsoft. Microsoft phishing filter: A new approach to building trust in e-commerce content, 2005. [32] P. Mockapetris. Domain Names—Implementation and Specification. IETF RFC 1035, November 1987. [33] C. Nuuja (Adobe), 2007. Personal communication. [34] G. Ollmann. The pharming guide. http://www. ngssoftware.com/papers/ThePharmingGuide.pdf, August 2005. [35] Y. Rekhter, B. Moskowitz, D. Karrenberg, G. J. de Groot, and E. Lear. Address Allocation for Private Internets. IETF RFC 1918, February 1996. [36] J. Roskind. Attacks against the Netscape browser. In RSA Conference, April 2001. Invited talk. [37] D. Ross. Notes on DNS pinning. http://blogs.msdn.com/dross/archive/2007/07/ 09/notes-on-dns-pinning.aspx, 2007. [38] J. Ruderman. JavaScript Security: Same Origin. http://www.mozilla.org/projects/security/ components/same-origin.html. [39] Spamhaus. The spamhaus block list, 2007. http://www.spamhaus.org/sbl/. [40] S. Stamm, Z. Ramzan, and M. Jakobsson. Drive-by pharming. Technical Report 641, Computer Science, Indiana University, December 2006. [41] J. Topf. HTML Form Protocol Attack, August 2001.
  • 44. http://www.remote.org/jochen/sec/hfpa/hfpa.pdf. [42] D. Veditz et al. document.domain abused to access hosts behind firewall. https: //bugzilla.mozilla.org/show bug.cgi?id=154930. [43] W3C. The XMLHttpRequest Object, February 2007. http://www.w3.org/TR/XMLHttpRequest/. [44] B. Warner. Home PCs rented out in sabotage-for-hire racket. Reuters, July 2004. [45] J. Winter and M. Johns. LocalRodeo: Client-side protection against JavaScript Malware. http://databasement.net/labs/localrodeo/, 2007. [46] M. Wong and W. Schlitt. Sender Policy Framework (SPF) for Authorizing Use of Domains in E-Mail. IETF RFC 4408, April 2006. Enhancing Byte-Level Network Intrusion Detection Signatures with Context Robin Sommer TU München Germany [email protected] Vern Paxson International Computer Science Institute and Lawrence Berkeley National Laboratory
  • 45. Berkeley, CA, USA [email protected] ABSTRACT Many network intrusion detection systems (NIDS) use byte sequen- ces as signatures to detect malicious activity. While being highly efficient, they tend to suffer from a high false-positive rate. We develop the concept of contextual signatures as an improvement of string-based signature-matching. Rather than matching fixed strings in isolation, we augment the matching process with additional con- text. When designing an efficient signature engine for the NIDS Bro, we provide low-level context by using regular expressions for matching, and high-level context by taking advantage of the se- mantic information made available by Bro’s protocol analysis and scripting language. Therewith, we greatly enhance the signature’s expressiveness and hence the ability to reduce false positives. We present several examples such as matching requests with replies, using knowledge of the environment, defining dependencies be- tween signatures to model step-wise attacks, and recognizing ex- ploit scans. To leverage existing efforts, we convert the comprehensive sig- nature set of the popular freeware NIDS Snort into Bro’s language. While this does not provide us with improved signatures by
  • 46. itself, we reap an established base to build upon. Consequently, we evalu- ate our work by comparing to Snort, discussing in the process sev- eral general problems of comparing different NIDSs. Categories and Subject Descriptors: C.2.0 [Computer-Communi- cation Networks]: General - Security and protection. General Terms: Performance, Security. Keywords: Bro, Network Intrusion Detection, Pattern Matching, Security, Signatures, Snort, Evaluation 1. INTRODUCTION Several different approaches are employed in attempting to detect computer attacks. Anomaly-based systems derive (usually in an au- tomated fashion) a notion of “normal” system behavior, and report divergences from this profile, an approach premised on the notion that attacks tend to look different in some fashion from legitimate computer use. Misuse detection systems look for particular, explicit indications of attacks (Host-based IDSs inspect audit logs for this while network-based IDSs, or NIDSs, inspect the network traffic). Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that
  • 47. copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. CCS’03,�October�27–31,�2003,�Washington,�DC,�USA. Copyright 2003 ACM 1-58113-738-9/03/0010 ...$5.00. In this paper, we concentrate on one popular form of misuse de- tection, network-based signature matching in which the system in- spects network traffic for matches against exact, precisely- described patterns. While NIDSs use different abstractions for defining such patterns, most of the time the term signature refers to raw byte se- quences. Typically, a site deploys a NIDS where it can see network traffic between the trusted hosts it protects and the untrusted exterior world, and the signature-matching NIDS inspects the passing pack- ets for these sequences. It generates an alert as soon as it encounters one. Most commercial NIDSs follow this approach [19], and also the most well-known freeware NIDS, Snort [29]. As an example, to detect the buffer overflow described in CAN-2002-0392 [9], Snort’s signature #1808 looks for the byte pattern 0xC0505289- E150515250B83B000000CD80 [2] in Web requests. Keeping in mind that there are more general forms of signatures used in
  • 48. in- trusion detection as well—some of which we briefly discuss in §2— in this paper we adopt this common use of the term signature. Signature-matching in this sense has several appealing proper- ties. First, the underlying conceptual notion is simple: it is easy to explain what the matcher is looking for and why, and what sort of total coverage it provides. Second, because of this simplicity, signatures can be easy to share, and to accumulate into large “at- tack libraries.” Third, for some signatures, the matching can be quite tight: a match indicates with high confidence that an attack occurred. On the other hand, signature-matching also has significant lim- itations. In general, especially when using tight signatures, the matcher has no capability to detect attacks other than those for which it has explicit signatures; the matcher will in general com- pletely miss novel attacks, which, unfortunately, continue to be de- veloped at a brisk pace. In addition, often signatures are not in fact “tight.” For example, the Snort signature #1042 to detect an exploit of CVE-2000-0778 [9] searches for “Translate: F” in Web requests; but it turns out that this header is regularly used by certain applications. Loose signatures immediately raise the major problem of false positives: alerts that in fact do not reflect an actual attack. A second form of false positive, which signature matchers
  • 49. likewise often fail to address, is that of failed attacks. Since at many sites attacks occur at nearly-continuous rates, failed attacks are often of little interest. At a minimum, it is important to distinguish between them and successful attacks. A key point here is that the problem of false positives can po- tentially be greatly reduced if the matcher has additional context at its disposal: either additional particulars regarding the exact activ- ity and its semantics, in order to weed out false positives due to overly general “loose” signatures; or the additional information of how the attacked system responded to the attack, which often indi- cates whether the attack succeeded. 262 In this paper, we develop the concept of contextual signatures, in which the traditional form of string-based signature matching is augmented by incorporating additional context on different levels when evaluating the signatures. First of all, we design and imple- ment an efficient pattern matcher similar in spirit to traditional sig- nature engines used in other NIDS. But already on this low- level
  • 50. we enable the use of additional context by (i) providing full regu- lar expressions instead of fixed strings, and (ii) giving the signature engine a notion of full connection state, which allows it to corre- late multiple interdependent matches in both directions of a user session. Then, if the signature engine reports the match of a sig- nature, we use this event as the start of a decision process, instead of an alert by itself as is done by most signature-matching NIDSs. Again, we use additional context to judge whether something alert- worthy has indeed occurred. This time the context is located on a higher-level, containing our knowledge about the network that we have either explicitly defined or already learned during operation. In §3.5, we will show several examples to demonstrate how the concept of contextual signatures can help to eliminate most of the limitations of traditional signatures discussed above. We will see that regular expressions, interdependent signatures, and knowledge about the particular environment have significant potential to reduce the false positive rate and to identify failed attack attempts. For example, we can consider the server’s response to an attack and the set of software it is actually running—its vulnerability profile— to decide whether an attack has succeeded. In addition, treating signature matches as events rather than alerts enables us to
  • 51. analyze them on a meta-level as well, which we demonstrate by identifying exploit scans (scanning multiple hosts for a known vulnerability). Instrumenting signatures to consider additional context has to be performed manually. For each signature, we need to determine what context might actually help to increase its performance. While this is tedious for large sets of already-existing signatures, it is not an extra problem when developing new ones, as such signatures have to be similarly adjusted to the specifics of particular attacks anyway. Contextual signatures serve as a building block for increasing the expressivess of signatures; not as a stand-alone solution. We implemented the concept of contextual signatures in the framework already provided by the freeware NIDS Bro [25]. In contrast to most NIDSs, Bro is fundamentally neither an anomaly- based system nor a signature-based system. It is instead partitioned into a protocol analysis component and a policy script component. The former feeds the latter via generating a stream of events that reflect different types of activity detected by the protocol analy- sis; consequently, the analyzer is also referred to as the event en- gine. For example, when the analyzer sees the establishment of
  • 52. a TCP connection, it generates a connection established event; when it sees an HTTP request it generates http request and for the corresponding reply http reply; and when the event engine’s heuristics determine that a user has successfully authenti- cated during a Telnet or Rlogin session, it generates login suc- cess (likewise, each failed attempt results in a login failure event). Bro’s event engine is policy-neutral: it does not consider any particular events as reflecting trouble. It simply makes the events available to the policy script interpreter. The interpreter then ex- ecutes scripts written in Bro’s custom scripting language in order to define the response to the stream of events. Because the lan- guage includes rich data types, persistent state, and access to timers and external programs, the response can incorporate a great deal of context in addition to the event itself. The script’s reaction to a par- ticular event can range from updating arbitrary state (for example, tracking types of activity by address or address pair, or grouping re- lated connections into higher-level “sessions”) to generating alerts (e.g., via syslog) or invoking programs for a reactive response. More generally, a Bro policy script can implement signature- style matching—for example, inspecting the URIs in Web requests, the
  • 53. MIME-encoded contents of email (which the event engine will first unpack), the user names and keystrokes in login sessions, or the filenames in FTP sessions—but at a higher semantic level than as just individual packets or generic TCP byte streams. Bro’s layered approach is very powerful as it allows a wide range of different applications. But it has a significant shortcoming: while, as discussed above, the policy script is capable of perform- ing traditional signature-matching, doing so can be cumbersome for large sets of signatures, because each signature has to be coded as part of a script function. This is in contrast to the concise, low- level languages used by most traditional signature-based systems. In ad- dition, if the signatures are matched sequentially, then the overhead of the matching can become prohibitive. Finally, a great deal of community effort is already expended on developing and dissemi- nating packet-based and byte-stream-based signatures. For exam- ple, the 1.9.0 release of Snort comes with a library of 1,715 signa- tures [2]. It would be a major advantage if we could leverage these efforts by incorporating such libraries. Therefore, one motivation for this work is to combine Bro’s flexi- bility with the capabilities of other NIDSs by implementing a
  • 54. signa- ture engine. But in contrast to traditional systems, which use their signature matcher more or less on its own, we tightly integrate it into Bro’s architecture in order to provide contextual signatures. As discussed above, there are two main levels on which we use addi- tional context for signature matching. First, at a detailed level, we extend the expressiveness of signatures. Although byte-level pattern matching is a central part of NIDSs, most only allow signatures to be expressed in terms of fixed strings. Bro, on the other hand, al- ready provides regular expressions for use in policy scripts, and we use them for signatures as well. The expressiveness of such patterns provides us with an immediate way to express syntactic context. For example, with regular expressions it is easy to express the no- tion “string XYZ but only if preceded at some point earlier by string ABC”. An important point to keep in mind regarding regular expres- sion matching is that, once we have fully constructed the matcher, which is expressed as a Deterministic Finite Automaton (DFA), the matching can be done in O(n) time for n characters in the input, and also Ω(n) time. (That is, the matching always takes time linear in the size of the input, regardless of the specifics of the input.)
  • 55. The “parallel Boyer-Moore” approaches that have been explored in the literature for fast matching of multiple fixed strings for Snort [12, 8] have a wide range of running times—potentially sublinear in n, but also potentially superlinear in n. So, depending on the particulars of the strings we want to match and the input against which we do the matching, regular expressions might prove fundamentally more efficient, or might not; we need empirical evaluations to determine the relative performance in practice. In addition, the construction of a regular expression matcher requires time potentially exponential in the length of the expression, clearly prohibitive, a point to which we return in §3.1. Second, on a higher level, we use Bro’s rich contextual state to implement our improvements to plain matching described above. Making use of Bro’s architecture, our engine sends events to the policy layer. There, the policy script can use all of Bro’s already existing mechanisms to decide how to react. We show several such examples in §3.5. Due to Snort’s large user base, it enjoys a comprehensive and up-to-date set of signatures. Therefore, although for flexibility we
  • 56. have designed a custom signature language for Bro, we make use 263 of the Snort libraries via a conversion program. This program takes an unmodified Snort configuration and creates a corresponding Bro signature set. Of course, by just using the same signatures in Bro as in Snort, we are not able to improve the resulting alerts in terms of quality. But even if we do not accompany them with additional context, they immediately give us a baseline of already widely- deployed signatures. Consequently, Snort serves us as a reference. Throughout the paper we compare with Snort both in terms of qual- ity and performance. But while doing so, we encountered several general problems for evaluating and comparing NIDSs. We be- lieve these arise independently of our work with Bro and Snort, and therefore describe them in some detail. Keeping these limitations in mind, we then evaluate the performance of our signature engine and find that it performs well. §2 briefly summarizes related work. In §3 we present the main design ideas behind implementing contextual signatures: regular expressions, integration into Bro’s architecture, some difficulties
  • 57. with using Snort signatures, and examples of the power of the Bro signature language. In §4 we discuss general problems of evaluating NIDSs, and then compare Bro’s signature matching with Snort’s. §5 summarizes our conclusions. 2. RELATED WORK [4] gives an introduction to intrusion detection in general, defin- ing basic concepts and terminology. In the context of signature-based network intrusion detection, previous work has focussed on efficiently matching hundreds of fixed strings in parallel: [12] and [8] both present implementations of set-wise pattern matching for Snort [29]. For Bro’s signature en- gine, we make use of regular expressions [18]. They give us both flexibility and efficiency. [17] presents a method to incrementally build the underlying DFA, which we can use to avoid the potentially enormous memory and computation required to generate the com- plete DFA for thousands of signatures. An extended form of regular expressions has been used in intrusion detection for defining se- quences of events [30], but to our knowledge no NIDS uses them for actually matching multiple byte patterns against the payload of packets.
  • 58. In this paper, we concentrate on signature-based NIDS. Snort is one of the most-widely deployed systems and relies heavily on its signature set. Also, most of the commercial NIDSs are signature- based [19], although there are systems that use more powerful con- cepts to express signatures than just specifying byte patterns. NFR [28], for example, uses a flexible language called N-Code to declare its signatures. In this sense, Bro already provides sophisti- cated signatures by means of its policy language. But the goal of our work is to combine the advantages of a traditional dedicated pattern matcher with the power of an additional layer abstracting from the raw network traffic. IDS like STAT [35] or Emerald [26] are more general in scope than purely network-based systems. They con- tain misuse-detection components as well, but their signatures are defined at a higher level. The STAT framework abstracts from low- level details by using transitions on a set of states as signatures. A component called NetSTAT [36] defines such state transitions based on observed network-traffic. Emerald, on the other hand, utilizes P-BEST [20], a production-based expert system to define attacks based on a set of facts and rules. Due to their general scope, both
  • 59. systems use a great deal of context to detect intrusions. On the other hand, our aim is to complement the most common form of signa- ture matching—low-level string matching—with context, while still keeping its efficiency. The huge number of generated alerts is one of the most impor- tant problems of NIDS (see, for example, [23]). [3] discusses some statistical limits, arguing in particular that the false-alarm rate is the limiting factor for the performance of an IDS. Most string-based NIDSs use their own signature language, and are therefore incompatible. But since most languages cover a com- mon subset, it is generally possible to convert the signatures of one system into the syntax of another. ArachNIDS [1], for example, generates signatures dynamically for different systems based on a common database, and [32] presents a conversion of Snort signa- tures into STAT’s language, although it does not compare the two systems in terms of performance. We take a similar approach, and convert Snort’s set into Bro’s new signature language. For evaluation of the new signature engine, we take Snort as a reference. But while comparing Bro and Snort, we have encoun- tered several difficulties which we discuss in §4. They are part of
  • 60. the general question of how to evaluate NIDSs. One of the most comprehensive evaluations is presented in [21, 22], while [24] of- fers a critique of the methodology used in these studies. [14] further extends the evaluation method by providing a user-friendly environ- ment on the one hand, and new characterizations of attack traffic on the other hand. More recently, [10] evaluates several commer- cial systems, emphasizing the view of an analyst who receives the alerts, finding that these systems ignore relevant information about the context of the alerts. [15] discusses developing a benchmark for NIDSs, measuring their capacity with a representative traffic mix. (Note, in §4.2 we discuss our experiences with the difficulty of find- ing “representative” traces.) 3. CONTEXTUAL SIGNATURES The heart of Bro’s contextual signatures is a signature engine de- signed with three main goals in mind: (i) expressive power, (ii) the ability to improve alert quality by utilizing Bro’s contextual state, and (iii) enabling the reuse of existing signature sets. We discuss each in turn. Afterwards, we present our experiences with Snort’s signature set, and finally show examples which demonstrate
  • 61. appli- cations for the described concepts. 3.1 Regular Expressions A traditional signature usually contains a sequence of bytes that are representative of a specific attack. If this sequence is found in the payload of a packet, this is an indicator of a possible at- tack. Therefore, the matcher is a central part of any signature- based NIDS. While many NIDSs only allow fixed strings as search pat- terns, we argue for the utility of using regular expressions. Regular expressions provide several significant advantages: first, they are far more flexible than fixed strings. Their expressiveness has made them a well-known tool in many applications, and their power arises in part from providing additional syntactic context with which to sharpen textual searches. In particular, character classes, union, optional elements, and closures prove very useful for speci- fying attack signatures, as we see in §3.5.1. Surprisingly, given their power, regular expressions can be matched very efficiently. This is done by compiling the expres- sions into DFAs whose terminating states indicate whether a match is found. A sequence of n bytes can therefore be matched with O(n) operations, and each operation is simply an array lookup— highly efficient. The total number of patterns contained in the signature set of
  • 62. a NIDSs can be quite large. Snort’s set, for example, contains 1,715 distinct signatures, of which 1,273 are enabled by default. Matching these individually is very expensive. However, for fixed strings, there are algorithms for matching sets of strings simulta- neously. Consequently, while Snort’s default engine still works it- eratively, there has been recent work to replace it with a “set- wise” 264 matcher [8, 12].1 On the other hand, regular expressions give us set- wise matching for free: by using the union operator on the individ- ual patterns, we get a new regular expression which effectively com- bines all of them. The result is a single DFA that again needs O(n) operations to match against an n byte sequence. Only slight modifi- cations have been necessary to extend the interface of Bro’s already- existing regular expression matcher to explicitly allow grouping of expressions. Given the expressiveness and efficiency of regular expressions, there is still a reason why a NIDS might avoid using them: the underlying DFA can grow very large. Fully compiling a regular ex- pression into a DFA leads potentially to an exponential number
  • 63. of DFA states, depending on the particulars of the patterns [18]. Con- sidering the very complex regular expression built by combining all individual patterns, this straight-forward approach could easily be intractable. Our experience with building DFAs for regular ex- pressions matching many hundreds of signatures shows that this is indeed the case. However, it turns out that in practice it is possible to avoid the state/time explosion, as follows. Instead of pre-computing the DFA, we build the DFA “on-the- fly” during the actual matching [17]. Each time the DFA needs to transit into a state that is not already constructed, we compute the new state and record it for future reuse. This way, we only store DFA states that are actually needed. An important observation is that for n new input characters, we will build at most n new states. Furthermore, we find in practice (§4.3) that for normal traffic the growth is much less than linear. However, there is still a concern that given inauspicious traffic— which may actually be artificially crafted by an attacker—the state construction may eventually consume more memory than we have available. Therefore, we also implemented a memory-bounded DFA
  • 64. state cache. Configured with a maximum number of DFA states, it expires old states on a least-recently-used basis. In the sequel, when we mention “Bro with a limited state cache,” we are referring to such a bounded set of states (which is a configuration option for our version of Bro), using the default bound of 10,000 states. Another important point is that it’s not necessary to combine all patterns contained in the signature set into a single regular expres- sion. Most signatures contain additional constraints like IP address ranges or port numbers that restrict their applicability to a subset of the whole traffic. Based on these constraints, we can build groups of signatures that match the same kind of traffic. By collecting only those patterns into a common regular expression for matching the group, we are able to reduce the size of the resulting DFA dras- tically. As we show in §4, this gives us a very powerful pattern matcher still efficient enough to cope with high-volume traffic. 3.2 Improving Alert Quality by Using Context Though pattern matching is a central part of any signature-based NIDSs, as we discussed above there is potentially great utility in incorporating more context in the system’s analysis prior to gener- ating an alert, to ensure that there is indeed something alert- worthy occurring. We can considerably increase the quality of alerts, while
  • 65. simultaneously reducing their quantity, by utilizing knowledge about the current state of the network. Bro is an excellent tool for this as it already keeps a lot of easily accessible state. The new signature engine is designed to fit nicely into Bro’s lay- ered architecture as an adjunct to the protocol analysis event en- gine (see Figure 1). We have implemented a custom language for defining signatures. It is mostly a superset of other, similar lan- 1The code of [12] is already contained in the Snort distribution, but not compiled-in by default. This is perhaps due to some subtle bugs, some of which we encountered during our testing as well. Figure 1: Integrating the signature engine (adapted from [25]) Event Control Event Engine Event stream Real−time notification Signature Control Packet filter Policy script Filtered packet stream
  • 66. Packet stream Signature Engine Signatures Network Policy Layer Packet capture guages, and we describe it in more detail in §3.3. A new component placed within Bro’s middle layer matches these signatures against the packet stream. Whenever it finds a match, it inserts a new event into the event stream. The policy layer can then decide how to re- act. Additionally, we can pass information from the policy layer back into the signature engine to control its operation. A signature can specify a script function to call whenever a particular signature matches. This function can then consult additional context and in- dicate whether the corresponding event should indeed be generated. We show an example of this later in §3.5.4. In general, Bro’s analyzers follow the communication between two endpoints and extract protocol-specific information. For exam- ple, the HTTP analyzer is able to extract URIs requested by
  • 67. Web clients (which includes performing general preprocessing such as expanding hex escapes) and the status code and items sent back by servers in reply, whereas the FTP analyzer follows the applica- tion dialog, matching FTP commands and arguments (such as the names of accessed files) with their corresponding replies. Clearly, this protocol-specific analysis provides significantly more context than does a simple view of the total payload as an undifferentiated byte stream. The signature engine can take advantage of this additional in- formation by incorporating semantic-level signature matching. For example, the signatures can include the notion of matching against HTTP URIs; the URIs to be matched are provided by Bro’s HTTP analyzer. Having developed this mechanism for interfacing the sig- nature engine with the HTTP analyzer, it is now straight forward to extend it to other analyzers and semantic elements (indeed, we timed how long it took to add and debug interfaces for FTP and Finger, and the two totalled only 20 minutes). Central to Bro’s architecture is its connection management. Each network packet is associated with exactly one connection. This no-
  • 68. tion of connections allows several powerful extensions to traditional signatures. First of all, Bro reassembles the payload stream of TCP connections. Therefore, we can perform all pattern matching on the actual stream (in contrast to individual packets). While Snort has a preprocessor for TCP session reassembling, it does so by combin- ing several packets into a larger “virtual” packet. This packet is then passed on to the pattern matcher. Because the resulting analysis remains packet-based, it still suffers from discretization problems introduced by focusing on packets, such as missing byte sequences that cross packet boundaries. (See a related discussion in [25] of the problem of matching strings in TCP traffic in the face of possible intruder evasion [27].) In Bro, a signature match does not necessarily correspond to an alert; as with other events, that decision is left to the policy script. Hence, it makes sense to remember which signatures have matched for a particular connection so far. Given this information, it is then possible to specify dependencies between signatures like “signature 265
  • 69. A only matches if signature B has already matched,” or “if a host matches more than N signatures of type C, then generate an alert.” This way, we can for example describe multiple steps of an attack. In addition, Bro notes in which direction of a connection a particular signature has matched, which gives us the notion of request/reply signatures: we can associate a client request with the corresponding server reply. A typical use is to differentiate between successful and unsuccessful attacks. We show an example in §3.5.3. More generally, the policy script layer can associate arbitrary kinds of data with a connection or with one of its endpoints. This means that any information we can deduce from any of Bro’s other components can be used to improve the quality of alerts. We demon- strate the power of this approach in §3.5.2. Keeping per-connection state for signature matching naturally raises the question of state management: at some point in time we have to reclaim state from older connections to prevent the system from exhausting the available memory. But again we can leverage the work already being done by Bro. Independently of our signa- tures, it already performs a sophisticated connection-tracking
  • 70. using various timeouts to expire connections. By attaching the matching state to the already-existing per-connection state, we assure that the signature engine works economically even with large numbers of connections. 3.3 Signature Language Any signature-based NIDS needs a language for actually defining signatures. For Bro, we had to choose between using an already existing language and implementing a new one. We have decided to create a new language for two reasons. First, it gives us more flexibility. We can more easily integrate the new concepts described in §3.1 and §3.2. Second, for making use of existing signature sets, it is easier to write a converter in some high-level scripting language than to implement it within Bro itself. Snort’s signatures are comprehensive, free and frequently up- dated. Therefore, we are particularly interested in converting them into our signature language. We have written a corresponding Py- thon script that takes an arbitrary Snort configuration and outputs signatures in Bro’s syntax. Figure 2 shows an example of such a conversion. Figure 2: Example of signature conversion
  • 71. alert tcp any any -> [a.b.0.0/16,c.d.e.0/24] 80 ( msg:"WEB-ATTACKS conf/httpd.conf attempt"; nocase; sid:1373; flow:to_server,established; content:"conf/httpd.conf"; [...] ) (a) Snort signature sid-1373 { ip-proto == tcp dst-ip == a.b.0.0/16,c.d.e.0/24 dst-port == 80 # The payload below is actually generated in a # case-insensitive format, which we omit here # for clarity. payload /.*conf/httpd.conf/ tcp-state established,originator event "WEB-ATTACKS conf/httpd.conf attempt" }% (b) Bro It turns out to be rather difficult to implement a complete parser for Snort’s language. As far as we have been able to determine, its syntax and semantics are not fully documented, and in fact often only defined by the source code. In addition, due to different inter- nals of Bro and Snort, it is sometimes not possible to keep the exact semantics of the signatures. We return to this point in §4.2. As the example in Figure 2 shows, our signatures are defined by means of an identifier and a set of attributes. There are two main
  • 72. types of attributes: (i) conditions and (ii) actions. The conditions define when the signature matches, while the actions declare what to do in the case of a match. Conditions can be further divided into four types: header, content, dependency, and context. Header conditions limit the applicability of the signature to a sub- set of traffic that contains matching packet headers. For TCP, this match is performed only for the first packet of a connection. For other protocols, it is done on each individual packet. In general, header conditions are defined by using a tcpdump-like [33] syntax (for example, tcp[2:2] == 80 matches TCP traffic with desti- nation port 80). While this is very flexible, for convenience there are also some short-cuts (e.g., dst-port == 80). Content conditions are defined by regular expressions. Again, we differentiate two kinds of conditions here: first, the expression may be declared with the payload statement, in which case it is matched against the raw packet payload (reassembled where appli- cable). Alternatively, it may be prefixed with an analyzer- specific label, in which case the expression is matched against the data as extracted by the corresponding analyzer. For example, the HTTP analyzer decodes requested URIs. So, http /(etc/(passwd |shadow)/ matches any request containing either etc/passwd
  • 73. or etc/shadow. Signature conditions define dependencies between signatures. We have implemented requires-signature, which specifies another signature that has to match on the same connection first, and requires-reverse-signature, which additionally re- quires the match to happen for the other direction of the connection. Both conditions can be negated to match only if another signature does not match. Finally, context conditions allow us to pass the match decision on to various components of Bro. They are only evaluated if all other conditions have already matched. For example, we have im- plemented a tcp-state condition that poses restrictions on the current state of the TCP connection, and eval, which calls an ar- bitrary script policy function. If all conditions are met, the actions associated with a signature are executed: event inserts a signature match event into the event stream, with the value of the event including the signature identifier, corresponding connection, and other context. The policy layer can then analyze the signature match. 3.4 Snort’s Signature Set Snort comes with a large set of signatures, with 1,273 enabled by default [2]. Unfortunately, the default configuration turns out to generate a lot of false positives. In addition, many alerts belong to failed exploit attempts executed by attackers who scan networks for
  • 74. vulnerable hosts. As noted above, these are general problems of signature-based systems. The process of selectively disabling signatures that are not appli- cable to the local environment, or “tuning,” takes time, knowledge and experience. With respect to Snort, a particular problem is that many of its signatures are too general. For example, Snort’s signa- ture #1560: alert tcp $EXTERNAL_NET any -> $HTTP_SERVERS $HTTP_PORTS (msg:"WEB-MISC /doc/ access"; uricontent:"/doc/"; flow:to_server,established; nocase; sid:1560; [...]) searches for the string /doc/ within URIs of HTTP requests. While this signature is indeed associated with a particular vulner- ability (CVE-1999-0678 [9]), it only makes sense to use it if you have detailed knowledge about your site (for example, that there is no valid document whose path contains the string /doc/). Other- wise, the probability of a signature match reflecting a false alarm 266 is much higher than that it indicates an attacker exploiting an
  • 75. old vulnerability. Another problem with Snort’s default set is the presence of over- lapping signatures for the same exploit. For example, signatures #1536, #1537, #1455, and #1456 (the latter is disabled by default) all search for CVE-2000-0432, but their patterns differ in the amount of detail. In addition, the vulnerability IDs given in Snort’s signatures are not always correct. For example, signature #884 ref- erences CVE-1999-0172 and Buqtraq [6] ID #1187. But the lat- ter corresponds to CVE-2000-0411. As already noted, we cannot expect to avoid these limitations of Snort’s signatures by just using them semantically unmodified in Bro. For example, although we convert the Snort’s fixed strings into Bro’s regular expressions, naturally they still represent fixed sets of characters. Only manual editing would give us the additional power of regular expressions. We give an example for such an improve- ment in §3.5.1. 3.5 The Power of Bro Signatures In this section, we show several examples to convey the power provided by our signatures. First, we demonstrate how to define more “tight” signatures by using regular expressions. Then, we show how to identify failed attack attempts by considering the set of
  • 76. software a particular server is runnning (we call this its vulnerabil- ity profile and incorporate some ideas from [22] here) as well as the response of the server. We next demonstrate modelling an attack in multiple steps to avoid false positives, and finally show how to use alert-counting for identifying exploit scans. We note that none of the presented examples are supported by Snort without extending its core significantly (e.g. by writing new plug-ins). 3.5.1 Using Regular Expressions Regular expressions allow far more flexibility than fixed strings. Figure 3 (a) shows a Snort signature for CVE-1999-0172 that generates a large number of false positives at Saarland University’s border router. (See §4.1 for a description of the university.) Fig- ure 3 (b) shows a corresponding Bro signature that uses a regular expression to identify the exploit more reliably. CVE-1999- 0172 describes a vulnerability of the formmail CGI script. If an at- tacker constructs a string of the form “...; <shell-cmds>” (a | instead of the ; works as well), and passes it on as argument of the recipient CGI parameter, vulnerable formmails will ex- ecute the included shell commands. Because CGI parameters can be given in arbitrary order, the Snort signature has to rely on iden- tifying the formmail access by its own. But by using a regular
  • 77. expression, we can explicitly define that the recipient parame- ter has to contain a particular character. Figure 3: Two signatures for CVE-1999-0172 alert tcp any any -> a.b.0.0/16 80 (msg:"WEB-CGI formmail access"; uricontent:"/formmail"; flow:to_server,established; nocase; sid:884; [...]) (a) Snort using a fixed string signature formmail-cve-1999-0172 { ip-proto == tcp dst-ip == a.b.0.0/16 dst-port = 80 # Again, actually expressed in a # case-insensitive manner. http /.*formmail.*?.*recipient=[ˆ&]*[;|]/ event "formmail shell command" } (b) Bro using a regular expression 3.5.2 Vulnerability Profiles Most exploits are aimed at particular software, and usually only some versions of the software are actually vulnerable. Given the overwhelming number of alerts a signature-matching NIDS can generate, we may well take the view that the only attacks of interest are those that actually have a chance of succeeding. If, for example,
  • 78. an IIS exploit is tried on a Web server running Apache, one may not even care. [23] proposes to prioritize alerts based on this kind of vulnerability information. We call the set of software versions that a host is running its vulnerability profile. We have implemented this concept in Bro. By protocol analysis, it collects the profiles of hosts on the network, using version/implementation information that the analyzer observes. Signatures can then be restricted to certain ver- sions of particular software. As a proof of principle, we have implemented vulnerability pro- files for HTTP servers (which usually characterize themselves via the Server header), and for SSH clients and servers (which iden- tify their specific versions in the clear during the initial protocol handshake). We intend to extend the software identification to other protocols. We aim in future work to extend the notion of developing a pro- file beyond just using protocol analysis. We can passively finger- print hosts to determine their operating system version information by observing specific idiosyncrasies of the header fields in the traffic they generate, similar to the probing techniques described in [13], or we can separately or in addition employ active techniques to explic-
  • 79. itly map the properties of the site’s hosts and servers [31]. Finally, in addition to automated techniques, we can implement a configu- ration mechanism for manually entering vulnerability profiles. 3.5.3 Request/Reply Signatures Further pursuing the idea to avoid alerts for failed attack attempts, we can define signatures that take into account both directions of a connection. Figure 4 shows an example. In operational use, we see a lot of attempts to exploit CVE-2001-0333 to execute the Windows command interpreter cmd.exe. For a failed attempt, the server typically answers with a 4xx HTTP reply code, indicating an error.2 To ignore these failed attempts, we first define one signature, http-error, that recognizes such replies. Then we define a sec- ond signature, cmdexe-success, that matches only if cmd.exe is contained in the requested URI (case-insensitive) and the server does not reply with an error. It’s not possible to define this kind of signature in Snort, as it lacks the notion of associating both direc- tions of a connection. Figure 4: Request/reply signature signature cmdexe-success { ip-proto == tcp dst-port == 80 http /.*[cC][mM][dD].[eE][xX][eE]/
  • 80. event "WEB-IIS cmd.exe success" requires-signature-opposite ! http-error tcp-state established } signature http-error { ip-proto == tcp src-port == 80 payload /.*HTTP/1.. *4[0-9][0-9]/ event "HTTP error reply" tcp-state established } 2There are other reply codes that reflect additional types of errors, too, which we omit for clarity. 267 3.5.4 Attacks with Multiple Steps An example of an attack executed in two steps is the infection by the Apache/mod ssl worm [7] (also known as Slapper), released in September 2002. The worm first probes a target for its potential vulnerability by sending a simple HTTP request and inspecting the response. It turns out that the request it sends is in fact in violation of the HTTP 1.1 standard [11] (because it does not include a Host header), and this idiosyncracy provides a somewhat “tight” signature for detecting a Slapper probe.
  • 81. If the server identifies itself as Apache, the worm then tries to exploit an OpenSSL vulnerability on TCP port 443. Figure 5 shows two signatures that only report an alert if these steps are performed for a destination that runs a vulnerable OpenSSL ver- sion. The first signature, slapper-probe, checks the payload for the illegal request. If found, the script function is vulnera- ble to slapper (omitted here due to limited space, see [2]) is called. Using the vulnerability profile described above, the func- tion evaluates to true if the destination is known to run Apache as well as a vulnerable OpenSSL version.3 If so, the signature matches (depending on the configuration this may or may not gen- erate an alert by itself). The header conditions of the second sig- nature, slapper-exploit, match for any SSL connection into the specified network. For each, the signature calls the script func- tion has slapper probed. This function generates a signa- ture match if slapper-probe has already matched for the same source/destination pair. Thus, Bro alerts if the combination of prob- ing for a vulnerable server, plus a potential follow-on exploit of the vulnerability, has been seen. Figure 5: Signature for Apache/mod sslworm signature slapper-probe { ip-proto == tcp dst-ip == x.y.0.0/16 # sent to local net dst-port == 80 payload /.*GET / HTTP/1.1x0dx0ax0dx0a/ eval is_vulnerable_to_slapper # call policy fct. event "Vulner. host possibly probed by Slapper"
  • 82. } signature slapper-exploit { ip-proto == tcp dst-ip == x.y.0.0/16 dst-port == 443 # 443/tcp = SSL/TLS eval has_slapper_probed # test: already probed? event "Slapper tried to exploit vulnerable host" } 3.5.5 Exploit Scanning Often attackers do not target a particular system on the Internet, but probe a large number of hosts for vulnerabilities (exploit scan- ning). Such a scan can be executed either horizontally (several hosts are probed for a particular exploit), vertically (one host is probed for several exploits), or both. While, by their own, most of these probes are usually low-priority failed attempts, the scan itself is an important event. By simply counting the number signature alerts per source address (horizontal) or per source/destination pair (ver- tical), Bro can readily identify such scans. We have implemented this with a policy script which generates alerts like: a.b.c.d triggered 10 signatures on host e.f.g.h i.j.k.l triggered signature sid-1287 on 100 hosts m.n.o.p triggered signature worm-probe on 500 hosts q.r.s.t triggered 5 signatures on host u.v.x.y
  • 83. 3Note that it could instead implement a more conservative policy, and return true unless the destination is known to not run a vulner- able version of OpenSSL/Apache. 4. EVALUATION Our approach for evaluating the effectiveness of the signature en- gine is to compare it to Snort in terms of run-time performance and generated alerts, using semantically equivalent signature sets. We note that we do not evaluate the concept of conceptual signatures by itself. Instead, as a first step, we validate that our implementation is capable of acting as an effective substitute for the most- widely deployed NIDS even when we do not use any of the advanced fea- tures it provides. Building further on this base by thoroughly evalu- ating the actual power of contextual signatures when deployed op- erationally is part of our ongoing work. During our comparision of Bro and Snort, we found several pe- culiarities that we believe are of more general interest. Our re- sults stress that the performance of a NIDS can be very sensitive to semantics, configuration, input, and even underlying hardware. Therefore, after discussing our test data, we delve into these in some
  • 84. detail. Keeping these limitations in mind, we then assess the overall performance of the Bro signature engine. 4.1 Test Data For our testing, we use two traces: USB-Full A 30-minute trace collected at Saarland University, Germany (USB-Full), consisting of all traffic (including packet contents) except for three high-volume peer-to-peer applications (to reduce the volume). The university has 5,500 internal hosts, and the trace was gathered on its 155 Mbps access link to the Internet. The trace totals 9.8 GB, 15.3M packets, and 220K connections. 35% of the trace packets be- long to HTTP on port 80, 19% to eDonkey on port 4662, and 4% to ssh on port 22, with other individual ports being less common than these three (and the high-volume peer-to-peer that was removed). LBL-Web A two-hour trace of HTTP client-side traffic, including packet contents, gathered at the Lawrence Berkeley National Laboratory (LBL), Berkeley, USA (LBL-Web). The labora- tory has 13,000 internal hosts, and the trace was gathered on its Gbps access link to the Internet. The trace totals 667MB, 5.5M packets, and 596K connections. Unless stated otherwise, we performed all measurements on 550MHz Pentium-3 systems containing ample memory (512MB or more). For both Snort and Bro’s signature engine, we used Snort’s default signature set. We disabled Snort’s “experimental” set of sig- natures as some of the latest signatures use new options which are
  • 85. not yet implemented in our conversion program. In addition, we disabled Snort signature #526, BAD TRAFFIC data in TCP SYN packet. Due to Bro matching stream-wise instead of packet- wise, it generates thousands of false positives. We discuss this in §4.2. In total, 1,118 signatures are enabled. They contain 1,107 distinct patterns and cover 89 different service ports. 60% of the signatures cover HTTP traffic. For LBL-Web, only these were acti- vated. For Snort, we enabled the preprocessors for IP defragmentation, TCP stream reassembling on its default ports, and HTTP decoding. For Bro, we have turned on TCP reassembling for the same ports (even if otherwise Bro would not reassemble them because none of the usual event handlers indicated interest in traffic for those ports), enabled its memory-saving configuration (“@load reduce- memory”), and used an inactivity timeout of 30 seconds (in correspondence with Snort’s default session timeout). We con- figured both systems to consider all packets contained in the traces. We used the version 1.9 branch of Snort, and version 0.8a1 of Bro. 268 4.2 Difficulties of Evaluating NIDSs The evaluation of a NIDS is a challenging undertaking, both in
  • 86. terms of assessing attack recognition and in terms of assessing per- formance. Several efforts to develop objective measures have been made in the past (e.g., [21, 22, 15]), while others stress the diffi- culties with such approaches [24]. During our evaluation, we en- countered several additional problems that we discuss here. While these arose in the specific context of comparing Snort and Bro, their applicability is more general. When comparing two NIDSs, differing internal semantics can present a major problem. Even if both systems basically perform the same task—capturing network packets, rebuilding payload, de- coding protocols—that task is sufficiently complex that it is almost inevitable that the systems will do it somewhat differently. When coupled with the need to evaluate a NIDS over a large traffic trace (millions of packets), which presents ample opportunity for the dif- fering semantics to manifest, the result is that understanding the significance of the disagreement between the two systems can en- tail significant manual effort. One example is the particular way in which TCP streams are re- assembled. Due to state-holding time-outs, ambiguities (see [27, 16] and [25] for discussion of how these occur for benign reasons in
  • 87. practice) and non-analyzed packets (which can be caused by packet filter drops, or by internal sanity checks), TCP stream analyzers will generally wind up with slightly differing answers for corner cases. Snort, for example, uses a preprocessor that collects a number of packets belonging to the same session until certain thresholds are reached and then combines them into “virtual” packets. The rest of Snort is not aware of the reassembling and still only sees packets. Bro, on the other hand, has an intrinsic notion of a data stream. It collects as much payload as needed to correctly reconstruct the next in-sequence chunk of a stream and passes these data chunks on as soon as it is able to. The analyzers are aware of the fact that they get their data chunk-wise, and track their state across chunks. They are not aware of the underlying packetization that lead to those chunks. While Bro’s approach allows true stream-wise signatures, it also means that the signature engine loses the notion of “packet size”: packets and session payload are decoupled for most of Bro’s analyzers. However, Snort’s signature format includes a way of specifying the packet size. Our signature engine must fake up an equivalent by using the size of the first matched payload chunk for
  • 88. each connection, which can lead to differing results. Another example of differing semantics comes from the behavior of protocol analyzers. Even when two NIDS both decode the same protocol, they will differ in the level-of-detail and their interpreta- tion of protocol corner cases and violations (which, as mentioned above, are in fact seen in non-attack traffic [25]). For example, both Bro and Snort extract URIs from HTTP sessions, but they do not interpret them equally in all situations. Character encodings within URIs are sometimes decoded differently, and neither contains a full Unicode decoder. The anti-IDS tool Whisker [37] can actively ex- ploit these kinds of deficiencies. Similarly, Bro decodes pipelined HTTP sessions; Snort does not (it only processes the first URI in a series of pipelined HTTP requests). Usually, the details of a NIDS can be controlled by a number of options. But frequently for a Bro option there is no equivalent Snort option, and vice versa. For example, the amount of memory used by Snort’s TCP reassembler can be bounded to a fixed value. If this limit is reached, old data is expired aggressively. Bro relies solely on time-outs. Options like these often involve time-memory trade-
  • 89. offs. The more memory we have, the more we can spend for Snort’s reassembler, and the larger we can make Bro’s time-outs. But how to choose the values, so that both will utilize the same amount of memory? And even if we do, how to arrange that both expire the same old data? The hooks to do so simply aren’t there. The result of these differences is differing views of the same net- work data. If one NIDS reports an alert while the other does not, it may take a surprisingly large amount of effort to tell which one of them is indeed correct. More fundamentally, this depends on the definition of “correct,” as generally both are correct within their own semantics. From a user’s point of the view, this leads to differ- ent alerts even when both systems seem to use the same signatures. From an evaluator’s point of view, we have to (i) grit our teeth and be ready to spend substantial effort in tracking down the root cause when validating the output of one tool versus another, and (ii) be very careful in how we frame our assessment of the differences, be- cause there is to some degree a fundamental problem of “comparing apples and oranges”. The same applies for measuring performance in terms of effi-
  • 90. ciency. If two systems do different things, it is hard to compare them fairly. Again, the HTTP analyzers of Snort and Bro illustrate this well. While Snort only extracts the first URI from each packet, Bro decodes the full HTTP session, including tracking multiple re- quests and replies (which entails processing the numerous ways in which HTTP delimits data entities, including “multipart MIME” and “chunking”). Similarly, Bro provides much more information at various other points than the corresponding parts of Snort. But there are still more factors that influence performance. Even if one system seems to be significantly faster than another, this can change by modifying the input or even the underlying hardware. One of our main observations along these lines is that the perfor- mance of NIDSs can depend heavily on the particular input trace. On a Pentium-3 system, Snort needs 440 CPU seconds for the trace LBL-Web (see Figure 6). This only decreases by 6% when us- ing the set-wise pattern matcher of [12]. In addition, we devised a small modification to Snort that, compared to the original ver- sion, speeds it up by factor of 2.6 for this particular trace. (The modification is an enhancement to the set-wise matcher: the orig- inal implementation first performs a set-wise search for all of the possible strings, caching the results, and then iterates through the lists of signatures, looking up for each in turn whether its
  • 91. particular strings were matched. Our modification uses the result of the set- wise match to identify potential matching signatures directly if the corresponding list is large, avoiding the iteration.) Figure 6: Run-times on different hardware Pentium−3, 512Mhz Pentium−4, 1.5Ghz Snort Snort−[FV01] Snort−Modified Bro w/o DFA cache Bro w/ DFA cache Run−times on Web trace S e co n d s 0 1 0 0 2 0
  • 92. 0 3 0 0 4 0 0 5 0 0 Using the trace USB-Full, however, the improvement realized by our modified set-wise matcher for Snort is only a factor of 1.2. Even more surprisingly, on a trace from another environment (a re- search laboratory with 1,500 workstations and supercomputers), the original version of Snort is twice as fast as the set-wise implemen- tation of [12] (148 CPU secs vs. 311 CPU secs), while our patched version lies in between (291 CPU secs). While the reasons remain to be discovered in Snort’s internals, this demonstrates the difficulty of finding representative traffic as proposed, for example, in [15]. 269
  • 93. Furthermore, relative performance does not only depend on the input but even on the underlying hardware. As described above, the original Snort needs 440 CPU seconds for LBL-Web on a Pentium- 3 based system. Using exactly the same configuration and input on a Pentium-4 based system (1.5GHz), it actually takes 29 CPU seconds more. But now the difference between stock Snort and our modified version is a factor of 5.8! On the same system, Bro’s run- time decreases from 280 to 156 CPU seconds.4 Without detailed hardware-level analysis, we can only guess why Snort suffers from the upgrade. To do so, we ran valgrind’s [34] cache simulation on Snort. For the second-level data cache, it shows a miss-rate of roughly 10%. The corresponding value for Bro is be- low 1%. While we do not know if valgrind’s values are airtight, they could at least be the start of an explanation. We have heard other anecdotal comments that the Pentium-4 performs quite poorly for applications with lots of cache-misses. On the other hand, by building Bro’s regular expression matcher incrementally, as a side effect the DFA tables will wind up having memory locality that somewhat reflects the dynamic patterns of the state accesses, which will tend to decrease cache misses. 4.3 Performance Evaluation We now present measurements of the performance of the Bro
  • 94. sig- nature engine compared with Snort, keeping in mind the difficulties described above. Figure 7 shows run-times on trace subsets of dif- ferent length for the USB-Full trace. We show CPU times for the original implementation of Snort, for Snort using [12] (virtually no difference in performance), for Snort modified by us as described in the previous section, for Bro with a limited DFA state cache, and for Bro without a limited DFA state cache. We see that our modified Snort runs 18% faster than the original one, while the cache-less Bro takes about the same amount of time. Bro with a limited state cache needs roughly a factor of 2.2 more time. We might think that the discrepancy between Bro operating with a limited DFA state cache and it operating with unlimited DFA state memory is due to it having to spend considerable time recomputing states previously expired from the limited cache. This, however, turns out not to be the case. Additional experiments with essentially infinite cache sizes indicate that the performance decrease is due to the additional overhead of maintaining the cache. While this looks like a significant impact, we note that it is not clear whether the space savings of a cache is in fact needed in opera-
  • 95. tional use. For this trace, only 2,669 DFA states had to be computed, totaling roughly 10MB. When running Bro operationally for a day at the university’s gateway, the number of states rapidly climbs to about 2,500 in the first hour, but then from that point on only slowly rises to a bit over 4,000 by the end of the day. A remaining question, however, is whether an attacker could cre- ate traffic specifically tailored to enlarge the DFAs (a “state- holding” attack on the IDS), perhaps by sending a stream of packets that nearly trigger each of the different patterns. Additional research is needed to further evaluate this threat. Comparing for USB-Full the alerts generated by Snort to the signature matches reported by Bro, all in all we find very good agreement. The main difference is the way they report a match. By design, Bro reports all matching signatures, but each one only once per connection. This is similar to the approach suggested in [10]. Snort, on the other hand, reports the first matching sig- nature for each packet, independently of the connection it belongs 4This latter figure corresponds to about 35,000 packets per second, though we strongly argue that measuring performance in PPS rates implies undue generality, since, as developed above, the specifics of the packets make a great difference in the results.
  • 96. to. This makes it difficult to compare the matches. We account for these difference by comparing connections for which at least one match is generated by either system. With USB-Full, we get 2,065 matches by Bro in total on 1,313 connections. Snort reports 4,147 alerts. When counting each alert only once per connection, Snort produces 1,320 on 1,305 connections.5 There are 1,296 con- nections for which both generate at least one alert, and 17 (9) for which Bro (Snort) reports a match but not Snort (Bro). Looking at individual signatures, we see that Bro misses 10 matches of Snort. 5 of them are caused by Snort ID #1013 (WEB- IIS fpcount access). The corresponding connections con- tain several requests, but an idle time larger than the defined in- activity timeout of 30 seconds. Therefore, Bro flushes the state before it can encounter the match which would happen later in the session. On the other hand, Bro reports 41 signature matches for connections for which Snort does not report anything. 37 of them are Web signatures. The discrepancy is due to different TCP stream semantics. Bro and Snort have slightly different definitions of when a session is established. In addition, the semantic differ- ences between stream-wise and packet-wise matching discussed in §4.2 cause some of the additional alerts. Figure 7: Run-time comparison on 550MHz Pentium-3
  • 97. 0 5 10 15 20 25 30 0 5 0 0 1 0 0 0 1 5 0 0 2 0 0 0 Runtime for USB−Full on Pentium−3 Trace length (mins) S e co n d s Bro w/o state cache Bro w/ state cache
  • 98. Snort Snort [FV01] Snort patched We have done similar measurements with LBL-Web. Due to lim- ited space, we omit the corresponding plot here. While the original Snort takes 440 CPU seconds for the trace, Bro without (with) a lim- ited state cache needs 280 (328) CPU seconds, and Snort as modi- fied by us needs only 164 CPU seconds. While this suggests room for improvement in some of Bro’s internal data structures, Bro’s matcher still compares quite well to the typical Snort configuration. For this trace, Bro (Snort) reports 2,764 (2,049) matches in total. If we count Snort’s alerts only once per connection, there are 1,472 of them. There are 1,395 connections for which both report at least one alert. For 133 (69) connections, Bro (Snort) reports a match but Snort (Bro) does not. Again, looking at individual signatures, Bro misses 73 of Snort’s alerts. 25 of them are matches of Snort signature #1287 (WEB-IIS scripts access). These are all caused by the same host. The reason is packets missing from the trace, which, due to a lack of in-order sequencing, prevent the TCP stream from being reassembled by Bro. Another 19 are due to sig- nature #1287 (CodeRed v2 root.exe access). The ones of these we inspected further were due to premature server-side
  • 99. resets, which Bro correctly identifies as the end of the corresponding con- nections, while Snort keeps matching on the traffic still being send by the client. Bro reports 186 signature matches for connections for which Snort does not report a match at all. 68 of these connections simultaneously trigger three signatures (#1002, #1113, #1287). 46 5Most of the duplicates are ICMP Destination Unreach- able messages. Using Bro’s terminology, we define all ICMP packets between two hosts as belonging to one “connection.” 270 are due to simultaneous matches of signatures #1087 and #1242. Looking at some of them, one reason is SYN-packets missing from the trace. Their absence leads to different interpretations of estab- lished sessions by Snort and Bro, and therefore to different matches. 5. CONCLUSIONS In this work, we develop the general notion of contextual sig- natures as an improvement on the traditional form of string- based signature-matching used by NIDS. Rather than matching fixed strings in isolation, contextual signatures augment the matching pro-
  • 100. cess with both low-level context, by using regular expressions for matching rather than simply fixed strings, and high-level context, by taking advantage of the rich, additional semantic context made available by Bro’s protocol analysis and scripting language. By tightly integrating the new signature engine into Bro’s event- based architecture, we achieve several major improvements over other signature-based NIDSs such as Snort, which frequently suf- fer from generating a huge number of alerts. By interpreting a signature-match only as an event, rather than as an alert by itself, we are able to leverage Bro’s context and state-management mech- anisms to improve the quality of alerts. We showed several exam- ples of the power of this approach: matching requests with replies, recognizing exploit scans, making use of vulnerabilty profiles, and defining dependencies between signatures to model attacks that span multiple connections. In addition, by converting the freely available signature set of Snort into Bro’s language, we are able to build upon existing community efforts. As a baseline, we evaluated our signature engine using Snort as a reference, comparing the two systems in terms of both run- time performance and generated alerts using the signature set
  • 101. archived at [2]. But in the process of doing so, we encountered several gen- eral problems when comparing NIDSs: differing internal semantics, incompatible tuning options, the difficulty of devising “representa- tive” input, and extreme sensitivity to hardware particulars. The last two are particularly challenging, because there are no a priori indi- cations when comparing performance on one particular trace and hardware platform that we might obtain very different results using a different trace or hardware platform. Thus, we must exercise great caution in interpreting comparisons between NIDSs. Based on this work, we are now in the process of deploying Bro’s contextual signatures operationally in several educational, research and commercial enviroments. Finally, we have integrated our work into version 0.8 of the Bro distribution, freely available at [5]. 6. ACKNOWLEDGMENTS We would like to thank the Lawrence Berkeley National Labora- tory (LBL), Berkeley, USA; the National Energy Research Scien- tific Computing Center (NERSC), Berkeley, USA; and the Saarland University, Germany. We are in debt to Anja Feldmann for
  • 102. making this work possible. Finally, we would like to thank the anonymous reviewers for their valuable suggestions. 7. REFERENCES [1] arachNIDS. http://whitehats.com/ids/. [2] Web archive of versions of software and signatures used in this paper. http://www.net.in.tum.de/˜robin/ccs03. [3] S. Axelsson. The base-rate fallacy and the difficulty of intrusion detection. ACM Transactions on Information and System Security, 3(3):186–205, August 2000. [4] R. G. Bace. Intrusion Detection. Macmillan Technical Publishing, Indianapolis, IN, USA, 2000. [5] Bro: A System for Detecting Network Intruders in Real- Time. http://www.icir.org/vern/bro-info.html. [6] Bugtraq. http://www.securityfocus.com/bid/1187. [7] CERT Advisory CA-2002-27 Apache/mod ssl Worm. http://www.cert.org/advisories/CA-2002-27.html. [8] C. J. Coit, S. Staniford, and J. McAlerney. Towards Faster Pattern Matching for Intrusion Detection or Exceeding the Speed of Snort. In Proc. 2nd DARPA Information Survivability Conference and Exposition, June
  • 103. 2001. [9] Common Vulnerabilities and Exposures. http://www.cve.mitre.org. [10] H. Debar and B. Morin. Evaluation of the Diagnostic Capabilities of Commercial Intrusion Detection Systems. In Proc. Recent Advances in Intrusion Detection, number 2516 in Lecture Notes in Computer Science. Springer-Verlag, 2002. [11] R. F. et. al. Hypertext transfer protocol – http/1.1. Request for Comments 2616, June 1999. [12] M. Fisk and G. Varghese. Fast Content-Based Packet Handling for Intrusion Detection. Technical Report CS2001-0670, UC San Diego, May 2001. [13] Fyodor. Remote OS detection via TCP/IP Stack Finger Printing. Phrack Magazine, 8(54), 1998. [14] J. Haines, L. Rossey, R. Lippmann, and R. Cunnigham. Extending the 1999 Evaluation. In Proc. 2nd DARPA Information Survivability Conference and Exposition, June 2001. [15] M. Hall and K. Wiley. Capacity Verification for High Speed Network Intrusion Detection Systems. In Proc. Recent Advances in Intrusion Detection, number
  • 104. 2516 in Lecture Notes in Computer Science. Springer-Verlag, 2002. [16] M. Handley, C. Kreibich, and V. Paxson. Network intrusion detection: Evasion, traffic normalization, and end-to-end protocol semantics. In Proc. 10th USENIX Security Symposium, Washington, D.C., August 2001. [17] J. Heering, P. Klint, and J. Rekers. Incremental generation of lexical scanners. ACM Transactions on Programming Languages and Systems (TOPLAS), 14(4):490–520, 1992. [18] J. E. Hopcroft and J. D. Ullman. Introduction to Automata Theory, Languages, and Computation. Addison Wesley, 1979. [19] K. Jackson. Intrusion detection system product survey. Technical Report LA-UR-99-3883, Los Alamos National Laboratory, June 1999. [20] U. Lindqvist and P. A. Porras. Detecting computer and network misuse through the production-based expert system toolset (P-BEST). In Proc. IEEE Symposium on Security and Privacy. IEEE Computer Society Press, May 1999. [21] R. Lippmann, R. K. Cunningham, D. J. Fried, I. Graf, K. R. Kendall, S. E. Webster, and M. A. Zissman. Results of the 1998 DARPA Offline Intrusion Detection Evaluation. In Proc. Recent Advances in Intrusion Detection, 1999.
  • 105. [22] R. Lippmann, J. W. Haines, D. J. Fried, J. Korba, and K. Das. The 1999 DARPA off-line intrusion detection evaluation. Computer Networks, 34(4):579–595, October 2000. [23] R. Lippmann, S. Webster, and D. Stetson. The Effect of Identifying Vulnerabilities and Patching Software on the Utility of Network Intrusion Detection. In Proc. Recent Advances in Intrusion Detection, number 2516 in Lecture Notes in Computer Science. Springer-Verlag, 2002. [24] J. McHugh. Testing Intrusion detection systems: A critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory. ACM Transactions on Information and System Security, 3(4):262–294, November 2000. [25] V. Paxson. Bro: A system for detecting network intruders in real-time. Computer Networks, 31(23–24):2435–2463, 1999. [26] P. A. Porras and P. G. Neumann. EMERALD: Event monitoring enabling responses to anomalous live disturbances. In National Information Systems Security Conference, Baltimore, MD, October 1997. [27] T. H. Ptacek and T. N. Newsham. Insertion, evasion, and denial of service: Eluding network intrusion detection. Technical report, Secure
  • 106. Networks, Inc., January 1998. [28] M. J. Ranum, K. Landfield, M. Stolarchuk, M. Sienkiewicz, A. Lambeth, and E. Wall. Implementing a generalized tool for network monitoring. In Proc. 11th Systems Administration Conference (LISA), 1997. [29] M. Roesch. Snort: Lightweight intrusion detection for networks. In Proc. 13th Systems Administration Conference (LISA), pages 229–238. USENIX Association, November 1999. [30] R. Sekar and P. Uppuluri. Synthesizing fast intrusion prevention/detection systems from high-level specifications. In Proc. 8th USENIX Security Symposium. USENIX Association, August 1999. [31] U. Shankar and V. Paxson. Active Mapping: Resisting NIDS Evasion Without Altering Traffic. In Proc. IEEE Symposium on Security and Privacy, 2003. [32] Steven T. Eckmann. Translating Snort rules to STATL scenarios. In Proc. Recent Advances in Intrusion Detection, October 2001. [33] tcpdump. http://www.tcpdump.org. [34] Valgrind. http://developer.kde.org/˜sewardj. [35] G. Vigna, S. Eckmann, and R. Kemmerer. The STAT Tool Suite. In Proc. 1st DARPA Information Survivability Conference and Exposition,
  • 107. Hilton Head, South Carolina, January 2000. IEEE Computer Society Press. [36] G. Vigna and R. A. Kemmerer. Netstat: A network-based intrusion detection system. Journal of Computer Security, 7(1):37–71, 1999. [37] Whisker. http://www.wiretrip.net/rfp. 271