SlideShare a Scribd company logo
August 2011
Hacker Intelligence Initiative, Monthly Trend Report #3


Hacker Intelligence Summary Report – The Convergence of Google and Bots:
Searching for Security Vulnerabilities using Automated Botnets
In this monthly report from Imperva’s Hacker Intelligence Initiative (HII), we describe
how popular search engines are used as an attack platform to retrieve sensitive                  Our findings show that during
                                                                                                 an attack, hackers can generate
data, a.k.a. “Google Hacking”. This attack is further enhanced by deploying bots to              more than 80,000 daily queries
automate the process and to evade anti-automation detection techniques commonly                  to probe the Web for vulnerable
                                                                                                 Web applications.
deployed by the search engine providers. Although Google Hacking has been around
– in name – for some time, some new innovations by hackers require another, closer
look. Specifically, Google, and other search engines, put in place anti-automation measures to stop hackers from search abuse.
However, by using distributed bots, hackers take advantage of bot’s dispersed nature, giving search engines the impression that
individuals are performing a routine search. The reality? Hackers are conducting cyber reconnaissance on a massive scale.

Imperva’s Application Defense Center (ADC) has followed up on a particular botnet and has witnessed its usage against a
well-known search engine provider. By tracking this botnet, they found how attackers lay out the groundwork to simplify and
automate the next stages in an attack campaign against web applications. In this report, we describe the steps that hackers
take to leverage on the power of search engines to successfully carry out their attacks to massively collect attack targets. Our
findings show that during an attack, hackers can generate more than 80,000 daily queries to probe the Web for vulnerable
Web applications. We provide essential advice to organizations on how to prepare against exploits tailored against these
vulnerabilities. We also propose potential solutions that leading search engines such as Google, Bing and Yahoo can employ in
order to address the growing problem of hackers using their platform as an attacker tool.



An Overview of Google Hacking
On the Internet, search engines have emerged as powerful tools in an attacker’s arsenal, providing a way to gather
information about a target and find potential vulnerabilities in an anonymous and risk-free fashion. This activity is typically
called “Google Hacking”. Although the name emphasizes the search-engine giant, it pertains to all search engine providers.
Collecting information about an organization can set the stage for hackers to devise an attack tailored for a known
application. The specialized exploitation of known vulnerabilities may lead to contaminated web sites, data theft, data
modification, or even a compromise of company servers.
Search engines can be directed to return results that are focused on specific potential targets by using a specific set of
query operators. For example, the attacker may focus on all potential victims in a specified geographic location (i.e. per
country). In this case, the query includes a “location” search operator. In another scenario, an attacker may want to target
all vulnerabilities in a specific web site, and achieves this by issuing different queries containing the “site” search operator.
These particular search queries are commonly referred to as “Google Dorks”, or simply “Dorks”.
Automating the query and result parsing enables the attacker to issue a large number of queries, examine all the returned
results and get a filtered list of potentially exploitable sites in a very short time and with minimal effort.
In order to block automated search campaigns, today’s search engines deploy detection mechanisms which are based on
the IP address of the originating request.
Hacker Intelligence Initiative, Monthly Trend Report




    What’s new about this attack campaign that we witnessed? Our investigation has shown that attackers are able to overcome
    these detection techniques by distributing the queries across different machines. This is achieved by employing a network
    of compromised machines, better known as botnet.
    Hackers also gain the secondary benefit of hiding their identity behind these bots, since it is the compromised host which
    actually performs the search queries. In effect, the attacker adds a layer of indirection between herself and the automated
    search queries. This makes the task of tracking back the malicious activity to the individual attacker all the more difficult.



    The Hacker’s 4 Steps for an Industrialized Attack:
       1. Get a botnet. This is usually done by renting a botnet from a bot farmer who has a global network of compromised
          computers under his control.
       2. Obtain a tool for coordinated, distributed searching. This tool is deployed to the botnet agents and it usually
          contains a database of dorks.
       3. Launch a massive search campaign through the botnet. Our observations show that there is an automated
          infrastructure to control the distribution of dorks and the examination of the results between botnet parts.
       4. Craft a massive attack campaign based on search results. With the list of potentially vulnerable resources, the
          attacker can create, or use a ready-made, script to craft targeted attack vectors that attempt to exploit vulnerabilities in
          pages retrieved by the search campaign. Attacks include: infecting web applications, compromising corporate data or
          stealing sensitive personal information.



    Detailed Analysis
    Mining Search Engines for Attack Targets
    Search engine mining can be used by attackers in multiple ways. Exposing neglected sensitive files and folders, collecting
    network intelligence from exposed logs and detecting unprotected network attached devices are some of the perks of
    having access to this huge universal index. Our report focuses on one specific usage: massively collecting attack targets.
    Specially crafted search queries can be constructed to detect web resources that are potentially vulnerable. There is a
    wide variety of indicators, starting from distinguishable resource names through banners of specific products and up to
    specific error messages. The special search terms, commonly referred to as “Dorks”1, combine search terms and operators
    that usually correlate the type of resource with its contents. Dorks are commonly exchanged between hackers in forums.
    Comprehensive lists of dorks are also being made available through various web sites (both public and underground).
    Examples include the legendary Google Hacking Database at http://johnny.ihackstuff.com/ghdb/ and the up-to-date sites
    http://www.1337day.com/webapps and http://www.exploit-db.com/google-dorks/. As the latter name suggests, the site
    contains an exploit database demonstrating how dorks and exploits go hand in hand.




1
    http://www.danscourses.com/Network-Security+/search-engine-hacking-471.html

Report #3, August 2011                                                                                                               2
Hacker Intelligence Initiative, Monthly Trend Report




Figure 1: Banner from the Google Hacking Database




Figure 2: Banners from the Exploit Database




Report #3, August 2011                                                                                3
Hacker Intelligence Initiative, Monthly Trend Report




Some resources classify dorks according to platform or usage as can be seen from the screenshot below:




Figure 3: Searching dorks by class

An attacker armed with a browser and a dork can start listing potential attack targets. By using search engine results an
attacker not only lists vulnerable servers but also gets a pretty accurate idea as to which resources within that server are
potentially vulnerable.




Report #3, August 2011                                                                                                         4
Hacker Intelligence Initiative, Monthly Trend Report




For example, the following query returns results of online shopping sites containing the Oscommerce application.




Figure 4: results returned from a dork search




Report #3, August 2011                                                                                                   5
Hacker Intelligence Initiative, Monthly Trend Report




The following screenshot returns results of a dork search for FTP configuration results




Figure 5: results returned from a dork search

Automating the Usage of Dorks
Tools to automate the use of dorks have been created over the years by attacker groups. Some of them are desktop tools
and some are accessible as an online service. Some automate just the collection of targets and others automate the
construction of exploit vector and the attack itself.




Figure 6: Desktop tool for automated Google Hacking

Report #3, August 2011                                                                                                      6
Hacker Intelligence Initiative, Monthly Trend Report




Figure 7: Online service for automated search and attack campaigns

In view of this threat, most search engines have implemented anti-automation measures that rely (mainly) on the
following attributes:
  › Number of search queries from a single source (IP / session)
  › Frequency of queries from a single source
  › Massive retrieval of results for a single query
The anti-automation measures taken by search engine operators forced attackers to look for new alternatives for search
engine hacking automation. They found it in the form of botnet based search engine mining. By harnessing the power of
botnets, attackers launch distributed coordinated search campaigns that evade the standard anti-automation mechanisms.
The inherent distributed nature of the attack helps avoid the single source issue. The use of special search operators that
artificially split the search space (e.g. by country or by partial domain), overcomes the limitation enforced by search engines
over the number of results that can be retrieved per query. In addition, the attacker creates yet another layer of indirection
through the use of “search proxies”. This extra layer makes it even harder to identify the true source of the attack and the
whereabouts of the attacker.
In the following section we will show evidence of these techniques as seen in the wild.
A Typical Dork-Search Attack
We have observed a specific botnet attack on a popular search engine during May-June 2011. The attacker used dorks that
match vulnerable web applications and search operators that were tailored to the specific search engine. For each unique
search query, the botnet examined dozens and even hundreds of returned results using paging parameters in the query.
The volume of attack traffic was huge: nearly 550,000 queries (up to 81,000 daily queries, and 22,000 daily queries on
average) were requested during the observation period. It is clear that the attacker took advantage of the bandwidth
available to the dozens of controlled hosts in the botnet to seek and examine vulnerable applications.




Report #3, August 2011                                                                                                        7
Hacker Intelligence Initiative, Monthly Trend Report




Figure 8: dork queries per hour




Figure 9: dork queries per day

Search Engine Dorks
Most of the Dorks used in the observed attack were related to Content Management Systems and e-commerce applications.
Content Management Systems manage the work flow of users in a collaborative environment and enable a large number of
people to contribute to a site and to share stored data (for example, an eCommerce system or a forum for users of a game to
share playing tips). These systems are naturally more open and allow external users to contribute content and even upload
entire files. Thus, security vulnerabilities they contain can be easily exposed and exploited. E-commerce systems, on the
other hand, manage and store financial information about their customers, and a successful attack on such a site can be
immediately monetized.



Report #3, August 2011                                                                                                    8
Hacker Intelligence Initiative, Monthly Trend Report




Some examples of the observed dorks used in the attack are shown below. As can be seen, the search terms include various
free text words that identify vulnerable applications, as well as search operators that focus the query to specific sites,
domains or countries.
                                                                                                              Example of vulnerabilities associated
                    Search Query                                     Target application
                                                                                                                     with the application2
                                                        Oscommerce: online shop e-commerce                   SQL injection vulnerability in shopping_
    “Powered By Oscommerce” ‘catalog’                   solution                                             cart.php (CVE-2006-4297)
    “powered by oscommerce” shoping                     Oscommerce                                           See above
                                                                                                             allows remote attackers to execute arbitrary
    “powered by e107” site:.ch                          e107 CMS; limited to servers in Switzerland          PHP code (CVE-2010-2099)
    “*.php?cPath=25” ranking                            Oscommerce                                           See above
    “powered by osCommerce”                             Oscommerce                                           See above
                                                        Zen Cart Ecommerce; e-commerce web site              Allows remote attackers to execute
    “powered by zen cart” payment.php                   platform                                             arbitrary SQL (CVE-2009-2254)
    “powered by e107” global                            e107 CMS                                             See above
                                                        e107 CMS - password reset page; limited to
    “fpw.php” site:.ir                                                                                       See above
                                                        servers in Iran
                                                        Oscommerce German welcome page;
    Herzlich Willkommen Gast! site:.de                                                                       See above
                                                        limited to servers in Germany
                                                        e107 CMS; limited to domains with org
    “powered by e107” site:.org                                                                              See above
                                                        suffix)
                                                        BigCommerce e-commerce software
    “by BigCommerce” joomla.ze                                                                               See above
                                                        integrated with Joomla CMS
                                                        AppServe application development                     XSS vulnerability allows remote attackers to
    “The Appserv Open Project” site:.th                 platform; limited to servers in Thailand.            inject arbitrary web script (CVE-2008-2398)
                                                        e107 CMS; limited to domains with com
    “Powered by e107 Forum System” site:.com                                                                 See above
                                                        suffix
    Joomla! es Software Libre distribuido bajo          Joomla CMS - Spanish version                         See above
    licencia GNU/GPL.
                                                                                                             Directory Traversal vulnerability in
    “com_rokdownloads” site:jp                          Joomla CMS; limited to servers in Japan              RokDownloads component of Joomla (CVE-
                                                                                                             2010-1056)
Table 1: Examples of observed dork queries


The additional operators (domain, language, etc.) as well as specification of the wanted page of results are used for
several purposes:
      › Creating more focused result sets that allow construction of more accurate attack vectors
      › Artificially splitting the search space in a way that distributes the workload of exhaustively examining the entire result
        set between the bots in the net
Overall we have seen 4719 different dork variations being used in the attack (where “powered by e107” site:.ch and “powered
by e107” site:.fr are variation on the same basic dork). The 30 most-used dorks were related to osCommerse e-commerce
solution, and each of these variation appeared in 1,600-3,900 queries. The e107 application was the next popular attack
target based on the number of observed dorks.




2
    For the applications that the attackers sought, these are examples of publicly disclosed vulnerabilities. However, these are not necessarily the
    vulnerabilities that the attackers actually tried to exploit.

Report #3, August 2011                                                                                                                                  9
Hacker Intelligence Initiative, Monthly Trend Report




Botnet Hosts
Search engine providers identify malicious attacks based on a high volume or a high frequency of queries from the same
source. Yet we have witnessed how attackers bypass these detection mechanisms by employing a botnet.
During our observation period we have identified 40 different IP addresses of hosts that participate in the attacking botnet.
The hosts are not all active at the same time. The attack is distributed and coordinated. Thus, different hosts handle different
dorks and each host produces low rate search activity. We found that most hosts issue no more than one request every 2
minutes. However, four hosts together issue 2-4 requests per minute. This rate does not trigger the search engine’s anti-
automation policy as it normally cannot be considered abusive. In addition, the requests simulate a true browser activity
rather than a script by constantly changing the user-agent field. Consequently, the attack campaign can go on for a long
time, allowing the attacker to collect a substantial amount of target resources. An example of a coordinated distributed dork
search was for the dork “e107” using 99 different argument for the site search operator: 5 different hosts issued these queries
over the entire observation period.




Figure 10: hosts searching for the dork “e107” with a “site” operator




Figure 11: queries for the dork “e107” with a “site” operator


Report #3, August 2011                                                                                                       10
Hacker Intelligence Initiative, Monthly Trend Report




The botnet hosts are distributed all over the world. This is not surprising, since the attacker does not care about the location
or ownership of the abused hosts and just needs the ability to take control of these machines and add them to her network
of compromised computers. Thus, the identities of the botnet hosts give no direct indication to the identity of the hacker
that uses them for malicious attacks. However, it is interesting to note that the observed botnet has a disproportionate
number of servers in Iran, Hungary and Germany, and a low number of servers in the United States. Also, some of the
dork queries specifically limited results to servers in Iran or Germany. This combination may be a hint to the interests of
the attacker.




Figure 12: number of hosts issuing dork queries

           Country            # dork queries       Percentage of dork queries
 Islamic Republic of Iran         227554                      41
 Hungary                          136445                      25
 Germany                           80448                      15
 United States                     19237                      3.5
 Chile                             17365                       3
 Thailand                          16717                       3
 Republic of Korea                 11872                       2
 France                            10906                       2
 Belgium                           10661                       2
 Brazil                             7559                      1.5
 Other                              8892                       2
Table 2: Countries of hosts issuing dork queries




Report #3, August 2011                                                                                                        11
Hacker Intelligence Initiative, Monthly Trend Report




Figure13: Countries of hosts issuing dork queries



Summary and Conclusions
We have observed a high-volume mining campaign of a botnet through a popular search engine. The campaign was
focused on finding resources that use specific content management frameworks that can be exploited.
While none of the components of the attack (use of botnets deployed on compromised servers, exploiting search
engine using dorks) are unique, it is interesting to observe the potential for automation and flexibility of the attack. Each
component may be replaced or reconfigured easily, while the attacker and tools remain hidden from targeted servers
and even the abused search engine. The impact of which would be for the attacker to create a map of hackable targets
on the Web.
This type of abuse should concern both the search engine providers as well as organizations. Search engines have
a responsibility to prevent attackers from taking advantage of their platform to carry out their attacks. At the same
time, search engines are in a unique position to identify botnets that abuse their services thus shedding light on the
attackers. Organizations should protect their applications from being publicly exposed through the search engines.
Recommendations to the Search Engines
Search engine providers are expected to perform a detailed analysis of network traffic which allows the flagging of
suspicious anomalies in the query traffic. Search engines typically look for low-level anomalies like high frequency or
high volume of requests from a host. As this report indicates, they should start looking for unusual suspicious queries
– such as those that are known to be part of public dorks-databases, or queries that look for known sensitive files (/etc
files or database data files).



Report #3, August 2011                                                                                                          12
Hacker Intelligence Initiative, Monthly Trend Report




A list of IPs suspected of being part of a botnet and a pattern of queries from the botnet can be extracted from the
suspicious traffic that is flagged by the analysis. Using these black-lists, search engines can then:
     › Apply strict anti-automation policies (e.g. using CAPTCHA) to IP addresses that are blacklisted. Google has been
       known3 to use CAPTCHA in recent years when a client host exhibits suspicious behavior. However, it appears that this
       is motivated at least partly by desire to fight Search Engine Optimization and preserve the engine’s computational
       resources, and less by security concerns. Smaller search engines rarely resort to more sophisticated defenses than
       applying timeouts between queries from the same IP, which are easily circumvented by automated botnets.
     › Identify additional hosts which exhibit the same suspicious behavior pattern to update the IPs blacklist.
Search engines can use the IPs black list to issue warning to the registered owners of the IPs that their machines may have
been compromised by attackers. Such proactive approach could help make the Internet safer, instead of just settling for
limiting the damage caused by compromised hosts.
Recommendations to the Organization
Organizations should be aware that with the efficiency and thorough indexing of corporate information – including Web
applications – the exposure of vulnerable applications is bound to occur. While attackers are mapping out these targets, it
is essential that organizations prepare against exploits tailored against these vulnerabilities. This can be done by deploying
runtime application layer security controls:
     › A Web Application Firewall should detect and block attempts at exploiting applications vulnerabilities.
     › Reputation-based controls could block attacks originating from known malicious sources. As our 2011 H1 Web
       Application Attack Report (WAAR) has shown, attacks are automated. Knowing that a request is generated by an
       automated process, such as coming from a known active botnet source, should be flagged as malicious.




Hacker Intelligence Initiative Overview
The Imperva Hacker Intelligence Initiative goes inside the cyber-underground and provides analysis of the trending hacking
techniques and interesting attack campaigns from the past month. A part of Imperva’s Application Defense Center research
arm, the Hacker Intelligence Initiative (HII), is focused on tracking the latest trends in attacks, Web application security and
cyber-crime business models with the goal of improving security controls and risk management processes.


3
    See: http://googleonlinesecurity.blogspot.com/2007/07/reason-behind-were-sorry-message.html


Imperva                                             Tel: +1-650-345-9000
3400 Bridge Parkway, Suite 200                      Fax: +1-650-345-9004
Redwood City, CA 94065                              www.imperva.com

© Copyright 2011, Imperva
All rights reserved. Imperva, SecureSphere, and “Protecting the Data That Drives Business” are registered trademarks of Imperva.
All other brand or product names are trademarks or registered trademarks of their respective holders. #HII-AUGUST-2011-0811rev1

More Related Content

PDF
Android mobile platform security and malware survey
PDF
Analyzing the effectualness of Phishing Algorithms in Web Applications Inques...
PDF
Artificial Intelligence powered malware - A Smart virus
PPTX
Introduction to ethical hacking
PDF
Clustering Categorical Data for Internet Security Applications
PDF
Ethical Hacking
PDF
McAfee Labs Threats Report, August 2019
PDF
IRJET- Android Malware Detection System
Android mobile platform security and malware survey
Analyzing the effectualness of Phishing Algorithms in Web Applications Inques...
Artificial Intelligence powered malware - A Smart virus
Introduction to ethical hacking
Clustering Categorical Data for Internet Security Applications
Ethical Hacking
McAfee Labs Threats Report, August 2019
IRJET- Android Malware Detection System

What's hot (20)

PPTX
The Rise of Ransomware
PPTX
Honey pot in cloud computing
DOCX
Viruses & Malware: Effects On Enterprise Networks
PDF
EXTERNAL - Whitepaper - How 3 Cyber ThreatsTransform Incident Response 081516
DOCX
So692 cyber security-document
PDF
IRJET- Phishing Website Detection System
PPTX
honey pots introduction and its types
ODP
Honeypot Presentation - Using Honeyd
PDF
Intelligent Phishing Website Detection and Prevention System by Using Link Gu...
PDF
IRJET- Identification of Clone Attacks in Social Networking Sites
PDF
Offensive OSINT
PDF
2014_protect_presentation
PDF
Honeypots
PPTX
Threat hunting for Beginners
DOC
Ethical hacking1
PDF
A survey on detection of website phishing using mcac technique
PPT
Honeypot Project
PPT
Honeypot Basics
PPTX
APT 28 :Cyber Espionage and the Russian Government?
The Rise of Ransomware
Honey pot in cloud computing
Viruses & Malware: Effects On Enterprise Networks
EXTERNAL - Whitepaper - How 3 Cyber ThreatsTransform Incident Response 081516
So692 cyber security-document
IRJET- Phishing Website Detection System
honey pots introduction and its types
Honeypot Presentation - Using Honeyd
Intelligent Phishing Website Detection and Prevention System by Using Link Gu...
IRJET- Identification of Clone Attacks in Social Networking Sites
Offensive OSINT
2014_protect_presentation
Honeypots
Threat hunting for Beginners
Ethical hacking1
A survey on detection of website phishing using mcac technique
Honeypot Project
Honeypot Basics
APT 28 :Cyber Espionage and the Russian Government?
Ad

Viewers also liked (9)

PPS
Concordia Staete
PDF
UTS Future Library - CCA Educause
PDF
09 06-11-eotm-european-minifigure-union
PPT
PR in a changing world
PDF
Design Thinking and UTS Library
PDF
Insa cyber intelligence 2011
PPT
Presentation1
PDF
UTS Library future service model (with notes)
PPT
Edelman Trust Barometer 2007
Concordia Staete
UTS Future Library - CCA Educause
09 06-11-eotm-european-minifigure-union
PR in a changing world
Design Thinking and UTS Library
Insa cyber intelligence 2011
Presentation1
UTS Library future service model (with notes)
Edelman Trust Barometer 2007
Ad

Similar to Hii the convergence_of_google_and_bots_-_searching_for_security_vulnerabilities_using_automated_botnets (20)

PDF
Spiffy Spyware Stuff
PDF
Gg2511351142
PDF
Gg2511351142
PDF
Anti-tampering in Android and Take Look at Google SafetyNet Attestation API
PDF
Search Engine Poisoning
PDF
A Mitigation Technique For Internet Security Threat of Toolkits Attack
PDF
A literature survey on anti phishing
PDF
Invesitigation of Malware and Forensic Tools on Internet
PDF
Assessing the Effectiveness of Antivirus Solutions
PDF
Hii assessing the_effectiveness_of_antivirus_solutions
PDF
Em36849854
PDF
A Survey of Keylogger in Cybersecurity Education
PDF
IRJET - An Automated System for Detection of Social Engineering Phishing Atta...
PDF
The Value of Shared Threat Intelligence
PDF
Deep Learning based Threat / Intrusion detection system
PDF
A0430104
PDF
Utilization Data Mining to Detect Spyware
PDF
HOST PROTECTION USING PROCESS WHITE-LISTING, DECEPTION AND REPUTATION SERVICES
PDF
blackhole.pdf
PDF
Spiffy Spyware Stuff
Gg2511351142
Gg2511351142
Anti-tampering in Android and Take Look at Google SafetyNet Attestation API
Search Engine Poisoning
A Mitigation Technique For Internet Security Threat of Toolkits Attack
A literature survey on anti phishing
Invesitigation of Malware and Forensic Tools on Internet
Assessing the Effectiveness of Antivirus Solutions
Hii assessing the_effectiveness_of_antivirus_solutions
Em36849854
A Survey of Keylogger in Cybersecurity Education
IRJET - An Automated System for Detection of Social Engineering Phishing Atta...
The Value of Shared Threat Intelligence
Deep Learning based Threat / Intrusion detection system
A0430104
Utilization Data Mining to Detect Spyware
HOST PROTECTION USING PROCESS WHITE-LISTING, DECEPTION AND REPUTATION SERVICES
blackhole.pdf

More from Mousselmal Tarik (20)

PDF
Baromètre des Valeurs des Français 2014 : « Moi, beau et méchant ! »
PDF
The anatomy of russian information warfare
PDF
China’s Three Warfares
PPTX
Fox news vs. anonymous (Propaganda made in USA)
PDF
Information as power
PDF
Cia culture-intelligence-berrett-cultural topography
PPTX
Stolen iPad CNN
PDF
Night For Life Dossier De Presse
PDF
Les Echos 051009 Total Mauvaise Image
PDF
99 Tips E Version
PDF
Les Echos 2009 Alvin Toffler
PDF
Right Ear
PPT
Twitter- Cyxymu
PDF
Bhusa09 Miller Fuzzing Phone Paper
PDF
Eiaa Marketers Internet Ad Barometer 2009 Pr Presentation
PDF
health is a new health
PPT
Pharell Vs Mcdonald's
PPT
Pharell Vs Mcdonald's
PDF
La Plaquette Des Restos
PDF
Pepsi Gravitational Field
Baromètre des Valeurs des Français 2014 : « Moi, beau et méchant ! »
The anatomy of russian information warfare
China’s Three Warfares
Fox news vs. anonymous (Propaganda made in USA)
Information as power
Cia culture-intelligence-berrett-cultural topography
Stolen iPad CNN
Night For Life Dossier De Presse
Les Echos 051009 Total Mauvaise Image
99 Tips E Version
Les Echos 2009 Alvin Toffler
Right Ear
Twitter- Cyxymu
Bhusa09 Miller Fuzzing Phone Paper
Eiaa Marketers Internet Ad Barometer 2009 Pr Presentation
health is a new health
Pharell Vs Mcdonald's
Pharell Vs Mcdonald's
La Plaquette Des Restos
Pepsi Gravitational Field

Recently uploaded (20)

PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
Machine Learning_overview_presentation.pptx
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
A Presentation on Artificial Intelligence
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Encapsulation theory and applications.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Approach and Philosophy of On baking technology
PPTX
Cloud computing and distributed systems.
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
NewMind AI Weekly Chronicles - August'25-Week II
Machine Learning_overview_presentation.pptx
Building Integrated photovoltaic BIPV_UPV.pdf
Assigned Numbers - 2025 - Bluetooth® Document
Programs and apps: productivity, graphics, security and other tools
Dropbox Q2 2025 Financial Results & Investor Presentation
A Presentation on Artificial Intelligence
Unlocking AI with Model Context Protocol (MCP)
Encapsulation theory and applications.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Per capita expenditure prediction using model stacking based on satellite ima...
The Rise and Fall of 3GPP – Time for a Sabbatical?
MIND Revenue Release Quarter 2 2025 Press Release
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Approach and Philosophy of On baking technology
Cloud computing and distributed systems.
A comparative analysis of optical character recognition models for extracting...
Build a system with the filesystem maintained by OSTree @ COSCUP 2025

Hii the convergence_of_google_and_bots_-_searching_for_security_vulnerabilities_using_automated_botnets

  • 1. August 2011 Hacker Intelligence Initiative, Monthly Trend Report #3 Hacker Intelligence Summary Report – The Convergence of Google and Bots: Searching for Security Vulnerabilities using Automated Botnets In this monthly report from Imperva’s Hacker Intelligence Initiative (HII), we describe how popular search engines are used as an attack platform to retrieve sensitive Our findings show that during an attack, hackers can generate data, a.k.a. “Google Hacking”. This attack is further enhanced by deploying bots to more than 80,000 daily queries automate the process and to evade anti-automation detection techniques commonly to probe the Web for vulnerable Web applications. deployed by the search engine providers. Although Google Hacking has been around – in name – for some time, some new innovations by hackers require another, closer look. Specifically, Google, and other search engines, put in place anti-automation measures to stop hackers from search abuse. However, by using distributed bots, hackers take advantage of bot’s dispersed nature, giving search engines the impression that individuals are performing a routine search. The reality? Hackers are conducting cyber reconnaissance on a massive scale. Imperva’s Application Defense Center (ADC) has followed up on a particular botnet and has witnessed its usage against a well-known search engine provider. By tracking this botnet, they found how attackers lay out the groundwork to simplify and automate the next stages in an attack campaign against web applications. In this report, we describe the steps that hackers take to leverage on the power of search engines to successfully carry out their attacks to massively collect attack targets. Our findings show that during an attack, hackers can generate more than 80,000 daily queries to probe the Web for vulnerable Web applications. We provide essential advice to organizations on how to prepare against exploits tailored against these vulnerabilities. We also propose potential solutions that leading search engines such as Google, Bing and Yahoo can employ in order to address the growing problem of hackers using their platform as an attacker tool. An Overview of Google Hacking On the Internet, search engines have emerged as powerful tools in an attacker’s arsenal, providing a way to gather information about a target and find potential vulnerabilities in an anonymous and risk-free fashion. This activity is typically called “Google Hacking”. Although the name emphasizes the search-engine giant, it pertains to all search engine providers. Collecting information about an organization can set the stage for hackers to devise an attack tailored for a known application. The specialized exploitation of known vulnerabilities may lead to contaminated web sites, data theft, data modification, or even a compromise of company servers. Search engines can be directed to return results that are focused on specific potential targets by using a specific set of query operators. For example, the attacker may focus on all potential victims in a specified geographic location (i.e. per country). In this case, the query includes a “location” search operator. In another scenario, an attacker may want to target all vulnerabilities in a specific web site, and achieves this by issuing different queries containing the “site” search operator. These particular search queries are commonly referred to as “Google Dorks”, or simply “Dorks”. Automating the query and result parsing enables the attacker to issue a large number of queries, examine all the returned results and get a filtered list of potentially exploitable sites in a very short time and with minimal effort. In order to block automated search campaigns, today’s search engines deploy detection mechanisms which are based on the IP address of the originating request.
  • 2. Hacker Intelligence Initiative, Monthly Trend Report What’s new about this attack campaign that we witnessed? Our investigation has shown that attackers are able to overcome these detection techniques by distributing the queries across different machines. This is achieved by employing a network of compromised machines, better known as botnet. Hackers also gain the secondary benefit of hiding their identity behind these bots, since it is the compromised host which actually performs the search queries. In effect, the attacker adds a layer of indirection between herself and the automated search queries. This makes the task of tracking back the malicious activity to the individual attacker all the more difficult. The Hacker’s 4 Steps for an Industrialized Attack: 1. Get a botnet. This is usually done by renting a botnet from a bot farmer who has a global network of compromised computers under his control. 2. Obtain a tool for coordinated, distributed searching. This tool is deployed to the botnet agents and it usually contains a database of dorks. 3. Launch a massive search campaign through the botnet. Our observations show that there is an automated infrastructure to control the distribution of dorks and the examination of the results between botnet parts. 4. Craft a massive attack campaign based on search results. With the list of potentially vulnerable resources, the attacker can create, or use a ready-made, script to craft targeted attack vectors that attempt to exploit vulnerabilities in pages retrieved by the search campaign. Attacks include: infecting web applications, compromising corporate data or stealing sensitive personal information. Detailed Analysis Mining Search Engines for Attack Targets Search engine mining can be used by attackers in multiple ways. Exposing neglected sensitive files and folders, collecting network intelligence from exposed logs and detecting unprotected network attached devices are some of the perks of having access to this huge universal index. Our report focuses on one specific usage: massively collecting attack targets. Specially crafted search queries can be constructed to detect web resources that are potentially vulnerable. There is a wide variety of indicators, starting from distinguishable resource names through banners of specific products and up to specific error messages. The special search terms, commonly referred to as “Dorks”1, combine search terms and operators that usually correlate the type of resource with its contents. Dorks are commonly exchanged between hackers in forums. Comprehensive lists of dorks are also being made available through various web sites (both public and underground). Examples include the legendary Google Hacking Database at http://johnny.ihackstuff.com/ghdb/ and the up-to-date sites http://www.1337day.com/webapps and http://www.exploit-db.com/google-dorks/. As the latter name suggests, the site contains an exploit database demonstrating how dorks and exploits go hand in hand. 1 http://www.danscourses.com/Network-Security+/search-engine-hacking-471.html Report #3, August 2011 2
  • 3. Hacker Intelligence Initiative, Monthly Trend Report Figure 1: Banner from the Google Hacking Database Figure 2: Banners from the Exploit Database Report #3, August 2011 3
  • 4. Hacker Intelligence Initiative, Monthly Trend Report Some resources classify dorks according to platform or usage as can be seen from the screenshot below: Figure 3: Searching dorks by class An attacker armed with a browser and a dork can start listing potential attack targets. By using search engine results an attacker not only lists vulnerable servers but also gets a pretty accurate idea as to which resources within that server are potentially vulnerable. Report #3, August 2011 4
  • 5. Hacker Intelligence Initiative, Monthly Trend Report For example, the following query returns results of online shopping sites containing the Oscommerce application. Figure 4: results returned from a dork search Report #3, August 2011 5
  • 6. Hacker Intelligence Initiative, Monthly Trend Report The following screenshot returns results of a dork search for FTP configuration results Figure 5: results returned from a dork search Automating the Usage of Dorks Tools to automate the use of dorks have been created over the years by attacker groups. Some of them are desktop tools and some are accessible as an online service. Some automate just the collection of targets and others automate the construction of exploit vector and the attack itself. Figure 6: Desktop tool for automated Google Hacking Report #3, August 2011 6
  • 7. Hacker Intelligence Initiative, Monthly Trend Report Figure 7: Online service for automated search and attack campaigns In view of this threat, most search engines have implemented anti-automation measures that rely (mainly) on the following attributes: › Number of search queries from a single source (IP / session) › Frequency of queries from a single source › Massive retrieval of results for a single query The anti-automation measures taken by search engine operators forced attackers to look for new alternatives for search engine hacking automation. They found it in the form of botnet based search engine mining. By harnessing the power of botnets, attackers launch distributed coordinated search campaigns that evade the standard anti-automation mechanisms. The inherent distributed nature of the attack helps avoid the single source issue. The use of special search operators that artificially split the search space (e.g. by country or by partial domain), overcomes the limitation enforced by search engines over the number of results that can be retrieved per query. In addition, the attacker creates yet another layer of indirection through the use of “search proxies”. This extra layer makes it even harder to identify the true source of the attack and the whereabouts of the attacker. In the following section we will show evidence of these techniques as seen in the wild. A Typical Dork-Search Attack We have observed a specific botnet attack on a popular search engine during May-June 2011. The attacker used dorks that match vulnerable web applications and search operators that were tailored to the specific search engine. For each unique search query, the botnet examined dozens and even hundreds of returned results using paging parameters in the query. The volume of attack traffic was huge: nearly 550,000 queries (up to 81,000 daily queries, and 22,000 daily queries on average) were requested during the observation period. It is clear that the attacker took advantage of the bandwidth available to the dozens of controlled hosts in the botnet to seek and examine vulnerable applications. Report #3, August 2011 7
  • 8. Hacker Intelligence Initiative, Monthly Trend Report Figure 8: dork queries per hour Figure 9: dork queries per day Search Engine Dorks Most of the Dorks used in the observed attack were related to Content Management Systems and e-commerce applications. Content Management Systems manage the work flow of users in a collaborative environment and enable a large number of people to contribute to a site and to share stored data (for example, an eCommerce system or a forum for users of a game to share playing tips). These systems are naturally more open and allow external users to contribute content and even upload entire files. Thus, security vulnerabilities they contain can be easily exposed and exploited. E-commerce systems, on the other hand, manage and store financial information about their customers, and a successful attack on such a site can be immediately monetized. Report #3, August 2011 8
  • 9. Hacker Intelligence Initiative, Monthly Trend Report Some examples of the observed dorks used in the attack are shown below. As can be seen, the search terms include various free text words that identify vulnerable applications, as well as search operators that focus the query to specific sites, domains or countries. Example of vulnerabilities associated Search Query Target application with the application2 Oscommerce: online shop e-commerce SQL injection vulnerability in shopping_ “Powered By Oscommerce” ‘catalog’ solution cart.php (CVE-2006-4297) “powered by oscommerce” shoping Oscommerce See above allows remote attackers to execute arbitrary “powered by e107” site:.ch e107 CMS; limited to servers in Switzerland PHP code (CVE-2010-2099) “*.php?cPath=25” ranking Oscommerce See above “powered by osCommerce” Oscommerce See above Zen Cart Ecommerce; e-commerce web site Allows remote attackers to execute “powered by zen cart” payment.php platform arbitrary SQL (CVE-2009-2254) “powered by e107” global e107 CMS See above e107 CMS - password reset page; limited to “fpw.php” site:.ir See above servers in Iran Oscommerce German welcome page; Herzlich Willkommen Gast! site:.de See above limited to servers in Germany e107 CMS; limited to domains with org “powered by e107” site:.org See above suffix) BigCommerce e-commerce software “by BigCommerce” joomla.ze See above integrated with Joomla CMS AppServe application development XSS vulnerability allows remote attackers to “The Appserv Open Project” site:.th platform; limited to servers in Thailand. inject arbitrary web script (CVE-2008-2398) e107 CMS; limited to domains with com “Powered by e107 Forum System” site:.com See above suffix Joomla! es Software Libre distribuido bajo Joomla CMS - Spanish version See above licencia GNU/GPL. Directory Traversal vulnerability in “com_rokdownloads” site:jp Joomla CMS; limited to servers in Japan RokDownloads component of Joomla (CVE- 2010-1056) Table 1: Examples of observed dork queries The additional operators (domain, language, etc.) as well as specification of the wanted page of results are used for several purposes: › Creating more focused result sets that allow construction of more accurate attack vectors › Artificially splitting the search space in a way that distributes the workload of exhaustively examining the entire result set between the bots in the net Overall we have seen 4719 different dork variations being used in the attack (where “powered by e107” site:.ch and “powered by e107” site:.fr are variation on the same basic dork). The 30 most-used dorks were related to osCommerse e-commerce solution, and each of these variation appeared in 1,600-3,900 queries. The e107 application was the next popular attack target based on the number of observed dorks. 2 For the applications that the attackers sought, these are examples of publicly disclosed vulnerabilities. However, these are not necessarily the vulnerabilities that the attackers actually tried to exploit. Report #3, August 2011 9
  • 10. Hacker Intelligence Initiative, Monthly Trend Report Botnet Hosts Search engine providers identify malicious attacks based on a high volume or a high frequency of queries from the same source. Yet we have witnessed how attackers bypass these detection mechanisms by employing a botnet. During our observation period we have identified 40 different IP addresses of hosts that participate in the attacking botnet. The hosts are not all active at the same time. The attack is distributed and coordinated. Thus, different hosts handle different dorks and each host produces low rate search activity. We found that most hosts issue no more than one request every 2 minutes. However, four hosts together issue 2-4 requests per minute. This rate does not trigger the search engine’s anti- automation policy as it normally cannot be considered abusive. In addition, the requests simulate a true browser activity rather than a script by constantly changing the user-agent field. Consequently, the attack campaign can go on for a long time, allowing the attacker to collect a substantial amount of target resources. An example of a coordinated distributed dork search was for the dork “e107” using 99 different argument for the site search operator: 5 different hosts issued these queries over the entire observation period. Figure 10: hosts searching for the dork “e107” with a “site” operator Figure 11: queries for the dork “e107” with a “site” operator Report #3, August 2011 10
  • 11. Hacker Intelligence Initiative, Monthly Trend Report The botnet hosts are distributed all over the world. This is not surprising, since the attacker does not care about the location or ownership of the abused hosts and just needs the ability to take control of these machines and add them to her network of compromised computers. Thus, the identities of the botnet hosts give no direct indication to the identity of the hacker that uses them for malicious attacks. However, it is interesting to note that the observed botnet has a disproportionate number of servers in Iran, Hungary and Germany, and a low number of servers in the United States. Also, some of the dork queries specifically limited results to servers in Iran or Germany. This combination may be a hint to the interests of the attacker. Figure 12: number of hosts issuing dork queries Country # dork queries Percentage of dork queries Islamic Republic of Iran 227554 41 Hungary 136445 25 Germany 80448 15 United States 19237 3.5 Chile 17365 3 Thailand 16717 3 Republic of Korea 11872 2 France 10906 2 Belgium 10661 2 Brazil 7559 1.5 Other 8892 2 Table 2: Countries of hosts issuing dork queries Report #3, August 2011 11
  • 12. Hacker Intelligence Initiative, Monthly Trend Report Figure13: Countries of hosts issuing dork queries Summary and Conclusions We have observed a high-volume mining campaign of a botnet through a popular search engine. The campaign was focused on finding resources that use specific content management frameworks that can be exploited. While none of the components of the attack (use of botnets deployed on compromised servers, exploiting search engine using dorks) are unique, it is interesting to observe the potential for automation and flexibility of the attack. Each component may be replaced or reconfigured easily, while the attacker and tools remain hidden from targeted servers and even the abused search engine. The impact of which would be for the attacker to create a map of hackable targets on the Web. This type of abuse should concern both the search engine providers as well as organizations. Search engines have a responsibility to prevent attackers from taking advantage of their platform to carry out their attacks. At the same time, search engines are in a unique position to identify botnets that abuse their services thus shedding light on the attackers. Organizations should protect their applications from being publicly exposed through the search engines. Recommendations to the Search Engines Search engine providers are expected to perform a detailed analysis of network traffic which allows the flagging of suspicious anomalies in the query traffic. Search engines typically look for low-level anomalies like high frequency or high volume of requests from a host. As this report indicates, they should start looking for unusual suspicious queries – such as those that are known to be part of public dorks-databases, or queries that look for known sensitive files (/etc files or database data files). Report #3, August 2011 12
  • 13. Hacker Intelligence Initiative, Monthly Trend Report A list of IPs suspected of being part of a botnet and a pattern of queries from the botnet can be extracted from the suspicious traffic that is flagged by the analysis. Using these black-lists, search engines can then: › Apply strict anti-automation policies (e.g. using CAPTCHA) to IP addresses that are blacklisted. Google has been known3 to use CAPTCHA in recent years when a client host exhibits suspicious behavior. However, it appears that this is motivated at least partly by desire to fight Search Engine Optimization and preserve the engine’s computational resources, and less by security concerns. Smaller search engines rarely resort to more sophisticated defenses than applying timeouts between queries from the same IP, which are easily circumvented by automated botnets. › Identify additional hosts which exhibit the same suspicious behavior pattern to update the IPs blacklist. Search engines can use the IPs black list to issue warning to the registered owners of the IPs that their machines may have been compromised by attackers. Such proactive approach could help make the Internet safer, instead of just settling for limiting the damage caused by compromised hosts. Recommendations to the Organization Organizations should be aware that with the efficiency and thorough indexing of corporate information – including Web applications – the exposure of vulnerable applications is bound to occur. While attackers are mapping out these targets, it is essential that organizations prepare against exploits tailored against these vulnerabilities. This can be done by deploying runtime application layer security controls: › A Web Application Firewall should detect and block attempts at exploiting applications vulnerabilities. › Reputation-based controls could block attacks originating from known malicious sources. As our 2011 H1 Web Application Attack Report (WAAR) has shown, attacks are automated. Knowing that a request is generated by an automated process, such as coming from a known active botnet source, should be flagged as malicious. Hacker Intelligence Initiative Overview The Imperva Hacker Intelligence Initiative goes inside the cyber-underground and provides analysis of the trending hacking techniques and interesting attack campaigns from the past month. A part of Imperva’s Application Defense Center research arm, the Hacker Intelligence Initiative (HII), is focused on tracking the latest trends in attacks, Web application security and cyber-crime business models with the goal of improving security controls and risk management processes. 3 See: http://googleonlinesecurity.blogspot.com/2007/07/reason-behind-were-sorry-message.html Imperva Tel: +1-650-345-9000 3400 Bridge Parkway, Suite 200 Fax: +1-650-345-9004 Redwood City, CA 94065 www.imperva.com © Copyright 2011, Imperva All rights reserved. Imperva, SecureSphere, and “Protecting the Data That Drives Business” are registered trademarks of Imperva. All other brand or product names are trademarks or registered trademarks of their respective holders. #HII-AUGUST-2011-0811rev1