Introduction
This is not about finding sensitive data during an assessment as much as
it is about what the “bad guys” might do to troll for the data.The examples presented
generally represent the lowest-hanging fruit on the security
tree. Hackers target this information on a daily basis.To protect against this type
of attacker, we need to be fairly candid about the worst-case possibilities.We
won’t be overly candid, however.
We start by looking at some queries that can be used to uncover usernames,
the less important half of most authentication systems.The value of a username is
often overlooked, but, an entire multimilliondollar
security system can be shattered through skillful crafting of even the
smallest, most innocuous bit of information.
Next, we take a look at queries that are designed to uncover passwords. Some
of the queries we look at reveal encrypted or encoded passwords, which will take
a bit of work on the part of an attacker to use to his or her advantage.We also
take a look at queries that can uncover cleartext passwords.These queries are some
of the most dangerous in the hands of even the most novice attacker. What could
make an attack easier than handing a username and cleartext password to an
attacker?
We wrap up by discussing the very real possibility of uncovering
highly sensitive data such as credit card information and information used to
commit identity theft, such as Social Security numbers. Our goal here is to
explore ways of protecting against this very real threat.To that end, we don’t go
into details about uncovering financial information and the like. If you’re a “dark
side” hacker, you’ll need to figure these things out on your own.
Searching for Usernames
Most authentication mechanisms use a username and password to protect information.
To get through the “front door” of this type of protection, you’ll need to
determine usernames as well as passwords. Usernames also can be used for social
engineering efforts, as we discussed earlier.
Many methods can be used to determine usernames. In Chapter 10, we
explored ways of gathering usernames via database error messages. In Chapter 8
we explored Web server and application error messages that can reveal various
information, including usernames.These indirect methods of locating usernames
are helpful, but an attacker could target a usernames directory
query like “your username is”. This phrase can locate help pages that describe the
username creation process,
information gleaned from other sources, such as Google Groups posts or phone
listings.The usernames could then be recycled into various other phases of the
attack, such as a worm-based spam campaign or a social-engineering attempt.An
attacker can gather usernames from a variety of sources, as shown in the sample
queries listed
Sample Queries That Locate Usernames
Query Description
inurl:admin inurl:userlist Generic userlist files
inurl:admin filetype:asp Generic userlist files
inurl:userlist
inurl:php inurl:hlstats intext: Half-life statistics file, lists username and
Server Username other information
filetype:ctl inurl:haccess. Microsoft FrontPage equivalent of htaccess
ctl Basic shows Web user credentials
Query Description
filetype:reg reg intext: Microsoft Internet Account Manager can
”internet account manager” reveal usernames and more
filetype:wab wab Microsoft Outlook Express Mail address
books
filetype:mdb inurl:profiles Microsoft Access databases containing (user)
profiles.
index.of perform.ini mIRC IRC ini file can list IRC usernames and
other information
inurl:root.asp?acs=anon Outlook Mail Web Access directory can be
used to discover usernames
filetype:conf inurl:proftpd. PROFTP FTP server configuration file reveals
conf –sample username and server information
filetype:log username putty PUTTY SSH client logs can reveal usernames
and server information
filetype:rdp rdp Remote Desktop Connection files reveal user
credentials
intitle:index.of .bash_history UNIX bash shell history reveals commands
typed at a bash command prompt; usernames
are often typed as argument strings
intitle:index.of .sh_history UNIX shell history reveals commands typed at
a shell command prompt; usernames are
often typed as argument strings
“index of ” lck Various lock files list the user currently using
a file
+intext:webalizer +intext: Webalizer Web statistics page lists Web user-
Total Usernames +intext: names and statistical information
”Usage Statistics for”
filetype:reg reg HKEY_ Windows Registry exports can reveal
CURRENT_USER username usernames and other information
Underground Googling
Searching for a Known Filename
Remember that there are several ways to search for a known filename.
One way relies on locating the file in a directory listing, like intitle:index.of
install.log. Another, often better, method relies on the filetype operator,
as in filetype:log inurl:install.log. Directory listings are not all that
common. Google will crawl a link to a file in a directory listing, meaning
that the filetype method will find both directory listing entries as well as
files crawled in other ways.
In some cases, usernames can be gathered from Web-based statistical programs
that check Web activity.The Webalizer program shows all sorts of information
about a Web server’s usage. Output files for the Webalizer program can be
located with a query such as intext:webalizer intext:”Total Usernames” intext:”Usage
Statistics for”. Among the information displayed is the username that was used to
connect to the Web server, as shown in Figure 9.2. In some cases, however, the
usernames displayed are not valid or current, but the “Visits” column lists the
number of times a user account was used during the capture period.This enables
an attacker to easily determine which accounts are more likely to be valid.
The Windows registry holds all sorts of authentication information, including
usernames and passwords.Though it is unlikely (and fairly uncommon) to locate
live, exported Windows registry files on the Web, at the time of this writing
there are nearly 100 hits on the query filetype:reg HKEY_CURRENT_USER
username, which locates Windows registry files that contain the word username
and in some cases passwords,
As any talented attacker or security person will tell you, it’s rare to get information
served to you on a silver platter. Most decent finds take a bit of persistence,
creativity, intelligence, and just a bit of good luck. For example, consider
the Microsoft Outlook Web Access portal, which can be located with a query
like inurl:root.asp?acs=anon. At the time of this writing, fewer than 50 sites are
returned by this query, even though there a certainly more than 50 sites running
the Microsoft Web-based mail portal. Regardless of how you might locate a site
running this e-mail gateway, it’s not uncommon for the site to host a public
directory (denoted “Find Names,” by default)
The public directory allows access to a search page that can be used to find
users by name. In most cases, wildcard searching is not allowed, meaning that a
search for * will not return a list of all users, as might be expected. Entering a
search for a space is an interesting idea, since most user descriptions contain a
space, but most large directories will return the error message “This query would
return too many addresses!” Applying a bit of creativity, an attacker could begin
searching for individual common letters, such as the “Wheel of Fortune letters”
R, S,T, L, N, and E. Eventually one of these searches will most likely reveal a list
of user information like
Once a list of user information is returned, the attacker can then recycle the
search with words contained in the user list, searching for the words Voyager,
Freshmen, or Campus, for example.Those results can then be recycled, eventually
resulting in a nearly complete list of user information.
Searching for Passwords
Password data, one of the “Holy Grails” during a penetration test, should be protected.
Unfortunately, many examples of Google queries can be used to locate
passwords on the Web, as shown in Table 9.2.
Table 9.2 Queries That Locate Password Information
Query Description
inurl:/db/main.mdb ASP-Nuke passwords
filetype:cfm “cfapplication ColdFusion source with potential passwords
name” password
filetype:pass pass intext:userid dbman credentials
allinurl:auth_user_file.txt DCForum user passwords
eggdrop filetype:user user Eggdrop IRC user credentials
filetype:ini inurl:flashFXP.ini FlashFXP FTP credentials
filetype:url +inurl:”ftp://” FTP bookmarks cleartext passwords
+inurl:”@”
inurl:zebra.conf intext: GNU Zebra passwords
password -sample -test
-tutorial –download
filetype:htpasswd htpasswd HTTP htpasswd Web user credentials
intitle:”Index of” “.htpasswd” HTTP htpasswd Web user credentials
“htgroup” -intitle:”dist”
-apache -htpasswd.c
intitle:”Index of” “.htpasswd” HTTP htpasswd Web user credentials
htpasswd.bak
“http://*:*@www” bob:bob HTTP passwords (bob is a sample username)
“sets mode: +k” IRC channel keys (passwords)
“Your password is * Remember IRC NickServ registration passwords
this for later use”
signin filetype:url JavaScript authentication credentials
Queries That Locate Password Information
Query Description
LeapFTP intitle:”index.of./” LeapFTP client login credentials
sites.ini modified
inurl:lilo.conf filetype:conf LILO passwords
password -tatercounter2000
-bootpwd –man
filetype:config config intext: Microsoft .NET application credentials
appSettings “User ID”
filetype:pwd service Microsoft FrontPage Service Web passwords
intitle:index.of Microsoft FrontPage Web credentials
administrators.pwd
“# -FrontPage-” inurl:service.pwd Microsoft FrontPage Web passwords
ext:pwd inurl:_vti_pvt inurl: Microsoft FrontPage Web passwords
(Service | authors | administrators)
inurl:perform filetype:ini mIRC nickserv credentials
intitle:”index of” intext: mySQL database credentials
connect.inc
intitle:”index of” intext: mySQL database credentials
globals.inc
filetype:conf oekakibbs Oekakibss user passwords
filetype:dat wand.dat Opera‚ ÄúMagic Wand‚Äù Web credentials
inurl:ospfd.conf intext: OSPF Daemon Passwords
password -sample -test
-tutorial –download
index.of passlist Passlist user credentials
inurl:passlist.txt passlist.txt file user credentials
filetype:dat “password.dat” password.dat files
inurl:password.log filetype:log password.log file reveals usernames, passwords,
and hostnames
filetype:log inurl:”password.log” password.log files cleartext passwords
inurl:people.lst filetype:lst People.lst generic password file
intitle:index.of config.php PHP Configuration File database credentials
inurl:config.php dbuname dbpass PHP Configuration File database credentials
inurl:nuke filetype:sql PHP-Nuke credentials
Queries That Locate Password Information
Query Description
filetype:conf inurl:psybnc.conf psyBNC IRC user credentials
“USER.PASS=”
filetype:ini ServUDaemon servU FTP Daemon credentials
filetype:conf slapd.conf slapd configuration files root password
inurl:”slapd.conf” intext: slapd LDAP credentials
”credentials” -manpage
-”Manual Page” -man: -sample
inurl:”slapd.conf” intext: slapd LDAP root password
”rootpw” -manpage
-”Manual Page” -man: -sample
filetype:sql “IDENTIFIED BY” –cvs SQL passwords
filetype:sql password SQL passwords
filetype:ini wcx_ftp Total Commander FTP passwords
filetype:netrc password UNIX .netrc user credentials
index.of.etc UNIX /etc directories contain various credential
files
intitle:”Index of..etc” passwd UNIX /etc/passwd user credentials
intitle:index.of passwd UNIX /etc/passwd user credentials
passwd.bak
intitle:”Index of” pwd.db UNIX /etc/pwd.db credentials
intitle:Index.of etc shadow UNIX /etc/shadow user credentials
intitle:index.of master.passwd UNIX master.passwd user credentials
intitle:”Index of” spwd.db UNIX spwd.db credentials
passwd -pam.conf
filetype:bak inurl:”htaccess| UNIX various password file backups
passwd|shadow|htusers
filetype:inc dbconn Various database credentials
filetype:inc intext:mysql_ Various database credentials, server names
connect
filetype:properties inurl:db Various database credentials, server names
intext:password
inurl:vtund.conf intext:pass –cvs Virtual Tunnel Daemon passwords
inurl:”wvdial.conf” intext: wdial dialup user credentials
Queries That Locate Password Information
Query Description
filetype:mdb wwforum Web Wiz Forums Web credentials
“AutoCreate=TRUE password=*”Website Access Analyzer user passwords
filetype:pwl pwl Windows Password List user credentials
filetype:reg reg +intext: Windows Registry Keys containing user
”defaultusername” intext: credentials
”defaultpassword”
filetype:reg reg +intext: Windows Registry Keys containing user
”internet account manager” credentials
“index of/” “ws_ftp.ini” WS_FTP FTP credentials
“parent directory”
filetype:ini ws_ftp pwd WS_FTP FTP user credentials
inurl:/wwwboard wwwboard user credentials
In most cases, passwords discovered on the Web are either encrypted or
encoded in some way. In most cases, these passwords can be fed into a password
cracker such as John the Ripper from www.openwall.com/john to produce
plaintext passwords that can be used in an attack. Figure 9.6 shows the results of
the search ext:pwd inurl:_vti_pvt inurl:(Service | authors | administrators), which
combines a search for some common
Exported Windows registry files often contain encrypted or encoded passwords
as well. If a user exports the Windows registry to a file and Google subsequently
crawls that file, a query like filetype:reg intext:”internet account manager”
could reveal interesting keys containing password data
ress. Note that live, exported Windows registry files are not very common, but it’s
not uncommon for an attacker to target a site simply because of one exceptionally
insecure file. It’s also possible for a Google query to uncover cleartext passwords.
These passwords can be used as is without having to employ a
password-cracking utility. In these extreme cases, the only challenge is determining
the username as well as the host on which the password can be used. As
shown in Figure 9.8, certain queries will locate all the following information:
usernames, cleartext passwords, and the host that uses that authentication!
There is no magic query for locating passwords, but during an assessment,
remember that the simplest queries directed at a site can have amazing results, as
we discussed in , Chapter 7, Ten Simple Searches. For example, a query like “Your
password” forgot would locate pages that provide a forgotten password recovery
mechanism.The information from this type of query can be used to formulate
any of a number of attacks against a password. As always, effective social engineering
is a terrific nontechnical solution to “forgotten” passwords.
Another generic search for password information, intext:(password | passcode |
pass) intext:(username | userid | user), combines common words for passwords and
user IDs into one query.This query returns a lot of results, but the vast majority
of the top hits refer to pages that list forgotten password information, including
either links or contact information. Using Google’s translate feature, found at
http://translate.google.com/translate_t, we could also create multilingual password
searches.Table 9.3 lists common translations for the word password
English Translations of the Word Password
Language Word Translation
German password Kennwort
Spanish password contraseña
French password mot de passe
Italian password parola d’accesso
Portuguese password senha
Dutch password Paswoord
NOTE
The terms username and userid in most languages translate to username
and userid, respectively.
Searching for Credit Card Numbers,
Social Security Numbers, and More
Most people have heard news stories about Web hackers making off with customer
credit card information.With so many fly-by night retailers popping up
on the Internet, it’s no wonder that credit card fraud is so prolific.These momand-
pop retailers are not the only ones successfully compromised by hackers.
Corporate giants by the hundreds have had financial database compromises over
the years, victims of sometimes very technical, highly focused attackers. What
might surprise you is that it doesn’t take a rocket scientist to uncover live credit
card numbers on the Internet, thanks to search engines like Google. Everything
from credit information to banking data or supersensitive classified government
documents can be found on the Web. Consider the (highly edited) Web page
This document, found using Google, lists hundreds and hundreds of credit
card numbers (including expiration date and card validation numbers) as well as
the owners’ names, addresses, and phone numbers.This particular document also
included phone card (calling card) numbers. Notice the scroll bar on the righthand
side of Figure 9.9, an indicator that the displayed page is only a small part
of this huge document—like many other documents of its kind. In most cases,
pages that contain these numbers are not “leaked” from online retailers or ecommerce
sites but rather are most likely the fruits of a scam known as phishing,
in which users are solicited via telephone or e-mail for personal information.
Several Web sites, including MillerSmiles.co.uk, document these scams and
hoaxes. Figure 9.10 shows a screen shot of a popular eBay phishing scam that
encourages users to update their eBay profile information.
Once a user fills out this form, all the information is sent via e-mail to the
attacker, who can use it for just about anything.
Tools and Traps
Catching Online Scammers
In some cases, you might be able to use Google to help nab the bad guys.
Phishing scams are effective because the fake page looks like an official
page. To create an official-looking page, the bad guys must have examples
to work from, meaning that they must have visited a few legitimate companies’
Web sites. If the fishing scam was created using text from several
companies’ existing pages, you can key in on specific phrases from the fake
page, creating Google queries designed to round up the servers that hosted
some of the original content. Once you’ve located the servers that contained
the pilfered text, you can work with the companies involved to
extract correlating connection data from their log files. If the scammer visited
each company’s Web page, collecting bits of realistic text, his IP should
appear in each of the log files. Auditors at SensePost (www.sensepost.com)
have successfully used this technique to nab online scam artists.
Unfortunately, if the scammer uses an exact copy of a page from only one
company, this task becomes much more difficult to accomplish.
Social Security Numbers
Social Security numbers (SSNs) and other sensitive data can be easily located
with Google as well as via the same techniques used to locate credit card numbers.
For a variety of reasons, SSNs might appear online—for example, educational
facilities are notorious for using an SSN as a student ID, then posting
grades to a public Web site with the “student ID” displayed next to the grade.A
creative attacker can do quite a bit with just an SSN, but in many cases it helps
to also have a name associated with that SSN. Again, educational facilities have
been found exposing this information via Excel spreadsheets listing student’s
names, grades, and SSNs, despite the fact that the student ID number is often
used to help protect the privacy of the student! Although we don’t feel it’s right
to go into the details of how this data is located, several media outlets have irresponsibly
posted the details online. Although the blame lies with the sites that are
leaking this information, in our opinion it’s still not right to draw attention to
how exactly the information can be located.
Personal Financial Data
In some cases, phishing scams are responsible for publicizing personal information;
in other cases, hackers attacking online retails are to blame for this breach of
privacy. Sadly, there are many instances where an individual is personally responsible
for his own lack of privacy. Such is the case with personal financial information.
With the explosion of personal computers in today’s society, users have
literally hundreds of personal finance programs to choose from. Many of these
programs create data files with specific file extensions that can be searched with
Google. It’s hard to imagine why anyone would post personal financial information
to a public Web site (which subsequently gets crawled by Google), but it
must happen quite a bit, judging by the number of hits for program files generated
by Quicken and Microsoft Money, for example. Although it would be
somewhat irresponsible to provide queries here that would unearth personal
financial data, it’s important to understand the types of data that could potentially
be uncovered by an attacker.To that end,Table 9.4 shows file extensions for various
financial, accounting, and tax return programs. Ensure that these filetypes
aren’t listed on a webserver you’re charged with protecting.
File Extension Description
afm Abassis Finance Manager
ab4 Accounting and Business File
mmw AceMoney File
Iqd AmeriCalc Mutual Fund Tax Report
et2 Electronic Tax Return Security File (Australia)
tax Intuit TurboTax Tax Return
t98-t04 Kiplinger Tax Cut File (extension based on two-digit return
year)
mny Microsoft Money 2004 Money Data Files
mbf Microsoft Money Backup Files
inv MSN Money Investor File
ptdb Peachtree Accounting Database
qbb QuickBooks Backup Files reveal financial data
qdf Quicken personal finance data
soa Sage MAS 90 accounting software
sdb Simply Accounting
stx Simply Tax Form
tmd Time and Expense Tracking
tls Timeless Time & Expense
fec U.S. Federal Campaign Expense Submission
wow Wings Accounting File
Searching for Other Juicy Info
As we’ve seen, Google can be used to locate all sorts of sensitive information. In
this section we take a look at some of the data that Google can find that’s harder
to categorize. From address books to chat log files and network vulnerability
reports, there’s no shortage of sensitive data online.Table 9.5 shows some queries
that can be used to uncover various types of sensitive data.
Query Description
intext:”Session Start AIM and IRC log files
* * * *:*:* *” filetype:log
filetype:blt blt +intext: AIM buddy lists
screenname
buddylist.blt AIM buddy lists
intitle:index.of cgiirc.config CGIIRC (Web-based IRC client) config file,
shows IRC servers and user credentials
inurl:cgiirc.config CGIIRC (Web-based IRC client) config file,
shows IRC servers and user credentials
“Index of” / “chat/logs” Chat logs
intitle:”Index Of” cookies.txt cookies.txt file reveals user information
“size”
“phone * * *” “address *” Curriculum vitae (resumes) reveal names
“e-mail” intitle:”curriculum vitae” and address information
ext:ini intext:env.ini Generic environment data
intitle:index.of inbox Generic mailbox files
“Running in Child mode” Gnutella client data and statistics
“:8080” “:3128” “:80” HTTP Proxy lists
filetype:txt
intitle:”Index of” ICQ chat logs
dbconvert.exe chats
“sets mode: +p” IRC private channel information
“sets mode: +s” IRC secret channel information
“Host Vulnerability Summary ISS vulnerability scanner reports, reveal
Report” potential vulnerabilities on hosts and
networks
“Network Vulnerability ISS vulnerability scanner reports, reveal
Assessment Report” potential vulnerabilities on hosts and networks
filetype:pot inurl:john.pot John the Ripper password cracker results
intitle:”Index Of” -inurl:maillog Maillog files reveals e-mail traffic
maillog size information
ext:mdb inurl:*.mdb inurl: Microsoft FrontPage database folders
Query Description
filetype:xls inurl:contact Microsoft Excel sheets containing contact
information.
intitle:index.of haccess.ctl Microsoft FrontPage equivalent(?)of htaccess
shows Web authentication info
ext:log “Software: Microsoft Microsoft Internet Information Services
Internet Information Services *.*” (IIS) log files
filetype:pst inurl:”outlook.pst” Microsoft Outlook e-mail and calendar
backup files
intitle:index.of mt-db-pass.cgi Movable Type default file
filetype:ctt ctt messenger MSN Messenger contact lists
“This file was generated Nessus vulnerability scanner reports, reveal
by Nessus” potential vulnerabilities on hosts and networks
inurl:”newsletter/admin/” Newsletter administration information
inurl:”newsletter/admin/” Newsletter administration information
intitle:”newsletter admin”
filetype:eml eml intext: Outlook Express e-mail files
”Subject” +From
intitle:index.of inbox dbx Outlook Express Mailbox files
intitle:index.of inbox dbx Outlook Express Mailbox files
filetype:mbx mbx intext:Subject Outlook v1–v4 or Eudora mailbox files
inurl:/public/?Cmd=contents Outlook Web Access public folders or
appointments
filetype:pdb pdb backup (Pilot Palm Pilot Hotsync database files
| Pluckerdb)
“This is a Shareaza Node” Shareaza client data and statistics
inurl:/_layouts/settings Sharepoint configuration information
inurl:ssl.conf filetype:conf SSL configuration files, reveal various configuration
information
site:edu admin grades Student grades
intitle:index.of mystuff.xml Trillian user Web links
inurl:forward filetype: UNIX mail forward files reveal e-mail
forward –cvs addresses
intitle:index.of dead.letter UNIX unfinished e-mails
Summary
Make no mistake—there’s sensitive data on the Web, and Google can find it.
There’s hardly any limit to the scope of information that can be located, if only
you can figure out the right query. From usernames to passwords, credit card and
Social Security numbers, and personal financial information, it’s all out there. As a
purveyor of the “dark arts,” you can relish in the stupidity of others, but as a professional
tasked with securing a customer’s site from this dangerous form of
information leakage, you could be overwhelmed by the sheer scale of your
defensive duties.
As droll as it might sound, a solid, enforced security policy is a great way to
keep sensitive data from leaking to the Web. If users understand the risks associated
with information leakage and understand the penalties that come with violating
policy, they will be more willing to cooperate in what should be a security
partnership.
In the meantime, it certainly doesn’t hurt to understand the tactics an adversary
might employ in attacking a Web server. One thing that should become
clear as you read this book is that any attacker has an overwhelming number of
files to go after. One way to prevent dangerous Web information leakage is by
denying requests for unknown file types. Whether your Web server normally
serves up CFM,ASP, PHP, or HTML, it’s infinitely easier to manage what should
be served by the Web server instead of focusing on what should not be served.
Adjust your servers or your border protection devices to allow only specific content
or file types.
Solutions Fast Track
Searching for Usernames
_ Usernames can be found in a variety of locations.
_ In some cases, digging through documents or e-mail directories might
be required.
_ A simple query such as “your username is” can be very effective in
locating usernames.
Searching for Passwords
_ Passwords can also be found in a variety locations.
_ A query such as “Your password” forgot can locate pages that provide a
forgotten-password recovery mechanism.
_ intext:(password | passcode | pass) intext:(username | userid | user) is
another generic search for locating password information.
Searching for Credit Cards
Numbers, Social Security Numbers, and More
_ Documents containing credit card and Social Security number
information do exist and are relatively prolific.
_ Some irresponsible news outlets have revealed functional queries that
locate this information.
_ There are relatively few examples of personal financial data online, but
there is a great deal of variety.
_ In most cases, specific file extensions can be searched for.
Searching for Other Juicy Info
_ From address books and chat log files to network vulnerability reports,
there’s no shortage of sensitive data online.