Mangle

Statistics: August

A log is kept of the options and results for each random search. I wrote a small visual basic program to analyse everything based on the data from the past 30 days.

General
Number of searches through August 31 12985
Total number of words used in searches 38485
Number of times safemode was used 2294
Number of times Frames was used 11902
Number of times Javascript wasn't used 227
Number of times the search failed for an unknown reason 803
Number of times no URL was found 218
Actual number of usable searches 11964
Number of words in the database 7111

The large number of failed queries is a result of the web server failing on August 26th.


Website URL hits
The root URL was counted for each site visited. For example, If a web site address 'www.news.com/paper/june/00232.html' was hit, only 'www.news.com' was counted.

Web site root URL Hits (out of 11964)
wortschatz.uni-leipzig.de ** 94
www.geocities.com 86
english.pravda.ru 40
news.bbc.co.uk 36
www.amazon.com 35
members.aol.com 34
www.cnn.com 28
www.epa.gov 27
www.nap.edu 26
www.pbs.org 25
members.tripod.com 25
www.wired.com 23
citeseer.nj.nec.com 22
www.businessweek.com 22
www.angelfire.com 20
www.epinions.com 19
biz.yahoo.com 19
www.washingtonpost.com 18
www.fas.org 18
www.media.mit.edu 17
news.com.com 16
www.ed.gov 16
detnews.com 16
www.usdoj.gov 15
home.earthlink.net 15
Other notables:
www.guardian.co.uk 13
www.usatoday.com 12
www.law.emory.edu 11
www.bbc.co.uk 10
www.salon.com 10
www.microsoft.com 9
www.time.com 9
www.msnbc.com 6
www.canoe.ca 5
www3.sympatico.ca 5
dmoz.org 3
www.apple.com 3

** Note that the site wortschatz.uni-leipzig.de contains a 10000+ word list of the english language, and so this page is generally found if an odd combination of rarely used words are used in the search.


Top-level domain hits
The top-level domains (.com, .edu, .org, etc) in the web URLs were counted. Not surprisingly .com makes up the majority of hits, with .org and .edu far behind. At first I used the entire log file, however .au was ranked #4 with 407 hits (thanks to some media exposure from Australia!). I redid the stats, using only the entries where no country restriction was used and .au was back where it belonged.

Top-level Domain Hits (out of 10155)
.com 3560
.org 1296
.edu 830
.uk 350
.net 300
.gov 265
.ca 179
.us 168
.au 152
.jp 90
.de 65
.mil 37
.za 31
.nl 27
.ru 25
.nz 24
.fr 24
.se 21
.it 20


Domain File Extension Names
The filetypes that were used in the URL were counted. .html and .htm make up more than 70% of all the filetypes encountered. Many sites only had a root URL and no filetype to show. '.txt' is ranked higher than usual because of the 94 hits that http://wortschatz.uni-leipzig.de/Papers/top10000en.txt tallied.

File Extension Hits (out of 11964)
.html 4856
.htm 3681
Root (no extension) 1228
.asp 566
.shtml 351
.txt 212
.cfm 151
.php 133
.stm 42
.php3 35
.jsp 28
.cgi 26
.phtml 18
.jhtml 9


Word Frequencies
Although there is an equal probability of any word being picked out of the 7111 words in the database, some words came up more often than others. These are the words that were most often picked. There seems to be a bug where a word is occasionally picked twice in a row ...

Words used 39 times: dozens
Words used 34 times: both
Words used 32 times: fares, southwest
Words used 31 times: logs
Words used 29 times: ash, executing
Words used 28 times: clever, endorsed, games
Words used 26 times: argument, drag, tutorial
Words used 25 times: challenges, cholesterol, claiming, factions, grow, minister, shallow
Words used 24 times: analyses, buffers, chips, mathematical, megabyte, neighbors, resource, shocked, withdrew
Words used 23 times: analyzer, apartment, headline, protection, require, shed, tenure, tourist, tune, varying
Words used 22 times: basement, blanket, buses, commonly, currency, damaged, detect, diluted, keystroke, pointing, religion, settlement, spurred, therapy
796 words were never used.


Frequency of option 'Number of words'
Number of times 1 word was used in search 575 4.4%
Number of times 2 words were used in search 680 5.2%
Number of times 3 words were used in search (default) 10972 84%
Number of times 4 words were used in search 156 1.2%
Number of times 5 words were used in search 238 4.6%


'Country' options used:
The number of times the search was restricted to specific countries (the countries which were used less than 40 times were omitted).

Option: Country Number of times used
Australia 522
United States 428
Japan 300
Russia 122
Canada 108
Great Britain 94
United Kingdom 67
Belgium 50
Netherlands 45
Denmark 41
France 41


'Language' options used:
The number of times the search was restricted to specific languages (the languages which were used less than 20 times were omitted).

Option: Language Number of times used
English 3884
Japanese 310
Russian 108
Dutch 35
French 35
German 26

Other stats:

Other stats:
2005
January
2004
December
November
October
September
August
July
June
May
April
March
February
January
2003
December
November
October
September
August
July
June
May
April
March
February
January
2002
December
September - November
August
July
June
March



Home     Browser Toolbar     Help     Statistics     Search History     Links     Contact