# Robots.txt file from http://www.searchengineworld.com # # Built from text file http://info.webcrawler.com/mak/projects/robots/active/all.txt # # This restricts access to only known and registered robots. # User-agent: Mozilla/3.0 (compatible;miner;mailto:miner@miner.com.br) Disallow: User-agent: WebFerret Disallow: User-agent: Due to a deficiency in Java it's not currently possible to set the User-agent. Disallow: User-agent: no Disallow: User-agent: 'Ahoy! The Homepage Finder' Disallow: User-agent: Arachnophilia Disallow: User-agent: ArchitextSpider Disallow: User-agent: ASpider/0.09 Disallow: User-agent: AURESYS/1.0 Disallow: User-agent: BackRub/*.* Disallow: User-agent: Big Brother Disallow: User-agent: BlackWidow Disallow: User-agent: BSpider/1.0 libwww-perl/0.40 Disallow: User-agent: CACTVS Chemistry Spider Disallow: User-agent: Digimarc CGIReader/1.0 Disallow: User-agent: Checkbot/x.xx LWP/5.x Disallow: User-agent: CMC/0.01 Disallow: User-agent: combine/0.0 Disallow: User-agent: conceptbot/0.3 Disallow: User-agent: Crescent Internet ToolPak HTTP OLE Control v.1.0 Disallow: User-agent: root/0.1 Disallow: User-agent: CS-HKUST-IndexServer/1.0 Disallow: User-agent: CyberSpyder/2.1 Disallow: User-agent: Deweb/1.01 Disallow: User-agent: DragonBot/1.0 libwww/5.0 Disallow: User-agent: EIT-Link-Verifier-Robot/0.2 Disallow: User-agent: Emacs-w3/v[0-9\.]+ Disallow: User-agent: EmailSiphon Disallow: User-agent: EMC Spider Disallow: User-agent: explorersearch Disallow: User-agent: Explorer Disallow: User-agent: ExtractorPro Disallow: User-agent: FelixIDE/1.0 Disallow: User-agent: Hazel's Ferret Web hopper, Disallow: User-agent: ESIRover v1.0 Disallow: User-agent: fido/0.9 Harvest/1.4.pl2 Disallow: User-agent: Hämähäkki/0.2 Disallow: User-agent: KIT-Fireball/2.0 libwww/5.0a Disallow: User-agent: Fish-Search-Robot Disallow: User-agent: Mozilla/2.0 (compatible fouineur v2.0; fouineur.9bit.qc.ca) Disallow: User-agent: Robot du CRIM 1.0a Disallow: User-agent: Freecrawl Disallow: User-agent: FunnelWeb-1.0 Disallow: User-agent: gcreep/1.0 Disallow: User-agent: ??? Disallow: User-agent: GetURL.rexx v1.05 Disallow: User-agent: Golem/1.1 Disallow: User-agent: Gromit/1.0 Disallow: User-agent: Gulliver/1.1 Disallow: User-agent: yes Disallow: User-agent: AITCSRobot/1.1 Disallow: User-agent: wired-digital-newsbot/1.5 Disallow: User-agent: htdig/3.0b3 Disallow: User-agent: HTMLgobble v2.2 Disallow: User-agent: no Disallow: User-agent: IBM_Planetwide, Disallow: User-agent: gestaltIconoclast/1.0 libwww-FM/2.17 Disallow: User-agent: INGRID/0.1 Disallow: User-agent: IncyWincy/1.0b1 Disallow: User-agent: Informant Disallow: User-agent: InfoSeek Robot 1.0 Disallow: User-agent: Infoseek Sidewinder Disallow: User-agent: InfoSpiders/0.1 Disallow: User-agent: inspectorwww/1.0 http://www.greenpac.com/inspectorwww.html Disallow: User-agent: 'IAGENT/1.0' Disallow: User-agent: IsraeliSearch/1.0 Disallow: User-agent: JCrawler/0.2 Disallow: User-agent: Jeeves v0.05alpha (PERL, LWP, lglb@doc.ic.ac.uk) Disallow: User-agent: Jobot/0.1alpha libwww-perl/4.0 Disallow: User-agent: JoeBot, Disallow: User-agent: JubiiRobot Disallow: User-agent: jumpstation Disallow: User-agent: Katipo/1.0 Disallow: User-agent: KDD-Explorer/0.1 Disallow: User-agent: KO_Yappo_Robot/1.0.4(http://yappo.com/info/robot.html) Disallow: User-agent: LabelGrab/1.1 Disallow: User-agent: LinkWalker Disallow: User-agent: logo.gif crawler Disallow: User-agent: Lycos/x.x Disallow: User-agent: Lycos_Spider_(T-Rex) Disallow: User-agent: Magpie/1.0 Disallow: User-agent: MediaFox/x.y Disallow: User-agent: MerzScope Disallow: User-agent: NEC-MeshExplorer Disallow: User-agent: MOMspider/1.00 libwww-perl/0.40 Disallow: User-agent: Monster/vX.X.X -$TYPE ($OSTYPE) Disallow: User-agent: Motor/0.2 Disallow: User-agent: MuscatFerret Disallow: User-agent: MwdSearch/0.1 Disallow: User-agent: NetCarta CyberPilot Pro Disallow: User-agent: NetMechanic Disallow: User-agent: NetScoop/1.0 libwww/5.0a Disallow: User-agent: NHSEWalker/3.0 Disallow: User-agent: Nomad-V2.x Disallow: User-agent: NorthStar Disallow: User-agent: Occam/1.0 Disallow: User-agent: HKU WWW Robot, Disallow: User-agent: Orbsearch/1.0 Disallow: User-agent: PackRat/1.0 Disallow: User-agent: Patric/0.01a Disallow: User-agent: Peregrinator-Mathematics/0.7 Disallow: User-agent: Duppies Disallow: User-agent: Pioneer Disallow: User-agent: PGP-KA/1.2 Disallow: User-agent: Resume Robot Disallow: User-agent: Road Runner: ImageScape Robot (lim@cs.leidenuniv.nl) Disallow: User-agent: Robbie/0.1 Disallow: User-agent: ComputingSite Robi/1.0 (robi@computingsite.com) Disallow: User-agent: Roverbot Disallow: User-agent: SafetyNet Robot 0.1, Disallow: User-agent: Scooter/1.0 Disallow: User-agent: not available Disallow: User-agent: Senrigan/xxxxxx Disallow: User-agent: SG-Scout Disallow: User-agent: Shai'Hulud Disallow: User-agent: SimBot/1.0 Disallow: User-agent: Open Text Site Crawler V1.0 Disallow: User-agent: SiteTech-Rover Disallow: User-agent: Slurp/2.0 Disallow: User-agent: ESISmartSpider/2.0 Disallow: User-agent: Snooper/b97_01 Disallow: User-agent: Solbot/1.0 LWP/5.07 Disallow: User-agent: Spanner/1.0 (Linux 2.0.27 i586) Disallow: User-agent: no Disallow: User-agent: Mozilla/3.0 (Black Widow v1.1.0; Linux 2.0.27; Dec 31 1997 12:25:00 Disallow: User-agent: Tarantula/1.0 Disallow: User-agent: tarspider Disallow: User-agent: dlw3robot/x.y (in TclX by http://hplyot.obspm.fr/~dl/) Disallow: User-agent: Templeton/ Disallow: User-agent: TitIn/0.2 Disallow: User-agent: TITAN/0.1 Disallow: User-agent: UCSD-Crawler Disallow: User-agent: urlck/1.2.3 Disallow: User-agent: Valkyrie/1.0 libwww-perl/0.40 Disallow: User-agent: Victoria/1.0 Disallow: User-agent: vision-search/3.0' Disallow: User-agent: VWbot_K/4.2 Disallow: User-agent: w3index Disallow: User-agent: W3M2/x.xxx Disallow: User-agent: WWWWanderer v3.0 Disallow: User-agent: WebCopy/ Disallow: User-agent: WebCrawler/3.0 Robot libwww/5.0a Disallow: User-agent: WebFetcher/0.8, Disallow: User-agent: weblayers/0.0 Disallow: User-agent: WebLinker/0.0 libwww-perl/0.1 Disallow: User-agent: no Disallow: User-agent: WebMoose/0.0.0000 Disallow: User-agent: Digimarc WebReader/1.2 Disallow: User-agent: webs@recruit.co.jp Disallow: User-agent: webvac/1.0 Disallow: User-agent: webwalk Disallow: User-agent: WebWalker/1.10 Disallow: User-agent: WebWatch Disallow: User-agent: Wget/1.4.0 Disallow: User-agent: w3mir Disallow: User-agent: no Disallow: User-agent: WWWC/0.25 (Win95) Disallow: User-agent: none Disallow: User-agent: XGET/0.7 Disallow: User-agent: Nederland.zoek Disallow: User-agent: BizBot04 kirk.overleaf.com Disallow: User-agent: HappyBot (gserver.kw.net) Disallow: User-agent: CaliforniaBrownSpider Disallow: User-agent: EI*Net/0.1 libwww/0.1 Disallow: User-agent: Ibot/1.0 libwww-perl/0.40 Disallow: User-agent: Merritt/1.0 Disallow: User-agent: StatFetcher/1.0 Disallow: User-agent: TeacherSoft/1.0 libwww/2.17 Disallow: User-agent: WWW Collector Disallow: User-agent: processor/0.0ALPHA libwww-perl/0.20 Disallow: User-agent: wobot/1.0 from 206.214.202.45 Disallow: User-agent: Libertech-Rover www.libertech.com? Disallow: User-agent: WhoWhere Robot Disallow: User-agent: ITI Spider Disallow: User-agent: w3index Disallow: User-agent: MyCNNSpider Disallow: User-agent: SummyCrawler Disallow: User-agent: OGspider Disallow: User-agent: linklooker Disallow: User-agent: CyberSpyder (amant@www.cyberspyder.com) Disallow: User-agent: SlowBot Disallow: User-agent: heraSpider Disallow: User-agent: Surfbot Disallow: User-agent: Bizbot003 Disallow: User-agent: WebWalker Disallow: User-agent: SandBot Disallow: User-agent: EnigmaBot Disallow: User-agent: spyder3.microsys.com Disallow: User-agent: www.freeloader.com. Disallow: User-agent: Googlebot Disallow: User-agent: METAGOPHER Disallow: User-agent: * Disallow: / User-agent: * Disallow: /search Disallow: /groups Disallow: /images Disallow: /catalogs Disallow: /catalogues Disallow: /news Allow: /news/directory Disallow: /nwshp Disallow: /setnewsprefs? Disallow: /index.html? Disallow: /? Disallow: /addurl/image? Disallow: /pagead/ Disallow: /relpage/ Disallow: /relcontent Disallow: /imgres Disallow: /imglanding Disallow: /keyword/ Disallow: /u/ Disallow: /univ/ Disallow: /cobrand Disallow: /custom Disallow: /advanced_group_search Disallow: /googlesite Disallow: /preferences Disallow: /setprefs Disallow: /swr Disallow: /url Disallow: /default Disallow: /m? Disallow: /m/? Disallow: /m/blogs? Disallow: /m/ig Disallow: /m/images? Disallow: /m/local? Disallow: /m/movies? Disallow: /m/news? Disallow: /m/news/i? Disallow: /m/place? Disallow: /m/setnewsprefs? Disallow: /m/search? Disallow: /m/swmloptin? Disallow: /m/trends Disallow: /wml? Disallow: /wml/? Disallow: /wml/search? Disallow: /xhtml? Disallow: /xhtml/? Disallow: /xhtml/search? Disallow: /xml? Disallow: /imode? Disallow: /imode/? Disallow: /imode/search? Disallow: /jsky? Disallow: /jsky/? Disallow: /jsky/search? Disallow: /pda? Disallow: /pda/? Disallow: /pda/search? Disallow: /sprint_xhtml Disallow: /sprint_wml Disallow: /pqa Disallow: /palm Disallow: /gwt/ Disallow: /purchases Disallow: /hws Disallow: /bsd? Disallow: /linux? Disallow: /mac? Disallow: /microsoft? Disallow: /unclesam? Disallow: /answers/search?q= Disallow: /local? Disallow: /local_url Disallow: /froogle? Disallow: /products? Disallow: /products/ Disallow: /froogle_ Disallow: /product_ Disallow: /products_ Disallow: /print Disallow: /books Disallow: /bkshp?q= Allow: /booksrightsholders Disallow: /patents? Disallow: /patents/ Allow: /patents/about Disallow: /scholar Disallow: /complete Disallow: /sponsoredlinks Disallow: /videosearch? Disallow: /videopreview? Disallow: /videoprograminfo? Disallow: /maps? Disallow: /mapstt? Disallow: /mapslt? Disallow: /maps/stk/ Disallow: /maps/br? Disallow: /mapabcpoi? Disallow: /maphp? Disallow: /places/ Disallow: /maps/place Disallow: /help/maps/streetview/partners/welcome/ Disallow: /lochp? Disallow: /center Disallow: /ie? Disallow: /sms/demo? Disallow: /katrina? Disallow: /blogsearch? Disallow: /blogsearch/ Disallow: /blogsearch_feeds Disallow: /advanced_blog_search Disallow: /reader/ Allow: /reader/play Disallow: /uds/ Disallow: /chart? Disallow: /transit? Disallow: /mbd? Disallow: /extern_js/ Disallow: /calendar/feeds/ Disallow: /calendar/ical/ Disallow: /cl2/feeds/ Disallow: /cl2/ical/ Disallow: /coop/directory Disallow: /coop/manage Disallow: /trends? Disallow: /trends/music? Disallow: /trends/hottrends? Disallow: /trends/viz? Disallow: /notebook/search? Disallow: /musica Disallow: /musicad Disallow: /musicas Disallow: /musicl Disallow: /musics Disallow: /musicsearch Disallow: /musicsp Disallow: /musiclp Disallow: /browsersync Disallow: /call Disallow: /archivesearch? Disallow: /archivesearch/url Disallow: /archivesearch/advanced_search Disallow: /base/search? Disallow: /base/reportbadoffer Disallow: /base/s2 Disallow: /urchin_test/ Disallow: /movies? Disallow: /codesearch? Disallow: /codesearch/feeds/search? Disallow: /wapsearch? Disallow: /safebrowsing Allow: /safebrowsing/diagnostic Allow: /safebrowsing/report_error/ Allow: /safebrowsing/report_phish/ Disallow: /reviews/search? Disallow: /orkut/albums Disallow: /jsapi Disallow: /views? Disallow: /c/ Disallow: /cbk Disallow: /recharge/dashboard/car Disallow: /recharge/dashboard/static/ Disallow: /translate_a/ Disallow: /translate_c Disallow: /translate_f Disallow: /translate_static/ Disallow: /translate_suggestion Disallow: /profiles/me Allow: /profiles Disallow: /s2/profiles/me Allow: /s2/profiles Allow: /s2/photos Allow: /s2/static Disallow: /s2 Disallow: /transconsole/portal/ Disallow: /gcc/ Disallow: /aclk Disallow: /cse? Disallow: /cse/panel Disallow: /cse/manage Disallow: /tbproxy/ Disallow: /comparisonads/ Disallow: /imesync/ Disallow: /shenghuo/search? Disallow: /support/forum/search? Disallow: /reviews/polls/ Disallow: /hosted/images/ Disallow: /ppob/? Disallow: /ppob? Disallow: /ig/add? Disallow: /adwordsresellers Disallow: /accounts/o8 Allow: /accounts/o8/id Disallow: /topicsearch?q= Disallow: /xfx7/ Disallow: /squared/api Disallow: /squared/search Disallow: /squared/table Disallow: /toolkit/ Allow: /toolkit/*.html Disallow: /qnasearch? Disallow: /errors/ Disallow: /app/updates Disallow: /sidewiki/entry/ Disallow: /quality_form? Disallow: /labs/popgadget/search Disallow: /buzz/post Sitemap: http://www.gstatic.com/s2/sitemaps/profiles-sitemap.xml Sitemap: http://www.google.com/hostednews/sitemap_index.xml Sitemap: http://www.google.com/ventures/sitemap_ventures.xml Sitemap: http://www.google.com/sitemaps_webmasters.xml Sitemap: http://www.gstatic.com/trends/websites/sitemaps/sitemapindex.xml Sitemap: http://www.gstatic.com/dictionary/static/sitemaps/sitemap_index.xml