Gün: 30 Ocak 2009

cuill’den arama motoru hızıyla ilgili sorduğum soruya gelen cevap

Dear Nurettin,

Twiceler is the crawler for our new search engine. It is important
to us that it obey robots.txt, and that it not crawl sites that do not
wish to be crawled.

Recently we have seen a number of crawlers masquerading as Twiceler, so
please check that the IP address of the crawler in question is one of ours.
You can see our IP addresses at http://www.cuil.com/info/webmaster_info

You may wish to add a robots.txt file to your site (I notice you don’t
have one). That is the standard mechanism for controlling robot access and
behavior. You can read about it at
http://www.robotstxt.org/wc/exclusion-admin.html
and there a simple generator of the file here
http://www.mcanerin.com/EN/search-engine/robots-txt.asp

The Crawl-delay directive is what you are looking for. It tells robots
that support it (we do) how long to wait between requests. Add the
directive just below the ‘User-agent: *’ line like this:

Crawl-delay: 120

would tell us to wait two minutes between requests.

Also be aware that changes to robots.txt take several days to take
effect. The industry standard is to cache robots.txt for seven days,
but we make every effort to re-read it more frequently.

Please feel free to contact me if you have any further questions.

Sincerely,

James Akers
Operations Engineer
Cuill, Inc.

30 Ocak 2009