Gün: 30 Ocak 2009

  • cuill’den arama motoru hızıyla ilgili sorduğum soruya gelen cevap

    Dear Nurettin,

    Twiceler is the crawler for our new search engine. It is important
    to us that it obey robots.txt, and that it not crawl sites that do not
    wish to be crawled.

    Recently we have seen a number of crawlers masquerading as Twiceler, so
    please check that the IP address of the crawler in question is one of ours.
    You can see our IP addresses at http://www.cuil.com/info/webmaster_info

    You may wish to add a robots.txt file to your site (I notice you don’t
    have one). That is the standard mechanism for controlling robot access and
    behavior. You can read about it at
    http://www.robotstxt.org/wc/exclusion-admin.html
    and there a simple generator of the file here
    http://www.mcanerin.com/EN/search-engine/robots-txt.asp

    The Crawl-delay directive is what you are looking for. It tells robots
    that support it (we do) how long to wait between requests. Add the
    directive just below the ‘User-agent: *’ line like this:

    Crawl-delay: 120

    would tell us to wait two minutes between requests.

    Also be aware that changes to robots.txt take several days to take
    effect. The industry standard is to cache robots.txt for seven days,
    but we make every effort to re-read it more frequently.

    Please feel free to contact me if you have any further questions.

    Sincerely,

    James Akers
    Operations Engineer
    Cuill, Inc.