Monday, July 28, 2008

Cuil to crawl your site

Webmaster Info

Cuil is the biggest search engine on the planet. In our quest to let users search as much of the Internet as possible, Cuil has indexed more than 120 billion pages so far.

If you would like Cuil to crawl your site and have it included in our index, please let us know.

Twiceler is the name of our robot Web crawler. The user-agent is “twiceler.” We understand that many small sites are bandwidth-limited, so we support the robots.txt Crawl-delay directive. You can read about robots.txt at robotstxt.org and there is a simple generator of the file at mcanerin.com.

If you have modified your robots.txt file for Twiceler, it may take several days for us to re-read the file. If you need something blocked right away, please let us know.

Got a Twiceler question? If you have questions or concerns about Twiceler you can contact Jim. Jim’s the guy who keeps track of Twiceler, when he’s not busy with his horses.

If you would prefer that we not crawl your site at all we are happy to oblige. Just drop Jim a note to that effect and he will place your site or IP address on our do-not-crawl list. Be sure to be explicit about the site to block as email address domains frequently differ from the site in question.

Occasionally, we have seen other Web crawling robots masquerading as Twiceler. You can be sure it’s Cuil crawling your site if the robot has one of the following IP addresses:

38.99.13.121 38.99.44.101 64.1.215.166 208.36.144.6
38.99.13.122 38.99.44.102 64.1.215.162 208.36.144.7
38.99.13.123 38.99.44.103 64.1.215.163 208.36.144.8
38.99.13.124 38.99.44.104 64.1.215.164 208.36.144.9
38.99.13.125 38.99.44.105 64.1.215.165 208.36.144.10
38.99.13.126 38.99.44.106

To all those who have contacted us to let us know that they are happy to have their site included in a Web index for the first time, thank you for being a part of the biggest search engine on the Web—Cuil!


No comments: