How to Identify ISearchHereBOT Presumably, you arrived at this site because you noticed traffic from a User-Agent that identified itself with the string: Mozilla/5.0 (compatible; ISearchHereBOT 1.0 BETA; +http://www.isearchhere.com/bot.php) You have come to the right place to find out about the ISearchHereBOT crawler. What does ISearchHereBOT do? ISearchHere.Com crawls are used in an actual search engine available at http://www.ISearchHere.Com.com. This site gets queries from around the world. How ISearchHereBOT often Crawls a Site ISearchHereBOT is currently run sporadically (not continuously) on a large number of machines. Each machine has about 2-3 fetcher processes. Each fetcher has open at most 100-300 connections at any given time. In a typical situation, these connections would not all be to the same host. How you can Change how ISearchHereBOT Crawls your Site ISearchHereBOT understands robots.txt (it has to be robots.txt not robot.txt ) files. A robots.txt must be placed in the root folder of your website for its instructions to be followed. ISearchHere.Com does not look in subfolders for robots.txt files. A simple robots.txt file to block ISearchHere.Com from crawling any folders other than the cool_stuff folder and its subfolders might look like: User-agent: ISearchHereBOT Disallow: / Allow: /cool_stuff/ ISearchHereBOT also obeys HTML ROBOTS meta tags with content among none, noindex, nofollow, noarchive, nosnippet. An example HTML page, using the noindex, nofollow directive might look as follows: <!DOCTYPE html > <html> <head><title>Meta Robots Example</title> <meta name="ROBOTS"NOINDEX,NOFOLLOW" /> <!-- The members of the content attribute must be comma separated, whitespace will be ignored--> </head> <body> <p>Stuff robots shouldn't put in their index. <a href="/somewhere">A link that nofollow will prevent from being followed</a></p> </body> </html> ISearchHereBOT does not use Open Directory or Yahoo! Directory data, so noodp and noydir are implicitly supported. ISearchHereBOT matches case-insensitively. Within HTML documents it honors anchor rel="nofollow" directives. For example, the following link would not be followed by ISearchHereBOT: <a href="somewhere_else" rel="nofollow">This link would not be followed by ISearchHereBOT</a> ISearchHereBOT further understands the Crawl-delay extension to the robots.txt standard and also Sitemap directives. For example, User-agent: ISearchHereBOT Crawl-Delay: 10 # ISearchHereBOT will wait 10 seconds between requests Sitemap: http://www.domain.com/domainsitemap.xml #ISearchHereBOT will eventually download ISearchHereBOT only supports uncompressed sitemaps. For non-HTML pages, you can control how ISearchHereBOT indexes, follows links, and how ISearchHere.Com displays results from these pages in the ISearchHere.Com Web site by using an X-Robots-Tag HTTP header. For example, if your web server sent as part of its HTTP Response header before the actual page data of say a PDF file, the following X-Robots-Tag: nosnippet then if the PDF appeared as part of search results, then would be no snippet text under the link in the search results. More Specifics on robots.txt and Meta Tag Handling When processing a robots.txt file, if Disallow and Allow lines are in conflict, ISearchHereBOT gives preference to the Allow directive over the Disallow directive as the default behavior of robots.txt is to allow everything except what is explicitly disallowed. If a webpage has a noindex meta tag, then it won't show up in search results, provided that ISearchHere.Com has actually downloaded the page. If ISearchHere.Com hasn't downloaded the page, or is forbidden from downloading the page by a robots.txt file, it is possible for a link to the page to show up in search results. This could happen if another page links to the given page, and ISearchHere.Com has extracted this link and its text and used them in search results. One can check if a URL has been downloaded by typing a query info:URL into ISearchHere.Com and seeing the results. When processing a robots.txt file, ISearchHereBOT first looks for ISearchHereBOT User-agent blocks and extracts all of the Allow and Disallow paths listed in them. On success, these form the path that ISearchHereBOT will use to restrict its access to your site. If it cannot find any such block, it searches case-insensitively for User-Agent names which may contain the wildcard * which match with ISearchHereBOT's name. For example, *oop*, *Bot*, etc. It then parses all of these blocks and uses them to restrict its access to your site. In particular, if you have a block "User-Agent: *" followed by allow and disallow rules, and no blocks for ISearchHereBOT, then these paths will be what ISearchHereBOT uses and honors. Sitemap directives as per the Sitemap specification are not associated with any particular User-Agent. So ISearchHere.Com processes, to the extent that it does, any such directive it finds.In processing, Allow and Disallow paths prior to March, 2015, ISearchHereBOT did not understand * or $ in these paths. "*" and "$" are Google, Yahoo, and Bing supported extensions to the original robots.txt specification. As of March, 2015, ISearchHereBOT does understand these extensions. So for example, one can block access to pages on your site containing a query string by having a Disallow path such as: Disallow: /*? ISearchHere.Com makes use of the cURL libraries to download web pages. Prior to December, 2015, ISearchHere.Com used cURL's automatic following of redirects. This meant that ISearchHere.Com sometimes followed URL shortened links or other redirects to a page whose robots.txt would have denied it access. Since Feb, 2015, ISearchHere.Com does not use this feature of cURL and for a redirect response instead extracts a link that has to go through the same queuing and robots.txt checking as all other links. How Quickly does ISearchHereBOT Change its Behavior When ISearchHere.Com machines are crawling for longer than one day, they cache the robots.txt file. They use the cached directives rather than re-requesting the robots.txt file for 24 hours before making a new request of the robots.txt file again. So if you change your robots.txt file it might take a little while before the changes are noticed by ISearchHere.Com crawler. Adding Your Site to ISearchHere.Com Search Engine Currently you may go to the main page at http://www.ISearchHere.Com.com and select Submit URL and add your URL. There is a limit of 30 URLS per day. Who owns ISearchHere.Com Search? ISearchHere.Com is owned by Avanced Sales Force a startup company with ambition and bringing back the way searching should be. Where is ISearchHere.Com Search Located? ISearchHere.Com is primary located in Canada. We also have servers in multiple countries. We are determined to have our existence in almost every country soon. ISearchHere.Com currently has multiple servers up and running globally and are 100% owned by ISearchHere.Com Copyright The name ISearchHere.Com is not to be used for any other purpose. We do not allow anyone to copy our logo and or use our name without written permission. How Many URL's Does ISearchHere.Com Currently Have? ISearchHere.Com currently sees about 3.2 Billion urls. As of June 12, 2015 Contact Info If you have any questions about ISearchHere.Com crawler, please feel free to contact us (query at ISearchHere.Com) For Advertising and Business Development please contact us from the provded lnks on our main page or below. if you want ISearchHere.Com to crawl your site, please visit our main page and click Submit URL.