| Recommend this page to a friend! |
| Classes of Andy Pieters | > | Robots_txt | > | Download .tar.gz .zip | > | > | > | |||||
|
| ||||||||||||||||||||||||||||||||||||||||||||
| Detailed description | ||
| This class can be used to check whether a page may be crawled by looking at the robots.txt file of its site. It takes the URL of a page and retrieves the robots.txt file of the same site. The class parses the robots.txt file and looks up for the rules defined in that file to see if the site allows crawling the intended page. The class also stores the time when a page is crawled to check whether next time another page of the same site is being crawled it is honoring the intended crawl delay and request rate limits. |
| Groups | ||
| PHP 5 | Classes using PHP 5 specific features | View top rated classes | |
| Searching | Search engines, crawling and indexing | View top rated classes |
| Applications | ||||||
No application links were specified for this class.
|
|||||||||||||||||||||||||||||||||||||||||||||||