| Recommend this page to a friend! |
| Classes of Karol Janyst | > | Spider website | > | Download .tar.gz .zip | > | > | > | |||||
|
| |||||||||||||||||||||||||||||||||||||||||||
| Detailed description | ||
| This class can be used to crawl a site and retrieve the the URL of all links. It can retrieve a page of a site and follow all links recursively to retrieve all the site URLs. The class can restrict the crawling to URLs with a given extension and avoids accessing pages listed in the site robots.txt file, or pages set with the no index or no follow meta tags. |
| Groups | ||
| HTML | HTML generation and processing | View top rated classes | |
| PHP 5 | Classes using PHP 5 specific features | View top rated classes | |
| Searching | Search engines, crawling and indexing | View top rated classes |
| Applications | ||||||
No application links were specified for this class.
|
|||||||||||||||||||||||||||||||||||||||||