Login   Register  
PHP Classes
elePHPant
Icontem

Crawler: Extract links and images from remote Web pages

Recommend this page to a friend!
Stumble It! Stumble It! Bookmark in del.icio.us Bookmark in del.icio.us

  Author Author  
Picture of Md. Shaiful islam
Name: Md. Shaiful islam is available for providing paid consulting. Contact Md. Shaiful islam .
Classes: 1 package by
Country: Bangladesh Bangladesh
Age: 31
All time rank: 4269 in Bangladesh Bangladesh
Week rank: 371 Up14 in Bangladesh Bangladesh Down
Innovation award
Innovation award
Nominee: 1x


  Detailed description   Download Download .zip .tar.gz   Install with Composer Install with Composer  
This class can be used to extract links and images from remote Web pages.

It can access Web pages, parse the pages HTML and extract the URLs of the links and the images.

If necessary, the class may access a login page and emulate the submission of a login form to subsequent accesses can be done on behalf of the logged user.

  Classes of Md. Shaiful islam  >  Crawler  >  Download Download .zip .tar.gz  >  Support forum Support forum (6)  >  Blog Blog  >  RSS 1.0 feed RSS 2.0 feed Latest changes  
Name: Crawler
Base name: crawler
Description: Extract links and images from remote Web pages
Version: 1.1
PHP version: 4.0
License: Freely Distributable
All time users: 6056 users
All time rank: 337
Week users: 5 users
Week rank: 338 Down
 
  Groups   Rate classes User ratings   Trackback   Applications   Files Files  

  Groups  
Group folder image HTML HTML generation and processing View top rated classes
Group folder image Web services Web data clipping, SOAP or XML-RPC clients and servers View top rated classes


  Innovation Award  
PHP Programming Innovation award nominee
March 2008
Number 7


Prize: One copy of Delphi for PHP
Retrieving Web pages from remote sites is a relatively easy task in PHP.

If you want to crawl a site to search for something in its pages, you only need to retrieve the site pages, use some regular expressions to extract the site links, and retrieve the linked pages until all pages were followed.

However, if some pages can only be accessed by authenticated users, the problem is no longer so simple.

This package provides a more complete solution to the problem of crawling site pages by automatically authenticating, so it can access all pages restricted to logged users.

Manuel Lemos

  User ratings  
RatingsUtility Consistency Documentation Examples Tests Videos Overall Rank
All time: Sufficient (77%) Sufficient (72%) - Good (80%) - - Not sure (54%) 1331
Month: Not yet rated by the users

  Pages that reference this package  
Classe Crawler para PHP - Pegar links e imagem com PHP
Hoje vou postar uma classe para identificar links e imagens em sites...
SEO Tool: Search Engine position checker
I’ve just finished developing a stable release of quite an advanced SEO tool, I called it “position checker”...

Latest pages that reference packages Latest pages that reference packages


  Applications that use this package  
No pages of applications that use this class were specified.
Add link image If you know an application of this package, send a message to the author to add a link here.
  Files folder image Files  
File Role Description
Plain text file Crawler.php Class The Class
Accessible without login Plain text file ExampleCrawlImage.php Example Crawl Image form http://www.phpclasses.org/ site
Accessible without login Plain text file ExampleCrawlLink.php Example Crawl links form http://www.phpclasses.org/ site
Accessible without login Plain text file ExampleLoginCrawlLink.php Example Login and CrawlLink from a site

Install with Composer Install with Composer - Download Download all files: crawler.tar.gz crawler.zip
NOTICE: if you are using a download manager program like 'GetRight', please Login before trying to download this archive.