PHP Classes

File: readme.txt

Recommend this page to a friend!
Stumble It! Stumble It! Bookmark in Bookmark in
  Classes of Nadir Latif  >  Site Checker  >  readme.txt  >  Download  
File: readme.txt
Role: Documentation
Content type: text/plain
Description: help file
Class: Site Checker
Find broken links in Web site pages
Author: By
Last change: Initial commit
Date: 2 months ago
Size: 2,517 bytes


Class file image Download
Made by: Nadir Latif (

Dependencies: Uses HTTP protocol client by Manuel Lemos and OverallTagView by Dror Golan available at phpclasses.

This script displays the url of all pages of a website (on which the script is run) that contain deadlinks. For each dead link it displays the page url and the dead link url.

1) Usage:

-Place the database connection information in get_missing_pages.php.
-Comment out the line $this->CheckCMSFiles(); if you only want to check physical pages.
-Comment out the line $this->CheckPhysicalFiles(); if you only want to check database pages. In the same file mention the name of the table and field where the pages in your database are located. This information is required in IsLinkValid($url) and CheckCMSFiles().
-Copy the files to the directory of a web server and run index.php.

2) What does this sctipt do?

The script initially retrieves the list of all pages on the website (physical pages as well as pages in a database. e.g CMS). It then checks each link in each page against this list. The script also displays the url of pages containing windows.location and windows.navigate. For links to external sites, the script makes http requests for the page and checks the status code. In this way links to external sites can be checked. e.g if there is a link to a page that has been removed but that exists in some backup folder then the script will indicate that the page is missing but can be found in the backup folder. For this to work the list of all files in the backup folder must be placed in the backup.csv file. Sometimes urls do not correspond to actual pages but are rewitten in .htacess files. This script also checks the htacess file. The user may also specify a list of folders to ignore. This script can be usefull on large sites as it can be used to determine internal/external page links that are dead. The script can be easily extended to list pages that contain invalid links to images, javascript, css files etc.

3)List of files:

a)index.php (initial file)
b)get_missing_pages.php (main program file)
c)backup.csv (contains list of urls on some backup folder)
d)mysql.php,query_builder.php (provide database functions)
e)htaccess.txt (main htacess file of server)
f)get_page.php (used to retrived the status code of a page)
g)readme.txt (help file)
h)http.php (contains the http functions)
i)TagView.class.php (contains function for extracting tags from html pages)

-Feel free to contact me for any assistance regarding this script.