|I'm trying to get the spiderClass class to work, however i'm having a few issues. When i spider a site, i get values returned that have the root slash removed:|
also, when i start with a site that has a page in it, like:
pages within that directory are returned without the original directory:
site.domain.com/page1/page.htm, page2.htm, return as:
my regular expression is the site: "/http\:\/\/site\.domain\.com\/page\//"
lastly, when i looked in the second problem, i found that the term "page.htm" is being sent to the parse_url() function. should it be sending the full page?