PHP Classes

Impossible string to extract

Recommend this page to a friend!

      HTML extractor  >  All threads  >  Impossible string to extract  >  (Un) Subscribe thread alerts  
Subject:Impossible string to extract
Summary:some ID's are displaying 'oos' when they are in the page
Messages:3
Author:Vector Frog
Date:2009-03-11 03:38:20
Update:2009-03-11 16:52:21
 

 


  1. Impossible string to extract   Reply   Report abuse  
Picture of Vector Frog Vector Frog - 2009-03-11 03:38:20
There are 2 ID's that just will not extract for me. I try to use the extractByID call which works for all the other ID's, but on these ID's, it just won't work. Here is one of the strings:

<span id="ctl00_cplhMainContent_lblFeatures" class="smalltext"><table id="tblSpecs"></td></tr><tr valign="top"><td class="spec_label_td"><span class="spec_label">Product Description</span></td><td class="spec_value_td"><span class="spec_value_alt">Ilford HP5 Plus - B/W film - 135 (35 mm) - ISO 400 - 36 - 50 rolls<br /></td></tr><tr valign="top"><td class="spec_label_td_alt"><span class="spec_label">Type Packaged Quantity</span></td><td class="spec_value_td_alt"><span class="spec_value_alt">Black & white print film<br /></td></tr><tr valign="top"><td class="spec_label_td"><span class="spec_label">Format</span></td><td class="spec_value_td"><span class="spec_value_alt">135 (35 mm)<br /></td></tr><tr valign="top"><td class="spec_label_td_alt"><span class="spec_label">Speed</span></td><td class="spec_value_td_alt"><span class="spec_value_alt">ISO 400<br /></td></tr><tr valign="top"><td class="spec_label_td"><span class="spec_label">Exposures per Roll</span></td><td class="spec_value_td"><span class="spec_value_alt">36<br /></td></tr><tr valign="top"><td class="spec_label_td_alt"><span class="spec_label">Roll Qty</span></td><td class="spec_value_td_alt"><span class="spec_value_alt">50 rolls<br /></td></tr></table></span>

To add to my frustration, I've written my own function that pulls the content in between 2 strings. In this case I'm using the ID and </table> This returns blank as well! Any ideas?

  2. Re: Impossible string to extract   Reply   Report abuse  
Picture of Dror Golan Dror Golan - 2009-03-11 11:17:41 - In reply to message 1 from Vector Frog
which of the ID at the attahced code you cannot extract?
because the class uses regular expression , the "_" char might be problemtatic So I suggest not using an ID name with this char.


Dror.

  3. I don't think it's the '_'   Reply   Report abuse  
Picture of Vector Frog Vector Frog - 2009-03-11 16:52:21 - In reply to message 2 from Dror Golan
Thanks for the reply Dror. I don't think it's the '_' character because almost all of my id's have that character and they all seem to work fine. Instead I think the issue has more to do with missing a </table> tag in the ID contents.

I wrote a simple function that simply takes all the content between 2 strings and I ran into the same problem, then when I added the </table> tag to the end of the returned string, it worked fine. I have no idea why this is the case, but everything is working now, so I'll take it :).

Thanks Dror, and I love this class. It's fantastic!