Login   Register  
PHP Classes
elePHPant
Icontem

Nice, but works only on some pdf documents, not all of them.

Recommend this page to a friend!
Stumble It! Stumble It! Bookmark in del.icio.us Bookmark in del.icio.us

      PDF Text Extractor  >  All threads  >  Nice, but works only on some pdf documents, not all of them.  >  (Un) Subscribe thread alerts  
Subject:Nice, but works only on some pdf documents, not all of them.
Summary:Package rating comment
Messages:3
Author:Issam
Date:2010-08-18 15:30:38
Update:2013-05-05 12:26:10
 

Issam rated this package as follows:

Utility: Sufficient
Consistency: Good

  1. Nice, but works only on some pdf documents, not all of them.   Reply  
Picture of Issam
Issam
2010-08-18 15:30:38
Nice, but works only on some pdf documents, not all of them.

Thanks.

  2. Re: Nice, but works only on some pdf documents, not all of them.   Reply  
Picture of Juan
Juan
2010-09-12 23:52:14 - In reply to message 1 from Issam
Hi. How did you make for this to work? I've tried with many pdf docs but no luck at all. Thanks.

  3. Re: Nice, but works only on some pdf documents, not all of them.   Reply  
Picture of Carsten Jensen
Carsten Jensen
2013-05-05 12:26:10 - In reply to message 2 from Juan
You don't mention if the PDF's you are trying to parse actually have text in them, or if the are only scanned images.

For scanned images you probably want to do OCR.

It's easy to see if the docs have text in them (they have gone through a pdf "printer" or saved directly as PDF) just Zoom in.. if the text starts to pixelate they are scanned. if the text still seems sharp it's text.

Using the Select Text in Acrobat can't be trusted for this test as it actually does OCR