Carsten Jensen - 2013-05-05 12:26:10 - In reply to message 2 from Juan
You don't mention if the PDF's you are trying to parse actually have text in them, or if the are only scanned images.
For scanned images you probably want to do OCR.
It's easy to see if the docs have text in them (they have gone through a pdf "printer" or saved directly as PDF) just Zoom in.. if the text starts to pixelate they are scanned. if the text still seems sharp it's text.
Using the Select Text in Acrobat can't be trusted for this test as it actually does OCR