PHP Classes
elePHPant
Icontem

What is the best PHP pdf table parser class?: Extract data form a table on multipage pdf

Recommend this page to a friend!
  All requests RSS feed  >  What is the best PHP pdf table parser...  >  Request new recommendation  >  A request is featured when there is no good recommended package on the site when it is posted. Featured requests  >  No recommendations No recommendations  

What is the best PHP pdf table parser class?

A request is featured when there is no good recommended package on the site when it is posted. Edit

by Francesco Facco de Lagarda - 4 months ago (2019-12-03)

Extract data form a table on multipage pdf

This request is clear and relevant.
This request is not clear or is not relevant.

+1

I need to extract data from rows and columns of a table from a PDF file.

The PDF document contains a 5 column table. I need to extract the data from it.

All attempts with various libraries have not been able to understand the table and cant accurately extract the data contained in the individual cells.

  • 2 Clarification requests
  • 2. by Marco van Oostende - 1 month ago (2020-02-18) Reply

    It is very much depending on the quality of the PDF. It is not uncommon that cell content is cluttered around the table, or that text is gibberish. I would suggest to simply copy the text you wish from that table onto the clipboard and paste it into something like Notepad or any other text-based tool. This should give you an indication on what is actually possible: if you can find a structure in that text, the above package may work. Big chance it won't however.

    • 1. by Manuel Lemos - 1 month ago (2020-02-18) Reply

      Parsing and extracting data from PDF documents is not an easy task due to the complexity of that kind of documents.

      There is this PDF document parser but I am not sure if it can handle tables well in PDF document. Can you please try it and let us know if it works well for you?

      https://www.phpclasses.org/package/9732-PHP-Extract-text-contents-from-PDF-files.html

      Ask clarification

      Recommend package
      : 
      :