PHP Classes
elePHPant
Icontem

Decode PDF, ODT, Word, DOC, DOCX, RTF: Need a decoder for multiple formats

Recommend this page to a friend!
  All requests RSS feed  >  Decode PDF, ODT, Word, DOC, DOCX, RTF  >  Request new recommendation  >  A request is featured when there is no good recommended package on the site when it is posted. Featured requests  >  No recommendations No recommendations  

Decode PDF, ODT, Word, DOC, DOCX, RTF

A request is featured when there is no good recommended package on the site when it is posted. Edit

by herman lapre - 1 year ago (2016-02-20)

Need a decoder for multiple formats

This request is clear and relevant.
This request is not clear or is not relevant.

+7

The PHP file reader must be able to read PDF, ODT, DOC, DOCX and RTF documents.

  • 4 Clarification requests
  • 10. by Nitin Shukla - 7 months ago (2017-02-16) Reply

    I want to convert .doc, .rtf and .docx format into HTMl page without lost any content, style (bullets, tables, text format etc.). Can anyone please provide me any library/script that can handle all of my requirement.

    Script for individual format will also work for me.

    Thanks, Nitin

    • 9. by Backiaraj - 9 months ago (2016-12-02) Reply

      i like

      • 2. by Christian Vigh - 1 year ago (2016-02-26) Reply

        Please clarify your demand : which data do you expect from the PDF/ODT/DOC/DOCX and RTF document reader ? do you want to manipulate document elements after decoding ? do you want to be able to perform modifications after decoding ? or do you simply want to display the document contents on a web page ?

        • 3. by Manuel Lemos - 1 year ago (2016-02-27) in reply to comment 2 by Christian Vigh Comment

          According to the request tags he wants a file viewer for those formats. So I suppose something that converts those formats to images will be helpful.

          It seems that OpenOffice/LibreOffice can be used for that purpose. the soffice program has options that can start the program opening a given file and convert the file to some other format, like Web pages with pictures, and then it exits without opening the GUI.

          So it can run from the console using the options --headless and --convert-to .

        • 4. by Christian Vigh - 1 year ago (2016-02-27) in reply to comment 3 by Manuel Lemos Comment

          I have had some experience with OpenOffice/LibreOffice for converting .DOC/.DOCX to .PDF documents. I have encountered some formatting issues, especially with tables but in general it works well.

          In addition, the unoconv script provides a command-line interface for doing the conversion.

          However, as far as I can remember, I requires the openoffice daemon to be up and running.

          I don't know if this could address Herman's needs ?

        • 5. by Manuel Lemos - 1 year ago (2016-02-27) in reply to comment 4 by Christian Vigh Comment

          You do not need to have the OpenOffice daemon running. You can just start OppeOffice on demand to make the format conversion using the soffice command with the options mentioned above. So you do not need the unoconv script as well.

          Starting OpenOffice as a daemon has the advantage of keeping OpenOffice running in memory, just in case you need to convert many documents without delay. In that case you would use a script like unoconv to communicate with the daemon.

        • 6. by satya teja - 1 year ago (2016-07-11) in reply to comment 2 by Christian Vigh Comment

          hi i have the same question and i want to simply display the documents contents into their respective fields, for example if i upload a resume the data must be displayed into fileds like first name, last name etc.

        • 7. by Muhammad Khalid Chaudhary - 10 months ago (2016-11-02) in reply to comment 4 by Christian Vigh Comment

          Can you explain how to convert .DOC/.DOCX to .PDF documents Using PHP and OpenOffice/LibreOffice ?

        • 8. by Manuel Lemos - 10 months ago (2016-11-02) in reply to comment 7 by Muhammad Khalid Chaudhary Comment

          I do not remember exactly. You need to check the documentation but I think it is something pretty easy. What may be hard is to have OpenOffice installed on the server. In any case maybe somebody can publish a class that can do that for you.

      • 1. by Manuel Lemos - 1 year ago (2016-02-26) Reply

        There are packages that can render some of those formats as images that you can display on a Web page.

        There are not packages for all those formats but some of them could be added later using external programs to render the files as images.

        That could be a innovative solution.

        Ask clarification

        1 Recommendation

        ApiLayer API Encapsulation: Send requests to APILayer REST APIs

        This recommendation solves the problem.
        This recommendation does not solve the problem.

        +4

        by Christian Vigh package author package author Reputation 370 - 1 year ago (2016-02-26) Comment

        As Manuel said, there is currently no universal solution for that. The package referenced here is able to capture html contents and generate either an image or a pdf document, using a third party web service.

        • 3 Comments
        • 1. by Dave Smith - 1 year ago (2016-02-29) Reply

          While I support apiLayer, I do not think this is what the requester is looking for. They do NOT want to convert html, they want to view rtf, office doc and docx, and openoffice odt files.

        • 2. by Dave Smith - 1 year ago (2016-02-29) in reply to comment 1 by Dave Smith Reply

          Forgot to mention that they also want to read adobe pdf, not create them.

        • 3. by herman lapre - 1 year ago (2016-03-04) in reply to comment 1 by Dave Smith Reply

          that is exactly what i need; i have research TET, TIKA , several pdf decoders etc. but they all cover partial solutions. Migrating to eg. elasticsearch solution is a bit overkill to me. I need a reader that is able to read the plain text from PDF,ODT,DOCX,DOC,RTF documents and the like


        Recommend package
        : 
        :