PHP PDF to HTML: Convert PDF to HTML using Poppler

Recommend this page to a friend!
  Info   View files Documentation   View files View files (6)   DownloadInstall with Composer Download .zip   Reputation   Support forum (6)   Blog    
Ratings Unique User Downloads Download Rankings
StarStarStar 56%Total: 1,814 This week: 2All time: 2,151 This week: 122Up
Version License PHP version Categories
pdf-to-html 1.0.7GNU General Publi...5.4PHP 5, Utilities and Tools, Files and..., C...
Description Author

This class can convert PDF to HTML using Poppler program.

It can take the path of the Poppler program tools and execute several operations to extract information from PDF documents.

Currently the class can convert whole PDF documents or individual pages to HTML, get the document information, return the page count, etc..

Several parameters can be configured like the the preferred format of the pictures inside the document, zoom scale, whether to use images or CSS inline within the HTML or as external files, etc..


What is the best PHP search string in pdf class?
Search string in PDF and return page number

What is the best PHP pdf to text class?
pdf to text format in php

What is the best PHP convert pdf to html class?
Need to convert PDF to HTML or format to embed in Web site

What is the best PHP read pdf file class?
Read PDF file upload than read in text in it.

Convert PDF to HTML
Convert PDF to HTML library

What is the best PHP pdf to html class?
Converting PDF files to a HTML file

What is the best PHP pdf to html class?
Convert PDF to HTML

What is the best PHP convert html to pdf class?
I need to convert html content with tables to pdf

I am looking for a code to convert PDF to HTML and PDF to JPEG

Picture of Anton N Nikolaev
  Performance   Level  
Name: Anton N Nikolaev <contact>
Classes: 1 package by
Country: Russian Federation Russian Federation
Age: 40
All time rank: 159848 in Russian Federation Russian Federation
Week rank: 122 Up9 in Russian Federation Russian Federation Up




This PHP class can convert your pdf files to html using poppler-utils.


Big thanks Mochamad Gufron (mgufrone)! I did a packet based on its package (

Important Notes

Please see how to use below.


When you are in your active directory apps, you can just run this command to add this package on your app

  composer require tonchik-tm/pdf-to-html:~1

Or add this package to your composer.json



1. Install Poppler-Utils


sudo apt-get install poppler-utils

Mac OS X

brew install poppler


For those who need this package in windows, there is a way. First download poppler-utils for windows here <>. And download the latest binary.

After download it, extract it.

2. We need to know where is utilities


$ whereis pdftohtml
pdftohtml: /usr/bin/pdftohtml

$ whereis pdfinfo
pdfinfo: /usr/bin/pdfinfo

Mac OS X

$ which pdfinfo

$ which pdftohtml


Go in extracted directory. There will be a directory called bin. We will need this one.

3. PHP Configuration with shell access enabled



// if you are using composer, just use this
include 'vendor/autoload.php';

// initiate
$pdf = new \TonchikTm\PdfToHtml\Pdf('test.pdf', [
    'pdftohtml_path' => '/usr/bin/pdftohtml',
    'pdfinfo_path' => '/usr/bin/pdfinfo'

// example for windows
// $pdf = new \TonchikTm\PdfToHtml\Pdf('test.pdf', [
//     'pdftohtml_path' => '/path/to/poppler/bin/pdftohtml.exe',
//     'pdfinfo_path' => '/path/to/poppler/bin/pdfinfo.exe'
// ]);

// get pdf info
$pdfInfo = $pdf->getInfo();

// get count pages
$countPages = $pdf->countPages();

// get content from one page
$contentFirstPage = $pdf->getHtml()->getPage(1);

// get content from all pages and loop for they
foreach ($pdf->getHtml()->getAllPages() as $page) {
    echo $page . '<br/>';

Full list settings:


$full_settings = [
    'pdftohtml_path' => '/usr/bin/pdftohtml', // path to pdftohtml
    'pdfinfo_path' => '/usr/bin/pdfinfo', // path to pdfinfo

    'generate' => [ // settings for generating html
        'singlePage' => false, // we want separate pages
        'imageJpeg' => false, // we want png image
        'ignoreImages' => false, // we need images
        'zoom' => 1.5, // scale pdf
        'noFrames' => false, // we want separate pages

    'clearAfter' => true, // auto clear output dir (if removeOutputDir==false then output dir will remain)
    'removeOutputDir' => true, // remove output dir
    'outputDir' => '/tmp/'.uniqid(), // output dir

    'html' => [ // settings for processing html
        'inlineCss' => true, // replaces css classes to inline css rules
        'inlineImages' => true, // looks for images in html and replaces the src attribute to base64 hash
        'onlyContent' => true, // takes from html body content only

Feedback & Contribute

Send me an issue for improvement or any buggy thing. I love to help and solve another people problems. Thanks :+1:

  Files folder image Files  
File Role Description
Files folder imagesrc (3 files)
Accessible without login Plain text file composer.json Data Auxiliary data
Accessible without login Plain text file LICENSE Lic. License text
Accessible without login Plain text file Doc. Documentation

  Files folder image Files  /  src  
File Role Description
  Plain text file Base.php Class Class source
  Plain text file Html.php Class Class source
  Plain text file Pdf.php Class Class source

 Version Control Unique User Downloads Download Rankings  
This week:2
All time:2,151
This week:122Up
User Ratings User Comments (2)
 All time

For more information send a message to info at phpclasses dot org.