PHP Classes
elePHPant
Icontem

File: README.md

Recommend this page to a friend!
  Classes of Lars Moelleken  >  PHP HTML to Text Conversion  >  README.md  >  Download  
File: README.md
Role: Documentation
Content type: text/markdown
Description: Documentation
Class: PHP HTML to Text Conversion
Parse HTML and extract text contained in it
Author: By
Last change: Update README.md
Date: 4 months ago
Size: 4,105 bytes
 

Contents

Class file image Download

Build Status codecov.io Coverage Status Scrutinizer Code Quality Codacy Badge SensioLabsInsight Dependency Status Reference Status Latest Stable Version Total Downloads Latest Unstable Version PHP 7 ready License

Html2Text

WARNING: this is only a Maintained-Fork of "https://github.com/mtibben/html2text/"

A PHP library for converting HTML to formatted plain text.

Installation

The recommended installation way is through Composer.

$ composer require voku/html2text

Basic Usage

$html = new \voku\Html2Text\Html2Text('Hello, &quot;<b>world</b>&quot;');

echo $html->getText();  // Hello, "WORLD"

Extended Usage

Each element (h1, li, div, etc) can have the following options:

  • 'case' => convert case (`Html2Text::OPTION_NONE, Html2Text::OPTION_UPPERCASE, Html2Text::OPTION_LOWERCASE , Html2Text::OPTION_UCFIRST, Html2Text::OPTION_TITLE`)
  • 'prepend' => prepend a string
  • 'append' => append a string

For example:

$html = '<h1>Should have "AAA" changed to BBB</h1><ul><li>• Custom bullet should be removed</li></ul><img alt="The Linux Tux" src="tux.png" />';
$expected = 'SHOULD HAVE "BBB" CHANGED TO BBB' . "\n\n" . '- Custom bullet should be removed |' . "\n\n" . '[IMAGE]: "The Linux Tux"';

$html2text = new Html2Text(
    $html,
    array(
        'width'    => 0,
        'elements' => array(
            'h1' => array(
              'case' => Html2Text::OPTION_UPPERCASE, 
              'replace' => array('AAA', 'BBB')),
            'li' => array(
              'case' => Html2Text::OPTION_NONE, 
              'replace' => array('•', ''), 
              'prepend' => "- ",
              'append' => " |",
            ),
        ),
    )
);

$html2text->setPrefixForImages('[IMAGE]: ');
$html2text->setPrefixForLinks('[LINKS]: ');
$html2text->getText(); // === $expected

Live Demo

  • HTML | TEXT
  • https://moelleken.org/url_to_text.php?url=https://ADD_YOUR_URL_HERE

History

This library started life on the blog of Jon Abernathy http://www.chuggnutt.com/html2text

A number of projects picked up the library and started using it - among those was RoundCube mail. They made a number of updates to it over time to suit their webmail client.

Now it has been extracted as a standalone library. Hopefully it can be of use to others.