Process input HTML code.
Features:
- Parses input text, get HTML tags information;
- Removes unwanted HTML tags (configurable whitelist/blacklist);
- Removes unwanted HTML tag attributes depending on their name/value;
- Formats all HTML tags/attributes as lowercase/uppercase;
- Detects URLs in text and convert to HTML links;
- Encodes HTML special characters to entities, similar to PHP's built-in functions but more configurable;
- Decodes HTML entities (named and numeric) to the corresponding unicode characters (to avoid the problem of double entity encoding);
- Uses additional map file to decode unknown HTML entities;
- Support UTF-8 and ASCII/ISO-8859-1 charsets for HTML entities decode;
- Obfuscates properly writen e-mail addresses using HTML entities;
- Converts newline characters to <br />\n;
- Strips the texts in HTML comments out;
- Replaces arbitrary strings in text (array search/replace mapping).
You can pass the input text for every command individually, or once for the constructor to be chain-processed.
Default process settings (tags and their attributes lists) should be acceptable for most sites.