PHP Classes

HTMLPP: Parse HTML code and manage the DOM structure

Recommend this page to a friend!
  Info   View files Documentation   View files View files (6)   DownloadInstall with Composer Download .zip   Reputation   Support forum (4)   Blog    
Ratings Unique User Downloads Download Rankings
StarStarStar 57%Total: 792 All time: 4,311 This week: 488Up
Version License PHP version Categories
htmlpp 1.0.3GNU Lesser Genera...4.2HTML


HTMLPP is a PHP4 library for HTML code parsing. It allows you to parse a HTML code string, build the relative DOM structure and work on it with methods similar to Javascript.


HTML parsing:
- Simple tags
- Tags without closures
- Autoclosing tags
- Doctype, text and comment parsing
- Modern browser parsing behaviour (Add head,body and html tags if they're not present, Wrap table content inside the tbody if it's not present)

Dom traversing:
- Access to the parent node using the parentNode property
- Access to child nodes using the childNodes array property
- Access to sibling nodes using nextSibling and previousSibling properties
- Access to the owner document with ownerDocument property
- Document shortcuts to body, head and doctype

Dom manipulation:
- Append nodes with appendChild, append and other methods
- Remove nodes with removeChild and remove methods
- Replace nodes with replaceChild and replace methods

Attributes and style manipulation:
- Add, remove, set and get methods for attributes
- Add, remove, set and get methods for style properties

Node searching functions on every element:
- getElementById
- getElementsByTagName
- getElementsByClassName
- getElementsBySelector (Full featured support for Css3 selectors, Support for other non-standard selectors)
- Node iterator class for personalized filter functions

Dom collections with JQuery like methods:
- Add, remove and filter elements in the collection
- Change the current collection by searching in its elements siblings, child nodes or parent nodes
- Manipulate elements in the collection


- first release
- Fixed some bugs in elements parsing regexp
- Fixed a bug in doctype parsing
- Fixed some problems in the parser class
- Fixed a bug in HTMLFilterIterator::find() function when pass HTML_SEARCH_DESCENDANT as iteration type
- Fixed error on selector parsing
- Now every element is closed at the end of its parent code if no closing tag is found
- Better support for textarea tag
- Fixed bug on attributes parsing (thanks Mike)
- Fixed bug in getAttribute() method
- Fixed bug in getStyle() method
- Fixed bug on attributes parsing

Picture of Marco Marchiņ
  Performance   Level  
Name: Marco Marchiņ <contact>
Classes: 3 packages by
Country: Italy Italy
Age: 35
All time rank: 78329 in Italy Italy
Week rank: 288 Up11 in Italy Italy Up
Innovation award
Innovation award
Nominee: 2x


Extract div data or tags text from Web pages
I need to extract the values that are in divs of the same class

Class to convert HTML into objects.
Class to convert HTML into objects, XML DOM style.



  Files folder image Files  
File Role Description
Accessible without login HTML file documentation.html Doc. Documentation and examples
Plain text file HTMLCollection.php Class HTML collections class
Plain text file HTMLFilterIterator.php Class HTML filter iterator class
Plain text file HTMLNode.php Class HTML nodes class
Plain text file HTMLParser.php Class HTML parser private class
Plain text file HTMLPP.php Class Main HTMLPP class

 Version Control Unique User Downloads Download Rankings  
This week:0
All time:4,311
This week:488Up
 User Ratings  
 All time