This class can be used to parse HTML documents.
It retrieves an HTML document from a Web server. The document can be clean to remove white spaces, NUL and escape characters, Javascript and style section definitions.
The class can parse the HTML document and return an hierarchy of tag objects.
It can also traverse the parsed document hierarchy to extract the keywords contained in it and the respective keyword density values.
| Ratings | Utility |
Consistency |
Documentation |
Examples |
Tests |
Videos |
Overall |
Rank |
| All time: |
Sufficient (66.7%) |
Sufficient (66.7%) |
- |
Not sure (41.7%) |
- |
- |
Not sure (44.2%) |
1591 |
| Month: |
Not yet rated by the users |
No application links were specified for this class.

If you know an application of this package, send a message to the
author to add a link here.