 |
|
Innovation award
 Nominee: 1x |
This class can be used to extract important words from HTML documents.
It can process a well-formed XHTML document and extract the words contained in the document.
The class gives scores to each word depending on conditions like, whether the first letter is upper case, whether the word is inside strong or bold tags, etc..
It returns an associative array of words sorted by importance score.
|
|
| Name: |
Seltz analyzer |
| Base name: |
seltz_analyzer |
| Description: |
Extract important words from HTML documents |
| Version: |
0.4 |
| PHP version: |
5.2 |
| License: |
Free for non-commercial use |
| All time users: |
588 users |
| All time rank: |
4479 |
| Week users: |
0 users |
| Week rank: |
2038  |
| |
|
 March 2009
Number 8
Prize: One book of choice by Manning |
There are many solutions for determining the most important keywords of a text document.
However, when the text is part of an HTML document, the importance of each keyword may be affected by the emphasis given by the tags that enclose each keyword.
This class implements an approach for determining the most important keywords in an HTML document considering also the tags that format them.
Manuel Lemos |
| There are not enough user ratings to display for this class. |
| |
Applications that use this class |
|
|
No application links were specified for this class.

If you know an application of this package, send a message to the
author to add a link here.
| |
Files |
|
|