Login   Register  
PHP Classes
elePHPant
Icontem

Fast Chinese Word Segmentation

Recommend this page to a friend!
Stumble It! Stumble It! Bookmark in del.icio.us Bookmark in del.icio.us

  Author  
Picture of Wudi
Name: Wudi <e-mail contact>
Packages: 5 Browse all classes by Wudi Browse all classes by
Country: China China - PHP jobs in China
Age: 24
All time rank: 9959 in China China
Week rank: 837 Up10 in China China Down
Innovation award
Innovation award
Nominee: 2x


  Detailed description   Download .zip .tar.gz  
This class can segment Chinese text.

It uses the RMM (reverse maximum match) approach. Therefore it may commit some mistakes that cannot be avoided with perfection.

It handles English but in a very simple way.

  Classes of Wudi  >  Fast Chinese Word Segmentation  >  Download .zip .tar.gz  >  Support forum Support forum (1)  >  Blog Blog  >  RSS 1.0 feed RSS 2.0 feed Latest changes  
Name: Fast Chinese Word Segmentation
Base name: fcws
Description: Segment Chinese text using the RMM approach
Version: -
PHP version: -
License: Free for non-commercial use
All time users: 567 users
All time rank: 4579
Week users: 0 users
Week rank: 2023 Equal
Country specific: This package is specific mainly for applications used in China China .
 
  Groups   Screenshots Screenshots   Rate classes User ratings  
  Applications   Related links   Files Files  

  Groups  
Group folder image Text processing Manipulating and validating text data View top rated classes

  Files folder image Screenshots  
Example
File Role Description
Accessible without login Image file screenshot.png Screen Example


  Innovation Award  
PHP Programming Innovation award nominee
July 2005
Number 9
Chinese is a language that is becoming more and more relevant on the Internet due to the growth of the Chinese economy. This growth is making it possible for many Chinese speaking people becoming Internet users.

The Chinese language words are actually individual symbols. Certain encodings may include ASCII characters allowing for words in other languages to be mixed in Chinese documents.

This class provides a solution to break a Chinese text in a way that it avoids breaking English words that may be mixed with Chinese symbols.

Manuel Lemos

  User ratings  
Not yet rated by the users

  Applications that use this class  
No application links were specified for this class.
Add link image If you know an application of this package, send a message to the author to add a link here.

  Related links  
Link Description
Default dictionary The default dict for this class (@mediafire.com)
Default dictionary The default dict for this class (@box.net)

  Files folder image Files  
File Role Description
Plain text file cwordseg_fast.lib.php Class Class
HTML file Readme_CN.htm Doc. Readme (Chinese)
HTML file Readme_EN.htm Doc. Readme (English)
Plain text file test.php Example Test

Download all files: fcws.tar.gz fcws.zip
NOTICE: if you are using a download manager program like 'GetRight', please Login before trying to download this archive.