PHP Classes

Multibyte Keyword Generator: Extract meta keywords from multi-byte texts

Recommend this page to a friend!
  Info   View files Example   View files View files (13)   DownloadInstall with Composer Download .zip   Reputation   Support forum   Blog    
Last Updated Ratings Unique User Downloads Download Rankings
2017-02-25 (1 month ago) RSS 2.0 feedStarStarStar 58%Total: 1,283 This week: 1All time: 2,913 This week: 734Up
Version License PHP version Categories
cm-mb-keyword-gen 1.9GNU General Publi...4Text processing, Content management, SEO
Description Author

This class can be used to Extract meta keywords from multi-byte texts.

It is an enhanced version of the "Automatic Keyword Generator" class originally written by Ver Pangonilo.

This version provides better word segmentation, the ability to handle multi-byte strings, and texts in multiple languages.

  Performance   Level  
Name: Peter Kahl <contact>
Classes: 11 packages by
Country: Hong Kong Hong Kong
Age: ???
All time rank: 9144 in Hong Kong Hong Kong
Week rank: 5 Up1 in Hong Kong Hong Kong Up
Innovation award
Innovation award
Nominee: 4x


Multibyte Keyword Generator

Copyright (c) 2009-2012, Peter Kahl. All rights reserved.


This PHP class is based in large part on the "Automatic Keyword Generator" class by Ver Pangonilo with additional improvements, among them being better word segmentation and ability to handle multibyte strings.

This class automatically generates META Keywords for your web pages based on the contents of a text string. This eliminates the tedious process of thinking what the best keywords are. The main principle of this method is the number of occurrences of single words or multiple words in a text string.

The string supplied to this class may contain HTML tags and punctuations. Advantage is taken from the presence of line breaks and punctuations to better guess the best multiple word keyphrases.

This Multibyte Keyword Generator will automatically create single word keywords, 2-word and 3-word keyphrases. All keywords and keyphrases are filtered to remove common (useless) words. Common words are defined within the class and can be associated with specific language.

This class is highly configurable. One can use minimal settings and rely on defaults. Alternatively, one can choose to obtain ANY combination of final result: 1-word keywords, 2-word keyphrases, 3-word keyphrases. Each option can be disabled. For example, one can configure this class to obtain only 1-word keywords, or only 2-word keyphrases, or only 3-word keyphrases, or all, or any combination.

This class is capable of handling multilingual texts and multibyte strings.

This class is capable of handling all European languages and is likely to handle many others as well. You may need to define common (useless) words for your own language if not already part of this class.


This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see

Change Log

0.9 ..... 2009-11-05

Initial release

1.0 ..... 2010-01-19

Improved function removeDuplicateKw() to better handle deletion of duplicate
plural words (English), such as "class" and "classes".

1.1 ..... 2010-01-19

Changed the function array_one_dim() to array_flatten().

1.2 ..... 2010-08-14

Improved regular expressions in function html2txt().

1.3 ..... 2010-08-20

Added word segmentation for character ':' in function process_text().

1.4 ..... 2011-05-08

1.5 ..... 2012-02-26

1.6 ..... 2012-02-26

1.7 ..... 2012-11-02

Added link to repo on GitHub.

  Files folder image Files  
File Role Description
Plain text file class.colossal-min...yword-generator.php Class the class itself
Accessible without login Plain text file common-words-en_GB.php Conf. common words ENGLISH
Accessible without login Plain text file common-words-de_DE.php Conf. common words GERMAN
Accessible without login Plain text file common-words-cs_CZ.php Conf. common words CZECH
Accessible without login Plain text file example-BEST-en_GB.php Example the BEST way to use this class
Accessible without login Plain text file example_minimal.php Example using minimal configuration
Accessible without login Plain text file example-cs_CZ.php Example using Czech text
Accessible without login Plain text file example-de_DE.php Example using German text
Accessible without login Plain text file example-en_GB.php Example using English text
Accessible without login Plain text file example_single_words_only.php Example produces only single word keywords
Accessible without login Plain text file Doc. Documentation
Accessible without login Plain text file CHANGELOG Doc. Change log
Accessible without login Plain text file LICENSE Doc. License

 Version Control Unique User Downloads Download Rankings  
This week:1
All time:2,913
This week:734Up
 User Ratings  
 All time