PHP Classes

Character Set From String: Identify predominant character set in a string

Recommend this page to a friend!
  Info   Documentation   View files Files   Install with Composer Install with Composer   Download Download   Reputation   Support forum   Blog    
Ratings Unique User Downloads Download Rankings
StarStarStar 52%Total: 114 All time: 9,587 This week: 39Up
Version License PHP version Categories
charset-from-string 1.0.6Custom (specified...5Localization, PHP 5, Text processing, L...
Description 

Author

This class can identify predominant character set in a string.

It can take a string of text in UTF-8 and analyzes the character codes to determine which is the predominant character set that the is used based on the frequency of the characters that are typically of certain languages.

Currently it can identify the character sets of Latin, Greek, Cyrillic. Armenian, Hebrew, Arabic, Devanagari, Bengali, Gujarati, Tamil, Malayalam, Sinhala, Thai, Lao, Tibetan, Burmese, Georgian, Korean, Khmer, Japanese, and CJK.

Innovation Award
PHP Programming Innovation award nominee
June 2017
Number 6
A string in Unicode may contain text of multiple character sets.

This class can identify predominant character set in a string of many possible character sets.

Manuel Lemos
Picture of Peter Kahl
  Performance   Level  
Name: Peter Kahl <contact>
Classes: 37 packages by
Country: United Kingdom
Age: ???
All time rank: 41521 in United Kingdom
Week rank: 199 Up8 in United Kingdom Up
Innovation award
Innovation award
Nominee: 23x

Winner: 2x

Documentation

Charset From String

Downloads Download per Month License If this project has business value for you then don't hesitate to support me with a small donation.

Identifies predominant script (charset, language) in a string. This library is capable of identifying:

<pre> Arabic Armenian Bengali Burmese CJK Cyrillic Devanagari Georgian Greek Gujarati Hebrew Japanese Khmer Korean Lao Latin Malayalam Sinhala Tamil Thai Tibetan </pre>

Usage

use peterkahl\CharsetFromString\CharsetFromString;

echo CharsetFromString::getCharset('????? ????? ?? ???????')."\n"; # ARABIC

echo CharsetFromString::getCharset('????? ????? ?? ????-??')."\n"; # HEBREW

echo CharsetFromString::getCharset('??? ?????? ?????? ??????, ??? ??? ?? ?????.')."\n"; # CYRILLIC

echo CharsetFromString::getCharset('Lex iniusta non est lex.')."\n"; # LATIN

echo CharsetFromString::getCharset('??? ??? ?? ??? ??? ???? ??.')."\n"; # KOREAN

echo CharsetFromString::getCharset('??????????????')."\n"; # JAPANESE

echo CharsetFromString::getCharset('??????????')."\n"; # CJK

echo CharsetFromString::getCharset('??????????????? ??????????????')."\n"; # THAI

echo CharsetFromString::getCharset('????????????????????????????????????? ????')."\n"; # LAO

echo CharsetFromString::getCharset('?????????????????????????????????????????????')."\n"; # KHMER

echo CharsetFromString::getCharset('???????????????????????????????')."\n"; # TIBETAN

echo CharsetFromString::getCharset('? ????? ???????? ??? ???????? ?????? ????, ???? ???????? ???? ??? ??????.')."\n"; # GREEK

  Files folder image Files (4)  
File Role Description
Files folder imagesrc (1 file)
Accessible without login Plain text file composer.json Data Auxiliary data
Accessible without login Plain text file LICENSE Lic. License text
Accessible without login Plain text file README.md Doc. Documentation

  Files folder image Files (4)  /  src  
File Role Description
  Plain text file CharsetFromString.php Class Class source

The PHP Classes site has supported package installation using the Composer tool since 2013, as you may verify by reading this instructions page.
Install with Composer Install with Composer
 Version Control Unique User Downloads Download Rankings  
 100%
Total:114
This week:0
All time:9,587
This week:39Up
 User Ratings  
 
 All time
Utility:75%StarStarStarStar
Consistency:75%StarStarStarStar
Documentation:75%StarStarStarStar
Examples:-
Tests:-
Videos:-
Overall:52%StarStarStar
Rank:2400