It looks that it cannot handle UTF-8 well, even if it says so. It will fail to recognize not Chinese, but Latin characters as those from French or German. Why the UTF-8 than when everything not-ascii become garbage? Very strange to say it is UTF-8 when in fact address just the 4 bit equivalent of ASCII characters. In fact, for example French or German, do not need 4 bit, 2 are enough. Chinese needs 4. However, this class cannot be consider UTF-8 compliant as far as it fails across the board.