PHP Classes

Demo data $arr[6] not working?

Recommend this page to a friend!

      PHP UTF-8 Validation  >  All threads  >  Demo data $arr[6] not working?  >  (Un) Subscribe thread alerts  
Subject:Demo data $arr[6] not working?
Author:R. Bakker
Date:2019-02-02 15:53:31


  1. Demo data $arr[6] not working?   Reply   Report abuse  
R. Bakker - 2019-02-02 15:53:31
The demo data from $arr[6] (chr(0xC6) . ' AE Ligature') is not showing up any errors while runnig your demo. Nor does $arr[7] ('Accented "a" ' . chr(0xE0) . ' in this string').
I am at a loss: should'nt these also yield errors (acoording to you comment in the demo file?
René bakker, Netherlands

  2. Re: Demo data $arr[6] not working?   Reply   Report abuse  
Ray Paseur - 2019-02-02 20:25:27 - In reply to message 1 from R. Bakker
Yes, René - your understanding is correct. These should have data in the "error" property. In my tests they worked correctly. Here is the UTF8 object for $arr[6]:

UTF8 Object
[error] => Array
[0] => 0

[bytes] => 13
[chars] => 12
[str] => � AE Ligature

Please see: ...

  3. Re: Demo data $arr[6] not working?   Reply   Report abuse  
R. Bakker - 2019-02-04 00:09:17 - In reply to message 2 from Ray Paseur
Hi Ray,

After some extensive testing this weekend, I like to share my findings:
- demo page flawlessy in all tested Unix environments
- demo page fails for AE ligature and accented "a" in all tested Windows environments, not reporting any errors in these strings. It turns out that the utf8 class sees them as valid ascii strings.
The PCRE character class :ascii: -used to test for ascii- seems to be in the wrong here ...

Can you aknowlege and may be repair/extand the class?
I would realy like to use your class to clean up some database content as part of a migration.

Thx for any help in advance!


  4. Re: Demo data $arr[6] not working?   Reply   Report abuse  
Ray Paseur - 2019-03-02 13:45:11 - In reply to message 3 from R. Bakker
Hi, René, sorry I missed your message.

Please have a look at this script. ...

It uses the $repair (second argument to the constructor) to tell the class to try to repair invalid UTF8 characters. I do not have any way to test on Windows, sorry. This script has highlight_file(__FILE__); so you can copy / paste the code.

My Twitter handle is @RayPaseur and my GMail is Ray.Paseur@

If you can post a link to the demo showing the error, I'll be glad to look into it, thanks.