- The Ar-PHP project
- Interview with Khaled Al-Shamaa
Arabic is a language spoken in many countries. The number of PHP developers from countries that use the Arabic language has been growing a lot. Currently it represents almost 2% of the users that access the PHPClasses site.
The Arabic language writing direction is from right to left. It requires the use of a non-Latin character set.
These facts raise special challenges to developers willing to build sites with Arabic text.
- The Ar-PHP project
Khaled Al-Shamaa, a PHP developer from Syria has written a library of classes named Ar-PHP. It address many of the challenges of building Arabic sites in PHP.
Khaled has submitted many of those classes to the PHPClasses site over the last years:
Ten of Khaled's Arabic classes have been nominated for the PHP Programming Innovation Award. He won the award first place in for 4 times in different months of 2006 and 2007. This impressive track record got him the second place of the all time ranking of PHP Programming Innovators.
- Interview with Khaled Al-Shamaa
PC = PHP Classes (Manuel Lemos)
KA = Khaled Al-Shamaa
PC: Khaled, can you please tell a bit about you, where do you live, where do you work, and what do you do?
KA: My name is Khaled Al-Shamaa, and I was born in Kuwait in 1975. I am a son of a Lebanese father and a Syrian mother. Both of my grandmothers are from Palestine.
I also worked in Jordan for Maktoob company from 1999 to 2002 as senior Web developer in mazadmaktoob.com (currently souq.com one of the leader e-commerce website in the region).
To simplify, I am an Arabic citizen currently living in Syria.
I graduated as a Computing Engineer in Aleppo University (Syria) in 1998. Since I left Maktoob in 2002, I work in the International Center for Agricultural Research in the Dry Area (ICARDA) as scientific software engineer. At the same time I am currently doing my MSc in bio-informatics.
In my second life I am PHP developer. I translated half a dozen of Web technology books and published more than 8 articles in the SCS Informatics magazine. I am also the author of a book titled "PHP and the Arabic language" published in 2007.
I love to travel. Actually I visited more than dozen of countries so far including Kuwait, Syria, Lebanon, Jordan, Iran, Eritrea, Egypt, Italy, Algeria, Tunisia, Libya, Bahrain, Uzbekistan and, Cyprus.
PC: Can you give an overview about the Ar-PHP project, your library of PHP classes for Arabic, when and why did you decide to develop it?
KA: Arabic belongs to the Semitic family of languages. Since morphological change in Arabic results from the addition of prefixes, in-fixes, as well as suffixes, simple removal of suffixes is not as effective for Arabic as it is for English.
Ar-PHP story starts when I published Arabic MySQL query class in February 2006 in the PHPClasses site.
It is a simple class developed to build WHERE condition clauses for SQL statement using MySQL regular expressions and Arabic lexical rules to perform advanced stem-based Arabic search function to retrieve more accurate Arabic results than simple match which usually works fine for English language.
At that time I thought it would be the end of the story, but the positive feedback I got, especially on several Arabic programmers forums like Swalif.net, encouraged me to do more steps forward by developing other classes to handle different issues to process and present Arabic content.
Well, this project started taking most of my free time, and I developed during 2006 and 2007 more than dozen of classes to handle different issues related to Arabic language.
In 2008 I focused more on documentation, quality assurance, and collect all those classes in one library, in addition to moving the source code into CVS repository hosted in Sourceforge.
In 2009 we started providing some end user solutions, like Wordpress plugin developed by Khaled Hourani from Syria and Firefox add-on developed by Salih Al-Matrafe from Saudi Arabia then enhanced by Djihed Afifi from Algeria.
I also presented it in wider circle through the participation in conferences that happened in our region, like "Arab Techies Code Sprint" that held in Egypt in May 2009, with the goal to find solutions to known Arabic language processing problems.
I met there expert developers that work in the same domain and we benefit from sharing information and experience. Many of them became an active developer of this project, in particular Taha Zerrouki from Algeria.
PC: Why did you decided to publish your work in the PHPClasses site?
KA: Well, first of all, PHPClasses was one of the most valuable resources I access as a passive Open Source developer, so I felt that it is the right place to publish my first active contribution in Open Source world.
Its also has less restrictive acceptance rules, when comparing to PEAR for example, but at the same time there is mechanism maintained by human to review submitted classes that gives them more credibility. Also the monthly PHP Innovation Award competition has given more momentum to this community.
PC: Developing Web applications in Arabic requires special care. What are the most important concerns and what components do you provide to address those concerns?
KA: Besides the search issue presented above, some of Arab countries use Hijri calendar instead of Gregorian calendar. So I developed classes to convert dates between those two calendars, as well as an Arabic version from date and strtotime PHP functions.
Another issue that is handled in this project is related to rendering Arabic text correctly in some libraries like GD, PDF, SWF and even VRML.
Bear in your mind that Arabic letter shape changes depending on the previous and following letters. Most of available libraries do not perform the necessary joining of Arabic glyphs, nor handle right to left writing. So we developed ArGlyphs class to address those issues.
There are also many other simple classes that handle:
Arabic auto summarization
Spell numbers in Arabic idiom
Arabic version of soundex function
Identify Arabic text in multi-language documents
Guess the gender of Arabic names
Calculate the time of Muslims prayer and Qibla direction
You can find the complete list of available classes and related examples and documentation here:
PC: What other problems does Arabic Web development poses that your library does not address but you are working on?
KA: I would like to say that Arabic customization for both Pspell extension and MySQL full text search feature are in the top of my to do list.
On the other hand I am currently working on adding this library to PEAR, I know it will not be a trivial task, but I will compensate all the necessary efforts to accomplish it.
One more thing, for 2010 I plan to cooperate with other PHP developers all over Arab world to translate the official PHP documentation into Arabic language. I would like to take this opportunity to encourage Arab PHP developers who can contribute to join us.
PC: Some Arabic applications use character set Windows-1256, others ISO-8859-6 and others UTF-8. When do you recommend the use of one or the other character sets?
KA: UTF-8 is always recommended. Both of Windows-1256 and ISO-8859-6 character set are there for historical reasons.
PC: PHP 6 is supposed to provide transparent Unicode support to text manipulation in PHP. Do you think it will be useful for writing PHP in Arabic applications or do you see any challenges that PHP 6 features may not solve just by itself?
KA: Transparent Unicode support coming in PHP 6 will be big step forward. This is suppose to solve most of the Arabic issues at the text manipulation level. We can benefit from this in PHP and Arabic language projects by simplifying the processing algorithms that currently are handled manually.
In the long term, I believe that Ar-PHP project delivers other functionality besides transparent Unicode support that PHP 6 provides, so I do not see any conflict.
PC: What are the most important recommendations that you can make to PHP developers that want to develop PHP applications ready to support Arabic, even when it was not the application initial idiom?
KA: Use UTF-8 character set and move text messages from PHP source code into separate files. Think about right to left languages when you design your view/presentation layer. If possible use an API or hooks to implement the functionality related to date, search, etc..
Currently I plan to develop add-ons to implement many of Ar-PHP library functionality in few PHP Web applications commonly used in Arab world, like MediaWiki, VBulletin, OS Commerce, Moodle, Drupal and Joomla. So, I hope that I can find people from the developer communities of those applications who can assist in this effort to provide better Arabic language support to those applications.
PC: What lessons did you learn writing PHP in Arabic applications that may be useful to PHP developers that want to write PHP applications in their idioms which require non-Latin character sets or different writing direction, like for instance Asian and Eastern Europe Languages?
KA: If you plan to be an active member of the Open Source community, but you hesitate because you are not sure if your code is good enough and if it can make a difference, the best starting point is for you to go ahead, break the ice, and start develop solutions for your own language.
Think about it, there is no body can be more experienced than you in this domain. You can always expect that any effort you put through will be worth it and will be recognized at least in your local region.
PC: Is there anything else you would like to say that was not asked in the questions above?
KA: As you may know, worldwide Internet usage has grown tremendously in recent years, and very rapidly in non-English speaking regions, especially in Arab world.
For example, from 2000 to 2008, the online population in the Middle East grew about 20 times. The Arabic Web content is estimated to be doubling every year. Such growth has created demand for better Web sites developed using resources in the Arabic language.
On the other hand, Arabic, the fifth most popular language in the world, is spoken by more than 284 million people in 22 countries, yet the Arabic Web is still in its infancy, constituting less than 1% of total Web content.
So its fresh market, it has its own risks but also opportunities just the deal that happened recently when Yahoo acquired the Arab Internet portal Maktoob for more than 80 million dollars.
PC: Thank you for your interview and keep up with the good work supporting PHP in Arabic.
If you have further questions to Khaled, feel free to post a comment to this article and he will reply to you here.