Author: Manuel Lemos
Posted on: 2011-12-15
Categories: PHP Performance, PHP opinions
The latest benchmarks seem to indicate that PHP applications compiled by Phalanger execute noticeably faster than when they are executed by the official PHP implementation based on Zend engine, even when a caching extension is used.
Read this article to learn how Phalanger works and what lessons can be learned to make the official PHP implementation run at least as fast, eventually in PHP 6 based on Zend Engine 3.
What is Phalanger?
Phalanger is an Open Source project which is a PHP compiler that can generate .NET assemblies, very similar to Java bytecodes and Zend opcodes from PHP code. This means that the resulting compiled code can run on a .NET virtual machine, just like compiled C# code, or even be turned into native machine code thanks to .NET JIT (Just In Time) compilation capabilities.
This project was created in 2004 by Tomas Matousek and Ladislav Prosek at the Charles University in Prague, Czech Republic. In 2008 they went to work for Microsoft and the project was handed over to new team of developers composed by Jakub Misek, Miloslav Beno, Daniel Balas and Tomas Petricek. In 2009 they founded a company named DevSense with the intention of providing commercial support for Phalanger.
The Proposal of Replacing Zend Engine by Phalanger
Recently Phalanger 3.0 was released introducing numerous improvements in terms of compatibility with the PHP 5.3, interoperability with .NET platform implementations including Mono on Linux, and probably most importantly performance improvements.
But the story of this article did not exactly start with the news of Phalanger 3.0 release. Actually what motivated this article was that a PHP developer named Rasmus Schultz went on the php.internals mailing list and proposed to switch the official PHP implementation based on Zend for another based on Phalanger.
The reactions of total refusal of the proposal were somewhat expected. Some developers presented technical arguments. Others presented more emotional arguments like the fact that core developers have been working for years on the C language code that executes PHP and its extensions.
The fact that .NET is a Microsoft thing was also brought up. Nowadays Microsoft is more friendly to the Open Source world and the PHP project in particular. That still does not make people forget the dark past of Microsoft when they actively fought and disdain Open Source projects. So there is always some rejection due to the lack of trust on Microsoft intentions.
Anyway, Phalanger is not a Microsoft project. It can run as well on Linux and other platforms using Mono, which is an Open Source platform implementation of the .NET specification. This is an important detail because the vast majority of the public Web servers on which PHP applications are installed run on Linux.
There were claims that Mono would be much slower than the .NET engine but the Mono project has evolved over the years and no evidence was presented to demonstrate that is still the case today.
Phalanger versus Zend Engine
All this discussion of replacing the Zend Engine based PHP implementation by Phalanger with .NET or Mono is irrelevant. Personally I do not see core PHP developers accepting that, no matter how many well justified arguments may be presented.
What may be relevant is what Phalanger did to surpass the Zend Engine performance. I think that is an opportunity for the PHP core developers to learn about different approaches to reach higher efficiency, not only in terms of execution speed but also in memory usage. Those are factors that influence the cost of scaling up a Web application.
Usually you would think that gains in execution speed of pure computational tasks may not matter so much in the real world because PHP code is most of the time waiting for I/O operations to finish, like database access, file access, receiving the HTTP requests and sending the HTTP responses.
However, when benchmarking a real world application, like for instance Wordpress, the Phalanger team managed to make it take about 2 seconds less to respond to HTTP requests than with the regular PHP based on Zend Engine using FastCGI as well an opcode cache.
It may not seem much, but 2 seconds improvement in a requests that take in average 7 seconds, is not a small improvement. That is about a 25% gain. In cases that the weight of computational tasks performed by PHP scripts is higher, the gain can be even more evident.
Even if you are not so concerned with your site response speed in order to keep the user happy, keep in mind that nowadays Google considers the response speed as one of the factors that affect the ranking of site pages. So an eventual response speed gain is definitely not something to be ignored.
Given these facts I decided to contact the Phalanger team to learn more about why the Phalanger + .NET setup provides such a great improvement over a Zend Engine + FastCGI + opcode cache setup. Thankfully Jakub Misek replied very quickly to my inquiries.
What I learned from Jakub that seems to have more relevance is that the most important factor is that the .NET engine can compile the .NET assemblies generated by Phalanger from PHP code into machine code optimized for the current machine CPU. This is called JIT (Just In Time) compilation.
The Zend Engine also compiles PHP into opcodes. This is equivalent to Phalanger compiling PHP into .NET assemblies, and Quercus compiling PHP code into Java bytecodes.
The difference is that the Zend Engine executes the compiled opcodes in a sort of virtual machine emulating a the behavior of a CPU, while .NET and Java execution engines can compile the assemblies or bytecodes into native machine code. That code can run much faster than the current Zend Engine would interpret and execute the opcodes.
Another point brought up by Jakub is that a PHP Web application compiled by Phalanger and run by the .NET engine uses a single process to manage a single memory allocation heap for all the requests. As of the Zend Engine + FastCGI (or Apache pre-fork), there will be multiple processes running in parallel, each with their own memory heap with memory being reallocated in every request.
This seems to not be so much a matter of Phalanger versus Zend Engine, but rather of multi-process versus multi-threaded Web servers. It is true that if you run a multi-threaded Web server like Microsoft IIS (or Apache worker, Nginx, lighttpd, etc...) you only use a single memory heap for all requests.
This usally leads to a more efficient usage of the available RAM in a Web server machine, as it will reduce the eventual memory waste after the execution of requests handled by scripts that consume much more memory than the average.
This is one point that I brought up in 2008 when I was invited to attend to Microsoft Web Development Summit. There I gave a small talk about things that Microsoft can do to help PHP run better on Windows. One of those things is to help making PHP more thread-safe (see slide 12).
Zend Engine and many of the PHP extensions are already thread-safe, so in theory PHP can be used to run on a multi-threaded Web server without crashing. In practice the problem is that there are some extensions that rely on code that is not thread-safe. So crashes may still occur when that code runs on a multi-threaded Web server.
That is why PHP is more recommended to run FastCGI instances, instead of running as a module of IIS, or any other multi-threaded Web server. But this has the penalty of the overhead of communicating with a FastCGI instance, as well leading to greater RAM waste.
Improvements for PHP 6 on Zend Engine 3
PHP running on Zend Engine is already a very mature platform for Web application development, but it can go further by addressing more complex matters.
Basically I see two complex matters that should be addressed in future versions of the Zend Engine based PHP. I am not saying this is a trivial effort, even less that is something that may be fun to do for any core developer, but it seems to be necessary to achieve progress, so PHP based on Zend Engine can match the performance of .NET, Java and other similarly optimized environments.
1. Thread-safety for running with less memory waste
As I mentioned above most of Zend Engine and PHP extensions code is thread safe. What remains to be done is to identify what libraries of PHP extensions are not thread-safe and either fix the code to make it thread-safe or replace the libraries by equivalents that are thread-safe.
2. JIT (Just in time) compiler
The current Zend Engine opcode interpreter and executor needs to be replaced by a JIT compiler that generates native machine code from Zend opcodes. The resulting machine code should be optimized and cached in memory.
Implementing a proper JIT compiler is easier said than done, but fortunately there are some Open Source projects that can be adopted, rather than writing a JIT compiler from scratch. Here is an overview of some that I know:
a) Phalanger or Quercus
There is not much more to say about Phalanger besides what was already said above.
I just suspect that it will never be considered as acceptable JIT solution, not because of technical matters, but maybe because it is built on top of a platform created by Microsoft. And you know, there are core developers that hate anything related with Microsoft with a passion. But who knows I may wrong.
What I said about Phalanger, I could also say about the Quercus. That is a project which compiles PHP into Java bytecodes. It is basically the same approach for a different platform, which is not so different after all, as .NET is basically the Microsoft implementation of a platform like Java.
b) Facebook HipHop
Coincidentally Facebook just released an enhanced version of their PHP Facebook HipHop compiler that does JIT. Unlike the initial version release that takes hours to compile a simple PHP application like Wordpress, this release implements fast dynamic translation of PHP code to native machine code which can compile PHP code in a few seconds.
Facebook employed Scott MacVicar, a PHP core developer to work on the PHP HipHop compiler project. I guess this fact helps making HipHop JIT engine a strong candidate to be used by PHP core developers as the JIT engine for the future of Zend Engine based PHP.
c) PHC PHP Compiler
PHC is a PHP compiler that compiles PHP scripts into PHP extensions that can be used with the Zend Engine based PHP implementation.To be accurate we cannot call it a JIT engine because it uses the static compilation approach, very similar to the original HipHop PHP compiler implementation, so it is very slow.
The main difference is that PHC can generate C code that can run with Zend Engine, while HipHop is a project totally independent of Zend Engine.
LLVM is a set of compiler tools that among other things can be used to create JIT compilers for a variety of languages, including PHP.
In 2008 Nuno Lopes started working on a PECL LLVM extension that aimed to add JIT capabilities the Zend Engine by compiling Zend opcodes into native machine code.
I contacted Nuno and he told me that the project is somewhat abandoned due to the lack of interest of other core developers to help on achieving the project goals.
Now that the interest about bringing to life a PHP JIT compiler that works with Zend Engine sort of emerged, who knows if Nuno or other developers will regain interest in resuming the project.
e) Zend own JIT compiler
Zend may as well come up with their own JIT compiler. They could even build a JIT engine that generates native machine code directly, instead of generating intermediary opcodes first. That is basically what the Google V8 engine does, and it is really very fast.
Such project would consume Zend a lot of resources but in the end it would be something they would have greater control.
f) Something else
Well, I just covered several possibilities of existing JIT projects that could somehow be adopted in future PHP releases based on the Zend Engine. If you know about any other interesting JIT solutions, just post a comment telling about them.
When a PHP JIT based engine will be available?
I think it is consensual that PHP must evolve. If it does not evolve, something else will take its place. The fact that other companies have developed engines that make PHP run faster and even with greater memory efficiency, is just a symptom of the need for PHP to evolve.
PHP was created in 1994. In 2000 PHP 4 was launched introducing Zend Engine 1. It brought to PHP the power of opcode compilation. In 2004 PHP 5 was launched evolving the PHP object model. The original PHP 6 plan failed because adding built-in Unicode support was a too ambitious goal.
I think now it is more than time for a big PHP release. I think nowadays the PHP language is very much feature complete, but the PHP code execution engine needs to evolve.
Will Zend or other PHP core developers reckon this need? I don't know. Anyway, I just shared my view about it to tell something that probably others also think.
But historically PHP core developers seemed to not get so excited on picking up ideas provided by others. They tend to get more excited with implementing ideas that they got in order to address their own needs, but that it is not necessarily a rule.
Sometimes good ideas just take time to be accepted and implemented, which is a shame, but better later than never.
I remember for instance on time in 2002 when PHP 5 was being planned. Andi Gutmans of Zend asked for suggestions of features that developers felt to be important to be implemented in PHP 5.
I suggested to add built-in support for SOAP protocol in PHP, so developers could just make Web services calls as simple as calling class functions. This was not something new in several other languages.
What followed my suggestion was long and pointless discussion, not about the technical merits of the proposal, but rather about the way PHP feature proposals should be presented or not. The discussing end up dying when everybody was fed up of it, so the idea seemed to have been forgotten.
More than 2 years later, in 2004, the proposal ended up being implement by somebody at Zend. The implementation worked just exactly the way how I proposed. So the idea was not really forgotten.
Nowadays, PHP has a more formal feature proposal system based on RFCs (Request For Comments) eventually followed by a vote process.
But usually that is meant for people that are willing to actually to develop the code to implement the features. Since that is not my case, I just leave the idea here for anybody interested to pick up.
The RFC process is a step forward to help not forgetting good ideas but it is still not ideal. It is not uncommon for proposals being rejected without much feedback to proponent on what he needs to change make his proposals acceptable.
This often causes very ungrateful situations because sometimes the proponent has already gone through a lot of effort by implementing proof of concept code. In the end some proponents give up trying to contribute to PHP because it is a frustraiting process. That is a shame but it has been happening for many years.
In this article I presented my point of view of where PHP is now and where I think it should go to evolve by addressing current issues. Those issues are difficult to address but possible to implement eventually with the cooperation of external project contributors.
Feel free to post a comment to this article to tell whether you agree or not, or what do you think it should be done to make PHP address needs that you think it does not address well.
You need to be a registered user or login to post a comment
Login Immediately with your account on: