Paul M. Jones

Don't listen to the crowd, they say "jump."

Labor Day Benchmarks

By popular request, here is an update of my web framework benchmarks report. You can see previous result sets here:

Before you comment on this post, please have the courtesy to read at least the first two articles above; I am tired of refuting the same old invalid arguments about "hello world makes no sense", "if you cache, it goes faster", "the ORM systems are different", and "speed isn't everything" with people who have no understanding of what these reports actually say.

Full disclosure: I am the lead developer on the Solar Framework for PHP 5, and I was an original contributor to the Zend framework.

In the interest of putting to rest any accusations of bias or favoritism, the entire project codebase is available for public review and criticism here.

Flattered By Imitators

They say that imitation is the sincerest form of flattery. As such, I am sincerely flattered that the following articles and authors have adopted methodologies strikingly similar to the methodology I outlined in Nov 2006.

  • SellersRank here and here.
  • AVNet Labs here.
  • Rasmus Lerdorf here. I am considering writing a separate post about this talk by Rasmus.

Methodology, Setup, and Source Code

The methodology in this report is nearly identical to that in previous reports. I won't duplicate that narrative here; please see this page for the full methodology.

The only difference from previous reports regards the server setup. Although I'm still using an Amazon EC2 instance, I now provide the full setup instructions so you can replicate the server setup as well as the framework setup. See this page for server setup instructions.

Finally, you can see all the code used for the benchmarking here.

Results, Part 1

Update: FYI, opcode caching is turned on for these results.

The "avg" column is the number of requests/second the framework itself can deliver, with no application code, averaged over 5 one-minute runs with 10 concurrent users. That is, the framework dispatch cycle of "boostrap, front controller, page controller, action method, view" will never go any faster than this.

The "rel" column is a percentage relative to PHP itself. Thus, if you see "0.1000" that means the framework delivers 10% of the maximum requests/second that PHP itself can deliver.

framework avg rel
baseline-html 2309.14 1.7487
baseline-php 1320.47 1.0000
cake-1.1.19 118.30 0.0896
cake-1.2.0-rc2 46.42 0.0352
solar-1.0.0alpha1 154.29 0.1168
symfony-1.0.17 67.35 0.0510
symfony-1.1.0 67.41 0.0511
zend-1.0.1 112.36 0.0851
zend-1.5.2 86.23 0.0653
zend-1.6.0-rc1 77.85 0.0590

We see that the Apache server can deliver 2300 static "hello world" requests/second. If you use PHP to echo "Hello World!" you get 1300 requests/second; that is the best PHP will get on this particular server setup.

Cake: After conferring with the Cake lead developers, it looks like the 1.2 release has some serious performance issues (more than 50% drop in responsiveness from the 1.1 release line). They are aware of this and are fixing the bugs for a 1.2.0-rc3 release.

Solar: The 1.0.0-alpha1 release is almost a year old, and while the unreleased Subversion code is in production use, I make it a point not to benchmark unreleased code. I might do a followup report just on Solar to show the decline in responsiveness as features have been added.

Symfony: Symfony remains the least-responsive of the tested frameworks (aside from the known-buggy Cake 1.2.0-rc1 release). No matter what they may say about Symfony being "fast at its core", it does not appear to be true, at least not in comparison to the other frameworks here. But to their credit, they are not losing performance. (Could it be there's not much left to lose? ;-) In addition, I continue to find Symfony to be the hardest to set up for these reports -- more than half my setup time was spent on Symfony alone.

Zend: The difference between the 1.0 release and the 1.5 release is quite dramatic: a 25% drop in responsiveness. And then another 10% drop between 1.5 and 1.6.

To sum up, my point from earlier posts that "every additional line of code will reduce responsiveness" is illustrated here. Each of the newer framework releases has added features, and has slowed down as a result. This is neither good nor bad in itself; it is an engineering and economic tradeoff.

Results, Part 2

I have stated before that I don't think it's fair to compare CodeIgniter and Prado to Cake, Solar, Symfony, and Zend, because they are (in my opinion) not of the same class. Prado especially is entirely unlike the others.

Even so, I keep getting requests to benchmark them, so here are the results; the testing conditions are idential to those from the main benchmarking.

framework avg rel
baseline-html 2318.89 1.7710
baseline-php 1309.39 1.0000
ci-1.5.4 229.29 0.1751
ci-1.6.2 189.89 0.1450
prado-3.1.0 39.86 0.0304

CodeIgniter: Even the CI folks are not immune to the rule that "there is no such thing as a free feature"; between 1.5.4 and 1.6.2 releases they lost about 18% of their requests/second. However, they are still running at 14.5% of PHP's maximum, compared with the 11.68% of Solar-1.0.0-alpha1 (the most-responsive of the frameworks benchmarked above), so it's clearly the fastest of the bunch.

Prado: Prado works in a completely different way than the other frameworks listed here. Even though it is the slowest of the bunch, it's simply not fair to compare it in terms of requests/second. If the Prado way of working is what you need, then the requests/second comparison will be of little value to you.

This Might Be The Last Time

Although I get regular requests to update these benchmark reports, it's very time-consuming and tedious. It took five days to prepare everything, add new framework releases, make the benchmark runs, do additional research, and then write this report. As such, I don't know when (if ever) I will perform public comparative benchmarks again; my thanks to everyone who provided encouragement, appreciation, and positive feedback.


Solar System

In the spirit of some other framework projects, the Solar Framework for PHP 5 now offers a ready-to-use Solar system to get new users off to a quick start. It's not prepared as a tarball just yet, but it is available for checkout or export using Subversion from http://svn.solarphp.com/system/trunk.

For example, if you make a checkout in your document root ...

$ cd /var/www/html
$ svn checkout http://svn.solarphp.com/system/trunk solar

... and follow the README instructions, you will have a fully-operational installation in very short order, including an SQLite database, authentication, and three example applications:

http://example.com/solar/index.php
A simple "hello world"
http://example.com/solar/index.php/hello-app
A complex "hello world" with authentication and localization
http://example.com/solar/index.php/bookmarks
A "bookmarks" application.

(Note that the "index.php" is only in the evaluation deployment; when you create a virtual host and point it at the Solar system document root, a .htaccess file makes the "index.php" unnecessary.)

You can read more about the structure and principles of the Solar system here.


BREAD, not CRUD

Several developers have asked me what "BREAD" means in web applications. Most everyone knows that CRUD is "create, read, update, delete," but I think that misses an important aspect of web apps: the listing of records to select from.

I don't recall where I first heard the term BREAD; it stands for "browse, read, edit, add, delete". That covers more of what common web apps do, including the record listings. It even sounds nicer: "crud" is something icky, but "bread" is warm and fulfilling. That's why I tend to use the term BREAD instead of CRUD, especially when it comes to Solar and action-method names in the application logic.

Update 1 (2008-08-21): Wow, lot of traffic from Reddit and Y-Combinator on this one. Be sure to check out my post on Web Framework Benchmarking, and of course the Solar Framework for PHP 5.

I see a couple of comments saying that "browse is the same thing as read, it's just a special-case of read." I can see where that would be true, in a limited way. Using similar logic, one could argue that "add" is a special case of "edit", it just happens that the record isn't there yet; and then "delete" is another special case of "edit", you're just editing it out of existence. So that leaves you with just Read (one/many) and Edit (existing/non-existing/out-of-existence).

I think that takes things way too far. ;-) The special cases of "edit" are *so* special that they deserve their own logic. I think the same thing applies to "browse" -- it might be a special case of "read", but it's different-enough to deserve its own place.

Update 2: Matthew Weier O'Phinney refreshes my memory -- he mentioned the term to me years ago in a discussion about his PHP port of CGI::App. Thanks, Matthew!

Update 3: I said above that you could reduce all operations to "read" (with 2 cases) and "edit" (with 3 cases). It occurs to me now that those correspond to the way GET and POST are most-widely used. So maybe it wasn't such a silly argument after all. ;-)



Savant Has A New Owner

As many of you know, I've been the lead of many different PHP libraries over the years: Contact_Vcard_Parse, Contact_Vcard_Build, DB_Table, Text_Wiki, and others. As each matured, I handed them over to other maintainers who continued to improve on them and take them to greater heights. Now that time has come for Savant, one of my early and favorite PHP projects.

Due to time constraints, mostly because of my Solar framework project, I haven't been able to pay as much attention to Savant as I think it deserves, so I made the hard decision to put it up for adoption. Lucky for the Savant community, Brett Bieber (aka Salty Beagle) of the PEAR Group picked up on that call right away. Brett is now the steward and lead developer of the Savant Template System for PHP.

The transfer of code, domain names, and hosting is complete, but the transition period might be a bit bumpy, so please bear with us. Brett is committed to "carrying the torch" for Savant (his words). Anyone who wants to help out the new project lead can contact him at "brett.bieber --at-- gmail --dot-- com".

Thanks, Brett, for taking over the project, and good luck!


Ledger's Joker

I plan on writing a much lengthier post about The Dark Knight, and especially about Heath Ledger's portrayal of the Joker. But I want to get this bit out first.

It took me a while to figure out what it is about the Joker in this movie that strikes me as so fascinating and familiar and yet so terrifying, but I think I have it: Ledger takes the intense psychosis of Hannibal Lecter and mixes in the coyote characteristics of Daffy Duck. There's likely a lot more to it than that, but I've seen the movie three times now -- you tell me:

+ =

(The Joker was always my favorite villain, and Daffy is one of my favorite Warner Brothers characters, so maybe I'm predisposed to pick out similar behaviors.)



Exceptional command-line PHP

(Yes, I know, I've done no blogging in far too long. I've got a stack of stuff to blog about, but it's all rather heavy. In the mean time, here's something light.)

When executing code at the command line using php -r and PHP 5.2.5, be sure not to extend the Exception class. It will cause a segmentation fault.

For example, the following causes no trouble at all:


Samurai:~ pmjones$ php -r "throw new Exception();"
PHP Fatal error:  Uncaught exception 'Exception' in Command line code:1
Stack trace:
#0 {main}
  thrown in Command line code on line 1

But the next example gives a segmentation fault following a long ... pause ... after the stack trace output:


Samurai:~ pmjones$ php -r "class Foo extends Exception {} throw new Exception();"
Fatal error: Uncaught exception 'Exception' in Command line code:1
Stack trace:
#0 {main}
  thrown in Command line code on line 1
Segmentation fault

Note that we didn't even throw the extended Foo exception; we threw the native PHP exception. The mere presence of the extended class is enough to cause the segfault.

It took me two evenings to track this down; what you see here is the simplified generic case. I've entered a bug with the PHP guys here.

Update: I thought I was running 5.2.6, but I was wrong; this was occurring on PHP 5.2.5. Note to self: check to make sure you're running the latest version. :-)

Update (2008-08-12): These guys found the problem earlier, too: https://bugs.launchpad.net/ubuntu/+source/php5/+bug/198246.


On Plumbing

Note to self: next time the bathtub won't drain, first check to make sure the plug is open **before** you assume it's clogged and pour 2 1/2 bottles of Drano down the pipe.



Line Length, Volume, and Density

Update: This entry seems to be getting a lot of new attention; welcome! The lessons of line length, volume, and density, along with lots of other good design principles, are applied to the Solar Framework for PHP 5. Be sure to give it a look if you're interested in well-designed PHP code.

When it comes to coding style, there are are various ideas about how you should write the individual lines of code. The usual argument is about "how long should a line of code be"? There's more to it than that, though. Developers should also take into account line volume ("number of lines") and line density ("instructions per line").

Line Length

The PEAR style guide says lines should be no longer than 75-85 characters. Some developers think this is because we need to support terminals where lines may not wrap properly, or because some developer screens may not be big enough to show more than that without having to scroll sideways, or because it's tradition, and so on. These reasons may even be accurate in some sense. However, I see the 75-character rule as recognizing a cognitive limitation, not a requirement that can change with available technology.

How many words per line can a person scan, and still be able to grasp the content of the line in the context of the surrounding lines? Printing and publishing typographers figured out a long time ago that most people can read no more than 10 to 12 words per line before they have trouble differentiating lines from each other. (A "word" is counted as five characters on average.) Even allowing for a 25% to 50% increase, that brings us up to 15 words. Times 5 characters per word, that means 75 characters on a line.

So the style guide limitation on line length is not exactly arbitrary. It is about the developer's ability to effectively scan and comprehend strings of text, not about the technical considerations of terminals and text-editors.

Line Volume and Density

Some developers believe you should put as much code as possible on a single line, to reduce line-count. They say this makes the code read more like a "sentence". In doing so, these developers trade line "volume" for line "density" (or line "complexity").

Increasing the density of a line tends to make it less readable. Lines of code are generally lists of statements, not natural-language prose. If you put a lot of instructions on a single line of code, that tends to make it harder for other developers to decipher the logical flow.

Examine the following:

list($foo, $bar, $baz) = array(Zim::getVal('foo'), Dib::getVal('bar'), Gir::getVal('baz', Gir::DOOM));

(Yes, I have actually seen code like this. Only the identifier names have been changed.)

Now compare that to the following equivalent code:

$foo = Zim::getVal('foo');
$bar = Dib::getVal('bar');
$baz = Gir::getVal('baz', Gir::DOOM);

When I showed this rewrite to the initial developer, his complaint was: "But it's more lines!".

Increasing line volume ("more lines") and reducing line density does three things:

  1. It reduces line length to make the code more readable.

  2. Making it more readable makes the intent of the code more clear. The logical flow is easier to comprehend.

  3. In this particular case, it may be faster than the original one-liner, because it drops the list() and array() calls. True devotees of the Zend Engine will be able to say for certain if this translates into faster bytecode execution. (I am not a fan of speed for its own sake, but in this case it would be good gravy over the meat of the above two points.)

In reducing line density, you don't have to make one line correlate with a single statement (although usually that's a good idea). Here's another way to rewrite the original example, this time as a single statement across multiple lines:

list($foo, $bar, $baz) = array(
    Zim::getVal('foo'),
    Dib::getVal('bar'),
    Gir::getVal('baz', Gir::DOOM)
);

I find this less readable than the initial rewrite, but the principle is the same: more lines, but shorter, to improve readability.

Balancing Considerations

If shorter lines are better, does that mean lines should be as short as technically possible?

$foo
=
Zim::getVal(
'foo'
);

$bar
=
Dib::getVal(
'bar'
);

$baz
=
Gir::getVal(
'baz'
,
Gir::DOOM
);

It looks like the answer is "no". The line-volume vs. line-density argument is about readability and comprehension. The above example, while absurd, helps to show that overly-short lines are as difficult to read as over-long ones.

Developers with good style balance all the considerations of line length, volume, and density. That means they write lines of code no more than about 75 characters long, but not so short as to be increase line volume without need. They also show attention to line density for reasons related to cognition and comprehension, not merely technical syntax.