In my previous blog entry I questioned the value of most web server benchmarking, particularly as related to Erlang. Typical benchmarks are misleading, inaccurate, and poorly executed. Perhaps worse, the intent of publishing them seems to be to assert that the fastest web server (at least according to the tests performed) is of course also the best web server. You’d think the flaws of this fallacy would be so obvious that nobody would fall for it, but think again: watching the delicious “erlang” tag over the past few days revealed the benchmarks my blog post referred to to be one of the most bookmarked Erlang-related pages during that timeframe.
Not surprisingly, though, it looks like I’m not the only one bothered by poor benchmarking practices. Over on his blog, Mark Nottingham just published a brilliant set of rules for HTTP load testing. It’s quite instructive to take your favorite set of published web server benchmarks and see just how many of Mark’s rules they violate.
Like I hinted last time, if you want benchmarks, you are best off by far if you run them yourself. That way, their relevance to the problems you’re addressing will be much more likely, and you can run them in a similar, or even the same, environment on which you plan to deploy. You can also gear the benchmarks to much more closely resemble your applications and the loads you require them to handle. Doing the benchmarking work yourself will give you valuable hands-on experience with the servers and frameworks you’re considering, allowing you to get a feel for important factors such as feature completeness and correctness, ease of development, flexibility, and ease of deployment and runtime management/monitoring, none of which can be gauged by someone else’s performance benchmarks. Finally, by doing your own benchmarking you can also help ensure the validity and usefulness of your results by following Mark’s load testing rules.