Erlang Web Server Benchmarking :: Steve Vinoski's Blog

Erlang Web Server Benchmarking

May 9th, 2011 | Published in code, erlang, performance, web | 12 Comments | Bookmark on Pinboard.in

Over on his blog, Roberto Ostinelli published “A comparison between Misultin, Mochiweb, Cowboy, NodeJS and Tornadoweb.” I was going to write a reply comment there, but it got pretty long so I decided to publish it here instead. I’m going to ignore the non-Erlang web servers it discusses and focus entirely on Erlang. I’m not trying to really pick specifically on Roberto here, but rather I decided to finally write something I’ve been meaning to write for awhile now about Erlang web servers and benchmarking.

First, I second the request made by one commenter for including Yaws in the measurements. Roberto, if you need help with the code or setup, just let me know. If one insists on writing these kinds of benchmarks, which as you’ll learn if you read this whole entry is something I question, the least he or she could do is include Yaws since it’s the granddaddy of all Erlang web servers.

Based on the benchmark code Roberto published, I wrote the following simple Yaws module to conform to the problem statement and registered it in my Yaws configuration as a “/” appmod:


-module(yaws_bench).
-export([out/1]).

out(Arg) ->
    [{status, 200},
     {content, "text/xml",
      case yaws_api:queryvar(Arg, "value") of
          {ok, Value} ->
              ["<http_test><value>", Value, "</value></http_test>"];
          _ ->
              "<http_test><error>no value specified</error></http_test>"
      end}].

I then measured it on my Ubuntu 10.10 two-core system using Roberto’s published httperf command against the misultin and Mochiweb code he published, and found that Yaws definitely holds its own, even though it’s a full-featured web server and does not claim to be just a lightweight library offering (sometimes partial) HTTP support as some frameworks do. For some tests Yaws outperforms misultin, and for others it doesn’t. This is interesting, considering that neither Klacke nor I have made any attempts at performance improvements in Yaws recently.

Second, the benchmarks do not compare apples to apples. Both Mochiweb and Yaws, for example, produce replies that are larger in size than misultin’s replies, primarily because they both include Server and Date headers. As I’ve learned from years of helping maintain Yaws, date calculations can noticeably and surprisingly impact Erlang web server performance, yet simply leaving Date headers out isn’t an option for real-world apps since HTTP 1.1 pretty much requires them (section 13.2.3 of RFC 2616 states, “HTTP/1.1 requires origin servers to send a Date header, if possible, with every response, giving the time at which the response was generated…”). Caches use Date headers for several reasons, for example in the absence of cache-control headers to help heuristically calculate content expiration. Even ignoring the date calculation requirements, just creating and delivering larger replies due to the presence of the Server header will negatively impact any comparisons based on request/second measurements.

Third, the benchmarking approach includes no application “think time.” How many real-world apps just blast request after request down a connection without any intervening time to handle replies? If the goal is to measure something akin to real-world apps, then the benchmarks should at least be using something like httperf’s --wsess option to simulate client think time. And unfortunately doing that is hard to get right for generic benchmarks, since different client apps will have different think times.

On a related note, what exactly is the goal of these benchmarks? To imply that faster is better? That’s unfortunately a commonly-held fallacy. Given that the blog entry states that the target is dynamic applications, then consider the fact that the performance of a real-world dynamic application is often dominated by something other than the web server — perhaps some back-end service from which page data is being fetched, for example. A real-world setup greatly concerned with performance is likely to have nginx out in front, probably with a local cache, to handle fast-path requests, shunting only those requests it can’t fulfill off to the slower back-end server. Such benchmarking games are therefore often misguided as far as real-world dynamic apps are concerned because they end up measuring something that isn’t even in the critical path in a real setup.

I don’t agree with Kyle Drake’s comment on Roberto’s blog about code ugliness, since the Erlang code posted there is very clear and would look like “garbage” only to someone who doesn’t know the language. But I do agree with the sentiment, which is that for dynamic apps, what often matters is what kind of code, and how much, you have to write and maintain to support your app. Given that Erlang web servers tend to make use of the underlying Erlang/OTP facilities for HTTP parsing and socket handling, then all things considered you’re just not going to get a huge variation in performance among them, assuming they’re written halfway decently. What matters for dynamic apps are the stability of the web server/library and the programming model it offers. These are what Roberto should really be benchmarking, but of course that’s basically impossible since stability would take a long time to prove, and programming model is a matter of taste that can’t be conveniently measured using artificial benchmarking tools. This reminds me of one of my old columns on this very issue as applied to enterprise middleware, entitled “The Performance Presumption” (PDF); the short version is that people often measure performance simply because performance is relatively easy to measure. The lesson is that you shouldn’t rely on generic benchmarks, but rather you should take the time to create specific benchmarks that mimic the app you want to develop, and base your decisions on the results of that exercise.

On top of all that, I don’t really understand the desire to keep writing new Erlang web frameworks for performance reasons. As I stated earlier, if a framework uses Erlang’s built-in packet decoding and socket handling, it won’t perform a great deal better than any other Erlang web framework. OTOH, if someone writes a new framework with the hope of providing a really nice new programming model — webmachine is a fantastic example of this — then they shouldn’t be “proving” how good the programming model is by trying to show how fast it is. Ever seen webmachine being advertised via performance benchmarks? Neither have I.

Let’s face it, the Erlang web development community isn’t large enough to support numerous web servers and frameworks. I’m sure some will disagree, but publishing artificial benchmarks designed to “prove” which is best IMO results mostly in just fragmenting the community. If you really have an itch to write a fast Erlang web server, you’d help the community much more by contributing to an existing one, including the Erlang inets web server included in Erlang/OTP and now powering the Erlang website. For Yaws, Klacke and I often take patches and suggestions from our users, and we gladly welcome solid contributions intended to improve Yaws performance. If you’re just dying to show off your chops, note that improving performance in a long-lived and highly stable codebase like Yaws without breaking anyone’s code is far more challenging than writing another new server that basically doesn’t differ much from what already exists.

Or perhaps better yet, contribute to the Erlang core. IMO the next major performance improvements in Erlang web servers will come not from minor tweaks in handling binaries or such things, but rather via radical improvements in the Erlang TCP driver or even from developing a whole new HTTP-specific driver. Unlike a war of artificial benchmarks among Erlang web servers, these approaches have a great chance to improve the lot of all Erlang web systems.

Responses

Feed

Roberto Ostinelli says:

May 9th, 2011 at 11:24 pm (#)

Hi Steve, I can only agree to many of the sensible points you make.

Whenever I discuss about benchmarks with people around me I always start by stating that speed is only one of the ‘n’ things you want from a webserver, and these would include stability, features, ease of maintenance, a low standard deviation (i.e. a known average behaviour), and such.

And that’s precisely the point. There are specific situations where speed can seriously help, and it’s the kind of situations in which I’m often confronted to: small packets, fast application times, loads of connections. And that was the reason for me writing Misultin in the first place.

That’s trading features for speed, no secret in that. If it were to be a big application, probably the percentage of time used by any of these servers for header parsing / socket handling / querystring parsing would be only a minor part in the overall process that produces a response. Because of that, it is most probable that the different servers would flat down to more similar speeds on a big application. That explains why I am not interested in adding application “thinking time”, because that is, by definition, application-dependent, and I test that every time I release a specific app.

As a last note, I did not include Yaws for a simple reason: it’s a fully blown up server, the “old guy” is a feature-loaded unquestioned webserver, and because of that I simply thought that it would have been unfair to add it to the comparison. This really sound to me like apple and peers, not a minor difference of the missing headers you are referring to. IMHO, of course, and I’ll be more than pleased to add Yaws to the next run which I might be called to do.

AFA contributing to OTP itself, I would be more than delighted and there are signs to try to get along on that.. I just believe it to be out of my reach: I wrote misultin simply because I needed it.

Keep up the good work,

r.
Loïc Hoguin says:

May 10th, 2011 at 4:23 am (#)

You are generalizing the range of applications involved.

Some applications are little more than “hello worlds”, except with more data transferred. This is one application Cowboy will be used for.

Also, why cache through nginx? Bind your Erlang web server to port 80 using procket, cache yourself inside Erlang as part of your application and probably with more fine grained control over it. No proxy needed.

Finally, something that these benchmarks don’t show, is that Cowboy and misultin are already comparable or better in performance to nginx. In the upcoming FastCGI work in Cowboy, running a PHP “hello world”, we get 1700 reqs/s versus 1200 in the C server. And with the way Cowboy is designed, you can always interface handlers between the server and FastCGI to make sure you only call PHP when required, with more fine grained control over it here again.

I don’t disagree people should consider yaws, perhaps inets too, and I’d like to see the benchmarks for it, but ultimately they are completely different beasts.

By the way: point noted about Server and Date headers. Server was next on my todo list, Date can be done at the same time. Not sure why building a date can’t be fast though, you can always cache a good chunk of it, what changes often are seconds and minutes and those are easily replaced in the string/binary.
steve says:

May 10th, 2011 at 10:50 am (#)

@Loïc: I don’t think I’m generalizing the range of applications, given that the range of applications was never stated in the original article. On the contrary, my message is that benchmarks are hard to apply and analyze generally, but instead have to be considered within the confines of specific applications.

I’ve seen too many people read benchmarking articles like Roberto’s as definitive statements of performance and ranking of web servers and frameworks, regardless of application. I know Roberto doesn’t intend his postings that way, as is clear from his comment above, and I know you and I don’t read them that way, but unfortunately, many do. I wrote this blog entry as a counterpoint and reminder that trying to benchmark performance in a general manner is full of issues and gotchas. There’s a great deal of responsibility that goes with publishing benchmarks.

In the future I’d like to see such postings contain clear explanations of exactly the types of applications the benchmarks are targeting. If Yaws or other systems are intentionally left out of the measurements, then say why they’re left out rather than just letting readers guess. Above all, I’d really like to avoid “us vs. them” benchmarking shootouts among Erlang web servers, simply because, as I already said, the Erlang web community won’t benefit from that form of fragmentation.

Interestingly, on the Yaws mailing list we sometimes get questions about whether Yaws or Mochiweb is faster, and both Bob Ippolito and I always (and independently) answer that they’re about the same. I can’t speak for Bob but I think we’re both happier to see folks using any Erlang web server rather than trying to fight over users and get them to use the server each of us respectively works on. Very useful articles and postings can be written that instruct readers on approaches and designs that work well for Erlang web serving, comparing and contrasting against other approaches, without ever turning things into battles among servers and frameworks.

As for nginx, yes, sometimes pure Erlang can beat it. For a number of apps, though, this is not the case, including apps I’ve worked on and measured. Do some searches and you’ll find a number of Erlang web developers in the past who traversed similar paths of trying to boost their server performance and then ultimately realized that having a reverse proxy out front, typically nginx, helped them get the performance they needed. I generally prefer to stay in pure Erlang and would use nginx only when measurements show that doing so is worthwhile, but again, that approach is based exactly on a core theme of my posting above, which is that you have to measure your own apps instead of relying on general benchmarks to do the work for you.

And yes, generating Date headers “intelligently” is key. The approach you mention is roughly what Yaws does, and coincidentally is pretty much what nginx does too.

I find your comment about Yaws being a different beast interesting — if you’ve never used it, how do you know it’s a different beast? It supports FastCGI and PHP just fine. So why are you writing your own instead of using and contributing to Yaws? I could be wrong but it sounds like you’re assuming things about Yaws that may not be at all true.
kenny says:

May 10th, 2011 at 11:44 am (#)

“Let’s face it, the Erlang web development community isn’t large enough to support numerous web servers and frameworks.”

This is so important. The huge success of Rails happened in large part because the community rallied behind it – support, docs, screencasts, books, tutorials, plugins, etc. It is so vital.
Loïc Hoguin says:

May 10th, 2011 at 2:14 pm (#)

We agree there, most people understand benchmarks as a pissing contest when they’re in fact a useful tool for two things: a test to showcase the inefficiency of the software and the areas that could be improved; a test to decide what’s the best software to use for your needs.

Something to keep in mind is that Cowboy isn’t an HTTP server, it really is an acceptor pool with bells and whistles and just happen to include HTTP by default. Acceptor pools can be started and stopped at will with any protocol and options. Today I am focusing on HTTP as it’s my most pressing need but tomorrow Cowboy will be used for more protocols. In fact the gen_smtp project has a branch that makes use of Cowboy for this exact feature.

I couldn’t build on yaws or mochiweb because I needed to break compatibility anyway, so I had little hopes of getting them merged back. Plus those two projects include tons of code that I simply don’t need or want. The design is also fairly different and heavily lightweight, although less than misultin in some ways (Cowboy has a dispatcher by default) and more in other ways (Cowboy needs only one process for websockets). All projects also included nasty or undocumented features like parameterized modules, process dictionary, prim_inet calls, which I do not want. Finally, Cowboy uses binary HTTP which was requested to me by various people.

Cowboy is the result of much experimentation and that experimentation would have been much harder on an older and bigger code base such as Yaws. New projects are always useful for that and I hope we’ll see many more Erlang HTTP servers in the future.

About FastCGI, well yeah, Yaws includes support for it. But that’s part of the problem: it’s inside Yaws and built specifically for it. So we went and made a standalone client called ex_fcgi which is still being worked on and that can be used by any project. At the same time that client got the same amount of experimentation I was talking about above. I know Roberto also made a FastCGI enabled HTTP server although it only has one commit and wasn’t touched ever since; but it’s probably the same thing: experimenting with new ideas. The most interesting point about FastCGI in Cowboy is the ability to mix Erlang and PHP/Ruby/.. code by dispatching requests to the right handler, allowing us to port existing platforms to Erlang one step at a time.

I would probably have looked at Yaws more if it wasn’t monolithic; perhaps reusing some of its components in other projects too. Same happens with mochiweb, really.
steve says:

May 10th, 2011 at 4:50 pm (#)

@Loïc: thanks for the detailed reply. In an older codebase like Yaws there’s always a tension between maintaining backward compatibility and moving forward with experimentation. What I’ve found, though, is that I’ve been able to do quite a bit of augmentation and experimentation without breaking anything. For example, I was able to add the sendfile driver as well as streaming without breaking anything, and the new streaming capabilities in turn led to comet support and websockets. I also have some ideas for optimizing the dispatch path that could help performance and latency, and again, I believe I can do it without breaking anything. I personally find it much more rewarding to fit new things into established code in this way when appropriate rather than just going off and writing yet another server or framework, since as I said in my original post augmenting existing systems helps the community more and prevents unnecessary fragmentation.

Yaws has been stable for nearly a decade now, and that’s really quite an achievement on Klacke’s part and is an important feature for people looking for something they can count on. Naturally, Klacke and I try to be pretty careful to keep it that way. But given what we’ve done with Yaws over the past few years and continue to do with it, my point is that you don’t always need new projects to do experimentation.

Unless Erlang/OTP changes in ways that provide radically new features and capabilities that make it worth the effort to develop whole new HTTP servers, I’ll continue to disagree with you that more Erlang HTTP servers are needed. Without something really new they’ll just be trodding over well-known ground. As I hinted in my original post, the more solutions you have, the greater the potential for community fragmentation and resulting confusion — IMO you would do well to avoid ignoring or underestimating this effect.
Yurii Rashkovskii says:

May 10th, 2011 at 6:23 pm (#)

Steve,

While I can agree (to a certain degree) that creating new HTTP servers (or whatever it is) might not be good idea, especially given the size of the community, I still think we should understand that new things don’t get created just for the sake of their novelty. There are (almost) always reasons why something gets re-implemented again and again.

For me personally, the really important point of mochiweb, misultin & cowboy is their extreme light weight and positioning as a library, not as a monstrous solution for everything. If Yaws was perfect for everybody, nobody would create new servers. With all due respect, Yaws does have a reputation of a complex beast. On one hand, a decade means stability & matureness, on the other hand it means legacy software. Software does get old but it doesn’t always mean it gets better with time. Successor can always learn from past mistakes without making any unnecessary compromises.

I think that lightweight HTTP servers are essential for the infrastructure as they allow you to build on top of them, so that you have these extremely small & fast libraries when you need performance and you can build Yaws-like functionality on top of that “core” for the additional features when you want them.

With that in mind, I think that these experimentations will eventually lead to a better web serving experience for all of us. Especially if all of you, guys, can analyze the status quo and make some important positive decisions that will affect the whole ecosystem :) May be this ultimate experimentation experience combined can help us creating the “only one” http server to be used by the community? Who knows.

Just my 2c.
Loïc Hoguin says:

May 10th, 2011 at 6:26 pm (#)

There’s another issue I forgot, which is one of moving forward. The more established the project, the longest to take patches in, especially if they’re big changes. Not saying it’d happen with Yaws, it’s just a generality.

Speaking of fragmentation, I don’t think any language is really united to begin with. Except the ones with a central and easy to use repository of projects, maybe. And even then, I’m not even sure. It’s normal that people want to start from scratch and reinvent new things to make them “better” (this is often subjective). At worst we learn something. The important thing I think is that new people come to Erlang. The gen_server2 project doesn’t make people stop using the gen_server or refrain them from using Erlang. ;)

About the Rails comment: Rails people were united under Rails. Not Ruby people. The people who thought they needed Rails were pushing for it. Three years later they went on into separate ways, some staying with Rails, others trying to make new Ruby frameworks, and others presumably moved to NodeJS. That’s how long you can keep people united before it completely breaks. I don’t think we can do anything about it. :)

Anyway I will probably try to share all my experiments with Cowboy in Stockholm later this year, and I’ll also be in London next month as a visitor, hope we can get in touch and share some more thoughts there.
steve says:

May 11th, 2011 at 9:42 am (#)

@Yurii: sadly, often the reasons new things get created are NIH syndrome along with unfounded assumptions and perceptions.

The problem with lightweight systems is that they gain weight quickly as they succeed. This isn’t a technical issue but rather is simply how markets work, and it’s unavoidable. If the software is successful, users demand new features as they discover that “heavyweight” feature supplied by other frameworks viewed unfairly as “too heavy” actually exists for a good reason and is needed for real-world solutions. I think you need only look at the recent history of misultin to see this effect (and that’s not at all a knock against misultin, but is rather simply a product of its success).

You seem to use the term “legacy” as derogatory, yet for many it means “stable,” “reliable,” “mature,” and “something that just works.” “Legacy” implies “successful” since if there are no users, there is no legacy. Anything you release that garners users who deploy your code in production is legacy. It’s a nice problem to have.

I don’t understand the Yaws reputation you claim, since considering all of what Yaws can do, it’s actually not complex at all. For example, if you look at the code in my blog entry above, it’s no more complicated than any of the examples Roberto posted for the other frameworks. Running it in a deployed server requires one extra line of configuration; alternatively, running it embedded requires a few extra lines of Erlang config code, certainly no more than Roberto’s Cowboy example and probably fewer. Have you ever actually written real applications in Yaws? If so, can you provide me details on exactly where the complexity is?

I think a critical point you’re either missing or unintentionally ignoring is one I tried to make clear in my posting: since HTTP handling and socket handling are essentially built into Erlang/OTP, there’s not a whole lot of room for drastic performance differences between different Erlang web frameworks. You say you can build Yaws-like functionality on top of small and fast libraries, which is true but guess what? The end result will naturally include extra overhead due to the additional code, thus negating any slight performance advantages, and it will likely end up reinventing much of what a more complete solution like Yaws already provides. So what would you rather do, use a proven solution that already exists, or just rewrite and then have to debug and maintain the very same code solutions like Yaws already provide? And if you’re going to do that, why not contribute your efforts to an existing project instead of writing a new one?

I’m pretty certain that most of Yaws’ detractors have never actually used it or even looked at it much. I think if they had, they would have been on the Yaws mailing list asking pertinent questions, perhaps challenging some of the current design and implementation, and supplying patches, and I don’t recall seeing that sort of traffic there in recent memory from anyone involved in these newer frameworks. Which is a real shame, because then the reasons for the new stuff end up appearing as nothing more than NIH syndrome, which, as I’ve already hinted repeatedly, leads to community fragmentation and a stagnation of progress in Erlang web development practices — ironically, likely the very opposite of the effects actually being sought.
steve says:

May 11th, 2011 at 9:56 am (#)

@Loïc: I think you make a good point about drawing new users to Erlang. I’m not entirely convinced you need new projects to do that, though; projects like Yaws could certainly try to make themselves sexier and more visible and “in your face,” but a lot of that depends on the personalities involved and frankly that’s just not how Klacke is.

New users should not be afraid to try to contribute to Yaws. I was a new user a few years ago, and I certainly wasn’t afraid. I liked the stability Yaws gave my project from the start, and yet I could also see where I could add to it. I was then fortunate enough to have Klacke add me as a committer, and I’ve learned a lot from him and from all the Yaws users as a result. Had I instead gone out on my own, I’d have fewer users, assuming whatever I worked on ever saw the light of day at all, and would have missed out on working with one of the greatest Erlang programmers of all time. And that was back when Yaws was in svn — in this day and age of github, there’s really no barrier to contributing to existing projects.

Thanks again for your comments, and I hope we can meet in London and have time to discuss all this some more.
Loïc Hoguin says:

May 11th, 2011 at 7:53 pm (#)

“The problem with lightweight systems is that they gain weight quickly as they succeed.”

I know about that issue and can assure you that once my todo list is done, Cowboy will not get any new feature unless HTTP gets a new version; HTTP gets replaced by something better (Google is experimenting with that); some new kind of long-polling appears that does it better than previously. Everything else will be minor improvements, supporting r15+ Erlang versions, and fixing bugs. So unless something changes a lot in the web landscape, Cowboy will remain the same.

Once Cowboy is done it’ll be a good basis for the rest of the environment I’m building and I don’t plan to add features to Cowboy but rather build other applications taking advantage of it. To give an example, if I ever need RSS, SOAP, XML-RPC, they won’t be part of Cowboy, but rather separate reusable applications that won’t depend on it. Kind of like ex_fcgi does now. That’s the biggest difference in philosophy between Cowboy and Yaws, and I sure don’t want feature creep in anything I make, simplicity is always better IMHO. Where other lightweight servers become more like Yaws, Cowboy won’t, it’ll stay as is until a tech revolution happens.

Oh and don’t worry, if I need SOAP I’ll probably just use Yaws as I can’t care less about SOAP and wouldn’t want to reimplement it even if I was paid to do it. ;)

About the not posting on the ML, yeah I don’t do that unless I use something for a long time. Took me more than a year before talking on Erlang IRC channels or on the ML, and I gave up on Yaws long before reaching that stage. Although I did plan on using it initially before looking at WebMachine.

Great talks by the way, been enjoying the chat.
betareduction says:

May 16th, 2011 at 10:07 am (#)

I had to read this post. I always thought someone has to write it. I had been wandering around all the erlang webservers just because of all the benchmarks that are published on the cyberspace, and could not decide for myself what to use for my project. But somehow, after studying the yaws source code over and over again, i think the community should stick to yaws and support it. Its not only a masterpiece, but you can learn a lot from it.

Steve Vinoski's Blog

Erlang Web Server Benchmarking

Responses

Archives

Categories