yaws

Yaws 1.85 Released

October 19th, 2009  |  Published in HTTP, web, yaws  |  Add to del.icio.us

Today Klacke announced Yaws 1.85, mainly a bugfix release. You can find the list of changes and fixes at that link, but one addition in this release I wanted to point out was our new streamcontent_from_pid feature, which allows your server application code to temporarily take over the client connection socket from Yaws, thus allowing you to feed data directly to the socket without first passing it back through Yaws. Could be just the ticket for long-polling (Comet) applications, for example.

Prior to this release, the closest feature Yaws provided for this sort of operation was the streamcontent capability, which is still very useful in that it allows you to have an Erlang process deliver data back into Yaws, which in turn sends it in HTTP chunked transfer mode back to the client. For large file resources like video files or install tarballs, or for data sources where content arrives at your server in batches from a separate back-end source, or generally for resources whose sizes are not known up front, streamcontent is perfect because it lets you transfer the data back into Yaws in chunks, at your leisure and without having to copy all the data at once. Still, though, the data has to be sent back through Yaws, which converts it to HTTP chunks and writes it to the socket, plus in this case your only choice is chunked transfer.

With streamcontent_from_pid, you reply to the out/1 upcall from Yaws with the HTTP reply headers and with the following special return tuple:

{streamcontent_from_pid, MimeType, StreamPid}

This tells Yaws that you wish to have process StreamPid take over the client socket in order to send data of media type MimeType directly back to the client. Yaws uses MimeType to set the HTTP Content-Type header, and then after sending that and all the other HTTP headers back to the client, it turns control of the client socket over to StreamPid. The code running within StreamPid must handle the following messages from Yaws:

  • {ok, YawsPid} tells StreamPid that it can proceed with using the socket. The socket is present in the original Arg variable passed to your out/1 function and can be retrieved via Arg#arg.clisock.
  • {discard, YawsPid} tells StreamPid that it shouldn’t send any data on the socket, for example because the client request was an HTTP HEAD request and so there is no response body.

To send data, your code can choose to send chunked data or non-chunked data. To send the latter, first make sure you set the HTTP Content-Length header in your initial reply to Yaws, and then once Yaws calls back to your StreamPid process, just call:

yaws_api:stream_process_deliver(Socket, IoList)

or for chunked data:

yaws_api:stream_process_deliver_chunk(Socket, IoList)

where for both cases Socket is the client socket from the Arg and IoList is an iolist containing the data to be sent. The first case just calls gen_tcp:send and is there primarily so we can maybe someday add SSL socket support for this feature. For the second case Yaws will format the data for you for chunked transfer. Unless a Content-Length header is set, Yaws will assume you want chunked transfer and will set the Transfer-Encoding header appropriately. If you’re sending chunked data, make sure you send your final chunk using the following function:

yaws_api:stream_process_deliver_final_chunk(Socket, IoList)

so that Yaws knows to send the termination chunk to inform the client of the end of the transfer.

You can continue to call these functions from your StreamPid as frequently as you need to in order to deliver your data to the client. Meanwhile, the Yaws process that handed you the socket will just sit back and wait for you (non-blocking, of course). When you’re completely finished sending, just call:

yaws_api:stream_process_end(Socket, YawsPid)

to end the transmission and give control of the socket back to Yaws. At that point, your StreamPid can exit if it wishes.

If you try out this feature, be sure to send feedback either to me or to the Yaws mailing list.

Yaws 1.82

May 29th, 2009  |  Published in erlang, web, yaws  |  Add to del.icio.us

Today Klacke released version 1.82 of Yaws. This is primarily a maintenance release, so nothing major has changed, except that yaws.conf and other files that used to go into etc now go into etc/yaws. For example, the default install prefix is /usr/local so for the installation on my laptop my yaws.conf file has moved from /usr/local/etc/yaws.conf to /usr/local/etc/yaws/yaws.conf. You’ll want to make sure you move any existing conf files you might have to the new directory.

Various other changes and bug fixes are described in the release notes on the main Yaws website. For the previous release Klacke moved the sources from sourceforge to github, so follow that link if you’re looking for source, and you can follow these instructions to build, install, and run. If you should run into any issues or problems, please send a message to the Yaws mailing list or create a new issue on the Yaws issue tracker.

As far as planning for the next release goes, aside from the usual little bug fixes and enhancements and adding tests, Klacke and I noticed Joe Williams’s recent Erlang Factory presentation that mentions Yaws CPU usage is sometimes higher than expected, so we’ll be looking into that and doing some general profiling and performance testing work.

Sendfile for Yaws

January 5th, 2009  |  Published in erlang, performance, scalability, web, yaws  |  Add to del.icio.us

A few months back Klacke gave me committer rights for Yaws. I’ve made a few fixes here and there, including adding support for passing “*” as the request URI for OPTIONS, enabling OPTIONS requests to be dispatched to appmods and yapps, and completing a previously-submitted patch for configuring the listen backlog. Klacke has just started putting a test framework into the codebase and build system so that contributors can include tests with any patches or new code they submit, and I’ve contributed to that as well.

The biggest feature I’ve added to date, though, is a new linked-in driver that allows Yaws to use the sendfile system call on Linux, OS X, and FreeBSD. I never wrote a linked-in driver before, so I was happy and fortunate to have an Erlang expert like Klacke providing hints and reviewing my code.

I did some preliminary testing that showed that sendfile definitely improves CPU usage across the board but depending on file size, sometimes does so at the cost of increasing request times. I used my otherwise idle 2-core 2.4GHz Ubuntu 8.04.1 Dell box with 2 GB of RAM, and ran Apache Bench (ab) from another Linux host to simulate 50 concurrent clients downloading a 64k data file a total of 100000 times. I saw that user/system CPU on the web host tended to run around 33%/28% without sendfile, while with sendfile it dropped to 22%/17%. The trade-off was request time, though, where each request for the 64k file averaged 0.928ms with sendfile but 0.567ms without. With larger files, however, sendfile is slightly faster and still has better CPU usage. For example, with a 256k file, sendfile averaged 2.251ms per request with user/system CPU at 8%/16% whereas it was 2.255ms and 16%/27% CPU without sendfile, which makes me wonder if the figures for the 64k file are outliers for some reason. I performed these measurements fairly quickly, so while I believe they’re reasonably accurate, don’t take them as formal results.

On my MacBook Pro laptop running OS X 10.5.6, CPU usage didn’t seem to differ much whether I used sendfile or not, but requests across the board tended to be slightly faster with sendfile.

I ran FreeBSD 7.0.1 in a Parallels VM on my laptop, and there I saw significantly better system CPU usage with sendfile than without, sometimes as much as a 30% improvement. Requests were also noticeably faster with sendfile than without, sometimes by as much as 17%, and again depending on file size, with higher improvements for larger files. User CPU was not a factor. All in all, though, I don’t know how much the fact that I ran all this within a VM affected these numbers.

Given that Yaws is often used for delivering mainly dynamic content, sendfile won’t affect those cases. Still, I think it’s nice to have it available for the times when you do have to deliver file-based content, especially if the files are of the larger variety. Anyway, I committed this support to the Yaws svn repository back around December 21 or so. If you’d like to do your own testing, please feel free — I’d be interested in learning your results. Also, if you have ideas for further tests I might try, please leave a comment to let me know.

Detailed RESTful Yaws Service

April 10th, 2008  |  Published in REST, erlang, yaws  |  Add to del.icio.us

Nick Gerakines has posted a detailed example of a RESTful web service implemented using Erlang and Yaws. It provides a lot of implementation information that I didn’t have room for in my InfoQ article on this topic.

Sam’s issue isn’t addressed here either, but no matter, the details of Nick’s example are worth examining nonetheless.

Erlang, Yaws, and ETags

April 3rd, 2008  |  Published in REST, erlang, yaws  |  Add to del.icio.us

Regarding my “RESTful Services with Erlang and Yaws” article on InfoQ.com, Sam Ruby said:

This otherwise excellent article fails my ETag test.

When Sam speaks, I listen, so I’ve given his feedback a lot of thought.

As I wrote in a comment on Sam’s blog, the Erlang/Yaws RESTful services I work on do indeed support conditional GETs, so at least my day-to-day work passes his ETags test. As for the article, there are two ways to think about it:

  1. If you focus on the “RESTful Design” portion of the article, then yes, I could have added a “think about where you need to support conditional GETs” item to the “key areas to pay attention to” list.

  2. If you focus on the Yaws/Erlang aspect of the article, then keep in mind that dealing with ETags requires dealing with HTTP headers such as If-none-match and the ETag header itself. The article already shows you how to read request headers and write reply headers, though, and how you actually create specific ETag values for use in the headers depends on the particulars of your resources — Leonard Richardson’s and Sam’s excellent book already covers this pretty well.

I intended the focus of the article to be more about item 2 than item 1, so I think not specifically addressing ETags is OK.

One thing I should have included, though, is how to parse POST data. You use the yaws_api:parse_post/1 function for that, passing in an arg record. For typical form data, it’ll give you back a list of key/value pairs over which you can iterate, or from which you can extract expected key/value pairs using yaws_api:postvar/2 (or even proplists:lookup/2 or lists:keysearch/3, if you like). See the documentation at the Yaws website for more details, but all in all, handling POST data in Yaws is fairly trivial.