{"id":7,"date":"2007-09-23T11:50:43","date_gmt":"2007-09-23T16:50:43","guid":{"rendered":"http:\/\/steve.vinoski.net\/blog\/2007\/09\/23\/tim-bray-and-erlang\/"},"modified":"2007-09-23T23:53:23","modified_gmt":"2007-09-24T04:53:23","slug":"tim-bray-and-erlang","status":"publish","type":"post","link":"https:\/\/steve.vinoski.net\/blog\/2007\/09\/23\/tim-bray-and-erlang\/","title":{"rendered":"Tim Bray and Erlang"},"content":{"rendered":"<p><a href=\"http:\/\/www.tbray.org\/ongoing\/When\/200x\/2007\/09\/22\/Erlang\">Tim&#8217;s playing with Erlang<\/a>, trying to rewrite a Ruby program for analyzing his website logs. His initial reaction appeared to be one of disgust, since his Erlang program was an order of magnitude slower than the Ruby version. Thankfully, though, a bunch of people have since jumped in and made the solution more palatable (check the comments on his posting).<\/p>\n<p>Reading between the lines, it seems that Tim was hoping to take advantage of Erlang&#8217;s concurrency to put his multicore machines to work analyzing his logs. Unfortunately, none of the answers posted in the blog comments seem to provide that, so I decided to take a crack at it myself.<\/p>\n<p>First I wrote a tiny little Erlang program to read in Tim&#8217;s entire <a href=\"http:\/\/www.tbray.org\/tmp\/o10k.ap\">sample logfile<\/a> using <code>file:read_file<\/code>, turn the result into a list of strings using <code>binary_to_list<\/code>, and then process the strings. Simple timings showed the <code>binary_to_list<\/code> call to be the slowest part, so I decided to throw Erlang&#8217;s multiprocess capability at it. If you&#8217;re familiar with pthreads or Java threads, think of an Erlang &#8220;process&#8221; as basically a very, very lightweight thread.<\/p>\n<p>Here&#8217;s my solution:<\/p>\n<p><code><\/p>\n<pre>-module(tbray).\r\n-export([start\/2]).\r\n\r\nfind_match(\"\/ongoing\/When\/\" ++ Last) ->\r\n    case lists:member($., Last) of\r\n        false -&gt; 1;\r\n        true -&gt; 0\r\n    end;\r\nfind_match(_) -&gt; 0.\r\n\r\nprocess_binary(Pid, Bin) -&gt;\r\n    spawn(fun() -&gt;\r\n        L = string:tokens(binary_to_list(Bin), \"\\\\n\"),\r\n        V = lists:foldl(\r\n            fun(Line, Total) ->\r\n                Total + find_match(lists:nth(7, string:tokens(Line, \" \"))) end,\r\n            0, L),\r\n        Pid ! V\r\n        end).\r\n\r\nsplit_on_newline(Bin, N, All) when size(Bin) &lt; N -&gt;\r\n    All ++ [Bin];\r\nsplit_on_newline(Bin, N, All) -&gt;\r\n    {_, &lt;&lt;C:8, _\/binary&gt;&gt;} = split_binary(Bin, N),\r\n    case C of\r\n        $\\\\n -&gt;\r\n          {B21, B22} = split_binary(Bin, N+1),\r\n          split_on_newline(B22, N, All ++ [B21]);\r\n        _ -&gt; split_on_newline(Bin, N+1, All)\r\n    end.\r\nsplit_on_newline(Bin, N) when N == size(Bin) -&gt; [Bin];\r\nsplit_on_newline(Bin, N) -&gt; split_on_newline(Bin, N, []).\r\n\r\nstart(Num, Input) -&gt;\r\n    {ok, Data} = file:read_file(Input),\r\n    Bins = split_on_newline(Data, size(Data) div Num),\r\n    Me = self(),\r\n    Pids = [process_binary(Me, B) || B &lt;- Bins],\r\n    lists:foldl(\r\n        fun(_, Total) -&gt; receive X -&gt; Total + X end end,\r\n        0, Pids).\r\n<\/pre>\n<p><\/code><\/p>\n<p>The way this solution works is that it uses multiple Erlang processes to convert chunks of the input file to lists of strings and process them for matches. Begin with the <code>start\/2<\/code> function at the very bottom. First, we read the file in one shot, then split it into <code>Num<\/code> chunks, with the <code>split_on_newline<\/code> function variants being mindful to end each chunk on a newline character so we don&#8217;t split lines across chunks. We then pass each chunk to the <code>process_binary\/2<\/code> function using a list comprehension. Each <code>process_binary\/2<\/code> call spawns a new process to first convert its chunk to a list of strings and then process those strings for matches.<\/p>\n<p>Now let&#8217;s time it. My MacBook Pro has two cores, so let&#8217;s enable SMP, and bump the Erlang process limit up to 60,000. First, we&#8217;ll compile the module and time it with just a single process:<\/p>\n<p><code>$ erl -smp enable +P 60000<br \/>\n1> c(tbray).<br \/>\n{ok,tbray}<br \/>\n2> timer:tc(tbray, start, [1, \"o10k.ap\"]).<br \/>\n{661587,1101}<br \/>\n<\/code><\/p>\n<p>OK, at 0.66 seconds, we&#8217;re already a lot faster than Tim&#8217;s approach (the second value, 1101, is the number of matches we found), but can it go faster? Let&#8217;s try 2 processes:<\/p>\n<p><code>3> timer:tc(tbray, start, [2, \"o10k.ap\"]).<br \/>\n{472067,1101}<br \/>\n<\/code><\/p>\n<p>That dropped us to 0.47 seconds, which is not an insignificant speedup. Do more processes help?<\/p>\n<p><code>4> timer:tc(tbray, start, [4, \"o10k.ap\"]).<br \/>\n{390786,1101}<br \/>\n<\/code><\/p>\n<p>Yes, at 4 processes we drop to 0.39 seconds. Let&#8217;s go up a few orders of magnitude:<\/p>\n<p><code>5> timer:tc(tbray, start, [40, \"o10k.ap\"]).<br \/>\n{380753,1101}<br \/>\n6> timer:tc(tbray, start, [400, \"o10k.ap\"]).<br \/>\n{322979,1101}<br \/>\n7> timer:tc(tbray, start, [4000, \"o10k.ap\"]).<br \/>\n{316857,1101}<br \/>\n8> timer:tc(tbray, start, [40000, \"o10k.ap\"]).<br \/>\n{318153,1101}<br \/>\n<\/code><\/p>\n<p>As we increase the number of Erlang processes, our performance improves, up to a point. At 40,000 processes we&#8217;re slower than we were at 4000. Maybe there&#8217;s a better number in between? It turns out that despite the numbers listed above, once you get above 400 processes or so, the numbers remain about the same. The best I got on my MacBook Pro after numerous runs was 0.301 seconds with 2400 processes, but the average best seems to be about 0.318 seconds. The performance of this approach comes pretty close to <a href=\"http:\/\/undefined.org\/erlang\/o10k.zip\">other solutions that rely on external non-Erlang assistance<\/a>, at least for Tim&#8217;s sample dataset on this machine.<\/p>\n<p>I also tried it on an 8-core (2 Intel Xeon E5345 CPUs) 64-bit Dell box running Linux, and it clocked in at 0.126 seconds with 2400 processes, and I saw a 0.124 seconds with 1200 processes. I believe this utilization of multiple cores was exactly what Tim was looking for.<\/p>\n<p>If you&#8217;re a Java or C++ programmer, note the ease with which we can spawn Erlang processes and have them communicate, and note how quickly we can launch thousands of processes. This is what Tim was after, I believe, so hopefully my example provides food for thought in that area. BTW, I&#8217;m no Erlang expert, so if anyone wants to suggest improvements to what I&#8217;ve written, please feel free to comment here.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Tim&#8217;s playing with Erlang, trying to rewrite a Ruby program for analyzing his website logs. His initial reaction appeared to be one of disgust, since his Erlang program was an order of magnitude slower than the Ruby version. Thankfully, though, a bunch of people have since jumped in and made the solution more palatable (check [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4],"tags":[],"class_list":["post-7","post","type-post","status-publish","format-standard","hentry","category-erlang"],"_links":{"self":[{"href":"https:\/\/steve.vinoski.net\/blog\/wp-json\/wp\/v2\/posts\/7","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/steve.vinoski.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/steve.vinoski.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/steve.vinoski.net\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/steve.vinoski.net\/blog\/wp-json\/wp\/v2\/comments?post=7"}],"version-history":[{"count":0,"href":"https:\/\/steve.vinoski.net\/blog\/wp-json\/wp\/v2\/posts\/7\/revisions"}],"wp:attachment":[{"href":"https:\/\/steve.vinoski.net\/blog\/wp-json\/wp\/v2\/media?parent=7"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/steve.vinoski.net\/blog\/wp-json\/wp\/v2\/categories?post=7"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/steve.vinoski.net\/blog\/wp-json\/wp\/v2\/tags?post=7"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}