<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: More File Processing with Erlang</title>
	<atom:link href="http://steve.vinoski.net/blog/2007/09/29/more-file-processing-with-erlang/feed/" rel="self" type="application/rss+xml" />
	<link>http://steve.vinoski.net/blog/2007/09/29/more-file-processing-with-erlang/</link>
	<description>Ask forgiveness, not permission.</description>
	<lastBuildDate>Mon, 15 Mar 2010 14:06:17 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Hypothetical Labs &#187; Worse Is Better Scaling</title>
		<link>http://steve.vinoski.net/blog/2007/09/29/more-file-processing-with-erlang/comment-page-1/#comment-116</link>
		<dc:creator>Hypothetical Labs &#187; Worse Is Better Scaling</dc:creator>
		<pubDate>Thu, 11 Oct 2007 14:54:19 +0000</pubDate>
		<guid isPermaLink="false">http://steve.vinoski.net/blog/2007/09/29/more-file-processing-with-erlang/#comment-116</guid>
		<description>[...] It&#8217;s spawned a couple of interesting threads on erlang-questions and several insightful blog [...]</description>
		<content:encoded><![CDATA[<p>[...] It&#8217;s spawned a couple of interesting threads on erlang-questions and several insightful blog [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Hynek (Pichi) Vychodil</title>
		<link>http://steve.vinoski.net/blog/2007/09/29/more-file-processing-with-erlang/comment-page-1/#comment-85</link>
		<dc:creator>Hynek (Pichi) Vychodil</dc:creator>
		<pubDate>Mon, 08 Oct 2007 06:38:02 +0000</pubDate>
		<guid isPermaLink="false">http://steve.vinoski.net/blog/2007/09/29/more-file-processing-with-erlang/#comment-85</guid>
		<description>Steve: why bfile should be faster than file?
http://pichis-blog.blogspot.com/2007/10/is-bfile-faster-than-old-erlang-file.html</description>
		<content:encoded><![CDATA[<p>Steve: why bfile should be faster than file?<br />
<a href="http://pichis-blog.blogspot.com/2007/10/is-bfile-faster-than-old-erlang-file.html" rel="nofollow">http://pichis-blog.blogspot.com/2007/10/is-bfile-faster-than-old-erlang-file.html</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dilip</title>
		<link>http://steve.vinoski.net/blog/2007/09/29/more-file-processing-with-erlang/comment-page-1/#comment-47</link>
		<dc:creator>Dilip</dc:creator>
		<pubDate>Thu, 04 Oct 2007 21:03:44 +0000</pubDate>
		<guid isPermaLink="false">http://steve.vinoski.net/blog/2007/09/29/more-file-processing-with-erlang/#comment-47</guid>
		<description>FWIW Joe Cheng has a C# 3.0 version[1] of this problem that seems to perform better than Ruby (even in terms of code brevity!).  Of course it still needs PLINQ to make use of multi core/CPU hardware.

[1] http://jcheng.wordpress.com/2007/10/02/wide-finder-with-linq/#more-240</description>
		<content:encoded><![CDATA[<p>FWIW Joe Cheng has a C# 3.0 version[1] of this problem that seems to perform better than Ruby (even in terms of code brevity!).  Of course it still needs PLINQ to make use of multi core/CPU hardware.</p>
<p>[1] <a href="http://jcheng.wordpress.com/2007/10/02/wide-finder-with-linq/#more-240" rel="nofollow">http://jcheng.wordpress.com/2007/10/02/wide-finder-with-linq/#more-240</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: steve</title>
		<link>http://steve.vinoski.net/blog/2007/09/29/more-file-processing-with-erlang/comment-page-1/#comment-36</link>
		<dc:creator>steve</dc:creator>
		<pubDate>Wed, 03 Oct 2007 21:09:22 +0000</pubDate>
		<guid isPermaLink="false">http://steve.vinoski.net/blog/2007/09/29/more-file-processing-with-erlang/#comment-36</guid>
		<description>Hi Pete, yes, MacBook Pro disk I/O throughput is best case about 45 MB/sec from some figures I&#039;ve seen, which I think would put us in the 4-5 sec range for reading this data, best case. If I run the Ruby solution on the full dataset, the first time it takes 7.5-8 secs, but the second time it takes 2.2-2.5 secs, which I believe shows the caching effects. But if we&#039;re getting cached data, then I think that reinforces my point about the Erlang code not being I/O bound.

Your second paragraph above has some very good insights, and your final paragraph is right on the money, IMO. I&#039;d be really interested in seeing your benchmark results (I assume they&#039;ll be on your website?), and needless to say Tim&#039;s T2 results should be quite interesting as well.</description>
		<content:encoded><![CDATA[<p>Hi Pete, yes, MacBook Pro disk I/O throughput is best case about 45 MB/sec from some figures I&#8217;ve seen, which I think would put us in the 4-5 sec range for reading this data, best case. If I run the Ruby solution on the full dataset, the first time it takes 7.5-8 secs, but the second time it takes 2.2-2.5 secs, which I believe shows the caching effects. But if we&#8217;re getting cached data, then I think that reinforces my point about the Erlang code not being I/O bound.</p>
<p>Your second paragraph above has some very good insights, and your final paragraph is right on the money, IMO. I&#8217;d be really interested in seeing your benchmark results (I assume they&#8217;ll be on your website?), and needless to say Tim&#8217;s T2 results should be quite interesting as well.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Pete Kirkham</title>
		<link>http://steve.vinoski.net/blog/2007/09/29/more-file-processing-with-erlang/comment-page-1/#comment-35</link>
		<dc:creator>Pete Kirkham</dc:creator>
		<pubDate>Wed, 03 Oct 2007 19:59:48 +0000</pubDate>
		<guid isPermaLink="false">http://steve.vinoski.net/blog/2007/09/29/more-file-processing-with-erlang/#comment-35</guid>
		<description>I do think on that particular measure you&#039;re reading from cache rather than physical disk; on my laptop it takes 8.5s if you clear the cache, 0.4s if you don&#039;t. Most laptop drives are around 30MBps; a bit more if 10,000 rpm. Flash drives are twice that, but excel at seek time.

However, that sort of number may actually be representative of the target system - if you assume a transfer rate in the 2GBps ballpark (Sun&#039;s Thumper gives 2GBps to memory), and the T2&#039;s 1.4GHz core, that gives only 0.7 CPU cycles (per hardware thread) to process each byte of data, so even the C++ code (which takes around 3 cycles per byte on my laptop) would be CPU limited rather than IO limited if you don&#039;t spread the load over the available hardware threads.  There&#039;s more estimates at the link I put as website.

If it&#039;s a language war then it&#039;s between erlang and ruby; I&#039;m trying to find out what the VM of either language should be doing to solve this problem, so am benchmarking to find where the costs are if you write close-to-the-metal code to solve it. I wouldn&#039;t write a log file extraction script in bit-twiddly C++; I&#039;d use Perl or XSLT. You really shouldn&#039;t have to, but performing experiments to help think about where the bottlenecks may be is useful. 


Pete</description>
		<content:encoded><![CDATA[<p>I do think on that particular measure you&#8217;re reading from cache rather than physical disk; on my laptop it takes 8.5s if you clear the cache, 0.4s if you don&#8217;t. Most laptop drives are around 30MBps; a bit more if 10,000 rpm. Flash drives are twice that, but excel at seek time.</p>
<p>However, that sort of number may actually be representative of the target system &#8211; if you assume a transfer rate in the 2GBps ballpark (Sun&#8217;s Thumper gives 2GBps to memory), and the T2&#8217;s 1.4GHz core, that gives only 0.7 CPU cycles (per hardware thread) to process each byte of data, so even the C++ code (which takes around 3 cycles per byte on my laptop) would be CPU limited rather than IO limited if you don&#8217;t spread the load over the available hardware threads.  There&#8217;s more estimates at the link I put as website.</p>
<p>If it&#8217;s a language war then it&#8217;s between erlang and ruby; I&#8217;m trying to find out what the VM of either language should be doing to solve this problem, so am benchmarking to find where the costs are if you write close-to-the-metal code to solve it. I wouldn&#8217;t write a log file extraction script in bit-twiddly C++; I&#8217;d use Perl or XSLT. You really shouldn&#8217;t have to, but performing experiments to help think about where the bottlenecks may be is useful. </p>
<p>Pete</p>
]]></content:encoded>
	</item>
</channel>
</rss>

<!-- Dynamic Page Served (once) in 0.242 seconds -->
