<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>BioTeam Inc.</title>
	<atom:link href="http://blog.bioteam.net/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.bioteam.net</link>
	<description>Latest news: publications, presentations, projects and training classes</description>
	<lastBuildDate>Thu, 29 Jul 2010 14:15:15 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Playing with NFS &amp; GlusterFS on Amazon cc1.4xlarge EC2 instance types</title>
		<link>http://blog.bioteam.net/2010/07/29/playing-with-nfs-glusterfs-on-amazon-cc1-4xlarge-ec2-instance-types/</link>
		<comments>http://blog.bioteam.net/2010/07/29/playing-with-nfs-glusterfs-on-amazon-cc1-4xlarge-ec2-instance-types/#comments</comments>
		<pubDate>Thu, 29 Jul 2010 14:07:42 +0000</pubDate>
		<dc:creator>chrisdag</dc:creator>
				<category><![CDATA[Employee Posts]]></category>
		<category><![CDATA[amazon]]></category>
		<category><![CDATA[amazon cloud]]></category>
		<category><![CDATA[aws]]></category>
		<category><![CDATA[cc1.4xlarge]]></category>
		<category><![CDATA[cloud]]></category>
		<category><![CDATA[ebs]]></category>
		<category><![CDATA[ebs performance]]></category>

		<guid isPermaLink="false">http://blog.bioteam.net/?p=664</guid>
		<description><![CDATA[Early single-client tests of shared ephemeral storage via NFS and parallel GlusterFS We here at BioTeam have been kicking tires and generally exploring around the edges of the new Amazon cc1.4xlarge &#8220;compute cluster&#8221; EC2 instance types. Much of our experimentation has been centered around simplistic benchmarking techniques as a way of slowly zeroing in on [...]]]></description>
			<content:encoded><![CDATA[<h3>Early single-client tests of shared ephemeral storage via NFS and parallel GlusterFS</h3>
<p>We here at BioTeam have been kicking tires and <a href="http://blog.bioteam.net/2010/07/19/exploring-the-new-aws-compute-cluster-ec2-instances/">generally exploring around the edges of the new Amazon cc1.4xlarge &#8220;compute cluster&#8221; EC2 instance types</a>. Much of our experimentation has been centered around simplistic benchmarking techniques as a way of slowly zeroing in on the methods, techniques and orchestration approaches most likely to have a significant usability, performance or <em>wallclock-time-to-scientific-results</em> outcome for the work we do professionally for ourselves and our clients.</p>
<p> </p>
<p><img style="display: block; margin-left: auto; margin-right: auto;" title="glusterFS-002.png" src="http://blog.bioteam.net/wp-content/uploads/2010/07/glusterFS-002.png" border="0" alt="glusterFS-002.png" width="500" height="203" /></p>
<p>We are asking very broad questions and testing assumptions along the lines of:</p>
<ul>
<li>Does the hot new 10 Gigabit non-blocking networking fabric backing up the new instance types really mean that &#8220;legacy&#8221; compute farm and HPC cluster architectures which make heavy use of network filesharing possible?
</li>
<li>How does filesharing between nodes look and feel on the new network and instance types?
</li>
<li>Are the speedy ephemeral disks on the new instance types suitable for bundling into NFS shares or aggregating into parallel or clustered distribtued filesystems?
</li>
<li>Can we use the replication features in GlusterFS to mitigate some of the risks of using ephemeral disk for storage? 
</li>
<li>Should the shared storage built from ephermeral disk be assigned to &#8220;/scratch&#8221; or other non-critical duties due to the risks involved? What can we do to mitigate the risks?
</li>
<li>At what scale is NFS the easiest and most suitable sharing option? What are the best NFS server and client tuning parameters to use? 
</li>
<li>When using parallel or cluster filesystems like GlusterFS, what rough metrics can we use to figure out how many data servers to dedicate to a particular cluster size or workflow profile?</li>
</ul>
<p> </p>
<h4>GlusterFS &amp; NFS Initial Testing</h4>
<p>Over the past week we have been running tests on two types of network filesharing. We&#8217;ve only tested against a single client so obviously these results say nothing about at-scale performance or operation.</p>
<p>Types of tests:</p>
<ol>
<li>Take the pair of ~900GB ephemeral disks on the instance type, stripe them together as a RAID0 set, slap an XFS filesystem on top and export the entire volume out via NFS
</li>
<li>Take the pair of ~900GB ephemeral disks on the instance type, slap a single large partition on each drive, format each drive with an EXT3 filesystem and then use GlusterFS to create, mount and export the volume via the GlusterFS protocol</li>
</ol>
<p>For each of the above two test types we repeatedly ran (at least 4x times) our standard bonnie++ benchmark tests (methodology described in the earlier blog posts). The tests were run on a single remote client that was either NFS mounting or GlusterFS mounting the file share.</p>
<p><strong>GlusterFS parameters</strong></p>
<ul>
<li>None really. We used the standard volume creation command and mounted the file share via the GlusterFS protocol over TCP. Eventually we want to ask some of our GlusterFS expert friends for additional tuning guidance</li>
</ul>
<p><strong>NFS parameters:</strong></p>
<ul>
<li>Server export file:  &#8221;/nfs    &lt;host&gt;(rw,async)?&#8221;</li>
<li>NFS Server config: boosted the number of nfsd daemons to 16 via edits to /etc/sysconfig/nfs file</li>
<li>Client mount options:  &#8221;mount -t nfs -o rw,async,hard,intr,retrans=2,rsize=32768,wsize=32768,nfsvers=3,tcp &lt;host&gt;:/nfs /nfs-scratch?&#8221;</li>
</ul>
<p> </p>
<h3>Lessons Learned So Far  - NFS vs GlusterFS</h3>
<ul>
<li>GlusterFS was incredibly easy to install and creating and exporting parallel filesystem shares was straightforward. The methods involved are easily scripted/automated or built into a server orchestration strategy. The process was so simple that initially we were thinking that GlusterFS would be our default sharing option for all our work on the new compute cluster instances
</li>
<li>GlusterFS has <strong><em>ONE HUGE DOWNSIDE</em></strong>. It turns out that GlusterFS recommends that the participating disk volumes be formatted with an ext3 filesystem for best results. This is &#8230; problematic &#8230; with the 900GB ephemeral disks because<strong><em> formatting a 900 gb disk with ext3 takes damn near forever</em></strong>. We estimate about 15-20 minutes of wallclock time wasted while waiting for the &#8220;mkfs.ext3&#8243; command to complete.
</li>
<li>The wallclock time lost to formatting ext3 volumes for GlusterFS usage is significant enough to affect how we may or may not use GlusterFS in the future. Maybe there is a different filesystem we can use with a faster formatting profile. Using XFS and software RAID we can normally stand up and export filesystems in a matter of a few seconds or a minute or two. Sadly, XFS is not recommended at all with current versions of GlusterFS. </li>
<li> Using GlusterFS with the recommended ext3 configuration seems to mean that we have to accept a minimum delay of 15 minutes or even more when standing up and exporting new storage. This is unacceptable for small deployments or workflows where you might only be running the EC2 instances for a short time.
</li>
<li>The possibility of using the GlusterFS replication features to mitigate against the risks of using ephermeral storage might be significant. We need to do more testing in this configuration. 
</li>
<li>Given the extensive wallclock time delays inherent in waiting for ext3 filesystem formatting to complete in a GlusterFS scenario it seems likely that we might default to using a tuned NFS server setup for (a) small clusters &amp; compute farms or (b) systems that we plan to stand up only for a few hours.
</li>
<li>The overhead of provisioning GlusterFS becomes less significant when we have very large clusters that can benefit from the inherent scaling ability of GlusterFS or when we plan to stand up the clusters for longer periods of time</li>
</ul>
<p> </p>
<h3>Benchmark Results</h3>
<p>In all the results shown below I&#8217;ve included data from a 2-disk RAID0 ephemeral storage  setup. This is so that the network filesharing data can be contrasted against the results seen from running bonnie++ locally.</p>
<p><em>Click on the images for a larger version</em>.</p>
<p><a href="http://blog.bioteam.net/wp-content/uploads/2010/07/glusterFS-004.png"><img style="display: block; margin-left: auto; margin-right: auto;" title="glusterFS-004.png" src="http://blog.bioteam.net/wp-content/uploads/2010/07/glusterFS-004.png" border="0" alt="glusterFS-004.png" width="600" /></a></p>
<p><a href="http://blog.bioteam.net/wp-content/uploads/2010/07/glusterFS-005.png"><img style="display: block; margin-left: auto; margin-right: auto;" title="glusterFS-005.png" src="http://blog.bioteam.net/wp-content/uploads/2010/07/glusterFS-005.png" border="0" alt="glusterFS-005.png" width="600" /></a></p>
<p> </p>
<p><a href="http://blog.bioteam.net/wp-content/uploads/2010/07/glusterFS-006.png"><img style="display: block; margin-left: auto; margin-right: auto;" title="glusterFS-006.png" src="http://blog.bioteam.net/wp-content/uploads/2010/07/glusterFS-006.png" border="0" alt="glusterFS-006.png" width="600" /></a></p>
<p> </p>
<p> </p>
<p> </p>
<img src="http://blog.bioteam.net/?ak_action=api_record_view&id=664&type=feed" alt="" /><p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://blog.bioteam.net/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://blog.bioteam.net/2010/07/29/playing-with-nfs-glusterfs-on-amazon-cc1-4xlarge-ec2-instance-types/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Boot, ephemeral &amp; EBS storage performance on amazon cc1.4xlarge instance types</title>
		<link>http://blog.bioteam.net/2010/07/20/boot-ephemeral-ebs-storage-performance-on-amazon-cc1-4xlarge-instance-types/</link>
		<comments>http://blog.bioteam.net/2010/07/20/boot-ephemeral-ebs-storage-performance-on-amazon-cc1-4xlarge-instance-types/#comments</comments>
		<pubDate>Tue, 20 Jul 2010 22:51:51 +0000</pubDate>
		<dc:creator>chrisdag</dc:creator>
				<category><![CDATA[Employee Posts]]></category>
		<category><![CDATA[amazon]]></category>
		<category><![CDATA[amazon cloud]]></category>
		<category><![CDATA[aws]]></category>
		<category><![CDATA[cloud]]></category>
		<category><![CDATA[cluster compute]]></category>
		<category><![CDATA[ebs]]></category>

		<guid isPermaLink="false">http://blog.bioteam.net/?p=644</guid>
		<description><![CDATA[Backstory For background and summary writeups of all the various blog posts we have dealing with the new Amazon EC2 &#8220;compute cluster&#8221; cc1.4xlarge instance types please refer to this summary page: http://blog.bioteam.net/2010/07/19/exploring-the-new-aws-compute-cluster-ec2-instances/? Related post We talked about the performance of the boot and ephemeral storage in this post: http://blog.bioteam.net/2010/07/19/local-storage-performance-of-aws-cluster-compute-instances/? This post In this post I&#8217;ve finally collected [...]]]></description>
			<content:encoded><![CDATA[<p><strong>Backstory</strong></p>
<p>For background and summary writeups of all the various blog posts we have dealing with the new Amazon EC2 &#8220;compute cluster&#8221; cc1.4xlarge instance types please refer to this summary page: <a href="http://blog.bioteam.net/2010/07/19/exploring-the-new-aws-compute-cluster-ec2-instances/">http://blog.bioteam.net/2010/07/19/exploring-the-new-aws-compute-cluster-ec2-instances/?</a></p>
<p><strong>Related post</strong></p>
<p>We talked about the performance of the boot and ephemeral storage in this post: <a href="http://blog.bioteam.net/2010/07/19/local-storage-performance-of-aws-cluster-compute-instances/">http://blog.bioteam.net/2010/07/19/local-storage-performance-of-aws-cluster-compute-instances/?</a></p>
<p><strong>This post</strong></p>
<p>In this post I&#8217;ve finally collected enough data to cover repeated bonnie++ benchmark tests against all the main types of block storage available to the new Amazon cc1.4xlarge &#8220;compute cluster&#8221; instance types:</p>
<ul>
<li>Local boot disk performance</li>
<li>Performance of a single ephemeral disk</li>
<li>Performance of the two available ephemeral? disks when striped together with software RAID0</li>
<li>Performance of a single EBS volumes attached to the instance</li>
<li>Performance of 4 EBS volumes striped with RAID0 and attached to the instance</li>
<li>Performance of 8 EBS volumes striped with RAID0 and attached to the instance?</li>
</ul>
<p><strong>Lessons Learned</strong></p>
<p><strong>1. Don&#8217;t use the boot disk for anything other than booting the operating system</strong>. As the results show, the performance of the <em>(no PV driver support</em>) cc1.4xlarge boot disk <strong><em>is the slowest of all possible block storage options available</em></strong>. Really slow. Not worth using for anything other than OS stuff. This also includes <a href="http://blog.bioteam.net/2010/07/14/how-to-resize-an-amazon-ec2-ami-when-boot-disk-is-on-ebs/">not bothering to mess with the size of the available volume</a>.</p>
<p><strong>2. The instance ephemeral storage volumes are fast and should not be ignored</strong>. Every cc1.4xlarge EC2 instance comes with a pair of ~840GB ephemeral disk volumes. In living by our own &#8220;<em>don&#8217;t trust anything to non-persistant storage!</em>&#8221; best practice we are guilty of ignoring these drives in situations where they could have been of significant benefit. This will change. The performance of a single ephemeral volume beats the performance we see out of a single EBS volume. A striped pair of ephemeral volumes performs even better and stacks up well even to multiple EBS volumes striped together. The RAID0 pairing of the two ephemeral drives seems to consistently outperform even 8-drive EBS RAID0 volumes when you look at the bonniee++ results for random and sequential file creation and deletion tests. This has <strong><em>major implications</em></strong> for HPC and scientific pipeline processing on the cloud. In particular I can easily envision using the ephemeral drives to build a shared parallel scratch filesystem (<em>think PVFS or GlusterFS</em>) in cluster configurations. This would give you a nice shared scratch storage pool. Even in simpler cluster setups it looks like it would be a win to stage data into the ephemeral storage so it can be used as the target drive for  scientific processing (where the input data is not unique and has backup copies elsewhere). We can run the IO heavy analysis against the fast ephemeral storage and send our result data into S3 buckets or a proper EBS volume for downstream handling.</p>
<p>3. <strong>Striping EBS volumes into software RAID0 sets is a valid practice</strong>. We clearly see performance gains when using more than one EBS volume, the performance gain is significant enough to justify the hassles involved in backing up and protecting EBS-resident sofware RAID sets. We need to do more work (and really need to test 2-volume EBS stripes) but it is clear that there is a measurable performance gain to be had. Not sure if we&#8217;d use 8-disk RAID0 sets for production work but looking at 2-disk and 4-disk methods is something that we will be looking seriously at.</p>
<p>Obviously there is much more to be drawn from the data but benchmarking is hard (and controversial) in regular settings let alone trying to get repeatable and consistent results out of a virtualized multi-tennant cloud framework. For now I&#8217;d prefer to stick to broad general conclusions and &#8220;lessons learned&#8221; rather than trying to divine highly specific things out of the raw data.</p>
<p> </p>
<p><strong>Test Results</strong></p>
<p>As usual, you can find all of the raw data in <a href="https://spreadsheets.google.com/ccc?key=0AsrRXBRzWSxSdDdTZG9rZXRHUnQyU0sxak9aaGpJUlE&amp;hl=en&amp;authkey=CJmVloIK">this google spreadsheet</a>. We did not finesse the data at all, the only data munging we did was to run tests repeatedly and then average out the results in order to arrive at the numbers used in the graphs.</p>
<p>Here are the numerical numbers behind the graphs, click on the image for the full-size version:</p>
<p><a href="http://blog.bioteam.net/wp-content/uploads/2010/07/cc1.4xlarge-summary-001.png"><img style="display: block; margin-left: auto; margin-right: auto;" title="cc1.4xlarge-summary-001.png" src="http://blog.bioteam.net/wp-content/uploads/2010/07/cc1.4xlarge-summary-001.png" border="0" alt="cc1.4xlarge-summary-001.png" width="620" /></a></p>
<p>And here are the graphs. We&#8217;ve broken out the graphs to represent the results measured in &#8220;K/sec&#8221; versus just &#8221; /sec&#8221;.</p>
<p> </p>
<p><strong>Read, Write &amp; Rewrite results for all cc1.x4large storage types (click on image for full size):</strong></p>
<p><a href="http://blog.bioteam.net/wp-content/uploads/2010/07/cc1.4xlarge-summary-002.png"><img style="display: block; margin-left: auto; margin-right: auto;" title="cc1.4xlarge-summary-002.png" src="http://blog.bioteam.net/wp-content/uploads/2010/07/cc1.4xlarge-summary-002.png" border="0" alt="cc1.4xlarge-summary-002.png" width="620" /></a></p>
<p> </p>
<p><strong>Sequential create, delete &amp; seek results for all cc1.x4large storage types (click on image for full size):</strong></p>
<p> </p>
<p><a href="http://blog.bioteam.net/wp-content/uploads/2010/07/cc1.4xlarge-summary-004.png"><img style="display: block; margin-left: auto; margin-right: auto;" title="cc1.4xlarge-summary-004.png" src="http://blog.bioteam.net/wp-content/uploads/2010/07/cc1.4xlarge-summary-004.png" border="0" alt="cc1.4xlarge-summary-004.png" width="620" /></a></p>
<p> </p>
<p> </p>
<img src="http://blog.bioteam.net/?ak_action=api_record_view&id=644&type=feed" alt="" /><p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://blog.bioteam.net/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://blog.bioteam.net/2010/07/20/boot-ephemeral-ebs-storage-performance-on-amazon-cc1-4xlarge-instance-types/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Exploring the new AWS Compute Cluster EC2 Instances</title>
		<link>http://blog.bioteam.net/2010/07/19/exploring-the-new-aws-compute-cluster-ec2-instances/</link>
		<comments>http://blog.bioteam.net/2010/07/19/exploring-the-new-aws-compute-cluster-ec2-instances/#comments</comments>
		<pubDate>Mon, 19 Jul 2010 20:11:42 +0000</pubDate>
		<dc:creator>chrisdag</dc:creator>
				<category><![CDATA[Featured]]></category>
		<category><![CDATA[amazon]]></category>
		<category><![CDATA[amazon cloud]]></category>
		<category><![CDATA[aws]]></category>
		<category><![CDATA[cc1.4xlarge]]></category>
		<category><![CDATA[ec2]]></category>
		<category><![CDATA[roundup]]></category>

		<guid isPermaLink="false">http://blog.bioteam.net/?p=632</guid>
		<description><![CDATA[Note: Depending on how you found this post, it might be helpful to understand our own personal &#38; professional biases. We are bioinformatics and HPC types specializing in life sciences, not people trying to build the next twitter or facebook. What we care about when it comes to AWS performance may not be what YOU [...]]]></description>
			<content:encoded><![CDATA[<p><strong>Note: </strong>Depending on how you found this post, it might be helpful to understand our own personal &amp; professional biases. We are bioinformatics and HPC types specializing in life sciences, not people trying to build the next twitter or facebook. What we care about when it comes to AWS performance may not be what YOU care about. In particular there is a ton of internet information concentrating on methods for speeding up random IO access patterns on AWS. In our work, however, we seem to be more bound by the speed of long sequential reads (and sometimes writes). Parallel and serial scientific/HPC computing is different from building giant websites or databases.</p>
<p>In our work with Amazon Web Services we try to spend as much time as we can &#8220;kicking the tires&#8221; so we become better at building stuff that we and our clients can actually use. We also try to share our information as much as possible in the spirit of scientific collaboration &amp; honest exchange.</p>
<p> </p>
<p>This is a &#8217;roundup&#8217; or summary blog post where we&#8217;ll list out blog post or articles that discuss the new Amazon cc1.4xlarge Compute Cluster instances.</p>
<ul>
<li><a href="http://blog.bioteam.net/2010/07/13/grid-engine-on-the-new-amazon-compute-cluster-instances/">Grid Engine on the new Amazon Compute Cluster Instances</a></li>
<li><a href="http://blog.bioteam.net/2010/07/14/how-to-resize-an-amazon-ec2-ami-when-boot-disk-is-on-ebs/">How to resize an EC2 AMI when the boot disk is on EBS</a></li>
<li><a href="http://blog.bioteam.net/2010/07/13/preliminary-ebs-performance-tests-on-amazon-compute-cluster-cc1-4xlarge-instance-types/">Preliminary EBS performance on Amazon Compute Cluster cc1.4xlarge instance types</a></li>
<li><a href="http://blog.bioteam.net/2010/07/19/local-storage-performance-of-aws-cluster-compute-instances/">Local storage performance of AWS cluster compute instances</a></li>
<li><a href="http://blog.bioteam.net/2010/07/20/boot-ephemeral-ebs-storage-performance-on-amazon-cc1-4xlarge-instance-types/">Combined summary of local, ephemeral &amp; EBS storage on cc1.4xlarge instance types</a></li>
<li><a href="http://blog.bioteam.net/2010/07/29/playing-with-nfs-glusterfs-on-amazon-cc1-4xlarge-ec2-instance-types/">NFS &amp; GlusterFS network filesharing on the cc1.4xlarge instance types</a></li>
</ul>
<img src="http://blog.bioteam.net/?ak_action=api_record_view&id=632&type=feed" alt="" /><p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://blog.bioteam.net/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://blog.bioteam.net/2010/07/19/exploring-the-new-aws-compute-cluster-ec2-instances/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Local storage performance of AWS cluster compute instances</title>
		<link>http://blog.bioteam.net/2010/07/19/local-storage-performance-of-aws-cluster-compute-instances/</link>
		<comments>http://blog.bioteam.net/2010/07/19/local-storage-performance-of-aws-cluster-compute-instances/#comments</comments>
		<pubDate>Mon, 19 Jul 2010 16:07:01 +0000</pubDate>
		<dc:creator>chrisdag</dc:creator>
				<category><![CDATA[Employee Posts]]></category>
		<category><![CDATA[amazon]]></category>
		<category><![CDATA[aws]]></category>
		<category><![CDATA[cc1.4xlarge]]></category>
		<category><![CDATA[cloud]]></category>
		<category><![CDATA[cluster compute]]></category>

		<guid isPermaLink="false">http://blog.bioteam.net/?p=607</guid>
		<description><![CDATA[Lots more data collected over the weekend as we were finally able to run bonnie++ against the local boot disk as well as single and striped versions of the ephermeral storage volumes that come along with every cc1.4xlarge instance type. Key Results: Performance of the root/boot disk is way slower than any other type of [...]]]></description>
			<content:encoded><![CDATA[<p>Lots more data collected over the weekend as we were finally able to run bonnie++ against the local boot disk as well as single and striped versions of the ephermeral storage volumes that come along with every cc1.4xlarge instance type.</p>
<p><strong>Key Results</strong>:</p>
<ul>
<li>Performance of the root/boot disk is way slower than any other type of block based storage. This is to be expected as the boot disk (even though it comes via an EBS-resident AMI) does not get the benefit of paravirtualization acceleration. The take home message is that the boot/root disk volume should not really be used for anything. This also means that this blog post showing how to increase the size of the local OS disk is useful only for playing around and not for anything serious
</li>
<li>The performance of the ephemeral disks is better and striping the two available drives together as a RAID0 volume has measurable benefits across the board</li>
</ul>
<p><strong>What this means in the real world:</strong></p>
<ol>
<li>Don&#8217;t use the boot/root disk for anything but the OS and don&#8217;t bother trying to expand it&#8217;s size
</li>
<li>It is reasonable to stripe the ephemeral storage together and use it for &#8220;real&#8221; work, especially as indications are that the speed may be faster than an EBS mounted volume.</li>
</ol>
<p>Other people have mentioned that this is worth doing even if one includes the time it takes to rsync or stage data into the ephemeral storage. Future BioTeam cluster building practices may use the ~800GB of ephemeral storage to service a NFS or parallel filesystem that offers input data to pipelines running on EC2 compute farms. Since we can&#8217;t trust ephemeral storage for anything unique we&#8217;d have a second shared filesystem (backed by EBS) to handle capturing pipeline results.</p>
<p>Obviously there is one other comparison to make &#8212; how do these performance numbers measure against the 1-disk. 4-disk and 8-disk EBS RAID0 stripesets that we&#8217;ve been testing all week?</p>
<p>That is a topic for the next blog posting &#8230;</p>
<p>Here are the results of our tests against local storage on a cc1.4xlarge instance. As usual <a href="https://spreadsheets.google.com/ccc?key=0AsrRXBRzWSxSdDdTZG9rZXRHUnQyU0sxak9aaGpJUlE&amp;hl=en&amp;authkey=CJmVloIK">the raw data is available in our public spreadsheet.</a></p>
<p>We ran tests multiple times and averaged the results. All file systems were XFS.</p>
<p>Summarized/averaged values used to generate the graphs:</p>
<p><a href="http://blog.bioteam.net/wp-content/uploads/2010/07/hpcLocaldisk-0011.png"><img style="display: block; margin-left: auto; margin-right: auto;" title="hpcLocaldisk-001.png" src="http://blog.bioteam.net/wp-content/uploads/2010/07/hpcLocaldisk-0011.png" border="0" alt="hpcLocaldisk-001.png" width="620" /></a></p>
<p>Here is the graph showing block and character based read &amp; write tests. We did not capture character-based test data for the local root disk because it was so slow already.</p>
<p><a href="http://blog.bioteam.net/wp-content/uploads/2010/07/hpcLocaldisk-002.png"><img style="display: block; margin-left: auto; margin-right: auto;" title="hpcLocaldisk-002.png" src="http://blog.bioteam.net/wp-content/uploads/2010/07/hpcLocaldisk-002.png" border="0" alt="hpcLocaldisk-002.png" width="620" /></a></p>
<p> </p>
<p>And here is a graph of the bonnie++ tests that deal with Seeks and Sequential/Random file creation &amp; deletion:</p>
<p><a href="http://blog.bioteam.net/wp-content/uploads/2010/07/hpcLocaldisk-004.png"><img style="display: block; margin-left: auto; margin-right: auto;" title="hpcLocaldisk-004.png" src="http://blog.bioteam.net/wp-content/uploads/2010/07/hpcLocaldisk-004.png" border="0" alt="hpcLocaldisk-004.png" width="620" /></a></p>
<p> </p>
<p> </p>
<img src="http://blog.bioteam.net/?ak_action=api_record_view&id=607&type=feed" alt="" /><p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://blog.bioteam.net/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://blog.bioteam.net/2010/07/19/local-storage-performance-of-aws-cluster-compute-instances/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How to resize an Amazon EC2 AMI when boot disk is on EBS</title>
		<link>http://blog.bioteam.net/2010/07/14/how-to-resize-an-amazon-ec2-ami-when-boot-disk-is-on-ebs/</link>
		<comments>http://blog.bioteam.net/2010/07/14/how-to-resize-an-amazon-ec2-ami-when-boot-disk-is-on-ebs/#comments</comments>
		<pubDate>Wed, 14 Jul 2010 22:12:12 +0000</pubDate>
		<dc:creator>chrisdag</dc:creator>
				<category><![CDATA[Employee Posts]]></category>
		<category><![CDATA[tech notes]]></category>
		<category><![CDATA[amazon]]></category>
		<category><![CDATA[aws]]></category>
		<category><![CDATA[ebs]]></category>

		<guid isPermaLink="false">http://blog.bioteam.net/?p=565</guid>
		<description><![CDATA[Note Screwing around with the boot volume is part of our regular &#8220;explore around the edges&#8221; work before we get serious with how we are going to configure and orchestrate the new systems. The boot volume in this scenario does not have PV driver support and thus will perform slower than the actual ephemeral storage. [...]]]></description>
			<content:encoded><![CDATA[<p><strong>Note</strong></p>
<p><em>Screwing around with the boot volume is part of our regular &#8220;explore around the edges&#8221; work before we get serious with how we are going to configure and orchestrate the new systems. The boot volume in this scenario does not have PV driver support and thus will perform slower than the actual ephemeral storage. Our need was for the boot volume to be big enough to hammer with bonnie++ &#8211; this is not something we&#8217;d do in a production scenario</em>.</p>
<p><strong>Background</strong></p>
<p><span style="font-family: 'Lucida Grande';">All of the cloud nerds at BioTeam are thrilled now that the Amazon Compute Cluster nodes have been publicly launched. If you missed the exciting news please visit the announcement post over at the AWS blog - <a href="http://aws.typepad.com/aws/2010/07/the-new-amazon-ec2-instance-type-the-cluster-compute-instance.html">http://aws.typepad.com/aws/2010/07/the-new-amazon-ec2-instance-type-the-cluster-compute-instance.html</a></span></p>
<p><span style="font-family: 'Lucida Grande';">We&#8217;ve been madly banging on the new instance types and trying to (initially) perform some basic low level benchmarks before we go on to the cooler benchmarks where we run actual life science &amp; informatics pipelines against the hot new gear. Using our <a href="http://www.opscode.com/chef/">Chef Server</a> it&#8217;s trivial for us to orchestrate these new systems into working HPC clusters in just a matter of minutes. We plan to start blogging and demoing live deployment of elastic genome assembly pipelines and NextGen DNA sequencing instrument pipelines (like the <a href="http://www.illumina.com/">Illumina</a> software) on AWS soon. </span></p>
<p><span style="font-family: 'Lucida Grande';">Like <a href="http://perspectives.mvdirona.com/2010/07/13/HighPerformanceComputingHitsTheCloud.aspx">James Hamilton says</a>, the real value in these new EC2 server types lies in the non-blocking 10Gigabit ethernet network backing them up. All of a sudden our &#8220;legacy&#8221; cluster and compute farm practices that involved network filesharing among nodes via NFS, GlusterFS, Lustre and GPFS seem actually feasible rather than a sick masochistic exercise in cloud futility. </span></p>
<p><span style="font-family: 'Lucida Grande';">We expect to see quite a bit of news in the near future about people using NFS, pNFS and other parallel/cluster filesystems for HPC data sharing on AWS &#8211; seems like a no-brainer now that we have full bisectional 10GbE bandwidth between our Compute Cluster cc1.4xlarge instance types. </span></p>
<p><span style="font-family: 'Lucida Grande';">However, despite the fact that the coolest stuff is going to involve what we can do now<strong><em> node-to-node</em></strong> over the fast new networking fabric there is still value in doing the low-level &#8220;what does this new environment look and feel like?&#8221; tests involving EBS disk volumes, S3 access and the like. </span></p>
<p><span style="font-family: 'Lucida Grande';">The Amazon AWS team did a great job preparing the way for people who want to quickly experiment with the new HPC instance types. The EC2 AMI images have to boot off of EBS and run under HVM virtualization rather than the standard paravirtualization used on the other instance types.</span></p>
<p><span style="font-family: 'Lucida Grande';">Recognizing that bootstrapping a HVM-aware EBS-booting EC2 server instance is a non-trivial exercise, AWS created a QuickStart public AMI with CentOS Linux that anyone can use right away:</span></p>
<p><span style="font-family: 'Lucida Grande';"><img style="display: block; margin-left: auto; margin-right: auto;" title="amiResize-001.png" src="http://blog.bioteam.net/wp-content/uploads/2010/07/amiResize-0011.png" border="0" alt="amiResize-001.png" width="550" height="304" /></span></p>
<p><span style="font-family: 'Lucida Grande';"><strong>The Problem &#8211; how to grow the default 20GB system disk in the QuickStart AMI? </strong></span></p>
<p><span style="font-family: 'Lucida Grande';">As beautiful as the CentOS HVM AMI is, it only gives us 20GB of disk space in it&#8217;s native form. This is perfectly fine for most of our use cases but presented a problem when we decided we wanted to use <a href="http://www.coker.com.au/bonnie++/">bonnie++</a> to perform some <a href="http://blog.bioteam.net/2010/07/13/preliminary-ebs-performance-tests-on-amazon-compute-cluster-cc1-4xlarge-instance-types/">disk IO benchmarks</a> on the local boot volume to complement our tests against more traditional mounted EBS volumes and RAID0 stripe sets.</span></p>
<p><span style="font-family: 'Lucida Grande';">Bonnie++ <strong><em>really</em></strong> wants to work against a filesystem that is at least twice the size of available RAM so as to mitigate any memory-related caching issues when testing actual IO performance. </span></p>
<p><span style="font-family: 'Lucida Grande';">The cc1.4xlarge &#8220;Cluster Compute&#8221; instance comes with ~23GB of physical RAM. Thus our problem &#8212; we wanted to run bonnie++ against the local system disk but the disk is actually smaller than the amount of RAM available to the instance!</span></p>
<p><span style="font-family: 'Lucida Grande';">For this one particular IO test we really wanted a HVM-compatible AMI that had at least 50GB of </span><span style="font-family: 'Lucida Grande';">storage on the boot volume. </span></p>
<p><span style="font-family: 'Lucida Grande';"><strong>The Solution</strong></span></p>
<p><span style="font-family: 'Lucida Grande';">I was shocked and amazed to find that in about ~20 minutes of screwing around with EBS snapshot sizes, instance disk partitions and LVM settings I was able to achieve the goal of converting the Amazon Quickstart 20GB AMI into a custom version where the system disk was 80Gb in size. </span></p>
<p><span style="font-family: 'Lucida Grande';">The fact that this was possible and achievable out of the box without having to debug mysterious boot failures, kernel panics and all the other sorts of things I&#8217;m used to dealing with when messing with low level disk and partition issues is the ultimate testament to both Amazon&#8217;s engineering prowess (<em>how cool is it that we can launch EBS snapshots of arbitrary size</em>?) as well as the current excellent state of Linux, Grub and LVM2. </span></p>
<p><span style="font-family: 'Lucida Grande';">I took a bunch of rough notes so I&#8217;d remember how the heck I managed to do this. Then I decided to clean up the notes and really document the process in case it might help someone else. </span></p>
<p><span style="font-family: 'Lucida Grande';"><strong>The Process</strong></span></p>
<p><span style="font-family: 'Lucida Grande';">I will try to walk through step-by-step the commands and methods used to increase the system boot disk from 20GB in size to 80GB in size. </span></p>
<p><span style="font-family: 'Lucida Grande';">It boils down to two main steps</span></p>
<ul>
<li><span style="font-family: 'Lucida Grande';">Launch the Amazon QuickStart AMI but override &amp; increase the default 20GB boot disk size</span></li>
<li><span style="font-family: 'Lucida Grande';">Get the CentOS Linux OS to recognize that it&#8217;s now running on a bigger disk</span></li>
</ul>
<p><span style="font-family: 'Lucida Grande';">You can&#8217;t do the following step using the <a href="https://console.aws.amazon.com/">AWS Web Management Console</a> as the webUI does not let you alter the parameters of the block device settings. You will need the <a href="http://developer.amazonwebservices.com/connect/entry.jspa?externalID=351">command-line EC2 utilities</a> installed and working in order to proceed.</span></p>
<p>On the command line we can easily tell Amazon that we want to start the QuickStart AMI but instead of launching it within a 20GB snapshot of the EBS boot volume we will launch it against a much larger snapshot.</p>
<p>If you look at the info for AMI ami-7ea24a17 within the web page you will see this under the details for the block devices that will be available to the system at boot:</p>
<blockquote><p><span class="label" style="width: 130px;"><span id="images_main_block_devices" class="console-tooltip">Block Devices: </span></span><span class="value">/dev/sda1=snap-1099e578:20:true</span></p></blockquote>
<p>That is basically saying that Linux Device &#8220;/dev/sda1&#8243; will be built from EBS snapshot volume &#8220;snap-1099e578&#8243;. The next &#8220;:&#8221; delimited parameter sets the size to 20GB.</p>
<p>We are going to change that from 20GB to 80GB.</p>
<p>Here is the command to launch that AMI using a disk of 80GB in size instead of the default 20GB.</p>
<p>In the following screenshot, note how we are starting the Amazon AMI ami-7ea24a17 with a block device (the &#8220;-b&#8221; switch) that is bootstrapping itself from the snapshot &#8220;snap-1099e578&#8243;.</p>
<p>All we needed to do in order to make the EC2 server have a larger boot disk is pass in &#8220;80&#8243; to override the default value of 20GB. Confused? Look at the &#8220;-b&#8221; block device argument below, the 80GB is set right after the snapshot name:</p>
<p> </p>
<div class="triple_wide_data" style="width: 97%;"><img style="display: block; margin-left: auto; margin-right: auto;" title="amiResize-002.png" src="http://blog.bioteam.net/wp-content/uploads/2010/07/amiResize-002.png" border="0" alt="amiResize-002.png" width="553" height="281" /></div>
<p>Your EBS volume might take a bit longer than normal to boot up but once it is online and available you can login normally.</p>
<p>Of course, the system will appear to have the default 20GB system disk:</p>
<p><img style="display: block; margin-left: auto; margin-right: auto;" title="amiResize-003.png" src="http://blog.bioteam.net/wp-content/uploads/2010/07/amiResize-003.png" border="0" alt="amiResize-003.png" width="500" height="346" /></p>
<p>Even the LVM2 physical disk reports show the ~20GB settings:</p>
<p><img style="display: block; margin-left: auto; margin-right: auto;" title="amiResize-004.png" src="http://blog.bioteam.net/wp-content/uploads/2010/07/amiResize-004.png" border="0" alt="amiResize-004.png" width="500" height="174" /></p>
<p>However, if we actually use the &#8216;fdisk&#8217; command to examine the disk we see that the block device is, indeed, much larger than the 20GB the Operating System thinks it has to utilize:</p>
<p><img style="display: block; margin-left: auto; margin-right: auto;" title="amiResize-005.png" src="http://blog.bioteam.net/wp-content/uploads/2010/07/amiResize-005.png" border="0" alt="amiResize-005.png" width="500" height="288" /></p>
<p>Fdisk tells us that disk device /dev/hda has 85.8 GB of physical capacity.</p>
<p>Now we need to teach the OS to make use of that space!</p>
<p>There are only two partitions on this disk, /dev/hda1 is the 100MB /boot partition common to RedHat varients. Lets leave that alone.</p>
<p>The second partition, /dev/hda2 is already set up for logical volumes under LVM. We are going to be lazy. We are just going to use &#8216;fdisk&#8217; to delete the /dev/hda2 partition so that we can immediately recreate it so that it spans the full remaining space on the physical drive.</p>
<p>After typing &#8220;fdisk /dev/hda&#8221; we type &#8220;d&#8221; and delete partition &#8220;2&#8243;. Then we type &#8220;n&#8221; for a new partition of type &#8220;p&#8221; for primary and &#8220;2&#8243; to name it as the second partition. After that we just hit return to accept the default suggestions for the begin and end of the recreated second partion.</p>
<p>If it all worked, we can type the &#8220;p&#8221; command to print the new partition table out.</p>
<p>Note how /dev/hda2 now has <strong><em>many</em></strong> more blocks? Cool!</p>
<p><img style="display: block; margin-left: auto; margin-right: auto;" title="amiResize-006.png" src="http://blog.bioteam.net/wp-content/uploads/2010/07/amiResize-0062.png" border="0" alt="amiResize-006.png" width="500" height="288" /></p>
<p>We are not done yet. None of our partition changes have actually been written to disk yet. We still need to type &#8220;w&#8221;  to write the new partition table down to disk and &#8220;q&#8221; to exit.</p>
<p>Obviously we can&#8217;t make live changes on a running boot disk. The new partition settings will come into effect after a system reboot.</p>
<p><img style="display: block; margin-left: auto; margin-right: auto;" title="amiResize-007.png" src="http://blog.bioteam.net/wp-content/uploads/2010/07/amiResize-007.png" border="0" alt="amiResize-007.png" width="500" height="288" /></p>
<p>Now we reboot the system and wait for it to come back up.</p>
<p>When it comes back up, don&#8217;t be alarmed that both &#8216;df&#8217; and &#8216;pvscan&#8217; still show the incorrect size:</p>
<p><img style="display: block; margin-left: auto; margin-right: auto;" title="amiResize-008.png" src="http://blog.bioteam.net/wp-content/uploads/2010/07/amiResize-008.png" border="0" alt="amiResize-008.png" width="500" height="255" /></p>
<p>We can fix that! Now we are in the realm of LVM so we need to use the &#8220;pvresize &lt;device&gt;&#8221; command to rescan the physical disk. Since our LVM2 partition is still /dev/hda2 that is the physical device path we give it:</p>
<p><img style="display: block; margin-left: auto; margin-right: auto;" title="amiResize-009.png" src="http://blog.bioteam.net/wp-content/uploads/2010/07/amiResize-009.png" border="0" alt="amiResize-009.png" width="500" height="223" /></p>
<p>Success! LVM recognizes that the drive is larger than 20GB.</p>
<p>With LVM aware that the disk is larger we are pretty much done. We can resize an existing logical volume or add a new one to the default Volume Group (&#8220;VolGroup00&#8243;).</p>
<p>Since I&#8217;m lazy AND I want to mount the extra space away from the root (&#8220;/&#8221;) volume I chose to create a new logical volume that shares the same /dev/hda2 physical volume (&#8220;PV&#8221;) and Volume Group (&#8220;VG&#8221;).</p>
<p>We are going to use the command &#8220;lvcreate VolGroup00 &#8211;size 60G /dev/hda2?&#8221; to make a new 60GB logical volume that is part of the existing Volume Group named &#8220;VolGroup00&#8243;:</p>
<p><img style="display: block; margin-left: auto; margin-right: auto;" title="amiResize-010.png" src="http://blog.bioteam.net/wp-content/uploads/2010/07/amiResize-0102.png" border="0" alt="amiResize-010.png" width="500" height="223" /></p>
<p>Success. Note that our new logical volume got assigned a default name of &#8220;lvol0&#8243; and it now exists in the LVM device path of &#8220;/dev/mapper/VolGroup00-lvol0&#8243;.</p>
<p>Now we need to place a Linux filesystem on our new 60GB of additional space and mount it up. Since I am a fan of XFS on EC2 I need to first install the &#8220;xfsprogs&#8221; RPM and then format the volume. A simple &#8220;yum -y install xfsprogs&#8221; does the trick and now I can make XFS filesystems on my server:</p>
<p><img style="display: block; margin-left: auto; margin-right: auto;" title="amiResize-011.png" src="http://blog.bioteam.net/wp-content/uploads/2010/07/amiResize-011.png" border="0" alt="amiResize-011.png" width="500" height="248" /></p>
<p>Success. We now have 60GB more space, visible to the OS and formatted with a filesystem. The final step is to mount it.</p>
<p><img style="display: block; margin-left: auto; margin-right: auto;" title="amiResize-012.png" src="http://blog.bioteam.net/wp-content/uploads/2010/07/amiResize-0121.png" border="0" alt="amiResize-012.png" width="500" height="276" /></p>
<p> </p>
<p>And we are done. We&#8217;ve successfully converted the 20GB Amazon QuickStart AMI into a version with a much larger boot volume<span style="font-family: 'Lucida Grande';">.</span></p>
<p><strong>Conclusion</strong></p>
<p>None of this is rocket science. It&#8217;s actually just Linux Systems Administration 101.</p>
<p>The real magic here is how easily this is all accomplished on our virtual cloud system using nothing but a web browser and some command-line utilities.</p>
<p>What makes this process special for me is how quick and easy it was &#8211; anyone who has spent any significant amount of time managing many physical Linux server systems knows the pain and hours lost when trying to do this stuff in the real world on real (and flaky) hardware.</p>
<p>I can&#8217;t even count how many hours of my life I&#8217;ve lost trying to debug Grub bootloader failures, mysterious kernel panics and other hard-to-troubleshoot booting and disk resizing efforts on production and development server systems when I&#8217;ve altered settings that we&#8217;ve covered in this post. In cluster environments we often have to do this debugging via a 9600 baud serial console or via flaky IPMI consoles. It&#8217;s just nasty.</p>
<p>The fact that this method worked so quickly and so smoothly is probably only amazing to people who know the real pain of having done this in the field, crouched on the floor of a freezing cold datacenter and trying not to pull your hair out as text scrolls slowly by at 9600 baud.</p>
<p>Congrats to the Amazon AWS team. Fantastic work. It&#8217;s a real win when virtual infrastructure is this easy to manipulate.</p>
<p> </p>
<p> </p>
<p> </p>
<img src="http://blog.bioteam.net/?ak_action=api_record_view&id=565&type=feed" alt="" /><p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://blog.bioteam.net/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://blog.bioteam.net/2010/07/14/how-to-resize-an-amazon-ec2-ami-when-boot-disk-is-on-ebs/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Preliminary EBS performance on Amazon Compute Cluster cc1.4xlarge instance types</title>
		<link>http://blog.bioteam.net/2010/07/13/preliminary-ebs-performance-tests-on-amazon-compute-cluster-cc1-4xlarge-instance-types/</link>
		<comments>http://blog.bioteam.net/2010/07/13/preliminary-ebs-performance-tests-on-amazon-compute-cluster-cc1-4xlarge-instance-types/#comments</comments>
		<pubDate>Wed, 14 Jul 2010 01:31:18 +0000</pubDate>
		<dc:creator>chrisdag</dc:creator>
				<category><![CDATA[Employee Posts]]></category>
		<category><![CDATA[amazon]]></category>
		<category><![CDATA[aws]]></category>
		<category><![CDATA[cc1.4xlarge]]></category>
		<category><![CDATA[compute cluster]]></category>
		<category><![CDATA[ebs]]></category>
		<category><![CDATA[ebs performance]]></category>
		<category><![CDATA[ec2]]></category>

		<guid isPermaLink="false">http://blog.bioteam.net/?p=550</guid>
		<description><![CDATA[Post Update History: July 13th &#8211; Original post July 14th &#8211; More results from cc1.4xlarge single-disk &#38; initial results from c1.xlarge instance type, uploaded new version of the raw data spreadsheet to Google Docs. Updated all graphs. July 19th &#8211; Lots more data (including ephemeral storage) added to the raw data spreadsheet on Google Docs [...]]]></description>
			<content:encoded><![CDATA[<p><em>Post Update History:</em></p>
<ul>
<li>July 13th &#8211; Original post</li>
<li>July 14th &#8211; More results from cc1.4xlarge single-disk &amp; initial results from c1.xlarge instance type, uploaded new version of the raw data spreadsheet to Google Docs. Updated all graphs. </li>
<li>July 19th &#8211; Lots more data (including ephemeral storage) added to the <a href="https://spreadsheets.google.com/ccc?key=0AsrRXBRzWSxSdDdTZG9rZXRHUnQyU0sxak9aaGpJUlE&amp;hl=en&amp;authkey=CJmVloIK">raw data spreadsheet on Google </a><a href="https://spreadsheets.google.com/ccc?key=0AsrRXBRzWSxSdEpOclJjX1ZsVkNFSVJlZWUyR0FKWXc&amp;hl=en&amp;authkey=CJ_IkYMB">Docs</a></li>
<li>July 20th &#8211; Added a new blog post <a href="http://blog.bioteam.net/?p=644">specifically talking about local, ephemeral and EBS performance</a> on cc1.4xlarge instances</li>
</ul>
<p><strong>Background</strong></p>
<p>Now that Amazon Web Services has <a href="http://aws.typepad.com/aws/2010/07/the-new-amazon-ec2-instance-type-the-cluster-compute-instance.html">opened their new &#8220;Compute Cluster&#8221; cc1.4xlarge instance types to the public</a> we&#8217;ve spent the day running bonnie++ disk performance benchmarks against single and RAID0 striped EBS volumes.</p>
<p>This is because we are life science types who do lots of high performance computing and cluster building. The single biggest performance bottleneck for people who want to do biology &#8220;in the cloud&#8221; is the generally poor performance of disk IO and storage in general. We tend to be more bottlenecked by the speed of disk than the speed of CPU in many common informatics and genomics applications.</p>
<p>Following in the footsteps of many others before us (<a href="http://af-design.com/blog/2010/03/02/honesty-box-ebs-performance-revisited/">example 1</a>, <a href="http://orion.heroku.com/past/2009/7/29/io_performance_on_ebs/">example 2</a>) we have learned that we can tease additional performance out of Amazon EBS disks by striping together multiple drives into a software RAID0 set.</p>
<p>There is a whole body of experimentation going on right now trying to find the &#8220;optimal&#8221; combination of:</p>
<ul>
<li>EC2 instance type</li>
<li>EBS volume size</li>
<li># of EBS volumes</li>
<li>Which filesystem to put on the software RAID set</li>
<li>What software RAID settings to use when creating the RAID set</li>
<li>What linux IO scheduler to use</li>
<li>What volume mount options to use</li>
<li>What other tweaks/parameters for increasing performance</li>
</ul>
<p>Nobody has really discovered the &#8220;ultimate&#8217;&#8221; solution and things are further complicated by the fact that performance &#8220;on the cloud&#8221; can vary minute by minute, hour by hour and day by day. It&#8217;s extremely difficult to get any sort of reliably repeatable data from cloud systems.</p>
<p>I also <strong><em>hate</em></strong> benchmarking because</p>
<ul>
<li>Performance &#8220;in the cloud&#8221; is insanely variable for reasons that are invisible to mortals</li>
<li> Nobody is ever satisfied with the results</li>
<li>It&#8217;s a lot of work, and even harder to do it reasonably correctly</li>
<li>Everybody has different needs, demands and requirements</li>
</ul>
<p><strong>Goals</strong></p>
<p>At this point all we really want to see is what the effect of having the new non-blocking 10 Gigabit Ethernet network operating behind the new EC2 &#8220;Cluster Compute&#8221; instance types does for performance on EBS volumes.</p>
<p>Obviously there are a lot of possibilities for non-oversubscribed 10GbE networking for people used to clusters and compute farms. We are also going to test node-to-node file transfers and even vanilla NFS between systems to see if it is now sensible to actually orchestrate actual Platform LSF, PBS and Grid Engine managed clusters on AWS.</p>
<p><strong>Methodology</strong></p>
<p>We chose a 160GB disk as our target size and made the following EBS volumes:</p>
<ul>
<li>Single 160GB EBS volume</li>
<li>Four 40GB EBS volumes (to be striped at RAID0 into 160GB disk)</li>
<li>Eight 20GB EBS volumes (to be striped at RAID0 into 160GB disk)</li>
</ul>
<p>Basically we wanted to run bonnie++ multiple times against 160GB single-disk, four-disk and eight-disk XFS volumes using different Linux IO schedulers to see what would happen.</p>
<p><strong>Filesystem</strong>. We chose XFS as the Linux filesystem to use, based largely on the work of others in this area, the filesystem was created with the standard &#8220;mkfs.xfs &lt;device&gt;&#8221; command. Nothing special</p>
<p><strong>Software RAID</strong>. Except for choosing a &#8220;&#8211;chunksize=256&#8243; option, we did nothing special with the creation of the /dev/md0 RAID0 device. An example command for our 8-disk stripeset would look like<em> &#8220;mdadm &#8211;create &#8211;verbose &#8211;level=0 &#8211;chunk=256 &#8211;raid-devices=8 /dev/md0 /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm</em>&#8220;</p>
<p><strong>Volume mounting</strong>: XFS volumes were mounted with the following options (<em>noatime,nodiratime,logbufs=8?) </em> example: &#8220;<em>mount -t xfs -o noatime,nodiratime,logbufs=8 /dev/md0 /eightdisk?</em>&#8220;</p>
<p><strong>Linux blockdev ra attribute</strong>. Following in the footsteps of others who seemed to find that the value of the Linux &#8220;readahead&#8221; value was perhaps set too conservatively (especially on RedHat varients) we increased the readahead value for every disk via the command &#8220;<em>blockdev &#8211;setra 65536? &lt;device&gt;</em>&#8220;</p>
<p><strong>Linux IO Scheduler</strong>. Various people have reported that the Linux IO scheduler <strong><em>matters</em></strong>. In our tests we wanted to see performance under different schedulers. We tested against the &#8220;noop&#8221;, &#8220;deadline&#8221; and &#8220;cfq&#8221; schedulers by altering the contents of &#8220;<em>/sys/block/&lt;device&gt;/queue/scheduler</em>&#8220;.</p>
<p><strong>Bonnie++ Command:</strong> We ran the same bonnie++ command for each test, differing only in the name of the output log file. An example command: <em>&#8220;bonnie++ -u nobody -n cc1-eightdisk-noop -s 50000 -x 2 -d /eightdisk/ -q -f 2&gt;&amp;1 | tee /opt/results/eightdisk-noop-results.log?</em>&#8220;</p>
<p> </p>
<p><strong>Access Our Raw Data</strong></p>
<p>Data collected so far has been posted to a <a href="https://spreadsheets.google.com/ccc?key=0AsrRXBRzWSxSdDdTZG9rZXRHUnQyU0sxak9aaGpJUlE&amp;hl=en&amp;authkey=CJmVloIK">Google Docs spreadsheet.</a></p>
<p><strong>&#8220;Cluster compute&#8221; cc1.4large Results</strong></p>
<p>We&#8217;ve created new blog posts to specifically talk about what we see just on the new EC2 instance types</p>
<ul>
<li><a href="http://blog.bioteam.net/2010/07/19/local-storage-performance-of-aws-cluster-compute-instances/">Performance of local storage only on cc1.4xlarge instance types</a></li>
<li><a href="http://blog.bioteam.net/?p=644">Combined performance of local &amp; EBS attached storage on cc1.4xlarge instance types</a> </li>
</ul>
<p><strong>Misc. Results</strong></p>
<p>Incomplete so far -</p>
<ul>
<li>Something seems &#8220;off&#8221; with our single-node 160GB EBS volume test. We might blow the volume and instance away and re-run just to see if we get any major shift in the numbers</li>
<li>Testing the single-node 160GB EBS volume is so slow that we were only able to complete the single drive tests with the noop IO scheduler. For the 4-drive and 8-drive stripe sets we were able to test with noop, cfq and deadline</li>
<li>We also have no data yet from non compute-cluster node instance types. We plan on collecting that data over the coming days so we can compare it. </li>
</ul>
<p>Along with the raw data, here are some graphs. We averaged the values of the repeated tests.</p>
<p>Interpretation and more results will be forthcoming, we&#8217;ll update this blog post as we learn and do more.</p>
<p><strong>Sequential Output</strong></p>
<p><img style="display: block; margin-left: auto; margin-right: auto;" title="sequentialOutput.png" src="http://blog.bioteam.net/wp-content/uploads/2010/07/sequentialOutput.png" border="0" alt="sequentialOutput.png" width="412" height="509" /></p>
<p><strong>Sequential Input</strong></p>
<p><img style="display: block; margin-left: auto; margin-right: auto;" title="sequentialInput.png" src="http://blog.bioteam.net/wp-content/uploads/2010/07/sequentialInput.png" border="0" alt="sequentialInput.png" width="412" height="509" /></p>
<p> </p>
<p><strong>Seek</strong></p>
<p><img style="display: block; margin-left: auto; margin-right: auto;" title="Seek.png" src="http://blog.bioteam.net/wp-content/uploads/2010/07/Seek1.png" border="0" alt="Seek.png" width="420" height="509" /></p>
<p><strong>Sequential Create &amp; Delete</strong></p>
<p><img style="display: block; margin-left: auto; margin-right: auto;" title="sequentialCreateDelete.png" src="http://blog.bioteam.net/wp-content/uploads/2010/07/sequentialCreateDelete.png" border="0" alt="sequentialCreateDelete.png" width="420" height="509" /></p>
<p> </p>
<p><strong>Random Create &amp; Delete</strong></p>
<p><img style="display: block; margin-left: auto; margin-right: auto;" title="randomCreateDelete.png" src="http://blog.bioteam.net/wp-content/uploads/2010/07/randomCreateDelete.png" border="0" alt="randomCreateDelete.png" width="420" height="509" /></p>
<img src="http://blog.bioteam.net/?ak_action=api_record_view&id=550&type=feed" alt="" /><p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://blog.bioteam.net/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://blog.bioteam.net/2010/07/13/preliminary-ebs-performance-tests-on-amazon-compute-cluster-cc1-4xlarge-instance-types/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Grid Engine on the new Amazon Compute Cluster Instances</title>
		<link>http://blog.bioteam.net/2010/07/13/grid-engine-on-the-new-amazon-compute-cluster-instances/</link>
		<comments>http://blog.bioteam.net/2010/07/13/grid-engine-on-the-new-amazon-compute-cluster-instances/#comments</comments>
		<pubDate>Tue, 13 Jul 2010 14:45:36 +0000</pubDate>
		<dc:creator>chrisdag</dc:creator>
				<category><![CDATA[Employee Posts]]></category>
		<category><![CDATA[News]]></category>
		<category><![CDATA[amazon]]></category>
		<category><![CDATA[cc1]]></category>
		<category><![CDATA[cloud]]></category>
		<category><![CDATA[gridengine]]></category>

		<guid isPermaLink="false">http://blog.bioteam.net/?p=532</guid>
		<description><![CDATA[{ crossposted to blog.bioteam.net and gridengine.info } Amazon made a very important announcement today, releasing new EC2 server types and network configurations that significantly enhance the Amazon AWS environment for people who are interested in cluster computing, compute farming and high performance computing (HPC) on the cloud. The announcement is here for those who are [...]]]></description>
			<content:encoded><![CDATA[<p>{ crossposted to blog.bioteam.net and gridengine.info }</p>
<p>Amazon made a very important announcement today, releasing new EC2 server types and network configurations that significantly enhance the Amazon AWS environment for people who are interested in cluster computing, compute farming and high performance computing (HPC) on the cloud.</p>
<p>The announcement is here for those who are interested:</p>
<p><a href="http://aws.typepad.com/aws/2010/07/the-new-amazon-ec2-instance-type-the-cluster-compute-instance.html">http://aws.typepad.com/aws/2010/07/the-new-amazon-ec2-instance-type-the-cluster-compute-instance.html?</a></p>
<p>I&#8217;m thrilled that this news is now public, the service is up for use and I can finally start testing, blogging and benchmarking in the &#8220;real&#8221; production environment.</p>
<p>In the next few days I&#8217;ll be blogging over on <a href="http://blog.bioteam.net">http://blog.bioteam.net</a>, concentrating initially on seeing how storage and storage IO speeds differ on the new instance types. For life science types like myself, one of the biggest hassles in the cloud is due to the fact that we tend to be more performance bound by the speed of storage and file IO than anything else. The 10GbE networking changes and non-oversubscription of the network links along with the ability to group nodes together may mean very very interesting things are now much more feasible on the AWS platform.</p>
<p>Because I&#8217;m going to first concentrate on storage and IO stuff on the new offering I wanted to quickly show Grid Engine running on the new server types.</p>
<p>Even a single node SGE cluster can do reasonable work now as the cc1 instance type includes a pair of quad-core Nehalem CPUs along with ~23GB memory and a 10GbE ethernet backend.</p>
<p><strong><em>We will be blogging and talking much more about how to use Chef Server to orchestrate self-assembling Grid Engine clusters and compute farms on this new service</em></strong> but since that may not happen until later &#8212; I just wanted to throw up a teaser post showing SGE 6.2u5 running in single-node mode on the new HPC offerings from Amazon.</p>
<p><strong>qstat output showing 16 CPUs <em>(click for full-size)</em>:</strong></p>
<p><a href="http://blog.bioteam.net/wp-content/uploads/2010/07/sge-cc1-1.png"><img style="display: block; margin-left: auto; margin-right: auto;" title="sge-cc1-1.png" src="http://blog.bioteam.net/wp-content/uploads/2010/07/sge-cc1-1.png" border="0" alt="sge-cc1-1.png" width="500" /></a></p>
<p> </p>
<p><strong>qhost output showing system resources</strong> <em>(click for full-size)</em>:</p>
<p><a href="http://blog.bioteam.net/wp-content/uploads/2010/07/sge-cc1-2.png"><img style="display: block; margin-left: auto; margin-right: auto;" title="sge-cc1-2.png" src="http://blog.bioteam.net/wp-content/uploads/2010/07/sge-cc1-2.png" border="0" alt="sge-cc1-2.png" width="500" /></a></p>
<img src="http://blog.bioteam.net/?ak_action=api_record_view&id=532&type=feed" alt="" /><p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://blog.bioteam.net/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://blog.bioteam.net/2010/07/13/grid-engine-on-the-new-amazon-compute-cluster-instances/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>2010 ISMB Amazon Cloud Workshop Slides</title>
		<link>http://blog.bioteam.net/2010/07/13/2010-ismb-amazon-cloud-workshop-slides/</link>
		<comments>http://blog.bioteam.net/2010/07/13/2010-ismb-amazon-cloud-workshop-slides/#comments</comments>
		<pubDate>Tue, 13 Jul 2010 13:03:54 +0000</pubDate>
		<dc:creator>chrisdag</dc:creator>
				<category><![CDATA[Employee Posts]]></category>
		<category><![CDATA[Presentations]]></category>

		<guid isPermaLink="false">http://blog.bioteam.net/?p=524</guid>
		<description><![CDATA[Later than I&#8217;d like but as promised to attendees of the ISBM 2010 meeting in Boston, here are my presentation slides from the Amazon Cloud Computing Workshop. Link: http://blog.bioteam.net/wp-content/uploads/2010/07/2010-ISMB-Cloud-Workshop_v1.pdf]]></description>
			<content:encoded><![CDATA[<p>Later than I&#8217;d like but as promised to attendees of the ISBM 2010 meeting in Boston, here are my presentation slides from the Amazon Cloud Computing Workshop.</p>
<p>Link:</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px Helvetica;"><a href="http://blog.bioteam.net/wp-content/uploads/2010/07/2010-ISMB-Cloud-Workshop_v1.pdf">http://blog.bioteam.net/wp-content/uploads/2010/07/2010-ISMB-Cloud-Workshop_v1.pdf</a></p>
<p><a href="http://blog.bioteam.net/wp-content/uploads/2010/07/2010-ISMB-Cloud-Workshop_v1.pdf"><img style="float: left;" title="ismb-cloud-iconLarge.png" src="http://blog.bioteam.net/wp-content/uploads/2010/07/ismb-cloud-iconLarge.png" border="0" alt="Presentation slide icon" width="377" height="297" /></a></p>
<img src="http://blog.bioteam.net/?ak_action=api_record_view&id=524&type=feed" alt="" /><p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://blog.bioteam.net/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://blog.bioteam.net/2010/07/13/2010-ismb-amazon-cloud-workshop-slides/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Cluster building</title>
		<link>http://blog.bioteam.net/2010/06/23/cluster-building/</link>
		<comments>http://blog.bioteam.net/2010/06/23/cluster-building/#comments</comments>
		<pubDate>Wed, 23 Jun 2010 22:04:03 +0000</pubDate>
		<dc:creator>cdwan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.bioteam.net/?p=487</guid>
		<description><![CDATA[With all of Bioteam&#8217;s talk about cloud computing, I was a bit surprised to find myself building an honest-to-goodness non-cloud, non-virtual compute cluster a couple of weeks ago. There were wires, blinking lights, whirring fans, circuit breakers, and all sorts of messy real-world details to contend with. The servers were heavy and unwieldy, and I [...]]]></description>
			<content:encoded><![CDATA[<p>With all of Bioteam&#8217;s talk about cloud computing, I was a bit surprised to find myself building an honest-to-goodness non-cloud, non-virtual compute cluster a couple of weeks ago.   There were wires, blinking lights, whirring fans, circuit breakers, and all sorts of messy real-world details to contend with.  The servers were heavy and unwieldy, and I made use of the roll of athletic tape in my cluster-build bag to cover the inevitable nicks and cuts in my fingers that accumulate over a couple of days of slinging metal.  </p>
<p>We deployed the first incarnation of this system in 2004.  At that time, it was a homogenous system built from single CPU G4 Xserve machines from Apple.   I believe that it ran OS X 10.2 Server, installed via NetBoot from a portal that was configured by copying a bootable OS image from a USB drive.</p>
<p>In the ensuing six years, it has been in near constant use as a BLAST farm for the department of environmental engineering at MIT.  We&#8217;ve upgraded it with each incarnation of the XServe as they came out &#8211; and when we ran out of physical space in the co-location facility (hard limit of three racks), we started rolling in new machines by ousting the oldest ones.  These old servers moved to a corner of a wet lab to serve as a development cluster.</p>
<p>The system is &#8211; to some extent &#8211; a crazy quilt.  It&#8217;s cobbled together, running three different major versions of OS X.  We&#8217;ve been wanting to upgrade the ethernet backplane for about four years now &#8211; but somehow it&#8217;s never been important enough to actually do.  There are separate NFS servers for BLAST databases and home directories.  A small pile of scripts integrate with Sun Grid Engine to ensure that data-staging and software updates do not collide with running jobs.  On the other hand, this system has cranked out a ridiculous amount of scientific analysis.</p>
<p>I had a blast.  At the end of two days of work, I re-enabled the queues and had the satisfaction of watching all the little blue lights spin up almost immediately, indicating that user jobs were flowing out onto the hardware.  </p>
<p>It reminds me, to some extent, of Admiral Hyman G. Rickover&#8217;s famous 1953 quote, to a congressional hearing, about the difference between &#8220;paper&#8221; and &#8220;real&#8221; nuclear reactors:</p>
<p><em>An academic reactor or reactor plant almost always has the following basic characteristics: (1) It is simple. (2) It is small. (3) It is cheap. (4) It is light. (5) It can be built very quickly. (6) It is very flexible in purpose. (7) Very little development will be required. It will use off-the-shelf components. (8) The reactor is in the study phase. It is not being built now.</p>
<p>On the other hand a practical reactor can be distinguished by the following characteristics: (1) It is being built now. (2) It is behind schedule. (3) It requires an immense amount of development on apparently trivial items. (4) It is very expensive. (5) It takes a long time to build because of its engineering development problems. (6) It is large. (7) It is heavy. (8) It is complicated.</em></p>
<p>The systems engineering required to keep pace with biology these days still falls in the &#8220;heavy and complicated&#8221; category.  Under the hood of the virtual, there must always be the real.  It&#8217;s therefore important that someone on the team break out the athletic tape and the power driver from time to time.  It keeps us honest.</p>
<img src="http://blog.bioteam.net/?ak_action=api_record_view&id=487&type=feed" alt="" /><p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://blog.bioteam.net/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://blog.bioteam.net/2010/06/23/cluster-building/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Slides from the Amazon Web Services Genomics meeting</title>
		<link>http://blog.bioteam.net/2010/06/08/slides-from-the-amazon-web-services-genomics-meeting/</link>
		<comments>http://blog.bioteam.net/2010/06/08/slides-from-the-amazon-web-services-genomics-meeting/#comments</comments>
		<pubDate>Tue, 08 Jun 2010 18:42:06 +0000</pubDate>
		<dc:creator>chrisdag</dc:creator>
				<category><![CDATA[Employee Posts]]></category>
		<category><![CDATA[Featured]]></category>
		<category><![CDATA[Presentations]]></category>

		<guid isPermaLink="false">http://blog.bioteam.net/?p=509</guid>
		<description><![CDATA[I had the unenviable task of giving the first talk after James Hamilton presented the opening keynote at the Amazon Web Services Genomics meeting James is one of the very very few people writing and talking in public about the sorts of things the mega internet scale companies are doing to drive costs down and [...]]]></description>
			<content:encoded><![CDATA[<p><img style="float: right;" title="hamilton.jpg" src="http://blog.bioteam.net/wp-content/uploads/2010/06/hamilton.jpg" border="0" alt="hamilton.jpg" width="400" height="274" /></p>
<p>I had the unenviable task of giving the first talk after <a href="http://perspectives.mvdirona.com/">James Hamilton</a> presented the opening keynote at the <a href="http://aws.amazon.com/genomics_workshop/">Amazon Web Services Genomics meeting</a></p>
<p>James is one of the very very few people writing and talking in public about the sorts of things the mega internet scale companies are doing to drive costs down and recapture efficiencies. There are amazing lessons to be learned from what Microsoft, Amazon and Google (etc.) are doing deep within their datacenters but that info is very hard to find because it represents a competitive advantage. This is why whenever you see somethign cool coming out of a Google datacenter you can be sure that the info is at least a few years old &#8230;</p>
<p>This is the first time I&#8217;ve gotten a chance to see James present in person and it was fantastic. Hopefully AWS will release the video and slides soon.</p>
<p>Many of my &#8220;cloud talks&#8221; are focused on how to map HPC workflows onto cloud platforms and the various technical challenges &amp; best practices involved. Today&#8217;s talk was far lighter and more &#8220;big picture&#8221; since I knew that the speakers coming after me were presenting deeply specific and technical works.</p>
<p>Talk slides in PDF form can be downloaded by clicking on the image below. Feedback welcome.</p>
<p>Video from the event is likely to be posted soon by Amazon.</p>
<p><strong>UPDATE</strong>:  Crap. I misspoke during my talk when I was talking about how surprising it was to hear about <a href="http://www.opscode.com">Chef</a> Usage at the 2010 BioITWorld Cloud Computing Workshop. I spaced out and said that &#8220;<em>we</em>&#8221; organized the cloud meeting. Totally wrong &#8211; the meeting was organized and run by Jason and the gang over at <a href="http://www.cyclecomputing.com">Cycle Computing</a>. Really sorry about that &#8211; not intentional!</p>
<p><a href="http://blog.bioteam.net/wp-content/uploads/2010/06/2010-AWS-Genomics_cdag.pdf"><img style="float: left;" title="aws-genomics-icon.png" src="http://blog.bioteam.net/wp-content/uploads/2010/06/aws-genomics-icon1.png" border="0" alt="Download presentation slides " width="393" height="297" /></a></p>
<img src="http://blog.bioteam.net/?ak_action=api_record_view&id=509&type=feed" alt="" /><p><a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save"><img src="http://blog.bioteam.net/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a> </p>]]></content:encoded>
			<wfw:commentRss>http://blog.bioteam.net/2010/06/08/slides-from-the-amazon-web-services-genomics-meeting/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
