<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>lalit.org &#187; amazon</title>
	<atom:link href="http://www.lalit.org/tag/amazon/feed" rel="self" type="application/rss+xml" />
	<link>http://www.lalit.org</link>
	<description>Personal page of Lalit Patel, an engineer, entrepreneur, geek from Bhubaneswar, India.</description>
	<lastBuildDate>Thu, 03 Jun 2010 05:23:52 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Getting most out of Amazon S3</title>
		<link>http://www.lalit.org/lab/setting-cache-headers-files-in-amazon-s3</link>
		<comments>http://www.lalit.org/lab/setting-cache-headers-files-in-amazon-s3#comments</comments>
		<pubDate>Sun, 02 Nov 2008 10:19:35 +0000</pubDate>
		<dc:creator>Lalit</dc:creator>
				<category><![CDATA[lab]]></category>
		<category><![CDATA[amazon]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[php]]></category>
		<category><![CDATA[s3]]></category>

		<guid isPermaLink="false">http://www.lalit.org/?p=259</guid>
		<description><![CDATA[Amazon S3 is a very useful service. S3, according to the official Amazon Web Services website is
Amazon S3 is storage for the Internet. It is designed to make web-scale computing easier for developers.
Its a no frills service and does exactly what it promises &#8212; makes it easy for developers so that they can concentrate on [...]]]></description>
			<content:encoded><![CDATA[<p><a rel="nofollow" href="http://aws.amazon.com/s3">Amazon S3</a> is a very useful service. S3, according to the official Amazon Web Services website is</p>
<blockquote><p>Amazon S3 is storage for the Internet. It is designed to make web-scale computing easier for developers.</p></blockquote>
<p>Its a no frills service and does exactly what it promises &#8212; makes it easy for developers so that they can concentrate on features and leave the scaling to Amazon. If you are new to Amazon S3, heres a good <a href="http://www.labnol.org/internet/host-images-files-on-amazon-s3-storage/4923/">starting guide</a> for you.</p>
<p>S3 is like a sharp sword, you must know how to play with it lest you can hurt yourself. Thats exactly what happened to me. <span id="more-259"></span>One of the many applications that we are developing on MySpace, <a href="http://profile.myspace.com/Modules/Applications/Pages/Canvas.aspx?appId=104651">Sketch Me</a>, required us to store (and serve) huge amount of image data (user sketches). And due to the viral nature of the application, the load almost tripled every month. S3 was a clear choice. It saved us time, money and headache. Our current stats (with caching) are:</p>
<ul>
<li>Total files stored: <strong>205GB</strong></li>
<li>Bandwidth per month: <strong>2TB</strong></li>
<li>GET Requests per month: <strong>112m</strong></li>
</ul>
<p>Clearly I would not like to waste time setting up image serving servers that can handle such load and I am more than happy to outsource it to Amazon S3.</p>
<p>You would be surprised before caching when our total images were just 5GB, the no of requests were <strong>263m ($363.91)</strong> (almost double to what it is now with 205GB of images)</p>
<p>So if we take the total request to be directly proportional to number of images, with that rate the actual requests should be approx 4.5 billion or $15,000 :O</p>
<h3>How did I tame the beast?</h3>
<p>At first look the pricing of Amazon&#8217;s S3 services seems quite cheap. Wait until you get your first bill and you will see have cents add up to huge $$$.</p>
<blockquote><p><strong>Storage</strong></p>
<ul>
<li>$0.150 per GB – first 50 TB / month of storage used</li>
</ul>
<p><strong>Data Transfer</strong></p>
<ul>
<li>$0.100 per GB – all data transfer in</li>
<li>$0.170 per GB – first 10 TB / month data transfer out</li>
</ul>
<p><strong>Requests</strong></p>
<ul>
<li>$0.01 per 1,000 PUT, POST, or LIST requests</li>
<li class="c_red">$0.01 per 10,000 GET and all other requests</li>
</ul>
</blockquote>
<p>So after getting the first bill for a few hundred dollars, I sat down thinking how to bring that down. When I was digging the HTTP headers, I found out that by default S3 doesn’t have any cache request headers set. So even when the visitor has the requested the file from S3 before and has it in his browser cache, the browser will send a HTTP <code>GET</code> request to S3 just to verify if the file has changed. S3 returns a <code>304 Not Modified</code> header if the file has not changed and file wont be downloaded. You may think, S3 saved me a few GB of bandwidth cost. But each of this requests cost you ($0.01 per 10,000 GET) which is generally the bulk of the S3 bill.</p>
<p>Since photos our users upload almost never change. Asking S3 every time if the file has changed on S3 is certainly not required. You can stop browser sending this extra request for the same user by setting appropriate <code>Cache-Control</code> headers or <code>Expires</code> headers for the files. We can set <code>Cache-Control max-age=864000</code> which will tell browser to not request the same file until next 10 days (3600*24*10 sec)</p>
<p><img class="alignnone size-full wp-image-268" title="s3_304" src="http://www.lalit.org/wordpress/wp-content/uploads/2008/11/s3_304.png" alt="" width="328" height="76" /></p>
<p>Fortunately S3 allows us to do that, but there is no simple and easy way to do that. So I decided write a small script (<a href="#domestication">see below</a>) to achieve this.</p>
<h3><a id="domestication" name="domestication"></a>After Domestication</h3>
<p>After setting Cache headers, my bill got down drastically. The traffic doubled and number of images  almost tripled (5GB &#8211; 15GB) within a month while the number of requests was reduced 3 times. So that is <strong>9 times reduction in cost</strong>. Ideally, your bandwidth cost should be more than your requests cost.</p>
<p><img class="alignnone size-full wp-image-266" title="s3_bill" src="http://www.lalit.org/wordpress/wp-content/uploads/2008/11/s3_bill.png" alt="" width="497" height="403" /></p>
<p>If you own a high traffic blog or website, you can also store your javascripts or css files on S3 by using far fetched expires headers and using versioning (changing file name when contents change) so that the browser knows when the file has changed.</p>
<p><strong>For Example:</strong></p>
<p><code>&lt;link href="http://s3.amazonaws.com/lalit/style.css?<strong>v=3</strong>" ... /&gt;</code><br />
after change in stylesheet, change your code to,<br />
<code>&lt;link href="http://s3.amazonaws.com/lalit/style.css?<strong>v=4</strong>" ... /&gt;</code></p>
<h3>Domestication</h3>
<p>The popular <a rel="nofollow" href="https://addons.mozilla.org/en-US/firefox/addon/3247">Firefox extension S3Fox</a> doesn&#8217;t allow us to do that. So I decided write a small script using the <a rel="nofollow" href="http://developer.amazonwebservices.com/connect/entry.jspa?externalID=1448&amp;categoryID=47">S3 PHP REST Library</a>.</p>
<blockquote><p>You can download the code <a href="http://www.lalit.org/wordpress/wp-content/uploads/2008/11/s3_upload.zip">here</a> (zip 6k).</p></blockquote>
<p><small><b class="red">Update:</b> Fixed a bug in code (mime type calculation was failing on some php configurations)</small></p>
<p>To use the script, you have to upload it to your server (running PHP). You need to edit the <code>upload.php</code> file to specify your AWS access key and Secret, your S3 bucket name.</p>

<div class="wp_syntax"><div class="code"><pre class="php" style="font-family:monospace;"><ol><li style="font-weight: bold; vertical-align:top;"><div style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #666666; font-style: italic;">/* One time settings. */</span></div></li><li style="font-weight: bold; vertical-align:top;"><div style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000088;">$awsAccessKey</span>	<span style="color: #339933;">=</span> <span style="color: #0000ff;">'---'</span><span style="color: #339933;">;</span>	<span style="color: #666666; font-style: italic;">// your AWS Key</span></div></li><li style="font-weight: bold; vertical-align:top;"><div style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000088;">$awsSecretKey</span>	<span style="color: #339933;">=</span> <span style="color: #0000ff;">'---'</span><span style="color: #339933;">;</span>	<span style="color: #666666; font-style: italic;">// your AWS Secret</span></div></li><li style="font-weight: bold; vertical-align:top;"><div style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000088;">$bucket_name</span>	<span style="color: #339933;">=</span> <span style="color: #0000ff;">'---'</span><span style="color: #339933;">;</span>	<span style="color: #666666; font-style: italic;">// S3 Bucket name</span></div></li><li style="font-weight: bold; vertical-align:top;"><div style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000088;">$age</span>		<span style="color: #339933;">=</span> <span style="color: #cc66cc;">3600</span><span style="color: #339933;">*</span><span style="color: #cc66cc;">24</span><span style="color: #339933;">*</span><span style="color: #cc66cc;">10</span><span style="color: #339933;">;</span>	<span style="color: #666666; font-style: italic;">// Cache age 10 days	</span></div></li><li style="font-weight: bold; vertical-align:top;"><div style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&nbsp;</div></li><li style="font-weight: bold; vertical-align:top;"><div style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #666666; font-style: italic;">/* File Data */</span></div></li><li style="font-weight: bold; vertical-align:top;"><div style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000088;">$s3_dir_name</span>	<span style="color: #339933;">=</span> <span style="color: #0000ff;">'dir1/dir2/'</span><span style="color: #339933;">;</span>	<span style="color: #666666; font-style: italic;">// Directory on s3 where you want to upload file</span></div></li><li style="font-weight: bold; vertical-align:top;"><div style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">				<span style="color: #666666; font-style: italic;">// example http://s3.amazonawas.com/bucket_name/dir1/dir2/filename.ext</span></div></li><li style="font-weight: bold; vertical-align:top;"><div style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">&nbsp;</div></li><li style="font-weight: bold; vertical-align:top;"><div style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;"><span style="color: #000088;">$upload_file</span>	<span style="color: #339933;">=</span> <span style="color: #0000ff;">'filename.ext'</span><span style="color: #339933;">;</span><span style="color: #666666; font-style: italic;">// name of the file you want to upload.</span></div></li><li style="font-weight: bold; vertical-align:top;"><div style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">				<span style="color: #666666; font-style: italic;">// keep it in the same dir as this file.</span></div></li></ol></pre></div></div>

<p>After you have saved the config info, every time you need to upload a file, you have to specify the file name and the dir name in <code>upload.php</code> and callit from the browser http://yoursite.com/s3/upload.php. (I know it sucks,  I promise will make it better soon)</p>
<p>If you want to escape all this trouble, there is a paid tool <a rel="external nofollow" href="http://www.bucketexplorer.com/">Bucket Explorer</a> which helps you to upload files with custom headers. I havn&#8217;t used it but it looks great and works on Win/Linux/Mac.</p>
<p>I hope your next S3 bill will come down <img src='http://www.lalit.org/wordpress/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.lalit.org/lab/setting-cache-headers-files-in-amazon-s3/feed</wfw:commentRss>
		<slash:comments>9</slash:comments>
		</item>
	</channel>
</rss>
