<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>John Eberly &#187; aws</title>
	<atom:link href="http://blog.eberly.org/tag/aws/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.eberly.org</link>
	<description>suggest a tagline....</description>
	<lastBuildDate>Wed, 28 Jul 2010 03:05:38 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>How I automated my backups to Amazon S3 using rsync and s3fs.</title>
		<link>http://blog.eberly.org/2008/10/27/how-i-automated-my-backups-to-amazon-s3-using-rsync/</link>
		<comments>http://blog.eberly.org/2008/10/27/how-i-automated-my-backups-to-amazon-s3-using-rsync/#comments</comments>
		<pubDate>Tue, 28 Oct 2008 03:24:14 +0000</pubDate>
		<dc:creator>John Eberly</dc:creator>
				<category><![CDATA[]]></category>
		<category><![CDATA[amazon]]></category>
		<category><![CDATA[aws]]></category>
		<category><![CDATA[backups]]></category>
		<category><![CDATA[s3]]></category>

		<guid isPermaLink="false">http://blog.eberly.org/?p=93</guid>
		<description><![CDATA[The following is how I automated my backups to Amazon S3 in about 5 minutes. I lot has changed since my original post on automating my backups to s3 using s3sync. There are more mature and easier to use solutions now. I am switching because using s3fs gives you much more options for using s3, [...]]]></description>
			<content:encoded><![CDATA[<p>The following is how I automated my backups to <a href="http://aws.amazon.com/s3/">Amazon S3</a> in about 5 minutes.</p>
<p>I lot has changed since my original post on <a href="http://blog.eberly.org/2006/10/09/how-automate-your-backup-to-amazon-s3-using-s3sync/">automating my backups to s3 using s3sync</a>.  There are more mature and easier to use solutions now.  I am switching because using s3fs gives you much more options for using s3, it is easier to set up and it is faster.</p>
<p>I now use a combination of <a href="http://s3fs.googlecode.com/">s3fs</a> to mount a S3 bucket to local directory and then use rsync to keep up to date with my files.  The following directions are geared towards Ubuntu linux, but could be modified for any linux distribution and <a href="http://www.rsaccon.com/2007/10/mount-amazon-s3-on-your-mac.html">Mac OSX</a>. </p>
<p><span id="more-93"></span><br />
<strong>STEP 1: Install s3fs</strong></p>
<p>The first step is to install s3fs dependencies.  (Assuming Ubuntu)</p>
<pre>
sudo apt-get install build-essential libcurl4-openssl-dev libxml2-dev libfuse-dev
</pre>
<p>Next, install the most recent version of <a href="http://code.google.com/p/s3fs/">s3fs</a>.  As of now the most recent is r177, but a quick check of <a href="http://code.google.com/p/s3fs/downloads/list">s3fs downloads</a> will show the most recent.</p>
<pre>
wget http://s3fs.googlecode.com/files/s3fs-r177-source.tar.gz
tar -xzf s3fs*
cd s3fs
make
sudo make install
sudo mkdir /mnt/s3
sudo chown yourusername:yourusername /mnt/s3
</pre>
<p><strong>STEP 2: Create script to mount your Amazon s3 bucket using s3fs and sync files.</strong></p>
<p>The following assumes you already have a bucket created on Amazon S3.  If this is not the case, you can use a tool like <a href="https://addons.mozilla.org/en-US/firefox/addon/3247">s3Fox</a> to create one.</p>
<p>Choose a text editor of your choice and make a shell script to mount your bucket, perform rsync, then unmount.  It is not necessary to unmount your S3 directory after each rsync, but I prefer to be safe.  One mistake like an &#8216;rm&#8217; on your root directory could wipe all of your files on your machine and your S3 mount.  You should probably start with a test directory to be safe.</p>
<p>Make the file s3fs.sh</p>
<pre>
#!/bin/bash
/usr/bin/s3fs yourbucket -o accessKeyId=yourS3key -o secretAccessKey=yourS3secretkey /mnt/s3
/usr/bin/rsync -avz --delete /home/username/dir/you/want/to/backup /mnt/s3
/bin/umount /mnt/s3
</pre>
<p>Note, the &#8211;delete option.  This will delete any files that have been removed on the &#8216;source&#8217;.<br />
Change permissions to make executable</p>
<pre>
chmod 700 s3fs.sh
</pre>
<p>Before you run the entire script, you might want to run each line separately to make sure everything is working properly.  The paths to rsync, umount might be different on your system. (Use &#8216;which rsync&#8217; to check)  Just for fun, I did a &#8216;df -h&#8217;, which showed I now have 256 Terabytes available on the s3 mount!</p>
<p>Next, run the script and let it do its work.  This could take a long time depending on how much data you are uploading initially.  Your internet upload speed will be the bottleneck. </p>
<pre>
sudo ./s3fs.sh
</pre>
<p>That&#8217;s it!  You are backing up to Amazon S3.  You probably want to automate this using cron after you are sure everything is running o.k.  Just for simplicity of this tutorial, lets assume you are setting up the cron job as root so we don&#8217;t need to worry about editing permissions for mount/umounting directory.</p>
<p><strong>STEP 3: Automate it with cron</strong></p>
<pre>
sudo su
crontab -e
0 0 * * * /path/to/s3fs.sh # this runs it everyday at midnight
</pre>
<p>p.s. I use this in combination with hourly backups to a second local machine using git to have revision history.  I only backup nightly to s3 without revision history in case my house burns down etc.  If you would like to know how I set up my git backups locally, just leave a comment and I can make a follow up post.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.eberly.org/2008/10/27/how-i-automated-my-backups-to-amazon-s3-using-rsync/feed/</wfw:commentRss>
		<slash:comments>53</slash:comments>
		</item>
		<item>
		<title>How I automated my backups to Amazon S3 using s3sync.</title>
		<link>http://blog.eberly.org/2006/10/09/how-automate-your-backup-to-amazon-s3-using-s3sync/</link>
		<comments>http://blog.eberly.org/2006/10/09/how-automate-your-backup-to-amazon-s3-using-s3sync/#comments</comments>
		<pubDate>Tue, 10 Oct 2006 00:45:40 +0000</pubDate>
		<dc:creator>John Eberly</dc:creator>
				<category><![CDATA[backups]]></category>
		<category><![CDATA[amazon]]></category>
		<category><![CDATA[aws]]></category>

		<guid isPermaLink="false">http://blog.eberly.org/archive/how-automate-your-backup-to-amazon-s3-using-s3sync/</guid>
		<description><![CDATA[UPDATE: See my newer article for the way I currently backup to Amazon S3. Jeremy Zawodny has an excellent article/discussion about the different tools currently available to take advantage of Amazon simple storage service (S3). After testing many tools available for S3 currently, I decided to use the ruby program s3sync to backup my data [...]]]></description>
			<content:encoded><![CDATA[<p>UPDATE:  <a href="http://blog.eberly.org/2008/10/27/how-i-automated-my-backups-to-amazon-s3-using-rsync/">See my newer article for the way I currently backup to Amazon S3.</a></p>
<p><a href="http://jeremy.zawodny.com/blog/archives/007641.html">Jeremy Zawodny</a> has an excellent article/discussion about the different tools currently available to take advantage of Amazon simple storage service (S3).  After testing many tools available for S3 currently, I decided to use the ruby program s3sync to backup my data to S3.<br />
As I explained an earlier post, I wanted a simple low level tool to perform automatic backups S3.  I decided to use <a href="http://developer.amazonwebservices.com/connect/thread.jspa?threadID=11975&#038;start=0&#038;tstart=0">s3sync</a> to do the heavy lifting and use the <a href="https://jets3t.dev.java.net/cockpit.html">jets3t Cockpit</a> GUI to monitor my S3 account.  The following explains how I successfully started automating my backups to S3 using s3sync and cockpit.</p>
<p>My server is running Ubuntu Dapper with samba server. All the machines in my house use a &#8220;Public&#8221; drive on the samba server to store all files from Windows and Linux. All of our important files like photos, home movies, and documents are stored on this &#8220;public&#8221; drive. This simplifies the backup procedure, since I don&#8217;t have to backup multiple sources.</p>
<p>The following steps describe how I backup my &#8220;public drive&#8221; to Amazon&#8217;s awesome S3 storage service. I decided to post this, because I haven&#8217;t found a fairly &#8220;simple&#8221; guide to actually automate backups to S3 that functions similar to rsync on Linux. This is a follow-up post to my original post on <a href="http://blog.eberly.org/2006/10/02/cheap-reliable-secure-off-site-storage-for-digital-life-backup-where-are-you/">choosing a backup solution</a>.</p>
<p><span id="more-10"></span><br />
<strong>STEP 1: Activate an Amazon s3 account.</strong></p>
<p>Go <a href="http://www.amazon.com/s3">http://www.amazon.com/s3</a> and sign up for a s3 web service account</p>
<p>Have your Access Key ID and your Secret Access Key handy.</p>
<p><strong>STEP 2: Install a management tool<br />
</strong></p>
<p>(update, I no longer use cockpit, now I use the command line tools that come with s3sync that were not available at the time I wrote this original article, see Option 1.)</p>
<p><strong>Option 1</strong> use the command line shell tools that are included with s3sync (my new preferred method)
<p>Here is a sampling of the commands from the readme file for command line tool, s3cmd.rb that can be used to create buckets and verify upload success or failure.  If you use, this option, make sure you have the correct version of ruby installed on your system and you have downloaded the s3sync package (See step 3)</p>
<p>List all the buckets your account owns:
<pre>
s3cmd.rb listbuckets</pre>
<p> Create a new bucket:
<pre>
s3cmd.rb createbucket BucketName</pre>
<p>Delete an old bucket you don&#8217;t want any more:
<pre>
s3cmd.rb deletebucket BucketName</pre>
<p>Find out what&#8217;s in a bucket, 10 lines at a time:
<pre>
s3cmd.rb list BucketName 10</pre>
<p>Only look in a particular prefix:
<pre>
s3cmd.rb list BucketName:startsWithThis</pre>
<p>I plan to write a shell script to verify success of backup and run via cron job each night, but I haven&#8217;t done it yet.  I will update here when I do.</p>
<p><strong>Option 2</strong> (original option that I used before s3sync command line shell tools were available)<br />
UPDATE:  I have had trouble getting this (or any other GUI) to work for folders containing large amounts of files.  If you plan to have thousands of files stored at Amazon, then I suggest option 1.</p>
<p>Download a GUI tool and make sure you can log into your S3 account, create a bucket, add files, and delete them.</p>
<p>I have tried a lot of them, but I prefer <a href="https://jets3t.dev.java.net/cockpit.html">jets3t Cockpit</a>. It is java and open source, plus it is able to read objects uploaded to S3 by other tools. Some tools like Jungle Disk create buckets and objects in a propietary format. This means you would not be able to see your files uploaded to S3 by other tools using JD.<br />
Here is a screenshot of Cockpit.</p>
<p><img id="image11" alt="Cockpit" src="http://blog.eberly.org/wp-content/uploads/2006/10/cockpit.jpg" /><br />
Create a bucket that you will store your backups in. Make sure to give your Bucket a unique name, because bucket names have to be unique for all users of S3. Many recommend to use your Access Key ID from S3 as a prefix. For example, fakeaccesskey1234.backups. For the rest of this article, I will assume our bucket name is &#8220;mybucket&#8221;.</p>
<p>Cockpit will be a handy tool for you to monitor your backups in S3, but the actual file uploading/downloading will be done with a shell script using s3sync.</p>
<p><strong>STEP 3: Install </strong><a href="http://developer.amazonwebservices.com/connect/thread.jspa?threadID=11975&#038;start=0&#038;tstart=0">s3sync</a><strong> (ruby)</strong></p>
<p><a href="http://developer.amazonwebservices.com/connect/thread.jspa?threadID=11975&#038;start=0&#038;tstart=0">s3sync</a> is an open source ruby script that acts similar to rsync, the linux file sync program. Remember to read the README file from s3sync. Also, all the normal warnings apply. Test this on a couple folders and files you don&#8217;t care about and make sure you understand what you are doing. Put the source/destination in the wrong order while using the &#8211;delete option and you could blow away all of your precious data.</p>
<p>Lets move on.</p>
<p>The following apply to a Debian/Ubuntu based distribution, but could easily be adapted to your own distro.</p>
<p>First, make sure you have ruby 1.8.4 or greater and the ssl lib for ruby or higher</p>
<pre>$ sudo apt-get install ruby libopenssl-ruby</pre>
<p>check ruby version</p>
<pre>$ ruby -v
ruby 1.8.4 (2005-12-24) [i486-linux]</pre>
<p>change into the directory where you want to install s3sync, like /home/john/s3sync</p>
<p>download and unpack s3sync</p>
<pre>$ wget http://s3.amazonaws.com/ServEdge_pub/s3sync/s3sync.tar.gz
$ tar xvzf s3sync.tar.gz</pre>
<p>clean up</p>
<pre>$ rm s3sync.tar.gz</pre>
<p>make directory for ssl certificates and download some (important, read <a href="http://s3.amazonaws.com/ServEdge_pub/s3sync/README.txt">README</a> for info about these SSL certs)</p>
<pre>$ mkdir certs
$ cd certs
$ wget http://mirbsd.mirsolutions.de/cvs.cgi/~checkout~/src/etc/ssl.certs.shar</pre>
<p>run this shell archive</p>
<pre>$ sh ssl.certs.shar</pre>
<p>get back into main s3sync dir</p>
<pre>$ cd ..</pre>
<p>create two files with your favorite editor, upload.sh and download.sh with the following contents and update to suit your needs. (Important, like rsync, slashes matter, see README for examples)</p>
<p>upload.sh &#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;-</p>
<pre>#!/bin/bash
# script to upload local directory upto s3
cd /path/to/yourshellscript/
export AWS_ACCESS_KEY_ID=yourS3accesskey
export AWS_SECRET_ACCESS_KEY=yourS3secretkey
export SSL_CERT_DIR=/your/path/to/s3sync/certs
ruby s3sync.rb -r --ssl --delete /home/john/localuploadfolder/ mybucket:/remotefolder
# copy and modify line above for each additional folder to be synced
</pre>
<p>download.sh &#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;-</p>
<pre>#!/bin/bash
# script to download local directory upto s3
cd /path/to/yourshellscript/
export AWS_ACCESS_KEY_ID=yourS3accesskey
export AWS_SECRET_ACCESS_KEY=yourS3secretkey
export SSL_CERT_DIR=/your/path/to/s3sync/certs
ruby s3sync.rb -r --ssl --delete mybucket:/remotefolder/ /home/john/localdownloadfolder
# copy and modify line above for each additional folder to be synced
</pre>
<p> NOTICE: These scripts use the &#8211;delete option. This means it will delete any file on the destination not on source. Also, these shell scripts contain your Amazon secret info, so you will want to make sure they are only readable by you (chmod 700, credit Kelvin below).  You can also add the &#8220;-v&#8221; option, so you get a verbose about of the changes.  I did this this after my initial upload, so I can monitor activity via cron job emails.</p>
<p>Create the local upload and download directories and put some test files in the upload folder</p>
<pre>$ mkdir localuploadfolder
$ mkdir localdownloadfolder</pre>
<p>change the permissions on the files</p>
<pre>$ chmod 700 upload.sh
$ chmod 700 download.sh</pre>
<p>Test upload.sh</p>
<pre>$./upload.sh</pre>
<p>Use s3cmd.rb or Cockpit to make sure you can see the files made it to Amazon.</p>
<p>Test download.sh</p>
<pre>$ ./download.sh</pre>
<p>The files you uploaded to S3 should now be in your localdownloadfolder.</p>
<p>Once you are confident everything is working fine and your understand what you are doing. Change the shell scripts to backup your actual folders. Run the scripts manually first to ensure everything is working properly. Remember, the upload script will be limited to the upload speed of your ISP, which can be very slow. If you have a typical Cable internet connection upload speed of 384 k it will take approx. 6 hours to upload 1GB. Download speeds are usually much faster, approx 1GB/20 min, but hopefully you never need it.</p>
<p><strong>STEP 4: set up cronjob to run backup script once a week/month etc.</strong></p>
<p>Once you are sure the script is working for your uploads, you can automate the task by creating a cron job to run once a week, day or month. I have it run once a week, because I do nightly backups locally to my Desktop machine using rsync.</p>
<pre>$ crontab -e</pre>
<p>add the following line.</p>
<pre>30 2 * * sun /path/to/upload.sh</pre>
<p>save and exit.</p>
<p>Obviously, monitor to make sure everything is working.</p>
<p><strong>STEP 5: kick back and relax</strong></p>
<p>Now you can relax, if your laptop battery explodes and burns down your house, you know your data is safe sitting on Amazon&#8217;s geo-redundant servers right between some bits describing a new book from Oprah and a bad review on latest Ben Affleck movie!</p>
<p>Feel free to leave a comment if you find this useful, incorrect, or just plain uninteresting.</p>
<p>UPDATE 1:  One additional step I did, was to create one additional bucket where I uploaded all the necessary code/scripts to restore my files using s3sync (minus my s3 information).</p>
<p>UPDATE 2: I have changed the chmod 755 to chmod 700 to make script not readable to all.  (Credit Kelvin below).  Also, updated the information about the tools I use.  I no longer use cockpit to verify success, but I mostly rely on the s3sync command line tools there were not present at the time I wrote the original article.</p>
<p>UPDATE 3:  I never gave enough credit to the <a href="http://developer.amazonwebservices.com/connect/profile.jspa?userID=18616">actual author of s3sync.</a>  Without him, this entire process would not be possible, thanks again.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.eberly.org/2006/10/09/how-automate-your-backup-to-amazon-s3-using-s3sync/feed/</wfw:commentRss>
		<slash:comments>104</slash:comments>
		</item>
	</channel>
</rss>
