Jeremy Zawodny has an excellent article/discussion about the different tools currently available to take advantage of Amazon simple storage service (S3). After testing many tools available for S3 currently, I decided to use the ruby program s3sync to backup my data to S3.
As I explained an earlier post, I wanted a simple low level tool to perform automatic backups S3. I decided to use s3sync to do the heavy lifting and use the jets3t Cockpit GUI to monitor my S3 account. The following explains how I successfully started automating my backups to S3 using s3sync and cockpit.
My server is running Ubuntu Dapper with samba server. All the machines in my house use a “Public” drive on the samba server to store all files from Windows and Linux. All of our important files like photos, home movies, and documents are stored on this “public” drive. This simplifies the backup procedure, since I don’t have to backup multiple sources.
The following steps describe how I backup my “public drive” to Amazon’s awesome S3 storage service. I decided to post this, because I haven’t found a fairly “simple” guide to actually automate backups to S3 that functions similar to rsync on Linux. This is a follow-up post to my original post on choosing a backup solution.
STEP 1: Activate an Amazon s3 account.
Go http://www.amazon.com/s3 and sign up for a s3 web service account
Have your Access Key ID and your Secret Access Key handy.
STEP 2: Install a management tool
(update, I no longer use cockpit, now I use the command line tools that come with s3sync that were not available at the time I wrote this original article, see Option 1.)
Option 1 use the command line shell tools that are included with s3sync (my new preferred method)
Here is a sampling of the commands from the readme file for command line tool, s3cmd.rb that can be used to create buckets and verify upload success or failure. If you use, this option, make sure you have the correct version of ruby installed on your system and you have downloaded the s3sync package (See step 3)
List all the buckets your account owns:
s3cmd.rb listbuckets
Create a new bucket:
s3cmd.rb createbucket BucketName
Delete an old bucket you don't want any more:
s3cmd.rb deletebucket BucketName
Find out what's in a bucket, 10 lines at a time:
s3cmd.rb list BucketName 10
Only look in a particular prefix:I plan to write a shell script to verify success of backup and run via cron job each night, but I haven’t done it yet. I will update here when I do.
s3cmd.rb list BucketName:startsWithThis
Option 2 (original option that I used before s3sync command line shell tools were available)
UPDATE: I have had trouble getting this (or any other GUI) to work for folders containing large amounts of files. If you plan to have thousands of files stored at Amazon, then I suggest option 1.
Download a GUI tool and make sure you can log into your S3 account, create a bucket, add files, and delete them.
I have tried a lot of them, but I prefer jets3t Cockpit. It is java and open source, plus it is able to read objects uploaded to S3 by other tools. Some tools like Jungle Disk create buckets and objects in a propietary format. This means you would not be able to see your files uploaded to S3 by other tools using JD.
Here is a screenshot of Cockpit.

Create a bucket that you will store your backups in. Make sure to give your Bucket a unique name, because bucket names have to be unique for all users of S3. Many recommend to use your Access Key ID from S3 as a prefix. For example, fakeaccesskey1234.backups. For the rest of this article, I will assume our bucket name is “mybucket”.
Cockpit will be a handy tool for you to monitor your backups in S3, but the actual file uploading/downloading will be done with a shell script using s3sync.
STEP 3: Install s3sync (ruby)
s3sync is an open source ruby script that acts similar to rsync, the linux file sync program. Remember to read the README file from s3sync. Also, all the normal warnings apply. Test this on a couple folders and files you don’t care about and make sure you understand what you are doing. Put the source/destination in the wrong order while using the –delete option and you could blow away all of your precious data.
Lets move on.
The following apply to a Debian/Ubuntu based distribution, but could easily be adapted to your own distro.
First, make sure you have ruby 1.8.4 or greater and the ssl lib for ruby or higher
$ sudo apt-get install ruby libopenssl-ruby
check ruby version
$ ruby -v
ruby 1.8.4 (2005-12-24) [i486-linux]
change into the directory where you want to install s3sync, like /home/john/s3sync
download and unpack s3sync
$ wget http://s3.amazonaws.com/ServEdge_pub/s3sync/s3sync.tar.gz
$ tar xvzf s3sync.tar.gz
clean up
$ rm s3sync.tar.gz
make directory for ssl certificates and download some (important, read README for info about these SSL certs)
$ mkdir certs
$ cd certs
$ wget http://mirbsd.mirsolutions.de/cvs.cgi/~checkout~/src/etc/ssl.certs.shar
run this shell archive
$ sh ssl.certs.shar
get back into main s3sync dir
$ cd ..
create two files with your favorite editor, upload.sh and download.sh with the following contents and update to suit your needs. (Important, like rsync, slashes matter, see README for examples)
upload.sh —————————————-
#!/bin/bash
# script to upload local directory upto s3
cd /path/to/yourshellscript/
export AWS_ACCESS_KEY_ID=yourS3accesskey
export AWS_SECRET_ACCESS_KEY=yourS3secretkey
export SSL_CERT_DIR=/your/path/to/s3sync/certs
ruby s3sync.rb -r --ssl --delete /home/john/localuploadfolder/ mybucket:/remotefolder
# copy and modify line above for each additional folder to be synced
download.sh —————————————-
#!/bin/bash NOTICE: These scripts use the –delete option. This means it will delete any file on the destination not on source. Also, these shell scripts contain your Amazon secret info, so you will want to make sure they are only readable by you (chmod 700, credit Kelvin below). You can also add the “-v” option, so you get a verbose about of the changes. I did this this after my initial upload, so I can monitor activity via cron job emails.
# script to download local directory upto s3
cd /path/to/yourshellscript/
export AWS_ACCESS_KEY_ID=yourS3accesskey
export AWS_SECRET_ACCESS_KEY=yourS3secretkey
export SSL_CERT_DIR=/your/path/to/s3sync/certs
ruby s3sync.rb -r --ssl --delete mybucket:/remotefolder/ /home/john/localdownloadfolder
# copy and modify line above for each additional folder to be synced
Create the local upload and download directories and put some test files in the upload folder
$ mkdir localuploadfolder
$ mkdir localdownloadfolder
change the permissions on the files
$ chmod 700 upload.sh
$ chmod 700 download.sh
Test upload.sh
$./upload.sh
Use s3cmd.rb or Cockpit to make sure you can see the files made it to Amazon.
Test download.sh
$ ./download.sh
The files you uploaded to S3 should now be in your localdownloadfolder.
Once you are confident everything is working fine and your understand what you are doing. Change the shell scripts to backup your actual folders. Run the scripts manually first to ensure everything is working properly. Remember, the upload script will be limited to the upload speed of your ISP, which can be very slow. If you have a typical Cable internet connection upload speed of 384 k it will take approx. 6 hours to upload 1GB. Download speeds are usually much faster, approx 1GB/20 min, but hopefully you never need it.
STEP 4: set up cronjob to run backup script once a week/month etc.
Once you are sure the script is working for your uploads, you can automate the task by creating a cron job to run once a week, day or month. I have it run once a week, because I do nightly backups locally to my Desktop machine using rsync.
$ crontab -e
add the following line.
30 2 * * sun /path/to/upload.sh
save and exit.
Obviously, monitor to make sure everything is working.
STEP 5: kick back and relax
Now you can relax, if your laptop battery explodes and burns down your house, you know your data is safe sitting on Amazon’s geo-redundant servers right between some bits describing a new book from Oprah and a bad review on latest Ben Affleck movie!
Feel free to leave a comment if you find this useful, incorrect, or just plain uninteresting.
UPDATE 1: One additional step I did, was to create one additional bucket where I uploaded all the necessary code/scripts to restore my files using s3sync (minus my s3 information).
UPDATE 2: I have changed the chmod 755 to chmod 700 to make script not readable to all. (Credit Kelvin below). Also, updated the information about the tools I use. I no longer use cockpit to verify success, but I mostly rely on the s3sync command line tools there were not present at the time I wrote the original article.
UPDATE 3: I never gave enough credit to the actual author of s3sync. Without him, this entire process would not be possible, thanks again.



Filter for 11/10 2006 - Felt | 10-Oct-06 at 9:18 pm | Permalink
[…] John Eberly: How I automated my backups to Amazon S3 using s3sync Finally, a step-by-step on how to backup using Amazon S3. […]
mark | 11-Oct-06 at 9:28 am | Permalink
Is there a way with s3sync to exclude certain files/folders like there is with rsync? I want to keep the cache directories on my web server from being backed up…there are thousands of files in each one that change every day.
john.eberly | 11-Oct-06 at 9:32 am | Permalink
Mark, good question, I was wondering that myself. Currently, I just specify each folder I want backed up in the parent directory, but exclude the folders I want to skip. However, this is not my preferred way either, because it means I would have to modify my script every time I add a new folder.
I will research this further and post my findings here.
Thanks, John
john.eberly | 11-Oct-06 at 10:27 am | Permalink
Mark, I tried the –exclude command with the current s3sync.rb and it does not recognize that command. I also searched through the README and the ruby code, but there is no mention of “exclude”, so I sent the developer an email asking if such a feature is planned.
john.eberly | 11-Oct-06 at 1:35 pm | Permalink
I got an email back from the author of s3sync stated that there is no –exclude option for directories yet, but he plans on implementing it when he has time only if someone else doesn’t add it before him.
Clay Loveless | 13-Oct-06 at 3:11 am | Permalink
Thanks, John. Nice article — I just wish you’d posted it about 2 weeks ago before I’d spent a few frustrating hours trying to get s3sync working for me properly.
I appreciate the pointer to jes3t Cockpit, looks like the perfect tool, and no doubt a better solution than the S3 Organizer extension for Firefox.
My question — why did you choose s3sync.rb over jets3t Synchronize?
links for 2006-10-14 « Amy G. Dala | 14-Oct-06 at 7:18 am | Permalink
[…] How I automated my backups to Amazon S3 using s3sync. | John Eberly’s Geek Blog (tags: amazon s3 ruby software geekery) […]
john.eberly | 14-Oct-06 at 8:50 am | Permalink
Clay, I chose s3sync over jets3t not really for any solid reason. I had started working on my solution using s3sync before I knew about jet3t. It looks like jets3t would work nicely as well, but s3sync is in ruby. I reasoned that if I had to modify anything, I would rather try to learn more about ruby than Java.
PapaScott » Blog Archive » links for 2006-10-16 | 15-Oct-06 at 11:43 pm | Permalink
[…] How I automated my backups to Amazon S3 using s3sync (tags: backup amazon s3 s3sync rsync) […]
How I automated my backups to Amazon S3 using s3sync. « The other side of the firewall | 16-Oct-06 at 12:04 pm | Permalink
[…] How I automated my backups to Amazon S3 using s3sync.: interesting. I should probably try it. […]
The JJW Blog :: links for 2006-10-16 | 17-Oct-06 at 2:45 pm | Permalink
[…] How I automated my backups to Amazon S3 using s3sync. | John Eberly’s Geek Blog (tags: s3 storage backup) […]
links for 2006-10-23 at 59ideas | 23-Oct-06 at 9:20 am | Permalink
[…] How I automated my backups to Amazon S3 using s3sync. | John Eberly’s Geek Blog (tags: backup rsync) […]
Chris K | 28-Oct-06 at 3:29 pm | Permalink
John,
Great article, I have been using Jungle Disk, but the preformance is horrible. What kind of performance are you getting with your solution (I realize it is somewhat directly related to your upstream bandwidth).
John | 28-Oct-06 at 3:57 pm | Permalink
I tried jungledisk via mounting the webdav, but I had nothing but problems, slow, errors, crashes, etc. I wanted a pure low level command line tool, and s3sync has worked well for me.
I usually takes me 3 hours/GB for the initial upload, and 15 minutes/GB down (but I very rarely pull data down). I have upgraded my comcast to 8MB down and 768kb up.
Marc Abramowitz » links for 2006-11-01 | 01-Nov-06 at 11:07 am | Permalink
[…] How I automated my backups to Amazon S3 using s3sync. | John Eberly’s Geek Blog (tags: backup s3 amazon) […]
gunfus | 05-Nov-06 at 3:36 pm | Permalink
John nice write up. I am looking for solutions to backup to s3, but I didn’t find much that was interesting. This one is particulary well detail and using linux.
One question, that I do have to ask, is about your nigthly backups to your desktop? so if I understood correctly:
1) You have the public folder in your linux, and then of that you do backups to your desktop (windows?)
2) then weekly you do a s3 sync?
What is your monthly average cost? of the solution?
John Eberly | 05-Nov-06 at 4:02 pm | Permalink
gunfus,
I currently have the following at home
1. Linux Server running samba with /var/public/ shared as “public” drive (320GB)
2. Linux Desktop (250GB)
3. Windows Desktop
4. Windows Laptop
5. Dual boot Linux/Windows laptop
All machines have the “public” drive mounted or mapped and this is where all data is saved. I do not backup any local folders on the individual machines.
Nightly, I use rsync to push changes on /var/public to my linux desktop in /backups/. Weekly, I run the s3sync to copy all changes on /var/public to S3.
Currently, I have about 10GB on s3 and probably upload about < 1GB per week to s3. It currently costs me about $2-3 per week for storage on s3.
Honestly, the next thing I would like to do is try and reduce my energy used by leaving on the desktop machine and server all night.
gunfus | 06-Nov-06 at 5:03 am | Permalink
John,
Thanks for that info. if I may ask, why doing a backup to your desktop then to s3? why not go directly yo s3?
I am asking because I am re-thinking my whole backup solution and how my computers are configured. (in views of some recent lost of data)
This time I am defenetly going to upload to s3 and maybe every 6 month or a year.. I will do a dvd backup.
links for 2006-11-06 « Gobán Saor | 06-Nov-06 at 8:43 am | Permalink
[…] How I automated my backups to Amazon S3 using s3sync. | John Eberly’s Geek Blog (tags: s3 backup amazon s3sync) […]
John Eberly | 06-Nov-06 at 11:01 am | Permalink
gunfus,
My weekly backups do go directly to s3, I just do the nightly backups locally.
My current reasons for backing up in order of priority:
1. accidental deletion (backup locally/s3)
2. hardware failure (backup locally)
3. keep multiple versions (backup locally)
4. fire etc, destroys my house. (backup s3)
I didn’t mention this earlier to keep it simple, but I have data on the “public” drive that I don’t backup to s3 like GBs of disk images, music etc (non-critical/could be lost in fire).
The reason I backup up to local machine instead of to s3 is because I backup the entire public drive (100+ GB) every night locally, but only the data I deem critical in case of fire, etc gets backed up to s3 weekly directly from my server. This way, if someone accidentally deletes all their files and the daily backup job runs, I would still have a copy of their files on the weekly s3 backup. If my house burns down I would lose my disk images and music, something I can live with to avoid additional off-site storage/transfer costs.
There are thousands of ways you can chose to backup your data, just remember to take into account accidental deletion. Because if you use the –delete option on s3sync it will automatically delete all files on s3 that do not exist locally.
Ideally, if s3 was free and I had a much faster upload speed on my internet connection, I would make daily, weekly, monthly, yearly backups to s3. But currently, I chose to make my daily and monthly locally, but send my weekly backups to s3.
For what it is worth here are my personal opinions for a home backup solution:
- Write down your plan before trying to set it up.
- Keep it simple, try to get all the data in one location/directory that needs to be backed up. This is obviously not necessary, but I have found having data spread all over quickly gets confusing and you will forget what is backed up and what is not.
- Automate it, otherwise it will never get done. This is why I wanted to avoid DVDs entirely.
- Test a restore on a separate machine.
- Keep weekly, monthly, and possibly yearly copies of your data, not just nightly copies.
- Monitor your backups to make sure your process is working.
I hope this response is not too confusing, just let me know if you still have questions. Basically, any backup solution should be tailored to your needs and risks. Like I said earlier, next I want to analyze how much electricity I am using to keep two machines running all night. If I decide that the costs of running the second machine just for local backups matches or exceeds the additional cost to backup to s3, I will consider switching all my backups to s3. Your post has prompted me to start looking at that again.
Thanks for your posts gunfus.
gunfus | 06-Nov-06 at 8:54 pm | Permalink
Hey thanks, for the wealth of advice. Do you want to look at it together..? maybe we can come up with some awesome website that talks about what we did and why we did it..
I tend to be a gregarious person. Anyways.. Yea.. I like your advice of writing down the backup strategy and the automation. Previously I had no write up of the backup strategy, I just kind of came up with it on the fly, had a lot of files spread and never really had tried to restore it, and it wasn’t fully automated to with the DVD backups. I manually had to copy the files every once in a while and burn the .zip files.
This time around, with all the information and options out there, my data getting out of hand (with me and my wife taking pictures like mad). I decided to write down a layout of the network (because that is getting complicated as well), with that I am now deciding how the backup strategy will work. And will then implement it. After all, this is how I learn to develop software and have learn it the hard way.
Just today, during work hours I downloaded jets3t Cockpit to try it out, and I am still setting up my server.. the latest ubuntu 6.10 server version seems to have some glitches that I am trying to resolve.
John Eberly | 08-Nov-06 at 9:40 pm | Permalink
gunfus, glad you found some of my advice useful.
I am also revisiting my backup strategy to possibly move more of it S3.
As more and more people put more of their life in digital format, there will be an increasing value in backups so we don’t lose any of our history.
Keep me informed of your progress by posting here. I am always interested in other people’s solutions and by posting your results, it gives other readers multiple perspectives.
Giles Thomas | 13-Nov-06 at 6:54 pm | Permalink
John,
Many thanks for the tips - I’m currently setting up s3sync to run on my NSLU2 (a $100 NAS device from Linksys) - first successful sync today! - and this post helped a lot
(Here’s the first of the blog posts I made as I went along, in case you or your readers are interested: http://www.gilesthomas.com/?p=5)
Cheers,
Giles
links for 2006-11-14 « Gobán Saor | 14-Nov-06 at 7:34 am | Permalink
[…] How I automated my backups to Amazon S3 using s3sync. | John Eberly’s Geek Blog (tags: backup s3 amazon ruby sync s3sync) […]
gunfus | 21-Nov-06 at 5:47 am | Permalink
Hi John,
I do really appreciate your help. I have documented my solution in my website. I haven’t quite finish because I am still in testing mode, but the solution is there.
For anyone using ubuntu and wanting to use a full java solution please visit: http://larinaandangel.gunfus.com and expand the linux section.
links for 2006-11-29 « Bloggitation | 28-Nov-06 at 5:20 pm | Permalink
[…] How I automated my backups to Amazon S3 using s3sync (tags: amazon s3 ruby backup) […]
Phil | 03-Dec-06 at 5:20 am | Permalink
Thanks for howto. I thought I give it a try on my MacBook running OS X 10.4.8.
Unfortunately I’m unable to install libopenssl-ruby, since it isn’t availabe using Fink (the package management system for os x).
I tried to install a debian-based package from packages.debian.org/…/libopenssl-ruby but got an error message using
sudo dpkg -i libopenssl-ruby1.8_1.8.2-7sarge4_i386.deb:“package architecture (i386) does not match system (darwin-i386)”
‘Doing the google’ for libopenssl-ruby doesn’t come up with a solution - Any ideas what to do, to get your setup working on mac os x?
gunfus | 11-Dec-06 at 11:13 am | Permalink
Phil, did you try using the full java solution instead of RUBY, just use the sync program that comes with jets3.
John Eberly | 11-Dec-06 at 1:02 pm | Permalink
Sorry for the late response Phil. I am afraid I can’t be of much help with your MAC issue. But I also agree with gunfus, you might want to look at the jets3 java solution. It seems to do the same thing, and gunfus has a good post about how he used jets3. (see comment 25 for link to his site)
tecosystems » Friday Grab Bag From Frigid Denver | 12-Jan-07 at 5:44 pm | Permalink
[…] Speaking of personal backups, I’ve finally settled on S3 as the will-be solution to my backup issues. I’ll maintain local copies for the sake of convenience, but given the fact that my music collection is - apart from my apartment - my most valuable material possession (I’m pretty sure it’s worth more than my car), I need offsite backups and S3 is the solution of choice. The problem is a.) the size of my music collection (50 GBs+; small by some standards, but large enough to be a problem), and b.) my absurdly slow upload cap (768, I think). Forgetting the math, because of spikes in upload capacity, it’s going to take days for the collection to upload. During that time, my local bandwidth will be negatively impacted, so the current plan is to initiate the sync to S3 shortly before I head to Boston next week for Mashup Camp. The upload client I’ve selected - it’s a Windows box, so this solution would require too much overhead - is JungleDisk. Anybody happen to know how it will behave if the upload is terminated prematurely? […]
tecosystems » The RedMonk IT Report: S3/ZRM for Backup | 13-Jan-07 at 2:12 pm | Permalink
[…] Next, I needed to establish an automated backup of both the webroot and our backed up MySQL databases to our predetermined offsite provider, Amazon’s S3. To do so, I followed these simple instructions. The author, John Eberly, walks you through the installation of a Ruby based rsync clone, s3sync, the creation of a simple bash script that will execute that script, and the scheduling of that job. While the notes are excellent and quite complete, a couple of issues/clarifications: […]
Kelvin Nicholson | 23-Jan-07 at 7:18 pm | Permalink
Nice writeup — very clear. My only question: you mentioned making the upload/download scripts readable “by only you,” (since you are putting your access keys inside them). But you did a chmod 755 *.sh — should you do a chmod 700 *.sh? (or maybe a 711?)
Just my .02 — great tips either way!
John Eberly | 01-Feb-07 at 9:18 am | Permalink
Thanks Kelvin,
You are correct, I should have used chmod 700 instead of 755. I have updated this posting to reflect your advice.
Luis Villa | 17-Feb-07 at 7:48 am | Permalink
Just a minor point, John; your download.sh (as posted) won’t work because you’re missing a # in front of “script to sync local directory upto s3″.
John Eberly | 17-Feb-07 at 10:33 am | Permalink
Thanks Luis, I have update my post to reflect the changes.
David Dorgan ’s Weblog » 26-03-2007: S3 and some tools put to the test… | 26-Mar-07 at 3:06 pm | Permalink
[…] Oh well, there is amazon and s3 python libraries. One last pain with this is that on the default debian 3.1 it doesn’t work with the version of ruby installed which is 1.8.2 but it needs 1.8.4 or greater… In case you are interested in setting it up for backups, there’s a great post on automating backups using s3 and s3sync , enjoy. […]
Jon | 13-Apr-07 at 1:49 am | Permalink
John,
I want to setup a system similar to yours. After hours of research it appears that I differ slightly in my needs.
Instead of syncing up the data from the server level to s3, I instead want to setup webdrive to map a drive that will point to an external server, as well as communicate with s3. I know Jungle Disk can accomplish this (it can map its own drive). However, I have chosen not to use that tool because of its proprietary encyption format.
So any pointers, how can I use webdrive to get data to s3?
Also what did you use to map all your local data to store it on your public drive on the samba box?
Thanks for the assistance!
Jon
John Eberly | 13-Apr-07 at 9:34 am | Permalink
Jon, If you want an alternative to JungleDisk that maps to a windows drive letter, you might want to look at http://www.s3drive.net/ I have not tried it, but it looks like it does what you are looking for.
There is also s3fox, a firefox extension for accessing your s3 data.
For my “public” drive, I have a linux server running samba where I created a folder to store all of my data. All of my windows machines then are able to map the drive like any normal windows network share.
I am not sure if I answered your questions or not, just let me know.
John
Jon | 13-Apr-07 at 2:10 pm | Permalink
John,
So to map your public drive on your local machine do you simply use the built in ability within xp, or are you using another piece of software?
Thank You
Jon
John Eberly | 13-Apr-07 at 2:23 pm | Permalink
Yes, that is the benefit to setting up a samba server on linux. It allows windows machines to share files without any special software, just use windows explorer and tools->map network drive after you have configured samba on the linux server.
You can install basically any linux distro and samba or you could use some special purpose distro specially built for setting up a file server like the following:
www.openfiler.com
clarkconnect
freenas
Jon | 13-Apr-07 at 2:32 pm | Permalink
And because you are going direct from your linux box to s3 you don’t have to deal with slow uploads (for ex comcast) that make products like Jungle Disk less appealing.
John Eberly | 13-Apr-07 at 2:43 pm | Permalink
Well actually, I still have to worry about my upload speed of my ISP (comcast) because my linux server is located at home. I upgraded my comcast plan which doubled my upload speed (to 768kbps), which helped. But it still took a couple of days to get the initial upload up to S3, but now it is much faster because I am only uploading the changes. I also tried jungle disk on both windows and linux awhile back, but had nothing but problems. I prefer the set up I have now and it has been working perfectly for me.
Kelvin Nicholson | 13-Apr-07 at 7:36 pm | Permalink
Jon and John:
J#2: Thanks again for the write-up — I’m using this on my VPS to backup; I do a sync without the delete flag every day, then weekly do a sync with the delete flag. A cheap man’s backup.
Considering grabbing the python code and playing around with it, as I know python quite well, but know no ruby.
J#1: If you haven’t already implemented a solution, the first thing that jumped into my head was using S3 with FUSE. In short, you get to have a mount that acts like a mounted drive, but the “drive” is S3. This thread on the Amazon Forum should help you out:
http://developer.amazonwebservices.com/connect/thread.jspa?threadID=10271&start=0&tstart=0
John Eberly | 13-Apr-07 at 8:53 pm | Permalink
Kelvin,
That is a good idea with the delete flag only every week. I think I will switch my backups to using the delete flag only once a month, in case I totally mess something up, I have a longer period to recognize it.
John Eberly | 16-Apr-07 at 11:16 pm | Permalink
Kelvin,
I had been following the s3 fuse forum thread for awhile, but I kind of lost track of the progress. I think the ultimate solution would be mounting s3 using fuse.
Have you successfully done this with fuse?
Avinash Meetoo: Blog » Blog Archive » Synchronizing two computers using Amazon S3 | 18-Apr-07 at 12:55 pm | Permalink
[…] This is extremely cheap for the peace of mind you can enjoy when you know “your data is safe sitting on Amazon’s geo-redundant servers right between some bits describing a new book from Oprah and a […]
Colin Nederkoorn | 25-Apr-07 at 3:08 pm | Permalink
I have been using s3sync.rb to backup a large number of files. It seems to me that for every file, on every backup, s3sync compares to see if it is newer than the remote file. This is extremely expensive in time and bandwidth. Im taking a look at duplicity now because unfortunately for larger backups, s3sync.rb seems to create a lot of waste. Am I correct?
Kelvin Nicholson | 05-May-07 at 11:00 pm | Permalink
@John: I haven’t tried using FUSE+S3 quite yet, as the need hasn’t arisen.
For others:
S3 just got a tad cheaper:
Current bandwidth price (through May 31, 2007)
$0.20 / GB - uploaded
$0.20 / GB - downloaded
New bandwidth price (effective June 1, 2007)
$0.10 per GB - all data uploaded
$0.18 per GB - first 10 TB / month data downloaded
$0.16 per GB - next 40 TB / month data downloaded
$0.13 per GB - data downloaded / month over 50 TB
Data transferred between Amazon S3 and Amazon EC2 will remain free of charge
New request-based price (effective June 1, 2007)
$0.01 per 1,000 PUT or LIST requests
$0.01 per 10,000 GET and all other requests*
* No charge for delete requests
Kelvin Nicholson | 23-May-07 at 11:47 pm | Permalink
Colin:
Yes, I believe you are correct. I too am looking for more optimized solutions. In the end I broke up my upload.sh file into several different files, depending on how often I need to do backups. For instance, I might have upload-mysql.sh upload every day, but upload-home.sh upload once/week.
knolleary » Blog Archive » S3, EC2, SQS - The AWS Triumvirate | 05-Jun-07 at 2:33 pm | Permalink
[…] tool I looked at was the ruby-based s3sync. Following some instructions google found for me on John Eberly’s blog, I set about creating a storage bucket and began uploading the 1.5Gb of photos from my main […]
John Eberly’s Geek Blog :: How I automated my backups to Amazon S3 using s3sync. | 10-Jun-07 at 4:42 pm | Permalink
[…] http://blog.eberly.org/2006/10/09/how-automate-your-backup-to-amazon-s3-using-s3sync/ Tags: backup, s3, amazon, ruby, sysadmin, storage, sync(del.icio.us history) […]
Brian | 30-Jul-07 at 5:49 am | Permalink
I’d love to hear how people compare using S3 for backups vs. Mozy. On a price point alone, the 200GB or so I would want to backup would cost me $30+ using S3 but only $6 using Mozy. Also, I don’t have to do anything myself other than tell Mozy what to backup. No scripting.
Thoughts?
John Eberly | 30-Jul-07 at 6:25 am | Permalink
Brian,
I think it kind of depends on your situation. For me, I was looking for a solution where I could back up directly from a Linux Server. Also, I only needed between 10-20GB of storage, so s3 seemed like a better solution for me.
But for the average person with a Mac or PC, mozy seems like a very good option as well. Does Mozy support networked drives yet?
Currently, I would like to switch from using s3sync to duplicity when I get a chance. As Colin said above, s3sync has to make a remote call for every file which is expensive in bandwidth and time. I plan to write a post once I have it successfully working.
Unatine :: blog : links for 2007-07-30 | 30-Jul-07 at 5:32 pm | Permalink
[…] How I automated my backups to Amazon S3 using s3sync. Tags: none July 31, 2007, at 4:30 — links — BY-NC-SA […]
Link With Reality Web Log » links for 2007-07-31 | 30-Jul-07 at 9:17 pm | Permalink
[…] How I automated my backups to Amazon S3 using s3sync Write-up of using s3sync to back up a server to Amazon S3. (tags: administration howto article aws) […]
Michael Gorsuch, Timid Iconoclast » links for 2007-07-31 | 30-Jul-07 at 9:31 pm | Permalink
[…] How I automated my backups to Amazon S3 using s3sync. Excellent step-by-step guide to ensuring that your data is safe. Considering that all of my data is on a VPS at Linode, this is something to look into. It certainly looks better than my current rsync strategy. (tags: s3 ruby sysadmin) […]
Daniel Grüßing | 11-Aug-07 at 7:24 am | Permalink
Very cool Howto! Thank you very much!
Brilliant work! =)
Amazon S3 Storage Tools | Vinod Live! | 19-Aug-07 at 12:03 pm | Permalink
[…] directory and an S3 bucket:prefix. It behaves somewhat, but not precisely, like the rsync program. John Eberly has an efficient way of using S3Sync with Jets3t. […]
s3 is | 30-Aug-07 at 12:11 am | Permalink
s3 is…
I saw this domain for sale, according to the appraisals, its worth over $5000.00 usd. It has just the perfect amount of keywords….
Tech Messages | 2007-09-11 | Slaptijack | 11-Sep-07 at 5:32 pm | Permalink
[…] How I automated my backups to Amazon S3 using s3sync. | John Eberly - I’m seriously considering moving all my backups to Amazon S3. Do you have any experience with the service? […]
Cloth | 21-Sep-07 at 3:29 pm | Permalink
Hello to all, its my new pages about cloth
cloth diaper
You can buy here 24\7.
Pens | 27-Sep-07 at 3:35 am | Permalink
Hello, here you can read all info about pen pal
24\7.
dota | 27-Sep-07 at 5:17 pm | Permalink
[*map/map_index_cn5_f21.txt||10||r||1|| @]
pens | 29-Sep-07 at 1:01 am | Permalink
Hello nice blog! !!
pen
It’s my new page.about pens.
markofando | 02-Oct-07 at 2:53 am | Permalink
Want to start your private office arms race right now?
I just got my own USB rocket launcher
Awsome thing.
Plug into your computer and you got a remote controlled office missile launcher with 360 degrees horizontal and 45 degree vertival rotation with a range of more than 6 meters - which gives you a coverage of 113 square meters round your workplace.
You can get the gadget here: http://tinyurl.com/2qul3c
Check out the video they have on the page.
Cheers
Marko Fando
shoes | 02-Oct-07 at 5:08 am | Permalink
Hello nice blog! !!
sofa
It’s my new page.about shoes.
Dierieenronge | 04-Oct-07 at 7:26 pm | Permalink
I’ve got an Amazon gift certificate burning holes in my pocket,
and I want to get the most bang for my buck.
Enter the Secret Amazon Web Pages:
http://tinyurl.com/38sojf
This is where you’re going to find the “latest sales, rebates, and limited-time offers” from
Amazon, and you can score some pretty deep discounts if you’re a savvy shopper.
Next, there’s the special Sale link. This is open every Friday, and ONLY on Fridays.
You can find the same good discounts here as you would in hidden Deals, although some
Fridays you can really get lucky and make off like an Amazon bandit - I’ve seen discounts
there as low as 75% off sticker price.
awallFamChima | 05-Oct-07 at 2:47 pm | Permalink
There’s one special secret Sale link on Amazon:
http://tinyurl.com/2r7ldr
This is open every Friday and ONLY on Fridays!
You can find very good discounts here, although some Fridays you can really get
lucky and make off like an Amazon bandit - I´ve seen discounts there as low as 75%
off sticker Price.
taylor@unwirednation | 08-Oct-07 at 3:34 pm | Permalink
Phil,
Try using macports for getting the libopenssl package installed… I would use macports for the entire ruby setup also.
Nelson’s Backups | 13-Oct-07 at 7:27 am | Permalink
[…] storing some (if not all) of his important files out on Amazon’s S3. There is even a great little ruby app that makes this super easy. Typical Debian operating […]
links for 2007-11-13 | 13-Nov-07 at 1:33 am | Permalink
[…] How I automated my backups to Amazon S3 using s3sync. | John Eberly Jeremy Zawodny has an excellent article/discussion about the different tools currently available to take advantage of Amazon simple storage service (S3). After testing many tools available for S3 currently, I decided to use the ruby program s3sync to back (tags: article automation blog code command computer data filesystem guide hack hacks hosting howto imported linux mac macosx network online osx programming rails reference rubyonrails scripting server services software startup storage sysadmin Tech tool tools tutorial tutorials ubuntu Web2.0 webdev webservices windows work sync Ruby amazon backup s3) […]
vcqgr@hotmail.com | 22-Nov-07 at 4:17 am | Permalink
Very good site. Thank you.
cqkgjytmwg | 26-Nov-07 at 5:31 am | Permalink
cqkgjytmwg cqkgjytmwg cqkgjytmwgcqkgjytmwg
cqkgjytmwgcqkgjytmwgcqkgjytmwg cqkgjytmwg
Synchronizing two computers using Amazon S3 | RelatedSeek.com | 09-Dec-07 at 2:28 am | Permalink
[…] This is extremely cheap for the peace of mind you can enjoy when you know “your data is safe sitting on Amazon’s geo-redundant servers right between some bits describing a new book from Oprah […]
On Amazon S3 and competitive advantage « by jan | 23-Dec-07 at 4:15 am | Permalink
[…] can find many tutorials on how to use s3sync to do the backups as well. All that is very easy. Too easy actually. […]
MG | 14-Jan-08 at 2:49 am | Permalink
i wrote a rake script that uses this idea to
commit/update files for rails project.
certificate/libary are installed automatically, only need to specify keys/what to sync and your done :>
rake s3commit / s3update
PS: plz mention it above if u like it
Rafael Lima » Configurando sistema de backup do banco de dados MySQL no Amazon S3 em 10 minutos | 19-Mar-08 at 10:24 pm | Permalink
[…] Para enviar o backup realizado para um conta no Amazon S3, que é o web service de storage da Amazon, siga as instruções abaixo que foram retiradas deste link. […]
Remi | 21-Mar-08 at 1:50 am | Permalink
Thanks for this awesome tool. It appears that Amazon is timing out the connection, and that s3sync isn’t successful in re-establishing the connection
addady | 22-Mar-08 at 11:50 pm | Permalink
Every time a local file has been changed it will uploads the hole new file, not just what has changed. That means that if you are using S3sync for doing regular backups you are wasting unnecessary bandwidth.
Most of the daily change is user data are files that have been update and not new files.
You can bypass this limitation using rsync and 3rd party gateway like: http://www.s3rsync.com/
Resources and Tools for Amazon Services « mindstorms | 26-Mar-08 at 5:08 pm | Permalink
[…] How I automated my backups to Amazon S3 using s3sync […]
How To: Bulletproof Server Backups with Amazon S3 - PaulStamatiou.com | 29-Mar-08 at 9:25 pm | Permalink
[…] the most part, I took the advice of John Eberly in his automated S3 backups article. However, I did several things differently so I thought I would show what I did in an […]
'ILLEGAL | 12-Apr-08 at 8:30 pm | Permalink
None…
None…
David Laing’s blog » Blog Archive » S3 Backup options: s3sync & duplicity | 10-May-08 at 5:02 am | Permalink
[…] HOWTO; http://blog.eberly.org/2006/10/09/how-automate-your-backup-to-amazon-s3-… […]