February 02, 2014, 01:19:21 PM *
Welcome, Guest. Please login or register.

Login with username, password and session length
   Home   Help Search Login Register  
Pages: [1]
Author Topic: s3sync speed vs rsync  (Read 6088 times)
Posts: 1

View Profile
« on: April 24, 2008, 03:32:08 PM »

when syncing an up to date mirror, s3sync is taking about 10 minutes to do what psync/rsync did in < 5 minutes, and uses alot more cpu over the period (somewhat subjective, but it seems to be sitting at 30% the whole time whereas rsync just burst for a short period at the beginning while it built the file list).

i am assuming this is because s3sync can't pull a list of files at the beginning so spends alot more time traversing the directories (in XML rather than ssh no less). is this the case? has anyone else noticed comparably poor results when dealing with alot of files?

(i used to have my whole user directory (~120k files .. crazy, but there we go..) syncing every few hours to strongspace with rsync with no issue, but if its going to take half an hour each time and use 30% cpu while it works then its not really viable (not to mention my cost may be in the get/put/list requests - will see once it balances out).

amendment: rsync does seem to be taking a comparable amount of time, just not using any cpu, and its only about 23k files (not sure where i got the 120k, maybe that was the whole drive)
« Last Edit: April 24, 2008, 03:45:43 PM by babelian » Logged
Sr. Member
Posts: 363

(I am greg13070 on AWS forum)

View Profile
« Reply #1 on: April 24, 2008, 10:39:11 PM »

I'd guess the cpu is due to s3sync md5'ing every local file to check if it needs to be re-sync'd.  But just a best guess.  Otherwise, it's not really meaningful to compare the two.  They don't share any common code or architecture.
Posts: 2

View Profile
« Reply #2 on: May 29, 2008, 06:10:49 AM »


First there is know issue in ruby library.  That a look on this post http://s3sync.net/forum/index.php?topic=191.0

Second and more important. Given that S3 is all or nothing, i.e. if you want to change a file you have to PUT the whole thing again.
When using s3sync every time a local file has been changed it will uploads the hole new file, not just what has changed. That means that are wasting unnecessary bandwidth and take more time.

If you to benefit from Rsync bandwidth efficient algorithm You need to use your own Amazon Ec2 machine or to
3rd party gateway like: http://www.s3rsync.com/

« Last Edit: May 29, 2008, 06:47:36 AM by danm2 » Logged
Pages: [1]
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2013, Simple Machines Valid XHTML 1.0! Valid CSS!