General Category => Questions => Topic started by: A.Skwar on November 02, 2007, 06:54:51 AM

Title: Parallel sync
Post by: A.Skwar on November 02, 2007, 06:54:51 AM

I noticed, that it's somewhat slow to upload stuff to s3. In a self developped program in Python, I'm uploading the contents in many parallel threads. This speeds up the whole upload process A LOT.

But uploading data with s3sync is very slow, as it uploads one file at a time only.

Would it be possible to enhance a future version of s3 so, that it does upload in many threads at once? This would make the sync process so much faster.

Thanks a lot,

Title: Re: Parallel sync
Post by: ferrix on November 02, 2007, 05:48:44 PM
I have nothing against that idea.  It should be possible to get a fixed number of worker threads going instead of having the comparator wait on one thing at a time.

Title: Re: Parallel sync
Post by: grig on December 11, 2008, 11:53:52 PM
i have waited 10 days for my folder to transfer to s3sync

* s3 seems to have a cap on a single file upload speed (after googling for "s3 slow")

* s3sync tries to sync the same file it's working on in a separate process, so more control of a single directory parallel process doesn't work as-is
* If there was an INCLUDE for s3sync I could specify a few regexps and break the sync up

i am now shifting my dev time (from my actual task at hand) to making this faster by writing a script that uses s3cmd and my own file tree compare

p.s. I'm transferring about 6,000 FLAC files from a single directory on a remote fedora 4 box