resume upload

rondari

Newbie

Posts: 8

resume upload

« on: March 10, 2007, 09:12:58 PM »

S3 doest support resume upload does it?

bummer that. ... I bet lots of people want this. I guess I should put this on the S3 forums.


	Logged

treed

Newbie

Posts: 3

Re: resume upload

« Reply #1 on: March 26, 2007, 06:28:00 PM »

Resume would be VERY nice. For the last couple weeks I have been trying to upload a series of digital video files to S3 that I really want to have backed up. Each is around 1G in size. I would say it requires 5 attempts each to fully upload each file. Usually something goes wrong (on the amazon side I would presume) and the connection gets dropped. This is costing me a lot of money in bandwidth for the retries.

The only way to implement resume that I can think of (and also solve the 5G problem) is if s3sync would split up files bigger than, for example, 100M into 100M chunks and then maintain some metadata on which files are to be put back together. I know this violates the goal of all data being able to be downloaded and restored without the aid of s3sync but at least all you would have to do is to cat the files back together which most sysadmins can script up easily. If you name them sensibly (filename.avi-s3sync-1of5 or something like that for a wild guess) it shouldn't be that hard. It might be a good idea to make the chunk size configurable.


	Logged

ferrix

Sr. Member

Posts: 363

(I am greg13070 on AWS forum)

Re: resume upload

« Reply #2 on: March 27, 2007, 09:05:23 AM »

Well .. you could always create a patch Grin

If you're upset about AWS charging you for failed uploads, why not complain to them? If they add the ability to do resume, or some other form of modifying existing objects, I will jump on it as fast as all the other devs I'm sure!

The problem, as I see it, is that s3 offers this theoretical feature of large objects, but they have two enormous practical problems with it.

The +2GB upload bug
The fact that there's no reliable way to send a large object to them

I could break the 'sync' paradigm to make an ugly patch on these problems. But I won't.


	Logged

lowflyinghawk

Jr. Member

Posts: 52

Re: resume upload

« Reply #3 on: March 28, 2007, 02:15:22 PM »

"Usually something goes wrong (on the amazon side I would presume) and the connection gets dropped. "

I don't see why you should assume it's on the aws side. evidence? even granting that you can reliably upload to some other site the network path could be quite different.


	Logged

ferrix

Sr. Member

Posts: 363

(I am greg13070 on AWS forum)

Re: resume upload

« Reply #4 on: March 28, 2007, 06:02:45 PM »

Well for example if you get 500 responses or some other non network-looking hiccup, it's easy to tell that it's on their side.

My larger point is that they don't provide any reliable way to do large uploads. Resume, or some other solution, should be supported.


	Logged

lowflyinghawk

Jr. Member

Posts: 52

Re: resume upload

« Reply #5 on: March 28, 2007, 07:07:02 PM »

rt, 500s are on their side, but people constantly complain about timeouts and drops, and with those it is far from clear cut who is at fault.


	Logged

ferrix

Sr. Member

Posts: 363

(I am greg13070 on AWS forum)

Re: resume upload

« Reply #6 on: March 29, 2007, 10:18:45 AM »

Yeah so, let's assume those kinds of things are inevitable. *Other* protocols that claim to support large uploads have some method to mitigate this problem built in. S3 though, says "hey we support this" but doesn't provide any tangible way to do it. That's not a feature imo.


	Logged

lowflyinghawk

Jr. Member

Posts: 52

Re: resume upload

« Reply #7 on: March 29, 2007, 07:38:30 PM »

since s3 is really a name/value database, not a filesystem, the guarantee of atomicity would make it very difficult to resume. think about writing over an existing key. you start transferring the bytes and lose the connection at some point. how would they know if you just got tired of waiting or if you wanted to resume? how long woud they keep partials? suppose you try to resume from two connections? the protocol would require the clients to query the server and be able to understand where to start, which doesn't fit in very well with REST hey? I can understand how it works when I download with ftp, the file on the server end hasn't changed so a smart client can tell the server, "hey, start at byte 233333 and give it to me again". if the file meanwhile has changed though, how does the client know that and how would the server know if it had changed since the client last asked? if the client dies ftp doesn't keep any state, so the client and server have to be making some assumptions to allow the resume to happen. and rsync doesn't allow resume at all right? it just starts in on the diffs again and resends what it has to?

in a nutshell I think if they claim to support resume (I haven't seen the claim) I don't believe them. where is the explanation of the protocol?


	Logged

ferrix

Sr. Member

Posts: 363

(I am greg13070 on AWS forum)

Re: resume upload

« Reply #8 on: March 30, 2007, 09:24:23 PM »

It could be made to work I think Grin

Um about download resume, I don't remember for sure but I "could have sworn" that was supported. Maybe I'm just out of my mind.


	Logged

Pages: [1]

« previous next »