S3Sync.net
February 02, 2014, 01:33:18 PM *
Welcome, Guest. Please login or register.

Login with username, password and session length
 
   Home   Help Search Login Register  
Pages: [1]
  Print  
Author Topic: How do I make it start where it left off?  (Read 10431 times)
tkirby
Newbie
*
Posts: 6


View Profile
« on: September 19, 2007, 09:15:00 PM »

I'm trying to transfer a couple thousand movie files to S3 and the s3sync script was running happily for a few hours (and several hundred movies) before it ran into some kind of broken pipe error. Anyways, I went to run the script again (using --dryrun like a good little s3syncer) and it seems like it wants to start all over again. Do I need to add some flag to tell it not to retransfer stuff that already exists in the destination location?

CMD:

 ./s3sync.rb --progress /home/tkirby/html/avshare_old/media/ chickenrancher:avshare/media/ --dryrun

--------------------

I saw a snake eating itself tail first, but it disappeared before I could show anyone.
Logged
ferrix
Sr. Member
****
Posts: 363


(I am greg13070 on AWS forum)


View Profile
« Reply #1 on: September 19, 2007, 09:50:50 PM »

Well if you used dry run then of course it doesn't actually sync anything so it'd *have* to start over next time wouldn't it..

But assuming you mean that the first time you didn't dry run it....
Have you considered that putting the option at the end might be a problem?  I have no idea whether that works.  Assuming your command syntax is doing what you want it to do, then transfers won't recur if the local files are the same size and md5 as the nodes out on s3.
Logged
tkirby
Newbie
*
Posts: 6


View Profile
« Reply #2 on: September 19, 2007, 10:28:30 PM »

Yeah, umm, wet run the first time, dry run the second to make sure it was gonna pick up where it left of (which it didn't)

./s3sync.rb --dryrun --progress /home/tkirby/html/avshare_old/media/ chickenrancher:avshare/media/
Create node 100.wmv
Create node 1000.wmv
Create node 1001.wmv
Create node 1002.wmv
Create node 1003.mp3
Create node 1004.wmv
Create node 1005.wmv
Create node 1006.wmv
Create node 1008.wmv
Create node 1016.wmv
Create node 1017.wmv
Create node 1018.wmv

--dryrun before the paths, same deal. The files on the server are up to 1419 but it still wants to start at 1000 (the beginning)

Anything else I should try?
Logged
ferrix
Sr. Member
****
Posts: 363


(I am greg13070 on AWS forum)


View Profile
« Reply #3 on: September 20, 2007, 10:09:59 PM »

Best guess is the original files were sent to some differing path or something that is preventing them from being found.

You could try -d to see more details about what is being done internally.  I can't help too much more without actually having access to your source and bucket.
Logged
tkirby
Newbie
*
Posts: 6


View Profile
« Reply #4 on: September 21, 2007, 12:47:45 AM »

When I run it with debug and verbose it says this:


local item /home/webmstr/html/avshare_old/media/100.wmv
local node object init. Name:100.wmv Path:/home/webmstr/html/avshare_old/media/100.wmv Size:3227980 Tag:17a9e28de46a6a9634a1625299a8cd23
s3TreeRecurse trainorders avshare/media
Trying command list_bucket trainorders max-keys 200 prefix avshare/media delimiter / with 100 retries left
Response code: 200
prefix found: /
S3 item avshare/media_$folder$
s3 node object init. Name:_$folder$ Path:avshare/media_$folder$ Size:0 Tag:d41d8cd98f00b204e9800998ecf8427e
source: 100.wmv
dest: _$folder$
s3 node object init. Name:100.wmv Path:avshare/media/100.wmv Size: Tag:
Create node 100.wmv

Looks like it's trying to compare the local file with 'S3 item avshare/media_$folder$'
Is _$folder$ some variable that's not getting parsed correctly?

Cheers,
-Todd

Logged
ferrix
Sr. Member
****
Posts: 363


(I am greg13070 on AWS forum)


View Profile
« Reply #5 on: September 21, 2007, 10:31:06 AM »

Aha.. no, not a variable.  That is a folder created by some other (non s3sync) tool.  You can't sync with something else and then expect s3sync to understand it; the folder concepts of these tools are all proprietary and incompatible.

So as I suspected, your stuff was (somehow) stored to a location that s3sync doesn't see when it looks for the folder "media".  Therefore it's doing the right thing by its design and trying to store all the items over.
Logged
tkirby
Newbie
*
Posts: 6


View Profile
« Reply #6 on: September 21, 2007, 08:47:42 PM »

not sure what that means but I deleted everything from s3 and started over. Told it to copy again with:

./s3sync.rb --progress /home/tkirby/html/avshare_old/media/ chickenrancher:avshare/media/

Let it copy three or four:

Create node 100.wmv
Progress: 2764800b  839107b/s  85%
Create node 1000.wmv
Progress: 9064448b  243780b/s  95%
...

Stopped it and restarted it:

./s3sync.rb --progress /home/tkirby/html/avshare_old/media/ chickenrancher:avshare/media/

Create node 100.wmv
Progress: 3041280b  593944b/s  94%
Create node 1000.wmv
Progress: 8768512b  354275b/s  92%
...

It always starts from the beginning and never seems to report 100% copied in the progress. That might just be because it exits the progress loop before the last update but it's still copying everything every time. Does the bucket have to be created by s3sync.rb program in order to work? (I'm copying into an existing one)

Where does s3sync.rb store the md5?

Cheers,
-Todd




Logged
ferrix
Sr. Member
****
Posts: 363


(I am greg13070 on AWS forum)


View Profile
« Reply #7 on: September 22, 2007, 10:17:09 AM »

As before I can't really tell what's going on from the short results.. need -d at a minimum.

You are correct about the 100% thing; there's no special case to print when it's done.
MD5 is not stored locally, it is recalculated each time.  S3 uses md5 as the etag so it comes down during list operations.
Logged
Graham Cobb
Newbie
*
Posts: 4


View Profile
« Reply #8 on: October 13, 2007, 11:39:47 AM »

I think I am seeing the same problem only it occurs even if the sync completes successfully.

I created a new bucket (using s3cmd.rb) and then issued the following command:

Code:
./s3sync.rb -d --ssl --verbose --delete --progress /tmp/aa/ backup.cobb.me.uk:a

/tmp/aa contained two files: a and b.

As expected the command copied both files.  Here is the log:

Code:
s3Prefix a
localPrefix /tmp/aa/
localTreeRecurse /tmp/aa
Test /tmp/aa/a
Test /tmp/aa/b
local item /tmp/aa/a
local node object init. Name:a Path:/tmp/aa/a Size:2 Tag:60b725f10c9c85c70d97880dfe8191b3
s3TreeRecurse backup.cobb.me.uk a
Trying command list_bucket backup.cobb.me.uk max-keys 200 prefix a delimiter / with 100 retries left
Response code: 200
source: a
s3 node object init. Name:a Path:a/a Size: Tag:
Create node a
a/a
File extension: a/a
Trying command put backup.cobb.me.uk a/a #<S3::S3Object:0x2b27b4e05950> Content-Length 2 with 100 retries left
Response code: 200
local item /tmp/aa/b
local node object init. Name:b Path:/tmp/aa/b Size:2 Tag:3b5d5c3712955042212316173ccf37be
source: b
s3 node object init. Name:b Path:a/b Size: Tag:
Create node b
a/b
File extension: a/b
Trying command put backup.cobb.me.uk a/b #<S3::S3Object:0x2b27b4d57e90> Content-Length 2 with 100 retries left
Response code: 200

I then reissued exactly the same command and got almost exactly the same log:

Code:
s3Prefix a
localPrefix /tmp/aa/
localTreeRecurse /tmp/aa
Test /tmp/aa/a
Test /tmp/aa/b
local item /tmp/aa/a
local node object init. Name:a Path:/tmp/aa/a Size:2 Tag:60b725f10c9c85c70d97880dfe8191b3
s3TreeRecurse backup.cobb.me.uk a
Trying command list_bucket backup.cobb.me.uk max-keys 200 prefix a delimiter / with 100 retries left
Response code: 200
prefix found: /
source: a
s3 node object init. Name:a Path:a/a Size: Tag:
Create node a
a/a
File extension: a/a
Trying command put backup.cobb.me.uk a/a #<S3::S3Object:0x2b88a5339448> Content-Length 2 with 100 retries left
Response code: 200
local item /tmp/aa/b
local node object init. Name:b Path:/tmp/aa/b Size:2 Tag:3b5d5c3712955042212316173ccf37be
source: b
s3 node object init. Name:b Path:a/b Size: Tag:
Create node b
a/b
File extension: a/b
Trying command put backup.cobb.me.uk a/b #<S3::S3Object:0x2b88a528bd48> Content-Length 2 with 100 retries left
Response code: 200

The files were copied again.

I was hoping to use s3sync to synchronise some files which are about 1.5GB long (slices of a backup) and so copying the file unnecessarily is a big problem!

Any ideas?

Graham
Logged
ferrix
Sr. Member
****
Posts: 363


(I am greg13070 on AWS forum)


View Profile
« Reply #9 on: October 13, 2007, 08:30:39 PM »

I'll try to repro this and see what's going on.  I know it at least *mostly* works because I do daily backups and it certainly doesn't keep transferring all files.  Might be some weird edge case I missed.  I'll post again when I have had time to check.
Logged
ferrix
Sr. Member
****
Posts: 363


(I am greg13070 on AWS forum)


View Profile
« Reply #10 on: October 18, 2007, 09:16:23 PM »

This is incredible.. I wonder how long it's been broken.  S3 isn't returning MD5's in the list operation any more it looks like.  I suppose there's some change that I neglected to notice.

I will investigate.  Clearly this has incredibly far reaching effects.

ETA: OK I'm wrong, something else is going on.  I can get the etags ok, but not for this example.  Some problem with prefix interpretation.
« Last Edit: October 18, 2007, 09:32:50 PM by ferrix » Logged
ferrix
Sr. Member
****
Posts: 363


(I am greg13070 on AWS forum)


View Profile
« Reply #11 on: October 18, 2007, 10:27:56 PM »

This is just YASB
(yet another slash bug).

I hate that I chose the rsync-like interface, slash accounting ALWAYS leads to bugs.  Too tired to figure this out tonight, but at least I know what's going on now.

By the way, if you use recursive, this issue goes away.
Logged
Graham Cobb
Newbie
*
Posts: 4


View Profile
« Reply #12 on: October 19, 2007, 12:04:18 PM »

Thanks for looking into this.  I can confirm that I have worked round the problem by using --recursive (even though there are no sub-directories).

Graham
Logged
ferrix
Sr. Member
****
Posts: 363


(I am greg13070 on AWS forum)


View Profile
« Reply #13 on: October 19, 2007, 08:57:17 PM »

That's good.  I spent a little more time on this today and came to the following conclusion..

The slash accounting code is so incredibly spaghetti between trying to emulate rsync, and trying to make S3 list look like directory list..

Fixing it is no longer an option, I have to clean it up and make better sense of it next time.  And this is going to take more energy+time than I have to devote to it at the moment, alas.
Logged
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2013, Simple Machines Valid XHTML 1.0! Valid CSS!