S3Sync.net
February 02, 2014, 01:22:17 PM *
Welcome, Guest. Please login or register.

Login with username, password and session length
 
   Home   Help Search Login Register  
Pages: [1]
  Print  
Author Topic: s3sync randomly re-updates nodes [corruption occurring]  (Read 2628 times)
falk0069
Newbie
*
Posts: 4


View Profile
« on: January 07, 2010, 02:30:26 AM »

Has anyone experienced s3sync re-updating nodes when synch'ing even though the file has not changed?  I read in a different topic where this may happen for ever file if you are using trailing forward slashes on directory paths, but this is not the case for me.  I'm trying to sync a directory full of pictures and about 1 in 100 will re-synch.  Everything still works, but I'd like to minimize my bandwidth usage if possible.

Does anyone have any suggestions?

This is the command I'm using to do the work for me:
s3sync.rb -r -v --ssl --delete /raidset/storage/Saves mybucket:Saves

This is sample output:
Update node 1998-2002/2000-11 Thanksgiving & Christmas/101-0115_IMG.JPG
Update node 1998-2002/2000-11 Thanksgiving & Christmas/101-0139_IMG.JPG
Update node 1998-2002/2002-10-31 Office Halloween/IMG_0003.JPG
Update node 1998-2002/2002-11 Caribbean Vacation/3 - Barbados/314.JPG

Thanks
« Last Edit: January 10, 2010, 12:15:43 PM by falk0069 » Logged
falk0069
Newbie
*
Posts: 4


View Profile
« Reply #1 on: January 10, 2010, 12:15:12 PM »

Well, I found out more information and it is not good news. It turns out that a small portion of my data is getting corrupt.  Boy am I glad I decided to investigate further.  I STRONGLY URGE EVERYONE TO VERIFY THEIR TRANSFERS ARE NOT RANDOMLY RE-UPDATING BY USING THE -V FLAG. 

What I did was I starting double checking all of the md5sums and there were indeed some that were different.  I then starting doing binary differing and the corrupt files were only a few bytes different.  The file sizes were all equal but in all cases there were 1-4 extra bytes zeroed out at the end of the file.  Also, in about half of the cases there were also 2-10 consecutive bytes in the middle of the file that were just wrong.

I tried removing the the --ssl option to see if that had something to do with it but it made no different.  I also tried adding in the the '--content-md5' patch from the 'http://s3sync.net/forum/index.php?topic=206.0' topic.  Using that new command line option made no difference either. 

My only conclusion is since I'm doing md5 checks that somehow the file is corrupted prior to the actual upload.  Perhaps this is caused by a bad kernel compile or bad memory, but I'm surprise I'm not having had more serious issues then.  It is also possible that amazon made a change that s3sync is not completely compatible.  I noticed that there hasn't been any new development since June 2008--I would think that the content-md5 patch would be included by now. 

Well, I'm going to start trying some alternative s3 backup methods just to see if I have similar problems.   I do really enjoy s3sync as does exactly what I want but I can't deal with possible corruptions occurring and I don't think I can troubleshoot this much further.
Logged
ferrix
Sr. Member
****
Posts: 363


(I am greg13070 on AWS forum)


View Profile
« Reply #2 on: January 11, 2010, 01:20:32 PM »

This sounds like a serious issue, if it is caused by the sync.  I have been very busy and unable to work on this project at all for a long time (as you noted).

But if you are able to isolate some case where this corruption is occurring (say, with a specific file) and you can send me that file to test with, I'll have a look at what's going on.

My long term "plan" for s3sync is to rewrite it in C and Lua because the ruby runtime is just total crap and has been (indirectly) responsible for many of the complaints about the project.
Logged
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.19 | SMF © 2013, Simple Machines Valid XHTML 1.0! Valid CSS!