1  General Category / Questions / Re: s3sync froze on connection reset on: September 12, 2007, 02:06:33 PM
Hi ferrix,

Thanks, I'm trying to persuade the net/http library authors to throw an exception if the file size changes. If that's done, you could catch that exception and retry the upload.

I don't think either file locking or guessing which files might be changing is going to work all the time, and I'd like this to be reliable.

Cheers, Chris.
2  General Category / Questions / Re: s3sync froze on connection reset on: September 11, 2007, 11:12:54 AM
Hi Ferrix,

For log files it's perfectly fine to truncate them without locking them, as they're only appended to.

I agree with the other thread that locking is a nightmare. I don't agree that the solution is to do nothing at all. Detecting the situation and failing with a polite error message, preferably with the option to override it for users who know what they're doing, seems much better to me than the behaviour I saw (which admittedly was a bug in Ruby rather than in s3sync).

What percentage of s3sync users would be able to track this down by themselves in the current state?

Cheers, Chris.
3  General Category / Questions / Re: s3sync froze on connection reset on: September 11, 2007, 10:10:57 AM
Hi Ferrix,

Thanks for the suggestion. I think you're right. Turns out that I can't reproduce the problem on a static copy of the data, but I can if the file is being modified while it's being uploaded, e.g. the live error_log on a site, or in my tests, simply appending data to the file while syncing.

The file upload appears to complete, but further attempts to access S3 all fail.

I saw another comment about this in the forums here: http://s3sync.net/forum/index.php?topic=53.0. I think that extra handling for this case would be useful, as it's not at all clear to users what's going on.

My guess is that you check the file size more than once, and if it's changing fast then you get different values which confuses s3sync and causes it to break the HTTP protocol by sending more data than it's supposed to.

Could it be that the extra data ends up locked in the HTTPStreaming output buffer, trying to be sent before any subsequent request, which causes all subsequent requests to fail?

Actually I think (at least one) culprit may be /usr/lib/ruby/1.8/net/http.rb lines ~ 1523:

      write_header sock, ver, path
      if chunked?
        while s = f.read(1024)
          sock.write(sprintf("%x\r\n", s.length) << s << "\r\n")
        sock.write "0\r\n\r\n"
        while s = f.read(1024)
          sock.write s

This makes no attempt to verify that the content_length() matches the current length of the file. Amazon doesn't support Chunked encoding either (I tried it) so the only option seems to be to fix this method.

Cheers, Chris.
4  General Category / Questions / Re: s3sync froze on connection reset on: September 10, 2007, 10:31:07 AM
Hi Ferrix,

I've just started using S3 and I'm trying to upload 1.3GB of data using s3sync. At the moment it fails every time on the same file just after 1GB.

I'm running it like this:

./s3sync.rb -r /var/www/vhosts/ewb-uk.org/ 18XYRMTQ7PRP5J1B6SR2.EWB_Server_Backup:www.ewb-uk.org

and the output looks like this, after scanning through almost all my files:

local item /var/www/vhosts/ewb-uk.org/statistics/logs/access_log.processed
local node object init. Name:statistics/logs/access_log.processed Path:/var/www/vhosts/ewb-uk.org/statistics/logs/access_log.processed Size:188287179 Tag:9f1a59d3c21ef904387651e16ea4e3b1
S3 item www.ewb-uk.org/statistics/logs/access_log.processed
s3 node object init. Name:statistics/logs/access_log.processed Path:www.ewb-uk.org/statistics/logs/access_log.processed Size:188287179 Tag:9f1a59d3c21ef904387651e16ea4e3b1
source: statistics/logs/access_log.processed
dest: statistics/logs/access_log.processed
Node statistics/logs/access_log.processed unchanged
local item /var/www/vhosts/ewb-uk.org/statistics/logs/error_log
local node object init. Name:statistics/logs/error_log Path:/var/www/vhosts/ewb-uk.org/statistics/logs/error_log Size:57468411 Tag:48fd54f4ee01b9e3efb9b6fb468b69fb
S3 item www.ewb-uk.org/statistics/logs/error_log
s3 node object init. Name:statistics/logs/error_log Path:www.ewb-uk.org/statistics/logs/error_log Size:57466964 Tag:efe567d8a6a1e81d5f499b63588da973
source: statistics/logs/error_log
dest: statistics/logs/error_log
Update node statistics/logs/error_log
File extension: org/statistics/logs/error_log
Trying command put 18XYRMTQ7PRP5J1B6SR2.EWB_Server_Backup www.ewb-uk.org/statistics/logs/error_log #<S3::S3Object:0x2a973ced08> Content-Length 57468411 with 100 retries left
Broken pipe: Broken pipe
No result available
99 retries left
Trying command put 18XYRMTQ7PRP5J1B6SR2.EWB_Server_Backup www.ewb-uk.org/statistics/logs/error_log #<S3::S3Object:0x2a973ced08> Content-Length 57468411 with 99 retries left
/usr/lib/ruby/1.8/net/protocol.rb:175:in `write': Interrupt
        from /usr/lib/ruby/1.8/net/protocol.rb:175:in `write0'
        from /usr/lib/ruby/1.8/net/protocol.rb:151:in `write'
        from /usr/lib/ruby/1.8/net/protocol.rb:166:in `writing'
        from /usr/lib/ruby/1.8/net/protocol.rb:150:in `write'
        from /usr/lib/ruby/1.8/net/http.rb:1542:in `write_header'
        from /usr/lib/ruby/1.8/net/http.rb:1523:in `send_request_with_body_stream'
        from /usr/lib/ruby/1.8/net/http.rb:1498:in `exec'
        from /usr/lib/ruby/1.8/net/http.rb:1044:in `_HTTPStremaing_request'
        from ./HTTPStreaming.rb:43:in `request'
        from ./S3_s3sync_mod.rb:50:in `make_request'
        from ./S3.rb:152:in `put'
        from ./s3try.rb:57:in `S3try'
        from ./s3sync.rb:513:in `updateFrom'
        from ./s3sync.rb:395:in `main'
        from ./s3sync.rb:708

The backtrace is where I killed it after 20 minutes of doing nothing.

I know Broken Pipe is not quite the same as Connection Reset, but I've been getting those as well, with the same symptoms, so perhaps this will help.

I'd appreciate any help you can give me in fixing this. I'd be happy to debug and test patches, but I don't want to run s3sync too many times at the moment because it executes a lot of LIST commands each time it runs, and I get charged for those.

Cheers, Chris.
