S3Sync.net

General Category => Closed Bugs => Topic started by: scientastic on October 03, 2007, 11:59:29 AM



Title: Owner ID is incorrect
Post by: scientastic on October 03, 2007, 11:59:29 AM
I haven't had the time to trace this back to the source of the problem.  However, when running the following command:

Code:
s3sync -r bucket:path/to/dir/ /local/path/

I get the following error:

Code:
/usr/local/s3sync/s3sync.rb:659:in `chown': bignum too big to convert into `long' (RangeError)
        from /usr/local/s3sync/s3sync.rb:659:in `send'
        from /usr/local/s3sync/s3sync.rb:659:in `updateFrom'
        from /usr/local/s3sync/s3sync.rb:378:in `main'
        from /usr/local/s3sync/s3sync.rb:709

Now, I looked in s3sync.rb around line 659, and inserted the puts line in the code as shown below:

Code:
        # update permissions
        linkCommand = fromNode.symlink? ? 'l' : ''
        begin
          puts "chown #{fromNode.owner} #{fromNode.group} #{@path}"
          File.send(linkCommand + 'chown', fromNode.owner, fromNode.group, @path)
          File.send(linkCommand + 'chmod', fromNode.permissions, @path)
        rescue NotImplementedError
          # no one has lchmod, but who really cares
        rescue SystemCallError
          $stderr.puts "Could not change owner/permissions on #{@path}: #{$!}"
        end

The results of running the command again after cleaning out the local directory:

Code:
chown 400 401 /path/to/file1
chown 4294967295 401 /path/to/file2
/usr/local/s3sync/s3sync.rb:660:in `chown': bignum too big to convert into `long' (RangeError)
        from /usr/local/s3sync/s3sync.rb:660:in `send'
        from /usr/local/s3sync/s3sync.rb:660:in `updateFrom'
        from /usr/local/s3sync/s3sync.rb:378:in `main'
        from /usr/local/s3sync/s3sync.rb:710

Now, clearly, somewhere earlier in the code, fromNode.owner is getting assigned a bogus value.  I don't really have the time to trace this down, but could you look into it?  Thanks...


Title: Re: Owner ID is incorrect
Post by: scientastic on October 03, 2007, 12:20:47 PM
I think I may have a clue about where it is coming from.

I observe this when I try to sync from S3 to an EC2 node, but so far only when syncing files that were originally put in S3 from my local machine running Cygwin.  So s3sync on Cygwin must somehow be getting weird owner IDs... that is, in order to reproduce this you may have to run it on Cygwin.  However, when I run "echo $UID" I get 400 as the answer.

The sequence as I remember it, is that I synced a directory from S3 to Cygwin, edited some files, and synced them back to S3 in a different path.  Then, I tried to sync from S3 to an EC2 node, and that is when I started observing this error.  The file that I edited is the one giving the problem.

However, when I looked at the files I originally synced from S3 to Cygwin, all of them have the bogus large user ID as the owner of the file.  Since only the ones I edited got synced back to S3, that is why only those got the bogus user ID.  Perhaps s3sync is failing in setting the proper ownership on the local drive in Cygwin, when first syncing from S3 to local?  Or perhaps the Cygwin version of chown is messed up?

Hope this helps...





Title: Re: Owner ID is incorrect
Post by: scientastic on October 03, 2007, 12:52:26 PM
More on this.  I found out it is likely Cygwin's fault.  When copying files from S3 to Cygwin using s3sync, I get the following error:

Code:
Could not change owner/permissions on /path/to/file: Invalid argument - /path/to/file

I tried using chown directly... apparently, under Cygwin, it only likes user names, not user IDs.  At least, on my machine.

You can probably ignore this bug, unless someone knows a solution for Cygwin.


Title: Re: Owner ID is incorrect
Post by: sheltond on April 25, 2008, 06:19:39 AM
I have been having this problem for a long time, and have finally got around to looking into it. By putting logging in various places I found that my S3 content has metadata headers (which I believe were inserted by s3sync) like this:

  x-amz-meta-group: 4294967295
  x-amz-meta-owner: 11110

So the 4294967295 value (0xFFFFFFFF) is stored in S3. When I do "ls -l" on these same files, I get things like:

  drwxr-xr-x+  4 sheltond ????????     0 Aug 13  2007 Foo/

so clearly the group isn't being understood as one of the ones that cygwin knows about. This is for files/directories created by Windows. Ones created under Cygwin are fine - they have a sensible group value. Adding some logging when uploading files shows that the "self.stat().gid" call is getting back 0xFFFFFFFF from the "stat.gid" field under Cygwin.

In any case, I tend to use s3sync for syncing data between my laptop (windows/cygwin), my home PC (linux), my media center (vista), etc. I don't want it to be setting the user and group of these files to the same values on all of those machines, as the user-ids aren't the same.

The way I have got around this is to add a "--no-set-owner" option, which just avoids doing that "chown" command. This seems to make everything work nicely again.

Anyway, I don't know if anyone else wants this feature, but I have attached a diff of the changes that I have made (the diff is based on version 1.2.4). In summary:

- Added --no-set-owner option
- Check for self.stat().gid being 0xFFFFFFFF and returning 0 in that case
- Added printing of directories being traversed if --verbose is on (I like to see how far through it is)

Feel free to take any or all of these changes.


Title: Re: Owner ID is incorrect
Post by: ferrix on April 25, 2008, 07:52:06 AM
Just a thought... but why run s3sync in cygwin with busted permissions when you could just run it native in windows where uid/gid and such are ignored?


Title: Re: Owner ID is incorrect
Post by: sheltond on April 25, 2008, 08:37:00 AM
I have a bunch of bash wrapper scripts which are common across various different OSes, which I run under Cygwin on windows. I could probably set it up differently, but should I need to? s3sync is generally nice and portable, and I think it makes sense to fix non-portability issues when they arise.

In any case, regardless of my reasons for using Cygwin, it is still perfectly reasonable (I think) to want file ownership information not to be duplicated across different machines where the user/group IDs may not be the same. This applies whether using MacOS, Linux, Windows, etc...


Title: Re: Owner ID is incorrect
Post by: ferrix on April 26, 2008, 01:17:33 AM
Yeah that's a good reason, I was just wondering.  I've never run through cygwin, but it seems like the bug is in cygwin or the ruby library (because the uid/gid code in s3sync is dead simple).