lowflyinghawk
|
|
« on: February 28, 2007, 07:20:25 AM » |
|
I store keys from linux with names that sound like filesystem paths, e.g. "data/blah/whatever.txt". when I retrieve them the code looks like this (leaving out error handling, etc):
path = File.dirname(key) FileUtils.mkdir_p(path) File.open(key,"wb") do |f| get_response = @conn.get(bucket, key,{},f) end
what happens if I use this same code on windows to retrieve the same sample key as above? the docs imply that ruby will do the right thing for the local filesystem, but I haven't found a definitive answer to that in the ruby docs.
of course it's easy enough to try out...if you have a windows box ;-).
|
|
|
Logged
|
|
|
|
ferrix
|
|
« Reply #1 on: February 28, 2007, 03:48:45 PM » |
|
Yes slashes are magically handled right. In fact with s3sync, if you talk to it with DOS-style back slashes, it gets mightily confused.
|
|
|
Logged
|
|
|
|
lowflyinghawk
|
|
« Reply #2 on: March 05, 2007, 08:29:06 PM » |
|
here is a cute thing I did run into: as everybody knows, the filesystem illegal char list is different between windows and linux. AFAIK only \0 and '/' are illegal for files on linux. so I have a file in my test bucket named 'data/ x" y ' i.e. has a trailing space and a double-quote. the double-quote is illegal on windows, which is reported by throwing an Errno::EINVAL if you try to open it. if you don't already catch that you might want to take a look at your code. symlinks are another issue, you get a NotImplementedError if you call File.symlink(). annoyingly, File.respond_to? returns true.
|
|
|
Logged
|
|
|
|
ferrix
|
|
« Reply #3 on: March 06, 2007, 04:52:55 AM » |
|
I don't much care about crossing platforms with a particular set of data. The important thing is that whatever platform you use, the data round trip works as expected. Once you start juggling the same data from linux, to s3, and then to windows, that's not my problem.
|
|
|
Logged
|
|
|
|
lowflyinghawk
|
|
« Reply #4 on: March 06, 2007, 06:22:26 AM » |
|
I'm surprised you haven't run into it already. there are lots of reasons to use something like rsync, and one of them is to do things like replication of a tree from one system to another; this comes up with trees of static web content. of course people don't usually have perverse file names on purpose, but there are obvious ones, e.g. mm:dd:yyyy.log, that work fine on unix-like systems but not windows.
something similar to symlinks (called by some other, much more baroque name in typical ms fashion) also exists in NTFS, in fact there are two different things: shell links (takes two files per link) and "reparse points" (much closer to symlinks), but AFAIK ruby doesn't support them nor the alternate streams.
|
|
|
Logged
|
|
|
|
ferrix
|
|
« Reply #5 on: March 06, 2007, 11:13:07 AM » |
|
Sure but your scenario still works as long as one of the systems is S3. Why would you move [platform A] => S3 => [platform B] why not just rsync between the two platforms? The point of s3sync is because one of your endpoints is S3... not because you are using S3 as a cache between the two endpoints
|
|
|
Logged
|
|
|
|
lowflyinghawk
|
|
« Reply #6 on: March 06, 2007, 09:05:07 PM » |
|
heh, you are limiting yourself unnecessarily ;-).
example: I want to share a bunch of pics and also back them up. I'm behind a NAT router and so is the sharee. sure, I can set up ways to tunnel, mess with the routers on both ends, etc, but I'm lazy. I put the files on s3 (backup function done already then), and restore them on the other machine, no problem with firewalls, NAT, etc, done. the first time it's just cp, cp, cp. but after that? maybe I don't want to be careful about what's been sent and what hasn't. voila, no problem with s3sync...sync up to s3 and then sync down at the other end. same thing goes with any set of files, it needn't be pics.
example: I have some files I want to be kept very very safe, but I want somebody else to be able to access them if needed, for instance if I die. again I set up a script to do the sync down and another to decrypt the files with a key provided when the time comes. the sharee has an amazon account and I set up the ACLs accordingly. I don't worry about security because the files are encrypted, and I can change the contents of the bag anytime I please. again the A->B->C setup is far easier for me to manage than trying to set up a direct A->C.
now granted, the above takes some savvy to set up too, but all the smarts can be at my end, I set up the other end with a batch file to do the sync, say "double click here" and we are done. rsync direct A<->B would be a nightmare to set up by comparison, and there is always the possibility that the ISP's firewall rules won't let it pass. it avoids simpler failures too, for example S3 is up 24/7 and one disk crash isn't going to put them out of business (we now pause to say goodbye to the svn server I had on the spare box...).
in other words I think rsync + REST opens possibilities that rsync alone doesn't.
|
|
|
Logged
|
|
|
|
ferrix
|
|
« Reply #7 on: March 07, 2007, 04:27:46 AM » |
|
OK so.. for the sake of argument, what's the "correct" way to handle this then?
|
|
|
Logged
|
|
|
|
ferrix
|
|
« Reply #8 on: March 07, 2007, 11:56:57 AM » |
|
I split your reply and put it in feature requests, because I like your suggestions and want to make sure my stuff follows those recommendations in the future.
|
|
|
Logged
|
|
|
|
|