lowflyinghawk
|
|
« Reply #2 on: February 20, 2007, 08:15:30 AM » |
|
yes, I agree with you about the obstacles...I've been thinking about it for a while but no finished idea has bloomed. it's possible a filesystem could make it all simple, but I'm not holding my breath...s3 definitely has some issues when you want it to pretend to be a harddrive ;-).
the "find the hardlinks" code might be useful to warn users in a log message, e.g. "files 'x' 'y.blah'" are hardlinks, data will be duplicated".
memory usage: yes, if you had a full filesystem backed up using hardlinks this would be problematical, however there is no way to find all the links without looking at all the files. I suppose one might iterate over all the files one inode at a time, spit out the result, then go again, but frankly I'm too lazy to go far with that...I have around 50,000 files in my HOME directory, and if all of them were paired hardlinks the memory usage wouldn't be enough to get excited about. if you had a million files it would be an issue, but if you have filesystems like that you have bigger issues to think about.
me: I've been fooling around with s3 for two reasons, 1) because I need to back up my pics, and 2) because I wanted to learn ruby (or python, but ruby won out). I started out with a little app, no classes, no rubyisms, etc, and then it grew like topsy. now I have a big app with one big class that does most of the work, a bunch of little helper classes, and 4,000 commandline switches, but I finally didn't like that much so I refactored the whole thing into a bunch of obvious classes (bucket, service, ...) and smaller focused apps, e.g. s3mkbucket, s3rm, etc. now the code uses 'yield' and blocks idiomatically and as a bonus doesn't fill up huge arrays with interim results and the utilities are much more typical unix-like (small, focused scripts). I learned quite a bit about ruby vs c++ in the process, and I ended up with some useful gadgets. fooling around with it also ended up making me relearn some css and html so I could use s3 as a webserver for pics and whatnot.
I like participating in the s3sync discussions because I've learned a lot by so doing, even if I did occasionally broadcast my ignorance (e.g. SSL x.509 certs).
why not s3sync? as I said, one goal was to learn ruby, and you can't really do that by just looking at code. what I ended up with is not really rsync-like, although it performs many of the same functions. for example none of my gadgets generate their own list of keys to archive, retrieve, etc. on the other hand my s3archive *does* look before leaping, i.e. it doesn't just blindly copy bits without checking what is there first, and my s3get is the same way, it looks first and only retrieves if necessary.
my stuff does do some things I doubt s3sync does, for example I can look at ACLs either as xml or in summary format, I can use canned-acls to set permissions or selectively modify the xml to change permissions for a single grantee (REXML::document) on one or more keys or buckets, in other words I wrapped the ACL-related code in S3.rb and turned it into some utilities.
|