I've been trying to use S3sync to get results back from running jobs on elastic map-reduce. I set up my bucket just so:
wd-emr:results using Panic Transmit.
Now I get the reported error (below). DIgging around, I found this discussion:
http://s3sync.net/forum/index.php?topic=190.0 However, this is *EXACTLY* what elastic mapreduce needs to function -- its asked for an output S3bucket and key wd-emr:results
but this is exactly the key that causes problem for s3sync. I look forward to your comments. The obvious (but not palatable) solution is to use independent buckets, like wd-emr-results: wd-emr-logs and so on. Not happy with this idea though, seems to needlessly multiply buckets.
s3Prefix cache
localPrefix /Users/wdavies/emr/tmp/cache
s3TreeRecurse wd-emr cache
Creating new connection
Trying command list_bucket wd-emr max-keys 200 prefix cache delimiter / with 100 retries left
Response code: 200
prefix found: /
s3TreeRecurse wd-emr cache /
Trying command list_bucket wd-emr max-keys 200 prefix cache/ delimiter / with 100 retries left
Response code: 200
S3 item cache/
s3 node object init. Name: Path:cache Size:10 Tag:7fcfa567f82df949ccc30da8c6454e72 Date:Tue Sep 22 19:02:05 UTC 2009
local node object init. Name: Path:/Users/wdavies/emr/tmp/cache/ Size:38 Tag:d66759af42f282e1ba19144df2d405d0 Date:Wed Sep 23 22:48:11 UTC 2009
source:
dest:
Update node
Trying command get_stream wd-emr cache #<File:0x61b28c> with 100 retries left
Response code: 404
S3 command failed:
get_stream cache #<File:0x61b28c>
With result 404 Not Found
#<Net::ReadAdapter:0x6168f4>
s3sync/s3sync.rb:645:in `unlink': Operation not permitted - /Users/wdavies/emr/tmp/cache/ (Errno::EPERM)
from s3sync/s3sync.rb:645:in `updateFrom'
from s3sync/s3sync.rb:393:in `main'
from s3sync/s3sync.rb:735