How do I restore an entire S3 bucket from Glacier permanently? - amazon-web-services

A while ago when the price difference between standard storage and glacier was closer to 10:1 than 3:1, I moved a couple of bucket completely to Glacier using a life-cycle policy. Admittedly, I hadn't investigated how to reverse that process permanently.
I know the documentation states that I would have to "use the copy operation to overwrite the object as a Standard or RRS object", but I guess I'm unclear what that looks like. Do I just copy and paste within that bucket?

Restoration can only be done at the Object level, not Bucket or Folder. When an object is restored, it is a available in Amazon S3 for the period requested (eg 2 days), then reverts to only being in Glacier. You'll need to copy the objects out of the bucket during that time to keep a copy of them in S3.
The easiest method would be to use a utility to restore multiple Objects for you, eg:
How to restore whole directories from Glacier?
Also, see Stackoverflow: How to restore folders (or entire buckets) to Amazon S3 from Glacier?

Related

How do I move an S3 "Deep Glacier Archive" object?

I have a number of "Deep Glacier Archive" class objects in the root level of my Amazon S3 bucket.
As the number of objects grows, I've added some top-level folders to the same bucket that I'd like to move the other objects into for organizational reasons. While I can add new objects to these folders, I've noticed that the "Move" action option is grayed out while when I have existing objects selected.
Is there a way that I can move these glacier objects into the other folders in the same bucket? (I'm using the Amazon AWS S3 web console interface.)
Objects cannot be 'moved' in Amazon S3. Doing so actually involves performing a copy and then delete.
The S3 management console is unable to move/copy an object with a Glacier storage class because the data is not immediately available. Instead, you should:
Restore the object (Charges might apply)
Once restored, perform the move/copy
You have to first restore the objects and wait around 48h until the process completes (you can do that directly from the management console). Once it is done you should see the download button enabled in the console and a countdown of the days you set them to be available.
Then you can move them using the AWS CLI with:
aws s3 mv "s3://SOURCE" "s3://DEST" --storage-class DEEP_ARCHIVE --force-glacier-transfer
I don't think is possible to move them from the management console directly, after the restoration.

Approach to move file from s3 to s3 glacier

I need to create a python flask application that moves a file from s3 storage to s3 glacier. I cannot use the lifetime policy to do this as I need to use glacier vault lock which isn't possible with the lifetime policy method since I won't be able to use any glacier features on those files. The files will be multiple GBs in size so I need to download these files and then upload them on glacier. I was thinking of adding a script on ec2 that will be triggered by flask and will start downloading and uploading files to glacier.
This is the only solution I have come up with and it doesn't seem very efficient but I'm not sure. I am pretty new to AWS so any tips or thoughts will be appreciated.
Not posting any code as I don't really have a problem with the coding, just the approach I should take.
It appears that your requirement is to use Glacier Vault Lock on some objects to guarantee that they cannot be deleted within a certain timeframe.
Fortunately, similar capabilities have recently been added to Amazon S3, called Amazon S3 Object Lock. This works at the object or bucket level.
Therefore, you could simply use Object Lock instead of moving the objects to Glacier.
If the objects will be infrequently accessed, you might also want to change the Storage Class to something cheaper before locking it.
See: Introduction to Amazon S3 Object Lock - Amazon Simple Storage Service

Moving files across folders in the same S3 bucket

In the same bucket I've folders named Backup1 and Backup2.
Files in Backup1 folder transition from S3 Standard storage to Glacier after 5 days.
Now from the S3 console, if I want to copy some of these files from Backup1 folder to Backup2:
Will I incur any charges if these files are less than 90 days old?
Will the copy be done from Glacier to Glacier or will it be from Glacier to S3 Standard?
Since this involves file copy from Glacier, will it be time consuming?
First, it's worth mentioning that there have been some recent updates to Amazon S3:
It is now possible to upload directly to the Glacier storage class, rather than having to specify a Lifecycle rule. This is good for files you want to archive immediately (not necessarily your use-case).
There is a new S3 Intelligent-Tiering Storage Class that will automatically move objects in/out of Glacier based on usage patterns.
Will I incur any charges if these files are less than 90 days old?
When you copy objects to Backup2, you will be storing another copy of the data. Therefore, you will be charged for this additional storage at whatever storage class it is using. When you restore objects from the Glacier storage class, they are temporarily stored as Reduced Redundancy Storage. You'll have to check what storage class is used after you copy the files to Backup2.
Will the copy be done from Glacier to Glacier or will it be from Glacier to S3 Standard?
To copy objects from Backup1, you will need to first Restore the object(s) from Glacier. Once they have been restored, they will be available in S3.
Since this involves file copy from Glacier, will it be time consuming?
The file copy is fast, but you will first need to restore from Glacier. You can choose how long this should take, depending on whether you wish to pay for an expedited retrieval (1-5 minutes) or standard (3-5 hours) or bulk (5-12 hours).

Using AWS Glacier as back-up

I have a website where I serve content that is stored on an AWS S3 bucket. As the amount of content grows, I have started thinking about back-up options. Using AWS Glacier came up as a natural route.
After reading on it, I didn't understand if it does what I intend to do with it. From what I have understood, using Glacier, you set lifecycle policies on objects stored on your S3 buckets. According to these policies, objects will be transferred Glacier and deleted from your S3 bucket at a specific point in time after they have been uploaded to S3. At this point, the object's storage class changes to 'GLACIER'. Amazon explains that, once this is done, you can no longer access the objects through S3 but "their index entry will remain as is". Simultaneously, they say that retrieval of objects from Glacier takes 3-5 hours.
My question is: Does this mean that, once objects are transferred to Glacier, I will not be able to serve them on my website without retrieving them first? Or does it mean that they will still be served from the S3 bucket as usual but that, in case something happens with the files on S3 I will just be able to retrieve them in 3-5 hours? Glacier would only be a viable back up solution for me if users of my website would still be able to load content on the website after the correspondent objects are transferred to Glacier. Also, is it possible to have objects transferred to Glacier without them being deleted from the S3 bucket?
Thank you
To answer your question: Does this mean that, once objects are transferred to Glacier, I will not be able to serve them on my website without retrieving them first?
No, you won't be able to serve them on your website unless transfer them from glacier to standard or standard_IA class, which is taken 3-5 hours. Glacier is generally used to archive cold data like old logs which is accessed in rare condition. So if you need real-time access to the object, Glacier isn't a valid option for you.

Lifecycle policy on S3 bucket

I have an S3 bucket on which I've configured a Lifecycle policy which says to archive all objects in the bucket after 1 day(s) (since I want to keep the files in there temporarily but if there are no issues then it is fine to archive them and not have to pay for the S3 storage)
However I have noticed there are some files in that bucket that were created in February ..
So .. am I right in thinking that if you select 'Archive' as the lifecycle option, that means "copy-to-glacier-and-then-delete-from-S3"? In which case this issue of the files left from February would be a fault - since they haven't been?
Only I saw there is another option - 'Archive and then Delete' - but I assume that means "copy-to-glacier-and-then-delete-from-glacier" - which I don't want.
Has anyone else had issues with S3 -> Glacier?
What you describe sounds normal. Check the storage class of the objects.
The correct way to understand the S3/Glacier integration is the S3 is the "customer" of Glacier -- not you -- and Glacier is a back-end storage provider for S3. Your relationship is still with S3 (if you go into Glacier in the console, your stuff isn't visible there, if S3 put it in Glacier).
When S3 archives an object to Glacier, the object is still logically "in" the bucket and is still an S3 object, and visible in the S3 console, but can't be downloaded from S3 because S3 has migrated it to a different backing store.
The difference you should see in the console is that objects will have A "storage class" of Glacier instead of the usual Standard or Reduced Redundancy. They don't disappear from there.
To access the object later, you ask S3 to initiate a restore from Glacier, which S3 does... but the object is still in Glacier at that point, with S3 holding a temporary copy, which it will again purge after some number of days.
Note that your attempt at saving may be a little bit off target if you do not intend to keep these files for 3 months, because any time you delete an object from Glacier, you are billed for the remainder of the three months, if that object has been in Glacier for a shorter time than that.