Moving files across folders in the same S3 bucket - amazon-web-services

In the same bucket I've folders named Backup1 and Backup2.
Files in Backup1 folder transition from S3 Standard storage to Glacier after 5 days.
Now from the S3 console, if I want to copy some of these files from Backup1 folder to Backup2:
Will I incur any charges if these files are less than 90 days old?
Will the copy be done from Glacier to Glacier or will it be from Glacier to S3 Standard?
Since this involves file copy from Glacier, will it be time consuming?

First, it's worth mentioning that there have been some recent updates to Amazon S3:
It is now possible to upload directly to the Glacier storage class, rather than having to specify a Lifecycle rule. This is good for files you want to archive immediately (not necessarily your use-case).
There is a new S3 Intelligent-Tiering Storage Class that will automatically move objects in/out of Glacier based on usage patterns.
Will I incur any charges if these files are less than 90 days old?
When you copy objects to Backup2, you will be storing another copy of the data. Therefore, you will be charged for this additional storage at whatever storage class it is using. When you restore objects from the Glacier storage class, they are temporarily stored as Reduced Redundancy Storage. You'll have to check what storage class is used after you copy the files to Backup2.
Will the copy be done from Glacier to Glacier or will it be from Glacier to S3 Standard?
To copy objects from Backup1, you will need to first Restore the object(s) from Glacier. Once they have been restored, they will be available in S3.
Since this involves file copy from Glacier, will it be time consuming?
The file copy is fast, but you will first need to restore from Glacier. You can choose how long this should take, depending on whether you wish to pay for an expedited retrieval (1-5 minutes) or standard (3-5 hours) or bulk (5-12 hours).

Related

How do I only transition objects greater than 100MB to AWS Glacier from S3 Standard using AWS Lifecycle Management Policies?

I have 50TB of data in an S3 Standard bucket.
I want to transition objects that are greater than 100MB & older than 30 days to AWS Glacier using an S3 Lifecycle Policy.
How can I only transition objects that are greater than 100MB in size?
There is no way to transition items based on file size.
As the name suggests, S3 Lifecycle policies allow you to specify transition actions based on object lifetime - not file size - to move items from the S3 Standard storage class to S3 Glacier.
Now, a really inefficient & costly way that may be suggested would be to schedule a Lambda to check the S3 bucket daily, see if anything is 30 days old & then "move" items to Glacier.
However, the Glacier API does not allow you to move items from S3 Standard to Glacier unless it is through a lifecycle policy.
This means you will need to download the S3 object and then re-upload the item again to Glacier.
I would still advise having a Lambda running daily to check the file size of items, however, create another folder (key) called archive for example. If there are any items older than 30 days & greater than 100MB, copy the item from the current folder to the archive folder and then delete the original item.
Set a 0-day life-cycle policy, filtered on the prefix of the other folder (archive), which then transitions the items to Glacier ASAP.
This way, you will be able to transfer items larger than 100MB after 30 days, without paying higher per-request charges associated with uploading items to Glacier, which may even cost you more than the savings you were aiming for in the first place.
To later transition the object(s) back from Glacier to S3 Standard, use the RestoreObject API (or SDK equivalent) to restore it back into the original folder. Then finally, delete the object from Glacier using a DELETE request to the archive URL.
create a lambda that runs every day (cron job) that checks for files older than 30 days and greater then 100mb in the bucket. You can use the s3 api and glacier api.
In the "Lifecycle rule configuration" there is (from Nov 23, 2021 - see References 1.) a "Object size" form field on which you can specify both the minimum and the maximum object size.
For the sake of completeness, by default Amazon S3 does not transition objects that are smaller than 128 KB for the following transitions:
From the S3 Standard or S3 Standard-IA storage classes to S3 Intelligent-Tiering or S3 Glacier Instant Retrieval.
From the S3 Standard storage class to S3 Standard-IA or S3 One Zone-IA
References:
https://aws.amazon.com/about-aws/whats-new/2021/11/amazon-s3-lifecycle-storage-cost-savings/
https://docs.aws.amazon.com/AmazonS3/latest/userguide/lifecycle-transition-general-considerations.html

S3 move current objects to Glacier Deep Archive storage class

I have 1.5TB of data in bucket with Standard storage class. I want to move all objects to Glacier Deep Archive storage class.
S3 makes a copy (version) of an object in Standard storage class If I move it to Glacier Deep Archive via GUI (select objects->actions->edit storage class).
Do I get charged for both objects (versions?) or just for the Glacier one? Both versions are the same size.
I could use Lifecycle rules but not all files were created >180days ago.
Yes, you are getting charged for the whole volume including the versions.
But instead of manual update of the storage class, maybe a lifecycle policy is better option, where the objects you can archive the the objects and specify to delete the versionins after some time. No need to wait 180 days
https://aws.amazon.com/blogs/aws/amazon-s3-lifecycle-management-update/
I could use Lifecycle rules but not all files were created >180days ago.
You can transition them at any time. They don't have to be 180 days old. The 180 days limit is for getting them back from Deep Archive. If you try to restore your files before they have been in DA for 180, you will be charged penalties for that.
You'll get charged for both objects (versions). From S3 FAQs:
Normal Amazon S3 rates apply for every version of an object stored or requested

How to delete glacier object?

I have created a s3 life cycle policy which will Expire the current version of the object in 540 days.
I am a bit confused here, whether it deletes the objects from s3 or glacier,
if not I want to delete the objects from a bucket in 540 days and the glacier in some 4 years! how will I set it up?
Expiring an object means "delete it", regardless of its storage class.
So, if it has moved to a Glacier storage class, it will still be deleted.
When you store data in Glacier via S3, then the object is managed by Amazon S3 (but stored in Glacier). Thus, the lifecycle rules apply.
If, however, you store data directly in Amazon Glacier (without going via Amazon S3), then the data would not be impacted by the lifecycle rules, nor would it be visible in Amazon S3.
Bottom line: Set your rules for deletion based upon the importance of the data, not its current storage class.

How do I restore an entire S3 bucket from Glacier permanently?

A while ago when the price difference between standard storage and glacier was closer to 10:1 than 3:1, I moved a couple of bucket completely to Glacier using a life-cycle policy. Admittedly, I hadn't investigated how to reverse that process permanently.
I know the documentation states that I would have to "use the copy operation to overwrite the object as a Standard or RRS object", but I guess I'm unclear what that looks like. Do I just copy and paste within that bucket?
Restoration can only be done at the Object level, not Bucket or Folder. When an object is restored, it is a available in Amazon S3 for the period requested (eg 2 days), then reverts to only being in Glacier. You'll need to copy the objects out of the bucket during that time to keep a copy of them in S3.
The easiest method would be to use a utility to restore multiple Objects for you, eg:
How to restore whole directories from Glacier?
Also, see Stackoverflow: How to restore folders (or entire buckets) to Amazon S3 from Glacier?

Expiry date for Glacier backups

Is there a way to set an expiry date in Amazon Glacier? I want to copy in weekly backup files, but I dont want to hang on to more than 1 years worth.
Can the files be set to "expire" after one year, or is this something I will have to do manually?
While not available natively within Amazon Glacier, AWS has recently enabled Archiving Amazon S3 Data to Amazon Glacier, which makes working with Glacier much easier in the first place already:
[...] Amazon S3 was designed for rapid retrieval. Glacier, in
contrast, trades off retrieval time for cost, providing storage for as
little at $0.01 per Gigabyte per month while retrieving data within
three to five hours.
How would you like to have the best of both worlds? How about rapid
retrieval of fresh data stored in S3, with automatic, policy-driven
archiving to lower cost Glacier storage as your data ages, along with
easy, API-driven or console-powered retrieval? [emphasis mine]
[...] You can now use Amazon Glacier as a storage option for Amazon S3.
This is enabled by facilitating Amazon S3 Object Lifecycle Management, which not only drives the mentioned Object Archival (Transition Objects to the Glacier Storage Class) but also includes optional Object Expiration, which allows you to achieve what you want as outlined in section Before You Decide to Expire Objects within Lifecycle Configuration Rules:
The Expiration action deletes objects
You might have objects in Amazon S3 or archived to Amazon Glacier. No
matter where these objects are, Amazon S3 will delete them. You will
no longer be able to access these objects. [emphasis mine]
So at the small price of having your objects stored in S3 for a short time (which actually eases working with Glacier a lot due to removing the need to manage archives/inventories) you gain the benefit of optional automatic expiration.
You can do this in the AWS Command Line Interface.
http://docs.aws.amazon.com/AmazonS3/latest/dev/object-lifecycle-mgmt.html