AWS S3 delete all the objects or within in a given date range - amazon-web-services

I am really having a hard time of deleting my bucket jananath-logs-bucket-new. It has over 70 TB of data and I need to delete the entire bucket. This has files from 2019
I tried deleting the bucket and since it has many small files (over 50 millions), it take so much time and the UI (browser hangs). So I thought, let the AWS do it for me.
So I tried the lifecycle rules. So I created the two rules
delete-all-from-start
delete-all-from-start-2
And below are the screenshots of each rule:
delete-all-from-start
delete-all-from-start-2
And both the rules look like this now:
But my objects are not deleted.
I have given the number of days for each field as 1 thinking it would delete everything from 2019 (where the first object is created).
Can someone help me on this?
How can I delete the entire objects from the bucket from the 2019
Is it possible to delete the objects between a date range - say from 2020-2021 ?
Thank you,
Have a great day!

According to the documentation a lifecycle policy is a valid way to empty a bucket. Please note that there may be a delay for expiring objects:
When an object reaches the end of its lifetime based on its lifecycle
policy, Amazon S3 queues it for removal and removes it asynchronously.
There might be a delay between the expiration date and the date at
which Amazon S3 removes an object.

Related

AWS S3 Lifecycle

I've been exploring AWS S3 Lifecycle techniques and found the best way to delete S3 files > 60 days old is to configure this through the GUI.
However, I'm not wanting to delete ALL files greater than 60 days. For example, I'd like to at least keep all HTML files inside the bucket that are greater than 60 days.
I've found that a prefix can be entered to limit the scope of the lifecycle to a specific file; however, this requires me to enter ALL files EXCEPT HTMLs. We have hundreds of files, so this will take forever.
I was wondering if anyone knew of an easier way? For example, I would like to just exclude all *.html from the lifecycle.
There is no way to exclude object from rules.
You can rearrange object in your bucket so rule can be applied to objects in specified prefix ("folder").

Delete all version of S3 object using lifecycle rule

I have a S3 bucket with multiple folders with versioning enabled.
Out of these multiple folders I want to complete delete one folder as it has multiple delete marker.
I am using Lifecycle rule to delete the objects but not sure if it will work for specific folder.
In Lifecycle Rule, If I specify the folder_name/ as a prefix and expiration rule as 1 day after creation for all and current versions.
Will it delete all the objects and its versions ?
Can someone please confirm ?
The other folders are quite critical so can't mess with the rule to test.
I can confirm that you can delete at folder level instead of entire bucket. We have a rule that does the exact same thing (although 7 days instead of 1). I will echo John's point that after initial setup, it will take time to do the deletion. You should see progress STARTING within 1 hour, but actual completion may take a while.

Cheapest way to delete 2 billion objects from S3 IA

I have a bucket in S3 (Infrequent access) containing 2 billion objects. It is too big to delete in the console or over the api without taking years.
I can create a lifecycle rule to expire and delete the objects but the calculator predicts this will cost me >$20,000. Is that correct? Is there a better way to delete a bucket?
I have a file effectively containing a list of all the objects in that bucket if that helps.
Update 2021:
An answer below from #MAP points out that there is now an "Empty" button. I haven't tested yet, but looks like the way to go (I'll accept that answer once tested):
If you have a list of all the objects available then you can certainly use Multi Delete Object action. Apparently this API is free. I would create AWS Step Functions state machine to loop through the file and delete 1000 objects at a time. 1000 appears to be the limit.
It will take around 2M step function transactions to delete all the objects in the bucket. As per the pricing for step function it will cost you around $50 + cost of Lambda invocations around $1 so total cost roughly $51.
Update
Using Lambda or Step Functions is probably not the most cost effective option because both ways you will need to read the file (that contains object keys) from some source such as S3. So I think running the script from local machine or any EC2 linux screen appears to be the best option.
In 2021, anyone who comes across this question may benefit to know that AWS console now provides an empty button.
Select the bucket and click on "empty" button and all objects versioned or not versioned would be emptied/deleted. Depending on the number of objects it can take minutes to days.
Expiration lifecycle rules are free. From the original feature announcement:
As with standard delete requests, Amazon S3 doesn’t charge you for using Object Expiration.
Delete operations are for free. You can create a lifecycle
Policy to automate a bulk delete.
I would start with a small number of objects first and check billing report to 100% confirm that the delete will not be charged, then go for the rest.

Does using tags for object-level deletion in AWS S3 work?

I need object level auto deletion after some time in my s3 bucket, but only for some objects. I want to accomplish this by having a lifecycle rule that auto deletes objects with a certain (Tag, Value) pair. For example, I am adding the tag pair (AutoDelete, True) for objects I want to delete, and I have a lifecycle rule that deletes such objects after 1 day.
I ran some experiments to see if objects are getting deleted using this technique, but so far my object has not been deleted. (It may get deleted soon??)
If anyone has experience with this technique, please let me know if this does not work, because so far my object has not been deleted, even though it is past its expiration date.
Yes, you can use S3 object lifecycle rules to delete objects having a given tag.
Based on my experience, lifecycle rules based on time are approximate, so it's possible that you need to wait longer. I have also found that complex lifecycle rules can be tricky -- my advice is to start with a simple test in a test bucket and work your way up.
There's decent documentation here on AWS.

Faster way to delete TB of data from GCP cloud storage

I want to delete 2TB of files from the GCP bucket.
I have read the GCP documentation for deletion and it says to use the gsutil -m rm command but when I am running it says 400+ hours estimate time.
Is there any faster way to do the deletion process?
For buckets with a very large number of objects, one trick to deleting the contents is to use the Lifecycle Management feature. https://cloud.google.com/storage/docs/lifecycle
Set a lifecycle rule that triggers when the object is 0 days old and an action of "Delete", and that should cause GCS to begin deleting your objects for you. Note that this may still take a while, as lifecycle rules can take up to 24 hours to go into effect, but that's still a lot better than a couple of weeks.
You can configure the lifecycle policy on a bucket from the console:
Head to https://console.cloud.google.com/storage/browser
Find the bucket you want to enable, and click None in the Lifecycle column.
Click Add rule.
Select the condition (object is 0 days old or )
Select an action (Delete the object)
Click continue.
Click save.
See https://cloud.google.com/storage/docs/managing-lifecycles for more instructions.
N.B.: Lifecycle changes can take up to 24 hours to go into effect, so once all of your objects go away and you remove the lifecycle config setting, you should wait an additional 24 hours before putting any new files in the bucket, or else they might also get deleted.