Does using tags for object-level deletion in AWS S3 work? - amazon-web-services

I need object level auto deletion after some time in my s3 bucket, but only for some objects. I want to accomplish this by having a lifecycle rule that auto deletes objects with a certain (Tag, Value) pair. For example, I am adding the tag pair (AutoDelete, True) for objects I want to delete, and I have a lifecycle rule that deletes such objects after 1 day.
I ran some experiments to see if objects are getting deleted using this technique, but so far my object has not been deleted. (It may get deleted soon??)
If anyone has experience with this technique, please let me know if this does not work, because so far my object has not been deleted, even though it is past its expiration date.

Yes, you can use S3 object lifecycle rules to delete objects having a given tag.
Based on my experience, lifecycle rules based on time are approximate, so it's possible that you need to wait longer. I have also found that complex lifecycle rules can be tricky -- my advice is to start with a simple test in a test bucket and work your way up.
There's decent documentation here on AWS.

Related

AWS S3 delete all the objects or within in a given date range

I am really having a hard time of deleting my bucket jananath-logs-bucket-new. It has over 70 TB of data and I need to delete the entire bucket. This has files from 2019
I tried deleting the bucket and since it has many small files (over 50 millions), it take so much time and the UI (browser hangs). So I thought, let the AWS do it for me.
So I tried the lifecycle rules. So I created the two rules
delete-all-from-start
delete-all-from-start-2
And below are the screenshots of each rule:
delete-all-from-start
delete-all-from-start-2
And both the rules look like this now:
But my objects are not deleted.
I have given the number of days for each field as 1 thinking it would delete everything from 2019 (where the first object is created).
Can someone help me on this?
How can I delete the entire objects from the bucket from the 2019
Is it possible to delete the objects between a date range - say from 2020-2021 ?
Thank you,
Have a great day!
According to the documentation a lifecycle policy is a valid way to empty a bucket. Please note that there may be a delay for expiring objects:
When an object reaches the end of its lifetime based on its lifecycle
policy, Amazon S3 queues it for removal and removes it asynchronously.
There might be a delay between the expiration date and the date at
which Amazon S3 removes an object.

How to mass / bulk PutObjectAcl in Amazon S3?

I am looking for a way to update several objects ACL in one (of few request) to the AWS API.
My web application contains several sensitive objects stored in AWS S3. This object have a default ACL to "private". I sometimes need to update several objects ACL to "public-read" for some time (a couple of minutes) before going back to "private".
For a couple of objects, one request per object to PutObjectAcl is ok. But when dealing with several objects (hundreds), the operation requires to much time.
My question is : how can I "mass put object acl" or "bulk put object acl" ? The AWS API doesn't contain a specific answer, like DeleteObjects (which allows to delete several objects at once). But may be I didn't look in the right place ?!
Any tricks or way to work around that would be of great value !
Mixing private and public objects inside a bucket is usually a bad idea. If you only need those objects to be public for a couple of minutes, you can create a pre-signed GET URL and set a desired expiration time.

Delete all version of S3 object using lifecycle rule

I have a S3 bucket with multiple folders with versioning enabled.
Out of these multiple folders I want to complete delete one folder as it has multiple delete marker.
I am using Lifecycle rule to delete the objects but not sure if it will work for specific folder.
In Lifecycle Rule, If I specify the folder_name/ as a prefix and expiration rule as 1 day after creation for all and current versions.
Will it delete all the objects and its versions ?
Can someone please confirm ?
The other folders are quite critical so can't mess with the rule to test.
I can confirm that you can delete at folder level instead of entire bucket. We have a rule that does the exact same thing (although 7 days instead of 1). I will echo John's point that after initial setup, it will take time to do the deletion. You should see progress STARTING within 1 hour, but actual completion may take a while.

Cheapest way to delete 2 billion objects from S3 IA

I have a bucket in S3 (Infrequent access) containing 2 billion objects. It is too big to delete in the console or over the api without taking years.
I can create a lifecycle rule to expire and delete the objects but the calculator predicts this will cost me >$20,000. Is that correct? Is there a better way to delete a bucket?
I have a file effectively containing a list of all the objects in that bucket if that helps.
Update 2021:
An answer below from #MAP points out that there is now an "Empty" button. I haven't tested yet, but looks like the way to go (I'll accept that answer once tested):
If you have a list of all the objects available then you can certainly use Multi Delete Object action. Apparently this API is free. I would create AWS Step Functions state machine to loop through the file and delete 1000 objects at a time. 1000 appears to be the limit.
It will take around 2M step function transactions to delete all the objects in the bucket. As per the pricing for step function it will cost you around $50 + cost of Lambda invocations around $1 so total cost roughly $51.
Update
Using Lambda or Step Functions is probably not the most cost effective option because both ways you will need to read the file (that contains object keys) from some source such as S3. So I think running the script from local machine or any EC2 linux screen appears to be the best option.
In 2021, anyone who comes across this question may benefit to know that AWS console now provides an empty button.
Select the bucket and click on "empty" button and all objects versioned or not versioned would be emptied/deleted. Depending on the number of objects it can take minutes to days.
Expiration lifecycle rules are free. From the original feature announcement:
As with standard delete requests, Amazon S3 doesn’t charge you for using Object Expiration.
Delete operations are for free. You can create a lifecycle
Policy to automate a bulk delete.
I would start with a small number of objects first and check billing report to 100% confirm that the delete will not be charged, then go for the rest.

Is AWS S3 read guaranteed to return a newly created object?

I've been reading the docs regarding read-after-write consistency with AWS S3 but I'm still unsure about this.
If I write an object to S3 and after getting a successful response from my write operation, I immediately attempt to read it, is the read operation guaranteed to return the object?
In other words, is it possible that the read operation will fail because it can't find the object? Because the read happened too soon after the write?
I'm only talking about new PUTs here, not updates to existing objects.
Yes guaranteed to return the object (only for new objects) with one caveat:
As per AWS documentation:
Amazon S3 provides read-after-write consistency for PUTS of new
objects in your S3 bucket in all regions with one caveat. The caveat
is that if you make a HEAD or GET request to the key name (to find if
the object exists) before creating the object, Amazon S3 provides
eventual consistency for read-after-write.
Amazon S3 offers eventual consistency for overwrite PUTS and DELETES
in all regions.
EDIT: credits to #Michael - sqlbot, more on HEAD (or) GET caveat:
If you send a GET or HEAD before the object exists, such as to check whether there's an object there before you upload, then the upload is not immediately consistent for read requests even after the upload is complete, because S3 has already made the only immediately consistent internal query it's going to make for that object, discovering, authoritatively, that there's no such key. The object creation becomes eventually consistent, since the creation has to "overwrite" the previous lookup that found nothing.
Based on following table provided in the link, "consistent reads" will never be stale.
Above provided link has nice example regarding how "read-after-write consistency" & "eventual consistency" works.
I would like to add this caution note to this answer to make things more clear:
Amazon S3 achieves high availability by replicating data across multiple servers within Amazon's data centers. If a PUT request is successful, your data is safely stored. However, information about the changes must replicate across Amazon S3, which can take some time, and so you might observe the following behaviors:
A process writes a new object to Amazon S3 and immediately lists keys
within its bucket. Until the change is fully propagated, the object
might not appear in the list.