Delete all version of S3 object using lifecycle rule - amazon-web-services

I have a S3 bucket with multiple folders with versioning enabled.
Out of these multiple folders I want to complete delete one folder as it has multiple delete marker.
I am using Lifecycle rule to delete the objects but not sure if it will work for specific folder.
In Lifecycle Rule, If I specify the folder_name/ as a prefix and expiration rule as 1 day after creation for all and current versions.
Will it delete all the objects and its versions ?
Can someone please confirm ?
The other folders are quite critical so can't mess with the rule to test.

I can confirm that you can delete at folder level instead of entire bucket. We have a rule that does the exact same thing (although 7 days instead of 1). I will echo John's point that after initial setup, it will take time to do the deletion. You should see progress STARTING within 1 hour, but actual completion may take a while.

Related

AWS S3 Lifecycle

I've been exploring AWS S3 Lifecycle techniques and found the best way to delete S3 files > 60 days old is to configure this through the GUI.
However, I'm not wanting to delete ALL files greater than 60 days. For example, I'd like to at least keep all HTML files inside the bucket that are greater than 60 days.
I've found that a prefix can be entered to limit the scope of the lifecycle to a specific file; however, this requires me to enter ALL files EXCEPT HTMLs. We have hundreds of files, so this will take forever.
I was wondering if anyone knew of an easier way? For example, I would like to just exclude all *.html from the lifecycle.
There is no way to exclude object from rules.
You can rearrange object in your bucket so rule can be applied to objects in specified prefix ("folder").

AWS S3 delete all the objects or within in a given date range

I am really having a hard time of deleting my bucket jananath-logs-bucket-new. It has over 70 TB of data and I need to delete the entire bucket. This has files from 2019
I tried deleting the bucket and since it has many small files (over 50 millions), it take so much time and the UI (browser hangs). So I thought, let the AWS do it for me.
So I tried the lifecycle rules. So I created the two rules
delete-all-from-start
delete-all-from-start-2
And below are the screenshots of each rule:
delete-all-from-start
delete-all-from-start-2
And both the rules look like this now:
But my objects are not deleted.
I have given the number of days for each field as 1 thinking it would delete everything from 2019 (where the first object is created).
Can someone help me on this?
How can I delete the entire objects from the bucket from the 2019
Is it possible to delete the objects between a date range - say from 2020-2021 ?
Thank you,
Have a great day!
According to the documentation a lifecycle policy is a valid way to empty a bucket. Please note that there may be a delay for expiring objects:
When an object reaches the end of its lifetime based on its lifecycle
policy, Amazon S3 queues it for removal and removes it asynchronously.
There might be a delay between the expiration date and the date at
which Amazon S3 removes an object.

Google cloud object persists after deletion

I'm using Chrome (vs Cloud SDK / command line) to repeatedly replace a file in a bucket. Dragging / dropping a file to overwrite the existing one, and / or deleting it first and putting it back (changed).
At a certain point the file stops updating and remains in a persistent state, even if I literally rm -r its parent folder.
i.e., I could have /bucket/css/file.css and rm -r /bucket/css and the file will still be available to the public.
From your second answer, it seems that your bucket has the option of “Object Versioning” enabled.
When Object Versioning is enabled for a bucket, Cloud Storage creates an archived version of an object each time the live version of the object is overwritten or deleted.
To verify that “Object Versioning” is enabled on your bucket you can use the following command:
gsutil versioning get gs://[BUCKET_NAME]
The response looks like the following if Object Versioning is enabled:
gs://[BUCKET_NAME]: Enabled
However, according to the official documentation there is no limit to the number of older versions of an object you will create if you continue to upload to the same object in a versioning-enabled bucket.
Having said that, I tried to reproduce your case in my own bucket. The steps I followed are:
1.Enable Object Versioning for my bucket
2.Upload a file in the bucket with the name “example.png”, using the GCP Console.
3.Drag and drop another file with the same name (“example.png”), but different content.
4.Check option “Replace existing object”
5.Check if the file has been updated. It had.
6.Repeated the process 50 times (since you said you had 40 archived versions of your file) by uploading the different files one after the other, every time overriding the previous one. Each time I uploaded a different content file, a new archived version of that file was created. Each time file updated accordingly without any problems.
Please review the steps I followed and let me know if there is any additional action from your side.
Since you are able to delete the files via gsutil command, deletion is working fine and you have all permissions required. Would you be able to clean-up the cookies of your web browser, and try deleting it again? You can also try to use incognito window mode to check if its working.
Furthermore, if Object Versioning is on, you can disable it and try deleting the object again.Note that object deletion can not be undone and once you delete the object it will be completely removed.
Additionally, a good practice suggested along with Object Versioning is to create an Object Lifecycle rule for the bucket that would delete all the objects that have been stored for more than a specific amount of time. You can use this as a workaround for deleting either live or archived versions of your object (if Object Versioning is actually enabled) and to accomplish it, you can follow this link.
Generally, you can review Deleting data best practices here.
Note that, according to Cloud Storage Object Limits a single particular object can only be updated or overwritten up to once per second. For more information, check here.
I used the gsutil to delete it and it worked... temporarily. It seems there were like 40 cached versions of the file with hash tag ids.
At some point it stops updating / deleting the file. :(
gsutil rm -r gs://bucket/path/to/folder/

Does using tags for object-level deletion in AWS S3 work?

I need object level auto deletion after some time in my s3 bucket, but only for some objects. I want to accomplish this by having a lifecycle rule that auto deletes objects with a certain (Tag, Value) pair. For example, I am adding the tag pair (AutoDelete, True) for objects I want to delete, and I have a lifecycle rule that deletes such objects after 1 day.
I ran some experiments to see if objects are getting deleted using this technique, but so far my object has not been deleted. (It may get deleted soon??)
If anyone has experience with this technique, please let me know if this does not work, because so far my object has not been deleted, even though it is past its expiration date.
Yes, you can use S3 object lifecycle rules to delete objects having a given tag.
Based on my experience, lifecycle rules based on time are approximate, so it's possible that you need to wait longer. I have also found that complex lifecycle rules can be tricky -- my advice is to start with a simple test in a test bucket and work your way up.
There's decent documentation here on AWS.

Faster way to delete TB of data from GCP cloud storage

I want to delete 2TB of files from the GCP bucket.
I have read the GCP documentation for deletion and it says to use the gsutil -m rm command but when I am running it says 400+ hours estimate time.
Is there any faster way to do the deletion process?
For buckets with a very large number of objects, one trick to deleting the contents is to use the Lifecycle Management feature. https://cloud.google.com/storage/docs/lifecycle
Set a lifecycle rule that triggers when the object is 0 days old and an action of "Delete", and that should cause GCS to begin deleting your objects for you. Note that this may still take a while, as lifecycle rules can take up to 24 hours to go into effect, but that's still a lot better than a couple of weeks.
You can configure the lifecycle policy on a bucket from the console:
Head to https://console.cloud.google.com/storage/browser
Find the bucket you want to enable, and click None in the Lifecycle column.
Click Add rule.
Select the condition (object is 0 days old or )
Select an action (Delete the object)
Click continue.
Click save.
See https://cloud.google.com/storage/docs/managing-lifecycles for more instructions.
N.B.: Lifecycle changes can take up to 24 hours to go into effect, so once all of your objects go away and you remove the lifecycle config setting, you should wait an additional 24 hours before putting any new files in the bucket, or else they might also get deleted.