Deleting Versions in S3 Bucket AWS

Deleting Versions in S3 Bucket AWS - amazon-web-services

I am unable to delete a bucket with 160million+ versions that were created by a bug that was updating a profile picture and since we had versioning set up it caused a heap of a mess. I tried to use the AWS website to delete the versions and delete the s3 bucket but the auth token expires before it runs through all the files and signs me out of AWS. I then tried to use AWS CLI and am confronted with another issue. The command and error can be found below.
aws s3api delete-bucket --bucket pinch-profile-picture --region us-east-2
and received the following error:
An error occurred (BucketNotEmpty) when calling the DeleteBucket operation: The bucket you tried to delete is not empty. You must delete all versions in the bucket.
AWS does not want to touch or delete the bucket for us because they would be held liable for deleting something.
The following steps I might be able to take are:
1)Create a script to delete everything.
2)Figure out how to delete all the versions from CLI
3)Figure out how to not prevent our token from expiring on the site
4)"Insert your suggestion here"
..
99)manually delete but this is limited to 300 per page which would mean i would be required to delete 300 items 666,666 times if we created 200million versions. So thats a no-go.
But I am open to suggestions, what are your tips to getting the S3 bucket deleted along with the versions. If you've been in the situation before or have experience building scripts please help a brotha out.
Best Regards,
Akshay Kumar

If force-deleting the bucket doesn't work, you could Create a Lifecycle Policy for an S3 Bucket.
Configure it to delete all versions of objects and then, within 24 hours, the objects and versions will be gone!
You can then delete the bucket.

The remove bucket command with force option on a bucket that has versioning enabled will only add a delete marker to the version history and retains all other versions of that object. Your bucket will not be completely empty as object versions are still there and delete bucket command will fail.
To delete all object versions, you can turn show version option on and delete them individually. Of course, in your case with millions of versions in the bucket deleting individual version is not practical.
The following python code will do the job:
import boto3
session = boto3.Session()
s3 = session.resource(service_name='s3')
bucket = s3.Bucket('your-bucket-name')
bucket.object_versions.delete()

You can try this command:
aws s3 rb s3://mybucket --force
Force parameter: Remove all objects in the bucket and finally it
remove itself.
Read More

Related

Not able to delete the S3 bucket

I am trying to delete the S3 bucket in Region 1, but I'm unable to and in AWS console, I was able to see the following issue:
POST https://us-east-1.console.aws.amazon.com/s3/proxy 404 (Not Found)
Moreover, I was not able to view the region against that particular S3 bucket. Please post your suggestions on this. But I tried creating the same bucket in another region and deleted the same, and I was able to delete.

I have resolved this issue. Actually that bucket doesn't exist in that location. But only the name of the bucket is there in that location. so if we can delete and create the stack again. It won't throw any error.

Can't Delete Empty S3 Bucket

I have an S3 bucket that is 100% empty. Versioning was never enabled on the bucket. However, I still cannot remove the bucket. I have tried via the Console and the CLI tool. On the console it just says "Error" with no error message. From the cli and api it tells me: "An error occurred (BucketNotEmpty) when calling the DeleteBucket operation: The bucket you tried to delete is not empty". I have tried all of the following:
aws s3 rb s3://<bucket_name> --force -> BucketNotEmpty
aws s3 rm s3://<bucket_name> --recursive -> No output (because it's already empty)
aws s3api list-object-versions --bucket <bucket_name> -> No output (because versioning was never enabled)
aws s3api list-multipart-uploads --bucket <bucket_name> -> No outputs
aws s3api list-objects --delimiter=/ --prefix= --bucket <bucket_name> -> No Output (because it's empty)
It has no dependencies (it's not used by cloudfront or anything else that I'm aware of).
The bucket has been empty for approximately 5 days.
I was able to delete another very similar bucket with the same IAM user. Additionally my IAM user has Admin access.

I was facing this same problem. I was able to fix the issue by going into the bucket and deleting the "Bucket Policy" for the bucket. After that, deleting the bucket worked correctly.
I did this through the AWS console, for an S3 bucket created by Elastic Beanstalk (ie elasticbeanstalk-us-west-2-861587641234). I imagine the creation script includes a policy to prevent people from accidentally deleting the bucket.

I had a similar issue and was able to delete the bucket after waiting overnight.
It's a pretty weak solution but may save you and other some time from pounding on it.
If it's still not deleting after all the actions in the comments there are some things that only AWS support can fix properly. Again a weak answer but register a ticket with AWS support and then post their response here as an answer for others.

To delete an Elastic Beanstalk storage bucket (console)
1. Open the Amazon S3 Management Console
2. Select the Elastic Beanstalk storage bucket.
3. Choose Properties.
4. Choose Permissions.
5. Choose Edit Bucket Policy - Allow to delete and make it public.
6. Save.
7. Choose Actions and then choose Delete Bucket.
8, Type the name of the bucket and then choose Delete.

This is what had worked for me. I didn't have versioning enabled on the bucket. When you delete an object from s3 bucket, it puts a 'delete marker' on that object and hides it from the listing. When you click 'show' version button you will see your deleted objects with the delete marker. Select this object (with delete marker) and delete again. This is a permanent delete. Now your object is really gone and your bucket is really empty. After this I was able to delete my bucket.
I guess, versioning=true only means that s3 will create versions of the object if you upload with the same name.

For users who are facing similar issue.
I tried #Federico solution still no success. There was an other option "Empty" next to delete.
So I emptied the bucket first and then tried delete, it worked.

I was facing an issue with deleting the Elastic Beanstalk storage bucket.
Follow the below steps:
1. Select the Elastic Beanstalk storage bucket.
2. Choose Permissions.
3. Delete the bucket policy
4. Save.
If your bucket is empty, you can delete the bucket.

Sometimes after attempting to delete a bucket, it's not actually deleted, but the permissions are lost.
In my case, I went to the "permissions" tab, re-granted permissions to myself, and was then able to remove it

I had the same issue and there was not a policy, so added permission for the email I was logged in with and saved. After granting myself permission I was able to delete the bucket. I also had another bucket that had a policy, so I delete the policy and was able to delete that bucket as well.

Using aws cli :
# delete contents of a bucket
aws s3api delete-objects --bucket nameOfYourBucket --delete "$(aws s3api list-object-versions --bucket nameOfYourBucket --query='{Objects: Versions[].{Key:Key,VersionId:VersionId}}')"
# delete older version files from bucket
aws s3api delete-objects --bucket nameOfYourBucket --delete "$(aws s3api list-object-versions --bucket nameOfYourBucket --query='{Objects: DeleteMarkers[].{Key:Key,VersionId:VersionId}}')"
And then you can delete the bucket.

i made the s3 bucket permission to public, and gave access to everyone. Then i was able to delete the Bucket from the AWS console.

I am using the AWS Console to perform deletion of the bucket.
had the same problem and I tried all the above solutions and not worked for me then I figured out another way.
My bucket was used by ElasticBean and whenever I deleted the bucket the ElasticBean created one automatically. I then deleted the ElasticBean service and tried to delete the bucket again but not worked again this time, the bucket was empty but was not allowing to delete.
I tried to change permissions but the bucket was still there.
Finally I deleted the bucket policy and came back and deleted the bucket and it was gone.
Problem solved

I tried to look at many of the solutions mentioned. The only thing that worked for me is deleting it through Cyberduck (I neither work for nor am promoting Cyberduck, i genuinely used it and it worked). Here are the steps of what I did:
1 - Download and install Cyberduck.
2 - Click on Open Connection
3 - Select Amazon S3 from the dropdown (default would be FTP)
4 - Enter your access key ID and secret Access key (if you dont have one then you need to create one on your s3 bucket through IAM on AWS)
5 - You will see a list your S3 buckets. Select the file or folder or bucket you want to delete, right click and delete. Even files with 0kb show up here and can be deleted.
Hope this helps

S3 download works from console, but not from commandline

Can anyone explain this behaviour:
When I try to download a file from S3, I get the following error:
An error occurred (403) when calling the HeadObject operation: Forbidden.
Commandline used:
aws s3 cp s3://bucket/raw_logs/my_file.log .
However, when I use the S3 console website, I'm able to download the file without issues.
The access key used by the commandline is correct. I verified this, and other AWS operations via commandline work fine. The access key is tied to the same user account I use in the AWS console.

So I assume you're sure about the IAM policy of your user and the file exists in your bucket
If you have set a default region in your configuration but the bucket has not been created in this region (Yes s3 buckets are created in a region), it will not find it. Make sure to add the region flag to the CLI
aws s3 cp s3://bucket/raw_logs/my_file.log . --region <region of the bucket>
Other notes:
make sure to upgrade to latest version
can be cause if system clock is not synchronized, if you're not indicating any synchronize params, it might be ok but I dont know the internal and for some commands the CLI is looking at the system clock to compare to S3, if you're out of sync it might cause issues

I had a similar issue due to having two-factor authentication enabled on my account. Check out how to configure 2FA for the aws cli here: https://aws.amazon.com/premiumsupport/knowledge-center/authenticate-mfa-cli/

Error: "[BucketAlreadyOwnedByYou] Your previous request to create the named bucket succeeded and you already own it" in amazon server

I checked the my server log, there are so many errors like:
S3::putBucket(******): [BucketAlreadyOwnedByYou] Your previous request to create the named bucket succeeded and you already own it. in /var/www/html/****/public_html/*****/common/config/S3.php on line 188
I googled it, but I didn't get any proper help, can anyone please tell me what is causing this error? How can I solve this issue?

http://docs.aws.amazon.com/AmazonS3/latest/API/ErrorResponses.html
BucketAlreadyOwnedByYou
Your previous request to create the named bucket succeeded and you already own it. You get this error in all AWS regions except US East (N. Virginia) region, us-east-1. In us-east-1 region, you will get 200 OK, but it is no-op (if bucket exists it Amazon S3 will not do anything).
409 Conflict (in all regions except US East (N. Virginia) region).
How to ignore it, check if the bucket already exist, if not then create it.
For bash:
BUCKET_NAME=my-bucket-name
aws s3api get-bucket-location --bucket ${BUCKET_NAME} || aws s3 mb s3://${BUCKET_NAME}

All this is related with regions. For example, if your script set region to us-east-1, but using Amazon new interface to create a nue bucket you chose any other, when run your script you will have that error. In my case I had to delete the newly created bucket and re-created. This will take time, because AWS locks bucket name, so just be patient to recreate the folder. Of course there are other ways to do it in code, if you are not in a rush (like me). But trust me, they take time to kill that bucket...

This happens when you ask S3 to create a bucket already created.
Can you give us some details about how you use S3 from EC2?
Environment, SDK version, code...
We can't help without details.

How to download data from Amazon's requester pay buckets?

I have been struggling for about a week to download arXiv articles as mentioned here: http://arxiv.org/help/bulk_data_s3#src.
I have tried lots of things: s3Browser, s3cmd. I am able to login to my buckets but I am unable to download data from arXiv bucket.
I tried:
s3cmd get s3://arxiv/pdf/arXiv_pdf_1001_001.tar
See:
$ s3cmd get s3://arxiv/pdf/arXiv_pdf_1001_001.tar
s3://arxiv/pdf/arXiv_pdf_1001_001.tar -> ./arXiv_pdf_1001_001.tar [1 of 1]
s3://arxiv/pdf/arXiv_pdf_1001_001.tar -> ./arXiv_pdf_1001_001.tar [1 of 1]
ERROR: S3 error: Unknown error
s3cmd get with x-amz-request-payer:requester
It gave me same error again:
$ s3cmd get --add-header="x-amz-request-payer:requester" s3://arxiv/pdf/arXiv_pdf_manifest.xml
s3://arxiv/pdf/arXiv_pdf_manifest.xml -> ./arXiv_pdf_manifest.xml [1 of 1]
s3://arxiv/pdf/arXiv_pdf_manifest.xml -> ./arXiv_pdf_manifest.xml [1 of 1]
ERROR: S3 error: Unknown error
Copying
I have tried copying files from that folder too.
$ aws s3 cp s3://arxiv/pdf/arXiv_pdf_1001_001.tar .
A client error (403) occurred when calling the HeadObject operation: Forbidden
Completed 1 part(s) with ... file(s) remaining
This probably means that I made a mistake. The problem is I don't know how and what to add that will convey my permission to pay for download.
I am unable to figure out what should I do for downloading data from S3. I have been reading a lot on AWS sites, but nowhere I can get pinpoint solution to my problem.
How can I bulk download the arXiv data?

Try downloading s3cmd version 1.6.0: http://sourceforge.net/projects/s3tools/files/s3cmd/
$ s3cmd --configure
Enter your credentials found in the account management tab of the Amazon AWS website interface.
$ s3cmd get --recursive --skip-existing s3://arxiv/src/ --requester-pays

Requester Pays is a feature on Amazon S3 buckets that requires the user of the bucket to pay Data Transfer costs associated with accessing data.
Normally, the owner of an S3 bucket pays Data Transfer costs, but this can be expensive for free / Open Source projects. Thus, the bucket owner can activated Requester Pays to reduce the portion of costs they will be charged.
Therefore, when accessing a Requester Pays bucket, you will need to authenticate yourself so that S3 knows whom to charge.
I recommend using the official AWS Command-Line Interface (CLI) to access AWS services. You can provide your credentials via:
aws configure
and then view the bucket via:
aws s3 ls s3://arxiv/pdf/
and download via:
aws s3 cp s3://arxiv/pdf/arXiv_pdf_1001_001.tar .
UPDATE: I just tried the above myself, and received Access Denied error messages (both on the bucket listing and the download command). When using s3cmd, it says ERROR: S3 error: Access Denied. It would appear that the permissions on the bucket no longer permit access. You should contact the owners of the bucket to request access.

At the bottom of this page arXiv explains that s3cmd gets denied because it does not support access to requester pays bucket as a non-owner and you have to apply a patch to the source code of s3cmd. However, the version of s3cmd they used is outdated and the patch does not apply to the latest version of s3cmd.
Basically you need to allow s3cmd to add "x-amz-request-payer" header to its HTTP request to buckets. Here is how to fix it:
Download the source code of s3cmd.
Open S3/S3.py with a text editor.
Add this two lines of code at the bottom of __init__ function:
if self.s3.config.extra_headers:
self.headers.update(self.s3.config.extra_headers)
Install s3cmd as instructed.

For me the problem was that my IAM user didn't have enough permissions.
Setting AmazonS3FullAccess was the solution for me.
Hope it'll save time to someone

Don't want to steal the thunder, but OttoV's comment actually gave the right command that works for me.
aws s3 ls --request-payer requester s3://arxiv/src/
My EC2 is in Region us-east-2, but the arXiv s3 buckets are in Region us-east-1, so I think that's why the --request-payer requester is needed.
From https://aws.amazon.com/s3/pricing/?nc=sn&loc=4 :
You pay for all bandwidth into and out of Amazon S3, except for the following:
• Data transferred in from the internet.
• Data transferred out to an Amazon Elastic Compute Cloud (Amazon EC2) instance, when the instance is in the same AWS Region as the S3 bucket (including to a different account in the same AWS region).
• Data transferred out to Amazon CloudFront (CloudFront).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js