I am creating cloudFormation stack for s3 bucket (with the help of yaml template file). Is there a way by which we can automatically delete the created buckets? Can we configure the yaml templates such that the s3 bucket gets deleted after some time of its creation? If not what is the best way to programmaticaly delete the s3 buckets?
Tried to add
DeletionPolicy: Delete
But it is for retention of deleted files.
The best way to achieve this is to create a cloudwatch event at a specific time that would trigger a lambda. This lambda could delete the file in the bucket and then delete the bucket.
You can build all that using cloudformation template.
As far as I know
DeletePolicy
is used to determine what will happen when a bucket gets deleted.
You should look into lifecycle policy instead if you want some sort of time based automation behind it.
Here are some examples:
AWS DOCS
Related
I want to delete an s3 bucket in aws with millions of objects. Is there a quick way of doing it through AWS CLI command or a script to delete them all without going in the console and manually doing it?
The easiest way I have found is to first edit the bucket's lifecycle policy to expire all objects. Then wait a day or two for the lifecycle policy to have removed all the objects from the bucket.
You can use the following to delete the bucket with the object. But i think that can take a lot of time if there are a lot of objects. I think there isn't a really fast way to do this.
aws s3 rb --force s3://your_bucket_name
But perhaps everyone does a better way.
Fairly new to cloudformation templating but all I am looking to create a template where I create a S3 bucket and import contents into that bucket from another S3 bucket from a different account (that is also mine). I realize CloudFormation does not natively supports importing contents into S3 bucket, and I have to utilize custom resource. I could not find any reference/resources that does such tasks. Hoping someone could point out some examples or maybe even some guidance as to how to tackle this.
Thank you very much!
Can't provide full code, but can provide some guidance. There are few ways of doing this, but I will list one:
Create a bucket policy for the bucket in the second account. The policy should allow the first account (one with cfn) to read it. There are many resources on doing this. One from AWS is here.
Create a standalone lambda function in the first account with execution role allowing it to the read bucket from the second account. This is not a custom resource yet. The purpose of this lambda function is to test the cross-account permissions, and your code which reads objects from it. This is like a test function to sort out all the permissions and polish object copying code from one bucket to other.
Once your lambda function works as intended, you modify it (or create new one) as a custom resource in CFN. As a custom resource, it will need to take your newly created bucket in cfn as one of its arguments. For easier creation of custom resources this aws helper can be used.
Note, that the lambda execution timeout is 15 minutes. Depending on how many objects you have, it may be not enough.
Hope this helps.
If Custom Resources scare you, then a simpler way is to launch an Amazon EC2 instance with a startup script specified via User Data.
The CloudFormation template can 'insert' the name of the new bucket into the script by referencing the bucket resource that was created. The script could then run an AWS CLI command to copy the files across.
Plus, it's not expensive. A t3.micro instance is about 1c/hour and it is charged per second, so it's pretty darn close to free.
I have recently joined a company that uses S3 Buckets for various different projects within AWS. I want to identify and potentially delete S3 Objects that are not being accessed (read and write), in an effort to reduce the cost of S3 in my AWS account.
I read this, which helped me to some extent.
Is there a way to find out which objects are being accessed and which are not?
There is no native way of doing this at the moment, so all the options are workarounds depending on your usecase.
You have a few options:
Tag each S3 Object (e.g. 2018-10-24). First turn on Object Level Logging for your S3 bucket. Set up CloudWatch Events for CloudTrail. The Tag could then be updated by a Lambda Function which runs on a CloudWatch Event, which is fired on a Get event. Then create a function that runs on a Scheduled CloudWatch Event to delete all objects with a date tag prior to today.
Query CloudTrail logs on, write a custom function to query the last access times from Object Level CloudTrail Logs. This could be done with Athena, or a direct query to S3.
Create a Separate Index, in something like DynamoDB, which you update in your application on read activities.
Use a Lifecycle Policy on the S3 Bucket / key prefix to archive or delete the objects after x days. This is based on upload time rather than last access time, so you could copy the object to itself to reset the timestamp and start the clock again.
No objects in Amazon S3 are required by other AWS services, but you might have configured services to use the files.
For example, you might be serving content through Amazon CloudFront, providing templates for AWS CloudFormation or transcoding videos that are stored in Amazon S3.
If you didn't create the files and you aren't knowingly using the files, can you probably delete them. But you would be the only person who would know whether they are necessary.
There is recent AWS blog post which I found very interesting and cost optimized approach to solve this problem.
Here is the description from AWS blog:
The S3 server access logs capture S3 object requests. These are generated and stored in the target S3 bucket.
An S3 inventory report is generated for the source bucket daily. It is written to the S3 inventory target bucket.
An Amazon EventBridge rule is configured that will initiate an AWS Lambda function once a day, or as desired.
The Lambda function initiates an S3 Batch Operation job to tag objects in the source bucket. These must be expired using the following logic:
Capture the number of days (x) configuration from the S3 Lifecycle configuration.
Run an Amazon Athena query that will get the list of objects from the S3 inventory report and server access logs. Create a delta list with objects that were created earlier than 'x' days, but not accessed during that time.
Write a manifest file with the list of these objects to an S3 bucket.
Create an S3 Batch operation job that will tag all objects in the manifest file with a tag of "delete=True".
The Lifecycle rule on the source S3 bucket will expire all objects that were created prior to 'x' days. They will have the tag given via the S3 batch operation of "delete=True".
Expiring Amazon S3 Objects Based on Last Accessed Date to Decrease Costs
I'd like to write a Lambda function that is triggered when files are added or modified in an s3 bucket and processes them and moves them elsewhere, clobbering older versions of the files.
I'm wondering if AWS Lambda can be configured to trigger when files are updated?
After reviewing the Boto3 documentation for s3 it looks like the only things that could happen in a s3 bucket would be creations and deletions.
Additionally, the AWS documentation seems to indicate there is no way to trigger things on 'updates' to S3.
Am I correct in thinking there is no real concept of an 'update' to a file in S3 and that an update would actually be when something was destroyed and recreated? If I'm mistaken, how can I trigger a Lambda function when an S3 file is changed in a bucket?
No, there is no concept of updating a file on S3. A file on S3 is updated the same way it is uploaded in the first place - through a PUT object request. (Relevant answer here.) An S3 bucket notification configured to trigger on a PUT object request can execute a Lambda function.
There is now a new functionality for S3 buckets. Under properties there is the possibility to enable versioning for this bucket. And if you set a trigger for creating on S3 assigned to your Lambda function - this will executed every time if you 'update' the same file as it is a new version.
Brand new to AWS and trying to put together a stack with a CloudFormation template.
The stack will have two EC2 Instances with a windows service running on each. Some of the storage will be on S3 and some will be on AWS Glacier.
I can't find samples or instructions on how to add Glacier as Resource in the CF Template.
Am I missing something and this is not possible through a CF template?
Has anyone done this before, and can someone provide a sample, if it is possible?
Thanks.
As of 2013-02-27, CloudFormation does not support Glacier.
If/when it does, you'll see Glacier show up in the CloudFormation resource types documentation here:
http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-template-resource-type-ref.html
Any support for auto-migration from S3 to Glacier should show up here:
http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-s3-bucket.html
To register your desire for Amazon to work on this feature, add a +1 comment in this forum thread:
https://forums.aws.amazon.com/thread.jspa?threadID=117947
Workaround would be following:
Instead you can user the S3 rules to attach a Glacier policy OR S3 Lifecycle Rule. This rule will move the object automatically to the Glacier or even delete object after some time.
http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-s3-bucket-lifecycleconfig-rule.html
AWS CloudFormation has added support for custom resources, where you can use AWS Lambda functions to do the job of creating resources that CloudFormation doesn't natively support. The resource type in the CloudFormation template file would end up being AWS::CloudFormation::CustomResource or Custom::String
For more info, check these official AWS docs:
http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-cfn-customresource.html?shortFooter=true
http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/template-custom-resources-lambda.html?shortFooter=true