How to update lambda#edge arn in cloudfront distribution using cli - amazon-web-services

I would like to update the cloudfront distribution with the latest lambda#edge function using CLI.
I saw this documentation https://docs.aws.amazon.com/cli/latest/reference/cloudfront/update-distribution.html
but could not figure out how to update the lambda arn only.
Can some one help

Here is the script, that is doing exactly that. It is implemented based on #cloudbud answer. There is no argument checking. It would be executed like this: ./script QF234ASD342FG my-lambda-at-edge-function us-east-1. In my case, the execution time is less than 10 sec. See update-distribution for details.
#!/bin/bash
set -xeuo pipefail
export PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin
distribution_id="$1"
function_name="$2"
region="$3"
readonly lambda_arn=$(
aws lambda list-versions-by-function \
--function-name "$function_name" \
--region "$region" \
--query "max_by(Versions, &to_number(to_number(Version) || '0'))" \
| jq -r '.FunctionArn'
)
readonly tmp1=$(mktemp)
readonly tmp2=$(mktemp)
aws cloudfront get-distribution-config \
--id "$distribution_id" \
> "$tmp1"
readonly etag=$(jq -r '.ETag' < "$tmp1")
cat "$tmp1" \
| jq '(.DistributionConfig.CacheBehaviors.Items[] | select(.PathPattern=="dist/sxf/*") | .LambdaFunctionAssociations.Items[] | select(.EventType=="origin-request") | .LambdaFunctionARN ) |= "'"$lambda_arn"'"' \
| jq '.DistributionConfig' \
> "$tmp2"
# the dist config has to be in the file
# and be referred in specific way.
aws cloudfront update-distribution \
--id "$distribution_id" \
--distribution-config "file://$tmp2" \
--if-match "$etag"
rm -f "$tmp1" "$tmp2"

could not figure out how to update the lambda arn only.
The link that you provided explains the process:
The update process includes getting the current distribution configuration, updating the XML document that is returned to make your changes, and then submitting an UpdateDistribution request to make the updates.
This means that you can't just update lambda arn directly. You have:
Call get-distribution-config to obtain full current configuration.
Change the lambda arn in the configuration data obtained.
Upload the entire new configuration using update-distribution.
The process requires extra attention which is also explained in the docs under Warning:
You must strip out the ETag parameter that is returned.
Additional fields are required when you update a distribution.
and more.
The process is indeed complex. Thus if you can I would recommend trying this on some test/dummy CloudFront distribution rather than directly on the production version.

Something like this:
#!/bin/bash
set -x
TEMPDIR=$(mktemp -d)
CONFIG=$(aws cloudfront get-distribution-config --id CGSKSKLSLSM)
ETAG=$(echo "${CONFIG}" | jq -r '.ETag')
echo "${CONFIG}" | jq '.DistributionConfig' > ${TEMPDIR}/orig.json
echo "${CONFIG}" | jq '.DistributionConfig | .DefaultCacheBehavior.LambdaFunctionAssociations.Items[0].LambdaFunctionARN= "arn:aws:lambda:us-east-1:xxxxx:function:test-func:3"' > ${TEMPDIR}/updated.json
aws cloudfront update-distribution --id CGSKSKLSLSM --distribution-config file://${TEMPDIR}/updated.json --if-match "${ETAG}"

Related

How to get the full results of a query to CSV file using AWS/Athena from CLI?

I need to download a full table content that I have on my AWS/Glue/Catalog using AWS/Athena. At the moment what I do it is running a select * from my_table from the Dashboard and saving the result locally as CSV always from Dashboard. Is there a way to get the same result using AWS/CLI?
From the documentation I can see https://docs.aws.amazon.com/cli/latest/reference/athena/get-query-results.html but it is not quite what I need.
You can run an Athena query with AWS CLI using the aws athena start-query-execution API call. You will then need to poll with aws athena get-query-execution until the query is finished. When that is the case the result of that call will also contain the location of the query result on S3, which you can then download with aws s3 cp.
Here's an example script:
#!/usr/bin/env bash
region=us-east-1 # change this to the region you are using
query='SELECT NOW()' # change this to your query
output_location='s3://example/location' # change this to a writable location
query_execution_id=$(aws athena start-query-execution \
--region "$region" \
--query-string "$query" \
--result-configuration "OutputLocation=$output_location" \
--query QueryExecutionId \
--output text)
while true; do
status=$(aws athena get-query-execution \
--region "$region" \
--query-execution-id "$query_execution_id" \
--query QueryExecution.Status.State \
--output text)
if [[ $status != 'RUNNING' ]]; then
break
else
sleep 5
fi
done
if [[ $status = 'SUCCEEDED' ]]; then
result_location=$(aws athena get-query-execution \
--region "$region" \
--query-execution-id "$query_execution_id" \
--query QueryExecution.ResultConfiguration.OutputLocation \
--output text)
exec aws s3 cp "$result_location" -
else
reason=$(aws athena get-query-execution \
--region "$region" \
--query-execution-id "$query_execution_id" \
--query QueryExecution.Status.StateChangeReason \
--output text)
echo "Query $query_execution_id failed: $reason" 1>&2
exit 1
fi
If your primary work group has an output location, or you want to use a different work group which also has a defined output location you can modify the start-query-execution call accordingly. Otherwise you probably have an S3 bucket called aws-athena-query-results-NNNNNNN-XX-XXXX-N that has been created by Athena at some point and that is used for outputs when you use the UI.
You cannot save results from the AWS CLI, but you can Specify a Query Result Location and Amazon Athena will automatically save a copy of the query results in an Amazon S3 location that you specify.
You could then use the AWS CLI to download that results file.

AWS IAM - How to show describe policy statements using the CLI?

How can I use the AWS CLI to show an IAM policy's full body including the Effect, Action and Resource statements?
"aws iam list-policies" command lists all the policies but not the actual JSON E,A,R statements contained within the policy.
I could use the "aws iam get-policy-version" command but this does not show the policy name in its output. When I am running this command via a script to obtain information for dozens of policies, there is no way to know which policy the output will belong to.
Is there another way of doing this?
The only to do this as you've said is the following:
Get all IAM Policies via the list-policies verb.
Loop over the output, taking the "PolicyId" and "DefaultVersionId".
Pass these into the get-policy-version verb.
Map the PolicyName from the iteration to the PolicyVersion.Document value in the second request.
Slight modification to #uberhumus suggestion to reduce the number of policies that will be extracted . Use the --scope Local qualifier in the query to limit it . Otherwise it will spit out 100's of policies in the account . limiting the scope to local will only list policies which are user provisioned in the account ... Here's the modified version :
RAW_POLICIES=$(aws iam list-policies **--scope Local** --query Policies[].[Arn,PolicyName,DefaultVersionId])
POLICIES=$(echo $RAW_POLICIES | tr -d " " | sed 's/\],/\]\n/g')
for POLICY in $POLICIES
do echo $POLICY | cut -d '"' -f 4
echo -e "---------------\n"
aws iam get-policy-version --version-id $(echo $POLICY | cut -d '"' -f 6) --policy-arn $(echo $POLICY | cut -d '"' -f 2)
echo -e "\n-----------------\n"
done
As mokugo-devops said in his answer, and you stated in your question, you could only use "get-policy-version" to get the proper JSON. Here is how I would do it:
RAW_POLICIES=$(aws iam list-policies --query Policies[].[Arn,PolicyName,DefaultVersionId])
POLICIES=$(echo $RAW_POLICIES | tr -d " " | sed 's/\],/\]\n/g')
for POLICY in $POLICIES
do echo $POLICY | cut -d '"' -f 4
echo -e "---------------\n"
aws iam get-policy-version --version-id $(echo $POLICY | cut -d '"' -f 6) --policy-arn $(echo $POLICY | cut -d '"' -f 2)
echo -e "\n-----------------\n"
done
Now a bit of explanation about the script:
RAW_POLICIES will get you a giant list of arrays that would each contain the name of the policy as requested and the Policy ARN, and Default Version ID as needed. It would however contain spaces that would make iterating over it directly in bash less comfortable (though not impossible for the sufficiently stubborn).
To make the upcoming loop more easy we will clean the spaces and then use sed to insert the spaces we will need. This is done in the 2nd line which defines the POLICIES variable.
This leaves us very little to do in the actual loop. Here we just print the Policy name, some pretty lines and invoke the function that you predicted will be the one used, get-policy-version.

How do I set up my API to require an API key with amazon API Gateway?

I have been following advice on this post I've created an API key on AWS and set my POST method to require an API key.
I have also setup a usage plan and linked that API key to it.
My API key is enabled
When I have been testing requests with postman, my request still goes through without any additional headers.
I was expecting no requests to go through unless I had included a header in my request like this "x-api-key":"my_api_key"
Do I need to change the endpoint I send requests to in postman for them to go through API Gateway?
If you need to enable API key for each method then needs to be enabled API key required true for each method.
Go to resources--> select your resource and method, go to Method Request and set "API Key Required" to true.
https://docs.aws.amazon.com/apigateway/latest/developerguide/how-to-use-postman-to-call-api.html
https://docs.aws.amazon.com/apigateway/latest/developerguide/welcome.html
https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-api-key-source.html
If you want, I've made the following script to enable the API key on every method for certain API. It requires the jq tool for advanced JSON parsing.
You can find the script to enable the API key for all methods of an API Gateway API on this gist.
#!/bin/bash
api_gateway_method_enable_api_key() {
local api_id=$1
local method_id=$2
local method=$3
aws --profile "$profile" --region "$region" \
apigateway update-method \
--rest-api-id "$api_id" \
--resource-id "$method_id" \
--http-method "$method" \
--patch-operations op="replace",path="/apiKeyRequired",value="true"
}
# change this to 1 in order to execute the update
do_update=0
profile=your_profile
region=us-east-1
id=your_api_id
tmp_file="/tmp/list_of_endpoint_and_methods.json"
aws --profile $profile --region $region \
apigateway get-resources \
--rest-api-id $id \
--query 'items[?resourceMethods].{p:path,id:id,m:resourceMethods}' >"$tmp_file"
while read -r line; do
path=$(jq -r '.p' <<<"$line")
method_id=$(jq -r '.id' <<<"$line")
echo "$path"
# do not update OPTIONS method
for method in GET POST PUT DELETE; do
has_method=$(jq -r ".m.$method" <<<"$line")
if [ "$has_method" != "null" ]; then
if [ $do_update -eq 1 ]; then
api_gateway_method_enable_api_key "$id" "$method_id" "$method"
echo " $method method changed"
else
echo " $method method will be changed"
fi
fi
done
done <<<"$(jq -c '.[]' "$tmp_file")"

How do I delete a versioned bucket in AWS S3 using the CLI?

I have tried both s3cmd:
$ s3cmd -r -f -v del s3://my-versioned-bucket/
And the AWS CLI:
$ aws s3 rm s3://my-versioned-bucket/ --recursive
But both of these commands simply add DELETE markers to S3. The command for removing a bucket also doesn't work (from the AWS CLI):
$ aws s3 rb s3://my-versioned-bucket/ --force
Cleaning up. Please wait...
Completed 1 part(s) with ... file(s) remaining
remove_bucket failed: s3://my-versioned-bucket/ A client error (BucketNotEmpty) occurred when calling the DeleteBucket operation: The bucket you tried to delete is not empty. You must delete all versions in the bucket.
Ok... how? There's no information in their documentation for this. S3Cmd says it's a 'fully-featured' S3 command-line tool, but it makes no reference to versions other than its own. Is there any way to do this without using the web interface, which will take forever and requires me to keep my laptop on?
I ran into the same limitation of the AWS CLI. I found the easiest solution to be to use Python and boto3:
#!/usr/bin/env python
BUCKET = 'your-bucket-here'
import boto3
s3 = boto3.resource('s3')
bucket = s3.Bucket(BUCKET)
bucket.object_versions.delete()
# if you want to delete the now-empty bucket as well, uncomment this line:
#bucket.delete()
A previous version of this answer used boto but that solution had performance issues with large numbers of keys as Chuckles pointed out.
Using boto3 it's even easier than with the proposed boto solution to delete all object versions in an S3 bucket:
#!/usr/bin/env python
import boto3
s3 = boto3.resource('s3')
bucket = s3.Bucket('your-bucket-name')
bucket.object_versions.all().delete()
Works fine also for very large amounts of object versions, although it might take some time in that case.
You can delete all the objects in the versioned s3 bucket.
But I don't know how to delete specific objects.
$ aws s3api delete-objects \
--bucket <value> \
--delete "$(aws s3api list-object-versions \
--bucket <value> | \
jq '{Objects: [.Versions[] | {Key:.Key, VersionId : .VersionId}], Quiet: false}')"
Alternatively without jq:
$ aws s3api delete-objects \
--bucket ${bucket_name} \
--delete "$(aws s3api list-object-versions \
--bucket "${bucket_name}" \
--output=json \
--query='{Objects: Versions[].{Key:Key,VersionId:VersionId}}')"
This two bash lines are enough for me to enable the bucket deletion !
1: Delete objects
aws s3api delete-objects --bucket ${buckettoempty} --delete "$(aws s3api list-object-versions --bucket ${buckettoempty} --query='{Objects: Versions[].{Key:Key,VersionId:VersionId}}')"
2: Delete markers
aws s3api delete-objects --bucket ${buckettoempty} --delete "$(aws s3api list-object-versions --bucket ${buckettoempty} --query='{Objects: DeleteMarkers[].{Key:Key,VersionId:VersionId}}')"
Looks like as of now, there is an Empty button in the AWS S3 console.
Just select your bucket and click on it. It will ask you to confirm your decision by typing permanently delete
Note, this will not delete the bucket itself.
Here is a one liner you can just cut and paste into the command line to delete all versions and delete markers (it requires aws tools, replace yourbucket-name-backup with your bucket name)
echo '#!/bin/bash' > deleteBucketScript.sh \
&& aws --output text s3api list-object-versions --bucket $BUCKET_TO_PERGE \
| grep -E "^VERSIONS" |\
awk '{print "aws s3api delete-object --bucket $BUCKET_TO_PERGE --key "$4" --version-id "$8";"}' >> \
deleteBucketScript.sh && . deleteBucketScript.sh; rm -f deleteBucketScript.sh; echo '#!/bin/bash' > \
deleteBucketScript.sh && aws --output text s3api list-object-versions --bucket $BUCKET_TO_PERGE \
| grep -E "^DELETEMARKERS" | grep -v "null" \
| awk '{print "aws s3api delete-object --bucket $BUCKET_TO_PERGE --key "$3" --version-id "$5";"}' >> \
deleteBucketScript.sh && . deleteBucketScript.sh; rm -f deleteBucketScript.sh;
then you could use:
aws s3 rb s3://bucket-name --force
If you have to delete/empty large S3 buckets, it becomes quite inefficient (and expensive) to delete every single object and version. It's often more convenient to let AWS expire all objects and versions.
aws s3api put-bucket-lifecycle-configuration \
--lifecycle-configuration '{"Rules":[{
"ID":"empty-bucket",
"Status":"Enabled",
"Prefix":"",
"Expiration":{"Days":1},
"NoncurrentVersionExpiration":{"NoncurrentDays":1}
}]}' \
--bucket YOUR-BUCKET
Then you just have to wait 1 day and the bucket can be deleted with:
aws s3api delete-bucket --bucket YOUR-BUCKET
For those using multiple profiles via ~/.aws/config
import boto3
PROFILE = "my_profile"
BUCKET = "my_bucket"
session = boto3.Session(profile_name = PROFILE)
s3 = session.resource('s3')
bucket = s3.Bucket(BUCKET)
bucket.object_versions.delete()
One way to do it is iterate through the versions and delete them. A bit tricky on the CLI, but as you mentioned Java, that would be more straightforward:
AmazonS3Client s3 = new AmazonS3Client();
String bucketName = "deleteversions-"+UUID.randomUUID();
//Creates Bucket
s3.createBucket(bucketName);
//Enable Versioning
BucketVersioningConfiguration configuration = new BucketVersioningConfiguration(ENABLED);
s3.setBucketVersioningConfiguration(new SetBucketVersioningConfigurationRequest(bucketName, configuration ));
//Puts versions
s3.putObject(bucketName, "some-key",new ByteArrayInputStream("some-bytes".getBytes()), null);
s3.putObject(bucketName, "some-key",new ByteArrayInputStream("other-bytes".getBytes()), null);
//Removes all versions
for ( S3VersionSummary version : S3Versions.inBucket(s3, bucketName) ) {
String key = version.getKey();
String versionId = version.getVersionId();
s3.deleteVersion(bucketName, key, versionId);
}
//Removes the bucket
s3.deleteBucket(bucketName);
System.out.println("Done!");
You can also batch delete calls for efficiency if needed.
If you want pure CLI approach (with jq):
aws s3api list-object-versions \
--bucket $bucket \
--region $region \
--query "Versions[].Key" \
--output json | jq 'unique' | jq -r '.[]' | while read key; do
echo "deleting versions of $key"
aws s3api list-object-versions \
--bucket $bucket \
--region $region \
--prefix $key \
--query "Versions[].VersionId" \
--output json | jq 'unique' | jq -r '.[]' | while read version; do
echo "deleting $version"
aws s3api delete-object \
--bucket $bucket \
--key $key \
--version-id $version \
--region $region
done
done
Simple bash loop I've found and implemented for N buckets:
for b in $(ListOfBuckets); do \
echo "Emptying $b"; \
aws s3api delete-objects --bucket $b --delete "$(aws s3api list-object-versions --bucket $b --output=json --query='{Objects: *[].{Key:Key,VersionId:VersionId}}')"; \
done
I ran into issues with Abe's solution as the list_buckets generator is used to create a massive list called all_keys and I spent an hour without it ever completing. This tweak seems to work better for me, I had close to a million objects in my bucket and counting!
import boto
s3 = boto.connect_s3()
bucket = s3.get_bucket("your-bucket-name-here")
chunk_counter = 0 #this is simply a nice to have
keys = []
for key in bucket.list_versions():
keys.append(key)
if len(keys) > 1000:
bucket.delete_keys(keys)
chunk_counter += 1
keys = []
print("Another 1000 done.... {n} chunks so far".format(n=chunk_counter))
#bucket.delete() #as per usual uncomment if you're sure!
Hopefully this helps anyone else encountering this S3 nightmare!
For deleting specify object(s), using jq filter.
You may need cleanup the 'DeleteMarkers' not just 'Versions'.
Using $() instead of ``, you may embed variables for bucket-name and key-value.
aws s3api delete-objects --bucket bucket-name --delete "$(aws s3api list-object-versions --bucket bucket-name | jq -M '{Objects: [.["Versions","DeleteMarkers"][]|select(.Key == "key-value")| {Key:.Key, VersionId : .VersionId}], Quiet: false}')"
Even though technically it's not AWS CLI, I'd recommend using AWS Tools for Powershell for this task. Then you can use the simple command as below:
Remove-S3Bucket -BucketName {bucket-name} -DeleteBucketContent -Force -Region {region}
As stated in the documentation, DeleteBucketContent flag does the following:
"If set, all remaining objects and/or object versions in the bucket
are deleted proir (sic) to the bucket itself being deleted"
Reference: https://docs.aws.amazon.com/powershell/latest/reference/items/Remove-S3Bucket.html
This bash script found here: https://gist.github.com/weavenet/f40b09847ac17dd99d16
worked as is for me.
I saved script as: delete_all_versions.sh and then simply ran:
./delete_all_versions.sh my_foobar_bucket
and that worked without a flaw.
Did not need python or boto or anything.
You can do this from the AWS Console using Lifecycle Rules.
Open the bucket in question. Click the Management tab at the top.
Make sure the Lifecycle Sub Tab is selected.
Click + Add lifecycle rule
On Step 1 (Name and scope) enter a rule name (e.g. removeall)
Click Next to Step 2 (Transitions)
Leave this as is and click Next.
You are now on the 3. Expiration step.
Check the checkboxes for both Current Version and Previous Versions.
Click the checkbox for "Expire current version of object" and enter the number 1 for "After _____ days from object creation
Click the checkbox for "Permanently delete previous versions" and enter the number 1 for
"After _____ days from becoming a previous version"
click the checkbox for "Clean up incomplete multipart uploads"
and enter the number 1 for "After ____ days from start of upload"
Click Next
Review what you just did.
Click Save
Come back in a day and see how it is doing.
I improved the boto3 answer with Python3 and argv.
Save the following script as something like s3_rm.py.
#!/usr/bin/env python3
import sys
import boto3
def main():
args = sys.argv[1:]
if (len(args) < 1):
print("Usage: {} s3_bucket_name".format(sys.argv[0]))
exit()
s3 = boto3.resource('s3')
bucket = s3.Bucket(args[0])
bucket.object_versions.delete()
# if you want to delete the now-empty bucket as well, uncomment this line:
#bucket.delete()
if __name__ == "__main__":
main()
Add chmod +x s3_rm.py.
Run the function like ./s3_rm.py my_bucket_name.
In the same vein as https://stackoverflow.com/a/63613510/805031 ... this is what I use to clean up accounts before closing them:
# If the data is too large, apply LCP to remove all objects within a day
# Create lifecycle-expire.json with the LCP required to purge all objects
# Based on instructions from: https://aws.amazon.com/premiumsupport/knowledge-center/s3-empty-bucket-lifecycle-rule/
cat << JSON > lifecycle-expire.json
{
"Rules": [
{
"ID": "remove-all-objects-asap",
"Filter": {
"Prefix": ""
},
"Status": "Enabled",
"Expiration": {
"Days": 1
},
"NoncurrentVersionExpiration": {
"NoncurrentDays": 1
},
"AbortIncompleteMultipartUpload": {
"DaysAfterInitiation": 1
}
},
{
"ID": "remove-expired-delete-markers",
"Filter": {
"Prefix": ""
},
"Status": "Enabled",
"Expiration": {
"ExpiredObjectDeleteMarker": true
}
}
]
}
JSON
# Apply to ALL buckets
aws s3 ls | cut -d" " -f 3 | xargs -I{} aws s3api put-bucket-lifecycle-configuration --bucket {} --lifecycle-configuration file://lifecycle-expire.json
# Apply to a single bucket; replace $BUCKET_NAME
aws s3api put-bucket-lifecycle-configuration --bucket $BUCKET_NAME --lifecycle-configuration file://lifecycle-expire.json
...then a day later you can come back and delete the buckets using something like:
# To force empty/delete all buckets
aws s3 ls | cut -d" " -f 3 | xargs -I{} aws s3 rb s3://{} --force
# To remove only empty buckets
aws s3 ls | cut -d" " -f 3 | xargs -I{} aws s3 rb s3://{}
# To force empty/delete a single bucket; replace $BUCKET_NAME
aws s3 rb s3://$BUCKET_NAME --force
It saves a lot of time and money so worth doing when you have many TBs to delete.
I found the other answers either incomplete or requiring external dependencies to be installed (like boto), so here is one that is inspired by those but goes a little deeper.
As documented in Working with Delete Markers, before a versioned bucket can be removed, all its versions must be completely deleted, which is a 2-step process:
"delete" all version objects in the bucket, which marks them as
deleted but does not actually delete them
complete the deletion by deleting all the deletion marker objects
Here is the pure CLI solution that worked for me (inspired by the other answers):
#!/usr/bin/env bash
bucket_name=...
del_s3_bucket_obj()
{
local bucket_name=$1
local obj_type=$2
local query="{Objects: $obj_type[].{Key:Key,VersionId:VersionId}}"
local s3_objects=$(aws s3api list-object-versions --bucket ${bucket_name} --output=json --query="$query")
if ! (echo $s3_objects | grep -q '"Objects": null'); then
aws s3api delete-objects --bucket "${bucket_name}" --delete "$s3_objects"
fi
}
del_s3_bucket_obj ${bucket_name} 'Versions'
del_s3_bucket_obj ${bucket_name} 'DeleteMarkers'
Once this is done, the following will work:
aws s3 rb "s3://${bucket_name}"
Not sure how it will fare with 1000+ objects though, if anyone can report that would be awesome.
By far the easiest method I've found is to use this CLI tool, s3wipe. It's provided as a docker container so you can use it like so:
$ docker run -it --rm slmingol/s3wipe --help
usage: s3wipe [-h] --path PATH [--id ID] [--key KEY] [--dryrun] [--quiet]
[--batchsize BATCHSIZE] [--maxqueue MAXQUEUE]
[--maxthreads MAXTHREADS] [--delbucket] [--region REGION]
Recursively delete all keys in an S3 path
optional arguments:
-h, --help show this help message and exit
--path PATH S3 path to delete (e.g. s3://bucket/path)
--id ID Your AWS access key ID
--key KEY Your AWS secret access key
--dryrun Don't delete. Print what we would have deleted
--quiet Suprress all non-error output
--batchsize BATCHSIZE # of keys to batch delete (default 100)
--maxqueue MAXQUEUE Max size of deletion queue (default 10k)
--maxthreads MAXTHREADS Max number of threads (default 100)
--delbucket If S3 path is a bucket path, delete the bucket also
--region REGION Region of target S3 bucket. Default vaue `us-
east-1`
Example
Here's an example where I'm deleting all the versioned objects in a bucket and then deleting the bucket:
$ docker run -it --rm slmingol/s3wipe \
--id $(aws configure get default.aws_access_key_id) \
--key $(aws configure get default.aws_secret_access_key) \
--path s3://bw-tf-backends-aws-example-logs \
--delbucket
[2019-02-20#03:39:16] INFO: Deleting from bucket: bw-tf-backends-aws-example-logs, path: None
[2019-02-20#03:39:16] INFO: Getting subdirs to feed to list threads
[2019-02-20#03:39:18] INFO: Done deleting keys
[2019-02-20#03:39:18] INFO: Bucket is empty. Attempting to remove bucket
How it works
There's a bit to unpack here but the above is doing the following:
docker run -it --rm mikelorant/s3wipe - runs s3wipe container interactively and deletes it after each execution
--id & --key - passing our access key and access id in
aws configure get default.aws_access_key_id - retrieves our key id
aws configure get default.aws_secret_access_key - retrieves our key secret
--path s3://bw-tf-backends-aws-example-logs - bucket that we want to delete
--delbucket - deletes bucket once emptied
References
https://github.com/slmingol/s3wipe
Is there a way to export an AWS CLI Profile to Environment Variables?
https://cloud.docker.com/u/slmingol/repository/docker/slmingol/s3wipe
https://gist.github.com/wknapik/191619bfa650b8572115cd07197f3baf
#!/usr/bin/env bash
set -eEo pipefail
shopt -s inherit_errexit >/dev/null 2>&1 || true
if [[ ! "$#" -eq 2 || "$1" != --bucket ]]; then
echo -e "USAGE: $(basename "$0") --bucket <bucket>"
exit 2
fi
# $# := bucket_name
empty_bucket() {
local -r bucket="${1:?}"
for object_type in Versions DeleteMarkers; do
local opt=() next_token=""
while [[ "$next_token" != null ]]; do
page="$(aws s3api list-object-versions --bucket "$bucket" --output json --max-items 1000 "${opt[#]}" \
--query="[{Objects: ${object_type}[].{Key:Key, VersionId:VersionId}}, NextToken]")"
objects="$(jq -r '.[0]' <<<"$page")"
next_token="$(jq -r '.[1]' <<<"$page")"
case "$(jq -r .Objects <<<"$objects")" in
'[]'|null) break;;
*) opt=(--starting-token "$next_token")
aws s3api delete-objects --bucket "$bucket" --delete "$objects";;
esac
done
done
}
empty_bucket "${2#s3://}"
E.g. empty_bucket.sh --bucket foo
This will delete all object versions and delete markers in a bucket in batches of 1000. Afterwards, the bucket can be deleted with aws s3 rb s3://foo.
Requires bash, awscli and jq.
This works for me. Maybe running later versions of something and above > 1000 items. been running a couple of million files now. However its still not finished after half a day and no means to validate in AWS GUI =/
# Set bucket name to clearout
BUCKET = 'bucket-to-clear'
import boto3
s3 = boto3.resource('s3')
bucket = s3.Bucket(BUCKET)
max_len = 1000 # max 1000 items at one req
chunk_counter = 0 # just to keep track
keys = [] # collect to delete
# clear files
def clearout():
global bucket
global chunk_counter
global keys
result = bucket.delete_objects(Delete=dict(Objects=keys))
if result["ResponseMetadata"]["HTTPStatusCode"] != 200:
print("Issue with response")
print(result)
chunk_counter += 1
keys = []
print(". {n} chunks so far".format(n=chunk_counter))
return
# start
for key in bucket.object_versions.all():
item = {'Key': key.object_key, 'VersionId': key.id}
keys.append(item)
if len(keys) >= max_len:
clearout()
# make sure last files are cleared as well
if len(keys) > 0:
clearout()
print("")
print("Done, {n} items deleted".format(n=chunk_counter*max_len))
#bucket.delete() #as per usual uncomment if you're sure!
To add to python solutions provided here: if you are getting boto.exception.S3ResponseError: S3ResponseError: 400 Bad Request error, try creating ~/.boto file with the following data:
[Credentials]
aws_access_key_id = aws_access_key_id
aws_secret_access_key = aws_secret_access_key
[s3]
host=s3.eu-central-1.amazonaws.com
aws_access_key_id = aws_access_key_id
aws_secret_access_key = aws_secret_access_key
Helped me to delete bucket in Frankfurt region.
Original answer: https://stackoverflow.com/a/41200567/2586441
If you use AWS SDK for JavaScript S3 Client for Node.js (#aws-sdk/client-s3), you can use following code:
const { S3Client, ListObjectsCommand } = require('#aws-sdk/client-s3')
const endpoint = 'YOUR_END_POINT'
const region = 'YOUR_REGION'
// Create an Amazon S3 service client object.
const s3Client = new S3Client({ region, endpoint })
const deleteEverythingInBucket = async bucketName => {
console.log('Deleting all object in the bucket')
const bucketParams = {
Bucket: bucketName
}
try {
const command = new ListObjectsCommand(bucketParams)
const data = await s3Client.send(command)
console.log('Bucket Data', JSON.stringify(data))
if (data?.Contents?.length > 0) {
console.log('Removing objects in the bucket', data.Contents.length)
for (const object of data.Contents) {
console.log('Removing object', object)
if (object.Key) {
try {
await deleteFromS3({
Bucket: bucketName,
Key: object.Key
})
} catch (err) {
console.log('Error on object delete', err)
}
}
}
}
} catch (err) {
console.log('Error creating presigned URL', err)
}
}
For my case, I wanted to be sure that all objects for specific prefixes would be deleted. So, we generate a list of all objects for each prefix, divide it by 1k records (AWS limitation), and delete them.
Please note that AWS CLI and jq must be installed and configured.
A text file with prefixes that we want to delete was created (in the example below prefixes.txt).
The format is:
prefix1
prefix2
And this is a shell script (also please change the BUCKET_NAME with the real name):
#!/bin/sh
BUCKET="BUCKET_NAME"
PREFIXES_FILE="prefixes.txt"
if [ -f "$PREFIXES_FILE" ]; then
while read -r current_prefix
do
printf '***** PREFIX %s *****\n' "$current_prefix"
OLD_OBJECTS_FILE="$current_prefix-all.json"
if [ -f "$OLD_OBJECTS_FILE" ]; then
printf 'Deleted %s...\n' "$OLD_OBJECTS_FILE"
rm "$OLD_OBJECTS_FILE"
fi
cmd="aws s3api list-object-versions --bucket \"$BUCKET\" --prefix \"$current_prefix/\" --query \"[Versions,DeleteMarkers][].{Key: Key, VersionId: VersionId}\" >> $OLD_OBJECTS_FILE"
echo "$cmd"
eval "$cmd"
no_of_obj=$(cat "$OLD_OBJECTS_FILE" | jq 'length')
i=0
page=0
#Get old version Objects
echo "Objects versions count: $no_of_obj"
while [ $i -lt "$no_of_obj" ]
do
next=$((i+999))
old_versions=$(cat "$OLD_OBJECTS_FILE" | jq '.[] | {Key,VersionId}' | jq -s '.' | jq .[$i:$next])
paged_file_name="$current_prefix-page-$page.json"
cat << EOF > "$paged_file_name"
{"Objects":$old_versions, "Quiet":true}
EOF
echo "Deleting records from $i - $next"
cmd="aws s3api delete-objects --bucket \"$BUCKET\" --delete file://$paged_file_name"
echo "$cmd"
eval "$cmd"
i=$((i+1000))
page=$((page+1))
done
done < "$PREFIXES_FILE"
else
echo "$PREFIXES_FILE does not exist."
fi
If you want just to check the list of objects and don't delete them immediately - please comment/remove the last eval "$cmd".
I needed to delete older object versions but keep the current version in the bucket. Code uses iterators, works on buckets of any size with any number of objects.
import boto3
from itertools import islice
bucket = boto3.resource('s3').Bucket('bucket_name'))
all_versions = bucket.object_versions.all()
stale_versions = iter(filter(lambda x: not x.is_latest, all_versions))
pages = iter(lambda: tuple(islice(stale_versions, 1000)), ())
for page in pages:
bucket.delete_objects(
Delete={
'Objects': [{
'Key': item.key,
'VersionId': item.version_id
} for item in page]
})
S3=s3://tmobi-processed/biz.db/
aws s3 rm ${S3} --recursive
BUCKET=`echo ${S3} | egrep -o 's3://[^/]*' | sed -e s/s3:\\\\/\\\\///g`
PREFIX=`echo ${S3} | sed -e s/s3:\\\\/\\\\/${BUCKET}\\\\///g`
aws s3api list-object-versions \
--bucket ${BUCKET} \
--prefix ${PREFIX} |
jq -r '.Versions[] | .Key + " " + .VersionId' |
while read key id ; do
aws s3api delete-object \
--bucket ${BUCKET} \
--key ${key} \
--version-id ${id} >> versions.txt
done
aws s3api list-object-versions \
--bucket ${BUCKET} \
--prefix ${PREFIX} |
jq -r '.DeleteMarkers[] | .Key + " " + .VersionId' |
while read key id ; do
aws s3api delete-object \
--bucket ${BUCKET} \
--key ${key} \
--version-id ${id} >> delete_markers.txt
done
You can use aws-cli to delete s3 bucket
aws s3 rb s3://your-bucket-name
If aws cli is not installed in your computer you can your following commands:
For Linux or ubuntu:
sudo apt-get install aws-cli
Then check it is installed or not by:
aws --version
Now configure it by providing aws-access-credentials
aws configure
Then give the access key and secret access key and your region

Exporting DNS zonefile from Amazon Route 53

I would like to export a DNS zonefile from my Amazon Route 53 setup. Is this possible, or can zonefiles only be created manually? (e.g. through http://www.zonefile.org/?lang=en)
The following script exports zone details in bind format from Route53. Pass over the domain name as a parameter to script. (This required awscli and jq to be installed and configured.)
#!/bin/bash
zonename=$1
hostedzoneid=$(aws route53 list-hosted-zones --output json | jq -r ".HostedZones[] | select(.Name == \"$zonename.\") | .Id" | cut -d'/' -f3)
aws route53 list-resource-record-sets --hosted-zone-id $hostedzoneid --output json | jq -jr '.ResourceRecordSets[] | "\(.Name) \t\(.TTL) \t\(.Type) \t\(.ResourceRecords[]?.Value)\n"'
It's not possible yet. You'll have to use the API's ListResourceRecordSets and build the zonefile yourself.
As stated in the comment, the cli53 is a great tool to interact with Route 53 using the command line interface.
First, configure your account keys in ~/.aws/config file:
[default]
aws_access_key_id = AK.....ZP
aws_secret_access_key = 8j.....M0
Then, use the export command:
$ cli53 export --full --debug example.com > example.com.zone 2> example.com.zone.log
Verify the example.com.zone file after export to make sure that everything is exported correctly.
You can import the zone lately:
$ cli53 import --file ./example.com.zone example.com
And if you want to transfer the Route53 zone from one AWS account to another, you can use the profile option. Just add two named accounts to the ~/.aws/config file and reference them with the profile property during export and import. You can even pipe these two commands.
You can export a JSON file:
aws route53 list-resource-record-sets --hosted-zone-id <zone-id-here> --output json > route53-records.json
You can export with aws api
aws route53 list-resource-record-sets --hosted-zone-id YOUR_ZONE_ID
Exporting and importing is possible with https://github.com/RisingOak/route53-transfer
Based on #szentmarjay's answer above, except it shows usage and supports zone_id or zone_name. This is my fave because it's standard old school bind format, so other tools can do stuff with it.
#!/bin/bash
# r53_export
usage() {
local cmd=$(basename "$0")
echo -e >&2 "\nUsage: $cmd {--id ZONE_ID|--domain ZONE_NAME}\n"
exit 1
}
while [[ $1 ]]; do
if [[ $1 == --id ]]; then shift; zone_id="$1"
elif [[ $1 == --domain ]]; then shift; zone_name="$1"
else usage
fi
shift
done
if [[ $zone_name ]]; then
zone_id=$(
aws route53 list-hosted-zones --output json \
| jq -r ".HostedZones[] | select(.Name == \"$zone_name.\") | .Id" \
| head -n1 \
| cut -d/ -f3
)
echo >&2 "+ Found zone id: '$zone_id'"
fi
[[ $zone_id ]] || usage
aws route53 list-resource-record-sets --hosted-zone-id $zone_id --output json \
| jq -jr '.ResourceRecordSets[] | "\(.Name) \t\(.TTL) \t\(.Type) \t\(.ResourceRecords[]?.Value)\n"'