Delete the oldest AWS EC2 snapshots - amazon-web-services

I'm trying to remove all my AWS EC2 snapshots except the last 6 with this script:
#!/bin/bash
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
# Backup script
Volume="{VOL-DATA}"
Owner="{OWNER}"
Description="{DESCRIPTION}"
Local_numbackups=6
Local_region="us-west-1"
# Remove old snapshots associated to a description, keep the last $Local_numbackups
aws ec2 describe-snapshots --filters Name=description,Values=$Description | grep "SnapshotId" | head -n -$Local_numbackups | awk '{print $2}' | sed -e 's/,//g' | xargs -n 1 -t aws ec2 delete-snapshot --snapshot-id
However it doesn't work. It deletes instances, but not the oldest ones. Why?

You're trying to do something too complex to be handled (gracefully) in one line, so we'll need to break it down a bit. First, let's get the snapshots sorted by age, oldest to newest:
aws ec2 describe-snapshots --filters Name=description,Values=$Description --query 'Snapshots[*].[StartTime,SnapshotId]' --output text | sort -n
Then we can drop the StartTime field to get the snapshot ID alone:
aws ec2 describe-snapshots --filters Name=description,Values=$Description --query 'Snapshots[*].[StartTime,SnapshotId]' --output text | sort -n | sed -e 's/^.*\t//'
head (or tail) aren't really suitable for discarding the fixed number of snapshots we want to keep. We need to filter those out another way. So, putting it altogether:
# Get array of snapshot IDs sorted by age (oldest to newest)
snapshots=($(aws ec2 describe-snapshots --filters Name=description,Values=$Description --query 'Snapshots[*].[StartTime,SnapshotId]' --output text | sort -n | sed -e 's/^.*\t//'))
# Get number of snapshots
count=${#snapshots[#]}
if [ "$count" -lt "$Local_numbackups" ]; then
echo "We already have less than $Local_numbackups snapshots"
exit 0
else
# Drop the last (newest) $Local_numbackups IDs from the array
snapshots=(${snapshots[#]:0:$((count - Local_numbackups))})
# Loop through the remaining snapshots and delete
for snapshot in ${snapshots[#]}; do
aws ec2 delete-snapshot --snapshot-id $snapshot
done
fi
(While it's obviously possible to do this in bash with the AWS CLI, it's complex enough that I'd personally rather use a more robust language and the AWS SDK.)

Here is a sample.
days2keep="30"
region="us-west-2"
name="jdoe"
#date - -v is for Osx
cutoffdate=`date -j -v-${days2keep}d '+%Y-%m-%d'`
echo "Finding list of snapshots before $cutoffdate "
oldsnapids=$(aws ec2 describe-snapshots --region $region --filters Name=tag:Name,Values=$name --query Snapshots[?StartTime\<=\`$cutoffdate\`].SnapshotId --output text)
for snapid in $oldsnapids
do
echo Deleting snapshot $snapid
aws ec2 delete-snapshot --snapshot-id $snapid --region $region
done

We can delete all old snapshots using below steps:-
List out all snapshots ID's they are old and put in one file like:- /opt/snapshot.txt
And then use "aws configure" command for setup access AWS account from command line, at this time we need to provide credentials:-
Such as:
AWS Access Key ID [None]: XXXXXXXXXXXXXXXXXX
AWS Secret Access Key [None]: XXXXXXXXXXXXXXXXXXXXX
Default region name [None]: XXXXXXXXXXXXXXXX
After that we can use below shell script, we need to give snapshots ID's file name
Codes:
#!/bin/bash
list=$(cat /opt/snapshot.txt)
for i in $list
do
aws ec2 delete-snapshot --snapshot-id $i
if [ $? -eq 0 ]; then
echo Going Good
else
echo FAIL
fi
done
Thanks

Related

How to list all AWS RDS instances and their tags in CSV

I'm new to the AWS CLI and I am trying to build a CSV server inventory of my project's AWS RDS instances that includes their tags.
I have done so successfully with EC2 instances using this:
aws ec2 describe-isntances\
--query 'Reservations[*].Instances[*].[PrivateIpAddress, InstanceType, [Tags[?Key=='Name'.Value] [0][0], [Tags[?Key=='ENV'.Value] [0][0] ]'\
--output text | sed -E 's/\s+/,/g' >> ec2list.csv
The above command gives me a CSV with the Ip address, instance type, as well as the values of the listed tags.
However, I am currently trying to do so unsuccessfully on RDS instances with this:
aws rds describe-db-isntances\
--query 'DBInstances[*].[DBInstanceIdentifier, DBInstanceArn, [Tags[?Key=='Component'.Value] [0][0], [Tags[?Key=='Engine'.Value] [0][0] ]'
--output text | sed -E 's/\s+/,/g' >> rdslist.csv
The RDS command only returns the instance arn and identifier but the tag values show up as none even though they definitely do have a value.
What modifications need to be made to my RDS query to show the tag values/is this even possible? Thanks
Probably you will need one more command https://docs.aws.amazon.com/AmazonRDS/latest/APIReference//API_ListTagsForResource.html.
You can wrap the 2 scripts in shell script like the below example.
#!/bin/bash
ARNS=$(aws rds describe-db-instances --query "DBInstances[].DBInstanceArn" --output text)
for line in $ARNS; do
TAGS=$(aws rds list-tags-for-resource --resource-name "$line" --query "TagList[]")
echo $line $TAGS
done
Realized that tags can be displayed in my original query. It does not use Tags like EC2 instances but TagList. E.g,
aws rds describe-db-isntances\
--query 'DBInstances[*].[DBInstanceIdentifier, DBInstanceArn, [TagList[?Key=='Component'.Value] [0][0], [TagList[?Key=='Engine'.Value] [0][0] ]'
--output text | sed -E 's/\s+/,/g' >> rdslist.csv

Query AWS CLI to populate Jenkins "Active Choices Reactive Parameter" (Linux)

I have a Jenkins 2.0 job where I require the user to select the list of servers to execute the job against via a Jenkins "Active Choices Reactive Parameter". These servers which the job will execute against are AWS EC2 instances. Instead of hard-coding the available servers in the "Active Choices Reactive Parameter", I'd like to query the AWS CLI to get a list of servers.
A few notes:
I've assigned the Jenkins 2.0 EC2 an IAM role which has sufficient privileges to query AWS via the CLI.
The AWS CLI is installed on the Jenkins EC2.
The "Active Choices Reactive Parameter" will return a list of checkboxes if I hardcode values in a Groovy script, such as:
return ["10.1.1.1", "10.1.1.2", 10.1.1.3"]
I know my awk commands can be improved, I'm not yet sure how, but my primary goal is to get the list of servers dynamically loaded in Jenkins.
I can run the following command directly on the EC2 instance which is hosting Jenkins:
aws ec2 describe-instances --region us-east-2 --filters
"Name=tag:Env,Values=qa" --query
"Reservations[*].Instances[*].PrivateIpAddress" | grep -o
'\"[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\"' | awk
{'printf $0 ", "'} | awk {'print "[" $0'} | awk {'gsub(/^[ \t]+|[
\t]+$/, ""); print'} | awk {'print substr ($0, 1, length($0)-1)'} |
awk {'print $0 "]"'}
This will return the following, which is in the format expected by the "Active Choices Reactive Parameter":
["10.1.1.1", "10.1.1.2", 10.1.1.3"]
So, in the "Script" textarea of the "Active Choices Reactive Parameter", I have the following script. The problem is that my server list is never populated. I've tried numerous variations of this script without luck. Can someone please tell me where I've went wrong and what I can do to correct this script so that my list of server IP addresses is dynamically loaded into a Jenkins job?
def standardOut = new StringBuffer(), standardErr = new StringBuffer()
def command = $/
aws ec2 describe-instances --region us-east-2 --filters "Name=tag:Env,Values=qaint" --query "Reservations[*].Instances[*].PrivateIpAddress" |
grep -o '\"[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\"' |
awk {'printf $0 ", "'} |
awk {'print "[" $0'} |
awk {'gsub(/^[ \t]+|[ \t]+$/, ""); print'} |
awk {'print substr ($0, 1, length($0)-1)'} |
awk {'print $0 "]"'}
/$
def proc = command.execute()
proc.consumeProcessOutput(standardOut, standardErr)
proc.waitForOrKill(1000)
return standardOut
I tried to execute your script and the standardErr had some errors, Looks like groovy didn't like the double quotes in the AWS CLI. Here is a cleaner way to do without using awk
def command = 'aws ec2 describe-instances \
--filters Name=tag:Name,Values=Test \
--query Reservations[*].Instances[*].PrivateIpAddress \
--output text'
def proc = command.execute()
proc.waitFor()
def output = proc.in.text
def exitcode= proc.exitValue()
def error = proc.err.text
if (error) {
println "Std Err: ${error}"
println "Process exit code: ${exitcode}"
return exitcode
}
//println output.split()
return output.split()
This script works with Jenkins Active Choices Parameter, and returns the list of IP addresses:
def aws_cmd = 'aws ec2 describe-instances \
--filters Name=instance-state-name,Values=running \
Name=tag:env,Values=dev \
--query Reservations[].Instances[].PrivateIpAddress[] \
--region us-east-2 \
--output text'
def aws_cmd_output = aws_cmd.execute()
// probably is required if execution takes long
//aws_cmd_output.waitFor()
def ip_list = aws_cmd_output.text.tokenize()
return ip_list

aws cli returns an extra 'None' when fetching the first element using --query parameter and with --output text

I am getting an extra None in aws-cli (version 1.11.160) with --query parameter and --output text when fetching the first element of the query output.
See the examples below.
$ aws kms list-aliases --query "Aliases[?contains(AliasName,'alias/foo')].TargetKeyId|[0]" --output text
a3a1f9d8-a4de-4d0e-803e-137d633df24a
None
$ aws kms list-aliases --query "Aliases[?contains(AliasName,'alias/foo-bar')].TargetKeyId|[0]" --output text
None
None
As far as I know this was working till yesterday but from today onwards this extra None comes in and killing our ansible tasks.
Anyone experienced anything similar?
Thanks
I started having this issue in the past few days too. In my case I was querying exports from a cfn stack.
My solution was (since I'll only ever get one result from the query) to change | [0].Value to .Value, which works with --output text.
Some examples:
$ aws cloudformation list-exports --query 'Exports[?Name==`kms-key-arn`] | []'
[
{
"ExportingStackId": "arn:aws:cloudformation:ap-southeast-2:111122223333:stack/stack-name/83ea7f30-ba0b-11e8-8b7d-50fae957fc4a",
"Name": "kms-key-arn",
"Value": "arn:aws:kms:ap-southeast-2:111122223333:key/a13a4bad-672e-45a3-99c2-c646a9470ffa"
}
]
$ aws cloudformation list-exports --query 'Exports[?Name==`kms-key-arn`] | [].Value'
[
"arn:aws:kms:ap-southeast-2:111122223333:key/a13a4bad-672e-45a3-99c2-c646a9470ffa"
]
$ aws cloudformation list-exports --query 'Exports[?Name==`kms-key-arn`] | [].Value' --output text
arn:aws:kms:ap-southeast-2:111122223333:key/a13a4bad-672e-45a3-99c2-c646a9470ffa
aws cloudformation list-exports --query 'Exports[?Name==`kms-key-arn`] | [0].Value' --output text
arn:aws:kms:ap-southeast-2:111122223333:key/a13a4bad-672e-45a3-99c2-c646a9470ffa
None
I'm no closer to finding out why it's happening, but it disproves #LHWizard's theory, or at least indicates there are conditions where that explanation isn't sufficient.
The best explanation is that not every match for your query statement has a TargetKeyId. On my account, there are several Aliases that only have AliasArn and AliasName key/value pairs. The None comes from a null value for TargetKeyId, in other words.
I came across the same issue when listing step functions. I consider it to be a bug. I don't like solutions that ignore the first or last element, expecting it will always be None at that position - at some stage the issue will get fixed and your workaround has introduced a nasty bug.
So, in my case, I did this as a safe workaround (adapt to your needs):
#!/usr/bin/env bash
arn="<step function arn goes here>"
arns=()
for arn in $(aws stepfunctions list-executions --state-machine-arn "$arn" --max-items 50 --query 'executions[].executionArn' --output text); do
[[ $arn == 'None' ]] || arns+=("$arn")
done
# process execution arns
for arn in "${arns[#]}"; do
echo "$arn" # or whatever
done
Supposing you need only the first value:
Replace --output text with --output json and you could parsed with jq
Therefore, you'll have something like
Ps. the -r option with jq is to remove the quotes around the response
aws kms list-aliases --query "Aliases[?contains(AliasName,'alias/foo')].TargetKeyId|[0]" --output | jq -r '.'

AWS CLI: ECR list-images, get newest

Using AWS CLI, and jq if needed, I'm trying to get the tag of the newest image in a particular repo.
Let's call the repo foo, and say the latest image is tagged bar. What query do I use to return bar?
I got as far as
aws ecr list-images --repository-name foo
and then realized that the list-images documentation gives no reference to the date as a queryable field. Sticking the above in a terminal gives me keypairs with just the tag and digest, no date.
Is there still some way to get the "latest" image? Can I assume it'll always be the first, or the last in the returned output?
You can use describe-images instead.
aws ecr describe-images --repository-name foo
returns imagePushedAt which is a timestamp property which you can use to filter.
I dont have examples in my account to test with but something like following should work
aws ecr describe-images --repository-name foo \
--query 'sort_by(imageDetails,& imagePushedAt)[*]'
If you want another flavor of using sort method, you can review this post
To add to Frederic's answer, if you want the latest, you can use [-1]:
aws ecr describe-images --repository-name foo \
--query 'sort_by(imageDetails,& imagePushedAt)[-1].imageTags[0]'
Assuming you are using a singular tag on your images... otherwise you might need to use imageTags[*] and do a little more work to grab the tag you want.
To get only latest image with out special character minor addition required for above answer.
aws ecr describe-images --repository-name foo --query 'sort_by(imageDetails,& imagePushedAt)[-1].imageTags[0]' --output text
List latest 3 images pushed to ECR
aws ecr describe-images --repository-name gvh \
--query 'sort_by(imageDetails,& imagePushedAt)[*].imageTags[0]' --output yaml \
| tail -n 3 | awk -F'- ' '{print $2}'
List first 3 images pushed to ECR
aws ecr describe-images --repository-name gvh \
--query 'sort_by(imageDetails,& imagePushedAt)[*].imageTags[0]' --output yaml \
| head -n 3 | awk -F'- ' '{print $2}'
Number '3' can be generalized in either head or tail command based on user requirement
Without having to sort the results, you can filter them specifying the imageTag=latest on image-ids, like so:
aws ecr describe-images --repository-name foo --image-ids imageTag=latest --output text
This will return only one result with the newest image, which is the one tagged as latest
Some of the provided solutions will fail because:
There is no image tagged with 'latest'.
There are multiple tags available, eg. [1.0.0, 1.0.9, 1.0.11]. With a sort_by this will return 1.0.9. Which is not the latest.
Because of this it's better to check for the image digest.
You can do so with this simple bash script:
#!/bin/bash -
#===============================================================================
#
# FILE: get-latest-image-per-ecr-repo.sh
#
# USAGE: ./get-latest-image-per-ecr-repo.sh aws-account-id
#
# AUTHOR: Enri Peters (EP)
# CREATED: 04/07/2022 12:59:15
#=======================================================================
set -o nounset # Treat unset variables as an error
for repo in \
$(aws ecr describe-repositories |\
jq -r '.repositories[].repositoryArn' |\
sort -u |\
awk -F ":" '{print $6}' |\
sed 's/repository\///')
do
echo "$1.dkr.ecr.eu-west-1.amazonaws.com/${repo}#$(aws ecr describe-images\
--repository-name ${repo}\
--query 'sort_by(imageDetails,& imagePushedAt)[-1].imageDigest' |\
tr -d '"')"
done > latest-image-per-ecr-repo-${1}.list
The output will be written to a file named latest-image-per-ecr-repo-awsaccountid.list.
An example of this output could be:
123456789123.dkr.ecr.eu-west-1.amazonaws.com/your-ecr-repository-name#sha256:fb839e843b5ea1081f4bdc5e2d493bee8cf8700458ffacc67c9a1e2130a6772a
...
...
With this you can do something like below to pull all the images to your machine.
#!/bin/bash -
for image in $(cat latest-image-per-ecr-repo-353131512553.list)
do
docker pull $image
done
You will see that when you run docker images that none of the images are tagged. But you can 'fix' this by running these commands:
docker images --format "docker image tag {{.ID}} {{.Repository}}:latest" > tag-images.sh
chmod +x tag-images.sh
./tag-images.sh
Then they will all be tagged with latest on your machine.
To get the latest image tag use:-
aws ecr describe-images --repository-name foo --query 'imageDetails[*].imageTags[ * ]' --output text | sort -r | head -n 1

Delete older than month AWS EC2 snapshots

Is this below given command will work or not to delete older than month AWS EC2 Snapshot.
aws describe-snapshots | grep -v (date +%Y-%m-) | grep snap- | awk '{print $2}' | xargs -n 1 -t aws delete-snapshot
Your command won't work mostly because of a typo: aws describe-snapshots should be aws ec2 describe-snapshots.
Anyway, you can do this without any other tools than aws:
snapshots_to_delete=$(aws ec2 describe-snapshots --owner-ids xxxxxxxxxxxx --query 'Snapshots[?StartTime<=`2017-02-15`].SnapshotId' --output text)
echo "List of snapshots to delete: $snapshots_to_delete"
# actual deletion
for snap in $snapshots_to_delete; do
aws ec2 delete-snapshot --snapshot-id $snap
done
Make sure you always know what are you deleting. By echo $snap, for example.
Also, adding --dry-run to aws ec2 delete-snapshot can show you that there are no errors in request.
Edit:
There are two things to pay attention at in the first command:
--owner-ids - you account unique id. Could easily be found manually in top right corner of AWS Console: Support->Support Center->Account Number xxxxxxxxxxxx
--query - JMESPath query which gets only snapshots created later than specified date (e.g.: 2017-02-15): Snapshots[?StartTime>=`2017-02-15`].SnapshotId
+1 to #roman-zhuzha for getting me close. i did have trouble when $snapshots_to_delete didn't parse into a long string of snapshots separated by whitespaces.
this script, below, does parse them into a long string of snapshot ids separated by whitespaces on my Ubuntu (trusty) 14.04 in bash with awscli 1.16:
#!/usr/bin/env bash
dry_run=1
echo_progress=1
d=$(date +'%Y-%m-%d' -d '1 month ago')
if [ $echo_progress -eq 1 ]
then
echo "Date of snapshots to delete (if older than): $d"
fi
snapshots_to_delete=$(aws ec2 describe-snapshots \
--owner-ids xxxxxxxxxxxxx \
--output text \
--query "Snapshots[?StartTime<'$d'].SnapshotId" \
)
if [ $echo_progress -eq 1 ]
then
echo "List of snapshots to delete: $snapshots_to_delete"
fi
for oldsnap in $snapshots_to_delete; do
# some $oldsnaps will be in use, so you can't delete them
# for "snap-a1234xyz" currently in use by "ami-zyx4321ab"
# (and others it can't delete) add conditionals like this
if [ "$oldsnap" = "snap-a1234xyz" ] ||
[ "$oldsnap" = "snap-c1234abc" ]
then
if [ $echo_progress -eq 1 ]
then
echo "skipping $oldsnap known to be in use by an ami"
fi
continue
fi
if [ $echo_progress -eq 1 ]
then
echo "deleting $oldsnap"
fi
if [ $dry_run -eq 1 ]
then
# dryrun will not actually delete the snapshots
aws ec2 delete-snapshot --snapshot-id $oldsnap --dry-run
else
aws ec2 delete-snapshot --snapshot-id $oldsnap
fi
done
Switch these variables as necesssary:
dry_run=1 # set this to 0 to actually delete
echo_progress=1 # set this to 0 to not echo stmnts
Change the date -d string to a human readable version of the number of days, months, or years back you want to delete "older than":
d=$(date +'%Y-%m-%d' -d '15 days ago') # half a month
Find your account id and update these XXXX's to that number:
--owner-ids xxxxxxxxxxxxx \
Here is an example of where you can find that number:
If running this in a cron, you only want to see errors and warnings. A frequent warning will be that there are snapshots in use. The two example snapshot id's (snap-a1234xyz, snap-c1234abc) are ignored since they would otherwise print something like:
An error occurred (InvalidSnapshot.InUse) when calling the DeleteSnapshot operation: The snapshot snap-a1234xyz is currently in use by ami-zyx4321ab
See the comments near "snap-a1234xyx" example snapshot id for how to handle this output.
And don't forget to check on the handy examples and references in the 1.16 aws cli describe-snapshots manual.
you can use 'self' in '--owner-ids' and delete the snapshots created before a specific date (e.g. 2018-01-01) with this one-liner command:
for i in $(aws ec2 describe-snapshots --owner-ids self --query 'Snapshots[?StartTime<=`2018-01-01`].SnapshotId' --output text); do echo Deleting $i; aws ec2 delete-snapshot --snapshot-id $i; sleep 1; done;
Date condition must be within Parenthesis ()
aws ec2 describe-snapshots \
--owner-ids 012345678910 \
--query "Snapshots[?(StartTime<='2020-03-31')].[SnapshotId]"