Delete older than month AWS EC2 snapshots - amazon-web-services

Is this below given command will work or not to delete older than month AWS EC2 Snapshot.
aws describe-snapshots | grep -v (date +%Y-%m-) | grep snap- | awk '{print $2}' | xargs -n 1 -t aws delete-snapshot

Your command won't work mostly because of a typo: aws describe-snapshots should be aws ec2 describe-snapshots.
Anyway, you can do this without any other tools than aws:
snapshots_to_delete=$(aws ec2 describe-snapshots --owner-ids xxxxxxxxxxxx --query 'Snapshots[?StartTime<=`2017-02-15`].SnapshotId' --output text)
echo "List of snapshots to delete: $snapshots_to_delete"
# actual deletion
for snap in $snapshots_to_delete; do
aws ec2 delete-snapshot --snapshot-id $snap
done
Make sure you always know what are you deleting. By echo $snap, for example.
Also, adding --dry-run to aws ec2 delete-snapshot can show you that there are no errors in request.
Edit:
There are two things to pay attention at in the first command:
--owner-ids - you account unique id. Could easily be found manually in top right corner of AWS Console: Support->Support Center->Account Number xxxxxxxxxxxx
--query - JMESPath query which gets only snapshots created later than specified date (e.g.: 2017-02-15): Snapshots[?StartTime>=`2017-02-15`].SnapshotId

+1 to #roman-zhuzha for getting me close. i did have trouble when $snapshots_to_delete didn't parse into a long string of snapshots separated by whitespaces.
this script, below, does parse them into a long string of snapshot ids separated by whitespaces on my Ubuntu (trusty) 14.04 in bash with awscli 1.16:
#!/usr/bin/env bash
dry_run=1
echo_progress=1
d=$(date +'%Y-%m-%d' -d '1 month ago')
if [ $echo_progress -eq 1 ]
then
echo "Date of snapshots to delete (if older than): $d"
fi
snapshots_to_delete=$(aws ec2 describe-snapshots \
--owner-ids xxxxxxxxxxxxx \
--output text \
--query "Snapshots[?StartTime<'$d'].SnapshotId" \
)
if [ $echo_progress -eq 1 ]
then
echo "List of snapshots to delete: $snapshots_to_delete"
fi
for oldsnap in $snapshots_to_delete; do
# some $oldsnaps will be in use, so you can't delete them
# for "snap-a1234xyz" currently in use by "ami-zyx4321ab"
# (and others it can't delete) add conditionals like this
if [ "$oldsnap" = "snap-a1234xyz" ] ||
[ "$oldsnap" = "snap-c1234abc" ]
then
if [ $echo_progress -eq 1 ]
then
echo "skipping $oldsnap known to be in use by an ami"
fi
continue
fi
if [ $echo_progress -eq 1 ]
then
echo "deleting $oldsnap"
fi
if [ $dry_run -eq 1 ]
then
# dryrun will not actually delete the snapshots
aws ec2 delete-snapshot --snapshot-id $oldsnap --dry-run
else
aws ec2 delete-snapshot --snapshot-id $oldsnap
fi
done
Switch these variables as necesssary:
dry_run=1 # set this to 0 to actually delete
echo_progress=1 # set this to 0 to not echo stmnts
Change the date -d string to a human readable version of the number of days, months, or years back you want to delete "older than":
d=$(date +'%Y-%m-%d' -d '15 days ago') # half a month
Find your account id and update these XXXX's to that number:
--owner-ids xxxxxxxxxxxxx \
Here is an example of where you can find that number:
If running this in a cron, you only want to see errors and warnings. A frequent warning will be that there are snapshots in use. The two example snapshot id's (snap-a1234xyz, snap-c1234abc) are ignored since they would otherwise print something like:
An error occurred (InvalidSnapshot.InUse) when calling the DeleteSnapshot operation: The snapshot snap-a1234xyz is currently in use by ami-zyx4321ab
See the comments near "snap-a1234xyx" example snapshot id for how to handle this output.
And don't forget to check on the handy examples and references in the 1.16 aws cli describe-snapshots manual.

you can use 'self' in '--owner-ids' and delete the snapshots created before a specific date (e.g. 2018-01-01) with this one-liner command:
for i in $(aws ec2 describe-snapshots --owner-ids self --query 'Snapshots[?StartTime<=`2018-01-01`].SnapshotId' --output text); do echo Deleting $i; aws ec2 delete-snapshot --snapshot-id $i; sleep 1; done;

Date condition must be within Parenthesis ()
aws ec2 describe-snapshots \
--owner-ids 012345678910 \
--query "Snapshots[?(StartTime<='2020-03-31')].[SnapshotId]"

Related

How to list all AWS RDS instances and their tags in CSV

I'm new to the AWS CLI and I am trying to build a CSV server inventory of my project's AWS RDS instances that includes their tags.
I have done so successfully with EC2 instances using this:
aws ec2 describe-isntances\
--query 'Reservations[*].Instances[*].[PrivateIpAddress, InstanceType, [Tags[?Key=='Name'.Value] [0][0], [Tags[?Key=='ENV'.Value] [0][0] ]'\
--output text | sed -E 's/\s+/,/g' >> ec2list.csv
The above command gives me a CSV with the Ip address, instance type, as well as the values of the listed tags.
However, I am currently trying to do so unsuccessfully on RDS instances with this:
aws rds describe-db-isntances\
--query 'DBInstances[*].[DBInstanceIdentifier, DBInstanceArn, [Tags[?Key=='Component'.Value] [0][0], [Tags[?Key=='Engine'.Value] [0][0] ]'
--output text | sed -E 's/\s+/,/g' >> rdslist.csv
The RDS command only returns the instance arn and identifier but the tag values show up as none even though they definitely do have a value.
What modifications need to be made to my RDS query to show the tag values/is this even possible? Thanks
Probably you will need one more command https://docs.aws.amazon.com/AmazonRDS/latest/APIReference//API_ListTagsForResource.html.
You can wrap the 2 scripts in shell script like the below example.
#!/bin/bash
ARNS=$(aws rds describe-db-instances --query "DBInstances[].DBInstanceArn" --output text)
for line in $ARNS; do
TAGS=$(aws rds list-tags-for-resource --resource-name "$line" --query "TagList[]")
echo $line $TAGS
done
Realized that tags can be displayed in my original query. It does not use Tags like EC2 instances but TagList. E.g,
aws rds describe-db-isntances\
--query 'DBInstances[*].[DBInstanceIdentifier, DBInstanceArn, [TagList[?Key=='Component'.Value] [0][0], [TagList[?Key=='Engine'.Value] [0][0] ]'
--output text | sed -E 's/\s+/,/g' >> rdslist.csv

aws cli returns an extra 'None' when fetching the first element using --query parameter and with --output text

I am getting an extra None in aws-cli (version 1.11.160) with --query parameter and --output text when fetching the first element of the query output.
See the examples below.
$ aws kms list-aliases --query "Aliases[?contains(AliasName,'alias/foo')].TargetKeyId|[0]" --output text
a3a1f9d8-a4de-4d0e-803e-137d633df24a
None
$ aws kms list-aliases --query "Aliases[?contains(AliasName,'alias/foo-bar')].TargetKeyId|[0]" --output text
None
None
As far as I know this was working till yesterday but from today onwards this extra None comes in and killing our ansible tasks.
Anyone experienced anything similar?
Thanks
I started having this issue in the past few days too. In my case I was querying exports from a cfn stack.
My solution was (since I'll only ever get one result from the query) to change | [0].Value to .Value, which works with --output text.
Some examples:
$ aws cloudformation list-exports --query 'Exports[?Name==`kms-key-arn`] | []'
[
{
"ExportingStackId": "arn:aws:cloudformation:ap-southeast-2:111122223333:stack/stack-name/83ea7f30-ba0b-11e8-8b7d-50fae957fc4a",
"Name": "kms-key-arn",
"Value": "arn:aws:kms:ap-southeast-2:111122223333:key/a13a4bad-672e-45a3-99c2-c646a9470ffa"
}
]
$ aws cloudformation list-exports --query 'Exports[?Name==`kms-key-arn`] | [].Value'
[
"arn:aws:kms:ap-southeast-2:111122223333:key/a13a4bad-672e-45a3-99c2-c646a9470ffa"
]
$ aws cloudformation list-exports --query 'Exports[?Name==`kms-key-arn`] | [].Value' --output text
arn:aws:kms:ap-southeast-2:111122223333:key/a13a4bad-672e-45a3-99c2-c646a9470ffa
aws cloudformation list-exports --query 'Exports[?Name==`kms-key-arn`] | [0].Value' --output text
arn:aws:kms:ap-southeast-2:111122223333:key/a13a4bad-672e-45a3-99c2-c646a9470ffa
None
I'm no closer to finding out why it's happening, but it disproves #LHWizard's theory, or at least indicates there are conditions where that explanation isn't sufficient.
The best explanation is that not every match for your query statement has a TargetKeyId. On my account, there are several Aliases that only have AliasArn and AliasName key/value pairs. The None comes from a null value for TargetKeyId, in other words.
I came across the same issue when listing step functions. I consider it to be a bug. I don't like solutions that ignore the first or last element, expecting it will always be None at that position - at some stage the issue will get fixed and your workaround has introduced a nasty bug.
So, in my case, I did this as a safe workaround (adapt to your needs):
#!/usr/bin/env bash
arn="<step function arn goes here>"
arns=()
for arn in $(aws stepfunctions list-executions --state-machine-arn "$arn" --max-items 50 --query 'executions[].executionArn' --output text); do
[[ $arn == 'None' ]] || arns+=("$arn")
done
# process execution arns
for arn in "${arns[#]}"; do
echo "$arn" # or whatever
done
Supposing you need only the first value:
Replace --output text with --output json and you could parsed with jq
Therefore, you'll have something like
Ps. the -r option with jq is to remove the quotes around the response
aws kms list-aliases --query "Aliases[?contains(AliasName,'alias/foo')].TargetKeyId|[0]" --output | jq -r '.'

AWS CLI: ECR list-images, get newest

Using AWS CLI, and jq if needed, I'm trying to get the tag of the newest image in a particular repo.
Let's call the repo foo, and say the latest image is tagged bar. What query do I use to return bar?
I got as far as
aws ecr list-images --repository-name foo
and then realized that the list-images documentation gives no reference to the date as a queryable field. Sticking the above in a terminal gives me keypairs with just the tag and digest, no date.
Is there still some way to get the "latest" image? Can I assume it'll always be the first, or the last in the returned output?
You can use describe-images instead.
aws ecr describe-images --repository-name foo
returns imagePushedAt which is a timestamp property which you can use to filter.
I dont have examples in my account to test with but something like following should work
aws ecr describe-images --repository-name foo \
--query 'sort_by(imageDetails,& imagePushedAt)[*]'
If you want another flavor of using sort method, you can review this post
To add to Frederic's answer, if you want the latest, you can use [-1]:
aws ecr describe-images --repository-name foo \
--query 'sort_by(imageDetails,& imagePushedAt)[-1].imageTags[0]'
Assuming you are using a singular tag on your images... otherwise you might need to use imageTags[*] and do a little more work to grab the tag you want.
To get only latest image with out special character minor addition required for above answer.
aws ecr describe-images --repository-name foo --query 'sort_by(imageDetails,& imagePushedAt)[-1].imageTags[0]' --output text
List latest 3 images pushed to ECR
aws ecr describe-images --repository-name gvh \
--query 'sort_by(imageDetails,& imagePushedAt)[*].imageTags[0]' --output yaml \
| tail -n 3 | awk -F'- ' '{print $2}'
List first 3 images pushed to ECR
aws ecr describe-images --repository-name gvh \
--query 'sort_by(imageDetails,& imagePushedAt)[*].imageTags[0]' --output yaml \
| head -n 3 | awk -F'- ' '{print $2}'
Number '3' can be generalized in either head or tail command based on user requirement
Without having to sort the results, you can filter them specifying the imageTag=latest on image-ids, like so:
aws ecr describe-images --repository-name foo --image-ids imageTag=latest --output text
This will return only one result with the newest image, which is the one tagged as latest
Some of the provided solutions will fail because:
There is no image tagged with 'latest'.
There are multiple tags available, eg. [1.0.0, 1.0.9, 1.0.11]. With a sort_by this will return 1.0.9. Which is not the latest.
Because of this it's better to check for the image digest.
You can do so with this simple bash script:
#!/bin/bash -
#===============================================================================
#
# FILE: get-latest-image-per-ecr-repo.sh
#
# USAGE: ./get-latest-image-per-ecr-repo.sh aws-account-id
#
# AUTHOR: Enri Peters (EP)
# CREATED: 04/07/2022 12:59:15
#=======================================================================
set -o nounset # Treat unset variables as an error
for repo in \
$(aws ecr describe-repositories |\
jq -r '.repositories[].repositoryArn' |\
sort -u |\
awk -F ":" '{print $6}' |\
sed 's/repository\///')
do
echo "$1.dkr.ecr.eu-west-1.amazonaws.com/${repo}#$(aws ecr describe-images\
--repository-name ${repo}\
--query 'sort_by(imageDetails,& imagePushedAt)[-1].imageDigest' |\
tr -d '"')"
done > latest-image-per-ecr-repo-${1}.list
The output will be written to a file named latest-image-per-ecr-repo-awsaccountid.list.
An example of this output could be:
123456789123.dkr.ecr.eu-west-1.amazonaws.com/your-ecr-repository-name#sha256:fb839e843b5ea1081f4bdc5e2d493bee8cf8700458ffacc67c9a1e2130a6772a
...
...
With this you can do something like below to pull all the images to your machine.
#!/bin/bash -
for image in $(cat latest-image-per-ecr-repo-353131512553.list)
do
docker pull $image
done
You will see that when you run docker images that none of the images are tagged. But you can 'fix' this by running these commands:
docker images --format "docker image tag {{.ID}} {{.Repository}}:latest" > tag-images.sh
chmod +x tag-images.sh
./tag-images.sh
Then they will all be tagged with latest on your machine.
To get the latest image tag use:-
aws ecr describe-images --repository-name foo --query 'imageDetails[*].imageTags[ * ]' --output text | sort -r | head -n 1

Delete the oldest AWS EC2 snapshots

I'm trying to remove all my AWS EC2 snapshots except the last 6 with this script:
#!/bin/bash
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games
# Backup script
Volume="{VOL-DATA}"
Owner="{OWNER}"
Description="{DESCRIPTION}"
Local_numbackups=6
Local_region="us-west-1"
# Remove old snapshots associated to a description, keep the last $Local_numbackups
aws ec2 describe-snapshots --filters Name=description,Values=$Description | grep "SnapshotId" | head -n -$Local_numbackups | awk '{print $2}' | sed -e 's/,//g' | xargs -n 1 -t aws ec2 delete-snapshot --snapshot-id
However it doesn't work. It deletes instances, but not the oldest ones. Why?
You're trying to do something too complex to be handled (gracefully) in one line, so we'll need to break it down a bit. First, let's get the snapshots sorted by age, oldest to newest:
aws ec2 describe-snapshots --filters Name=description,Values=$Description --query 'Snapshots[*].[StartTime,SnapshotId]' --output text | sort -n
Then we can drop the StartTime field to get the snapshot ID alone:
aws ec2 describe-snapshots --filters Name=description,Values=$Description --query 'Snapshots[*].[StartTime,SnapshotId]' --output text | sort -n | sed -e 's/^.*\t//'
head (or tail) aren't really suitable for discarding the fixed number of snapshots we want to keep. We need to filter those out another way. So, putting it altogether:
# Get array of snapshot IDs sorted by age (oldest to newest)
snapshots=($(aws ec2 describe-snapshots --filters Name=description,Values=$Description --query 'Snapshots[*].[StartTime,SnapshotId]' --output text | sort -n | sed -e 's/^.*\t//'))
# Get number of snapshots
count=${#snapshots[#]}
if [ "$count" -lt "$Local_numbackups" ]; then
echo "We already have less than $Local_numbackups snapshots"
exit 0
else
# Drop the last (newest) $Local_numbackups IDs from the array
snapshots=(${snapshots[#]:0:$((count - Local_numbackups))})
# Loop through the remaining snapshots and delete
for snapshot in ${snapshots[#]}; do
aws ec2 delete-snapshot --snapshot-id $snapshot
done
fi
(While it's obviously possible to do this in bash with the AWS CLI, it's complex enough that I'd personally rather use a more robust language and the AWS SDK.)
Here is a sample.
days2keep="30"
region="us-west-2"
name="jdoe"
#date - -v is for Osx
cutoffdate=`date -j -v-${days2keep}d '+%Y-%m-%d'`
echo "Finding list of snapshots before $cutoffdate "
oldsnapids=$(aws ec2 describe-snapshots --region $region --filters Name=tag:Name,Values=$name --query Snapshots[?StartTime\<=\`$cutoffdate\`].SnapshotId --output text)
for snapid in $oldsnapids
do
echo Deleting snapshot $snapid
aws ec2 delete-snapshot --snapshot-id $snapid --region $region
done
We can delete all old snapshots using below steps:-
List out all snapshots ID's they are old and put in one file like:- /opt/snapshot.txt
And then use "aws configure" command for setup access AWS account from command line, at this time we need to provide credentials:-
Such as:
AWS Access Key ID [None]: XXXXXXXXXXXXXXXXXX
AWS Secret Access Key [None]: XXXXXXXXXXXXXXXXXXXXX
Default region name [None]: XXXXXXXXXXXXXXXX
After that we can use below shell script, we need to give snapshots ID's file name
Codes:
#!/bin/bash
list=$(cat /opt/snapshot.txt)
for i in $list
do
aws ec2 delete-snapshot --snapshot-id $i
if [ $? -eq 0 ]; then
echo Going Good
else
echo FAIL
fi
done
Thanks

AWS Cloudwatch Log - Is it possible to export existing log data from it?

I have managed to push my application logs to AWS Cloudwatch by using the AWS CloudWatch log agent. But the CloudWatch web console does not seem to provide a button to allow you to download/export the log data from it.
Any idea how I can achieve this goal?
The latest AWS CLI has a CloudWatch Logs cli, that allows you to download the logs as JSON, text file or any other output supported by AWS CLI.
For example to get the first 1MB up to 10,000 log entries from the stream a in group A to a text file, run:
aws logs get-log-events \
--log-group-name A --log-stream-name a \
--output text > a.log
The command is currently limited to a response size of maximum 1MB (up to 10,000 records per request), and if you have more you need to implement your own page stepping mechanism using the --next-token parameter. I expect that in the future the CLI will also allow full dump in a single command.
Update
Here's a small Bash script to list events from all streams in a specific group, since a specified time:
#!/bin/bash
function dumpstreams() {
aws $AWSARGS logs describe-log-streams \
--order-by LastEventTime --log-group-name $LOGGROUP \
--output text | while read -a st; do
[ "${st[4]}" -lt "$starttime" ] && continue
stname="${st[1]}"
echo ${stname##*:}
done | while read stream; do
aws $AWSARGS logs get-log-events \
--start-from-head --start-time $starttime \
--log-group-name $LOGGROUP --log-stream-name $stream --output text
done
}
AWSARGS="--profile myprofile --region us-east-1"
LOGGROUP="some-log-group"
TAIL=
starttime=$(date --date "-1 week" +%s)000
nexttime=$(date +%s)000
dumpstreams
if [ -n "$TAIL" ]; then
while true; do
starttime=$nexttime
nexttime=$(date +%s)000
sleep 1
dumpstreams
done
fi
That last part, if you set TAIL will continue to fetch log events and will report newer events as they come in (with some expected delay).
There is also a python project called awslogs, allowing to get the logs: https://github.com/jorgebastida/awslogs
There are things like:
list log groups:
$ awslogs groups
list streams for given log group:
$ awslogs streams /var/log/syslog
get the log records from all streams:
$ awslogs get /var/log/syslog
get the log records from specific stream :
$ awslogs get /var/log/syslog stream_A
and much more (filtering for time period, watching log streams...
I think, this tool might help you to do what you want.
It seems AWS has added the ability to export an entire log group to S3.
You'll need to setup permissions on the S3 bucket to allow cloudwatch to write to the bucket by adding the following to your bucket policy, replacing the region with your region and the bucket name with your bucket name.
{
"Effect": "Allow",
"Principal": {
"Service": "logs.us-east-1.amazonaws.com"
},
"Action": "s3:GetBucketAcl",
"Resource": "arn:aws:s3:::tsf-log-data"
},
{
"Effect": "Allow",
"Principal": {
"Service": "logs.us-east-1.amazonaws.com"
},
"Action": "s3:PutObject",
"Resource": "arn:aws:s3:::tsf-log-data/*",
"Condition": {
"StringEquals": {
"s3:x-amz-acl": "bucket-owner-full-control"
}
}
}
Details can be found in Step 2 of this AWS doc
The other answers were not useful with AWS Lambda logs since they create many log streams and I just wanted to dump everything in the last week. I finally found the following command to be what I needed:
aws logs tail --since 1w LOG_GROUP_NAME > output.log
Note that LOG_GROUP_NAME is the lambda function path (e.g. /aws/lambda/FUNCTION_NAME) and you can replace the since argument with a variety of times (1w = 1 week, 5m = 5 minutes, etc)
I would add that one liner to get all logs for a stream :
aws logs get-log-events --log-group-name my-log-group --log-stream-name my-log-stream | grep '"message":' | awk -F '"' '{ print $(NF-1) }' > my-log-group_my-log-stream.txt
Or in a slightly more readable format :
aws logs get-log-events \
--log-group-name my-log-group\
--log-stream-name my-log-stream \
| grep '"message":' \
| awk -F '"' '{ print $(NF-1) }' \
> my-log-group_my-log-stream.txt
And you can make a handy script out of it that is admittedly less powerful than #Guss's but simple enough. I saved it as getLogs.sh and invoke it with ./getLogs.sh log-group log-stream
#!/bin/bash
if [[ "${#}" != 2 ]]
then
echo "This script requires two arguments!"
echo
echo "Usage :"
echo "${0} <log-group-name> <log-stream-name>"
echo
echo "Example :"
echo "${0} my-log-group my-log-stream"
exit 1
fi
OUTPUT_FILE="${1}_${2}.log"
aws logs get-log-events \
--log-group-name "${1}"\
--log-stream-name "${2}" \
| grep '"message":' \
| awk -F '"' '{ print $(NF-1) }' \
> "${OUTPUT_FILE}"
echo "Logs stored in ${OUTPUT_FILE}"
Apparently there isn't an out-of-box way from AWS Console where you can download the CloudWatchLogs. Perhaps you can write a script to perform the CloudWatchLogs fetch using the SDK / API.
The good thing about CloudWatchLogs is that you can retain the logs for infinite time(Never Expire); unlike the CloudWatch which just keeps the logs for just 14 days. Which means you can run the script in monthly / quarterly frequency rather than on-demand.
More information about the CloudWatchLogs API,
http://docs.aws.amazon.com/AmazonCloudWatchLogs/latest/APIReference/Welcome.html
http://awsdocs.s3.amazonaws.com/cloudwatchlogs/latest/cwl-api.pdf
You can now perform exports via the Cloudwatch Management Console with the new Cloudwatch Logs Insights page. Full documentation here https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL_ExportQueryResults.html. I had already started ingesting my Apache logs into Cloudwatch with JSON, so YMMV if you haven't set it up in advance.
Add Query to Dashboard or Export Query Results
After you run a query, you can add the query to a CloudWatch
dashboard, or copy the results to the clipboard.
Queries added to dashboards automatically re-run every time you load
the dashboard and every time that the dashboard refreshes. These
queries count toward your limit of four concurrent CloudWatch Logs
Insights queries.
To add query results to a dashboard
Open the CloudWatch console at
https://console.aws.amazon.com/cloudwatch/.
In the navigation pane, choose Insights.
Choose one or more log groups and run a query.
Choose Add to dashboard.
Select the dashboard, or choose Create new to create a new dashboard
for the query results.
Choose Add to dashboard.
To copy query results to the clipboard
Open the CloudWatch console at
https://console.aws.amazon.com/cloudwatch/.
In the navigation pane, choose Insights.
Choose one or more log groups and run a query.
Choose Actions, Copy query results.
Inspired by saputkin I have created a pyton script that downloads all the logs for a log group in given time period.
The script itself: https://github.com/slavogri/aws-logs-downloader.git
In case there are multiple log streams for that period multiple files will be created. Downloaded files will be stored in current directory, and will be named by the log streams that has a log events in given time period. (If the group name contains forward slashes, they will be replaced by underscores. Each file will be overwritten if it already exists.)
Prerequisite: You need to be logged in to your aws profile. The Script itself is going to use on behalf of you the AWS command line APIs: "aws logs describe-log-streams" and "aws logs get-log-events"
Usage example: python aws-logs-downloader -g /ecs/my-cluster-test-my-app -t "2021-09-04 05:59:50 +00:00" -i 60
optional arguments:
-h, --help show this help message and exit
-v, --version show program's version number and exit
-g , --log-group (required) Log group name for which the log stream events needs to be downloaded
-t , --end-time (default: now) End date and time of the downloaded logs in format: %Y-%m-%d %H:%M:%S %z (example: 2021-09-04 05:59:50 +00:00)
-i , --interval (default: 30) Time period in minutes before the end-time. This will be used to calculate the time since which the logs will be downloaded.
-p , --profile (default: dev) The aws profile that is logged in, and on behalf of which the logs will be downloaded.
-r , --region (default: eu-central-1) The aws region from which the logs will be downloaded.
Please let me now if it was useful to you. :)
After I did it I learned that there is another option using Boto3: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/logs.html#CloudWatchLogs.Client.get_log_events
Still the command line API seems to me like a good option.
export LOGGROUPNAME=[SOME_LOG_GROUP_NAME]; for LOGSTREAM in `aws --output text logs describe-log-streams --log-group-name ${LOGGROUPNAME} |awk '{print $7}'`; do aws --output text logs get-log-events --log-group-name ${LOGGROUPNAME} --log-stream-name ${LOGSTREAM} >> ${LOGGROUPNAME}_output.txt; done
Adapted #Guyss answer to macOS. As I am not really a bash guy, had to use python, to convert dates to a human-readable form.
runaswslog -1w gets last week and so on
runawslog() { sh awslogs.sh $1 | grep "EVENTS" | python parselogline.py; }
awslogs.sh:
#!/bin/bash
#set -x
function dumpstreams() {
aws $AWSARGS logs describe-log-streams \
--order-by LastEventTime --log-group-name $LOGGROUP \
--output text | while read -a st; do
[ "${st[4]}" -lt "$starttime" ] && continue
stname="${st[1]}"
echo ${stname##*:}
done | while read stream; do
aws $AWSARGS logs get-log-events \
--start-from-head --start-time $starttime \
--log-group-name $LOGGROUP --log-stream-name $stream --output text
done
}
AWSARGS=""
#AWSARGS="--profile myprofile --region us-east-1"
LOGGROUP="/aws/lambda/StockTrackFunc"
TAIL=
FROMDAT=$1
starttime=$(date -v ${FROMDAT} +%s)000
nexttime=$(date +%s)000
dumpstreams
if [ -n "$TAIL" ]; then
while true; do
starttime=$nexttime
nexttime=$(date +%s)000
sleep 1
dumpstreams
done
fi
parselogline.py:
import sys
import datetime
dat=sys.stdin.read()
for k in dat.split('\n'):
d=k.split('\t')
if len(d)<3:
continue
d[2]='\t'.join(d[2:])
print( str(datetime.datetime.fromtimestamp(int(d[1])/1000)) + '\t' + d[2] )
I had a similar use case where i had to download all the streams for a given log group. See if this script helps.
#!/bin/bash
if [[ "${#}" != 1 ]]
then
echo "This script requires two arguments!"
echo
echo "Usage :"
echo "${0} <log-group-name>"
exit 1
fi
streams=`aws logs describe-log-streams --log-group-name "${1}"`
for stream in $(jq '.logStreams | keys | .[]' <<< "$streams"); do
record=$(jq -r ".logStreams[$stream]" <<< "$streams")
streamName=$(jq -r ".logStreamName" <<< "$record")
echo "Downloading ${streamName}";
echo `aws logs get-log-events --log-group-name "${1}" --log-stream-name "$streamName" --output json > "${stream}.log" `
echo "Completed dowload:: ${streamName}";
done;
You have have pass log group name as an argument.
Eg: bash <name_of_the_bash_file>.sh <group_name>
I found AWS Documentation to be complete and accurate. https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/S3ExportTasks.html
This laid down steps for exporting logs from Cloudwatch to S3