I am trying to run a lambda that will kick off on a schedule to copy all snapshots taken the day prior to another region for DR purposes. I have a bit of code but it seems to not work as intended.
Symptoms:
It's grabbing the same snapshots multiple times and copying them
It always errors out on 2 particular snapshots, I don't know enough coding to write a log to figure out why. These snapshots work if I manually copy them though.
import boto3
from datetime import date, timedelta
SOURCE_REGION = 'us-east-1'
DEST_REGION = 'us-west-2'
ec2_source = boto3.client('ec2', region_name = SOURCE_REGION)
ec2_destination = boto3.client('ec2', region_name = DEST_REGION)
snaps = ec2_source.describe_snapshots(OwnerIds=['self'])['Snapshots']
yesterday = date.today() - timedelta(days = 1)
yesterday_snaps = [ s for s in snaps if s['StartTime'].date() == yesterday ]
for yester_snap in yesterday_snaps:
DestinationSnapshot = ec2_destination.copy_snapshot(
SourceSnapshotId = yester_snap['SnapshotId'],
SourceRegion = SOURCE_REGION,
Encrypted = True,
KmsKeyId='REMOVED FOR SECURITY',
DryRun = False
)
DestinationSnapshotID = DestinationSnapshot['SnapshotId']
ec2_destination.create_tags(Resources=[DestinationSnapshotID],
Tags=yester_snap['Tags']
)
waiter = ec2_destination.get_waiter('snapshot_completed')
waiter.wait(
SnapshotIds=[DestinationSnapshotID],
DryRun=False,
WaiterConfig={'Delay': 10,'MaxAttempts': 123}
)
Debugging
You can debug by simply putting print() statements in your code.
For example:
for yester_snap in yesterday_snaps:
print('Copying:', yester_snap['SnapshotId'])
DestinationSnapshot = ec2_destination.copy_snapshot(...)
The logs will appear in CloudWatch Logs. You can access the logs via the Monitoring tab in the Lambda function. Make sure the Lambda function has AWSLambdaBasicExecutionRole permissions so that it can write to CloudWatch Logs.
Today/Yesterday
Be careful about your definition of yesterday. Amazon EC2 instances run in the UTC timezone, so your concept of a today and yesterday might not match what is happening.
It might be better to add a tag to snapshots after they are copied (eg 'copied') rather than relying on dates to figure out which ones to copy.
CloudWatch Events rule
Rather than running this program once per day, an alternative method would be:
Create an Amazon CloudWatch Events rule that triggers on Snapshot creation:
{
"source": [
"aws.ec2"
],
"detail-type": [
"EBS Snapshot Notification"
],
"detail": {
"event": [
"createSnapshot"
]
}
}
Configure the rule to trigger an AWS Lambda function
In the Lambda function, copy the Snapshot that was just created
This way, the Snapshots are created immediately and there is no need to search for them or figure out which Snapshots to copy
Related
How can I get an email alert when there are 1 or more EBS volumes in AWS with a state of 'Available'?
In AWS we have a team of people who manage EC2 instances. Sometimes instances are deleted and redundant volumes are left, showing as State = Available (seen here https://eu-west-1.console.aws.amazon.com/ec2/v2/home?region=eu-west-1#Volumes:sort=state).
I would like to be notified by e-mail when this happens, so I can manually review and delete them as required. A scheduled check and alert (e-mail) of once per day would be ok.
I think this should be possible via AWS Cloudwatch, but I can't see how to do it...
This is what I am using in an AWS Lambda process:
import boto3
ec2 = boto3.resource('ec2')
sns = boto3.client('sns')
def chk_vols(event, context):
vol_array = ec2.volumes.all()
vol_avail = []
for v in vol_array:
if v.state == 'available':
vol_avail.append(v.id)
if vol_avail:
sns.publish(
TopicArn='arn:aws:sns:<your region>:<your account>:<your topic>',
Message=str(vol_avail),
Subject='AWS Volumes Available'
)
Title pretty much says it all - Since the EFS metered size (usage) ist not a metric that I can use in Cloudwatch, I need to create a custom metric watching the last metered file size in EFS.
Is there any possiblity to do so? Or is there maybe a even better way to monitore the size of my EFS?
I would recommend using a Lambda, running every hour or so and sending the data into CloudWatch.
This code gathers all the EFS File Systems and sends their size (in kb) to Cloudwatch along with the file system name. Modify it to suit your needs:
import json
import boto3
region = "us-east-1"
def push_efs_size_metric(region):
efs_name = []
efs = boto3.client('efs', region_name=region)
cw = boto3.client('cloudwatch', region_name=region)
efs_file_systems = efs.describe_file_systems()['FileSystems']
for fs in efs_file_systems:
efs_name.append(fs['Name'])
cw.put_metric_data(
Namespace="EFS Metrics",
MetricData=[
{
'MetricName': 'EFS Size',
'Dimensions': [
{
'Name': 'EFS_Name',
'Value': fs['Name']
}
],
'Value': fs['SizeInBytes']['Value']/1024,
'Unit': 'Kilobytes'
}
]
)
return efs_name
def cloudtrail_handler(event, context):
response = push_efs_size_metric(region)
print ({
'EFS Names' : response
})
I'd also suggest reading up on the reference below for more details on creating custom metrics.
References
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/publishingMetrics.html
We are building an automated DR cold site on other region, currently are working on retrieving a list of RDS automated snapshots created today, and passed them to another function to copy them to another AWS region.
The issue is with RDS boto3 client where it returned a unique format of date, making filtering on creation date more difficult.
today = (datetime.today()).date()
rds_client = boto3.client('rds')
snapshots = rds_client.describe_db_snapshots(SnapshotType='automated')
harini = "datetime("+ today.strftime('%Y,%m,%d') + ")"
print harini
print snapshots
for i in snapshots['DBSnapshots']:
if i['SnapshotCreateTime'].date() == harini:
print(i['DBSnapshotIdentifier'])
print (today)
despite already converted the date "harini" to the format 'SnapshotCreateTime': datetime(2015, 1, 1), the Lambda function still unable to list out the snapshots.
The better method is to copy the files as they are created by invoking a lambda function using a cloud watch event.
See step by step instruction:
https://geektopia.tech/post.php?blogpost=Automating_The_Cross_Region_Copy_Of_RDS_Snapshots
Alternatively, you can issue a copy for each snapshot regardless of the date. The client will raise an exception and you can trap it like this
# Written By GeekTopia
#
# Copy All Snapshots for an RDS Instance To a new region
# --Free to use under all conditions
# --Script is provied as is. No Warranty, Express or Implied
import json
import boto3
from botocore.exceptions import ClientError
import time
destinationRegion = "us-east-1"
sourceRegion = 'us-west-2'
rdsInstanceName = 'needbackups'
def lambda_handler(event, context):
#We need two clients
# rdsDestinationClient -- Used to start the copy processes. All cross region
copies must be started from the destination and reference the source
# rdsSourceClient -- Used to list the snapshots that need to be copied.
rdsDestinationClient = boto3.client('rds',region_name=destinationRegion)
rdsSourceClient=boto3.client('rds',region_name=sourceRegion)
#List All Automated for A Single Instance
snapshots = rdsSourceClient.describe_db_snapshots(DBInstanceIdentifier=rdsInstanceName,SnapshotType='automated')
for snapshot in snapshots['DBSnapshots']:
#Check the the snapshot is NOT in the process of being created
if snapshot['Status'] == 'available':
#Get the Source Snapshot ARN. - Always use the ARN when copying snapshots across region
sourceSnapshotARN = snapshot['DBSnapshotArn']
#build a new snapshot name
sourceSnapshotIdentifer = snapshot['DBSnapshotIdentifier']
targetSnapshotIdentifer ="{0}-ManualCopy".format(sourceSnapshotIdentifer)
targetSnapshotIdentifer = targetSnapshotIdentifer.replace(":","-")
#Adding a delay to stop from reaching the api rate limit when there are large amount of snapshots -
#This should never occur in this use-case, but may if the script is modified to copy more than one instance.
time.sleep(.2)
#Execute copy
try:
copy = rdsDestinationClient.copy_db_snapshot(SourceDBSnapshotIdentifier=sourceSnapshotARN,TargetDBSnapshotIdentifier=targetSnapshotIdentifer,SourceRegion=sourceRegion)
print("Started Copy of Snapshot {0} in {2} to {1} in {3} ".format(sourceSnapshotIdentifer,targetSnapshotIdentifer,sourceRegion,destinationRegion))
except ClientError as ex:
if ex.response['Error']['Code'] == 'DBSnapshotAlreadyExists':
print("Snapshot {0} already exist".format(targetSnapshotIdentifer))
else:
print("ERROR: {0}".format(ex.response['Error']['Code']))
return {
'statusCode': 200,
'body': json.dumps('Opearation Complete')
}
The code below will take automated snapshots created today.
import boto3
from datetime import date, datetime
region_src = 'us-east-1'
client_src = boto3.client('rds', region_name=region_src)
date_today = datetime.today().strftime('%Y-%m-%d')
def get_db_snapshots_src():
response = client_src.describe_db_snapshots(
SnapshotType = 'automated',
IncludeShared=False,
IncludePublic=False
)
snapshotsInDay = []
for i in response["DBSnapshots"]:
if i["SnapshotCreateTime"].strftime('%Y-%m-%d') == date.isoformat(date.today()):
snapshotsInDay.append(i)
return snapshotsInDay
I need to catch the event RunflowJob in my Cloudwatch EventRule in order to tag AWS EMR starting clusters.
I'm looking for this event, because i need the username and account informations
Any idea?
Thanks
Calls to the ListClusters, DescribeCluster, and RunJobFlow actions generate entries in CloudTrail log files.
Every log entry contains information about who generated the request. For example, if a request is made to create and run a new job flow (RunJobFlow), CloudTrail logs the user identity of the person or service that made the request
https://docs.aws.amazon.com/emr/latest/ManagementGuide/logging_emr_api_calls.html#understanding_emr_log_file_entries
Here is a sample snippet to get the username using Python Boto3.
import boto3
cloudtrail = boto3.client("cloudtrail")
response = cloudtrail.lookup_events (
LookupAttributes=[
{
'AttributeKey': 'EventName',
'AttributeValue': 'RunJobFlow'
}
],
)
for event in response.get ("Events"):
print(event.get ("Username"))
username and cluster details can be retrieved from the RunJobFlow event itself. Easier solution would be to use Cloudwatch event rule along with Lambda function as a target to fetch these info and subsequently further action can be taken as required. Example below:
Event Pattern to be used with Cloudwatch event rule
{
"source": ["aws.elasticmapreduce"],
"detail": {
"eventName": ["RunJobFlow"]
}
}
Lambda code snippet
def lambda_handler(event, context):
#print("Received event: " + json.dumps(event, indent=2))
user = event['detail']['userIdentity']['userName']
cluster_id = event['detail']['responseElements']['jobFlowId']
region = event['region']
How do I use boto3 to simulate the Add Event Source action on the AWS GUI Console in the Event Sources tab.
I want to programatically create a trigger such that if an object is created in MyBucket, it will call MyLambda function(qualified with an alias).
The relevant api call that I see in the Boto3 documentation is create_event_source_mapping but it states explicitly that it is only for AWS Pull Model while I think that S3 belongs to the Push Model. Anyways, I tried using it but it didn't work.
Scenarios:
Passing a prefix filter would be nice too.
I was looking at the wrong side. This is configured on S3
s3 = boto3.resource('s3')
bucket_name = 'mybucket'
bucket_notification = s3.BucketNotification(bucket_name)
response = bucket_notification.put(
NotificationConfiguration={'LambdaFunctionConfigurations': [
{
'LambdaFunctionArn': 'arn:aws:lambda:us-east-1:033333333:function:mylambda:staging',
'Events': [
's3:ObjectCreated:*'
],
},
]})