I'm creating a Lambda function with the intent of backing up my EC2 instances with their snapshots. However, I noticed reading the boto documentation the call to ec2.describe_instances is rate limited with MaxResults/NextToken. How can I combine the two of these to safely iterate through the list 50 at a time? Below is my work in progress:
import boto3
import datetime
import time
ec2 = boto3.client('ec2')
def lambda_handler(event, context):
try:
print("Creating snapshots on " + str(datetime.datetime.today()) + ".")
maxResults = 50
schedulers = ec2.describe_instances(Filters=[{'Name':'tag:GL-sub-purpose', 'Values':[Schedule]}], MaxResults=maxResults)
nextToken = schedulers['NextToken']
totalSchedulers = len(schedulers)
while totalSchedulers == maxResults:
schedulers = ec2.describe_instances(Filters=[{'Name':'tag:GL-sub-purpose', 'Values':[Schedule]}], MaxResults=maxResults, NextToken=nextToken)
nextToken = result['NextToken']
totalSchedulers = len(schedulers)
print("Performing backup on " + str(len(schedulers)) + " schedules.")
successful = []
failed = []
for s in schedulers:
#[...] More operations here, done 50 at a time.
I'm not really sure if I'm using the MaxResults/NextToken parameters correctly or efficiently here. Is this the best way to achieve my desired result/am I on the right track?
Just iterate through until NextToken is not returned. Here is a sample code to iterate through a batch of instances. Change it to suit your needs.
import boto3
ec2 = boto3.client('ec2')
insts = ec2.describe_instances(MaxResults=50)
while True:
#
# Process Instances (insts)
#
if 'NextToken' not in insts: break
next_token = insts['NextToken']
insts = ec2.describe_instances(MaxResults=50, NextToken=next_token)
Related
I am trying to create lambda script using Python3.9 which will return total ec2 servers in AWS account, their status & details. Some of my code snippet is -
def lambda_handler(event, context):
client = boto3.client("ec2")
#s3 = boto3.client("s3")
# fetch information about all the instances
status = client.describe_instances()
for i in status["Reservations"]:
instance_details = i["Instances"][0]
if instance_details["State"]["Name"].lower() in ["shutting-down","stopped","stopping","terminated",]:
print("AvailabilityZone: ", instance_details['AvailabilityZone'])
print("\nInstanceId: ", instance_details["InstanceId"])
print("\nInstanceType: ",instance_details['InstanceType'])
On ruunning this code i get error -
If I comment AZ details, code works fine.If I create a new function with only AZ parameter in it, all AZs are returned. Not getting why it fails in above mentioned code.
In python, its always a best practice to use get method to fetch value from list or dict to handle exception.
AvailibilityZone is actually present in Placement dict and not under instance details. You can check the entire response structure from below boto 3 documentation
Reference
def lambda_handler(event, context):
client = boto3.client("ec2")
#s3 = boto3.client("s3")
# fetch information about all the instances
status = client.describe_instances()
for i in status["Reservations"]:
instance_details = i["Instances"][0]
if instance_details["State"]["Name"].lower() in ["shutting-down","stopped","stopping","terminated",]:
print(f"AvailabilityZone: {instance_details.get('Placement', dict()).get('AvailabilityZone')}")
print(f"\nInstanceId: {instance_details.get('InstanceId')}")
print(f"\nInstanceType: {instance_details.get('InstanceType')}")
The problem is that in response of describe_instances availability zone is not in first level of instance dictionary (in your case instance_details). Availability zone is under Placement dictionary, so what you need is
print(f"AvailabilityZone: {instance_details.get('Placement', dict()).get('AvailabilityZone')}")
In my project, some EC2 instances will be shut down. These instances will only be connected when the user needs to work.
Users will access the instances using a clientless remote desktop gateway called Apache Guacamole.
If the instance is stopped, how start an EC2 instance through Apache Guacamole?
Home Screen
Guacamole is, essentially, an RDP/VNC/SSH client and I don't think you can get the instances to startup by themselves since there is no possibility for a wake-on-LAN feature or something like it out-of-the-box.
I used to have a similar issue and we always had one instance up and running and used it to run the AWS CLI to startup the instances we wanted.
Alternatively you could modify the calls from Guacamole to invoke a Lambda function to check if the instance you wish to connect to is running and start it up if not; but then you'd have to deal with the timeout for starting a session from Guacamole (not sure if this is a configurable value from the web admin console, or files), or set up another way of getting feedback for when your instance becomes available.
There was a discussion in the Guacamole mailing list regarding Wake-on-LAN feature and one approach was proposed. It is based on the script that monitors connection attempts and launches instances when needed.
Although it is more a workaround, maybe it will be helpful for you. For the proper solution, it is possible to develop an extension.
You may find the discussion and a link to the script here:
http://apache-guacamole-general-user-mailing-list.2363388.n4.nabble.com/guacamole-and-wake-on-LAN-td7526.html
http://apache-guacamole-general-user-mailing-list.2363388.n4.nabble.com/Wake-on-lan-function-working-td2832.html
There is unfortunately not a very simple solution. The Lambda approach is the way we solved it.
Guacamole has a feature that logs accesses to Cloudwatch Logs.
So next we need the the information of the connection_id and the username/id as a tag on the instance. We are automatically assigning theses tags with our back-end tool when starting the instances.
Now when a user connects to a machine, a log is written to Cloudwatch Logs.
A filter is applied to only get login attempts and trigger Lambda.
The triggered Lambda script checks if there is an instance with such tags corresponding to the current connection attempt and if the instance is stopped, plus other constraints, like if an instance is expired for example.
If yes, then the instance gets started, and in roughly 40 seconds the user is able to connect.
The lambda scripts looks like this:
#receive information from cloudwatch event, parse it call function to start instances
import re
import boto3
import datetime
from conn_inc import *
from start_instance import *
def lambda_handler(event, context):
# Variables
region = "eu-central-1"
cw_region = "eu-central-1"
# Clients
ec2Client = boto3.client('ec2')
# Session
session = boto3.Session(region_name=region)
# Resource
ec2 = session.resource('ec2', region)
print(event)
#print ("awsdata: ", event['awslogs']['data'])
userdata ={}
userdata = get_userdata(event['awslogs']['data'])
print ("logDataUserName: ", userdata["logDataUserName"], "connection_ids: ", userdata["startConnectionId"])
start_instance(ec2,ec2Client, userdata["logDataUserName"],userdata["startConnectionId"])
import boto3
import datetime
from datetime import date
import gzip
import json
import base64
from start_related_instances import *
def start_instance(ec2,ec2Client,logDataUserName,startConnectionId):
# Boto 3
# Use the filter() method of the instances collection to retrieve
# all stopped EC2 instances which have the tag connection_ids.
instances = ec2.instances.filter(
Filters=[
{
'Name': 'instance-state-name',
'Values': ['stopped'],
},
{
'Name': 'tag:connection_ids',
'Values': [f"*{startConnectionId}*"],
}
]
)
# print ("instances: ", list(instances))
#check if instances are found
if len(list(instances)) == 0:
print("No instances with connectionId ", startConnectionId, " found that is stopped.")
else:
for instance in instances:
print(instance.id, instance.instance_type)
expire = ""
connectionName = ""
for tag in instance.tags:
if tag["Key"] == 'expire': #get expiration date
expire = tag["Value"]
if (expire == ""):
print ("Start instance: ", instance.id, ", no expire found")
ec2Client.start_instances(
InstanceIds=[instance.id]
)
else:
print("Check if instance already expired.")
splitDate = expire.split(".")
expire = datetime.datetime(int(splitDate[2]) , int(splitDate[1]) , int(splitDate[0]) )
args = date.today().timetuple()[:6]
today = datetime.datetime(*args)
if (expire >= today):
print("Instance is not yet expired.")
print ("Start instance: ", instance.id, "expire: ", expire, ", today: ", today)
ec2Client.start_instances(
InstanceIds=[instance.id]
)
else:
print ("Instance not started, because it already expired: ", instance.id,"expiration: ", f"{expire}", "today:", f"{today}")
def get_userdata(cw_data):
compressed_payload = base64.b64decode(cw_data)
uncompressed_payload = gzip.decompress(compressed_payload)
payload = json.loads(uncompressed_payload)
message = ""
log_events = payload['logEvents']
for log_event in log_events:
message = log_event['message']
# print(f'LogEvent: {log_event}')
#regex = r"\'.*?\'"
#m = re.search(str(regex), str(message), re.DOTALL)
logDataUserName = message.split('"')[1] #get the username from the user logged into guacamole "Adm_EKoester_1134faD"
startConnectionId = message.split('"')[3] #get the connection Id of the connection which should be started
# create dict
dict={}
dict["connected"] = False
dict["disconnected"] = False
dict["error"] = True
dict["guacamole"] = payload["logStream"]
dict["logDataUserName"] = logDataUserName
dict["startConnectionId"] = startConnectionId
# check for connected or disconnected
ind_connected = message.find("connected to connection")
ind_disconnected = message.find("disconnected from connection")
# print ("ind_connected: ", ind_connected)
# print ("ind_disconnected: ", ind_disconnected)
if ind_connected > 0 and not ind_disconnected > 0:
dict["connected"] = True
dict["error"] = False
elif ind_disconnected > 0 and not ind_connected > 0:
dict["disconnected"] = True
dict["error"] = False
return dict
The cloudwatch logs trigger for lambda like that:
I need to analyse all cloudtrail events within an account (actually multiple accounts, but restricting it to one for now) - however I don't have direct access to the S3 bucket where events are stored.
I need to find all events initiated by any role that fits a pattern. The reason for this is that I need to calculate the guard duty costs associated with the application that is making the API calls.
I have a script which works (it's just thrown together at the moment), however it's VERY slow as it's analysing millions of cloudtrail events.
Is there a better way to get the data I need?
import boto3
from datetime import datetime
import json
session = boto3.Session(profile_name='<profile_name_here>')
client = session.client('cloudtrail')
total_events = 0
target_events = 0
start_time = datetime(2020, 1, 22)
guard_duty_cost = 0.0000044
paginator = client.get_paginator('lookup_events')
response_iterator = paginator.paginate(
StartTime = start_time,
MaxResults = 1000
)
y = 1
for response in response_iterator:
events = response['Events']
print('Processing response {}'.format(y))
y += 1
for event in events:
total_events += 1
cloudtrail_event = event['CloudTrailEvent']
cloudtrail_event_json = json.loads(cloudtrail_event)
user_identity = cloudtrail_event_json['userIdentity']
if 'sessionContext' in user_identity:
user_name = user_identity['sessionContext']['sessionIssuer']['userName']
if '<target_role_pattern>' in user_name:
target_events += 1
total_cost = guard_duty_cost * total_events
target_cost = guard_duty_cost * target_events
print('Total number of events since {} is {} - cost EUR {}'.format(start_time, total_events, total_cost))
print('Number of target events since {} is {} - cost EUR {}'.format(start_time, target_events, target_cost))
You should probably consider using AWS Athena for this - but you will need access to the S3 bucket - not sure how any solution is going to work without that access.
Using Athena with CloudTrail logs is a powerful way to enhance your
analysis of AWS service activity. For example, you can use queries to
identify trends and further isolate activity by attributes, such as
source IP address or user.
https://docs.aws.amazon.com/athena/latest/ug/cloudtrail-logs.html
I am trying to deploy the existing breast cancer prediction model on Amazon Sagemanker using AWS Lambda and API gateway. I have followed the official documentation from the below url.
https://aws.amazon.com/blogs/machine-learning/call-an-amazon-sagemaker-model-endpoint-using-amazon-api-gateway-and-aws-lambda/
I am getting a type error at "predicted_label".
result = json.loads(response['Body'].read().decode())
print(result)
pred = int(result['predictions'][0]['predicted_label'])
predicted_label = 'M' if pred == 1 else 'B'
return predicted_label
please let me know if someone could resolve this issue. Thank you.
By printing the result type by print(type(result)) you can see its a dictionary. now you can see the key name is "score" instead of "predicted_label" that you are giving to pred. Hence replace it with
pred = int(result['predictions'][0]['score'])
I think this solves your problem.
here is my lambda function:
import os
import io
import boto3
import json
import csv
# grab environment variables
ENDPOINT_NAME = os.environ['ENDPOINT_NAME']
runtime= boto3.client('runtime.sagemaker')
def lambda_handler(event, context):
print("Received event: " + json.dumps(event, indent=2))
data = json.loads(json.dumps(event))
payload = data['data']
print(payload)
response = runtime.invoke_endpoint(EndpointName=ENDPOINT_NAME,
ContentType='text/csv',
Body=payload)
#print(response)
print(type(response))
for key,value in response.items():
print(key,value)
result = json.loads(response['Body'].read().decode())
print(type(result))
print(result['predictions'])
pred = int(result['predictions'][0]['score'])
print(pred)
predicted_label = 'M' if pred == 1 else 'B'
return predicted_label
We are building an automated DR cold site on other region, currently are working on retrieving a list of RDS automated snapshots created today, and passed them to another function to copy them to another AWS region.
The issue is with RDS boto3 client where it returned a unique format of date, making filtering on creation date more difficult.
today = (datetime.today()).date()
rds_client = boto3.client('rds')
snapshots = rds_client.describe_db_snapshots(SnapshotType='automated')
harini = "datetime("+ today.strftime('%Y,%m,%d') + ")"
print harini
print snapshots
for i in snapshots['DBSnapshots']:
if i['SnapshotCreateTime'].date() == harini:
print(i['DBSnapshotIdentifier'])
print (today)
despite already converted the date "harini" to the format 'SnapshotCreateTime': datetime(2015, 1, 1), the Lambda function still unable to list out the snapshots.
The better method is to copy the files as they are created by invoking a lambda function using a cloud watch event.
See step by step instruction:
https://geektopia.tech/post.php?blogpost=Automating_The_Cross_Region_Copy_Of_RDS_Snapshots
Alternatively, you can issue a copy for each snapshot regardless of the date. The client will raise an exception and you can trap it like this
# Written By GeekTopia
#
# Copy All Snapshots for an RDS Instance To a new region
# --Free to use under all conditions
# --Script is provied as is. No Warranty, Express or Implied
import json
import boto3
from botocore.exceptions import ClientError
import time
destinationRegion = "us-east-1"
sourceRegion = 'us-west-2'
rdsInstanceName = 'needbackups'
def lambda_handler(event, context):
#We need two clients
# rdsDestinationClient -- Used to start the copy processes. All cross region
copies must be started from the destination and reference the source
# rdsSourceClient -- Used to list the snapshots that need to be copied.
rdsDestinationClient = boto3.client('rds',region_name=destinationRegion)
rdsSourceClient=boto3.client('rds',region_name=sourceRegion)
#List All Automated for A Single Instance
snapshots = rdsSourceClient.describe_db_snapshots(DBInstanceIdentifier=rdsInstanceName,SnapshotType='automated')
for snapshot in snapshots['DBSnapshots']:
#Check the the snapshot is NOT in the process of being created
if snapshot['Status'] == 'available':
#Get the Source Snapshot ARN. - Always use the ARN when copying snapshots across region
sourceSnapshotARN = snapshot['DBSnapshotArn']
#build a new snapshot name
sourceSnapshotIdentifer = snapshot['DBSnapshotIdentifier']
targetSnapshotIdentifer ="{0}-ManualCopy".format(sourceSnapshotIdentifer)
targetSnapshotIdentifer = targetSnapshotIdentifer.replace(":","-")
#Adding a delay to stop from reaching the api rate limit when there are large amount of snapshots -
#This should never occur in this use-case, but may if the script is modified to copy more than one instance.
time.sleep(.2)
#Execute copy
try:
copy = rdsDestinationClient.copy_db_snapshot(SourceDBSnapshotIdentifier=sourceSnapshotARN,TargetDBSnapshotIdentifier=targetSnapshotIdentifer,SourceRegion=sourceRegion)
print("Started Copy of Snapshot {0} in {2} to {1} in {3} ".format(sourceSnapshotIdentifer,targetSnapshotIdentifer,sourceRegion,destinationRegion))
except ClientError as ex:
if ex.response['Error']['Code'] == 'DBSnapshotAlreadyExists':
print("Snapshot {0} already exist".format(targetSnapshotIdentifer))
else:
print("ERROR: {0}".format(ex.response['Error']['Code']))
return {
'statusCode': 200,
'body': json.dumps('Opearation Complete')
}
The code below will take automated snapshots created today.
import boto3
from datetime import date, datetime
region_src = 'us-east-1'
client_src = boto3.client('rds', region_name=region_src)
date_today = datetime.today().strftime('%Y-%m-%d')
def get_db_snapshots_src():
response = client_src.describe_db_snapshots(
SnapshotType = 'automated',
IncludeShared=False,
IncludePublic=False
)
snapshotsInDay = []
for i in response["DBSnapshots"]:
if i["SnapshotCreateTime"].strftime('%Y-%m-%d') == date.isoformat(date.today()):
snapshotsInDay.append(i)
return snapshotsInDay