boto3 issue checking ec2 instance state - amazon-web-services

So i have this boto3 script that starts an ec2 instance. But when i run this lambda function, the function describe_instance_status returns blank InstanceStatus array. So the program terminates, after saying index our of range. Any suggestions?
import boto3
from time import sleep
region = 'your region name'
def lambda_handler(event, context):
cye_production_web_server_2 = 'abcdefgh'
ec2 = boto3.client('ec2',region)
start_response = ec2.start_instances(
InstanceIds=[cye_production_web_server_2, ],
DryRun=False
)
print(
'instance id:',
start_response['StartingInstances'][0]['InstanceId'],
'is',
start_response['StartingInstances'][0]['CurrentState']['Name']
)
status = None
counter = 5
while (status != 'ok' and counter > 0):
status_response = ec2.describe_instance_status(
DryRun=False,
InstanceIds=[cye_production_web_server_2, ],
)
status = status_response['InstanceStatuses'][0]['SystemStatus'] ['Status']
sleep(5) # 5 second throttle
counter=counter-1
print(status_response)
print('status is', status.capitalize())

By default, only running instances are described, unless specified otherwise.
It can take a few minutes for the instance to enter the running state.
Your program will never sleep as it fails in the prior step where the status is actually not returned in first iteration.
Use "IncludeAllInstances" which is a boolean request parameter, when true, includes the health status for all instances. When false, includes the health status for running instances only. Default is false

As omuthu mentioned, the default return type gives info only about the running state of an instance. To get the other states of the instant set the "IncludeAllInstances" argument to describe_instance_status() as True.

Related

How start an EC2 instance through Apache Guacamole?

In my project, some EC2 instances will be shut down. These instances will only be connected when the user needs to work.
Users will access the instances using a clientless remote desktop gateway called Apache Guacamole.
If the instance is stopped, how start an EC2 instance through Apache Guacamole?
Home Screen
Guacamole is, essentially, an RDP/VNC/SSH client and I don't think you can get the instances to startup by themselves since there is no possibility for a wake-on-LAN feature or something like it out-of-the-box.
I used to have a similar issue and we always had one instance up and running and used it to run the AWS CLI to startup the instances we wanted.
Alternatively you could modify the calls from Guacamole to invoke a Lambda function to check if the instance you wish to connect to is running and start it up if not; but then you'd have to deal with the timeout for starting a session from Guacamole (not sure if this is a configurable value from the web admin console, or files), or set up another way of getting feedback for when your instance becomes available.
There was a discussion in the Guacamole mailing list regarding Wake-on-LAN feature and one approach was proposed. It is based on the script that monitors connection attempts and launches instances when needed.
Although it is more a workaround, maybe it will be helpful for you. For the proper solution, it is possible to develop an extension.
You may find the discussion and a link to the script here:
http://apache-guacamole-general-user-mailing-list.2363388.n4.nabble.com/guacamole-and-wake-on-LAN-td7526.html
http://apache-guacamole-general-user-mailing-list.2363388.n4.nabble.com/Wake-on-lan-function-working-td2832.html
There is unfortunately not a very simple solution. The Lambda approach is the way we solved it.
Guacamole has a feature that logs accesses to Cloudwatch Logs.
So next we need the the information of the connection_id and the username/id as a tag on the instance. We are automatically assigning theses tags with our back-end tool when starting the instances.
Now when a user connects to a machine, a log is written to Cloudwatch Logs.
A filter is applied to only get login attempts and trigger Lambda.
The triggered Lambda script checks if there is an instance with such tags corresponding to the current connection attempt and if the instance is stopped, plus other constraints, like if an instance is expired for example.
If yes, then the instance gets started, and in roughly 40 seconds the user is able to connect.
The lambda scripts looks like this:
#receive information from cloudwatch event, parse it call function to start instances
import re
import boto3
import datetime
from conn_inc import *
from start_instance import *
def lambda_handler(event, context):
# Variables
region = "eu-central-1"
cw_region = "eu-central-1"
# Clients
ec2Client = boto3.client('ec2')
# Session
session = boto3.Session(region_name=region)
# Resource
ec2 = session.resource('ec2', region)
print(event)
#print ("awsdata: ", event['awslogs']['data'])
userdata ={}
userdata = get_userdata(event['awslogs']['data'])
print ("logDataUserName: ", userdata["logDataUserName"], "connection_ids: ", userdata["startConnectionId"])
start_instance(ec2,ec2Client, userdata["logDataUserName"],userdata["startConnectionId"])
import boto3
import datetime
from datetime import date
import gzip
import json
import base64
from start_related_instances import *
def start_instance(ec2,ec2Client,logDataUserName,startConnectionId):
# Boto 3
# Use the filter() method of the instances collection to retrieve
# all stopped EC2 instances which have the tag connection_ids.
instances = ec2.instances.filter(
Filters=[
{
'Name': 'instance-state-name',
'Values': ['stopped'],
},
{
'Name': 'tag:connection_ids',
'Values': [f"*{startConnectionId}*"],
}
]
)
# print ("instances: ", list(instances))
#check if instances are found
if len(list(instances)) == 0:
print("No instances with connectionId ", startConnectionId, " found that is stopped.")
else:
for instance in instances:
print(instance.id, instance.instance_type)
expire = ""
connectionName = ""
for tag in instance.tags:
if tag["Key"] == 'expire': #get expiration date
expire = tag["Value"]
if (expire == ""):
print ("Start instance: ", instance.id, ", no expire found")
ec2Client.start_instances(
InstanceIds=[instance.id]
)
else:
print("Check if instance already expired.")
splitDate = expire.split(".")
expire = datetime.datetime(int(splitDate[2]) , int(splitDate[1]) , int(splitDate[0]) )
args = date.today().timetuple()[:6]
today = datetime.datetime(*args)
if (expire >= today):
print("Instance is not yet expired.")
print ("Start instance: ", instance.id, "expire: ", expire, ", today: ", today)
ec2Client.start_instances(
InstanceIds=[instance.id]
)
else:
print ("Instance not started, because it already expired: ", instance.id,"expiration: ", f"{expire}", "today:", f"{today}")
def get_userdata(cw_data):
compressed_payload = base64.b64decode(cw_data)
uncompressed_payload = gzip.decompress(compressed_payload)
payload = json.loads(uncompressed_payload)
message = ""
log_events = payload['logEvents']
for log_event in log_events:
message = log_event['message']
# print(f'LogEvent: {log_event}')
#regex = r"\'.*?\'"
#m = re.search(str(regex), str(message), re.DOTALL)
logDataUserName = message.split('"')[1] #get the username from the user logged into guacamole "Adm_EKoester_1134faD"
startConnectionId = message.split('"')[3] #get the connection Id of the connection which should be started
# create dict
dict={}
dict["connected"] = False
dict["disconnected"] = False
dict["error"] = True
dict["guacamole"] = payload["logStream"]
dict["logDataUserName"] = logDataUserName
dict["startConnectionId"] = startConnectionId
# check for connected or disconnected
ind_connected = message.find("connected to connection")
ind_disconnected = message.find("disconnected from connection")
# print ("ind_connected: ", ind_connected)
# print ("ind_disconnected: ", ind_disconnected)
if ind_connected > 0 and not ind_disconnected > 0:
dict["connected"] = True
dict["error"] = False
elif ind_disconnected > 0 and not ind_connected > 0:
dict["disconnected"] = True
dict["error"] = False
return dict
The cloudwatch logs trigger for lambda like that:

How to use boto3 waiters to wait RDS instance to be in availble state inorder to stop

I am confused with the usage of boto3 waiters. I want to stop the rds instances which are in available state. Before stopping i need to make some modificaitions( MultiAZ deployed instances to none). So i want to wait until the instance get modified and to be in available state. How can i come over this.Here is my script:
import boto3
client = boto3.client('rds')
dbmultiAZ=[]
def lambda_handler(event,context):
response=client.describe_db_instances()
for i in response['DBInstances']:
if i['DBInstanceStatus'] == 'available':
dbmultiAZ.append(i['DBInstanceIdentifier'])
for j in dbmultiAZ:
if i['MultiAZ']==True:
response1 = client.modify_db_instance(
DBInstanceIdentifier=i['DBInstanceIdentifier'],
ApplyImmediately=True,
MultiAZ=False
)
dbmultiAZ.append(i['DBInstanceIdentifier'])
else:
dbmultiAZ.append(i['DBInstanceIdentifier'])
for z in dbmultiAZ:
waiter = client.get_waiter('db_instance_available')
waiter.wait(
DBInstanceIdentifier=z )
response2 = client.stop_db_instance(
DBInstanceIdentifier=z
)
Using waiters inside of lambda is anti-pattern for serverless. Instead, you should utilize step functions to have the following:
Step 1: Create RDS
Step 2: Check if finished
Step 3: Move on if finished, Recheck if not
Step 4 ....

list automated RDS snapshots created today and copy to other region using boto3

We are building an automated DR cold site on other region, currently are working on retrieving a list of RDS automated snapshots created today, and passed them to another function to copy them to another AWS region.
The issue is with RDS boto3 client where it returned a unique format of date, making filtering on creation date more difficult.
today = (datetime.today()).date()
rds_client = boto3.client('rds')
snapshots = rds_client.describe_db_snapshots(SnapshotType='automated')
harini = "datetime("+ today.strftime('%Y,%m,%d') + ")"
print harini
print snapshots
for i in snapshots['DBSnapshots']:
if i['SnapshotCreateTime'].date() == harini:
print(i['DBSnapshotIdentifier'])
print (today)
despite already converted the date "harini" to the format 'SnapshotCreateTime': datetime(2015, 1, 1), the Lambda function still unable to list out the snapshots.
The better method is to copy the files as they are created by invoking a lambda function using a cloud watch event.
See step by step instruction:
https://geektopia.tech/post.php?blogpost=Automating_The_Cross_Region_Copy_Of_RDS_Snapshots
Alternatively, you can issue a copy for each snapshot regardless of the date. The client will raise an exception and you can trap it like this
# Written By GeekTopia
#
# Copy All Snapshots for an RDS Instance To a new region
# --Free to use under all conditions
# --Script is provied as is. No Warranty, Express or Implied
import json
import boto3
from botocore.exceptions import ClientError
import time
destinationRegion = "us-east-1"
sourceRegion = 'us-west-2'
rdsInstanceName = 'needbackups'
def lambda_handler(event, context):
#We need two clients
# rdsDestinationClient -- Used to start the copy processes. All cross region
copies must be started from the destination and reference the source
# rdsSourceClient -- Used to list the snapshots that need to be copied.
rdsDestinationClient = boto3.client('rds',region_name=destinationRegion)
rdsSourceClient=boto3.client('rds',region_name=sourceRegion)
#List All Automated for A Single Instance
snapshots = rdsSourceClient.describe_db_snapshots(DBInstanceIdentifier=rdsInstanceName,SnapshotType='automated')
for snapshot in snapshots['DBSnapshots']:
#Check the the snapshot is NOT in the process of being created
if snapshot['Status'] == 'available':
#Get the Source Snapshot ARN. - Always use the ARN when copying snapshots across region
sourceSnapshotARN = snapshot['DBSnapshotArn']
#build a new snapshot name
sourceSnapshotIdentifer = snapshot['DBSnapshotIdentifier']
targetSnapshotIdentifer ="{0}-ManualCopy".format(sourceSnapshotIdentifer)
targetSnapshotIdentifer = targetSnapshotIdentifer.replace(":","-")
#Adding a delay to stop from reaching the api rate limit when there are large amount of snapshots -
#This should never occur in this use-case, but may if the script is modified to copy more than one instance.
time.sleep(.2)
#Execute copy
try:
copy = rdsDestinationClient.copy_db_snapshot(SourceDBSnapshotIdentifier=sourceSnapshotARN,TargetDBSnapshotIdentifier=targetSnapshotIdentifer,SourceRegion=sourceRegion)
print("Started Copy of Snapshot {0} in {2} to {1} in {3} ".format(sourceSnapshotIdentifer,targetSnapshotIdentifer,sourceRegion,destinationRegion))
except ClientError as ex:
if ex.response['Error']['Code'] == 'DBSnapshotAlreadyExists':
print("Snapshot {0} already exist".format(targetSnapshotIdentifer))
else:
print("ERROR: {0}".format(ex.response['Error']['Code']))
return {
'statusCode': 200,
'body': json.dumps('Opearation Complete')
}
The code below will take automated snapshots created today.
import boto3
from datetime import date, datetime
region_src = 'us-east-1'
client_src = boto3.client('rds', region_name=region_src)
date_today = datetime.today().strftime('%Y-%m-%d')
def get_db_snapshots_src():
response = client_src.describe_db_snapshots(
SnapshotType = 'automated',
IncludeShared=False,
IncludePublic=False
)
snapshotsInDay = []
for i in response["DBSnapshots"]:
if i["SnapshotCreateTime"].strftime('%Y-%m-%d') == date.isoformat(date.today()):
snapshotsInDay.append(i)
return snapshotsInDay

AWS: How to make sure that Instance Profile is initialized and propagized when starting the ec2 server

I am starting servers using a ec2 instance profile with instance profiles.
The problem is the Profile sometimes is not "there" after creating it, even if I wait like 10 seconds:
# Create new Instance Profile
instanceProfile = self.iam.create_instance_profile(InstanceProfileName=instProfName)
instanceProfile.add_role(RoleName="...")
time.sleep(10)
# Create the Instance
instances = self.ec2.create_instances(
# ...
IamInstanceProfile={
"Name":instanceProfile.instance_profile_name
}
)
is there a way to wait for it to be propagated?
My first attempt is:
error = 30
dryRun = True
while error > 0:
try:
# Create the Instance
instances = self.ec2.create_instances(
DryRun=dryRun
# ...
IamInstanceProfile={
"Name":instanceProfile.instance_profile_name
}
)
if not dryRun:
break;
dryRun = False
except botocore.exceptions.ClientError as e:
error = error - 1;
but how do I get only the IAM Profile error?
Much of AWS works using "eventual consistency". That means that after you make a change, it will take some time to propagate through the system.
After you create the instance profile:
Delay by 5 or 10 seconds (or some other time you're comfortable with),
Call iam.get_instance_profile with your instance profile name.
Repeat delay and check until get_instance_profile returns your information.
Another thing you can do is catch the "instance profile not found error" during the ec2.create_instances call, and delay and repeat it if you get that error.

cannot make more than one spot request using boto3 in one day

Greeting,
My code work as long as its the first spot request of the day. If I terminate the instance and make another spot request It just gives me back my old request.
Is there something with my code or with AWS ??? Is there work-around ??
I have tried to clone my AMI and then use the clone AMI change the price or change the number of instance in the spec
but it is still not working ???
!/home/makayo/.virtualenvs/boto3/bin/python
"""
http://boto3.readthedocs.io/en/latest/reference/services/ec2.html#EC2.Client.describe_spot_instance_requests
"""
import boto3
import time
myid=
s = boto3.Session()
ec2 = s.resource('ec2')
client = boto3.client('ec2')
images = list(ec2.images.filter(Owners=[myid]))
def getdate(datestr):
ix=datestr.replace('T',' ')
ix=ix[0:len(ix)-5]
idx=time.strptime(ix,'%Y-%m-%d %H:%M:%S')
return(idx)
zz=sorted(images, key=lambda images: getdate(images.creation_date))
#last_ami
myAmi=zz[len(zz)-1]
#earliest
#myAmi=latestAmi=zz[0]
"""
[{u'DeviceName': '/dev/sda1', u'Ebs': {u'DeleteOnTermination': True, u'Encrypted': False, u'SnapshotId': 'snap-d8de3adb', u'VolumeSize': 50, u'VolumeType': 'gp2'}}]
"""
#myimageId='ami-42870a55'
myimageId=myAmi.id
print myimageId
mysubnetId= myinstanceType='c4.4xlarge'
mykeyName='spot-coursera'
#make sure ajust this but dont do multiple in a loop as it can fail!!!
mycount=2
#make sure ajust this but dont do multiple in a loop as it can fail!!!
myprice='5.0'
mytype='one-time'
myipAddr=
myallocId=''
mysecurityGroups=['']
#mydisksize=70
mygroupId=
#mygroupId=
myzone='us-east-1a'
myvpcId='vpc-503dba37'
#latestAmi.block_device_mappings[0]['Ebs']['VolumeSize']=mydisksize
#diskSpec=latestAmi.block_device_mappings[0]['Ebs']['VolumeSize']
response2 = client.request_spot_instances(
DryRun=False,
SpotPrice=myprice,
ClientToken='string',
InstanceCount=1,
Type='one-time',
LaunchSpecification={
'ImageId': myimageId,
'KeyName': mykeyName,
'SubnetId':mysubnetId,
#'SecurityGroups': mysecurityGroups,
'InstanceType': myinstanceType,
'Placement': {
'AvailabilityZone': myzone,
}
}
)
#print(response2)
myrequestId=response2['SpotInstanceRequests'][0]['SpotInstanceRequestId']
import time
XX=True
while XX:
response3 = client.describe_spot_instance_requests(
#DryRun=True,
SpotInstanceRequestIds=[
myrequestId,
]
#Filters=[
# {
# 'Name': 'string',
# 'Values': [
# 'string',
# ]
#},
#]
)
#print(response3)
request_status=response3['SpotInstanceRequests'][0]['Status']['Code']
if(request_status=='fullfilled'):
print myrequestId,request_status
XX=False;
elif ('pending' in request_status):
print myrequestId,request_status
time.sleep(5)
else:
XX=False
print myrequestId,request_status
"""
instances = ec2.instances.filter(Filters=[{'Name': 'instance-state-name', 'Values': ['running']}])
while( len(list(instances))==0):
instances = ec2.instances.filter(Filters=[{'Name': 'instance-state-name', 'Values': ['running']}])
instances = ec2.instances.filter(Filters=[{'Name': 'instance-state-name', 'Values': ['running']}])
for instance in instances:
print(instance.id, instance.instance_type);
response = instance.modify_attribute(Groups=[mygroupId]);
print(response);
This is wrong:
ClientToken='string',
Or, at least, it's wrong most of the time, as you should now realize.
The purpose of the token is to ensure that EC2 does not process the same request twice, due to retries, bugs, or a multitude of other reasons.
It doesn't matter (within reason -- 64 characters, max, ASCII, case-sensitive) what you send here, but you need to send something different with each unique request.
A client token is a unique, case-sensitive string of up to 64 ASCII characters. It is included in the response when you describe the instance. A client token is valid for at least 24 hours after the termination of the instance. You should not reuse a client token in another call later on.
http://docs.aws.amazon.com/AWSEC2/latest/APIReference/Run_Instance_Idempotency.html