In my project, some EC2 instances will be shut down. These instances will only be connected when the user needs to work.
Users will access the instances using a clientless remote desktop gateway called Apache Guacamole.
If the instance is stopped, how start an EC2 instance through Apache Guacamole?
Home Screen
Guacamole is, essentially, an RDP/VNC/SSH client and I don't think you can get the instances to startup by themselves since there is no possibility for a wake-on-LAN feature or something like it out-of-the-box.
I used to have a similar issue and we always had one instance up and running and used it to run the AWS CLI to startup the instances we wanted.
Alternatively you could modify the calls from Guacamole to invoke a Lambda function to check if the instance you wish to connect to is running and start it up if not; but then you'd have to deal with the timeout for starting a session from Guacamole (not sure if this is a configurable value from the web admin console, or files), or set up another way of getting feedback for when your instance becomes available.
There was a discussion in the Guacamole mailing list regarding Wake-on-LAN feature and one approach was proposed. It is based on the script that monitors connection attempts and launches instances when needed.
Although it is more a workaround, maybe it will be helpful for you. For the proper solution, it is possible to develop an extension.
You may find the discussion and a link to the script here:
http://apache-guacamole-general-user-mailing-list.2363388.n4.nabble.com/guacamole-and-wake-on-LAN-td7526.html
http://apache-guacamole-general-user-mailing-list.2363388.n4.nabble.com/Wake-on-lan-function-working-td2832.html
There is unfortunately not a very simple solution. The Lambda approach is the way we solved it.
Guacamole has a feature that logs accesses to Cloudwatch Logs.
So next we need the the information of the connection_id and the username/id as a tag on the instance. We are automatically assigning theses tags with our back-end tool when starting the instances.
Now when a user connects to a machine, a log is written to Cloudwatch Logs.
A filter is applied to only get login attempts and trigger Lambda.
The triggered Lambda script checks if there is an instance with such tags corresponding to the current connection attempt and if the instance is stopped, plus other constraints, like if an instance is expired for example.
If yes, then the instance gets started, and in roughly 40 seconds the user is able to connect.
The lambda scripts looks like this:
#receive information from cloudwatch event, parse it call function to start instances
import re
import boto3
import datetime
from conn_inc import *
from start_instance import *
def lambda_handler(event, context):
# Variables
region = "eu-central-1"
cw_region = "eu-central-1"
# Clients
ec2Client = boto3.client('ec2')
# Session
session = boto3.Session(region_name=region)
# Resource
ec2 = session.resource('ec2', region)
print(event)
#print ("awsdata: ", event['awslogs']['data'])
userdata ={}
userdata = get_userdata(event['awslogs']['data'])
print ("logDataUserName: ", userdata["logDataUserName"], "connection_ids: ", userdata["startConnectionId"])
start_instance(ec2,ec2Client, userdata["logDataUserName"],userdata["startConnectionId"])
import boto3
import datetime
from datetime import date
import gzip
import json
import base64
from start_related_instances import *
def start_instance(ec2,ec2Client,logDataUserName,startConnectionId):
# Boto 3
# Use the filter() method of the instances collection to retrieve
# all stopped EC2 instances which have the tag connection_ids.
instances = ec2.instances.filter(
Filters=[
{
'Name': 'instance-state-name',
'Values': ['stopped'],
},
{
'Name': 'tag:connection_ids',
'Values': [f"*{startConnectionId}*"],
}
]
)
# print ("instances: ", list(instances))
#check if instances are found
if len(list(instances)) == 0:
print("No instances with connectionId ", startConnectionId, " found that is stopped.")
else:
for instance in instances:
print(instance.id, instance.instance_type)
expire = ""
connectionName = ""
for tag in instance.tags:
if tag["Key"] == 'expire': #get expiration date
expire = tag["Value"]
if (expire == ""):
print ("Start instance: ", instance.id, ", no expire found")
ec2Client.start_instances(
InstanceIds=[instance.id]
)
else:
print("Check if instance already expired.")
splitDate = expire.split(".")
expire = datetime.datetime(int(splitDate[2]) , int(splitDate[1]) , int(splitDate[0]) )
args = date.today().timetuple()[:6]
today = datetime.datetime(*args)
if (expire >= today):
print("Instance is not yet expired.")
print ("Start instance: ", instance.id, "expire: ", expire, ", today: ", today)
ec2Client.start_instances(
InstanceIds=[instance.id]
)
else:
print ("Instance not started, because it already expired: ", instance.id,"expiration: ", f"{expire}", "today:", f"{today}")
def get_userdata(cw_data):
compressed_payload = base64.b64decode(cw_data)
uncompressed_payload = gzip.decompress(compressed_payload)
payload = json.loads(uncompressed_payload)
message = ""
log_events = payload['logEvents']
for log_event in log_events:
message = log_event['message']
# print(f'LogEvent: {log_event}')
#regex = r"\'.*?\'"
#m = re.search(str(regex), str(message), re.DOTALL)
logDataUserName = message.split('"')[1] #get the username from the user logged into guacamole "Adm_EKoester_1134faD"
startConnectionId = message.split('"')[3] #get the connection Id of the connection which should be started
# create dict
dict={}
dict["connected"] = False
dict["disconnected"] = False
dict["error"] = True
dict["guacamole"] = payload["logStream"]
dict["logDataUserName"] = logDataUserName
dict["startConnectionId"] = startConnectionId
# check for connected or disconnected
ind_connected = message.find("connected to connection")
ind_disconnected = message.find("disconnected from connection")
# print ("ind_connected: ", ind_connected)
# print ("ind_disconnected: ", ind_disconnected)
if ind_connected > 0 and not ind_disconnected > 0:
dict["connected"] = True
dict["error"] = False
elif ind_disconnected > 0 and not ind_connected > 0:
dict["disconnected"] = True
dict["error"] = False
return dict
The cloudwatch logs trigger for lambda like that:
Related
I am trying to create lambda script using Python3.9 which will return total ec2 servers in AWS account, their status & details. Some of my code snippet is -
def lambda_handler(event, context):
client = boto3.client("ec2")
#s3 = boto3.client("s3")
# fetch information about all the instances
status = client.describe_instances()
for i in status["Reservations"]:
instance_details = i["Instances"][0]
if instance_details["State"]["Name"].lower() in ["shutting-down","stopped","stopping","terminated",]:
print("AvailabilityZone: ", instance_details['AvailabilityZone'])
print("\nInstanceId: ", instance_details["InstanceId"])
print("\nInstanceType: ",instance_details['InstanceType'])
On ruunning this code i get error -
If I comment AZ details, code works fine.If I create a new function with only AZ parameter in it, all AZs are returned. Not getting why it fails in above mentioned code.
In python, its always a best practice to use get method to fetch value from list or dict to handle exception.
AvailibilityZone is actually present in Placement dict and not under instance details. You can check the entire response structure from below boto 3 documentation
Reference
def lambda_handler(event, context):
client = boto3.client("ec2")
#s3 = boto3.client("s3")
# fetch information about all the instances
status = client.describe_instances()
for i in status["Reservations"]:
instance_details = i["Instances"][0]
if instance_details["State"]["Name"].lower() in ["shutting-down","stopped","stopping","terminated",]:
print(f"AvailabilityZone: {instance_details.get('Placement', dict()).get('AvailabilityZone')}")
print(f"\nInstanceId: {instance_details.get('InstanceId')}")
print(f"\nInstanceType: {instance_details.get('InstanceType')}")
The problem is that in response of describe_instances availability zone is not in first level of instance dictionary (in your case instance_details). Availability zone is under Placement dictionary, so what you need is
print(f"AvailabilityZone: {instance_details.get('Placement', dict()).get('AvailabilityZone')}")
I am very new to Google Cloud Platform. I am looking for ways to automate starting and stopping a mySQL instance at a predefined time.
I found that we could create a cloud function to start/stop an instance and then use the cloud scheduler to trigger this. However, I am not able to understand how this works.
I used the code that I found in GitHub.
https://github.com/chris32g/Google-Cloud-Support/blob/master/Cloud%20Functions/turn_on_cloudSQL_instance
https://github.com/chris32g/Google-Cloud-Support/blob/master/Cloud%20Functions/turn_off_CloudSQL_instance
However, I am not familiar with any of the programming languages like node, python or go. That was the reason for the confusion. Below is the code that I found on GitHub to Turn On a Cloud SQL instance:
# This file uses the Cloud SQL API to turn on a Cloud SQL instance.
from googleapiclient import discovery
from oauth2client.client import GoogleCredentials
credentials = GoogleCredentials.get_application_default()
service = discovery.build('sqladmin', 'v1beta4', credentials=credentials)
project = 'wave24-gonchristian' # TODO: Update placeholder value.
def hello_world(request):
instance = 'test' # TODO: Update placeholder value.
request = service.instances().get(project=project, instance=instance)
response = request.execute()
j = response["settings"]
settingsVersion = int(j["settingsVersion"])
dbinstancebody = {
"settings": {
"settingsVersion": settingsVersion,
"tier": "db-n1-standard-1",
"activationPolicy": "Always"
}
}
request = service.instances().update(
project=project,
instance=instance,
body=dbinstancebody)
response = request.execute()
# pprint(response)
request_json = request.get_json()
if request.args and 'message' in request.args:
return request.args.get('message')
elif request_json and 'message' in request_json:
return request_json['message']
else:
return f"Hello World!"
________________________
requirements.txt
google-api-python-client==1.7.8
google-auth-httplib2==0.0.3
google-auth==1.6.2
oauth2client==4.1.3
As I mentioned earlier, I am not familiar with Python. I just found this code on GitHub. I was trying to understand what this specific part does:
dbinstancebody = {
"settings": {
"settingsVersion": settingsVersion,
"tier": "db-n1-standard-1",
"activationPolicy": "Always"
}
}
dbinstancebody = {
"settings": {
"settingsVersion": settingsVersion,
"tier": "db-n1-standard-1",
"activationPolicy": "Always"
}
}
The code block above specifies sql instance properties you would like to update, amongst which the most relevant for your case is activationPolicy which allows you to stop / start sql instance.
For Second Generation instances, the activation policy is used only to start or stop the instance. You change the activation policy by starting and stopping the instance. Stopping the instance prevents further instance charges.
Activation policy can have two values Always or Never. Always will start the instance and Never will stop the instance.
You can use the API to amend the activationPolicy to "NEVER" to stop the server or "ALWAYS" to start it.
# PATCH
https://sqladmin.googleapis.com/sql/v1beta4/projects/{project}/instances/{instance}
# BODY
{
"settings": {
"activationPolicy": "NEVER"
}
}
See this article in the Cloud SQL docs for more info: Starting, stopping, and restarting instances. You can also try out the instances.patch method in the REST API reference.
please try the code below :
from pprint import pprint
from googleapiclient import discovery
from oauth2client.client import GoogleCredentials
import os
credentials = GoogleCredentials.get_application_default()
service = discovery.build("sqladmin", "v1beta4", credentials=credentials)
project_id = os.environ.get("GCP_PROJECT")
# setup this vars using terraform and assign the value via terraform
desired_policy = os.environ.get("DESIRED_POLICY") # ALWAYS or NEVER
instance_name = os.environ.get("INSTANCE_NAME")
def cloudsql(request):
request = service.instances().get(project=project_id, instance=instance_name)
response = request.execute()
state = response["state"]
instance_state = str(state)
x = response["settings"]
current_policy = str(x["activationPolicy"])
dbinstancebody = {"settings": {"activationPolicy": desired_policy}}
if instance_state != "RUNNABLE":
print("Instance is not in RUNNABLE STATE")
else:
if desired_policy != current_policy:
request = service.instances().patch(
project=project_id, instance=instance_name, body=dbinstancebody
)
response = request.execute()
pprint(response)
else:
print(f"Instance is in RUNNABLE STATE but is also already configured with the desired policy: {desired_policy}")
In my repo you can have more information on how to setup the cloud function using Terraform. This cloud function is intended to do what you want but it is using environment variables, if you dont want to use them, just change the variables values on the python code.
Here is my repository Repo
We are building an automated DR cold site on other region, currently are working on retrieving a list of RDS automated snapshots created today, and passed them to another function to copy them to another AWS region.
The issue is with RDS boto3 client where it returned a unique format of date, making filtering on creation date more difficult.
today = (datetime.today()).date()
rds_client = boto3.client('rds')
snapshots = rds_client.describe_db_snapshots(SnapshotType='automated')
harini = "datetime("+ today.strftime('%Y,%m,%d') + ")"
print harini
print snapshots
for i in snapshots['DBSnapshots']:
if i['SnapshotCreateTime'].date() == harini:
print(i['DBSnapshotIdentifier'])
print (today)
despite already converted the date "harini" to the format 'SnapshotCreateTime': datetime(2015, 1, 1), the Lambda function still unable to list out the snapshots.
The better method is to copy the files as they are created by invoking a lambda function using a cloud watch event.
See step by step instruction:
https://geektopia.tech/post.php?blogpost=Automating_The_Cross_Region_Copy_Of_RDS_Snapshots
Alternatively, you can issue a copy for each snapshot regardless of the date. The client will raise an exception and you can trap it like this
# Written By GeekTopia
#
# Copy All Snapshots for an RDS Instance To a new region
# --Free to use under all conditions
# --Script is provied as is. No Warranty, Express or Implied
import json
import boto3
from botocore.exceptions import ClientError
import time
destinationRegion = "us-east-1"
sourceRegion = 'us-west-2'
rdsInstanceName = 'needbackups'
def lambda_handler(event, context):
#We need two clients
# rdsDestinationClient -- Used to start the copy processes. All cross region
copies must be started from the destination and reference the source
# rdsSourceClient -- Used to list the snapshots that need to be copied.
rdsDestinationClient = boto3.client('rds',region_name=destinationRegion)
rdsSourceClient=boto3.client('rds',region_name=sourceRegion)
#List All Automated for A Single Instance
snapshots = rdsSourceClient.describe_db_snapshots(DBInstanceIdentifier=rdsInstanceName,SnapshotType='automated')
for snapshot in snapshots['DBSnapshots']:
#Check the the snapshot is NOT in the process of being created
if snapshot['Status'] == 'available':
#Get the Source Snapshot ARN. - Always use the ARN when copying snapshots across region
sourceSnapshotARN = snapshot['DBSnapshotArn']
#build a new snapshot name
sourceSnapshotIdentifer = snapshot['DBSnapshotIdentifier']
targetSnapshotIdentifer ="{0}-ManualCopy".format(sourceSnapshotIdentifer)
targetSnapshotIdentifer = targetSnapshotIdentifer.replace(":","-")
#Adding a delay to stop from reaching the api rate limit when there are large amount of snapshots -
#This should never occur in this use-case, but may if the script is modified to copy more than one instance.
time.sleep(.2)
#Execute copy
try:
copy = rdsDestinationClient.copy_db_snapshot(SourceDBSnapshotIdentifier=sourceSnapshotARN,TargetDBSnapshotIdentifier=targetSnapshotIdentifer,SourceRegion=sourceRegion)
print("Started Copy of Snapshot {0} in {2} to {1} in {3} ".format(sourceSnapshotIdentifer,targetSnapshotIdentifer,sourceRegion,destinationRegion))
except ClientError as ex:
if ex.response['Error']['Code'] == 'DBSnapshotAlreadyExists':
print("Snapshot {0} already exist".format(targetSnapshotIdentifer))
else:
print("ERROR: {0}".format(ex.response['Error']['Code']))
return {
'statusCode': 200,
'body': json.dumps('Opearation Complete')
}
The code below will take automated snapshots created today.
import boto3
from datetime import date, datetime
region_src = 'us-east-1'
client_src = boto3.client('rds', region_name=region_src)
date_today = datetime.today().strftime('%Y-%m-%d')
def get_db_snapshots_src():
response = client_src.describe_db_snapshots(
SnapshotType = 'automated',
IncludeShared=False,
IncludePublic=False
)
snapshotsInDay = []
for i in response["DBSnapshots"]:
if i["SnapshotCreateTime"].strftime('%Y-%m-%d') == date.isoformat(date.today()):
snapshotsInDay.append(i)
return snapshotsInDay
So i have this boto3 script that starts an ec2 instance. But when i run this lambda function, the function describe_instance_status returns blank InstanceStatus array. So the program terminates, after saying index our of range. Any suggestions?
import boto3
from time import sleep
region = 'your region name'
def lambda_handler(event, context):
cye_production_web_server_2 = 'abcdefgh'
ec2 = boto3.client('ec2',region)
start_response = ec2.start_instances(
InstanceIds=[cye_production_web_server_2, ],
DryRun=False
)
print(
'instance id:',
start_response['StartingInstances'][0]['InstanceId'],
'is',
start_response['StartingInstances'][0]['CurrentState']['Name']
)
status = None
counter = 5
while (status != 'ok' and counter > 0):
status_response = ec2.describe_instance_status(
DryRun=False,
InstanceIds=[cye_production_web_server_2, ],
)
status = status_response['InstanceStatuses'][0]['SystemStatus'] ['Status']
sleep(5) # 5 second throttle
counter=counter-1
print(status_response)
print('status is', status.capitalize())
By default, only running instances are described, unless specified otherwise.
It can take a few minutes for the instance to enter the running state.
Your program will never sleep as it fails in the prior step where the status is actually not returned in first iteration.
Use "IncludeAllInstances" which is a boolean request parameter, when true, includes the health status for all instances. When false, includes the health status for running instances only. Default is false
As omuthu mentioned, the default return type gives info only about the running state of an instance. To get the other states of the instant set the "IncludeAllInstances" argument to describe_instance_status() as True.
I'm creating a Lambda function with the intent of backing up my EC2 instances with their snapshots. However, I noticed reading the boto documentation the call to ec2.describe_instances is rate limited with MaxResults/NextToken. How can I combine the two of these to safely iterate through the list 50 at a time? Below is my work in progress:
import boto3
import datetime
import time
ec2 = boto3.client('ec2')
def lambda_handler(event, context):
try:
print("Creating snapshots on " + str(datetime.datetime.today()) + ".")
maxResults = 50
schedulers = ec2.describe_instances(Filters=[{'Name':'tag:GL-sub-purpose', 'Values':[Schedule]}], MaxResults=maxResults)
nextToken = schedulers['NextToken']
totalSchedulers = len(schedulers)
while totalSchedulers == maxResults:
schedulers = ec2.describe_instances(Filters=[{'Name':'tag:GL-sub-purpose', 'Values':[Schedule]}], MaxResults=maxResults, NextToken=nextToken)
nextToken = result['NextToken']
totalSchedulers = len(schedulers)
print("Performing backup on " + str(len(schedulers)) + " schedules.")
successful = []
failed = []
for s in schedulers:
#[...] More operations here, done 50 at a time.
I'm not really sure if I'm using the MaxResults/NextToken parameters correctly or efficiently here. Is this the best way to achieve my desired result/am I on the right track?
Just iterate through until NextToken is not returned. Here is a sample code to iterate through a batch of instances. Change it to suit your needs.
import boto3
ec2 = boto3.client('ec2')
insts = ec2.describe_instances(MaxResults=50)
while True:
#
# Process Instances (insts)
#
if 'NextToken' not in insts: break
next_token = insts['NextToken']
insts = ec2.describe_instances(MaxResults=50, NextToken=next_token)