How to check AWS EC2 instance current uptime - amazon-web-services

What is the best way to check the EC2 instance uptime and possibly send alerts if uptime for instance is more then N hours? How can it be organized with default AWS tools such as CloudWatch, Lambda ?

Here's another option which can be done just in CloudWatch.
Create an alarm for your EC2 instance with something like CPUUtilization - you will always get a value for this when the instance is running.
Set the alarm to >= 0; this will ensure that whenever the instance is running, it matches.
Set the period and consecutive periods to match the required alert uptime, for example for 24 hours you could set the period to 1 hour and the consecutive periods to 24.
Set an action to send a notification when the alarm is in ALARM state.
Now, when the instance has been on less than the set time, the alarm will be in INSUFFICIENT DATA state. Once it has been on for the uptime, it will go to ALARM state and the notification will be sent.

One option is to use AWS CLI and get the launch time. From that calculate the uptime and send it to Cloudwatch:
aws ec2 describe-instances --instance-ids i-00123458ca3fa2c4f --query 'Reservations[*].Instances[*].LaunchTime' --output text
Output
2016-05-20T19:23:47.000Z
Another option is to periodically run a cronjob script that:
calls uptime -p command
converts the output to hours
sends the result to Cloudwatch with dimension Count
After adding the cronjob:
add a Cloudwatch alarm that sends an alert when this value exceeds a threshold or if there is INSUFFICIENT DATA
INSUFFICIENT DATA means the machine is not up

I would recommend looking into an "AWS" native way of doing this.
If it is basically sending OS level metrics (e.g. Free Memory, Uptime, Disk Usage etc...) to Cloudwatch then this can be achieved by following the guide:
This installs the Cloudwatch Logs Agent on your EC2 instances.
http://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/QuickStartEC2Instance.html
The great thing about this is you then get the metrics show up in Cloudwatch logs (see attached picture which shows the CW Logs interface in AWS Console.).

Related

Programmatically Stop AWS EC2 in case of inactivity

Can we stop an AWS windows server EC2 instance of a development environment when there is no activity in it, say after 2 hours of inactivity? I am having trouble identifying whether any user is connected to the server virtually.
I can easily start/stop the EC2 at a fixed time, programmatically, but in order to cut the cost of my server, I am trying to stop the EC2 when it is not being used.
My intent(or use case) is: If no user is using the EC2 till a specified amount of time, it will automatically stop. Developers can restart it as and when needed.
Easiest solution probably would be to set up an Alert with CloudWatch.
Have a read at the documentation, which basically describes your use case perfectly:
You can create an alarm that stops an Amazon EC2 instance when a
certain threshold has been met
A condition could be the average CPU utilisation, e.g. CPU utilisation is below a certain point (which most probably correlates with no logged in users / no developer actually utilising the machine).
This is not a simple task.
The Amazon EC2 service provides a virtual computer that has RAM, CPU and Disk. It can view the amount of activity on the CPU, Network traffic and disk access but it cannot see into the Operating System.
So, the problem becomes how to detect 'inactivity'. This really comes down to the operating system and making some hard decisions. For example, your home computer screen turns off after a defined time of no mouse/keyboard input but the operating system is still doing activity in the background. If the system is running an application such as a web server, and there are no web requests, it is hard to know whether this is 'inactive' because there are no requests, or 'active' because the web server is running.
Bottom line: There is no out-of-the-box feature to do this. You would need to find your own definition of 'inactivity' and then trigger a shutdown in the Operating System.
If you wish to do it via schedule, this might help: Auto-Stop EC2 instances when they finish a task - DEV Community
UPDATE: Lambda's aren't needed anymore, see tpschmidt's answer.
Create a Lambda to turn off the EC2 that will be triggered by a Cloud Watch Alarm when for example the CPU goes under 20% average for an hour. This is fine when you're coding as you will be using more than 20%, and when you have a break for over an hour that's when you want it turned off.
Be sure to set auto save in your IDE's.
Example Python Lambda:
import boto3
region = 'eu-west-3'
instances = ['i-05be5c0c4039881ed']
ec2 = boto3.client('ec2', region_name=region)
def lambda_handler(event, context):
#TODO getInstanceIDFromCloudWatch = event["instanceid"]
ec2.stop_instances(InstanceIds=instances)
print('stopped your instances: ' + str(instances))
Ref: https://www.howtoforge.com/aws-lambda-function-to-start-and-stop-ec2-instance/
In AWS Console:
Goto EC2, select the EC2 instance and copy the Instance ID
Goto Cloud Watch and select Metrics
Under AWS Namespaces click EC2
Paste the Instance ID to find it
Select EC2 > Per-Instance Metrics
Choose the first metric CPU utilisation
Select the second tab called Graphed Metric
Click the Bell icon under Actions
Set a threshold, also this is the hard part, leave the default of Statistic: Average over 1 hour
Set the Condition Lower/Equal and put the value as 20% (you'll need to use the machine more than 1/5th of the hour over 20% CPU otherwise it'll turn off).
Next create an alarm, setup a notification if you like or remove it
Once the Alarm is created
In Cloud Watch select Event > Rules
Add a Rule
Select EC2 as the Service Name and All Event
Click Target and select your Lambda.
When the Alarm goes off the Lambda will turn off the instance ID
You can set up an AWS Cloudwatch alarm that monitors activity. Different parameters like ComparisonOperator, Period, and Threshold can be modified according to how you want to monitor your Ec2 instance.
Then, you can set up an SQS queue and set a Python Lambda function as its target. Within the lambda function, you can use boto3 to turn off the ec2 instance. You can read more details here: https://medium.com/geekculture/automatically-turn-off-ec2-instances-upon-inactivity-31fedd363cad
Terraform setup:
https://medium.com/geekculture/terraform-setup-for-automatically-turning-off-ec2-instances-upon-inactivity-d7f414390800
You are looking for adding stop action to your ec2 instance, this can be easily achieved using CloudWatch alarms.
You can do this from the console using the following steps:
Open the Amazon EC2 console
In the navigation pane, choose Instances.
Select the instance and choose Actions, Monitor and troubleshoot,
Manage CloudWatch alarms.
Alternatively, you can choose the plus sign ( ) in the Alarm status
column.
On the Manage CloudWatch alarms page, do the following:
Choose to Create an alarm.
To receive an email when the alarm is triggered, for Alarm
notification, choose an existing Amazon SNS topic. You first need to
create an Amazon SNS topic using the Amazon SNS console. For more
information, see Using Amazon SNS for application-to-person (A2P)
messaging in the Amazon Simple Notification Service Developer Guide.
Toggle on the Alarm action, and choose Stop.
For Group samples by and Type of data to sample, choose a statistic
and a metric. In this example, choose Average and CPU utilization.
For Alarm When and Percent, specify the metric threshold. In this
example, specify <= and 10 percent.
For the Consecutive period and Period, specify the evaluation period for
the alarm. In this example, specify 1 consecutive period of 5
Minutes.
Amazon CloudWatch automatically creates an alarm name for you. To
change the name, for the Alarm name, enter a new name. Alarm names must
contain only ASCII characters.
Choose to Create.
Note You can adjust the alarm configuration based on your own
requirements before creating the alarm, or you can edit them later.
This includes the metric, threshold, duration, action, and
notification settings. However, after you create an alarm, you
cannot edit its name later.
Check this link from the documentation for terminating the instance using the same way.
You are looking for adding stop action to your ec2 instance, this can be easily achieved using CloudWatch alarms.
Here, I will show how to do that using Terraform:
resource "aws_cloudwatch_metric_alarm" "ec2_cpu" {
alarm_name = "StopTheInstanceAfterInactivity"
metric_name = "CPUUtilization"
comparison_operator = "LessThanOrEqualToThreshold"
statistic = "Average"
threshold = var.threshold
evaluation_periods = var.evaluation_periods # The number of periods over which data is compared to the specified threshold
period = var.period # Evaluation Period (seconds)
namespace = "AWS/EC2"
alarm_description = "This metric monitors ec2 cpu utilization and stop the instance if it is inactive"
actions_enabled = "true"
alarm_actions = ["arn:aws:automate:${var.region}:ec2:stop"]
ok_actions = [] # do nothing
insufficient_data_actions = [] # do nothing
dimensions = {InstanceId = aws_instance.ec2_instance.id}
}

How can we monitor a process with cloudwatch

I have a Java process that runs on EC2 I would like to setup an alert in Cloudwatch when the process goes down or is in a bad state (e.g does not send heartbeat to Cloudwatch for the last 10 secs or so).
What is the best way to do this ? I think I need the custom metrics, but did not find any documentation for specifically monitoring a process.
I can use the AWS SDK if needed.
You can write a custom script with ps or jps and push that metric to Cloudwatch. BUT if you are looking for 10 seconds granularity, then Cloudwatch is not the right solution since its minimum granularity is 60 seconds.
From: AWS Resource and Custom Metrics Monitoring
Q: What is the minimum granularity for the data that Amazon CloudWatch
receives and aggregates?
The minimum granularity supported by CloudWatch is 1 minute data
points. Many metrics are received and aggregated at 1-minute
intervals. Some are received at 3-minute or 5-minute intervals.
Though it is possible to create an alarm using CLI and SDK, I suggest you use the AWS Cloudwatch dashboard. Wait for your custom metric to appear in Cloudwatch dashboard. After you see your custom metrics in Cloudwatch, click on CreateAlarm and select your metric. After that define your alarm.
The attached image shows Applications as the metric. In your case, it will be whatever name you choose to call it. Under Actions, create a new notification and specify your email. Now if the count goes below 1 for one period, you will get an alarm.
AWS Custom Metrics can be used to publish the health of the Program.
Below Java Code can be used to Publish the Heart Beat. Using Custom Metrics Alarm can be configured in CloudWatch.
AmazonCloudWatch amazonCloudWatch = AmazonCloudWatchClientBuilder.standard().
withEndpointConfiguration(new AwsClientBuilder.
EndpointConfiguration("monitoring.us-west-1.amazonaws.com","us-west-1")).build();
PutMetricDataRequest putMetricDataRequest = new PutMetricDataRequest();
putMetricDataRequest.setNamespace("CUSTOM/SQS");
MetricDatum metricDatum1 = new MetricDatum().withMetricName("MessageCount").withDimensions(new Dimension().withName("Personalization").withValue("123"));
metricDatum1.setValue(-1.00);
metricDatum1.setUnit(StandardUnit.Count);
putMetricDataRequest.getMetricData().add(metricDatum1);
PutMetricDataResult result = amazonCloudWatch.putMetricData(putMetricDataRequest);
The best way to monitor a process will be using AWS CloudWatch procstat plugin. First create a CloudWatch configuration file with PID file location from EC2 and monitor the memory_rss parameter of process. The idea is memory consumption metric will never go below or equal to zero for a running process.
{
"agent": {
"run_as_user": "cwagent"
},
"metrics": {
"metrics_collected": {
"procstat": [
{
"pid_file": "/var/run/sshd.pid",
"measurement": [
"cpu_usage",
"memory_rss"
]
}
]
}
}
}
Later start the CloudWatch Agent and configure the ALARM using this AWS documentation!

Alert email when worker fails on AWS EC2

I have an EC2 instance in AWS with Centos 6 and I only have supervisor on it which maintains a single PHP script. In some cases this script fails and I can see something like this:
$ sudo /usr/local/bin/supervisorctl status
my-worker EXITED Aug 19 10:19 AM
I would like to receive alert email about it because my script hasn't worked since Aug 19.
I try to find something related to health checks, but health check available only for load balancers. Also I tried to find something in CloudWatch but couldn't find a relevant metric for me.
Any idea, how i can receive email when my worker fall down?
There isn't an out of the box metric for something like that as Cloudwatch by default only has access to hypervisor level metrics rather than OS based metrics such as RAM usage or process related statistics.
To augment the data in Cloudwatch you could write a small script that checks whether the process is running and then calls PutMetricData to upload that metric to Cloudwatch.
Something like this should work:
#!/bin/bash
${process_name}=$1
DATE=`date +%Y-%m-%dT%H:%M:%S.000Z`
processes_running=`pidof ${process_name} | wc -w`
aws cloudwatch put-metric-data --metric-name ${process_name}_running --namespace "MyService" --value ${processes_running} --timestamp $DATE
Then just call that with cron or something every minute (or however often you want to update Cloudwatch - max resolution is 1 minute though, more frequent calls will be aggregated)
Then you just need to create an alarm that performs some action (such as using SNS to send an email to all subscribed addresses but potentially also performing some action such as rebooting the instance).

Use cloudwatch to determine if linux service is running

Suppose I have an ec2 instance with service /etc/init/my_service.conf with contents
script
exec my_exec
end script
How can I monitor that ec2 instance such that if my_service stopped running I can act on it?
You can publish a custom metric to CloudWatch in the form of a "heart beat".
Have a small script running via cron on your server checking the
process list to see whether my_service is running and if it is, make
a put-metric-data call to CloudWatch.
The metric could be as simple as pushing the number "1" to your custom metric in CloudWatch.
Set up a CloudWatch alarm that triggers if the average for the metric falls below 1
Make the period of the alarm be >= the period that the cron runs e.g. cron runs every 5 minutes, make the alarm alarm if it sees the average is below 1 for two 5 minute periods.
Make sure you also handle the situation in which the metric is not published (e. g. cron fails to run or whole machine dies). you would want to setup an alert in case the metric is missing. (see here: AWS Cloudwatch Heartbeat Alarm)
Be aware that the custom metric will add an additional cost of 50c to your AWS bill (not a big deal for one metric - but the equation changes drastically if you want to push hundred/thousands of metrics - i.e. good to know it's not free as one would expect)
See here for how to publish a custom metric: http://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/publishingMetrics.html
I am not sure if CloudWatch is the right route for checking if the service is running - it would be easier with Nagios kind of solution.
Nevertheless, you may try the CloudWatch Custom metrics approach. You add Additional lines of code which publishes say an integer 1 to CloudWatch Custom Metrics every 5 mins. Your can then configure CloudWatch alarms to do a SNS Notification / Mail Notification for the conditions like Sample Count or sum deviating your anticipated value.
script
exec my_exec
publish cloudwatch custom metrics value
end script
More Info
Publish Custom Metrics - http://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/publishingMetrics.html

Attach EBS volume to EC2 instance when cloudwatch alarm triggers

I have a business case when an EC2 instance runs out of space, we need to spawn new EBS volume, attach it to EC2 instance and format it.
I have created one cron job which keeps sending disk usage to cloud watch and trying to create one alarm this custom metric.
Now I am not able to find out any information regarding how to spawn an EBS volume when this alarm triggers.
So I would like to know if it is it possible to spawn EBS volume when cloudwatch alarm triggers? If yes, please give some steps or point to the document where I can find this information.
As if now all I have found out is that we can either spawn new instances or send some emails whenever alarm triggers.
You can fire an notification to an SNS topic when the CloudWatch alarm fires, and have a SQS queue as a subscriber to that topic. Then, an EC2 instance consuming that SQS queue can perform the desired change using the AWS CLI or SDKs.