Creating cloudwatch alarms for specific log groups - amazon-web-services

I want to create a cloudwatch alarm that triggers when a specific word, "exception", shows up in my log. I have four different log groups, three for lambda programs and one for an elastic beanstalk instance.
I would like to have an alarm that only triggers "exception" shows up a specific log group. Is this possible or will my alarm just trigger when "exception" shows up in any of the four log groups?

This is possible, with the caveat that your one alarm becomes four alarms.
Here is an example implementation:
You have four log groups: one for each of your three lambdas and your beanstalk instance.
Each log group has a metric filter that increments an associated metric by 1 when "exception" shows up in the logs. The 'Example: Count Log Events' link in the section below describes exactly how to do this.
Each metric has its own distinct alarm that fires when sum > 0 for X minutes.
So you'll have four log groups, four metric filters, four metric, and four alarms. This will allow each of your apps have their own distinct alerting workflow, so that they won't step on each other.
Further Reading
AWS CloudWatch Logs Documentation - Example: Count Log Events

Related

How to Monitor EKS Node group Status in CloudWatch

I'm currently trying to monitor the EKS Node group status, sometimes my node groups show degraded and I want a CloudWatch alert whenever the status is in a Degraded state, I checked CloudWatch Metrics there are no standard metrics, and even I'm unable to find the event in Cloud trail,
Is there any possibility's to creating the alarm using AWS Cloud trail events, Event bridge, or CloudWatch
Kindly help to find the solution for this
For CloudWatch, please take a looks at this:
https://docs.aws.amazon.com/de_de/AmazonCloudWatch/latest/monitoring/deploy-container-insights-EKS.html
I think you can combine Lambda & CloudWatch & EventBridge service here to implement your simple health-check status for a single or multiple node groups.
For your health check Lambda function:
We create a Lambda with Python3 (3.9 for example)
We describe the node group using Boto3
We put a custom metric to CloudWatch metrics so if the status is Active, we put 1 else 0.
When we have the function ready, we prepare the every 1 minutes (up to you) setup.
We create an EventBridge (EB) rule with every 1 min triggers
The EB rule destination is the Lambda function
Once we have enough data points from CloudWatch metrics, we can create a CloudWatch alarm to help us notifying to E-mail or others.
References:
https://stackify.com/custom-metrics-aws-lambda/
https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-run-lambda-schedule.html
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/eks.html#EKS.Client.describe_nodegroup
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/cloudwatch.html

Single lambda, multiple cloudwatch log groups

After we run an AWS lambda, a single cloudwatch log group is populated. Is there a way we can populate two (different) cloudwatch log groups from a single AWS lambda? I searched about it but couldn't find an answer. Let me know if it is possible.
It is not possible to specify two log groups from single lambda.
If you need logs in two cloudwatch groups, you would need to be innovative and create subscription filter in cloudwatch group which stream logs to another lambda where you get logs as a payload and then from the second lambda you can save logs to another cloudwatch group.
More info about cloudwatch subscriptions: https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/SubscriptionFilters.html

Regular expressions for CloudWatch alarms

I have microservice that sends some custom metrics to AWS CloudWatch. Metric name consists of package name and some other data. For example gauge.com.example.test.time and gauge.com.example.test2.time and so on
Now I need to create some alarms based on this metrics. Is it possible to specify some reqular expression in metric name field when you create CloudWatch alarm instead of manual creation of separate alarm for each metric?
I tried such things: gauge.com.example..time gauge.com.example.*.time gauge.com.example.(\w).time and many other things but without success.
It is now possible to create aws alarms based on a Metric Math Expression.
https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/Create-alarm-on-metric-math-expression.html
Not possible. From Creating Amazon CloudWatch Alarms
You can create a CloudWatch alarm that watches a single metric.

How does Amazon CloudWatch batch logs when streaming to AWS Lambda?

The AWS documentation indicates that multiple log event records are provided to Lambda when streaming logs from CloudWatch.
logEvents
The actual log data, represented as an array of log event
records. The "id" property is a unique identifier for every log event.
How does CloudWatch group these logs?
Time? Count? Randomly, from my perspective?
Currently you get one Lambda invocation for every PutLogEvents batch that CloudWatch Logs had received against that log group. However you should probably not rely on that because AWS could always change it (for example batch more, etc).
You can observe this behavior by running the CWL -> Lambda example in the AWS docs.
Some aws services allow you to configure the log intervals such as elastic load balancing. There's a choice between five and sixty minute log intervals. You may not see a specific increment or parameter in the docs because they are configurable based on each service.

Use cloudwatch to determine if linux service is running

Suppose I have an ec2 instance with service /etc/init/my_service.conf with contents
script
exec my_exec
end script
How can I monitor that ec2 instance such that if my_service stopped running I can act on it?
You can publish a custom metric to CloudWatch in the form of a "heart beat".
Have a small script running via cron on your server checking the
process list to see whether my_service is running and if it is, make
a put-metric-data call to CloudWatch.
The metric could be as simple as pushing the number "1" to your custom metric in CloudWatch.
Set up a CloudWatch alarm that triggers if the average for the metric falls below 1
Make the period of the alarm be >= the period that the cron runs e.g. cron runs every 5 minutes, make the alarm alarm if it sees the average is below 1 for two 5 minute periods.
Make sure you also handle the situation in which the metric is not published (e. g. cron fails to run or whole machine dies). you would want to setup an alert in case the metric is missing. (see here: AWS Cloudwatch Heartbeat Alarm)
Be aware that the custom metric will add an additional cost of 50c to your AWS bill (not a big deal for one metric - but the equation changes drastically if you want to push hundred/thousands of metrics - i.e. good to know it's not free as one would expect)
See here for how to publish a custom metric: http://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/publishingMetrics.html
I am not sure if CloudWatch is the right route for checking if the service is running - it would be easier with Nagios kind of solution.
Nevertheless, you may try the CloudWatch Custom metrics approach. You add Additional lines of code which publishes say an integer 1 to CloudWatch Custom Metrics every 5 mins. Your can then configure CloudWatch alarms to do a SNS Notification / Mail Notification for the conditions like Sample Count or sum deviating your anticipated value.
script
exec my_exec
publish cloudwatch custom metrics value
end script
More Info
Publish Custom Metrics - http://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/publishingMetrics.html