AWS custom logging - amazon-web-services

Environment – Two Different ec2 instances running tomcat separately.
Requirement – If there is any Error in logs – we should get an alert.
Implementation –
We implemented AWS customer logging for this which is successfully sending alerts on the Error Pattern Matching.
It automatically created a log groups – “/opt/tomcat/logs/catalina.out”.
Under this log group – there are two log streams – two instances separately showing.
Problem –
Now I want separate alarm for separate instances
Problem is when I create an alarm – it does not let me choose the instance. It takes both instance by default, which means one alarm – monitoring both instances simultaneously. And sending alert without mentioning instance name. so it is difficult to find which instance has actually sent alert.
And the second problem is - we created few log metrics for testing – like on keyword – info – which we want to delete and not able to do so.

It appears that you are using the CloudWatch Logs functionality that permits automated sending of log files from an EC2 instance (or elsewhere) to the CloudWatch service. CloudWatch Logs can then be configured to look for strings in the log files, which will trigger the recording of metrics.
To create separate alarms for separate instances, each EC2 instance should be configured to use a different CloudWatch Log stream. The CloudWatch Logs agent takes a Destination Log Group name.
See: Quick Start: Install and Configure the CloudWatch Logs Agent on an Existing EC2 Instance
As for the metrics that you wish to delete, it is not possible to delete metrics from Amazon CloudWatch. However, metrics will automatically disappear after 14 days.

Related

Confusion on AWS Cloudwatch and Application Logs

I have an on-premise app deployed in an Application Server (e.g. Tomcat) and it generates its own log file. If I decide to migrate this to an AWS EC2, including the Application Server, is it possible to port my application logs in Cloudwatch instead? or is Cloudwatch only capable of logging the runtime logs in my application server? is it a lot of work to do this or is this even possible?
Kind of confuse on Cloudwatch. Seems it can do multiple things but is it really right to make it do that? Its only supposed to log metrics right, so it can alert whatever or whoever needs to be alerted.
If you have already developed application that produces its own log files, you can use CloudWatch Logs Agent to ingest the logs into CloudWatch Logs:
After installation is complete, logs automatically flow from the instance to the log stream you create while installing the agent. The agent confirms that it has started and it stays running until you disable it.
The metrics, such as RAM usage, disk space, can also be monitored and pushed to CloudWatch through the agent.
In both cases, logs and metrics, you can setup CloudWatch Alarms to automatically detect anomalies and notify you, or perform other actions, when they are detected. For logs, this is done through metric filters:
You can search and filter the log data coming into CloudWatch Logs by creating one or more metric filters. Metric filters define the terms and patterns to look for in log data as it is sent to CloudWatch Logs. CloudWatch Logs uses these metric filters to turn log data into numerical CloudWatch metrics that you can graph or set an alarm on.
update
You can also have your application to inject logs directly to CloudWatch logs using AWS SDK. For example, in python, you can use put_log_events.

What is Log Mechanism in AWS cloud watch?

I have recently started learning about AWS cloud watch and I want to understand the concept of creating Logs so I went through a lot of links like
https://aws.amazon.com/answers/logging/centralized-logging/
I could understand that we can create log groups but and logs are basically to track activity. Is there anything more to it. When do the logs get created.
Any help would be highly appreciated!
You can get more details about Log Groups and CloudWatch Logs Concepts here
Following is the extract from that page
Log Events
A log event is a record of some activity recorded by the application or resource being monitored. The log event record that
CloudWatch Logs understands contains two properties: the timestamp of
when the event occurred, and the raw event message. Event messages
must be UTF-8 encoded.
Log Streams
A log stream is a sequence of log events that share the same source. More specifically, a log stream is generally intended to
represent the sequence of events coming from the application instance
or resource being monitored. For example, a log stream may be
associated with an Apache access log on a specific host. When you no
longer need a log stream, you can delete it using the aws logs
delete-log-stream command. In addition, AWS may delete empty log
streams that are over 2 months old.
Log Groups
Log groups define groups of log streams that share the same retention, monitoring, and access control settings. Each log stream
has to belong to one log group. For example, if you have a separate
log stream for the Apache access logs from each host, you could group
those log streams into a single log group called
MyWebsite.com/Apache/access_log.
And to answer your question "When do the logs get created.", basically that is completely dependent on your application. However, whenever they are created they get streamed to cloudwatch streams (if you have installed the cloudwatch agent and are streaming that particular log)
The advantage of using cloudwatch is that you can retain logs even after your EC2 instance is terminated and you dont need to SSH into the resource to check the logs, you can simply get that from AWS Console

How does Amazon CloudWatch batch logs when streaming to AWS Lambda?

The AWS documentation indicates that multiple log event records are provided to Lambda when streaming logs from CloudWatch.
logEvents
The actual log data, represented as an array of log event
records. The "id" property is a unique identifier for every log event.
How does CloudWatch group these logs?
Time? Count? Randomly, from my perspective?
Currently you get one Lambda invocation for every PutLogEvents batch that CloudWatch Logs had received against that log group. However you should probably not rely on that because AWS could always change it (for example batch more, etc).
You can observe this behavior by running the CWL -> Lambda example in the AWS docs.
Some aws services allow you to configure the log intervals such as elastic load balancing. There's a choice between five and sixty minute log intervals. You may not see a specific increment or parameter in the docs because they are configurable based on each service.

Stopping EC2 instance when custom cloudwatch metric passes limit

I'm trying to find a way to make an Amazon EC2 instance stop automatically when a certain custom metric on CloudWatch passes a limit. So far if I've understood correctly based on these articles:
Discussion Forum: Custom Metric EC2 Action
CloudWatch Documentation: Create Alarms to Stop, Terminate, Reboot, or Recover an Instance
This will only work if the metric is defined as follows:
Tied to certain instance
With type of System/Linux
However in my case I have a custom metric that is actually not instance-related but "global" and if a certain limit is passed, I would need to stop all instances, no matter from which instance the limiting log is received.
Does anybody know if there is way to make this work? What I'd need is some way to make CloudWatch work like this:
If arbitrary custom metric value passes a certain limit -> stop defined instances not tied to the metric itself.
The main problem is that the EC2 option is greyed out as the metric is not tied to certain EC2 instance and I'm not sure if there's any way to do this without actually making the metric itself certain instance related.
Have the custom CloudWatch metric post alerts to an SNS topic.
Have the SNS topic trigger a Lambda function that shuts down your EC2 instances via a call to the AWS API.

How do I set up CloudWatch to detect when an EC2 instance goes down?

I've got an app running on AWS. How do I set up Amazon CloudWatch to notify me when the EC2 instance fails or is no longer responsive?
I went through the CloudWatch screens, and it appears that you can monitor certain statistics, like CPU or disk utilization, but I didn't see a way to monitor an event like "the instance got an http request and took more than X seconds to respond."
Amazon's Route 53 Health Check is the right tool for the job.
Route 53 can monitor the health and performance of your application as well as your web servers and other resources.
You can set up HTTP resource checks in Route 53 that will trigger an e-mail notification if the server is down or responding with an error.
http://eladnava.com/monitoring-http-health-email-alerts-aws/
To monitor an event in CloudWatch you create an Alarm, which monitors a metric against a given threshold.
When creating an alarm you can add an "action" for sending a notification. AWS handles notifications through SNS (Simple Notification Service). You can subscribe to a notification topic and then you'll receive an email for you alarm.
For EC2 metrics like CPU or disk utilization this is the guide from the AWS docs: http://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/US_AlarmAtThresholdEC2.html
As answered already, use an ELB to monitor HTTP.
This is the list of available metrics for ELB:
http://docs.aws.amazon.com/ElasticLoadBalancing/latest/DeveloperGuide/US_MonitoringLoadBalancerWithCW.html#available_metrics
To answer your specific question, for monitoring X seconds for the http response, you would set up an alarm to monitor the ELB "Latency".
CloudWatch monitoring is just like you have discovered. You will be able to infer that one of your instances is frozen by taking a look at the metrics, but CloudWatch won't e.g. send you an email when your app is down or too slow, for example.
If you are looking for some sort of notification when your app or instance is down, I suggest you to use a monitoring service. Pingdom is a good option. You can also set up a new instance on AWS and install a monitoring tool, like Nagios, which would be my preferred option.
Good practices that are always worth, in the long road: using load balancing (Amazon ELB), more than one instance running your app, Autoscaling (when an instance is down, Amazon will automatically start a new one and maintain your SLA), and custom monitoring.
My team has used a custom monitoring script for a long time, and we always knew of failures as soon as they occurred. Basically, if we had two nodes running our app, node 1 sent HTTP requests to node 2 and node 2 to 1. If any request took more than expected, or returned an unexpected HTTP status or response body, the script sent an email to the system admins. Nowadays, we rely on more robust approaches, like Nagios, which can even monitor operating system stuff (threads, etc), application servers (connection pools health, etc) and so on. It's worth every cent invested in setting it up.
CloudWatch recently added "status check" metrics that will answer one of your questions on whether an instance is down or not. It will not do a request to your Web server but rather a system check. As previous answer suggest, use ELB for HTTP health checks.
You could always have another instance for tools/testing, that instance would try the http request based on a schedule and measure the response time, then you could publish that response time with CloudWatch and set an alarm when it goes over a certain threshold.
You could even do that from the instance itself.
As Kurst Ursan mentioned above, using "Status Check" metrics is the way to go. In some cases you won't be able to browse that metrics (i.e if you;re using AWS OpsWorks), so you're going to have to report that custom metric on your own. However, you can set up an alarm built on a metric that always matches (in an OK sate) and have the alarm trigger when the state changes to "INSUFFICIENT DATA" state, this technically means CloudWatch can't tell whether the state is OK or ALARM because it can't reach your instance, AKA your instance is offline.
There are a bunch of ways to get instance health info. Here are a couple.
Watch for instance status checks and EC2 events (planned downtime) in the EC2 API. You can poll those and send to Cloudwatch to create an alarm.
Create a simple daemon on the server which writes to DynamoDB every second (has better granularity than Cloudwatch). Have a second process query the heartbeats and alert when missing.
Put all instances in a load balancer with a dummy port open that that gives a TCP response. Setup TCP health checks on the ELB, and alert on unhealthy instances.
Unless you use a product like Blue Matador (automatically notifies you of production issues), it's actually quite heinous to set something like this up - let alone maintain it. That said, if you're going down the road, and want some help getting started using Cloudwatch (terminology, alerts, logs, etc), start with this blog: How to Monitor Amazon EC2 with CloudWatch
You can use CloudWatch Event Rule to Monitor whenever any EC2 instance goes down. You can create an Event rule from CloudWatch console as following :
In the CLoudWatch Console choose Events -> rule
For Event Pattern, In service Name Choose EC2
For Event Type, Choose EC2 Instance State-change Notification
For Specific States, Choose Stopped
In targets Choose any previously created SNS topic for sending a notification!
Source : Create a Rule - https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/CloudWatch-Events-Input-Transformer-Tutorial.html#input-transformer-create-rule
This is not exactly a CloudWatch alarm, however this serves the purpose of monitoring/notification.