So we have logs (apache, tomcat, etc) stored in Amazon CloudWatch Logs.
I'm trying to use Logstash to index from AWS and send them over to Elasticsearch/Kibana.
I can't seem to find a plugin to accomplish this.
Has anyone tried this and was successful?
I don't want the metrics, just the logs stored in AWS Logs.
Other posters have mentioned that CloudFormation templates are available that will stream your logs to Amazon Elasticsearch, but if you want to go through Logstash first, this logstash plugin may be of use to you:
https://github.com/lukewaite/logstash-input-cloudwatch-logs/
This plugin allows you to ingest specific CloudWatch Log Groups, or a series of groups that match a prefix into your Logstash pipeline, and work with the data as you will. It is published on RubyGems, and can be installed like a normal Logstash plugin: bin/logstash-plugin install logstash-input-cloudwatch_logs.
As already pointed out by BMW, AWS has just introduced a dedicated CloudWatch Logs Subscription Consumer, which provides one click access to a complete CloudWatch Logs + Elasticsearch + Kibana stack by means of a resp. AWS CloudFormation template, as further illustrated in the introductory blog post.
Given you seem to have an ELK stack readily available, it shouldn't be too complex to adjust the AWS sample template to target your own endpoints instead.
In order to use the CloudFormation template (as per BMW's answer) it needs to be customized, part of this would be providing your account ID and region as a CF Resource.
AWS: : AccountId and AWS: : Region are pseudo parameters that return the AWS account ID of the account in which the stack is being created, such as 123456789012, and a string representing the AWS Region in which the encompassing resource is being created, such as us-west-2. (http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/pseudo-parameter-reference.html)
Related
I have a view on a PostgreSQL RDS instance that lists any ongoing deadlocks. Ideally, there are no deadlocks in the database, causing the view to show nothing, but on rare occasions, there are.
How would I setup an alarm in Cloudwatch to query this view and raise an alarm if any records return?
I found the cool script on Github specifically for this:
A Serverless MySQL RDS Data Collection script to push Custom Metrics to CloudWatch on AWS
Basically, there are 2 main possibilities to publish any custom metrics on CloudWatch:
Via API
You can run it on a schedule on EC2 instance (AWS example) or as a lambda function (great manual with code examples)
With CloudWatch agent
Here is the pretty example for Monitor your Microsoft SQL Server using custom metrics with Amazon CloudWatch and AWS Systems Manager.
After all, you should set up CloudWatch alarms with Metric Math and relevant thresholds.
It is not possible to configure Amazon CloudWatch to look inside an Amazon RDS database.
You will need some code running somewhere that regularly runs a query on the database and sends a custom metric to Amazon CloudWatch.
For example, you could trigger an AWS Lambda function, or use cron on an Amazon EC2 instance to trigger a script.
I have "cloudwatch" service to monitor logs for my EC2 running instances. But the ColudWatch web console does not seem to have a button to allow you to download/exporting the log data from it.
Any ideas how I can achieve this goal through CLI or GUI?
Programmatically, using boto3 (Python),
log_client=boto3.client('logs')
result_1=log_client.describe_log_streams(logGroupName='<NAME>')
(I don't know what log group names for EC2 instances look like; for Lambda they are of the form '/aws/lambda/FuncName'. Try grabbing the names you see in the console).
result_1 contains two useful keys: logStreams (the result you want) and nextToken (for pagination, I'll let you look up the usage).
Now result_1['logStreams'] is a list of objects containing a logStreamName. Also useful are firstEventTimestamp and lastEventTimestamp.
Now that you have log stream names, you can use
log_client.get_log_events(logGroupName='<name>',logStreamName='<name>'
The response contains nextForwardToken and nextBackwardToken for pagination, and events for the log events you want. Each event contains a timestamp and a message.
I'll leave it to you to look up the API to see what other parameters might be useful to you. By the way, the console will let you stream your logs to an S3 bucket or to AWS's ElasticSearch service. ElasticSearch is a joy to use, and Kibana's UI is intuitive enough that you can get results even without learning their query language.
You can use the console or the AWS CLI to download CloudWatch logs to Amazon S3. You do need to know the log group name, from & to timestamps in the log, destination bucket and prefix. Amazon recommends a separate S3 bucket for your logs. Once you have a bucket you create an export task, under (in the console) Navigation - Logs - select your log group - Actions - Export data to S3 - fill in the details for your export - select Export data. Amazon's documentation explains it pretty well at: http://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/S3Export.html. And CLI instructions are there also if you wanted to use that. I imagine with the CLI you could also script your export, but you would have to define the variables somehow so you don't overwrite an existing export.
If this is part of your overall AWS disaster recovery planning, you might want to check out some tips & best practices, such as Amazon's white paper on AWS disaster recovery, and NetApp's discussion of using the cloud for disaster recovery.
We are hosting our services in AWS beanstalk managed instances. That is forcing us to move away from files based logging to use database based logging.
Is DynamoDB a good choice for replacing file based logging. If so, what should be the primary key. I thought of using timestamp but multiple messages may be logged by the same service within the same timeStamp so that might not be reliable.
Any advice would be appreciated.
Don't use DynamoDB to store logs. You'll be paying for throughput and space needlessly.
Amazon CloudWatch has built-in logging capabilities.
http://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/WhatIsCloudWatchLogs.html
Another alternative is a dedicated logging service such as Loggly which is cloud-based and can receive logs in many common formats, plus they have an API to send custom logs. In the web-based console, you can search and filter through the logs.
As an alternative, why don't you use cloudwatch? I ended up writing a whole app to consolidate logs across ec2 instances in a beanstalk app, then last year AWS opened up cloudwatch as a service, so I junked my stuff. You tell cloudwatch where your logs are on the instance, give it a log group and stream name, and all your logs are consolidated in one spot, in cloudwatch. You can also run alarms off them using the standard AWS setup. It's pretty slick, and easy - don't have to write a front end to do lookups, it's already there.
Don't know what you're using for logging - we are a node.js shop, used winston for logging, and there is a nice NPM module that works with Winston to log automatically, called winston-cloudwatch.
How do I fetch logs (AWS VPC LOGS) from aws which are seen on cloudwatch? I am confused between which API to use. The cloud watch api is about fetching the metrics and not about getting the log events.
If someone could help me getting a Java example to fetch logs into a file. I want to append the logs to a file. I have my own logging infrastructure for which I am using logstash-statsD-graphite.
You need to use the AWSLogs client, in the package com.amazonaws.services.logs : http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/logs/AWSLogs.html
You have the GetLogEventsRequest object to perform the request, and everything you need to paginate. You'll get a list of OutputLogEvent with timestamps and messages (and as far as I know, each message should be a VPC flow record).
The full API doc is here: http://docs.aws.amazon.com/AmazonCloudWatchLogs/latest/APIReference/Welcome.html
Hope this will get you started
Amazon Cloudwatch provides some very useful metrics for monitoring my EC2s, load balancers, elasticache and RDS databases, etc and allows me to set alarms for a whole range of criteria; but is there any way to configure it to monitor my S3s as well? Or are there any other monitoring tools (besides simply enabling logging) that will help me monitor the numbers of POST/GET requests and data volumes for my S3 resources? And to provide alarms for thresholds of activity or increased datastorage?
AWS S3 is a managed storage service. The only metrics available in AWS CloudWatch for S3 are NumberOfObjects and BucketSizeBytes. In order to understand your S3 usage better you need to do some extra work.
I have recently written an AWS Lambda function to do exactly what you ask for and it's available here:
https://github.com/maginetv/s3logs-cloudwatch
It works by parsing S3 Server side log files and aggregates/exports metrics to AWS Cloudwatch (CloudWatch allows you to publish custom metrics).
Example graphs that you will get in AWS CloudWatch after deploying this function on your AWS account are:
RestGetObject_RequestCount
RestPutObject_RequestCount
RestHeadObject_RequestCount
BatchDeleteObject_RequestCount
RestPostMultiObjectDelete_RequestCount
RestGetObject_HTTP_2XX_RequestCount
RestGetObject_HTTP_4XX_RequestCount
RestGetObject_HTTP_5XX_RequestCount
+ many others
Since metrics are exported to CloudWatch, you can easily set up alarms for them as well.
CloudFormation template is included in GitHub repo and you can deploy this function very quickly to gain visibility into your S3 bucket usage.
EDIT 2016-12-10:
In November 2016 AWS has added extra S3 request metrics in CloudWatch that can be enabled when needed. This includes metrics like AllRequests, GetRequests, PutRequests, DeleteRequests, HeadRequests etc. See Monitoring Metrics with Amazon CloudWatch documentation for more details about this feature.
I was also unable to find any way to do this with CloudWatch. This question from April 2012 was answered by Derek#AWS as not having S3 support in CloudWatch. https://forums.aws.amazon.com/message.jspa?messageID=338089
The only thing I could think of would be to import the S3 access logs to a log service (like Splunk). Then create a custom cloud watch metric where you post the data that you parse from the logs. But then you have to filter out the polling of the access logs and…
And while you were at it, you could just create the alarms in Splunk instead of in S3.
If your use case is to simply alert when you are using it too much, you could set up an account billing alert for your S3 usage.
I think this might depend on where you are looking to track the access from. I.e. if you are trying to measure/watch usage of S3 objects from outside http/https requests then Anthony's suggestion if enabling S3 logging and then importing into splunk (or redshift) for analysis might work. You can also watch billing status on requests every day.
If trying to guage usage from within your own applications, there are some AWS SDK cloudwatch metrics:
http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/metrics/package-summary.html
and
http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/metrics/S3ServiceMetric.html
S3 is a managed service, meaning that you don't need to take action based on system events in order to keep it up and running (as long as you can afford to pay for the service's usage). The spirit of CloudWatch is to help with monitoring services that require you to take action in order to keep them running.
For example, EC2 instances (which you manage yourself) typically need monitoring to alert when they're overloaded or when they're underused or else when they crash; at some point action needs to be taken in order to spin up new instances to scale out, spin down unused instances to scale back in, or reboot instances that have crashed. CloudWatch is meant to help you do the job of managing these resources more effectively.
To enable Request and Data transfer metrics in your bucket you can run the below command. Be aware that these are paid metrics.
aws s3api put-bucket-metrics-configuration \
--bucket YOUR-BUCKET-NAME \
--metrics-configuration Id=EntireBucket
--id EntireBucket
This tutorial describes how to do it in AWS Console with point and click interface.