I have "cloudwatch" service to monitor logs for my EC2 running instances. But the ColudWatch web console does not seem to have a button to allow you to download/exporting the log data from it.
Any ideas how I can achieve this goal through CLI or GUI?
Programmatically, using boto3 (Python),
log_client=boto3.client('logs')
result_1=log_client.describe_log_streams(logGroupName='<NAME>')
(I don't know what log group names for EC2 instances look like; for Lambda they are of the form '/aws/lambda/FuncName'. Try grabbing the names you see in the console).
result_1 contains two useful keys: logStreams (the result you want) and nextToken (for pagination, I'll let you look up the usage).
Now result_1['logStreams'] is a list of objects containing a logStreamName. Also useful are firstEventTimestamp and lastEventTimestamp.
Now that you have log stream names, you can use
log_client.get_log_events(logGroupName='<name>',logStreamName='<name>'
The response contains nextForwardToken and nextBackwardToken for pagination, and events for the log events you want. Each event contains a timestamp and a message.
I'll leave it to you to look up the API to see what other parameters might be useful to you. By the way, the console will let you stream your logs to an S3 bucket or to AWS's ElasticSearch service. ElasticSearch is a joy to use, and Kibana's UI is intuitive enough that you can get results even without learning their query language.
You can use the console or the AWS CLI to download CloudWatch logs to Amazon S3. You do need to know the log group name, from & to timestamps in the log, destination bucket and prefix. Amazon recommends a separate S3 bucket for your logs. Once you have a bucket you create an export task, under (in the console) Navigation - Logs - select your log group - Actions - Export data to S3 - fill in the details for your export - select Export data. Amazon's documentation explains it pretty well at: http://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/S3Export.html. And CLI instructions are there also if you wanted to use that. I imagine with the CLI you could also script your export, but you would have to define the variables somehow so you don't overwrite an existing export.
If this is part of your overall AWS disaster recovery planning, you might want to check out some tips & best practices, such as Amazon's white paper on AWS disaster recovery, and NetApp's discussion of using the cloud for disaster recovery.
Related
Our CIO had a heart attack upon seeing our AWS bill.
I need to aggregate Apache and Tomcat logs from multiple EC2 (in scaling group) -- what could be the best way to initiate this without breaking the bank? The goal of the logs is to view events by IP address, account names, view the transaction flows (diagnostic/audit logging -- not so much as performance metrics).
ELK is out of the equation (political). Cloudwatch is allowed + anything else.
Depends on volume and access patterns, but pushing the logs to S3 and using Athena to query them is a good shout.
Its cheap because S3 is a really cheap datastore, and Athena is server-less, meaning you only pay for the queries you run.
Make sure you convert the logs to a compressed data format (like Apace Parquet) to save even more dosh.
https://aws.amazon.com/athena
https://docs.aws.amazon.com/athena/latest/ug/querying-apache-logs.html
https://aws.amazon.com/blogs/big-data/analyzing-data-in-s3-using-amazon-athena/
My arguments against S3/Athena would be that S3 may be the cheapest storage mechanism but how will you get the logs off your box and into S3? I'm not aware of any AWS agents that do this but there may be some commercial or open source projects to do it. Also, there is some setup required to get Athena to work for searching such as defining schemas and/or setting up AWS Glue Crawlers to discover data. You'll often find that Glue Crawlers won't be the great of identifying log data if it's not in something like JSON formatted.
I would highly recommend CloudWatch. AWS has created a CloudWatch agent that is available for multiple OSs that will pull and forward your logs from your EC2 instances. CloudWatch also has some free searching tools and now the more powerful CloudWatch Insights tool to help you search your data in a way similar to what other first-class log aggregators allow.
CloudWatch pricing is also pretty cheap. It's only $0.50/GB ingested and $0.02/GB long term storage (in us-east-1 at least). And there is no charge to use the CloudWatch agent which is the biggest advantage as you don't have to invent and test a new way to pull logs off of your boxes.
I'm hunting down a misbehaving EC2 instance, whose ID I found in my billing logs. I can't do describe-instances on it anymore since it died a few days ago. Is there a way to get its equivalent, i.e. does AWS log this kind of information anywhere? In this particular case, I needed to find out which SSH key it was tied to, but the more details, the merrier.
You can get this information from CloudTrail, as long as it happened in the current region and last 90 days.
While you can do it with the Console, you will probably find the CloudTrail CLI easier. Start with this:
aws cloudtrail lookup-events --lookup-attributes AttributeKey=EventName,AttributeValue=RunInstances --no-paginate > /tmp/$$
This dumps all (?) of the RunInstances events to a file, which you can then open in your editor (check the docs; I think that --no-paginate will dump everything, but if you have a lot of events you might have to manually request additional pages).
A better long-term solution is to enable CloudTrail logging to an S3 bucket. This gathers events across all regions, and for a multi-account organization, all accounts. You can then use a bucket life-cycle policy to hold onto those events as long as you think you'll need them. It is, however, somewhat more challenging to query against events stored in S3.
It is not possible to retrieve meta data information of terminated instances.
In future, you can try some alternate approach. For ex: Use AWS Config and add a custom rule write a little lambda function which which saves the meta data of your instances and trigger the lambda function periodically.
Does amazon ec2 generate emails and pdfs of the monitoring information which it does on regular timely basis ? It provides some graphs for CPU Utilisation, Disk reads, Dish reads information, Disk write, Disk write oprations, Network in etc. I need to get all these graphs and data from aws console to my email address in the form of pdf.Can i get it directly or if there is another way to get backup on regular basis.
All the metrics from EC2 are stored in CloudWatch (with most other AWS service metrics). Unfortunately there is no export feature built in so you either need to make one using the CloudWatch api/cli or use someone else's
https://github.com/petezybrick/awscwxls
The following is a good starting point for collecting the EC2 metrics, before creating the graphs if you want to do it on your own.
http://docs.aws.amazon.com/cli/latest/reference/cloudwatch/get-metric-statistics.html
Option 3 is to script a login and screenshot of the AWS Console on the metrics page in a browser
We are hosting our services in AWS beanstalk managed instances. That is forcing us to move away from files based logging to use database based logging.
Is DynamoDB a good choice for replacing file based logging. If so, what should be the primary key. I thought of using timestamp but multiple messages may be logged by the same service within the same timeStamp so that might not be reliable.
Any advice would be appreciated.
Don't use DynamoDB to store logs. You'll be paying for throughput and space needlessly.
Amazon CloudWatch has built-in logging capabilities.
http://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/WhatIsCloudWatchLogs.html
Another alternative is a dedicated logging service such as Loggly which is cloud-based and can receive logs in many common formats, plus they have an API to send custom logs. In the web-based console, you can search and filter through the logs.
As an alternative, why don't you use cloudwatch? I ended up writing a whole app to consolidate logs across ec2 instances in a beanstalk app, then last year AWS opened up cloudwatch as a service, so I junked my stuff. You tell cloudwatch where your logs are on the instance, give it a log group and stream name, and all your logs are consolidated in one spot, in cloudwatch. You can also run alarms off them using the standard AWS setup. It's pretty slick, and easy - don't have to write a front end to do lookups, it's already there.
Don't know what you're using for logging - we are a node.js shop, used winston for logging, and there is a nice NPM module that works with Winston to log automatically, called winston-cloudwatch.
Amazon Cloudwatch provides some very useful metrics for monitoring my EC2s, load balancers, elasticache and RDS databases, etc and allows me to set alarms for a whole range of criteria; but is there any way to configure it to monitor my S3s as well? Or are there any other monitoring tools (besides simply enabling logging) that will help me monitor the numbers of POST/GET requests and data volumes for my S3 resources? And to provide alarms for thresholds of activity or increased datastorage?
AWS S3 is a managed storage service. The only metrics available in AWS CloudWatch for S3 are NumberOfObjects and BucketSizeBytes. In order to understand your S3 usage better you need to do some extra work.
I have recently written an AWS Lambda function to do exactly what you ask for and it's available here:
https://github.com/maginetv/s3logs-cloudwatch
It works by parsing S3 Server side log files and aggregates/exports metrics to AWS Cloudwatch (CloudWatch allows you to publish custom metrics).
Example graphs that you will get in AWS CloudWatch after deploying this function on your AWS account are:
RestGetObject_RequestCount
RestPutObject_RequestCount
RestHeadObject_RequestCount
BatchDeleteObject_RequestCount
RestPostMultiObjectDelete_RequestCount
RestGetObject_HTTP_2XX_RequestCount
RestGetObject_HTTP_4XX_RequestCount
RestGetObject_HTTP_5XX_RequestCount
+ many others
Since metrics are exported to CloudWatch, you can easily set up alarms for them as well.
CloudFormation template is included in GitHub repo and you can deploy this function very quickly to gain visibility into your S3 bucket usage.
EDIT 2016-12-10:
In November 2016 AWS has added extra S3 request metrics in CloudWatch that can be enabled when needed. This includes metrics like AllRequests, GetRequests, PutRequests, DeleteRequests, HeadRequests etc. See Monitoring Metrics with Amazon CloudWatch documentation for more details about this feature.
I was also unable to find any way to do this with CloudWatch. This question from April 2012 was answered by Derek#AWS as not having S3 support in CloudWatch. https://forums.aws.amazon.com/message.jspa?messageID=338089
The only thing I could think of would be to import the S3 access logs to a log service (like Splunk). Then create a custom cloud watch metric where you post the data that you parse from the logs. But then you have to filter out the polling of the access logs and…
And while you were at it, you could just create the alarms in Splunk instead of in S3.
If your use case is to simply alert when you are using it too much, you could set up an account billing alert for your S3 usage.
I think this might depend on where you are looking to track the access from. I.e. if you are trying to measure/watch usage of S3 objects from outside http/https requests then Anthony's suggestion if enabling S3 logging and then importing into splunk (or redshift) for analysis might work. You can also watch billing status on requests every day.
If trying to guage usage from within your own applications, there are some AWS SDK cloudwatch metrics:
http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/metrics/package-summary.html
and
http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/metrics/S3ServiceMetric.html
S3 is a managed service, meaning that you don't need to take action based on system events in order to keep it up and running (as long as you can afford to pay for the service's usage). The spirit of CloudWatch is to help with monitoring services that require you to take action in order to keep them running.
For example, EC2 instances (which you manage yourself) typically need monitoring to alert when they're overloaded or when they're underused or else when they crash; at some point action needs to be taken in order to spin up new instances to scale out, spin down unused instances to scale back in, or reboot instances that have crashed. CloudWatch is meant to help you do the job of managing these resources more effectively.
To enable Request and Data transfer metrics in your bucket you can run the below command. Be aware that these are paid metrics.
aws s3api put-bucket-metrics-configuration \
--bucket YOUR-BUCKET-NAME \
--metrics-configuration Id=EntireBucket
--id EntireBucket
This tutorial describes how to do it in AWS Console with point and click interface.