How to enable GET request metric for S3 bucket? - amazon-web-services

I want to add a CloudWatch alarm for GET requests in an S3 bucket. In the Console, I went to Management -> Metrics for this bucket and checked the box for Request metrics (10) (paid feature). This also automatically checked the box for Data transfer metrics (6) (paid feature). I thought this would enable the GET request metric. Instead though only 5 request metrics and 3 data transfer metrics have appeared, and the GET request metric is not one of them. How do I fix this? I thought 16 new metrics should have appeared, but only 8 have.

You can alternatively use S3 access logs to get the number of GET requests for your objects: Using Amazon S3 access logs to identify requests.
Also note that some CloudWatch metrics only appear when there are actual data points. Operation-specific metrics (such as GetRequests) are reported only if there are requests of that type for your bucket or your filter.

Related

Can you upload historical metrics to Cloudwatch?

We use AWS Cloudwatch Metrics and the associated dashboards a lot. Sometimes we want to add a new visualisation, but now we can only find the metric from our first PutMetricData onwards. We however often have the data retrospectively, just not uploaded at the time.
Can you retrospectively upload metrics to Cloudwatch?

Alarm when object size in S3 bucket exceeds threshold

I have AWS data pipelines setup that feed to my S3 bucket. Each time a new feed file is generated by the pipeline and stored in the bucket. We keep at most 30 days of data in the bucket. Is it possible to configure an alarm so that I am notified via email, etc when the generated object size crosses the threshold (say 1G)? How would I go about it?
If you want granular data some dev work is required below are some options/further reading.
s3 notifications - ie events sent by s3 in response to create/delete etc which can be used to fire a lambda to perform whatever logic. You can base logic on key, filesize, created date etc. You could then store that value as a cloudwatch metric, and then setup an alarm on your custom metric.
See https://docs.aws.amazon.com/AmazonS3/latest/dev/NotificationHowTo.html
Or
s3 inventory (which is basically a csv formatted directory listing uploaded to different bucket on a schedule).
If you go for inventory option you set a schedule and then you can then create a notification on destination bucket of inventory file to fire a lambda as each csv is availavle. Also take a look at aws Athena, can be used to query the inventory files direct via api - no need to download/parse csv!
See https://docs.aws.amazon.com/AmazonS3/latest/dev/storage-inventory.html
If your interested in quick n easy / none programming route there's a total bucket size cloudwatch metric called BucketSizeBytes which you could easily add an alarm which triggers sns email if total size got above 30gb. Depending on your goals this might be useful and should take minutes to setup - but is pretty useless for timely monitoring purposes.
See https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/s3-metricscollected.html

Cloudwatch - Metrics are expiring

I currently have a bunch of custom metric's based in multiple regions across our AWS account.
I thought I was going crazy but have now confirmed that the metric I created a while ago is expiring when not used for a certain time period (could be 2 weeks).
Here's my setup.
I create a new metric on my log entry - which has no expiry date;
I then go to the main page on CloudWatch --> then to Metrics to view any metrics (I understand this will only display new metric hits when there are hits that match the metric rule).
About 2 weeks ago, I had 9 Metrics logged under my "Custom Namespaces", and I now have 8 - as if it does not keep all the data:
As far as i'm aware, all my metrics should stay in place (unless I remove them), however, it seems as though if these are not hit consistently, the data "expires", is that correct? If so, how are you meant to track historical data?
Thanks
CloudWatch will remove metrics from search if there was no new data published for that metric in the last 2 weeks.
This is mentioned in passing in the FAQ for EC2 metrics, but I think it applies to all metrics.
From 'will I lose the metrics data if I disable monitoring for an Amazon EC2 instance question' in the FAQ:
CloudWatch console limits the search of metrics to 2 weeks after a
metric is last ingested to ensure that the most up to date instances
are shown in your namespace.
Your data is still there however. Data adheres to a different retention policy.
You can still get your data if you know what the metric name is. If you added your metric to a dashboard, it will still be visible there. You can use CloudWatch PutDashboards API to add the metric to a dashboard or use CloudWatch GetMetricStatistics API to get the raw data.

How does Amazon CloudWatch batch logs when streaming to AWS Lambda?

The AWS documentation indicates that multiple log event records are provided to Lambda when streaming logs from CloudWatch.
logEvents
The actual log data, represented as an array of log event
records. The "id" property is a unique identifier for every log event.
How does CloudWatch group these logs?
Time? Count? Randomly, from my perspective?
Currently you get one Lambda invocation for every PutLogEvents batch that CloudWatch Logs had received against that log group. However you should probably not rely on that because AWS could always change it (for example batch more, etc).
You can observe this behavior by running the CWL -> Lambda example in the AWS docs.
Some aws services allow you to configure the log intervals such as elastic load balancing. There's a choice between five and sixty minute log intervals. You may not see a specific increment or parameter in the docs because they are configurable based on each service.

AWS Cloudwatch monitoring for S3

Amazon Cloudwatch provides some very useful metrics for monitoring my EC2s, load balancers, elasticache and RDS databases, etc and allows me to set alarms for a whole range of criteria; but is there any way to configure it to monitor my S3s as well? Or are there any other monitoring tools (besides simply enabling logging) that will help me monitor the numbers of POST/GET requests and data volumes for my S3 resources? And to provide alarms for thresholds of activity or increased datastorage?
AWS S3 is a managed storage service. The only metrics available in AWS CloudWatch for S3 are NumberOfObjects and BucketSizeBytes. In order to understand your S3 usage better you need to do some extra work.
I have recently written an AWS Lambda function to do exactly what you ask for and it's available here:
https://github.com/maginetv/s3logs-cloudwatch
It works by parsing S3 Server side log files and aggregates/exports metrics to AWS Cloudwatch (CloudWatch allows you to publish custom metrics).
Example graphs that you will get in AWS CloudWatch after deploying this function on your AWS account are:
RestGetObject_RequestCount
RestPutObject_RequestCount
RestHeadObject_RequestCount
BatchDeleteObject_RequestCount
RestPostMultiObjectDelete_RequestCount
RestGetObject_HTTP_2XX_RequestCount
RestGetObject_HTTP_4XX_RequestCount
RestGetObject_HTTP_5XX_RequestCount
+ many others
Since metrics are exported to CloudWatch, you can easily set up alarms for them as well.
CloudFormation template is included in GitHub repo and you can deploy this function very quickly to gain visibility into your S3 bucket usage.
EDIT 2016-12-10:
In November 2016 AWS has added extra S3 request metrics in CloudWatch that can be enabled when needed. This includes metrics like AllRequests, GetRequests, PutRequests, DeleteRequests, HeadRequests etc. See Monitoring Metrics with Amazon CloudWatch documentation for more details about this feature.
I was also unable to find any way to do this with CloudWatch. This question from April 2012 was answered by Derek#AWS as not having S3 support in CloudWatch. https://forums.aws.amazon.com/message.jspa?messageID=338089
The only thing I could think of would be to import the S3 access logs to a log service (like Splunk). Then create a custom cloud watch metric where you post the data that you parse from the logs. But then you have to filter out the polling of the access logs and…
And while you were at it, you could just create the alarms in Splunk instead of in S3.
If your use case is to simply alert when you are using it too much, you could set up an account billing alert for your S3 usage.
I think this might depend on where you are looking to track the access from. I.e. if you are trying to measure/watch usage of S3 objects from outside http/https requests then Anthony's suggestion if enabling S3 logging and then importing into splunk (or redshift) for analysis might work. You can also watch billing status on requests every day.
If trying to guage usage from within your own applications, there are some AWS SDK cloudwatch metrics:
http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/metrics/package-summary.html
and
http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/s3/metrics/S3ServiceMetric.html
S3 is a managed service, meaning that you don't need to take action based on system events in order to keep it up and running (as long as you can afford to pay for the service's usage). The spirit of CloudWatch is to help with monitoring services that require you to take action in order to keep them running.
For example, EC2 instances (which you manage yourself) typically need monitoring to alert when they're overloaded or when they're underused or else when they crash; at some point action needs to be taken in order to spin up new instances to scale out, spin down unused instances to scale back in, or reboot instances that have crashed. CloudWatch is meant to help you do the job of managing these resources more effectively.
To enable Request and Data transfer metrics in your bucket you can run the below command. Be aware that these are paid metrics.
aws s3api put-bucket-metrics-configuration \
--bucket YOUR-BUCKET-NAME \
--metrics-configuration Id=EntireBucket
--id EntireBucket
This tutorial describes how to do it in AWS Console with point and click interface.