I'd like to know if possible to discover which resource is behind this cost in my Cost Explorer, grouping by usage type I can see it is Data Processing bytes, but I don't know which resource would be consuming this amount of data.
Have some any idea how to discover it on CloudWatch?
This is almost certainly because something is writing more data to CloudWatch than previous months.
As stated this AWS Support page about unexpected CloudWatch logs bill increases:
Sudden increases in CloudWatch Logs bills are often caused by an
increase in ingested or storage data in a particular log group. Check
data usage using CloudWatch Logs Metrics and review your Amazon Web
Services (AWS) bill to identify the log group responsible for bill
increases.
Your screenshot identifies the large usage type as APS2-DataProcessing-Bytes. I believe that the APS2 part is telling you it's about the ap-southeast-2 region, so start by looking in that region when following the instructions below.
Here's a brief summary of the steps you need to take to find out which log groups are ingesting the most data:
How to check how much data you're ingesting
The IncomingBytes metric shows you how much data is being ingested in your CloudWatch log groups in near-real time. This metric can help you to determine:
Which log group is the highest contributor towards your bill
Whether there's been a spike in the incoming data to your log groups or a gradual increase due to new applications
How much data was pushed in a particular period
To query a small set of log groups:
Open the Amazon CloudWatch console.
In the navigation pane, choose Metrics.
For each of your log groups, select the IncomingBytes metric, and then choose the Graphed metrics tab.
For Statistic, choose Sum.
For Period, choose 30 Days.
Choose the Graph options tab and choose Number.
At the top right of the graph, choose custom, and then choose Absolute. Select a start and end date that corresponds with the last 30 days.
For more details, and for instructions on how to query hundreds of log groups, read the full AWS support article linked above.
Apart from the steps which Gabe mentioned what helped me identify the resource which was creating large number of logs was by:
heading over to Cloudwatch
selecting the region which showed in Cost explorer
Selecting Log Groups
From settings under Log Groups, Enabling column Stored bytes to be visible
This showed me which service was causing a lot of logs to be written to Cloudwatch.
Related
I am trying to figure out how to simply view all of our custom metrics in CloudWatch.
AWS Console is far from helpful, or at least it's not well signposted. I want to try and relate our CloudWatch bill to actual metrics we have to try and determine where I can make some cuts.
For Example:
Our Bill shows 1,600 Metrics charged at $0.30 a piece per month, but I see over 17,000 custom namespaces in the metrics list within the CloudWatch console.
Does anyone know how I can best find this information, or have a nice handy CLI command to view all custom metrics for a region?
I can see the custom namespaces section in cloud watch, but these don't really marry up to the billing page as such. By about a 10 fold.
Thank you.
UPDATE:
I think I may have identified why there is a discrepancy between the billing and the list of metrics:
We have namespace builds, each creating metrics and being destroyed sometimes within hours.
These metrics which were created linger for 15 days according to the AWS FAQ on CloudWatch Metrics.
The overall monthly metrics seemingly is a figure of what it is due to the concurrency of metrics over the month.
However, this still doesn't make the billing breakdown any easier to understand when you're trying to highlight possible outliers in costs.
I have a requirement to send an email notification whenever there is no data getting inserted into my BigQuery table. For this, I am using the Logging and Alerting mechanism But still, I am not able to receive any email. Here are the steps I followed:
I had written a Query in Logs explorer as below:
Now I had created a metric for those logs with Metric type COUNTER and in the filter section obviously I have given the above query.
Now I created a policy in ALERTING under the MONITORING domain. And here is the screenshot attached. The alerting policy which I had selected is for the logging metrics which I had created before.
And then a trigger as below:
And in the Notification channel, added my Email ID.
Can someone please help me if I am missing something? My requirement is to receive an alert when there is no data inserted into a Bigquery table for more than a day.
And also, I could see in Metrics Explorer, the metric which I created is not ACTIVE. Why so?
As mentioned in GCP docs:
Metric absence conditions require at least one successful measurement — one that retrieves data — within the maximum duration window after the policy was installed or modified.
For example, suppose you set the duration window in a metric-absence policy to 30 minutes. The condition isn't met if the subsystem that writes metric data has never written a data point. The subsystem needs to output at least one data point and then fail to output additional data points for 30 minutes.
Meaning, you will need at least 1 datapoint (insert job) to have an incident created for the metric to be missing.
There are two options:
Create an artificial log entry to get the metric started and have at least one time series and data point.
Run an insert job that would match the log-based metric that was created to get the metric started.
With regards to your last question, the metric you created is not active because there hasn't been any written data points to it within the previous 24 hours. As mentioned above, the metric must have at least 1 datapoint written to it.
Refer to custom metrics quota for more info.
We have recently huge cost increasing (x8 times) on CloudWatch GetMetricData operation. We have a lot of log groups and different teams on the same Aws Account.
Do you know how could we find out the GetMetricData is for which log group ?
Thanks.
Unfortunately, there's no easy answer your question. We had the same issue where a line on the bill call "GetMetricsData API" was getting completely out of control. It's a shame AWS CloudTrail does not log such request. To discover the root cause, we had to disable all external monitoring tool we had plugged on this account one by one and monitor for a dent in the bill. See this article.
AWS does not tie the charges of GetMetricData to specific CloudWatch Log Groups so sadly this is not possible to see. The only things that you can see on a per log group basis are "processing bytes" and storage. If you believe that those could be close proxies, then you can query them directly via Cost and Usage Reports...but it may be that ingestion costs are not at all tied to querying of metric data.
An alternative hosted solution for seeing all of this data aggregated together would be https://www.vantage.sh/ which will query for all CloudWatch log groups and show you all the costs that it can on a per Log Group basis but you'll need to enabled "Advanced Analytics" from them.
I'm trying to estimate how much GuardDuty is going to cost me per month and according to https://aws.amazon.com/guardduty/pricing/ I should look at how many cloudtrail logs I produce a month as well as how much VPC logs in GB I produce a month.
Using boto3 S3 I can count how many logs are in my bucket, which tells me how much I am going to spend having GuardDuty read my logs. Now I wish to find how many GB's of data my VPC logs are producing, but I can't seem to figure out where I can pull that kind of information from. I want to programmatically see how many GB's of VPC flow logs I produce a month to best estimate how much I would spend on GD.
This code snippet is to show how to get the size of VPC flow flogs associated with each network interface in the VPC. You have to modify the script to get the logs for the entire month and sum it.
import boto3
logs = boto3.client('logs')
# List the log groups and identify the VPC flow log group
for log in logs.describe_log_groups()['logGroups']:
print log['logGroupName']
# Get the logstreams in 'vpc-flow-logs'
for log in logs.describe_log_streams(logGroupName='vpc-flow-logs')['logStreams']:
print log['logStreamName'], log['storedBytes']
describe_log_streams
Lists the log streams for the specified log group. You can list all
the log streams or filter the results by prefix. You can also control
how the results are ordered.
This operation has a limit of five transactions per second, after which transactions are throttled.
Request Syntax
response = client.describe_log_streams(
logGroupName='string',
logStreamNamePrefix='string',
orderBy='LogStreamName'|'LastEventTime',
descending=True|False,
nextToken='string',
limit=123
)
I currently have a bunch of custom metric's based in multiple regions across our AWS account.
I thought I was going crazy but have now confirmed that the metric I created a while ago is expiring when not used for a certain time period (could be 2 weeks).
Here's my setup.
I create a new metric on my log entry - which has no expiry date;
I then go to the main page on CloudWatch --> then to Metrics to view any metrics (I understand this will only display new metric hits when there are hits that match the metric rule).
About 2 weeks ago, I had 9 Metrics logged under my "Custom Namespaces", and I now have 8 - as if it does not keep all the data:
As far as i'm aware, all my metrics should stay in place (unless I remove them), however, it seems as though if these are not hit consistently, the data "expires", is that correct? If so, how are you meant to track historical data?
Thanks
CloudWatch will remove metrics from search if there was no new data published for that metric in the last 2 weeks.
This is mentioned in passing in the FAQ for EC2 metrics, but I think it applies to all metrics.
From 'will I lose the metrics data if I disable monitoring for an Amazon EC2 instance question' in the FAQ:
CloudWatch console limits the search of metrics to 2 weeks after a
metric is last ingested to ensure that the most up to date instances
are shown in your namespace.
Your data is still there however. Data adheres to a different retention policy.
You can still get your data if you know what the metric name is. If you added your metric to a dashboard, it will still be visible there. You can use CloudWatch PutDashboards API to add the metric to a dashboard or use CloudWatch GetMetricStatistics API to get the raw data.