Stack Driver Monitoring Costs - google-cloud-platform

Currently I'm getting charged for $0.5 per day per environment. We have 4 environments and all per day it costs $2
We don't have any traffic at all as we are still in development phase.
When I try to disable Stack Monitoring API, it says it will disable few more api's which isn't expected.
I saw that "google.monitoring.v3.MetricService.ListTimeSeries" has a request count of 1M per month in the metrics and I don't have from where this is getting triggered.
I see stack driver monitoring api costs are per 1000 calls and it can easily push my budgets to the edge.
Is it possible to find out from where this is getting triggered?

Finally found out that the issue is with NewRelic monitoring.
I gave access to NewRelic sometime back for all my projects and once I revoke the access, it stopped showing the costs!
It would be great if there is an option on gcp which resource is consuming and what is triggering it.

Source can be tracked this way
'APIs & Services' -> Library
Search for 'Strackdriver Monitoring Api' and click
Click 'Manage' on the next screen
Click 'Metrics' from the left-hand side menu
In the 'Select Graphs' dropdown, select "Traffic by Credential" and click 'OK'
You will either see newrelic or a service account email or a unique id. That unique ID is the unique ID of the service account used.

Related

Getting alerts for a TCP LB whenever there's an unhealthy target?

I am currently using an unmanaged TCP Load Balancer that has 3 target VMs, and to provide quick response I will need an alerting system whenever the healthy amount is not 3 out of 3 VMs
Is there a way to get alerts about this through e-mail, slack, or pagerduty in GCP?
It's possible to create such an alert that will alert you when your one of the instances in your group stops working properly.
Go your Unmanaged instance group details page and switch the tab to "Monitoring":
Click on Create alerting policy and you will see another panel:
At the bottom of this screen change the Condition to is below and Threshold to 3 as shown below.
You will find yourself at the Policy creation page:
Click Next and select desired notification channel, if you don't see any available click on Manage notification channels and create one you want, it can be email, SMS, Slack and many others.
Another approach is to create an alert triggered by logs.
First you need to create a health check (and enable logging). Then you go to your load balancers settings and edit your backend service, in there you select the health check you created.
Then go to Logs explorer and select as a log resource your instance group. You will see in the query editor something like this:
resource.type="gce_instance_group" resource.labels.instance_group_id="3863333883516335882" resource.labels.instance_group_name="hc-group-1"
then add at the bottom this line: jsonPayload.healthCheckProbeResult.healthState="UNHEALTHY"
And then click "Run Query" which should result in a few logs that will contain logs that can be used to trigger an alert.
Now when you see the logs click on Actions and select "Create log alert":
You will see the window that will allow you to name the alert and select a proper channel to send notifications. I've just tested it (group of 2 VM's, after switching off either one of them triggered an alert) in a form of email:
Lastly - depending on the service you're running you can monitor many different services (in my case it was HTTP reply on port 80).

CloudWatch Cost - Data Processing

I'd like to know if possible to discover which resource is behind this cost in my Cost Explorer, grouping by usage type I can see it is Data Processing bytes, but I don't know which resource would be consuming this amount of data.
Have some any idea how to discover it on CloudWatch?
This is almost certainly because something is writing more data to CloudWatch than previous months.
As stated this AWS Support page about unexpected CloudWatch logs bill increases:
Sudden increases in CloudWatch Logs bills are often caused by an
increase in ingested or storage data in a particular log group. Check
data usage using CloudWatch Logs Metrics and review your Amazon Web
Services (AWS) bill to identify the log group responsible for bill
increases.
Your screenshot identifies the large usage type as APS2-DataProcessing-Bytes. I believe that the APS2 part is telling you it's about the ap-southeast-2 region, so start by looking in that region when following the instructions below.
Here's a brief summary of the steps you need to take to find out which log groups are ingesting the most data:
How to check how much data you're ingesting
The IncomingBytes metric shows you how much data is being ingested in your CloudWatch log groups in near-real time. This metric can help you to determine:
Which log group is the highest contributor towards your bill
Whether there's been a spike in the incoming data to your log groups or a gradual increase due to new applications
How much data was pushed in a particular period
To query a small set of log groups:
Open the Amazon CloudWatch console.
In the navigation pane, choose Metrics.
For each of your log groups, select the IncomingBytes metric, and then choose the Graphed metrics tab.
For Statistic, choose Sum.
For Period, choose 30 Days.
Choose the Graph options tab and choose Number.
At the top right of the graph, choose custom, and then choose Absolute. Select a start and end date that corresponds with the last 30 days.
For more details, and for instructions on how to query hundreds of log groups, read the full AWS support article linked above.
Apart from the steps which Gabe mentioned what helped me identify the resource which was creating large number of logs was by:
heading over to Cloudwatch
selecting the region which showed in Cost explorer
Selecting Log Groups
From settings under Log Groups, Enabling column Stored bytes to be visible
This showed me which service was causing a lot of logs to be written to Cloudwatch.

Annoying! CloudWatch Dashboards go to home page after every few minutes

We've built dashboards for service monitoring using AWS CloudWatch and Logs Insights. Everything looks great from reporting perspective. However, something very annoying is happening on the screen where we want to set it up to constantly display the service performance. Our setup is
We use AWS STS/Assume Role from Identities account to login to our Development and Production accounts
CloudWatch Dashboards are on Production accounts
We've below problems which we are looking at solving immediately:
The STS token expires every 12 hours (max). Is there anyway we can keep the sessions running for more than 12 hours? We don't want to be logging onto every service monitoring machine every morning.
Every few minutes CloudWatch exits from Display Dashboard and lands on CloudWatch home page on the monitoring screen
How to get rid of Alarms by Service and Recent Alarms widgets on CloudWatch home page?
I referred to this thread on AWS Forums, but it has no posts or resolutions from many months :-(
Thanks in advance!!
There isn't really much you can do here until AWS decides to change how that works. I'm with you, it's really annoying, and makes it far less valuable. I did learn one trick on this, however. When you do end up back on the home page you can use your browsers back button and you'll most often be back to where you were. That at least saves you from having to reenter things.

stackdriver monitoring api in a dataflow project

I'm just starting out using Apache Beam on Google Cloud Dataflow. I have a project set up with a billing account. The only things I plan on using this project for are:
1. dataflow - for all data processing
2. pubsub - for exporting stackdriver logs to be consumed by Datadog
Right now, as I write this, I am not currently running any dataflow jobs.
Looking at the past month, I see ~$15 in dataflow costs and ~$18 in Stackdriver Monitor API costs. It looks as though Stackdriver Monitor API is close to a fixed $1.46/day.
I'm curious how to mitigate this. I do not believe I want or need Stackdriver Monitoring. Is it mandatory? Further, while I feel I have nothing running, I see this over the past hour:
So I suppose the questions are these:
1. what are these calls?
2. is it possible to disable Stackdriver Monitoring for dataflow or otherwise mitigate the cost?
Per Yuri's suggestion, I found the culprit, and this is how (thanks to Google Support for walking me through this):
In GCP Cloud Console, navigate to 'APIs & Services' -> Library
Search for 'Strackdriver Monitoring Api' and click
Click 'Manage' on the next screen
Click 'Metrics' from the left-hand side menu
In the 'Select Graphs' dropdown, select "Traffic by Credential" and click 'OK'
This showed me a graph making it clear just about all of my requests were coming from a credential named datadog-metrics-collection, a service account I'd set up previously to collect GCP metrics and emit to Datadog.
Considering the answer posted and question, If we think we do not need Stackdriver monitoring, we can disable stackdriver monitoring API using bellow steps:
From the Cloud Console,go to APIs & Services.
Select Stackdriver Monitoring API.
Click Disable API.
In addition you can view Stackdriver usage by billing account and also can estimate cost using Stackdriver pricing calculator [a] [b].
View Stackdriver usage by billing account:
From anywhere in the Cloud Console, click Navigation menu and select Billing.
If you have more than one billing account, select Go to linked billing account to
view the current project's billing account. To locate a different billing account,
select Manage billing accounts and choose the account for which you'd like to get
usage reports.
Select Reports.
4.Select Group By > SKU. This menu might be hidden; you can access it by clicking Show
Filters.
From the SKUs drop-down list, make the following selections:
Log Volume (Stackdriver Logging usage)
Spans Ingested (Stackdriver Trace usage)
Metric Volume and Monitoring API Requests (Stackdriver Monitoring usage)
Your usage data, filtered by the SKUs you selected, will appear.
You can also select just one or some of these SKUs if you don't want to group your usage data.
Note: If your usage of any of these SKUs is 0, they don't appear in the Group By > SKU pull-down menu. For example, who use only the Cloud console might never generate API requests, so Monitoring API Requests doesn't appear in the list.
Use the Stackdriver pricing calculator [b]:
Add your current or projected Monitoring usage data to the Metrics section and click Add to estimate.
Add your current or projected Logging usage data to the Logs section and click Add to estimate.
Add your current Trace usage data to the Trace spans section and click Add to estimate.
Once you have input your usage data, click Estimate.
Estimates of your future Stackdriver bills appear. You can also Email Estimate or Save Estimate.
[a] https://cloud.google.com/stackdriver/estimating-bills#billing-acct-usage
[b] https://cloud.google.com/products/calculator/#tab=google-stackdriver

Alert to detect the available space on the hard drive

I would like to know if it is possible to create an alert in a google cloud platform instance, to identify that a hard drive of an instance is 90% busy, for example, and that this sends a notification to some user.
I await your response, and thanks in advance.
You can use Google Stackdrive to setup alerts and have an email sent.
However, disk percentage busy is not an available metric. You can chose from Disk Read I/O and Disk Write I/O bytes per second and set a threshold for the metric.
Go to the Google Console Stackdriver section. Click on Monitoring.
Select Alerting -> Create Policy in the left panel.
Create your alerting policy based upon Conditions and Notifications
You can create custom metrics. This link describes how.
Creating Custom Metrics