stackdriver monitoring api in a dataflow project

stackdriver monitoring api in a dataflow project - google-cloud-platform

I'm just starting out using Apache Beam on Google Cloud Dataflow. I have a project set up with a billing account. The only things I plan on using this project for are:
1. dataflow - for all data processing
2. pubsub - for exporting stackdriver logs to be consumed by Datadog
Right now, as I write this, I am not currently running any dataflow jobs.
Looking at the past month, I see ~$15 in dataflow costs and ~$18 in Stackdriver Monitor API costs. It looks as though Stackdriver Monitor API is close to a fixed $1.46/day.
I'm curious how to mitigate this. I do not believe I want or need Stackdriver Monitoring. Is it mandatory? Further, while I feel I have nothing running, I see this over the past hour:
So I suppose the questions are these:
1. what are these calls?
2. is it possible to disable Stackdriver Monitoring for dataflow or otherwise mitigate the cost?

Per Yuri's suggestion, I found the culprit, and this is how (thanks to Google Support for walking me through this):
In GCP Cloud Console, navigate to 'APIs & Services' -> Library
Search for 'Strackdriver Monitoring Api' and click
Click 'Manage' on the next screen
Click 'Metrics' from the left-hand side menu
In the 'Select Graphs' dropdown, select "Traffic by Credential" and click 'OK'
This showed me a graph making it clear just about all of my requests were coming from a credential named datadog-metrics-collection, a service account I'd set up previously to collect GCP metrics and emit to Datadog.

Considering the answer posted and question, If we think we do not need Stackdriver monitoring, we can disable stackdriver monitoring API using bellow steps:
From the Cloud Console,go to APIs & Services.
Select Stackdriver Monitoring API.
Click Disable API.
In addition you can view Stackdriver usage by billing account and also can estimate cost using Stackdriver pricing calculator [a] [b].
View Stackdriver usage by billing account:
From anywhere in the Cloud Console, click Navigation menu and select Billing.
If you have more than one billing account, select Go to linked billing account to
view the current project's billing account. To locate a different billing account,
select Manage billing accounts and choose the account for which you'd like to get
usage reports.
Select Reports.
4.Select Group By > SKU. This menu might be hidden; you can access it by clicking Show
Filters.
From the SKUs drop-down list, make the following selections:
Log Volume (Stackdriver Logging usage)
Spans Ingested (Stackdriver Trace usage)
Metric Volume and Monitoring API Requests (Stackdriver Monitoring usage)
Your usage data, filtered by the SKUs you selected, will appear.
You can also select just one or some of these SKUs if you don't want to group your usage data.
Note: If your usage of any of these SKUs is 0, they don't appear in the Group By > SKU pull-down menu. For example, who use only the Cloud console might never generate API requests, so Monitoring API Requests doesn't appear in the list.
Use the Stackdriver pricing calculator [b]:
Add your current or projected Monitoring usage data to the Metrics section and click Add to estimate.
Add your current or projected Logging usage data to the Logs section and click Add to estimate.
Add your current Trace usage data to the Trace spans section and click Add to estimate.
Once you have input your usage data, click Estimate.
Estimates of your future Stackdriver bills appear. You can also Email Estimate or Save Estimate.
[a] https://cloud.google.com/stackdriver/estimating-bills#billing-acct-usage
[b] https://cloud.google.com/products/calculator/#tab=google-stackdriver

Related

Creating Alert Policy in Cloud Monitoring with OpenCensus Metric

I'd like to use an OpenCensus metric in a Cloud Monitoring (Stackdriver) Alert Policy.
When I try to click the Add button, I get This query must contain a resource type. error. The odd thing, is that I can view this metric in MQL and can chart it.
According to the MQL charts that us this metric, the Resource: field is blank, and the charts work fine. The MQL charts show a resource type (on metric hover) of knative_broker, dataflow_job, aws_rds_database, k8s_control_plane_component, aws_lambda_function, and 36 more.
What Resource type should be used to alert on Open Census metrics in Cloud Monitoring alerts?

In Cloud Monitoring, application-specific metrics are typically called “custom metrics”. You can create custom metrics with opencensus. For more detailed information please follow Custom Metrics with OpenCensus. We define a Stackdriver exporter as the goal to create an alert policy on the aggregation of two metrics. Refer Alert policy metric thresholds with stackdriver and opencensus for more information.

Stack Driver Monitoring Costs

Currently I'm getting charged for $0.5 per day per environment. We have 4 environments and all per day it costs $2
We don't have any traffic at all as we are still in development phase.
When I try to disable Stack Monitoring API, it says it will disable few more api's which isn't expected.
I saw that "google.monitoring.v3.MetricService.ListTimeSeries" has a request count of 1M per month in the metrics and I don't have from where this is getting triggered.
I see stack driver monitoring api costs are per 1000 calls and it can easily push my budgets to the edge.
Is it possible to find out from where this is getting triggered?

Finally found out that the issue is with NewRelic monitoring.
I gave access to NewRelic sometime back for all my projects and once I revoke the access, it stopped showing the costs!
It would be great if there is an option on gcp which resource is consuming and what is triggering it.

Source can be tracked this way
'APIs & Services' -> Library
Search for 'Strackdriver Monitoring Api' and click
Click 'Manage' on the next screen
Click 'Metrics' from the left-hand side menu
In the 'Select Graphs' dropdown, select "Traffic by Credential" and click 'OK'
You will either see newrelic or a service account email or a unique id. That unique ID is the unique ID of the service account used.

Receiving alerts for GCP activities

Is it possible to create alerts for configuration activities?
On the dashboard of my GCP project, I'm able to see the history of activities. However, for security reasons, I would like to be able to receive notifications when certain activities happen, e.g. Set IAM policy on project, deleting instance of project, etc. Is this possible?
I have looked into "metric-based alerting policies", but I'm only able to create alerts for uptime checks. Not sure what else to look for.

You are on the right path. You need to create a log-based metric and then to create an alert when the counter cross a threshold (1 for example)

Now a more straightforward solution is available: In one step, you can use log-based alerts. It allows to set alerts on any log type and content. This new feature is on preview and was announced few days ago.

CloudWatch Cost - Data Processing

I'd like to know if possible to discover which resource is behind this cost in my Cost Explorer, grouping by usage type I can see it is Data Processing bytes, but I don't know which resource would be consuming this amount of data.
Have some any idea how to discover it on CloudWatch?

This is almost certainly because something is writing more data to CloudWatch than previous months.
As stated this AWS Support page about unexpected CloudWatch logs bill increases:
Sudden increases in CloudWatch Logs bills are often caused by an
increase in ingested or storage data in a particular log group. Check
data usage using CloudWatch Logs Metrics and review your Amazon Web
Services (AWS) bill to identify the log group responsible for bill
increases.
Your screenshot identifies the large usage type as APS2-DataProcessing-Bytes. I believe that the APS2 part is telling you it's about the ap-southeast-2 region, so start by looking in that region when following the instructions below.
Here's a brief summary of the steps you need to take to find out which log groups are ingesting the most data:
How to check how much data you're ingesting
The IncomingBytes metric shows you how much data is being ingested in your CloudWatch log groups in near-real time. This metric can help you to determine:
Which log group is the highest contributor towards your bill
Whether there's been a spike in the incoming data to your log groups or a gradual increase due to new applications
How much data was pushed in a particular period
To query a small set of log groups:
Open the Amazon CloudWatch console.
In the navigation pane, choose Metrics.
For each of your log groups, select the IncomingBytes metric, and then choose the Graphed metrics tab.
For Statistic, choose Sum.
For Period, choose 30 Days.
Choose the Graph options tab and choose Number.
At the top right of the graph, choose custom, and then choose Absolute. Select a start and end date that corresponds with the last 30 days.
For more details, and for instructions on how to query hundreds of log groups, read the full AWS support article linked above.

Apart from the steps which Gabe mentioned what helped me identify the resource which was creating large number of logs was by:
heading over to Cloudwatch
selecting the region which showed in Cost explorer
Selecting Log Groups
From settings under Log Groups, Enabling column Stored bytes to be visible
This showed me which service was causing a lot of logs to be written to Cloudwatch.

Alert to detect the available space on the hard drive

I would like to know if it is possible to create an alert in a google cloud platform instance, to identify that a hard drive of an instance is 90% busy, for example, and that this sends a notification to some user.
I await your response, and thanks in advance.

You can use Google Stackdrive to setup alerts and have an email sent.
However, disk percentage busy is not an available metric. You can chose from Disk Read I/O and Disk Write I/O bytes per second and set a threshold for the metric.
Go to the Google Console Stackdriver section. Click on Monitoring.
Select Alerting -> Create Policy in the left panel.
Create your alerting policy based upon Conditions and Notifications
You can create custom metrics. This link describes how.
Creating Custom Metrics

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js