I can find a graph of "Group size" in the page of the instance group.
However, when I try to find this metric in Stackdriver, it doesn't exist.
I tried looking in the metricDescriptors API, but it doesn't seem to be there either.
Where can I find this metric?
I'm particularly interested in sending alerts when this metrics goes to 0.
There is not a Stackdriver Monitoring metric for this data yet. You can fetch the size using the instanceGroups.get API call. You could create a system that polls this data and posts it back to Stackdriver Monitoring as a custom metric and then you will be able to access it from Stackdriver.
Related
I keep reading that I can write a log query to sample a percentage of logs but I have found zero examples.
https://cloud.google.com/blog/products/gcp/preventing-log-waste-with-stackdriver-logging\
You can also choose to sample certain messages so that only a percentage of the messages appear in Stackdriver Logs Viewer
How do I get 10% of all GCE load balancer logs with a log query? I know I can configure this on the backend, but I don't want that. I want to get 100% of logs in stackdriver and create a pub/sub log sink with a log query that only captures 10% of them and sends those sampled logs somewhere else.
I suspect you'll want to create a Pub/Sub sink for Log Router. See Configure and manage sinks
Using Google's Log querying, you can use sample to filter (inclusion) logs.
I have a DataFlow job with a counter metric. On every restart the metric is reset to zero, as expected. The problem is that when using the counter in gcp Metrics explorer, I cannot get an accumulated value for the metric, disregarding restarts. Prometheus has a function called increase() that does this. Is there a similar function for gcp metrics explorer?
One approuch to metrics across runs would be to make use of Cloud Monitoring. There is a good how to on the features and usage of custom metrics.
If you use job names that you can apply a regexp to then you can make use of the filters to aggregate them into a graph.
I have uptime checks configured for my instances in google cloud stackdriver. Now I need to programatically check the uptime percentage through that uptime check. Is there any api available for the same ?
I checked the documentation and didn't find any api to do so.
There appears to be a couple of metrics of interest:
monitoring.googleapis.com/uptime_check/check_passed - True if check passed
monitoring.googleapis.com/uptime_check/request_latency - Latency in msecs
see also: Creating Uptime Charts
Once can use these metrics for charts and alerts. In addition, since all metric data is retrievable as time-series information using APIs (at least REST APIs) then you can periodically retrieve the data and perform calculations upon it.
Also distinguish this from the metric called compute.googleapis.com/instance/uptime which is how long a VM has been running.
I would like to know if it is possible to create an alert in a google cloud platform instance, to identify that a hard drive of an instance is 90% busy, for example, and that this sends a notification to some user.
I await your response, and thanks in advance.
You can use Google Stackdrive to setup alerts and have an email sent.
However, disk percentage busy is not an available metric. You can chose from Disk Read I/O and Disk Write I/O bytes per second and set a threshold for the metric.
Go to the Google Console Stackdriver section. Click on Monitoring.
Select Alerting -> Create Policy in the left panel.
Create your alerting policy based upon Conditions and Notifications
You can create custom metrics. This link describes how.
Creating Custom Metrics
I want to create stackdriver metrics, based on the ip and the frequency of requests an ip makes.
Therefore I would like to group by ip (the IP address of a requesting client) my loadbalancer logs, and if the number of requests exceed a threshold sent a notification.
Edit:
A workaround to achieve this.
Go to Stackdriver Logging and create a User-defined Metric that counts the total requests.
Fire an alarm when requests exceed a threshold.
Alarms call a lambda function that create a sync from stackdriver to bigquery
Execute the queries in order to find out the ip that causes the trouble
In Stackdriver Logging, create a User-defined Metric (myMetric) [1] filtered on the desired IP address,
In Stackdriver Monitoring, find resource type and metric by locating myMetric to create the chart.
[1] https://cloud.google.com/logging/docs/logs-based-metrics/
There is no out of the box solution so there can be a workaround with BigQuery
Go to Stackdriver Logging and create a User-defined Metric that counts the total requests.
Fire an alarm when requests exceed a threshold.
Alarms call a lambda function that create a sync from stackdriver to bigquery
Execute the queries in order to find out the ip that causes the trouble