Having created log-based metrics in cloud console, I then want to create alerts so that every time there is a new matching log entry, the alert triggers.
In trying to create a suitable metric, the most likely looking options seem to be threshold or rate of change, but I don't think either will work for a policy of 1 log message => 1 alert.
Help appreciated.
Yes, today the only way to alert on the log message is to create a threshold condition on the log metric with a very small threshold (0.001) and a duration of 1 minute.
Thanks for using Stackdriver.
You can use another alert triggering software (like PagerDuty) which is pinged by emails sent by Stackdriver. PagerDuty is able to filter all those emails that has the word RESOLVE in its subject. They can be absolutely thrown away in our case if you'd like to avoid autoresolving. Of course Stackdriver and PagerDuty alerts will diverge from each other (states will be inconsistent) but you should consider PD as single source of truth this time. It could be a possible workaround.
With log-based alerts, we can create alerts from the logs, incident will be creatied for each matching entry.
https://cloud.google.com/blog/products/operations/create-logs-alerts-preview
Related
I have a requirement to send an email notification whenever there is no data getting inserted into my BigQuery table. For this, I am using the Logging and Alerting mechanism But still, I am not able to receive any email. Here are the steps I followed:
I had written a Query in Logs explorer as below:
Now I had created a metric for those logs with Metric type COUNTER and in the filter section obviously I have given the above query.
Now I created a policy in ALERTING under the MONITORING domain. And here is the screenshot attached. The alerting policy which I had selected is for the logging metrics which I had created before.
And then a trigger as below:
And in the Notification channel, added my Email ID.
Can someone please help me if I am missing something? My requirement is to receive an alert when there is no data inserted into a Bigquery table for more than a day.
And also, I could see in Metrics Explorer, the metric which I created is not ACTIVE. Why so?
As mentioned in GCP docs:
Metric absence conditions require at least one successful measurement — one that retrieves data — within the maximum duration window after the policy was installed or modified.
For example, suppose you set the duration window in a metric-absence policy to 30 minutes. The condition isn't met if the subsystem that writes metric data has never written a data point. The subsystem needs to output at least one data point and then fail to output additional data points for 30 minutes.
Meaning, you will need at least 1 datapoint (insert job) to have an incident created for the metric to be missing.
There are two options:
Create an artificial log entry to get the metric started and have at least one time series and data point.
Run an insert job that would match the log-based metric that was created to get the metric started.
With regards to your last question, the metric you created is not active because there hasn't been any written data points to it within the previous 24 hours. As mentioned above, the metric must have at least 1 datapoint written to it.
Refer to custom metrics quota for more info.
I went through the docs but I couldn't find a way of sending an alert based on my Cloud function logs, is this possible?
What I want to do is trigger an alert when I get an specific log node.
Like, every time I have this:
Triggers an alert in my GCP, is that possible?
Create an log-based metric from the advanced filter of the Logs and create an alerting from the log-based metric. To create alert from the log-based metric you need to click on three dot right side of the metric name there you will see the create alert option.
As an alternative to log-based metrics, now log-based alerts are also supported. This lets you receive notification whenever a specific message appears in logs.
I'm trying to setup notifications to be sent from our AWS Lambda instance to a Slack channel. I'm following along in this guide:
https://medium.com/analytics-vidhya/generate-slack-notifications-for-aws-cloudwatch-alarms-e46b68540133
I get stuck on step 4 however because the type of alarm I want to setup does not involve thresholds or anomalies. It involves a specific error in our code. We want to be notified when users encounter errors when attempting to login in or sign up. We have try/catch blocks in our Node.js backend to log errors to CloudWatch at various points in the login/signup flow where we think the errors are most likely happening. We would like to identify when those SPECIFIC errors are occurring and send a notification to a Slack channel built for this purpose.
So in step 4 of the article, what would I have to do to set this up? Or is the approach in this article simply the wrong one for my purposes?
Thanks.
The step 4 titled "Create a CloudWatch Alarm" uses CPUUtlization metric to trigger an alarm.
In your case, since you want to use CloudWatch Logs, you would create CloudWatch Metric Filters based on the logs entries of interest. This would produce custom metrics based on your error string. Subsequently, you would create CloudWatch Alarm of this metric as shown in the linked tutorial for CPUUtlization.
Is there a way in Stackdriver to create an alert based on the absence of a specific line in the logs for a specific timeframe (say 1 hour) ?
I am trying to have a way to monitor (and be notified) whether a GKE CronJob did not run in the last hour. (was not able to come up with any other way of achieving this)
You can create a log based metric in regards to a specific log entry following the steps here. Once that is created, you can create an alert based off of that log based metric following the instructions here.
You could configure the Alert to trigger when its below a certain threshold for a certain amount of time; however, you cannot define a certain time frame for the alert policy to run. The alert policy will run until it is deleted.
I have schedule a HTTP call type job using Google Cloud Scheduler. How do I send out email alert if the job failed?
I have read the Cloud Scheduler documentation, and google around but the answer is not obvious. I had also attempted the stackdriver alert policy but can't find the corresponding metrics for the failed log entry.
I expect an email notification can be configured to send out if the scheduled job failed.
One way to handle this is to create a new Log-Based Metric with this filter:
resource.type="cloud_scheduler_job" severity != INFO.
Then you can create an alert based on this new metric.
I use a workaround to solve my own problem.
Since my Cloud Scheduler is calling a HTTP call to my Cloud Function.
I use stack driver to create an alert to monitor my function execution with status code != ok. Any time the function execute with failure, an email alert will be send to my inbox.
This for the time being solve my problem.
Nevertheless, perhaps Cloud Scheduler can provide such enhancement to send alert as part of the configuration.
thank you.
You can use log-based metrics in Stackdriver along with email notifications to get email notifications when your job fails.
October 2022: You no longer need to create a metric for this, you can skip that step and create an alert directly from Logs Explorer after entering the query already described:
resource.type="cloud_scheduler_job" severity != INFO