Google monitoring selected metric is invalid? - google-cloud-platform

I literally am clicking a metric that works for the last 2 months from our dashboard and it shows a graph WITH data like so (you can see in red, it is saying invalid metric).
So, metrics HALF works. The real issue is I can no longer see ANY of our company metrics that I don't already have predefined in the dashboard. Is there a work around for this odd issue?
EDIT: Even more frustrating is how the logs find the metric just fine...
thanks,
Dean

Related

Extract scalar value from text log message as metric in Google Cloud Logging

According to the docs I can define log-based metrics, but I can't seem to find a way to do this. My application logs a message like this:
2022-03-29T10:20:30 [INFO] Some action took 0.23 seconds
What I'm trying to do is extract the 0.23 as a metric that I can put on dashboard and monitor. How can I go about doing this?
Edit
I mostly solved this at the application level (my hope was that no changes to the code would be needed) by using a logger from the client library (Python in my case), and adding my metric to the log statement via structured logging. From there, I created a user defined, log-based metric, which I can now monitor and set alarm on, etc. Only detail is that, as mentioned in the comments, this particular metric is a gauge metric type. I'm extracting it as a distribution, which mostly works out of the box, though, I lose a negligible amount of precision.

GCP Stackdriver alert on metric not absent

there are some metrics in GCP's Stackdriver like f.e. serviceruntime.googleapis.com/quota/exceeded that appear when there is a problem and are absent (not 0) if there is no more problem.
The problem is I cannot set up alerting to auto-resolve, because when the problem is resolved data for this metric is absent.
How can I set up alerts to auto-resolve with these types of metrics?
Unfortunately, there's no solution at the moment. Google Engineers are aware about this problem. You can join (use +1 mark), comment and follow the feature request at the Google Public Issue Tracker to be updated of any changes.

Can we configure error reporting in GCP using P99/P95 methodology

We just configuring error reporting in GCP which shows the histogram of errors (group by type) over time. But i think this data isn't that useful. I mean, for sake of argument, having 100 errors for 100 requests means that our service is broken. but 100 errors for 1,000,000 request is okayish.
thats why i was thinking to add alerts/monitoring using P99/P95/P90 methods. I would also like to see if we can configure alerts based on # of data points i.e., if error count on P90 > 5 for 10 minutes, then alert.
Is it something that can be done in GCP? I believe this is possible in AWS but not in GCP.
You can configure Stackdriver Monitoring and Alerting for this. Here's a quickstart to help you get this set up.

Is anybody else experiencing discrepancies between actual calls/renders and what azure wants to bill?

Since about a month we integrated a test report as a dashboard into a web application (power bi embedded). This dashboard is only accessible from within that application if you are logged on. We have around 20 tiles on the report and as far as i understood that, this means 20 renders. Since this is just a test and is mainly used to discuss with clients how the real reports will look in the future we had around 60 calls of that report in the last 30 days => so roughly 1200 renders. so since 1000 are free and we havent actually rendered over 2k it should still be more or less free - or maybe the $2,50 for 1000 renders.
but now our free trial azure account has been deactivated, because power bi ate up the free €170 start budget and it would total to bill around €600 according to the warning in azure... In my count that means azure counted 300000 renders. That seems totally wrong to me.
Do i completely misunderstand the pricing model? Anybody else have a problem like that?
Yes, I also had this problem, I was basically getting charged the per render instead of every thousand renders.
As specializt mentions, these kinds of questions aren't meant for stackoverflow.
To get the issue resolved, log a support ticket with Microsoft Billing from the Azure portal by selecting the question mark icon, and select New Support Request. Then explain your problem in detail and they should correct the problem and credit your account.

PubSub service having issue?

It seems like api's are timing out when creating subscription for already existing topics. My tests which validate this was working until few minutes back but now they started failing. I am not even able to create subscription using Google's console either.
Though Google's dashboard shows service as working. Anyone else seeing this? I am not sure what else I can do to gather more info.
Yes, we had an issue with error rates higher than expected around that time. The users should have seen 504 errors during the outage. As of 1:30PM, the issue has been resolved, and everything should be back to normal. Sorry about the inconveniences.
We do have SLA. Cloud Pub/Sub SLA is described here:
https://cloud.google.com/pubsub/sla
If you think you're eligible, you can request a refund from that page.
Sorry again for the inconveniences you may have had.