Restore metrics after Google Cloud SQL (Postgres) crash - google-cloud-platform

A couple of days our Google Cloud SQL instance "crashed" or at least was not responsive any longer. It recovered and works and all Query Insights and so on work again.
However, most metrics like CPU utilization, Storage usage and Memory usage are currently not available. I thought that would recover automatically as well but after 2 days I wonder if there needs to happen something manually.
Is there something I can do other than restarting the database (which would be only my last resort).

Okay, after waiting around 3 days the metrics are working again.

Related

GCP Cloud SQL PITR, how long can it take?

We had sort of a disastrous event with a database at work (managed on Cloud SQL, it's a MySQL db) and thankfully we have point in time recovery enabled.
We went ahead and cloned the production database to a certain point in time to be able to recover the data just before the disaster, but it's now been running for more than 5 hours for a 69GB database (the size advertised on the GCP Cloud SQL panel, so the real size of the DB is probably less than that).
Does anyone have experience with this ?
The status of the operation when querying it with the gcloud CLI says "RUNNING". We checked the logs of the instance to see if anything was off but there just isn't any log at all.

Google Cloud SQL - Database instance storage size increased dramatically everyday

I have a database instance (MySQL 8) on Google Cloud and since 20 days ago, the instance's storage usage just keeps increasing (approx 2Gb every single day!).
But I couldn't find out why.
What I have done:
Take a look at Point-in-time recovery "Point-in-time recovery" option, it's already disabled.
Binary logs is not enabled.
Check the actual database size and I see my database is just only 10GB in size
No innodb_per_table flag, so it must be "false" by default
Storage usage chart:
Database flags:
The actual database size is 10GB, now the storage usage takes up to 220GB! That's a lot of money!
I couldn't resolve this issue, please give me some ideal tips. Thank you!
I had the same thing happen to me about a year ago. I couldn't determine any root cause of the huge increase in storage size. I restarted the server and the problem stopped. None of my databases experienced any significant increase in size. My best guess is that some runaway process causes the binlog to blow up.
Turns out the problem is in a Wordpress theme's function called "related_products" which just read and write every instance of the products that user comes accross (it would be millions per day) and makes the database physically blew up.

General guidance around Bigtable performance

I'm using a single node Bigtable cluster for my sample application running on GKE. Autoscaling feature has been incorporated within the client code.
Sometimes I experience slowness (>80ms) for the GET calls. In order to investigate it further, I need some clarity around the following Bigtable behaviour.
I have cached the Bigtable table object to ensure faster GET calls. Is the table object persistent on GKE? I have learned that objects are not persistent on Cloud Function. Do we expect any similar behaviour on GKE?
I'm using service account authentication but how frequently auth tokens get refreshed? I have seen frequent refresh logs for gRPC Java client. I think Bigtable won't be able to serve the requests over this token refreshing period (4-5 seconds).
What if client machine/instance doesn't scale enough? Will it cause slowness for GET calls?
Bigtable client libraries use connection pooling. How frequently connections/channels close itself? I have learned that connections are closed after minutes of inactivity (>15 minutes or so).
I'm planning to read only needed columns instead of entire row. This can be achieved by specifying the rowkey as well as column qualifier filter. Can I expect some performance improvement by not reading the entire row?
According to GCP official docs you can get here the cause of slower performance of Bigtable. I would like to suggest you to go through the docs that might be helpful. Also you can see Troubleshooting performance issues.

Is this normal for gcp Cloud SQL disk usage

I created a cloud sql db for learning purposes a while ago and have basically never used it for anything. Yet the storage / disk space keeps climbing:
updated image
Updating image to show timescale... this climb seems to be within just a few hours!
Is this normal? If not, how do I troubleshoot / prevent this steady climb? The only operations against the db seem to be backup operations. I'm not doing any ops (afaik).

CouchDB load spike (even with low traffic)?

We're been running CouchDB v1.5.0 on AWS and its been working fine. Recently AWS came out with new prices for their new m3 instances so we switched our CouchDB instance to use an m3.large. We have a relatively small database with < 10GB of data in it.
Our steady state metrics for it are system loads of 0.2 and memory usages of 5% or so. However, we noticed that every few hours (3-4 times per day) we get a huge spike that floors our load to 1.5 or so and memory usage to close to 100%.
We don't run any cronjobs that involve the database and our traffic flow about the same over the day. We do run a continuous replication from one database on the west coast to another on the east coast.
This has been stumping me for a bit - any ideas?
Just wanted to follow up on this question in case it helps anyone.
While I didn't figure out the direct answer to my load spike question, I did discover another bug from inspecting the logs that I was able to solve.
In my case, running "sudo service couchdb stop" was not actually stopping couchdb. On top of that, every couple of seconds a new process of couch would try and spawn only to be blocked by the existing couchdb process.
Ultimately, removing the respawn flag /etc/init.d/couchdb fixed this error.