Since 15/10 I notice a sudden spike on Cloud SQL Admin API queries, but we didn't make any changes in our application or in our environment.
Is there any place where I can see which queries are these?
In Stackdriver I just found Cloud SQL's logs, but nothing about Cloud SQL Admin API.
UPDATE:
According to this https://cloud.google.com/sql/docs/mysql/sql-proxy
" While the proxy is running, it makes two calls to the API per hour per connected instance."
And all requests would be registered as '[USER_NAME]'#'cloudsqlproxy~%'
My application uses 5 PODs of sqlproxy on top, nothing that justifies this spike.
In contact with Google Cloud Support, the agent wasn't able to inform me of the origin of this queries as well.
When the volume of Cloud SQL Admin API's queries increase, the connection latency increases up to 5 seconds and my production application gets unusable.
I'm very disappointed with Google Cloud Support, and I'm thinking in move on to another provider.
Related
I am trying to stop server side GTM as I did it for a test to understand the process but I am still getting billed. What are the steps to stop this.
I have so far.
Removed the transport URL from the GA tag
Paused the GA tag in the client side GTM
Removed the 4 A's and 4 AAAA records from my DNS
Deleted the mapping from the Cloud account under App Engine > Settings
Disabled the application as well
You can find here how to stops it from serving and incurring billing charges related to serving your app:
https://cloud.google.com/appengine/docs/managing-costs#understanding_billing
Anyway, you may continue to incur charges from other Google Cloud products.
Google Tag Manager has a dependency on App Engine and it requires the creation of a Google Cloud Platform project.
In order to stop charges from accruing to an App Engine application you could either disable the application (although some fees related to Cloud Logging, Cloud Storage, or Cloud Datastore might keep being charged), disable billing or my recommendation will be to completely shut down the project related to your tagging server. Take into consideration that when shutting down a project after around 30 days all the resources associated with your project will be fully deleted and you won't be able to recover it.
Usecase: Our requirement is to run a service continuously every few minutes. This service reads a value from datastore, and hits a public url using that value from datastore (Stateful). This service doesnt have Front End. No body would be accessing this service publicly. A new value is stored in datastore as a result of response from the url. Exactly one server is required to run.
We are in need to decide one of the below for our use case.
Compute Engine (IaaS -> we dont want to maintain the infra for this simple stateful application)
Kubernetes Engine (still feeling overkill )
App Engine : PaaS-> App Engine is usually used for Mobile apps, Gaming, Websites. App Engine provides a url with web address. Is it right choice for our usecase? If we choose app engine, is it possible to stop the public app engine url? Also, as one instance would be running continuously in app engine, what is cost effective - standard or flexible?
Cloud Functions -> Event Driven(looks not suitable for our application)
Google Cloud Scheduler-> We thought we could use cloud scheduler + cloud functions. But during outage, jobs are queued up. In our case, after outage, only one server/instance/job could be up and running.
Thanks!
after outage, only one server/instance/job could be up and running
Limiting Cloud Function concurrency is enough? If so, you can do this:
gcloud functions deploy FUNCTION_NAME --max-instances 1 FLAGS...
https://cloud.google.com/functions/docs/max-instances
I also recommend taking a look at Google Cloud Run, is a serverless docker platform, it can be limited to a maximum of 1 instances responding to a maximum of 1 request concurrently. It would require Cloud Scheduler too, making regular HTTP requests to it.
With both services configured with max concurrency of 1, only one server/instance/job will be up and running, but, after outages, jobs may be scheduled as soon as another finish. If this is problematic, adding a lastRun datetime field on datastore job row and not running if it's too recent, or disable retry of cloud scheduler, like said here:
Google Cloud Tasks HTTP trigger - how to disable retry
We have a Python data pipeline that run in our server. It grab data from various sources, aggregate and write data to sqlite databases. The daily runtime is just 1 hours and network maybe 100mb at most. What are our options to migrate this to Google Cloud? We would like to have more reliable scheduling, cloud database and better data analytics options from the data (powerful dashboard and visualization) and easy development. Should we go with serverless or server? Is the pricing free for such low usage?
for a lift and shift option, you can run your python workload on the Google Compute Engine, which is a virtual machine, but for best use of Google Cloud, i suggest you to:
Spin up a Google Compute Engine
Run your Python Workload
Save your data on Google Big Query
Shutdown your VM
Schedule it using the Cloud Scheduler
Here is a tutorial from Google on how to do it:
https://cloud.google.com/scheduler/docs/start-and-stop-compute-engine-instances-on-a-schedule
GCP on a shoestring budget:
Google Gives you $300 to spend for first 12 months and there are some services which gives you free usage per month: https://cloud.google.com/free/docs/gcp-free-tier
For example:
You can use BigQuery free-of-charge 1 TB of querying per month and 10 GB of storage each month.
Here's an excellent video on making the most of out of GCP Free tiers: https://www.youtube.com/watch?v=N2OG1w6bGFo&t=818s
Approach to migration:
When moving to cloud you typically choose from one of the following approaches:
1) Rehost (lift-and-shift) no modification to code or architecture
2) Replatform - with minor modifications to code
3) Refactor - with modifications to code and architecture
Obviously you'll get the most cloud benefits (i.e. performance and cost efficiency) with option 3 but it will take longer whereas option 1 is quicker with least amount of benefits.
You can use Cloud Composer for scheduling which is effectively managed apache airflow service. It will allow you to manage batch, streaming and schedule tasks.
Visualisation can be done through Google Data Studio, which can use BigQuery as datasource. Data Studio is free but querying on BigQuery will be chargeable.
BigQuery for data-analytics.
For database you can migrate to managed CloudSQL which supports Postgres and MySQL database types.
Typically serverless stuff is likely to be cost effective if you can accommodate it which obviously will fall into option 3) refactor.
There is several requirement to take care before migrating like: Is all your datasources are reachable by a cloud platform?
About the storage and analytics, BigQuery is an amazing product, and work very well with denormalized data. Is your data can be denormalized? Is your job required transactional capabilities?
Is your data need to be requested on website? BigQuery is powerful for analytics but there is about 1s of query warming, not acceptable on website. It's not like CLoud SQL (MySQL or PostgreSQL) response time which is in millis, but limited to some TB (and having good response time with TB in Cloud SQL is a challenge!)
If it's only for dashboarding, you can use Datastudio, it's free and you can cache your BigQuery data with BI-Engine for more responsive dashboards.
If all of this requirements works for you, and if your datasources are publicly accessible on internet (I mean no VPN requirement for accessing them), I can propose you a full serverless solution. This solution is a side use of Google Cloud Service, and that works well!
I wrote an article on similar use and you can take inspiration on it. Cloud Build allows you to run CI pipeline, and you can use Custom Builder: it's a container that you build yourself and that you can run on Cloud Build.
By the way,
Package your current workflow in a container compliant with Cloud Build, and write your Cloud Build jobs (don't forget to set the right timeout value)
Create a Cloud Function or Cloud Run (if you prefer container) that run Cloud Build; with optionally some substitutions variable for customizing your run.
Set up a Cloud Scheduler to trigger every day your Cloud Run or Cloud Function
Out of BigQuery cost, this pattern cost 0! you have 120 free minutes per day (per billing account) with Cloud Build, Cloud Scheduler is free (up to 3 scheduler per billing account) and Cloud Function/Cloud Run have a huge free tier (here only run some milliseconds).
Streaming to BigQuery is not free but affordable. Half of cent for 100Mb!!
Note: Cloud Run will propose, a day, long running jobs. By the way you could reuse your Cloud Builder container into Cloud Run when the feature will be released. Today, I propose a workaround of this
In Google Cloud Platform BigQuery is great serverless choice - you can start small and grow over time.
With partitioning, cluster and other improvements, we've been successfully using it with UI (4-8k queries per day) with most queries completing under second.
You can also get all data seamlessly ingested from the various sources with millions of files per day to one or many tables with BqTail
1. Summarize the problem
Google Cloud Run advertises that it is "stateless containers". Is there a way to run anything at all, have it save state somewhere?
I want to run Postgres in a container, but only have it up on demand, spin up the PG container when there is a request made.
The same question goes for a container that will hold a REST API (web server), to connect to the PG container.
So when the web app (hosted on Firebase), makes a request to the REST API (container), it would spin up, and then the PG instance that gets queried from the REST api would spin up (or can simply put both DB , REST API in one container).
For a dev instance, I don't want something up 24x7x365 doing mostly nothing, just something that will spin up during development hours, but have a number of these, am the only OPS guy, want to automate it for developers, including myself and minimize billing.
Any best approach here would be appreciated.
2. Provide background including what you've already tried
I have created Docker containers and deployed to Cloud Run
3. Show some code
yum install buildah podman -y
4. Describe expected and actual results including any error messages
I am looking for a solution to minimize billing for a dev environment that will include hosting and a database/REST API (database has to be Postgres).
I'm looking for a stateful cloud run that will maintain the state of a database.
Cloud Run is not suitable for hosting a database. Server instances allocated for incoming requests to Cloud Run can come and go, and not all requests will go to the same instance, which means that not all clients will see the same data. That's the problem with "stateless containers".
If you want to use Cloud Run to provide database access, it would best be as a proxy to some other cloud-hosted database service. You might use to it host a REST API endpoint that accesses some other database service (for example: Cloud Firestore, Cloud SQL). But it doesn't make sense to host the database itself in your docker image, since those server instances can come and go unpredictably, destroying any database state stored in each instance.
with Google cloud platform, cloud SQL you get a lot of options to setup the infrastructure. Does this mean cloud SQL is infrastructure as a service ?
No, the infrastucture of Cloud SQL is managed by Google and by it's engineers, so, Cloud SQL is PAAS (Plaform As A Service).
Cloud SQL is a docker container built on top of a GCE instance, and Google monitor everything for you, and fix the Cloud SQL instance automatically if something goes wrong (Sometimes Google software engineers have to perform some actions to fix some issues if the instance is stuck). So, the only thing that you have to take care of is to store and query your data.
Also, Cloud SQL offers a lot of interesting features, such as, failover replicas, read replicas, user and database adminitration, etc.
So, in Cloud SQL, Google doesn't just sell the infrastucture to create databases, but also the application itself and the monitoring tools too.