How to read data from Google-storage to Cloud-run dynamically? - google-cloud-platform

I have a dash application running on Google-cloud-run This application needs some data in order to work, it reads it from Google-cloud-storage
This data in google-cloud-storage is updated once a week, I am looking for a way to enable reading the new data without the need to re-deploy a new version of the application every week. Otherwise, the application will read the data stored in the memory (old data)
I tried to call a function that downloads the new data (on google-cloud-run's server) but I couldn't load the data to the app because it's already running and reading the loaded data in memory

First of all, stop to waste you time to update Cloud Run with Cloud Functions. The cloud run containers are immutable (as any container) and the only way to change the data is to build a new container. (solution that you don't want)
There, you still have 2 solutions to achieve that:
You can read data from Cloud Storage when you start your container.
Create a bash script that load the data from Cloud Storage with gsutil, and then start your binary. Put that bash script in the entrypoint of your container
Use the Cloud Storage client libraries in your Cloud Run service to load the data
Use the 2nd gen runtime execution of Cloud Run and mount the bucket as a volume on Cloud Run.

Related

cloud run : should we bundle trained model (having large size 2 GB) in container or should we download it from cloud storage at container start

My use case is:
I have trained model which i want to use for infer small messages.
Not sure about where should i keep my models in cloud run.
inside container
On cloud storage and download it at the time of container start
Mount cloud storage as local directory and use it
I am able to write and run code successfully for option 1 and 2.
Tried option 3 but not lucky there. I am using this link https://cloud.google.com/run/docs/tutorials/network-filesystems-fuse
Actually here my entry point is an pub sub event. thats where i am not able to make it working.
But before exploring more into it i would like to know about which approach is better here. or is there any other better solution.
Thanks for valuable comments, it helped a lot.
If model is static better to club it with container. downloading it from storage bucket or mounting FS will download model again whenever we spin new container.

GCP cloud run send a request to all running instances

I have a rest API running on cloud run that implements a cache, which needs to be cleared maybe once a week when I update a certain property in the database. Is there any way to send a HTTP request to all running instances of my application? Right now my understanding is even if I send multiple requests and there are 5 instances, it could all go to one instance. So is there a way to do this?
Let's go back to basics:
Cloud Run instances start based on a revision/image.
If you have the above use case, where suppose you have 5 instances running and you suddenly need to re-start them as restarting the instances resolves your use case, such as clearing/rebuilding the cache, what you need to do is:
Trigger a change in the service/config, so a new revision gets
created.
This will automatically replace, so will stop and relaunch all your instances on the fly.
You have a couple of options here, choose which is suitable for you:
if you have your services defined as yaml files, the easiest is to run the replace service command:
gcloud beta run services replace myservice.yaml
otherwise add an Environmental variable like a date that you increase, and this will yield a new revision (as a change in Env means new config, new revision) read more.
gcloud run services update SERVICE --update-env-vars KEY1=VALUE1,KEY2=VALUE2
As these operations are executed, you will see a new revision created, and your active instances will be replaced on their next request with fresh new instances that will build the new cache.
You can't reach directly all the active instance, it's the magic (and the tradeoff) of serverless: you don't really know what is running!! If you implement cache on Cloud Run, you need a way to invalidate it.
Either based on duration; when expired, refresh it
Or by invalidation. But you can't on Cloud Run.
The other way to see this use case is that you have a cache shared between all your instance, and thus you need a shared cache, something like memory store. You can have only 1 Cloud Run instance which invalidate it and recreate it and all the other instances will use it.

Datastore Emulator Query/Issue

I have installed google datastore emulator in my local machine along with it written a sample spring boot application which performs crud operations on datastore.
When i hit the rest endpoints through postman i can actually see the data gets inserted in datastore in gcp console
can someone help me by clearing below queries:
1>Even though using an emulator in local , whether data gets inserted to actual datastore in cloud (gcp)
2>what is the purpose of emulator (if qn 1 is correct)
No data is inserted on Datastore servers, everything is local as mentioned here
The emulator simulates Datastore by creating /WEB-INF/appengine-generated/local_db.bin in a specified data directory and storing data in local_db.bin. By default, the emulator uses the data directory ~/.config/gcloud/emulators/datastore/. The local_db.bin file persists between sessions of the emulator. You can set up multiple data directories and think of each as a separate, local Datastore mode instance. To clear the contents of a local_db.bin file, stop the emulator and manually delete the file.
There are multiple uses for example:
To develop and test your application locally without writing actual Data to the servers hence avoiding charges during the development process
Help you generate indexes for your production Firestore in Datastore mode instance and delete unneeded indexes, that could be exported then into production
Edit
In order to use the emulator on the same machine it's recommended to set the environment variables automatically as mentioned in the documentation

Can I tell Google Cloud SQL to restore my backup to a completely different database?

Since there is a nightly backup of SQL we are wondering of a good way to restore this backup to a different database in the same MySQL server instance. We have prod_xxxx for all our production databases AND we have staging_xxxx for all our staging databases (yes not that good in that they are all on the same mysql instance right now).
Anyways, we would love to restore all tables/constraints/etc and data from prod_incomingdb to staging_incomingdb. Is this possible in cloud SQL?
Since this is over a productive instance I recommend you to perform a backup before start, in order to avoid any data corruption.
To clone a database within the same instance, there is not a direct way to perform the task (this is a missing feature on MySQL).
I followed this path in order to successfully clone a database within same MySQL Cloud SQL instance.
1.- Create a dump of the desired database using the Google Cloud Console (Web UI) by follow these steps
*it is very important to only dump the desired database in format SQL, please not select multiple databases on the dump.
After finish the process, the dump will be available in a Google Cloud Storage Bucket.
2.- Download the dump file to a Compute Engine VM or to any local machine with linux.
3.- please replace the database name (the old one) in the USE clauses.
I used this sed command over my downloaded dump to change the names of the databases
sed -i 's/USE `employees`;/USE `emp2`;/g' employees.sql
*this can take some seconds depending the size of your file.
4.- Upload the updated file to the Cloud storage bucket.
5.- Create a new empty database on your Cloud SQL instance, in this case my target instance is called emp2.
6.- Import the modified dump by following these steps
I could not figure out the nightly backups as it seems to restore an entire instance. I think the answer to the above is no. I did find out that I can export and then import (not exactly what I wanted though as I didn't want to be exporting our DB during the day but for now, we may go with that and automate a nightly export later).

Creating a duplicate of a VM

I'm preparing to get in to the world of cloud computing.
My first question is:
Is it possible to programmatically create a new, or duplicate an existing VM from my server?
Project Background
I provide a file processing service, and as it's been growing I need to offer a better service.
Project Requirement
Machine specs:
HDD: Min 16gb
CPU: Min 1 core
RAM: Min 2
GB GPU: Min CUDA 10.1 compatible
What I'm thinking is the following steps:
User uploads a file
A dedicated VM is created for that specific file inside Google Cloud Compute
The file is sent to the VM
File is processed using a Anaconda environment
Results are downloaded to local server
Dedicated VM is removed
Results are served to user
How is this accomplished?
PS: I'm looking for resources and advice. Not code.
Your question is a perfect formulation of the concept of Google Cloud Run. At the highest level concept, you create a Docker image (think of it like a VM) and then register that Docker image with GCP Cloud Run. When a trigger occurs, GCP will spin up an instance of that Docker container and pass in information about the cause of that trigger (a file created in GCS or a REST request or others ...). What you do in your container is up to you. You have full power of the Linux environment (under Docker) to do as you like. When your request ends, the container is spun down. You are only billed for the compute resources you use. If your container (VM) isn't being used, you pay nothing until the next trigger.
An alternative to Cloud Run is Cloud Functions. This is a higher level abstraction where instead of providing a Docker container, you provide the body of a function (JavaScript, Java, Python or others) and the request is passed to that function when a trigger occurs. Which you use is mostly personal choice (you didn't elaborate on "File is processed").
References:
Cloud Run
Cloud Functions