How to mount persistent storage to Google Cloud Run? - google-cloud-platform

I was trying to run a Docker image with Cloud run and realised that there is no option for adding a persistent storage. I found a list of services in https://cloud.google.com/run/docs/using-gcp-services#connecting_to_services_in_code but all of them are access from code. I was looking to share volume with persistent storage. Is there a way around it ? Is it because persistent storage might not work shared between multiple instances at the same time ? Is there are alternative solution ?

Cloud Run is serverless: it abstracts away all infrastructure management.
Also is a managed compute platform that automatically scales your stateless containers.
Filesystem access The filesystem of your container is writable and is
subject to the following behavior:
This is an in-memory filesystem, so writing to it uses the container
instance's memory. Data written to the filesystem does not persist
when the container instance is stopped.
You can use Google Cloud Storage, Firestore or Cloud SQL if your application is stateful.
3 Great Options for Persistent Storage with Cloud Run
What's the default storage for Google Cloud Run?

Cloud Run (fully managed) has known services that's not yet supported including Filestore which is also a persistent storage. However, you can consider running your Docker image on Cloud Run Anthos which runs on GKE and there you can use persistent volumes which are typically backed by Compute Engine persistent disks.

Having persistent storage in (fully managed) Cloud Run should be possible now.
Cloud Run's second generation execution environment (gen2) supports network mounted file systems.
Here are some alternatives:
Cloud Run + GCS: Using Cloud Storage FUSE with Cloud Run tutorial
Cloud Run + Filestore: Using Filestore with Cloud Run tutorial
If you need help deciding between those, check this:
Design an optimal storage strategy for your cloud workload
NOTE: At the time of this answer, Cloud Run gen2 is in Preview.

Related

Increasing Disk Size of provisioned GCP Cloud Shell VM

I am looking to increase the disk size for the 5GB persistent disk that Google provides for Cloud Shell.
I have read all the docs but, they only mention EC2 machines. Though, the Cloud Shell is also an EC2 machine when i loom at the VM's the Cloud Shell machi e is not listed.
PLEASE HELP.
You cannot change the size of the Cloud Shell disk. Cloud Shell runs on a Compute Engine instance that Google controls and is not a resource in your project. Cloud Shell is a container and you can deploy your own container on Compute Engine COS or via Docker.

Can I use Cloud Shell with more than the 5 GB persistent storage?

According to the docs:
Cloud Shell provisions 5 GB of free persistent disk storage mounted as your $HOME directory on the virtual machine instance.
I would need more (paid) storage though that I can access from the Cloud Shell environment and that is persistent across my sessions. It's mostly used to store local clones of git repositories and images. I would be the only one to access these files.
It seems that the 5 GB storage is a hard limit, so it won't expand dynamically and bill me for the exceeding amount. It is possible to use Boost Mode, but that does not affect the storage size. And I also can't provision more storage with a custom Cloud Shell environment. I couldn't figure out if I can mount another GCE persistent disk to my $HOME. I was considering gcs-fuse as suggested in this answer but I'm not sure if it is suitable for git repos.
Is there any way to have more storage available in Cloud Shell?
Google Cloud Shell is a container that runs on a hidden Compute Engine instance managed by Google. You can download, modify and redeploy this container to Cloud Shell or to your own container running in the cloud or on your desktop.
The base image of the container is available at gcr.io/cloudshell-images/cloudshell:latest, per this page.
For your use case, I would use Compute Engine with Container OS and run the Cloud Shell container within COS. You can scale the CPUs, memory, and storage to fit your requirements.
You can also set up a Compute Engine instance, install the CLIs, SDKs, and tools and have a more powerful system.
Notes for future readers based upon the first answer:
Filestore is a great product, but pay attention to costs. The minimum deployment is 1 TB at $200+ per month. You will need to mount the NFS share each time Cloud Shell restarts - this can be put into login scripts. Note: I am not sure if you can actually mount an NFS share from Filestore in Cloud Shell. I have never tested this.
You will have the same remount problem with FUSE, plus you will have bandwidth costs to access Cloud Storage.
Cloud Shell is a great product that is well implemented, but when you need to exceed its capabilities it is better to deploy a small/medium size GCE instance. this enables persistent, snapshots, etc.
There is another way to have more disk space in the cloud shell. It's to create a cloud storage bucket and map the cloud storage bucket as a folder. This way you can store larger files in the cloud storage bucket and it doesn't require any compute instance.
Go to cloud storage and create a new storage bucket
Copy the storage bucket's name, eg. my_storage_bucket
Go to cloud shell and create a folder in your home folder
mkdir ~/my_bucket_folder
Mount the storage bucket to this folder
gcsfuse my_storage_bucket ~/my_bucket_folder
Change directory to your my_bucket_folder
cd ~/my_bucket_folder
Voila! you have unlimited space!
To unmount please run the following
fusermount -u ~/my_bucket_folder
I'm using gcsfuse and works fine. You don't have to remount every time if you put the mount command in .customize_environment (run on boot up).
#!/bin/sh
#.customize_environmnet run in background as root, wait for your user to initialize
sleep 20
sudo -u [USER] gcsfuse -o nonempty -file-mode=777 -dir-mode=777 --uid=1000 --debug_gcs [BUCKET_NAME] /home/[USER]/[FOLDER_NAME]
You can read more at Unlimited persistent disk in google cloud shell
There is no way of adding more storage to the Cloud Shell. You can create a VM and install the Cloud SDK and have as much storage as you'd like but it is not currently possible to add storage space to the Cloud Shell.
Depending on how you plan on using the saved repos, Cloud Storage may be ideal as it has a storage category just perfect archiving.
Filestore will be your best option as it is great for file systems and it is scalable. It fits your needs as you have described.
You can use Cloud Storage with FUSE. Keep in mind that this method, although great, depends on how it will be used as costs are based on storage category.
You can see a brief comparison of the Storage solutions the Cloud Platform has to offer here.

What's the default storage for Google Cloud Run?

There is not documentation that I can find about the storage that Google Cloud Run has. For example, does it contains few Gigabyte storage as we create a VM?
If not, is there a '/tmp' folder that I can put data temporarily into during the request? What's the limitation if available?
If neither of them available, what's the recommendation if I want to save some temporary data while running Cloud Run?
Cloud Run is a stateless service platform, and does not have any built-in storage mechanism.
Files can be temporarily stored for processing in a container instance, but this storage comes out of the available memory for the service as described in the runtime contract. Maximum memory available to a service is 8 GB.
For persistent storage the recommendation is to integrate with other GCP services that provide storage or databases.
The top services for this are Cloud Storage and Cloud Firestore.
These two are a particularly good match for Cloud Run because they have the most "serverless" compatibility: horizontal scaling to matching the scaling capacity of Cloud Run and the ability to trigger events on state changes to plug into asynchronous, serverless architectures via Cloud Pub/Sub and Cloud Storage's Registering Object Changes and Cloud Functions with Cloud Function Events & Triggers.
The writable disk storage is an in-memory filesystem, which limited by instance memory to a maximum of 8GB. Anything written to the filesystem is not persisted between instances.
See:
https://cloud.google.com/run/quotas
https://cloud.google.com/run/docs/reference/container-contract#filesystem
https://cloud.google.com/run/docs/reference/container-contract#memory

Kubernetes Persistent volumes

I have this MYSQL database pod running within a google cloud cluster, also using a persistent volume(GCEpersistentdisk) to backup my data.
Is there a way to also have a AWS persistent volume(AWSElasticBlockStore) backing up the same Pod, in case something goes wrong with google cloud platform and can't reach any of my data, so when I create another Kubernetes Pod within AWS, I'll be able to get my latest data(before GCP crushes) from that AWSElasticBlockStore.
If not what's the best way to simultaneously backing up a kubernetes database pod at two different cloud provider. So when one crushes, you'll be still able to deploy at the other.

How to use Google Cloud Storage in a Django app hosted on Google Compute Engine?

I am using Django 1.7, Gunicorn and Nginx for my app. It is hosted on GCE VM instance.
I want to store all my user uploaded content in Google Cloud Storage, so that it is easily accessible in case the traffic increases and I have to use multiple VM instances behind a HTTP/Network load balancer.
Given that Google does not allow attaching a storage disk to multiple VM instances in write mode, Google Cloud Storage looks like the only option. I want to use Google Cloud Storage as a file system or something similar to that.
Please let me know if there are any other options.
Sounds like you want to use the Google Cloud Storage Python Client Library from your Django app to access GCS.
See my other answer for a list of alternatives (the original question was about persistent disk, so GCS is one of the options, as you have already discovered):
If you want to share data between them, you need to use something other than Persistent Disk, e.g., Google's Cloud Datastore, Cloud Storage, or Cloud SQL, or you can run your own database (whether SQL or NoSQL), a distributed filesystem (Ceph, Gluster), or a file server (NFS, SAMBA), among other options.