Can I share persistent volumes between builds in Google Cloud Build?

Can I share persistent volumes between builds in Google Cloud Build? - google-cloud-platform

My build process is somewhat slow and it would be very handy if I could speed it up by reusing some assets from previous builds.
I've found that one can define volumes to share volumes between steps but is it possible to share folders between builds?

If it's layer of your container, you can use Kaniko cache feature. Else, you need to export the content somewhere (on Cloud Storage, it's a great place for this) and then import it on the next Build.
Tips: You can create custom builder. Define yours which saves the assets, and another one which retrieves the assets. Like this, add them simple at the end and the beginning of your pipeline to achieve this easily and in a more reusable manner.

Not sure if that helps in your case, but did you consider saving objects in the google cloud storage buckets?

Related

cloud run : should we bundle trained model (having large size 2 GB) in container or should we download it from cloud storage at container start

My use case is:
I have trained model which i want to use for infer small messages.
Not sure about where should i keep my models in cloud run.
inside container
On cloud storage and download it at the time of container start
Mount cloud storage as local directory and use it
I am able to write and run code successfully for option 1 and 2.
Tried option 3 but not lucky there. I am using this link https://cloud.google.com/run/docs/tutorials/network-filesystems-fuse
Actually here my entry point is an pub sub event. thats where i am not able to make it working.
But before exploring more into it i would like to know about which approach is better here. or is there any other better solution.

Thanks for valuable comments, it helped a lot.
If model is static better to club it with container. downloading it from storage bucket or mounting FS will download model again whenever we spin new container.

Managing AWS Parameter Store variables with a data migrations tool

We are increasingly using AWS Parameter Store for managing configuration.
One issue we have is managing which variables need to be set when releases occur to different environments (staging, dev, prod etc). There is always a lot of work to configure the different environments, and it is easy to overlook required settings when we release microservices.
It seems what is needed is a database migration similar to Flyway or Liquibase, but I haven't found any products available, and it is unclear to me how secrets would be managed with this system.
What are people doing to manage pushing new configuration into Parameter Store when new application code is deployed?

Do you know AWS AppConfig? It’s a different way of managing configuration and I’m not sure if this fits your requirements but it might be worth a look.

What service should I use to process my files in a Cloud Storage bucket and upload the result?

I have a software that process some files. What I need is:
start a default image on google cloud (I think docker should be a good solution) using an API or a run command
download files from google storage
process it, run my software using those downloaded files
upload the result to google storage
shut the image down, expecting not to be billed anymore
What I do know is how to create my image hehe. But I can't find any info saying me what google cloud service should I use or even if I could do it like I'm thinking. I think I'm not using the right keywords to find what i need.
I was looking at Kubernetes, but i couldn't figure out how to manipulate those instances to execute a one time processing.
[EDIT]
Explaining better the process I have an app that receive images and send it to Google storage. After that, I need to process that images, apply filters, georeferencing, split image etc. So I want to start a docker image to process it and upload the results to google cloud again.

If you are using any of the runtimes supported by Google Cloud Functions, they are easiest way to do those kind of operations (i.e. fetch something from Google Cloud Storage, perform some actions on those files and upload them again). The Cloud Functions will be triggered by an event of your choice, and after the job, it will die.
Next option in terms of complexity would be to deploy a Google App Engine application in standard environment. It allows you to deploy your own application written in any of the supported languages for this environment. While there is traffic in your application, you will have instances serving, but the number of instances running can go down to 0 when they are not serving, which would mean less cost.
Another option would be Google App Engine in flexible environment. This product allows you to deploy your application in any custom runtime. This option has always at least one instance running, so it would never shut down.
Lastly, you can use Google Compute Engine to "create and run virtual machines on Google infrastructure". Otherwise than GAE, this is not that managed by Google, which means that most of the configuration is up to you. In this case, you would need to programmatically indicate your VM to shut down after you have finished your operations.

Based on your edit where you stated that you already have an app that is inserting images into Google Cloud Storage, your easiest option would be to use Cloud Functions that are triggered by additions, changes, or deletions to objects in Cloud Storage buckets.
You can follow the Cloud Functions tutorial for Cloud Storage to get an idea of the generic process and then implement your own code that handles your specific tasks. There are other tutorials like the Imagemagick tutorial for Cloud Functions that might also be relevant to the type of processing you intend to do.
Cloud Functions is probably your lightest weight approach. You could of course do more full scale applications, but that is likely overkill, more expensive, and more complex. You can write your processing code in Node.js, Python, or Go.

Saving images on files or on database when using docker

I'm using docker for a project, the main focus for its usage is to make the application available even if one of the node (it's a 6 nodes cluster with docker swarm) is down.
The application is basically a Django App that can save some images from users and others models. I'm currently saving the images as files, but since I need to specify a volume locally for a single machine, I would like to know if it would be better to save the images on database cluster, so it would be available even if the whole node goes down. Or is there another way?
#Edit
Note: The cluster runs locally and doesn't have internet access

The two options are two perform the file sharing via database or via the file system.
For file system sharing, you can use something like GlusterFS, so for each container it seems like they are mounting a host-local volume, but it's actually shared via GlusterFS between the hosts.
To my mind, if it's your application (e.g you can modify it at will), saving stuff in database would be the easier approach for most developers.

The best solution is often to go for a hosted option (such as MongoDB Atlas). Making a database resilient and highly available is really hard, and unless you are an expert on docker and mongo I would strongly recommend you to go for a hosted option.

Google Container Builder: How to cache dependencies between two builds

We are migrating our container building process to Google Container Builder. We have multiple repo using Node or Scala.
As of actual container builder features, is it possible to cache dependencies between two builds (ex: node_modules, .ivy, ...). It's really time (money) consuming to download everything each time.
I know it's possible to build a custom docker image with all packaged within, but we would prefer avoiding this solution.
For example can we mount a persistent volume for that purpose, as we used to do with DroneIO? or even better automatically like in Bitbucket Pipelines?
Thanks

GCB doesn't currently support mounting a persistent volume across builds.
In the meantime, the team recently published a document outlining some options for speeding up builds, which might be useful: https://cloud.google.com/container-builder/docs/speeding-up-builds
In particular, caching generated output to Google Cloud Storage and pulling it in at the beginning of your build might help in your case.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Can I share persistent volumes between builds in Google Cloud Build? - google-cloud-platform

My build process is somewhat slow and it would be very handy if I could speed it up by reusing some assets from previous builds. I've found that one can define volumes to share volumes between steps but is it possible to share folders between builds?

Not sure if that helps in your case, but did you consider saving objects in the google cloud storage buckets?

Related

cloud run : should we bundle trained model (having large size 2 GB) in container or should we download it from cloud storage at container start

Managing AWS Parameter Store variables with a data migrations tool

What service should I use to process my files in a Cloud Storage bucket and upload the result?

Saving images on files or on database when using docker

Google Container Builder: How to cache dependencies between two builds

Categories

Resources