Google Container Builder: How to cache dependencies between two builds - google-cloud-platform

We are migrating our container building process to Google Container Builder. We have multiple repo using Node or Scala.
As of actual container builder features, is it possible to cache dependencies between two builds (ex: node_modules, .ivy, ...). It's really time (money) consuming to download everything each time.
I know it's possible to build a custom docker image with all packaged within, but we would prefer avoiding this solution.
For example can we mount a persistent volume for that purpose, as we used to do with DroneIO? or even better automatically like in Bitbucket Pipelines?
Thanks

GCB doesn't currently support mounting a persistent volume across builds.
In the meantime, the team recently published a document outlining some options for speeding up builds, which might be useful: https://cloud.google.com/container-builder/docs/speeding-up-builds
In particular, caching generated output to Google Cloud Storage and pulling it in at the beginning of your build might help in your case.

Related

Are incremental builds possible with copied files not built on the machine?

I'm having trouble setting up incremental builds in Azure DevOps. There are too many variables with workspace cleaning to ensure that I don't have to do a full build every time.
I had a thought that I could just always copy the built files to a location outside of the agents' purview, and then copy those files into my release directory before each build.
Would that allow for an incremental build?
You probably can 'fool' the incremental logic but you would be working against the tooling.
For an actual incremental build you need to build in the same place.
In the context of Azure DevOps, that means building the same job of the same pipeline on the same agent. You can't let the build move around between agents or even between work folders of the same agent. (It also means that your agent and the state of the agent work folder must be persistent across the builds.)
You can make the job, stage, or pipeline 'sticky' to one dedicated agent by using demands and capabilities.
Decide what will be on your dedicated agent. Will it be the entire pipeline or just a stage of the pipeline or just a job of a stage?
For the dedicated agent, create a capability that represents the build. Using the name of the pipeline (or pipeline+stage or pipeline+stage+job depending) for the name of the capability is handy and self-documenting. You can create the capability in Azure DevOps as a 'user capability' of the agent.
Change your pipeline to add a demand on the custom capability. The demand can test if the custom capability exists. In a YAML pipeline the demands are configured in the pool definition.
This is an easier and less brittle approach then trying to outsmart the incremental logic.
With this approach, all builds will be done in series on the one agent. If the build takes a long time (which may be the motivation for building incrementally) and the build is tied to one agent, the 'throughput' of builds will be limited. If a build's duration is 1 hour, there will be a maximum of 8 builds in an 8 hour work day.
Tying specific builds to specific agents is not the intent in Azure DevOps. For a monolithic legacy codebase where there is no notion of semantic versioning and immutable interfaces, you may have little choice. But a better way is to use package management. Instead of one big build, have multiple smaller builds that produce packages that are used by other builds. The challenge is that packages will not work well without some attention and discipline around versioning and keeping published interfaces and contracts unchanged.

Can I share persistent volumes between builds in Google Cloud Build?

My build process is somewhat slow and it would be very handy if I could speed it up by reusing some assets from previous builds.
I've found that one can define volumes to share volumes between steps but is it possible to share folders between builds?
If it's layer of your container, you can use Kaniko cache feature. Else, you need to export the content somewhere (on Cloud Storage, it's a great place for this) and then import it on the next Build.
Tips: You can create custom builder. Define yours which saves the assets, and another one which retrieves the assets. Like this, add them simple at the end and the beginning of your pipeline to achieve this easily and in a more reusable manner.
Not sure if that helps in your case, but did you consider saving objects in the google cloud storage buckets?

How Docker and Ansible fit together to implement Continuous Delivery/Continuous Deployment

I'm new to the configuration management and deployment tools. I have to implement a Continuous Delivery/Continuous Deployment tool for one of the most interesting projects I've ever put my hands on.
First of all, individually, I'm comfortable with AWS, I know what Ansible is, the logic behind it and its purpose. I do not have same level of understanding of Docker but I got the idea. I went through a lot of Internet resources, but I can't get the the big picture.
What I've been struggling is how they fit together. Using Ansible, I can manage my Infrastructure as Code; building EC2 instances, installing packages... I can even deploy a full application by pulling its code, modify config files and start web server. Docker is, itself, a tool that packages an application and ensures that it can be run wherever you deploy it.
My problems are:
How does Docker (or Ansible and Docker) extend the Continuous Integration process!?
Suppose we have a source code repository, the team members finish working on a feature and they push their work. Jenkins detects this, runs all the acceptance/unit/integration test suites and if they all passed, it declares it as a stable build. How Docker fits here? I mean when the team pushes their work, does Jenkins have to pull the Docker file source coded within the app, build the image of the application, start the container and run all the tests against it or it runs the tests the classic way and if all is good then it builds the Docker image from the Docker file and saves it in a private place?
Should Jenkins tag the final image using x.y.z for example!?
Docker containers configuration :
Suppose we have an image built by Jenkins stored somewhere, how to handle deploying the same image into different environments, and even, different configurations parameters ( Vhosts config, DB hosts, Queues URLs, S3 endpoints, etc...) What is the most flexible way to deal with this issue without breaking Docker principles? Are these configurations backed in the image when it gets build or when the container based on it is started, if so how are they injected?
Ansible and Docker:
Ansible provides a Docker module to manage Docker containers. Assuming I solved the problems mentioned above, when I want to deploy a new version x.t.z of my app, I tell Ansible to pull that image from where it was stored on, start the app container, so how to inject the configuration settings!? Does Ansible have to log in the Docker image, before it's running ( this sounds insane to me ) and use its Jinja2 templates the same way with a classic host!? If not, how is this handled?!
Excuse me if it was a long question or if I misspelled something, but this is my thinking out loud. I'm blocked for the past two weeks and I can't figure out the correct workflow. I want this to be a reference for future readers.
Please, it would very helpful to read your experiences and solutions because this looks like a common workflow.
I would like to answer in parts
How does Docker (or Ansible and Docker) extend the Continuous Integration process!?
Since docker images same everywhere, you use your docker images as if they are production images. Therefore, when somebody committed a code, you build your docker image. You run tests against it. When all tests pass, you tag that image accordingly. Since docker is fast, this is a feasible workflow.
Also docker changes are incremental; therefore, your images will have minimal impact on storage. Also when your tests fail, you may also choose to save that image too. In this way, developer will pull that image and investigate easily why your tests failed. Developer may choose to run tests in their machine too since docker images in jenkins and their machine are not different.
What this brings that all developers will have same environment, same version of all software since you decide which one will be used in docker images. I have come across to bugs that are due to differences between developer machines. For example in the same operating system, unicode settings may affect your code. But in docker images all developers will test against same settings, same version software.
Docker containers configuration :
If you are using a private repository, and you should use one, then configuration changes will not affect hard disk space much. Therefore except security configurations, such as db passwords, you can apply configuration changes to docker images(Baking the Configuration into the Container). Then you can use ansible to apply not-stored configurations to deployed images before/after startup using environment variables or Docker Volumes.
https://dantehranian.wordpress.com/2015/03/25/how-should-i-get-application-configuration-into-my-docker-containers/
Does Ansible have to log in the Docker image, before it's running (
this sounds insane to me ) and use its Jinja2 templates the same way
with a classic host!? If not, how is this handled?!
No, ansible will not log in the Docker image, but ansible with Jinja2 templates can be used to change dockerfile. You can change dockerfile with templates and can inject your configuration to different files. Tag your files accordingly and you have configured images to spin up.
Regarding your question about handling multiple environment configurations using the same Docker image, I have been planning on using a Service Discovery tool like Consul as a centralized config/property management tool. So, when you start your container up, you set an ENV var that tells it what application it is (appID), and what environment config it should use (ex: MyApplication:Dev) and it will pull its config from Consul at startup. I still have to investigate the security around Consul (as if we are storing DB connection credentials in there for example, how do we restrict who can query/update those values). I don't want to just use this for containers, but all apps in general. Another cool capability is to change the config value in Consul and have a hook back into your app to apply the changes immediately (maybe like a REST endpoint on your app to push changes down to and dynamically apply it). Of course your app has to be written to support this!
You might be interested in checking out Martin Fowler's blog articles on immutable infrastructure and on Phoenix servers.
Although not a complete solution, I have suggestions for two of your issues. Although they might not be perfect, these are the practices we are using in our workflow, and prove themselves so far.
Defining different environments - supposing you've written a different Ansible role for each environment you launch, we define an environment variable setting the environment we wish the container to belong to. We then download the suitable configuration file from an S3 bucket using the env variable set before into the container (which should be possible if you supply AWS creds or give your server an IAM role) and inject these parameters into the code when building it.
Ansible doesn't need to log into the docker app, but the solution is a bit tricky. I've tried two ways of tackling this problem, and both aren't ideal. The first one is to download the configuration file as part of the docker image command line, and build the app on container startup. While this solution works - it breaches the Docker philosophy and makes the image highly prone to build errors.
Another solution is pushing several images to your docker hub repo, and then pulling the appropriate image according to the environment at hand.
In a broader stroke, I've tried launching our app completely with Ansible and it was hell, many configuration steps are tricky and get trickier when you try to implement them as a playbook. When I switched to maintaining the severs alone with Ansible, and deploying the app itself with Docker things got a lot easier.

Utilizing EC2 or Azure to scale out a distributed C# build system?

Quite a few build and CI systems support steps for pushing build output to Azure, but I haven't seen any which can actually run on Azure (or EC2). Ideally I would like to be able to spin up an arbitrary number of instances (depending on the # of pending submits) to deal with the actual build + quality gates (UTs, FXCop, other static analysis tools) + source repository checkin process.
Are there existing tools which can do this, or has anyone built something which they can discuss?
Thanks!
[Edit: I found this question which is quite similar but didn't have any informative answers, so I'll keep my question alive]
If you're using Git or Mercurial for source control, AppHarbor might be what you're looking for. It's a CI build/deploy environment that runs exclusively in the cloud (EC2), and can deploy build output to Azure.
Here are some links for reference:
http://sourcecodebean.com/archives/appharbor-heroku-for-net/987
http://lostechies.com/chrismissal/2011/03/12/using-appharbor-for-continuous-integration
http://haacked.com/archive/2011/05/12/making-let-me-bing-that-for-you-open-source.aspx
http://appharbor.com/page/pricing
The open souce Jenkins CI server has an EC2 plugin that will spin up EC2 instances automatically depending on your build load. I couldn't find anything for Azure, but I highly recommend Jenkins - it's easy to configure, well maintained and has stacks of features.
Continuous Integration on Windows Azure http://code.google.com/p/cassis/ (over Mercurial)
Disclaimer: work produced by my 1st year CS students
Also Teamcity has support for this: http://www.jetbrains.com/teamcity/features/amazon_ec2.html

What do you use for Staging / Deployment Artifact Servers?

I am thinking about writing my own release storage server and before I do this, I'd like to know what people use to see integration instead of create.
So what do you use to store your builds for internal access?
I'm looking for a web app that allows me to upload artifacts and then reference them by various tags so I can group them together by component or release vehicle. I also want access controls per build by readiness or promotion.
I define staging as placing built artifacts on a server for communities of users to access. The artifacts are usually zip files containing either applications or libraries + documentation. The user communities are developers, QA, and service delivery/operations. Basically, the creators, the checkers and external-users.
We release artifacts individually and as groups in a release vehicle (e.g., release 1.1 contains foo 1.0.1 and bar 1.0.7). Depending on the artifact, we may want to restrict access. Operations shouldn't be able to access pre-released builds and we may want to track who downloads a limited availability release.
So, I'm hoping to find a tool that does most of what I want with a good extensible design so I can add in what I don't have.
Any one know of a good tool for managing the builds post-build?
Examples might be:
quickbuild/lunt build
Team forge
build forge
Jira & confluence as a set
sonatype nexus
home grown
SVN repository using branching to promote builds from dev->Qa->GA
Peter,
Since you're not getting many answers, I'll let you know about AnthillPro whose developer, Urbancode, I work for.
Ok, disclaimers out of the way, AnthillPro is designed to serve exactly the broad audience that you're discussing - dev, checkers, and operations. Compared to the tools you list, AnthillPro is something like a BuildForge (a key competitor of ours) or quick build with a tightly integrated artifact repository (like nexus). So the builds are run, and you can view the results of your builds - and the build artifacts - in a nice web ui. Users with correct permissions can run a secondary process like a deployment or test against prior builds - and the artifacts from the selected build.
The goal is to manage the entire build lifecycle from creation, through various testing tools and deployment environments out through release to production. It's not a big nasty suite, instead we integrate with tools like Subversion and Jira to make sure every release has a manifest of source and problem ticket changes.
Your release packages would map well to AnthillPro's built in dependency system. We often see customers create virtual projects that take little or no source code, but instead either relate or package components into a release bundle.
Where AnthillPro may fall short for you is that generally, we would allow operations to see pre-release builds. However, you could add rules that would immediately fail / block an attempted releaes by operations of any build not marked as "pre-release". AnthillPro's system of statuses allows the team to flag a build with custom markers like, "In QA" or "Approved for Release". Combined with rules about running workflows,that should give you the control you need. If some projects are particularly sensitive, you'd just use the role based security to lock those away.
Hope that gives you something to look into.
-- Eric
My options are
build automation systems like AntHill, QuickBuild, TeamForge, BuildForge
file server
source control server
maven repository manager (nexus, archiva)
My goals are
group build by multiple criteria (artifact type, release vehicle, stage/phase)
promote build from dev -> qa -> released
provide access control for dev builds, qa ready builds, production ready builds
I'm going to focus on either source control as file server (using svn) or maven repos manager as file server using nexus. The rational is as follows:
minimize effort
minimize cost
use something I can easily extend when needed to (because I'm certain my requirements will shift).
maven use is growing and will eventually be the dominant build technology here.
Thanks for the information.