best approach to run AWS CodeBuild steps in parallel? - amazon-web-services

I'm new to CodeBuild and have built a CodeBuild project that builds a set of AMIs.
It takes a long time to run. In other build systems like circleci and concourse I've used features that allow build steps to run in separate docker containers in parallel and the build system waits for them all to finish and then proceeds to the next step.
Does CodeBuild support something like this? I don't see that it does...
If it doesn't, what is the best approach? Is this a use case for CodePipeline?
I could also pass in each ami as a parameter to the build and run n copies of the build simultaneously (trigger with a script that launches one build per ami).
Thanks for any thoughts!

Below is the existing answer in StackOverflow that might help you achieve the parallelism.
CodePipeline buildspec and multiple build actions

Related

How to implement a CI/CD pipeline for Apache Beam/Dataflow classic templates (Python) & data pipelines

What is the best way to implement a CI/CD build process for Apache Beam/Dataflow classic templates & pipelines in Python? I have only found tutorials for this with Java that include artifact registry + Cloud Build, but rarely any in-depth tutorials for Python. I'd like to understand the "best-practice" way to develop pipelines in a Github repo and then having a CI/CD pipeline that automates staging template & kicking off job.
This Medium post was one of the more helpful high-level walkthroughs, but didn't dive in deep on getting all the tools to work together:
https://medium.com/posh-engineering/how-to-deploy-your-apache-beam-pipeline-in-google-cloud-dataflow-3b9fe431c7bb
I use a beam/dataflow pipeline with a CI/CD pipeline in GitLab. those are the steps that my CI/CD pipeline is following:
In my .gitlab-ci.yml file I pull a google/cloud-sdk Docker image which create an environment with python3.8 and the essentials of gcloud tools.
After that I run the unit tests and the integration tests of my pipeline. Once succeeded, I try to build a flex template (in your case you want to build a classic template) with the gcloud builds submit command.
Also if you want to automatically quick the job after all this, you have 2 options:
Either running the pipeline with a command line from the Docker container of your CI pipeline
Or since you already created a template for your pipeline, you can trigger it using an HTTP request for example
yes, for me, the Medium post actually covers most of it and helped me to build my CI pipeline as well.
These are the stages that I have:
Infra - Terraform for the pre-requisite GCP infra
Build - pip -r requirements.txt and anything else.
Test - Unit, integration, end-to-end. I will implement performance tests with a sample of prod data later on.
Security Checks - Secrets scanning, SAST
SonarQube for SCA
Deploy Template and Metadata (both Manual) to PoC, other environments and Prod. I use standard templates.
Run Job (Manual) - actions to run job using the DirectRunner for quick testing, and also another job using the Dataflow runner using gcloud dataflow jobs run ${JOB_NAME}....
For most steps, I used the python:3.10 image as the default (I ran into issues with installing the apache-beam dependency using Python 3.11), and google/cloud-sdk alpine for the gcloud steps.
There are other things we need to consider such as an action to stop a dataflow job and to rollback to a previously working dataflow template (need to upload multiple templates to GCS).
Hope this helps.

Provisioning a custom Docker image on AWS CodeBuild takes a very long time

My Dockerfile:
FROM mcr.microsoft.com/dotnet/framework/sdk:4.8-windowsservercore-ltsc2016
COPY AWSCLIV2.msi .
RUN Start-Process msiexec.exe -Wait -ArgumentList '/I AWSCLIV2.msi /quiet /qn /norestart /log awscli.log'
RUN rm AWSCLIV2.msi
My CodeBuild environment needs to be able to build a .NET Framework project as well as use AWS CLI. Due to limitations, I can only have one CodeBuild stage. I push to ECR the docker image created using the above Dockerfile and set my CodeBuild environment to use that image. However it takes ~10 minutes to provision.
CodeBuild provides caching that only lasts ~15 minutes which is not helpful for more infrequent builds. I also found this solution that others linked to https://github.com/aws/aws-codebuild-docker-images/issues/26#issuecomment-370177343 but I'm not sure how it can be applied to Windows containers.
If anyone has any pointers on decreasing the provisioning time, I would really appreciate it.
Windows images are large, so the provisioning time which includes the time to pull down the custom image to CodeBuild instance will be relatively long.
There are two approaches that can help:
Use CodeBuild provided images for the build environment, as the latest version for these images are pre-cached on the build servers.
Use the base layer for the custom image same as that of CodeBuild image, so that the base layer could be reused and won't incur download time. e.g.:
For Microsoft Windows, use a Windows container with a container OS that is version microsoft/windowsservercore:10.0.x (for example, microsoft/windowsservercore:10.0.14393.2125). Link

Fabric task dependencies

I am working on a fabric file to make our code deployment process a little bit easier. Now I would like to have dependencies between certain tasks, similar to what is discussed in one here.
Let's simplify the problem and say I have two task: build and deploy. The build task should build our code and the deploy task will transfer it to a deployment server.
Now, deploy obviously depends on build, but build could also be a standalone task. So someone could just build the code with fab build or deploy the code with fab build deploy. But I also want people to use fab deploy for convenience, but then it should run build first. But build should only be executed once.
So if I include build into the deploy task and then do fab build deploy it will run build twice and then deploy.
I managed to do this with the runs_once decorator and execute function.
The build task is now decorated with runs_once and every task that depends on build, e.g. deploy, will do execute(build) at the beginning. This will execute the build task or silently fail if it was already executed (thanks to the decorator).
This is more like a workaround than a solution but it works in my case. Regardless, thanks to everyone for their input

Maven multi-module deploy to repository only after successful unit tests

Question: What is the best solution for executing a 'mvn deploy' such that the deploy part is only run after all unit tests succeed and no processing steps are duplicated?
I was hoping the simple answer was: Execute maven command 'x' (or use a flag) such that the deploy can be run without invoking the prior goals in the default lifecycle.
Sadly this does not appear to have a simple answer. I have included the details on the path I have followed so far below.
We have the following three requirements:
Execute the maven deploy goal to deploy all multi-module artifacts to a remote repository.
Only deploy if ALL unit tests across all projects pass.
Do not repeat any processing.
We started with simply "mvn clean deploy", however we noticed a couple issues:
the build would stop before completing all unit tests :: so we added the --fail-at-end flag
The deploy goal would execute against any modules that were successful.
This results in a "corrupted" state where the remote repository may only has a partial deployment (if there were modules with failures later in the build).
We looked at 3 different solutions:
Staging the artifacts prior to deploying :: this was determined to be too heavy for a fully automated process.
Use a profile to override the default lifecycle such that 'mvn deploy -Pci-deploy' would run without invoking any prior goals :: this worked and was fast, but is obviously an unconventional approach.
Simply running 'mvn clean package' and then only iff successful execute 'mvn deploy' :: this appears to work and seems to only take a minor hit when the goals are invoked (though some of them are smart enough not to reprocess an unchanged workspace)
I pose this question to the community with the background details I have provided to determine if there is a better approach or a strong opinion regarding (potentially) making one of the following requests:
A new deploy goal that can run separate and apart from all other lifecycle goals with the expectation that: all prior steps have already been run and that it will execute the deploy identically to "mvn deploy"
a flag in the deploy goal which would effectively disable the previous goals.
a little more out of the box and definitely against the current convention:
a flag that would tell maven to run the [unit] test goal for all modules prior to proceeding.
Notes:
We are using Jenkins, but for the purposes of this question the CI environment is not the complication.
I tried the 'mvn deploy:deploy' goal, but it had a number of unclear errors.
I have not considered integration tests as part of the requirements.
Update 8/20/2013
I tested the deferred deploy plugin and determined that the tool worked as expected, but took way to long.
For our code base:
mvn clean deploy: for all goals executed in 2:44
mvn clean install 'deferred-deploy-plugin': for all goals executed in 15 min
mvn clean package; mvn deploy -Pci-deploy a custom build profile that disables the earlier goals executed:
for all goals (including deploy): 4:30
deploy only: 1:45
mvn clean package; mvn deploy -Dmaven.test.skip=true on the same workspace executed:
for all goals (including deploy): 4:40
deploy only: 1:54
The clean package followed by deploy skipping the tests runs faster than the deferred deploy and accomplished our desire to delay the deploy until after the tests succeed.
There appears to be a minor time hit for when the deploy lifecycle executes and exits each of the preceding goals (process, compile, test, package, etc). However the only alternative is to hack a non-standard execution, which only saves 10 seconds.
There's a new answer now. Since version 2.8 of the maven deploy plugin there's a way to do this "natively". See the jira issue for details.
Basically you need to force at least v2.8 of the plugin
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-deploy-plugin</artifactId>
<version>2.8</version>
</plugin>
and use the new parameter deployAtEnd. more info here. This setting usually goes along with installAtEnd of the maven-install-plugin
As an alternative, I also found this
http://code.google.com/p/maven-deferred-deploy-plugin/
A maven plugin that iterates through all projects in a reactor and
executes a deploy on each project individually. Can be used to produce
a near-atomic build for a reactor by deferring artifact deployment
until the install phase has completed.
Sounds alot like what you were asking for. I still think my other answer is easier to implement since you use jenkins, just check a checkbox
Two things.
Disabling all the previous phases i don't see it as an option. It is a basic feature of maven, you would be altering the standard lifecycle so i highly doubt anyone would implement something in a plugin to allow this
Since you said you use Jenkins, there is a setting in jenkins specifically for the case of deploying at the end to guarantee that the repo is not in a corrupt/intermediate state
In "Post-build actions"
Deploy artifacts to a Maven repository. In comparison with the
standard mvn deploy, this feature allows you to deploy artifacts after
the entire build is confirmed to be successful.
This prevents a typical problem in Maven, where some modules are deployed before a critical failure is discovered later down the road,
rendering the repository state inconsistent.
Note that regardless of this configuration, you can always manually come back to Jenkins and deploy any of the past artifacts to
any repository of your choice, after the fact.
To use this feature you shouldn't deactivate the automatic artifact archiving.
I have never used this so i can't confirm whether it works, I just know it's there for this particular use-case

How to Perform remote build in Jenkins

I am new to Jenkins. Please help me with my requirement.
I'm running Jenkins in Windows environment. I have a development box where Jenkins is running successfully. Now, I have to do a build in another windows machine (say QA box) from the dev box. Can anyone suggest me please how to do this?
Solution is quite simple.
Step 1: Create and configure the slave node (QA BOX) with Jenkins.
Goto Manage Jenkins
Click on Manage Nodes
New Node Configuration
Step 2: There may be several ways to complete this task.
Configure the jobs according to the new machine (IP, Ports or any other dependencies). A good practice is keeping the build scripts separate for machine or keeping the separate properties files for different machines.
Configuer Jobs According to the new slave configuration.
Keep in mind any dependency over File Structure, IPs and Ports.
Step 3: Run the jobs and debug for any dependencies regarding the machine.
If you encounter any trouble. Go through the logs and find the related problem.
Create a test node for your QA BOX
Configure a Job to:
Update the latest code to the remote test node, example SVN
Configure the build setting for the remote test node build, example using ANT
Done