I have Hosted agent in the Hosted queue and I also installed local agent in the Default queue. Can I use them for the same build configuration to have builds running in parallel (for different branches/versions/etc)?
The same build can’t be run in local and hosted agent. You can queue multiple build with different build agent. Another build can be triggered during current build by calling Queue a build rest api.
A build definition can only be configured against one Agent Queue, so it is not possible currently to use both Hosted as well as Default agent queue for a single build definition.
You can instead buy more Hosted Pipelines that will allow you to run more builds on Hosted agents in parallel or you can configure more local agents.
Related
Code agent gives us the ability to run builds against local instances or we can use the standard containers that get spun up when we run a build. I want to know if there is a way to dynamically select between these at run time. For example is it possible that I could run my build and check if my local agent is busy, if it is spin up a build container and if not run it on the local agent.
I need to run a pipeline of data transformation that is composed of several scripts in distinct projects = Python repos.
I am thinking of using Compute Engine to run these scripts in VMs when needed as I can manage resources required.
I need to be able to orchestrate these scripts in the sense that I want to run steps sequentially and sometimes asyncronously.
I see that GCP provides us with a Worflows components which seems to suit this case.
I am thinking of creating a specific project to orchestrate the executions of scripts.
However I cannot see how I can trigger the execution of my scripts which will not be in the same repo as the orchestrator project. From what I understand of GCE, VMs are only created when scripts are executed and provide no persistent HTTP endpoints to be called to trigger the execution from elsewhere.
To illustrate, let say I have two projects step_1 and step_2 which contain separate steps of my data transformation pipeline.
I would also have a project orchestrator with the only use of triggering step_1 and step_2 sequentially in VMs with GCE. This project would not have access to the code repos of these two former projects.
What would be the best practice in this case? Should I use other components than GCE and Worflows for this or there is a way to trigger scripts in GCE from an independent orchestration project?
One possible solution would be to not use GCE (Google Compute Engines) but instead create Docker containers that contain your task steps. These would then be registered with Cloud Run. Cloud Run spins up docker containers on demand and charges you only for the time you spend processing a request. When the request ends, you are no longer charged and hence you are optimally consuming resources. Various events can cause a request in Cloud Run but the most common is a REST call. With this in mind, now assume that your Python code is now packaged in a container which is triggered by a REST server (eg. Flask). Effectively you have created "microservices". These services can then be orchestrated by Cloud Workflows. The invocation of these microservices is through REST endpoints which can be Internet addresses with authorization also present. This would allow the microservices (tasks/steps) to be located in separate GCP projects and the orchestrator would see them as distinct callable endpoints.
Other potentials solutions to look at would be GKE (Kubernetes) and Cloud Composer (Apache Airflow).
If you DO wish to stay with Compute Engines, you can still do that using shared VPC. Shared VPC would allow distinct projects to have network connectivity between each other and you could use Private Catalog to have the GCE instances advertize to each other. You could then have a GCE instance choreograph or, again, choreograph through Cloud Workflows. We would have to check that Cloud Workflows supports parallel items ... I do not believe that as of the time of this post it does.
This is a common request, to organize automation into it's own project. You can setup service account that spans multiple projects.
See a tutorial here: https://gtseres.medium.com/using-service-accounts-across-projects-in-gcp-cf9473fef8f0
On top of that, you can also think to have Workflows in both orchestrator and sublevel project. This way the orchestrator Workflow can call another Workflow. So the job can be easily run, and encapsuled also under the project that has the code + workflow body, and only the triggering comes from other project.
I am trying to setup a build pipeline in azure devops. I have a cpp project. It needs a Framemaker FDKs to build. I have the FDK files on my local storage as well as on TFS. I want to link them in my pipeline so that they can be used while building and do not throw file not found errors.
Since Framemaker FDK is installed on local machines. You should create a self-hosted agent on the same machine where the FDK is installed.
And then you need configure your pipeline to run on this self-hosted agent by targeting the pipeline agent pool to your local agent pool. So that your pipeline will be able to access to the FDK installed on the local machine while running on the self-hosted agent.
See the detailed steps here to create self-hosted agent.
I'm trying to setup an Auto Scaling Group in combination with CodeDeploy. Everything works fine except for the fact that when a new instance is created CodeDeploy starts before the user data script (defined in the Launch Configuration) finishes.
The default value of this user data script downloads and install the code deploy agent and i've extended it with installation of a couple of windows features, IIS rewrite module and msdeploy.
In my appspec.yml i'm using the hook AfterInstall to deploy my IIS website and this obviously fails when msdeploy is not installed (yet).
Am i going about this the wrong way or is there a way to make CodeDeploy wait for the user data script to finish?
Unfortunately, there's no was for CodeDeploy to know anything more than the instance has loaded it's OS. The good thing is that CodeDeploy give the host agent 1 hour to start polling for commands with automatic deployments. The easiest thing to do is install the host agent after all the required dependencies are installed. The automatic deployment will be created, but can't proceed until after the host agent is started.
This is explained in detail here - https://aws.amazon.com/blogs/devops/under-the-hood-aws-codedeploy-and-auto-scaling-integration/
Ordering execution of launch scripts – The CodeDeploy agent looks for and executes deployments as soon as it starts. There is no ordering between the deployment execution and launch scripts such as user data, cfn-init, etc. We recommend you install the host agent as part of (and maybe as the last step in) the launch scripts so that you can be sure the deployment won’t be executed until the instance has installed dependencies that are not part of your CodeDeploy deployment. If you prefer baking the agent into the base AMI, we recommend that you keep the agent service in a stopped state and use the launch scripts to start the agent service.
I've been looking into Mesos, Marathon and Chronos combo to host a large number of websites. In my head I should be able to type a few commands into my laptop, and wait about 30 minutes for the thing to build and deploy.
My only issue, is that my resources are scattered across multiple data centers, numerous cloud accounts, and about 6 on premises places. I see no reason why I can't control them all from my laptop -- (I have serious power and control issues when it comes to my hardware!)
I'm thinking that my best approach is to build the brains in the cloud, (zoo keeper and at least one master), and then add on the separate data centers, but I am yet to see any examples of a distributed cluster, where not all the nodes can talk to each other.
Can anyone recommend a way of doing this?
I've got a setup like this, that i'd like to recommend:
Source code, deployment scripts and dockerfiles in GIT
Each webservice has its own directory and comes together with a dockerfile to containerize it
A build script (shell script running docker builds) builds all the docker containers, of which all images are pushed to a docker image repository
A ansible deploy deploys all the containers remotely to a set of VPSes. (You use your own deployment procedure, that fits mesos/marathon)
As part of the process, a activeMQ broker is deployed to the cloud (yep, in a container). While deploying, it supplies each node with the URL of the broker they need to connect to. In your setup you could instead use ZooKeeper or etcd for example.
I am also using jenkins to do automatic rebuilds and to run deploys whenever there has been GIT commits, but they can also be done manually.
Rebuilds are lightning fast, and deploys dont take much time either. I can replicate everything I have in my repository endlessly and have zero configuration.
To be able to do a new deploy, all I need is a set of VPSs with docker daemons, and some datastores for persistence. Im not sure if this is something that you can replace with mesos, but ansible will definitely be able to install a mesos cloud for you onto your hardware.
All logging is being done with logstash, to a central logging server.
i have setup a 3 master, 5 slave, 1 gateway mesos/marathon/docker setup and documented here
https://github.com/debianmaster/Notes/wiki/Mesos-marathon-Docker-cluster-setup-on-RHEL-7-with-three-master
this may help you in understanding the load balancing / scaling across different machines in your data center
1) masters can also be used as slaves
2) mesos haproxy bridge script can be used for service discovery of the newly created services in the cluster
3) gateway haproxy is updated every min with new services that are created
This documentation has
1) master/slave setup
2) setting up haproxy that automatically reloads
3) setting up dockers
4) example service program
You should use Terraform to orchestrate your infrastructure as code.
Terraform has a lot of providers that allows you to manage different resources accross multiples clouds services and/or bare-metal resources such as vSphere.
You can start with the Getting Started Guide.