Planning nodegroups for Gitlab with EKS cluster - amazon-web-services

I am in the process of building an infrastructure for my gitlab instance using AWS EKS. I have already created an EKS cluster, added a managed node group and installed the gitlab-runner in the cluster. In this node group I can now run my pipelines as usual. In my gitlab instance, I have several projects that each have an MR pipeline. In addition, I run another pipeline overnight in each project. These pipelines that run overnight sometimes require certain HW resources such as an FPGA board or an SDR. I want to clarify that I don't want to build and deploy apps in my cluster. The cluster should be used exclusively to run the pipelines.
Currently I am trying to create the right setup for the node groups and would like to draw on community experience in this regard.
What do I want to achieve?
I want to be able to determine the HW for the individual jobs, such as building the code. It should be possible to speed up the process with more nodes or a stronger instance type.
I also want to have a node group for external resources with special HW (FPGA boards, SDRs) to use in my tests.
Questions:
What node groups and settings are suitable in your experience?
How to run jobs in single node groups via gitlab? Is this possible with tags? How do I address the individual groups in gitlab?
What is the best way to manage external HW resources, like the ones in my local lab?
I would be very happy if you share your experiences with me! Every help is appreciated! Thanks a lot!

Related

running windows Container in Kubernetes over AWS cloud

I installed my existing Kubernetes Cluster (1.8) running in AWS using KOPS.
I would like to add Windows Container to the existing cluster but I can not find the right solution! :(
I thought of following these given steps given in:
https://kubernetes.io/docs/getting-started-guides/windows/
I downloaded the node binaries and copied it to my Windows machine (Kubelet, Kube-dns, kube-proxy, kubectl) but I got a little confused about the multiple networking options.
They have also given the kubeadmin option to join the node to my Master, which I have no idea why since I used Kops to create my cluster.
Can someone advise or help me on how I can get my windows node added?
KOPS is really good if the default architecture satisfies your requirements, if you need to make some changes it will give you some trouble. For example I needed to add a GPU Node, I was able to add it, but unable to make this process automatic, being unable to create an auto scaling group.
Kops has a lot of pros, like creating all the cluster in a transparent way.
Do you really need a windows node?
If yes, try to launch a cluster using kube-adm, then joining the windows node to this cluster.
Kops will take some time to add this windows nodes feature.

Merging software deployment and vm provisioning in Ansible + Ec2

Most examples of Ansible usage I've seen seem to try and separate machine provisioning from software deployment.
In that, they will have a dynamic inventory of hosts, add a host to that inventory and than deploy the application on the matching host.
For certain services, it seems more logical (to me) to merge the two steps and include the instance deployment in the playbook that deploys the software.
Is this something that can be done in a practical way with ansible ? How would I go about launching my ec2 instance and (in the same playbook), deploying an application on it, without having to have any entity external to the playbook to which the new host's identifiers are added ?
If the requirement is specifically in AWS, it makes more sense to bake the software on to an AMI and then spin up node from that. Any fine tuning of the VM could be done using cloud-init module or similar.
Generating AMI and spinning up VM are done in two stages, and, Ansible could be used in both to automate.

AWS CodeDeploy Instance specific configuration

I'm not native, so first I'm sorry for my bad English.
What is the best practice for instance specific configuration in AWS CodeDeploy?
I want to deploy server for multiple instances, and I also want to register some cron job (like, daily report?) on just one of these instances. I'm using AWS CodeDeploy, and looks like there's no simple option to do such thing.
I have some solutions but not very satisfying. One is separating Deployment Group. Means I have to manage some additional Revisions. The other is add tags to EC2 instances and diverge with the tags. It feels too tricky. Is there any other recommended way to do it?
There is no best practice for instance specific configuration in CodeDeploy for instances in the same deployment group. I recommend creating a separate application entirely running on a different instance if you want to run jobs like daily report, so that the job will not interfere with the normal functioning of your application (for example, if the job consumes all the CPU, then your server on that same box will be impacted.)

How to deploy to autoscaling group with only one active node without downtime

There are two questions about AWS autoscaling + deployment which I cannot clearly answer:
I'm currently trying to figure out, whats the best strategy to deploy to an EC2 instance behind an ELB which is the only member of an autoscaling group without downtime.
By now the EC2 setup will be done with puppet including the deployment of the application, triggered after an successful build by jenkins.
The best solution I have found is to check per script how many instances are registered at the ELB. If a single one is registered, spawn a new one, which runs puppet on startup (the new node will be up to date) and kill the old node.
How to deploy (autoscaling EC2 behind an ELB) without delivering two different versions of the application?
Possible solution: Check per script how many EC2 instances are registered to the ELB, spawn the same amount of instances, register all new instances and unregister all old ones.
My experiences with AWS teacher me that AWS has a service for everything. So are there any services out there to accomplish my requirements and my solutions are inconvenient?
You can create an entirely new environment with its own ELB and when it's ready and checked, you switch the DNS record to the new ELB.
Anyway for a brief time (60 seconds or so, depending on the TTL of your DNS record) some users will see your old version while some others will see the new version.
In the end there were two possible solutions. Both of them would temporarily deliver two versions of the app.
Use AWS CodeDeploy to perform an sequential deployment (one after another). This solution offers the possibility to rollback to a previous state and visual shows the state and results of the deployment.
Create a python script to get the registered nodes (using Boto) and run the appropriate puppet script on them (using Fabric). This solution offers more control of the deployment but requires some time to build these script. Also there can be bugs..
For now I choose AWS CodeDeploy because its already available and - hopefully - well tested.

Mesos, Marathon, the cloud and 10 data centers - How to talk to each other?

I've been looking into Mesos, Marathon and Chronos combo to host a large number of websites. In my head I should be able to type a few commands into my laptop, and wait about 30 minutes for the thing to build and deploy.
My only issue, is that my resources are scattered across multiple data centers, numerous cloud accounts, and about 6 on premises places. I see no reason why I can't control them all from my laptop -- (I have serious power and control issues when it comes to my hardware!)
I'm thinking that my best approach is to build the brains in the cloud, (zoo keeper and at least one master), and then add on the separate data centers, but I am yet to see any examples of a distributed cluster, where not all the nodes can talk to each other.
Can anyone recommend a way of doing this?
I've got a setup like this, that i'd like to recommend:
Source code, deployment scripts and dockerfiles in GIT
Each webservice has its own directory and comes together with a dockerfile to containerize it
A build script (shell script running docker builds) builds all the docker containers, of which all images are pushed to a docker image repository
A ansible deploy deploys all the containers remotely to a set of VPSes. (You use your own deployment procedure, that fits mesos/marathon)
As part of the process, a activeMQ broker is deployed to the cloud (yep, in a container). While deploying, it supplies each node with the URL of the broker they need to connect to. In your setup you could instead use ZooKeeper or etcd for example.
I am also using jenkins to do automatic rebuilds and to run deploys whenever there has been GIT commits, but they can also be done manually.
Rebuilds are lightning fast, and deploys dont take much time either. I can replicate everything I have in my repository endlessly and have zero configuration.
To be able to do a new deploy, all I need is a set of VPSs with docker daemons, and some datastores for persistence. Im not sure if this is something that you can replace with mesos, but ansible will definitely be able to install a mesos cloud for you onto your hardware.
All logging is being done with logstash, to a central logging server.
i have setup a 3 master, 5 slave, 1 gateway mesos/marathon/docker setup and documented here
https://github.com/debianmaster/Notes/wiki/Mesos-marathon-Docker-cluster-setup-on-RHEL-7-with-three-master
this may help you in understanding the load balancing / scaling across different machines in your data center
1) masters can also be used as slaves
2) mesos haproxy bridge script can be used for service discovery of the newly created services in the cluster
3) gateway haproxy is updated every min with new services that are created
This documentation has
1) master/slave setup
2) setting up haproxy that automatically reloads
3) setting up dockers
4) example service program
You should use Terraform to orchestrate your infrastructure as code.
Terraform has a lot of providers that allows you to manage different resources accross multiples clouds services and/or bare-metal resources such as vSphere.
You can start with the Getting Started Guide.