How to replicate code changes across multiple AWS instances?

How to replicate code changes across multiple AWS instances? - amazon-web-services

We have a load balanced setup in AWS with two instances. We do pretty frequent code updates, utilizing SVN. I need to know how easy it is to update the code changes across all the instances in our cluster. Can we simply do 'snapshots' and create new volumes each time for the instances?...or?...

I would not do updates via EBS snapshots. Think of EBS volumes as a hard disk - you would not change your harddisk if you have an update for your software.
As you have your code in a version control system, code updates should be quite simple like logging in to your (multiple) servers and doing a git pull or svn update. This should fetch the latest code files from your servers. Depending on the type of application you would have to do some other tasks afterwards, running build scripts, emptying cache etc.
The problem is that this kind of setup does not scale well. If you have n servers, you will have to login and do this command n times. Therefore it makes sense to look into some remote management tools that you can use in one step. With a lot of these tools, you also get a complete configuration management stack: you define a set of recipes or tasks (like installed packages, configuration files, fetch the latest code, necessary build steps) for each of your servers, and when you boot up a new server it fetches the lastest version of its configuration and installs itself.
Popular configuration management tools include Puppet or Salt. Both tools have remote execution included which should make your task to publish your code base easier, you would only have to fire one command on your master server and it automatically executes this task on all its minions / slave servers.

Related

Best way to update code on Azure Linux VMSS from Git using JENKINS

I am planning to use Azure VMSS for deploying a set of spring boot apps. I am planning to create a custom linux VM image with all the required softwares/utilities as well as the required directory structure and configure this image in VMSS. We use jenkins as CI/CD tool and Git as source code repo. What is the best way to build and deploy these spring boot apps on VMSS?
I think one way is to write a custom script extension which downloads code from Git repo and then starts these spring boot apps. I believe this script will then get executed every time a new VM is provisioned.
But what about cases where already multiple VMs are running on top of minimum scale instance count. I believe a manual restart will not trigger the CSE script to run on these already running VMs right?
Could anyone advise the best way to handle this?
Also once a VM is deallocated due to auto scale down, what is the best/cost optimal way to back up the log files from VM to storage (blob or file share)?

You could enable Automatically tear down virtual machines after every use in the organization settings/project setting >> agent pool >> VMSS agent pool >> settings. Then, a new VM instance is used for every job. After running a job, the VM will go offline and be reimaged before it picks up another job. The Custom Script Extension will be executed on every virtual machine in the scaleset immediately after it is created or reimaged. Here is the reference document: Create the scale set agent pool.
To back up the log files from VM, you could refer to Troubleshoot and support about related file path on the target virtual machine.

Medium Hadoop / Spark Cluster Administration

Please let me know if this question is more appropriate for a different channel but I was wondering what the recommended tools are for being able to install, configure and deploy hadoop/spark across a large number of remote servers. I'm already familiar with how to setup all of the software but I'm trying to determine what I should start using that would allow me to easily deploy across a large number of servers. I've started to look into configuration management tools (ie. chef, puppet, ansible) but was wondering what the best and most user friendly option to start off with is out there. I also do not want to use spark-ec2. Should I be creating homegrown scripts to loop through a hosts file containing IP? Should I use pssh? pscp? etc. I want to just be able to ssh with as many servers as needed and install all of the software.

If you have some experience in scripting language then you can go for chef. The recipes are already available for deployment and configuration of cluster and it's very easy to start with.
And if wants to do it by your own then you can use sshxcute java API which runs the script on remote server. You can build up the commands there and pass them to sshxcute API to deploy the cluster.

Check out Apache Ambari. Its a great tool for central management of configs, adding new nodes, monitoring the cluster, etc. This would be your best bet.

How Docker and Ansible fit together to implement Continuous Delivery/Continuous Deployment

I'm new to the configuration management and deployment tools. I have to implement a Continuous Delivery/Continuous Deployment tool for one of the most interesting projects I've ever put my hands on.
First of all, individually, I'm comfortable with AWS, I know what Ansible is, the logic behind it and its purpose. I do not have same level of understanding of Docker but I got the idea. I went through a lot of Internet resources, but I can't get the the big picture.
What I've been struggling is how they fit together. Using Ansible, I can manage my Infrastructure as Code; building EC2 instances, installing packages... I can even deploy a full application by pulling its code, modify config files and start web server. Docker is, itself, a tool that packages an application and ensures that it can be run wherever you deploy it.
My problems are:
How does Docker (or Ansible and Docker) extend the Continuous Integration process!?
Suppose we have a source code repository, the team members finish working on a feature and they push their work. Jenkins detects this, runs all the acceptance/unit/integration test suites and if they all passed, it declares it as a stable build. How Docker fits here? I mean when the team pushes their work, does Jenkins have to pull the Docker file source coded within the app, build the image of the application, start the container and run all the tests against it or it runs the tests the classic way and if all is good then it builds the Docker image from the Docker file and saves it in a private place?
Should Jenkins tag the final image using x.y.z for example!?
Docker containers configuration :
Suppose we have an image built by Jenkins stored somewhere, how to handle deploying the same image into different environments, and even, different configurations parameters ( Vhosts config, DB hosts, Queues URLs, S3 endpoints, etc...) What is the most flexible way to deal with this issue without breaking Docker principles? Are these configurations backed in the image when it gets build or when the container based on it is started, if so how are they injected?
Ansible and Docker:
Ansible provides a Docker module to manage Docker containers. Assuming I solved the problems mentioned above, when I want to deploy a new version x.t.z of my app, I tell Ansible to pull that image from where it was stored on, start the app container, so how to inject the configuration settings!? Does Ansible have to log in the Docker image, before it's running ( this sounds insane to me ) and use its Jinja2 templates the same way with a classic host!? If not, how is this handled?!
Excuse me if it was a long question or if I misspelled something, but this is my thinking out loud. I'm blocked for the past two weeks and I can't figure out the correct workflow. I want this to be a reference for future readers.
Please, it would very helpful to read your experiences and solutions because this looks like a common workflow.

I would like to answer in parts
How does Docker (or Ansible and Docker) extend the Continuous Integration process!?
Since docker images same everywhere, you use your docker images as if they are production images. Therefore, when somebody committed a code, you build your docker image. You run tests against it. When all tests pass, you tag that image accordingly. Since docker is fast, this is a feasible workflow.
Also docker changes are incremental; therefore, your images will have minimal impact on storage. Also when your tests fail, you may also choose to save that image too. In this way, developer will pull that image and investigate easily why your tests failed. Developer may choose to run tests in their machine too since docker images in jenkins and their machine are not different.
What this brings that all developers will have same environment, same version of all software since you decide which one will be used in docker images. I have come across to bugs that are due to differences between developer machines. For example in the same operating system, unicode settings may affect your code. But in docker images all developers will test against same settings, same version software.
Docker containers configuration :
If you are using a private repository, and you should use one, then configuration changes will not affect hard disk space much. Therefore except security configurations, such as db passwords, you can apply configuration changes to docker images(Baking the Configuration into the Container). Then you can use ansible to apply not-stored configurations to deployed images before/after startup using environment variables or Docker Volumes.
https://dantehranian.wordpress.com/2015/03/25/how-should-i-get-application-configuration-into-my-docker-containers/
Does Ansible have to log in the Docker image, before it's running (
this sounds insane to me ) and use its Jinja2 templates the same way
with a classic host!? If not, how is this handled?!
No, ansible will not log in the Docker image, but ansible with Jinja2 templates can be used to change dockerfile. You can change dockerfile with templates and can inject your configuration to different files. Tag your files accordingly and you have configured images to spin up.

Regarding your question about handling multiple environment configurations using the same Docker image, I have been planning on using a Service Discovery tool like Consul as a centralized config/property management tool. So, when you start your container up, you set an ENV var that tells it what application it is (appID), and what environment config it should use (ex: MyApplication:Dev) and it will pull its config from Consul at startup. I still have to investigate the security around Consul (as if we are storing DB connection credentials in there for example, how do we restrict who can query/update those values). I don't want to just use this for containers, but all apps in general. Another cool capability is to change the config value in Consul and have a hook back into your app to apply the changes immediately (maybe like a REST endpoint on your app to push changes down to and dynamically apply it). Of course your app has to be written to support this!
You might be interested in checking out Martin Fowler's blog articles on immutable infrastructure and on Phoenix servers.

Although not a complete solution, I have suggestions for two of your issues. Although they might not be perfect, these are the practices we are using in our workflow, and prove themselves so far.
Defining different environments - supposing you've written a different Ansible role for each environment you launch, we define an environment variable setting the environment we wish the container to belong to. We then download the suitable configuration file from an S3 bucket using the env variable set before into the container (which should be possible if you supply AWS creds or give your server an IAM role) and inject these parameters into the code when building it.
Ansible doesn't need to log into the docker app, but the solution is a bit tricky. I've tried two ways of tackling this problem, and both aren't ideal. The first one is to download the configuration file as part of the docker image command line, and build the app on container startup. While this solution works - it breaches the Docker philosophy and makes the image highly prone to build errors.
Another solution is pushing several images to your docker hub repo, and then pulling the appropriate image according to the environment at hand.
In a broader stroke, I've tried launching our app completely with Ansible and it was hell, many configuration steps are tricky and get trickier when you try to implement them as a playbook. When I switched to maintaining the severs alone with Ansible, and deploying the app itself with Docker things got a lot easier.

How does ElasticBeanStalk deploy your application version to instances?

I am currently using AWS ElasticBeanStalk and I was curious as to how (as in internally) it knows that when you fire up an instance (or it automatically does with scaling), to unpack the zip I uploaded as a version? Is there some enviroment setting that looks up my zip in my S3 bucket and then unpacks automatically for every instance running in that environment?
If so, could this be used to automate a task such as run an SQL query on boot-up (instance deployment) too? Are these automated tasks changeable or viewable at all?
Thanks

I don't know how beanstalk knows which version to download and unpack, but running a task on start-up is trivial. Check out cloud-init, a tool written by Ubuntu that's now packaged in Amazon Linux. It allows you to pass arbitrary shell scripts into the UserData section of the instance configuration, and those shell scripts will run on startup.
It's a great way to bootstrap instances on startup, which avoids the soul-sucking misery of managing AMIs.
A quick (possibly non-applicable) warning: If you're running a SQL query on a database that lives on the beanstalk AMI, you're pretty much guaranteed to lose your database at some point. Those machines are designed to be entirely transient. Do not put databases on them. See this answer for more details.

Since your goal seems to be to run custom configuration tasks, the answer is yes, there is a way to do that. You can define custom actions in an .ebextensions file packaged with your app. For example, you can configure a command to run every time a new machine is deployed:
http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/customize-containers-ec2.html#linux-commands

Amazon EC2 multiple instances with SVN app

Heya,
Quick question.
I've got multiple instances on EC2 with a load balancer between them. I use an SVN app that used to push to my live env. at will.
With the multiple EC2's, how would I push a codebase to all of them at once?
Any thoughts/links would be appreciated.

There are a few different ways to do this.
If You Are Using Elastic Load Balancers
Write a script that:
Removes a machine from the pool
Updates the SVN repository
Re-adds the machine to the pool
Repeats for any additional machines
You could also get fancy and remove one machine, update it, remove all other machines and update them, if you're concerned about consistency.
If You Are Using a Custom Load Balancing Application
Look into Capistrano. You don't need to use it with Ruby/Rake -- you can write custom cap files that can do parallel deploys.

How about vlad or fabric for code deployment.

We use Scalr. It is available as a service (Scalr.net) or you can run it yourself (it is Open Source - though the source in the googlecode repository is sometimes a little behind the version the service uses).
Basically, Scalr has a global scripting feature whereby you can specify a script (e.g. bash, PHP, anything with #!bang) and trigger it to be run on all instances of a given 'role' (e.g. web instance). In our case we have a script that just does svn checkout or svn update as appropriate. Scalr supports periodic scheduling of scripts, so in the dev environment I run it every 5 mins to keep dev in synch with SVN, but obviously I manually trigger it for production.
(I have the script taking a param to specify the SVN branch to use)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js