Build system when using auto scaling group with ELB in aws - django

I was using a free tier aws account in which I had one ec2 machine (Linux). I have a simple website with backend server running on django at 8000 port and front end server written in angular and running on http (80) port. I used nginx for https and redirection of calls to backend and frontend server.
Now for backend build system, I did these 3 main steps (which I automated by running jenkins on the same machine).
1) git pull (Pull the latest code from repo).
2) Do migrations (Updating my db with any new table).
3) Restarting the django server. (I was using gunicorn).
Now, I split my front end and backend server into 2 different machines using auto scaling groups and I am now using ELB (Aws Elastic Load balancer) to route the requests. I am done with the setup. But now I am having problem in continuous deployment. The main thing is that ELB uses auto scaling groups which in turn uses AMI.
Now, since AMI's are created once, my first question is how to automate this process and deploy my latest code in already running aws servers.
Second, if I want to run few steps just once for all the servers like my second step of updating db with new tables then how to achieve that.
And also third if these steps need to run on a machine, then do I need to have another ec2 instance to automate the process of creating AMI, updating auto scaling groups with it and then deploying latest code in that.
So, basically I want to know the best practices that people follow in deploying latest code in aws machines that were created by auto scaling groups with the help of AMI. Also I use bitbucket for code management.

First Question: how to automate 'package based deployment'.
Instead of creating a new AMI for every release, create a baseline AMI which only changes when your new release require OS changes / security patches / etc. Look into tools such as packer to create AMIs automatically. In order to automate your code deployment when it changes, you can use a package-based deployment approach, which means you create a package for every release (Should be part of your CI process), which is stored in some repository such as Nexus, Artifactory, or even a simple S3 bucket.
When you deploy a new instance of your application, it should run some sort of script to pull and unpack/install that package on the instance < this is the basic concept, there are many tools that can help you achieve this, for example, Chef, or AWS CloudFormation.
So essentially, Step 1 should pull the code, create the package and store it in some repository available to your application servers > this can be done offline.
Second Question: How to run other tasks such as updating database schema.
As mentioned above, this can also be part of your 'deployment' automation, so if you are using Chef or even a simple bash script, it can update a database schema before unpacking the new code, this really depends on your database, how you manage it, and who orchestrates the deployment.
For example, you could have a Jenkins job that pulls the new schema and updates your database when ever you rollout a release.
Your third question can be solved by Packer, it can spin up instances, create an AMI, and terminate the instance.
Read more into CICD, and CICD related tools.

Related

Is it possible to set up auto-scaling so that it always duplicates the most recent version of your main server?

I know that you can create an image of your server as-is and setup auto-scaling on that, but what if I then make changes to my original server? Do I have to then make another snapshot of that and setup auto-scaling again?
There are two approaches for configuring a server:
Creating an Amazon Machine Image (AMI) with all software fully configured, or
Having the instance configure itself via a startup script triggered via User Data
A fully-configured AMI is fast to startup, whereas a configuration script can take several minutes to configure the instance before the instance is ready to accept traffic.
In general, it is not considered good practice to "make changes to my original server" because there is no concept of an "original server". All instances are considered equal. Instead, the configuration should be created and tested on development servers separate to any production servers and then released by deploying new servers or by having an 'update' script intelligently run on existing servers. Some of these capabilities are provided by AWS CodeDeploy.

AWS EC2 instances with auto scaling staying in sync

I have a Node.js web application currently running on a single EC2 instance on AWS. I am thinking of using auto scaling with 2 or more EC2 instances since the load on the application is increasing.
I have been trying to understand something with AWS Auto Scaling for a couple hours now but I cant seem to find an answer anywhere.
Currently, at many instances I SSH into my Ubuntu EC2 instance to modify some things or to run a deploy command (which grabs latest code from github). How does this work when you have, let's say 4 instances running under the auto scaling?
So if I SSH into a server and change the server.js file, what happens to the other 3 instances?
If that is not possible what are my choices? I have seen many people seeing that using S3 is the way to keep things in Sync but I don't fully get that. So I have to keep all my source code in S3 and do my edits from there?
You won't be able to modify files directly on the server once they are in an auto-scaling group. Changing something on one server won't be reflected on the other servers, and even if you manually updated all the currently running servers, any servers added by auto-scaling actions will not have those changes.
There are many methods to solve this, for example using AWS Code Deploy.
You could also configure something via an EC2 User-Data script in your auto-scaling configuration which will run on each server when they are created. That script could checkout the latest code from Git, or pull the latest build artifact from S3, and then start the app. When you have an update ready to deploy, you would simply flag the current instances as "unhealthy" and wait for the Auto-Scaling group to automatically replace them with new, updated instances.
You could use AWS EFS to host your application code and all web servers will get content from EFS instead of individual server. This way you don't have to worry about modifying individual server content.
One way you can do it is using github. you can update your code and push it to github and then terminate your existing instances and let the auto-scaling group spin up new instances with the updated code. here is a youtube tutorial video that has detailed steps on how to do it: https://www.youtube.com/watch?v=lB3Ip0Yn-Zs

Deploying to EC2 instances behind a load balancer; PHPStorm + GitHub

I know this has been partially answered in a bunch of places, but the answers are so.. all over the map, dated and not well explained. I'm looking the best practice as of February 2016.
The setup:
A PHP-based RESTful application service that lives in an EC2 instance. The EC2 instance uses S3 for uploaded user data (image files), and RDS MySql for its DB (these two points aren't particularly important.)
We develop in PHPStorm, and our source control is GitHub. When we deploy, we just use PHPStorm's built-in SFTP deployment to upload files directly to the EC2 instance (we have one instance for our Staging environment, and another for our Production environment). I deploy to Staging very often. Could be 20 times a day. I just click on a file in PHPStorm and say 'deploy to Staging', which does the SFTP transfer. Or, I might just click on the entire project and click 'deploy to Staging' - certain folders and files are excluded from the upload, which is part of PHPStorm's deployment configuration.
Recently, I put our EC2 instance behind a Load Balancer. I did this so that I can take advantage of Amazon's free SSL offering via the Certificate Manager, which does not support individual EC2 instances.
So, right now, there's a Load Balancer with only a single EC2 instance behind it. I maintain an Elastic IP pointing to the EC2 instance so that I can access it directly (see my current deployment method above).
Question:
I have not yet had the guts to create additional (clone) EC2 instances behind my Load Balancer, because I'm not sure how I should be deploying to them. A few ideas came to mind, but they're all pretty hacky.
Given the scenario above, what is currently the smoothest and best way to A) quickly deploy a codebase to a set of EC2 instances behind a Load Balancer, and B) actually 'clone' my current EC2 instance to create additional instances.
I haven't been able to really paint a clear picture of the above in my head yet, despite the fact that I've gone over a few (highly technical) suggestions.
Thanks!
You need to treat your EC2 instance as 100% dispensable. Meaning, that it can be terminated at any time and you should not care. A replacement EC2 instance would start and take over the work.
There are 3 ways to accomplish this:
Method 1: Each deployment creates a new AMI image.
When you deploy you app, you deploy it to a worker EC2 instance whose sole purpose is for "setup" of your app. Once the new version is deployed, you create a fresh AMI image from the EC2 instance and update your Auto Scaling launch configuration with the new AMI image. The old EC2 instances are terminated and replaced with the new code.
New EC2 instances have the recent code already on them so they're ready to be added to the load balancer.
Method 2: Each deployment is done to off-instance storage (like Amazon S3).
The EC2 instances will download the recent code from Amazon S3 and install it on boot.
So to put the new code in action, you terminate the old instances and new ones are launched to replace them which start using the new code.
This could be done in a rolling-update fashion, or as a blue/green deployment.
Method 3: Similar to method 2, but this time the instances have some smarts and can be signaled to download and install the code.
This way, you don't need to terminate instances: the existing instances are told to update from S3 and they do so on their own.
Some tools that may help include:
Chef
Ansible
CloudFormation
Update:
Methods 2 & 3 both start with a "basic" AMI which is configured to pull the webpage assets from S3. This AMI is not changed from version-to-version of your website.
For example, the AMI can have Apache and PHP already installed and on boot it pulls the .php website assets from S3 and puts them in /var/www/html.
CloudFormation works well for this. In addition, for method 3, you can use cfn-hup to wait for update signals. When signaled, it'll pull updated assets from S3.
Another possibility is using Elastic Beanstalk which could be used to manage all of this for you.
Update:
To have your AMI image pull from Git, try the following:
Setup an EC2 instance with everything installed that you need to have installed for your web app
Install Git and setup a local repo ready to Git pull.
Shutdown and create an AMI of your instance.
When you deploy, you do the following:
Git push to GitHub
Launch a new EC2 instance, based on your AMI image.
As part of the User Data (specified during the EC2 instance launch), specify something like the following:
#!/bin/sh
cd /git/app
git pull
; copy files from repo to web folder
composer install
When done like this, that user data acts as a script which will run on first boot.

Mesos, Marathon, the cloud and 10 data centers - How to talk to each other?

I've been looking into Mesos, Marathon and Chronos combo to host a large number of websites. In my head I should be able to type a few commands into my laptop, and wait about 30 minutes for the thing to build and deploy.
My only issue, is that my resources are scattered across multiple data centers, numerous cloud accounts, and about 6 on premises places. I see no reason why I can't control them all from my laptop -- (I have serious power and control issues when it comes to my hardware!)
I'm thinking that my best approach is to build the brains in the cloud, (zoo keeper and at least one master), and then add on the separate data centers, but I am yet to see any examples of a distributed cluster, where not all the nodes can talk to each other.
Can anyone recommend a way of doing this?
I've got a setup like this, that i'd like to recommend:
Source code, deployment scripts and dockerfiles in GIT
Each webservice has its own directory and comes together with a dockerfile to containerize it
A build script (shell script running docker builds) builds all the docker containers, of which all images are pushed to a docker image repository
A ansible deploy deploys all the containers remotely to a set of VPSes. (You use your own deployment procedure, that fits mesos/marathon)
As part of the process, a activeMQ broker is deployed to the cloud (yep, in a container). While deploying, it supplies each node with the URL of the broker they need to connect to. In your setup you could instead use ZooKeeper or etcd for example.
I am also using jenkins to do automatic rebuilds and to run deploys whenever there has been GIT commits, but they can also be done manually.
Rebuilds are lightning fast, and deploys dont take much time either. I can replicate everything I have in my repository endlessly and have zero configuration.
To be able to do a new deploy, all I need is a set of VPSs with docker daemons, and some datastores for persistence. Im not sure if this is something that you can replace with mesos, but ansible will definitely be able to install a mesos cloud for you onto your hardware.
All logging is being done with logstash, to a central logging server.
i have setup a 3 master, 5 slave, 1 gateway mesos/marathon/docker setup and documented here
https://github.com/debianmaster/Notes/wiki/Mesos-marathon-Docker-cluster-setup-on-RHEL-7-with-three-master
this may help you in understanding the load balancing / scaling across different machines in your data center
1) masters can also be used as slaves
2) mesos haproxy bridge script can be used for service discovery of the newly created services in the cluster
3) gateway haproxy is updated every min with new services that are created
This documentation has
1) master/slave setup
2) setting up haproxy that automatically reloads
3) setting up dockers
4) example service program
You should use Terraform to orchestrate your infrastructure as code.
Terraform has a lot of providers that allows you to manage different resources accross multiples clouds services and/or bare-metal resources such as vSphere.
You can start with the Getting Started Guide.

efficient way to administer or manage an auto-scaling instances in aws

As a sysadmin, i'm looking for an efficient way or best practices that you do on managing an ec2 instances with autoscaling.
How you manage automate this following scenario: (our environment is running with autoscaling, Elastic Load Balancing and cloudwatch)
patching the latest version of the rpm packages of the server for security reasons? like (yup update/upgrade)
making a configuration change of the Apache server like a change of the httpd.conf and apply it to all instances in the auto-scaling group?
how do you deploy the latest codes to your app to the server with less disruption in production?
how do you use puppet or chef to automate your admin task?
I would really appreciate if you have anything to share on how you automate your administration task with aws
Check out Amazon OpsWorks, the new Chef based DevOps tool for Amazon Web Services.
It gives you the ability to run custom Chef recipes on your instances in the different layers (Load Balancer, App servers, DB...), as well as to manage the deployment of your app from various source repositories (Git, Subversion..).
It supports auto-scaling based on load (like the auto-scaling that you are already using), as well as auto-scaling based on time, which is more complex to achieve with standard EC2 auto-scaling.
This is relatively a young service and not all functionality is available already, but it might be useful for your.
patching the latest version of the rpm packages of the server for
security reasons? like (yup update/upgrade)
You can use puppet or chef to create a cron job that takes care of this for you (the cron would in its most basic form download and or install updates via a bash script). You may want to automatically upgrade, or simply notify an admin via email so you can evaluate before apply updates.
making a configuration change of the Apache server like a change of
the httpd.conf and apply it to all instances in the auto-scaling
group?
I usually handle all of my configuration files through my Puppet manifest. You could setup each EC2 instance to pull updates from a Puppet Server, then you can roll out changes on demand. Part of this process should be updating the AMI stored in your AutoScale group (this is done with the Amazon Command Line tools).
how do you deploy the latest codes to your app to the server with less
disruption in production?
Test it in staging first! Also a neat trick is to versioned deployments, so each time you do a deployment it gets its own folder (/var/www/v1 /var/www/v2 etc) and once you have verified the deployment was successful you simply update a symlink to point to the lastest version (/var/www/current points to /var/www/v2).
OpsWorks handles all this sort of stuff for you so you can look into that if you don't want to do it all yourself.
how do you use puppet or chef to automate your admin task?
You can use Chef or Puppet to do all sorts of things, and anything they can't (or you don't know how to) do can be done via a bash/python script that you invoke from Chef or Puppet.
I normally do things like install packages, build custom packages, set permissions, download things, start services, manage configuration files, setup cron jobs etc
I would really appreciate if you have anything to share on how you automate your administration task with aws
Look into CloudFormation. This can help you setup all your servers and related services (think EC2, LBS, CloudWatch) through configuration files, thus helping you to automate your entire stack (not just the EC2's Operating System).