Deploying to several environments on Amazon Elastic Beanstalk at the same time - amazon-web-services

I have an application that have several environments (all running in Amazon Elastic Beanstalk), namely, Production, Worker and Debug. Each environment have corresponding git branch that is different from master in some ways (like, configuration is changed and some code is deleted).
I use eb deploy to deploy the new version of application from its branch. It zips current git branch using git zip and sends the information to Amazon. Then it deploys to running instances.
The problem, however, is that deploying takes some time (about 5 minutes). Thus, between deploying, say, worker and production it have different code. Which is bad, because my changes might have change the queue protocol or something like that.
What I want is to be able to upload the information and to do its processing on all the environments, but not actually replace the code, just prepare it. And after I did it for all the environments issue command like "finish deploy" so that the code base is replaced on all the environments simultaneously.
Is there a way to do it?

You need to perform a "blue-green" deploy and not do this in-place. Because your deployment model requires synchronization of more than one piece, a change to the protocol those pieces use means those pieces MUST be deployed at the same time. Treat it as a single service if there's a frequently-breaking protocol that strongly binds the design.
"Deployed" means that the outermost layer of the system is exposed and usable by other systems. In this case, it sounds like you have a web server tier exposing an API to some other system, and a worker tier that reads messages produced by the web tier.
When making a breaking queue protocol change, you should deploy BOTH change-sets (web server layer and queue layer) to entirely NEW beanstalk environments, have them configured to use each other, then do a DNS swap on the exposed endpoint, from the old webserver EB environment to the new one. After swapping DNS on the webserver tier and verifying the environment works as expected, you can destroy the old webserver and queue tiers.
On non-protocol-breaking updates, you can simply update one environment or the other.
It sounds complex because it is. If you are breaking the protocol frequently, then your system is not decoupled enough to expect to version the worker and webserver tiers, which is why you have to do this complex process to version them together.
Hope this helps!

Related

Can we run an application that is configured to run on multi-node AWS EC2 K8s cluster using kops into local kubernetes cluster (using kubeadm)?

Can we run an application that is configured to run on multi-node AWS EC2 K8s cluster using kops (project link) into local Kubernetes cluster (setup using kubeadm)?
My thinking is that if the application runs in k8s cluster based on AWS EC2 instances, it should also run in local k8s cluster as well. I am trying it locally for testing purposes.
Heres what I have tried so far but it is not working.
First I set up my local 2-node cluster using kubeadm
Then I modified the installation script of the project (link given above) by removing all the references to EC2 (as I am using local machines) and kops (particularly in their create_cluster.py script) state.
I have modified their application yaml files (app requirements) to meet my localsetup (2-node)
Unfortunately, although most of the application pods are created and in running state, some other application pods are unable to create and therefore, I am not being able to run the whole application on my local cluster.
I appreciate your help.
It is the beauty of Docker and Kubernetes. It helps to keep your development environment to match production. For simple applications, written without custom resources, you can deploy the same workload to any cluster running on any cloud provider.
However, the ability to deploy the same workload to different clusters depends on some factors, like,
How you manage authorization and authentication in your cluster? for example, IAM, IRSA..
Are you using any cloud native custom resources - ex, AWS ALBs used as LoadBalancer Services
Are you using any cloud native storage - ex, your pods rely on EFS/EBS volumes
Is your application cloud agonistic - ex using native technologies like Neptune
Can you mock cloud technologies in your local - ex. Using local stack to mock Kinesis, Dynamo
How you resolve DNS routes - ex, Say you are using RDS n AWS. You can access it using a route53 entry. In local you might be running a mysql instance and you need a DNS mechanism to discover that instance.
I did a google search and looked at the documentation of kOps. I could not find any info about how to deploy to local, and it only supports public cloud providers.
IMO, you need to figure out a way to set up your local EKS cluster, and if there are any usage of cloud native technologies, you need to figure out an alternative way about doing the same in your local.
The true answer, as Rajan Panneer Selvam said in his response, is that it depends, but I'd like to expand somewhat on his answer by saying that your application should run on any K8S cluster given that it provides the services that the application consumes. What you're doing is considered good practice to ensure that your application is portable, which is always a factor in non-trivial applications where simply upgrading a downstream service could be considered a change of environment/platform requiring portability (platform-independence).
To help you achieve this, you should be developing a 12-Factor Application (12-FA) or one of its more up-to-date derivatives (12-FA is getting a little dated now and many variations have been suggested, but mostly they're all good).
For example, if your application uses a database then it should use DB independent SQL or no-sql so that you can switch it out. In production, you may run on Oracle, but in your local environment you may use MySQL: your application should not care. The credentials and connection string should be passed to the application via the usual K8S techniques of secrets and config-maps to help you achieve this. And all logging should be sent to stdout (and stderr) so that you can use a log-shipping agent to send the logs somewhere more useful than a local filesystem.
If you run your app locally then you have to provide a surrogate for every 'platform' service that is provided in production, and this may mean switching out major components of what you consider to be your application but this is ok, it is meant to happen. You provide a platform that provides services to your application-layer. Switching from EC2 to local may mean reconfiguring the ingress controller to work without the ELB, or it may mean configuring kubernetes secrets to use local-storage for dev creds rather than AWS KMS. It may mean reconfiguring your persistent volume classes to use local storage rather than EBS. All of this is expected and right.
What you should not have to do is start editing microservices to work in the new environment. If you find yourself doing that then the application has made a factoring and layering error. Platform services should be provided to a set of microservices that use them, the microservices should not be aware of the implementation details of these services.
Of course, it is possible that you have some non-portable code in your system, for example, you may be using some Oracle-specific PL/SQL that can't be run elsewhere. This code should be extracted to config files and equivalents provided for each database you wish to run on. This isn't always possible, in which case you should abstract as much as possible into isolated services and you'll have to reimplement only those services on each new platform, which could still be time-consuming, but ultimately worth the effort for most non-trival systems.

Infrastructure and code deployment in same pipeline or different?

We are in the process of setting up a new release process in AWS. We are using terraform with Elastic Beanstalk to spin up the hardware to deploy to (although actual tools are irrelevant).
As this elastic beanstalk does not support immutable deployments in windows environments we are debating whether to have a separate pipeline to deploy our infrastructure or to run terraform on all code deployments.
The two things are likely to have different rates of churn which feels like a good reason to separate them. This would also reduce risk as there is less to deploy. But it means code could be deployed to snowflake servers and means QA and live hardware could get out of sync and therefore we would not be testing like for like.
Does anyone have experience of the two approaches and care to share which has worked better and why?
Well,
we have both the approaches in place. The initial AWS provisioning has the last step of a null resource which runs an ansible which does the initial code deployment.
Subsequent code deployments are done with standalone jenkins+ansible jobs.

What to bake into an AWS AMI and what to provision using cloud-init?

I'm using AWS Cloudformation to setup numerous elements of network infrastructure (VPCs, SecurityGroups, Subnets, Autoscaling groups, etc) for my web application. I want the whole process to be automated. I want click a button and be able to fire up the whole thing.
I have successfully created a Cloudformation template that sets up all this network infrastructure. However the EC2 instances are currently launched without any needed software on them. Now I'm trying to figure out how best to get that software on them.
To do this, I'm creating AMIs using Packer.io. But some people have instead urged me to use Cloud-Init. What heuristic should I use to decide what to bake into the AMIs and/or what to configure via Cloud-Init?
For example, I want to preconfigure an EC2 instance to allow me (saqib) to login without a password from my own laptop. Thus the EC2 must have a user. That user must have a home directory. And in that home directory must live a file .ssh/known_hosts containing encrypted codes. Should I bake these directories into the AMI? Or should I use cloud-init to set them up? And how should I decide in this and other similar cases?
I like to separate out machine provisioning from environment provisioning.
In general, I use the following as a guide:
Build Phase
Build a Base Machine Image with something like Packer, including all software required to run your application. Create an AMI out of this.
Install the application(s) onto the Base Machine Image creating an Application Image. Tag and version this artifact. Do not embed environment specific stuff here like database connections etc. as this precludes you from easily reusing this AMI across different environment runtimes.
Ensure all services are stopped
Release Phase
Spin up an environment consisting of the images and infra required, using something like CFN.
Use Cloud-Init user-data to configure the application environment (database connections, log forwarders etc.) and then start the applications/services
This approach gives the greatest flexibility and cleanly separates out the various concerns of a continuous delivery pipeline.
One of the important factors that determines how you should assemble servers, AMIs, and infrastructure planning is to answer the question: In production, how fast will I need a new instance launched?
The answer to this question will determine how much you bake into the AMI vs. how much you build after boot.
NOTE: My experience is with Chef Server so I will use Chef terminology but the concepts are the same for any other configuration management stack.
The general rule of thumb is to treat your "Infrastructure as Code". This means think about the process of launching instances, creating users on that machine, and the process of managing a known_hosts files and SSH keys the same as you would your application code. Being able to track the changes to Infrastructure in source code makes management easier, redeployments, and even CI much easier.
This Chef Introduction covers the terminology in Chef of Cookbooks, Recipes, Resources, and more. It shows you how to build a simple LAMP stack, and how you can relaunch it just as easily with one command.
So given the example in your question, at a high level I would do the following:
Launch a base Ubuntu Linux AMI (currently 14.04) with a Cloudformation script.
In the UserData section of the Instance configuration, boot strap the Chef Client Install process.
Run a Recipe to create a user.
Run a Recipe to create the known_hosts file for the user
Tools like Chef are used because you are able to break down the infrastructure into small blocks of code performing specific functions. There are numerous Cookbooks already built and available that perform the basic building blocks of creating services, installing software packages, etc.
All that being said, there are some times when you have to deviate from best practices in the interest of your specific domain and requirements. There may be situations where given all the advantages of a infrastructure management you will still need to bake items into the AMI.
Let's pretend your application does image processing and has a requirement to use ImageMagick. Let's assume that you will need to build ImageMagick from source. If you were to do this via Chef Recipes this could add another 7 minutes of just compiling ImageMagick to the normal instance boot time. If waiting 10-12 minutes is too long for a new instance to come online then you may want to consider baking your own AMI that has ImageMagick already compiled and installed.
This is an acceptable solution but you should keep in mind that managing your own fleet of pre-baked AMIs adds additional infrastructure overhead. You will need to keep your custom AMIs updated as new AMIs are released, you expand to different instance types and to different AWS Regions.

Mesos, Marathon, the cloud and 10 data centers - How to talk to each other?

I've been looking into Mesos, Marathon and Chronos combo to host a large number of websites. In my head I should be able to type a few commands into my laptop, and wait about 30 minutes for the thing to build and deploy.
My only issue, is that my resources are scattered across multiple data centers, numerous cloud accounts, and about 6 on premises places. I see no reason why I can't control them all from my laptop -- (I have serious power and control issues when it comes to my hardware!)
I'm thinking that my best approach is to build the brains in the cloud, (zoo keeper and at least one master), and then add on the separate data centers, but I am yet to see any examples of a distributed cluster, where not all the nodes can talk to each other.
Can anyone recommend a way of doing this?
I've got a setup like this, that i'd like to recommend:
Source code, deployment scripts and dockerfiles in GIT
Each webservice has its own directory and comes together with a dockerfile to containerize it
A build script (shell script running docker builds) builds all the docker containers, of which all images are pushed to a docker image repository
A ansible deploy deploys all the containers remotely to a set of VPSes. (You use your own deployment procedure, that fits mesos/marathon)
As part of the process, a activeMQ broker is deployed to the cloud (yep, in a container). While deploying, it supplies each node with the URL of the broker they need to connect to. In your setup you could instead use ZooKeeper or etcd for example.
I am also using jenkins to do automatic rebuilds and to run deploys whenever there has been GIT commits, but they can also be done manually.
Rebuilds are lightning fast, and deploys dont take much time either. I can replicate everything I have in my repository endlessly and have zero configuration.
To be able to do a new deploy, all I need is a set of VPSs with docker daemons, and some datastores for persistence. Im not sure if this is something that you can replace with mesos, but ansible will definitely be able to install a mesos cloud for you onto your hardware.
All logging is being done with logstash, to a central logging server.
i have setup a 3 master, 5 slave, 1 gateway mesos/marathon/docker setup and documented here
https://github.com/debianmaster/Notes/wiki/Mesos-marathon-Docker-cluster-setup-on-RHEL-7-with-three-master
this may help you in understanding the load balancing / scaling across different machines in your data center
1) masters can also be used as slaves
2) mesos haproxy bridge script can be used for service discovery of the newly created services in the cluster
3) gateway haproxy is updated every min with new services that are created
This documentation has
1) master/slave setup
2) setting up haproxy that automatically reloads
3) setting up dockers
4) example service program
You should use Terraform to orchestrate your infrastructure as code.
Terraform has a lot of providers that allows you to manage different resources accross multiples clouds services and/or bare-metal resources such as vSphere.
You can start with the Getting Started Guide.

Same code for AWS and local application

I want to create Java application with use of Amazon Web Services and I also want to have ability to run it as local application. So it will be in two versions: Amazon cloud and as local application. I don't know AWS yet and I'am worry about if there is some specific api or database access so I couldn't run as local app. I simply do not want to write two separate versions of that app, or just write as less as possible.
Is it possible?
In EC2, you can launch virtual servers (or instances) with root or administrator access. That means your EC2 instances are capable of running mostly everything you can run locally.
There are no specific APIs to learn to run Java code on EC2. Just compile and package your code, upload it to your server (using scp/rsync/anything else you might be more used to), then run it with java -jar myapp.jar, after installing Java on the instance. You can also upload the source code directly into your instance and compile it there if you want. It really behaves like a "normal" server.
About database access, again, it works exactly as you would expect: just install your database server on the instance, say, MySQL, and connect to it normally (using JDBC for example). Also, note that there's a service called Relational Database Service (RDS), which simplifies the deployment and management of a database system: you don't have to install your database software, maintain it, upgrade, backup, etc, everything is done for you. You simply specify the name and password of the "master" user, and it gives you back a connection string. (and there's also a "micro" RDS instance which is included in the free tier so that you can start exploring for free!)
Finally, if you don't want to launch and maintain a virtual server by yourself, you could use Elastic Beanstalk, which automates lots of things for you: using the web interface, you simply upload your ".war" file, and Elastic Beanstalk launches and instance for you, installs Java, Tomcat, deploys your application, and monitor it for you -- you get emails in your inbox if anything goes wrong. There are tons of other features included in Elastic Beanstalk, and it is all completely free (you just pay for the servers it launches -- also, if you instruct it to launch at most a single t1.micro instance, which is included on the free tier, again, you pay nothing!)