I think I'm on the right path. I can use .ebextensions to change some of the conf files for the instance I'm running. Since I'm using Elastic Beanstalk, and that a lot of the software is shrinkwrapped (which I'm fine with), I should be using .ebextensions as a means of modifying the environment.
I want to employ some form of mod_rewrite config, but I know nothing of this Amazon Linux. I don't even know what the web server is. I've been through the console for the past few hours and see no trace of the things I want to override.
Apparently I can setup a shell to take a look around, but modifying things that way will cause things to be overridden since Beanstalk is handling config. I'm not entirely sure on that last point.
Should I just ssh and play in userland like a typical unix host?
You can definitely ssh to the instance, and see around. But remember, that your changes are not persistent. You should look at .ebextensions config files as the way to re-run your commands on the host, plus more.
It might take some time to see where ElasticBeanstalk stores configuration files and all other interesting things.
To get you started, your app files are located at: /opt/python/current/app and if you are using Python, it is located in virtual environment at: /opt/python/run/venv/bin/python27
Customizing the Software on EC2 Instances Running Linux guide contains detailed information on what you can do:
Packages - install packages
Sources - retrieve archives
Files - operations with files
Users - anything with users
Groups - anything with groups
Commands - execute instance commands
Container_commands - execute commands after the container is
extracted
Services - launch services
Option_settings - configure
container settings
See if that satisfies your requirements, if not, come back to StackOverflow and ask more questions.
Related
I'm new to the configuration management and deployment tools. I have to implement a Continuous Delivery/Continuous Deployment tool for one of the most interesting projects I've ever put my hands on.
First of all, individually, I'm comfortable with AWS, I know what Ansible is, the logic behind it and its purpose. I do not have same level of understanding of Docker but I got the idea. I went through a lot of Internet resources, but I can't get the the big picture.
What I've been struggling is how they fit together. Using Ansible, I can manage my Infrastructure as Code; building EC2 instances, installing packages... I can even deploy a full application by pulling its code, modify config files and start web server. Docker is, itself, a tool that packages an application and ensures that it can be run wherever you deploy it.
My problems are:
How does Docker (or Ansible and Docker) extend the Continuous Integration process!?
Suppose we have a source code repository, the team members finish working on a feature and they push their work. Jenkins detects this, runs all the acceptance/unit/integration test suites and if they all passed, it declares it as a stable build. How Docker fits here? I mean when the team pushes their work, does Jenkins have to pull the Docker file source coded within the app, build the image of the application, start the container and run all the tests against it or it runs the tests the classic way and if all is good then it builds the Docker image from the Docker file and saves it in a private place?
Should Jenkins tag the final image using x.y.z for example!?
Docker containers configuration :
Suppose we have an image built by Jenkins stored somewhere, how to handle deploying the same image into different environments, and even, different configurations parameters ( Vhosts config, DB hosts, Queues URLs, S3 endpoints, etc...) What is the most flexible way to deal with this issue without breaking Docker principles? Are these configurations backed in the image when it gets build or when the container based on it is started, if so how are they injected?
Ansible and Docker:
Ansible provides a Docker module to manage Docker containers. Assuming I solved the problems mentioned above, when I want to deploy a new version x.t.z of my app, I tell Ansible to pull that image from where it was stored on, start the app container, so how to inject the configuration settings!? Does Ansible have to log in the Docker image, before it's running ( this sounds insane to me ) and use its Jinja2 templates the same way with a classic host!? If not, how is this handled?!
Excuse me if it was a long question or if I misspelled something, but this is my thinking out loud. I'm blocked for the past two weeks and I can't figure out the correct workflow. I want this to be a reference for future readers.
Please, it would very helpful to read your experiences and solutions because this looks like a common workflow.
I would like to answer in parts
How does Docker (or Ansible and Docker) extend the Continuous Integration process!?
Since docker images same everywhere, you use your docker images as if they are production images. Therefore, when somebody committed a code, you build your docker image. You run tests against it. When all tests pass, you tag that image accordingly. Since docker is fast, this is a feasible workflow.
Also docker changes are incremental; therefore, your images will have minimal impact on storage. Also when your tests fail, you may also choose to save that image too. In this way, developer will pull that image and investigate easily why your tests failed. Developer may choose to run tests in their machine too since docker images in jenkins and their machine are not different.
What this brings that all developers will have same environment, same version of all software since you decide which one will be used in docker images. I have come across to bugs that are due to differences between developer machines. For example in the same operating system, unicode settings may affect your code. But in docker images all developers will test against same settings, same version software.
Docker containers configuration :
If you are using a private repository, and you should use one, then configuration changes will not affect hard disk space much. Therefore except security configurations, such as db passwords, you can apply configuration changes to docker images(Baking the Configuration into the Container). Then you can use ansible to apply not-stored configurations to deployed images before/after startup using environment variables or Docker Volumes.
https://dantehranian.wordpress.com/2015/03/25/how-should-i-get-application-configuration-into-my-docker-containers/
Does Ansible have to log in the Docker image, before it's running (
this sounds insane to me ) and use its Jinja2 templates the same way
with a classic host!? If not, how is this handled?!
No, ansible will not log in the Docker image, but ansible with Jinja2 templates can be used to change dockerfile. You can change dockerfile with templates and can inject your configuration to different files. Tag your files accordingly and you have configured images to spin up.
Regarding your question about handling multiple environment configurations using the same Docker image, I have been planning on using a Service Discovery tool like Consul as a centralized config/property management tool. So, when you start your container up, you set an ENV var that tells it what application it is (appID), and what environment config it should use (ex: MyApplication:Dev) and it will pull its config from Consul at startup. I still have to investigate the security around Consul (as if we are storing DB connection credentials in there for example, how do we restrict who can query/update those values). I don't want to just use this for containers, but all apps in general. Another cool capability is to change the config value in Consul and have a hook back into your app to apply the changes immediately (maybe like a REST endpoint on your app to push changes down to and dynamically apply it). Of course your app has to be written to support this!
You might be interested in checking out Martin Fowler's blog articles on immutable infrastructure and on Phoenix servers.
Although not a complete solution, I have suggestions for two of your issues. Although they might not be perfect, these are the practices we are using in our workflow, and prove themselves so far.
Defining different environments - supposing you've written a different Ansible role for each environment you launch, we define an environment variable setting the environment we wish the container to belong to. We then download the suitable configuration file from an S3 bucket using the env variable set before into the container (which should be possible if you supply AWS creds or give your server an IAM role) and inject these parameters into the code when building it.
Ansible doesn't need to log into the docker app, but the solution is a bit tricky. I've tried two ways of tackling this problem, and both aren't ideal. The first one is to download the configuration file as part of the docker image command line, and build the app on container startup. While this solution works - it breaches the Docker philosophy and makes the image highly prone to build errors.
Another solution is pushing several images to your docker hub repo, and then pulling the appropriate image according to the environment at hand.
In a broader stroke, I've tried launching our app completely with Ansible and it was hell, many configuration steps are tricky and get trickier when you try to implement them as a playbook. When I switched to maintaining the severs alone with Ansible, and deploying the app itself with Docker things got a lot easier.
I've got an env.config in source control but pretty much the only things I can put in it are things that relate to all my various environments (production, staging). I've got environment specific settings that I want to add to the env.config file (for instance, the DB host) that will change from environment to environment. How can I handle these differences? Right now I'm doing it from the AWS console where I can manage it in the GUI on a per-environment basis, but I'd love to be able to change a lot of this stuff from git so I don't have to be logging into the console whenever I want to change something.
Is there any way to have multiple, environment specific config files?
So this has been posted before in the AWS forums. (https://forums.aws.amazon.com/thread.jspa?messageID=529373) So far there's only workarounds! The problem is that the .config files would require some logic to figure out what environment you're attempting to target. Personally I don't think any logic is required, as you could simply namespace the config settings based on the AWS environment name you're targeting.
I think your usecase is similar to what is discussed in How to configure Elastic Beanstalk for RDS
You may want to use 'eb branch'. You can then have multiple environments with different configurations.
More documentation on eb branch here
I am currently using AWS ElasticBeanStalk and I was curious as to how (as in internally) it knows that when you fire up an instance (or it automatically does with scaling), to unpack the zip I uploaded as a version? Is there some enviroment setting that looks up my zip in my S3 bucket and then unpacks automatically for every instance running in that environment?
If so, could this be used to automate a task such as run an SQL query on boot-up (instance deployment) too? Are these automated tasks changeable or viewable at all?
Thanks
I don't know how beanstalk knows which version to download and unpack, but running a task on start-up is trivial. Check out cloud-init, a tool written by Ubuntu that's now packaged in Amazon Linux. It allows you to pass arbitrary shell scripts into the UserData section of the instance configuration, and those shell scripts will run on startup.
It's a great way to bootstrap instances on startup, which avoids the soul-sucking misery of managing AMIs.
A quick (possibly non-applicable) warning: If you're running a SQL query on a database that lives on the beanstalk AMI, you're pretty much guaranteed to lose your database at some point. Those machines are designed to be entirely transient. Do not put databases on them. See this answer for more details.
Since your goal seems to be to run custom configuration tasks, the answer is yes, there is a way to do that. You can define custom actions in an .ebextensions file packaged with your app. For example, you can configure a command to run every time a new machine is deployed:
http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/customize-containers-ec2.html#linux-commands
I'm interested in figuring out the best practice way of organising Django apps on a server.
Where do you place Django code? The (old now) Almanac says /home/django/domains/somesitename.com/ but I've also seen things placed in /opt/apps/somesitename/ . I'm thinking that the /opt/ idea sounds better as it's not global, but I've not seen opt before, and presumably it might be better for apps to go in a site specific deployer users home dir.
Would you recommend having one global deployer user, one user per site, or one per site-env (eg, sitenamelive, sitenamestaging). I'm thinking one per site.
How do you version your config files? I currently put them in an /etc/ folder at top level of source control. eg, /etc/nginc/somesite-live.conf.
How do you do provision your servers and do the deployment? I've resisted Chef and Puppet for years on the hope of something Python based. Silver Lining doesn't seem ready yet, and I have big hopes for Patchwork (https://github.com/fabric/patchwork/). Currently we're just using some custom Fabric scripts to deploy, but the "server provisioning" is handled by a bash script and some manual steps for adding keys and creating users. I'm about to investigate Silk Deployment (https://bitbucket.org/btubbs/silk-deployment) as it seems closest to our setup.
Thanks!
I think there would have to be more information on what kinds of sites you are deploying: there would be differences based on the relations between the sites, both programatically and 'legally' (as in a business relation):
Having an system account per 'site' can be handy if the sites are 'owned' by different people - if you are a web designer or programmer with a few clients, then you might benefit from separation.
If your sites are related, i.e. a forum site, a blog site etc, you might benefit from a single deployment system (like ours).
for libraries, if they're hosted on reputable sources (pypy, github etc), its probably ok to leave them there and deploy from them - if they're on dodgy hosts which are up or down, we take a copy and put them in a /thirdparty folder in our git repo.
FABRIC
Fabric is amazing - if its setup and configured right for you:
We have a policy here which means nobody ever needs to log onto a server (which is mostly true - there are occasions where we want to look at the raw nginx log file, but its a rarity).
We've got fabric configured so that there are individual functional blocks (restart_nginx, restart_uwsgi etc), but also
higher level 'business' functions which run all the little blocks in the right order - for us to update all our servers we meerly type 'fab -i secretkey live deploy' - the live sets the settings for the live servers, and deploy ldeploys (the -i is optional if you have your .ssh keys set up right)
We even have a control flag that if the live setting is used, it will ask 'are you sure' before performing the deploy.
Our code layout
So our code base layout looks a bit like this:
/ <-- folder containing readme file etc
/bin/ <-- folder containing nginx & uwsgi binaries (!)
/config/ <-- folder containing nginx config and pip list but also things like pep8 and pylint configs
/fabric/ <-- folder containing fabric deployment
/logs/ <-- holding folder that nginx logs get written into (but not committed)
/src/ <-- actual source is in here!
/thirdparty/ <-- third party libs that we didn't trust the hosting of for pip
Possibly controversial because we load our binaries into our repo, but it means that if i upgrade nginx on the boxes, and want to roll back, i just do it by manipulation of git. I know what works against what build.
How our deploy works:
All our source code is hosted on a private bitbucket repo (we have a lot of repos and a few users, thats why bitbucket is better for us then github). We have a user account for the 'servers' with its own ssh key for bitbucket.
Deploy in fabric performs the following on each server:
irc bot announce beginning into the irc channel
git pull
pip deploy (from a pip list in our repo)
syncdb
south migrate
uwsgi restart
celery restart
irc bot announce completion into the irc channel
start availability testing
announce results of availability testing (and post report into private pastebin)
The 'availability test' (think unit test, but against live server) - hits all the webpages and API's on the 'test' account to make sure it gets back sane data without affecting live stats.
We also have a backup git service so if bitbucket is down, it falls over to that gracefully, and we even have jenkins integration that on a commit to the 'deploy' branch, it causes the deployment to go through
The scary bit
Because we use cloud computing and expect a high throughput, our boxes auto spawn. Theres a default image which contains a a copy of the git repo etc, but invariably it will be out of date, so theres a startup script which does a deployment to itself, meaning new boxes added to the cluster are automatically up-to-date.
Would it be possible/safe to run two instances of VisualSVNServer pointing to the same repo?
I've searched around and not had any luck finding anything related specifically to this question. The only reason I ask is because we have a need to enable Windows Authentication/Integration over http, and svn authentication over https. It does not seem to be an option to run both within a single instance of VisualSVNServer.
If not, do you know of alternative solution that would allow for this?
Edit: Received the following answer from VisualSVN Support
Thanks to Subversion design, repositories are ready to be accessed by several server instances simultaneously. We haven't experimented a lot with such configuration, but I think it's possible.
Am I understand properly, that you are going to store your repositories on a network storage and run two VisualSVN Server instances on different machines?
Please take care about the server.pid. file. In the current release, this file is stored in the repositories folder. So there will be a collision between two instances of VisualSVN Server. We are going to fix this problem in the upcoming release.
You can easily relocate the server.pid to another destination by adding the following command to the "C:\Program Files\VisualSVN Server\conf\httpd-custom.conf" file:
[[
PidFile "C:/Tmp/server.pid"
]]"
You can point two VisualSVN Server instances to the same repository if it stored on SMB share without any problems. It's typical configuration for active/active or active/passive cluster setups.
I wouldn't do this because as far as I know, VisualSVN brings its own web server (Apache) and SVN binaries. I would expect locking issues when running two of each on the same repo, if it's possible at all. VisualSVN probably won't install twice at all.
This sounds like a case for separate installation of SVN and Apache and custom configuration. I can't say whether what you want is possible but I would expect it is. It's probably to be fiddly, though - VisualSVN takes away a lot of configuration hassle that you have when doing the setup manually. Questions about that would be appropriate to ask on Serverfault.com.
Apart from VisualSVN, there also are other, also commercial wrappers. Maybe one of them is more flexible in this respect.
Update: Also, check this out: Supporting Multiple Repository Access Methods from the SVN book