Recommended way to provide configuration files to Docker containers

Recommended way to provide configuration files to Docker containers - django

I am moving our web application to docker-compose deployment (Django, DRF, AngularJS).
Docker looks solid now and things are going well.
I want to:
confirm with you that I am following best practices regarding application configuration files
know if "volume files" are actually bind mounts, which are not recommended
I've managed to use environment variables and docker-compose secrets read from the Django settings.py file and it works. The downside is that environment variables are limited to simple strings and can pose some escape challenges when sending Python lists, dictionaries etc. We also have to define and maintain a lot of environment variables since our web app is installed in many place and it's highly configurable.
On frontend side (AngularJS) we have two constants.js files and the nginx conf.
I've used a CMD ["/start.sh"] in Dockerfile and have some sed commands.
But this looks really hackish and it also means that we have to define and maintain quite a few environment variables.
Are Docker volumes a good idea to use for these configuration files?
Does such thing as "volume file" actually exist (mentioned here) or is it actually a bind mount? And bind mounts are less recommendable since they depend on the file system and file path on the host.
Volumes documentation briefly mentions files: "path where the file or directory are mounted in the container", but does not go into greater detail.
Our web app has simple configuration files now:
settings.py
site\contants.js
admin\constants.js
and:
I want to avoid moving those files to dedicated directories that can be mounted.
Can you show me a sample docker-compose.yml with single file volumes (not bind mounts).
Thank you

If you can't use environment variables then you should use a bind mount. If you use a named volume you can't access single files and you can't directly edit the config files.
A named volume is always an entire directory, and can't be directly accessed from the host. There is no such thing as a "volume file" (your linked question is entirely about bind mounts, some using named-volume syntax) and there is no way to mount a single file out of a named volume.
Newer Docker has a couple of different syntaxes for bind mounts (in Compose, the short and long volumes: service configuration, or creating a type: bind named volume). These are all basically equivalent, and many of the answers in the question you link to involve making a named volume simulate a bind mount.
Docker Compose supports relative paths, so there is much less of a concern around host paths for bind mounts being non-portable across systems. A basic fragment of a docker-compose.yml file could include:
services:
app:
build: django
volumes:
- ./config/django-settings.py:/app/settings.py
In this example I'd suggest a (deploy-time) config directory that contains the configuration files, but that's an arbitrary choice; if you want to bind-mount ./django/settings.py from the application source tree over what's in the image to be able to directly edit it, that's a valid choice too. You can check this tree into source control, and it will still work regardless of where it's checked out.
If you're using a base image with the full GNU tool set (Ubuntu, not Alpine) then your container entrypoint script can also use envsubst as a very lightweight templating tool (it replaces $VARIABLE references with the equivalent environment variable), which will help you support the "many options" case but not the "dict-type options" case.
In general I'd recommend bind mounts for two cases and maybe a third: for config files (where the operator needs to directly edit them), for log files (where the operator needs to directly read them) and maybe for persistent data storage (where your existing backup solution will work unmodified; but not on MacOS where it's very slow). Named volumes can be a good match for the persistent-data case and better match what you would use in a clustered environment (Swarm, Kubernetes) but can't be directly accessed.

Related

How to mount a file via CloudFoundry manifest similar to Kubernetes?

With Kubernetes, I used to mount a file containing feature-flags as key/value pairs. Our UI would then simply get the file and read the values.
Like this: What's the best way to share/mount one file into a pod?
Now I want to do the same with the manifest file for CloudFoundry. How can I mount a file so that it will be available in /dist folder at deployment time?
To add more information, when we mount a file, the UI later can download the file and read the content. We are using React and any call to the server has to go through Apigee layer.

The typical approach to mounting files into a CloudFoundry application is called Volume Services. This takes a remote file system like NFS or SMB and mounts it into your application container.
I don't think that's what you want here. It would probably be overkill to mount in a single file. You totally could go this route though.
That said, CloudFoundry does not have a built-in concept that's similar to Kubernetes, where you can take your configuration and mount it as a file. With CloudFoundry, you do have a few similar options. They are not exactly the same though so you'll have to make the determination if one will work for your needs.
You can pass config through environment variables (or through user-provided service bindings, but that comes through an environment variable VCAP_SERVICES as well). This won't be a file, but perhaps you can have your UI read that instead (You didn't mention how the UI gets that file, so I can't comment further. If you elaborate on that point like if it's HTTP or reading from disk, I could perhaps expand on this option).
If it absolutely needs to be a file, your application could read the environment variable contents and write it to disk when it starts. If your application isn't able to do that like if you're using Nginx, you could include a .profile script at the root of your application that reads it and generates the file. For example: echo "$CFG_VAR" > /dist/file or whatever you need to do to generate that file.
A couple of more notes when using environment variables. There are limits to how much information can go in them (sorry I don't know the exact value off the top of my head, but I think it's around 128K). It is also not great for binary configuration, in which case, you'd need to base64 encode your data first.
You can pull the config file from a config server and cache it locally. This can be pretty simple. The first thing your app does when it starts is to reach out and download the file, place it on the disk and the file will persist there for the duration of your application's lifetime.
If you don't have a server-side application like if you're running Nginx, you can include a .profile script (can be any executable script) at the root of your application which can use curl or another tool to download and set up that configuration.
You can replace "config server" with an HTTP server, Git repository, Vault server, CredHub, database, or really any place you can durably store your data.
Not recommended, but you can also push your configuration file with the application. This would be as simple as including it in the directory or archive that you push. This has the obvious downside of coupling your configuration to the application bits that you push. Depending on where you work, the policies you have to follow, and the tools you use this may or may not matter.
There might be other variations you could use as well. Loading the file in your application when it starts or through a .profile script is very flexible.

Is there a service in AWS that is equivalent to docker configs?

I have a WordPress site that is gonna be hosted using ECS in AWS.
To make the management even more flexible, I plan not to store service configurations (i.e. php.ini, nginx.conf) inside the docker image itself. I found that docker swarm offers "docker configs" for such. Are there any equivalent tools doing the same thing? (I know AWS Secrets Manager can handle docker secrets though)
Any advice or alternative approaches? thank you all.

The most similar you could use is probably AWS SSM Parameter store
You will need some logic to retrieve the values when you are running the image.
If you don't want to have the files also inside of the running containers, then you pull from Parameter Store, and add them to the environment, and you will need to do probably some work in the application to read from the environment (the application stays decoupled from the actually source of the config), or you can read directly from Param store in the application (easier, but you have some coupling in your image with Parameter store.
if your concern is only about not having the values in the image, but it is fine if they are inside of the running container, then you can read from Param Store and inject the values in the container inside of the usual location of the files, so for the application is transparent
As additional approaches:
Especially for php.ini and nginx.conf I like a simple approach that is having a separate git repo, with different config files per different environments.
You have a common docker image regardless of the environment
in build time, you pull the proper file for the enviroment, and either save as env variables, or inject in the container
And last: need to mention classic tools like Chef or Puppet, and also ansible. More complex and maybe overkill

The two ways that I store configs and secrets for most services are
Credstash which is combination of KMS and Dynamodb, and
Parameter Store which has already been mentioned,
The aws command line tool can be used to fetch from Parameter Store
and S3(for configs), while credstash is its own utility (quite useful and easy to
use) and needs to be installed separately.

How Docker and Ansible fit together to implement Continuous Delivery/Continuous Deployment

I'm new to the configuration management and deployment tools. I have to implement a Continuous Delivery/Continuous Deployment tool for one of the most interesting projects I've ever put my hands on.
First of all, individually, I'm comfortable with AWS, I know what Ansible is, the logic behind it and its purpose. I do not have same level of understanding of Docker but I got the idea. I went through a lot of Internet resources, but I can't get the the big picture.
What I've been struggling is how they fit together. Using Ansible, I can manage my Infrastructure as Code; building EC2 instances, installing packages... I can even deploy a full application by pulling its code, modify config files and start web server. Docker is, itself, a tool that packages an application and ensures that it can be run wherever you deploy it.
My problems are:
How does Docker (or Ansible and Docker) extend the Continuous Integration process!?
Suppose we have a source code repository, the team members finish working on a feature and they push their work. Jenkins detects this, runs all the acceptance/unit/integration test suites and if they all passed, it declares it as a stable build. How Docker fits here? I mean when the team pushes their work, does Jenkins have to pull the Docker file source coded within the app, build the image of the application, start the container and run all the tests against it or it runs the tests the classic way and if all is good then it builds the Docker image from the Docker file and saves it in a private place?
Should Jenkins tag the final image using x.y.z for example!?
Docker containers configuration :
Suppose we have an image built by Jenkins stored somewhere, how to handle deploying the same image into different environments, and even, different configurations parameters ( Vhosts config, DB hosts, Queues URLs, S3 endpoints, etc...) What is the most flexible way to deal with this issue without breaking Docker principles? Are these configurations backed in the image when it gets build or when the container based on it is started, if so how are they injected?
Ansible and Docker:
Ansible provides a Docker module to manage Docker containers. Assuming I solved the problems mentioned above, when I want to deploy a new version x.t.z of my app, I tell Ansible to pull that image from where it was stored on, start the app container, so how to inject the configuration settings!? Does Ansible have to log in the Docker image, before it's running ( this sounds insane to me ) and use its Jinja2 templates the same way with a classic host!? If not, how is this handled?!
Excuse me if it was a long question or if I misspelled something, but this is my thinking out loud. I'm blocked for the past two weeks and I can't figure out the correct workflow. I want this to be a reference for future readers.
Please, it would very helpful to read your experiences and solutions because this looks like a common workflow.

I would like to answer in parts
How does Docker (or Ansible and Docker) extend the Continuous Integration process!?
Since docker images same everywhere, you use your docker images as if they are production images. Therefore, when somebody committed a code, you build your docker image. You run tests against it. When all tests pass, you tag that image accordingly. Since docker is fast, this is a feasible workflow.
Also docker changes are incremental; therefore, your images will have minimal impact on storage. Also when your tests fail, you may also choose to save that image too. In this way, developer will pull that image and investigate easily why your tests failed. Developer may choose to run tests in their machine too since docker images in jenkins and their machine are not different.
What this brings that all developers will have same environment, same version of all software since you decide which one will be used in docker images. I have come across to bugs that are due to differences between developer machines. For example in the same operating system, unicode settings may affect your code. But in docker images all developers will test against same settings, same version software.
Docker containers configuration :
If you are using a private repository, and you should use one, then configuration changes will not affect hard disk space much. Therefore except security configurations, such as db passwords, you can apply configuration changes to docker images(Baking the Configuration into the Container). Then you can use ansible to apply not-stored configurations to deployed images before/after startup using environment variables or Docker Volumes.
https://dantehranian.wordpress.com/2015/03/25/how-should-i-get-application-configuration-into-my-docker-containers/
Does Ansible have to log in the Docker image, before it's running (
this sounds insane to me ) and use its Jinja2 templates the same way
with a classic host!? If not, how is this handled?!
No, ansible will not log in the Docker image, but ansible with Jinja2 templates can be used to change dockerfile. You can change dockerfile with templates and can inject your configuration to different files. Tag your files accordingly and you have configured images to spin up.

Regarding your question about handling multiple environment configurations using the same Docker image, I have been planning on using a Service Discovery tool like Consul as a centralized config/property management tool. So, when you start your container up, you set an ENV var that tells it what application it is (appID), and what environment config it should use (ex: MyApplication:Dev) and it will pull its config from Consul at startup. I still have to investigate the security around Consul (as if we are storing DB connection credentials in there for example, how do we restrict who can query/update those values). I don't want to just use this for containers, but all apps in general. Another cool capability is to change the config value in Consul and have a hook back into your app to apply the changes immediately (maybe like a REST endpoint on your app to push changes down to and dynamically apply it). Of course your app has to be written to support this!
You might be interested in checking out Martin Fowler's blog articles on immutable infrastructure and on Phoenix servers.

Although not a complete solution, I have suggestions for two of your issues. Although they might not be perfect, these are the practices we are using in our workflow, and prove themselves so far.
Defining different environments - supposing you've written a different Ansible role for each environment you launch, we define an environment variable setting the environment we wish the container to belong to. We then download the suitable configuration file from an S3 bucket using the env variable set before into the container (which should be possible if you supply AWS creds or give your server an IAM role) and inject these parameters into the code when building it.
Ansible doesn't need to log into the docker app, but the solution is a bit tricky. I've tried two ways of tackling this problem, and both aren't ideal. The first one is to download the configuration file as part of the docker image command line, and build the app on container startup. While this solution works - it breaches the Docker philosophy and makes the image highly prone to build errors.
Another solution is pushing several images to your docker hub repo, and then pulling the appropriate image according to the environment at hand.
In a broader stroke, I've tried launching our app completely with Ansible and it was hell, many configuration steps are tricky and get trickier when you try to implement them as a playbook. When I switched to maintaining the severs alone with Ansible, and deploying the app itself with Docker things got a lot easier.

How do I know what .ebextensions config file to create?

I think I'm on the right path. I can use .ebextensions to change some of the conf files for the instance I'm running. Since I'm using Elastic Beanstalk, and that a lot of the software is shrinkwrapped (which I'm fine with), I should be using .ebextensions as a means of modifying the environment.
I want to employ some form of mod_rewrite config, but I know nothing of this Amazon Linux. I don't even know what the web server is. I've been through the console for the past few hours and see no trace of the things I want to override.
Apparently I can setup a shell to take a look around, but modifying things that way will cause things to be overridden since Beanstalk is handling config. I'm not entirely sure on that last point.
Should I just ssh and play in userland like a typical unix host?

You can definitely ssh to the instance, and see around. But remember, that your changes are not persistent. You should look at .ebextensions config files as the way to re-run your commands on the host, plus more.
It might take some time to see where ElasticBeanstalk stores configuration files and all other interesting things.
To get you started, your app files are located at: /opt/python/current/app and if you are using Python, it is located in virtual environment at: /opt/python/run/venv/bin/python27
Customizing the Software on EC2 Instances Running Linux guide contains detailed information on what you can do:
Packages - install packages
Sources - retrieve archives
Files - operations with files
Users - anything with users
Groups - anything with groups
Commands - execute instance commands
Container_commands - execute commands after the container is
extracted
Services - launch services
Option_settings - configure
container settings
See if that satisfies your requirements, if not, come back to StackOverflow and ask more questions.

Dropwizard configuration.yml security issues (where to save and should it contain passwords)

Where should the configuration.yml file of Dropwizard be saved?
I'm using Dropwizard which is a Java web framework.
Dropwizard uses configuration.yml files to load in environment specific configuration files.
In the example I found online the configuration.yml files contains username and password of databases.
Now the question is where to save this configuration files which contain password in plain text.
OPTION 1 GIT REPOSITORY
In the example the configuration.yml are part of the project. So one could keep them in the git repository with the rest of the code. This though is a well-known bad security practice.
If someone crack the git repository has access to the code and to the database. Also this way every single developer has access to all the passwords of all the environments.
OPTION 2 FILE ON THE COMPUTER
Safe the configuration.yml on the machine but do not store on the git repository
OPTION 3 ENVIRONMENT VARIABLES
Use configuration.yml file which point to environment variables on the specific machine.
This is not so practical since all this environment variables needs to be set manually on all the machines. Also what is the syntax to use ENVIRONMENT VARIABLES in Dropwizard's configuration.yml files?

I'd go with environment variables if you cannot control read access to the config file or are concerned that your machine is owned by an untrusted third party.
Environment variables are trivial to script.

You should use a file on the computer: this is how many frameworks out there work.
If you use a unix/linux server you can chmod 0600 [filename] and be sure that nobody (almost nobody as root can do anything) can read that file.
On the dropwizard ML it was also cited to use software like puppet/chef to deploy your application and using these frameworks to handle all variables (eg: different configurations for test/staging/production).
Bye
Piero

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js