So I use Python+Django (but it does not really matter for this question)
When I write my code I simply run
./manage.py runserver
which does the webserver, static files, automatic reload, etc.
and and to put it on production I use series of commands like
./manage.py collectstatic
./manage.py migrate
uwsgi --http 127.0.0.1:8000 -w wsgi --processes=4
also I have few other services like postgres, redis (which are common for both production and dev)
So I'm trying to adapt here docker(+ -compose) and I cannot understand how to split prod/dev with it.
basically in docker-compose.yml you define your services and images - but in my case image in production should run one CMD and in dev another..
what are the best practices to achieve that ?
You should create additional docker-compose.yml files like docker-compose-dev.yml or docker-compose-pro.yml and override some of the original docker-compose.yml configuration with -f command:
docker-compose -f docker-compose.yml -f docker-compose-dev.yml up -d
Sometimes, I also use different Dockerfile for different environments and specify dockerfile parameter in docker-compose-pro.yml build section, but I didn't recommend it because you will end with duplicated Dockerfiles.
Update
Docker has introduced multi-stage builds feature https://docs.docker.com/develop/develop-images/multistage-build/#use-multi-stage-builds which allow to create a Dockerfile for different environments.
Usually having a different production and dev starting workflow is a bad idea. You should always try to keep both dev and prod environments very similar, even in the way you launch your applications. You should always externalize the configuration that is different between the different environments.
Having different startup sequence is maybe acceptable, however having multiple docker images (or dockerfiles) for each environment is a very bad idea. Docker images should be immutable and portable.
However, you might have some constraints. Docker-compose allows you to override the command that is specified in the image. There is the command property that will override the default command in the image. I would recommend that you keep the image production ready,
i.e. use something like CMD ./manage.py collectstatic && ./manage.py migrate && uwsgi --http 127.0.0.1:8000 -w wsgi --processes=4 in the Dockerfile.
In the compose file just override the CMD by specifying:
command: ./manage.py runserver
Having multiple compose file is usually not a big issue. You can keep your compose files clean and manageable by using some nice compose file features such as extends, where once compose file can extend another one.
Related
I am following the official tutorial on the Docker website
The docker file is
FROM python:3
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
WORKDIR /code
COPY requirements.txt /code/
RUN pip install -r requirements.txt
COPY . /code/
The docker-compose is:
web:
build: .
command: python manage.py runserver 0.0.0.0:8000
volumes:
- .:/code
I do not understand why in the Dockerfile they are copying the code COPY . /code/ , but then again mounting it in the docker-compose - .:/code ? Is it not enough if I either copy or mount?
Both the volumes: and command: in the docker-compose.yml file are unnecessary and should be removed. The code and the default CMD to run should be included in the Dockerfile.
When you're setting up the Docker environment, imagine that you're handed root access to a brand-new virtual machine with nothing installed on it but Docker. The ideal case is being able to docker run your-image, as a single command, pulling it from some registry, with as few additional options as possible. When you run the image you shouldn't need to separately supply its source code or the command to run, these should usually be built into the image.
In most cases you should be able to build a Compose setup with fairly few options. Each service needs an image: or build: (both, if you're planning to push the image), often environment:, ports:, and depends_on: (being aware of the limitations of the latter option), and your database container(s) will need volumes: for their persistent state. That's usually it.
The one case where you do need to override command: in Compose is if you need to run a separate command on the same image and code base. In a Python context this often comes up to run a Celery worker next to a Django application.
Here's a complete example, with a database-backed Web application with an async worker. The Redis cache layer does not have persistence and no files are stored locally in any containers, except for the database storage. The one thing missing is the setup for the database credentials, which requires additional environment: variables. Note the lack of volumes: for code, the single command: override where required, and environment: variables providing host names.
version: '3.8'
services:
app:
build: .
ports: ['8000:8000']
environment:
REDIS_HOST: redis
PGHOST: db
worker:
build: .
command: celery worker -A queue -l info
environment:
REDIS_HOST: redis
PGHOST: db
redis:
image: redis:latest
db:
image: postgres:13
volumes:
- pgdata:/var/lib/postgresql/data
volumes:
pgdata:
Where you do see volumes: overwriting the image's code like this, it's usually an attempt to avoid needing to rebuild the image when the code changes. In the Dockerfile you show, though, the rebuild is almost free assuming the requirements.txt file hasn't changed. It's also almost always possible to do your day-to-day development outside of Docker – for Python, in a virtual environment – and use the container setup for integration testing and deployment, and this will generally be easier than convincing your IDE that the language interpreter it needs is in a container.
Sometimes the Dockerfile does additional setup (changing line endings or permissions, rearranging files) and the volumes: mount will hide this. It means you're never actually running the code built into the image in development, so if the image setup is buggy in some way you won't see it. In short, it reintroduces the "works on my machine" problem that Docker generally tries to avoid.
It used for saving the image after with the code.
When you use COPY it save it as part of the image.
While mounting is only while developing.
Ideally we use a single Dockerfile to create the image we use for both production and development. This increases the similarity of the app's runtime environment, which is a good thing.
In contrast to what #David writes: it's quite handy to do your day-to-day development with a Docker container. Your code runs in the same environment in production and development. If you use virtualenv in development you're not making use of that very practical attribute of Docker. The environments can diverge without you knowing and prod can break while dev keeps on working.
So how do we let a single Dockerfile produce an image that we can run in production and use during development? First we should talk about the code we want to run. In production we want to have a specific collection of code to run (most likely the code at a specific commit in your repository). But during development we constantly change the code we want to run, by checking out different branches or editing files. So how do we satisfy both requirements?
We instruct the Dockerfile to copy some directory of code (in your example .) into the image, in your example /code. If we don't do anything else: that will be the code that runs. This happens in production.
But in development we can override the /code directory with a directory on the host computer using a volume. In the example the Docker Compose file sets the volume. Then we can easily change the code running in the dev container without needing to rebuild the image.
Also: even if rebuilding is fast, letting a Python process restart with new files is a lot faster.
I have a django app that I put inside of a docker container for deployment. I have some initial data that I want to load into the database via the dumpdata and loaddata commands. The initial data lives on my local hard drive. I choose a very naive approach and simply copied the data_backup.json file to the server via scp.
Now, I want to load the data_backup.json file (the file sits on the server not in the docker container) by executing:
sudo docker-compose exec restapi python manage.py loaddata --settings=rest.settings.production ./data_backup_20191004.json
But Django only searches the internal directories for fixtures.
I am looking for a way to populate the database with the data_backup.json file inside the docker container. Can someone help?
Ultimately, I am looking for a way to dump data directly to S3 and load it from there if needed (for db backups). If you have any tips on how to achieve that, this would also be super helpful - I don't seem to be able to find material on that.
Just in case someone has this question in the future. It is possible to loaddata from stdin. So you can just take the backup file and pipe it into the database (within the container with a command like this:
cat <<fixture_name.json>> | sudo docker exec -i <<container_name_or_id>> python manage.py loaddata --format=json -
The last dash tells django that you want to load the data from stdin.
DOCS
You could copy the file into the docker container before running the command with docker cp:
docker cp ./data_backup_20191004.json <container_id>:django_dir/data_backup_20191004.json
Or, if the file is located on an S3 server, you could execute a curl inside the docker container and install from there:
sudo docker-compose exec restapi curl http://s3.example.com/path/to/data.json > data.json
sudo docker-compose exec restapi python manage.py loaddata --settings=rest.settings.production ./data.json
I am looking for a way to populate the database with the
data_backup.json file inside the docker container. Can someone help?
See the answer of Xen_mar, which I think it's perfect.
Ultimately, I am looking for a way to dump data directly to S3 and
load it from there if needed (for db backups). If you have any tips on
how to achieve that, this would also be super helpful - I don't seem
to be able to find material on that.
This seems to be a complete different question. I would consider using some django package like Django Smuggler, which allows you to load and dump fixtures from the admin, and I assume that it may be possible to configure django smuggler upload directory to be handled by django storages. I'm not sure that is possible, so if you try let me know.
I have a Django app up and running in Google App Engine flexible. I know how to run migrations using the cloud proxy or by setting the DATABASES value but I would like to automate running migrations by doing it in the deployment step. However, there does not seem to be a way to run a custom script before or after the deployment.
The only way I've come up with is by doing it in the entrypoint command which you can set in the app.yaml:
entrypoint: bash -c 'python3 manage.py migrate --noinput && gunicorn -b :$PORT app.wsgi'
This feels a lot like doing it wrong. A lot of Googling didn't provide a better answer.
Defining the python3 manage.py migrate command in your app.yaml file will make it run every time a new instance is spawned and set up to serve traffic. Although technically this may not be an issue (no migration will happen if database schema hasn't changed) this isn't the right place to declare it.
You'd want this command to run once on every new version code push. This fits perfectly in a CI/CD approach. There are several tutorials on the Google Cloud online documentation using Bitbucket Pipelines or Travis CI for example but you can use many other CI/CD solutions.
I am building a Python+Django development environment using docker. I defined Dockerfile files and services in docker-compose.yml for web server (nginx) and database (postgres) containers and a container that will run our app using uwsgi. Since this is a dev environment, I am mounting the the app code from the host system, so I can easily edit it in my IDE.
The question I have is where/how to run migrate command.
In case you don't know Django, migrate command creates the database structure and later changes it as needed by the project. I have seen people run migrate as part of the compose command directive command: python manage.py migrate && uwsgi --ini app.ini, but I do not want migrations to run on every container restart. I only want it to run once when I create the containers and never run again unless I rebuild.
Where/how would I do that?
Edit: there is now an open issue with the compose team. With any luck, one time command containers will get supported by compose. https://github.com/docker/compose/issues/1896
You cannot use RUN because as you mentioned in the comments your source is mounted during running of the container.
You cannot use CMD either since you don't want it to run everytime you restart the container.
I recommend using docker exec manually after running the container. I do not think there is a way to automate this inside a dockerfile or docker-compose because of the two reasons I gave above.
It sounds like what you need is a tool for managing project tasks. dobi is a tool designed to handle these tasks (disclaimer: I am the author of this tool).
You can see an example of how to run a migration here: https://github.com/dnephin/dobi/tree/master/examples/init-db-with-rails. The example uses rails, but it's basically the same idea as django.
You could setup a task called migrate which would run the command in a container and write the data to a volume. Then when you start your docker-compose containers, use that volume as the source for your database service.
https://github.com/docker/compose/issues/1896 is finally resolved now by the new service profiles introduced with docker-compose 1.28.0. With profiles you can mark services to be only started in specific profiles:
services:
nginx:
# ...
postgres:
# ...
uwsgi:
# ...
migrations:
profiles: ["cli-only"] # profile name chosen freely
# ...
docker-compose up # start only your app services, no migrations
docker-compose run migrations # run migrations on-demand
docker exec -it container-name bash
Then you will be inside the container and you can run any command you normally do when you develop without using docker.
I'm trying to find a workflow with Docker and Django. Currently, I'm using the basic configuration from the docker documentation.
I'd like to use manage.py startapp directly from the container to start a new app using:
docker-compose run web ./manage.py startapp myapp
But all the files created in the volume are owned by the root user and not by myself, so I can't edit them from the host.
My idea is to avoid installing all the requirements on my host machine but maybe I should not create app from the container?
One possible solution is to create a user and make it having the same UID/GID than my user on my host machine but it won't work if I try to use an other account on my host machine...
Any suggestion?
What worked best for me was avoiding (or minimizing) file creation inside the containers.
My Dockerfile would just copy the requirements.txt and install them;
and the container would access the app files through a mounted volume.
I pass the env var PYTHONDONTWRITEBYTECODE=1 to the containers, so python does not create *.pyc/*.pyo files.
The few times I cannot avoid it (like, ./manage.py makemigrations), I run chown afterwards.
It's not ideal, but as this happens rarely for my case, I don't bother.