Productionizing Docker On AWS With Django & Postgres: - django

I have been stumped as of late trying to figure out a cloud environment setup that can support my 'dockerized' Python Django project. The architecture of my application is simple; I have a web service, a redis service and a database service (see my compose yml file below).
My confusion is finding the correct path of moving a local setup that is stood up via the docker compose yml file to a production like environment. I absolutely love the docker compose utility and hope to use a similar config file for my production environment; however, I am finding that most container based approaches on the cloud are far more complex than this... This guide has been my favorite so far, but it doesn't dive into databases and other depended on components (which leaves out the necessary nitty gritty stuff!).
Questions:
Should I move away from trying to use docker-compose on a production environment?
Should I use a database that is not containerized? If I shouldn't, how do I configure the containerized web service to an Amazon RDS instance?
Should I break apart my docker-compose.yml into individual docker files? Should I write script to start these containers to stand up the environment?
Is there a solid tutorial that I am not finding to do this sort of thing? I am baffled on how overly simple a lot of the guides are out there that dont help with even remotely more complex applications than "Hello World".
I have also found AWS and Heroku to lack common features that I would expect based on how popular docker is. Is this a commonly held opinion? Heroku does not help or advocate for setting up databases in containers and AWS is still behind for users trying to use docker-compose yml files that are written in v3.
Any recommendations are much appreciated... Ive been hacking away with no solid conclusions for the past 3 nights.
version: '3.3'
services:
db:
restart: always
image: postgres
networks:
- webnet
redis:
restart: always
image: redis:latest
expose:
- "6379"
networks:
- webnet
web:
restart: always
build: .
command: make start
volumes:
- .:/code
deploy:
replicas: 5
resources:
limits:
cpus: "0.1"
memory: 50M
restart_policy:
condition: on-failure
ports:
- "8000:8000"
networks:
- webnet
depends_on:
- db
- redis
networks:
webnet:

Related

Copying code using Dockerfile or mounting volume using docker-compose

I am following the official tutorial on the Docker website
The docker file is
FROM python:3
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
WORKDIR /code
COPY requirements.txt /code/
RUN pip install -r requirements.txt
COPY . /code/
The docker-compose is:
web:
build: .
command: python manage.py runserver 0.0.0.0:8000
volumes:
- .:/code
I do not understand why in the Dockerfile they are copying the code COPY . /code/ , but then again mounting it in the docker-compose - .:/code ? Is it not enough if I either copy or mount?
Both the volumes: and command: in the docker-compose.yml file are unnecessary and should be removed. The code and the default CMD to run should be included in the Dockerfile.
When you're setting up the Docker environment, imagine that you're handed root access to a brand-new virtual machine with nothing installed on it but Docker. The ideal case is being able to docker run your-image, as a single command, pulling it from some registry, with as few additional options as possible. When you run the image you shouldn't need to separately supply its source code or the command to run, these should usually be built into the image.
In most cases you should be able to build a Compose setup with fairly few options. Each service needs an image: or build: (both, if you're planning to push the image), often environment:, ports:, and depends_on: (being aware of the limitations of the latter option), and your database container(s) will need volumes: for their persistent state. That's usually it.
The one case where you do need to override command: in Compose is if you need to run a separate command on the same image and code base. In a Python context this often comes up to run a Celery worker next to a Django application.
Here's a complete example, with a database-backed Web application with an async worker. The Redis cache layer does not have persistence and no files are stored locally in any containers, except for the database storage. The one thing missing is the setup for the database credentials, which requires additional environment: variables. Note the lack of volumes: for code, the single command: override where required, and environment: variables providing host names.
version: '3.8'
services:
app:
build: .
ports: ['8000:8000']
environment:
REDIS_HOST: redis
PGHOST: db
worker:
build: .
command: celery worker -A queue -l info
environment:
REDIS_HOST: redis
PGHOST: db
redis:
image: redis:latest
db:
image: postgres:13
volumes:
- pgdata:/var/lib/postgresql/data
volumes:
pgdata:
Where you do see volumes: overwriting the image's code like this, it's usually an attempt to avoid needing to rebuild the image when the code changes. In the Dockerfile you show, though, the rebuild is almost free assuming the requirements.txt file hasn't changed. It's also almost always possible to do your day-to-day development outside of Docker – for Python, in a virtual environment – and use the container setup for integration testing and deployment, and this will generally be easier than convincing your IDE that the language interpreter it needs is in a container.
Sometimes the Dockerfile does additional setup (changing line endings or permissions, rearranging files) and the volumes: mount will hide this. It means you're never actually running the code built into the image in development, so if the image setup is buggy in some way you won't see it. In short, it reintroduces the "works on my machine" problem that Docker generally tries to avoid.
It used for saving the image after with the code.
When you use COPY it save it as part of the image.
While mounting is only while developing.
Ideally we use a single Dockerfile to create the image we use for both production and development. This increases the similarity of the app's runtime environment, which is a good thing.
In contrast to what #David writes: it's quite handy to do your day-to-day development with a Docker container. Your code runs in the same environment in production and development. If you use virtualenv in development you're not making use of that very practical attribute of Docker. The environments can diverge without you knowing and prod can break while dev keeps on working.
So how do we let a single Dockerfile produce an image that we can run in production and use during development? First we should talk about the code we want to run. In production we want to have a specific collection of code to run (most likely the code at a specific commit in your repository). But during development we constantly change the code we want to run, by checking out different branches or editing files. So how do we satisfy both requirements?
We instruct the Dockerfile to copy some directory of code (in your example .) into the image, in your example /code. If we don't do anything else: that will be the code that runs. This happens in production.
But in development we can override the /code directory with a directory on the host computer using a volume. In the example the Docker Compose file sets the volume. Then we can easily change the code running in the dev container without needing to rebuild the image.
Also: even if rebuilding is fast, letting a Python process restart with new files is a lot faster.

Disable load examples in Apache Superset

I have installed Apache Superset locally using Docker Compose. All of the pre-loaded examples were helpful at first, but I am wondering if there is a configuration setting to skip loading these examples. I know that I could delete each example chart and dashboard individually, but I'm hoping there is a configuration setting somewhere that would disable loading examples. I've reviewed the documentation, but haven't seen such a setting.
https://superset.apache.org/docs/installation/configuring-superset
You can disable the loading example by removing SUPERSET_LOAD_EXAMPLES=yes in the env file located at docker/.env. The environment is loaded from the docker/.env file for docker-compose.
superset:
env_file: docker/.env
image: *superset-image
container_name: superset_app
command: ["/app/docker/docker-bootstrap.sh", "app"]
restart: unless-stopped
ports:
- 8088:8088

have different base image for an application in docker

I am new in docker world.I am trying to understand Docker concepts about parent images. Assume that I want to run my django application on docker. I want to use ubuntu and python, I want to have postgresql as my database backend, and I want to run my django application on gunicorn web server. Can I have different base image for ubuntu, python, postgres and gunicorn and create my django container like this:
FROM ubuntu
FROM python:3.6.3
FROM postgres
FROM gunicorn
...
I am thinking about having different base image because if someday I want to update one of these image, I only have to update base image and not to go into ubuntu and update them.
You can use multiple FROM in the same Dockerfile, provided you are doing a multi-stage build
One part of the Dockerfile would build an intermediate image used by another.
But that is generally use to cleanly separate the parents used for building your final program, from the parents needed to execute your final program.
No, you can not create your image like this, the only image that will treat as the base image in the Dockerfile you posted will be the last FROM gunicor. what you need is multi-stage builds but before that, I will clear some concept about such Dockerfile.
A parent image is the image that your image is based on. It refers to
the contents of the FROM directive in the Dockerfile. Each subsequent
declaration in the Dockerfile modifies this parent image. Most
Dockerfiles start from a parent image, rather than a base image.
However, the terms are sometimes used interchangeably.
But in your case, I will not recommend putting everything in one Dockerfile. It will kill the purpose of Containerization.
Rule of Thumb
One process per container
Each container should have only one concern
Decoupling applications into multiple containers makes it much easier to scale horizontally and reuse containers. For instance, a web application stack might consist of three separate containers, each with its own unique image, to manage the web application, database, and an in-memory cache in a decoupled manner.
dockerfile_best-practices
Apart from Database, You can use multi-stage builds
If you use Docker 17.05 or higher, you can use multi-stage builds to
drastically reduce the size of your final image, without the need to
jump through hoops to reduce the number of intermediate layers or
remove intermediate files during the build.
Images being built by the final stage only, you can most of the time
benefit both the build cache and minimize images layers.
Your build stage may contain several layers, ordered from the less
frequently changed to the more frequently changed for example:
Install tools you need to build your application
Install or update library dependencies
Generate your application
use-multi-stage-builds
With the multi-stage build, The Dockerfile can contain multiple FROM lines and each stage starts with a new FROM line and a fresh context. You can copy artifacts from stage to stage and the artifacts not copied over are discarded. This allows to keep the final image smaller and only include the relevant artifacts.
Is it possible? Yes, Technically multiple base images (FROM XXXX) can appear in single docker file. But it is not for what you are trying to do. They are used for multi-stage builds. You can read more about it here.
The answer to your question is that, if you want to achieve this type of docker image, you should use one base image and install everything else in it with RUN commands like this.
FROM ubuntu
RUN apt install postgresql # install postgresql
...
Obviously it is not that simple. base ubuntu image is very minimal you have to install all dependencies and tools needed to install python, postgres and gunicorn yourself with RUN commands. For example, if you need to download python source code using
RUN wget https://www.python.org/ftp/python/3.7.4/Python-3.7.4.tgz
wget (most probably) is not pre installed in ubuntu image. You have to install it yourself.
Should I do it? I think you are going against the whole idea of dockerization of apps. Which is not to build a monolithic giant image containing all the services, but to divide services in separate containers. (Generally there should be one service per container) and then make these containers talk to each other with docker networking tools. That is you should use one container for postgres one for nginx and one for gunicorn, run them separately and connect them via network.There is an awesome tool, docker-compose, comes with docker to automate this kind of multi-container setup. You should really use it. For more practical example about it, please read this good article.
You can use official docker image for django https://hub.docker.com/_/django/ .
It is well documented and explained its dockerfile.
If you wants to use different base image then you must go with docker-compose.
Your docker-compose.yml will look like
version: '3'
services:
web:
restart: always
build: ./web
expose:
- "8000"
links:
- postgres:postgres
- redis:redis
volumes:
- web-django:/usr/src/app
- web-static:/usr/src/app/static
env_file: .env
environment:
DEBUG: 'true'
command: /usr/local/bin/gunicorn docker_django.wsgi:application -w 2 -b :8000
nginx:
restart: always
build: ./nginx/
ports:
- "80:80"
volumes:
- web-static:/www/static
links:
- web:web
postgres:
restart: always
image: postgres:latest
ports:
- "5432:5432"
volumes:
- pgdata:/var/lib/postgresql/data/
redis:
restart: always
image: redis:latest
ports:
- "6379:6379"
volumes:
- redisdata:/data
volumes:
web-django:
web-static:
pgdata:
redisdata:
follow this blog for details https://realpython.com/django-development-with-docker-compose-and-machine/

Celery and Django, queries cause ProgrammingError

I'm building a small Django project with cookiecutter-django and I need to run tasks in the background. Even though I set up the project with cookiecutter I'm facing some issues with Celery.
Let's say I have a model class called Job with three fields: a default primary key, a UUID and a date:
class Job(models.Model):
access_id = models.UUIDField(default=uuid.uuid4, editable=False, unique=True)
date = models.DateTimeField(auto_now_add=True)
Now if I do the following in a Django view everything works fine:
job1 = Job()
job1.save()
logger.info("Created job {}".format(job1.access_id))
job2 = Job.objects.get(access_id=job1.access_id)
logger.info("Retrieved job {}".format(job2.access_id))
If I create a Celery task that does exactly the same, I get an error:
django.db.utils.ProgrammingError: relation "metagrabber_job" does not exist
LINE 1: INSERT INTO "metagrabber_job" ("access_id", "date") VALUES ('e8a2...
Similarly this is what my Postgres docker container says at that moment:
postgres_1 | 2018-03-05 18:23:23.008 UTC [85] STATEMENT: INSERT INTO "metagrabber_job" ("access_id", "date") VALUES ('e8a26945-67c7-4c66-afd1-bbf77cc7ff6d'::uuid, '2018-03-05T18:23:23.008085+00:00'::timestamptz) RETURNING "metagrabber_job"."id"
Interestingly enough, if I look into my Django admin I do see that a Job object is created, but it carries a different UUID as the logs say..
If I then set CELERY_ALWAYS_EAGER = False to make Django execute the task and not Celery: voila, it works again without error. But running the tasks in Django isn't the point.
I did quite a bit of searching and I only found similar issues where the solution was to run manage.py migrate. However I did this already and this can't be the solution otherwise Django wouldn't be able to execute the problematic code with or without Celery.
So what's going on? I'm getting this exact same behavior for all my model objects.
edit:
Just in case, I'm using Django 2.0.2 and Celery 4.1
I found my mistake. If you are sure that your database is migrated properly and you get errors as above: it might very well be that you can't connect to the database. Your db host might be reached, but not the database itself.
That means your config is probably broken.
Why it was misconfigured: in the case of cookiecutter-django there is an issue that Celery might complain about running as root on Mac, so I set the environment variable C_FORCE_ROOT in my docker-compose file. [Only for local, you should never do this in production!] Read about this issue here https://github.com/pydanny/cookiecutter-django/issues/1304
The relevant parts of the config looked like this:
django: &django
build:
context: .
dockerfile: ./compose/local/django/Dockerfile
depends_on:
- postgres
volumes:
- .:/app
environment:
- POSTGRES_USER=asdfg123456
- USE_DOCKER=yes
ports:
- "8000:8000"
- "3000:3000"
command: /start.sh
celeryworker:
<<: *django
depends_on:
- redis
- postgres
environment:
- C_FORCE_ROOT=true
ports: []
command: /start-celeryworker.sh
However setting this environment variable via docker-compose file prevented the django environment variables to be set on the celeryworker container, leaving me with a nonexistent database configuration.
I added the POSTGRES_USER variable to that container manually and things started to work again. Stupid mistake on my end, but I hope I can save some time for someone with this answer.

What is the correct way to set up a Django/PostgreSQL project with Docker Compose?

Can you tell me what is the best way to way to set up and configure a Django/PostgreSQL project with Docker Compose?
(with the latest versions of everything, python 3.4, django 1.8.1, etc.)
Have you looked at the examples on the Docker website? Here's a link describing exactly this.
Basically, you need two services, one for your Django app and one for your Postgres instance. You would probably want to build a Docker image for your Django app from you current folder, hence you'll need to define a Dockerfile:
# Dockerfile
FROM python:3.4-onbuild
That's the whole Dockerfile! Using the magic -onbuild image, files are automatically copied to the container and the requirements are installed with pip. For more info, read here.
Then, you simply need to define your docker-compose.yml file:
# docker-compose.yml
db:
image: postgres
web:
build: .
command: python manage.py runserver 0.0.0.0:8000
volumes:
- .:/code
ports:
- "8000:8000"
links:
- db
Here, you've defined the Postgres service, which is built from the latest postgres image. Then, you've defined your Django app's service, built it from the current directory, exposed port 8000 so that you can access it from outside your container, linked to the database container (so that they can magically communicate without anything specific from your side - more details here) and started the container with the classic command you use to normally start your Django app. Also, a volume is defined in order to sync the code you're writing with the one from inside your container (so that you don't need to rebuild your image every time you change the code).
Related to having the latest Django version, you just have to specify it in your requirements.txt file.