I have a django app that I put inside of a docker container for deployment. I have some initial data that I want to load into the database via the dumpdata and loaddata commands. The initial data lives on my local hard drive. I choose a very naive approach and simply copied the data_backup.json file to the server via scp.
Now, I want to load the data_backup.json file (the file sits on the server not in the docker container) by executing:
sudo docker-compose exec restapi python manage.py loaddata --settings=rest.settings.production ./data_backup_20191004.json
But Django only searches the internal directories for fixtures.
I am looking for a way to populate the database with the data_backup.json file inside the docker container. Can someone help?
Ultimately, I am looking for a way to dump data directly to S3 and load it from there if needed (for db backups). If you have any tips on how to achieve that, this would also be super helpful - I don't seem to be able to find material on that.
Just in case someone has this question in the future. It is possible to loaddata from stdin. So you can just take the backup file and pipe it into the database (within the container with a command like this:
cat <<fixture_name.json>> | sudo docker exec -i <<container_name_or_id>> python manage.py loaddata --format=json -
The last dash tells django that you want to load the data from stdin.
DOCS
You could copy the file into the docker container before running the command with docker cp:
docker cp ./data_backup_20191004.json <container_id>:django_dir/data_backup_20191004.json
Or, if the file is located on an S3 server, you could execute a curl inside the docker container and install from there:
sudo docker-compose exec restapi curl http://s3.example.com/path/to/data.json > data.json
sudo docker-compose exec restapi python manage.py loaddata --settings=rest.settings.production ./data.json
I am looking for a way to populate the database with the
data_backup.json file inside the docker container. Can someone help?
See the answer of Xen_mar, which I think it's perfect.
Ultimately, I am looking for a way to dump data directly to S3 and
load it from there if needed (for db backups). If you have any tips on
how to achieve that, this would also be super helpful - I don't seem
to be able to find material on that.
This seems to be a complete different question. I would consider using some django package like Django Smuggler, which allows you to load and dump fixtures from the admin, and I assume that it may be possible to configure django smuggler upload directory to be handled by django storages. I'm not sure that is possible, so if you try let me know.
Related
I am developing Django Wagtail application on my local machine connected to a local postgres server.
I have a test server and a production server.
However when I develop locally and try upload it there is always some issue with makemigration and migrate e.g. KeyError etc.
What are the best practices of ensuring I do not get run into these issues? What files do I need to port across?
so ill tell you what i do and what most of the companies that i worked as a django developer did and i can tell you by experience that worked pretty well.
First containerize your application, this will make your life much more easy and you will remove external influence in your code, also will get you an easy way to reproduce your environment.
Your Dockerfile should be from some python image and should do 3 basically things:
Install your requirements dependencies
Run the python manage.py migrate --noinput command
Run a http server such as gunicorn with gunicorn -c /gunicorn.py wsgi:application
You ill do the makemigration in your local machine and make sure that everything is working before commit then to the repo.
In your gunicorn.py you ill put your settings to run the app such as the number of CPU, the binding port, the folder that your app is, something like:
import os
import multiprocessing
# Chdir to specified directory before apps loading.
# https://docs.gunicorn.org/en/stable/settings.html#chdir
chdir = '/app/'
# Bind the application on localhost both on ipv6 and ipv4 interfaces.
# https://docs.gunicorn.org/en/stable/settings.html#bind
bind = '0.0.0.0:8000'
Second containerize your other stuff, for example the postgres database, the redis (for cache), a connection pooler for the database depending on the size of your application.
Its highly recommend that you have a step in the pipeline to do tests, they need to run before everything, maybe just after lint
Ok what now? now you need a way to deploy that stuff, the best for that scenario is: pull your image to github registry, and you can add a tag to that for example:
IMAGE_ID=ghcr.io/${{ github.repository_owner }}/$IMAGE_NAME
# Change all uppercase to lowercase
IMAGE_ID=$(echo $IMAGE_ID | tr '[A-Z]' '[a-z]')
docker tag $IMAGE_NAME $IMAGE_ID:staging
docker push $IMAGE_ID:staging
This can be add in a github action in the build step for example.
After having your new code in a new image inside github you just need to update the current one, this can be done by creaaing a script to do it in the server and running that script from github action, is something like:
docker pull ghcr.io/${{ github.repository_owner }}/$IMAGE_NAME
echo 'Restarting Application...'
docker stop {YOUR_CONTAINER} && docker up -d
sudo systemctl restart nginx
echo 'Cleaning old images'
sudo docker system prune -af
You can see that i create the image with a staging tag, you can create a rule in github actions for example to trigger that action when you create a new release for example, and create another action to be trigger in every new commit and build/deploy for a dev tag.
For the migration problem, the first thing is, when your application go live squash every migration to the first one (you can drop the database and all the migration then create the database and run the makemigration command again to reach this), so you can have a clean migration in the server. Never creates unnecessary relation between the tables, prefer always doing cached properties instead of add new columns, use UUID for unique ids, and try to not do breaking changes in the database, its hard but if you plan the database before is not so difficult to do.
Hit me if you have any questions. A lot of the stuff that i said can be done in a lot of other platforms such as gitlab, travis, circle ci, but i use the github action in the example because i think is more simple to picture.
EDIT:
I forgot to tell you to have a cron in your server doing backups of your databases, the migrate command ill apply the changes only after the verification but if something else break the database this can save your life.
So I use Python+Django (but it does not really matter for this question)
When I write my code I simply run
./manage.py runserver
which does the webserver, static files, automatic reload, etc.
and and to put it on production I use series of commands like
./manage.py collectstatic
./manage.py migrate
uwsgi --http 127.0.0.1:8000 -w wsgi --processes=4
also I have few other services like postgres, redis (which are common for both production and dev)
So I'm trying to adapt here docker(+ -compose) and I cannot understand how to split prod/dev with it.
basically in docker-compose.yml you define your services and images - but in my case image in production should run one CMD and in dev another..
what are the best practices to achieve that ?
You should create additional docker-compose.yml files like docker-compose-dev.yml or docker-compose-pro.yml and override some of the original docker-compose.yml configuration with -f command:
docker-compose -f docker-compose.yml -f docker-compose-dev.yml up -d
Sometimes, I also use different Dockerfile for different environments and specify dockerfile parameter in docker-compose-pro.yml build section, but I didn't recommend it because you will end with duplicated Dockerfiles.
Update
Docker has introduced multi-stage builds feature https://docs.docker.com/develop/develop-images/multistage-build/#use-multi-stage-builds which allow to create a Dockerfile for different environments.
Usually having a different production and dev starting workflow is a bad idea. You should always try to keep both dev and prod environments very similar, even in the way you launch your applications. You should always externalize the configuration that is different between the different environments.
Having different startup sequence is maybe acceptable, however having multiple docker images (or dockerfiles) for each environment is a very bad idea. Docker images should be immutable and portable.
However, you might have some constraints. Docker-compose allows you to override the command that is specified in the image. There is the command property that will override the default command in the image. I would recommend that you keep the image production ready,
i.e. use something like CMD ./manage.py collectstatic && ./manage.py migrate && uwsgi --http 127.0.0.1:8000 -w wsgi --processes=4 in the Dockerfile.
In the compose file just override the CMD by specifying:
command: ./manage.py runserver
Having multiple compose file is usually not a big issue. You can keep your compose files clean and manageable by using some nice compose file features such as extends, where once compose file can extend another one.
I am building a Python+Django development environment using docker. I defined Dockerfile files and services in docker-compose.yml for web server (nginx) and database (postgres) containers and a container that will run our app using uwsgi. Since this is a dev environment, I am mounting the the app code from the host system, so I can easily edit it in my IDE.
The question I have is where/how to run migrate command.
In case you don't know Django, migrate command creates the database structure and later changes it as needed by the project. I have seen people run migrate as part of the compose command directive command: python manage.py migrate && uwsgi --ini app.ini, but I do not want migrations to run on every container restart. I only want it to run once when I create the containers and never run again unless I rebuild.
Where/how would I do that?
Edit: there is now an open issue with the compose team. With any luck, one time command containers will get supported by compose. https://github.com/docker/compose/issues/1896
You cannot use RUN because as you mentioned in the comments your source is mounted during running of the container.
You cannot use CMD either since you don't want it to run everytime you restart the container.
I recommend using docker exec manually after running the container. I do not think there is a way to automate this inside a dockerfile or docker-compose because of the two reasons I gave above.
It sounds like what you need is a tool for managing project tasks. dobi is a tool designed to handle these tasks (disclaimer: I am the author of this tool).
You can see an example of how to run a migration here: https://github.com/dnephin/dobi/tree/master/examples/init-db-with-rails. The example uses rails, but it's basically the same idea as django.
You could setup a task called migrate which would run the command in a container and write the data to a volume. Then when you start your docker-compose containers, use that volume as the source for your database service.
https://github.com/docker/compose/issues/1896 is finally resolved now by the new service profiles introduced with docker-compose 1.28.0. With profiles you can mark services to be only started in specific profiles:
services:
nginx:
# ...
postgres:
# ...
uwsgi:
# ...
migrations:
profiles: ["cli-only"] # profile name chosen freely
# ...
docker-compose up # start only your app services, no migrations
docker-compose run migrations # run migrations on-demand
docker exec -it container-name bash
Then you will be inside the container and you can run any command you normally do when you develop without using docker.
I'm trying to find a workflow with Docker and Django. Currently, I'm using the basic configuration from the docker documentation.
I'd like to use manage.py startapp directly from the container to start a new app using:
docker-compose run web ./manage.py startapp myapp
But all the files created in the volume are owned by the root user and not by myself, so I can't edit them from the host.
My idea is to avoid installing all the requirements on my host machine but maybe I should not create app from the container?
One possible solution is to create a user and make it having the same UID/GID than my user on my host machine but it won't work if I try to use an other account on my host machine...
Any suggestion?
What worked best for me was avoiding (or minimizing) file creation inside the containers.
My Dockerfile would just copy the requirements.txt and install them;
and the container would access the app files through a mounted volume.
I pass the env var PYTHONDONTWRITEBYTECODE=1 to the containers, so python does not create *.pyc/*.pyo files.
The few times I cannot avoid it (like, ./manage.py makemigrations), I run chown afterwards.
It's not ideal, but as this happens rarely for my case, I don't bother.
I succesfully deployed my first Django/Heroku app, and now I only need to transfer my database. It was previously on a MySql db on a Win7 PC. I looked around for ways to import csv into the Heroku db but didn't find anything. They suggest using a ruby gem to do it, or using taps and this command:
heroku db:push mysql://root:mypass#localhost/mydb.
My database is pretty small, only around 1000 columns and 2 tables, so it would be pretty simple to do it import the CSV files, but I cant find how to do it. Anyone knows?
Here's a few ideas that should get you going very quickly:
First, a quick and dirty approach:
Install Postgres on your local machine.
Import the CSV into a local Postgres database.
Push that database to Heroku.
Alternatively, a slightly less quick and still a little dirty approach:
Use the CSV reader Python module.
Create a task in your Django app that loads the CSV, iterates over each CSV row, creates a new corresponding model, and saves it in the server's database if the model is valid.
Run the task via the heroku CLI.
Once you're finished and you've verified the server database, you can remove the task and CSV from the repo.
Otherwise, taps may suit you well too!
As per enter link description here, you can use the copy command to load a CSV into postgres from you local filesystem. You should be able to use this with your Heroku DB with something similar to:
PGPASSWORD=passwordhere psql -h hostname -U username dbname -c "\copy ..."
You can import a local csv file as a table in your heroku postgres by the below command
PGPASSWORD=<your password> psql -h <your heroku host> -U <heroku user> <heroku postgres database name> -c "\copy bank (ifsc, bank_id, branch, address, city, district, state, bank_name) FROM '<local file path location>' CSV HEADER DELIMITER E'\t';"
Please alter the DELIMITER value according to your needs. The 'E' before delimiter value is to denote that the command contains escape characters, otherwise it wil; throw exception