I am new to deploying to DigitalOcean. I recently created a Droplet, into which I will put my Django API to serve my project. I saw that the droplet I rent has 25 GB of memory, and for the actual size of my project, it could be a good fit for the database. I run my entire code with Docker Compose, which handles Django and PostgreSQL. I was wondering if it is a good idea to store all the data in the PostgreSQL created by Docker Image, and stored in the local memory of the droplet, or do I need to set up an external Database in Digital Ocean?
Technically I think that could work, but I don't know if it is safe to do, if I will be able to scale the droplets once the size scales, or if I will be able to migrate all the data into an external database, on digital ocean, a managed PostgreSQL all my data without too many problems.
And if I save it all locally, what about redundancy? Or can I clone my database on a daily basis in some way?
If someone knows about that and can help me, please I don't know where to start
Very good question!
So your approach is perfectly fine, even though I wouldn't use docker compose for your production environment. The reason being that you will definitely need to rebuild and redeploy your application container multiple times and it's better to decouple it from your database container.
In order to do that, just run something similar to this in your server:
docker run -d \
--name my_db \
-p 5432:5432 \
-e POSTGRES_PASSWORD='mypassword' \
-e POSTGRES_USER="postgres" \
-v /var/run/postgresql:/var/run/postgresql \
--network=mynetwork \
--restart=always \
postgres
Make sure to define a custom network and to register in it your db
container as well as your web container and any other standalone
service that you need (for example Redis or Celery)
The advantage of the Digital Ocean databases is that they are fully managed. Meaning that you can use their infrastructure to handle backups, and that it would run on a separate server that you can scale up and down based on need.
If I were you I would start with a PostgreSQL container in your server, making sure to define a volume so that even if your server instance crashes the data will be stored on file and will be there waiting for you when you restart the server and db.
Then as your needs evolve you can migrate your data on a more reliable and flexible system. This can be a Digital Ocean database or AWS RDS or a number of others. DB migrations can be a bit scary but nowadays there are plenty of tools to help you with those too.
Related
I have recently deployed my Django application on Elastic Beanstalk.
I have everything working now, but I'm curious what the best way to develop locally is.
Currently, after I make a change locally, I have to commit the changes via git and then run eb deploy. This process takes 1-3 minutes which is not ideal for making changes.
The Django application will not run on my local machine, as it is configured for EB.
You are right, having to deploy remotely during development isn't best practice.
Have you considered Docker?
To run a typical Django app locally using Docker, you'll need to dockerize:
The Django app
Database eg Postgres
Worker eg Celery
Local mailer eg Mailhog
Not a very long list.
Obviously you'll add or remove from that list depending on how complex or simple your app is.
we are using superset docker compose as per the installation guide, However, once the docker is crashed or machine is restarted, we need to re-configure dashboard.. not sure how to configure with persistence volume that it can continue work after restart.
By default superset docker-compose has PostgreSQL db service and data's are mounted as volume, which is holding all of data related to superset. So instead of using default db service use extranal database instance in that case even docker-compose restart you'll not loose your data.
I am trying to configure a CI job on Bamboo for a Django app, the tests to be run rely on a database (postgres 9.5). It seems that a prudent way to go about is it run the whole test in a docker container, as I do not control the agent environment so I cannot install Postgres there.
Most guides I found recommend running postgres and django in two separate containers and using docker-compose to easily manage them. In this scenario each docker image runs just one service, started with CMD. In Bamboo I cannot use docker-compose however, I need to use just one image, so I am trying to get Postgres and Django to run nicely together in one container but with little success so far.
My problem is that I see no easy way to start Postgres as a service inside docker but NOT as a docker CMD command, official postgre image uses an entrypoint.sh approach, also described in the official docker docs
But it is not clear to me how to implement that. I would appreciate your help!
Well, basically you would start postgres as a background process in the docker-entrypoint shell script that does otherwise start your django application.
The only trick here is that you need to put a 'trap' command in it so that you can send a shutdown/kill to the background process when your master process stops.
Although I have done that a thousand times, I know that it is a good source for programming errors. In general I do just use my docker-systemctl-replacement which takes care of running multiple applications as services, just as if the container is a virtual machine hosting multiple applications.
Your only other option is to add in a startup script in your Dockerfile, or kick it off as part of your docker run ... commands. We don't generally use the "Docker" tasks, as I find them ... distasteful (also why I usually just fall back to running a "Script" task, and directly calling docker run in that script task)
Anyway, you'd have to have your Docker container execute a script that would:
Start up Postgres (like a sudo systemctl start postgresql)
Execute your tests.
Your Dockerfile will have to install Postgresql and do some minor setup work I imagine (like create relevant users and databases with the proper owner). Since we're all good citizens, we remember to never run your containers as root, right?
Note - you can always hack around getting two containers to talk to each other without using docker-compose. It's a bit less convenient, but you could do something like:
docker run --detach --cidfile=db_cidfile --name ci_db postgresql_image
...
docker run --link ci_db testing_image
Make sure that you EXPOSE the right ports on the postgresql image to the testing_image container.
EDIT: I'm looking more at my specific case - we just install Postgresql into a base CentOS host rather than use the postgresql default image (using yum install http://yum.postgresql.org/..../pgdg-centos...rpm and then just install postgresql-server and postgresql-contrib packages from there). There is a CMD [ "/usr/pgsql-ver/bin/postgres", "-D", "/var/lib/pgsql/ver/data"] in our Dockerfile, too. We don't do anything fancy with the docker container, though. NOTE: we don't use this in production at all, this is strictly for local and CI testing.
As far as my understanding goes, the Kubernetes engine is meant for deploying applications that can be load balanced, for example, having an application which unhashes a string. If pod-a is on high load, it would be offloaded to pod-b. Correct me if I am wrong here, since if this is false, my following question will not make sense.
After exploring it for few hours I can't seem to figure out how to deploy a C++ application to the Kubernetes cluster. How would I do so?
What I tried:
I tried to follow the guide: Interactive Tutorial - Deploying an App, however, I couldn't understand how I would get my C++ app as an image that could be deployed.
What the C++ application is:
At the moment it proxies TCP traffic to another HOST designated by clients' HOSTNAME. It is pretty much a reverse proxy, however, this is NOT an HTTP application.
Is Kubernetes the right choice?
-
Kubernetes is really useful to loadbalance workloads, to provide high availability in case of failure to speed up test processes, and to increase safety during production rollout through different strategies and increase security through segregation.
However, not all the kind of workloads can take advantage of all the features introduced by Kubernetes.
For example, if your application is built in such a way it needs a stable amount of RAM and CPU, the code as well is really stable and you need merely one replica, then maybe Kubernetes and containers are not the best choice (even if you can perfectly use them), and you should rather implement everything on a big monolithic server/virtual machine.
But if you need to deploy it on a different cloud provider, and it should run merely some hours every day, maybe then it can make use as well of those features. If you are willing to add a layer, make sure that you need the features it introduces, otherwise it would be merely an overhead.
Note that Kubernetes it is not capable of splitting your workload alone. Therefore I do not know if what you mean by "If pod-a is on high load, it would be offloaded to pod-b" likely yes it is possible, but you have to instruct it to do so.
Kubernetes takes care to run your POD, making sure there have been scheduled on nodes where enough memory and CPU is available according to your specification, you can set up autoscaling procedures as well to support high workload periods or to scale even the cluster itself. Your application should have been created in such a way to support a divide and conquer pattern, otherwise you will likely have three nodes, one pod running on one node, two idle and a overhead that you could have avoided.
If your C++ application POD unhashes a strings and a single request could consume all the resources of a node Kubernetes will not "spit" the initial workload and will not create for you more PODS scheduling them across the cluster! Of course you can achieve something similar, but it will not come for free and you will likely need to modify your C++ code.
For sure you can take advantage of Kubernetes, running your application on it is pretty easy, but maybe you will have to modify something in the architecture to fully make advantage of those features.
Deploy the C++ application
The process to deploy your application in Kubernetes is pretty standard. Develop it locally, create a Docker image with all the libraries and components you need, test it locally, push it to the registry, and create the deployment in Kubernetes.
Let's say that you have all the resources needed to run your application and your executable file in a local folder. Create the Docker file.
Example, modify to implement your application, I have reported it as an example to show syntax:
# Download base image, Ubuntu 16.04 (Xenial Xerus)
FROM ubuntu:16.04
# Update software repository
RUN apt-get update
# Install nginx, php-fpm and supervisord from the Ubuntu repository
RUN apt-get install -y nginx php7.0-fpm supervisor
# Define the environment variable
ENV nginx_vhost /etc/nginx/sites-available/default
[...]
# Enable php-fpm on the nginx virtualhost configuration
COPY default ${nginx_vhost}
[...]
RUN chown -R www-data:www-data /var/www/html
# Volume configuration
VOLUME ["/etc/nginx/sites-enabled", "/etc/nginx/certs", "/etc/nginx/conf.d", "/var/log/nginx", "/var/www/html"]
# Configure services and port
COPY start.sh /start.sh
CMD ["./start.sh"]
EXPOSE 80 443
Built it running:
export PROJECT_ID="$(gcloud config get-value project -q)"
docker build -t gcr.io/${PROJECT_ID}/hello-app:v1 .
gcloud docker -- push gcr.io/${PROJECT_ID}/hello-app:v1
kubectl run hello --image=gcr.io/${PROJECT_ID}/hello-app:v1 --port [port number if needed]
More information is here.
Currently I have 3 linode servers:
1: Cache server (Ubuntu, varnish)
2: App server (Ubuntu, nginx, rabbitmq-server, python, php5-fpm, memcached)
3: DB server (Ubuntu, postgresql + pg_bouncer)
On my app-server I have multiple sites (topdomains). Each site is inside a virtualenviroment created with virtualenvwrapper. Some sites are big with a lot of traffic, and some site are small with little traffic.
A typical site consist of python (django), celery (beat, flower) and gunicorn.
My current development pattern now is working inside a staging environment on the app-server and committing changes to git. Then change environment to the production environment and doing a git pull, and a ./manage.py migrate and restarting the process doing sudo supervisorctl restart sitename:, but this takes time! There must be a simpler method!
Therefore it seems like docker could help simplify everything, but I can't decide the best approach for how I could manage all my sites and containers inside each site.
I have looked at http://panamax.io and https://github.com/progrium/dokku, but not sure if one of them fit my needs.
Ideally I would run a development version of each site on my local machine (emulating cache-server, app-server and db-server), do code changes there and test them. When I would see the changes worked, I would execute a command that would do all the heavy lifting and send the changes to the linode servers (I would think mostly the app-server), do all the migration and restarting the project on the server.
Could anyone point me in the right direction as how to achieve this?
I have faced the same problem. I don't claim this is the best possible answer and am interested to see what others have come up with.
There doesn't seem to be any really turnkey solution on Docker yet.
It's also been frustrating that most of the 'Django+Docker' tutorials just focus on a single Django site, so they bundle up the webserver and everything in the same Docker container. I think if you have multiple sites on a server you want them to share a single webserver, but this quickly gets more complicated than presented in the tutorials, which are no longer much help.
Roughly what I came up with is this:
using Fig to manage containers and complicated Docker config that would be tedious to type as commandline options all the time
sites are Django, on uWSGI+Nginx (no reason you couldn't have others though)
I have a git repo per site, plus a git repo for the 'server'
separate containers for db, nginx and each site
each site container has it's own uWSGI instance... I do some config switching so I can either bring up a 'dev' container with uWSGI as acting standalone web server, or a 'live' container where the uWSGI socket is exposed to the main Nginx container, which then takes over as front-side web server.
I'm not sure yet how useful the 'dev' uWSGI servers are, I might switch to just running Django dev server and sharing my local code dir as a volume in the container, so I can edit and get live reloading.
In the 'server' repo I have all the shared Dockerfiles, for Nginx server, base uWSGI app etc.
In the 'server' repo I have made Fabric tasks to do my deployment (checkout server and site repos on the server, build docker images, run fig up etc).
Speaking of deployment, frankly I'm not much keen on the Docker Registry idea. This seems to mean you have to upload hundreds of megabytes of image file to the registry server each time you want to deploy a new container version. This sucks if you are on a limited bandwidth connection at the time and seems very inefficient.
That's why so far I decided to deploy new code via Git and build the new images on the server. I don't use a Docker Registry at all (apart from the public one for a base Ubuntu image). This seems to go against the grain of Docker practice a bit so I'm curious for feedback.
I'd strongly recommend getting stuck in and building your own solution first. If you have to spend time learning a solution like Dokku, Panamax etc that may or may not work for you (I don't think any of them are really ready yet) you may as well spend that time learning Docker directly... it will then be easier to evaluate solutions further down the line.
I tried to get on with Dokku early on in my search but had to abandon because it's not compatible with boot2docker... which means on OS X you're faced with the 'fun' of setting up your own VirtualBox vm to run the Docker daemon. It didn't seem worth the hassle of this when I wasn't certain I wanted to be stuck with how Dokku works at the end of the day.