I have five nodes behind a load balancer and I'm trying to determine the optimal configuration for a Django based site.
Each node has access to Postgres, mod_wsgi, Apache, Lighttpd, memcached, pgpool2 (for database replication) and glusterfs(for media file replication) and is running Ubuntu 8.04LTS.
So far, the setup is four nodes running Apache/Lighttpd/memcached/pgpool2 all reading/writing to one master node that is running the "master" Postgresql. Each of the four web nodes is also running Postgres for replication from the master via pgpool.
So, my question is: How would you configure this setup and/or what would you change so that there is no single point of failure, if possible?
This sounds like a good setup, although its hard to know exactly what your setup looks like. In terms of memory etc. and what traffic you expect to handle.
You might want to consider using Django's multidb support and have a read only postgres instance (use DB routing to direct reads to the read only for certain apps). This can offer up some quite nice speed improvements - and at the moment you could have a potential bottleneck at the single postgres instance depending how heavy your database work is.
As #ashwoods suggested, it might be working looking into gunicorn and nginx. I guess at the moment you use Apache only to run mod_wsgi? And lighttpd for the static files? With nginx, you can use it with a number of wsgi servers and its great for static files too.
The setup looks pretty good to me. I would consider using gunicorn/uwsgi + nginx. I would also benchmark using pbbouncer, although pgpool2 offers more out of the box.
Related
So I have a basic django blog application. Which i want to dockerise into django. And one more thing. I am writing my question here because there are live people to answer.
You should use a cookie cutter for dockerizing your Django application. Here you can read the docs https://cookiecutter-django.readthedocs.io/en/latest/deployment-with-docker.html
you need to pip freeze first so that you could know what your current Django application taking up. And use them as requirements.txt
Benefits of using Django on docker:
Your code runs on any operating system that supports Docker.
You save time by not needing to configure system dependencies on your
host.
Your local and production environments can be exactly the same,
eliminating errors that only happen in production.
You should start with reading Dokcer docs in order to understand what and why's: https://docs.docker.com/
Long story short - containers (prular) enable to start every service (Django, database, possible front-end, servers, etc.) in separate container and furthermore, to start them up on any OS of your choice.
Those containers can communicate each other thru host separated docker network.
You will need Dockerfile - to set up service, and possibly Docker-compose (if multiple containers) to manage all the containers running.
Here's example docker setup for Django: https://semaphoreci.com/community/tutorials/dockerizing-a-python-django-web-application
I'm building a web api by watching the youtube video below and until the AWS S3 bucket setup I understand everything fine. But he first deploy everything locally then after making sure everything works he is transferring all static files to AWS and for DB he switches from SQLdb3 to POSgres.
django portfolio
I still don't understand this part why we need to put our static files to AWS and create POSTgresql database even there is an SQLdb3 default database from django. I'm thinking that if I'm the only admin and just connecting my GitHub from Heroku should be enough and anytime I change something in the api just need to push those changes to github master and that should be it.
Why we need to use AWS to setup static file location and setup a rds (relational data base) and do the things from the beginning. Still not getting it!
Can anybody help to explain this ?
Thanks
Databases
There are several reasons a video guide would encourage you to switch from SQLite to a database server such as MySQL or PostgreSQL:
SQLite is great but doesn't scale well if you're expecting a lot of traffic
SQLite doesn't work if you want to distribute your app accross multiple servers. Going back to Heroky, if you serve your app with multiple Dynos, you'll have a problem because each Dyno will use a distinct SQLite database. If you edit something through the admin, it will happen on one of this databases, at random, leading to inconsistencies
Some Django features aren't available on SQLite
SQLite is the default database in Django because it works out of the box, and is extremely fast and easy to use in local/development environments for prototyping.
However, it is usually not suited for production websites. Additionally, while it can be tempting to store your sqlite.db file along with your code, for instance in a git repository, it is considered a bad practice because your database can contain sensitive data (such as passwords, usernames, emails, etc.). Hence, a strict separation between your code and data is a good practice.
Another way to put it is that your code and your data have different lifecycles. You want to be able to edit data in your database without redeploying your code, and update your code without touching your database.
Even if you can remove public access to some files through GitHub, this is not a good practice because when you work in a team with multiple developpers, developpers may have access to the code but not the production data, because it's usually sensitive. If you work with 5 people and each one of them has a copy of your database, it means the risk to lose it or have it stolen is 5x higher ;)
Static files
When you work locally, Django's built-in runserver command handles the serving of static assets such as CSS, Javascript and images for you.
However, this server is not designed for production use either. It works great in development, but will start to fail very fast on a production website, that should handle way more requests than your local version.
Because of that, you need to host these static files somewhere else, and AWS is one place where you can do that. AWS will serve those files for you, in a very efficient way. There are other options available, for instance configuring a reverse proxy with Nginx to serve the files for you, if you're using a dedicated server.
As far as I can tell, the progression you describe from the video is bringing you from a local, development enviromnent to a more efficient and scalable production setup. That is to be expected, because it's less daunting to start with something really simple (SQLite, Django's built-in runserver), and move on to more complex and abstract topics and tools later on.
In a couple of months, I'm receiving a single (physical) Ubuntu LTS server for the purpose of a corporate Intranet only web tools site. I've been trying to figure out a framework to go with. My preference at this point would be to go with Django. I've used RoR, CF and PHP heavily in the past.
My Django concern right now is how to have both a separate '/web/' and '/dev/' environment, when I'm only getting a single server. Of course this would include also needing separate 'web' and 'dev' databases (either separated by db name or having two different db instances running on the single server).
Option 1: I know I could only setup a 'web' (production) environment on Ubuntu and then use my corporate Windows laptop to develop Django tools. I've read this works fine except that a lot of 3rd party Django packages don't work on Windows. My other concern would be making code changes and then pushing to the Ubuntu server where I might introduce problems that didn't show up on the local Windows development environment.
Option 2: Somehow setup a separate Django 'web' and 'dev' environment on the same server. I've seen a lot of different and confusing information on this. Also adding to the complication is what I assume would be the need to have two database instances running on the same server. Or, how could you have two different Django environments for 'web' and 'dev' and have them point to different db tables based on name instead of needing two different db instances running?
Thanks for any advice. I'm actually having trouble relaxing and learning Django not knowing how bad this is going to deal with. I could easily just deal with the pain of developing in basic PHP if this is too over complicated. With plain PHP it's dead simple to have a '/web/' and and '/dev/' path and separate db's just by checking the URL or file path for '/web/' or '/dev/' (and then pointing to the right db for example - 'mytool_dev_v1' / 'mytool_web_v1').
There are multiple ways to solve this problem:
You can run 2 separate instances of django in the same server in different virtual environments. You can configure them in a multiple ways: using environment variables or just separate 'production' and 'dev' config-files and choose which gonna be used.
You can use docker containers to serve different django instances. It is the best way I think. You can configure them in the same way: by the environment variables or multiple config files for 'dev' and 'prod' options.
If you want to serve 2 (or more) sites in the same server youll probably need to configure nginx server to redirect requests to the separate containers or django instances depends on the domain name or something else (url, for example).
As I know there is no problem to configure separate database for each instance. You also can run your postgres or mysql instance in container. The same way you can run nginx.
I can't recommend you to develop your app in the same server where production app is running. I convinced that development must going in the developer's computer, but yeah... Windows is not the best for django development, but it mostly works. Otherwise I can recommend you to use dualboot or at least VirtualBox with Ubuntu.
I'm using docker for a project, the main focus for its usage is to make the application available even if one of the node (it's a 6 nodes cluster with docker swarm) is down.
The application is basically a Django App that can save some images from users and others models. I'm currently saving the images as files, but since I need to specify a volume locally for a single machine, I would like to know if it would be better to save the images on database cluster, so it would be available even if the whole node goes down. Or is there another way?
#Edit
Note: The cluster runs locally and doesn't have internet access
The two options are two perform the file sharing via database or via the file system.
For file system sharing, you can use something like GlusterFS, so for each container it seems like they are mounting a host-local volume, but it's actually shared via GlusterFS between the hosts.
To my mind, if it's your application (e.g you can modify it at will), saving stuff in database would be the easier approach for most developers.
The best solution is often to go for a hosted option (such as MongoDB Atlas). Making a database resilient and highly available is really hard, and unless you are an expert on docker and mongo I would strongly recommend you to go for a hosted option.
I have never actually worked for a company which is deploying a Django App (with a large user base), and am curious about what is the best way to do this.
Right now I am hosting a Django App on EC2. The code for the app is sitting in my github account. I have nginx serving static content, and behind it a single apache server running django + mod_wsgi.
I am trying to figure out what the best practice is for "continuous deployment". Right now, after I have added additional functionality I do the following on EC2:
1) git reset HEAD --hard
2) git pull
3) restart apache
4) restart nginx
I have custom logic in my settings.py file so that if I am running on EC2, debug gets set to False, and my databases switch from sqlite3 (development) to mysql (production).
This seems to be working for me now, but I am wondering what is wrong with this process and how could I improve it.
Thanks
I've worked with systems that use Fabric to deploy to multiple servers
I'm the former lead developer at The Texas Tribune, which is 100% Django. We deployed to EC2 using RightScale. I didn't personally write the deployment scripts, but it allowed us to get new instances into the rotation very, very quickly and scales on-demand. it's not cheap, but was worth every penny in my opinion.
I'd agree with John and say that Fabric is the tool to do this sort of thing comfortably. You probably don't want to configure git to automatically deploy with a post commit hook, but you might want to configure a fabric command to run your test suite locally, and then push to production if it passes.
Many people run separate dev and production settings files, rather than having custom logic in there to detect if it's in a production environment. You can inherit from a unified file, and then override the bits that are different between dev and production. Then you start the server using the production file, rather than relying on a single unified settings.py.
If you're just using apache to host the application, you might benefit from a lighter weight solution. Using fastcgi with nginx would allow you to do away with the overhead of apache entirely. There's also a wsgi module for nginx, but I don't know if it's production ready at this point.
There is one more good way how to manage this. For ubuntu/debian amis it is good to manager versions and do deployemnts by packeging your application into .deb