Does a production DJango server fork? - django

I have been storing some information in global vars in my DJango views. This information can be accessed by every thread in the Python Django process. However, I am wondering about how Django behaves in production. Does a production Django process fork() multiple times to handle requests? If so this data would not be the same across processes. Does anyone know if Django forks?

I'm sure that it depends on your deployment, but if you are running it under FastCGI or WSGI, then yes, it generally pre-forks a number of server processes to handle incoming requests.
I don't know about running under mod_python, but I think that is being discouraged these days in favour of WSGI.

I'm not an expert in this field so I'm answering based only on the grep-ing I've just done.
The fastcgi server seems to be able to fork, depending on configuration settings:
http://code.djangoproject.com/browser/django/tags/releases/1.2.3/django/core/servers/fastcgi.py#L171
http://code.djangoproject.com/browser/django/tags/releases/1.2.3/django/utils/daemonize.py
As for WSGI, I believe that Django side-handling is going straight to the request processing:
http://code.djangoproject.com/browser/django/tags/releases/1.2.3/django/core/handlers/wsgi.py#L217
and forking is configured in mod_wsgi: http://code.google.com/p/modwsgi/ - embedded mode vs daemon mode - and/or in Apache (worker vs prefork builds).

For mod_wsgi, read:
http://code.google.com/p/modwsgi/wiki/ProcessesAndThreading
It explains the various models and guidelines in respect to use of common data across threads/processes. Situation isn't much different for other hosting systems.

Related

Setting up Nginx as a reverse proxy for Apache vs just Apache Event MPM

In the Django docs for setting up mod_wsgi, the tutorial notes:
Django doesn’t serve files itself; it leaves that job to whichever Web
server you choose.
We recommend using a separate Web server – i.e., one that’s not also
running Django – for serving media. Here are some good choices:
Nginx
A stripped-down version of Apache
I understand this might be due to wasted resources when Apache spawns new processes to serve each static file, which Nginx avoids. However, Apache's (newish?) Event MPM seems to act similar to an Nginx instance handing off requests to an Apache worker mpm. Therefore I'd like to ask: instead of setting up Nginx to be a reverse proxy for Apache, would using an Apache Event MPM be sufficient for serving static files in Apache?
Apache doesn't spawn a new process for each static file. Apache keeps persistent processes to handle concurrent and subsequent requests just like nginx. The difference is that nginx uses a full async model, whereas Apache relies on processes and/or threading for concurrency, although event MPM uses an async model for initial request acceptance and keep alive connections now. For the majority of people, Apache alone is still a more than acceptable solution. So don't get ahead of yourself if you are just starting out and think you need a Google/Facebook scale solution from the outset.
More important than separate web server is that if using Apache/mod_wsgi, serve the static files under a different host name. That way you avoid heavy weight cookie information being sent for all static file requests. You can do this using virtual hosts in Apache. Also ensure you are using daemon mode of mod_wsgi for running the Django application as that is a better architecture and provides lots more options for setting timeouts so you can have your application recover from various situations which might otherwise cause the server to lock up when overloaded.
For a system which provides a better out of the box configuration and experience than using Apache/mod_wsgi directly and configuring it yourself, look at using mod_wsgi-express.
https://pypi.python.org/pypi/mod_wsgi
http://blog.dscpl.com.au/2015/04/introducing-modwsgi-express.html
http://blog.dscpl.com.au/2015/04/using-modwsgi-express-with-django.html
http://blog.dscpl.com.au/2015/04/integrating-modwsgi-express-as-django.html
The advice about separating the webservers has two advantages. One clearly outlined by Graham. The other is "predictable resource consumption".
The number of resources per HTML page differ. Leaving one webserver to serve the application and the other to serve static resources, has the advantage that you know exactly how many concurrent visitors you can serve: the MaxClients setting of Apache.
If this slows down the loading of images, those webservers need very few modules and no measurable amount of CPU power so a one core machine with SSD disks is all you need and scaling is cheap.
As Graham indicates it starts with a STATIC_URL that has a different hostname. Run it at the same server at the start. When scaling up, tie that hostname to a reverse proxy that serves from several image server backend machines.

Does Django run in single thread by default?

By reading the code, I found it seems that Django run in single thread by default.
However, when I use sleep(15) in my view function and open two web to request my function. They return the response almost at the same time!
so, I do not know why does it happened……
my Django version is 1.9
Django itself does not determine whether it runs in one or more threads. This is the job of the server running Django.
The development server used to be single-threaded, but in recent versions it has been made multithreaded. Other servers such as Apache/mod_wsgi, gunicorn, or uwsgi, have their own defaults and can be configured in a number of ways; often they use multiple processes rather than threads.

concurrent requests on dotcloud with django

I have a django app I want to migrate to dotcloud.
Many actions in Django internals and in my app are not asynchronous, i.e. they block the thread until they finish.
When I was using Apache, that didn't pose a problem since a different thread is opened on every request. But it doesn't seem to be the case in nginx/uwsgi that dotcloud use.
Seemingly, uwsgi has a --enable-threads and --threads options that can be used for multithreading, but:
It is not clear what version of uwsgi dotcloud use, and if they support these features
Since I have no one else asking about this, I was wondering if this is really the right way to get the concurrent requests running (using threads)
You could run Django with Gunicorn. Gunicorn, in turn, supports multiple worker classes, and people reported success running gunicorn+gevents+django together[1][2].
To use that on dotCloud, you will probably have to use dotCloud's custom service. If that's something that you want to try, I would personally start with dotCloud's reimplementation of python service using the custom service, and replace uwsgi with gunicorn in it.
I came here looking for some leads, which I found, thanks!
There was a fair amount of leg work left to actually get stuff working, though.
Here is an example app on github that uses gunicorn, gevent, and socketio on dotcloud:
https://github.com/t1m0thy/django-tictactoe/tree/dotcloud
Threads is a problem in python - GIL doesn't allow them to run simultaneously.
So multiprocessing is an answer.
Or you may take a look at gevent. Actually gevent is a kind of a hack (monkey patching of python stack) and so on, but it allows to launch green threads.
I'm not sure if gevent can be combined with django, but google knows ;)

Serve multiple Django and PHP projects on the same machine?

The documentation states that one should not server static files on the same machine as as the Django project, because static content will kick the Django application out of memory. Does this problem also come from having multiple Django projects on one server ? Should I combine all my Website-Projects into one very large Django project ?
I'm currently serving Django along with php scripts from Apache with mod WSGI. Does this also cause a loss of efficiency ?
Or is the warning just meant for static content, because the problem arises when serving hundreds of files, while serving 20-30 different PHP / Django projects is ok ?
I would say that this setup is completely ok. Off course it depends on the hardware, load and the other projects. But here you can just try and monitor the usage/performance.
The suggestion to use different server(s) for static files makes sense, as it is more efficient for the ressources. But as long as one server performs good enough i don't see a reason to use a second one.
Another question - which has less to do with performance than with ease of use/configuration - is the decision if you really want to run everything on the same server.
For one setup with a bunch of smaller sites (and as well some php-legacy) we use one machine with four virtual servers:
webhead running nginx (and varnish)
database
simple apache2/php server
django server using gunicorn + supervisord
nginx handles all the sites, either proxying to the application-server or serving static content (via nas). I like this setup, as it is very easy to install and handle, as well it makes it simple to scale out one piece if needed. Bu
If the documentation says """one should not server static files on the same machine as as the Django project, because static content will kick the Django application out of memory""" then the documentation is very misleading and arguably plain wrong.
The one suggestion I would make if using PHP on same system is that you ensure you are using mod_wsgi daemon mode for running the Python web application and even one daemon process per Python web application.
Do not run the Python web application in embedded mode because that means you are running stuff in same process as mod_php and because PHP including extensions is not really multithread safe that means you have to be running prefork MPM. Running Python web applications embedded in Apache when running prefork MPM is a bad idea unless you know very well how to set up Apache properly for it. Don't set up Apache right and you get issues like as described in:
http://blog.dscpl.com.au/2009/03/load-spikes-and-excessive-memory-usage.html
The short of it is that Apache configuration for PHP and Python need to be quite different. You can get around that though by using mod_wsgi daemon mode for the Python web application.

Why is mod_python running entire django stack from beginning with each request?

My understanding is that mod_python loads the python process into apache, avoiding the overhead of doing that on each call. My expectation was that this would mean that my django stack would also only be loaded once.
What I am observing, however, is that every request is running the entire django stack from the beginning, as though it were the first request. The settings are re-imported. Middleware __init__'s, which are supposed to be run once at django startup, are run each time. And so forth. It seems to be essentially like I would expect CGI to be.
Is this expected behavior? I have mostly worked with mod_wsgi, which I believe does not work this way, but I have to use mod_python for my current client.
Thanks!
Apache on UNIX systems is a multiprocess system as pointed out by someone else. Also make sure the MaxRequestsPerChild hasn't been set to be 1 in Apache configuration for some reason. Ideally that directive should be set to 0, meaning keep processes around and not recycle them based on number of requests.
It loads Django once per httpd process. Since multiple processes start (each child being a process), multiple instances of Django are started.