By reading the code, I found it seems that Django run in single thread by default.
However, when I use sleep(15) in my view function and open two web to request my function. They return the response almost at the same time!
so, I do not know why does it happened……
my Django version is 1.9
Django itself does not determine whether it runs in one or more threads. This is the job of the server running Django.
The development server used to be single-threaded, but in recent versions it has been made multithreaded. Other servers such as Apache/mod_wsgi, gunicorn, or uwsgi, have their own defaults and can be configured in a number of ways; often they use multiple processes rather than threads.
Related
Introduction
I encountered this very interesting issue this week, better start with some facts:
pylibmc is not thread safe, when used as django memcached backend, starting multiple django instance directly in shell would crash when hit with concurrent requests.
if deploy with nginx + uWSGI, this problem with pylibmc magically dispear.
if you switch django cache backend to python-memcached, it too will solve this problem, but this question isn't about that.
Elaboration
start with the first fact, this is how I reproduced the pylibmc issue:
The failure of pylibmc
I have a django app which does a lot of memcached reading and writing, and there's this deployment strategy, that I start multiple django process in shell, binding to different ports (8001, 8002), and use nginx to do the balance.
I initiated two separate load test against these two django instance, using locust, and this is what happens:
In the above screenshot they both crashed and reported exactly the same issue, something like this:
Assertion "ptr->query_id == query_id +1" failed for function "memcached_get_by_key" likely for "Programmer error, the query_id was not incremented.", at libmemcached/get.cc:107
uWSGI to the rescue
So in the above case, we learned that multi-thread concurrent request towards memcached via pylibmc could cause issue, this somehow doesn't bother uWSGI with multiple worker process.
To prove that, I start uWSGI with the following settings included:
master = true
processes = 2
This tells uWSGI to start two worker process, I then tells nginx to server any django static files, and route non-static requests to uWSGI, to see what happens. With the server started, I launch the same locust test against django in localhost, and make sure there's enough requests per seconds to cause concurrent request against memcached, here's the result:
In the uWSGI console, there's no sign of dead worker processes, and no worker has been re-spawn, but looking at the upper part of the screenshot, there sure has been concurrent requests (5.6 req/s).
The question
I'm extremely curious about how uWSGI make this go away, and I couldn't learn that on their documentation, to recap, the question is:
How did uWSGI manage worker process, so that multi-thread memcached requests didn't cause django to crash?
In fact I'm not even sure that it's the way uWSGI manages worker processes that avoid this issue, or some other magic that comes with uWSGI that's doing the trick, I've seen something called a memcached router in their documentation that I didn't quite understand, does that relate?
Isn't it because you actually have two separate processes managed by uWSGI? As you are setting the processes option instead of the workers option, so you should actually have multiple uWSGI processes (I'm assuming a master + two workers because of the config you used). Each of those processes will have it's own loaded pylibmc, so there is not state sharing between threads (you haven't configured threads on uWSGI after all).
I'm running apache with django and mod_wsgi enabled in 2 different processes.
I read that the second process is a on-change listener for reloading code on change, but for some reason the ready() function of my AppConfig class is being executed twice. This function should only run once.
I understood that running django runserver with the --noreload flag will resolve the problem on development mode, but I cannot find a solution for this in production mode on my apache webserver.
I have two questions:
How can I run with only one process in production or at least make only one process run the ready() function ?
Is there a way to make the ready() function run not in a lazy mode? By this, I mean execute only on on server startup, not on first request.
For further explanation, I am experiencing a scenario as follows:
The ready() function creates a folder listener such as pyinotify. That listener will listen on a folder on my server and enqueue a task on any changes.
I am seeing this listener executed twice on any changes to a single file in the monitored directory. This leads me to believe that both processes are running my listener.
No, the second process is not an onchange listener - I don't know where you read that. That happens with the dev server, not with mod_wsgi.
You should not try to prevent Apache from serving multiple processes. If you do, the speed of your site will be massively reduced: it will only be able to serve a single request at a time, with others queued until the first finishes. That's no good for anything other than a toy site.
Instead, you should fix your AppConfig. Rather than blindly spawning a listener, you should check to see if it has already been created before starting a new one.
You shouldn't prevent spawning multiple processes, because it's good thing, especially on production environment. You should consider using some external tool, separated from django or add check if folder listening is already running (for example monitor persistence of PID file and it's content).
I have a django app I want to migrate to dotcloud.
Many actions in Django internals and in my app are not asynchronous, i.e. they block the thread until they finish.
When I was using Apache, that didn't pose a problem since a different thread is opened on every request. But it doesn't seem to be the case in nginx/uwsgi that dotcloud use.
Seemingly, uwsgi has a --enable-threads and --threads options that can be used for multithreading, but:
It is not clear what version of uwsgi dotcloud use, and if they support these features
Since I have no one else asking about this, I was wondering if this is really the right way to get the concurrent requests running (using threads)
You could run Django with Gunicorn. Gunicorn, in turn, supports multiple worker classes, and people reported success running gunicorn+gevents+django together[1][2].
To use that on dotCloud, you will probably have to use dotCloud's custom service. If that's something that you want to try, I would personally start with dotCloud's reimplementation of python service using the custom service, and replace uwsgi with gunicorn in it.
I came here looking for some leads, which I found, thanks!
There was a fair amount of leg work left to actually get stuff working, though.
Here is an example app on github that uses gunicorn, gevent, and socketio on dotcloud:
https://github.com/t1m0thy/django-tictactoe/tree/dotcloud
Threads is a problem in python - GIL doesn't allow them to run simultaneously.
So multiprocessing is an answer.
Or you may take a look at gevent. Actually gevent is a kind of a hack (monkey patching of python stack) and so on, but it allows to launch green threads.
I'm not sure if gevent can be combined with django, but google knows ;)
I have been storing some information in global vars in my DJango views. This information can be accessed by every thread in the Python Django process. However, I am wondering about how Django behaves in production. Does a production Django process fork() multiple times to handle requests? If so this data would not be the same across processes. Does anyone know if Django forks?
I'm sure that it depends on your deployment, but if you are running it under FastCGI or WSGI, then yes, it generally pre-forks a number of server processes to handle incoming requests.
I don't know about running under mod_python, but I think that is being discouraged these days in favour of WSGI.
I'm not an expert in this field so I'm answering based only on the grep-ing I've just done.
The fastcgi server seems to be able to fork, depending on configuration settings:
http://code.djangoproject.com/browser/django/tags/releases/1.2.3/django/core/servers/fastcgi.py#L171
http://code.djangoproject.com/browser/django/tags/releases/1.2.3/django/utils/daemonize.py
As for WSGI, I believe that Django side-handling is going straight to the request processing:
http://code.djangoproject.com/browser/django/tags/releases/1.2.3/django/core/handlers/wsgi.py#L217
and forking is configured in mod_wsgi: http://code.google.com/p/modwsgi/ - embedded mode vs daemon mode - and/or in Apache (worker vs prefork builds).
For mod_wsgi, read:
http://code.google.com/p/modwsgi/wiki/ProcessesAndThreading
It explains the various models and guidelines in respect to use of common data across threads/processes. Situation isn't much different for other hosting systems.
My understanding is that mod_python loads the python process into apache, avoiding the overhead of doing that on each call. My expectation was that this would mean that my django stack would also only be loaded once.
What I am observing, however, is that every request is running the entire django stack from the beginning, as though it were the first request. The settings are re-imported. Middleware __init__'s, which are supposed to be run once at django startup, are run each time. And so forth. It seems to be essentially like I would expect CGI to be.
Is this expected behavior? I have mostly worked with mod_wsgi, which I believe does not work this way, but I have to use mod_python for my current client.
Thanks!
Apache on UNIX systems is a multiprocess system as pointed out by someone else. Also make sure the MaxRequestsPerChild hasn't been set to be 1 in Apache configuration for some reason. Ideally that directive should be set to 0, meaning keep processes around and not recycle them based on number of requests.
It loads Django once per httpd process. Since multiple processes start (each child being a process), multiple instances of Django are started.