uWSGI + nginx for django app avoids pylibmc multi-thread concurrency issue? - django

Introduction
I encountered this very interesting issue this week, better start with some facts:
pylibmc is not thread safe, when used as django memcached backend, starting multiple django instance directly in shell would crash when hit with concurrent requests.
if deploy with nginx + uWSGI, this problem with pylibmc magically dispear.
if you switch django cache backend to python-memcached, it too will solve this problem, but this question isn't about that.
Elaboration
start with the first fact, this is how I reproduced the pylibmc issue:
The failure of pylibmc
I have a django app which does a lot of memcached reading and writing, and there's this deployment strategy, that I start multiple django process in shell, binding to different ports (8001, 8002), and use nginx to do the balance.
I initiated two separate load test against these two django instance, using locust, and this is what happens:
In the above screenshot they both crashed and reported exactly the same issue, something like this:
Assertion "ptr->query_id == query_id +1" failed for function "memcached_get_by_key" likely for "Programmer error, the query_id was not incremented.", at libmemcached/get.cc:107
uWSGI to the rescue
So in the above case, we learned that multi-thread concurrent request towards memcached via pylibmc could cause issue, this somehow doesn't bother uWSGI with multiple worker process.
To prove that, I start uWSGI with the following settings included:
master = true
processes = 2
This tells uWSGI to start two worker process, I then tells nginx to server any django static files, and route non-static requests to uWSGI, to see what happens. With the server started, I launch the same locust test against django in localhost, and make sure there's enough requests per seconds to cause concurrent request against memcached, here's the result:
In the uWSGI console, there's no sign of dead worker processes, and no worker has been re-spawn, but looking at the upper part of the screenshot, there sure has been concurrent requests (5.6 req/s).
The question
I'm extremely curious about how uWSGI make this go away, and I couldn't learn that on their documentation, to recap, the question is:
How did uWSGI manage worker process, so that multi-thread memcached requests didn't cause django to crash?
In fact I'm not even sure that it's the way uWSGI manages worker processes that avoid this issue, or some other magic that comes with uWSGI that's doing the trick, I've seen something called a memcached router in their documentation that I didn't quite understand, does that relate?

Isn't it because you actually have two separate processes managed by uWSGI? As you are setting the processes option instead of the workers option, so you should actually have multiple uWSGI processes (I'm assuming a master + two workers because of the config you used). Each of those processes will have it's own loaded pylibmc, so there is not state sharing between threads (you haven't configured threads on uWSGI after all).

Related

Are Django settings shared across uwsgi workers?

I have a Django app, with a setting (in my settings.py file) that's populated dynamically in my App Config's ready() function. Ie in settings.py I have:
POPULATE_THIS = None
and then in apps.py in ready I have:
def ready(self):
if POPULATE_THIS is None:
POPULATE_THIS = ... some code which instantiates an object I need that's effectively a singleton ...
This seems to work ok. But I'm now at the point where rather than just running the dev server locally (ie python manage.py runserver), I'm now running the Django app through uwsgi (proxied behind nginx), and uwsgi is configured to run 10 worker processes (ie my uwsgi ini file has processes = 10 and threads = 1).
I'm seeing evidence that even though there are 10 uwsgi processes , ready() is still called exactly once on app startup and the value of POPULATE_THIS is the same across all workers (doing a str on it is giving the same memory address).
My question: How is that value shared across the uwsgi processes, as I thought separate processes are distinct and do not share any memory? And am I correct in assuming that ready() is going to be called once per app startup (ie when uwsgi itself spins up), and not once per uwsgi worker process startup?
This answer (Multiple server processes using nginx and uWSGI) on a different question seems to indicate that some data is shared across workers, but I can't seem to find any official docs that indicate what exactly is shared and how, specifically with respect to Django settings, so some explanation/details would be much appreciated.
Exactly.
It seems that uwsgi only spawn processes of the django application itself, therefore all the functions ready will be called only once, during the first run.

Gunicorn + Gevent : Debugging workers stuck state/ WORKER TIMEOUT cause

I'm running a very simple web server using Django on Gunicorn with Gevent workers which communicate with MySQL for simple crud type operations. All of this is behind nginx and hosted on AWS. I'm running my app server using the following config:
gunicorn --logger-class=simple --timeout 30 -b :3000 -w 5 -k gevent my_app.wsgi:application
However, sometimes, the workers just get stuck (sometimes when # of requests increase. Sometimes even without it)and the TPS drops with nginx returning 499 HTTP error code. Sometimes, workers have started getting killed (WORKER TIMEOUT) and the requests are dropped.
I'm unable to find a way to debug where the workers are getting stuck. I've checked the slow logs of MySQL, and that is not the problem here.
In Java, I can take jstack to see the threads state or some other mechanisms like takipi which provides with the then state of threads when an exception comes.
To all the people out there who can help, I call upon you to help me find a way to see the internal state of a hosted python web server i.e.
workers state at a given point
threads state at a given point
which all requests a particular gevent worker have started processing and when it gets stuck/killed, where is it actually stuck
which all requests got terminated because of a worker getting killed
etc
I've been looking for it and have found many people facing similar issues, but their solutions seem hit-and-trial and nowhere is the steps mentioned on how to deep down on this.

how to run Apache with mod_wsgi and django in one process only?

I'm running apache with django and mod_wsgi enabled in 2 different processes.
I read that the second process is a on-change listener for reloading code on change, but for some reason the ready() function of my AppConfig class is being executed twice. This function should only run once.
I understood that running django runserver with the --noreload flag will resolve the problem on development mode, but I cannot find a solution for this in production mode on my apache webserver.
I have two questions:
How can I run with only one process in production or at least make only one process run the ready() function ?
Is there a way to make the ready() function run not in a lazy mode? By this, I mean execute only on on server startup, not on first request.
For further explanation, I am experiencing a scenario as follows:
The ready() function creates a folder listener such as pyinotify. That listener will listen on a folder on my server and enqueue a task on any changes.
I am seeing this listener executed twice on any changes to a single file in the monitored directory. This leads me to believe that both processes are running my listener.
No, the second process is not an onchange listener - I don't know where you read that. That happens with the dev server, not with mod_wsgi.
You should not try to prevent Apache from serving multiple processes. If you do, the speed of your site will be massively reduced: it will only be able to serve a single request at a time, with others queued until the first finishes. That's no good for anything other than a toy site.
Instead, you should fix your AppConfig. Rather than blindly spawning a listener, you should check to see if it has already been created before starting a new one.
You shouldn't prevent spawning multiple processes, because it's good thing, especially on production environment. You should consider using some external tool, separated from django or add check if folder listening is already running (for example monitor persistence of PID file and it's content).

Asynchronous celery task is blocking main application (maybe)

I have a django application running behind varnish and nginx.
There is a periodic task running every two minutes, accessing a locally running jsonrpc daemon and updating a django model with the result.
Sometimes the django app is not responding, ending up in an nginx gateway failed message. Looking through the logs it seems that when this happens the backend task accessing the jsonrpc daemon is also timing out.
The task itself is pretty simple: A value is requested from jsonrpc daemon and saved in a django model, either updating an existing entry or creating a new one. I don't think that any database deadlock is involved here.
I am a bit lost in how to track this down. To start, I don't know if the timeout of the task is causing the overall site timeout OR if some other problem is causing BOTH timeouts. After all, a timout in the asynchronous task should not have any influence on the website response?

concurrent requests on dotcloud with django

I have a django app I want to migrate to dotcloud.
Many actions in Django internals and in my app are not asynchronous, i.e. they block the thread until they finish.
When I was using Apache, that didn't pose a problem since a different thread is opened on every request. But it doesn't seem to be the case in nginx/uwsgi that dotcloud use.
Seemingly, uwsgi has a --enable-threads and --threads options that can be used for multithreading, but:
It is not clear what version of uwsgi dotcloud use, and if they support these features
Since I have no one else asking about this, I was wondering if this is really the right way to get the concurrent requests running (using threads)
You could run Django with Gunicorn. Gunicorn, in turn, supports multiple worker classes, and people reported success running gunicorn+gevents+django together[1][2].
To use that on dotCloud, you will probably have to use dotCloud's custom service. If that's something that you want to try, I would personally start with dotCloud's reimplementation of python service using the custom service, and replace uwsgi with gunicorn in it.
I came here looking for some leads, which I found, thanks!
There was a fair amount of leg work left to actually get stuff working, though.
Here is an example app on github that uses gunicorn, gevent, and socketio on dotcloud:
https://github.com/t1m0thy/django-tictactoe/tree/dotcloud
Threads is a problem in python - GIL doesn't allow them to run simultaneously.
So multiprocessing is an answer.
Or you may take a look at gevent. Actually gevent is a kind of a hack (monkey patching of python stack) and so on, but it allows to launch green threads.
I'm not sure if gevent can be combined with django, but google knows ;)