how to run Apache with mod_wsgi and django in one process only? - django

I'm running apache with django and mod_wsgi enabled in 2 different processes.
I read that the second process is a on-change listener for reloading code on change, but for some reason the ready() function of my AppConfig class is being executed twice. This function should only run once.
I understood that running django runserver with the --noreload flag will resolve the problem on development mode, but I cannot find a solution for this in production mode on my apache webserver.
I have two questions:
How can I run with only one process in production or at least make only one process run the ready() function ?
Is there a way to make the ready() function run not in a lazy mode? By this, I mean execute only on on server startup, not on first request.
For further explanation, I am experiencing a scenario as follows:
The ready() function creates a folder listener such as pyinotify. That listener will listen on a folder on my server and enqueue a task on any changes.
I am seeing this listener executed twice on any changes to a single file in the monitored directory. This leads me to believe that both processes are running my listener.

No, the second process is not an onchange listener - I don't know where you read that. That happens with the dev server, not with mod_wsgi.
You should not try to prevent Apache from serving multiple processes. If you do, the speed of your site will be massively reduced: it will only be able to serve a single request at a time, with others queued until the first finishes. That's no good for anything other than a toy site.
Instead, you should fix your AppConfig. Rather than blindly spawning a listener, you should check to see if it has already been created before starting a new one.

You shouldn't prevent spawning multiple processes, because it's good thing, especially on production environment. You should consider using some external tool, separated from django or add check if folder listening is already running (for example monitor persistence of PID file and it's content).

Related

How to warm up django web service before opening to public?

I'm running django web application on AWS ecs.
I'd like to warm up the server (hit the first request and it takes some time for django to load up) when deploying a new version.
Is there a way for warming up the server before registering it to the Application load balancer?
Edit
I'm using nginx + uwsgi
I asumed that you use mod_wsgi , because that is the behavior described here:
Q: Why do requests against my application seem to take forever, but then after a bit they all run much quicker?
A: This is because mod_wsgi by default performs lazy loading of any application. That is, an application is only loaded the first time
that a request arrives which targets that WSGI application. This means
that those initial requests will incur the overhead of loading all the
application code and performing any startup initialisation.
This startup overhead can appear to be quite significant, especially if using Apache prefork MPM and embedded mode. This is
because the startup cost is incurred for each process and with prefork
MPM there are typically a lot more processes that if using worker MPM
or mod_wsgi daemon mode. Thus, as many requests as there are processes
will run slowly and everything will only run full speed once code has
all been loaded.
Note that if recycling of Apache child processes or mod_wsgi daemon processes after a set number of requests is enabled, or for
embedded mode Apache decides itself to reap any of the child
processes, then you can periodically see these delayed requests
occurring.
Some number of the benchmarks for mod_wsgi which have been posted do not take into mind these start up costs and wrongly try to compare
the results to other systems such as fastcgi or proxy based systems
where the application code would be preloaded by default. As a result
mod_wsgi is painted in a worse light than is reality. If mod_wsgi is
configured correctly the results would be better than is shown by
those benchmarks.
For some cases, such as when WSGIScriptAlias is being used, it is actually possible to preload the application code when the processes
first starts, rather than when the first request arrives. To preload
an application see the WSGIImportScript directive.
I think you may try to use WSGIScriptAlias see more here
I just changed health check from nginx based ones to uwsgi related ones,
create an endpoint in django, and let the ELB use that as health check

Does Django run in single thread by default?

By reading the code, I found it seems that Django run in single thread by default.
However, when I use sleep(15) in my view function and open two web to request my function. They return the response almost at the same time!
so, I do not know why does it happened……
my Django version is 1.9
Django itself does not determine whether it runs in one or more threads. This is the job of the server running Django.
The development server used to be single-threaded, but in recent versions it has been made multithreaded. Other servers such as Apache/mod_wsgi, gunicorn, or uwsgi, have their own defaults and can be configured in a number of ways; often they use multiple processes rather than threads.

uWSGI + nginx for django app avoids pylibmc multi-thread concurrency issue?

Introduction
I encountered this very interesting issue this week, better start with some facts:
pylibmc is not thread safe, when used as django memcached backend, starting multiple django instance directly in shell would crash when hit with concurrent requests.
if deploy with nginx + uWSGI, this problem with pylibmc magically dispear.
if you switch django cache backend to python-memcached, it too will solve this problem, but this question isn't about that.
Elaboration
start with the first fact, this is how I reproduced the pylibmc issue:
The failure of pylibmc
I have a django app which does a lot of memcached reading and writing, and there's this deployment strategy, that I start multiple django process in shell, binding to different ports (8001, 8002), and use nginx to do the balance.
I initiated two separate load test against these two django instance, using locust, and this is what happens:
In the above screenshot they both crashed and reported exactly the same issue, something like this:
Assertion "ptr->query_id == query_id +1" failed for function "memcached_get_by_key" likely for "Programmer error, the query_id was not incremented.", at libmemcached/get.cc:107
uWSGI to the rescue
So in the above case, we learned that multi-thread concurrent request towards memcached via pylibmc could cause issue, this somehow doesn't bother uWSGI with multiple worker process.
To prove that, I start uWSGI with the following settings included:
master = true
processes = 2
This tells uWSGI to start two worker process, I then tells nginx to server any django static files, and route non-static requests to uWSGI, to see what happens. With the server started, I launch the same locust test against django in localhost, and make sure there's enough requests per seconds to cause concurrent request against memcached, here's the result:
In the uWSGI console, there's no sign of dead worker processes, and no worker has been re-spawn, but looking at the upper part of the screenshot, there sure has been concurrent requests (5.6 req/s).
The question
I'm extremely curious about how uWSGI make this go away, and I couldn't learn that on their documentation, to recap, the question is:
How did uWSGI manage worker process, so that multi-thread memcached requests didn't cause django to crash?
In fact I'm not even sure that it's the way uWSGI manages worker processes that avoid this issue, or some other magic that comes with uWSGI that's doing the trick, I've seen something called a memcached router in their documentation that I didn't quite understand, does that relate?
Isn't it because you actually have two separate processes managed by uWSGI? As you are setting the processes option instead of the workers option, so you should actually have multiple uWSGI processes (I'm assuming a master + two workers because of the config you used). Each of those processes will have it's own loaded pylibmc, so there is not state sharing between threads (you haven't configured threads on uWSGI after all).

Why is mod_python running entire django stack from beginning with each request?

My understanding is that mod_python loads the python process into apache, avoiding the overhead of doing that on each call. My expectation was that this would mean that my django stack would also only be loaded once.
What I am observing, however, is that every request is running the entire django stack from the beginning, as though it were the first request. The settings are re-imported. Middleware __init__'s, which are supposed to be run once at django startup, are run each time. And so forth. It seems to be essentially like I would expect CGI to be.
Is this expected behavior? I have mostly worked with mod_wsgi, which I believe does not work this way, but I have to use mod_python for my current client.
Thanks!
Apache on UNIX systems is a multiprocess system as pointed out by someone else. Also make sure the MaxRequestsPerChild hasn't been set to be 1 in Apache configuration for some reason. Ideally that directive should be set to 0, meaning keep processes around and not recycle them based on number of requests.
It loads Django once per httpd process. Since multiple processes start (each child being a process), multiple instances of Django are started.

Django + WSGI: Refreshing Issues?

I'm developing a Django site. I'm making all my changes on the live server, just because it's easier that way. The problem is, every now and then it seems to like to cache one of the *.py files I'm working on. Sometimes if I hit refresh a lot, it will switch back and forth between an older version of the page, and a newer version.
My set up is more or less like what's described in the Django tutorials: http://docs.djangoproject.com/en/dev/howto/deployment/modwsgi/#howto-deployment-modwsgi
I'm guessing it's doing this because it's firing up multiple instances of of the WSGI handler, and depending on which handler the the http request gets sent to, I may receive different versions of the page. Restarting apache seems to fix the problem, but it's annoying.
I really don't know much about WSGI or "MiddleWare" or any of that request handling stuff. I come from a PHP background, where it all just works :)
Anyway, what's a nice way of resolving this issue? Will running the WSGI handler is "daemon mode" alleviate the problem? If so, how do I get it to run in daemon mode?
Running the process in daemon mode will not help. Here's what's happening:
mod_wsgi is spawning multiple identical processes to handle incoming requests for your Django site. Each of these processes is its own Python Interpreter, and can handle an incoming web request. These processes are persistent (they are not brought up and torn down for each request), so a single process may handle thousands of requests one after the other. mod_wsgi is able to handle multiple web requests simultaneously since there are multiple processes.
Each process's Python interpreter will load your modules (your custom Python files) whenever an "import module" is executed. In the context of django, this will happen when a new view.py is needed due to a web request. Once the module is loaded, it resides in memory, and so any changes you make to the file will not be reflected in that process. As more web requests come in, the process's Python interpreter will simply use the version of the module that is already loaded in memory. You are seeing inconsistencies between refreshes since each web request you are making can be handled by different processes. Some processes may have loaded your Python modules during earlier revisions of your code, while others may have loaded them later (since those processes had not received a web request).
The simple solution: Anytime you modify your code, restart the Apache process. Most times that is as simple as running as root from the shell "/etc/init.d/apache2 restart". I believe a simple reload works as well, which is faster, "/etc/init.d/apache2 reload"
The daemon solution: If you are using mod_wsgi in daemon mode, then all you need to do is touch (unix command) or modify your wsgi script file. To clarify scrompt.com's post, modifications to your Python source code will not result in mod_wsgi reloading your code. Reloading only occurs when the wsgi script file has been modified.
Last point to note: I only spoke about wsgi as using processes for simplicity. wsgi actually uses thread pools inside each process. I did not feel this detail to be relevant to this answer, but you can find out more by reading about mod_wsgi.
Because you're using mod_wsgi in embedded mode, your changes aren't being automatically seen. You're seeing them every once in a while because Apache starts up new handler instances sometimes, which catch the updates.
You can resolve this by using daemon mode, as described here. Specifically, you'll want to add the following directives to your Apache configuration:
WSGIDaemonProcess example.com processes=2 threads=15 display-name=%{GROUP}
WSGIProcessGroup example.com
Read the mod_wsgi documentation rather than relying on the minimal information for mod_wsgi hosting contained on the Django site. In partcular, read:
http://code.google.com/p/modwsgi/wiki/ReloadingSourceCode
This tells you exactly how source code reloading works in mod_wsgi, including a monitor you can use to implement same sort of source code reloading that Django runserver does. Also see which talks about how to apply that to Django.
http://blog.dscpl.com.au/2008/12/using-modwsgi-when-developing-django.html
http://blog.dscpl.com.au/2009/02/source-code-reloading-with-modwsgi-on.html
You can resolve this problem by not editing your code on the live server. Seriously, there's no excuse for it. Develop locally using version control, and if you must, run your server from a live checkout, with a post-commit hook that checks out your latest version and restarts Apache.