Why is mod_python running entire django stack from beginning with each request?

Why is mod_python running entire django stack from beginning with each request? - django

My understanding is that mod_python loads the python process into apache, avoiding the overhead of doing that on each call. My expectation was that this would mean that my django stack would also only be loaded once.
What I am observing, however, is that every request is running the entire django stack from the beginning, as though it were the first request. The settings are re-imported. Middleware __init__'s, which are supposed to be run once at django startup, are run each time. And so forth. It seems to be essentially like I would expect CGI to be.
Is this expected behavior? I have mostly worked with mod_wsgi, which I believe does not work this way, but I have to use mod_python for my current client.
Thanks!

Apache on UNIX systems is a multiprocess system as pointed out by someone else. Also make sure the MaxRequestsPerChild hasn't been set to be 1 in Apache configuration for some reason. Ideally that directive should be set to 0, meaning keep processes around and not recycle them based on number of requests.

It loads Django once per httpd process. Since multiple processes start (each child being a process), multiple instances of Django are started.

Related

Does Django run in single thread by default？

By reading the code, I found it seems that Django run in single thread by default.
However, when I use sleep(15) in my view function and open two web to request my function. They return the response almost at the same time!
so, I do not know why does it happened……
my Django version is 1.9

Django itself does not determine whether it runs in one or more threads. This is the job of the server running Django.
The development server used to be single-threaded, but in recent versions it has been made multithreaded. Other servers such as Apache/mod_wsgi, gunicorn, or uwsgi, have their own defaults and can be configured in a number of ways; often they use multiple processes rather than threads.

Django Reloading on every request

I wanted to load some data and keep it in memory as in application scope. Basing on this and other posts in stackoverflow, I've put the required code snippet in settings.py, urls.py, models.py. I also put print statements to see when it gets executed. I see all the print statements in the server log with every request.
The following are the version details:
Linux 2.6.32-358.el6.x86_64
Apache/2.2.15 (Unix)
Django 1.4
Python 2.7.4
Looks like django is re-loading for every request. I also looked into this and confirmed with the admin that MaxRequestsPerChild is NOT 1.

If you are running in mod_wsgi embedded mode, you will have a multi process configuration, thus can take a while to warm up all processes with your code. Also, Apache will kill off idle processes and so you will see process churn. So what you may be seeing is the result of that.
Add to your debug code the printing out of the process ID to confirm this.
The easiest thing to do is use mod_wsgi daemon mode and restrict yourself to a small fixed number of persistent processes.
Also go watch my PyCon talk about this sort of stuff at:
http://lanyrd.com/2013/pycon/scdyzk/

Does a production DJango server fork?

I have been storing some information in global vars in my DJango views. This information can be accessed by every thread in the Python Django process. However, I am wondering about how Django behaves in production. Does a production Django process fork() multiple times to handle requests? If so this data would not be the same across processes. Does anyone know if Django forks?

I'm sure that it depends on your deployment, but if you are running it under FastCGI or WSGI, then yes, it generally pre-forks a number of server processes to handle incoming requests.
I don't know about running under mod_python, but I think that is being discouraged these days in favour of WSGI.

I'm not an expert in this field so I'm answering based only on the grep-ing I've just done.
The fastcgi server seems to be able to fork, depending on configuration settings:
http://code.djangoproject.com/browser/django/tags/releases/1.2.3/django/core/servers/fastcgi.py#L171
http://code.djangoproject.com/browser/django/tags/releases/1.2.3/django/utils/daemonize.py
As for WSGI, I believe that Django side-handling is going straight to the request processing:
http://code.djangoproject.com/browser/django/tags/releases/1.2.3/django/core/handlers/wsgi.py#L217
and forking is configured in mod_wsgi: http://code.google.com/p/modwsgi/ - embedded mode vs daemon mode - and/or in Apache (worker vs prefork builds).

For mod_wsgi, read:
http://code.google.com/p/modwsgi/wiki/ProcessesAndThreading
It explains the various models and guidelines in respect to use of common data across threads/processes. Situation isn't much different for other hosting systems.

Is there an "easy" way to get mod_wsgi to reflect Django updates?

I'm reading http://code.google.com/p/modwsgi/wiki/ReloadingSourceCode but it seems like way too much work, I've been restarting my apache2 server gracefully whenever I make tweaks to Django code as it inconsistently picks up the right files and probably tries to rely on cached .pycs.

I setup Django using mod_wsgi using the steps outlined at this blog post.
It automatically reflects updates (although every now and then, there will be a delay for a few minutes - never figure out why nor is it that much of an inconvenience).

If you are having to restart your Apache server then you can't be using mod_wsgi daemon mode. Use daemon mode and then simply touching the WSGI script file when an atomic set of changes have been completed isn't that hard and certainly safer than a system which restarts arbitrarily when it detects any single change. If you do want automatic restart based on code changes, then that is described in that document as well. For a Django slant on it, read:
http://blog.dscpl.com.au/2008/12/using-modwsgi-when-developing-django.html
http://blog.dscpl.com.au/2009/02/source-code-reloading-with-modwsgi-on.html
What is it about what is documented there which is 'way too much work'?

Django + WSGI: Refreshing Issues?

I'm developing a Django site. I'm making all my changes on the live server, just because it's easier that way. The problem is, every now and then it seems to like to cache one of the *.py files I'm working on. Sometimes if I hit refresh a lot, it will switch back and forth between an older version of the page, and a newer version.
My set up is more or less like what's described in the Django tutorials: http://docs.djangoproject.com/en/dev/howto/deployment/modwsgi/#howto-deployment-modwsgi
I'm guessing it's doing this because it's firing up multiple instances of of the WSGI handler, and depending on which handler the the http request gets sent to, I may receive different versions of the page. Restarting apache seems to fix the problem, but it's annoying.
I really don't know much about WSGI or "MiddleWare" or any of that request handling stuff. I come from a PHP background, where it all just works :)
Anyway, what's a nice way of resolving this issue? Will running the WSGI handler is "daemon mode" alleviate the problem? If so, how do I get it to run in daemon mode?

Running the process in daemon mode will not help. Here's what's happening:
mod_wsgi is spawning multiple identical processes to handle incoming requests for your Django site. Each of these processes is its own Python Interpreter, and can handle an incoming web request. These processes are persistent (they are not brought up and torn down for each request), so a single process may handle thousands of requests one after the other. mod_wsgi is able to handle multiple web requests simultaneously since there are multiple processes.
Each process's Python interpreter will load your modules (your custom Python files) whenever an "import module" is executed. In the context of django, this will happen when a new view.py is needed due to a web request. Once the module is loaded, it resides in memory, and so any changes you make to the file will not be reflected in that process. As more web requests come in, the process's Python interpreter will simply use the version of the module that is already loaded in memory. You are seeing inconsistencies between refreshes since each web request you are making can be handled by different processes. Some processes may have loaded your Python modules during earlier revisions of your code, while others may have loaded them later (since those processes had not received a web request).
The simple solution: Anytime you modify your code, restart the Apache process. Most times that is as simple as running as root from the shell "/etc/init.d/apache2 restart". I believe a simple reload works as well, which is faster, "/etc/init.d/apache2 reload"
The daemon solution: If you are using mod_wsgi in daemon mode, then all you need to do is touch (unix command) or modify your wsgi script file. To clarify scrompt.com's post, modifications to your Python source code will not result in mod_wsgi reloading your code. Reloading only occurs when the wsgi script file has been modified.
Last point to note: I only spoke about wsgi as using processes for simplicity. wsgi actually uses thread pools inside each process. I did not feel this detail to be relevant to this answer, but you can find out more by reading about mod_wsgi.

Because you're using mod_wsgi in embedded mode, your changes aren't being automatically seen. You're seeing them every once in a while because Apache starts up new handler instances sometimes, which catch the updates.
You can resolve this by using daemon mode, as described here. Specifically, you'll want to add the following directives to your Apache configuration:
WSGIDaemonProcess example.com processes=2 threads=15 display-name=%{GROUP}
WSGIProcessGroup example.com

Read the mod_wsgi documentation rather than relying on the minimal information for mod_wsgi hosting contained on the Django site. In partcular, read:
http://code.google.com/p/modwsgi/wiki/ReloadingSourceCode
This tells you exactly how source code reloading works in mod_wsgi, including a monitor you can use to implement same sort of source code reloading that Django runserver does. Also see which talks about how to apply that to Django.
http://blog.dscpl.com.au/2008/12/using-modwsgi-when-developing-django.html
http://blog.dscpl.com.au/2009/02/source-code-reloading-with-modwsgi-on.html

You can resolve this problem by not editing your code on the live server. Seriously, there's no excuse for it. Develop locally using version control, and if you must, run your server from a live checkout, with a post-commit hook that checks out your latest version and restarts Apache.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js