I run all my Django sites as SCGI daemons. I won't get into the fundamentals of why I do this but this means that when a site is running, there is a set of processes running from the following command:
/websites/website-name/manage.py runfcgi method=threaded host=127.0.0.1 port=3036 protocol=scgi
All is fine until I want to roll out a new release from the VCS (Bazaar in my case). I made an alias script called up that does the following:
alias up='bzr up; killall manage.py'
It is this generic for one simple reason: I'm lazy. I want one command that I can use under any site to update it. I'm logged into the server most of the time anyway, so, I just hop into the root of the right site and call up. Site updates from BZR and restarts.
The first downside of this is it kills all the manage.py processes on the machine. Currently 6 sites and growing rapidly. The second (and potentially worse -- at least for end-users) is it's a severely non-graceful restart. If somebody was uploading an image or doing something else with a long connection time, their request would just die on the vine.
So what I'm looking for is suggestions for a single method that:
Is generic for lazy people like me (eg I can run it from any site root without having to remember which command I need to call; 'up' is perfect in name.
Only kills the current site. I'm only updating the current site, so only this one should die.
Does the restart in a graceful manner. If possible, it should wait until there are no more active connections. I've no idea how feasible this is.
Instead of killing everything with manage.py in the name, could you write a script for each site that only kills manage.py processes from that site? (Edit: just write the scripts and put them in the root of each site (which you cd to anyway) and run those – still only one command to remember)
I don't know enough about SCGI or Bazaar to suggest much more than that... My method (I'm lazy too) uses Mercurial and Fabric for deployment: http://stevelosh.com/blog/entry/2009/1/15/deploying-site-fabric-and-mercurial/ – maybe it will give you an idea you can use?
Related
I'm using Django 1.10 with SQLite as my database back-end.
I have a site running with nginx/uwsgi and the above configuration, which I'm constantly updating with new code.
Each time I want to update the site code, I'm shutting down uwsgi, nginx, git pull ing the new version from my repo, then restarting uwsgi and nginx.
While this works, in the sense that the site is updated with the new version of my code, the side-effect is that in the event that a user is currently logged in the site and working on something (which generally will result in modifying the db), the user's work will be interrupted.
Worse still would be the case that the new version of my code contains migrations, which will change the db structure, with (unpredictable?) consequences on the user's on-going work.
So the question is: is there a way, like a command-line script, to check db.sqlite3 and see if a user is currently logged-in, before deciding to shutdown uwsgi and nginx ?
I would say NO. There is chances like
User logged in and didn't log out for a while.
Logged in, but inactive
We can get logged in users who didnt logout in django_session table.
A better approach would be to minimize the downtime. As a start, there's no reason to restart nginx when you update your code or uwsgi, and you can git pull without affecting the currently running code, so you don't need to stop uwsgi before pulling in the new code.
When it comes to migrations, try to avoid any migrations that will break the currently running code. For example, don't delete fields/models that are still used by the current code, but wait until the next update (when those fields/models are no longer in use). That way you can run your migrations while you're still running the old code, without generating any errors.
Next up, you should reload the uwsgi process rather than stopping and starting it manually. This will finish handling any open requests before reloading the process. It will also keep on listening for new requests, so that these can be handled immediately after the process has reloaded. Users may experience a slowdown, but this will not drop any requests unless the queue fills up before the process can be reloaded. For small-scale websites this should never happen.
So, to avoid downtime you can use this to update your website:
$ git pull # pull in new code while the old process is still running
$ python manage.py migrate # run your migrations, and possibly `collectstatic` etc.
$ kill -HUP `cat <pidfile>` # gracefully reload the uwsgi process
I was wondering that i always got the wrong message when i corrected my code, but not for long, it would return the right result of my program. Is that normal?
Sounds like you are not restarting the built-in django development server manually after making the changes:
The development server automatically reloads Python code for each
request, as needed. You don’t need to restart the server for code
changes to take effect. However, some actions like adding files don’t
trigger a restart, so you’ll have to restart the server in these
cases.
As documentation says, sometimes in order to see the changes, you need to restart the server manually.
Also, sometimes Django dev server reloader doesn't see the changes right away and it takes some time for the server to notice changes and trigger restart. If you see this often, restart the server manually.
Also note that in Django 1.7 kernel signals are used to autoreload the server on linux - this should make it pick up the changes and restart faster:
Changed in Django 1.7:
If you are using Linux and install pyinotify, kernel signals will be
used to autoreload the server (rather than polling file modification
timestamps each second). This offers better scaling to large projects,
reduction in response time to code modification, more robust change
detection, and battery usage reduction.
My goal is to create an application that will be able to do long-lasting mainly system tasks, such as:
checking out code from the repositories,
copying directories between various localizations,
etc.
The problem is I need to prepare it somehow independently from the web browser. I mean that for example after starting the checkout/copy action, closing the web browser will not interrupt the action. So after going back to that site I can see that the copying goes on or another action started when the browser was closed...
I was searching through various tools, like RabbitMQ + Celery, Twisted, Pyro, XML-RPC but I don't know if any of these will be suitable for me. Has anyone encountered similar needs when creating Django app? Please let me know if there are any methods/packages that I should know. Code samples also will be more than welcome!
Thank you in advance for your suggestions!
(And sorry for my bad English. I'm working on it.)
Basically you need to have a process that runs outside of the request. The absolute simplest way to do this (on a Unix-like operating system, at least) is to fork():
if os.fork() == 0:
do_long_thing()
sys.exit(0)
… continue with request …
This has some downsides, though (ex, if the server crashes, the “long thing” will be lost)… Which is where, ex, Celery can come in handy. It will keep track of the jobs that need to be done, the results of jobs (success/failure/whatever) and make it easy to run the jobs on other machines.
Using Celery with a Redis backend (see Kombu's Redis transport) is very simple, so I would recommend looking there first.
You might need to have a process outside the request / response cycle. If that is the case, Celery with a Redis backend is what I would suggest looking into, as that integrates nicely with Django (as David Wolever suggested).
Another option is to create Django management commands, and then use cron to execute them at scheduled intervals.
I'm reading http://code.google.com/p/modwsgi/wiki/ReloadingSourceCode but it seems like way too much work, I've been restarting my apache2 server gracefully whenever I make tweaks to Django code as it inconsistently picks up the right files and probably tries to rely on cached .pycs.
I setup Django using mod_wsgi using the steps outlined at this blog post.
It automatically reflects updates (although every now and then, there will be a delay for a few minutes - never figure out why nor is it that much of an inconvenience).
If you are having to restart your Apache server then you can't be using mod_wsgi daemon mode. Use daemon mode and then simply touching the WSGI script file when an atomic set of changes have been completed isn't that hard and certainly safer than a system which restarts arbitrarily when it detects any single change. If you do want automatic restart based on code changes, then that is described in that document as well. For a Django slant on it, read:
http://blog.dscpl.com.au/2008/12/using-modwsgi-when-developing-django.html
http://blog.dscpl.com.au/2009/02/source-code-reloading-with-modwsgi-on.html
What is it about what is documented there which is 'way too much work'?
I'm developing a Django site. I'm making all my changes on the live server, just because it's easier that way. The problem is, every now and then it seems to like to cache one of the *.py files I'm working on. Sometimes if I hit refresh a lot, it will switch back and forth between an older version of the page, and a newer version.
My set up is more or less like what's described in the Django tutorials: http://docs.djangoproject.com/en/dev/howto/deployment/modwsgi/#howto-deployment-modwsgi
I'm guessing it's doing this because it's firing up multiple instances of of the WSGI handler, and depending on which handler the the http request gets sent to, I may receive different versions of the page. Restarting apache seems to fix the problem, but it's annoying.
I really don't know much about WSGI or "MiddleWare" or any of that request handling stuff. I come from a PHP background, where it all just works :)
Anyway, what's a nice way of resolving this issue? Will running the WSGI handler is "daemon mode" alleviate the problem? If so, how do I get it to run in daemon mode?
Running the process in daemon mode will not help. Here's what's happening:
mod_wsgi is spawning multiple identical processes to handle incoming requests for your Django site. Each of these processes is its own Python Interpreter, and can handle an incoming web request. These processes are persistent (they are not brought up and torn down for each request), so a single process may handle thousands of requests one after the other. mod_wsgi is able to handle multiple web requests simultaneously since there are multiple processes.
Each process's Python interpreter will load your modules (your custom Python files) whenever an "import module" is executed. In the context of django, this will happen when a new view.py is needed due to a web request. Once the module is loaded, it resides in memory, and so any changes you make to the file will not be reflected in that process. As more web requests come in, the process's Python interpreter will simply use the version of the module that is already loaded in memory. You are seeing inconsistencies between refreshes since each web request you are making can be handled by different processes. Some processes may have loaded your Python modules during earlier revisions of your code, while others may have loaded them later (since those processes had not received a web request).
The simple solution: Anytime you modify your code, restart the Apache process. Most times that is as simple as running as root from the shell "/etc/init.d/apache2 restart". I believe a simple reload works as well, which is faster, "/etc/init.d/apache2 reload"
The daemon solution: If you are using mod_wsgi in daemon mode, then all you need to do is touch (unix command) or modify your wsgi script file. To clarify scrompt.com's post, modifications to your Python source code will not result in mod_wsgi reloading your code. Reloading only occurs when the wsgi script file has been modified.
Last point to note: I only spoke about wsgi as using processes for simplicity. wsgi actually uses thread pools inside each process. I did not feel this detail to be relevant to this answer, but you can find out more by reading about mod_wsgi.
Because you're using mod_wsgi in embedded mode, your changes aren't being automatically seen. You're seeing them every once in a while because Apache starts up new handler instances sometimes, which catch the updates.
You can resolve this by using daemon mode, as described here. Specifically, you'll want to add the following directives to your Apache configuration:
WSGIDaemonProcess example.com processes=2 threads=15 display-name=%{GROUP}
WSGIProcessGroup example.com
Read the mod_wsgi documentation rather than relying on the minimal information for mod_wsgi hosting contained on the Django site. In partcular, read:
http://code.google.com/p/modwsgi/wiki/ReloadingSourceCode
This tells you exactly how source code reloading works in mod_wsgi, including a monitor you can use to implement same sort of source code reloading that Django runserver does. Also see which talks about how to apply that to Django.
http://blog.dscpl.com.au/2008/12/using-modwsgi-when-developing-django.html
http://blog.dscpl.com.au/2009/02/source-code-reloading-with-modwsgi-on.html
You can resolve this problem by not editing your code on the live server. Seriously, there's no excuse for it. Develop locally using version control, and if you must, run your server from a live checkout, with a post-commit hook that checks out your latest version and restarts Apache.