Does Django Channels work as intended as a WSGI app? - django

I am trying to implement Django Channels because I need to have users receive notifications when another user does something, and I am completely confused by this part:
http://channels.readthedocs.io/en/stable/deploying.html
Deploying applications using channels requires a few more steps than a
normal Django WSGI application, but you have a couple of options as to
how to deploy it and how much of your traffic you wish to route
through the channel layers.
Firstly, remember that it’s an entirely optional part of Django. If
you leave a project with the default settings (no CHANNEL_LAYERS),
it’ll just run and work like a normal WSGI app.
The problem is that I have quite limited rights on the shared hosting that I am using and therefore, I can't use the runworker command.
The quote above says that this part is "optional" and that without it, it'll work like a normal WSGI app. But can I use Django Channels with a normal WSGI app? If not, then doesn't that mean that it's not optional at all?
So my question is: if I skip this part, will the channels still work and will I be able to use the things showed on this page (routing, sending messages, etc): http://channels.readthedocs.io/en/stable/getting-started.html ?

From reading the docs, what i get is, first you need to use a back end to run channel eg. redis, Sharding, and run "runworker", but since it's not an option for you, have a look at this http://channels.readthedocs.io/en/stable/backends.html
"""The in-memory layer is only useful when running the protocol server and the worker server in a single process; the most common case of this is runserver, where a server thread, this channel layer, and worker thread all co-exist inside the same python process."""
So by avoiding third party backend you can use in-memory asgi layer and just run 'runserver' and the channel layer is setup. Just look for in-memory subtopic in the link
And if you keep the CHANNEL_LAYERS empty django'll work as a wsgi app, but what we need is asgi app, and asgi is required for channels.

Related

Django and Celery - How to Distribute?

I'm trying to distribute Django and Celery.
I've created a small project with Django and Celery. Django will request a Celery Worker to work on some data on the database. Then the data is passed back to Django.
My idea is that:
Django stack installed on one server
Message queue (RabbitMQ) on one server
Celery worker on one server
Hence 3 Servers in Total
However, the problem is celery has to use some code from Django, for example models, because it accesses the model. Hence, it would also require settings.py file to know what are the servers.
Does this mean that for #3, I would need to install Django and Celery on the server, but disable Django and only run celery? For example celery -A PROJECT_NAME worker -l INFO, but without an Apache Server for Django?
If you want your celery workers to operate on a different server, you need to make sure that all the resources required by the worker are accessible from that server.
For example, if you have a simple task, you can copy only the code required for that task to the server. If your worker needs any other resources like some other code, files, db you need to make sure it has access.
Really, if you want to have two servers working on the same tasks, you will have to use a simple web interface (such as Flask) to communicate between the servers (and extend the functionality of your queue). Then, you will have to ensure they are both using the same data source.
Consider hosting your database remotely, or have the remote server access the database remotely. Either way, any workers running on a server will need access to the database and all source code necessary to complete the task. Then, you must simply have the two servers share a messaging queue.
Source: how to configure and run celery worker on remote system

concurrent requests on dotcloud with django

I have a django app I want to migrate to dotcloud.
Many actions in Django internals and in my app are not asynchronous, i.e. they block the thread until they finish.
When I was using Apache, that didn't pose a problem since a different thread is opened on every request. But it doesn't seem to be the case in nginx/uwsgi that dotcloud use.
Seemingly, uwsgi has a --enable-threads and --threads options that can be used for multithreading, but:
It is not clear what version of uwsgi dotcloud use, and if they support these features
Since I have no one else asking about this, I was wondering if this is really the right way to get the concurrent requests running (using threads)
You could run Django with Gunicorn. Gunicorn, in turn, supports multiple worker classes, and people reported success running gunicorn+gevents+django together[1][2].
To use that on dotCloud, you will probably have to use dotCloud's custom service. If that's something that you want to try, I would personally start with dotCloud's reimplementation of python service using the custom service, and replace uwsgi with gunicorn in it.
I came here looking for some leads, which I found, thanks!
There was a fair amount of leg work left to actually get stuff working, though.
Here is an example app on github that uses gunicorn, gevent, and socketio on dotcloud:
https://github.com/t1m0thy/django-tictactoe/tree/dotcloud
Threads is a problem in python - GIL doesn't allow them to run simultaneously.
So multiprocessing is an answer.
Or you may take a look at gevent. Actually gevent is a kind of a hack (monkey patching of python stack) and so on, but it allows to launch green threads.
I'm not sure if gevent can be combined with django, but google knows ;)

Serve multiple Django and PHP projects on the same machine?

The documentation states that one should not server static files on the same machine as as the Django project, because static content will kick the Django application out of memory. Does this problem also come from having multiple Django projects on one server ? Should I combine all my Website-Projects into one very large Django project ?
I'm currently serving Django along with php scripts from Apache with mod WSGI. Does this also cause a loss of efficiency ?
Or is the warning just meant for static content, because the problem arises when serving hundreds of files, while serving 20-30 different PHP / Django projects is ok ?
I would say that this setup is completely ok. Off course it depends on the hardware, load and the other projects. But here you can just try and monitor the usage/performance.
The suggestion to use different server(s) for static files makes sense, as it is more efficient for the ressources. But as long as one server performs good enough i don't see a reason to use a second one.
Another question - which has less to do with performance than with ease of use/configuration - is the decision if you really want to run everything on the same server.
For one setup with a bunch of smaller sites (and as well some php-legacy) we use one machine with four virtual servers:
webhead running nginx (and varnish)
database
simple apache2/php server
django server using gunicorn + supervisord
nginx handles all the sites, either proxying to the application-server or serving static content (via nas). I like this setup, as it is very easy to install and handle, as well it makes it simple to scale out one piece if needed. Bu
If the documentation says """one should not server static files on the same machine as as the Django project, because static content will kick the Django application out of memory""" then the documentation is very misleading and arguably plain wrong.
The one suggestion I would make if using PHP on same system is that you ensure you are using mod_wsgi daemon mode for running the Python web application and even one daemon process per Python web application.
Do not run the Python web application in embedded mode because that means you are running stuff in same process as mod_php and because PHP including extensions is not really multithread safe that means you have to be running prefork MPM. Running Python web applications embedded in Apache when running prefork MPM is a bad idea unless you know very well how to set up Apache properly for it. Don't set up Apache right and you get issues like as described in:
http://blog.dscpl.com.au/2009/03/load-spikes-and-excessive-memory-usage.html
The short of it is that Apache configuration for PHP and Python need to be quite different. You can get around that though by using mod_wsgi daemon mode for the Python web application.

Using Twisted for asynchronous file uploads from Django app

We have a Django app that needs to post messages and upload files from the web server to another server via an XML API. We need to do X asynchronous file uploads and then make another XML API request when they have finished uploading. I'd also like the files to stream from disk without having to load them completely into memory first. Finally, I need to send the files as application/octet-stream in a POST body (rather than a more typical form data MIME type) and I wasn't able to find a way to do this with urllib2 or httplib.
I ended up integrating Twisted into the app. It seemed perfect for this task, and sure enough I was able to write a beautifully clean implementation with deferreds for each upload. I use my own IBaseProducer to read the data from the file in chunks and send it to the server in a POST request body. Unfortunately then I found out that the Twister reactor cannot be restarted, so I can't just run it and then stop it whenever I want to upload files. Since Twisted is apparently used more for full-blown servers, I'm now wondering whether this was the right choice.
I'm not sure if I should:
a) Configure the WSGI container (currently I'm testing with manage.py) to start a Twisted thread on startup and use blockingCallFromThread to trigger my file uploads.
b) Use Twisted as the WSGI container for the Django app. I'm assuming we'll want to deploy later on Apache and I'm not sure what the implications are if we take this route.
c) Simply can Twisted and use some other approach for the file uploads. Kind of a shame since the Twisted approach with deferreds is elegant and works.
Which of these should we choose, or is there some other alternative?
Why would you want to deploy later on Apache? Twisted is rad. I would do (b) until someone presented specific, compelling reasons not to. Then I would do (a). Fortunately, your application code looks the same either way. blockingCallFromThread works fine whether Twisted is your WSGI container or not - either way, you're just dealing with running code in a separate thread than the reactor is running in.

Does a production DJango server fork?

I have been storing some information in global vars in my DJango views. This information can be accessed by every thread in the Python Django process. However, I am wondering about how Django behaves in production. Does a production Django process fork() multiple times to handle requests? If so this data would not be the same across processes. Does anyone know if Django forks?
I'm sure that it depends on your deployment, but if you are running it under FastCGI or WSGI, then yes, it generally pre-forks a number of server processes to handle incoming requests.
I don't know about running under mod_python, but I think that is being discouraged these days in favour of WSGI.
I'm not an expert in this field so I'm answering based only on the grep-ing I've just done.
The fastcgi server seems to be able to fork, depending on configuration settings:
http://code.djangoproject.com/browser/django/tags/releases/1.2.3/django/core/servers/fastcgi.py#L171
http://code.djangoproject.com/browser/django/tags/releases/1.2.3/django/utils/daemonize.py
As for WSGI, I believe that Django side-handling is going straight to the request processing:
http://code.djangoproject.com/browser/django/tags/releases/1.2.3/django/core/handlers/wsgi.py#L217
and forking is configured in mod_wsgi: http://code.google.com/p/modwsgi/ - embedded mode vs daemon mode - and/or in Apache (worker vs prefork builds).
For mod_wsgi, read:
http://code.google.com/p/modwsgi/wiki/ProcessesAndThreading
It explains the various models and guidelines in respect to use of common data across threads/processes. Situation isn't much different for other hosting systems.