I am using Django channels for chat application. This application might scale to 100k concurrent users in some amount of time. I am wondering how many concurrent connections can Django Channels handle. Basically, what I am comparing it with is XMPP server, that whether XMPP is a better choice for scalability or I should continue with Django channels?
Also, I am using a redis layer as the channel layer. I was wondering if redis layer can be a bottle neck at some point of time?
Thanks
ASGI (Asynchronous Server Gateway Interface) is intended to provide a standard interface between async-capable Python web servers, frameworks, and applications.
In Django Channels, even though Django itself in a synchronous mode it can handle connections and sockets asynchronously.
Websockets go into a server called Daphne (Daphne is a HTTP, HTTP2 and WebSocket protocol server for ASGI and ASGI-HTTP, developed to power Django Channels) can handle hundreds or potentially thousands of simultaneous connections open at once.
Any event on a websocket (connect, message received) is sent onto the channel layer, which is a type of first-in-first-out queue
Django worker processes synchronously take a message from the queue, process it, and then loop back to take another.
So, the number of open connections affects how many Daphne instances you run. And throughput of events affects the number of workers you run.
Related
We use gunicorn with django and django-telegrambot. We also have a MQTT client in an own app. When some MQTT messages arrive we send Telegram messages and the other way around. The Problem is now that when we use gunicorn with multiple workers, we have multiple MQTT Clients, so that when a MQTT message arrives we will send multiple times the same Telegram message.
When we use gunicorns preload with workers, we only have one MQTT client, but then all processes share the same Telegram TCP connection and we get wired SSL errors. As an alternative we could use only use on process and multiple threads, but then sometimes MQTT and Telegram messages gets not processed (idk why).
Is there a way to get this running?
Instead of using webhooks one could use botpolling, but django-telegrambot says:
Polling mode by management command (an easy to way to run bot in local machine, not recommended in production!)
I'm not familiar with the django-telegrambot library, so I can't judge why the authors chose to make this statement (maybe ask on the GitHub repository …). However, both polling and webhooks are officially supported by Telegram (see here). IMHO both have pros and cons. Webhooks may have a slight performance benefit over polling, but also require more work to set up. Polling requires you to continuously fetch for updates, which can be seen as downside. OTOH with webhooks you have to have a webserver running. For small to medium sized bots (in terms of usernumber), polling should be fine - I'm using polling without problems for my (rather small) bots.
Please take this with a grain of salt as I'm far from being an expert on networking topics.
which is main concept of django asgi?
when there are multiple tasks to be done inside a view,
handle those multiple tasks concurrently thus reduce view's response time.
when there are multiple requests from multiple users at same time,
hendle those requests concurrently thus users wait less in queue.
Channels? Web Socket?
I'm trying to understand and use the asgi concept but feeling so lost.
Thanks.
asgi provides an asynchronous/synchronous interface for python applications to interact with front-end web elements (HTML and Scripts). In a sense, because the interface itself handles the requests concurrently, it is working to reduce response time - because it is the reason that that django web servers respond notably quick. Multiple tasks from multiple users are handly quickly and efficiently, but that is not the main concept.
Most importantly, asgi provides a method for python (as well as the django library) to interact with the frontend HTML page we are showing the user. As was the original triumph of wsgi; asgi is the upgrade that allows python to communicate with the web client actively (listening) then start asynchronous tasks that allow us to begin tasks or change values outside of the direct scope of what the application is doing. Thus, we can start those tasks, serve different values to the user, and continue those tasks in the background uninterrupted.
I need to create a server in Qt C++ with QTcpServer which can handle so many requests at the same time. nearly more than 1000 connections and all these connection will constantly need to use database which is MariaDB.
Before it can be deployed on main servers, It needs be able to handle 1000 connections with each connection Querying data as fast it can on 4 core 1 Ghz CPU with 2GB RAM Ubuntu virtual machine running on cloud. MySQL database is hosted on some other server which more powerful
So how can I implement this ? after googling around, I've come up with following options
1. Create A new QThread for each SQL Query
2. Use QThreadPool for new SQL Query
For the fist one, it might will create so many Threads and it might slow down system cause of so many context switches.
For second one,after pool becomes full, Other connections have to wait while MariaDB is doing its work. So what is the best strategy ?
Sorry for bad english.
1) Exclude.
2) Exclude.
3) Here first always doing work qt. Yes, connections (tasks for connections) have to wait for available threads, but you easy can add 10000 tasks to qt threadpool. If you want, configure max number of threads in pool, timeouts for tasks and other. Ofcourse your must sync shared data of different threads with semaphore/futex/mutex and/or atomics.
Mysql (maria) it's server, and this server can accept many connections same time. This behaviour equally what you want for your qt application. And mysql it's just backend with data for your application.
So your application it's server. For simple, you must listen socket for new connections and save this clients connections to vector/array and work with each client connection. Always when you need something (get data from mysql backend for client (yeah, with new, separated for each client, onced lazy connection to mysql), read/write data from/to client, close connection, etc.) - you create new task and add this task to threadpool.
This is very simple explanation but hope i'm helped you.
Consider for my.cnf [mysqld] section
thread_handling=pool-of-threads
Good luck.
I'm new to Django and Channels and so far I couldn't find any solution to the issue that I face:
I need to communicate with an external WebSocket, to process received data and then sent it to some Channels groups or maybe start some Celery tasks based on that output.
As I've understood it's not a good practice to put that logic inside Consumer. What is the right way of doing this in Django?
Thanks
It'S probably not at all best practice to do it in Django in first place. Django is a Web Framework, that processes individual http requests. Connecting to websockets for potentially longer running process should happen in another component of your architecture.
We have a significantly complex Django application currently served by
apache/mod_wsgi and deployed on multiple AWS EC2 instances behind a
AWS ELB load balancer. Client applications interact with the server
using AJAX. They also periodically poll the server to retrieve notifications
and updates to their state. We wish to remove the polling and replace
it with "push", using web sockets.
Because arbitrary instances handle web socket requests from clients
and hold onto those web sockets, and because we wish to push data to
clients who may not be on the same instance that provides the source
data for the push, we need a way to route data to the appropriate
instance and then from that instance to the appropriate client web
socket.
We realize that apache/mod_wsgi do not play well with web sockets and
plan to replace these components with nginx/gunicorn and use the
gevent-websocket worker. However, if one of several worker processes
receive requests from clients to establish a web socket, and if the
lifetime of worker processes is controlled by the main gunicorn
process, it isn't clear how other worker processes, or in fact
non-gunicorn processes can send data to these web sockets.
A specific case is this one: A user who issues a HTTP request is
directed to one EC2 instance (host) and the desired behavior is that data is
to be sent to another user who has a web socket open in a completely
different instance. One can easily envision a system where a message
broker (e.g. rabbitmq) running on each instance can be sent a message
containing the data to be sent via web sockets to the client connected
to that instance. But how can the handler of these messages access
the web socket, which was received in a worker process of gunicorn?
The high-level python web socket objects created gevent-websocket and
made available to a worker cannot be pickled (they are instance
methods with no support for pickling), so they cannot easily be shared
by a worker process to some long-running, external process.
In fact, the root of this question comes down to how can web sockets
which are initiated by HTTP requests from clients and handled by WSGI
handlers in servers such as gunicorn be accessed by external
processes? It doesn't seem right that gunicorn worker processes,
which are intended to handle HTTP requests would spawn long-running
threads to hang onto web sockets and support handling messages from
other processes to send messages to the web sockets that have been
attached through those worker processes.
Can anyone explain how web sockets and WSGI-based HTTP request
handlers can possibly interplay in the environment I've described?
Thanks.
I think you've made the correct assesment that mod_wsgi + websockets is a nasty combination.
You would find all of your wsgi workers hogged by the web sockets and an attempt to (massively) increase the size of the worker pool would probably choke the server because of the memory usage and context switching.
If you like to stick with the synchronous wsgi worker architecture (as opposed to the reactive approach implemented by gevent, twisted, tornado etc), I would suggest looking into uWSGI as a application server. Recent versions can handle some URLs in the old way (i.e. your existing django views would still work the same as before), and route other urls to a async websocket handler. This might be a relatively smooth migration path for you.
It doesn't seem right that gunicorn worker processes, which are intended to handle HTTP requests would spawn long-running threads to hang onto web sockets and support handling messages from other processes to send messages to the web sockets that have been attached through those worker processes.
Why not? This is a long-running connection, after all. A long-running thread to take care of such a connection would seem... absolutely natural to me.
Often in these evented situations, writing is handled separately from reading.
A worker that is currently handling a websocket connection would wait for relevant message to come down from a messaging server, and then pass that down the websocket.
You can also use gevent's async-friendly Queues to handle in-code message passing, if you like.