We have a significantly complex Django application currently served by
apache/mod_wsgi and deployed on multiple AWS EC2 instances behind a
AWS ELB load balancer. Client applications interact with the server
using AJAX. They also periodically poll the server to retrieve notifications
and updates to their state. We wish to remove the polling and replace
it with "push", using web sockets.
Because arbitrary instances handle web socket requests from clients
and hold onto those web sockets, and because we wish to push data to
clients who may not be on the same instance that provides the source
data for the push, we need a way to route data to the appropriate
instance and then from that instance to the appropriate client web
socket.
We realize that apache/mod_wsgi do not play well with web sockets and
plan to replace these components with nginx/gunicorn and use the
gevent-websocket worker. However, if one of several worker processes
receive requests from clients to establish a web socket, and if the
lifetime of worker processes is controlled by the main gunicorn
process, it isn't clear how other worker processes, or in fact
non-gunicorn processes can send data to these web sockets.
A specific case is this one: A user who issues a HTTP request is
directed to one EC2 instance (host) and the desired behavior is that data is
to be sent to another user who has a web socket open in a completely
different instance. One can easily envision a system where a message
broker (e.g. rabbitmq) running on each instance can be sent a message
containing the data to be sent via web sockets to the client connected
to that instance. But how can the handler of these messages access
the web socket, which was received in a worker process of gunicorn?
The high-level python web socket objects created gevent-websocket and
made available to a worker cannot be pickled (they are instance
methods with no support for pickling), so they cannot easily be shared
by a worker process to some long-running, external process.
In fact, the root of this question comes down to how can web sockets
which are initiated by HTTP requests from clients and handled by WSGI
handlers in servers such as gunicorn be accessed by external
processes? It doesn't seem right that gunicorn worker processes,
which are intended to handle HTTP requests would spawn long-running
threads to hang onto web sockets and support handling messages from
other processes to send messages to the web sockets that have been
attached through those worker processes.
Can anyone explain how web sockets and WSGI-based HTTP request
handlers can possibly interplay in the environment I've described?
Thanks.
I think you've made the correct assesment that mod_wsgi + websockets is a nasty combination.
You would find all of your wsgi workers hogged by the web sockets and an attempt to (massively) increase the size of the worker pool would probably choke the server because of the memory usage and context switching.
If you like to stick with the synchronous wsgi worker architecture (as opposed to the reactive approach implemented by gevent, twisted, tornado etc), I would suggest looking into uWSGI as a application server. Recent versions can handle some URLs in the old way (i.e. your existing django views would still work the same as before), and route other urls to a async websocket handler. This might be a relatively smooth migration path for you.
It doesn't seem right that gunicorn worker processes, which are intended to handle HTTP requests would spawn long-running threads to hang onto web sockets and support handling messages from other processes to send messages to the web sockets that have been attached through those worker processes.
Why not? This is a long-running connection, after all. A long-running thread to take care of such a connection would seem... absolutely natural to me.
Often in these evented situations, writing is handled separately from reading.
A worker that is currently handling a websocket connection would wait for relevant message to come down from a messaging server, and then pass that down the websocket.
You can also use gevent's async-friendly Queues to handle in-code message passing, if you like.
Related
I am designing a backend server for some messaging protocol. In the program, there is one master server and multiple servers (which I will call secondary servers). The master server is in charge of keeping data consistency between secondary servers so the master server needs to be connected to all the secondary servers constantly (keep-alive).
As I'm using C++, I'm using socket programming. The main role of the master server is to listen to the requests made by the secondary servers and occasionally serve information to them. I don't know which of the two methods below will fit my program more:
Spawn a thread for each secondary server connection in the beginning so that each thread is constantly listening to the secondary server it's connected to. Keep it alive, and whenever the master server needs to serve data to the secondary servers, spawn a new thread (or maybe use thread pool for this?) to handle the task.
Main thread initially connects to all the secondary servers. Using select() (or poll()), the main thread listens to all connections simultaneously and whenever a new request is made from the secondary server, use thread pool to handle the task.
I am using Django channels for chat application. This application might scale to 100k concurrent users in some amount of time. I am wondering how many concurrent connections can Django Channels handle. Basically, what I am comparing it with is XMPP server, that whether XMPP is a better choice for scalability or I should continue with Django channels?
Also, I am using a redis layer as the channel layer. I was wondering if redis layer can be a bottle neck at some point of time?
Thanks
ASGI (Asynchronous Server Gateway Interface) is intended to provide a standard interface between async-capable Python web servers, frameworks, and applications.
In Django Channels, even though Django itself in a synchronous mode it can handle connections and sockets asynchronously.
Websockets go into a server called Daphne (Daphne is a HTTP, HTTP2 and WebSocket protocol server for ASGI and ASGI-HTTP, developed to power Django Channels) can handle hundreds or potentially thousands of simultaneous connections open at once.
Any event on a websocket (connect, message received) is sent onto the channel layer, which is a type of first-in-first-out queue
Django worker processes synchronously take a message from the queue, process it, and then loop back to take another.
So, the number of open connections affects how many Daphne instances you run. And throughput of events affects the number of workers you run.
Given an AWS Elastic-Beanstalk Worker box, is it possible to use Flask/port:80 to serve the messages coming in from the associated SQS queue?
I have seen conflicting information about what is going on, inside an ELB-worker. The ELB Worker Environment page says:
Elastic Beanstalk simplifies this process by managing the Amazon SQS queue and running a daemon process on each instance that reads from the queue for you. When the daemon pulls an item from the queue, it sends an HTTP POST request locally to http://localhost/ on port 80 with the contents of the queue message in the body. All that your application needs to do is perform the long-running task in response to the POST.
This SO question Differences in Web-server versus Worker says:
The most important difference in my opinion is that worker tier instances do not run web server processes (apache, nginx, etc).
Based on this, I would have expected that I could just run a Flask-server on port 80, and it would handle the SQS messages. However, the post appears incorrect. Even the ELB-worker boxes have Apache running on them, apparently for doing health-checks (when I stopped it, my server turned red). And of course it's using port 80...
I already have Flask/Gunicorn on an EC2 server that I was trying to move to ELB, and I would like to keep using that - is it possible? (Note: the queue-daemon only posts messages to port 80, that can't be changed...)
The docs aren't clear, but it sounds like they expect you to modify Apache to proxy to Flask, maybe? I hope that's not the only way.
Or, what is the "correct" way of setting up an ELB-worker to process the SQS messages? How are you supposed to "perform the long-running task"?
Note: now that I've used ELB more, and have a fairly good understanding of it - let me make clear that this it not the use-case that Amazon designed the ELB-workers for, and it has some glitches (which will be noted). The standard use-case, basically, is that you create a simple Flask app, and hook it into an ELB-EC2 server, that is configured to make it simple to run that Flask app.
My use-case was, I already had an EC2 server with a large Flask app, running under gunicorn, as well as various other things going on. I wanted to use that server (as an image) to build the ELB server, and have it respond to SQS-queue messages. It's possible there are better solutions, like just writing a queue-polling daemon, and that no-one else will ever take this option, but there it is...
The ELB worker is connected to an SQS queue, by a daemon that listens to that queue, and (internally) posts any messages to http://localhost:80. Apache is listening on port 80. This is to handle health-checks, which are done by the ELB manager (or something in the eco-system). Apache passes non-health-check-requests, using mod_wsgi, to the Flask app that was uploaded, which is at:
/opt/python/current/app/application.py
I suspect it would be possible but difficult to remove Apache and handle the health-checks some other way (flask), thus freeing up port 80. But that's enough of a change that I decided it wasn't worth it.
So the solution I found, is to change which port the local daemon posts to - by reconfiguring it via a YAML config-file, it will post to port 5001, where my Flask app was running. This mean Apache can continue to handle the health-checks on port 80, and Flask can handle the SQS messages from the daemon.
You configure the daemon, and stop/start it (as root):
/etc/aws-sqsd.d/default.yaml
/opt/elasticbeanstalk/addons/sqsd/hooks/stop-sqsd.sh
/opt/elasticbeanstalk/addons/sqsd/hooks/start-sqsd.sh
/opt/elasticbeanstalk/addons/sqsd/hooks/restart-sqsd.sh
Actual daemon:
/opt/elasticbeanstalk/lib/ruby/bin/aws-sqsd
/opt/elasticbeanstalk/lib/ruby/lib/ruby/gems/2.2.0/gems/aws-sqsd-2.3/bin/aws-sqsd
Glitches:
If you ever use the ELB GUI to configure daemon options, it will over-write the config-file, and you will have to re-edit the Port (and re-start the daemon).
Note: All of the HTTP traffic is internal, either to the ELB eco-system or the worker - so it is possible to close off all external ports (I keep 22 open), such as Port 80. Otherwise your Worker has Apache responding to http://:80 posts, meaning it's open to the world. I assume the server is configured fairly securely, but Port 80 doesn't need to be open at all, for healthchecks or anything else.
I've got server running in background and a program which should display data from server. I want to somehow launch method in my program from server. So server should be a sender, but how to do it ?
There is no reason why a server can't also be a client, just implement the interfaces from both sides and you're good.
The main thing to worry about is deadlocking: if you have a single threaded program which is waiting for the reply of the server, then it will not handle the request that the server sends, so the server is stuck and will not send a reply to the program.
This can be solved by starting the server implementations on different threads and letting them not block on the client thread.
Even better is to avoid having a server send back requests before sending replies, but cascading requests (forward requests to more specialized servers) should be no problem.
I was thinking of creating a web service that does a long running process. What would be the best way to design these to work with a load balancer? I can't think of any way of doing it besides writing a custom queue.
That is exactly what you should do. You typically want your web service calls to be a quick request/response. So make a call to the web service, have the web service queue the work then have worker processes pick up the messages from the queue and process them.
This is the way to go, queuing the long running processes allows your system to scale, allows you to add recovery logic if a process fails, allows you to scale quickly by adding additional workers to process the queue, and best of all does not tie up the client waiting for a response.
REDIS (http://redis.io/) has been my choice over the past few years, if you are using Azure or AWS they have messaging services as well.
You can also use websockets to notify the client when processes are completed to keep the UI state in the loop.