I am designing a backend server for some messaging protocol. In the program, there is one master server and multiple servers (which I will call secondary servers). The master server is in charge of keeping data consistency between secondary servers so the master server needs to be connected to all the secondary servers constantly (keep-alive).
As I'm using C++, I'm using socket programming. The main role of the master server is to listen to the requests made by the secondary servers and occasionally serve information to them. I don't know which of the two methods below will fit my program more:
Spawn a thread for each secondary server connection in the beginning so that each thread is constantly listening to the secondary server it's connected to. Keep it alive, and whenever the master server needs to serve data to the secondary servers, spawn a new thread (or maybe use thread pool for this?) to handle the task.
Main thread initially connects to all the secondary servers. Using select() (or poll()), the main thread listens to all connections simultaneously and whenever a new request is made from the secondary server, use thread pool to handle the task.
Related
I am using thrift to provide an interface between a device and a management console. It is possible for there to be up to 4 active connections to the device at one time, and I have this working using a TThreadPool server.
The issue arises around client disconnections; If a client disconnects correctly, there is no issue, however if one does not (i.e. the client crashes out or doesn't call client->close()) then the server seems to keep that clients thread alive. This means that when the next connection attempt is made, the client hangs, as the server has used up its allocated thread pool so cannot service the new request.
I haven't been able to find any standard, public mechanism by which the server can stop, and hence free up, a clients thread if that client has not used the interface for a set time period?
Is there a standard way to facilitate this in thrift?
Set the receive/send timeout on the server socket might help. Server will close the connection on timeout.
https://github.com/apache/thrift/blob/129f332d72facda5d06f87e2b4e5e08bea0b6b44/lib/cpp/src/thrift/transport/TServerSocket.h#L103
void setSendTimeout(int sendTimeout);
void setRecvTimeout(int recvTimeout);
I must develop a simple C++ command line client/server chat application. This application must provide a basic multiple two-partecipants chat-room implementation. Is it possible to combine IO/Multiplexing (select() syscall) with POSIX threads?
I mean I want to create a TCP server which handles multiple clients with select() and when a client wants to chat with another one the servewr creates a separate thread , that uses IO/Multiplexing (select() syscall) , to handle the communication between the two clients.
Is this a good idea? How could I do otherwise?
A crude attempt at an architecture...
Structure your application as two sets of threads (a set might be composed of just one thread).
One set minds the TCP connections, each TCP connection is assigned to one of the threads in the set, the thread just runs forever polling the connections assigned to it (incoming messages) and polling a (per-thread) from-logic queue (outgoing messages)
The other set minds the logic/session. Each session is assigned to a specific thread. Each thread just runs forever polling the (per-thread) from-network queue (incoming messages).
The network thread-set, receives messages and post them to the right logic queue [assumes there's a way of mapping connections to internal logic sessions]. It polls its from-logic queue to get the outgoing messages and send them.
The number of network threads is bound, and it does not depend on the number of connections.
The logic thread-set, receives requests from the network in its queue and handles them within a given session state and (perhaps) post back messages to the be sent out (sent out by the network threads)
The number of logic threads is bound, and it does not depend on the number of sessions.
I am writing a Cassandra's thrift client wrapper library in c++. Each of my application threads make connections to several of Cassandra nodes in one data center in a multiple data center setup.
I am planning to have a dedicated thread to handle on the fly TCP connection failures between application and Cassandra nodes that does the connection retry and switches to other data center when several nodes in one data-center are unreachable. This thread will be informed about connection failures by application threads.
My question is : How should this process/thread work? Is there any thrift call or CQL query that can tell the current up/down nodes in data center(like the nodetool does) or I need to retry making connection and run dummy queries to every down node to check ?
We have a significantly complex Django application currently served by
apache/mod_wsgi and deployed on multiple AWS EC2 instances behind a
AWS ELB load balancer. Client applications interact with the server
using AJAX. They also periodically poll the server to retrieve notifications
and updates to their state. We wish to remove the polling and replace
it with "push", using web sockets.
Because arbitrary instances handle web socket requests from clients
and hold onto those web sockets, and because we wish to push data to
clients who may not be on the same instance that provides the source
data for the push, we need a way to route data to the appropriate
instance and then from that instance to the appropriate client web
socket.
We realize that apache/mod_wsgi do not play well with web sockets and
plan to replace these components with nginx/gunicorn and use the
gevent-websocket worker. However, if one of several worker processes
receive requests from clients to establish a web socket, and if the
lifetime of worker processes is controlled by the main gunicorn
process, it isn't clear how other worker processes, or in fact
non-gunicorn processes can send data to these web sockets.
A specific case is this one: A user who issues a HTTP request is
directed to one EC2 instance (host) and the desired behavior is that data is
to be sent to another user who has a web socket open in a completely
different instance. One can easily envision a system where a message
broker (e.g. rabbitmq) running on each instance can be sent a message
containing the data to be sent via web sockets to the client connected
to that instance. But how can the handler of these messages access
the web socket, which was received in a worker process of gunicorn?
The high-level python web socket objects created gevent-websocket and
made available to a worker cannot be pickled (they are instance
methods with no support for pickling), so they cannot easily be shared
by a worker process to some long-running, external process.
In fact, the root of this question comes down to how can web sockets
which are initiated by HTTP requests from clients and handled by WSGI
handlers in servers such as gunicorn be accessed by external
processes? It doesn't seem right that gunicorn worker processes,
which are intended to handle HTTP requests would spawn long-running
threads to hang onto web sockets and support handling messages from
other processes to send messages to the web sockets that have been
attached through those worker processes.
Can anyone explain how web sockets and WSGI-based HTTP request
handlers can possibly interplay in the environment I've described?
Thanks.
I think you've made the correct assesment that mod_wsgi + websockets is a nasty combination.
You would find all of your wsgi workers hogged by the web sockets and an attempt to (massively) increase the size of the worker pool would probably choke the server because of the memory usage and context switching.
If you like to stick with the synchronous wsgi worker architecture (as opposed to the reactive approach implemented by gevent, twisted, tornado etc), I would suggest looking into uWSGI as a application server. Recent versions can handle some URLs in the old way (i.e. your existing django views would still work the same as before), and route other urls to a async websocket handler. This might be a relatively smooth migration path for you.
It doesn't seem right that gunicorn worker processes, which are intended to handle HTTP requests would spawn long-running threads to hang onto web sockets and support handling messages from other processes to send messages to the web sockets that have been attached through those worker processes.
Why not? This is a long-running connection, after all. A long-running thread to take care of such a connection would seem... absolutely natural to me.
Often in these evented situations, writing is handled separately from reading.
A worker that is currently handling a websocket connection would wait for relevant message to come down from a messaging server, and then pass that down the websocket.
You can also use gevent's async-friendly Queues to handle in-code message passing, if you like.
I am writing a server in linux that is supposed to serve an API.
Initially, I wanted to make it Multi-threaded on a single port, meaning that I'd have multiple threads working on various request received on a single port.
One of my friends told me that it not the way it is supposed to work. He told me that when a request is received, I first have to follow a Handshake procedure, create a thread that is listening to some other port dedicated to the request and then redirect the requested client to the new port.
Theoretically, it's very interesting but I could not find any information on how to implement the handshake and do the redirection. Can someone help?
If I'm not wrong in interpreting your responses, once I create a multithreaded server with a main thread listening to a port, and creates a new thread to handle requests, I'm essentially making it multithreaded on a single port?
Consider the scenario where I get a large number of requests every second. Isn't it true that every request on the port should now wait for the "current" request to complete? If not, how would the communication still be done: Say a browser sends a request, so the thread handling this has to first listen to the port, block it, process it, respond and then unblock it.
By this, eventhough I'm having "multithreads" , all I'm using is one single thread at a time apart from the main thread because the port is being blocked.
What your friend told you is similar to passive FTP - a client tells the server that it needs a connection, the server sends back the port number and the client creates a data connection to that port.
But all you wanted to do is a multithreaded server. All you need is one server socket listening and accepting connections on a given port. As soon as the automatic TCP handshake is finished, you'll get a new socket from the accept function - that socket will be used for communication with the client that has just connected. So now you only have to create a new thread, passing that client socket to the thread function. In your server thread, you will then call accept again in order to accept another connection.
TCP/IP does the handshake, if you can't think of any reason to do a handshake than your application does not demand it.
An example of an application specific handshake could be for user authentication.
What your colleague is suggesting sounds like the way FTP works. This is not a good thing to do -- the internet these days is more or less used for protocols which use a single port, and having a command port is bad. One of the reasons is because statefull firewalls aren't designed for multi-port applications; they have to be extended for each individual application that does things this way.
Look at ASIO's tutorial on async TCP. There one part accept connections on TCP and spawns handlers that each communicate with a single client. That's how TCP-servers usually work (including HTTP/web, the most common tcp protocol.)
You may disregard the asynchronous stuff of ASIO if you're set on creating a thread per connection. It doesn't apply to your question. (Going fully async and have one worker-thread per core is nice, but it might not integrate well with the rest of your environment.)