Queueing or not queueing for low latency - c++

I'm writing a low-latency program in C++ which receives data from a source, processes the data and sends to a target via a TCP socket. I have a separate thread for all these 3 modules, receiver thread, processor thread, sender thread. All these threads are communicating with lock-free queues.
Do you think that sending the message directly and not using the queue for the sender part would give lower latency? Does it affect performance stability?
Thanks

If the three threads are pinned to different physical cores, having a separate sender thread would give lower latency than processor thread doing the send operation, specially if there are retries happening in the send process. Even if it is a best effort async send, you could still save the marginal time it takes to write to the socket.

Related

Handling multiple clients simultaneously in C++ UDP server

I have developed a C++ UDP based server application and I am in the process of implementing code to handle multiple clients simultaneously .
I have the following understanding regarding how to handle multiple clients and want to fill in the knowledge gaps
My step wise understanding is as mentioned below
UDP server listens at a specific port(say xxxx)
The server has a message queue .It can be array or linked list or Queue or anything for that matter
As soon as a request arrives at the port xxxx, its placed in the message queue
After putting it in the message queue a new thread(let us call it worked thread) is spawned and it picks up the queued message and the same is removed from the message queue
The worked thread knows about the clients IP:port from the message header
The worker thread processes the request and sends the response to the clients IP:port
The clients gets the response and the worker thread terminates.
Steps 3 to 7 take care of multiple client being handled simultaneously.
Is my understanding sufficient ? Where do I need improvement?
Thanks in advance
The clients gets the response and the worker thread terminates.
The worker thread should terminate when it completes processing. There is no practical way for it to wait for an acknowledgement from the client.
The worker thread processes the request and sends the response to the clients IP:port
I think it will be better to place the response on a queue. The main server thread can check the queue and send any responses found there. This prevents race conditions when two worker threads overlap in their attempts to send responses.
The server has a message queue .It can be array or linked list or Queue or anything for that matter
It pretty much has to be a queue. The interesting question is what queue priority. Initially FIFO would do. If your server becomes overloaded, then you need to consider alternatives. Perhaps it would be good to estimate the processing time required, and do the fast ones first. Or perhaps different clients deserve different priorities.
After putting it in the message queue a new thread(let us call it worked thread) is spawned
This is fine initially. However, you will want to do some time profiling and determine if a thread pool would be advantageous.
Deeper Discussion of threading issues
The job processing must be done in a separate worker thread, so that a long job will not block the server from accepting connections from other clients. However, you should consider carefully whether or not you want to use multiple worker threads. Since you are placing the job requests on a queue, a single worker thread can be used to process them one by one.
PRO single thread
Simpler, more reliable code. The processing code must be thread safe for context switches back to the main thread. However, there will not be any context switches between job processing code. This makes it easier to design and debug the processing code. For example, if the jobs are updating a database, then you do not require any extra code to ensure the database is always consistent - just that consistency is guaranteed at the end of each job process.
Faster response for short jobs. If there are many short jobs submitted at the same time, your CPU can spend more cycles switching between jobs than actually doing useful processing.
CON single thread
A big job will block other jobs until it completes.

I don't understand how to async operations can make a HTTP server concurrent

I am developing a HTTP server using boost asio. So far, I have been using async operations (aync_read, async_write etc.), but I want to make my server concurrent, that is, the same as a server that creates a new thread per each new client connected.
I have read some forums etc. and, apparently, a concurrent server can be made only by using the mentioned async operations. I do not understand how is this possible.
I mean, taking into account that the async operations' handlers are executed in the thread that called to io_service.run(), lets take that a client is being responsed at this moment. How can another client make a petition and been answered while the main thread is busy with the first client?
The meaning of the word "concurrent" is ambiguous.
You are right, an asynchronous server is not concurrent at all. It can process only one request at a time. But the key insight is that what most servers do is actually they take a request, do some light processing (parsing, serialization, validation, some light business logic, etc.) and then call external resources (e.g. some database). The server can then process other requests while waiting for the external resource. So it's only an illusion of being concurrent (processing happens one after another but really fast). And it works as long as the processing is relatively fast compared to io.
If your server is supposed to do some hard cpu computations then obviously there will be no concurrency at all. In that case the only way to make it concurrent is to add threads or processes (possibly on multiple machines).
Asynchronous IO does not make the server concurrent.
In fact, Asynchronous IO does not mean "multi-threaded" or "multi-processed" at all. Node.js servers are mono-threaded and using asynchronous IO.
Asynchronous IO just means your thread does not wait for the IO to finish, but does other stuff meanwhile (like accepting and processing new incoming requests).
So no, the premise that Asynchronous IO makes the server concurrent is wrong. it does not make it concurrent, it makes it scalable, as thread-per-request is not so scalable, but a proper thread-pool + event queue/coroutines are. the threads only deal with CPU bound tasks and the event queue/coroutines manages enqueuing and dequeuing started/finished IO operations.
Not sure if you're only looking for a theoretical answer or a design example, but have you seen the HTTP Server 3 example for boost.asio?
Concurrency is achieved by having a small thread pool to execute the work. When callbacks need to be handled, all threads calling io_service.run() can be chosen to execute the task.

C++ IO/Multiplexed TCP Server and POSIX Threads

I must develop a simple C++ command line client/server chat application. This application must provide a basic multiple two-partecipants chat-room implementation. Is it possible to combine IO/Multiplexing (select() syscall) with POSIX threads?
I mean I want to create a TCP server which handles multiple clients with select() and when a client wants to chat with another one the servewr creates a separate thread , that uses IO/Multiplexing (select() syscall) , to handle the communication between the two clients.
Is this a good idea? How could I do otherwise?
A crude attempt at an architecture...
Structure your application as two sets of threads (a set might be composed of just one thread).
One set minds the TCP connections, each TCP connection is assigned to one of the threads in the set, the thread just runs forever polling the connections assigned to it (incoming messages) and polling a (per-thread) from-logic queue (outgoing messages)
The other set minds the logic/session. Each session is assigned to a specific thread. Each thread just runs forever polling the (per-thread) from-network queue (incoming messages).
The network thread-set, receives messages and post them to the right logic queue [assumes there's a way of mapping connections to internal logic sessions]. It polls its from-logic queue to get the outgoing messages and send them.
The number of network threads is bound, and it does not depend on the number of connections.
The logic thread-set, receives requests from the network in its queue and handles them within a given session state and (perhaps) post back messages to the be sent out (sent out by the network threads)
The number of logic threads is bound, and it does not depend on the number of sessions.

Multiple threads on I/O completion ports

I'm working on a C++ socket server using I/O completion ports. I create one I/O port and two worker threads per processor as some articles recommend in order to handle the overlapped I/O activity
From what I've read only one of these worker threads should be waking up and handling an I/O request at any time. So what I would expect when one client connects (only) and sends something is only one thread to wake up and handle that receive, but when I try to debug it I can see that multiple threads wake up and try to handle the same operation.
Is my assumption wrong?

How to get a Win32 Thread to wait on a work queue and a socket?

I need a client networking thread to be able to respond both to new messages to be transmitted, and the receipt of new data on the network. I wish to avoid this thread performing a polling loop, but rather to process only as needed.
The scenario is as follows:
A client application needs to communicate to a server via a protocol that is largely, but not entirely, synchronous. Typically, the client sends a message to the server and blocks until a response is received.
The server may process client requests asynchronously, in which case the response to client
is not a result, but a notification that processing has begun. A result message is sent to to the client at some point in the future, when the server has finish processing the client request.
The asynchronous result notifications can arrive at the client at any time. These notifications need processed when they are received i.e. it is not possible to process a backlog only when the client transmits again.
The clients networking thread receives and processes notifications from the server, and to transmit outgoing messages from the client.
To achieve this, I need to to make a thread wake to perform processing either when network data is received OR when a message to transmit is enqueued into an input queue.
How can a thread wake to perform processing of an enqueued work item OR data from a socket?
I am interested primarily in using the plain Win32 APIs.
A minimal example or relevant tutorial would be very welcome!
An alternative to I/O Completion Ports for sockets is using WSAEventSelect to associate an event with the socket. Then as others have said, you just need to use another event (or some sort of waitable handle) to signal when an item has been added to your input queue, and use WaitForMultipleObjects to wait for either kind of event.
You can set up an I/O Completion Port for the handles and have your thread wait on the completion port:
http://technet.microsoft.com/en-us/sysinternals/bb963891.aspx
Actually, you can have multiple threads wait on the port (one thread per processor usually works well).
Following on from Michael's suggestion, I have some free code that provides a framework for IO Completion Port style socket stuff; and it includes an IOCP based work queue too. You should be able to grab some stuff from it to solve your problem from here.
Well, if both objects have standard Windows handles, you can have your client call WaitForMultipleObjects to wait on them.
You might want to investiate splitting the servicing of the network port off onto its own thread. That might simplify things greatly. However, it won't help if you just end up having to synchonize something else between that new thread and your main one.