Business logic in IO completion Port - c++

I have some doubts in regards of IO Completion Port as well as AcceptEx in winsock2
Please correct me if i am wrong.
AcceptEx is an overlapped way of accepting requests or connection. However, as pointed out by multiple posts on this site that AcceptEx is prone to DOS attack if AcceptEx is expecting data but is not sent by the connected client. So, can it be solved by just putting 0 to the dwReceiveDataLength?
Besides, what is the advantages to be able to receive data from client when accepting the respective connection rather than receive the data later using AcceptEx?
After accepting connections from opposite endpoint and associates it with IO completion port, the requests are queued in IO completion port as completion packets which are associated with their respective handle. The worker threads that block on the completion port would be woken up depending on the NumberOfConcurrentThreads to serve the requests. So, is the threads in the completion port the IO threads?
So, where should I implement business logic or operation in the socket server? For example, a request from client that send numbers to the server for processing while the server act like a calculator that responds by echoing back the calculated output. Thus, can this logic be implemented in IO Completion Port?
If the logic is implemented in IO completion port (When IO threads (assumed) that are active in IO completion port are performing WSARecv or WSASend)), would the IO threads block while waiting for the calculation to finish thus making no connection is able to be accepted if the backlog are all taken?
EDITED:
For example, after accepting the client socket and queued/ associated in the IO completion port(main_cpl_port), threads that block on this main_cpl_port call GetQueuedCompletionStatus to dequeue completion packet and subsequently read data into allocated buffer. Before any response is written back to the client, the buffer is processed/ parsed for "command" (eg: GoToCalculator, GoToRecorder).
For example, GoToCalculator is responsible for other calculation related commands.
In this case, GoToCalculator is actually another IO completion port that caters all those requests related to calculation. Let say the completion port is named as calc_completion_port.
Thus, is it possible that the completion packet from main_cpl_port to be posted to calc_completion_port for future IO (send and recv) from the client socket which is currently associated with main_cpl_port. Is this what PostQueuedCompletionStatus is used for?
Is message sent from client after posted to calc_completion_port can be received by threads that block on this completion port?
In other words, how can I redirect the connection to other completion port from another?

1) Avoiding the potential AcceptEx DOS attack is easy, just don't provide any space for data and the AcceptEx will complete as soon as the connection is established.
2) Using AcceptEx means that you don't need to have a separate thread to run an accept loop. This removes one thread from your system and reduces context switching. This is especially useful if you are listening on multiple sockets (different ports/interfaces) as each listening socket would need its own accept thread.
3) Yes, the worker threads that call GetQueuedCompletionStatus on an IOCP can be thought of as I/O threads...
4) It depends. I've built systems with distinct, fixed sized pools of I/O threads which never do any blocking operations and separate expanding thread pools designed to perform blocking operations. The idea being that this would prevent all of the threads being blocked and preventing I/O... This requires that you pass work items off to the other thread pool and it causes unnecessary context switching and complexity but it means that you always have threads to do I/O operations (such as handle new connections as AcceptEx completes)... This kind of design used to work well back when the IOCP APIs used to cancel pending operations if the thread that issued then exited before the operation completed. Now that the OS has changed the rules and pending operations are not cancelled there's no real reason why you don't just have an expanding/contracting pool of I/O threads and do all your work there... You just need to track how many threads are available and create/destroy threads as you need to expand/contract your pool...
5) see 4.

Related

C++ server with recv/send commands & request/response design

I'm trying to create a server with blocking sockets (one new thread for each new client). This thread should be able to receive commands from the client (and send back the result) and periodically send commands to the client (and request back the result).
What I've thought is creating two threads for each client, one for recv, second for send. However:
it's double of the normal thread overhead.
due to request/response design, recv I do in the first thread (to wait for client's commands) can be the request I look for in the second thread (client's result to my send) and vice versa. Making it all properly synced is probably a hell story. So now I'm thinking to do that from a single thread this way:
In a loop:
setsockopt(SO_RCVTIMEO, &small_timeout); // set the timeout for the recv (like 1000 ms).
recv(); // check for client's requests first. if returns WSAETIMEDOUT than I assume no data is requested and do nothing. if I get a normal request I handle it.
if (clientbufferToSend != nullptr) send(clientbufferToSend); // now when client's request has been processed we check the command list we have to send to the client. if there is commands in queue, we send them. SO_SNDTIMEO timeout can be set to a large value so we don't deadlock if client looses connection.
setsockopt(SO_RCVTIMEO, &large_timeout); // set the timeout for the recv (as large as SO_SNDTIMEO, just to not deadlock if anything).
recv(); // now we wait the response from the client.
Is this the legal way to do what I want? Or are there better alternatives (preferrably with blocking sockets and threads)?
P.S. Does recv() with timeout returns WSAETIMEDOUT only if no data is available? Can it return this error if there is the data, but recv() wasn't fast enough to handle it all, thus returning partial data?
One approach is only create a background thread for reading from that socket. Write on whatever random thread your unsolicited events are raised.
You’ll need following stuff.
A critical section or mutex per socket to serialize writes, like when background thread is sending response to client-initiated message, and other thread wants to send message to the same client.
Some other synchronization primitive like a conditional variable for client thread to sleep while waiting for responses.
The background thread which receives messages needs to distinguish client-initiated messages (which need to be responded by the same background thread) from responses to server-initiated messages. If your network protocol doesn’t have that data you’ll have to change the protocol.
This will work OK if your server-initiated events are only happening on a single thread, e.g. they come from some serialized source like a device or OS interface.
If however the event source is multithreaded as well, and you want good performance, you gonna need non-trivial complexity to dispatch the responses to the correct server thread, like 1 conditional variable per client thread, maybe some queues, etc.

QTcpSocket is really full duplex?

BSD stream sockets are full duplex, meaning two connected parties can both send/receive at the same time.
A QTcpSocket (qt socket implementation) has asynchronous support, non blocking mode, but can only belong to one thread, see qt docs.
Event driven objects may only be used in a single thread.
Specifically, this applies to the timer mechanism and the network
module.
Let's say I want a transmit/tx thread and a separate receive/rx thread to use the same socket and send/receive data at the same time.
In my understanding this can be 'done' via qt signals/slots, but the socket thread will never really perform the send() and the receive() simultaneously. It just runs the event loop which will do this in a serial fashion and emit the signals when send/receive is done.
Yes, my rx and tx threads can work concurrently and handle the notifications via qt slots, but the socket itself is never really used in full duplex mode.
Is it correct to say that: considering one endpoint only, in the socket thread, its send() and receive() calls are always serial, never simultaneous?
(because the event loop thread is one thread only)
In my understanding this can be 'done' via qt signals/slots, but the
socket thread will never really perform the send() and the receive()
simultaneously. It just runs the event loop which will do this in a
serial fashion and emit the signals when send/receive is done.
True, but keep in mind that the kernel buffers incoming and outgoing data, and QTCPSocket sets the socket to non-blocking, so that the send() and recv() calls always return immediately and never block the event-loop. That means that the actual processes of sending and receiving data will happen simultaneously (inside the kernel), even if the (more-or-less instantaneous) send() and recv() calls technically do not. (*)
Yes, my rx and tx threads can work concurrently and handle the
notifications via qt slots, but the socket itself is never really used
in full duplex mode. Is this correct?
That is not correct -- the socket's data streams can (and do) flow both ways across the network simultaneously, so the socket really is full-duplex. The full-duplex capability is present whether you are using a single thread or multiple threads.
(*) You can test this with a single-threaded Qt program that uses a QTCPSocket to send or receive data, by simply disconnecting your computer's Ethernet cable during a large data transfer. If the QTCPSocket's send() or recv() calls are blocking until completion, that would block the GUI thread and cause your GUI to become unresponsive until you reconnect the cable (or until the TCP connection times out after several minutes).

Handling POSIX socket read() errors

Currently I am implementing a simple client-server program with just the basic functionalities of read/write.
However I noticed that if for example my server calls a write() to reply my client, and if my client does not have a corresponding read() function, my server program will just hang there.
Currently I am thinking of using a simple timer to define a timeout count, and then to disconnect the client after a certain count, but I am wondering if there is a more elegant/or standard way of handling such errors?
There are two general approaches to prevent server blocking and to handle multiple clients by a single server instance:
use POSIX threads to handle each client's connection. If one thread blocks because of erroneous client, other threads will still continue to run. If the remote client has just disappeared (crashed, network down, etc.), then sooner or later the TCP stack will signal a timeout and the blocked write operation will fail with error.
use non-blocking I/O together with a polling mechanism, e.g. select(2) or poll(2). It is quite harder to program using polling calls though. Network sockets are made non-blocking using fcntl(2) and in cases where a normal write(2) or read(2) on the socket would block an EAGAIN error is returned instead. You can use select(2) or poll(2) to wait for something to happen on the socket with an adjustable timeout period. For example, waiting for the socket to become writable, means that you will be notified when there is enough socket send buffer space, e.g. previously written data was flushed to the client machine TCP stack.
If the client side isn't going to read from the socket anymore, it should close down the socket with close. And if you don't want to do that because the client still might want to write to the socket, then you should at least close the read half with shutdown(fd, SHUT_RD).
This will set it up so the server gets an EPIPE on the write call.
If you don't control the clients... if random clients you didn't write can connect, the server should handle clients actively attempting to be malicious. One way for a client to be malicious is to attempt to force your server to hang. You should use a combination of non-blocking sockets and the timeout mechanism you describe to keep this from happening.
In general you should write the protocols for how the server and client communicate so that neither the server or client are trying to write to the socket when the other side isn't going to be reading. This doesn't mean you have to synchronize them tightly or anything. But, for example, HTTP is defined in such a way that it's quite clear for either side as to whether or not the other side is really expecting them to write anything at any given point in the protocol.

Writing multithreaded TCP server on Linux

At work I have been tasked with implementing a TCP server as part of a Modbus slave device. I have done a lot of reading both here on stack exchange and on the internet in general (including the excellent http://beej.us/guide/bgnet/) but I am struggling with a design issue. In summary, my device can accept just 2 connections and on each connection will be incoming modbus requests which I must process in my main controller loop and then reply with success or failure status. I have the following ideas of how to implement this.
Have a listener thread that creates, binds, listens and accepts connections, then spawns a new pthread to listen on the connection for incoming data and close connection after an idle timeout period. If the number of active threads is currently 2, new connections are instantly closed to ensure only 2 are allowed.
Do not spawn new threads from the listener thread, instead use select() to detect incoming connection requests as well as incoming modbus connects on active connections (similar to the approach in Beejs guide).
Create 2 listener threads each of which creates a socket (same IP and port number) which can block on accept() calls, then close the socket fd and deal with the connection. Here I am (perhaps naively) assuming that this will only allow max of 2 connections which I can deal with using blocking reads.
I have been using C++ for a long time but I am fairly new to Linux development. I would really welcome any suggestions as to which of the above approaches is best (if any) and if my inexperience with Linux means that any of them are really really bad ideas. I am keen to avoid fork() and stick to pthreads as incoming modbus requests are going to be queued and read off a main controller loop periodically. Thanks in advance for any advice.
The third alternative won't work, you can only bind to the local address once.
I would probably use your second alternative, unless you need to do a lot of processing in which case a combination of the first to alternatives might be useful.
The combination of the two first alternative I'm thinking of is to have the main thread (the one you always have when a program starts) create two worker threads, then go a blocking accept call to wait for a new connection. When a new connection arrives, tell one of the threads to start working on the new connection and go back to block on accept. When the second connection is accepted you tell the other thread to work on that connection. If both connections are open already, either don't accept until one connection is closed, or wait for new connections but close them immediately.
All of the design option you propose are not very object oriented, and they're all geared more towards C than C++. If your work allows you to use boost, then the Boost.Asio library is fantastic for making simple (and complex) socket servers. You could take nearly any of their examples and trivially extend it to only allow 2 active connections, closing all others as soon as they are opened.
Off the top of my head, their simple HTTP server could be modified to do this by keeping a static counter in the connection class (inc in the constructor, dec in the destructor), and when a new one is created check the count and decide whether to close the connection. The connection class could also gain a boost::asio::deadline_timer to keep track of timeouts.
This would most closely resemble your first design choice, boost could do this in 1 thread and in the background does something similar to select() (usually epoll()). But this is the "C++ way", and in my opinion using select() and raw pthreads is the C way.
Since you are only dealing with 2 connections, thread per connection is perfect for this kind of application. Object oriented approaches using non-blocking or asynchronous I/O would be better if you needed to scale up to thousands of connections. 2 listener threads makes sense, you don't need to close the accept fd. Just come back to accept on it when the connection is completed. In fact, a variation is to have three threads blocked doing accept. If two of the threads are actively handling connections, then the third resets the newly created connection (or returns busy response, whatever is appropriate for your device).
To have all three threads block on accept, you need to have the main thread create and bind your socket before the three threads launch to do their accept/handle processing.
The man page for pthreads on Linux indicates that accept is thread-safe. (The section under thread-safe functions lists the functions that are not thread-safe, go figure.)

some OVERLAPS using WSASend not returning in a timely manner using GetQueuedCompletionStatus?

Background: I'm using CreateIoCompletionPort, WSASend/Recv, and GetQueuedCompletionStatus to do overlapped socket io on my server. For flow control, when sending to the client, I only allow several WSASend() to be called when all pending OVERLAPs have popped off the IOCP.
Problem: Recently, there are occassions when the OVERLAPs do not get returned to the IOCP. The thread calling GetQueuedCompletionStatus does not get them and they remain in my local pending queue. I've verified that the client DOES receive the data off the socket and the socket is connected. No errors were returned when the WSASend() calls were made. The OVERLAPs simply "never" come back without an external stimulus like the following:
Disconnecting the socket from the client or server, immediately allows the GetQueuedCompletionStatus thread to retrieve the OVERLAPs
Making additional calls to WSASend(), sometimes several are needed, before all the OVERLAPs suddenly pop off the queue.
Question: Has anyone seen this type of behavior? Any ideas on what is causing this?
Thanks,
Geoffrey
WSASend() can fail to complete in a timely manner if the TCP window is full. In this case the stack can't send any more data so your WSASend() waits and your completion doesn't occur until the TCP stack CAN send more data.
If you happen to have a protocol between your client and server that has no flow control built into the protocol itself AND you aren't doing any flow control yourself based on write completions and are just sending data as fast as your server can send then you may get to a point where either the network or your client can't keep up and TCP flow control kicks in (when the TCP window gets full). If you continue to just fire off data asynchronously with additional calls to WSASend() then eventually you'll chew your way through all of the non-paged memory on the machine and at that point all bets are off (chances are high that a driver may cause the box to bluescreen).
So, in summary, completions from overlapped socket writes can and will sometimes take longer to come back than you may expect. In your example, I expect that the completions that you get when you close the socket are all failures?
I talk about this some more on my blog; here: http://www.lenholgate.com/blog/2008/07/write-completion-flow-control.html and here: http://www.serverframework.com/asynchronousevents/2011/06/tcp-flow-control-and-asynchronous-writes.html