Can I call WSASend() repeatedly? - c++

When using IOCP, if I call WSASend(), do I have to wait for the notification to arrive before making another call to it, or can I call it multiple times before receiving any notification, for example is something like this allowed:
WSASend();
// Call it again without waiting for notification of previous call
WSASend();
// Call it again without waiting for notification of previous call
WSASend();

Yes, you can make multiple I/O requests without waiting for completion notifications. Alternatively, you can WSASend() multiple buffers with one call.
Either way, or both, will work fine. The OVERLAPPED block for each call is, essentially, pointers for a linked-list of I/O requests, so they can all be queued up and executed by the kernel and comms stack/s when the I/O resource is available.
This applies to WSARecv etc. overlapped I/O too. This allows the kernel/stack to be loading buffers while user thread code is processing those notified earlier.
Note: the OVERLAPPED block, and the buffers, must be unique per call and their lifetime must extend to the completion notification. You must not let them be RAII'd or delete'd away before a user thread has handled the completion notification. It's usual for the buffer and OVERLAPPED to be members of one 'IOrequest' class, (with a 'SocketContext*' class pointer member to connect every IOrequest up with its bound socket).

Yes you can issue multiple overlapped operations on a single socket without needing to wait for any completions to occur.
One thing that you need to be aware of with multiple outstanding WSASend() calls on a TCP socket is that you are effectively handing over resource management for the buffers used in the WSASend() calls to the peer on the other end of the socket. The reason for this is that if you send data and the peer doesn't read it as fast as you are sending then you will eventually cause TCP's flow control to kick in. This doesn't prevent you issuing more WSASend() calls and all you will notice is that the completions take longer and longer to occur. See here for more details.

Related

Under what conditions does MPI_Isend/MPI_Irecv wait for its associated completion call (MPI_Wait/MPI_Test) to start data transmission?

One of the comments in this post briefly mentions
The standard allows the implementation to postpone the actual data transmission until the wait/test call.
Is it always the case that data transmission of MPI_Isend/MPI_Irecv is postponed until the associated completion call (MPI_Wait/MPI_Test or their variants) is invoked? If not, what conditions influence this?
MPI_Wait is used to wait for a single communication to complete
MPI_Waitall is used to wait for a list of communications to complete
MPI_Test and MPI_Testall use in non-blocking communications to check if the communications are finished without requiring them to be finished.
With the MPI_Isend, You had to store the values of each of the data points as a separate variable in an array
This is because the data can be sent anytime until the MPI_Waitall is called.
This means that the data mustn’t be changed/overwritten or go out of scope within this interval.
This is different from MPI_Send, where the data to be sent is buffered and/or actually sent by the time MPI_Send completes.
The same is true of MPI_Irecv, though this is more obvious as you want to have the data.
Note
In MPI_Testall the flag will be true only if all the communications are finished.

Does SleepEx guarantee that all pending completion callbacks get called before timeout?

I have a C++ program that uses overlapped IO for network communication. The main thread has a loop that calls SleepEx(5, true);. There are also two TCP sockets. I assume that the completion callbacks are called during the alertable wait. Assume also that by the time SleepEx gets called both of my TCP connections have received some data. Now the question is what happens if the first completion callback takes longer than 5ms? Does the SleepEx return after calling the first callback or does it also call the second callback? In other words does the SleepEx guarantee to call ALL of the scheduled completion callbacks? This is not clear because the documentation says it will return when at least one of the events meet...
Your code must not assume that both APCs will be called before SleepEx() returns. Conversely, it must not assume that a pending APC will not be called simply because the specified wait period has expired.
The only behaviour that you can rely upon is that if one or more APCs are pending, at least one will be executed.
Generally speaking, best practice is to wait for APCs in a loop that does nothing else, using an infinite timeout in the wait. If you need to do something periodically, you can use a waitable timer to generate an APC periodically.
Alternatively, you can use WaitForSingleObjectEx() or WaitForMultipleObjectsEx() to detect when a waitable timer or other synchronization object is triggered, while still handling APCs.
However, if you must perform some periodic action that cannot be handled in an APC or be triggered by a synchronization object, you can use nested loops: the inner loop does nothing but call the wait repeatedly (with a timeout period reduced by however long the loop has already been running) and the outer loop performs the periodic action.
If you must perform some periodic action that cannot be delayed by pending APCs, you will need to do it in a separate thread. Note that because Windows is not a real-time OS, you will still not be able to guarantee that any given action will take place within any particular timeframe, although you can reduce the risk by increasing the thread priority.

How to use Overlapped I/O with sockets?

I want to use Overlapped I/O in my server, but I am unable to find many tutorials on the subject (most of the tutorials are about Overlapped I/O with Completion Ports, and I want to use a callback function).
My server will have a maximum of 400 clients connected at one time, and it only send and receive data at a long periods of time (each 30 seconds a few kilobytes worth of data is exchanged between the server and the clients).
The main reason why I want to use Overlapped I/O is because select() can only handle a maximum of 64 sockets (and I have 400!).
So I will tell you how I understand Overlapped I/O and correct me if I'm wrong:
If I want to receive data from one of the clients, I use WSARecv() and supply the socket handle, and a buffer to be filled with the received data, and also I supply a callback function. When the data is received and filled in the buffer, the callback function will be called, and I can process the data.
When I want to send data I use WSASend(), I also supply the socket handle and the callback function, and when the data is sent (not sure if when placed in the underlying sent buffer or actually placed on the wire), the callback will also be called telling me that data was sent, and I can send the next piece of data.
The one misunderstanding you appear to have is that OVERLAPPED callbacks are actually synchronous.
You said:
When the data is received and filled in the buffer, the callback function will be called
Reality:
When a call is made to an alertable wait function (e.g. SleepEx or MsgWaitForMultipleObjectsEx), if data has been received and filled in the buffer, the callback function will be called
As long as you are aware of that, you should be in good shape. I agree with you that overlapped I/O with callbacks is a great approach in your scenario. Because callbacks occur on the thread performing the I/O, you don't have to worry about synchronizing access from multiple threads, the way you would need to with completion ports and work items on the thread pool.
Oh, also make sure to check for WSA_IO_PENDING, because it's possible for operations to complete synchronously, if there's enough data already buffered (for receive) or enough space in the buffer (for send). In this case the callback will occur, but it is queued for the next alertable wait, it never runs immediately. Certain errors will be reported synchronously also. Others will come to your callback.
Also, it's guaranteed that your callback gets queued exactly once for every operation that returned 0 or WSA_IO_PENDING, whether that operation completes successfully, is cancelled, or with some other error. You can't reuse the buffer until that callback has happened.
The IO completion callback mechanism works fine, I've used it a few times, no problem. In 32-bit systems, you can put the 'this' for the socket-context instance into the hEvent field of the OVERLAPPED struct and retreive it in the callback. Not sure how to do it in 64-bit systems:(

Asynchronous Completion Handling

I have this situation:
void foo::bar()
{
RequestsManager->SendRequest(someRequest, this, &foo::someCallback);
}
where RequestsManager works in asynchronous way:
SendRequest puts the request in a queue and returns to the caller
Other thread gets the requests from the queue and process them
When one request is processed the callback is called
Is it possible to have foo::someCallback called in the same thread as SendRequest? If not, how may I avoid following "callback limitation": callbacks should not make time consuming operations to avoid blocking the requests manager.
No - calls/callbacks cannot change thread context - you have to issue some signal to communicate between threads.
Typically, 'someCallback' would either signal an event upon which the thread that originated the 'SendRequest' call is waiting on, (synchronous call), or push the SendRequest, (and so, presumably, results from its processing), onto a queue upon which the thread that originated the 'SendRequest' call will eventually pop , (asynchronous). Just depends on how the originator wshes to be signaled..
Aynch example - the callback might PostMessage/Dispatcher.BeginInvoke the completed SendRequest to a GUI thread for display of the results.
I can see few ways how to achieve it:
A) Implement strategy similar to signal handling
When request processing is over RequestManager puts callback invocation on the waiting list. Next time SendRequest is called, right before returning execution it will check are there any pending callbacks for the thread and execute them. This is relatively simple approach with minimal requirements on the client. Choose it if latency is not of a concern. RequestManager can expose API to forcefully check for pending callbacks
B) Suspend callback-target thread and execute callback in the third thread
This will give you true asynchronous solution with all its caveats. It will look like target-thread execution got interrupted and execution jumped into interrupt handler. Before callback returns target thread needs to be resumed. You wont be able to access thread local storage or original thread's stack from inside the callback.
Depends on "time-consuming operations"'s definition.
The classic way to do this is:
when the request is processed, the RequestManager should execute that &foo::someCallback
to avoid blocking the request manager, you may just rise a flag inside this callback
check that flag periodically inside the thread, which called RequestsManager->SendRequest
This flag will be just a volatile bool inside class foo
If you want to make sure, that the calling thread (foo's) will understand immediately, that the request has been processed, you need additional synchronization.
Implement (or use already implemented) blocking pipe (or use signals/events) between these threads. The idea is:
foo's thread executes SendRequest
foo starts sleeping on some select (for example)
RequestManager executes the request and:
calls &foo::someCallback
"awakes" the foo's thread (by sending something in that file descriptor, which foo sleeps on (using select))
foo is awaken
checks the volatile bool flag for already processed request
does what it needs to do
annuls the flag

is it valid to async send data before completion handler of the previous one was invoked?

I'm sending data asynchronously to TCP socket. Is it valid to send the next data piece before the previous one was reported as sent by completion handler?
As I know it's not allowed when sending is done from different threads. In my case all sending are done from the same thread.
Different modules of my client send data to the same socket. E.g. module1 sent some data and will continue when corresponding completion handler is invoked. Before this io_service invoked deadline_timer handler of module2 which leads to another async_write call. Should I expect any problems here?
Is it valid to send the next data piece before the previous one was
reported as sent by completion handler?
No it is not valid to interleave write operations. This is very clear in the documentation
This operation is implemented in terms of zero or more calls to the
stream's async_write_some function, and is known as a composed
operation. The program must ensure that the stream performs no other
write operations (such as async_write, the stream's async_write_some
function, or any other composed operations that perform writes) until
this operation completes.
emphasis added by me.
As I know it's not allowed when sending is done from different
threads. In my case all sending are done from the same thread.
Your problem has nothing to do with threads.
Yes, you can do that as long as the underlying memory (buffer) is not modified until the write handler is called. Calling async_write means you hand over the buffer ownership to Asio. When the write handler is called, the buffer ownership is given back to you.