Should I handle the fact that WSASend() may not send all data? - c++

I have found out that WSASend() may not send all of the data, for example if I asked it to send 800 bytes, it may only send 600 bytes.
Now my question is: are the situations where this can happen are extremely rare that I should not bother handling this kind of event. Or do I have to handle it? Can't I for example just show an error message to the user that not all data have been sent and abort the connection instead of trying to recover?
Note: I am using IOCP.

When sending using overlapped I/O and IOCP it's unlikely that you'll ever see a partial send. It's possibly more likely that you'll see a send with an error where you wanted to send a multiple of the system page size and only a smaller multiple was sent, but even that's fairly unlikely (and I don't have unit tests that force that condition as I view it as theoretical).
When sending via overlapped I/O, over a TCP connection, when your peer is receiving and processing slower than you are sending then this is the more likely situation that you'll encounter, that is, TCP flow control kicking in and your WSASend() calls taking longer and longer to complete.
It's really unlikely that you'll actually see an error either from a WSASend() call or a subsequent GetQueuedCompletionStatus() call. Things will just keep working until they don't...

It can happen any time the receiver is slower than the sender. You must handle it by rescheduling the remaining data to be written. Don't treat it as an error.

Related

Any way to know how many bytes will be sent on TCP before sending?

I'm aware that the ::send within a Linux TCP server can limit the sending of the payload such that ::send needs to be called multiple times until the entire payload is sent.
i.e. Payload is 1024 bytes
sent_bytes = ::send(fd, ...) where sent_bytes is only 256 bytes so this needs to be called again.
Is there any way to know exactly how many bytes can be sent before sending? If the socket will allow for the entire message, or that the message will be fragmented and by how much?
Example Case
2 messages are sent to the same socket by different threads at the same time on the same tcp client via ::send(). In some cases where messages are large multiple calls to ::send() are required as not all the bytes are sent at the initial call. Thus, go with the loop solution until all the bytes are sent. The loop is mutexed so can be seen as thread safe, so each thread has to perform the sending after the other. But, my worry is that beacuse Tcp is a stream the client will receive fragments of each message and I was thinking that adding framing to each message I could rebuild the message on the client side, if I knew how many bytes are sent at a time.
Although the call to ::send() is done sequentially, is the any chance that the byte stream is still mixed?
Effectively, could this happen:
Server Side
Message 1: "CiaoCiao"
Message 2: "HelloThere"
Client Side
Received Message: "CiaoHelloCiaoThere"
Although the call to ::send() is done sequentially, is the any chance that
the byte stream is still mixed?
Of course. Not only there's a chance of that, it is pretty much going to be a certainty, at one point or another. It's going to happen at one point. Guaranteed.
sent to the same socket by different threads
It will be necessary to handle the synchronization at this level, by employing a mutex that each thread locks before sending its message and unlocking it only after the entire message is sent.
It goes without sending that this leaves open a possibility that a blocked/hung socket will result in a single thread locking this mutex for an excessive amount of time, until the socket times out and your execution thread ends up dealing with a failed send() or write(), in whatever fashion it is already doing now (you are, of course, checking the return value from send/write, and handling the exception conditions appropriately).
There is no single, cookie-cutter, paint-by-numbers, solution to this that works in every situation, in every program, that needs to do something like this. Each eventual solution needs to be tailored based on each program's unique requirements and purpose. Just one possibility would be a dedicated execution thread that handles all socket input/output, and all your other execution threads sending their messages to the socket thread, instead of writing to the socket directly. This would avoid having all execution thread wedged by a hung socket, at expense of grown memory, that's holding all unsent data.
But that's just one possible approach. The number of possible, alternative solutions has no limit. You will need to figure out which logic/algorithm based solution will work best for your specific program. There is no operating system/kernel level indication that will give you any kind of a guarantee as to the amount of a send() or write() call on a socket will accept.

UDP send() to localhost under Winsock throwing away packets?

Scenario is rather simple... not allowed to use "sendto()" so using "send()" instead...
Under winsock2.2, normal operation on an brand new i7 machine running Windows 7 Professional...
Using SOCK_DGRAM socket, Client and Server console applications connect over localhost (127.0.0.1) to test things ...
Have to use packets of constant size...
Client socket uses connect(), Server socket uses bind()...
Client sends N packets using series of BLOCKING send() calls. Server only uses ioctlsocket call with FIONREAD, running in a while loop to constantly printf() number of bytes awaiting to be received...
PACKETS GET LOST UNLESS I PUT SLEEP() WITH CONSIDERABLE AMMOUNT OF TIME... What I mean is the number of bytes on the receiving socket differs between runs if I do not use SLEEP()...
Have played with changing buffer sizes, situation did not change much, except now there is no overflow, but the problem with the delay remains the same ...
I have seen many discussions about the issue between send() and recv(), but in this scenario, recv() is not even involved...
Thoughts anyone?
(P.S. The constraints under which I am programming are required for reasons beyond my control, so no WSA, .NET, MFC, STL, BOOST, QT or other stuff)
It is NOT an issue of buffer overflow for three reasons:
Both incoming and outgoing buffers are set and checked to be
significantly larger than ALL of the information being sent.
There is no recv(), only checking of the incoming buffer via ioctl() call, recv() is called long after, upon user input.
When Sleep() of >40ms is added between send()-s, the whole thing works, i.e. if there was an overflow no ammount of
Sleep() would have helped (again, see point (2) )
PACKETS GET LOST UNLESS I PUT SLEEP() WITH CONSIDERABLE AMMOUNT OF
TIME... What I mean is the number of bytes on the receiving socket
differs between runs if I do not use SLEEP()...
This is expected behavior; as others have said in the comments, UDP packets can and do get dropped for any reason. In the context of localhost-only communication, however, the reason is usually that a fixed-size packet buffer somewhere is full and can't hold the incoming UDP packet. Note that UDP has no concept of flow control, so if your receiving program can't keep up with your sending program, packet loss is definitely going to occur as soon as the buffers get full.
As for what to do about it, the insert-a-call-to-sleep() solution isn't particularly good because you have no good way of knowing what the "right" sleep-duration ought to be. (To short a sleep() and you'll still drop packets; too long a sleep() and you're transferring data more slowly than you might otherwise do; and of course the "best" value will likely vary from one computer to the next, or one moment to the next, in non-obvious ways).
One thing you could do is switch to a different transport protocol such as TCP, or (since you're only communicating within localhost), a simple pipe or socketpair. These protocols have the lossless FIFO semantics that you are looking for, so they might be the right tool for the job.
Assuming you are required to use UDP, however, UDP packet loss will be a fact of life for you, but there are some things you can do to reduce packet loss:
send() in blocking mode, or if using non-blocking send(), be sure to wait until the UDP socket select()'s as ready-for-write before calling send(). (I know you said you send() in blocking mode; I'm just including this for completeness)
Make your SO_RCVBUF setting as large as possible on the receiving UDP socket(s). The larger the buffer, the lower the chance of it filling up to capacity.
In the receiving program, be sure that the thread that calls recv() does nothing else that would ever hold it off from getting back to the next recv() call. In particular, no blocking operations (even printf() is a blocking operation that can slow your thread down, especially under Windows where the DOS prompt is infamous for slow scrolling under load)
Run your receiver's network recv() loop in a separate thread that does nothing else but call recv() and place the received data into a FIFO queue (or other shared data structure) somewhere. Then another thread can do the less time-critical work of examining and parsing the data in the FIFO, without fear of causing a dropped packet.
Run the UDP-receive thread at the highest priority you can convince the OS to let you run at. The fewer other tasks that can hold of the UDP-receive thread, the fewer opportunities for packets to get dropped during those hold-off periods.
Just keep in mind that no matter how clever you are at reducing the chances for UDP packet loss, UDP packet loss will still happen. So regardless you need to come up with a design that allows your programs to still function in a reasonably useful manner even when packets are lost. This could be done by implementing some kind of automatic-resend mechanism, or (depending on what you are trying to accomplish) by designing the protocol such that packet loss can simply be ignored.

TCP WSASend Completion Criteria

I am unable to find the specification of what it means that a TCP WSASend call completes. Does the completion of a WSASend operation require that an ACK response be received?
This question is relevant for slower networks with a 200ms - 2s ping timeout. Will it take 200ms - 2s for the WSASend completion callback to be invoked (or whatever completion mechanism is used)? Or perhaps only on some packets will Windows wait for an ACK and consider the WSASend operation complete much faster for all other packets?
The exact behavior makes a big difference with regard to buffer life cycle management and in turn has a significant impact on performance (locking, allocation/deallocation, and reference counting).
WSASend does not guarantee the following:
That the data was sent (it might have been buffered)
That it was received (it might have been lost)
That the receiving application processed it (the sender cannot ever know this by principle)
It does not require a round-trip. In fact, with nagling enabled small amounts of data are always buffered for 200ms hoping that the application will send more. WSASend must return quickly so that nagling has a chance to work.
If you require confirmation, change the application protocol so that you get a confirmation back. No other way to do it.
To clarify, even without nagling (TCP_NODELAY) you do not get an ACK for your send operation. It will be sent out to the network but the remote side does not know that it should ACK. TCP has no way to say "please ACK this data immediately". Data being sent does not mean it will ever be received. The network could drop a second after the data was pushed out to a black hole.
It's not documented. It will likely be different depending on whether you have turned off send buffering. However, you always need to pay attention to the potential time that it will take to get a WSASend() completion, especially if you're using asynchronous calls. See this article of mine for details.
You get a WSASend() completion when the TCP stack has finished with your buffer. If you have NOT turned off send buffering by setting SO_SNDBUF to zero then it likely means you will get a completion once the stack copies your data into its buffers. If you HAVE turned off send buffering then it likely means that you will get a completion once you get an ACK (simply because the stack should need your buffer for any potential retransmissions). However, it's not documented.

Calling WSASend() in completion port?

Many of you know the original "send()" will not write to the wire the amount of bytes you ask it to. Easily you can use a pointer and a loop to make sure your data is all sent.
However, I don't see how in WSASend() and completion ports work in this case. It returns immediately and you have no control over how much was sent (except in a lpLength which you have access in the routine). How does this get solved?
Do you have to call WSASend() in the routine multiple times in order the get all the data out? Doesn't this seem like a great disadvantage, especially if you want your data out in a particular order and multiple threads access the routines?
When you call WSASend with a socket that is associated with an IOCP and an OVERLAPPED structure you effectively pass off your data to the network stack to send. The network stack will give you a "completion" once the data buffer that you used is no longer required by the network stack. At that point you are free to reuse or release the memory used for your data buffer.
Note that the data is unlikely to have reached the peer at the point the completion is generated and the generation of the completion means nothing more than the network stack has taken ownership of the contents of the buffer.
This is different to how send operates. With send in blocking mode the call to send will block until the network stack has used all of the data that you have supplied. For calls to send in non-blocking mode the network stack takes as much data as it can from your buffer and then returns to you with details of how much it used; this means that some of your data has been used. With WSASend, generally, all of your data is used before you are notified.
It's possible for an overlapped WSASend to fail due to resource limits or network errors. It's unusual to get a failure which indicates that some data has been send but not all. Usually it's all sent OK or none sent at all. However it IS possible to get a completion with an error which indicates that some data has been used but not all. How you proceed from this point depends on the error (temporary resource limit or hard network fault) and how many other WSASends you have pending on that socket (zero or non-zero). You can only try and send the rest of the data if you have a temporary resource error and no other outstanding WSASend calls for this socket; and this is made more complicated by the fact that you don't know when the temporary resource limit situation will pass... If you ever have a temporary resource limit induced partial send and you DO have other WSASend calls pending then you should probably abort the connection as you may have garbled your data stream by sending part of the buffer from this WSASend call and then all (or part) of a subsequent WSASend call.
Note that it's a) useful and b) efficient to have multiple WSASend calls outstanding on a socket. It's the only way to keep the connection fully utilised. You should, however, be aware of the memory and resource usage implications of having multiple overlapped WSASend calls pending at one time (see here) as effectively you are handing control of the lifetime of your buffers (and thus the amount of memory and resources that your code uses) to the peer due to TCP flow control issues). See SIO_IDEAL_SEND_BACKLOG_QUERY and SIO_IDEAL_SEND_BACKLOG_CHANGE if you want to get really clever...
WSASend() on a completion port does not notify you until all of the requested data has been accepted by the socket, or until an error occurs, whichever happens first. It keeps working in the background until all of the data has been accepted (or errored). Until it notifies you, that buffer has to remain active in memory, but your code is free to move on to do other things while WSASend() is busy. There is no notification when the data is actually transmitted to the peer. IF you need that, then you have to implement an ACK in your data protocol so the peer can notify you when it receives the data.
First regarding send. Actually there may happen 2 different things, depending on how the socket is configured.
If socket is in so-called blocking mode (the default) - the call to send will block the calling thread, until all the input buffer is consumed by the underlying network driver. (Note that this doesn't mean that the data has already arrived at the peer).
If the socket is transferred to a non-blocking mode - the call to send will fail if the underlying driver may not consume all the input immediately. The GetLastError returns WSAEWOULDBLOCK in such a case. The application should wait until it may retry to send. Instead of calling send in a loop the application should get the notification from the system about the socket state change. Functions such as WSAEventSelect or WSAAsyncSelect may be used for this (as well as legacy select).
Now, with I/O completion ports and WSASend the story is somewhat different. When the socket is associated with the completion port - it's automatically transferred to a non-blocking mode.
If the call to WSASend can't be completed immediately (i.e. the network driver can't consume all the input) - the WSASend returns an error and GetLastError returns STATUS_PENDING. This actually means that an asynchronous operation has started but not finished yet**.
That is, you should not call WSASend repeatedly, because the send operation is already in the progress. When it's finished (either successfully or not) you'll get the notification on the I/O completion port, but meanwhile the calling thread is free to do other things.

multi socket architecture with asio

I have a Client - Server architecture with 10 Servers with permanent connections with a single Client, the software is written in C++ and uses boost asio libraries.
All the connections are created in the initialization phase, and they are always open during the execution.
When the client needs some information, sends a request to all of the servers. Each server finds the information needed and answers to the client.
In the client there is a single thread that is in charge of receiving the messages from all of the sockets, in particular, I use only one io_services, and one async_read from each of the sockets.
When a message arrives in one of the sockets, the async_read read the first N bit that are the header of the message and than call a function that uses read (synchronous) to read the rest of the message. To the server side, the header and the rest of the message are sent with a single write (synchronous).
Then, the architecture works properly, but I noticed that sometimes the synchronous readtakes more time (~0.24 sec) than the usual.
In theory the data is ready to be read because the synchronous read is called when the async_read has already read the header. I also saw that if I use only one server instead of 10, this problem doesn't occur. Furthermore, I noticed that this problem is not caused because of the dimension of the message.
Is it possible that the problem occurs because the io_service is not able to handle all the 10 async_read? In particular, if all the sockets receive a message at the same time, could the io_service lost some time to manage the queues and slows down my synchronous read?
I haven't posted the code, because is difficult to estract it from the project, but if you don't understand my description I could write an example.
Thank you.
1) When async.read completion handler gets invoked, it doesn't mean that some data is available, it means that all the available to that moment data has already been read (unless you specified a restricting completion-condition). So the subsequent sync.read might wait until some more data arrives.
2) Blocking a completion handler is a bad idea, because you actually block all the other completion handlers and other functors posted to that io_service. Consider changing your design.
If you go for an asynchronous design, don't mix in some synchronous parts. Replace all your synchronous reads and writes with asynchronous ones. Both reads and writes will block your thread while the asynchronous variants will not.
Further, if you know the number of expected bytes exactly after reading the header you should request exactly that number of bytes.
If you don't know it, you could go for a single async_read_some with the size of the biggest message you expect. async_read_some will notify you how many bytes were actually read.