how to resolve WSAEWOULDBLOCK error - c++

I have an win7 application where i am sending data b\w 2 clients on a TCP connection. While testing i found out that i was getting WSAEWOULDBLOCK error frequently on my socket. To
resolve this error i put a while loop around it for ex.
do
{
size_t value = ::send(); /*with proper arguments*/
}while(GetLastError() == 10035);
So if i get error 10035 i will resend the data.
But now i see that this while loop runs sometimes infinitely and my application goes
into kind of deadlock state. I tried increasing the size of socket but still of no use.
If anybody has any idea how to resolve WSAEWOULDBLOCK error please let me know.

WSAEWOULDBLOCK is not really an error but simply tells you that your send buffers are full. This can happen if you saturate the network or if the other side simply doesn't acknowledge the received data. Take a look at the select() function, which allows you to wait until buffer space is available or a timeout occurs. There is also a way to bind a win32 event to a stream, which then allows its use with WaitForMultipleObjects in case you want to abort waiting early.
BTW: I initially wanted to object against your use of the term "deadlock", but this is also something that could be happening: If you wait to send some some response before receiving the next request, and the other side wants to send a next request instead of receiving your response, your applications are effectively deadlocked. Using select(), you can determine whether you can send data, receive data or that the connection has failed, which allows you to handle these case correctly and when they occur.
Note: I also assume that your code is not really a call to socket() but one to send/recv.

Related

Epoll zero recv() and negative(EAGAIN) send()

I was struggling with epoll last days and I'm in the middle of nowhere right now ;)
There's a lot of information on the Internet and obviously in the system man but I probably took an overdose and a bit confused.
In my server app(backend to nginx) I'm waiting for data from clients in the ET mode:
event_template.events = EPOLLIN | EPOLLRDHUP | EPOLLET
Everything has become curious when I have noticed that nginx is responding with 502 despite I could see successful send() on my side. I run wireshark
to sniff and have realised that my server sends(trying and getting RST) data to another machine on the net. So, I decided that socket descriptor is invalid and this is sort of "undefined behaviour". Finally, I found out that on a second recv() I'm getting zero bytes which means that connection has to be closed and I'm not allowed to send data back anymore. Nevertheless, I was getting from epoll not just EPOLLIN but EPOLLRDHUP in a row.
Question: Do I have to close socket just for reading when recv() returns zero and shutdown(SHUT_WR) later on during EPOLLRDHUP processing?
Reading from socket in a nutshell:
std::array<char, BatchSize> batch;
ssize_t total_count = 0, count = 0;
do {
count = recv(_handle, batch.begin(), batch.size(), MSG_DONTWAIT);
if (0 == count && 0 == total_count) {
/// #??? Do I need to wait zero just on first iteration?
close();
return total_count;
} else if (count < 0) {
if (errno == EAGAIN || errno == EWOULDBLOCK) {
/// #??? Will be back with next EPOLLIN?!
break ;
}
_last_error = errno;
/// #brief just log the error
return 0;
}
if (count > 0) {
total_count += count;
/// DATA!
if (count < batch.size()) {
/// #??? Received less than requested - no sense to repeat recv, otherwise I need one more turn?!
return total_count;
}
}
} while (count > 0);
Probably, my the general mistake was attempt to send data on invalid socket descriptor and everything what happens later is just a consequence. But, I continued to dig ;) My second part of a question is about writing to a socket in MSG_DONTWAIT mode as well.
As far as I now know, send() may also return -1 and EAGAIN which means that I'm supposed to subscribe on EPOLLOUT and wait when kernel buffer will be free enough to receive some data from my me. Is this right? But what if client won't wait so long? Or, may I call blocking send(anyway, I'm sending on a different thread) and guarantee the everything what I send to kernel will be really sent to peer because of setsockopt(SO_LINGER)? And a final guess which I ask to confirm: I'm allowed to read and write simultaneously, but N>1 concurrent writes is a data race and everything that I have to deal with it is a mutex.
Thanks to everyone who at least read to the end :)
Questions: Do I have to close socket just for reading when recv()
returns zero and shutdown(SHUT_WR) later on during EPOLLRDHUP
processing?
No, there is no particular reason to perform that somewhat convoluted sequence of actions.
Having received a 0 return value from recv(), you know that the connection is at least half-closed at the network layer. You will not receive anything further from it, and I would not expect EPoll operating in edge-triggered mode to further advertise its readiness for reading, but that does not in itself require any particular action. If the write side remains open (from a local perspective) then you may continue to write() or send() on it, though you will be without a mechanism for confirming receipt of what you send.
What you actually should do depends on the application-level protocol or message exchange pattern you are assuming. If you expect the remote peer to shutdown the write side of its endpoint (connected to the read side of the local endpoint) while awaiting data from you then by all means do send the data it anticipates. Otherwise, you should probably just close the whole connection and stop using it when recv() signals end-of-file by returning 0. Note well that close()ing the descriptor will remove it automatically from any Epoll interest sets in which it is enrolled, but only if there are no other open file descriptors referring to the same open file description.
Any way around, until you do close() the socket, it remains valid, even if you cannot successfully communicate over it. Until then, there is no reason to expect that messages you attempt to send over it will go anywhere other than possibly to the original remote endpoint. Attempts to send may succeed, or they may appear to do even though the data never arrive at the far end, or the may fail with one of several different errors.
/// #??? Do I need to wait zero just on first iteration?
You should take action on a return value of 0 whether any data have already been received or not. Not necessarily identical action, but either way you should arrange one way or another to get it out of the EPoll interest set, quite possibly by closing it.
/// #??? Will be back with next EPOLLIN?!
If recv() fails with EAGAIN or EWOULDBLOCK then EPoll might very well signal read-readiness for it on a future call. Not necessarilly the very next one, though.
/// #??? Received less than requested - no sense to repeat recv, otherwise I need one more turn?!
Receiving less than you requested is a possibility you should always be prepared for. It does not necessarily mean that another recv() won't return any data, and if you are using edge-triggered mode in EPoll then assuming the contrary is dangerous. In that case, you should continue to recv(), in non-blocking mode or with MSG_DONTWAIT, until the call fails with EAGAIN or EWOULDBLOCK.
As far as I now know, send() may also return -1 and EAGAIN which means that I'm supposed to subscribe on EPOLLOUT and wait when kernel buffer will be free enough to receive some data from my me. Is this right?
send() certainly can fail with EAGAIN or EWOULDBLOCK. It can also succeed, but send fewer bytes than you requested, which you should be prepared for. Either way, it would be reasonable to respond by subscribing to EPOLLOUT events on the file descriptor, so as to resume sending later.
But what if client won't wait so long?
That depends on what the client does in such a situation. If it closes the connection then a future attempt to send() to it would fail with a different error. If you were registered only for EPOLLOUT events on the descriptor then I suspect it would be possible, albeit unlikely, to get stuck in a condition where that attempt never happens because no further event is signaled. That likelihood could be reduced even further by registering for and correctly handling EPOLLRDHUP events, too, even though your main interest is in writing.
If the client gives up without ever closing the connection then EPOLLRDHUP probably would not be useful, and you're more likely to get the stale connection stuck indefinitely in your EPoll. It might be worthwhile to address this possibility with a per-FD timeout.
Or, may I call blocking send(anyway, I'm sending on a different
thread) and guarantee the everything what I send to kernel will be
really sent to peer because of setsockopt(SO_LINGER)?
If you have a separate thread dedicated entirely to sending on that specific file descriptor then you can certainly consider blocking send()s. The only drawback is that you cannot implement a timeout on top of that, but other than that, what would such a thread do if it blocking either on sending data or on receiving more data to send?
I don't see quite what SO_LINGER has to do with it, though, at least on the local side. The kernel will make every attempt to send data that you have already dispatched via a send() call to the remote peer, even if you close() the socket while data are still buffered, regardless of the value of SO_LINGER. The purpose of that option is to receive (and drop) straggling data associated with the connection after it is closed, so that they are not accidentally delivered to another socket.
None of this can guarantee that the data are successfully delivered to the remote peer, however. Nothing can guarantee that.
And a final guess which I ask to confirm: I'm allowed to read and
write simultaneously, but N>1 concurrent writes is a data race and
everything that I have to deal with it is a mutex.
Sockets are full-duplex, yes. Moreover, POSIX requires most functions, including send() and recv(), to be thread safe. Nevertheless, multiple threads writing to the same socket is asking for trouble, for the thread safety of individual calls does not guarantee coherency across multiple calls.

Boost asio synchronous write doesn't block

I've wrote an application that sends information through a socket using a TCP connection. For several reasons, I'm using blocking calls but I've noticed that boost::asio::write() method doesn't block when the other machine (the one receiving the data) disconnects. It doesn't raise an error either.
Is this the expected behavior?
Socket write will block when there is no room in the buffer, otherwise it will return as soon as data is in the buffer to send, not until data is delivered to the recipient. Also network stack may not detect that another side disconnected immediately, so you may or may not see error code on write. So yes, it is expected behavior.

How to determine if a blocking SSL BIO connection was closed?

I have a blocking SSL BIO object which I want to send data to. The problem is that the connection was closed on the remote side and I cannot find that out until I do a read (BIO_write does NOT return an error). However, I cannot read before I send since I do not want to block. Lastly, the code responsible for sending the data and the code responsible for reading are separate meaning that the failed read cannot trigger another send. How do I fix this?
There are two kinds of "close" states, and are referred to as "half-close" states. They mostly have to do with whether one side or the other side of a socket is going to be sending any more application data. When your recv call returns 0, it is actually notifying you that there is no more data to be received. However, it is still okay to send data, unless the send call signals some other kind of error, like EPIPE or ECONNRESET (I am not sure what the windows equivalents of these are for winsock, but I know they are there). If SSL_write is not returning an error, it is because the other side of the socket is still accepting the data.
The recv call allows a non-blocking check for the "no more data" state, and it can be done like this:
char c;
int r = recv(sock, &c, 1, MSG_DONTWAIT|MSG_PEEK);
If r is 0, the socket has receved an indication that there is no more data pending from the other end. Otherwise, the call will return 1 for a byte of data (which is still in the input buffer because of MSG_PEEK), or -1. If the errno is EAGAIN (which is possible because of MSG_DONTWAIT) there is no error. Any other errno value should be consulted, but is likely an indication that the socket is in an invalid state, and needs to be closed.
Before the socket gets closed, the OpenSSL application is supposed to make sure SSL_shutdown has returned 1. Then, the close on the socket occurs after the SSL object gets destroyed (with SSL_free). What this means is that, unless the application does something abnormal, both sides of the socket using OpenSSL should have seen SSL_shutdown return 1 and then both sides can safely close the connection.
If you want to check for the shutdown state of your SSL context, you can use SSL_get_shutdown, which will report whether or not the other end has started the SSL_shutdown sequence.

File transfer C++

When my client sends a file to the server, should I Sleep(100) or so before sending the next chunk to ensure the server has enough time to download + write the data?
Does that just seem completely unnecessary?
Also I'm getting wouldblock errors (# 10035) when sending a chunk, so im just looping send until it succeeds, if send == SOCKET_ERROR goto SendAgain; , is that ok?
If you're sending your file via TCP, then it's the protocol that is ensuring that everything has been received, I wouldn't put a sleep between each chunk.
The wouldblock error is either that you're sending too much data for your output buffer, or you try to send it too quickly, and the remote buffer gets full. That seems ok to send it again because the receiver received it but didn't have enough space to store it and have juste drop it.
Here is a small article about your error: Winsock error 10035
In my opinion using sleepfunction to wait for something to be done is in 99% of the time the wrong way.
You ll never now the time you gonna need or you ve to expect for a process to be executed (can be interrupted by e.g spikes, other problems in i/o or whatever)
If you want to make sure something important is executed completely you should read about Semaphores or something like that, where you lock/free processes on start/end.
Taken from a man-page:
When the message does not fit into the send buffer of the socket,
send() normally blocks, unless the socket has been placed in
nonblocking I/O mode. In nonblocking mode it would fail with the error
EAGAIN or EWOULDBLOCK in this case. The select(2) call may be
used to determine when it is possible to send more data.

Overlapped message named pipe, ERROR_MORE_DATA and CancelIoEx

I am using $SUB for the first time and have come across this problem. Both, client and server use overlapped operations and here is the specific situation I have a problem with.
Client
C1. Connects to the server.
C2. Sends the message bigger than a pipe buffer and buffer passed to overlapped read operation in the server.
C3. Successfully cancels the send operation.
Server
S1. Creates and waits for the client.
S2. When the client is connected, it reads the message.
S21. Because message doesn't fit into the buffer(ERROR_MORE_DATA), it is read part by part.
It seems to me that there is no way to tell when is the whole message, as an isolated unit, canceled. In particular, if client cancels the send operation, server does not receive the whole message, just a part of it, and consequent read operation returns with ERROR_IO_PENDING (in my case), which means there is no data to be read and read operation has been queued. I would expect to have some kind of means telling the reader that the message has been canceled, so that reader can act upon it.
However, relevant documentation is scatter over MSDN, so I may as well be missing something. I would really appreciate if anyone can shed some light on it. Thanks.
You are correct, there is no way to tell.
If you cancel the Writefile partway through, only part of the message will be written, so only that part will be read by the server. There is no "bookkeeping" information sent about how large the message was going to be before you cancelled it - what is sent is just the raw data.
So the answer is: Don't cancel the IO, just wait for it to succeed.
If you do need to cancel IO partway through, you should probably cut the connection and start again from the beginning, just as you would for a network outage.
(You could check your OVERLAPPED structure to find out how much was actually written, and carry on from there, but if you wanted to do that you would probably just not cancel the IO in the first place.)
Why did you want to cancel the IO anyway? What set of circumstances triggers this requirement?