Handling SSL_shutdown correctly - c++

The OpenSSL documentation on SSL_shutdown states that:
It is therefore recommended, to check the return value of SSL_shutdown() and call SSL_shutdown() again, if the bidirectional shutdown is not yet complete (return value of the first call is 0).
https://www.openssl.org/docs/ssl/SSL_shutdown.html
I have a code snippet below where I check for return value 0 from SSL_shutdown and call it again, which I have been using. My question is, is it okay to disregard the return value of SSL_shutdown on the second call or we should keep retrying SSL_shutdown until a 1 (bidirectional shutdown complete) is returned.
int r = SSL_shutdown(ssl);
//error handling here if r < 0
if(!r)
{
shutdown(fd,1);
SSL_shutdown(ssl); //how should I handle return value and error handling here is it required??
}
SSL_free(ssl);
SSLMap.erase(fd);
shutdown(fd,2);
close(fd);

openssl is a bit of a dark art.
Firstly the page you referenced has HTML-ified the return values badly. Here's what the man-page actually says:
RETURN VALUES
The following return values can occur:
0 The shutdown is not yet finished. Call SSL_shutdown() for a second
time, if a bidirectional shutdown shall be performed. The output
of SSL_get_error(3) may be misleading, as an erroneous
SSL_ERROR_SYSCALL may be flagged even though no error occurred.
1 The shutdown was successfully completed. The "close notify" alert
was sent and the peer's "close notify" alert was received.
-1 The shutdown was not successful because a fatal error occurred
either at the protocol level or a connection failure occurred. It
can also occur if action is need to continue the operation for non-
blocking BIOs. Call SSL_get_error(3) with the return value ret to
find out the reason.
If you have blocking BIOs, things are relatively simple. A 0 on the first call means you need to call SSL_shutdown again if you want a full bidirectional shutdown. Basically it means that you sent a close_notify alert but haven't one back yet). A 1 would mean you previously received a close_notify alert from the other peer, and you're totally done. A -1 means an unrecoverable error. On the second call (which you only do if you got a 0 back), then a bidirectional shutdown is initiated (i.e. now wait from the other side for them to send you their "close_notify" alert). Logic dictates you can't get a 0 back again (because it's a blocking BIO and will have completed the first step). A -1 indicates an error, and a 1 indicates completion success.
If you have non-blocking BIOs, the same "possibly 0 then 1" return values apply, save for the fact you need to go through the whole SSL_ERROR_WANT_READ and SSL_ERROR_WANT_WRITE rigmarole as well, i.e.:
If the underlying BIO is non-blocking, SSL_shutdown() will also return
when the underlying BIO could not satisfy the needs of SSL_shutdown()
to continue the handshake. In this case a call to SSL_get_error() with
the return value of SSL_shutdown() will yield SSL_ERROR_WANT_READ or
SSL_ERROR_WANT_WRITE. The calling process then must repeat the call
after taking appropriate action to satisfy the needs of SSL_shutdown().
The action depends on the underlying BIO. When using a non-blocking
socket, nothing is to be done, but select() can be used to check for
the required condition. When using a buffering BIO, like a BIO pair,
data must be written into or retrieved out of the BIO before being able
to continue.
So you have two levels of repetition. You call SSL_shutdown the 'first' time but repeat if you get SSL_ERROR_WANT_READ or SSL_ERROR_WANT_WRITE after going around the select() loop in the normal way, and only count the 'first' SSL_shutdown as done if you get a non SSL_ERROR_WANT_ error code (in which case it failed), or you get a 0 or 1 return. If you get a 1 return, you've done. If you get a 0 return, and you want a bidirectional shutdown, then you have to do the second call, on which again you will need to check for SSL_ERROR_WANT_READ or SSL_ERROR_WANT_WRITE and retry select; that should not return 1, but may return 0 or an error.
Not simple.
Couple more notes from the docs: after calling SSL_shutdown and getting a "0" back the first time, you could optionally then call SSL_read instead of SSL_shutdown (in case the peer is still sending you any data on that SSL socket), and, I guess, "hope" that they eventually send you a close message from their side, to flush the pipes.
Also if you're planning on closing the socket after shutdown completion "anyway" you could entirely skip the second call to SSL_shutdown (the "1" of the "0 then 1") and just go ahead and close the socket, the kernel should take care of discarding the "now ignored" close_notify alert that presumably they should be about to send...

Related

SSL_shutdown() returns -1 and errno is 0

In my C++ application I use OpenSSL to connect to a server using nonblocking BIO. I am developing for mac OS X and iOS.
The first call to SSL_shutdown() returns 0. Which means I have to call SSL_shutdown() again:
The following return values can occur:
0 The shutdown is not yet finished. Call SSL_shutdown() for a second time, if a bidirectional shutdown shall be performed. The output of SSL_get_error may be misleading, as an erroneous SSL_ERROR_SYSCALL may be flagged even though no error occurred.
<0
The shutdown was not successful because a fatal error occurred either at the protocol level or a connection failure occurred. It can also occur if action is need to continue the operation for non-blocking BIOs. Call SSL_get_error with the return value ret to find out the reason.
https://www.openssl.org/docs/ssl/SSL_shutdown.html
So far so god. The problem occurs on the second call to SSL_shutdown(). This returns -1 which means an error has occurred (see above). Now if I check with SSL_get_error() I get error SSL_ERROR_SYSCALL which in turn is supposed to mean a system error has occurred. But now the catch. If I check the errno it returns 0 -> unknown error. What I have read so far about the issue is, that it could mean that the server did just "hang up", but to be honest this does not satisfy me.
Here is my implementation of the shutdown:
int result = 0;
int shutdownResult;
while ((shutdownResult = SSL_shutdown(sslHandle)) != 1) { //close connection 1 means everything is shut down ok
if (shutdownResult == 0) { //we are supposed to call shutdown again
continue;
} else if (SSL_get_error(sslHandle, shutdownResult) == SSL_ERROR_WANT_READ) {
[...] //omitted want read code, in this case the application never reaches this point
} else if (SSL_get_error(sslHandle, shutdownResult) == SSL_ERROR_WANT_WRITE) {
[...] //omitted want write code, in this case the application never reaches this point
} else {
logError("Error in ssl shutdown, ssl error: " + std::to_string(SSL_get_error(sslHandle, shutdownResult)) + ", system error: " + std::string(strerror(errno))); //something went wrong
break;
}
}
When run the application logs:
ERROR:: Error in ssl shutdown, ssl error: 5, system error: Undefined error: 0
So is here just the server shutting down the connection or is there a more critical issue? Am I just missing something really obvious?
A full SSL shutdown consists of two parts:
sending the 'close notify' alert to the peer
receiving the 'close notify' alert from the peer
The first SSL_shutdown returned 0 which means that it did send the 'close notify' to the peer but did not receive anything back yet. The second call of SSL_shutdown fails because the peer did not do a proper SSL shutdown and send a 'close notify' back, but instead just closed the underlying TCP connection.
This behavior is actually very common and you can usually just ignore the error. It does not matter much if the underlying TCP connection should be closed anyway. But a proper SSL shutdown is usually needed when you want to continue in plain text on the same TCP connection, like needed for the CCC command in FTPS connections (but even there various implementation fail to handle this case properly).

How to make sure that WSASend() will send the data?

WSASend() will return immediately whether the data will be sent or not. But how to make sure that data will be sent, for example I have a button in my UI that will send "Hello World!" when pressed. Now I want to make sure that when the user click on this button the "Hello World!" will be sent at some point, but WSASend() could return WSAEWOULDBLOCK indicating that data will not be sent, so should I enclose WSASend() in a loop that does not exit until WSASend() returns 0 (success).
Note: I am using IOCP.
should I enclose WSASend() in a loop that does not exit until
WSASend() returns 0 (success)
Err.. NO!
Have the UI issue an overlapped WSASend request, complete with buffer/s and OVERLAPPED/s. If, by some miracle, it does actually return success immedately, (and I've never seen it), you're good.
If, (when:), it returns WSA_IO_PENDING, you can do nothing in your UI button-handler because GUI event-handlers cannot wait. Graphical UI's are state-machines - you must exit the button-handler and return to the message input queue in prompt manner. You can do some GUI stuff, if you want. Maybe disable the 'Send' button, or add some 'Message sent' text to a memo component. That's about it - you must then exit.
Some time later, the successful completion notification, (or failure notification), will get posted to the IOCP completion queue and a handler thread will get hold of it. Use PostMessage, QueueUserAPC or similar inter-thread comms mechanism to signal 'something', (eg. the buffer object used in the original WSASend), back to the UI thread so that it can take action/s on the returned result, eg. re-enabling the 'Send' button.
Yes, it can be seen as messy, but it is the only way you can do it that will work well.
Other approaches - polling loops, Application.DoEvents, timers etc are all horrible bodges.
Overlapped Socket I/O
If an overlapped operation completes immediately, WSASend returns a value of zero and the lpNumberOfBytesSent parameter is updated with the number of bytes sent. If the overlapped operation is successfully initiated and will complete later, WSASend returns SOCKET_ERROR and indicates error code WSA_IO_PENDING.
...
The error code WSA_IO_PENDING indicates that the overlapped operation has been successfully initiated and that completion will be indicated at a later time. Any other error code indicates that the overlapped operation was not successfully initiated and no completion indication will occur.
...
So as demonstrated in docs, you don't need to enclose in a loop, just check for a SOCKET_ERROR and if the last error is not equal to WSA_IO_PENDING, everything is fine:
rc = WSASend(AcceptSocket, &DataBuf, 1,
&SendBytes, 0, &SendOverlapped, NULL);
if ((rc == SOCKET_ERROR) &&
(WSA_IO_PENDING != (err = WSAGetLastError()))) {
printf("WSASend failed with error: %d\n", err);
break;
}

Why my else condition is never executing?

I am working on UDP server and this code of UDP server is working fine except the else condition. May be i am wrong but i have done lot of things using else condition in the same way to terminate while loop. I am not sure if its UDP problem or something else........
while(1)// execute three times because its getting data only three times from the client
{
int total_bytes = 0;
int bytes_recv=0;
int count = 0;
std::vector<double> m_vector(8000);
// Bytes are also received 3 times correctly then why else condition not executing after receiving 3 times ?
bytes_recv = recvfrom(Socket,(char*)m_vector.data(),64000,0,(SOCKADDR*)&ClientAddr,&i);
count++;
if(bytes_recv > 0 )
{
total_bytes = total_bytes+bytes_recv;
std::cout<<"Server: loop counter is"<<count<<std::endl;
std::cout<<"Server: Received bytes are"<<total_bytes<<std::endl;
}else
{
//why this part never executes ?
std::cout<<"Data Receiving has finished"<<std::endl;
break;
}
}
WSACleanup();
system("pause");
return 0;
}
The comment in the source says that you expect only 3 datagrams from the client. Thus, do count how many datagrams you have received, and if you already have 3 of them, do not continue calling recvfrom.
You already have a variable count, but it is reset to zero every iteration and isn't used as exit condition.
Once you have count == 3, you know that there is nothing more coming, so calling recvfrom is pointless. It will only block, since that is what you're telling it to do. Making the socket non-blocking would "help" to avoid blocking, but then you would be polling, which isn't good either (and useless, since you know there is nothing to be received). It's best to operate correctly.
You could also have the client send an "end of message" datagram, but of course you would have to add a timeout and a strategy for packet loss, or the server could block forever. Not only because of malicious clients, but also simply because the receive buffer was full and a packet was dropped (which is a normal thing to happen!).
Alternatively, since there is a call to WSACleanup in your code, you're using Winsock. Which means you could use overlapped WSARecvFrom instead of recvfrom. Fire off one receive, and from its completion handler fire off another two, also with a callback function. After firing off the request, forget about it and let the callback handle the rest, you can now deal with another client (must be alertable though for that to happen ... alternatively, block on an IOCP or WaitOnMultipleObjects or whatever).
If no second or third packet comes in after so and so long, either send a "please resend" message or consider the client dead, close the socket and move on.
recvfrom is by default a blocking call and will only return once a packet has been read. Because of this when you stop sending packets it just blocks on recvfrom so the case with 0 bytes never happens
You could change the flags to recvfrom to change this behaviour, but it's likely not what you want because then if there's any delay between sending the packets you will get 0 bytes and exit.
I suppose you could see how long you've gone without receiving any packets and then shut down, so in the else case you could use a timer and a running total before exiting.
What are you trying to accomplish?
I have not checked (bad me, I know, but time's short), if recvfrom follows typical behavior, then it guarantees you that:
returns value < 0 means error
returns value == 0 means that everything was OK but channel cannot receive anything more
returns value > 0 means something was received
In TCP you get 'received bytes' == 0 only when the connection is closed.
In UDP there's no such thing as 'connection'. The channel is always ready to receive, until your the socked is closed.
Hence, it probably simply waits until something arrives. It cannot detect that there is noone to listen from. That's the UDP specifics.
If you want to catch a case when nothing arrives for a long time, try to set read timeout.

boost::interprocess::timed_receive() never returns if sending process halts

I thought that calling timed_receive() would just time out in this case, but instead it gets stuck attempting to lock a mutex.
So is there a function that I can call on my que, before attempting to receive data, which tells me if the sending process has died or is halted?
There is no generic method to know whether the counter-party, died, halted, partially locked (eg one of the threads is in an infinite loop), playing possum, or just being adversarial. If the counter-party is cooperative, you can communicate with heartbeats and rely on them to decide if it responsive.
The fact that your program is "stuck" attempting to lock a mutex (are you sure that that's the case?) can indicate that,
the queue is empty, and the timeout value hasn't expired
you've run into a deadlock scenario.
The timed_receive succeeds, times out, or throws as expected:
bool timed_receive(void * buffer, std::size_t buffer_size,
std::size_t & recvd_size, unsigned int & priority,
const boost::posix_time::ptime & abs_time);
Receives a message from the message queue. The message is stored in
buffer "buffer", which has size "buffer_size". The received message
has size "recvd_size" and priority "priority". If the message queue is
empty the receiver retries until time "abs_time" is reached. Returns
true if the message has been successfully sent. Returns false if
timeout is reached. Throws interprocess_error on error.
Also make sure you pass an absolute timeout value.

send and recv on same socket from different threads not working

I read that it should be safe from different threads concurrently, but my program has some weird behaviour and I don't know what's wrong.
I have concurrent threads communicating with a client socket
one doing send to a socket
one doing select and then recv from the same socket
As I'm still sending, the client has already received the data and closed the socket.
At the same time, I'm doing a select and recv on that socket, which returns 0 (since it is closed) so I close this socket. However, the send has not returned yet...and since I call close on this socket the send call fails with EBADF.
I know the client has received the data correctly since I output it after I close the socket and it is right. However, on my end, my send call is still returning an error (EBADF), so I want to fix it so it doesn't fail.
This doesn't always happen. It happens maybe 40% of the time. I don't use sleep anywhere. Am I supposed to have pauses between sends or recvs or anything?
Here's some code:
Sending:
while(true)
{
// keep sending until send returns 0
n = send(_sfd, bytesPtr, sentSize, 0);
if (n == 0)
{
break;
}
else if(n<0)
{
cerr << "ERROR: send returned an error "<<errno<< endl; // this case is triggered
return n;
}
sentSize -= n;
bytesPtr += n;
}
Receiving:
while(true)
{
memset(bufferPointer,0,sizeLeft);
n = recv(_sfd,bufferPointer,sizeLeft, 0);
if (debug) cerr << "Receiving..."<<sizeLeft<<endl;
if(n == 0)
{
cerr << "Connection closed"<<endl; // this case is triggered
return n;
}
else if (n < 0)
{
cerr << "ERROR reading from socket"<<endl;
return n;
}
bufferPointer += n;
sizeLeft -= n;
if(sizeLeft <= 0) break;
}
On the client, I use the same receive code, then I call close() on the socket.
Then on my side, I get 0 from the receive call and also call close() on the socket
Then my send fails. It still hasn't finished?! But my client already got the data!
I must admit I'm surprised you see this problem as often as you do, but it's always a possibility when you're dealing with threads. When you call send() you'll end up going into the kernel to append the data to the socket buffer in there, and it's therefore quite likely that there'll be a context switch, maybe to another process in the system. Meanwhile the kernel has probably buffered and transmitted the packet quite quickly. I'm guessing you're testing on a local network, so the other end receives the data and closes the connection and sends the appropriate FIN back to your end very quickly. This could all happen while the sending machine is still running other threads or processes because the latency on a local ethernet network is so low.
Now the FIN arrives - your receive thread hasn't done a lot lately since it's been waiting for input. Many scheduling systems will therefore raise its priority quite a bit and there's a good chance it'll be run next (you don't specify which OS you're using but this is likely to happen on at least Linux, for example). This thread closes the socket due to its zero read. At some point shortly after this the sending thread will be re-awoken, but presumably the kernel notices that the socket is closed before it returns from the blocked send() and returns EBADF.
Now this is just speculation as to the exact cause - among other things it heavily depends on your platform. But you can see how this could happen.
The easiest solution is probably to use poll() in the sending thread as well, but wait for the socket to become write-ready instead of read-ready. Obviously you also need to wait until there's any buffered data to send - how you do that depends on which thread buffers the data. The poll() call will let you detect when the connection has been closed by flagging it with POLLHUP, which you can detect before you try your send().
As a general rule you shouldn't close a socket until you're certain that the send buffer has been fully flushed - you can only be sure of this once the send() call has returned and indicates that all the remaining data has gone out. I've handled this in the past by checking the send buffer when I get a zero read and if it's not empty I set a "closing" flag. In your case the sending thread would then use this as a hint to do the close once everything is flushed. This matters because if the remote end does a half-close with shutdown() then you'll get a zero read even if it might still be reading. You might not care about half closes, however, in which case your strategy above is OK.
Finally, I personally would avoid the hassle of sending and receiving threads and just have a single thread which does both - that's more or less the point of select() and poll(), to allow a single thread of execution to deal with one or more filehandles without worrying about performing an operation which blocks and starves the other connections.
Found the problem. It's with my loop. Notice that it's an infinite loop. When I don't have anymore left to send, my sentSize is 0, but I'll still loop to try to send more. At this time, the other thread has already closed this thread and so my send call for 0 bytes returns with an error.
I fixed it by changing the loop to stop looping when sentSize is 0 and it fixed the problem!