c++ linux accept() blocking after socket closed - c++

I have a thread that listens for new connections
new_fd = accept(Listen_fd, (struct sockaddr *) & their_addr, &sin_size);
and another thread that closes Listen_fd when when it's time to close the program. After Listen_fd is closed however, it still blocks. When I use GDB to try and debug accept() doesn't block. I thought that it could be a problem with SO_LINGER, but it shouldn't be on by default, and shouldn't change when using GDB. Any idea whats going on, or any other suggestion to closing the listing socket?

Use: sock.shutdown (socket.SHUT_RD)
Then accept will return EINVAL. No ugly cross thread signals required!
From the Python documentation:
"Note close() releases the resource associated with a connection but does not necessarily close the connection immediately. If you want to close the connection in a timely fashion, call shutdown() before close()."
http://docs.python.org/3/library/socket.html#socket.socket.close
I ran into this problem years ago, while programming in C. But I only found the solution today, after running into the same problem in Python, AND pondering using signals (yuck!), AND THEN remembering the note about shutdown!
As for the comments that say you should not close/use sockets across threads... in CPython the global interpreter lock should protect you (assuming you are using file objects rather than raw, integer file descriptors).
Here is example code:
import socket, threading, time
sock = socket.socket (socket.AF_INET, socket.SOCK_STREAM)
sock.setsockopt (socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
sock.bind (('', 8000))
sock.listen (5)
def child ():
print ('child accept ...')
try: sock.accept ()
except OSError as exc : print ('child exception %s' % exc)
print ('child exit')
threading.Thread ( target = child ).start ()
time.sleep (1)
print ('main shutdown')
sock.shutdown (socket.SHUT_RD)
time.sleep (1)
print ('main close')
sock.close ()
time.sleep (1)
print ('main exit')

The behavior of accept when called on something which is not a valid socket FD is undefined. "Not a valid socket FD" includes numbers which were once valid sockets but have since been closed. You might say "but Borealid, it's supposed to return EINVAL!", but that's not guaranteed - for instance, the same FD number might be reassigned to a different socket between your close and accept calls.
So, even if you were to isolate and correct whatever makes your program fail, you could still begin to fail again in the future. Don't do it - correct the error that causes you to attempt to accept a connection on a closed socket.
If you meant that a call which was previously made to accept continues blocking after close, then what you should do is send a signal to the thread which is blocked in accept. This will give it EINTR and it can cleanly disengage - and then close the socket. Don't close it from a thread other than the one using it.

The shutdown() function may be what you are looking for. Calling shutdown(Listen_fd, SHUT_RDWR) will cause any blocked call to accept() to return EINVAL. Coupling a call to shutdown() with the use of an atomic flag can help to determine the reason for the EINVAL.
For example, if you have this flag:
std::atomic<bool> safe_shutdown(false);
Then you can instruct the other thread to stop listening via:
shutdown_handler([&]() {
safe_shutdown = true;
shutdown(Listen_fd, SHUT_RDWR);
});
For completeness, here's how your thread could call accept:
while (true) {
sockaddr_in clientAddr = {0};
socklen_t clientAddrSize = sizeof(clientAddr);
int connSd = accept(Listen_fd, (sockaddr *)&clientAddr, &clientAddrSize);
if (connSd < 0) {
// If shutdown_handler() was called, then exit gracefully
if (errno == EINVAL && safe_shutdown)
break;
// Otherwise, it's an unrecoverable error
std::terminate();
}
char clientname[1024];
std::cout << "Connected to "
<< inet_ntop(AF_INET, &clientAddr.sin_addr, clientname,
sizeof(clientname))
<< std::endl;
service_connection(connSd);
}

It's a workaround, but you could select on Listen_fd with a timeout, and if a timeout occured check that you're about to close the program. If so, exit the loop, if not, go back to step 1 and do the next select.

Are you checking the return value of close?
From linux manpages, (http://www.kernel.org/doc/man-pages/online/pages/man2/close.2.html)
"It is probably unwise to close file descriptors while they may be in use by system calls in other threads in the same process. Since a file descriptor may be reused, there are some obscure race conditions that may cause unintended side effects".
You can use a select instead of an accept and wait for some event from the other thead, then close the socket in the listener thread.

Related

UnrealEngine4: Recv function would keep blocking when TCP server shutdown

I use a blocking FSocket in client-side that connected to tcp server, if there's no message from server, socket thread would block in function FScoket::Recv(), if TCP server shutdown, socket thread is still blocking in this function. but when use blocking socket of BSD Socket API, thread would pass from recv function and return errno when TCP server shutdown, so is it the defect of FSocket?
uint32 HRecvThread::Run()
{
uint8* recv_buf = new uint8[RECV_BUF_SIZE];
uint8* const recv_buf_head = recv_buf;
int readLenSeq = 0;
while (Started)
{
//if (TcpClient->Connected() && ClientSocket->GetConnectionState() != SCS_Connected)
//{
// // server disconnected
// TcpClient->SetConnected(false);
// break;
//}
int32 bytesRead = 0;
//because use blocking socket, so thread would block in Recv function if have no message
ClientSocket->Recv(recv_buf, readLenSeq, bytesRead);
.....
//some logic of resolution for tcp msg bytes
.....
}
delete[] recv_buf;
return 0
}
As I expected, you are ignoring the return code, which presumably indicates success or failure, so you are looping indefinitely (not blocking) on an error or end of stream condition.
NB You should allocate the recv_buf on the stack, not dynamically. Don't use the heap when you don't have to.
There is a similar question on the forums in the UE4 C++ Programming section. Here is the discussion:
https://forums.unrealengine.com/showthread.php?111552-Recv-function-would-keep-blocking-when-TCP-server-shutdown
Long story short, in the UE4 Source, they ignore EWOULDBLOCK as an error. The code comments state that they do not view it as an error.
Also, there are several helper functions you should be using when opening the port and when polling the port (I assume you are polling since you are using blocking calls)
FSocket::Connect returns a bool, so make sure to check that return
value.
FSocket::GetLastError returns the UE4 Translated error code if an
error occured with the socket.
FSocket::HasPendingData will return a value that informs you if it
is safe to read from the socket.
FSocket::HasPendingConnection can check to see your connection state.
FSocket::GetConnectionState will tell you your active connection state.
Using these helper functions for error checking before making a call to FSocket::Recv will help you make sure you are in a good state before trying to read data. Also, it was noted in the forum posts that using the non-blocking code worked as expected. So, if you do not have a specific reason to use blocking code, just use the non-blocking implementation.
Also, as a final hint, using FSocket::Wait will block until your socket is in a desirable state of your choosing with a timeout, i.e. is readable or has data.

Using pselect for synchronous wait

In a server code I want to use pselect to wait for clients to connect as well monitor the standard output of the prozesses that I create and send it to the client (like a simplified remote shell).
I tried to find examples on how to use pselect but I haven't found any. The socket where the client can connect is already set up and works, as I verified that with accept(). SIGTERM is blocked.
Here is the code where I try to use pselect:
waitClient()
{
fd_set readers;
fd_set writers;
fd_set exceptions;
struct timespec ts;
// Loop until we get a sigterm to shutdown
while(getSigTERM() == false)
{
FD_ZERO(&readers);
FD_ZERO(&writers);
FD_ZERO(&exceptions);
FD_SET(fileno(stdin), &readers);
FD_SET(fileno(stdout), &writers);
FD_SET(fileno(stderr), &writers);
FD_SET(getServerSocket()->getSocketId(), &readers);
//FD_SET(getServerSocket()->getSocketId(), &writers);
memset(&ts, 0, sizeof(struct timespec));
pret = pselect(FD_SETSIZE, &readers, &writers, &exceptions, &ts, &mSignalMask);
// Here pselect always returns with 2. What does this mean?
cout << "pselect returned..." << pret << endl;
cout.flush();
}
}
So what I want to know is how to wait with pselect until an event is received, because currently pselect always returns immediately with a value 2. I tried to set the timeout to NULL but that doesn't change anything.
The returnvalue of pselect (if positive) is the filedescriptor that caused the event?
I'm using fork() to create new prozesses (not implemented yet) I know that I have to wait() on them. Can I wait on them as well? I suppose I need to chatch the signal SIGCHILD, so how would I use that? wait() on the child would also block, or can I just do a peek and then continue with pselect, otherwise I have to concurrent blocking waits.
It returns immediately because the file descriptors in the writers set are ready. The standard output streams will almost always be ready for writing.
And if you check a select manual page you will see that the return value is either -1 on error, 0 on timeout, and a positive number telling you the number of file descriptors that are ready.

send and recv on same socket from different threads not working

I read that it should be safe from different threads concurrently, but my program has some weird behaviour and I don't know what's wrong.
I have concurrent threads communicating with a client socket
one doing send to a socket
one doing select and then recv from the same socket
As I'm still sending, the client has already received the data and closed the socket.
At the same time, I'm doing a select and recv on that socket, which returns 0 (since it is closed) so I close this socket. However, the send has not returned yet...and since I call close on this socket the send call fails with EBADF.
I know the client has received the data correctly since I output it after I close the socket and it is right. However, on my end, my send call is still returning an error (EBADF), so I want to fix it so it doesn't fail.
This doesn't always happen. It happens maybe 40% of the time. I don't use sleep anywhere. Am I supposed to have pauses between sends or recvs or anything?
Here's some code:
Sending:
while(true)
{
// keep sending until send returns 0
n = send(_sfd, bytesPtr, sentSize, 0);
if (n == 0)
{
break;
}
else if(n<0)
{
cerr << "ERROR: send returned an error "<<errno<< endl; // this case is triggered
return n;
}
sentSize -= n;
bytesPtr += n;
}
Receiving:
while(true)
{
memset(bufferPointer,0,sizeLeft);
n = recv(_sfd,bufferPointer,sizeLeft, 0);
if (debug) cerr << "Receiving..."<<sizeLeft<<endl;
if(n == 0)
{
cerr << "Connection closed"<<endl; // this case is triggered
return n;
}
else if (n < 0)
{
cerr << "ERROR reading from socket"<<endl;
return n;
}
bufferPointer += n;
sizeLeft -= n;
if(sizeLeft <= 0) break;
}
On the client, I use the same receive code, then I call close() on the socket.
Then on my side, I get 0 from the receive call and also call close() on the socket
Then my send fails. It still hasn't finished?! But my client already got the data!
I must admit I'm surprised you see this problem as often as you do, but it's always a possibility when you're dealing with threads. When you call send() you'll end up going into the kernel to append the data to the socket buffer in there, and it's therefore quite likely that there'll be a context switch, maybe to another process in the system. Meanwhile the kernel has probably buffered and transmitted the packet quite quickly. I'm guessing you're testing on a local network, so the other end receives the data and closes the connection and sends the appropriate FIN back to your end very quickly. This could all happen while the sending machine is still running other threads or processes because the latency on a local ethernet network is so low.
Now the FIN arrives - your receive thread hasn't done a lot lately since it's been waiting for input. Many scheduling systems will therefore raise its priority quite a bit and there's a good chance it'll be run next (you don't specify which OS you're using but this is likely to happen on at least Linux, for example). This thread closes the socket due to its zero read. At some point shortly after this the sending thread will be re-awoken, but presumably the kernel notices that the socket is closed before it returns from the blocked send() and returns EBADF.
Now this is just speculation as to the exact cause - among other things it heavily depends on your platform. But you can see how this could happen.
The easiest solution is probably to use poll() in the sending thread as well, but wait for the socket to become write-ready instead of read-ready. Obviously you also need to wait until there's any buffered data to send - how you do that depends on which thread buffers the data. The poll() call will let you detect when the connection has been closed by flagging it with POLLHUP, which you can detect before you try your send().
As a general rule you shouldn't close a socket until you're certain that the send buffer has been fully flushed - you can only be sure of this once the send() call has returned and indicates that all the remaining data has gone out. I've handled this in the past by checking the send buffer when I get a zero read and if it's not empty I set a "closing" flag. In your case the sending thread would then use this as a hint to do the close once everything is flushed. This matters because if the remote end does a half-close with shutdown() then you'll get a zero read even if it might still be reading. You might not care about half closes, however, in which case your strategy above is OK.
Finally, I personally would avoid the hassle of sending and receiving threads and just have a single thread which does both - that's more or less the point of select() and poll(), to allow a single thread of execution to deal with one or more filehandles without worrying about performing an operation which blocks and starves the other connections.
Found the problem. It's with my loop. Notice that it's an infinite loop. When I don't have anymore left to send, my sentSize is 0, but I'll still loop to try to send more. At this time, the other thread has already closed this thread and so my send call for 0 bytes returns with an error.
I fixed it by changing the loop to stop looping when sentSize is 0 and it fixed the problem!

How to Break C++ Accept Function?

When doing socket programming, with multi-threading,
if a thread is blocked on Accept Function,
and main thread is trying to shut down the process,
how to break the accept function in order to pthread_join safely?
I have vague memory of how to do this by connection itself to its own port in order to break the accept function.
Any solution will be thankful.
Cheers
Some choices:
a) Use non-blocking
b) Use AcceptEx() to wait on an extra signal, (Windows)
c) Close the listening socket from another thread to make Accept() return with an error/exception.
d) Open a temporary local connection from another thread to make Accept() return with the temp connection
The typical approach to this is not to use accept() unless there is something to accept! The way to do this is to poll() the corresponding socket with a suitable time-out in a loop. The loop checks if it is meant to exit because a suitably synchronized flag was set.
An alternative is to send the blocked thread a signal, e.g., using pthread_kill(). This gets out of the blocked accept() with a suitable error indication. Again, the next step is to check some flag to see if the thread is meant to exit. My preference is the first approach, though.
Depending on your system, if it is available, I would use a select function to wait for the server socket to have a read, indicating a socket is trying to connect. The amount of time to time to wait for a connection can be set/adjusted to to what every time you want to wait for a client to connect(infinity, to seconds, to 0 which will just check and return). The return status needs to be checked to see if the time limit was reached (no socket is trying to connect), or if there is something waiting to be serviced (your server socket indicating there is a client which would like to connect). You can then execute the accept knowing there is a socket to connect based on the returned status.
If available I would use a select function with a timeout in a loop to achieve this functionality.
as Glenn suggested
The select function with a timeout value will wait for a socket to connect for a set period of time. If a socket attempts to connect it can be accepted during that period. By looping this select with a timeout it is possible to check for new connections until the break condition is met.
Here is an example:
std::atomic<bool> stopThread;
void theThread ( std::atomic<bool> & quit )
{
struct timeval tv;
int activity;
...
while(!quit)
{
// reset the time value for select timeout
tv.tv_sec = 0;
tv.tv_usec = 1000000;
...
//wait for an activity on one of the sockets
activity = select( max_sd + 1 , &readfds , NULL , NULL , &tv);
if ((activity < 0) && (errno!=EINTR))
{
printf("select error");
}
if (FD_ISSET(master_socket, &readfds))
{
if ((new_socket = accept(master_socket, (struct sockaddr *)&address, (socklen_t*)&addrlen))<0)
{
perror("accept");
exit(EXIT_FAILURE);
}
...
}
}
int main(int argc, char** argv)
{
...
stopThread = false;
std::thread foo(theThread, std::ref(stopThread));
...
stopThread = true;
foo.join();
return 0;
}
A more complete example of 'Select' http://www.binarytides.com
I am pretty new to C++ so I am sure my code and answer can be improved.
Sounds like what you are looking for is this: You set a special flag variable known to the listening/accepting socket, and then let the main thread open a connection to the listening/accepting socket. The listening/accepting socket/thread has to check the flag every time it accepts a connection in order to know when to shut down.
Typically if you want to do multi-threaded networking, you would spawn a thread once a connection is made (or ready to be made). If you want to lower the overhead, a thread pool isn't too hard to implement.

Socket select() Handling Abrupt Disconnections

I am currently trying to fix a bug in a proxy server I have written relating to the socket select() call. I am using the Poco C++ libraries (using SocketReactor) and the issue is actually in the Poco code which may be a bug but I have yet to receive any confirmation of this from them.
What is happening is whenever a connection abruptly terminates the socket select() call is returning immediately which is what I believe it is meant to do? Anyway, it returns all of the disconnected sockets within the readable set of file descriptors but the problem is that an exception "Socket is not connected" is thrown when Poco tries to fire the onReadable event handler which is where I would be putting the code to deal with this. Given that the exception is silently caught and the onReadable event is never fired, the select() call keeps returning immediately resulting in an infinite loop in the SocketReactor.
I was considering modifying the Poco code so that rather than catching the exception silently it fires a new event called onDisconnected or something like that so that a cleanup can be performed.
My question is, are there any elegant ways of determining whether a socket has closed abnormally using select() calls? I was thinking of using the exception message to determine when this has occured but this seems dirty to me.
I had this same problem. The only way to get around it is to control the client applications exit code. The solution that I used was to send a shutdown signal before the reactor was terminated on the client side. Then on the server you simply close the socket.
//Client:
//Handler Class: onWrite
Packet p = Packet::Shutdown();
if (p.fn == "shutdown")
{
_reactor.stop();
delete this;
}
//Server
//Accepter Class: onRead
if (p.fn == "shutdown")
{
printf("%s has disconnected", _username.c_str());
_socket.close();
delete this;
}
It appears you are correct Remy. I managed to distinguish whether the socket had disconnected using the following code (this was added to Poco/Net/src/SocketImpl.cpp):
bool SocketImpl::isConnected()
{
int bytestoread;
int rc;
fd_set fdRead;
FD_ZERO(&fdRead);
FD_SET(_sockfd, &fdRead);
struct timeval tv;
tv.tv_sec = 0;
tv.tv_usec = 250000;
rc = ::select(int(_sockfd) + 1, &fdRead, (fd_set*) 0, (fd_set*) 0, &tv);
ioctl(FIONREAD, &bytestoread);
return !((bytestoread == 0) && (rc == 1));
}
From my understanding, this checks if the socket is readable using a call to select() and then checks the actual number of bytes which are available on that socket. If the socket reports that it is readable but the bytes are 0 then the socket is not actually connected.
While this answers my question here, this unfortunately has not solved my Poco problem as I can't figure out a way to fix this in the Poco SocketReactor code. I tried making a new event called DisconnectNotification but unfortunately cannot call that as the same error gets thrown as does for a ReadNotification on a closed socket.
Just catch the ConnectionResetException in OnReadable() (processes the ReadableNotification)
Then it handles "Connection reset by peer" properly.
catch(Poco::Net::ConnectionResetException &ex)
{
_socket.shutdownSend();
delete this;
}