recvmsg in blocking mode still works after fd is invalid [duplicate] - c++

Let's say I start a thread to receive on a port. The socket call will block on recvfrom.
Then, somehow in another thread, I close the socket.
On Windows, this will unblock recvfrom and my thread execution will terminate.
On Linux, this does not unblock recvfrom, and as a result, my thread is sitting doing nothing forever, and the thread execution does not terminate.
Can anyone help me with what's happening on Linux? When the socket is closed, I want recvfrom to unblock
I keep reading about using select(), but I don't know how to use it for my specific case.

Call shutdown(sock, SHUT_RDWR) on the socket, then wait for the thread to exit. (i.e. pthread_join).
You would think that close() would unblock the recvfrom(), but it doesn't on linux.

Here's a sketch of a simple way to use select() to deal with this problem:
// Note: untested code, may contain typos or bugs
static volatile bool _threadGoAway = false;
void MyThread(void *)
{
int fd = (your socket fd);
while(1)
{
struct timeval timeout = {1, 0}; // make select() return once per second
fd_set readSet;
FD_ZERO(&readSet);
FD_SET(fd, &readSet);
if (select(fd+1, &readSet, NULL, NULL, &timeout) >= 0)
{
if (_threadGoAway)
{
printf("MyThread: main thread wants me to scram, bye bye!\n");
return;
}
else if (FD_ISSET(fd, &readSet))
{
char buf[1024];
int numBytes = recvfrom(fd, buf, sizeof(buf), 0);
[...handle the received bytes here...]
}
}
else perror("select");
}
}
// To be called by the main thread at shutdown time
void MakeTheReadThreadGoAway()
{
_threadGoAway = true;
(void) pthread_join(_thread, NULL); // may block for up to one second
}
A more elegant method would be to avoid using the timeout feature of select, and instead create a socket pair (using socketpair()) and have the main thread send a byte on its end of the socket pair when it wants the I/O thread to go away, and have the I/O thread exit when it receives a byte on its socket at the other end of the socketpair. I'll leave that as an exercise for the reader though. :)
It's also often a good idea to set the socket to non-blocking mode also, to avoid the (small but non-zero) chance that the recvfrom() call might block even after select() indicated the socket is ready-to-read, as described here. But blocking mode might be "good enough" for your purpose.

Not an answer, but the Linux close man page contains the interesting quote:
It is probably unwise to close file descriptors while they may be in
use by system calls in other threads in the same process. Since a file
descriptor may be reused, there are some obscure race conditions that
may cause unintended side effects.

You are asking for the impossible. There is simply no possible way for the thread that calls close to know that the other thread is blocked in recvfrom. Try to write code that guarantees that this happens, you will find that it is impossible.
No matter what you do, it will always be possible for the call to close to race with the call to recvfrom. The call to close changes what the socket descriptor refers to, so it can change the semantic meaning of the call to recvfrom.
There is no way for the thread that enters recvfrom to somehow signal to the thread that calls close that it is blocked (as opposed to being about to block or just entering the system call). So there is literally no possible way to ensure the behavior of close and recvfrom are predictable.
Consider the following:
A thread is about to call recvfrom, but it gets pre-empted by other things the system needs to do.
Later, the thread calls close.
A thread started by the system's I/O library calls socket and gets the same decsriptor as the one you closed.
Finally, the thread calls recvfrom, and now it's receiving from the socket the library opened.
Oops.
Don'd ever do anything even remotely like this. A resource must not be released while another thread is, or might be, using it. Period.

Related

Using thread notification with Berkeley sockets

I am utilizing the Berkeley sockets select function in the following way.
/*Windows and linux typedefs/aliases/includes are made here with wsa
junk already taken care of.*/
/**Check if a socket can receive data without waiting.
\param socket The os level socket to check.
\param to The timeout value. A nullptr value will block forever, and zero
for each member of the value will cause it to return immediately.
\return True if recv can be called on the socket without blocking.*/
bool CanReceive(OSSocket& socket,
const timeval * to)
{
fd_set set = {};
FD_SET(socket, &set);
timeval* toCopy = nullptr;
if (to)
{
toCopy = new timeval;
*toCopy = *to;
}
int error = select((int)socket, &set, 0, 0, toCopy);
delete toCopy;
if (error == -1)
throw Err(); //will auto set from errno.
else if (error == 0)
return false;
else
return true;
}
I have written a class that will watch a container of sockets (wrapped up in aother class) and add an ID to a separate container that stores info on what sockets are ready to be accessed. The map is an unordered_map.
while(m_running)
{
for(auto& e : m_idMap)
{
auto id = e.first;
auto socket = e.second;
timeval timeout = ZeroTime; /*0sec, 0micro*/
if(CanReceive(socket,&timeout) &&
std::count(m_readyList.begin(),m_readyList.end(),socket) == 0)
{
/*only add sockets that are not on the list already.*/
m_readyList.push_back(id);
}
}
}
As I'm sure many have noticed, this code run insanely fast and gobbles up CPU like there is no tomorrow (40% CPU usage with only one socket in the map). My first solution was to have a smart waiting function that keeps the iterations per second to a set value. That seemed to be fine with some people. My question is this: How can I be notified when sockets are ready without using this method? Even if it might require a bunch of macro junk to keep it portable that's fine. I can only think there might be some way to have the operating system watch it for me and get some sort of notification or event when the socket is ready. Just to be clear, I have chosen not use dot net.
The loop runs in its own thread, sends notifications to other parts of the software when sockets are ready. The entire thing is multi threaded and every part of it (except this part) uses an event based notification system that eliminates the busy waiting problem. I understand that things become OS-dependent and limited in this area.
Edit: The sockets are run in BLOCKING mode (but select has no timeout, and therefor will not block), but they are operated on in a dedicated thread.
Edit: The system performs great with the smart sleeping functions on it, but not as good as it could with some notification system in place (likely from the OS).
First, you must set the socket non-blocking if you don't want the sockets to block. The select function does not provide a guarantee that a subsequent operation will not block. It's just a status reporting function that tells you about the past and the present.
Second, the best way to do this varies from platform to platform. If you don't want to write lots of platform specific code, you really should use a library like Boost ASIO or libevent.
Third, you can call select on all the sockets at the same time with a timeout. The function will return immediately if any of the sockets are (or were) readable and, if not, will wait up to the timeout. When select returns, it will report whether it timed out or, if not, which sockets were readable.
This will still perform very poorly because of the large number of wait lists the process has to be put on just to be immediately removed from all of them as soon as a single socket is readable. But it's the best you can do with reasonable portability.
How can I be notified when sockets are ready without using this
method?
That's what select() is for. The idea is that your call to select() should block until at least one of the sockets you passed in to it (via FD_SET()) is ready-for-read. After select() returns, you can find out which socket(s) are now ready-for-read (by calling FD_ISSET()) and call recv() on those sockets to get some data from them and handle it. After that you loop again, go back to sleep inside select() again, and repeat ad infinitum. In this way you handle all of your tasks as quickly as possible, while using the minimum amount of CPU cycles.
The entire thing is multi threaded and every part of it (except this
part) uses an event based notification system that eliminates the busy
waiting problem.
Note that if your thread is blocked inside of select() and you want it to wake up and do something right away (i.e. without relying on a timeout, which would be slow and inefficient), then you'll need some way to cause select() in that thread to return immediately. In my experience the most reliable way to do that is to create a pipe() or socketpair() and have the thread include one end of the file-descriptor-pair in its ready-for-read fd_set. Then when another thread wants to wake that thread up, it can do so simply by sending a byte on the the other end of the pair. That will cause select() to return, the thread can then read the single byte (and throw it away), and then do whatever it is supposed to do after waking up.

How to avoid tcdrain() from blocking forever

I have this function _write_port() being called from a thread, whenever I need to send a message. To ensure the whole message is written, tcdrain() is used.
void Serial_Port::_write_port(char *buf, unsigned &len)
{
// Lock
pthread_mutex_lock(&lock);
// Write packet via serial link
write(fd, buf, len);
// Wait until all data has been written
tcdrain(fd);
// Unlock
pthread_mutex_unlock(&lock);
return;
}
My problem is that tcdrain() blocks forever after a random number of executions of this function _write_port(). This will block the lock, resulting in blocking my other read thread, resulting in blocking everything.
What is a good approach to avoid tcdrain from blocking forever?
Note: I strangely noticed that if I use several printf() throughout the function, tcdrain never blocks. It does not make any sense to me to exist a relationship between printf() and write() because they write to different output files. Since I cannot explain this behaviour I assume it may be a coincidence that it worked like this on my experiments. If someone can explain this behaviour, please let me know.

what's the risk of closing socket which is used in another thread by accept?

I have a server application.
The server is accpting the connection from clients in a thread with:
while( (client_sock = accept(socket_desc, (struct sockaddr *)&client, (socklen_t*)&c)) )
{
.....
}
I have another thread which executed in the exit of my application. in this thread I close the socket socket_desc:
close(socket_desc);
what is the risk of closing a socket in a thread and there is another thread wich make accept on the same socket?
I could be wrong. But, it might be for a scenario like this:- Consider 3 threads - A, B, C.
Thread A waits on socket and goes to sleep
Thread B closes the socket
Thread C creates a new socket and which happens to get the same file descriptor number as the recently closed socket ( Socket is also a file )
Thread A wakes up ( with error code on accept ) and then call
close thinking unexpected error => This affects a completely different valid socket created in C !!
To the best of my knowledge, the semantics of the combination of actions you describe are not defined, and that should be enough of a concern to find an alternative. I speculate that behaviors reasonably likely to be observed include
the close() returning quickly, and
the accept() call failing quickly, perhaps indicating an EBADF, EINVAL, or ENOTSOCK error, or
the accept() call continuing to block until a connection request arrives, or
the accept() call blocking indefinitely; or
the close() blocking until accept() returns, if it ever does; or
close() and accept() deadlocking.
If indeed the semantics are not defined, however, then pretty much anything could happen.
If a different thread must close the socket than the one accept()ing connections on it, then you would be wise to set some sort of flag to indicate that the program is exiting, then signal() the accept()ing thread to break it out of accept(). The thread(s) receiving such a signal would know from the program-exiting flag to stop instead of trying to accept() again.
If your threads are cancelable, then the global flag could take the form of a thread-cancellation message. The accept() function is a cancellation point, so your threads will receive the cancellation message no later than the next time they call accept().
You can close the socket to abort the accept function. But you must change the code, and check the return value of API functions carefully. Your attempt while the while loop will not work, since it doesn't check for SOCK_ERROR (-1).

Do We Need Add Lock To the Code of Recv/Send/Close For a Same Socket Among Threads

From posts like this, I know on linux the recv/send functions are thread safe and user is allowed to operate on the same socket from different threads simultaneously.
Though this is not a good design, in following situation I wonder what shall we do from user level code in order to keep data consistency and healthy running state: There are threads operating on the same sockets, the first one for creating and closing socket, the second for reading socket and the last one for sending sockets. See the pseudo code
struct SocketInfo
{
int SockFd;
int SockType;
queue<Packet*> RecvPacks;
};
map<int, SocketInfo*> gSocketInfoMap;
pthread_mutex_t gSocketsLock;
//Thread1
pthread_mutex_lock(&gSocketsLock);
// get information for sock
SocketInfo* info = gSocketInfoMap[sock];
pthread_mutex_unlock(&gSocketsLock);
close(sock); // line-1
.....
//thread-2
pthread_mutex_lock(&gSocketsLock);
SocketInfo* info = gSocketInfoMap[sock];
pthread_mutex_unlock(&gSocketsLock);
recv(sock, buffer, sizeof(buffer)); // line-2
.....
//thread-3
pthread_mutex_lock(&gSocketsLock);
SocketInfo* info = gSocketInfoMap[sock];
pthread_mutex_unlock(&gSocketsLock);
send(sock, buffer, sizeof buffer); // line-3
.....
I wonder if I need to move the Line-1, Line-2 and Line-3 into the protection scope of gSocketsLock? Why?
As the linked question states, socket operations are threadsafe. In any event, receiving and sending data are independent operations which do not interfere with each other.
Obviously, it is not a great idea to close a socket which is actively being read from and written to, but putting the close() inside a critical section does not do anything to prevent that bug. Whatever mechanism ensures that active sockets are not closed or that closed sockets are not accessed is at a higher level than the critical sections shown in the OP.
If one thread closes a socket that another thread is trying to use for I/O, the worst that can happen is that the recv/send call will return an error.
In short: no, it would not be a good idea to put the socket operations inside the critical section. It has not benefit, and it unnecessarily increases the likelihood of lock contention.

Proper way to close a blocking UDP socket

I have a C++ object that creates a thread to read from a blocking UDP socket:
mRunning.store(true);
while (mRunning.load(boost::memory_order_consume)) {
...
int size = recvfrom(mSocket, buf, kTextBufSize , 0,
(struct sockaddr *) &packet->mReplyAddr.mSockAddr, (socklen_t*)&packet->mReplyAddr.mSockAddrLen);
if (size > 0) {
//do stuff
}
}
return 0;
(mRunning is a boost::atomic)
The object's destructor is called from another thread and does this:
mRunning.store(false);
#ifdef WIN32
if (mSocket != -1) closesocket(mSocket);
#else
if (mSocket != -1) close(mSocket);
#endif
pthread_join(mThread, NULL);
This seems to work, but one of my colleagues suggested that there might be a problem if recv is interrupted in the middle of reading something. Is this thread safe? What's the correct way of closing a blocking UDP socket? (Needs to be cross-platform OSX/Linux/Windows)
There could be a lot of different problems. Moving my application from one FreeBSD version to another I found that such close() worked normally on older kernel and just hung close() until something returned from recv() on newer. And OSX is FreeBSD based :)
Portable way of closing sockets from different thread is to create pipe and block not in recv(), but in select(). When you need to close socket, write something to pipe, select() will unblock and you can safely do close().
Well recvfrom in itself is thread-safe. IIRC all socket functions are. The question is:
What's going to happen if you pull the descriptor from under recvfrom while it's copying data to your buffer ?
It's a good question but I doubt the standard says anything about this (I also doubt the specific manual of an implementation says anything about it). So any implementation is free to:
Complete the operation (perhaps because it doesn't need the descriptor anymore, or because it's doing some cool reference counting or something else)
Make the recvfrom fail and return a -1 (ENOTSOCK, EINVAL ?)
Crash spectacularly because buffers and internal data structures are freed by the close.
Obviously this is just speculation (I have been wrong before, many times), but unless you find something in the standard to support the idea that you can close the socket while receiving through it, you're not safe.
So, what can you do ? THe safest: use a synchronization mechanism to ensure that you only close the socket after the recvfrom is done (semaphores, mutexes, whatever).
Personally I would do an UP on a semaphore after the recvfrom and a DOWN before the close.
Your colleague is right, boost sockets are not thread safe.
Your options;
Use ASIO (do this one)
Timeout on blocking call. This isn't really portable though it might work.