epoll does not signal an event when socket is close - c++

I have a listener socket, every new connection I get I add it to epoll like this:
int connfd = accept(listenfd, (struct sockaddr *)&clnt_addr, &clnt_addr_len);
ev.events = EPOLLIN | EPOLLET | EPOLLONESHOT | EPOLLHUP;
ev.data.fd = connfd;
epoll_ctl(epollfd, EPOLL_CTL_ADD, connfd, &ev)
When new data is received, epoll signal 'EPOLLIN' event - as expected.
In such a situation I read all the information as follows:
long read = 0;
do {
read = recv(events[n].data.fd, buffer, sizeof (buffer), 0);
} while (read > 0);
In case I disconnected brutally or normally, epoll does not signal an event.
This code run in each thread, that's what I'm using EPOLLET.
So my question:
What do I need to do to get this event?
What do I need to do to close the socket so that there is no leakage of resources?

There are a few problems with your attempt.
You should not use EPOLLONESHOT unless you know what you are doing and you really need it. It disables the report of any other events to the epoll instance until you enable it again with EPOLL_CTL_MOD.
You should not use EPOLLHUP to determine if a connection was closed. The EPOLLHUP event may be raised before all data is read from the socket input stream, even if the client disconnects gracefully. I recommend you to only use EPOLLIN. If there is no input left (because of forceful or graceful disconnect), the read() would return 0 (EOF) and you can close the socket.
Your read() call will block the reading thread and consume the whole stream until EOF (connection closed). The whole point in using epoll() is to not have to use a while ( read(...) > 0 ) loop.
You should not use EPOLLET because "the code runs multithreaded" but because you need it. You can write multithreaded code without the edge-triggered mode. The use of EPOLLET requires a thorough knowledge of the differences between blocking and non-blocking I/O. You can easily run into pitfalls (as mentioned in the manual), like edge-triggered starvation.

Related

Can I write to a closed socket and forcefully correct the broken pipe error?

I have an application that runs on a large number of processors. On processor 0, I have a function that writes data to a socket if it is open. This function runs in a loop in a separate thread on processor 0, i.e. processor 0 is responsible for its own workload and has an extra thread running the communication on the socket.
//This function runs on a loop, called every 1.5 seconds
void T_main_loop(const int& client_socket_id, bool* exit_flag)
{
//Check that socket still connected.
int error_code;
socklen_t error_code_size = sizeof(error_code);
getsockopt(client_socket_id, SOL_SOCKET, SO_ERROR, &error_code, &error_code_size);
if (error_code == 0)
{
//send some data
int valsend = send(client_socket_id , data , size_of_data , 0);
}
else
{
*(exit_flag) = false; //This is used for some external logic.
//Can I fix the broklen pipe here somehow?
}
}
When the client socket is closed, the program should just ignore the error, and this is standard behavior as far as I am aware.
However, I am using an external library (PETSc) that is somehow detecting the broken pipe error and closing the entire parallel (MPI) environment:
[0]PETSC ERROR: Caught signal number 13 Broken Pipe: Likely while reading or writing to a socket
I would like to leave the configuration of this library completely untouched if at all possible. Open to any robust workarounds that are possible.
By default, the OS sends the thread SIGPIPE if it tries to write into a (half) closed pipe or socket.
One option to disable the signal is to do signal(SIGPIPE, SIG_IGN);.
Another option is to use MSG_NOSIGNAL flag for send, e.g. send(..., MSG_NOSIGNAL);.

How do I make libpcap/pcap_loop non-blocking?

I'm currently using libpcap to sniff traffic in promiscuous mode
int main()
{
// some stuff
printf("Opening device: %s\n", devname.c_str());
handle = pcap_open_live(devname.c_str(), 65536 , 1 , 0 , errbuf);
if (handle == NULL)
{
fprintf(stderr, "Couldn't open device %s : %s..." , devname.c_str(), errbuf);
return 1;
}
printf(" Done\n");
pcap_loop(handle , -1 , process_packet , NULL);
// here run a thread to do some stuff. however, pcap_loop is blocking
return 0;
}
I'd like to add an external thread to do some other stuff. How do I change the code above to make it non-blocking?
When you use non-blocking mode on libpcap you have to use pcap_dispatch, but note, pcap_dispatch can work in blocking or in non-blocking mode, it depends how you set libpcap, to set libpcap to work in non-blocking you have use the function pcap_setnonblock:
int pcap_setnonblock(pcap_t *p, int nonblock, char *errbuf);
The difference between blocking and non-blocking is not a loop that runs forever, but in blocking the function pcap_dispatch waits for a packet and only returns when this packet is received, however, in the non-blocking mode the function returns immediately and the callback must process the packet.
In "non-blocking" mode, an attempt to read from the capture
descriptor with pcap_dispatch() will, if no packets are currently
available to be read, return 0 immediately rather than blocking
waiting for packets to arrive. pcap_loop() and pcap_next() will not
work in "non-blocking" mode.
http://www.tcpdump.org/manpages/pcap_setnonblock.3pcap.html
pcap_loop is meant to go on until all input ends. If you don't want that behavior, call pcap_dispatch in a loop instead. By definition pcap_loop will never return, its meant to always searching for more data.
I use pcap_next_ex It returns a result indicating if a packet was read. This way I manage the acquisition my own thread. See an example here. The read_timeout in pcap_open also affects this function.

Recv() call hangs after remote host terminates

My problem is that I have a thread that is in a recv() call. The remote host suddenly terminates (without a close() socket call) and the recv() call continues to block. This is obviously not good because when I am joining the threads to close the process (locally) this thread will never exit because it is waiting on a recv that will never come.
So my question is what method do people generally consider to be the best way to deal with this issue? There are some additional things of note that should be known before answering:
There is no way for me to ensure that the remote host closes the socket prior to exit.
This solution cannot use external libraries (such as boost). It must use standard libraries/features of C++/C (preferably not C++0x specific).
I know this has likely been asked in the past but id like to get someones take as to how to correct this issue properly (without doing something super hacky which I would have done in the past).
Thanks!
Assuming you want to continue to use blocking sockets, you can use the SO_RCVTIMEO socket option:
SO_RCVTIMEO and SO_SNDTIMEO
Specify the receiving or sending timeouts until reporting an
error. The parameter is a struct timeval. If an input or out-
put function blocks for this period of time, and data has been
sent or received, the return value of that function will be the
amount of data transferred; if no data has been transferred and
the timeout has been reached then -1 is returned with errno set
to EAGAIN or EWOULDBLOCK just as if the socket was specified to
be nonblocking. If the timeout is set to zero (the default)
then the operation will never timeout.
So, before you begin receiving:
struct timeval timeout = { timo_sec, timo_usec };
int r = setsockopt(s, SOL_SOCKET, SO_RCVTIMEO, &timeout, sizeof(timeout));
assert(r == 0); /* or something more user friendly */
If you are willing to use non-blocking I/O, then you can use poll(), select(), epoll(), kqueue(), or whatever the appropriate event dispatching mechanism is for your system. The reason you need to use non-blocking I/O is that you need to allow the system call to recv() to return to notify you that there is no data in the socket's input queue. The example to use is a little bit more involved:
for (;;) {
ssize_t bytes = recv(s, buf, sizeof(buf), MSG_DONTWAIT);
if (bytes > 0) { /* ... */ continue; }
if (bytes < 0) {
if (errno == EWOULDBLOCK) {
struct pollfd p = { s, POLLIN, 0 };
int r = poll(&p, 1, timo_msec);
if (r == 1) continue;
if (r == 0) {
/*...handle timeout */
/* either continue or break, depending on policy */
}
}
/* ...handle errors */
break;
}
/* connection is closed */
break;
}
You can use TCP keep-alive probes to detect if the remote host is still reachable. When keep-alive is enabled, the OS will send probes if the connection has been idle for too long; if the remote host doesn't respond to the probes, then the connection is closed.
On Linux, you can enable keep-alive probes by setting the SO_KEEPALIVE socket option, and you can configure the parameters of the keep-alive with the TCP_KEEPCNT, TCP_KEEPIDLE, and TCP_KEEPINTVL socket options. See tcp(7) and socket(7) for more info on those.
Windows also uses the SO_KEEPALIVE socket option for enabling keep-alive probes, but for configuring the keep-alive parameters, use the SIO_KEEPALIVE_VALS ioctl.
You could use select()
From http://linux.die.net/man/2/select
int select(int nfds, fd_set *readfds, fd_set *writefds,
fd_set *exceptfds, struct timeval *timeout);
select() blocks until the first event (read ready, write ready, or exception) on one or more file descriptors or a timeout occurs.
sockopts and select are probably the ideal choices. An additional option that you should consider as a backup is to send your process a signal (for example using the alarm() call). This should force any syscall in progress to exit and set errno to EINTR.

Using pselect for synchronous wait

In a server code I want to use pselect to wait for clients to connect as well monitor the standard output of the prozesses that I create and send it to the client (like a simplified remote shell).
I tried to find examples on how to use pselect but I haven't found any. The socket where the client can connect is already set up and works, as I verified that with accept(). SIGTERM is blocked.
Here is the code where I try to use pselect:
waitClient()
{
fd_set readers;
fd_set writers;
fd_set exceptions;
struct timespec ts;
// Loop until we get a sigterm to shutdown
while(getSigTERM() == false)
{
FD_ZERO(&readers);
FD_ZERO(&writers);
FD_ZERO(&exceptions);
FD_SET(fileno(stdin), &readers);
FD_SET(fileno(stdout), &writers);
FD_SET(fileno(stderr), &writers);
FD_SET(getServerSocket()->getSocketId(), &readers);
//FD_SET(getServerSocket()->getSocketId(), &writers);
memset(&ts, 0, sizeof(struct timespec));
pret = pselect(FD_SETSIZE, &readers, &writers, &exceptions, &ts, &mSignalMask);
// Here pselect always returns with 2. What does this mean?
cout << "pselect returned..." << pret << endl;
cout.flush();
}
}
So what I want to know is how to wait with pselect until an event is received, because currently pselect always returns immediately with a value 2. I tried to set the timeout to NULL but that doesn't change anything.
The returnvalue of pselect (if positive) is the filedescriptor that caused the event?
I'm using fork() to create new prozesses (not implemented yet) I know that I have to wait() on them. Can I wait on them as well? I suppose I need to chatch the signal SIGCHILD, so how would I use that? wait() on the child would also block, or can I just do a peek and then continue with pselect, otherwise I have to concurrent blocking waits.
It returns immediately because the file descriptors in the writers set are ready. The standard output streams will almost always be ready for writing.
And if you check a select manual page you will see that the return value is either -1 on error, 0 on timeout, and a positive number telling you the number of file descriptors that are ready.

Problem: recvmsg(pfd[0], &message, MSG_WAITALL) always returns -1 instead of being blocked?

I'm making a server which spawn a child upon connection (using fork), and use pipe to send another socket to this child when there is another connection comming in. The idea is to let the child process manage two connections in a 2-player network game mode.
IPC pipe variable between parent and child is pfd[2].
Basically, in the child process, I do recvmsg(pfd[0], &message, MSG_WAITALL) to wait for the 2nd socket to be passed from the parent.
However, recvmsg is never blocked, and always gets returned -1.
I've already set pfd[0] to BLOCKINg as follows:
// set to blocking pipe
int oldfl;
oldfl = fcntl(pfd[0], F_GETFL);
if (oldfl == -1) {
perror("fcntl F_GETFL");
exit(1);
}
fcntl(pfd[0], F_SETFL, oldfl & ~O_NONBLOCK);
How can I make the child to be blocked at recvmsg?
Thanks a million for any hint.
recvmsg() does not work for pipes, rather for sockets only. When recvmsg() returns -1 you should check errno value, it is probably EBADF.
You can use unix sockets instead of pipe to pass file descriptors between processes.