C++ - AF_UNIX socket hangs - c++

I'm currently trying to get a client/daemon communication via an AF_UNIX socket up and running.
At the moment the client successfully sends a message, the daemon receives and processes it and then should send the message back.
Well, that's where the problem is. As soon as the daemon tries to send the message back nothing happens, the client hangs, trying to read a message, and if I kill the client the daemon dies with it.
Following is the daemon code:
//successful call to accept, I have a file descriptor now...
int c = 0;
while((c = recv(fd, (char*)&buf[0], bufferSize, 0)))
{
if(c == -1 || c == 0)
break;
tmp.append(buf.begin(), buf.begin()+c);
}
writeLog(tmp);
tmp = evaluateMsg(tmp);
writeLog(tmp);
//I assume this send call is hanging
if(send(fd, tmp.c_str(), tmp.size(), 0) < 0)
writeLog("Could not write message back!");
close(fd);
And this is the client code:
//connect(); is successful
//send(); as well - the recv(); call is hanging forever
while((c = recv(sockfd, (char*)&buf[0], 1024, 0)))
{
if(c == -1)
{
cout<<"Error";
break;
}
else if(c == 0)
break;
tmp.append(buf.begin(), buf.begin()+c);
}
Please note that the code is heavily cut down for the sake of simplicity and readability (especially the code to daemonize and create the actual AF_UNIX socket (which are both successful)).
UPDATE:
I could verify that the client-side recv() call is never returning, which means that the daemon-side send() call is hanging. Why?

I don't see any reason that the daemon side recv() loop will end. Why would recv() return 0 or -1 if the socket is still open?
You should understand when the client finished sending data on the application level, the content should make it clear, and then finish the recv() loop and continue to the send() part of the server.

Alright, the solution was pretty simple.
#selalerer was right about the return value of recv() which leads to this working code snippet:
while((c = recv(fd, (char*)&buf[0], bufferSize, 0)))
{
if(c == -1)
/* handle error */
tmp.append(buf.begin(), buf.begin()+c);
if(c < bufferSize)
//no more to read, therefore stop reading
break;
}

Related

Socket send() hangs when in CLOSE_WAIT state

I have a C++ server application, written using the POCO framework. The server application is acting as a HTTP server in this case. There is a client application which I don't control and cannot debug that is causing a problem in the server. The client requests a large file, which is returned as the HTTP response. During the return of the file the client closes the connection. I see the socket move to the CLOSE_WAIT state, indicating that the client has sent a FIN. The trouble is that in my application the send() function then hangs causing one of my HTTP threads to be basically lost, and once all the threads enter this state the server is unresponsive.
The send code is inside the POCO framework, but looks like this:
do
{
if (_sockfd == POCO_INVALID_SOCKET) throw InvalidSocketException();
rc = ::send(_sockfd, reinterpret_cast<const char*>(buffer), length, flags);
}
while (_blocking && rc < 0 && lastError() == POCO_EINTR);
if (rc < 0) error();
return rc;
(flags are 0 in calls to this function). I tried to detect this state by adding the following code:
char c;
int r;
int rc;
do
{
// Check if FIN received
while ((r = recv(_sockfd, &c, 1, MSG_DONTWAIT)) == 1) {}
if (r == 0) { ::close(_sockfd); _sockfd = POCO_INVALID_SOCKET; } // FIN received
if (_sockfd == POCO_INVALID_SOCKET) throw InvalidSocketException();
rc = ::send(_sockfd, reinterpret_cast<const char*>(buffer), length, flags);
}
while (_blocking && rc < 0 && lastError() == POCO_EINTR);
if (rc < 0) error();
return rc;
This appears to make things better, but still not solve the problem. I end up with the server not hanging as quickly, but many more CLOSE_WAIT sockets, so I think I have partially solved the thread hanging issue, but I have still not tidied up correctly from the broken socket. With this change in place the problem happens less, but still happens, so I think the key to this is understanding why send() hangs.
I'm testing this code on linux.
To cleanly close a socket:
Call shutdown with SD_SEND.
Keep reading from the socket until read returns zero or a fatal error.
Close the socket.
Do not attempt to access the socket after you've closed it.
Your code has two major issues. It doesn't ensure that close is always called on the socket no matter what happens, and it can access the socket after it has closed it. The former is causing your CLOSE_WAIT problem. The latter is a huge security hole.

How to catch a "connection reset by peer" error in C socket?

I have a C++ and Qt application which part of it implements a C socket client. Some time ago by app crashed because something happened with the server; the only thing I got from that crash was a message in Qt Creator's Application Output stating
recv_from_client: Connection reset by peer
I did some research on the web about this "connection reset by peer" error and while some threads here in SO and other places did managed to explain what is going on, none of them tells how to handle it - that is, how can I "catch" the error and continue my application without a crash (particularly the method where I read from the server is inside a while loop, so I'ld like to stop the while loop and enter in another place of my code that will try to re-establish the connection).
So how can I catch this error to handle it appropriately? Don't forget that my code is actually C++ with Qt - the C part is a library which calls the socket methods.
EDIT
Btw, the probable method from which the crash originated (given the "recv_from_client" part of the error message above) was:
int hal_socket_read_from_client(socket_t *obj, u_int8_t *buffer, int size)
{
struct s_socket_private * const socket_obj = (struct s_socket_private *)obj;
int retval = recv(socket_obj->client_fd, buffer, size, MSG_DONTWAIT); //last = 0
if (retval < 0)
perror("recv_from_client");
return retval;
}
Note: I'm not sure if by the time this error occurred, the recv configuration was with MSG_DONTWAIT or with 0.
Just examine errno when read() returns a negative result.
There is normally no crash involved.
while (...) {
ssize_t amt = read(sock, buf, size);
if (amt > 0) {
// success
} else if (amt == 0) {
// remote shutdown (EOF)
} else {
// error
// Interrupted by signal, try again
if (errno == EINTR)
continue;
// This is fatal... you have to close the socket and reconnect
// handle errno == ECONNRESET here
// If you use non-blocking sockets, you also have to handle
// EWOULDBLOCK / EAGAIN here
return;
}
}
It isn't an exception or a signal. You can't catch it. Instead, you get an error which tells you that the connection has been resetted when trying to work on that socket.
int rc = recv(fd, ..., ..., ..., ...);
if (rc == -1)
{
if (errno == ECONNRESET)
/* handle it; there isn't much to do, though.*/
else
perror("Error while reading");
}
As I've written, there isn't much you can do. If you're using some I/O multiplexer, you may want to remove that file descriptor from further monitoring.

clean window socket internal buffer

I am wondering if there is a way to clean up window socket internal buffer, because what I want to achieve is this
while(1){
for(i=0;i<10;i++){
sendto(...) //send 10 UDP datagrams
}
for(i=0;i<10;i++){
recvfrom (Socket, RecBuf, MAX_PKT_SIZE, 0,
(SOCKADDR*) NULL, NULL);
int Status = ProcessBuffer(RecBuf);
if (Status == SomeCondition)
MagicalSocketCleanUP(Socket); //clean up the rest of stuff in the socket, so that it doesn't effect the reading for next iteration of the outer while loop
break; //occasionally the the receive loop needs to terminate before finishing off all 10 iteration
}
}
so I am asking for is there a function to clean up whatever remaining in the socket so that it won't effect my next reading? Thank you
The way to clean up data from the internal receive socket buffer is to read data until there is no more data to read. If you do this in a non-blocking way, you do not need to wait for more data in select(), because the EWOUDBLOCK error value means the internal receive socket buffer is empty.
int MagicalSocketCleanUP(SOCKET Socket) {
int r;
std::vector<char> buf(128*1024);
do {
r = recv(Socket, &buf[0], buf.size(), MSG_DONTWAIT);
if (r < 0 && errno == EINTR) continue;
} while (r > 0);
if (r < 0 && errno != EWOULDBLOCK) {
perror(__func__);
//... code to handle unexpected error
}
return r;
}
But this is not exactly safe. The other end of the socket may have sent good data into the socket buffer too, so this routine may discard more than what you want to discard.
Instead, the data on the socket should be framed in such a way that you know when the data of interest arrives. So instead of a cleanup API, you could extend ProcessBuffer() to discard input until it finds data of interest.
A simpler mechanism would be a message exchange between the two sides of the socket. When the error state is entered, the sender sends a "DISCARDING UNTIL <TOKEN>" message. The receiver sends back "<TOKEN>" and knows that only the data after the "<TOKEN>" message will be processed. The "<TOKEN>" can be a random sequence.

Socket can't accept connections when non-blocking?

EDIT: Messed up my pseudo-coding of the accept call, it now reflects what I'm actually doing.
I've got two sockets going. I'm trying to use send/recv between the two. When the listening socket is blocking, it can see the connection and receive it. When it's nonblocking, I put a busy wait in (just to debug this) and it times out, always with the error EWOULDBLOCK. Why would the listening socket not be able to see a connection that it could see when blocking?
The code is mostly separated in functions, but here's some pseudo-code of what I'm doing.
int listener = -2;
int connector = -2;
int acceptedSocket = -2;
getaddrinfo(port 27015, AI_PASSIVE) results loop for listener socket
{
if (listener socket() == 0)
{
if (listener bind() == 0)
if (listener listen() == 0)
break;
listener close(); //if unsuccessful
}
}
SetBlocking(listener, false);
getaddrinfo("localhost", port 27015) results loop for connector socket
{
if (connector socket() == 0)
{
if (connector connect() == 0)
break; //if connect successful
connector close(); //if unsuccessful
}
}
loop for 1 second
{
acceptedSocket = listener accept();
if (acceptedSocket > 0)
break; //if successful
}
This just outputs a huge list errno of EWOULDBLOCK before ultimately ending the timeout loop. If I output the file descriptor for the accepted socket in each loop interation, it is never assigned a file descriptor.
The code for SetBlocking is as so:
int SetBlocking(int sockfd, bool blocking)
{
int nonblock = !blocking;
return ioctl(sockfd,
FIONBIO,
reinterpret_cast<int>(&nonblock));
}
If I use a blocking socket, either by calling SetBlocking(listener, true) or removing the SetBlocking() call altogether, the connection works no problem.
Also, note that this connection with the same implementation works in Windows, Linux, and Solaris.
Because of the tight loop you are not letting the OS complete your request. That's the difference between VxWorks and others - you basically preempt your kernel.
Use select(2) or poll(2) to wait for the connection instead.

Properly writing to a nonblocking socket in C++

I'm having a strange problem while attempting to transform a blocking socket server into a nonblocking one. Though the message was only received once when being sent with blocking sockets, using nonblocking sockets the message seems to be received an infinite number of times.
Here is the code that was changed:
return ::write(client, message, size);
to
// Nonblocking socket code
int total_sent = 0, result = -1;
while( total_sent < size ) {
// Create a temporary set of flags for use with the select function
fd_set working_set;
memcpy(&working_set, &master_set, sizeof(master_set));
// Check if data is available for the socket - wait 1 second for timeout
timeout.tv_sec = 1;
timeout.tv_usec = 0;
result = select(client + 1, NULL, &working_set, NULL, &timeout);
// We are able to write - do so
result = ::write(client, &message[total_sent], (size - total_sent));
if (result == -1) {
std::cerr << "An error has occured while writing to the server."
<< std::endl;
return result;
}
total_sent += result;
}
return 0;
EDIT: The initialization of the master set looks like this:
// Private member variables in header file
fd_set master_set;
int sock;
...
// Creation of socket in class constructor
sock = ::socket(PF_INET, socket_type, 0);
// Makes the socket nonblocking
fcntl(sock,F_GETFL,0);
FD_ZERO(&master_set);
FD_SET(sock, &master_set);
...
// And then when accept is called on the socket
result = ::accept(sock, NULL, NULL);
if (result > 0) {
// A connection was made with a client - change the master file
// descriptor to note that
FD_SET(result, &master_set);
}
I have confirmed that in both cases, the code is only being called once for the offending message. Also, the client side code hasn't changed at all - does anyone have any recommendations?
fcntl(sock,F_GETFL,0);
How does that make the socket non-blocking?
fcntl(sock, F_SETFL, O_NONBLOCK);
Also, you are not checking if you can actually write to the socket non-blocking style with
FD_ISSET(client, &working_set);
I do not believe that this code is really called only once in the "non blocking" version (quotes because it is not really non-blocking yet as Maister pointed out, look here), check again. If the blocking and non blocking versions are consistent, the non blocking version should return total_sent (or size). With return 0 instead caller is likely to believe nothing was sent. Which would cause infinite sending... is it not what's happening ?
Also your "non blocking" code is quite strange. You seem to use select to make it blocking anyway... Ok, with a timeout of 1s, but why don't you make it really non blocking ? ie: remove all the select stuff and test for error case in write() with errno being EWOULDBLOCK. select or poll are for multiplexing.
Also you should check errors for select and use FD_ISSET to check if socket is really ready. What if the 1 s timeout really happen ? Or if select is stopped by some interruption ? And if an error occurs in write, you should also write which error, that is much more useful than your generic message. But I guess this part of code is still far from finished.
As far as I understand your code it should probably look somewhat like that (if the code is running in an unique thread or threaded, or forking when accepting a connection would change details):
// Creation of socket in class constructor
sock = ::socket(PF_INET, socket_type, 0);
fcntl(sock, F_SETFL, O_NONBLOCK);
// And then when accept is called on the socket
result = ::accept(sock, NULL, NULL);
if (result > 0) {
// A connection was made with a client
client = result;
fcntl(client, F_SETFL, O_NONBLOCK);
}
// Nonblocking socket code
result = ::write(client, &message[total_sent], (size - total_sent));
if (result == -1) {
if (errno == EWOULDBLOCK){
return 0;
}
std::cerr << "An error has occured while writing to the server."
<< std::endl;
return result;
}
return size;