Tcp accept fails after first connection after 1 hour - c++

I have written C++ client server application and the server is crashing.
The scenario
Start server
1 hour later (not before) Client connect
Then the Server which is waiting in accept returns -1 with errno "Too many open files".
Nothing else special is running on the machine which led me to believe that accept is opening many file descriptors while waiting.
Is that true?
How can I fix this so the client could connect anytime?
the relevant server code:
int sockClient;
while (true) {
sockaddr_in* clientSockAddr = new sockaddr_in();
socklen_t clientSockAddrLen = sizeof(sockaddr_in);
sockClient = accept(sockServer, (sockaddr *) clientSockAddr,
&clientSockAddrLen);
if(sockClient == -1 ){
std::ostringstream s;
s << "TCP Server: accept connection error." << std::strerror(errno);
throw runtime_error(s.str());
}
connection->communicate(sockClient, clientSockAddr, clientSockAddrLen);
}

You have a file descriptor leak somewhere. Possibly you aren't closing accepted sockets when you've finished with them, or else it's on a file somewhere.

Related

Can a TCP/IP client connect to an unreachable IP?

I've been searching for ages for my problem and I probably fell 20 times on stackoverflow without finding anything.
Here's my thing : I'm trying to develop a simple TCP/IP client in C++ (I've followed the well written Beej's Guide) that is supposed to communicate with a python TCP/IP server.
My code is (in a function) :
memset(&m_hints, 0, sizeof m_hints);
m_hints.ai_family=AF_UNSPEC;
m_hints.ai_socktype=SOCK_STREAM;
m_portnbrstring=to_string(m_portnbr);
if ((m_getaddrinfostatus=getaddrinfo(m_serverIP,(const char*) m_portnbrstring.c_str(), &m_hints, &m_servinfo))!=0)
{
char tempstrerror[100];
strcpy(tempstrerror,"getaddrinfo in TCPStartClient: ");
strcat(tempstrerror,gai_strerror(m_getaddrinfostatus));
ExitAndDisplayMessage(tempstrerror);
}
for(m_plist=m_servinfo; m_plist!=NULL; m_plist=m_plist->ai_next)
{
if ((m_sockfd=socket(m_plist->ai_family, m_plist->ai_socktype, m_plist->ai_protocol))==-1)
{
perror("Something went wrong when creating TCP socket");
continue;
}
break;
if (connect(m_sockfd, m_plist->ai_addr, m_plist->ai_addrlen)==-1)
{
close(m_sockfd);
perror("Something went wrong when connecting to TCP socket");
continue;
}
break;
}
if (m_plist==NULL) ExitAndDisplayMessage("getaddrinfo in TCPStartClient: failed to connect");
char tempaddr[INET6_ADDRSTRLEN];
inet_ntop(m_plist->ai_family,get_in_addr((struct sockaddr *)m_plist->ai_addr),tempaddr, sizeof tempaddr);
cout << "TCPClient started at IP " << tempaddr << " on port " << ntohs(get_in_port((struct sockaddr *)m_plist->ai_addr)) << endl;
The definitions are
int m_sockfd;
char *m_serverIP;
struct addrinfo m_hints;
struct addrinfo *m_servinfo;
struct addrinfo *m_plist;
Until here, everything looks fine but the connect function keeps sending 0 (no error) even if the IP I specify is unreachable. Basically, connect() works even if the server is down or if a test with a random IP (I tested with fping to be unreachable).
Does anyone have an idea of what's happening ? I'd be glad if someone could kick me out of this.
"Can a TCP/IP client connect to an unreachable IP?" - No. Obviously not.
If whatever you are using to establish a connection reports a successful connection of a TCP socket to an unreachable IP, then whatever you are using is broken (or you are using it wrong (probably the most likely situation) or the IP is in fact reachable).

bind returns address in use even if no connection is established

I have a c++ code in which I am trying to establish a connection on a socket. But I firstly need to check if a connection already exist on a given port, and if it exists I need to close the connection. I have the code below and my problem is that when checking if the port is already connected it returs that it is even if connect has failed previously.
connected = false;
int sockfd;
void conn(int port) {
struct sockaddr_in addr;
addr.sin_family = AF_INET;
addr.sin_port = htons(port);
.....
int sockfd_t;
if ( (sockfd_t = socket(PF_INET, SOCK_STREAM, 0)) < 0) {
cout << "Error opening socket_test " << endl;
return;
}
// check if address already in use
if (bind(sockfd_t, (struct sockaddr *) &addr, sizeof(addr)) < 0) {
if(errno == EADDRINUSE) {
cout << "address in use: bind fail, port=" << port << endl;
}
// do something - close the connection if already connected
}
else {
cout << "bind ok, port=" << port << endl;
}
close(sockfd_t);
if ( (sockfd = socket(PF_INET, SOCK_STREAM, 0)) < 0) {
cout << "Error opening socket " << endl;
return;
}
if (connect(sockfd, (struct sockaddr *) &adresse, sizeof(adresse))) {
cout << "Error connecting" << endl;
close(sockfd);
return;
}
connected = true;
}
int main() {
int port=3590;
while (!connected) {
conn(port);
}
cout << "CONNECTED";
// ..........
}
After running the program this is the output printed:
bind ok, port=3590;
Error connecting
bind ok, port=3590;
Error connecting
address in use: bind fail, port=3590 //???
CONNECTED!
I don't know why on the 5-th line of the print it displays "address in use:..." as the connect fails the first two times?
I think you have a misconception about what these socket operations do.
But I firstly need to check if a connection already exist on a given port, and if it exists I need to close the connection.
bind() gives a socket a local address, rather than having anything to do with checking if a remote address you are trying to connect to is accessible.
connect() connects the socket to a remote address.
When connecting a socket as a client (which is what I think you are trying to do), you don't need to check if there is already a connection, remote server can handle multiple incoming client connections to the same port. Binding is usually only important for servers.
if you don't bind before connecting, a socket will be assigned a random local port.
So, if you are a client, you do:
socket()
connect()
If you are a server, you do:
socket()
bind()
listen()
In your own question, the output makes sense when there is no server listening, but then a server comes online.
First two times, you bind a socket and it's successful, because nobody is using it to listen, then you fail to connect, because you just bound, and did not start a server (by calling listen()).
Then a real server on the same host binds that socket and starts listening, therefore you can't bind that port anymore (it fails), but you can connect, because the server is listening.

standard C++ TCP socket, connect fails with EINTR when using std::async

I am having trouble using the std::async to have tasks execute in parallel when the task involves a socket.
My program is a simple TCP socket server written in standard C++ for Linux. When a client connects, a dedicated port is opened and separate thread is started, so each client is serviced in their own thread.
The client objects are contained in a map.
I have a function to broadcast a message to all clients. I originally wrote it like below:
// ConnectedClient is an object representing a single client
// ConnectedClient::SendMessageToClient opens a socket, connects, writes, reads response and then closes socket
// broadcastMessage is the std::string to go out to all clients
// iterate through the map of clients
map<string, ConnectedClient*>::iterator nextClient;
for ( nextClient = mConnectedClients.begin(); nextClient != mConnectedClients.end(); ++nextClient )
{
printf("%s\n", nextClient->second->SendMessageToClient(broadcastMessage).c_str());
}
I have tested this and it works with 3 clients at a time. The message gets to all three clients (one at a time), and the response string is printed out three times in this loop. However, it is slow, because the message only goes out to one client at a time.
In order to make it more efficient, I was hoping to take advantage of std::async to call the SendMessageToClient function for every client asynchronously. I rewrote the code above like this:
vector<future<string>> futures;
// iterate through the map of clients
map<string, ConnectedClient*>::iterator nextClient;
for ( nextClient = mConnectedClients.begin(); nextClient != mConnectedClients.end(); ++nextClient )
{
printf("start send\n");
futures.push_back(async(launch::async, &ConnectedClient::SendMessageToClient, nextClient->second, broadcastMessage, wait));
printf("end send\n");
}
vector<future<string>>::iterator nextFuture;
for( nextFuture = futures.begin(); nextFuture != futures.end(); ++nextFuture )
{
printf("start wait\n");
nextFuture->wait();
printf("end wait\n");
printf("%s\n", nextFuture->get().c_str());
}
The code above functions as expected when there is only one client in the map. That you see "start send" quickly followed by "end send", quickly followed by "start wait" and then 3 seconds later (I have a three second sleep on the client response side to test this) you see the trace from the socket read function that the response comes in, and then you see "end wait"
The problem is that when there is more than one client in the map. In the part of the SendMessageToClient function that opens and connects to the socket, it fails in the code identified below:
// connected client object has a pipe open back to the client for sending messages
int clientSocketFileDescriptor;
clientSocketFileDescriptor = socket(AF_INET, SOCK_STREAM, 0);
// set the socket timeouts
// this part using setsockopt is omitted for brevity
// host name
struct hostent *server;
server = gethostbyname(mIpAddressOfClient.c_str());
if (server == 0)
{
close(clientSocketFileDescriptor);
return "";
}
//
struct sockaddr_in clientsListeningServerAddress;
memset(&clientsListeningServerAddress, 0, sizeof(struct sockaddr_in));
clientsListeningServerAddress.sin_family = AF_INET;
bcopy((char*)server->h_addr, (char*)&clientsListeningServerAddress.sin_addr.s_addr, server->h_length);
clientsListeningServerAddress.sin_port = htons(mPortNumberClientIsListeningOn);
// The connect function fails !!!
if ( connect(clientSocketFileDescriptor, (struct sockaddr *)&clientsListeningServerAddress, sizeof(clientsListeningServerAddress)) < 0 )
{
// print out error code
printf("Connected client thread: fail to connect %d \n", errno);
close(clientSocketFileDescriptor);
return response;
}
The output reads: "Connected client thread: fail to connect 4".
I looked this error code up, it is explained thus:
#define EINTR 4 /* Interrupted system call */
I searched around on the internet, all I found were some references to system calls being interrupted by signals.
Does anyone know why this works when I call my send message function one at a time, but it fails when the send message function is called using async? Does anyone have a different suggestion how I should send a message to multiple clients?
First, I would try to deal with the EINTR issue. connect ( ) has been interrupted (this is the meaning of EINTR) and does not try again because you are using and asynch descriptor.
What I usually do in such a circumstance is to retry: I wrap the function (connect in this case) in a while cycle. If connect succeeds I break out of the cycle. If it fails, I check the value of errno. If it is EINTR I try again.
Mind that there are other values of errno that deserve a retry (EWOULDBLOCK is one of them)

Winsock2's listen() function finds a connection for every port; even those that don't exist?

I'm attempting to create a method that listens for a connection request to a specific port using a TCP protocol, with no libraries other than those that come with the Windows OS. The method seems to work fine with creating a socket and binding to a port; the problem seems to be with the listen() function. Even with no connection request to any port, it continually returns the value of zero, meaning, straight off of Microsoft's website -
If no error occurs, listen returns zero.
The strange part is that this happens with all port values; it seems to find a connection request for randomly attempted ports, ranging from 1234, to 8000, to -154326. For each of these, it's returning a value of zero.
What it should be doing is continually running until a connection request is found (this is what SOMAXCONN apparently indicates); once again, straight off of Microsoft's website -
If there are no available socket descriptors, listen attempts to continue to function.
Here is the method itself -
bool listenOnPort(SOCKET networkSocket, int portNumber) {
WSADATA wsadata;
int error = WSAStartup(0x0202, &wsadata);
if(error) {
cout << "Failed to start up Windows Sockets API." << endl;
return false;
}
if(wsadata.wVersion != 0x0202) {
WSACleanup();
cout << "Failed to find a valid Windows Sockets API." << endl;
return false;
}
SOCKADDR_IN address;
address.sin_family = AF_INET;
address.sin_port = htons(portNumber);
address.sin_addr.s_addr = htonl(INADDR_ANY);
networkSocket = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
if(networkSocket == INVALID_SOCKET) {
cout << "Failed to create a network socket." << endl;
return false;
}
if(bind(networkSocket, (LPSOCKADDR)&address, sizeof(address)) == SOCKET_ERROR) {
cout << "Failed to bind to the port." << endl;
return false;
}
cout << "Listening for a connection to port " << portNumber <<"..." << endl;
listen(networkSocket, SOMAXCONN);
cout << "Found a connection!" << endl;
}
Any explanation/word of advice is appreciated - thank you ahead of time!
You've confused listen with accept. listen reserves the port for your application, and queues incoming connections. accept waits for an incoming connection (if one isn't already queued).
listen will succeed when there is no incoming connection attempt.
http://linux.die.net/man/2/listen
listen() marks the socket referred to by sockfd as a passive socket, that is, as a socket that will be used to accept incoming connection requests using accept(2).
You must call "listen()" before you can call "accept()"; but "accept()" is the call that accepts new connections (and gives you a new socket for each new connection).
Here's the man page for "accept()":
http://linux.die.net/man/2/accept
Better, look at Beej's Guide for an excellent introduction to sockets programming:
http://beej.us/guide/bgnet/output/html/multipage/
PS:
And don't forget to call WSAStartup() if you're using Windows sockets :)

Why might bind() sometimes give EADDRINUSE when other side connects?

In my C++ application, I am using ::bind() for a UDP socket, but on rare occasions, after reconnection due to lost connection, I get errno EADDRINUSE, even after many retries. The other side of the UDP connection which will receive the data reconnected fine and is waiting for select() to indicate there is something to read.
I presume this means the local port is in use. If true, how might I be leaking the local port such that the other side connects to it fine? The real issue here is that other side connected fine and is waiting but this side is stuck on EADDRINUSE.
--Edit--
Here is a code snippet showing that I am already doing SO_REUSEADDR on my TCP socket, not on this UDP socket for which I am having issue:
// According to "Linux Socket Programming by Example" p. 319, we must call
// setsockopt w/ SO_REUSEADDR option BEFORE calling bind.
// Make the address is reuseable so we don't get the nasty message.
int so_reuseaddr = 1; // Enabled.
int reuseAddrResult
= ::setsockopt(getTCPSocket(), SOL_SOCKET, SO_REUSEADDR, &so_reuseaddr,
sizeof(so_reuseaddr));
Here is my code to close the UDP socket when done:
void
disconnectUDP()
{
if (::shutdown(getUDPSocket(), 2) < 0) {
clog << "Warning: error during shutdown of data socket("
<< getUDPSocket() << "): " << strerror(errno) << '\n';
}
if (::close(getUDPSocket()) < 0 && !seenWarn) {
clog << "Warning: error while closing data socket("
<< getUDPSocket() << "): " << strerror(errno) << '\n';
}
}
Yes, that's normal. You need to set the socket SO_REUSEADDR before you bind, eg on *nix:
int sock = socket(...);
int yes = 1;
setsockopt(sock, SOL_SOCKET, SO_REUSEADDR, &yes, sizeof(yes));
If you have separate code that reconnects by creating a new socket, set it on that one too. This is just to do with the default behaviour of the OS -- the port on a broken socket is kept defunct for a while.
[EDIT] This shouldn't apply to UDP connections. Maybe you should post the code you use to set up the socket.
In UDP there's no such thing as lost connection, because there's no connection. You can lose sent packets, that's all.
Don't reconnect, simply reuse the existing fd.