Windws C++ Intermittent Socket Disconnect - c++

I've got a server that uses a two thread system to manage between 100 and 200 concurrent connections. It uses TCP sockets, as packet delivery guarantee is important (it's a communication system where missed remote API calls could FUBAR a client).
I've implemented a custom protocol layer to separate incoming bytes into packets and dispatch them properly (the library is included below). I realize the issues of using MSG_PEEK, but to my knowledge, it is the only system that will fulfill the needs of the library implementation. I am open to suggestions, especially if it could be part of the problem.
Basically, the problem is that, randomly, the server will drop the client's socket due to a lack of incoming packets for more than 20 seconds, despite the client successfully sending a keepalive packet every 4. I can verify that the server itself didn't go offline and that the connection of the users (including myself) experiencing the problem is stable.
The library for sending/receiving is here:
short ncsocket::send(wstring command, wstring data) {
wstringstream ss;
int datalen = ((int)command.length() * 2) + ((int)data.length() * 2) + 12;
ss << zero_pad_int(datalen) << L"|" << command << L"|" << data;
int tosend = datalen;
short __rc = 0;
do{
int res = ::send(this->sock, (const char*)ss.str().c_str(), datalen, NULL);
if (res != SOCKET_ERROR)
tosend -= res;
else
return FALSE;
__rc++;
Sleep(10);
} while (tosend != 0 && __rc < 10);
if (tosend == 0)
return TRUE;
return FALSE;
}
short ncsocket::recv(netcommand& nc) {
vector<wchar_t> buffer(BUFFER_SIZE);
int recvd = ::recv(this->sock, (char*)buffer.data(), BUFFER_SIZE, MSG_PEEK);
if (recvd > 0) {
if (recvd > 8) {
wchar_t* lenstr = new wchar_t[4];
memcpy(lenstr, buffer.data(), 8);
int fulllen = _wtoi(lenstr);
delete lenstr;
if (fulllen > 0) {
if (recvd >= fulllen) {
buffer.resize(fulllen / 2);
recvd = ::recv(this->sock, (char*)buffer.data(), fulllen, NULL);
if (recvd >= fulllen) {
buffer.resize(buffer.size() + 2);
buffer.push_back((char)L'\0');
vector<wstring> data = parsewstring(L"|", buffer.data(), 2);
if (data.size() == 3) {
nc.command = data[1];
nc.payload = data[2];
return TRUE;
}
else
return FALSE;
}
else
return FALSE;
}
else
return FALSE;
}
else {
::recv(this->sock, (char*)buffer.data(), BUFFER_SIZE, NULL);
return FALSE;
}
}
else
return FALSE;
}
else
return FALSE;
}
This is the code for determining if too much time has passed:
if ((int)difftime(time(0), regusrs[i].last_recvd) > SERVER_TIMEOUT) {
regusrs[i].sock.end();
regusrs[i].is_valid = FALSE;
send_to_all(L"removeuser", regusrs[i].server_user_id);
wstringstream log_entry;
log_entry << regusrs[i].firstname << L" " << regusrs[i].lastname << L" (suid:" << regusrs[i].server_user_id << L",p:" << regusrs[i].parent << L",pid:" << regusrs[i].parentid << L") was disconnected due to idle";
write_to_log_file(server_log, log_entry.str());
}
The "regusrs[i]" is using the currently iterated member of a vector I use to story socket descriptors and user information. The 'is_valid' check is there to tell if the associated user is an actual user - this is done to prevent the system from having to deallocate the member of the vector - it just returns it to the pool of available slots. No thread access/out-of-range issues that way.
Anyway, I started to wonder if it was the server itself was the problem. I'm testing on another server currently, but I wanted to see if another set of eyes could stop something out of place or cue me in on a concept with sockets and extended keepalives that I'm not aware of.
Thanks in advance!

I think I see what you're doing with MSG_PEEK, where you wait until it looks like you have enough data to read a full packet. However, I would be suspicious of this. (It's hard to determine the dynamic behaviour of your system just by looking at this small part of the source and not the whole thing.)
To avoid use of MSG_PEEK, follow these two principles:
When you get a notification that data is ready (I assume you're using select), then read all the waiting data from recv(). You may use more than one recv() call, so you can handle the incoming data in pieces.
If you read only a partial packet (length or payload), then save it somewhere for the next time you get a read notification. Put the packets and payloads back together yourself, don't leave them in the socket buffer.
As an aside, the use of new/memcpy/wtoi/delete is woefully inefficient. You don't need to allocate memory at all, you can use a local variable. And then you don't even need the memcpy at all, just a cast.
I presume you already assume that your packets can be no longer than 999 bytes in length.

Related

libusb_get_string_descriptor_ascii() timeout error?

I'm trying to get the serial number of a USB device using libusb-1.0.
The problem I have is that sometimes the libusb_get_string_descriptor_ascii() function returns -7 (LIBUSB_ERROR_TIMEOUT) in my code, but other times the serial number is correctly written in my array and I can't figure out what is happening. Am I using libusb incorrectly? Thank you.
void EnumerateUsbDevices(uint16_t uVendorId, uint16_t uProductId) {
libusb_context *pContext;
libusb_device **ppDeviceList;
libusb_device_descriptor oDeviceDescriptor;
libusb_device_handle *hHandle;
int iReturnValue = libusb_init(&pContext);
if (iReturnValue != LIBUSB_SUCCESS) {
return;
}
libusb_set_debug(pContext, 3);
ssize_t nbUsbDevices = libusb_get_device_list(pContext, &ppDeviceList);
for (ssize_t i = 0; i < nbUsbDevices; ++i) {
libusb_device *pDevice = ppDeviceList[i];
iReturnValue = libusb_get_device_descriptor(pDevice, &oDeviceDescriptor);
if (iReturnValue != LIBUSB_SUCCESS) {
continue;
}
if (oDeviceDescriptor.idVendor == uVendorId && oDeviceDescriptor.idProduct == uProductId) {
iReturnValue = libusb_open(pDevice, &hHandle);
if (iReturnValue != LIBUSB_SUCCESS) {
continue;
}
unsigned char uSerialNumber[255] = {};
int iSerialNumberSize = libusb_get_string_descriptor_ascii(hHandle, oDeviceDescriptor.iSerialNumber, uSerialNumber, sizeof(uSerialNumber));
std::cout << iSerialNumberSize << std::endl; // Print size of serial number <--
libusb_close(hHandle);
}
}
libusb_free_device_list(ppDeviceList, 1);
libusb_exit(pContext);
}
I see nothing wrong with your code. I would not care to much about timeouts in the context of USB. It is a bus after all and can be occupied with different traffic.
As you may know there is depending on the version of USB a portion of the bandwidth reserved for control transfers. libusb_get_string_descriptor_ascii simply sends all the required control transfers to get the string. If any of those times out it will abort. You can try to send this control transfers yourself and use bigger timeout values but I guess the possibility of a timeout will always be there to wait for you (pun intended).
So it turns out my device was getting into weird states, possibly not being closed properly or the like. Anyway, calling libusb_reset_device(hHandle); just after the libusb_open() call seems to fix my sporadic timeout issue.
libusb_reset_device()

c++ irc bot only receives data once

I'm creating a simple IRC bot in C++ but I'm coming across some issues I can't seem to resolve. My code is based on Beej's Networking Guide.
Also have I used following code to try and fix my problem.
main.cpp snippet
// as long we don't lose connection, keep receiving data
while(true) {
num_receives++;
if(num_receives == 3) {
snd("NICK q1w2\r\n");
snd("USER q1w2 0 * :q\r\n");
}
if(num_receives == 4) {
snd("JOIN q1w2\r\n");
}
num_bytes = recv(s, buf, MAXDATASIZE-1, 0);
buf[num_bytes]='\0';
std::cout << num_receives << "\t" << buf;
if(num_bytes == -1) {
perror("ERROR receiving");
break;
}
if(num_bytes == 0) {
perror("DISCONNECTED");
break;
}
}
and the output is (first line is command to run the bot)
~$ ./bot irc.snt.utwente.nl 6667
1 :irc.snt.utwente.nl 020 * :Please wait while we process your connection.
2 ERROR :Closing Link: [unknown#91.177.66.186] (Ping timeout)
DISCONNECTED: Network is unreachable
So, the problem is that it only receives data once. Even though it's placed in a while loop. I don't really see what's blocking it from looping or receiving. If I place a cout inside the while loop you can see it only loops once and a second time in the end when it disconnects.
Edit: I have it working. The commands send to the severs should be put outside the while loop. Because inside the loop it's waiting to be initialized. But as w don't receive anything we just lose connection because we didn't send anything either.
[correct] main.cpp snippet
snd("NICK q1w2\r\n");
snd("USER q1w2 0 * :q\r\n");
snd("JOIN q1w2\r\n");
// as long we don't lose connection, keep receiving data
while(true) {
num_receives++;
num_bytes = recv(s, buf, MAXDATASIZE-1, 0);
buf[num_bytes]='\0';
std::cout << num_receives << "\t" << buf;
if(num_bytes == -1) {
perror("ERROR receiving");
break;
}
if(num_bytes == 0) {
perror("DISCONNECTED");
break;
}
}
You forgot to implement the IRC protocol.
Your code assumes that each call to the TCP connection's recv function will return exactly one IRC message. This assumption is totally and completely invalid. The TCP implementation has no idea what an IRC message is. You have to implement the IRC protocol and the IRC protocol's definition of a message if you want to count how many messages you've received.

Multiple threads writing to same socket causing issues

I have written a client/server application where the server spawns multiple threads depending upon the request from client.
These threads are expected to send some data to the client(string).
The problem is, data gets overwritten on the client side. How do I tackle this issue ?
I have already read some other threads on similar issue but unable to find the exact solution.
Here is my client code to receive data.
while(1)
{
char buff[MAX_BUFF];
int bytes_read = read(sd,buff,MAX_BUFF);
if(bytes_read == 0)
{
break;
}
else if(bytes_read > 0)
{
if(buff[bytes_read-1]=='$')
{
buff[bytes_read-1]='\0';
cout<<buff;
}
else
{
cout<<buff;
}
}
}
Server Thread code :
void send_data(int sd,char *data)
{
write(sd,data,strlen(data));
cout<<data;
}
void *calcWordCount(void *arg)
{
tdata *tmp = (tdata *)arg;
string line = tmp->line;
string s = tmp->arg;
int sd = tmp->sd_c;
int line_no = tmp->line_no;
int startpos = 0;
int finds = 0;
while ((startpos = line.find(s, startpos)) != std::string::npos)
{
++finds;
startpos+=1;
pthread_mutex_lock(&myMux);
tcount++;
pthread_mutex_unlock(&myMux);
}
pthread_mutex_lock(&mapMux);
int t=wcount[s];
wcount[s]=t+finds;
pthread_mutex_unlock(&mapMux);
char buff[MAX_BUFF];
sprintf(buff,"%s",s.c_str());
sprintf(buff+strlen(buff),"%s"," occured ");
sprintf(buff+strlen(buff),"%d",finds);
sprintf(buff+strlen(buff),"%s"," times on line ");
sprintf(buff+strlen(buff),"%d",line_no);
sprintf(buff+strlen(buff),"\n",strlen("\n"));
send_data(sd,buff);
delete (tdata*)arg;
}
On the server side make sure the shared resource (the socket, along with its associated internal buffer) is protected against the concurrent access.
Define and implement an application level protocol used by the server to make it possible for the client to distinguish what the different threads sent.
As an additional note: One cannot rely on read()/write() reading/writing as much bytes as those two functions were told to read/write. It is an essential necessity to check their return value to learn how much bytes those functions actually read/wrote and loop around them until all data that was intended to be read/written had been read/written.
You should put some mutex to your socket.
When a thread use the socket it should block the socket.
Some mutex example.
I can't help you more without the server code. Because the problem is probably in the server.

Is poll() an edge triggered function?

I am responsible for a server that exports data over a TCP connection. With each data record that the server transmits, it requires the client to send a short "\n" acknowledgement message back. I have a customer who claims that the acknowledgement that he sends is not read from the web server. The following is code that I am using for I/O on the socket:
bool can_send = true;
char tx_buff[1024];
char rx_buff[1024];
struct pollfd poll_descriptor;
int rcd;
poll_descriptor.fd = socket_handle;
poll_descriptor.events = POLLIN | POLLOUT;
poll_descriptor.revents = 0;
while(!should_quit && is_connected)
{
// if we know that data can be written, we need to do this before we poll the OS for
// events. This will prevent the 100 msec latency that would otherwise occur
fill_write_buffer(write_buffer);
while(can_send && !should_quit && !write_buffer.empty())
{
uint4 tx_len = write_buffer.copy(tx_buff, sizeof(tx_buff));
rcd = ::send(
socket_handle,
tx_buff,
tx_len,
0);
if(rcd == -1 && errno != EINTR)
throw SocketException("socket write failure");
write_buffer.pop(rcd);
if(rcd > 0)
on_low_level_write(tx_buff, rcd);
if(rcd < tx_len)
can_send = false;
}
// we will use poll for up to 100 msec to determine whether the socket can be read or
// written
if(!can_send)
poll_descriptor.events = POLLIN | POLLOUT;
else
poll_descriptor.events = POLLIN;
poll(&poll_descriptor, 1, 100);
// check to see if an error has occurred
if((poll_descriptor.revents & POLLERR) != 0 ||
(poll_descriptor.revents & POLLHUP) != 0 ||
(poll_descriptor.revents & POLLNVAL) != 0)
throw SocketException("socket hung up or socket error");
// check to see if anything can be written
if((poll_descriptor.revents & POLLOUT) != 0)
can_send = true;
// check to see if anything can be read
if((poll_descriptor.revents & POLLIN) != 0)
{
ssize_t bytes_read;
ssize_t total_bytes_read = 0;
int bytes_remaining = 0;
do
{
bytes_read = ::recv(
socket_handle,
rx_buff,
sizeof(rx_buff),
0);
if(bytes_read > 0)
{
total_bytes_read += bytes_read;
on_low_level_read(rx_buff,bytes_read);
}
else if(bytes_read == -1)
throw SocketException("read failure");
ioctl(
socket_handle,
FIONREAD,
&bytes_remaining);
}
while(bytes_remaining != 0);
// recv() will return 0 if the socket has been closed
if(total_bytes_read > 0)
read_event::cpost(this);
else
{
is_connected = false;
closed_event::cpost(this);
}
}
}
I have written this code based upon the assumption that poll() is a level triggered function and will unblock immediately as long as there is data to be read from the socket. Everything that I have read seems to back up this assumption. Is there a reason that I may have missed that would cause the above code to miss a read event?
It is not edge triggered. It is always level triggered. I will have to read your code to answer your actual question though. But that answers the question in the title. :-)
I can see no clear reason in your code why you might be seeing the behavior you are seeing. But the scope of your question is a lot larger than the code you're presenting, and I cannot pretend that this is a complete problem diagnosis.
It is level triggered. POLLIN fires if there is data in the socket receive buffer when you poll, and POLLOUT fires if there is room in the socket send buffer (which there almost always is).
Based on your own assessment of the problem (that is, you are blocked on poll when you expect to be able to read the acknowledgement), then you will eventually get a timeout.
If the customer's machine is more than 50ms away from your server, then you will always timeout on the connection before receiving the acknowledgement, since you only wait 100ms. This is because it will take a minimum of 50ms for the data to reach the customer, and a minimum of 50ms for the acknowledgement to return.

Flush queued GPIB responses

Architecture ->GBIP from external interface is connected to target ( linux) system via gpib bus.
Inside Linux box , there is ethernet cable from GPIB to motherboard.
The PIC_GPIB card on external interface is IEEE 488.2
I am sending a query from external interface to linux box.
Few scenarios
1) If I send a query which does not expect a response back , then next query send will work.
2) If I send a query which expect response back , and when I have received the response and read it and then fire next query it works fine.
3) BUT if I send a query from external interface and got response back and I ignore to read the response , then Next query fails.
I am requesting help for scenario 3.
The coding is done on linux side and its a socket programming , which uses linux inbuilt function from unistd.h for read and write.
My investigation : I have found there is a internal memory on gbib card on external interface which stores the value of previous response until we have the read. Generally I use IEEE string utility software to write commands that goes to linux box and read reposne via read button .
Could someone please direct me how to clean input buffer or memory which stores value so that write from external command contiunues without bothering to read it.
My code on linux side has been developed in C++ and socket programming. I have used in bulit write and read function to write and read to the gpib and to json server.
Sample code is shown below
bool GpibClass::ReadWriteFromGPIB()
{
bool check = true;
int n = 0;
char buffer[BUFFER_SIZE];
fd_set read_set;
struct timeval lTimeOut;
// Reset the read mask for the select
FD_ZERO(&read_set);
FD_SET(mGpibFd, &read_set);
FD_SET(mdiffFd, &read_set);
// Set Timeout to check the status of the connection
// when no data is being received
lTimeOut.tv_sec = CONNECTION_STATUS_CHECK_TIMEOUT_SECONDS;
lTimeOut.tv_usec = 0;
cout << "Entered into this function" << endl;
// Look for sockets with available data
if (-1 == select(FD_SETSIZE, &read_set, NULL, NULL, &lTimeOut))
{
cout << "Select failed" << endl;
// We don't know the cause of select's failure.
// Close everything and start from scratch:
CloseConnection(mGpibFd);
CloseConnection(mdifferntServer); // this is different server
check = false;
}
// Check if data is available from GPIB server,
// and if any read and push it to gpib
if(true == check)
{
cout << "Check data from GPIB after select" << endl;
if (FD_ISSET(mGpibFd, &read_set))
{
n = read(mGpibFd, buffer, BUFFER_SIZE);
cout << "Read from GPIB" << n << " bytes" << endl;
if(0 < n)
{
// write it to different server and check if we get response from it
}
else
{
// Something failed on socket read - most likely
// connection dropped. Close socket and retry later
CloseConnection(mGpibFd);
check = false;
}
}
}
// Check if data is available from different server,
// and if any read and push it to gpib
if(true == check)
{
cout << "Check data from diff server after select" << endl;
if (FD_ISSET(mdiffFd, &read_set))
{
n = read(mdiffFd, buffer, BUFFER_SIZE);
cout << "Read from diff servewr " << n << " bytes" << endl;
if (0 < n)
{
// Append, just in case - makes sure data is sent.
// Extra cr/lf shouldn't cause any problem if the json
// server has already added them
strcpy(buffer + n, "\r\n");
write(mGpibFd, buffer, n + 2);
std::cout <<" the buffer sixze = " << buffer << std::endl;
}
else
{
// Something failed on socket read - most likely
// connection dropped. Close socket and retry later
CloseConnection(mdiffFd);
check = false;
}
}
}
return check;
}
You should ordinarily be reading responses after any operation which could generate them.
If you fail to do that, an easy solution would be to read responses in a loop until you have drained the queue to empty.
You can reset the instrument (probably *RST), but you would probably loose other state as well. You will have to check it's documentation to see if there is a command to reset only the response queue. Checking the documentation is always a good idea, because the number of instruments which precisely comply with the spec is dwarfed by the number which augment or omit parts in unique ways.