udp select timeout issues. Either timing out or reading from all clients - c++

I am using select to handle connections on a udp server. If I do not get a packet for some period, I would like to time out. The probelm is, it seems I can either timeout correctly and only read from one client, or read from all clients and not time out.
The difference in this functionality has to do with the first argument to select, the int nfds
Here is my code:
int TIMEOUT = 5;
for (;;) {
FD_ZERO(&read_handles);
FD_SET(udpFD, &read_handles);
timeout.tv_sec = TIMEOUT;
timeout.tv_usec = 0;
if (select(udpFD+1, &read_handles, NULL, NULL, &timeout) == 0) {
printf("Select has timed out...\n");
return 1;
} else {
int length = 1;
if (FD_ISSET(udpFD, &read_handles)) {
//process read.
}
}
}
This version does not time out. If I change the select line to:
if(select(udpFD, &read_handles, NULL, NULL, &timeout) == 0)
It does timeout, but it only receives data from one of my clients.
udpFD is the only handle I am looking at, but it has a value of 4 because it is not the first descriptor I have made. I do not know if that makes a difference because it is the max value.
How can I both timeout and get data from both of my clients?

Using if(select(udpFD+1, &read_handles, NULL, NULL, &timeout) == 0) is the correct way to go.
This will work.
My error was later in the code I was not resetting a length field I read, and was getting stuck in the recvfrom loop, and only calling select once.

Related

Is poll() an edge triggered function?

I am responsible for a server that exports data over a TCP connection. With each data record that the server transmits, it requires the client to send a short "\n" acknowledgement message back. I have a customer who claims that the acknowledgement that he sends is not read from the web server. The following is code that I am using for I/O on the socket:
bool can_send = true;
char tx_buff[1024];
char rx_buff[1024];
struct pollfd poll_descriptor;
int rcd;
poll_descriptor.fd = socket_handle;
poll_descriptor.events = POLLIN | POLLOUT;
poll_descriptor.revents = 0;
while(!should_quit && is_connected)
{
// if we know that data can be written, we need to do this before we poll the OS for
// events. This will prevent the 100 msec latency that would otherwise occur
fill_write_buffer(write_buffer);
while(can_send && !should_quit && !write_buffer.empty())
{
uint4 tx_len = write_buffer.copy(tx_buff, sizeof(tx_buff));
rcd = ::send(
socket_handle,
tx_buff,
tx_len,
0);
if(rcd == -1 && errno != EINTR)
throw SocketException("socket write failure");
write_buffer.pop(rcd);
if(rcd > 0)
on_low_level_write(tx_buff, rcd);
if(rcd < tx_len)
can_send = false;
}
// we will use poll for up to 100 msec to determine whether the socket can be read or
// written
if(!can_send)
poll_descriptor.events = POLLIN | POLLOUT;
else
poll_descriptor.events = POLLIN;
poll(&poll_descriptor, 1, 100);
// check to see if an error has occurred
if((poll_descriptor.revents & POLLERR) != 0 ||
(poll_descriptor.revents & POLLHUP) != 0 ||
(poll_descriptor.revents & POLLNVAL) != 0)
throw SocketException("socket hung up or socket error");
// check to see if anything can be written
if((poll_descriptor.revents & POLLOUT) != 0)
can_send = true;
// check to see if anything can be read
if((poll_descriptor.revents & POLLIN) != 0)
{
ssize_t bytes_read;
ssize_t total_bytes_read = 0;
int bytes_remaining = 0;
do
{
bytes_read = ::recv(
socket_handle,
rx_buff,
sizeof(rx_buff),
0);
if(bytes_read > 0)
{
total_bytes_read += bytes_read;
on_low_level_read(rx_buff,bytes_read);
}
else if(bytes_read == -1)
throw SocketException("read failure");
ioctl(
socket_handle,
FIONREAD,
&bytes_remaining);
}
while(bytes_remaining != 0);
// recv() will return 0 if the socket has been closed
if(total_bytes_read > 0)
read_event::cpost(this);
else
{
is_connected = false;
closed_event::cpost(this);
}
}
}
I have written this code based upon the assumption that poll() is a level triggered function and will unblock immediately as long as there is data to be read from the socket. Everything that I have read seems to back up this assumption. Is there a reason that I may have missed that would cause the above code to miss a read event?
It is not edge triggered. It is always level triggered. I will have to read your code to answer your actual question though. But that answers the question in the title. :-)
I can see no clear reason in your code why you might be seeing the behavior you are seeing. But the scope of your question is a lot larger than the code you're presenting, and I cannot pretend that this is a complete problem diagnosis.
It is level triggered. POLLIN fires if there is data in the socket receive buffer when you poll, and POLLOUT fires if there is room in the socket send buffer (which there almost always is).
Based on your own assessment of the problem (that is, you are blocked on poll when you expect to be able to read the acknowledgement), then you will eventually get a timeout.
If the customer's machine is more than 50ms away from your server, then you will always timeout on the connection before receiving the acknowledgement, since you only wait 100ms. This is because it will take a minimum of 50ms for the data to reach the customer, and a minimum of 50ms for the acknowledgement to return.

Why does select only show file descriptors as ready if data is already being sent?

I'm using select() in a thread to monitor a datagram socket, but unless data is being sent to the socket before the thread starts, select() will continue to return 0.
I'm mixing a little C and C++; here's the method that starts the thread:
bool RelayStart() {
sock_recv = socket(AF_INET, SOCK_DGRAM, 0);
memset(&addr_recv, 0, sizeof(addr_recv));
addr_recv.sin_family = AF_INET;
addr_recv.sin_port = htons(18902);
addr_recv.sin_addr.s_addr = htonl(INADDR_ANY);
bind(sock_recv, (struct sockaddr*) &addr_recv, sizeof(addr_recv));
isRelayingPackets = true;
NSS::Thread::start(VIDEO_SEND_THREAD_ID);
return true;
}
The method that stops the thread:
bool RelayStop() {
isSendingVideo = false;
NSS::Thread::stop();
close(sock_recv);
return true;
}
And the method run in the thread:
void Run() {
fd_set read_fds;
int select_return;
struct timeval select_timeout;
FD_ZERO(&read_fds);
FD_SET(sock_recv, &read_fds);
while (isRelayingPackets) {
select_timeout.tv_sec = 1;
select_timeout.tv_usec = 0;
select_return = select(sock_recv + 1, &read_fds, NULL, NULL, &select_timeout);
if (select_return > 0 && FD_ISSET(sock_recv, &read_fds)) {
// ...
}
}
}
The problem is that if there isn't a process already sending UDP packets to port 18902 before RelayStart() is called, select() will always return 0. So, for example, I can't restart the sender without restarting the thread (in the correct order.)
Everything seems to work fine as long as the sender is started first.
The Run thread only constructs read_fds once.
The select call updates read_fds to have all its bits cleared for all descriptors that did not have data ready, and all its bits set for those that were set before and do have data ready.
Hence, if no descriptor has any data ready and the select call times out (and returns 0), all the bits in read_fds are now cleared. Further calls passing the same all-zero bit-mask will scan no file descriptors.
You can either re-construct the read-set on each trip inside the loop:
while (isRelayingPackets) {
FD_ZERO(&read_fds);
FD_SET(sock_recv, &read_fds);
...
}
or use an auxiliary variable with a copy of the bit-set:
while (isRelayingPackets) {
fd_set select_arg = read_fds;
... same as before but use &select_arg ...
}
(Or, of course, there are non-select interfaces that are easier to use in some ways.)
How were you expecting it to behave? The point of select() is to sleep to a timeout until data are available to be read; in this case, it will time out after 1 second and return 0. Perhaps you don't actually want a timeout before the start of a stream?

select() blocks instead of returning a timeout

As a background, I'm writing a multithreaded linux server application. Each child process has a connection associated with it and uses select() to see if there is data waiting to be read on the socket.
I've done some searching and for once I couldn't find any help to this problem.
First time actually posting to Stack Overflow, so I apologize if my formatting is crap.
//this first line switches my connection to non-blocking.
//select() still fails whether or not this line is in.
fcntl(ChildConnection -> newsockfd, F_SETFL, 0);
struct timeval tv;
fd_set readfds;
FD_ZERO(&readfds);
FD_SET(ChildConnection -> newsockfd, &readfds);
tv.tv_sec = 3; //3 seconds of waiting maximum. Changing this does nothing.
tv.tv_usec = 0;
printf("-DEBUG: Child, About to select() the newsockfd, which is %i. readfds is %i.\n", ChildConnection -> newsockfd, readfds);
//if I feed this a bad descriptor (-1 or something) on purpose, it DOES return -1 though.
int result = select(ChildConnection -> newsockfd + 1, &readfds, NULL, NULL, &tv);
//this commented out line below doesn't even time out.
//int result = select(0, NULL, NULL, NULL, &tv);
printf("-DEBUG: Child, Just select()ed. result is %i. Hopefully that was >= 0.", result);
if (result < 0)
{
DisplayError("ERROR using select() on read connection in MotherShip::HandleMessagesChild: ");
}
else if (result > 0) // > 0 means there is data waiting to be read
{
/* <--- Snipped Reading Stuff here ---> */
}
//so if the code gets here without a result that means it timed out.
Unfortunately, the second print line (saying it has selected) is never printed. Does anyone know what's going on or have advice for me to try and debug this?
You have a blocking condition somewhere else. Get your select() code working in a small test rig first and then transplant it. Your comment in the code that "this commented out line below doesn't even time out" is verifiably incorrect:
$ cat test.c
#include <stdio.h>
#include <sys/select.h>
int main()
{
struct timeval tv;
tv.tv_sec = 3;
tv.tv_usec = 0;
select(0, NULL, NULL, NULL, &tv);
return 0;
}
$ gcc -o test test.c
$ time ./test
real 0m3.004s
user 0m0.000s
sys 0m0.000s
Alternatively, try attaching a debugger to your hanging process and see where it is blocked. Or watch it in strace(), etc...

C++ select stops accepting connections

I'm trying to make a select-server in order to receive connection from several clients (all clients will connect to the same port).
The server accepts the first 2 clients, but unless one of them disconnects, it will not accept a new one.
I'm starting to listen the the server port like this:
listen(m_socketId, SOMAXCONN);
and using the select command like this:
int selected = select(m_maxSocketId + 1, &m_socketReadSet, NULL, NULL, 0);
I've added some code.
bool TcpServer::Start(char* ipAddress, int port)
{
m_active = true;
FD_ZERO(&m_socketMasterSet);
bool listening = m_socket->Listen(ipAddress, port);
// Start listening.
m_maxSocketId = m_socket->GetId();
FD_SET(m_maxSocketId, &m_socketMasterSet);
if (listening == true)
{
StartThread(&InvokeListening);
StartReceiving();
return true;
}
else
{
return false;
}
}
void TcpServer::Listen()
{
while (m_active == true)
{
m_socketReadSet = m_socketMasterSet;
int selected = select(m_maxSocketId + 1, &m_socketReadSet, NULL, NULL, 0);
if (selected <= 0)
continue;
bool accepted = Accept();
if (accepted == false)
{
ReceiveFromSockets();
}
}
}
bool TcpServer::Accept()
{
int listenerId = m_socket->GetId();
if (FD_ISSET(listenerId, &m_socketReadSet) == true)
{
struct sockaddr_in remoteAddr;
int addrSize = sizeof(remoteAddr);
unsigned int newSockId = accept(listenerId, (struct sockaddr *)&remoteAddr, &addrSize);
if (newSockId == -1) // Invalid socket...
{
return false;
}
if (newSockId > m_maxSocketId)
{
m_maxSocketId = newSockId;
}
m_clientUniqueId++;
// Remembering the new socket, so we'll be able to check its state
// the next time.
FD_SET(newSockId, &m_socketMasterSet);
CommEndPoint remote(remoteAddr);
CommEndPoint local = m_socket->GetLocalPoint();
ClientId* client = new ClientId(m_clientUniqueId, newSockId, local, remote);
m_clients.Add(client);
StoreNewlyAcceptedClient(client);
char acceptedMsg = CommInternalServerMsg::ConnectionAccepted;
Server::Send(CommMessageType::Internal, client, &acceptedMsg, sizeof(acceptedMsg));
return true;
}
return false;
}
I hope it's enough :)
what's wrong with it?
The by far most common error with select() is not re-initializing the fd sets on every iteration. The second, third, and forth arguments are updated by the call, so you have to populate them again every time.
Post more code, so people can actually help you.
Edit 0:
fd_set on Windows is a mess :)
It's not allowed to copy construct fd_set objects:
m_socketReadSet = m_socketMasterSet;
This combined with Nikolai's correct statement that select changes the set passed in probably accounts for your error.
poll (On Windows, WSAPoll) is a much friendlier API.
Windows also provides WSAEventSelect and (Msg)WaitForMultipleObjects(Ex), which doesn't have a direct equivalent on Unix, but allows you to wait on sockets, files, thread synchronization events, timers, and UI messages at the same time.

select() behaviour for writeability?

I have a fd_set "write_set" which contains sockets that I want to use in a send(...) call. When I call select(maxsockfd+1, NULL, &write_set, NULL, &tv) there it always returns 0 (timeout) although I haven't sent anything over the sockets in the write_set yet and it should be possible to send data.
Why is this? Shouldn't select return instantly when it's possible to send data over the sockets in write_set?
Thanks!
Edit: My code..
// _read_set and _write_set are the master sets
fd_set read_set = _read_set;
fd_set write_set = _write_set;
// added this for testing, the socket is a member of RemoteChannelConnector.
std::list<RemoteChannelConnector*>::iterator iter;
for (iter = _acceptingConnectorList->begin(); iter != _acceptingConnectorList->end(); iter++) {
if(FD_ISSET((*iter)->getSocket(), &write_set)) {
char* buf = "a";
int ret;
if ((ret = send((*iter)->getSocket(), buf, 1, NULL)) == -1) {
std::cout << "error." << std::endl;
} else {
std::cout << "success." << std::endl;
}
}
}
struct timeval tv;
tv.tv_sec = 10;
tv.tv_usec = 0;
int status;
if ((status = select(_maxsockfd, &read_set, &write_set, NULL, &tv)) == -1) {
// Terminate process on error.
exit(1);
} else if (status == 0) {
// Terminate process on timeout.
exit(1);
} else {
// call send/receive
}
When I run it with the code for testing if my socket is actually in the write_set and if it is possible to send data over the socket, I get a "success"...
I don't believe that you're allowed to copy-construct fd_set objects. The only guaranteed way is to completely rebuild the set using FD_SET before each call to select. Also, you're writing to the list of sockets to be selected on, before ever calling select. That doesn't make sense.
Can you use poll instead? It's a much friendlier API.
Your code is very confused. First, you don't seem to be setting any of the bits in the fd_set. Secondly, you test the bits before you even call select.
Here is how the flow generally works...
Use FD_ZERO to zero out your set.
Go through, and for each file descriptor you're interested in the writeable state of, use FD_SET to set it.
Call select, passing it the address of the fd_set you've been calling the FD_SET function on for the write set and observe the return value.
If the return value is > 0, then go through the write set and use FD_ISSET to figure out which ones are still set. Those are the ones that are writeable.
Your code does not at all appear to be following this pattern. Also, the important task of setting up the master set isn't being shown.