As a background, I'm writing a multithreaded linux server application. Each child process has a connection associated with it and uses select() to see if there is data waiting to be read on the socket.
I've done some searching and for once I couldn't find any help to this problem.
First time actually posting to Stack Overflow, so I apologize if my formatting is crap.
//this first line switches my connection to non-blocking.
//select() still fails whether or not this line is in.
fcntl(ChildConnection -> newsockfd, F_SETFL, 0);
struct timeval tv;
fd_set readfds;
FD_ZERO(&readfds);
FD_SET(ChildConnection -> newsockfd, &readfds);
tv.tv_sec = 3; //3 seconds of waiting maximum. Changing this does nothing.
tv.tv_usec = 0;
printf("-DEBUG: Child, About to select() the newsockfd, which is %i. readfds is %i.\n", ChildConnection -> newsockfd, readfds);
//if I feed this a bad descriptor (-1 or something) on purpose, it DOES return -1 though.
int result = select(ChildConnection -> newsockfd + 1, &readfds, NULL, NULL, &tv);
//this commented out line below doesn't even time out.
//int result = select(0, NULL, NULL, NULL, &tv);
printf("-DEBUG: Child, Just select()ed. result is %i. Hopefully that was >= 0.", result);
if (result < 0)
{
DisplayError("ERROR using select() on read connection in MotherShip::HandleMessagesChild: ");
}
else if (result > 0) // > 0 means there is data waiting to be read
{
/* <--- Snipped Reading Stuff here ---> */
}
//so if the code gets here without a result that means it timed out.
Unfortunately, the second print line (saying it has selected) is never printed. Does anyone know what's going on or have advice for me to try and debug this?
You have a blocking condition somewhere else. Get your select() code working in a small test rig first and then transplant it. Your comment in the code that "this commented out line below doesn't even time out" is verifiably incorrect:
$ cat test.c
#include <stdio.h>
#include <sys/select.h>
int main()
{
struct timeval tv;
tv.tv_sec = 3;
tv.tv_usec = 0;
select(0, NULL, NULL, NULL, &tv);
return 0;
}
$ gcc -o test test.c
$ time ./test
real 0m3.004s
user 0m0.000s
sys 0m0.000s
Alternatively, try attaching a debugger to your hanging process and see where it is blocked. Or watch it in strace(), etc...
Related
I am using select to handle connections on a udp server. If I do not get a packet for some period, I would like to time out. The probelm is, it seems I can either timeout correctly and only read from one client, or read from all clients and not time out.
The difference in this functionality has to do with the first argument to select, the int nfds
Here is my code:
int TIMEOUT = 5;
for (;;) {
FD_ZERO(&read_handles);
FD_SET(udpFD, &read_handles);
timeout.tv_sec = TIMEOUT;
timeout.tv_usec = 0;
if (select(udpFD+1, &read_handles, NULL, NULL, &timeout) == 0) {
printf("Select has timed out...\n");
return 1;
} else {
int length = 1;
if (FD_ISSET(udpFD, &read_handles)) {
//process read.
}
}
}
This version does not time out. If I change the select line to:
if(select(udpFD, &read_handles, NULL, NULL, &timeout) == 0)
It does timeout, but it only receives data from one of my clients.
udpFD is the only handle I am looking at, but it has a value of 4 because it is not the first descriptor I have made. I do not know if that makes a difference because it is the max value.
How can I both timeout and get data from both of my clients?
Using if(select(udpFD+1, &read_handles, NULL, NULL, &timeout) == 0) is the correct way to go.
This will work.
My error was later in the code I was not resetting a length field I read, and was getting stuck in the recvfrom loop, and only calling select once.
I have a problem that the select() does not give timeout when I run the program inside a Bash script file. This is my implementation:
#include <sys/select.h>
bool checkKeyPressed()
{
struct timeval tv;
tv.tv_sec = 1;
tv.tv_usec = 0;
fd_set descriptor;
const int input = 0;
FD_ZERO(&descriptor);
FD_SET(input, &descriptor);
return select(1, &descriptor, NULL, NULL, &tv) > 0;
}
// strace result after running the program directly (correct that there is a timeout)
select(1, [0], NULL, NULL, {1, 0}) = 0 (Timeout)
// strace result to run the application inside a bash script file (no timeout)
select(1, [0], NULL, NULL, {1, 0}) = 1 (in [0], left {0, 999996})
read(0, "", 1) = 0
How can I change the function to get it working with also running under the Bash script?
If you look closer at the read call in the trace, you will notice it returns zero meaning end-of-file.
When a file descriptor is at EOF (or remote socket closed, etc), the descriptor is readable with read returning zero.
If you would have pressed CTRL+d in the interactive shell, you would have gotten the same result.
If you just need a 1-second timeout don't pass any file descriptors to select(). In this case select() works as a portable sleep() function.
I'm using select() in a thread to monitor a datagram socket, but unless data is being sent to the socket before the thread starts, select() will continue to return 0.
I'm mixing a little C and C++; here's the method that starts the thread:
bool RelayStart() {
sock_recv = socket(AF_INET, SOCK_DGRAM, 0);
memset(&addr_recv, 0, sizeof(addr_recv));
addr_recv.sin_family = AF_INET;
addr_recv.sin_port = htons(18902);
addr_recv.sin_addr.s_addr = htonl(INADDR_ANY);
bind(sock_recv, (struct sockaddr*) &addr_recv, sizeof(addr_recv));
isRelayingPackets = true;
NSS::Thread::start(VIDEO_SEND_THREAD_ID);
return true;
}
The method that stops the thread:
bool RelayStop() {
isSendingVideo = false;
NSS::Thread::stop();
close(sock_recv);
return true;
}
And the method run in the thread:
void Run() {
fd_set read_fds;
int select_return;
struct timeval select_timeout;
FD_ZERO(&read_fds);
FD_SET(sock_recv, &read_fds);
while (isRelayingPackets) {
select_timeout.tv_sec = 1;
select_timeout.tv_usec = 0;
select_return = select(sock_recv + 1, &read_fds, NULL, NULL, &select_timeout);
if (select_return > 0 && FD_ISSET(sock_recv, &read_fds)) {
// ...
}
}
}
The problem is that if there isn't a process already sending UDP packets to port 18902 before RelayStart() is called, select() will always return 0. So, for example, I can't restart the sender without restarting the thread (in the correct order.)
Everything seems to work fine as long as the sender is started first.
The Run thread only constructs read_fds once.
The select call updates read_fds to have all its bits cleared for all descriptors that did not have data ready, and all its bits set for those that were set before and do have data ready.
Hence, if no descriptor has any data ready and the select call times out (and returns 0), all the bits in read_fds are now cleared. Further calls passing the same all-zero bit-mask will scan no file descriptors.
You can either re-construct the read-set on each trip inside the loop:
while (isRelayingPackets) {
FD_ZERO(&read_fds);
FD_SET(sock_recv, &read_fds);
...
}
or use an auxiliary variable with a copy of the bit-set:
while (isRelayingPackets) {
fd_set select_arg = read_fds;
... same as before but use &select_arg ...
}
(Or, of course, there are non-select interfaces that are easier to use in some ways.)
How were you expecting it to behave? The point of select() is to sleep to a timeout until data are available to be read; in this case, it will time out after 1 second and return 0. Perhaps you don't actually want a timeout before the start of a stream?
This is a followup to this question: How to wait for input from the serial port in the middle of a program
I am writing a program to control an Iridium modem that needs to wait for a response from the serial port in the middle of the program in order to verify that the correct response was given. In order to accomplish this, a user recommended I use the select() command to wait for this input.
However, I have run into some difficulty with this approach. Initially, select() would return the value indicated a timeout on the response every time (even though the modem was sending back the correct responses, which I verified with another program running at the same time). Now, the program stops after one iteration, even with the correct response sent back from the modem.
//setting the file descriptor to the port
int fd = open(portName.c_str(), O_RDWR | O_NOCTTY | O_NDELAY);
if (fd == -1)
{
/*
* Could not open the port.
*/
perror("open_port: Unable to open /dev/ttyS0 - ");
}
else
fcntl(fd, F_SETFL, 0);
FILE *out = fopen(portName.c_str(), "w");//sets the serial port
FILE *in = fopen(portName.c_str(), "r");
fd_set fds;
FD_ZERO(&fds);
FD_SET(fd, &fds);
struct timeval timeout = { 10, 0 }; /* 10 seconds */
//int ret = select(fd+1, &fds, NULL, NULL, &timeout);
/* ret == 0 means timeout, ret == 1 means descriptor is ready for reading,
ret == -1 means error (check errno) */
char buf[100];
int i =0;
while(i<(sizeof(messageArray)/sizeof(messageArray[0])))
{
//creates a string with the AT command that writes to the module
std::string line1("AT+SBDWT=");
line1+=convertInt( messageArray[i].numChar);
line1+=" ";
line1+=convertInt(messageArray[i].packetNumber);
line1+=" ";
line1+=messageArray[i].data;
line1+=std::string("\r\n");
//creates a string with the AT command that initiates the SBD session
std::string line2("AT+SBDI");
line2+=std::string("\r\n");
fputs(line1.c_str(), out); //sends to serial port
//usleep(7000000);
int ret =select(fd+1, &fds, NULL, NULL, &timeout);
/* ret == 0 means timeout, ret == 1 means descriptor is ready for reading,
ret == -1 means error (check errno) */
if (ret ==1){
fgets (buf ,sizeof(buf), in);
//add code to check if response is correct
}
else if(ret == 0) {
perror("timeout error ");
}
else if (ret ==-1) {
perror("some other error");
}
fputs(line2.c_str(), out); //sends to serial port
//usleep(7000000); //Pauses between the addition of each packet.
int ret2 = select(fd+1, &fds, NULL, NULL, &timeout);
/* ret == 0 means timeout, ret == 1 means descriptor is ready for reading,
ret == -1 means error (check errno) */
if(ret2 == 0) {
perror("timeout error ");
}
else if (ret2 ==-1) {
perror("some other error");
}
i++;
}
You aren't using the same file handle for read/write/select, which is somewhat strange.
You are not resetting your fd_sets, which are modified by select and would have all of your fds unset in the case of a timeout, making the next call timeout by default (as you are asking for no fds).
you are also using buffered IO, which is bound to create headaches in this case. eg. fgets waits for either EOF (which won't occur), or a newline, reading all the while. It will block until it gets its newline, so may keep you hanging indefinitely if that never occurs.
It may also read more than it needs into the buffer, messing up your select read signal (you have data in the buffer, but select will time out, since there's nothing to read on the filehandle).
Bottom line is this:
use FD_SET in the loop to set/reset your fd sets, also reset your timeout, as select may modify it.
use a single handle for read/write/select, instead of multiple handles, eg. open file with fopen(..., "w+") or open(..., O_RDWR)
if still using fopen, try disabling buffering using setvbuf with the _IONBF buffering option.
otherwise, use open/read/write instead of fopen etc.
I will note that part of this was mentioned in this answer to your previous question.
You should perhaps use fflush() on your output file stream.
I have a fd_set "write_set" which contains sockets that I want to use in a send(...) call. When I call select(maxsockfd+1, NULL, &write_set, NULL, &tv) there it always returns 0 (timeout) although I haven't sent anything over the sockets in the write_set yet and it should be possible to send data.
Why is this? Shouldn't select return instantly when it's possible to send data over the sockets in write_set?
Thanks!
Edit: My code..
// _read_set and _write_set are the master sets
fd_set read_set = _read_set;
fd_set write_set = _write_set;
// added this for testing, the socket is a member of RemoteChannelConnector.
std::list<RemoteChannelConnector*>::iterator iter;
for (iter = _acceptingConnectorList->begin(); iter != _acceptingConnectorList->end(); iter++) {
if(FD_ISSET((*iter)->getSocket(), &write_set)) {
char* buf = "a";
int ret;
if ((ret = send((*iter)->getSocket(), buf, 1, NULL)) == -1) {
std::cout << "error." << std::endl;
} else {
std::cout << "success." << std::endl;
}
}
}
struct timeval tv;
tv.tv_sec = 10;
tv.tv_usec = 0;
int status;
if ((status = select(_maxsockfd, &read_set, &write_set, NULL, &tv)) == -1) {
// Terminate process on error.
exit(1);
} else if (status == 0) {
// Terminate process on timeout.
exit(1);
} else {
// call send/receive
}
When I run it with the code for testing if my socket is actually in the write_set and if it is possible to send data over the socket, I get a "success"...
I don't believe that you're allowed to copy-construct fd_set objects. The only guaranteed way is to completely rebuild the set using FD_SET before each call to select. Also, you're writing to the list of sockets to be selected on, before ever calling select. That doesn't make sense.
Can you use poll instead? It's a much friendlier API.
Your code is very confused. First, you don't seem to be setting any of the bits in the fd_set. Secondly, you test the bits before you even call select.
Here is how the flow generally works...
Use FD_ZERO to zero out your set.
Go through, and for each file descriptor you're interested in the writeable state of, use FD_SET to set it.
Call select, passing it the address of the fd_set you've been calling the FD_SET function on for the write set and observe the return value.
If the return value is > 0, then go through the write set and use FD_ISSET to figure out which ones are still set. Those are the ones that are writeable.
Your code does not at all appear to be following this pattern. Also, the important task of setting up the master set isn't being shown.