Socket select reducing the number of sockets in file descriptor set - c++

I have a piece of code that accepts 2 connections, creates a file descriptor set with their respective sockets, and passes it to select. But when select returns, the number of file descriptors in the file descriptor set was reduced to 1, and select can just detect received data for the first socket in the fd_array array.
Any ideas where I should look at?
Thanks in advance,
Andre
fd_set mSockets;
/* At this point
mSockets.fd_count = 2
mSockets.fd_array[0] = 3765
mSockets.fd_array[1] = 2436
*/
select(0, & mSockets, 0, 0, 0);
/* At this point
mSockets.fd_count = 1
mSockets.fd_array[0] = 3765
mSockets.fd_array[1] = 2436
*/

That is by design the readfds, writefds and exceptfds paramters of the select functions are in/out paramters.
You should initialize the fd_set before each call to select:
SOCKET s1;
SOCKET s2;
// open sockets s1 and s2
// prepare select call
FD_ZERO(&mSockets);
FD_SET(s1, &mSockets);
FD_SET(s2, &mSockets);
select(0, &mSockets, 0, 0, 0);
// evaluate select results
if (FD_ISSET(s1, &mSockets))
{
// process s1 traffic
}
if (FD_ISSET(s2, &mSockets))
{
// process s2 traffic
}
Additionally cou can check the return value of select. It indicates invalid if you can opertate with the sockets at all. I.e. a zero return indicates, that all FD_ISSET amcros will return 0.
EDIT:
Since readfds, writefds and exceptfds are also out paramters of the select functions, they are modified. The fd_count member indicates how many fd_array members are valid. You should not evaluate fd_array[1] if fd_count is less than 2.

Related

What could cause a non-blocking socket to block on `recv`?

I have a TCP/IP socket set to non-blocking that is blocking anyway. The socket is only referenced in one thread. This code works on Windows (with a few call substitutions) but not on Linux. I have code that looks like this (Don't mind the C-style casts -- this was written long ago. Also, I trimmed it up a bit, so let me know if I accidentally trimmed off a step. Chances are that I'm actually doing that step. The actual code is on another computer, so I can't copy-paste.):
// In the real code, these are class members. I'm not bonkers
int mSocket;
sockaddr_in mAddress;
void CreateSocket(
unsigned int ipAddress,
unsigned short port)
{
// Omitting my error checking in this question for brevity because everything comes back valid
mSocket = socket(AF_INET, SOCK_STREAM, 0); // Not -1
int oldFlags = fctnl(mSocket, F_GETFL, 0); // Not -1
fcntl(mSocket, F_SETFL, oldFlags | O_NONBLOCK); // Not -1
mAddress.sin_family = AF_INET;
mAddress.sin_addr.s_addr = ipAddress; // address is valid
mAddress.sin_port = htons((u_short)port); // port is not 0 and allowed on firewall
memset(mAddress.sin_zero, 0, sizeof(mAddress.sin_zero));
// <Connect attempt loop starts here>
connect(mSocket, (sockaddr*)&mAddress, sizeof(mAddress)); // Not -1 to exit loop
// <Connect attempt loop ends here>
// Connection is now successful ('connect' returned a value other than -1)
}
// ... Stuff happens ...
// ... Then this is called because 'select' call shows read data available ...
void AttemptReceive(
MyReturnBufferTypeThatsNotImportant &returnedBytes)
{
// Read socket
const size_t bufferSize = 4096;
char buffer[bufferSize];
int result = 0;
do {
// Debugging code: sanity checks
int socketFlags = fcntl(mSocket, F_GETFL, 0); // Not -1
printf("result=%d\n", result);
printf("O_NONBLOCK? %d\n", socketFlags & O_NONBLOCK); // Always prints "O_NONBLOCK? 2048"
result = recv(mSocket, buffer, bufferSize, 0); // NEVER -1 or 0 after hundreds to thousands of calls, then suddenly blocks
// ... Save off and package read data into user format for output to caller ...
} while (result == bufferSize);
}
I believe, because AttemptReceive is called in response to select, that the socket just happens to contain exactly a number of bytes equal to a multiple of the buffer size (4096). I've somewhat confirmed this with the printf statements, so it never blocks on the first loop-through. Every time this bug happens, the last two lines to get printed before the thread blocks are:
result=4096
O_NONBLOCK? 2048
Changing the recv line to recv(mSocket, buffer, bufferSize, MSG_DONTWAIT); actually "fixes" the issue (suddenly, recv occasionally returns -1 with errno EWOULDBLOCK/EAGAIN (both equal to each other on my OS)), but I'm afraid I'm just putting a band-aid on a gushing wound, so to speak. Any ideas?
P.S. the address is "localhost", but I don't think it matters.
Note: I'm using an old compiler (not by choice), g++ 4.4.7-23 from 2010. That may have something to do with the issue.
socket() automatically sets O_RDWR on the socket with my operating system and compiler, but it appears that O_RDWR had accidentally gotten unset on the socket in question at the start of the program (which somehow allowed it to read fine if there was data to read, but block otherwise). Fixing that bug caused the socket to stop blocking. Apparently, both O_RDWR and O_NONBLOCK are required to avoid sockets blocking, at least on my operating system and compiler.

Pipe + select: select never woken up

I'm building a small io service that checks read and write availability of some fds.
To do that, I have a thread dedicated to the select without any timeout so that the select only wakes up when a fd becomes available.
However, I sometimes want to force select to be woken up on specific events. To do so, I simply use a pipe, watch for its read availability and write on it when I want to wake up the select call.
This works most of the time, but it sometimes happens that nothing happen when I write to the pipe. So the select call remains blocked indefinitely.
Here is a part of the code I use:
Select thread:
FD_ZERO(&rd_set);
//! set some other fds...
FD_SET(m_notif_pipe_fds[0], &rd_set);
select(max_fd + 1, &rd_set, &wr_set, nullptr, nullptr);
if (FD_ISSET(m_notif_pipe_fds[0], &rd_set)) {
char buf[1024];
read(m_notif_pipe_fds[0], buf, 1024);
}
Notify thread:
write(m_notif_pipe_fds[1], "a", 1);
The max_fd variable has effectively been set to the highest fd value (not the number of fd to watch which is a common error).
Any idea?
I'd suggest you to make your pipe non-blocking
int flags = fcntl(m_notif_pipe_fd[1], F_GETFL, 0);
assert(flags != -1);
fcntl(m_notif_pipe_fd[1], F_SETFL, flags | O_NONBLOCK);
and set pipe buffer size to 1
int pipe_sz = fcntl(m_notif_pipe_fd[1], F_SETPIPE_SZ, 1);
See this question

Check if stdin is empty

I searched but did not get a relevant answer to this question, i am working on a linux machine, i wanted to check if the standard input stream contains any character, without removing the characters from the stream.
You might want to try select() function, and wait for having data into the input stream.
Description:
select() and pselect() allow a program to monitor multiple file
descriptors, waiting until one or more of the file descriptors become
"ready" for some class of I/O operation (e.g., input possible). A file
descriptor is considered ready if it is possible to perform the
corresponding I/O operation (e.g., read(2)) without blocking.
In your case, the file descriptor will be stdin
void yourFunction(){
fd_set fds;
struct timeval timeout;
int selectRetVal;
/* Set time limit you want to WAIT for the fdescriptor to have data,
or not( you can set it to ZERO if you want) */
timeout.tv_sec = 0;
timeout.tv_usec = 1;
/* Create a descriptor set containing our remote socket
(the one that connects with the remote troll at the client side). */
FD_ZERO(&fds);
FD_SET(stdin, &fds);
selectRetVal = select(sizeof(fds)*8, &fds, NULL, NULL, &timeout);
if (selectRetVal == -1) {
/* error occurred in select(), */
printf("select failed()\n");
} else if (selectRetVal == 0) {
printf("Timeout occurred!!! No data to fetch().\n");
//do some other stuff
} else {
/* The descriptor has data, fetch it. */
if (FD_ISSET(stdin, &fds)) {
//do whatever you want with the data
}
}
}
Hope it helps.
cacho was on the right path, however select is only necessary if you're dealing with more than one file descriptor, and stdin is not a POSIX file descriptor (int); It's a FILE *. You'd want to use STDIN_FILENO, if you go that route.
It's not a very clean route to take, either. I'd prefer to use poll. By specifying 0 as the timeout, poll will return immediately.
If none of the defined events have occurred on any selected file
descriptor, poll() shall wait at least timeout milliseconds for an
event to occur on any of the selected file descriptors. If the value
of timeout is 0, poll() shall return immediately. If the value of
timeout is -1, poll() shall block until a requested event occurs or
until the call is interrupted.
struct pollfd stdin_poll = { .fd = STDIN_FILENO
, .events = POLLIN | POLLRDBAND | POLLRDNORM | POLLPRI };
if (poll(&stdin_poll, 1, 0) == 1) {
/* Data waiting on stdin. Process it. */
}
/* Do other processing. */

Why does select only show file descriptors as ready if data is already being sent?

I'm using select() in a thread to monitor a datagram socket, but unless data is being sent to the socket before the thread starts, select() will continue to return 0.
I'm mixing a little C and C++; here's the method that starts the thread:
bool RelayStart() {
sock_recv = socket(AF_INET, SOCK_DGRAM, 0);
memset(&addr_recv, 0, sizeof(addr_recv));
addr_recv.sin_family = AF_INET;
addr_recv.sin_port = htons(18902);
addr_recv.sin_addr.s_addr = htonl(INADDR_ANY);
bind(sock_recv, (struct sockaddr*) &addr_recv, sizeof(addr_recv));
isRelayingPackets = true;
NSS::Thread::start(VIDEO_SEND_THREAD_ID);
return true;
}
The method that stops the thread:
bool RelayStop() {
isSendingVideo = false;
NSS::Thread::stop();
close(sock_recv);
return true;
}
And the method run in the thread:
void Run() {
fd_set read_fds;
int select_return;
struct timeval select_timeout;
FD_ZERO(&read_fds);
FD_SET(sock_recv, &read_fds);
while (isRelayingPackets) {
select_timeout.tv_sec = 1;
select_timeout.tv_usec = 0;
select_return = select(sock_recv + 1, &read_fds, NULL, NULL, &select_timeout);
if (select_return > 0 && FD_ISSET(sock_recv, &read_fds)) {
// ...
}
}
}
The problem is that if there isn't a process already sending UDP packets to port 18902 before RelayStart() is called, select() will always return 0. So, for example, I can't restart the sender without restarting the thread (in the correct order.)
Everything seems to work fine as long as the sender is started first.
The Run thread only constructs read_fds once.
The select call updates read_fds to have all its bits cleared for all descriptors that did not have data ready, and all its bits set for those that were set before and do have data ready.
Hence, if no descriptor has any data ready and the select call times out (and returns 0), all the bits in read_fds are now cleared. Further calls passing the same all-zero bit-mask will scan no file descriptors.
You can either re-construct the read-set on each trip inside the loop:
while (isRelayingPackets) {
FD_ZERO(&read_fds);
FD_SET(sock_recv, &read_fds);
...
}
or use an auxiliary variable with a copy of the bit-set:
while (isRelayingPackets) {
fd_set select_arg = read_fds;
... same as before but use &select_arg ...
}
(Or, of course, there are non-select interfaces that are easier to use in some ways.)
How were you expecting it to behave? The point of select() is to sleep to a timeout until data are available to be read; in this case, it will time out after 1 second and return 0. Perhaps you don't actually want a timeout before the start of a stream?

How to pass user-defined data to a worker thread using IOCP?

Hey... I created a small test server using I/O completion ports and winsock.
I can successfully connect and associate a socket handle with the completion port.
But I don´t know how to pass user-defined data-structures into the wroker thread...
What I´ve tried so far was passing a user-structure as (ULONG_PTR)&structure as the Completion Key in the association-call of CreateIoCompletionPort()
But that did not work.
Now I tried defining my own OVERLAPPED-structure and using CONTAINING_RECORD() as described here http://msdn.microsoft.com/en-us/magazine/cc302334.aspx and http://msdn.microsoft.com/en-us/magazine/bb985148.aspx.
But that does not work, too. (I get freaky values for the contents of pHelper)
So my Question is: How can I pass data to the worker thread using WSARecv(), GetQueuedCompletionStatus() and the Completion packet or the OVERLAPPED-strucutre?
EDIT: How can I successfully transmit "per-connection-data"?... It seems like I got the art of doing it (like explained in the two links above) wrong.
Here goes my code: (Yes, its ugly and its only TEST-code)
struct helper
{
SOCKET m_sock;
unsigned int m_key;
OVERLAPPED over;
};
///////
SOCKET newSock = INVALID_SOCKET;
WSABUF wsabuffer;
char cbuf[250];
wsabuffer.buf = cbuf;
wsabuffer.len = 250;
DWORD flags, bytesrecvd;
while(true)
{
newSock = accept(AcceptorSock, NULL, NULL);
if(newSock == INVALID_SOCKET)
ErrorAbort("could not accept a connection");
//associate socket with the CP
if(CreateIoCompletionPort((HANDLE)newSock, hCompletionPort, 3,0) != hCompletionPort)
ErrorAbort("Wrong port associated with the connection");
else
cout << "New Connection made and associated\n";
helper* pHelper = new helper;
pHelper->m_key = 3;
pHelper->m_sock = newSock;
memset(&(pHelper->over), 0, sizeof(OVERLAPPED));
flags = 0;
bytesrecvd = 0;
if(WSARecv(newSock, &wsabuffer, 1, NULL, &flags, (OVERLAPPED*)pHelper, NULL) != 0)
{
if(WSAGetLastError() != WSA_IO_PENDING)
ErrorAbort("WSARecv didnt work");
}
}
//Cleanup
CloseHandle(hCompletionPort);
cin.get();
return 0;
}
DWORD WINAPI ThreadProc(HANDLE h)
{
DWORD dwNumberOfBytes = 0;
OVERLAPPED* pOver = nullptr;
helper* pHelper = nullptr;
WSABUF RecvBuf;
char cBuffer[250];
RecvBuf.buf = cBuffer;
RecvBuf.len = 250;
DWORD dwRecvBytes = 0;
DWORD dwFlags = 0;
ULONG_PTR Key = 0;
GetQueuedCompletionStatus(h, &dwNumberOfBytes, &Key, &pOver, INFINITE);
//Extract helper
pHelper = (helper*)CONTAINING_RECORD(pOver, helper, over);
cout << "Received Overlapped item" << endl;
if(WSARecv(pHelper->m_sock, &RecvBuf, 1, &dwRecvBytes, &dwFlags, pOver, NULL) != 0)
cout << "Could not receive data\n";
else
cout << "Data Received: " << RecvBuf.buf << endl;
ExitThread(0);
}
If you pass your struct like this it should work just fine:
helper* pHelper = new helper;
CreateIoCompletionPort((HANDLE)newSock, hCompletionPort, (ULONG_PTR)pHelper,0);
...
helper* pHelper=NULL;
GetQueuedCompletionStatus(h, &dwNumberOfBytes, (PULONG_PTR)&pHelper, &pOver, INFINITE);
Edit to add per IO data:
One of the frequently abused features of the asynchronous apis is they don't copy the OVERLAPPED struct, they simply use the provided one - hence the overlapped struct returned from GetQueuedCompletionStatus points to the originally provided struct. So:
struct helper {
OVERLAPPED m_over;
SOCKET m_socket;
UINT m_key;
};
if(WSARecv(newSock, &wsabuffer, 1, NULL, &flags, &pHelper->m_over, NULL) != 0)
Notice that, again, in your original sample, you were getting your casting wrong. (OVERLAPPED*)pHelper was passing a pointer to the START of the helper struct, but the OVERLAPPED part was declared last. I changed it to pass the address of the actual overlapped part, which means that the code compiles without a cast, which lets us know we are doing the correct thing. I also moved the overlapped struct to be the first member of the struct.
To catch the data on the other side:
OVERLAPPED* pOver;
ULONG_PTR key;
if(GetQueuedCompletionStatus(h,&dw,&key,&pOver,INFINITE))
{
// c cast
helper* pConnData = (helper*)pOver;
On this side it is particularly important that the overlapped struct is the first member of the helper struct, as that makes it easy to cast back from the OVERLAPPED* the api gives us, and the helper* we actually want.
You can send special-purpose data of your own to the completion port via PostQueuedCompletionStatus.
The I/O completion packet will satisfy
an outstanding call to the
GetQueuedCompletionStatus function.
This function returns with the three
values passed as the second, third,
and fourth parameters of the call to
PostQueuedCompletionStatus. The system
does not use or validate these values.
In particular, the lpOverlapped
parameter need not point to an
OVERLAPPED structure.
I use the standard socket routines (socket, closesocket, bind, accept, connect ...) for creating/destroying and ReadFile/WriteFile for I/O as they allow use of the OVERLAPPED structure.
After your socket has accepted or connected you should associate it with the session context that it services. Then you associate your socket to an IOCP and (in the third parameter) provide it with a reference to the session context. The IOCP does not know what this reference is and doesn't care either for that matter. The reference is for YOUR use so that when you get an IOC through GetQueuedCompletionStatus the variable pointed to by parameter 3 will be filled in with the reference so that you immediately find the context associated with the socket event and can begin servicing the event. I usually use an indexed structure containing (among other things) the socket declaration, the overlapped structure as well as other session-specific data. The reference I pass to CreateIoCompletionPort in parameter 3 will be the index to the structure member containing the socket.
You need to check if GetQueuedCompletionStatus returned a completion or a timeout. With a timeout you can run through your indexed structure and see (for example) if one of them has timed out or something else and take appropriate house-keeping actions.
The overlapped structure also needs to be checked to see that the I/O completed correctly.
The function servicing the IOCP should be a separate, multi-threaded entity. Use the same number of threads that you have cores in your system, or at least no more than that as it wastes system resources (you don't have more resources for servicing the event than the number of cores in your system, right?).
IOCPs really are the best of all worlds (too good to be true) and anyone who says "one thread per socket" or "wait on multiple-socket list in one function" don't know what they are talking about. The former stresses your scheduler and the latter is polling and polling is ALWAYS extremely wasteful.