Callbacks and Delays in a select/poll loop - c++

One can use poll/select when writing a server that can service multiple clients all in the same thread. select and poll, however need a file descriptor to work. For this reason, I am uncertain how to perform simple asynchronous operations, like implementing a simple callback to break up a long running operation or a delayed callback without exiting the select/poll loop. How does one go about doing this? Ideally, I would like to do this without resorting to spawning new threads.
In a nutshell, I am looking for a mechanism with which I can perform ALL asynchronous operations. The windows WaitForMultipleObjects or Symbian TRequestStatus seems a much more suited to generalized asynchronous operations.

For arbitrary callbacks, maintain a POSIX pipe (see pipe(2)). When you want to do a deferred call, write a struct consisting of a function pointer and optional context pointer to the write end. The read end is just another input for select. If it selects readable, read the same struct, and call the function with the context as argument.
For timed callbacks, maintain a list in order of due time. Entries in the list are structs of e.g. { due time (as interval since previous callback); function pointer; optional context pointer }. If this list is empty, block forever in select(). Otherwise, timeout when the first event is due. Before each call to select, recalculate the first event's due time.
Hide the details behind a reasonable interface.

select() and poll() are syscalls - it means that your program is calling OS kernel to do something and your program can do nothing while waiting for return from kernel, unless you use other thread.
Although select() and poll() are used for async I/O, these functions (syscalls) are not async - they will block (unless you specify some timeout) until there is something happened with the descriptor you are watching.
Best strategy would be to check descriptors time to time (specifying small timeout value), and if there is nothing, do what you want to do in idle time, otherwise process I/O.

You could take advantage of the timeout of select() or poll() to do your background stuff periodically:
for ( ;; ) {
...
int fds = select(<fds and timeout>);
if (fds < 0) {
<error occured>
} else if if (fds == 0) {
<handle timeout, do some background work.>
} else {
<handle the active file descriptors>
}
}

For an immediate callback using the select loop, one can use one of the special files like /dev/zero that are always active. The will allow select the exit soon but will allow other files to become active as well.
For timed delays, I can only thing of using the timeout on select.
Both of the above don't feel great, so please send better answers.

Related

How to stop select() immediately on closing the worker thread? [duplicate]

I have a loop which basically calls this every few seconds (after the timeout):
while(true){
if(finished)
return;
switch(select(FD_SETSIZE, &readfds, 0, 0, &tv)){
case SOCKET_ERROR : report bad stuff etc; return;
default : break;
}
// do stuff with the incoming connection
}
So basically for every few seconds (which is specified by tv), it reactivates the listening.
This is run on thread B (not a main thread). There are times when I want to end this acceptor loop immediately from thread A (main thread), but seems like I have to wait until the time interval finishes..
Is there a way to disrupt the select function from another thread so thread B can quit instantly?
The easiest way is probably to use pipe(2) to create a pipe and add the read end to readfds. When the other thread wants to interrupt the select() just write a byte to it, then consume it afterward.
Yes, you create a connected pair of sockets. Then thread B writes to one side of socket and thread A adds the other side socket to select. So once B writes to socket A exits select, do not forget to read this byte from socket.
This is the most standard and common way to interrupt selects.
Notes:
Under Unix, use socketpair to create a pair of sockets, under windows it is little bit tricky but googling for Windows socketpair would give you samples of code.
Can't you just make the timeout sufficiently short (like 10ms or so?).
These "just create a dummy connection"-type solution seem sort of hacked. I personally think that if an application is well designed, concurrent tasks never have to be interrupted forcefully, the just has worker check often enough (this is also a reason why boost.threads do not have a terminate function).
Edit Made this answer CV. It is bad, but it might help other to understand why it is bad, which is explained in the comments.
You can use shutdown(Sock, SHUT_RDWR) call from main thread to come out of waiting select call which will also exit your another thread before the timeout so you don't need to wait till timeout expires.
cheers. :)

effect of SELECT on read() in linux

I have a legacy code which is doing this:
select(nFD + 1, &tReadFds, NULL, NULL, &timer);
.............
if (FD_ISSET(nFD, &tReadFds))
n = read(nFD,len,x);
is the read gonna read the whole receive buffer(nFD), assuming 'len' and 'x' are big enough.
I think SELECT here is acting as just a way of blocking till data becomes available in recv buffer.
In a nutshell, select is a function that you can call without blocking (i.e. it returns immediately), and upon return it will tell you a list of file descriptors on which you can call read (or write) without blocking.
Such a function is crucial if you want to provide a persistent service while processing I/O with only a single thread: You cannot afford to do nothing while you are waiting for I/O, and so you need a deterministic method to ensure that you can do non-blocking I/O.
Edit. Here's an example of a typical single-threaded select-server, in pseudo-code:
while (true)
{
select(...);
read_available_data();
process_data_and_do_work(); // expensive
}
Such a server never has to be idle, and the expensive processing function can take up almost all the available computing time (it just has to make sure to return when it needs more data). I think select even allows for a context switch, so this will play nice in a multi-process environment.
The code snippet is calling select() with a non-NULL timeout parameter. The code is waiting up to some maximum amount of time for the socket to become readable. If the timeout elapses, the socket is not readable and FD_ISSET() will return false, skipping the read() call. However, if the socket becomes readable before the timeout elapses, FD_ISSET() will return true, and a call to read() is quaranteed not to block the calling thread. It will return immediately, either returning whatever data is currently in the socket's receive buffer (up to len bytes max), or returning 0 if the remote party has disconnected gracefully.

Is there a way to communicate data between computers without while loops? C++

I have been struggling to try and find my answer for this on google, as I dont know the exact terms I am looking to search for.
If someone were to build an msn messenger-like program, is it possible to have always-open connections and no while(true) loop? If so, could someone point me in the direction of how this is achieved?
Using boost::asio library for socket handling, i think it is possible to define callbacks upon data reception.
The one single magic word your looking for is asynchronous I/O. This can be achieved either through using asynchronous APIs (functions such as ReadThis() that return immediately and signal on success/failure -- like but not limited by boost::asio) or by deferring blocking calls to different threads. Picking either method requires careful weighing of both the underlying implementation and the scale of your operations.
You want to use ACE. It has a Reactor pattern which will notify you when data is available to be use.
Reactor Pattern
You could have:
while(1) {
sleep(100); // 100 ms
// check if there is a message
// process message
//...
}
This is ok, but there is an overhead on servers running 10000s of threads since threads come out of sleep and check for a message, causing context-switching. Instead, operating systems provide functions like select and epoll on Linux, which allow a thread to wait on an event.
while(1) {
// wait for message
// process message
//...
}
Using wait, the thread is not "woken up" unless a message is received.
You can only hide your while loop (or some kind of loop) somewhere buried in some library or restart the waiting for next IO in an event callback, but you aren't going to be able to completely avoid it.
That's a great question. Like nj said, you want to use asynchronous I/O. Too many programs use a polling strategy. It is not uncommon to have 1000 threads running on a system. If all of them were polling, you would have a slow system. Use asynchronous I/O whenever possible.
what about udp protocol communication ? you dont have to wait in while loop for every clients
just open one connection on specified port and call receive method

breaking out from socket select

I have a loop which basically calls this every few seconds (after the timeout):
while(true){
if(finished)
return;
switch(select(FD_SETSIZE, &readfds, 0, 0, &tv)){
case SOCKET_ERROR : report bad stuff etc; return;
default : break;
}
// do stuff with the incoming connection
}
So basically for every few seconds (which is specified by tv), it reactivates the listening.
This is run on thread B (not a main thread). There are times when I want to end this acceptor loop immediately from thread A (main thread), but seems like I have to wait until the time interval finishes..
Is there a way to disrupt the select function from another thread so thread B can quit instantly?
The easiest way is probably to use pipe(2) to create a pipe and add the read end to readfds. When the other thread wants to interrupt the select() just write a byte to it, then consume it afterward.
Yes, you create a connected pair of sockets. Then thread B writes to one side of socket and thread A adds the other side socket to select. So once B writes to socket A exits select, do not forget to read this byte from socket.
This is the most standard and common way to interrupt selects.
Notes:
Under Unix, use socketpair to create a pair of sockets, under windows it is little bit tricky but googling for Windows socketpair would give you samples of code.
Can't you just make the timeout sufficiently short (like 10ms or so?).
These "just create a dummy connection"-type solution seem sort of hacked. I personally think that if an application is well designed, concurrent tasks never have to be interrupted forcefully, the just has worker check often enough (this is also a reason why boost.threads do not have a terminate function).
Edit Made this answer CV. It is bad, but it might help other to understand why it is bad, which is explained in the comments.
You can use shutdown(Sock, SHUT_RDWR) call from main thread to come out of waiting select call which will also exit your another thread before the timeout so you don't need to wait till timeout expires.
cheers. :)

C++ Timers in Unix

We have an API that handles event timers. This API says that it uses OS callbacks to handle timed events (using select(), apparently).
The api claims this order of execution as well:
readable events
writable events
timer events
This works by creating a point to a Timer object, but passing the create function a function callback:
Something along these lines:
Timer* theTimer = Timer::Event::create(timeInterval,&Thisclass::FunctionName);
I was wondering how this worked?
The operating system is handling the timer itself, and when it sees it fired how does it actually invoke the callback? Does the callback run in a seperate thread of execution?
When I put a pthread_self() call inside the callback function (Thisclass::FunctionName) it appears to have the same thread id as the thread where theTimer is created itself! (Very confused by this)
Also: What does that priority list above mean? What is a writable event vs a readable event vs a timer event?
Any explanation of the use of select() in this scenario is also appreciated.
Thanks!
This looks like a simple wrapper around select(2). The class keeps a list of callbacks, I guess separate for read, write, and timer expiration. Then there's something like a dispatch or wait call somewhere there that packs given file descriptors into sets, calculates minimum timeout, and invokes select with these arguments. When select returns, the wrapper probably goes over read set first, invoking read callback, then write set, then looks if any of the timers have expired and invokes those callbacks. This all might happen on the same thread, or on separate threads depending on the implementation of the wrapper.
You should read up on select and poll - they are very handy.
The general term is IO demultiplexing.
A readable event means that data is available for reading on a particular file descriptor without blocking, and a writable event means that you can write to a particular file descriptor without blocking. These are most often used with sockets and pipes. See the select() manual page for details on these.
A timer event means that a previously created timer has expired. If the library is using select() or poll(), the library itself has to keep track of timers since these functions accept a single timeout. The library must calculate the time remaining until the first timer expires, and use that for the timeout parameter. Another approach is to use timer_create(), or an older variant like setitimer() or alarm() to receive notification via a signal.
You can determine which mechanism is being used at the OS layer using a tool like strace (Linux) or truss (Solaris). These tools trace the actual system calls that are being made by the program.
At a guess, the call to create() stores the function pointer somewhere. Then, when the timer goes off, it calls the function you specified via that pointer. But as this is not a Standard C++ function, you should really read the docs or look at the source to find out for sure.
Regarding your other questions, I don't see mention of a priority list, and select() is a sort of general purpose event multiplexer.
Quite likely there's a framework that works with a typical main loop, the driving force of the main loop is the select call.
select allows you to wait for a filedescriptor to become readable or writable (or for an "exception" on the filedeescriptor) or for a timeout to occur. I'd guess the library also allow you to register callbacks for doing async IO, if it's a GUI library it'll get the low primitive GUI events via a file descriptor on unixes.
To implement timer callbacks in such a loop, you just keep a priority queue of timers and process them on select timeouts or filedescriptor events.
The priority means it processes the file i/o before the timers, which in itself takes time, could result in GUI updates eventually resulting in GUI event handlers being run, or other tasks spending time servicing I/O.
The library is more or less doing
for(;;) {
timeout = calculate_min_timeout();
ret = select(...,timeout); //wait for a timeout event or filedescriptor events
if(ret > 0) {
process_readable_descriptors();
process_writable_descriptors();
}
process_timer_queue(); //scan through a timer priority queue and invoke callbacks
}
Because of the fact that the thread id inside the timer callback is the same as the creator thread I think that it is implemented somehow using signals.
When a signal is sent to a thread that thread's state is saved and the signal handler is called which then calls the event call back.
So the handler is called in the creator thread which is interrupted until the signal handler returns.
Maybe another thread waits for all timers using select() and if a timer expires it sends a signal to the thread the expired timer was created in.