SIGALRM Timeout -- How does it affect existing operations? - c++

I am currently using select() to act as a timer. I have network receive operations which occur within a loop, and every few seconds a processing operation needs to occur.
As a result, the select() statement's timeout constantly changes -- decreasing to zero over time and then restarting at 3.
rv = select(selectmonitor+1, &readnet, NULL, NULL, &helper.timeout());
As things come in on the network, the statement is repeated and the value passed to it by helper.timeout() decreases. Eventually, the value will either be equal to zero or the system will timeout, which will result in the processing function executing. However, I've noticed that this is quite resource intensive -- the value for helper.timeout() must be constantly calculated. When I am trying to receive a few thousand packets a second, the time it takes for this operation to be done results in packet loss.
My new idea is using SIGALRM to resolve this. This would allow me to set a timer once and then react when it is set off. However, I'm confused as to how it will affect my program. When SIGALRM 'goes off', it will run a function which I specify. However, how will it interrupt my existing code? And once the function is done, how will my existing code (within the while statement) resume?
Also, it would appear its impossible to set the SIGALRM signal to call a function within a class? Is this correct? I imagine I can call a function, which can in turn call a function within a class..
Thanks for any assistance ahead of time.

Use your SIGALRM handler to set a flag variable.
Use sigaction instead of signal to set your signal handler. Do not set the SA_RESTART flag. With SA_RESTART not set your select statement will be interrupted by the signal. Your select will return -1 and errno will be EINTR.
Since the signal might happen while your other code is executing you will want to check the flag variable too, probably right before going into the select.
I was just reminded that this pattern can result in missing the signal, if it happens just after checking the flag and just before entering the select.
To avoid that, you need to use sigprocmask to block the SIGALRM signal before entering the while loop. Then you use pselect instead of select. By giving pselect a signal mask with SIGALRM unmasked, the signal will end up always interrupting during the select instead of happening at any other time.

Related

Why a new signal for socket::readyRead() is executed, even when its earlier slot is still processing?

According to following post an emitted signal is served, only once the currently executing slot completes.
Wait for a SLOT to finish the execution with Qt
I have a client-server communication app based on ssl socket, which is single threaded.
connect(socket, &QSslSocket::readyRead, [&]() { myObject.Read(); });
Client & server send each other some custom messages. Whenever a message is sent or received by either, they send ACK bytes (00).
Most of the times, I notice that when the Read() is in between of execution, the next readyRead() is served! I put debug statements in beginning & end of myObject->Read(). They confirm that, beginning debug is called again & again. Same is observed with breakpoints.
When too much of data is received, a recursive stack frame is created of too many Read()s. It either slows down the app GUI or crashes.
Typically this recursion happens, when client attempts to send an ACK as part of myObject->Read(). During that time readyRead() is incidentally signalled & also gets served. However the slot of previous signal was still under processing.
Questions:
Is it possible for Qt framework to cater a signal in between when a slot is still mid-way (single thread)?
How to fix this socket specific scenario?
Note:
- By default for single threads, the Qt::ConnectionType is DirectConnection. I have tried with QueuedConnection as well, but the result is same.
- myObject.Read() is quite complex and has many other function calls. If this is causing problem, then let me know what should I look for. It will be impractical to write the actual code of it.
The recursive calls of readyRead() were happening because of the event loop getting freed up in between. Following functions were causing the event loop to be freed:
QCoreApplication::processEvents()
SslSocket::flush()
The 1st is understandable, as it's meant for that. But the 2nd flush() was a total surprise. Its documentation doesn't state so. At least in my debugging it was showing that, whenever the flush() is invoked, the subsequent readyRead() is invoked and catered. See that in the Qn as well.
The processEvent() was meant to make GUI more responsive during the high loading of the data. But it seems that, we need to make another choice for the same.

signal handling when process is waiting for another process to terminate

I am just trying to understand the concept of signal handling with respective from kernel and user mode for the running process.
PROCESS-1 --------------------> PROCESS-3
(parent process) <-------------------
^ process-3 sending signals(SIGPIPE-for communication) or
|| SIGSTOP or SIGKILL to process-1
||
||
||process-1 waiting for child process-2
|| using waitpid command.
||
v
PROCESS-2(waiting for resource, page fault happened, etc)
(child process)
I want to know how kernel sends the signal from process-3 to process-1 knowing that process-1 is waiting for process-2 to finish. Would like to know more about the user and kernel communication during the signal handling scenario(PCB,resources,open file descriptors etc.). Please explain related to this context..
Any help given is thankful..!!!
The kernel doesn't really care that process-1 is "waiting for process-2 to finish" (in particular it's not interested in "why" it's in the state it is, merely that it is in some state: in this case, idling in the kernel waiting for some event). For typical1 caught signals, the signal-sender essentially just sets some bit(s) in the signal-receiver's process/thread state, and then if appropriate, schedules that process/thread to run so that it can see those bits. If the receiver is idling in the kernel waiting for some event, that's one of the "schedule to run" cases. (Other common situations include: the receiver is in STOP state, where it stays stopped except for SIGCONT signals; or, the receiver is running in user mode, where it is set up to transition to kernel mode so as to notice the pending signals.)
Both SIGKILL and SIGSTOP cannot be caught or ignored, so, no, you cannot provide a handler for these. (Normally processes are put into stop state via SIGTSTP, SIGTTIN, or SIGTTOU, all of which can be caught or ignored.)
If system calls are set to restart after a user signal handler returns (via the SA_RESTART flag of sigaction()), this is achieved by setting up the "return address" for the sigreturn() operation to, in fact, make the system call over again. That is, if process-1 is in waitpid(), the sequence of operations (from process-1's point of view) from the point of the initial waitpid(), through receiving a caught signal s, and back to more waiting, is:
system call: waitpid()
put self to sleep waiting for an event
awakened: check for awakening event
event is signal and signal is caught, so:
set new signal mask per sigaction() settings (see sigaction())
push signal frame on a stack (see SA_ONSTACK and sigaltstack())
set up for user code (program counter) to enter at "signal trampoline"
return to user code (into trampoline)
(At this point process-1 is back in user mode. The remaining steps are not numbered because I can't make SO start at 9. :-) )
call user handler routine (still on stack chosen above)
when user routine returns, execute sigreturn() system call,
using the frame stored at setup time, possibly modified
by user routine
(At this point the process enters kernel mode, to execute sigreturn() system call)
system call: sigreturn(): set signal mask specified by sigreturn() argument
set other registers, including stack pointer(s) and
program counter, as specified by sigreturn() arguments
return to user code
(the program is now back in user mode, with registers set up to enter waitpid)
system call: waitpid()
At this point the process returns to the same state it had before it received the caught signal: waitpid puts it to sleep waiting for an event (step 2). Once awakened (step 3), either the event it was waiting for has occurred (e.g., the process being waitpid()-ed is done) and it can return normally, or another caught signal has occurred and it should repeat this sequence, or it is being killed and it should clean up, or whatever.
This sequence is why some system calls (such as some read()-like system calls) will "return early" if interrupted by a signal: they've done something irreversible between the "first" entry into the kernel and the time the signal handler is to be run. In this case, the signal frame pushed at step 6 must not have a program-counter value that causes the entire system call to restart. If it did, the irreversible work done before the process went to sleep would be lost. So, it is instead set up to return to the instruction that detects a successful system call, with the register values set up to return the short read() count, or whatever.
When system calls are set up not to restart (SA_RESTART is not set), the signal frame pushed in step 6 is also different. Instead of returning to the instruction that executes the system call, it returns to the instruction that detects a failed system call, with the register values set up to indicate an EINTR error.
(Often, but not always, these are the same instruction, e.g., a conditional branch to test for success/fail. In my original SPARC port, I made them different instructions in most cases. Since leaf routines return to %o6+8 with no register or stack manipulation, I just set a bit indicating that a successful return should return to the leaf routine's return address. So most system calls were just "put syscall number and ret-on-success flag into %g1, then trap to kernel, then jump-to-error-handling because the system call must have failed if we got here.")
1Versus queued signals.

Executing new task based on sigchld() from previous task

I'm currently in the process of building a small shell within C++.
A user may enter a job at the prompt such as exe1 && exe2 &. Similar to the BASH shell, I will only execute exe2 if exe1 exits successfully. In addition, the entire job must be performed in the background (as specified by the trailing & operator).
Right now, I have a jobManager which handles execution of jobs and a job structure which contains the job's executable and their individual arguments / conditions. A job is started by calling fork() and then calling execvp() with the proper arguments. When a job ends, I have a signal handler for SIGCHLD, in which I perform wait() to determine which process has just ended. When exe1 ends, I observe its exit code and make a determination as to whether I should proceed to launch exe2.
My concern is how do I launch exe2. I am concerned that if I use my jobManager start function from the context of my SIGCHLD handler, I could end up with too many SIGCHLD handler functions hanging out on the stack (if there were 10 conditional executions, for instance). In addition, it just doesn't seem like a good idea to be starting the next execution from the signal handler, even if it is occurring indirectly. (I tried doing something similar 1.5 years ago when I was just learning about signal handling -- I seem to recall it failing on me).
All of the above needs to be able to occur in the background and I want to avoid having the jobManager sitting in a busy wait just waiting for exe1 to return. I would also prefer to not have a separate thread sitting around just waiting to start the execution of another process. However, instructing my jobManager to begin execution of the next process from the SIGCHLD handler seems like poor code.
Any feedback is appriciated.
I see two ways:
1)Replace you sighandler with loop that call "sigwait" (see man 3 sigwait)
then in loop
2)before start create pipe, and in mainloop of your program use "select" on pipe handle to wait
events. In signal handler write to pipe, and in mainloop handle situation.
Hmmm that's a good one.
What about forking twice, once per process? The first one runs, and the second one stops. In the parent SIGCHLD handler, send a SIGCONT to the second child, if appropriate, which then goes off and runs the job. Naturally, you SIGKILL the second one if the first one shouldn't run, which should be safe because you won't really have set anything up.
How does that sound? You'll have a process sitting around doing nothing, but it shouldn't be for very long.

C++ Timers in Unix

We have an API that handles event timers. This API says that it uses OS callbacks to handle timed events (using select(), apparently).
The api claims this order of execution as well:
readable events
writable events
timer events
This works by creating a point to a Timer object, but passing the create function a function callback:
Something along these lines:
Timer* theTimer = Timer::Event::create(timeInterval,&Thisclass::FunctionName);
I was wondering how this worked?
The operating system is handling the timer itself, and when it sees it fired how does it actually invoke the callback? Does the callback run in a seperate thread of execution?
When I put a pthread_self() call inside the callback function (Thisclass::FunctionName) it appears to have the same thread id as the thread where theTimer is created itself! (Very confused by this)
Also: What does that priority list above mean? What is a writable event vs a readable event vs a timer event?
Any explanation of the use of select() in this scenario is also appreciated.
Thanks!
This looks like a simple wrapper around select(2). The class keeps a list of callbacks, I guess separate for read, write, and timer expiration. Then there's something like a dispatch or wait call somewhere there that packs given file descriptors into sets, calculates minimum timeout, and invokes select with these arguments. When select returns, the wrapper probably goes over read set first, invoking read callback, then write set, then looks if any of the timers have expired and invokes those callbacks. This all might happen on the same thread, or on separate threads depending on the implementation of the wrapper.
You should read up on select and poll - they are very handy.
The general term is IO demultiplexing.
A readable event means that data is available for reading on a particular file descriptor without blocking, and a writable event means that you can write to a particular file descriptor without blocking. These are most often used with sockets and pipes. See the select() manual page for details on these.
A timer event means that a previously created timer has expired. If the library is using select() or poll(), the library itself has to keep track of timers since these functions accept a single timeout. The library must calculate the time remaining until the first timer expires, and use that for the timeout parameter. Another approach is to use timer_create(), or an older variant like setitimer() or alarm() to receive notification via a signal.
You can determine which mechanism is being used at the OS layer using a tool like strace (Linux) or truss (Solaris). These tools trace the actual system calls that are being made by the program.
At a guess, the call to create() stores the function pointer somewhere. Then, when the timer goes off, it calls the function you specified via that pointer. But as this is not a Standard C++ function, you should really read the docs or look at the source to find out for sure.
Regarding your other questions, I don't see mention of a priority list, and select() is a sort of general purpose event multiplexer.
Quite likely there's a framework that works with a typical main loop, the driving force of the main loop is the select call.
select allows you to wait for a filedescriptor to become readable or writable (or for an "exception" on the filedeescriptor) or for a timeout to occur. I'd guess the library also allow you to register callbacks for doing async IO, if it's a GUI library it'll get the low primitive GUI events via a file descriptor on unixes.
To implement timer callbacks in such a loop, you just keep a priority queue of timers and process them on select timeouts or filedescriptor events.
The priority means it processes the file i/o before the timers, which in itself takes time, could result in GUI updates eventually resulting in GUI event handlers being run, or other tasks spending time servicing I/O.
The library is more or less doing
for(;;) {
timeout = calculate_min_timeout();
ret = select(...,timeout); //wait for a timeout event or filedescriptor events
if(ret > 0) {
process_readable_descriptors();
process_writable_descriptors();
}
process_timer_queue(); //scan through a timer priority queue and invoke callbacks
}
Because of the fact that the thread id inside the timer callback is the same as the creator thread I think that it is implemented somehow using signals.
When a signal is sent to a thread that thread's state is saved and the signal handler is called which then calls the event call back.
So the handler is called in the creator thread which is interrupted until the signal handler returns.
Maybe another thread waits for all timers using select() and if a timer expires it sends a signal to the thread the expired timer was created in.

Callbacks and Delays in a select/poll loop

One can use poll/select when writing a server that can service multiple clients all in the same thread. select and poll, however need a file descriptor to work. For this reason, I am uncertain how to perform simple asynchronous operations, like implementing a simple callback to break up a long running operation or a delayed callback without exiting the select/poll loop. How does one go about doing this? Ideally, I would like to do this without resorting to spawning new threads.
In a nutshell, I am looking for a mechanism with which I can perform ALL asynchronous operations. The windows WaitForMultipleObjects or Symbian TRequestStatus seems a much more suited to generalized asynchronous operations.
For arbitrary callbacks, maintain a POSIX pipe (see pipe(2)). When you want to do a deferred call, write a struct consisting of a function pointer and optional context pointer to the write end. The read end is just another input for select. If it selects readable, read the same struct, and call the function with the context as argument.
For timed callbacks, maintain a list in order of due time. Entries in the list are structs of e.g. { due time (as interval since previous callback); function pointer; optional context pointer }. If this list is empty, block forever in select(). Otherwise, timeout when the first event is due. Before each call to select, recalculate the first event's due time.
Hide the details behind a reasonable interface.
select() and poll() are syscalls - it means that your program is calling OS kernel to do something and your program can do nothing while waiting for return from kernel, unless you use other thread.
Although select() and poll() are used for async I/O, these functions (syscalls) are not async - they will block (unless you specify some timeout) until there is something happened with the descriptor you are watching.
Best strategy would be to check descriptors time to time (specifying small timeout value), and if there is nothing, do what you want to do in idle time, otherwise process I/O.
You could take advantage of the timeout of select() or poll() to do your background stuff periodically:
for ( ;; ) {
...
int fds = select(<fds and timeout>);
if (fds < 0) {
<error occured>
} else if if (fds == 0) {
<handle timeout, do some background work.>
} else {
<handle the active file descriptors>
}
}
For an immediate callback using the select loop, one can use one of the special files like /dev/zero that are always active. The will allow select the exit soon but will allow other files to become active as well.
For timed delays, I can only thing of using the timeout on select.
Both of the above don't feel great, so please send better answers.