Executing new task based on sigchld() from previous task - c++

I'm currently in the process of building a small shell within C++.
A user may enter a job at the prompt such as exe1 && exe2 &. Similar to the BASH shell, I will only execute exe2 if exe1 exits successfully. In addition, the entire job must be performed in the background (as specified by the trailing & operator).
Right now, I have a jobManager which handles execution of jobs and a job structure which contains the job's executable and their individual arguments / conditions. A job is started by calling fork() and then calling execvp() with the proper arguments. When a job ends, I have a signal handler for SIGCHLD, in which I perform wait() to determine which process has just ended. When exe1 ends, I observe its exit code and make a determination as to whether I should proceed to launch exe2.
My concern is how do I launch exe2. I am concerned that if I use my jobManager start function from the context of my SIGCHLD handler, I could end up with too many SIGCHLD handler functions hanging out on the stack (if there were 10 conditional executions, for instance). In addition, it just doesn't seem like a good idea to be starting the next execution from the signal handler, even if it is occurring indirectly. (I tried doing something similar 1.5 years ago when I was just learning about signal handling -- I seem to recall it failing on me).
All of the above needs to be able to occur in the background and I want to avoid having the jobManager sitting in a busy wait just waiting for exe1 to return. I would also prefer to not have a separate thread sitting around just waiting to start the execution of another process. However, instructing my jobManager to begin execution of the next process from the SIGCHLD handler seems like poor code.
Any feedback is appriciated.

I see two ways:
1)Replace you sighandler with loop that call "sigwait" (see man 3 sigwait)
then in loop
2)before start create pipe, and in mainloop of your program use "select" on pipe handle to wait
events. In signal handler write to pipe, and in mainloop handle situation.

Hmmm that's a good one.
What about forking twice, once per process? The first one runs, and the second one stops. In the parent SIGCHLD handler, send a SIGCONT to the second child, if appropriate, which then goes off and runs the job. Naturally, you SIGKILL the second one if the first one shouldn't run, which should be safe because you won't really have set anything up.
How does that sound? You'll have a process sitting around doing nothing, but it shouldn't be for very long.

Related

Linux best practice to start and watch another process

In my process I need to start/restart another process.
Currently I use a thread with a tiny stack size and the following code:
void startAndMonitorA()
{
while(true)
{
system("myProcess");
LOG("myProcess crashed");
usleep(1000 * 1000);
}
}
I feel like that's not best practice. I have no idea about the resources the std::system() call is blocking or wasting. I'm on an embedded Linux - so in general I try to care about resources.
One problematic piece is restarting immediately: if the child process fails to start that is going to cause 100% CPU usage. It may be a transient error in the child process (e.g. cannot connect to a server). It may be a good idea to add a least one second pause before trying to restart.
What system call does on Linux is:
Sets up signals SIGINT and SIGQUIT to be ignored.
Blocks signal SIGCHLD.
fork()
Child process calls exec() shell, passing the command line to the shell.
Parent process calls waitpid() that blocks the thread till the child process terminates.
Parent process restores its signal dispositions.
If you were to re-implement the functionality of system you would probably omit step 5 (along with steps 1, 2 and 6) to avoid blocking the thread and rely on SIGCHLD to get notified when the child process has terminated and needs to be restarted.
In other words, the bare minimum would be to set up a signal handler for SIGCHLD and call fork and exec.
The code as shown would be adequate for most circumstances. If you really care about resource usage, you should be aware that you are starting (and keeping around) a thread for each process you are monitoring. If your program has an event loop anyway, that kind of thing can be avoided at the cost of some additional effort (and an increase in complexity).
Implementing this would entail the following:
Instead of calling system(), use fork() and exec() to start the external program. Store its PID in a global table.
Set a SIGCHLD handler that notifies the event loop of the exit of a child, e.g. by writing a byte to a pipe monitored by the event loop.
When a child exits, run waitpid with the WNOHANG flag in a loop that runs for as long as there are children to reap. waitpid() will return the PID of the child that exited, so that you know to remove its PID from the table, and to schedule a timeout that restarts it.

QProcess becomes defunct and unable to start again

I'm using a List of QProcess objects to keep track of some processes that need to be start/stopped at user defined intervals.
I'm able to start and stop the processes OK. But the issue arises when I stop a process using the following methods (Pseudo code):
process->start("PathToProcess","Some Arguments");
//Do some stuff.
process->terminate();
However, if I try to start the process again at another time, I get the error:
QProcess::start: Process is already running
I can do a ps -ef|grep processName and find that it is indeed dead, but it's sitting in a defunct state which I think is preventing me from starting it again.
What do I need to do to prevent this defunct state, or remove the defunct method so I can start my process again without reconstruction?
Figured out what was causing the error.
In qprocess_unix.cpp, you'll find a class called QProcessManager. Essentially this class has signal handlers that watch for child processes that have died. When a child dies, the QProcessManager sends a message across a pipe that lets the QProcess class know that it terminated/died.
In a unrelated part of my code, I had set up some signal catching statements that I used for various puposes. However, these signal catches were catching my SIGCHLD event and thus the QProcessManager was never being triggered to pipe to the QProcess that it died.
In my case, my only options are to either watch for the death of the child manually or to remove the signal catching I'm performing in my other sections of code.
For future reference, if you have this problem, you may be better off doing POSIX calls for kills and terminates, and checking the return value of those calls manually. If success, perform a:
process->setProcessState(ProcessState::NotRunning);//Specify the process is no longer running
waitpid(process->pid(),NULL,WNOHANG); //Clear the defunct process.
Thanks all.
Call process->waitForFinished() after calling process->terminate() in order to reap the zombie process. Then you can reuse the process object.

How to get .gcda files when the process is killed?

I have a binary build with -fprofile-arcs and -ftest-coverage. The binary is run by a process monitor which spawns the process as a child process. Then, when I want the process to exit, I have to go through the process monitor. It sends a SIGKILL to the process. I found out that .gcda files do not generate in this case. What can I do?
EDIT: Actually the process monitor first tries to make the process exit. However, the ProcessMonitor library (used in each process) calls _exit instead of exit when the user issues a command to stop the process. This is the cause of all trouble.
This might work:
http://nixcraft.com/coding-general/12544-gcov-g.html
In summary: call __gcov_flush() in the program, possibly in a signal handler or periodically during execution.
If C++ code remember to make a extern "C" declaration of the function.
Also remember to use some kind of preprocessor ifdef so that the program does not call it when not built with profiling.
SIGKILL is a "hard" kill signal, that cannot be caught by the application. Therefore, the app has no chance to write out the .gcda file.
I see two options:
Catch signals other than SIGKILL: any sensible process monitor should send a SIGTERM first. init and the batch managers I've encountered do this. SIGKILL is a last resort, so it should be sent only after SIGTERM followed by a grace period.
Workaround: run the program via an intermediate program that gets the SIGKILL; have the actual program check periodically (or in a separate thread) if its parent still lives, and if not, have it exit gracefully.
Afaik compilers (IntelC too) only store profiling stats in exit handler.
So what about somehow telling the process to quit, instead of killing it?
Like adding a SIGKILL handler maybe, with exit() in it?

Simplest way to interrupt a thread that is blocked on running a process

I need to execute some commands via "/bin/sh" from a daemon. Some times these commands takes too long to execute, and I need to somehow interrupt them. The daemon is written in C++, and the commands are executed with std::system(). I need the stack cleaned up so that destructors are called when the thread dies. (Catching the event in a C++ exception-handler would be perfect).
The threads are created using boost:thread. Unfortunately, neither boost::thread::interrupt() or pthread_cancel() are useful in this case.
I can imagine several ways to do this, from writing my own version of system(), to finding the child's process-id and signal() it. But there must be a simpler way?
Any command executed using the system command is executed in a new process. Unfortunately system halts the execution of the current process until the new process completes. If the sub process hangs the new process hangs as well.
The way to get round this is to use fork to create a new process and call one of the exec calls to execute the desired command. Your main process can then wait on the child process's Process Id (pid). The timeout can be achieve by generating a SIGALRM using the alarm call before the wait call.
If the sub process times out you can kill it using the kill command. Try first with SIGTERM, if that fails you can try again will SIGKILL, this will certainly kill the child process.
Some more information on fork and exec can be found here
I did not try boost::process, as it is not part of boost. I did however try ACE_Process, which showed some strange behavior (the time-outs sometimes worked and sometimes did not work). So I wrote a simple std::system replacement, that polls for the status of the running process (effectively removing the problems with process-wide signals and alarms on a multi threading process). I also use boost::this_thread::sleep(), so that boost::thread::interrupt() should work as an alternative or in addition to the time-out.
Stackoverflow.com does not work very good with my Firefox under Debian (in fact, I could not reply at all, I had to start Windows in a VM) or Opera (in my VM), so I'm unable to post the code in a readable manner. My prototype (before I moved it to the actual application) is available here: http://www.jgaa.com/files/ExternProcess.cpp
You can try to look at Boost.Process:
Where is Boost.Process?
I have been waiting for a long time for such a class.
If you are willing to use Qt, a nice portable solution is QProcess:
http://doc.trolltech.com/4.1/qprocess.html
Of course, you can also make your own system-specific solution like Let_Me_Be suggests.
Anyway you'd probably have to get rid of the system() function call and replace it by a more powerful alternative.

How to check if a process is running or got segfaulted or terminated in linux from its pid in my main() in c++

I am invoking several processes in my main and I can get the pid of that processes. Now I want to wait until all this processes have been finished and then clear the shared memory block from my parent process. Also if any of the process not finished and segfaulted I want to kill that process. So how to check from the pid of processes in my parent process code that a process is finished without any error or it gave broke down becoz of runtime error or any other cause, so that I can kill that process.
Also what if I want to see the status of some other process which is not a child process but its pid is known.
Code is appreciated( I am not looking for script but code ).
Look into waitpid(2) with WNOHANG option. Check the "fate" of the process with macros in the manual page, especially WIFSIGNALED().
Also, segfaulted process is already dead (unless SIGSEGV is specifically handled by the process, which is usually not a good idea.)
From your updates, it looks like you also want to check on other processes, which are not children of your current process.
You can look at /proc/{pid}/status to get an overview of what a process is currently doing, its either going to be:
Running
Stopped
Sleeping
Disk (D) sleep (i/o bound, uninterruptable)
Zombie
However, once a process dies (fully, unless zombied) so does its entry in /proc. There's no way to tell if it exited successfully, segfaulted, caught a signal that could not be handled, or failed to handle a signal that could be handled. Not unless its parent logs that information somewhere.
It sounds like your writing a watchdog for other processes that you did not start, rather than keeping track of child processes.
If a program segfaults, you won't need to kill it. It's dead already.
Use the wait and waitpid calls to wait for children to finish and check the status for some idea of how they exiting. See here for details on how to use these functions. Note especially the WIFSIGNALED and WTERMSIG macros.
waitpid() from SIGCHLD handler to catch the moment when application terminates itself. Note that if you start multiple processes you have to loop on waitpid() with WNOHANG until it returns 0.
kill() with signal 0 to check whether the process is still running. IIRC zombies still qualify as processes thus you have to have proper SIGCHLD handler for that to work.