Keep forked process alive if parent/child exits abnormally (C++) - c++

I am trying to execute another command line process in parallel with the current process. However, I realize that the command line program sometimes abnormally exits, and that kills my main program as well.
// MAIN PROGRAM
pid = fork();
char *argv[] = { stuff.. };
if (pid == 0) {
int rc = execv("command line program...", argv);
}
// DO OTHER STUFF HERE.
if (pid > 0) {
waitpid(pid, 0, 0);
}
Is there any way to keep my main program running after the command line program dies abnormally? Thanks!
[UPDATE]:Yes, the main process is writing to a file where the command line is reading from, but it is a normal file, not a pipe. I receive a segfault.
It is extremely hard for me to reproduce the bug, since the child process does not crash very often. But it does happen. Randomly crashing is a known bug in the command line program, which is why I want to keep my main program alive even if the command line dies.

In your real code do you have an else here:
if (pid == 0) {
int rc = execv("command line program...", argv);
// possibly more child stuff
}
else {
// parent stuff
}
It's always a good idea to post real code when asking questions here.

Use vfork rather than fork to avoid unnecessary process cloning.
Make sure you don't crash when SIGCHLD is received by parent process.
Use proper if-then-else statement to make it clear what code executes in parent process and what happens in a child process. For example it is very likely that both child and process will execute code where // DO OTHER STUFF HERE. comment is in case execv fails.
After all, use gdb. It will tell you where the crash occurs.

Related

Interprocess communication, reading from multiple children stdout

I'm trying to write a custom shell-like program, where multiple commands can be executed concurrently. For a single command this is not much complicated. However, when I try to concurrently execute multiple commands (each one in a separate child) and capture their stdout I'm having a problem.
What I tried so far is this under my shell application I have two functions to run the commands concurrently, execute() takes multiple commands and for each of the commands it fork() a child process to execute the command, subprocess() takes 1 cmd and executes it.
void execute(std::vector<std::string> cmds) {
int fds[2];
pipe(fds);
std::pair<pid_t, int> sp;
for (int i = 0; i < cmds.size(); i++) {
std::pair<pid_t, int> sp = this->subprocess(cmds[i], fds);
}
// wait for all children
while (wait(NULL) > 0);
close(sp.second);
}
std::pair<pid_t, int> subprocess(std::string &cmd, int *fds) {
std::pair<pid_t, int> process = std::make_pair(fork(), fds[0]);
if (process.first == 0) {
close(fds[0]); // no reading
dup2(fds[1], STDIN_FILENO);
close(fds[1]);
char *argv[] = {"/bin/sh", "-c", cmd.data(), NULL};
execvp(argv[0], argv);
exit(0);
}
close(fds[1]); // only reading
return process;
}
The problem here is, when I execute multiple commands on my custom shell (not diving into spesifics here, but it will call execute() at some point.) if I use STDIN_FILENO as above to capture child process stdout, it keeps writing to shell's stdin forever what the captured output is, for example
if the input commands are
echo im done, yet?
echo nope
echo maybe
then, in writing to STDIN_FILENO case, the output is like (where >>> ) is my marker for user input.
im done, yet?
nope
maybe
>>> nope
maybe
im done, yet?
>>> im done, yet?
nope
maybe
in writing to STDOUT_FILENO case, it seems it's ignoring one of the commands (probably the first child), I'm not sure why?
maybe
nope
>>> maybe
nope
>>> nope
maybe
>>> maybe
nope
>>> nope
So, potential things I thought are in my shell I'm using std::cin >> ... for user input in a while loop ofc, this may somehow conflict with stdin case. On the other hand, in the main process (parent) I'm waiting for all children to exit, so children somehow is not exiting, but child should die off after execvp, right ? Moreover, I close the reading end in the main process close(sp.second). At this point, I'm not sure why this case happens ?
Should I not use pipe() for a process like this ? If I use a temp file to redirect stdout of child process, would everything be fine ? and if so, can you please explain why ?
There are multiple, fundamental, conceptual problems in the shown code.
std::pair<pid_t, int> sp;
This declares a new std::pair object. So far so good.
std::pair<pid_t, int> sp = this->subprocess(cmds[i], fds);
This declares a new std::pair object inside the for loop. It just happens to have the same name as the sp object at the function scope. But it's a different object that has nothing to do, whatsoever, with it. That's how C++ works: when you declare an object inside an inner scope, inside an if statement, a for loop, or anything that's stuffed inside another pair of { ... } you end up declaring a new object. Whether its name happens to be the same as another name that's been declared in a larger scope, it's immaterial. It's a new object.
// wait for all children
while (wait(NULL) > 0);
close(sp.second);
There are two separate problems here.
For starters, if we've been paying attention: this sp object has not been initialized to anything.
If the goal here is to read from the children, that part is completely missing, and that should be done before waiting for the child processes to exit. If, as the described goal is here, the child processes are going to be writing to this pipe the pipe should be read from. Otherwise if nothing is being read from the pipe: the pipe's internal buffer is limited, and if the child processes fill up the pipe they'll be blocked, waiting for the pipe to be read from. But the parent process is waiting for the child processes to exist, so everything will hang.
Finally, it is also unclear why the pipe's file descriptor is getting passed to the same function, only to return a std::pair with the same file descriptor. The std::pair serves no useful purpose in the shown code, so it's likely that there's also more code that's not shown here, where this is put to use.
At least all of the above problems must be fixed in order for the shown code to work correctly. If there's other code that's not shown, it may or may not have additional issues, as well.

Modern C++ way of starting and terminating a linux program

I have a C++ program that, based on user input, needs to start and stop a given Linux program.
I've simplified the logic I'm currently using in the following code:
int pid = fork();
if (pid == -1)
{
//Handle error
}
else if (pid == 0)
{
execlp("my_program", nullptr);
exit(0);
}
else
{
//Main program stuff
if(/* user selects close "my_program"*/)
{
kill(pid, SIGTERM);
}
//Other main program stuff
}
Everything is working, but I was wondering if there were any other approaches, maybe more in the modern C++ style, that could be used in this situation.
Thanks in advance.
Take a look at boost::process.
There are several ways a process can be spawned. For example, in your example, you fork the process, but you don't close the cloned file descriptors in the child process. That is an often forgotten practice by those that invoke fork and can lead to binding to port problems (port/address already in use) or other hard-to-debug issues.
Even boost didn't do that quite correctly for some time.
The modern way would be to use a library like Boost.Process https://www.boost.org/doc/libs/1_78_0/doc/html/process.html or Qt https://doc.qt.io/qt-5/qprocess.html.

Better way to monitor and kill other program's stalled process in linux?

I need my program to run some other program, but if the other program won't return within some time limit, I need to kill it. I came up with the following solution that seems to be working.
int main()
{
int retval, timeout=10;
pid_t proc1=fork();
if(proc1>0)
{
while(timeout)
{
waitpid(proc1, &retval, WNOHANG);
if(WIFEXITED(retval)) break; //normal termination
sleep(1);
--timeout;
if(timeout==0)
{
printf("attempt to kill process\n");
kill(proc1, SIGTERM);
break;
}
}
}
else if(proc1==0)
{
execlp("./someprogram", "./someprogram", "-a", "-b", NULL);
}
//else if fork failed etc.
return 0;
}
I need my program to be as robust as possible but I am new to programming under linux so I may not be aware of possible problems with it. My questions are:
1. Is this a proper solution to this particular problem or are there better methods?
2. Does anyone see possible problems or bugs that can lead to an unexpected behavior or a leak of system resources?
(WIFEXITED(retval)) won't return true if the program is killed by a signal (including say a crash due to segmentation violation).
Probably best to just check for a successful return from waitpid. That will only happen if the program is terminated (whether voluntarily or not).
Depending on how important it is to make sure the process is gone...
After killing the process with SIGTERM, you could sleep another second or so and if it's still not gone, use SIGKILL to be sure.

Is there a way for my win32 program to tell that the child process it launched has crashed (and not just exited)?

I wrote a multi-platform C++ class that launches a user-specified child process, and lets the user communicate with the child process's stdin/stdout, wait for the child process to exit, and so on.
In the Unix/POSIX implementation of this class, I've just added a feature that lets the caller find out whether the child process's exit was due to an unhandled signal (i.e. a crash):
bool ChildProcessDataIO :: WaitForChildProcessToExit(bool & retDidChildProcessCrash)
{
int status = 0;
int pid = waitpid(_childPID, &status, 0);
if (pid == _childPID)
{
retDidChildProcessCrash = WIFSIGNALED(status);
return true;
}
else return false; // error, couldn't get child process's status
}
... and now I'd like to add similar functionality to the Windows implementation, which currently looks like this:
bool ChildProcessDataIO :: WaitForChildProcessToExit(bool & retDidChildProcessCrash)
{
bool ret = (WaitForSingleObject(_childProcess, INFINITE) == WAIT_OBJECT_0);
if (ret)
{
/* TODO: somehow set (retDidChildProcessCrash) here */
}
return ret;
}
... but I haven't figured out how to set (retDidChildProcessCrash) to the appropriate value using the Win32 API.
Is there some way to do this, or do I just need to put a note in my documentation that this feature isn't currently implemented under Windows?
Arrange for the child to communicate with the parent to indicate completion. A shared event would be one way. If the process terminates and the parent has not received notification of success then it can conclude that the child failed.
Another option might be to use the process exit code. Will be zero on success, assuming the child follows the usual conventions. And a crash will lead to an error code indicating the form of the crash, according to this question: Predictable exit code of crashed process in Windows?
This is less reliable though. A process might terminate because it had TerminateProcess called on it, with a zero exit code. So if you control both process the first approach is safer. If you don't control the child process then the exit code might be your best shot. There's not much else you can get from a process handle of a terminated process.

I'm confused how this execvp() is handled in this sample function which uses fork() to clone a process

I have the following function from a book titled "Advanced Linux Programming".
int spawn (char* program, char** arg_list)
{
pid_t child_pid;
/* Duplicate this process. */
child_pid = fork ();
if (child_pid != 0)
/* This is the parent process. */
return child_pid;
else {
/* Now execute PROGRAM, searching for it in the path. */
execvp (program, arg_list);
/* The execvp function returns only if an error occurs. */
fprintf (stderr, “an error occurred in execvp\n”);
abort ();
}
}
But I'm confused that, in cases where ls is executed successfully, the error is not printed, but in case it fails, it prints the error which is put in the line following it.
My Question
This line fprintf (stderr, “an error occurred in execvp\n”); is after the execvp() function, and it is expected to be executed after the execution of execvp() finishes, but it is not the case, and it is executed only if execvp() encounters an error. It seems the function spawn() finishes as soon as it executes execvp() successfully. Am I right?
You can have a look at the manpage for execvp, it says:
The exec() family of functions replaces the current process image with
a new process image.
So, what does that mean? It means, if execvp succeeds, your program wont be in memory anymore, thus it wont ever reach the error message. Your program in memory will be replaced by the new program (in your case ls if i understood it correctly).
So, if your program is able to reach the error message printout, then the execvp function will have failed. Otherwise the other program starts execution.
The reason why your programm will be still running is the fork command, which creates a copy of the process image, so you will be having two same processes running of which only one will be replaced by the command you try to execute. This is achieved by the if clause if (child_pid != 0), as the fork command will duplicate the process and return the new Process ID (PID). If this is set to 0 (see man 3 fork), then its the new child process, if its != 0 its the parent process. Your function there only executes execvp if its the child process, the parent process encounters an early return.