Linux, waitpid, WNOHANG and zombies - c++

I need to be able to:
fork a process and make it execvp (I did that)
check if the child process execvp was successful (don't know how)
check if the child process finished (having problems)
I'm forking a process and I don't have any way to check if the childs's execvp worked or failed. If it failed I need to be able to know that it failed. Currently I'm using
-1 != waitpid( pid, &status, WNOHANG )
But it seems that if the execv of the pid process fails the waitpid does not return -1.
How could I check that? I read the waitpid man page, but it isn't clear to me; maybe my English isn't good enough.
EDIT: in order to explain more:
I'm building my own terminal for a Home Work. I need to get as an input a command string, lets say "ls" and then I have to execute the command.
After the child forks, the child calls execvp in order to execute the command ( after I parse the string ), and the parent need to check whether there was a '&' at the end of the command or not.
if the sign '&' does not exist at the end of the command then the parent need to wait for the child to execute.
so I need to know if the execvp failed. If it didn't failed then the parent use waitpid to wait for the child to finish it execution. If it failed then the parent will not wait for the child.

A common solution to #2 is to open a pipe prior to the fork(), then write to it in the child following the exec. In the parent, a successful read means the exec failed; an unsuccessful read means the exec succeeded and the write never took place.
// ignoring all errors except from execvp...
int execpipe[2];
pipe(execpipe);
fcntl(execpipe[1], F_SETFD, fcntl(execpipe[1], F_GETFD) | FD_CLOEXEC);
if(fork() == 0)
{
close(execpipe[0]);
execvp(...); // on success, never returns
write(execpipe[1], &errno, sizeof(errno));
// doesn't matter what you exit with
_exit(0);
}
else
{
close(execpipe[1]);
int childErrno;
if(read(execpipe[0], &childErrno, sizeof(childErrno)) == sizeof(childErrno))
{
// exec failed, now we have the child's errno value
// e.g. ENOENT
}
}
This lets the parent unambiguously know whether the exec was successful, and as a byproduct what the errno value was if unsuccessful.
If the exec was successful, the child process may still fail with an exit code, and examining the status with the WEXITSTATUS macro give you that condition as well.
NOTE: Calling waitpid with the WNOHANG flag is nonblocking, and you may need to poll the process until a valid pid is returned.

An exec call shouldn't return at all if it succeeds, because it replaces the current process image with another one, so if it does it means an error has occurred:
execvp(...);
/* exec failed and you should exit the
child process here with an error */
exit(errno);
To let the parent process know if exec failed you should read the status of the child process:
waitpid(pid, &status, WNOHANG);
And then use the WEXITSTATUS(status) macro, from the man page:
WEXITSTATUS(status) returns the exit status of the child. This
consists of the least significant 8 bits of the status argument that
the child specified in a call to exit(3) or _exit(2) or as the argument for a return statement in main()
Note the last statement means if exec succeeds and runs the command you will get the exit status of the main() function of that command, in other words you can't reliably tell the difference between a failed exec and a failed command this way, so it depends if that matters to you.
Another issue:
if the sign '&' does not exist at the end of the command then the
parent need to wait for the child to execute.
You need to call wait() on the child process at some point in your program, regardless of the & to avoid leaving the child process in a zombie state,
Note: When you use the WNOHANG it means that waitpid() will return immediately if no process has changed its state, i.e. it will not block, I assume you know that, otherwise use wait() or call waitpid() as part of your main loop.

Related

A weird bug in my code is causing some strange results, I believe it's due to my use of fork

There are a few different strange results that result from different types of input. First off, I'm building a simple linux shell, and below I show some example i/o
$
$
$
$ ls -l /
$ $ exit
so the first thing you probably notice is the double $. This happens whenever I have entered something into the prompt and not simply left it blank. Second, it appears to have exited properly, as it returns control back to my terminal... or does it? I really don't know, but as I'm in my terminal, if I simply press enter, this pops up in my terminal.
finn-and-jake#CandyKingom:~/Desktop/OS/hw2$ terminate called after throwing an instance of 'std::out_of_range'
what(): basic_string::at
I'm not 100% what's causing this or how to fix it, but I have a hunch that it has something to do with fork, and I believe that's also what's causing my extra $. There's also another issue with when I put input in as I did above, but with some empty input between the initial and the exit, which results in the program not completely closing out. An example is provided below.
$
$
$ ls -l /
$ $
$
$
$
$
$
$
$ exit
$ exit
and finally, there's another issue that I'm not sure what's causing it where the program runs in an infinite loop I can't force quit out of and it crashes my operating system (Ubuntu 14.04)
In an attempt to keep the code minimal, I'm only including the method that I suspect to be the cause of this. If any more than that is requested I will include it in an edit.
void Shell::interpreteInput() {
if (commandQueue.empty()) {
return;
};
if (commandQueue.at(0) == "exit") {
exit = true;
return;
};
pid_t pid = fork();
if (pid < 0) {
cerr << "Fork unsuccessful\n";
return;
};
if (commandQueue.size() >= 3 && commandQueue.at(commandQueue.size() - 2) == ">") {
//commandQueue.at(commandQueue.size() - 1) is fileName open
//commandQueue.at(0) is name of program to run exec
//remaining substrings are args
};
//commandQueue.at(0) is name of program to run exec
// remaining substrings are args
};
Edit (response to first question in comments): In the child process, execute the given program, passing it the given arguments (if any). If the program is a bare name (i.e., it does not contain any slashes), search the PATH for the executable. If the line has form 1 (my fourth if statement) —where the output is to be redirected—open (create or overwrite) a file with the given path, and redirect the program’s output to that file. (See detailed instructions below.)
• If output is to be redirected but the file cannot be opened, display an error message and return to Step 1.
• If the given program cannot be executed (exec fails), display an error message and return to Step 1.
After fork(), there is a check for a fork error, but otherwise both parent and child process do the same thing afterwards. You probably want to diverge code paths: parent does one thing and child does another.
Traditionally, a shell parent process waits for the child process to complete, unless there is an & indicating that the parent does not wait. The child then scoops together the command pipe line and exec's the command(s).
You need to make sure you're using a function like waitpid() or one of the related wait functions in the parent process. When fork() returns successfully (not -1), there are two processes running. The parent process will return the actual PID of the child process. The child process will get a return value of 0. So you need code like this:
pid_t pid = fork();
if (pid == -1) {
// handle error
} else if (pid == 0) {
// do child process stuff
} else {
// do parent process stuff
int status, rc;
do {
rc = waitpid(pid, &status, 0);
// handle rc value from waitpid
} while (!WIFEXITED(status));
}

Is there a way for my win32 program to tell that the child process it launched has crashed (and not just exited)?

I wrote a multi-platform C++ class that launches a user-specified child process, and lets the user communicate with the child process's stdin/stdout, wait for the child process to exit, and so on.
In the Unix/POSIX implementation of this class, I've just added a feature that lets the caller find out whether the child process's exit was due to an unhandled signal (i.e. a crash):
bool ChildProcessDataIO :: WaitForChildProcessToExit(bool & retDidChildProcessCrash)
{
int status = 0;
int pid = waitpid(_childPID, &status, 0);
if (pid == _childPID)
{
retDidChildProcessCrash = WIFSIGNALED(status);
return true;
}
else return false; // error, couldn't get child process's status
}
... and now I'd like to add similar functionality to the Windows implementation, which currently looks like this:
bool ChildProcessDataIO :: WaitForChildProcessToExit(bool & retDidChildProcessCrash)
{
bool ret = (WaitForSingleObject(_childProcess, INFINITE) == WAIT_OBJECT_0);
if (ret)
{
/* TODO: somehow set (retDidChildProcessCrash) here */
}
return ret;
}
... but I haven't figured out how to set (retDidChildProcessCrash) to the appropriate value using the Win32 API.
Is there some way to do this, or do I just need to put a note in my documentation that this feature isn't currently implemented under Windows?
Arrange for the child to communicate with the parent to indicate completion. A shared event would be one way. If the process terminates and the parent has not received notification of success then it can conclude that the child failed.
Another option might be to use the process exit code. Will be zero on success, assuming the child follows the usual conventions. And a crash will lead to an error code indicating the form of the crash, according to this question: Predictable exit code of crashed process in Windows?
This is less reliable though. A process might terminate because it had TerminateProcess called on it, with a zero exit code. So if you control both process the first approach is safer. If you don't control the child process then the exit code might be your best shot. There's not much else you can get from a process handle of a terminated process.

I'm confused how this execvp() is handled in this sample function which uses fork() to clone a process

I have the following function from a book titled "Advanced Linux Programming".
int spawn (char* program, char** arg_list)
{
pid_t child_pid;
/* Duplicate this process. */
child_pid = fork ();
if (child_pid != 0)
/* This is the parent process. */
return child_pid;
else {
/* Now execute PROGRAM, searching for it in the path. */
execvp (program, arg_list);
/* The execvp function returns only if an error occurs. */
fprintf (stderr, “an error occurred in execvp\n”);
abort ();
}
}
But I'm confused that, in cases where ls is executed successfully, the error is not printed, but in case it fails, it prints the error which is put in the line following it.
My Question
This line fprintf (stderr, “an error occurred in execvp\n”); is after the execvp() function, and it is expected to be executed after the execution of execvp() finishes, but it is not the case, and it is executed only if execvp() encounters an error. It seems the function spawn() finishes as soon as it executes execvp() successfully. Am I right?
You can have a look at the manpage for execvp, it says:
The exec() family of functions replaces the current process image with
a new process image.
So, what does that mean? It means, if execvp succeeds, your program wont be in memory anymore, thus it wont ever reach the error message. Your program in memory will be replaced by the new program (in your case ls if i understood it correctly).
So, if your program is able to reach the error message printout, then the execvp function will have failed. Otherwise the other program starts execution.
The reason why your programm will be still running is the fork command, which creates a copy of the process image, so you will be having two same processes running of which only one will be replaced by the command you try to execute. This is achieved by the if clause if (child_pid != 0), as the fork command will duplicate the process and return the new Process ID (PID). If this is set to 0 (see man 3 fork), then its the new child process, if its != 0 its the parent process. Your function there only executes execvp if its the child process, the parent process encounters an early return.

Does pclose() return pipe's termination status shifted left by eight bits on all platforms?

I found on Centos4 that the man page for popen() states in part:
DESCRIPTION
The pclose() function shall close a stream that was opened by popen(), wait for the command to termi-
nate, and return the termination status of the process that was running the command language inter-
preter. However, if a call caused the termination status to be unavailable to pclose(), then pclose()
shall return -1 with errno set to [ECHILD] to report this situation.
However, in my C++ application, when I actually execute the code, I see that the termination status is shifted left by 8 bits. Perhaps this is to distinguish a -1 from the pipe's termination status from pclose()'s own exit status of -1?
Is this portable behavior? Why doesn't the man page mention this? If not portable, which platforms conform to this behavior?
Just to add some code to shooper's answer above, you may want to do something on the lines of this:
#include <sys/wait.h>
//Get your exit code...
int status=pclose(pipe);
//...and ask how the process ended to clean up the exit code.
if(WIFEXITED(status)) {
//If you need to do something when the pipe exited, this is the time.
status=WEXITSTATUS(status);
}
else if(WIFSIGNALED(status)) {
//If you need to add something if the pipe process was terminated, do it here.
status=WTERMSIG(status);
}
else if(WIFSTOPPED(status)) {
//If you need to act upon the process stopping, do it here.
status=WSTOPSIG(status);
}
Other than that, add elegance as needed.
If you think about it, there is a "fork" in there, so you may want to "WIFEXITED" and "WEXITSTATUS".
From the man page:
The pclose() function waits for the associated process to terminate and returns the exit status of the command as returned by wait4(2).

execl with wget, child process, unix - why does it not work

So I am trying to execute wget in a separate child process which I am duplicating with fork as follows:
int child;
pid_t child = fork();
if ( child == 0 ) { // no errors
bool done = false; // set to false
while (!done) { // while not true do
execl("wget", "someurl", NULL);
done = true; // since dl finished
}
cout << "DL Finished\n"; // to see if child was successful
}
else if ( child != 0 ) { // errors
Any apparent errors that you can point out in this code? If it matters, this is inside a void function that I am calling in main what is happening is that it is not downloading and it displays "DL Finished", but does not execute wget - then terminal takes over.
This is executed on Ubuntu 12.04.2 LTS. I have previously inside the same void function used child to execute "ls" which works properly, that is with me telling it the whole path of ls (/bin/ls). I read that not providing the full path will make it search for the command, which is what I want.
I read that not providing the full path will make it search for the
command
That happens for execlp. Also, by convention the first argument should be the name of the executable. So you could try:
execlp("wget", "wget", "someurl", NULL);
^ ^^^^^
As a side note, your while (!done) is wrong. That's not how you wait for a program to finish. In fact, once you call exec the while is gone: another process "replaces" your own. So you can think of it "exec is a function that doesn't return". The standard way is to wait(2) in the parent until the child dies.
As a second side note, if all you want is to wget something and wait until the wget is done, the system(3) is possibly more appropriate:
system("wget someurl");
The arguments you pass to execl are what will be the argv array to the new process main function. And as you know the first entry in argv is the program name itself.
So what you need to do is:
execlp("wget", "wget", "someurl", NULL);
Also, if all went well the exec family of function does not return, so any code after the exec call will not run.