I use posix_spawnp to execute different processes and I check the status (with waitpid) to make sure the child was created properly
int iRet = posix_spawnp(&iPID, zPath, NULL, NULL, argv, environ);
if (iRet != 0)
{
return false;
}
int iState;
waitpid(static_cast<pid_t>(iPID), &iState, WNOHANG);
cout << "Wait: PID " << iPID << " | State " << iState << endl;
if (WIFEXITED(iState)) {
printf("Child exited with RC=%d\n",WEXITSTATUS(iState));
}
else if (WIFSIGNALED(iState)) {
printf("Child exited via signal %d\n",WTERMSIG(iState));
}
else
{
printf("Child is NORMAL");
}
At first this executes properly and I get the following message:
Wait: PID 15911 | State 0 Child exited
with RC=0
After executing the same process several times, the child process starts to exit with status 127.
Wait: PID 15947 | State 32512 Child
exited with RC=127
After this happens, I could not get the child to spawn again. I enclosed the section of code given above in a for loop but it wouldn't spawn properly.
If I restart the parent process, it works for a while but the same problem crops up again after a while.
What am I doing wrong here?
Check this link.
For example:
EINVAL The value specified by file_actions or attrp is invalid.
The error codes for the posix_spawn and posix_spawnp subroutines are affected by the following conditions:
If this error occurs after the calling process successfully returns from the posix_spawn or posix_spawnp function, the child process might exit with exit status 127.
It looks as if it might exit with 127 for a whole host of reasons.
Check the return code from waitpid() to be sure that it isn't having problems.
The way the code reads suggests that you are only spawning one child process at a time (otherwise there'd be no need to call waitpid() within the loop). However in that case I wouldn't expect to use WNOHANG.
Related
I am new to fork and exec, and I tried the following program.
Program 1:
int main(int argc, char *argv[]){
pid_t pid;
int status;
pid = fork();
if(pid == 0){
printf("new process");
execv("p1",argv);
}
else{
pid_t pr = wait(&status);// I am trying to get the exit value
// of the sub process.
printf("the child process exit with %d",status);
printf("father still running\n");
}
}
Program 2:
int main(){
std::cout<<"I am the new thread"<<std::endl;
sleep(1);
std::cout<<"after 1 second"<<std::endl;
exit(1);
}
I run the first program, and the output is "the child process exit with 256". Why is the result 256 instead of 1? If I change exit(1) to exit(2), the result becomes 512, why is that? It only worked if I return 0.
The status value you get back from the wait system call is not necessarily what your child process exited with.
There are a number of other pieces of information that can be returned as well, such as:
did the process terminate normally?
was it terminated by a signal?
what was the signal that terminated it?
did it dump core?
In order to extract the exit code, you use a macro:
WEXITSTATUS(status)
That, and the macros that can give you more information, should be available on the wait man-page, such as the one here.
I have a main process and some child process spawn from it. At a point of time i have to give SIGINT signal to all the child process but not to main process. I am unable to store pid's for all child processes. So i used SIG_IGN for ignoring SIGINT in main process and set to default after my action. But it is not working.
Please find my code snippet below:
/* Find group id for process */
nPgid = getpgid(parentPID);
/* Ignore SIGINT signal in parent process */
if (signal(SIGINT, SIG_IGN) == SIG_ERR)
{
cout << "Error in ignoring signal \n");
}
/* Send SIGINT signal to all process in the group */
nReturnValue = kill ( (-1 * nPgid), SIGINT);
if (nReturnValue == RETURN_SUCCESS)
{
cout << "Sent SIGINT signal to all process in group successfully \n";
}
else
{
cout << "Alert!!! Unable to send SIGINT signal to all process in the group \n";
}
/* Set SIGINT signal status to default */
signal (SIGINT, SIG_DFL);
sleep(2);
I am not getting any error. But parent is getting killed. Am i doing anything wrong here?
nPgid = getpgid(parentPID);
What is parentPID? The get the group of the calling process either pass 0 or the result of getpid().
From man getpgid():
getpgid() returns the PGID of the process specified by pid. If pid
is zero, the process ID of the calling process is used. (Retrieving
the PGID of a process other than the caller is rarely necessary, and
the POSIX.1 getpgrp() is preferred for that task.)
From this text above I'd draw the conclusion to do
nPgid = getpgid(o);
Extracted from Unix Network Programming Vol1 Third Edition Section 5.10 wait and waitpid functions
#include "unp.h"
void
sig_chld(int signo)
{
pid_t pid;
int stat;
while ( (pid = waitpid(-1, &stat, WNOHANG)) > 0) {
printf("child %d terminated\n", pid);
}
return;
}
...
// in server code
Signal(SIGCHLD, sig_chld); // used to prevent any zombies from being left around
...
..
// in client code
The client establishes five connection with the server and then immediately exit
...
Reference waitpid:
Return Value
waitpid(): on success, returns the process ID of the child whose state
has changed; if WNOHANG was specified and one or more child(ren)
specified by pid exist, but have not yet changed state, then 0 is
returned. On error, -1 is returned.
Based on the above document, waitpid will return 0 if at the moment no child process has terminated. If I understood correctly, this will cause the function sig_chld break from the while statement.
Question> Thus how can we guarantee that this signal handler can make sure all terminated children processes are collected?
while ( (pid = waitpid(-1, &stat, WNOHANG)) > 0) {
printf("child %d terminated\n", pid);
You wouldn't be in the signal handler if you didn't have a child to handle. The loop is because while you are in the handler itself a 2nd or 3rd child could have changed or terminated sending SIGCHLDs that would not be queued. Thus the loop actually prevents you from missing those possible dead children. It will return 0 or error out with a -1 (ECHILD) when there are no more children to be reaped at the moment.
i have a code like this...
c = fork();
if(c==0) {
close(fd[READ]);
if (dup2(fd[WRITE],STDOUT_FILENO) != -1)
execlp("ssh", "ssh", host, "ls" , NULL);
_exit(1);
}
close(fd[WRITE]);
fd[READ] and fd[WRITE] are pipe file descriptors.
when i run it continuously, there are a lot of zombie processes when i use ps ax. How to rectify this? Is this because i am not using the parent to wait for the exit status of the child process...
If you have no intention to wait for your child processes, set the SIGCHLD handler to SIG_IGN to have the kernel automatically reap your children, eg.
signal(SIGCHLD, SIG_IGN);
Yes, the parent must wait for the child return status. You can do it asynchronously by catching SIGCHILD in the parent process and then call waitpid in the capture method.
Yes, waitpid() should be called from parent. waitpid() will clean-up any child process of the parent process, which is currently in terminated state.
You can add below code to your program :
if(c>0)
{
while(1){
ret = waitpid(-1,&status,0);
if(ret>0){
if(WIFEXITED(status)){
if(WEXITSTATUS(status) == 0){
printf("child process terminated normally and successfully\n");
}
else{
printf("child process terminated normally and unsuccessfully\n");
}
}
else{
printf("child process terminated abnormally and unsuccessfully\n");
}
}
if(ret<0) {
break;
}
}
}
FYI : more on waitpid.
First parameter is set to -1 such that waitpid() will clean-up any child process of this parent process, which is currently in terminated state.The first parameter can also be +ve - in this case, waitpid() will cleanup only the specific child process.Most common use is to set first parameter to -1 also refer to manual page of waitpid().
Second parameter is used to extract the termination/exit status code of the child process - waitpid() system call API fills the status field when the system call API is invoked.
Last field is the flags field - currently unused - in most cases, flags field will be set to 0 - meaning, default behaviour of the system call API !!! if you really need to use flags, refer to manual page of waitpid().
Note:
In the code you submitted, _exit(1) will be called iff execlp() fails. so you can put a condition for execlp() fail and that condition _exit() can be called. The Reason is, execlp() functions only return if an error has occurred.
Modified code can be like below :
c = fork();
if(c==0) {
close(fd[READ]);
if (dup2(fd[WRITE],STDOUT_FILENO) != -1)
ret_execlp = execlp("ssh", "ssh", host, "ls" , NULL);
if(ret_execlp == -1 ) {
printf("execlp is failed");
_exit(1);
}
}
close(fd[WRITE]);
I appreciate the above 2 answers. Wish this answer may give more clarity. Thank you.
I do the regular thing:
fork()
execvp(cmd, ) in child
If execvp fails because no cmd is found, how can I notice this error in parent process?
The well-known self-pipe trick can be adapted for this purpose.
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <sys/wait.h>
#include <sysexits.h>
#include <unistd.h>
int main(int argc, char **argv) {
int pipefds[2];
int count, err;
pid_t child;
if (pipe(pipefds)) {
perror("pipe");
return EX_OSERR;
}
if (fcntl(pipefds[1], F_SETFD, fcntl(pipefds[1], F_GETFD) | FD_CLOEXEC)) {
perror("fcntl");
return EX_OSERR;
}
switch (child = fork()) {
case -1:
perror("fork");
return EX_OSERR;
case 0:
close(pipefds[0]);
execvp(argv[1], argv + 1);
write(pipefds[1], &errno, sizeof(int));
_exit(0);
default:
close(pipefds[1]);
while ((count = read(pipefds[0], &err, sizeof(errno))) == -1)
if (errno != EAGAIN && errno != EINTR) break;
if (count) {
fprintf(stderr, "child's execvp: %s\n", strerror(err));
return EX_UNAVAILABLE;
}
close(pipefds[0]);
puts("waiting for child...");
while (waitpid(child, &err, 0) == -1)
if (errno != EINTR) {
perror("waitpid");
return EX_SOFTWARE;
}
if (WIFEXITED(err))
printf("child exited with %d\n", WEXITSTATUS(err));
else if (WIFSIGNALED(err))
printf("child killed by %d\n", WTERMSIG(err));
}
return err;
}
Here's a complete program.
$ ./a.out foo
child's execvp: No such file or directory
$ (sleep 1 && killall -QUIT sleep &); ./a.out sleep 60
waiting for child...
child killed by 3
$ ./a.out true
waiting for child...
child exited with 0
How this works:
Create a pipe, and make the write endpoint CLOEXEC: it auto-closes when an exec is successfully performed.
In the child, try to exec. If it succeeds, we no longer have control, but the pipe is closed. If it fails, write the failure code to the pipe and exit.
In the parent, try to read from the other pipe endpoint. If read returns zero, then the pipe was closed and the child must have exec successfully. If read returns data, it's the failure code that our child wrote.
You terminate the child (by calling _exit()) and then the parent can notice this (through e.g. waitpid()). For instance, your child could exit with an exit status of -1 to indicate failure to exec. One caveat with this is that it is impossible to tell from your parent whether the child in its original state (i.e. before exec) returned -1 or if it was the newly executed process.
As suggested in the comments below, using an "unusual" return code would be appropriate to make it easier to distinguish between your specific error and one from the exec()'ed program. Common ones are 1, 2, 3 etc. while higher numbers 99, 100, etc. are more unusual. You should keep your numbers below 255 (unsigned) or 127 (signed) to increase portability.
Since waitpid blocks your application (or rather, the thread calling it) you will either need to put it on a background thread or use the signalling mechanism in POSIX to get information about child process termination. See the SIGCHLD signal and the sigaction function to hook up a listener.
You could also do some error checking before forking, such as making sure the executable exists.
If you use something like Glib, there are utility functions to do this, and they come with pretty good error reporting. Take a look at the "spawning processes" section of the manual.
1) Use _exit() not exit() - see http://opengroup.org/onlinepubs/007908775/xsh/vfork.html - NB: applies to fork() as well as vfork().
2) The problem with doing more complicated IPC than the exit status, is that you have a shared memory map, and it's possible to get some nasty state if you do anything too complicated - e.g. in multithreaded code, one of the killed threads (in the child) could have been holding a lock.
Not should you wonder how you can notice it in parent process, but also you should keep in mind that you must notice the error in parent process. That's especially true for multithreaded applications.
After execvp you must place a call to function that terminates the process in any case. You should not call any complex functions that interact with C library (such as stdio), since effects of them may mingle with pthreads of libc functionality of parent process. So you can't print a message with printf() in child process and have to inform parent about the error instead.
The easiest way, among the other, is passing return code. Supply nonzero argument to _exit() function (see note below) you used to terminate the child and then examine the return code in the parent. Here's the example:
int pid, stat;
pid = fork();
if (pid == 0){
// Child process
execvp(cmd);
if (errno == ENOENT)
_exit(-1);
_exit(-2);
}
wait(&stat);
if (!WIFEXITED(stat)) { // Error happened
...
}
Instead of _exit(), you might think of exit() function, but it's incorrect, since this function will do a part of the C-library cleanup that should be done only when parent process terminates. Instead, use _exit() function, that doesn't do such a cleanup.
Well, you could use the wait/waitpid functions in the parent process. You can specify a status variable that holds info about the status of the process that terminated. The downside is that the parent process is blocked until the child process finishes execution.
Anytime exec fails in a subprocess, you should use kill(getpid(),SIGKILL) and the parent should always have a signal handler for SIGCLD and tell the user of the program, in the appropriate way, that the process was not successfully started.