Best way to create child process in linux and handle possible failing - c++

I have parent process, that have to create few children processes. Best way I found is using fork + execl. But then parent process need to know if execl of concrete child fails or not, and I don't know how to implement that.
int pid = fork();
if (pid < 0) {
std::cout << "ERROR on fork." << std::endl;
} if (pid == 0) {
execl("/my/program/full/path", (char *)NULL);
exit(1);
}
else {
if (/*child's process execl fails*/) {
std::cout << "it failed" << std::endl
} else {
std::cout << "child born" << std::endl
}
}
I think this idea is not good:
int status(0);
sleep(100);
int res = waitpid(pid, &status, WNOHANG);
if (res < 0 && errno == 10) {
std::cout << "it failed" << std::endl
} else {
std::cout << "child born" << std::endl
}
because it's not good to hope that child process will die after 100 milliseconds, I want to know that for sure as only that will happens.
I also think that creation of shared_memory or special pipe connection for such check is a Cannon against Bees.
There have to be simple solution for that, that I just didn't found yet.
What is the best way to achieve that?

As a general solution you can register signal handler (SIGUSR1) in the parent using sigaction().
In a child: unregister signal handler, if execl() call failed you need to send SIGUSR1 to the parent.
In the parent: Every child pid we will store in std::set. When all childs are created you just create a separate thread for tracking childs. In the thread function just call wait() and remove pid from the set. Another way to listen SIGCHLD signal (but it will lead to more complex solution, so if spawn another thread is an option I'd use thread).
When the set is empty we have done.

Related

How to use waitpid when execv returns an error?

My program loops through a vector of strings and runs a program to do some work. Each entry in the vector has its own associated program. The child processes are created using fork() and execv() in the loop. The parent process is waiting until each child process has returned before continuing the loop using waidpid(). The called child processes in my test environment (for now) will each print a message, sleep() and print another message.
The code works perfectly fine as long as all execv() does not return -1 (for example because the file wasn't found).
std::vector<std::string> files{ "foo", "bar", "foobar" };
for (size_t i=0; i<files.size(); i++)
{
pid_t pid_fork = fork();
if (pid_fork == -1)
{
std::cout << "error: could not fork process" << std::endl;
} else if (pid_fork > 1)
{
std::cout << "this is the parent" << std::endl;
int pid_status;
pid_t child_ret = waitpid(pid_fork, &pid_status, 0);
std::cout << "child_ret: " << child_ret << std::endl;
if (child_ret == -1)
{
std::cout << "error waiting for child " << pid_fork << std::endl;
} else
{
if (WIFEXITED(pid_status))
{
std::cout << "child process exit status: " << WEXITSTATUS(pid_status) << std::endl;
if (WIFEXITED(pid_status) == 0)
{
std::cout << "updating db that file has been loaded: " << files[i].first << std::endl;
/* some code to update a DB table */
} else
{
std::cout << "exit status = FAILED" << std::endl;
}
}
}
} else
{
std::cout << "this is the child" << std::endl;
char *args[] = {NULL};
if (execv(("./etl/etl_" + files[i].c_str(), args) == -1)
{
std::cout << "could not load ./etl/etl_" << files[i] << std::endl;
/* DB insert of failed "load" here */
return EXIT_FAILURE;
}
}
}
/* some more code here writing stuff to a database before cleanup and returning from main*/
Output:
this is the parent
this is the child
hello from etl_foo
etl_foo is done
child_ret: 77388
child process exit status: 0
this is the parent
this is the child
hello from etl_bar
etl_bar is done
child_ret: 77389
child process exit status: 0
this is the parent
this is the child
hello from etl_foobar
etl_foobar is done
child_ret: 77390
child process exit status: 0
If, however I cause execv() to return ´´´-1´´´ because I deleted etl_foobar the parent process seems to no longer wait for the child process to return
this is the child
hello from etl_foo
etl_foo is done
child_ret: 77620
child process exit status: 0
this is the parent
this is the child
hello from etl_bar
etl_bar is done
child_ret: 77621
child process exit status: 0
this is the parent
this is the child
could not load ./etl_foobar
-> here the end of the parent code is reached, the DB is updated and the parent returns (?)
-> I expect the program to be done at this stage, however... this happens
child_ret: 77622
terminate called after throwing an instance of 'sql::SQLException'
what(): Lost connection to MySQL server during query
Aborted (core dumped)
It seems the code block after pid_t child_ret = waitpid(pid_fork, &pid_status, 0); is executed, which I don't understand. The parent has already returned, yet part of the parent's code is still executed and fails as the connection object for the db connection has been deleted just before the parent returns.
The desired behavior is that upon discovery that execv() == -1 the child process returns to the waiting parent, which then finishes off the remaining code and returns itself in an orderly manner, the same way it does when there is no error in execv(). Thank you!
Edit: User Sneftel pointed me to the fact that the child process in case of failure actually doesn't return, which I have changed now. The parent process hence is now waiting for all children to return, including those where execv fails.
Nevertheless, I still have the issue that whenever the child returns with EXIT_FAILURE, the following loop performs up until the next DB insert is attempted, where I continue to get the "lost MySQL connection" error + core dump. Not sure what the origin of this is.

Child Process runs even after parent process has exited?

I was writing a code for a research program. I have following requirement:
1. Main binary execution begins at main()
2. main() fork()
3. child process runs a linpack benchmark binary using execvp()
4. parent process runs some monitoring process and wait for child to exit.
The code is below:
main.cpp
extern ServerUncorePowerState * BeforeStates ;
extern ServerUncorePowerState * AfterStates;
int main(int argc, char *argv[]) {
power pwr;;
procstat st;
membandwidth_t data;
int sec_pause = 1; // sample every 1 second
pid_t child_pid = fork();
if (child_pid >= 0) { //fork successful
if (child_pid == 0) { // child process
int exec_status = execvp(argv[1], argv+1);
if (exec_status) {
std::cerr << "execv failed with error "
<< errno << " "
<< strerror(errno) << std::endl;
}
} else { // parent process
int status = 1;
waitpid(child_pid, &status, WNOHANG);
write_headers();
pwr.init();
st.init();
init_bandwidth();
while (status) {
cout << " Printing status Value: " << status << endl;
sleep (sec_pause);
time_t now;
time(&now);
struct tm *tinfo;
tinfo = localtime(&now);
pwr.loop();
st.loop();
data = getbandwidth();
write_samples(tinfo, pwr, st, data.read_bandwidth + data.write_bandwidth);
waitpid(child_pid, &status, WNOHANG);
}
wait(&status); // wait for child to exit, and store its status
//--------------------This code is not executed------------------------
std::cout << "PARENT: Child's exit code is: "
<< WEXITSTATUS(status)
<< std::endl;
delete[] BeforeStates;
delete[] AfterStates;
}
} else {
std::cerr << "fork failed" << std::endl;
return 1;
}
return 0;
}
What is expected that the child will exit and then parent exits but due to some unknown reason after 16 mins parent exits but child is still running.
Normally It is said that when parent exits the child dies automatically.
What could be the reason for this strange behavior???
Normally It is said that when parent exits the child dies automatically.
Well this is not always true, it depends on the system. When a parent process terminates, the child process is called an orphan process. In a Unix-like OS this is managed by relating the parent process of the orphan process to the init process, this is called re-parenting and it's automatically managed by the OS. In other types of OS, orphan processes are automatically killed by the system. You can find more details here.
From the code snippet I would think that maybe the issue is in the wait(&status) statement. The previous loop would end (or not be executed) when the return status is 0, which is the default return value from your final return 0 at the end, that could be yielded by the previous waitpid(child_pid, &status, WNOHANG) statements. This means that the wait(&status) statement would wait on a already terminated process, this may cause some issues.

Forking and Waiting in linux (C++).

I want to fork a process and then do the following in the parent:
Wait until it terminates naturally or timeout period set by the parent expires (something like waitforsingalobject in windows) after which I will kill the process using kill(pid);
Get the exit code of the child process (assuming it exited naturally)
I need to have access to the std::cout of the child process from the parent.
I attempted to use waitpid() however while this allows me access to the return code I cannot implement a timeout using this function.
I also looked at the following solution (https://www.linuxprogrammingblog.com/code-examples/signal-waiting-sigtimedwait) which allows me to implement a time-out however there doesnt seem a way to get the return code.
I geuss my question boils down to, Whats the correct way achieving this in linux?
You can do #1 and #2 with sigtimedwait function and #3 with pipe:
#include <unistd.h>
#include <signal.h>
#include <iostream>
int main() {
// Block SIGCHLD, so that it only gets delivered while in sigtimedwait.
sigset_t sigset;
sigemptyset(&sigset);
sigaddset(&sigset, SIGCHLD);
sigprocmask(SIG_BLOCK, &sigset, nullptr);
// Make a pipe to communicate with the child process.
int child_stdout[2];
if(pipe(child_stdout))
abort();
std::cout.flush();
std::cerr.flush();
auto child_pid = fork();
if(-1 == child_pid)
abort();
if(!child_pid) { // In the child process.
dup2(child_stdout[1], STDOUT_FILENO); // Redirect stdout into the pipe.
std::cout << "Hello from the child process.\n";
std::cout.flush();
sleep(3);
_exit(3);
}
// In the parent process.
dup2(child_stdout[0], STDIN_FILENO); // Redirect stdin to stdout of the child.
std::string line;
getline(std::cin, line);
std::cout << "Child says: " << line << '\n';
// Wait for the child to terminate or timeout.
timespec timeout = {1, 0};
siginfo_t info;
auto signo = sigtimedwait(&sigset, &info, &timeout);
if(-1 == signo) {
if(EAGAIN == errno) { // Timed out.
std::cout << "Killing child.\n";
kill(child_pid, SIGTERM);
}
else
abort();
}
else { // The child has terminated.
std::cout << "Child process terminated with code " << info.si_status << ".\n";
}
}
Outputs:
Child says: Hello from the child process.
Killing child.
If sleep is commented out:
Child says: Hello from the child process.
Child process terminated with code 3.

dup2( ) causing child process to terminate early

So I'm writing a program that involves the creation of 2 sets of pipes so that a parent process can write to a child process & the child process can right back...
I have the following code for my child process:
if(pid==0){ //child process
cout << "executing child" << endl;
close(fd1[WRITE_END]);
close(fd2[READ_END]);
if(dup2(fd1[READ_END],STDIN_FILENO) < 0 || dup2(fd2[WRITE_END],STDOUT_FILENO) < 0){
cerr << "dup2 failed" << endl;
exit(1);
}
cout << "test output" << endl;
close(fd2[WRITE_END]);
close(fd1[READ_END]);
read(fd1[READ_END],buf,BUFFER_SIZE);
cout << "Child process read " << buf << endl;
execl("/bin/sort", "sort", "-nr", NULL);
} else { //... parent process
When I run my program, all I get as output from the child process is executing child but no test output.
However, when I remove the if-statement handling the dup2 calls, my output does include test output.
Any ideas as to why dup2 causes my child process to not finish terminating?
(and by the way, originally, my two dup2's were done in separate if statements... when I put the test output below the dup2(fd1[READ_END],STDIN_FILENO) < 0 test, it outputs, but not when I put it below the other dup2 conditional test, so I'm convinced that that's where my issue is)
Thanks in advance
The call to dup2(fd2[WRITE_END],STDOUT_FILENO) connects STDOUT (which is used by C++ cout stream) to your fd2 pipe. So 'test output' gets written to the pipe.

C++ Process fork and sigalarm

The goal of this program is to fork and have the child sleep while parent loops infinitely waiting for an interrupt. When I hit ^C, it calls the void parent function. This part works however, the message from the kill ( pid, SIGALRM ) is not working. I checked and pid is the correct process ID for the child.
I've searched for awhile, but I haven't found what I'm doing wrong. I used the kill ( pid, SIGALRM ) before from the child process to the parent but I can't figure out why this isn't working..
#include <signal.h>
#include <unistd.h>
#include <iostream>
#include <sys/types.h>
#include <sys/wait.h>
using namespace std;
int pid;
void parent ( int sig )
{
kill ( pid, SIGALRM );
cout << "I'm a parent " << getpid() << " My child is " << pid << endl;
}
void child ( int sig )
{
cout << "I am " << getpid() << "my parent is " << getppid()<< endl;
cout << "Use ctrl+backslash to actually end the program" << endl;
}
int main()
{
pid = fork();
if(pid == 0)
{ //Child process
cout << "Child pid = " << getpid() << " Waiting for interrupt." << endl;
(void) signal ( SIGALRM, child );
pause();
}
else if(pid > 0)
{ //Parent
sleep(2);
cout << "child pid = " << pid << endl;
struct sigaction act;
act.sa_handler = parent;
sigemptyset ( &act.sa_mask);
sigaction (SIGINT, &act, 0);
while(1)
{
sleep ( 1 );
}
}
return 0;
}
Ok, so I figured out the problem.
When I was pressing ^C, it would catch the interrupt in the main process, but kill the child process. When I ran a system("ps") from inside the program, it showed the child a.out process to be defunct.
To fix this I added the following to the child's process:
struct sigaction act;
act.sa_handler = CHILD_PRESERVER;
sigemptyset ( &act.sa_mask);
sigaction (SIGINT, &act, 0);
Where CHILD PRESERVER was a dummy function that did nothing except keep it alive.
It doesn't see that this solution is very elegant, so if anyone has a more correct way of doing this please post it.
You can do the same thing as your sigaction solution by just using signal(SIGINT, SIG_IGN);
The thing that tripped you up initially (and often trips up new programmers dealing with ctrl-C and signals) is that ctrl-C sends a signal to AN ENTIRE PROCESS GROUP, rather than to a single process -- every process in the group will get the signal. The process group the signal is sent to is the foreground process group of the terminal.
So this gives you lots of ways of dealing with/controlling ctrl-C interrupts. You can have each process install its own SIGINT handler (as you have done). Or you can carefully manage your process groups, putting children into their own process group (which will generally not be the foreground process group), so they won't get the signal in the first place.
You manage process groups with the setpgrp(2)/setpgid(2) system call.