How to wait until all child processes called by fork() complete? - c++

I am forking a number of processes and I want to measure how long it takes to complete the whole task, that is when all processes forked are completed. Please advise how to make the parent process wait until all child processes are terminated? I want to make sure that I stop the timer at the right moment.
Here is as a code I use:
#include <iostream>
#include <string>
#include <fstream>
#include <sys/time.h>
#include <sys/wait.h>
using namespace std;
struct timeval first, second, lapsed;
struct timezone tzp;
int main(int argc, char* argv[])// query, file, num. of processes.
{
int pCount = 5; // process count
gettimeofday (&first, &tzp); //start time
pid_t* pID = new pid_t[pCount];
for(int indexOfProcess=0; indexOfProcess<pCount; indexOfProcess++)
{
pID[indexOfProcess]= fork();
if (pID[indexOfProcess] == 0) // child
{
// code only executed by child process
// magic here
// The End
exit(0);
}
else if (pID[indexOfProcess] < 0) // failed to fork
{
cerr << "Failed to fork" << endl;
exit(1);
}
else // parent
{
// if(indexOfProcess==pCount-1) and a loop with waitpid??
gettimeofday (&second, &tzp); //stop time
if (first.tv_usec > second.tv_usec)
{
second.tv_usec += 1000000;
second.tv_sec--;
}
lapsed.tv_usec = second.tv_usec - first.tv_usec;
lapsed.tv_sec = second.tv_sec - first.tv_sec;
cout << "Job performed in " <<lapsed.tv_sec << " sec and " << lapsed.tv_usec << " usec"<< endl << endl;
}
}//for
}//main

I'd move everything after the line "else //parent" down, outside the for loop. After the loop of forks, do another for loop with waitpid, then stop the clock and do the rest:
for (int i = 0; i < pidCount; ++i) {
int status;
while (-1 == waitpid(pids[i], &status, 0));
if (!WIFEXITED(status) || WEXITSTATUS(status) != 0) {
cerr << "Process " << i << " (pid " << pids[i] << ") failed" << endl;
exit(1);
}
}
gettimeofday (&second, &tzp); //stop time
I've assumed that if the child process fails to exit normally with a status of 0, then it didn't complete its work, and therefore the test has failed to produce valid timing data. Obviously if the child processes are supposed to be killed by signals, or exit non-0 return statuses, then you'll have to change the error check accordingly.
An alternative using wait:
while (true) {
int status;
pid_t done = wait(&status);
if (done == -1) {
if (errno == ECHILD) break; // no more child processes
} else {
if (!WIFEXITED(status) || WEXITSTATUS(status) != 0) {
cerr << "pid " << done << " failed" << endl;
exit(1);
}
}
}
This one doesn't tell you which process in sequence failed, but if you care then you can add code to look it up in the pids array and get back the index.

The simplest method is to do
while(wait() > 0) { /* no-op */ ; }
This will not work if wait() fails for some reason other than the fact that there are no children left. So with some error checking, this becomes
int status;
[...]
do {
status = wait();
if(status == -1 && errno != ECHILD) {
perror("Error during wait()");
abort();
}
} while (status > 0);
See also the manual page wait(2).

Call wait (or waitpid) in a loop until all children are accounted for.
In this case, all processes are synchronizing anyway, but in general wait is preferred when more work can be done (eg worker process pool), since it will return when the first available process state changes.

I believe the wait system call will accomplish what you are looking for.

for (int i = 0; i < pidCount; i++) {
while (waitpid(pids[i], NULL, 0) > 0);
}
It won't wait in the right order, but it will stop shortly after the last child dies.

Related

Creating 3 children processes and exiting them after a specified number of seconds

image for what output is supposed to look like:My problem is that I need to write a program that will accept the names of 3 processes as command-line arguments. Each of these processes will run for as many seconds as:(PID%10)*3+5 and terminate. After those 3 children terminated, the parent process
will reschedule each child. When all children have been rescheduled 3 times, the parent will terminate. I have used fork to create the three children but am struggling with getting them to exit with that specific criteria?
using namespace std;
int main(){
int i;
int pid;
for(i=0;i<3;i++) // loop will run n times (n=3)
{
if(fork() == 0)
{
pid = getpid();
cout << "Process p" << i+1 << " pid:" << pid << " Started..." << endl;
exit(0);
}
}
for(int i=0;i<5;i++) // loop will run n times (n=3)
wait(NULL);
}
You can use sigtimedwait to wait for SIGCHLD or timeout.
Working example:
#include <cstdio>
#include <cstdlib>
#include <signal.h>
#include <unistd.h>
template<class... Args>
void start_child(unsigned max_runtime_sec, Args... args) {
// Block SIGCHLD.
sigset_t set;
sigemptyset(&set);
sigaddset(&set, SIGCHLD);
sigprocmask(SIG_BLOCK, &set, nullptr);
// Enable SIGCHLD.
signal(SIGCHLD, [](int){});
pid_t child_pid = fork();
switch(child_pid) {
case -1:
std::abort();
case 0: {
// Child process.
execl(args..., nullptr);
abort(); // never get here.
}
default: {
// paren process.
timespec timeout = {};
timeout.tv_sec = max_runtime_sec;
siginfo_t info = {};
int rc = sigtimedwait(&set, nullptr, &timeout);
if(SIGCHLD == rc) {
std::printf("child %u terminated in time with return code %d.\n", static_cast<unsigned>(child_pid), info.si_status);
}
else {
kill(child_pid, SIGTERM);
sigwaitinfo(&set, &info);
std::printf("child %u terminated on timeout with return code %d.\n", static_cast<unsigned>(child_pid), info.si_status);
}
}
}
}
int main() {
start_child(2, "/bin/sleep", "/bin/sleep", "10");
start_child(2, "/bin/sleep", "/bin/sleep", "1");
}
Output:
child 31548 terminated on timeout with return code 15.
child 31549 terminated in time with return code 0.
With these changes your program produces the desired output:
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
#include <iostream>
using namespace std;
int main()
{
for (int round = 0; ++round <= 4; )
{
int i;
cout << "*** ROUND: " << round << " ***\n";
for (i=0; i<3; i++) // loop will run n times (n=3)
{
if (fork() == 0)
{
int pid = getpid();
cout << "Process p" << i+1 << " pid:" << pid << " started...\n";
unsigned int seconds = pid%10*3+5;
cout << "Process " << pid << " exiting after "
<< seconds-sleep(seconds) << " seconds\n";
exit(0);
}
}
while (i--) // loop will run n times (n=3)
{
int status;
cout << "Process " << wait(&status);
cout << " exited with status: " << status << endl;
}
}
}
As Serge suggested, we're using sleep() for every child before exiting it. it will pause the process for a number of seconds.
To get the actual status information, we call wait(&status) instead of wait(NULL).
We're doing this all for the first scheduling round plus the desired 3 times of rescheduling.

Child Process runs even after parent process has exited?

I was writing a code for a research program. I have following requirement:
1. Main binary execution begins at main()
2. main() fork()
3. child process runs a linpack benchmark binary using execvp()
4. parent process runs some monitoring process and wait for child to exit.
The code is below:
main.cpp
extern ServerUncorePowerState * BeforeStates ;
extern ServerUncorePowerState * AfterStates;
int main(int argc, char *argv[]) {
power pwr;;
procstat st;
membandwidth_t data;
int sec_pause = 1; // sample every 1 second
pid_t child_pid = fork();
if (child_pid >= 0) { //fork successful
if (child_pid == 0) { // child process
int exec_status = execvp(argv[1], argv+1);
if (exec_status) {
std::cerr << "execv failed with error "
<< errno << " "
<< strerror(errno) << std::endl;
}
} else { // parent process
int status = 1;
waitpid(child_pid, &status, WNOHANG);
write_headers();
pwr.init();
st.init();
init_bandwidth();
while (status) {
cout << " Printing status Value: " << status << endl;
sleep (sec_pause);
time_t now;
time(&now);
struct tm *tinfo;
tinfo = localtime(&now);
pwr.loop();
st.loop();
data = getbandwidth();
write_samples(tinfo, pwr, st, data.read_bandwidth + data.write_bandwidth);
waitpid(child_pid, &status, WNOHANG);
}
wait(&status); // wait for child to exit, and store its status
//--------------------This code is not executed------------------------
std::cout << "PARENT: Child's exit code is: "
<< WEXITSTATUS(status)
<< std::endl;
delete[] BeforeStates;
delete[] AfterStates;
}
} else {
std::cerr << "fork failed" << std::endl;
return 1;
}
return 0;
}
What is expected that the child will exit and then parent exits but due to some unknown reason after 16 mins parent exits but child is still running.
Normally It is said that when parent exits the child dies automatically.
What could be the reason for this strange behavior???
Normally It is said that when parent exits the child dies automatically.
Well this is not always true, it depends on the system. When a parent process terminates, the child process is called an orphan process. In a Unix-like OS this is managed by relating the parent process of the orphan process to the init process, this is called re-parenting and it's automatically managed by the OS. In other types of OS, orphan processes are automatically killed by the system. You can find more details here.
From the code snippet I would think that maybe the issue is in the wait(&status) statement. The previous loop would end (or not be executed) when the return status is 0, which is the default return value from your final return 0 at the end, that could be yielded by the previous waitpid(child_pid, &status, WNOHANG) statements. This means that the wait(&status) statement would wait on a already terminated process, this may cause some issues.

Supervisor Program Forking to a Multi-threaded Child

First off, allow me to describe my scenario:
I developed a supervisory program on Linux that forks and then uses execv(), in the child process, to launch my multi-threaded application. The supervisory program is acting as a watchdog to the multi-threaded application. If the multi-threaded application does not send a SIGUSR1 signal to the supervisor after a period of time then the supervisory program will kill the child using the pid_t from the fork() call and repeat the process again.
Here is the code for the Supervisory Program:
#include <sys/types.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
#include <iostream>
#include <cerrno>
time_t heartbeatTime;
void signalHandler(int sigNum)
{
//std::cout << "Signal (" << sigNum << ") received.\n";
time(&heartbeatTime);
}
int main(int argc, char *argv[])
{
pid_t cpid, ppid;
int result = 0;
bool programLaunched = false;
time_t now;
double timeDiff;
int error;
char ParentID[25];
char *myArgv[2];
// Get the Parent Process ID
ppid = ::getpid();
// Initialize the Child Process ID
cpid = 0;
// Copy the PID into the char array
sprintf(ParentID, "%i", ppid);
// Set up the array to pass to the Program
myArgv[0] = ParentID;
myArgv[1] = 0;
// Print out of the P PID
std::cout << "Parent ID: " << myArgv[0] << "\n";
// Register for the SIGUSR1 signal
signal(SIGUSR1, signalHandler);
// Register the SIGCHLD so the children processes exit fully
signal(SIGCHLD, SIG_IGN);
// Initialize the Heart Beat time
time(&heartbeatTime);
// Loop forever and ever, amen.
while (1)
{
// Check to see if the program has been launched
if (programLaunched == false)
{
std::cout << "Forking the process\n";
// Fork the process to launch the application
cpid = fork();
std::cout << "Child PID: " << cpid << "\n";
}
// Check if the fork was successful
if (cpid < 0)
{
std::cout << "Error in forking.\n";
// Error in forking
programLaunched = false;
}
else if (cpid == 0)
{
// Check if we need to launch the application
if (programLaunched == false)
{
// Send a message to the output
std::cout << "Launching Application...\n";
// Launch the Application
result = execv("./MyApp", myArgv);
std::cout << "execv result = " << result << "\n";
// Check if the program launched has failed
if (result != -1)
{
// Indicate the program has been launched
programLaunched = true;
// Exit the child process
return 0;
}
else
{
std::cout << "Child process terminated; bad execv\n";
// Flag that the program has not been launched
programLaunched = false;
// Exit the child process
return -1;
}
}
}
// In the Parent Process
else
{
// Get the current time
time(&now);
// Get the time difference between the program heartbeat time and current time
timeDiff = difftime(now, heartbeatTime);
// Check if we need to restart our application
if ((timeDiff > 60) && (programLaunched == true))
{
std::cout << "Killing the application\n";
// Kill the child process
kill(cpid, SIGINT);
// Indicate that the process was ended
programLaunched = false;
// Reset the Heart Beat time
time(&heartbeatTime);
return -1;
}
// Check to see if the child application is running
if (kill(cpid, 0) == -1)
{
// Get the Error
error = errno;
// Check if the process is running
if (error == ESRCH)
{
std::cout << "Process is not running; start it.\n";
// Process is not running.
programLaunched = false;
return -1;
}
}
else
{
// Child process is running
programLaunched = true;
}
}
// Give the process some time off.
sleep(5);
}
return 0;
}
This approach worked fairly well until I ran into a problem with the library I was using. It didn't like all of the killing and it basically ended up tying up my Ethernet port in an endless loop of never releasing - not good.
I then tried an alternative method. I modified the supervisory program to allow it to exit if it had to kill the multi-threaded application and I created a script that will launch the supervisor program from crontab. I used a shell script that I found on Stackoverflow.
#!/bin/bash
#make-run.sh
#make sure a process is always running.
export DISPLAY=:0 #needed if you are running a simple gui app.
process=YourProcessName
makerun="/usr/bin/program"
if ps ax | grep -v grep | grep $process > /dev/null
then
exit
else
$makerun &
fi
exit
I added it to crontab to run every minute. That was very helpful and it restarted the supervisory program which in turn restarted multi-threaded application but I noticed a problem of multiple instances of the multi-threaded application being launched. I'm not really sure why this was happening.
I know I'm really hacking this up but I'm backed into a corner with this implementation. I'm just trying to get it to work.
Suggestions?

How to find out whether child process still is running?

I am spawning a process in my application:
int status = posix_spawnp(&m_iProcessHandle, (char*)strProgramFilepath.c_str(), NULL, NULL, argsWrapper.m_pBuffer, NULL);
When I want to see if the process is still running, I use kill:
int iReturn = kill(m_iProcessHandle,0);
But after the spawned process has finished its work, it hangs around. The return value on the kill command is always 0. Not -1. I am calling kill from within the code, but if I call it from the command line, there is no error - the spawned process still exists.
Only when my application exits does the command-line kill return "No such process".
I can change this behavior in my code with this:
int iResult = waitpid(m_iProcessHandle, &iStatus, 0);
The call to waitpd closes down the spawned process and I can call kill and get -1 back, but by then I know the spawned process is dead.
And waitpd blocks my application!
How can I test a spawned processes to see if it is running, but without blocking my application?
UPDATE
Thanks for the help! I have implemented your advise and here is the result:
// background-task.cpp
//
#include <spawn.h>
#include <sys/wait.h>
#include <sys/types.h>
#include <signal.h>
#include "background-task.h"
CBackgroundTask::CBackgroundTask()
{
// Initialize member variables
m_iProcessHandle = 0;
}
CBackgroundTask::~CBackgroundTask()
{
// Clean up (kill first)
_lowLevel_cleanup(true);
}
bool CBackgroundTask::IsRunning()
{
// Shortcuts
if (m_iProcessHandle == 0)
return false;
// Wait for the process to finish
int iStatus = 0;
int iResult = waitpid(m_iProcessHandle, &iStatus, WNOHANG);
return (iResult != -1);
}
void CBackgroundTask::Wait()
{
// Wait (clean up without killing)
_lowLevel_cleanup(false);
}
void CBackgroundTask::Stop()
{
// Stop (kill and clean up)
_lowLevel_cleanup(true);
}
void CBackgroundTask::_start(const string& strProgramFilepath, const string& strArgs, int iNice /*=0*/)
{
// Call pre-start
_preStart();
// Split the args and build array of char-strings
CCharStringAarray argsWrapper(strArgs,' ');
// Run the command
int status = posix_spawnp(&m_iProcessHandle, (char*)strProgramFilepath.c_str(), NULL, NULL, argsWrapper.m_pBuffer, NULL);
if (status == 0)
{
// Process created
cout << "posix_spawn process=" << m_iProcessHandle << " status=" << status << endl;
}
else
{
// Failed
cout << "posix_spawn: error=" << status << endl;
}
// If process created...
if(m_iProcessHandle != 0)
{
// If need to adjust nice...
if (iNice != 0)
{
// Change the nice
stringstream ss;
ss << "sudo renice -n " << iNice << " -p " << m_iProcessHandle;
_runCommand(ss.str());
}
}
else
{
// Call post-stop success=false
_postStop(false);
}
}
void CBackgroundTask::_runCommand(const string& strCommand)
{
// Diagnostics
cout << "Running command: " << COUT_GREEN << strCommand << endl << COUT_RESET;
// Run command
system(strCommand.c_str());
}
void CBackgroundTask::_lowLevel_cleanup(bool bKill)
{
// Shortcuts
if (m_iProcessHandle == 0)
return;
// Diagnostics
cout << "Cleaning up process " << m_iProcessHandle << endl;
// If killing...
if (bKill)
{
// Kill the process
kill(m_iProcessHandle, SIGKILL);
}
// Diagnostics
cout << "Waiting for process " << m_iProcessHandle << " to finish" << endl;
// Wait for the process to finish
int iStatus = 0;
int iResult = waitpid(m_iProcessHandle, &iStatus, 0);
// Diagnostics
cout << "waitpid: status=" << iStatus << " result=" << iResult << endl;
// Reset the process-handle
m_iProcessHandle = 0;
// Call post-stop with success
_postStop(true);
// Diagnostics
cout << "Process cleaned" << endl;
}
Until the parent process calls one of the wait() functions to get the exit status of a child, the child stays around as a zombie process. If you run ps during this time, you'll see that the process is still there in the Z state. So kill() returns 0 because the process exists.
If you don't need to get the child's status, see How can I prevent zombie child processes? for how you can make the child disappear immediately when it exits.

Fork(): Dont return from child until it's terminated

I'm having some troubles with fork() and that kind of things.
I'm developing a shell, where the user can write commands that whill be executed as in a normal and common shell.
I have a main function like this:
void Shell::init() {
string command;
while (1) {
cout << getPrompt() << " ";
command = readCommand();
if (command.length() > 0) handleCommand(command);
}
}
handleCommand() is the function that does pretty much everything. Somewhere in it, I have the following:
...
else {
pid_t pid;
pid = fork();
char* arg[tokens.size() + 1];
for (int i = 0; i < tokens.size(); ++i) {
arg[i] = (char*) tokens[i].c_str();
}
arg[tokens.size()] = NULL;
if (pid == 0) {
if (execvp(tokens[0].c_str(), arg) == -1) {
cout << "Command not known. " << endl;
};
} else {
wait();
}
}
What I want is that when I reach that point, the command will be treated as a program invocation, so I create a child to run it. It's working almost perfect, but I get the prompt again before the program output. Example:
tronfi#orion:~/NetBeansProjects/Shell2$ whoami
tronfi#orion:~/NetBeansProjects/Shell2$ tronfi
tronfi#orion:~/NetBeansProjects/Shell2$
The child should die after the execvp, so it shouldn't be calling the prompt, and the parent is waiting until the child die.
So... what I'm doing wrong?
Thanks!!
You are calling wait() incorrectly. It expects to be passed a pointer-to-int, in which the child's exit status will be stored:
int status;
wait(&status);
Really, though, you should be using waitpid() to check for the specific child that you're after. You also need to loop around if waitpid() is interrupted by a signal:
int r;
do {
r = waitpid(pid, &status, 0);
} while (r < 0 && errno == EINTR);
I'm not sure that this is exactly the problem, but you must ensure that the child exits even if execvp() fails:
if (pid == 0) {
if (execvp(tokens[0].c_str(), arg) == -1) {
cout << "Command not known. " << endl;
};
exit(1); // or some other error code to indicate execvp() fails
} else {
wait();
}
If you don't do this, then if excecvp() fails then you will end up with two instances of your shell, which is probably not what you want.
The child must be terminated using the call exit(0) (only on success), as this helps in clening of memory and flushes the buffer. This status returned by the child must be checked by the parent and then only it should give the prompt.
Let me know if you need more details.