First off, allow me to describe my scenario:
I developed a supervisory program on Linux that forks and then uses execv(), in the child process, to launch my multi-threaded application. The supervisory program is acting as a watchdog to the multi-threaded application. If the multi-threaded application does not send a SIGUSR1 signal to the supervisor after a period of time then the supervisory program will kill the child using the pid_t from the fork() call and repeat the process again.
Here is the code for the Supervisory Program:
#include <sys/types.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
#include <iostream>
#include <cerrno>
time_t heartbeatTime;
void signalHandler(int sigNum)
{
//std::cout << "Signal (" << sigNum << ") received.\n";
time(&heartbeatTime);
}
int main(int argc, char *argv[])
{
pid_t cpid, ppid;
int result = 0;
bool programLaunched = false;
time_t now;
double timeDiff;
int error;
char ParentID[25];
char *myArgv[2];
// Get the Parent Process ID
ppid = ::getpid();
// Initialize the Child Process ID
cpid = 0;
// Copy the PID into the char array
sprintf(ParentID, "%i", ppid);
// Set up the array to pass to the Program
myArgv[0] = ParentID;
myArgv[1] = 0;
// Print out of the P PID
std::cout << "Parent ID: " << myArgv[0] << "\n";
// Register for the SIGUSR1 signal
signal(SIGUSR1, signalHandler);
// Register the SIGCHLD so the children processes exit fully
signal(SIGCHLD, SIG_IGN);
// Initialize the Heart Beat time
time(&heartbeatTime);
// Loop forever and ever, amen.
while (1)
{
// Check to see if the program has been launched
if (programLaunched == false)
{
std::cout << "Forking the process\n";
// Fork the process to launch the application
cpid = fork();
std::cout << "Child PID: " << cpid << "\n";
}
// Check if the fork was successful
if (cpid < 0)
{
std::cout << "Error in forking.\n";
// Error in forking
programLaunched = false;
}
else if (cpid == 0)
{
// Check if we need to launch the application
if (programLaunched == false)
{
// Send a message to the output
std::cout << "Launching Application...\n";
// Launch the Application
result = execv("./MyApp", myArgv);
std::cout << "execv result = " << result << "\n";
// Check if the program launched has failed
if (result != -1)
{
// Indicate the program has been launched
programLaunched = true;
// Exit the child process
return 0;
}
else
{
std::cout << "Child process terminated; bad execv\n";
// Flag that the program has not been launched
programLaunched = false;
// Exit the child process
return -1;
}
}
}
// In the Parent Process
else
{
// Get the current time
time(&now);
// Get the time difference between the program heartbeat time and current time
timeDiff = difftime(now, heartbeatTime);
// Check if we need to restart our application
if ((timeDiff > 60) && (programLaunched == true))
{
std::cout << "Killing the application\n";
// Kill the child process
kill(cpid, SIGINT);
// Indicate that the process was ended
programLaunched = false;
// Reset the Heart Beat time
time(&heartbeatTime);
return -1;
}
// Check to see if the child application is running
if (kill(cpid, 0) == -1)
{
// Get the Error
error = errno;
// Check if the process is running
if (error == ESRCH)
{
std::cout << "Process is not running; start it.\n";
// Process is not running.
programLaunched = false;
return -1;
}
}
else
{
// Child process is running
programLaunched = true;
}
}
// Give the process some time off.
sleep(5);
}
return 0;
}
This approach worked fairly well until I ran into a problem with the library I was using. It didn't like all of the killing and it basically ended up tying up my Ethernet port in an endless loop of never releasing - not good.
I then tried an alternative method. I modified the supervisory program to allow it to exit if it had to kill the multi-threaded application and I created a script that will launch the supervisor program from crontab. I used a shell script that I found on Stackoverflow.
#!/bin/bash
#make-run.sh
#make sure a process is always running.
export DISPLAY=:0 #needed if you are running a simple gui app.
process=YourProcessName
makerun="/usr/bin/program"
if ps ax | grep -v grep | grep $process > /dev/null
then
exit
else
$makerun &
fi
exit
I added it to crontab to run every minute. That was very helpful and it restarted the supervisory program which in turn restarted multi-threaded application but I noticed a problem of multiple instances of the multi-threaded application being launched. I'm not really sure why this was happening.
I know I'm really hacking this up but I'm backed into a corner with this implementation. I'm just trying to get it to work.
Suggestions?
Related
Following this documentation, I am testing how to stop and resume a process. I have basic code to test as follows:
#include <iostream>
#include <csignal>
#include <unistd.h>
int main() {
std::cout << "Hello" << std::endl;
int pid = getpid();
kill(pid, SIGSTOP);
kill(pid, SIGCONT);
std::cout << "Bye" << std::endl;
return 0;
}
The output is:
Hello
It stops the process, but it never resumes it. How should I fix it?
A solution, if a bit complicated, is to create a child process to start and stop the parent. Here is a small code example, that might help:
#include <iostream>
#include <csignal>
#include <unistd.h>
int pid; //Include declaration outside so it transfers to the child process
int main() {
std::cout << "Hello" << std::endl;
pid = getpid();
int returned_pid = fork(); //Duplicate process into 2 identical processes
if(returned_pid) {
// If it is the parent process, then fork returns the child process pid
// This is executed by the parent process
usleep(1000); // Sleep a millisecond to allow for the stop command to run
} else {
// If fork returns 0, then it is the child process
// The else is executed by the child process
kill(pid, SIGSTOP); // Stop parent process
usleep(3000000); // Delay 3 seconds
kill(pid, SIGCONT); // Resume parent process
}
if(returned_pid) { // Only print if parent process
std::cout << "Bye" << std::endl;
}
return 0;
}
Clarification: The fork command returns 2 different values in the 2 processes: 0 in the child, and the pid of the child process in the parent.
Other note: When running this in a terminal, it will look weird, as the terminal may note that the process was stopped and give a new command line, but then the process resumes, so prints Bye over it. Just a note.
if everything is not perfect I apologize;)
I am doing a program in c ++ that when it receives a sensor information, shows a picture with feh full screen.
The problem is that when I want to go from one image to another, It opens a new feh, until the moment when the computer crashes because it takes all the memory ...
How to make the opening of an image close the previous one?
This is my current command line :
system("feh -F ressources/icon_communication.png&");
I must specify that I also trigger a sound, but that there is no problem because the program closes automatically at the end of the sound:
system("paplay /home/pi/demo_ecran_interactif/ressources/swip.wav&");
Tried this as a test and works ! Thanks #paul-sanders !
#include <iostream>
#include <chrono>
#include <thread>
#include <unistd.h>
#include <signal.h>
using namespace std;
pid_t display_image_file (const char *image_file)
{
pid_t pid = fork ();
if (pid == -1)
{
std::cout << "Could not fork, error: " << errno << "\n";
return -1;
}
if (pid != 0) // parent
return pid;
// child
execlp ("feh", "-F", image_file, NULL); // only returns on failure
std::cout << "Couldn't exec feh for image file " << image_file << ", error: " << errno << "\n";
return -1;
}
int main()
{
pid_t pid = display_image_file ("nav.png");
if (pid != -1)
{
std::this_thread::sleep_for (std::chrono::milliseconds (2000));
kill (pid, SIGKILL);
}
pid_t pid2 = display_image_file ("sms2.png");
}
Soooooooooo, the goal here (in terms of your test program) seems to be:
display nav.png in feh
wait 2 seconds
close (that instance of) feh
display sms2.png in feh
And if you can get the test program doing that then you will be on your way (I'm not going to worry my pretty little head about your sound issue (because it's 30+ degrees here today), but once you have your test program running right then you will probably be able to figure out how to solve that one yourself).
So, two issues that I see in your code here:
you're not making any effort to close the first instance of 'feh'
execlp() doesn't do quite what you probably think it does (specifically, it never returns, unless it fails for some reason).
So what I think you need to do is something like this (code untested, might not even compile and you need to figure out the right header files to #include, but it should at least get you going):
pid_t display_image_file (const char *image_file)
{
pid_t pid = fork ();
if (pid == -1)
{
std::cout << "Could not fork, error: " << errno << "\n";
return -1;
}
if (pid != 0) // parent
return pid;
// child
execlp ("feh", "-F", image_file, NULL); // only returns on failure
std::cout << "Couldn't exec feh for image file " << image_file << ", error: " << errno << "\n";
return -1;
}
int main()
{
pid_t pid = display_image_file ("nav.png");
if (pid != -1)
{
std::this_thread::sleep_for (std::chrono::milliseconds (2000));
kill (pid, SIGKILL);
}
pid_t pid = display_image_file ("sms2.png");
// ...
}
Does that help?
I was writing a code for a research program. I have following requirement:
1. Main binary execution begins at main()
2. main() fork()
3. child process runs a linpack benchmark binary using execvp()
4. parent process runs some monitoring process and wait for child to exit.
The code is below:
main.cpp
extern ServerUncorePowerState * BeforeStates ;
extern ServerUncorePowerState * AfterStates;
int main(int argc, char *argv[]) {
power pwr;;
procstat st;
membandwidth_t data;
int sec_pause = 1; // sample every 1 second
pid_t child_pid = fork();
if (child_pid >= 0) { //fork successful
if (child_pid == 0) { // child process
int exec_status = execvp(argv[1], argv+1);
if (exec_status) {
std::cerr << "execv failed with error "
<< errno << " "
<< strerror(errno) << std::endl;
}
} else { // parent process
int status = 1;
waitpid(child_pid, &status, WNOHANG);
write_headers();
pwr.init();
st.init();
init_bandwidth();
while (status) {
cout << " Printing status Value: " << status << endl;
sleep (sec_pause);
time_t now;
time(&now);
struct tm *tinfo;
tinfo = localtime(&now);
pwr.loop();
st.loop();
data = getbandwidth();
write_samples(tinfo, pwr, st, data.read_bandwidth + data.write_bandwidth);
waitpid(child_pid, &status, WNOHANG);
}
wait(&status); // wait for child to exit, and store its status
//--------------------This code is not executed------------------------
std::cout << "PARENT: Child's exit code is: "
<< WEXITSTATUS(status)
<< std::endl;
delete[] BeforeStates;
delete[] AfterStates;
}
} else {
std::cerr << "fork failed" << std::endl;
return 1;
}
return 0;
}
What is expected that the child will exit and then parent exits but due to some unknown reason after 16 mins parent exits but child is still running.
Normally It is said that when parent exits the child dies automatically.
What could be the reason for this strange behavior???
Normally It is said that when parent exits the child dies automatically.
Well this is not always true, it depends on the system. When a parent process terminates, the child process is called an orphan process. In a Unix-like OS this is managed by relating the parent process of the orphan process to the init process, this is called re-parenting and it's automatically managed by the OS. In other types of OS, orphan processes are automatically killed by the system. You can find more details here.
From the code snippet I would think that maybe the issue is in the wait(&status) statement. The previous loop would end (or not be executed) when the return status is 0, which is the default return value from your final return 0 at the end, that could be yielded by the previous waitpid(child_pid, &status, WNOHANG) statements. This means that the wait(&status) statement would wait on a already terminated process, this may cause some issues.
I wrote a helper function to start a process using fork() and execv() inspired by this answer. It is used to start e.g. mysqldump to make a database backup.
The code works totally fine in a couple of different locations with different programs.
Now I hit one constellation where it fails:
It is a call to systemctl to stop a unit. Running systemctl works, the unit is stopped. But in the intermediate process, when wait()ing for the child process, wait() hangs until the timeout process ends.
If I check, if the worker process finished with kill(), I can tell that it did.
Important: The program does not misbehave or seg fault, besides that the wait() does not signal the end of the worker process!
Is there anything in my code (see below) that is incorrect that could trigger that behavior?
I've read Threads and fork(): think twice before mixing them but I cannot find anything in there that relates to my problem.
What's strange:
Deep, deep, deep in the program JSON-RPC is used. If I deactivate the code using the JSON-RPC everything works fine!?
Environment:
The program that uses the function is a multi-threaded application. Signals are blocked for all threads. The main threads handles signals via sigtimedwait().
Code (production code in which logging got traded for output via std::cout) with sample main function:
#include <iostream>
#include <unistd.h>
#include <sys/wait.h>
namespace {
bool checkStatus(const int status) {
return( WIFEXITED(status) && ( WEXITSTATUS(status) == 0 ) );
}
}
bool startProcess(const char* const path, const char* const argv[], const unsigned int timeoutInSeconds, pid_t& processId, const int* const fileDescriptor) {
auto result = true;
const pid_t intermediatePid = fork();
if(intermediatePid == 0) {
// intermediate process
std::cout << "Intermediate process: Started (" << getpid() << ")." << std::endl;
const pid_t workerPid = fork();
if(workerPid == 0) {
// worker process
if(fileDescriptor) {
std::cout << "Worker process: Redirecting file descriptor to stdin." << std::endl;
const auto dupResult = dup2(*fileDescriptor, STDIN_FILENO);
if(-1 == dupResult) {
std::cout << "Worker process: Duplication of file descriptor failed." << std::endl;
_exit(EXIT_FAILURE);
}
}
execv(path, const_cast<char**>(argv));
std::cout << "Intermediate process: Worker failed!" << std::endl;
_exit(EXIT_FAILURE);
} else if(-1 == workerPid) {
std::cout << "Intermediate process: Starting worker failed!" << std::endl;
_exit(EXIT_FAILURE);
}
const pid_t timeoutPid = fork();
if(timeoutPid == 0) {
// timeout process
std::cout << "Timeout process: Started (" << getpid() << ")." << std::endl;
sleep(timeoutInSeconds);
std::cout << "Timeout process: Finished." << std::endl;
_exit(EXIT_SUCCESS);
} else if(-1 == timeoutPid) {
std::cout << "Intermediate process: Starting timeout process failed." << std::endl;
kill(workerPid, SIGKILL);
std::cout << "Intermediate process: Finished." << std::endl;
_exit(EXIT_FAILURE);
}
// ---------------------------------------
// This code is only used for double checking if the worker is still running.
// The if condition never evaluated to true in my tests.
const auto killResult = kill(workerPid, 0);
if((-1 == killResult) && (ESRCH == errno)) {
std::cout << "Intermediate process: Worker is not running." << std::endl;
}
// ---------------------------------------
std::cout << "Intermediate process: Waiting for child processes." << std::endl;
int status = -1;
const pid_t exitedPid = wait(&status);
// ---------------------------------------
// This code is only used for double checking if the worker is still running.
// The if condition evaluates to true in the case of an error.
const auto killResult2 = kill(workerPid, 0);
if((-1 == killResult2) && (ESRCH == errno)) {
std::cout << "Intermediate process: Worker is not running." << std::endl;
}
// ---------------------------------------
std::cout << "Intermediate process: Child process finished. Status: " << status << "." << std::endl;
if(exitedPid == workerPid) {
std::cout << "Intermediate process: Killing timeout process." << std::endl;
kill(timeoutPid, SIGKILL);
} else {
std::cout << "Intermediate process: Killing worker process." << std::endl;
kill(workerPid, SIGKILL);
std::cout << "Intermediate process: Waiting for worker process to terminate." << std::endl;
wait(nullptr);
std::cout << "Intermediate process: Finished." << std::endl;
_exit(EXIT_FAILURE);
}
std::cout << "Intermediate process: Waiting for timeout process to terminate." << std::endl;
wait(nullptr);
std::cout << "Intermediate process: Finished." << std::endl;
_exit(checkStatus(status) ? EXIT_SUCCESS : EXIT_FAILURE);
} else if(-1 == intermediatePid) {
// error
std::cout << "Parent process: Error starting intermediate process!" << std::endl;
result = false;
} else {
// parent process
std::cout << "Parent process: Intermediate process started. PID: " << intermediatePid << "." << std::endl;
processId = intermediatePid;
}
return(result);
}
bool waitForProcess(const pid_t processId) {
int status = 0;
const auto waitResult = waitpid(processId, &status, 0);
auto result = false;
if(waitResult == processId) {
result = checkStatus(status);
}
return(result);
}
int main() {
pid_t pid = 0;
const char* const path = "/bin/ls";
const char* argv[] = { "/bin/ls", "--help", nullptr };
const unsigned int timeoutInS = 5;
const auto startResult = startProcess(path, argv, timeoutInS, pid, nullptr);
if(startResult) {
const auto waitResult = waitForProcess(pid);
std::cout << "waitForProcess returned " << waitResult << "." << std::endl;
} else {
std::cout << "startProcess failed!" << std::endl;
}
}
Edit
The expected output should contain
Intermediate process: Waiting for child processes.
Intermediate process: Child process finished. Status: 0.
Intermediate process: Killing timeout process.
In the case of error the output looks like this
Intermediate process: Waiting for child processes.
Intermediate process: Child process finished. Status: -1
Intermediate process: Killing worker process.
When you run the sample code you will most likely see the expected output. I cannot reproduce the incorrect result in a simple example.
I found the problem:
Within the mongoose (JSON-RPC uses mongoose) sources in the function mg_start I found the following code
#if !defined(_WIN32) && !defined(__SYMBIAN32__)
// Ignore SIGPIPE signal, so if browser cancels the request, it
// won't kill the whole process.
(void) signal(SIGPIPE, SIG_IGN);
// Also ignoring SIGCHLD to let the OS to reap zombies properly.
(void) signal(SIGCHLD, SIG_IGN);
#endif // !_WIN32
(void) signal(SIGCHLD, SIG_IGN);
causes that
if the parent does a wait(), this call will return only when all children have exited, and then returns -1 with errno set to ECHILD."
as mentioned here in the section 5.5 Voodoo: wait and SIGCHLD.
This is also described in the man page for WAIT(2)
ERRORS [...]
ECHILD [...] (This can happen for
one's own child if the action for SIGCHLD is set to SIG_IGN.
See also the Linux Notes section about threads.)
Stupid on my part not to check the return value correctly.
Before trying
if(exitedPid == workerPid) {
I should have checked that exitedPid is != -1.
If I do so errno gives me ECHILD. If I would have known that in the first place, I would have read the man page and probably found the problem faster...
Naughty of mongoose just to mess with signal handling no matter what an application wants to do about it. Additionally mongoose does not revert the altering of signal handling when being stopped with mg_stop.
Additional info:
The code that caused this problem was changed in mongoose in September 2013 with this commit.
In our application the similar issue we faced. in a intense situation of repeated child process forks(), the child process never returned. One can monitor the PID of the child process, and if it does not return beyond a particular application defined threshold, you can terminate that process by sending a kill/Term signal.
I am forking a number of processes and I want to measure how long it takes to complete the whole task, that is when all processes forked are completed. Please advise how to make the parent process wait until all child processes are terminated? I want to make sure that I stop the timer at the right moment.
Here is as a code I use:
#include <iostream>
#include <string>
#include <fstream>
#include <sys/time.h>
#include <sys/wait.h>
using namespace std;
struct timeval first, second, lapsed;
struct timezone tzp;
int main(int argc, char* argv[])// query, file, num. of processes.
{
int pCount = 5; // process count
gettimeofday (&first, &tzp); //start time
pid_t* pID = new pid_t[pCount];
for(int indexOfProcess=0; indexOfProcess<pCount; indexOfProcess++)
{
pID[indexOfProcess]= fork();
if (pID[indexOfProcess] == 0) // child
{
// code only executed by child process
// magic here
// The End
exit(0);
}
else if (pID[indexOfProcess] < 0) // failed to fork
{
cerr << "Failed to fork" << endl;
exit(1);
}
else // parent
{
// if(indexOfProcess==pCount-1) and a loop with waitpid??
gettimeofday (&second, &tzp); //stop time
if (first.tv_usec > second.tv_usec)
{
second.tv_usec += 1000000;
second.tv_sec--;
}
lapsed.tv_usec = second.tv_usec - first.tv_usec;
lapsed.tv_sec = second.tv_sec - first.tv_sec;
cout << "Job performed in " <<lapsed.tv_sec << " sec and " << lapsed.tv_usec << " usec"<< endl << endl;
}
}//for
}//main
I'd move everything after the line "else //parent" down, outside the for loop. After the loop of forks, do another for loop with waitpid, then stop the clock and do the rest:
for (int i = 0; i < pidCount; ++i) {
int status;
while (-1 == waitpid(pids[i], &status, 0));
if (!WIFEXITED(status) || WEXITSTATUS(status) != 0) {
cerr << "Process " << i << " (pid " << pids[i] << ") failed" << endl;
exit(1);
}
}
gettimeofday (&second, &tzp); //stop time
I've assumed that if the child process fails to exit normally with a status of 0, then it didn't complete its work, and therefore the test has failed to produce valid timing data. Obviously if the child processes are supposed to be killed by signals, or exit non-0 return statuses, then you'll have to change the error check accordingly.
An alternative using wait:
while (true) {
int status;
pid_t done = wait(&status);
if (done == -1) {
if (errno == ECHILD) break; // no more child processes
} else {
if (!WIFEXITED(status) || WEXITSTATUS(status) != 0) {
cerr << "pid " << done << " failed" << endl;
exit(1);
}
}
}
This one doesn't tell you which process in sequence failed, but if you care then you can add code to look it up in the pids array and get back the index.
The simplest method is to do
while(wait() > 0) { /* no-op */ ; }
This will not work if wait() fails for some reason other than the fact that there are no children left. So with some error checking, this becomes
int status;
[...]
do {
status = wait();
if(status == -1 && errno != ECHILD) {
perror("Error during wait()");
abort();
}
} while (status > 0);
See also the manual page wait(2).
Call wait (or waitpid) in a loop until all children are accounted for.
In this case, all processes are synchronizing anyway, but in general wait is preferred when more work can be done (eg worker process pool), since it will return when the first available process state changes.
I believe the wait system call will accomplish what you are looking for.
for (int i = 0; i < pidCount; i++) {
while (waitpid(pids[i], NULL, 0) > 0);
}
It won't wait in the right order, but it will stop shortly after the last child dies.