I'm trying to launch a linux service from a c++ and I do it successfully but one of my process is marked as "defunct" and I don't want that my parent process dies.
My code is (testRip.cpp):
int main()
{
char* zebraArg[2];
zebraArg[0] = (char *)"zebra";
zebraArg[1] = (char *)"restart";
char* ripdArg[2];
ripdArg[0] = (char *)"ripd";
ripdArg[1] = (char *)"restart";
pid_t ripPid;
pid_t zebraPid;
zebraPid = fork();
if(zebraPid == 0)
{
int32_t iExecvRes = 0;
iExecvRes = execv("/etc/init.d/zebra", zebraArg);
return 0;
if(iExecvRes == -1)
{
::syslog((LOG_LOCAL0 | LOG_ERR),
"zebra process failed \n");
}
}
else
{
while(1)
{
::syslog((LOG_LOCAL0 | LOG_ERR),
"running\n");
sleep(2);
}
}
}
The exit of ps -e command is:
9411 pts/1 00:00:00 testRip
9412 pts/1 00:00:00 testRip <defunct>
9433 ? 00:00:00 zebra
The /etc/init.d/zebra launches the service as daemon or something like that so I think this is the trick but:
Why there are 3 processes and one of them is marked as defunct?
What is wrong in my code?
How can I fix it?
Thanks in advance.
To remove zombies you the parent process must wait() its child or dies. If you need to make a non blocking wait() look at waitpid() with W_NOHANG flag.
Correctly forking a daemon process is hard in Unix and Linux because there are a lot of details to get right, and order is also important. I would suspect a combination of open file descriptors and not detatching the controlling terminal, in this case.
I would strongly suggest using a well-debugged implementation from another program - one of the reduced-functionality command line shells such as rsh or ksh may be a good choice, rather than trying to bake your own version.
Related
I've been tasked to create a program that takes a text file that contains a list of programs as input. It then needs to run valgrind on the programs (one at a time) until valgrind ends or until the program hits a max allotted time. I have the program doing everything I need it to do EXCEPT it isn't waiting for valgrind to finish. The code I'm using has this format:
//code up to this point is working properly
pid_t pid = fork();
if(pid == 0){
string s = "sudo valgrind --*options omitted*" + testPath + " &>" + outPath;
system(s.c_str());
exit(0);
}
//code after here seems to also be working properly
I'm running into an issue where the child just calls the system and moves on without waiting for valgrind to finish. As such I'm guessing that system isn't the right call to use, but I don't know what call I should be making. Can anyone tell me how to get the child to wait for valgrind to finish?
I think that you are looking for fork/execv. Here is an example:
http://www.cs.ecu.edu/karl/4630/spr01/example1.html
An other alternative could be popen.
You can fork and exec your program and then wait for it to finish. See the following example.
pid_t pid = vfork();
if(pid == -1)
{
perror("fork() failed");
return -1;
}
else if(pid == 0)
{
char *args[] = {"/bin/sleep", "5", (char *)0};
execv("/bin/sleep", args);
}
int child_status;
int child_pid = wait(&child_status);
printf("Child %u finished with status %d\n", child_pid, child_status);
I am launching a command using system api (I am ok with using this api with C/C++). The command I pass may hang at times and hence I would like to kill after certain timeout.
Currently I am using it as:
system("COMMAND");
I want to use it something like this:
Run a command using a system independent API (I don't want to use CreateProcess since it is for Windows only) Kill the process if it does not exit after 'X' Minutes.
Since system() is a platform-specific call, there cannot be a platform-independent way of solving your problem. However, system() is a POSIX call, so if it is supported on any given platform, the rest of the POSIX API should be as well. So, one way to solve your problem is to use fork() and kill().
There is a complication in that system() invokes a shell, which will probably spawn other processes, and I presume you want to kill all of them, so one way to do that is to use a process group. The basic idea is use fork() to create another process, place it in its own process group, and kill that group if it doesn't exit after a certain time.
A simple example - the program forks; the child process sets its own process group to be the same as its process ID, and uses system() to spawn an endless loop. The parent process waits 10 seconds then kills the process group, using the negative value of the child process PID. This will kill the forked process and any children of that process (unless they have changed their process group.)
Since the parent process is in a different group, the kill() has no effect on it.
#include <unistd.h>
#include <stdlib.h>
#include <signal.h>
#include <stdio.h>
int main() {
pid_t pid;
pid = fork();
if(pid == 0) { // child process
setpgid(getpid(), getpid());
system("while true ; do echo xx ; sleep 5; done");
} else { // parent process
sleep(10);
printf("Sleep returned\n");
kill(-pid, SIGKILL);
printf("killed process group %d\n", pid);
}
exit(0);
}
There is no standard, cross-platform system API. The hint is that they are system APIs! We're actually "lucky" that we get system, but we don't get anything other than that.
You could try to find some third-party abstraction.
Check below C++ thread based attempt for linux. (not tested)
#include <iostream>
#include <string>
#include <thread>
#include <stdio.h>
using namespace std;
// execute system command and get output
// http://stackoverflow.com/questions/478898/how-to-execute-a-command-and-get-output-of-command-within-c
std::string exec(const char* cmd) {
FILE* pipe = popen(cmd, "r");
if (!pipe) return "ERROR";
char buffer[128];
std::string result = "";
while(!feof(pipe)) {
if(fgets(buffer, 128, pipe) != NULL)
result += buffer;
}
pclose(pipe);
return result;
}
void system_task(string& cmd){
exec(cmd.c_str());
}
int main(){
// system commad that takes time
string command = "find /";
// run the command in a separate thread
std::thread t1(system_task, std::ref(command));
// gives some time for the system task
std::this_thread::sleep_for(chrono::milliseconds(200));
// get the process id of the system task
string query_command = "pgrep -u $LOGNAME " + command;
string process_id = exec(query_command.c_str());
// kill system task
cout << "killing process " << process_id << "..." << endl;
string kill_command = "kill " + process_id;
exec(kill_command.c_str());
if (t1.joinable())
t1.join();
cout << "continue work on main thread" << endl;
return 0;
}
I had a similar problem, in a Qt/QML development: I want to start a bash command, while continuing to process events on the Qt Loop, and killing the bash command if it takes too long.
I came up with the following class that I'm sharing here (see below), in hope it may be of some use to people with a similar problem.
Instead of calling a 'kill' command, I call a cleanupCommand supplied by the developper. Example: if I'm to call myscript.sh and want to check that it won't last run for more than 10 seconds, I'll call it the following way:
SystemWithTimeout systemWithTimeout("myScript.sh", 10, "killall myScript.sh");
systemWithTimeout.start();
Code:
class SystemWithTimeout {
private:
bool m_childFinished = false ;
QString m_childCommand ;
int m_seconds ;
QString m_cleanupCmd ;
int m_period;
void startChild(void) {
int rc = system(m_childCommand.toUtf8().data());
if (rc != 0) SYSLOG(LOG_NOTICE, "Error SystemWithTimeout startChild: system returned %d", rc);
m_childFinished = true ;
}
public:
SystemWithTimeout(QString cmd, int seconds, QString cleanupCmd)
: m_childFinished {false}, m_childCommand {cmd}, m_seconds {seconds}, m_cleanupCmd {cleanupCmd}
{ m_period = 200; }
void setPeriod(int period) {m_period = period;}
void start(void) ;
};
void SystemWithTimeout::start(void)
{
m_childFinished = false ; // re-arm the boolean for 2nd and later calls to 'start'
qDebug()<<"systemWithTimeout"<<m_childCommand<<m_seconds;
QTime dieTime= QTime::currentTime().addSecs(m_seconds);
std::thread child(&SystemWithTimeout::startChild, this);
child.detach();
while (!m_childFinished && QTime::currentTime() < dieTime)
{
QTime then = QTime::currentTime();
QCoreApplication::processEvents(QEventLoop::AllEvents, m_period); // Process events during up to m_period ms (default: 200ms)
QTime now = QTime::currentTime();
int waitTime = m_period-(then.msecsTo(now)) ;
QThread::msleep(waitTime); // wait for the remaning of the 200 ms before looping again.
}
if (!m_childFinished)
{
SYSLOG(LOG_NOTICE, "Killing command <%s> after timeout reached (%d seconds)", m_childCommand.toUtf8().data(), m_seconds);
int rc = system(m_cleanupCmd.toUtf8().data());
if (rc != 0) SYSLOG(LOG_NOTICE, "Error SystemWithTimeout 164: system returned %d", rc);
m_childFinished = true ;
}
}
I do not know any portable way to do that in C nor C++ languages. As you ask for alternatives, I know it is possible in other languages. For example in Python, it is possible using the subprocess module.
import subprocess
cmd = subprocess.Popen("COMMAND", shell = True)
You can then test if COMMAND has ended with
if cmd.poll() is not None:
# cmd has finished
and you can kill it with :
cmd.terminate()
Even if you prefere to use C language, you should read the documentation for subprocess module because it explains that internally it uses CreateProcess on Windows and os.execvp on Posix systems to start the command, and it uses TerminateProcess on Windows and SIG_TERM on Posix to stop it.
I have created a binary using c++ in ubuntu which will run as a daemon and read the data from database and store it into a xml file.
and for stopping the daemon i am using this function but it is not working.
void stopService()
{
int mypid;
if(((mypid = validate_pid()) > 0) || ((mypid = validate_non_pid()) > 0)) {
if(0 == kill(mypid, SIGTERM) ) {
sleep(1);
}
else {
printf("Stopping %s [FAILED]\n", Service);// this line is getting printed.
}
}
else {
printf("Stopping %s [ Failed ] Not Running....\n", Service);
}
}
i'm getting the ouput as stopping the service [failed].
'validate_pid()' It will written the pid from /proc/some id/cmdline.
and
'validate_non_pid()' It will written the pid using pgrep.
I am not writing the complete code since it will become lengthy,
Thanks in advance.
and one thing i'm calling this function by taking command line argument and using this in switch();
'case 'e':
stopService();'
so how can i kill this process by usin kill().
It can be a permission issue.
Print the Process Id using
'getpid();'
Kill it manually from terminal.
'sudo kill -9 processid'
I have the helper function below, used to execute a command and get the return value on posix systems. I used to use popen, but it is impossible to get the return code of an application with popen if it runs and exits before popen/pclose gets a chance to do its work.
The following helper function creates a process fork, uses execvp to run the desired external process, and then the parent uses waitpid to get the return code. I'm seeing odd cases where it's refusing to run.
When called with wait = true, waitpid should return the exit code of the application no matter what. However, I'm seeing stdout output that specifies the return code should be non-zero, yet the return code is zero. Testing the external process in a regular shell, then echoing $? returns non-zero, so it's not a problem w/ the external process not returning the right code. If it's of any help, the external process being run is mount(8) (yes, I know I can use mount(2) but that's besides the point).
I apologize in advance for a code dump. Most of it is debugging/logging:
inline int ForkAndRun(const std::string &command, const std::vector<std::string> &args, bool wait = false, std::string *output = NULL)
{
std::string debug;
std::vector<char*> argv;
for(size_t i = 0; i < args.size(); ++i)
{
argv.push_back(const_cast<char*>(args[i].c_str()));
debug += "\"";
debug += args[i];
debug += "\" ";
}
argv.push_back((char*)NULL);
neosmart::logger.Debug("Executing %s", debug.c_str());
int pipefd[2];
if (pipe(pipefd) != 0)
{
neosmart::logger.Error("Failed to create pipe descriptor when trying to launch %s", debug.c_str());
return EXIT_FAILURE;
}
pid_t pid = fork();
if (pid == 0)
{
close(pipefd[STDIN_FILENO]); //child isn't going to be reading
dup2(pipefd[STDOUT_FILENO], STDOUT_FILENO);
close(pipefd[STDOUT_FILENO]); //now that it's been dup2'd
dup2(pipefd[STDOUT_FILENO], STDERR_FILENO);
if (execvp(command.c_str(), &argv[0]) != 0)
{
exit(EXIT_FAILURE);
}
return 0;
}
else if (pid < 0)
{
neosmart::logger.Error("Failed to fork when trying to launch %s", debug.c_str());
return EXIT_FAILURE;
}
else
{
close(pipefd[STDOUT_FILENO]);
int exitCode = 0;
if (wait)
{
waitpid(pid, &exitCode, wait ? __WALL : (WNOHANG | WUNTRACED));
std::string result;
char buffer[128];
ssize_t bytesRead;
while ((bytesRead = read(pipefd[STDIN_FILENO], buffer, sizeof(buffer)-1)) != 0)
{
buffer[bytesRead] = '\0';
result += buffer;
}
if (wait)
{
if ((WIFEXITED(exitCode)) == 0)
{
neosmart::logger.Error("Failed to run command %s", debug.c_str());
neosmart::logger.Info("Output:\n%s", result.c_str());
}
else
{
neosmart::logger.Debug("Output:\n%s", result.c_str());
exitCode = WEXITSTATUS(exitCode);
if (exitCode != 0)
{
neosmart::logger.Info("Return code %d", (exitCode));
}
}
}
if (output)
{
result.swap(*output);
}
}
close(pipefd[STDIN_FILENO]);
return exitCode;
}
}
Note that the command is run OK with the correct parameters, the function proceeds without any problems, and WIFEXITED returns TRUE. However, WEXITSTATUS returns 0, when it should be returning something else.
Probably isn't your main issue, but I think I see a small problem. In your child process, you have...
dup2(pipefd[STDOUT_FILENO], STDOUT_FILENO);
close(pipefd[STDOUT_FILENO]); //now that it's been dup2'd
dup2(pipefd[STDOUT_FILENO], STDERR_FILENO); //but wait, this pipe is closed!
But I think what you want is:
dup2(pipefd[STDOUT_FILENO], STDOUT_FILENO);
dup2(pipefd[STDOUT_FILENO], STDERR_FILENO);
close(pipefd[STDOUT_FILENO]); //now that it's been dup2'd for both, can close
I don't have much experience with forks and pipes in Linux, but I did write a similar function pretty recently. You can take a look at the code to compare, if you'd like. I know that my function works.
execAndRedirect.cpp
I'm using the mongoose library, and grepping my code for SIGCHLD revealed that using mg_start from mongoose results in setting SIGCHLD to SIG_IGN.
From the waitpid man page, on Linux a SIGCHLD set to SIG_IGN will not create a zombie process, so waitpid will fail if the process has already successfully run and exited - but will run OK if it hasn't yet. This was the cause of the sporadic failure of my code.
Simply re-setting SIGCHLD after calling mg_start to a void function that does absolutely nothing was enough to keep the zombie records from being immediately erased.
Per #Geoff_Montee's advice, there was a bug in my redirect of STDERR, but this was not responsible for the problem as execvp does not store the return value in STDERR or even STDOUT, but rather in the kernel object associated with the parent process (the zombie record).
#jilles' warning about non-contiguity of vector in C++ does not apply for C++03 and up (only valid for C++98, though in practice, most C++98 compilers did use contiguous storage, anyway) and was not related to this issue. However, the advice on reading from the pipe before blocking and checking the output of waitpid is spot-on.
I've found that pclose does NOT block and wait for the process to end, contrary to the documentation (this is on CentOS 6). I've found that I need to call pclose and then call waitpid(pid,&status,0); to get the true return value.
I have a command-line application called xooky_nabox that was programmed using c++. It reads a puredata patch, processes signals from the audio in jack of a beagleboard and outputs signals through the audio out jack.
I want the application to run wen the beagleoard starts up and stay running until the board is shut down. There is no GUI and no keyboard or monitor attached to it, just the audio in and out jacks.
If I run the application manually everything works fine:
xooky_nabox -audioindev 1 -audiooutdev 1 /var/xooky/patch.pd
And it also runs fine if I run it in the background:
xooky_nabox -audioindev 1 -audiooutdev 1 /var/xooky/patch.pd &
Now, let me show the code layout of two versions of the program (The full thing is at https://github.com/rvega/XookyNabox):
Version 1, main thread is kept alive:
void sighandler(int signum){
time_t rawtime;
time(&rawtime);
std::ofstream myfile;
myfile.open ("log.txt",std::ios::app);
myfile << ctime(&rawtime) << " Caught signal:" << signum << " " << strsignal(signum) << "\n";
myfile.close();
if(signum == 15 || signum == 2){
exit(0);
}
}
int main (int argc, char *argv[]) {
// Subscribe to all system signals for debugging purposes.
for(int i=0; i<64; i++){
signal(i, sighandler);
}
// Sanity checks, error and help messages, etc.
parseParameters(argc, argv);
//Start Signal processing and Audio
initZenGarden();
initAudioIO();
// Keep the program alive.
while(1){
sleep(10);
}
// This is obviously never reached, so far no problems with that...
stopAudioIO();
stopZengarden();
return 0;
}
static int paCallback( const void *inputBuffer, void *outputBuffer, unsigned long framesPerBuffer, const PaStreamCallbackTimeInfo* timeInfo, PaStreamCallbackFlags statusFlags, void *userData ){
// This is called by PortAudio when the output buffer is about to run dry.
}
Version 2, execution is forked and detached from the terminal that launched it:
void go_daemon(){
// Run the program as a daemon.
pid_t pid, sid;
pid = fork(); // Fork off the parent process
if (pid < 0) {
exit(EXIT_FAILURE);
}
if (pid > 0) {
exit(EXIT_SUCCESS); // If child process started ok, exit the parent process
}
umask(0); // Change file mode mask
sid = setsid(); // Create a new session ID for the child process
if (sid < 0) {
// TODO: Log failure
exit(EXIT_FAILURE);
}
if((chdir("/")) < 0){ //Change the working directory to "/"
//TODO: Log failre
exit(EXIT_FAILURE);
}
close(STDIN_FILENO);
close(STDOUT_FILENO);
close(STDERR_FILENO);
}
int main (int argc, char *argv[]) {
go_daemon();
// Subscribe to all system signals for debugging purposes.
for(int i=0; i<64; i++){
signal(i, sighandler);
}
// Sanity checks, error and help messages, etc.
parseParameters(argc, argv);
//Start Signal processing and Audio
initZenGarden();
initAudioIO();
// Keep the program alive.
while(1){
sleep(10);
}
// This is obviously never reached, so far no problems with that...
stopAudioIO();
stopZengarden();
return 0;
}
Trying to run it at startup
I've tried running both versions of the program at startup using a few methods. The outcome is always the same. When the beagle starts up, I can hear sound beign output for a fraction of a second, the sound then stops and the login screen is presented (I have a serial terminal attached to the board and minicom running on my computer). The weirdest thing to me is that the xooky_nabox process is actually kept running after login but there is no sound output...
Here's what I've tried:
Adding a #reboot entry to crontab and launching the program with a trailing ampersand (version 1 of the program):
#reboot xooky_nabox <params> &
Added a start-stop-daemon to crontab (version 1):
#reboot start-stop-daemon -S -b --user daemon -n xooky_nabox -a /usr/bin/xooky_nabox -- <params>
Created a script at /etc/init.d/xooky and did
$chmod +x xooky
$update-rc.d xooky defaults
And tried different versions of the startup script: start-stop-daemon with version 1, calling the program directly with a trailing ampersand (version 1), calling the program directly with no trailing ampersand (version 2).
Also, if I run the program manually from the serial terminal or from a ssh session (usb networking); and then I run top, the program will run fine for a few seconds consuming around 15% cpu. It will then stop outputing sound, and it's cpu consumption will rise to around 30%. My log.txt file shows no signal sent to the program by the OS in this scenario.
When version 2 of the program is ran at startup, the log wil show something like:
Mon Jun 6 02:44:49 2011 Caught signal:18 Continued
Mon Jun 6 02:44:49 2011 Caught signal:15 Terminated
Does anyone have any ideas on how to debug this? Suggestions on how to launch my program at startup?
In version 2,
I think you should open (and dup2) /dev/null to STDIN/STDOUT/STDERR. Just closing the handle would cause problem.
something like this:
int fd = open("/dev/null", O_RDWR);
dup2( fd, STDOUT_FILENO );
(I have no idea what start-stop-daemon do. Can't help version 1, sorry)
There is C function to create a daemon
#include <unistd.h>
int daemon(int nochdir, int noclose);
More information can be found in man pages for daemon(3)
Maybe it will help.
And if you want to launch you daemon when you linux start, you should find out which init version you are using in you distro, but usually, you can just add command to execute you daemon to /etc/init.d/rc (but it seems to be no so good idea). This file is executed by init when linux is starting.
I ended up ditching PortAudio and implementing a JACK client which runs it's own server so this issue was not relevant for me anymore.