I'm trying to set up some test software for code that is already written (that I cannot change). The issue I'm having is that it is getting hung up on certain calls, so I want to try to implement something that will kill the process if it does not complete in x seconds.
The two methods I've tried to solve this problem were to use fork or pthread, both haven't worked for me so far though. I'm not sure why pthread didn't work, I'm assuming it's because the static call I used to set up the thread had some issues with the memory needed to run the function I was calling (I continually got a segfault while the function I was testing was running). Fork worked initially, but on the second time I would fork a process, it wouldn't be able to check to see if the child had finished or not.
In terms of semi-pseudo code, this is what I've written
test_runner()
{
bool result;
testClass* myTestClass = new testClass();
pid_t pID = fork();
if(pID == 0) //Child
{
myTestClass->test_function(); //function in question being tested
}
else if(pID > 0) //Parent
{
int status;
sleep(5);
if(waitpid(0,&status,WNOHANG) == 0)
{
kill(pID,SIGKILL); //If child hasn't finished, kill process and fail test
result = false;
}
else
result = true;
}
}
This method worked for the initial test, but then when I would go to test a second function, the if(waitpid(0,&status,WNOHANG) == 0) would return that the child had finished, even when it had not.
The pthread method looked along these lines
bool result;
test_runner()
{
long thread = 1;
pthread_t* thread_handle = (pthread_t*) malloc (sizeof(pthread_t));
pthread_create(&thread_handle[thread], NULL, &funcTest, (void *)&thread); //Begin class that tests function in question
sleep(10);
if(pthread_cancel(thread_handle[thread] == 0))
//Child process got stuck, deal with accordingly
else
//Child process did not get stuck, deal with accordingly
}
static void* funcTest(void*)
{
result = false;
testClass* myTestClass = new testClass();
result = myTestClass->test_function();
}
Obviously there is a little more going on than what I've shown, I just wanted to put the general idea down. I guess what I'm looking for is if there is a better way to go about handling a problem like this, or maybe if someone sees any blatant issues with what I'm trying to do (I'm relatively new to C++). Like I mentioned, I'm not allowed to go into the code that I'm setting up the test software for, which prevents me from putting signal handlers in the function I'm testing. I can only call the function, and then deal with it from there.
If c++11 is legit you could utilize future with wait_for for this purpose.
For example (live demo):
std::future<int> future = std::async(std::launch::async, [](){
std::this_thread::sleep_for(std::chrono::seconds(3));
return 8;
});
std::future_status status = future.wait_for(std::chrono::seconds(5));
if (status == std::future_status::timeout) {
std::cout << "Timeout" <<endl ;
} else{
cout << "Success" <<endl ;
} // will print Success
std::future<int> future2 = std::async(std::launch::async, [](){
std::this_thread::sleep_for(std::chrono::seconds(3));
return 8;
});
std::future_status status2 = future2.wait_for(std::chrono::seconds(1));
if (status2 == std::future_status::timeout) {
std::cout << "Timeout" <<endl ;
} else{
cout << "Success" <<endl ;
} // will print Timeout
Another thing:
As per the documentation using waitpid with 0 :
meaning wait for any child process whose process group ID is equal to
that of the calling process.
Avoid using pthread_cancel it's probably not a good idea.
Related
I am trying to learn all interactions about signals and I discovered a funny interaction in it I can't understand.
Here's an abstract of the program, Im instructed to do execvp with grandchild, while child needs to wait for grandchild to finish. It runs correctly when without any signal interactions.
void say_Hi(int num) { printf("Finished\n"); }
int main() {
int i = 2;
char *command1[] = {"sleep", "5", NULL};
char *command2[] = {"sleep", "10", NULL};
signal(SIGCHLD, SIG_IGN);
signal(SIGUSR1, say_Hi);
while(i > 0) {
pid_t pid = fork();
if (pid == 0) {
pid_t pidChild = fork();
if (pidChild == 0) {
if (i == 2) {
execvp(command1[0], command1);
} else {
execvp(command2[0], command2);
}
} else if (pidChild > 0) {
waitpid(pidChild, 0, 0);
// kill(pid, SIGUSR1);
printf("pid finished: %d\n", pidChild);
exit(EXIT_FAILURE);
}
exit(EXIT_FAILURE);
} else {
//parent immediately goes to next loop
i--;
}
}
cin >> i; //just for me to pause and observate answers above
return 0;
}
As shown above, kill(pid, SIGUSR1); is commented, the program runs correctly.
Output:
pid finished: 638532 //after 5 sec
pid finished: 638533 //after 10 sec
However, when it is uncommented. Output becomes:
Finished
pid finished: 638610 //after 5 sec
Finished
Finished
Finished
Finished
pid finished: 638611 //after 5 sec too, why?
Finished
I would like to ask:
The whole program finished at once after 5 secs and a total of 6 "Finished" is printed out. Why is so?
Is there a way for me to modify it so that say_Hi function run in a total of two times only, in a correct time interval?
Please forgive me if my code looks stupid or bulky, I'm a newbie in programming. Any Comments about my code and help are appreciated!
void say_Hi(int num) { printf("Finished\n"); }
printf cannot be called in a signal handler. None of the C or the C++ library functions (with few exceptions) can be called in the signal handler. You can't even allocate or delete any memory from a signal handler (using either the C or the C++ library), except by using low-level OS calls like brk() or sbrk(). This is because of a very simple reason: that none of the C or the C++ library functions are signal-safe (with very few exceptions). Only function calls that are explicitly designated as "signal-safe" can be called from a signal handler. None of the C or C++ library functions or classes (with few exceptions) are signal-safe. The End.
The only thing that can be called from a signal handler are low-level operating system calls, like read() and write(), that operate directly on file handles. They are, by definition, signal-safe.
For this simple reason the shown code, when it comes to signals, is undefined behavior. Trying to analyze or figure out your programs behavior, from that respect, such as why or why not you see this message, is completely pointless. It cannot be logically analyzed. This is undefined behavior.
Answer:
kill(getpid(), SIG_USR1);
I am using version 0.5 of Boost.Process. Documentation can be found here. I am using Mac OS X Yosemite.
My problem: I am launching a compilation as a child process. I want to wait for the process to finish.
When my child process compiles correctly, everything is ok.
But when my child process does not compile, my code seems to crash when calling boost::process::wait_for_exit.
My user code looks like this:
EDIT: Code has been edited to match latest, more correct version (still does not work).
s::error_code ec{};
bp::child child = bp::execute(bpi::set_args(compilationCommand),
bpi::bind_stderr(outErrLog_),
bpi::bind_stdout(outErrLog_),
bpi::inherit_env(),
bpi::set_on_error(ec));
bool compilationSuccessful = true;
if (!ec) {
s::error_code ec2;
bp::wait_for_exit(child, ec2);
if (ec2)
compilationSuccessful = false;
}
The internal implementation of bp::wait_for_exit:
template <class Process>
inline int wait_for_exit(const Process &p, boost::system::error_code &ec)
{
pid_t ret;
int status;
do
{
ret = ::waitpid(p.pid, &status, 0);
} while ((ret == -1 && errno == EINTR) || (ret != -1 && !WIFEXITED(status)));
if (ret == -1) {
BOOST_PROCESS_RETURN_LAST_SYSTEM_ERROR("waitpid(2) failed");
}
else
ec.clear();
return status;
}
The code after ::waitpidis never reached when my compilation command fails. The error shown is: "child has exited; pid: xxxx; uid: yyy; exit value: 1".
Questions:
Is this a bug or I am misusing boost::process::wait_for_exit.
Any workaround for avoiding the crash I am getting that is portable?
Just looking at your code, the first thing that strikes me is that you don't actually test the "ec" variable that says whether execute() succeeded or not until after you call wait_for_exit(). If you're calling wait_for_exit() with an invalid child process, it's quite understandable that it would crash.
Start by checking "ec" before calling wait_for_exit().
So the problem was that Boost.Test modifies the signals stack in some way.
This signal stack modification has interactions with Boost.Process and code cannot be reliably tested, at least in the default Boost.Test configuration.
I rewrote the tests with a normal main and some functions and it did the job.
I've wrote a timer using std::thread - here is how it looks like:
TestbedTimer::TestbedTimer(char type, void* contextObject) :
Timer(type, contextObject) {
this->active = false;
}
TestbedTimer::~TestbedTimer(){
if (this->active) {
this->active = false;
if(this->timer->joinable()){
try {
this->timer->join();
} catch (const std::system_error& e) {
std::cout << "Caught system_error with code " << e.code() <<
" meaning " << e.what() << '\n';
}
}
if(timer != nullptr) {
delete timer;
}
}
}
void TestbedTimer::run(unsigned long timeoutInMicroSeconds){
this->active = true;
timer = new std::thread(&TestbedTimer::sleep, this, timeoutInMicroSeconds);
}
void TestbedTimer::sleep(unsigned long timeoutInMicroSeconds){
unsigned long interval = 500000;
if(timeoutInMicroSeconds < interval){
interval = timeoutInMicroSeconds;
}
while((timeoutInMicroSeconds > 0) && (active == true)){
if (active) {
timeoutInMicroSeconds -= interval;
/// set the sleep time
std::chrono::microseconds duration(interval);
/// set thread to sleep
std::this_thread::sleep_for(duration);
}
}
if (active) {
this->notifyAllListeners();
}
}
void TestbedTimer::interrupt(){
this->active = false;
}
I'm not really happy with that kind of implementation since I let the timer sleep for a short interval and check if the active flag has changed (but I don't know a better solution since you can't interrupt a sleep_for call). However, my program core dumps with the following message:
thread is joinable
Caught system_error with code generic:35 meaning Resource deadlock avoided
thread has rejoined main scope
terminate called without an active exception
Aborted (core dumped)
I've looked up this error and as seems that I have a thread which waits for another thread (the reason for the resource deadlock). However, I want to find out where exactly this happens. I'm using a C library (which uses pthreads) in my C++ code which provides among other features an option to run as a daemon and I'm afraid that this interfers with my std::thread code. What's the best way to debug this?
I've tried to use helgrind, but this hasn't helped very much (it doesn't find any error).
TIA
** EDIT: The code above is actually not exemplary code, but I code I've written for a routing daemon. The routing algorithm is a reactive meaning it starts a route discovery only if it has no routes to a desired destination and does not try to build up a routing table for every host in its network. Every time a route discovery is triggered a timer is started. If the timer expires the daemon is notified and the packet is dropped. Basically, it looks like that:
void Client::startNewRouteDiscovery(Packet* packet) {
AddressPtr destination = packet->getDestination();
...
startRouteDiscoveryTimer(packet);
...
}
void Client::startRouteDiscoveryTimer(const Packet* packet) {
RouteDiscoveryInfo* discoveryInfo = new RouteDiscoveryInfo(packet);
/// create a new timer of a certain type
Timer* timer = getNewTimer(TimerType::ROUTE_DISCOVERY_TIMER, discoveryInfo);
/// pass that class as callback object which is notified if the timer expires (class implements a interface for that)
timer->addTimeoutListener(this);
/// start the timer
timer->run(routeDiscoveryTimeoutInMilliSeconds * 1000);
AddressPtr destination = packet->getDestination();
runningRouteDiscoveries[destination] = timer;
}
If the timer has expired the following method is called.
void Client::timerHasExpired(Timer* responsibleTimer) {
char timerType = responsibleTimer->getType();
switch (timerType) {
...
case TimerType::ROUTE_DISCOVERY_TIMER:
handleExpiredRouteDiscoveryTimer(responsibleTimer);
return;
....
default:
// if this happens its a bug in our code
logError("Could not identify expired timer");
delete responsibleTimer;
}
}
I hope that helps to get a better understanding of what I'm doing. However, I did not to intend to bloat the question with that additional code.
I have the helper function below, used to execute a command and get the return value on posix systems. I used to use popen, but it is impossible to get the return code of an application with popen if it runs and exits before popen/pclose gets a chance to do its work.
The following helper function creates a process fork, uses execvp to run the desired external process, and then the parent uses waitpid to get the return code. I'm seeing odd cases where it's refusing to run.
When called with wait = true, waitpid should return the exit code of the application no matter what. However, I'm seeing stdout output that specifies the return code should be non-zero, yet the return code is zero. Testing the external process in a regular shell, then echoing $? returns non-zero, so it's not a problem w/ the external process not returning the right code. If it's of any help, the external process being run is mount(8) (yes, I know I can use mount(2) but that's besides the point).
I apologize in advance for a code dump. Most of it is debugging/logging:
inline int ForkAndRun(const std::string &command, const std::vector<std::string> &args, bool wait = false, std::string *output = NULL)
{
std::string debug;
std::vector<char*> argv;
for(size_t i = 0; i < args.size(); ++i)
{
argv.push_back(const_cast<char*>(args[i].c_str()));
debug += "\"";
debug += args[i];
debug += "\" ";
}
argv.push_back((char*)NULL);
neosmart::logger.Debug("Executing %s", debug.c_str());
int pipefd[2];
if (pipe(pipefd) != 0)
{
neosmart::logger.Error("Failed to create pipe descriptor when trying to launch %s", debug.c_str());
return EXIT_FAILURE;
}
pid_t pid = fork();
if (pid == 0)
{
close(pipefd[STDIN_FILENO]); //child isn't going to be reading
dup2(pipefd[STDOUT_FILENO], STDOUT_FILENO);
close(pipefd[STDOUT_FILENO]); //now that it's been dup2'd
dup2(pipefd[STDOUT_FILENO], STDERR_FILENO);
if (execvp(command.c_str(), &argv[0]) != 0)
{
exit(EXIT_FAILURE);
}
return 0;
}
else if (pid < 0)
{
neosmart::logger.Error("Failed to fork when trying to launch %s", debug.c_str());
return EXIT_FAILURE;
}
else
{
close(pipefd[STDOUT_FILENO]);
int exitCode = 0;
if (wait)
{
waitpid(pid, &exitCode, wait ? __WALL : (WNOHANG | WUNTRACED));
std::string result;
char buffer[128];
ssize_t bytesRead;
while ((bytesRead = read(pipefd[STDIN_FILENO], buffer, sizeof(buffer)-1)) != 0)
{
buffer[bytesRead] = '\0';
result += buffer;
}
if (wait)
{
if ((WIFEXITED(exitCode)) == 0)
{
neosmart::logger.Error("Failed to run command %s", debug.c_str());
neosmart::logger.Info("Output:\n%s", result.c_str());
}
else
{
neosmart::logger.Debug("Output:\n%s", result.c_str());
exitCode = WEXITSTATUS(exitCode);
if (exitCode != 0)
{
neosmart::logger.Info("Return code %d", (exitCode));
}
}
}
if (output)
{
result.swap(*output);
}
}
close(pipefd[STDIN_FILENO]);
return exitCode;
}
}
Note that the command is run OK with the correct parameters, the function proceeds without any problems, and WIFEXITED returns TRUE. However, WEXITSTATUS returns 0, when it should be returning something else.
Probably isn't your main issue, but I think I see a small problem. In your child process, you have...
dup2(pipefd[STDOUT_FILENO], STDOUT_FILENO);
close(pipefd[STDOUT_FILENO]); //now that it's been dup2'd
dup2(pipefd[STDOUT_FILENO], STDERR_FILENO); //but wait, this pipe is closed!
But I think what you want is:
dup2(pipefd[STDOUT_FILENO], STDOUT_FILENO);
dup2(pipefd[STDOUT_FILENO], STDERR_FILENO);
close(pipefd[STDOUT_FILENO]); //now that it's been dup2'd for both, can close
I don't have much experience with forks and pipes in Linux, but I did write a similar function pretty recently. You can take a look at the code to compare, if you'd like. I know that my function works.
execAndRedirect.cpp
I'm using the mongoose library, and grepping my code for SIGCHLD revealed that using mg_start from mongoose results in setting SIGCHLD to SIG_IGN.
From the waitpid man page, on Linux a SIGCHLD set to SIG_IGN will not create a zombie process, so waitpid will fail if the process has already successfully run and exited - but will run OK if it hasn't yet. This was the cause of the sporadic failure of my code.
Simply re-setting SIGCHLD after calling mg_start to a void function that does absolutely nothing was enough to keep the zombie records from being immediately erased.
Per #Geoff_Montee's advice, there was a bug in my redirect of STDERR, but this was not responsible for the problem as execvp does not store the return value in STDERR or even STDOUT, but rather in the kernel object associated with the parent process (the zombie record).
#jilles' warning about non-contiguity of vector in C++ does not apply for C++03 and up (only valid for C++98, though in practice, most C++98 compilers did use contiguous storage, anyway) and was not related to this issue. However, the advice on reading from the pipe before blocking and checking the output of waitpid is spot-on.
I've found that pclose does NOT block and wait for the process to end, contrary to the documentation (this is on CentOS 6). I've found that I need to call pclose and then call waitpid(pid,&status,0); to get the true return value.
I've a for loop that will launch processes in parallel every launched process will return a response back indicating that it is ready. I want to wait for the response and I'll abort if a certain timeout is reached.
Development environment is VS2008
Here is the pseudo code:
void executeCommands(std::vector<Command*> commands)
{
#pragma omp parallel for
for (int i = 0; i < commands.size(); i++)
{
Command* cmd = commands[i];
DWORD pid = ProcessLauncher::launchProcess(cmd->getWorkingDirectory(), cmd->getCommandToExcecute(), cmd->params);
//Should I wait for process to become ready?
if (cmd->getWaitStatusTimeout() > 0)
{
ProcessStatusManager::getInstance().addListener(*this);
//TODO: emit process launching signal
//BEGINNING OF QUESTION
//I don't how to do this part.
//I might use QT's QWaitCondition but if there is another solution in omp
//I'd like to use it
bool timedOut;
SOMEHANDLE handle = Openmp::waitWithTimeout(cmd->getWaitStatusTimeout(), &timedOut);
mWaitConditions[pid]) = handle;
//END OF QUESTION
if (timedOut)
{
ProcessStatusManager::getInstance().removeListener(*this);
//TODO: kill process
//TODO: emit fail signal
}
else
{
//TODO: emit process ready signal
}
}
else
{
//TODO: emit process ready signal
}
}
}
void onProcessReady(DWORD sourceProcessPid)
{
ProcessStatusManager::getInstance().removeListener(*this);
SOMEHANDLE handle = mWaitConditions[sourceProcessPid];
if (mWaitConditions[sourceProcessPid] != 0)
{
Openmp::wakeAll(handle);
}
}
As the comment above pointed out, Michael Suess did present a paper on adding this functionality to OpenMP. He is the last of several people that have proposed adding some type of wait function to OpenMP. The OpenMP language committee has taken the issue up several times. Each time it has been rejected because there are other ways to do this function already. I don't know Qt, but as long as the functions it provides are thread safe, then you should be able to use them.