How do Multi-process use the same ncurses screen? - c++

I'm writing a c++ multi-process program with ncurses.
Each process is required to display something on the screen.
My Example code:
int main() {
initscr();
noecho();
curs_set(0);
int flag = fork();
if (flag == -1)
exit(1);
else if (flag == 0) {
WINDOW *win = newwin(4, 4, 0, 0);
int n = 0;
while (1) {
mvwprintw(win, 0, 0, "%d", n % 9);
wrefresh(win);
n = (n + 1) % 9;
sleep(1);
}
}
else {
WINDOW *win = newwin(4, 4, 8, 8);
int n = 0;
while (1) {
mvwprintw(win, 0, 0, "%d", n % 9);
wrefresh(win);
n = (n + 1) % 9;
sleep(1);
}
}
endwin();
return 0;
}
But it can only display one process's information on the screen.
How can I solve it?

I have hacked about something ugly that roughly works but shows what the problems are. I suspect a single window manager process which other processes communicate with would be better - or some horrible set of mutexes.
#include <stdlib.h>
#include <unistd.h>
#include <curses.h>
int main() {
initscr();
noecho();
curs_set(0);
WINDOW *win0 = newwin(4, 4, 0, 0);
WINDOW *win1 = newwin(4, 4, 8, 8);
int flag = fork();
if (flag == -1)
exit(1);
else if (flag == 0) {
int n = 0;
while (1) {
mvwprintw(win0, 0, 0, "%d", n % 9);
wrefresh(win0);
wrefresh(win1);
n = (n + 1) % 9;
sleep(1);
}
}
else {
int n = 0;
while (1) {
mvwprintw(win1, 0, 0, "%d", n % 9);
wrefresh(win1);
wrefresh(win0);
n = (n + 1) % 9;
sleep(1);
}
}
endwin();
return 0;
}

The example creates two 4x4 windows, with the second offset to 8,8. So they have no lines in common.
Since you're using fork (rather than vfork), the two processes should have separate address spaces, and there should be no way for one process to refresh a window which is modified in the other process. In some cases, developers have chosen to equate vfork and fork. With Linux, the vfork manual page comments:
Standard Description
(From POSIX.1) The vfork() function has the same effect as fork(2),
except that the behavior is undefined if the process created by vfork()
either modifies any data other than a variable of type pid_t used to
store the return value from vfork(), or returns from the function in
which vfork() was called, or calls any other function before successfully calling _exit(2) or one of the exec(3) family of functions.
but goes on to say
The requirements put on vfork() by the standards are weaker than
those put on fork(2), so an implementation where the two are
synonymous is compliant. In particular, the programmer cannot rely
on the parent remaining blocked until the child either terminates or
calls execve(2), and cannot rely on any specific behavior with
respect to shared memory.
That weaker and compliant is a developer arguing that making the two functions similar doesn't really matter...
The fork manual page asserts that there are separate address spaces:
The child process and the parent process run in separate memory
spaces. At the time of fork() both memory spaces have the same
content. Memory writes, file mappings (mmap(2)), and unmappings
(munmap(2)) performed by one of the processes do not affect the
other.
but we're left with that ambiguity in the description of vfork. Your program may fail to update the window belonging to the parent process as part of this vfork behavior — and refreshing both windows in the suggested answer merely confirms that the fork function is vfork in disguise.
POSIX currently has no page for vfork. It had one here (the fork description is worth reading).
Either way, using vfork wouldn't actually improve things. If you have to work within the same address space, that's what threads are for. If you have to use separate processes, making one process update the screen and other process(es) communicate with pipes is what people actually do.
A commented suggested that fork is obsolete. POSIX has something different to say on that aspect. Quoting from the rationale for posix_spawn:
The posix_spawn() function and its close relation posix_spawnp() have been introduced to overcome the following perceived difficulties with fork(): the fork() function is difficult or impossible to implement without swapping or dynamic address translation.
Swapping is generally too slow for a realtime environment.
Dynamic address translation is not available everywhere that POSIX might be useful.
Processes are too useful to simply option out of POSIX whenever it must run without address translation or other MMU services.
Thus, POSIX needs process creation and file execution primitives that can be efficiently implemented without address translation or other MMU services.
The posix_spawn() function is implementable as a library routine, but both posix_spawn() and posix_spawnp() are designed as kernel operations. Also, although they may be an efficient replacement for many fork()/ exec pairs, their goal is to provide useful process creation primitives for systems that have difficulty with fork(), not to provide drop-in replacements for fork()/ exec.
Further reading:
what is the difference between fork() and vfork()?
The difference between fork(), vfork(), exec() and clone()
EWONTFIX - vfork considered dangerous

Related

Two Windows - one modified by thread random output

I'm trying to write code where the screen is divided into two windows and one of them is modified by a different thread, but output seems to be very random. Could anyone help? Upper piece of console should be modified by main, and lower by thread k.
#include <stdio.h>
#include <ncurses.h>
#include <unistd.h>
#include <thread>
#define WIDTH 30
#define HEIGHT 10
int startx = 0;
int starty = 0;
void kupa (int score_size, int parent_x, int parent_y)
{
int i = 0;
WINDOW *dupa = newwin(score_size, parent_x, parent_y - score_size, 0);
while(true)
{
i++;
mvwprintw(dupa, 0 , 0, "You chose choice %d with choice string", i);
wrefresh(dupa);
sleep(5);
wclear(dupa);
}
delwin(dupa);
}
int main ()
{
int parent_x, parent_y;
int score_size =10;
int counter =0 ;
initscr();
noecho();
curs_set(FALSE);
getmaxyx(stdscr, parent_y, parent_x);
WINDOW *field = newwin(parent_y - score_size, parent_x, 0, 0);
std::thread k (kupa, score_size, parent_x, parent_y);
while(true) {
mvwprintw(field, 0, counter, "Field");
wrefresh(field);
sleep(5);
wclear(field);
counter++;
}
k.join();
delwin(field);
}
The underlying curses/ncurses library is not thread-safe (see for example What is meant by “thread-safe” code? which discusses the term). In the case of curses, this means that the library's WINDOW structures such as stdscr are global variables which are not guarded by mutexes or other methods. The library also has internal static data which is shared across windows. You can only get reliable results for multithreaded code using one of these strategies:
do all of the window management (including input) within one thread
use mutexes, semaphores or whatever concurrency technique seems best to manage separate threads which "own" separate windows. To succeed here, a thread would have to own the whole screen from the point where the curses library blocks while waiting for input, until it updates the screen and resumes waiting for input. That is harder than it sounds.
ncurses 5.7 and up can be compiled to provide rudimentary support for reentrant code and some threaded applications. To do this, it uses mutexes wrapped around its static data, makes the global variables into "getter" functions, and adds functions which explicitly pass the SCREEN pointer which is implied in many calls. For more detail, see the manual page.
Some of ncurses' test-programs illustrate the threading support (these are programs in the test subdirectory of the sources):
ditto shows use_screen.
test_opaque execises the "getters" for WINDOW properties
rain shows use_window
worm shows use_window
I am not sure what you want to do but this behaviour is quite normal. The thread that is active writes to the window and when the system makes a task switch the other thread writes to the window. Normal behaviour is to use only one thread that writes to the window. Other threads are supposed to do only some work.
Anyway, if you are using more than one thread you have to synchronize them using events, mutexes, queues, semaphores or other methods.

exit() or _exit() after forking?

I am writing a program which requires communicating with an external program two-way simultaneously, i.e., reading and writing to an external program at the same time.
I create two pipes, one for sending data to the external process, one for receiving data from the external process. After forking the child process, which becomes the external program, the parent forks again. The new child now writes data into the outgoing pipe to the external program, and the parent now reads data from the incoming pipe from the external program for further processing.
I've heard that using exit(3) may cause buffers to be flushed twice, however I am also afraid that using _exit(2) may leave buffers left unflushed. In my program, there are outputs both before and after forking. Which, exit(3) or _exit(2), should I use in this case?
The below is my main function. The #includes and auxiliary function is left out for simplicity.
int main() {
srand(time(NULL));
ssize_t n;
cin >> n;
for (double p = 0.0; p <= 1.0; p += 0.1) {
string s = generate(n, p);
int out_fd[2];
int in_fd[2];
pipe(out_fd);
pipe(in_fd);
pid_t child = fork();
if (child) {
// parent
close(out_fd[0]);
close(in_fd[1]);
if (fork()) {
close(out_fd[1]);
ssize_t size = 0;
const ssize_t block_size = 1048576;
char buf[block_size];
ssize_t n_read;
while ((n_read = read(in_fd[0], buf, block_size)) != 0) {
size += n_read;
}
size += n_read;
close(in_fd[0]);
cout << "p = " << p << "; compress ratio = " << double(size) / double(n) << '\n'; // data written before forking (the loop continues to fork)
} else {
write(out_fd[1], s.data(), s.size()); // data written after forking
exit(EXIT_SUCCESS); // exit(3) or _exit(2) ?
}
} else {
// child
close(in_fd[0]);
close(out_fd[1]);
dup2(out_fd[0], STDIN_FILENO);
dup2(in_fd[1], STDOUT_FILENO);
close(STDERR_FILENO);
execlp("xz", "xz", "-9", "--format=raw", reinterpret_cast<char *>(NULL));
}
}
}
You need to be careful with these sort of things. exit() does different things to _exit() and yet again different to _Exit(), and as the answer suggested as a duplicate explains, the _Exit (not same as _exit, note upper case E) will not call atexit() handlers, or flush any output buffers, delete temporary files, etc [which may in fact be atexit() handling, but it could also be done as a direct call, depending on how the C library code has been written].
Most of your output is done via write, which should be unbuffered from the applications perspective. But you are calling cout << ... as well. You will need to make sure that is flushed before exiting. Right now, you are using '\n' as the end of line marker, which may or may not flush the output. If you change that to endl instead, it will flush the file. Now you can safely use _Exit() from an output perspective - if your code were to set up its own atexit() handler for example, open temporary files or a bunch of other such things, this would be problematic. If you want to do more complex things in the forked process, it should be done by another exec.
In your program as it stands, there isn't any pending output to flush, so it "works" anyway, but if you add a cout << ... << '\n'; (or without the newline) type statement at the beginning of the code, you would see it go wrong. If you add a cout.flush();, it would "fix" the problem (based on your current code).
You should also check the return value from your execlp() call and call _Exit() in that case (and handle it in the main process so you don't continue the loop in case of failure?)
In the child branch of a fork(), it is normally incorrect to use exit(), because that can lead to stdio buffers being flushed twice, and temporary files being unexpectedly removed. In C++ code the situation is worse, because destructors for static objects may be run incorrectly. (There are some unusual cases, like daemons, where the parent should call _exit() rather than the child; the basic rule, applicable in the overwhelming majority of cases, is that exit() should be called only once for each entry into main.)

can run in gdb, segmentation fault when run directly

My program get segmentation fault when I run it normally. However it works just fine if I use gdb run. Moreover, the ratio of segmentation fault increases when I increase the sleep time in the philo function. I am using ubuntu 12.04. Any help or pointing is appreciated. Here is my code
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sched.h>
#include <signal.h>
#include <sys/wait.h>
#include <time.h>
#include <semaphore.h>
#include <errno.h>
#define STACKSIZE 10000
#define NUMPROCS 5
#define ROUNDS 10
int ph[NUMPROCS];
//cs[i] is the chopstick between philosopher i and i+1
sem_t cs[NUMPROCS], dead;
int philo() {
int i = 0;
int cpid = getpid();
int phno;
for (i=0; i<NUMPROCS; i++)
if(ph[i] == cpid) phno = i;
for (i=0; i < ROUNDS ; i++){
// Add your entry protocol here
if (sem_wait(&dead) != 0) {
perror(NULL);
return 1;
}
if (sem_wait(&cs[phno]) != 0) {
perror(NULL);
return 1;
}
if (sem_wait(&cs[(phno-1+NUMPROCS) % NUMPROCS]) != 0){
perror(NULL);
return 1;
}
// Start of critical section -- simulation of slow n++
int sleeptime = 20000 + rand()%50000;
printf("philosopher %d is eating by chopsticks %d and %d\n",phno,phno,(phno-1+NUMPROCS)%NUMPROCS);
usleep(sleeptime) ;
// End of critical section
// Add your exit protocol here
if (sem_post(&dead) != 0) {
perror(NULL);
return 1;
}
if (sem_post(&cs[phno]) != 0) {
perror(NULL);
return 1;
}
if (sem_post(&cs[(phno-1+NUMPROCS) % NUMPROCS]) != 0){
perror(NULL);
return 1;
}
}
return 0;
}
int main( int argc, char ** argv){
int i;
void* stack[NUMPROCS];
srand(time(NULL));
//initialize semaphores
for (i=0; i<NUMPROCS; i++) {
if (sem_init(&cs[i],1,1) != 0){
perror(NULL);
return 1;
}
}
if (sem_init(&dead,1,4) != 0){
perror(NULL);
return 1;
}
for (i = 0; i < NUMPROCS; i++){
stack[i] = malloc(STACKSIZE) ;
if ( stack[i] == NULL ) {
printf("Error allocating memory\n") ;
exit(1) ;
}
// create a child that shares the data segment
ph[i] = clone(philo, stack[i]+STACKSIZE-1, CLONE_VM|SIGCHLD, NULL) ;
if (ph[i] < 0) {
perror(NULL) ;
return 1;
}
}
for (i=0; i < NUMPROCS; i++) wait(NULL);
for (i=0; i < NUMPROCS; i++) free(stack[i]);
return 0 ;
}
A typical Heisenbug: if you look at it, it disappears. In my experience getting a segv only outside gdb or vice versa is sign of using uninitialized memory or dependence on actual pointer addresses. Normally running valgrind is ruthlessly accurate in detecting those. Unfortunately (my) valgrind can not handle your clone outside the pthread context.
Visual inspection suggests it is not a memory problem. Only the stacks are allocated on the heap and their use looks ok. Except you treat them with a void * pointer and then add something to it, which is not allowed in standard-C (a GNU extension). Proper would be to use a char *, but the GNU extensions does what you want.
Subtracting one from the top address of the stack is probably not necessary and might cause alignment errors on simple implementations of clone, but again I don't think that is the problem, as clone most likely will align the stack top again. And admittedly the manual page of clone is not very clear about the exact location of the address: "topmost address of the memory space".
Just waiting for a state change of a child and assuming it died is a bit sloppy and then taking away its stack might lead to segmentation faults, but again I don't think that is the problem, because you are probably not frantically sending signals to your philosophers.
If I run your application the philosophers can finish their diner undisturbed both inside and outside gdb, so the following is a guess. Let's call the parent process that clones philosophers "the table". Once a philosopher is cloned the table stores the returned pid in ph, say assign that number to a chair. The first thing a philosopher does is looking for his chair. If he doesn't find his chair he will have an uninitialized phno which is used to access his semaphores. Now this may very well lead to segmentation faults.
The implementation is assuming that control is returned to the table before the philosophers start. I can't find such guarantee in the manual page and I would actually expect this not to be true. Also the clone interface has a possibility to place process ids in memory shared between the child and the parent, suggesting this is a recognized problem (see parameters pid and ctid). If those are used the pid will be written before either the table or the just cloned philosopher gets control.
It is highly possible that this error explains the difference between inside and outside gdb, because gdb is well aware of the processes that are spawned under its supervision and may treat them differently than the operating system.
Alternatively you could assign a semaphore to the table. So nobody sits at the table until the table says so, obviously after it assigned all chairs. This would make a much better use for the semaphore dead.
BTW. You are of course fully aware that the setup of your solution does allow for the situation where all philosophers end up each having one fork (eh chopstick) and starve to death waiting for the other. Luckily chances of that happening are very slim.
ph[i] = clone(philo, stack[i]+STACKSIZE-1, CLONE_VM|SIGCHLD, NULL) ;
This creates a thread of execution, which glibc knows nothing about. As such, glibc does not create any thread-specific internal structures that it needs for e.g. dynamic symbol resolution.
With such setup, calling into any glibc function from your philo function invokes undefined behavior, and you sometimes crash (because the dynamic loader will use main thread's private data to perform symbol resolution, and because the loader assumes that each thread has its own private area, but you've violated this assumption by creating clones which share the single private area "behind glibc's back").
If you look at a core dump, there is a high chance that the actual crash happens in ld.so, which would confirm my guess.
Don't ever use clone directly (unless you know what you are doing). Use pthread_create instead.
Here is what I see in the core that I just got (which is exactly the problem I described):
Program terminated with signal 4, Illegal instruction.
#0 _dl_x86_64_restore_sse () at ../sysdeps/x86_64/dl-trampoline.S:239
239 vmovdqa %fs:RTLD_SAVESPACE_SSE+0*YMM_SIZE, %ymm0
(gdb) bt
#0 _dl_x86_64_restore_sse () at ../sysdeps/x86_64/dl-trampoline.S:239
#1 0x00007fb694e1dc45 in _dl_fixup (l=<optimized out>, reloc_arg=<optimized out>) at ../elf/dl-runtime.c:127
#2 0x00007fb694e0dee5 in _dl_runtime_resolve () at ../sysdeps/x86_64/dl-trampoline.S:42
#3 0x00000000004009ec in philo ()
#4 0x00007fb69486669d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112

critical section problem in Windows 7

Why does the code sample below cause one thread to execute way more than another but a mutex does not?
#include <windows.h>
#include <conio.h>
#include <process.h>
#include <iostream>
using namespace std;
typedef struct _THREAD_INFO_ {
COORD coord; // a structure containing x and y coordinates
INT threadNumber; // each thread has it's own number
INT count;
}THREAD_INFO, * PTHREAD_INFO;
void gotoxy(int x, int y);
BOOL g_bRun;
CRITICAL_SECTION g_cs;
unsigned __stdcall ThreadFunc( void* pArguments )
{
PTHREAD_INFO info = (PTHREAD_INFO)pArguments;
while(g_bRun)
{
EnterCriticalSection(&g_cs);
//if(TryEnterCriticalSection(&g_cs))
//{
gotoxy(info->coord.X, info->coord.Y);
cout << "T" << info->threadNumber << ": " << info->count;
info->count++;
LeaveCriticalSection(&g_cs);
//}
}
ExitThread(0);
return 0;
}
int main(void)
{
// OR unsigned int
unsigned int id0, id1; // a place to store the thread ID returned from CreateThread
HANDLE h0, h1; // handles to theads
THREAD_INFO tInfo[2]; // only one of these - not optimal!
g_bRun = TRUE;
ZeroMemory(&tInfo, sizeof(tInfo)); // win32 function - memset(&buffer, 0, sizeof(buffer))
InitializeCriticalSection(&g_cs);
// setup data for the first thread
tInfo[0].threadNumber = 1;
tInfo[0].coord.X = 0;
tInfo[0].coord.Y = 0;
h0 = (HANDLE)_beginthreadex(
NULL, // no security attributes
0, // defaut stack size
&ThreadFunc, // pointer to function
&tInfo[0], // each thread gets its own data to output
0, // 0 for running or CREATE_SUSPENDED
&id0 ); // return thread id - reused here
// setup data for the second thread
tInfo[1].threadNumber = 2;
tInfo[1].coord.X = 15;
tInfo[1].coord.Y = 0;
h1 = (HANDLE)_beginthreadex(
NULL, // no security attributes
0, // defaut stack size
&ThreadFunc, // pointer to function
&tInfo[1], // each thread gets its own data to output
0, // 0 for running or CREATE_SUSPENDED
&id1 ); // return thread id - reused here
_getch();
g_bRun = FALSE;
return 0;
}
void gotoxy(int x, int y) // x=column position and y=row position
{
HANDLE hdl;
COORD coords;
hdl = GetStdHandle(STD_OUTPUT_HANDLE);
coords.X = x;
coords.Y = y;
SetConsoleCursorPosition(hdl, coords);
}
That may not answer your question but the behavior of critical sections changed on Windows Server 2003 SP1 and later.
If you have bugs related to critical sections on Windows 7 that you can't reproduce on an XP machine you may be affected by that change.
My understanding is that on Windows XP critical sections used a FIFO based strategy that was fair for all threads while later versions use a new strategy aimed at reducing context switching between threads.
There's a short note about this on the MSDN page about critical sections
You may also want to check this forum post
Critical sections, like mutexes are designed to protect a shared resource against conflicting access (such as concurrent modification). Critical sections are not meant to replace thread priority.
You have artificially introduced a shared resource (the screen) and made it into a bottleneck. As a result, the critical section is highly contended. Since both threads have equal priority, that is no reason for Windows to prefer one thread over another. Reduction of context switches is a reason to pick one thread over another. As a result of that reduction, the utilization of the shared resource goes up. That is a good thing; it means that one thread will be finished a lot earlier and the other thread will finish a bit earlier.
To see the effect graphically, compare
A B A B A B A B A B
to
AAAAA BBBBB
The second sequence is shorter because there's only one switch from A to B.
In hand wavey terms:
CriticalSection is saying the thread wants control to do some things together.
Mutex is making a marker to show 'being busy' so others can wait and notifying of completion so somebody else can start. Somebody else already waiting for the mutex will grab it before you can start the loop again and get it back.
So what you are getting with CriticalSection is a failure to yield between loops. You might see a difference if you had Sleep(0); after LeaveCriticalSection
I can't say why you're observing this particular behavior, but it's probably to do with the specifics of the implementation of each mechanism. What I CAN say is that unlocking then immediately locking a mutex is a bad thing. You will observe odd behavior eventually.
From some MSDN docs (http://msdn.microsoft.com/en-us/library/ms682530.aspx):
Starting with Windows Server 2003 with Service Pack 1 (SP1), threads waiting on a critical section do not acquire the critical section on a first-come, first-serve basis. This change increases performance significantly for most code

How do I write a program that tells when my other program ends? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
How do I write a program that tells when my other program ends?
The only way to do a waitpid() or waitid() on a program that isn't spawned by yourself is to become its parent by ptrace'ing it.
Here is an example of how to use ptrace on a posix operating system to temporarily become another processes parent, and then wait until that program exits. As a side effect you can also get the exit code, and the signal that caused that program to exit.:
#include <sys/ptrace.h>
#include <errno.h>
#include <stdio.h>
#include <signal.h>
#include <unistd.h>
#include <sys/wait.h>
int main(int argc, char** argv) {
int pid = atoi(argv[1]);
int status;
siginfo_t si;
switch (ptrace(PTRACE_ATTACH, pid, NULL)) {
case 0:
break;
case -ESRCH:
case -EPERM:
return 0;
default:
fprintf(stderr, "Failed to attach child\n");
return 1;
}
if (pid != wait(&status)) {
fprintf(stderr, "wrong wait signal\n");
return 1;
}
if (!WIFSTOPPED(status) || (WSTOPSIG(status) != SIGSTOP)) {
/* The pid might not be running */
if (!kill(pid, 0)) {
fprintf(stderr, "SIGSTOP didn't stop child\n");
return 1;
} else {
return 0;
}
}
if (ptrace(PTRACE_CONT, pid, 0, 0)) {
fprintf(stderr, "Failed to restart child\n");
return 1;
}
while (1) {
if (waitid(P_PID, pid, &si, WSTOPPED | WEXITED)) {
// an error occurred.
if (errno == ECHILD)
return 0;
return 1;
}
errno = 0;
if (si.si_code & (CLD_STOPPED | CLD_TRAPPED)) {
/* If the child gets stopped, we have to PTRACE_CONT it
* this will happen when the child has a child that exits.
**/
if (ptrace(PTRACE_CONT, pid, 1, si.si_status)) {
if (errno == ENOSYS) {
/* Wow, we're stuffed. Stop and return */
return 0;
}
}
continue;
}
if (si.si_code & (CLD_EXITED | CLD_KILLED | CLD_DUMPED)) {
return si.si_status;
}
// Fall through to exiting.
return 1;
}
}
On Windows, a technique I've used is to create a global named object (such as a mutex with CreateMutex), and then have the monitoring program open that same named mutex and wait for it (with WaitForSingleObject). As soon as the first program exits, the second program obtains the mutex and knows that the first program exited.
On Unix, a usual way to solve this is to have the first program write its pid (getpid()) to a file. A second program can monitor this pid (using kill(pid, 0)) to see whether the first program is gone yet. This method is subject to race conditions and there are undoubtedly better ways to solve it.
If you want to spawn another process, and then do nothing while it runs, then most higher-level languages already have built-ins for doing this. In Perl, for example, there's both system and backticks for running processes and waiting for them to finish, and modules such as IPC::System::Simple for making it easier to figure how the program terminated, and whether you're happy or sad about that having happened. Using a language feature that handles everything for you is way easier than trying to do it yourself.
If you're on a Unix-flavoured system, then the termination of a process that you've forked will generate a SIGCHLD signal. This means your program can do other things your child process is running.
Catching the SIGCHLD signal varies depending upon your language. In Perl, you set a signal handler like so:
use POSIX qw(:sys_wait_h);
sub child_handler {
while ((my $child = waitpid(-1, WNOHANG)) > 0) {
# We've caught a process dying, its PID is now in $child.
# The exit value and other information is in $?
}
$SIG{CHLD} \&child_handler; # SysV systems clear handlers when called,
# so we need to re-instate it.
}
# This establishes our handler.
$SIG{CHLD} = \&child_handler;
There's almost certainly modules on the CPAN that do a better job than the sample code above. You can use waitpid with a specific process ID (rather than -1 for all), and without WNOHANG if you want to have your program sleep until the other process has completed.
Be aware that while you're inside a signal handler, all sorts of weird things can happen. Another signal may come in (hence we use a while loop, to catch all dead processes), and depending upon your language, you may be part-way through another operation!
If you're using Perl on Windows, then you can use the Win32::Process module to spawn a process, and call ->Wait on the resulting object to wait for it to die. I'm not familiar with all the guts of Win32::Process, but you should be able to wait for a length of 0 (or 1 for a single millisecond) to check to see if a process is dead yet.
In other languages and environments, your mileage may vary. Please make sure that when your other process dies you check to see how it dies. Having a sub-process die because a user killed it usually requires a different response than it exiting because it successfully finished its task.
All the best,
Paul
Are you on Windows ? If so, the following should solve the problem - you need to pass the process ID:
bool WaitForProcessExit( DWORD _dwPID )
{
HANDLE hProc = NULL;
bool bReturn = false;
hProc = OpenProcess(SYNCHRONIZE, FALSE, _dwPID);
if(hProc != NULL)
{
if ( WAIT_OBJECT_0 == WaitForSingleObject(hProc, INFINITE) )
{
bReturn = true;
}
}
CloseHandle(hProc) ;
}
return bReturn;
}
Note: This is a blocking function. If you want non-blocking then you'll need to change the INFINITE to a smaller value and call it in a loop (probably keeping the hProc handle open to avoid reopening on a different process of the same PID).
Also, I've not had time to test this piece of source code, but I lifted it from an app of mine which does work.
Most operating systems its generally the same kind of thing....
you record the process ID of the program in question and just monitor it by querying the actives processes periodically
In windows at least, you can trigger off events to do it...
Umm you can't, this is an impossible task given the nature of it.
Let's say you have a program foo that takes as input another program foo-sub.
Foo {
func Stops(foo_sub) { run foo_sub; return 1; }
}
The problem with this all be it rather simplistic design is that quite simply if foo-sub is a program that never ends, foo itself never ends. There is no way to tell from the outside if foo-sub or foo is what is causing the program to stop and what determines if your program simply takes a century to run?
Essentially this is one of the questions that a computer can't answer. For a more complete overview, Wikipedia has an article on this.
This is called the "halting problem" and is not solvable.
See http://en.wikipedia.org/wiki/Halting_problem
If you want analyze one program without execution than it's unsolvable problem.