Closing pipe does not interrupt read() in child process spawned from thread - c++

In a Linux application I'm spawning multiple programs via fork/execvp and redirect the standard IO streams to a pipe for IPC. I spawn a child process, write some data into the child stdin pipe, close stdin, and then read the child response from the stdout pipe. This worked fine, until I've executed multiple child processes at the same time, using independent threads per child process.
As soon I increase the number of threads, I often find that the child processes hang while reading from stdin – although read should immediately exit with EOF because the stdin pipe has been closed by the parent process.
I've managed to reproduce this behaviour in the following test program. On my systems (Fedora 23, Ubuntu 14.04; g++ 4.9, 5, 6 and clang 3.7) the program often simply hangs after three or four child processes have exited. Child processes that have not exited are hanging at read(). Killing any child process that has not exited causes all other child processes to magically wake up from read() and the program continues normally.
#include <chrono>
#include <iostream>
#include <mutex>
#include <thread>
#include <vector>
#include <sys/fcntl.h>
#include <sys/wait.h>
#include <unistd.h>
#define HANDLE_ERR(CODE) \
{ \
if ((CODE) < 0) { \
perror("error"); \
quick_exit(1); \
} \
}
int main()
{
std::mutex stdout_mtx;
std::vector<std::thread> threads;
for (size_t i = 0; i < 8; i++) {
threads.emplace_back([&stdout_mtx] {
int pfd[2]; // Create the communication pipe
HANDLE_ERR(pipe(pfd));
pid_t pid; // Fork this process
HANDLE_ERR(pid = fork());
if (pid == 0) {
HANDLE_ERR(close(pfd[1])); // Child, close write end of pipe
for (;;) { // Read data from pfd[0] until EOF or other error
char buffer;
ssize_t bytes;
HANDLE_ERR(bytes = read(pfd[0], &buffer, 1));
if (bytes < 1) {
break;
}
// Allow time for thread switching
std::this_thread::sleep_for(std::chrono::milliseconds(
100)); // This sleep is crucial for the bug to occur
}
quick_exit(0); // Exit, do not call C++ destructors
}
else {
{ // Some debug info
std::lock_guard<std::mutex> lock(stdout_mtx);
std::cout << "Created child " << pid << std::endl;
}
// Close the read end of the pipe
HANDLE_ERR(close(pfd[0]));
// Send some data to the child process
HANDLE_ERR(write(pfd[1], "abcdef\n", 7));
// Close the write end of the pipe, wait for the process to exit
int status;
HANDLE_ERR(close(pfd[1]));
HANDLE_ERR(waitpid(pid, &status, 0));
{ // Some debug info
std::lock_guard<std::mutex> lock(stdout_mtx);
std::cout << "Child " << pid << " exited with status "
<< status << std::endl;
}
}
});
}
// Wait for all threads to complete
for (auto &thread : threads) {
thread.join();
}
return 0;
}
Compile using
g++ test.cpp -o test -lpthread --std=c++11
Note that I'm perfectly aware that mixing fork and threads is potentially dangerous, but please keep in mind that in the original code I'm immediately calling execvp after forking, and that I don't have any shared state between the child child process and the main program, except for the pipes specifically created for IPC. My original code (without the threading part) can be found here.
To me this almost seems like a bug in the Linux kernel, since the program continues correctly as soon as I kill any of the hanging child processes.

This problem is caused by two fundamental principles of how fork and pipes work in Unix. a) the pipe description is reference counted. The pipe is only closed, if all pipe file descriptors pointing at its other end (referencing the descriptions) are closed. b) fork duplicates all open file descriptors of a process.
In the above code, the following race condition might happen: If a thread switch occurs and fork is called between the pipe and fork system calls, the pipe file descriptors are duplicated, causing the write/read ends to be open multiple times. Remember that all duplicates must be closed for the EOF to be generated – which will not happen if there is another duplicate astray an unrelated process.
The best solution is to use the pipe2 system call with the O_CLOEXEC flag and to immediately call exec in the child process after a controlled duplicate of the file descriptor is created using dup2:
HANDLE_ERR(pipe2(pfd, O_CLOEXEC));
HANDLE_ERR(pid = fork());
if (pid == 0) {
HANDLE_ERR(close(pfd[1])); // Child, close write end of pipe
HANDLE_ERR(dup2(pfd[0], STDIN_FILENO));
HANDLE_ERR(execlp("cat", "cat"));
}
Note that the FD_CLOEXEC flag is not copied by the dup2 system call. This way all child processes will automatically close all the file descriptors they should not receive as soon as they reach the exec system call.
From the man-page on open on O_CLOEXEC:
O_CLOEXEC (since Linux 2.6.23)
Enable the close-on-exec flag for the new file descriptor.
Specifying this flag permits a program to avoid additional
fcntl(2) F_SETFD operations to set the FD_CLOEXEC flag.
Note that the use of this flag is essential in some
multithreaded programs, because using a separate fcntl(2)
F_SETFD operation to set the FD_CLOEXEC flag does not suffice
to avoid race conditions where one thread opens a file
descriptor and attempts to set its close-on-exec flag using
fcntl(2) at the same time as another thread does a fork(2)
plus execve(2). Depending on the order of execution, the race
may lead to the file descriptor returned by open() being
unintentionally leaked to the program executed by the child
process created by fork(2). (This kind of race is in
principle possible for any system call that creates a file
descriptor whose close-on-exec flag should be set, and various
other Linux system calls provide an equivalent of the
O_CLOEXEC flag to deal with this problem.)
The phenomenon of all child processes suddenly exiting when one child process is killed can be explained by comparing this issue to the dining philosophers problem. In the same way as killing one of the philosophers will solve the deadlock, killing one of the processes will close one of the duplicated file descriptors, triggering an EOF in another child process which will exit in return, freeing one of the duplicated file descriptors...
Thank you to David Schwartz for pointing this out.

Related

Use system() to create independent child process

I have written a program where I create a thread in the main and use system() to start another process from the thread. Also I start the same process using the system() in the main function also. The process started from the thread seems to stay alive even when the parent process dies. But the one called from the main function dies with the parent. Any ideas why this is happening.
Please find the code structure below:
void *thread_func(void *arg)
{
system(command.c_str());
}
int main()
{
pthread_create(&thread_id, NULL, thread_func, NULL);
....
system(command.c_str());
while (true)
{
....
}
pthread_join(thread_id, NULL);
return 0;
}
My suggestion is: Don't do what you do. If you want to create an independently running child-process, research the fork and exec family functions. Which is what system will use "under the hood".
Threads aren't really independent the same way processes are. When your "main" process ends, all threads end as well. In your specific case the thread seems to continue to run while the main process seems to end because of the pthread_join call, it will simply wait for the thread to exit. If you remove the join call the thread (and your "command") will be terminated.
There are ways to detach threads so they can run a little more independently (for example you don't have to join a detached thread) but the main process still can't end, instead you have to end the main thread, which will keep the process running for as long as there are detached threads running.
Using fork and exec is actually quite simple, and not very complex:
int pid = fork();
if (pid == 0)
{
// We are in the child process, execute the command
execl(command.c_str(), command.c_str(), nullptr);
// If execl returns, there was an error
std::cout << "Exec error: " << errno << ", " << strerror(errno) << '\n';
// Exit child process
exit(1);
}
else if (pid > 0)
{
// The parent process, do whatever is needed
// The parent process can even exit while the child process is running, since it's independent
}
else
{
// Error forking, still in parent process (there are no child process at this point)
std::cout << "Fork error: " << errno << ", " << strerror(errno) << '\n';
}
The exact variant of exec to use depends on command. If it's a valid path (absolute or relative) to an executable program then execl works well. If it's a "command" in the PATH then use execlp.
There are two points here that I think you've missed:
First, system is a synchronous call. That means, your program (or, at least, the thread calling system) waits for the child to complete. So, if your command is long-running, both your main thread and your worker thread will be blocked until it completes.
Secondly, you are "joining" the worker thread at the end of main. This is the right thing to do, because unless you join or detach the thread you have undefined behaviour. However, it's not what you really intended to do. The end result is not that the child process continues after your main process ends... your main process is still alive! It is blocked on the pthread_join call, which is trying to wrap up the worker thread, which is still running command.
In general, assuming you wish to spawn a new process entirely unrelated to your main process, threads are not the way to do it. Even if you were to detach your thread, it still belongs to your process, and you are still required to let it finish before your process terminates. You can't detach from the process using threads.
Instead, you'll need OS features such as fork and exec (or a friendly C++ wrapper around this functionality, such as Boost.Subprocess). This is the only way to truly spawn a new process from within your program.
But, you can cheat! If command is a shell command, and your shell supports background jobs, you could put & at the end of the command (this is an example for Bash syntax) to make the system call:
Ask the shell to spin off a new process
Wait for it to do that
The new process will now continue to run in the background
For example:
const std::string command = "./myLongProgram &";
// ^
However, again, this is kind of a hack and proper fork mechanisms that reside within your program's logic should be preferred for maximum portability and predictability.

Way to force file descriptor to close so that pclose() will not block?

I am creating a pipe using popen() and the process is invoking a third party tool which in some rare cases I need to terminate.
::popen(thirdPartyCommand.c_str(), "w");
If I just throw an exception and unwind the stack, my unwind attempts to call pclose() on the third party process whose results I no longer need. However, pclose() never returns as it blocks with the following stack trace on Centos 4:
#0 0xffffe410 in __kernel_vsyscall ()
#1 0x00807dc3 in __waitpid_nocancel () from /lib/libc.so.6
#2 0x007d0abe in _IO_proc_close##GLIBC_2.1 () from /lib/libc.so.6
#3 0x007daf38 in _IO_new_file_close_it () from /lib/libc.so.6
#4 0x007cec6e in fclose##GLIBC_2.1 () from /lib/libc.so.6
#5 0x007d6cfd in pclose##GLIBC_2.1 () from /lib/libc.so.6
Is there any way to force the call to pclose() to be successful before calling it so I can programmatically avoid this situation of my process getting hung up waiting for pclose() to succeed when it never will because I've stopped supplying input to the popen()ed process and wish to throw away its work?
Should I write an end of file somehow to the popen()ed file descriptor before trying to close it?
Note that the third party software is forking itself. At the point where pclose() has hung, there are four processes, one of which is defunct:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
abc 6870 0.0 0.0 8696 972 ? S 04:39 0:00 sh -c /usr/local/bin/third_party /home/arg1 /home/arg2 2>&1
abc 6871 0.0 0.0 10172 4296 ? S 04:39 0:00 /usr/local/bin/third_party /home/arg1 /home/arg2
abc 6874 99.8 0.0 10180 1604 ? R 04:39 141:44 /usr/local/bin/third_party /home/arg1 /home/arg2
abc 6875 0.0 0.0 0 0 ? Z 04:39 0:00 [third_party] <defunct>
I see two solutions here:
The neat one: you fork(), pipe() and execve() (or anything in the exec family of course...) "manually", then it is going to be up to you to decide if you want to let your children become zombies or not. (i.e. to wait() for them or not)
The ugly one: if you're sure you only have one of this child process running at any given time, you could use sysctl() to check if there is any process running with this name before you call pclose()... yuk.
I strongly advise the neat way here, or you could just ask whomever responsible to fix that infinite loop in your third party tool haha.
Good luck!
EDIT:
For you first question: I don't know. Doing some researches on how to find processes by name using sysctl() shoud tell you what you need to know, I myself have never pushed it this far.
For your second and third question: popen() is basically a wrapper to fork() + pipe() + dup2() + execl().
fork() duplicates the process, execl() replaces the duplicated process' image with a new one, pipe() handles inter process communication and dup2() is used to redirect the output... And then pclose() will wait() for the duplicated process to die, which is why we're here.
If you want to know more, you should check this answer where I've recently explained how to perform a simple fork with standard IPC. In this case, it's just a bit more complicated as you have to use dup2() to redirect the standard output to your pipe.
You should also take a look at popen()/pclose() source codes, as they are of course open source.
Finally, here's a brief example, I cannot make it clearer than that:
int pipefd[2];
pipe(pipefd);
if (fork() == 0) // I'm the child
{
close(pipefd[0]); // I'm not going to read from this pipe
dup2(pipefd[1], 1); // redirect standard output to the pipe
close(pipefd[1]); // it has been duplicated, close it as we don't need it anymore
execve()/execl()/execsomething()... // execute the program you want
}
else // I'm the parent
{
close(pipefd[1]); // I'm not going to write to this pipe
while (read(pipefd[0], &buf, 1) > 0) // read while EOF
write(1, &buf, 1);
close(pipefd[1]); // cleaning
}
And as always, remember to read the man pages and to check all your return values.
Again, good luck!
Another solution is to kill all your children. If you know that the only child processes you have are processes that get started when you do popen(), then it's easy enough. Otherwise you may need some more work or use the fork() + execve() combo, in which case you will know the first child's PID.
Whenever you run a child process, it's PPID (parent process ID) is your own PID. It is easy enough to read the list of currently running processes and gather those that have their PPID = getpid(). Repeat the loop looking for processes that have their PPID equal to one of your children's PID. In the end you build a whole tree of child processes.
Since you child processes may end up creating other child processes, to make it safe, you will want to block those processes by sending a SIGSTOP. That way they will stop creating new children. As far as I know, you can't prevent the SIGSTOP from doing its deed.
The process is therefore:
function kill_all_children()
{
std::vector<pid_t> me_and_children;
me_and_children.push_back(getpid());
bool found_child = false;
do
{
found_child = false;
std::vector<process> processes(get_processes());
for(auto p : processes)
{
// i.e. if I'm the child of any one of those processes
if(std::find(me_and_children.begin(),
me_and_children.end(),
p.ppid()))
{
kill(p.pid(), SIGSTOP);
me_and_children.push_back(p.pid());
found_child = true;
}
}
}
while(found_child);
for(auto c : me_and_children)
{
// ignore ourselves
if(c == getpid())
{
continue;
}
kill(c, SIGTERM);
kill(c, SIGCONT); // make sure it continues now
}
}
This is probably not the best way to close your pipe, though, since you probably need to let the command time to handle your data. So what you want is execute that code only after a timeout. So your regular code could look something like this:
void send_data(...)
{
signal(SIGALRM, handle_alarm);
f = popen("command", "w");
// do some work...
alarm(60); // give it a minute
pclose(f);
alarm(0); // remove alarm
}
void handle_alarm()
{
kill_all_children();
}
-- about the alarm(60);, the location is up to you, it could also be placed before the popen() if you're afraid that the popen() or the work after it could also fail (i.e. I've had problems where the pipe fills up and I don't even reach the pclose() because then the child process loops forever.)
Note that the alarm() may not be the best idea in the world. You may prefer using a thread with a sleep made of a poll() or select() on an fd which you can wake up as required. That way the thread would call the kill_all_children() function after the sleep, but you can send it a message to wake it up early and let it know that the pclose() happened as expected.
Note: I left the implementation of the get_processes() out of this answer. You can read that from /proc or with the libprocps library. I have such an implementation in my snapwebsites project. It's called process_list. You could just reap off that class.
I'm using popen() to invoke a child process which doesn't need any stdin or stdout, it just runs for a short time to do its work, then it stops all by itself. Arguably, invoking this type of child process should rather be done with system() ? Anyway, pclose() is used afterwards to verify that the child process exited cleanly.
Under certain conditions, this child process keeps on running indefinitely. pclose() blocks forever, so then my parent process is also stuck. CPU usage runs to 100%, other executables get starved, and my whole embedded system crumbles. I came here looking for solutions.
Solution 1 by #cmc : decomposing popen() into fork(), pipe(), dup2() and execl().
It might just be a matter of personal taste, but I'm reluctant to rewrite perfectly fine system calls myself. I would just end up introducing new bugs.
Solution 2 by #cmc : verifying that the child process actually exists with sysctl(), to make sure that pclose() will return successfully. I find that this somehow sidesteps the problem from the OP #WilliamKF - there is definitely a child process, it just has become unresponsive. Forgoing the pclose() call won't solve that. [As an aside, in the 7 years since #cmc wrote this answer, sysctl() seems to have become deprecated.]
Solution 3 by #Alexis Wilke : killing the child process. I like this approach best. It basically automates what I did when I stepped in manually to resuscitate my dying embedded system. The problem with my stubborn adherence to popen(), is that I get no PID from the child process. I have been trying in vain with
waitid(P_PGID, getpgrp(), &child_info, WNOHANG);
but all I get on my Debian Linux 4.19 system is EINVAL.
So here's what I cobbled together. I'm searching for the child process by name; I can afford to take a few shortcuts, as I'm sure there will only be one process with this name. Ironically, commandline utility ps is invoked by yet another popen(). This won't win any elegance prizes, but at least my embedded system stays afloat now.
FILE* child = popen("child", "r");
if (child)
{
int nr_loops;
int child_pid;
for (nr_loops=10; nr_loops; nr_loops--)
{
FILE* ps = popen("ps | grep child | grep -v grep | grep -v \"sh -c \" | sed \'s/^ *//\' | sed \'s/ .*$//\'", "r");
child_pid = 0;
int found = fscanf(ps, "%d", &child_pid);
pclose(ps);
if (found != 1)
// The child process is no longer running, no risk of blocking pclose()
break;
syslog(LOG_WARNING, "child running PID %d", child_pid);
usleep(1000000); // 1 second
}
if (!nr_loops)
{
// Time to kill this runaway child
syslog(LOG_ERR, "killing PID %d", child_pid);
kill(child_pid, SIGTERM);
}
pclose(child); // Even after it had to be killed
} /* if (child) */
I learned in the hard way, that I have to pair every popen() with a pclose(), otherwise I pile up the zombie processes. I find it remarkable that this is needed after a direct kill; I figure that's because according to the manpage, popen() actually launches sh -c with the child process in it, and it's this surrounding sh that becomes a zombie.

c++ fork() & execl() dont wait, detach completely

So I have a simple fork and exec program. It works pretty good but I want to be able to detach the process that is started, I try a fork with no wait:
if((pid = fork()) < 0)
perror("Error with Fork()");
else if(pid > 0) {
return "";
}
else {
if(execl("/bin/bash", "/bin/bash", "-c", cmddo, (char*) 0) < 0) perror("execl()");
exit(0);
}
It starts the proc fine but when my main app is closed - so is my forked proc.
How do I keep the forked process running after the main proc (that started it) closes?
Thanks :D
Various things to do if you want to start a detached/daemon process:
fork again and exit the first child (so the second child process no longer has the original process as its parent pid)
call setsid(2) to get a new session and process group
reopen stdin/stdout/stderr to dereference the controlling tty, if there was one. Or, for example, you might have inherited a pipe stdout that will be broken and give you SIGPIPE if you try to write it.
chdir to / to get away from the ancestor's current directory
Probably all you really want is to ignore SIGHUP in your fork()ed process as this is normally the one which brings the program down. That is, what you need to do is
signal(SIGHUP, SIG_IGN);
Using nohup arranges for a reader to be present which would avoid possibly writing to close pipe. To avoid this you could either arrange for standard outputs not to be available or to also ignore SIGPIPE. There are a number of signals which terminate your program when not ignore (see man signal; some signals can't be ignored) but the one which will be sent to the child is is SIGHUP.

child waiting for another child

is there a way for a forked child to examine another forked child so that, if the other forked child takes more time than usual to perform its chores, the first child may perform predefined steps?
if so, sample code will be greatly appreciated.
Yes. Simply fork the process to be watched, from the process to watch it.
if (fork() == 0) {
// we are the watcher
pid_t watchee_pid = fork();
if (watchee_pid != 0) {
// wait and/or handle timeout
int status;
waitpid(watchee_pid, &status, WNOHANG);
} else {
// we're being watched. do stuff
}
} else {
// original process
}
To emphasise: There are 3 processes. The original, the watcher process (that handles timeout etc.) and the actual watched process.
To do this, you'll need to use some form of IPC, and named shared memory segments makes perfect sense here. Your first child could read a value in a named segment which the other child will set once it has completed it's work. Your first child could set a time out and once that time out expires, check for the value - if the value is not set, then do what you need to do.
The code can vary greatly depending on C or C++, you need to select which. If C++, you can use boost::interprocess for this - which has lots of examples of shared memory usage. If C, then you'll have to put this together using native calls for your OS - again this should be fairly straightforward - start at shmget()
This is some orientative code that could help you to solve the problem in a Linux environment.
pid_t pid = fork();
if (pid == -1) {
printf("fork: %s", strerror(errno));
exit(1);
} else if (pid > 0) {
/* parent process */
int i = 0;
int secs = 60; /* 60 secs for the process to finish */
while(1) {
/* check if process with pid exists */
if (exist(pid) && i > secs) {
/* do something accordingly */
}
sleep(1);
i++;
}
} else {
/* child process */
/* child logic here */
exit(0);
}
... those 60 seconds are not very strict. you could better use a timer if you want more strict timing measurement. But if your system doesn't need critical real time processing should be just fine like this.
exist(pid) refers to a function that you should have code that looks into proc/pid where pid is the process id of the child process.
Optionally, you can implement the function exist(pid) using other libraries designed to extract information from the /proc directory like procps
The only processes you can wait on are your own direct child processes - not siblings, not your parent, not grandchildren, etc. Depending on your program's needs, Matt's solution may work for you. If not, here are some other alternatives:
Forget about waiting and use another form of IPC. For robustness, it needs to be something where unexpected termination of the process you're waiting on results in your receiving an event. The best one I can think of is opening a pipe which both processes share, and giving the writing end of the pipe to the process you want to wait for (make sure no other processes keep the writing end open!). When the process holding the writing end terminates, it will be closed, and the reading end will then indicate EOF (read will block on it until the writing end is closed, then return a zero-length read).
Forget about IPC and use threads. One advantage of threads is that the atomicity of a "process" is preserved. It's impossible for individual threads to be killed or otherwise terminate outside of the control of your program, so you don't have to worry about race conditions with process ids and shared resource allocation in the system-global namespace (IPC objects, filenames, sockets, etc.). All synchronization primitives exist purely within your process's address space.

Stop and start running again processes in Linux using C++

I have two process and a shared memory zone, my workflow is like this. The process A write some data in the shared memory, after that it should wait and send a signal to other process B to start running. The process B should read some data from the shared memory do some stuff write the result, and send a signal to the process A to keep running, after this process B should wait.
Can anyone plese provide an example or a place where I can find how can I stop a process and how can I start running again the process?. I am working in Linux and C++.
I already have semaphores, but the thing that I do not like, it is that one process is stop a bunch of seconds reading all the time from the shared memory, until it detects that it can run. That's why I was thinkin only in send a signal in the right moment
Update with the solution
I selected the answer of stefan.ciobaca as favourite because is a complete solution that it works and it has a very good explanation. But in all of the other answers there are other interesting options.
Here is a proof-of-concept of how it can be done:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <signal.h>
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/shm.h>
#include <unistd.h>
#include <assert.h>
typedef void (*sighandler_t)(int);
#define SHM_SIZE 8 /* size of shared memory: enough for two 32 bit integers */
volatile int cancontinue = 0;
void halt(char *err) { perror(err); exit(1); }
void handler(int signum) { assert(signum == SIGUSR1); cancontinue = 1; }
int main(void)
{
key_t key;
int id;
int *data;
pid_t otherpid;
printf("Hi, I am the %s process and my pid is %d\n",
#ifdef PRODUCER_MODE
"writer"
#else
"reader"
#endif
, getpid());
printf("Please give me the pid of the other process: ");
scanf("%d", &otherpid);
// get a pointer to the shared memory
if ((key = ftok("test_concur.c", 'R')) == -1) halt("ftok");
if ((id = shmget(key, SHM_SIZE, 0644 | IPC_CREAT)) == -1) halt("shmget");
if ((data = shmat(id, (void *)0, 0)) == (int *)(-1)) halt("shmat");
sighandler_t oldhandler = signal(SIGUSR1, handler);
while (1) {
#ifdef PRODUCER_MODE
printf("Enter two integers: ");
scanf("%d %d", data, data + 1);
printf("Sending signal to consumer process\n");
kill(otherpid, SIGUSR1);
printf("Waiting for consumer to allow me to continue\n");
while (!cancontinue);
cancontinue = 0;
if (*data + *(data + 1) == 0) { printf("Sum was 0, exiting...\n"); break; }
#else
printf("Waiting for producer to signal me to do my work\n");
while (!cancontinue);
cancontinue = 0;
printf("Received signal\n");
printf("Pretending to do a long calculation\n");
sleep(1);
int sum = *data + *(data + 1);
printf("The sum of the ints in the shared memory is %d\n", sum);
printf("Signaling producer I'm done\n");
kill(otherpid, SIGUSR1);
if (sum == 0) break;
#endif
}
signal(SIGUSR1, oldhandler);
/* detach from the segment: */
if (shmdt(data) == -1) {
perror("shmdt");
exit(1);
}
// don't forget to remove the shared segment from the command line with
// #sudo ipcs
// ... and look for the key of the shared memory segment
// #ipcrm -m <key>
return 0;
}
The above program is actually two programs, a consumer and a producer,
depending on how you compile it.
You compile the producer by making sure that the PRODUCER_MODE macro
is defined:
# gcc -Wall -DPRODUCER_MODE -o producer test_concur.c
The consumer is compiled without defining the PRODUCER_MODE macro:
# gcc -Wall -o consumer test_concur.c
The consumer and producer share some global memory (8 bytes pointed to by data); the producer's role is to read two 32-bit integers from stdin and write them to the shared
memory. The consumer reads integers from the shared memory and
computes their sum.
After writing the data to shared memory, the producer signals to the
consumer (via SIGUSR1) that it may begin the computation. After the
computation is done, the consumer signals to the producer (via SIGUSR1
again) that it may continue.
Both processes stop when the sum is 0.
Currently, each program begins by outputing its pid and reading from
stdin the other program's pid. This should probably :D be replaced by
something smarter, depending on exactly what you are doing.
Also, in practice, the "while (!cancontinue);"-like loops should be
replaced by something else :D, like semaphores. At least you should do
a small sleep inside each loop. Also, I think you do not truly need shared memory to solve this problem, it should be doable using message-passing techniques.
Here is an example session, showed in parallel:
# ./producer # ./consumer
Hi, I am the writer process and my pid is 11357 Hi, I am the reader process and my pid is 11358
Please give me the pid of the other process: 11358 Please give me the pid of the other process: 11357
Enter two integers: 2 Waiting for producer to signal me to do my work
3
Sending signal to consumer process Received signal
Waiting for consumer to allow me to continue Pretending to do a long calculation
... some times passes ...
The sum of the ints in the shared memory is 5
Signaling producer I'm done
Enter two integers: 0 Waiting for producer to signal me to do my work
0
Sending signal to consumer process Received signal
Waiting for consumer to allow me to continue Pretending to do a long calculation
... some times passes ...
The sum of the ints in the shared memory is 0
Signaling producer I'm done
Sum was 0, exiting...
I hope this helps. (when you run the programs, make sure the file test_concur.c exists (it's used to establish the shared memory key (ftok function call)))
Not quite what you've asked for, but could you use pipes (named or otherwise) to affect the synchronization? This puts the locking burden onto the OS which already knows how to do it.
Just a thought.
Response to comment: What I had in mind was using pipes rather than shared memory to more the data around, and getting synchronization for free.
For instance:
Process A starts, sets up a bi-directional pipe and forks process B using popen (3).
Immediately after the fork:
A does some work and writes to the pipe
B attempts to read the pipe, which will block until process A writes...
Next:
A attempts to read the pipe, which will block until data is available...
B does some work and writes to the pipe
goto step 2 until you reach a ending condition.
This is not what you asked for. No shared memory, no signals, but it should do the trick...
What about using Unix domain sockets for the IPC instead of shared memory? That way each process can block on reading from the socket while the other does its work.
Edit: This is similar to dmckee's answer, but offers more control on the blocking and IPC. The popen approach is definitely easier to implement, however.
Do you really need to stop the process (exit it), and restart it, or do you just want it to wait until some event occurs?
If the latter you should read up on IPC and process synchronisation (e.g. semaphores, mutexes).
If the former, look at the source code for something like init in linux.
What you are looking for is called blocking. Process B should block on a call from process A and Process A should block on a call from process B. If a processes is blocked (waiting for the call from the other process) it sits idly in the background and only wakes up when it receives a message.
Select is probably the function you are looking for.
I suggest using semaphores to synchronize the processes.
Reading the headline, I thought that SIGSTOP and SIGCONT might be possibilities, but that's probably not a good idea; you want them to stop when they're in the right (safe) place to stop. That's what semaphores are for.
Many other IPC mechanisms could also achieve similar results, but semaphores are cross-process communication mechanisms (you'd use mutexes between different threads in a single process).
You may also want to look into boost message queues which will use shared memory under the hood, but hides all the hard parts. It offers both blocking and non-blocking functions (it sounds like you want blocking).