I am writing a program which requires communicating with an external program two-way simultaneously, i.e., reading and writing to an external program at the same time.
I create two pipes, one for sending data to the external process, one for receiving data from the external process. After forking the child process, which becomes the external program, the parent forks again. The new child now writes data into the outgoing pipe to the external program, and the parent now reads data from the incoming pipe from the external program for further processing.
I've heard that using exit(3) may cause buffers to be flushed twice, however I am also afraid that using _exit(2) may leave buffers left unflushed. In my program, there are outputs both before and after forking. Which, exit(3) or _exit(2), should I use in this case?
The below is my main function. The #includes and auxiliary function is left out for simplicity.
int main() {
srand(time(NULL));
ssize_t n;
cin >> n;
for (double p = 0.0; p <= 1.0; p += 0.1) {
string s = generate(n, p);
int out_fd[2];
int in_fd[2];
pipe(out_fd);
pipe(in_fd);
pid_t child = fork();
if (child) {
// parent
close(out_fd[0]);
close(in_fd[1]);
if (fork()) {
close(out_fd[1]);
ssize_t size = 0;
const ssize_t block_size = 1048576;
char buf[block_size];
ssize_t n_read;
while ((n_read = read(in_fd[0], buf, block_size)) != 0) {
size += n_read;
}
size += n_read;
close(in_fd[0]);
cout << "p = " << p << "; compress ratio = " << double(size) / double(n) << '\n'; // data written before forking (the loop continues to fork)
} else {
write(out_fd[1], s.data(), s.size()); // data written after forking
exit(EXIT_SUCCESS); // exit(3) or _exit(2) ?
}
} else {
// child
close(in_fd[0]);
close(out_fd[1]);
dup2(out_fd[0], STDIN_FILENO);
dup2(in_fd[1], STDOUT_FILENO);
close(STDERR_FILENO);
execlp("xz", "xz", "-9", "--format=raw", reinterpret_cast<char *>(NULL));
}
}
}
You need to be careful with these sort of things. exit() does different things to _exit() and yet again different to _Exit(), and as the answer suggested as a duplicate explains, the _Exit (not same as _exit, note upper case E) will not call atexit() handlers, or flush any output buffers, delete temporary files, etc [which may in fact be atexit() handling, but it could also be done as a direct call, depending on how the C library code has been written].
Most of your output is done via write, which should be unbuffered from the applications perspective. But you are calling cout << ... as well. You will need to make sure that is flushed before exiting. Right now, you are using '\n' as the end of line marker, which may or may not flush the output. If you change that to endl instead, it will flush the file. Now you can safely use _Exit() from an output perspective - if your code were to set up its own atexit() handler for example, open temporary files or a bunch of other such things, this would be problematic. If you want to do more complex things in the forked process, it should be done by another exec.
In your program as it stands, there isn't any pending output to flush, so it "works" anyway, but if you add a cout << ... << '\n'; (or without the newline) type statement at the beginning of the code, you would see it go wrong. If you add a cout.flush();, it would "fix" the problem (based on your current code).
You should also check the return value from your execlp() call and call _Exit() in that case (and handle it in the main process so you don't continue the loop in case of failure?)
In the child branch of a fork(), it is normally incorrect to use exit(), because that can lead to stdio buffers being flushed twice, and temporary files being unexpectedly removed. In C++ code the situation is worse, because destructors for static objects may be run incorrectly. (There are some unusual cases, like daemons, where the parent should call _exit() rather than the child; the basic rule, applicable in the overwhelming majority of cases, is that exit() should be called only once for each entry into main.)
Related
I have been through Quite a few pages, and have an ok Idea of whats happening it think, but I have a few Questions just to be sure....
my program uses the -DTHREADSAFE=1 compile options, forks on receiving a database request (Select, Delete, Insert, Update) from a user or my network, then the child process handles the various database tasks, and relaying of messages should that be required and so on,
at the moment my database is not setup for concurrency which I wont lie is a major design flaw, but that's beside the point at the moment, let's say I have a function that prints all the entries in my table LEDGER as follows...
void PersonalDataBase::printAllEntries()
{
//get all entries
const char query [] = "select * from LEDGER";
sqlite3_stmt *stmt;
int error
try
{
if ((error = sqlite3_prepare(publicDB, query, -1, &stmt, 0 )) == SQLITE_OK)
{
int ctotal = sqlite3_column_count(stmt);
int res = 0;
while ( 1 )
{
res = sqlite3_step(stmt);
if ( res == SQLITE_ROW )
{
Entry *temp = loadBlockRow(stmt);
string from, to;
from = getNameForHash(temp -> from);
to = getNameForHash(temp -> to);
temp -> setFromOrTo(from, 0);
temp -> setFromOrTo(to, 1);
temp -> printEntry();
printlnEnd();
delete temp;
}
else if ( res == SQLITE_DONE || res==SQLITE_ERROR)
{
if (res == SQLITE_ERROR) { throw res; }
sqlite3_finalize(stmt);
break;
}
}
}
//problems
else
{
throw error;
}
}
catch (int err)
{
sqlite3_finalize(stmt);
setupOutput();
cout << "Database Error: " << sqlite3_errmsg(publicDB) << ", Error Code: " << (int) error << endl;
cout << "Did Not Find Values Try Again After Fixing Problems Above." << endl;
printlnEnd();
}
println("Done!");
}
my setupOutput(), printlnEnd(), println(), all help with my use of 'non-blocking' keyboard i/o, they work as I want lets not worry about them here, and think of them as just a call to cout
ok so now at this point I figure there are 4 options...
A while around my try/catch, then in catch check if err = 5, if so I need to setup a sqlite3_busy_handler and have it wait for whatever is blocking the current operation (once it returns SQLITE_OK and have cleaned up all my old variables I reiterate through the while/try again), now as only one of these can be setup at a time, let's say for instance Child1 is doing a large write and child2 and child3 are trying to say read and update concurrently on top of the first child's write, so if a SQLITE_BUSY is returned by this function I print out an error, then restart my while loop (restarting the function), of course after I have finalized my old statement, and cleared up any local objects that may have been created, if this a correct line of thinking?
Should I setup a recursive mutex, say screw it to SQLites own locking mechanism, set it up to be shared across processes then only allow one operation on a database at a time? for using my app on a small scale this doesn't seem to bad of an option, however I'm reading a lot of warnings on using a recursive mutex and am wondering if this is is the best option, as many posts say handle mutual exclusion yourself. however then I cannot have concurrent reads, which is a bit of a pain
Use option 1 but instead of using the SQLite busy handler, just call usleep on a random number, clean up data, and restart while?
before/after any function involving my database use sqlite3_exec() with "BEGIN IMMEDIATE"/"COMMIT" respectively, Locking the database for the duration of the code in between those 2 statements. So that nothing enclosed within can(or at least should) return SQLITE_BUSY, then if my "BEGIN IMMEDIATE" returns BUSY (it should be the only one so long as everything is set up correctly), I use the sqlite3_busy_handler which honestly, if only one process can use it at a time seems annoying... or a random number with usleep(), (presumably at this number is rather large 1mil = 1 second the chance of overlap between 1-20 processes is pretty slim) so each process will constantly try to re lock the database at random intervals for their own purposes
Is there a better way? or which one of these is best?
SQLite's internal busy handler (installed with sqlite3_busy_timeout()) already sleeps a more-or-less random number of times; there is no need to write your own handler.
Using your own locking mechanism would be more efficient than random waiting, but only if you have reader/writer locks.
BEGIN or BEGIN IMMEDIATE ensure that no other statement in the same transaction can run into a lock, but only if IMMEDIATE is used for transactions that write.
To allow concurrent readers and writers, consider using WAL mode. (But this does not allow multiple writers either.)
I am trying to write program using fsync() and write() but fsync need time to sync data but i haven't this time to wait. I made one more thread for fsync()
Here is my code:
#include <thread>
void thread_func(int fd) {
while (1) {
if(fsync(fd) != 0)
std::cout << "ERROR fsync()\n";
usleep(100);
}
}
int main () {
int fd = open ("device", O_RDWR | O_NONBLOCK);
if (fd < 0) {
std::cout << "ERROR: open()\n";
return -1;
}
std::thread *thr = new std::thread (thread_func, fd);
if (thr == nullptr) {
std::cout << "Cannot create thread\n";
close (fd);
return -1;
}
while (1) {
if (write (fd, 'x', 1) < 1)
std::cout << "ERROR write()\n";
}
close(fd);
}
Question is:
is it need to lock different thread when i use file descriptor to fsync in other thread than main?
when i test my program without mutex it have no problem. and when i read man description for fsync it have nothing for different thread.
If the fact that fsync takes time and even sometimes blocks for a very short time is a problem, then you are most probably doing something wrong.
Normally, you do not want to call fsync at all, ever. It is a serious anti-optimization to do so, and one will only ever want to do it if it must be assured that data has been written out1. In this case, however, you absolutely want fsync to block, this is not only works-as-intended, but necessary.
Only when fsync has returned, you know that it has done its task. You know that the OS has done its best to assure that data has been written, and only then it is safe to proceed. If you offload this to a background thread, you can just as well not call fsync, because you don't know when it's safe to assume data has been written.
If initiating writes is your primary goal, you use sync_file_range under Linux (which runs asynchronously) followed by a call to fsync some time later. The reason for following up with fsync is both to ensure that writes are done, and the fact that sync_file_range does not update metadata, so unless you are strictly overwriting already allocated data within the file, your writes may not be visible in case of a crash even though data is on disk (I can't imagine how that might happen since allocating more sectors to a file necessarily means metadata must be modified, but the manpage explicitly warns that this can happen).
1The fsync function still does not (and cannot) guarantee that data is on a permanent storage, it might still be somewhere in the cache hierarchy, such as a controller's or disk's write cache.
Unless you require the thread for something else I would suggest you use the asynchronous I/O aio library:
struct aiocb fsync_cb = {
.aio_fildes = fd
, .aio_sigevent = {
.sigev_notify = SIGEV_NONE
}
}
aio_fsync(O_SYNC, &fsync_cb);
There is also an equivalent variant for write.
struct aiocb write_cb = {
.aio_fildes = fd
, .aio_buf = buffer
, .aio_nbytes = nbytes
, .aio_offset = offset
, .aio_sigevent = {
.sigev_notify = SIGEV_NONE
}
}
aio_write(&write_cb);
If you choose not to have any notificaton of success then you will have to check/wait at some point for completion:
while (aio_error(&write_cb) == EINPROGRESS);
I have written a basic c++ program in unix with fork() and wait() system call. I am only creating one child. I have used two pipes. So After fork operation with first pipe i am writing from child to parent and as after parent receives the data, parent is writing back to child in second pipe. after that in parent side I am using wait(0) system call. but still my parent process dies before child process?
structure is something like this:
main()
char buff[] = "Parent process kills";
char cuff[] = "before Child process";
int fd1[2];
int fd2[2];
pipe(fd1);
pipe(fd2);
if((pid = fork()) == 0)
{
close(fd1[0]);
close(fd2[1]);
write(fd1[1],buff,strlen(buff)+1);
read(fd2[0],cuff,sizeof(cuff));
}
else
{
close(fd1[1]);
close(fd2[0]);
read(fd1[0],buff,sizeof(buff));
write(fd2[1],cuff,strlen(cuff)+1);
wait((int *) 0);
}
close(fd1);
close(fd2);
}'
Even though wait() is used but still parent process dies before child.
Thanks in adavance.
Your call to read result in undefined behavior. You try to read into string literals, not the buffers you have. In this case it probably results in a crash.
Your write calls also writes a string literal and not the buffer you have.
Also, since you have character arrays initialized to strings, sizeo(buff) and strlen(buff) + 1 are equal.
Are you sure you're not dying due to a segfault? Each of these commands is trying to send more than you intend:
write(fd1[1],"buff",strlen(buff)+1);
and
write(fd2[1],"cuff",strlen(cuff)+1);
and each of these is trying to receive into read-only memory:
read(fd2[0],"cuff",sizeof(cuff));
and
read(fd1[0],"buff",sizeof(buff));
There is a subtle error in the line
if(pid == fork())
You compare the result of fork() with pid instead of assigning to it and comparing it to zero. What you wanted to write is this:
if((pid = fork()))
Note the extra set of parentheses that tells the compiler that you really want to do the assignment, and that you don't want get a warning on it.
And with the corrected if, you have the parent executing the first case, not the second, so the correct code would be:
if(pid == fork()) {
close(fd1[1]);
close(fd2[0]);
read(fd1[0],"buff",sizeof(buff));
write(fd2[1],"cuff",strlen(cuff)+1);
wait((int *) 0);
} else {
close(fd1[0]);
close(fd2[1]);
write(fd1[1],"buff",strlen(buff)+1);
read(fd2[0],"cuff",sizeof(cuff));
}
I am kind of newbie on C++, and working on a simple program on Linux which is supposed to invoke another program in the same directory and get the output of the invoked program without showing output of the invoked program on console. This is the code snippet that I am working on:
pid_t pid;
cout<<"General sentance:"<<endl<<sentence<<endl;
cout<<"==============================="<<endl;
//int i=system("./Satzoo");
if(pid=fork()<0)
cout<<"Process could not be created..."<<endl;
else
{
cout<<pid<<endl;
execv("./Satzoo",NULL);
}
cout<<"General sentance:"<<endl<<sentence<<endl;
cout<<"==============================="<<endl;
One of the problem I encounter is that I am able to print the first two lines on console but I cant print the last two lines. I think the program stops working when I invoke the Satzoo program.
Another thing is that this code invokes Satzoo program twice, I dont know why? I can see the output on screen twice. On the other hand if I use system() instead of execv(), then the Satzoo works only once.
I haven't figured out how to read the output of Satzoo in my program.
Any help is appreciated.
Thanks
You aren't distinguisng between the child and the parent process after the call to fork(). So both the child and the parent run execv() and thus their respective process images are replaced.
You want something more like:
pid_t pid;
printf("before fork\n");
if((pid = fork()) < 0)
{
printf("an error occurred while forking\n");
}
else if(pid == 0)
{
/* this is the child */
printf("the child's pid is: %d\n", getpid());
execv("./Satzoo",NULL);
printf("if this line is printed then execv failed\n");
}
else
{
/* this is the parent */
printf("parent continues execution\n");
}
The fork() function clones the current process and returns different values in each process. In the "parent" process, it returns the pid of the child. In the child process, it returns zero. So you would normally invoke it using a model like this:
if (fork() > 0) {
cout << "in parent" << endl;
} else {
cout << "in child" << endl;
exit(0);
}
I have omitted error handling in the above.
In your example, both of the above code paths (both parent and child) fall into the else clause of your call to fork(), causing both of them to execv("./Satzoo"). That is why your program runs twice, and why you never reach the statements beyond that.
Instead of using fork() and doing everything manually (properly managing process execution is a fair amount of work), you may be interested in using the popen() function instead:
FILE *in = popen("./Satzoo", "r");
// use "in" like a normal stdio FILE to read the output of Satzoo
pclose(in);
From the fork() manpage:
RETURN VALUE
Upon successful completion, fork() shall return 0 to the child process and shall return the process ID of the child process to the parent process. Both processes shall continue to execute from the fork() function. Otherwise, -1 shall be returned to the parent process, no child process shall be created, and errno shall be set to indicate the error.
You check to make sure it succeeds, but not whether the pid indicates we're in the child or the parent. Thus, both the child and the parent do the same thing twice, which means that your program gets executed twice and the ending text is never printed. You need to check the return value of fork() more than just once.
exec - The exec() family of functions replaces the current process image with a new process image.
system - Blocks on execution of the command. Execution of the calling program continues after the system command returns
There are three return value tests you want with fork
0: you are the child
-1: error
other: you are the parent
You ran the other program from both the child and the parent...
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
How do I write a program that tells when my other program ends?
The only way to do a waitpid() or waitid() on a program that isn't spawned by yourself is to become its parent by ptrace'ing it.
Here is an example of how to use ptrace on a posix operating system to temporarily become another processes parent, and then wait until that program exits. As a side effect you can also get the exit code, and the signal that caused that program to exit.:
#include <sys/ptrace.h>
#include <errno.h>
#include <stdio.h>
#include <signal.h>
#include <unistd.h>
#include <sys/wait.h>
int main(int argc, char** argv) {
int pid = atoi(argv[1]);
int status;
siginfo_t si;
switch (ptrace(PTRACE_ATTACH, pid, NULL)) {
case 0:
break;
case -ESRCH:
case -EPERM:
return 0;
default:
fprintf(stderr, "Failed to attach child\n");
return 1;
}
if (pid != wait(&status)) {
fprintf(stderr, "wrong wait signal\n");
return 1;
}
if (!WIFSTOPPED(status) || (WSTOPSIG(status) != SIGSTOP)) {
/* The pid might not be running */
if (!kill(pid, 0)) {
fprintf(stderr, "SIGSTOP didn't stop child\n");
return 1;
} else {
return 0;
}
}
if (ptrace(PTRACE_CONT, pid, 0, 0)) {
fprintf(stderr, "Failed to restart child\n");
return 1;
}
while (1) {
if (waitid(P_PID, pid, &si, WSTOPPED | WEXITED)) {
// an error occurred.
if (errno == ECHILD)
return 0;
return 1;
}
errno = 0;
if (si.si_code & (CLD_STOPPED | CLD_TRAPPED)) {
/* If the child gets stopped, we have to PTRACE_CONT it
* this will happen when the child has a child that exits.
**/
if (ptrace(PTRACE_CONT, pid, 1, si.si_status)) {
if (errno == ENOSYS) {
/* Wow, we're stuffed. Stop and return */
return 0;
}
}
continue;
}
if (si.si_code & (CLD_EXITED | CLD_KILLED | CLD_DUMPED)) {
return si.si_status;
}
// Fall through to exiting.
return 1;
}
}
On Windows, a technique I've used is to create a global named object (such as a mutex with CreateMutex), and then have the monitoring program open that same named mutex and wait for it (with WaitForSingleObject). As soon as the first program exits, the second program obtains the mutex and knows that the first program exited.
On Unix, a usual way to solve this is to have the first program write its pid (getpid()) to a file. A second program can monitor this pid (using kill(pid, 0)) to see whether the first program is gone yet. This method is subject to race conditions and there are undoubtedly better ways to solve it.
If you want to spawn another process, and then do nothing while it runs, then most higher-level languages already have built-ins for doing this. In Perl, for example, there's both system and backticks for running processes and waiting for them to finish, and modules such as IPC::System::Simple for making it easier to figure how the program terminated, and whether you're happy or sad about that having happened. Using a language feature that handles everything for you is way easier than trying to do it yourself.
If you're on a Unix-flavoured system, then the termination of a process that you've forked will generate a SIGCHLD signal. This means your program can do other things your child process is running.
Catching the SIGCHLD signal varies depending upon your language. In Perl, you set a signal handler like so:
use POSIX qw(:sys_wait_h);
sub child_handler {
while ((my $child = waitpid(-1, WNOHANG)) > 0) {
# We've caught a process dying, its PID is now in $child.
# The exit value and other information is in $?
}
$SIG{CHLD} \&child_handler; # SysV systems clear handlers when called,
# so we need to re-instate it.
}
# This establishes our handler.
$SIG{CHLD} = \&child_handler;
There's almost certainly modules on the CPAN that do a better job than the sample code above. You can use waitpid with a specific process ID (rather than -1 for all), and without WNOHANG if you want to have your program sleep until the other process has completed.
Be aware that while you're inside a signal handler, all sorts of weird things can happen. Another signal may come in (hence we use a while loop, to catch all dead processes), and depending upon your language, you may be part-way through another operation!
If you're using Perl on Windows, then you can use the Win32::Process module to spawn a process, and call ->Wait on the resulting object to wait for it to die. I'm not familiar with all the guts of Win32::Process, but you should be able to wait for a length of 0 (or 1 for a single millisecond) to check to see if a process is dead yet.
In other languages and environments, your mileage may vary. Please make sure that when your other process dies you check to see how it dies. Having a sub-process die because a user killed it usually requires a different response than it exiting because it successfully finished its task.
All the best,
Paul
Are you on Windows ? If so, the following should solve the problem - you need to pass the process ID:
bool WaitForProcessExit( DWORD _dwPID )
{
HANDLE hProc = NULL;
bool bReturn = false;
hProc = OpenProcess(SYNCHRONIZE, FALSE, _dwPID);
if(hProc != NULL)
{
if ( WAIT_OBJECT_0 == WaitForSingleObject(hProc, INFINITE) )
{
bReturn = true;
}
}
CloseHandle(hProc) ;
}
return bReturn;
}
Note: This is a blocking function. If you want non-blocking then you'll need to change the INFINITE to a smaller value and call it in a loop (probably keeping the hProc handle open to avoid reopening on a different process of the same PID).
Also, I've not had time to test this piece of source code, but I lifted it from an app of mine which does work.
Most operating systems its generally the same kind of thing....
you record the process ID of the program in question and just monitor it by querying the actives processes periodically
In windows at least, you can trigger off events to do it...
Umm you can't, this is an impossible task given the nature of it.
Let's say you have a program foo that takes as input another program foo-sub.
Foo {
func Stops(foo_sub) { run foo_sub; return 1; }
}
The problem with this all be it rather simplistic design is that quite simply if foo-sub is a program that never ends, foo itself never ends. There is no way to tell from the outside if foo-sub or foo is what is causing the program to stop and what determines if your program simply takes a century to run?
Essentially this is one of the questions that a computer can't answer. For a more complete overview, Wikipedia has an article on this.
This is called the "halting problem" and is not solvable.
See http://en.wikipedia.org/wiki/Halting_problem
If you want analyze one program without execution than it's unsolvable problem.