pipe fork and execvp analogs in windows - c++

This is simple demonstration of pipe fork exec trio using in unix.
#include <stdio.h>
#include <sys/fcntl.h>
#include <unistd.h>
#include <sys/types.h>
int main()
{
int outfd[2];
if(pipe(outfd)!=0)
{
exit(1);
}
pid_t pid = fork();
if(pid == 0)
{
//child
close(outfd[0]);
dup2(outfd[1], fileno(stdout));
char *argv[]={"ls",NULL};
execvp(argv[0], (char *const *)argv);
throw;
}
if(pid < 0)
{
exit(1);
}
else
{
//parrent
close(outfd[1]);
dup2(outfd[0], fileno(stdin));
FILE *fin = fdopen(outfd[0], "rt");
char *buffer[2500];
while(fgets(buffer, 2500, fin)!=0)
{
//do something with buffer
}
}
return 0;
}
Now I want to write same in windows using WinAPI. What functions should I use? Any ideas?

fork() and execvp() have no direct equivalent in Windows. The combination of fork and exec would map to CreateProcess (or _spawnvp if you use MSVC). For the redirection, you need CreatePipe and DuplicateHandle, this is covered decently in this MSDN article

If you only need fork+execvp in the sense of launching another process that reads from a pipe (like in your example) then the answer given by Erik is 100% what you want (+1 on that).
Otherwise, if you need real fork behaviour, you are without luck under Windows, as there is no such thing. Though, with a lot of hacks it can be achieved, kind of. Cygwin has a working fork implementation that creates a suspended process and abuses setjmp and shared memory to get hold of its context and manually copy the stack and heap over in a somewhat complicated "dance" between parent and child. It's far from pretty and not overly efficient, but it kind of works, and it is probably as good as it can get under an operating system that doesn't natively support it.

Related

Running two programs concurrently

I have two C++ programs built in Ubuntu, and I want to run them concurrently. I do not want to combine them into one C++ project and run each on a different thread, as this is causing me all sorts of problems.
The solution I effectively want to emulate, is when I open two tabs in the terminal, and run each program in a separate tab. However, I also want one program (let's call this Program A) to be able to quit and rerun the other program (Program B). This cannot be achieved just in the terminal.
So what I want to do is to write some C++ code in Program A, which can run and quit Program B at any point. Both programs must run concurrently, so that Program A doesn't have to wait until Program B returns before continuing on with Program A.
Any ideas? Thanks!
In Linux you can fork the current process, which creates a new process.
Then you have to launch the new process with some exec system call.
Refer to:
http://man7.org/linux/man-pages/man2/execve.2.html
For example:
#include <unistd.h> /* for fork */
#include <sys/types.h> /* for pid_t */
#include <sys/wait.h> /* for wait */
int main(int argc,char** argv)
{
pid_t pid=fork();
if (pid==0)
{
execv("/bin/echo",argv);
}
}
You have multiple options here:
The traditional POSIX fork / exec (there are literally tons of examples on how to do this in SO, for example this one).
If you can use Boost then Boost process is an option.
If you can use Qt then QProcess is an option.
Boost and Qt also provide nice means manipulating the standard input/output of the child process if this is important. If not the classical POSIX means should do fine.
Take a look at the Linux operating system calls, fork() and exec(). The fork() call will create two copies of the current process which continue to execute simultaneously.
In the parent process, fork()'s return value is the PID (process ID) of
the child process.
In the child process, fork()'s return value is 0.
On error, fork()'s return value is -1.
You can use this to your advantage to control the behavior of the parent and child. As an example:
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <signal.h>
int main(int argc,char** argv)
{
char* progB = "/bin/progB";
char* args[progName, "arg1", "arg2", ..., NULL];
char* env[NULL]; // can fill in environment here.
pid_t pid=fork();
if (pid==0)
{
// In child...
execv(progB, args, env);
}
else if (pid == -1)
{
// handle error...
}
else
{
// In parent; pid is the child process.
// can wait for child or kill child here.
}
}
To wait until your child exits (in the third case above), you can use wait(2), which returns your child pid on successful termination or -1 on error:
pid_t result = waitpid(pid, &status, options);
To kill your child preemptively, you can send a kill signal as described in kill(2):
int result = kill(pid, SIGKILL); // or whatever signal you wish
This should allow you to manage your processes as described in the original question.

Using pipes to communicate with children in multithreaded programs

I am trying to use fork to execute child programs from a multithreaded parent using code similar to:
#include <thread>
#include <unistd.h>
#include <vector>
#include <sys/wait.h>
void printWithCat(const std::string& data) {
std::vector<char*> commandLine;
// exec won't change argument so safe cast
commandLine.push_back(const_cast<char*>("cat"));
commandLine.push_back(0);
int pipes[2];
pipe(pipes);
// Race condition here
pid_t pid = fork();
if (pid == 0) {
// Redirect pipes[0] to stdin
close(pipes[1]);
close(0);
dup(pipes[0]);
close(pipes[0]);
execvp("cat", &commandLine.front());
}
else {
close(pipes[0]);
write(pipes[1], (void*)(data.data()), data.size());
close(pipes[1]);
waitpid(pid, NULL, 0);
}
}
int main()
{
std::thread t1(printWithCat, "Hello, ");
std::thread t2(printWithCat, "World!");
t1.join();
t2.join();
}
This code contains a race condition between the call to pipe and the call to fork. If both threads create pipes and then fork, each child process contains open file descriptors to both pipes and only close one. The result is that a pipe never gets closed and the child process never exits. I currently wrap the pipe and fork calls in a global lock but this adds an additional synchronisation. Is there a better way?
Don't think you're avoiding synchronization by avoiding a lock in your code -- the kernel is going to take locks for process creation anyway, probably on a far more global level than your lock.
So go ahead and use a lightweight mutex here.
Your problems are going to arise when different parts of the program make fork calls and don't agree on a single mutex (because some are buried in library code, etc)

What happens to RAII objects after a process forks?

Under Unix / Linux, what happens to my active RAII objects upon forking? Will there be double deletions?
What is with copy construction and -assignment? How to make sure nothing bad happens?
fork(2) creates a full copy of the process, including all of its memory. Yes, destructors of automatic objects will run twice - in the parent process and in the child process, in separate virtual memory spaces. Nothing "bad" happens (unless of course, you deduct money from an account in a destructor), you just need to be aware of the fact.
Principally, it is no problem to use these functions in C++, but you have to be aware of what data is shared and how.
Consider that upon fork(), the new process gets a complete copy of the parent's memory (using copy-on-write). Memory is state, therefore
you have two independent processes that must leave a clean state behind.
Now, as long as you stay within the bounds of the memory given to you, you should not have any problem at all:
#include <iostream>
#include <unistd.h>
class Foo {
public:
Foo () { std::cout << "Foo():" << this << std::endl; }
~Foo() { std::cout << "~Foo():" << this << std::endl; }
Foo (Foo const &) {
std::cout << "Foo::Foo():" << this << std::endl;
}
Foo& operator= (Foo const &) {
std::cout << "Foo::operator=():" << this<< std::endl;
return *this;
}
};
int main () {
Foo foo;
int pid = fork();
if (pid > 0) {
// We are parent.
int childExitStatus;
waitpid(pid, &childExitStatus, 0); // wait until child exits
} else if (pid == 0) {
// We are the new process.
} else {
// fork() failed.
}
}
Above program will print roughly:
Foo():0xbfb8b26f
~Foo():0xbfb8b26f
~Foo():0xbfb8b26f
No copy-construction or copy-assignment happens, the OS will make bitwise copies.
The addresses are the same because they are not physical addresses, but pointers into each process' virtual memory space.
It becomes more difficult when the two instances share information, e.g. an opened file that must be flushed and closed before exiting:
#include <iostream>
#include <fstream>
int main () {
std::ofstream of ("meh");
srand(clock());
int pid = fork();
if (pid > 0) {
// We are parent.
sleep(rand()%3);
of << "parent" << std::endl;
int childExitStatus;
waitpid(pid, &childExitStatus, 0); // wait until child exits
} else if (pid == 0) {
// We are the new process.
sleep(rand()%3);
of << "child" << std::endl;
} else {
// fork() failed.
}
}
This may print
parent
or
child
parent
or something else.
Problem being that the two instances do not enough to coordinate their access to the same file, and you don't know the implementation details of std::ofstream.
(Possible) solutions can be found under the terms "Interprocess Communication" or "IPC", the most nearby one would be waitpid():
#include <unistd.h>
#include <sys/wait.h>
int main () {
pid_t pid = fork();
if (pid > 0) {
int childExitStatus;
waitpid(pid, &childExitStatus, 0); // wait until child exits
} else if (pid == 0) {
...
} else {
// fork() failed.
}
}
The most simple solution would be to ensure that each process only uses its own virtual memory, and nothing else.
The other solution is a Linux specific one: Ensure that the sub-process does no clean up. The operating system will make a raw, non-RAII cleanup of all acquired memory and close all open files without flushing them.
This can be useful if you are using fork() with exec() to run another process:
#include <unistd.h>
#include <sys/wait.h>
int main () {
pid_t pid = fork();
if (pid > 0) {
// We are parent.
int childExitStatus;
waitpid(pid, &childExitStatus, 0);
} else if (pid == 0) {
// We are the new process.
execlp("echo", "echo", "hello, exec", (char*)0);
// only here if exec failed
} else {
// fork() failed.
}
}
Another way to just exit without triggering any more destructors is the exit() function. I generally advice to not use in C++, but when forking, it has its place.
References:
http://www.yolinux.com/TUTORIALS/ForkExecProcesses.html
man pages
The currently accepted answer shows a synchronization problem which frankly has nothing to do with what problems RAII can really cause. That is, whether you use RAII or not, you will have synchronization problems between parent and child. Heck, if you run the same process in two different consoles, you have the exact same synchronization problem! (i.e. no fork() involved in your program, just your program running twice in parallel.)
To resolve synchronization problems, you may use a semaphore. See sema_open(3) and related functions. Note that a thread would generate the exact same synchronization problems. Only you can use a mutex to synchronize multiple threads and in most cases a mutex is much faster than a semaphore..
So where you do get a problem with RAII is when you use it to hold on what I call an external resource, although all external resources are not affected the same way. I have had the problem in two circumstances and I will show both here.
Do not shutdown() a socket
Say you have your own socket class. In the destructor, you do a shutdown. After all, once you are done, you can as well send a message to the other end of the socket saying you are done with the connection:
class my_socket
{
public:
my_socket(char * addr)
{
socket_ = socket(s)
...bind, connect...
}
~my_socket()
{
if(_socket != -1)
{
shutdown(socket_, SHUT_RDWR);
close(socket_);
}
}
private:
int socket_ = -1;
};
When you use this RAII class, the shutdown() function affects the socket in the parent AND the child. That means both, the parent and the child cannot read nor write to that socket anymore. Here I suppose that the child does not use the socket at all (and thus I have absolutely no synchronization problems,) but when the child dies, the RAII class wakes up and the destructor gets called. At that point it shutdowns the socket which becomes unusable.
{
my_socket soc("127.0.0.1:1234");
// do something with soc in parent
...
pid_t const pid(fork());
if(pid == 0)
{
int status(0);
waitpid(pid, &status, 0);
}
else if(pid > 0)
{
// the fork() "duplicated" all memory (with copy-on-write for most)
// and duplicated all descriptors (see dup(2)) which is why
// calling 'close(s)' is perfectly safe in the child process.
// child does some work
...
// here 'soc' calls my_socket::~my_socket()
return;
}
else
{
// fork did not work
...
}
// here my_socket::~my_socket() was called in child and
// the socket was shutdown -- therefore it cannot be used
// anymore!
// do more work in parent, but cannot use 'soc'
// (which is probably not the wanted behavior!)
...
}
Avoid using socket in parent and child
Another possibility, still with a socket (although you could have the same effect with a pipe or some other mechanism used to communicate externally,) is to end up sending a "BYE" command twice. This is actually very close to being a synchronization problem, though, but in this case, that synchronization happens in the RAII object when it gets destroyed.
Say for example that you create a socket and manage it in an object. Whenever the object gets destroyed, you want to tell the other side by sending a "BYE" command:
class communicator
{
public:
communicator()
{
socket_ = socket();
...bind, connect...
}
~communicator()
{
write(socket_, "BYE\n", 4);
// shutdown(socket_); -- now we know not to do that!
close(socket_);
}
private
int socket_ = -1;
};
In this case, the other end receives the "BYE" command and closes the connection. Now the parent cannot communicate using that socket since it got closed by the other end!
This is very similar to what phresnel talks about with his ofstream example. Only, it is not an easy to fix synchronization. The order in which you write the "BYE\n" or another command to the socket won't change the fact that in the end the socket gets closed from the other side (i.e. synchronization can be achieved using an inter-process lock, whereas, that "BYE" command is similar to the shutdown() command, it stops the communication in its track!)
A Solution
For the shutdown() it was easy enough, we just do not call the function. That being said, maybe you still wanted to have the shutdown() happen in the parent, just not in the child.
There are several ways to fix the problem, one of them is to memorize the pid and use it to know whether these destructive function calls should be called or not. There is a possible fix:
class communicator
{
communicator()
: pid_(getpid())
{
socket_ = socket();
...bind, connect...
}
~communicator()
{
if(socket_ != -1)
{
if(pid_ == getpid())
{
write(socket_, "BYE\n", 4);
shutdown(socket_, SHUT_RDWR);
}
close(socket_);
}
}
private:
pid_t pid_;
int socket_;
};
Here we do the write() and shutdown() only if we are in the parent.
Notice that the child can (and is expected to) do the close() on the socket descriptor since the fork() called dup() on all the descriptors so the child has a different file descriptor to each file it holds.
Another Security Guard
Now there may be way more complicated cases where an RAII object is created way up in a parent and the child will call the destructor of that RAII object anyway. As mentioned by roemcke, calling _exit() is probably the safest thing to do (exit() works in most cases, but it can have unwanted side effects in the parent, at the same time, exit() may be required for the child to end cleanly--i.e. delete tmpfile() it created!). In other words, instead of using return, call _exit().
pid_t r(fork());
if(r == 0)
{
try
{
...child do work here...
}
catch(...)
{
// you probably want to log a message here...
}
_exit(0); // prevent stack unfolding and calls to atexit() functions
/* NOT REACHED */
}
This is anyway much safer just because you probably do not want the child to return in the "parent's code" where many other things could happen. Not just stack unfolding. (i.e. continuing a for() loop that the child is not supposed to continue...)
The _exit() function does not return, so destructors of objects defined on the stack do not get called. The try/catch is very important here because the _exit() is not going to be called if the child raises an exception, although it should call the terminate() function which also won't destroy all the heap allocated objects, it calls the terminate() function after it unfolded the stack and thus probably called all your RAII destructors... and again not what you would expect.
The difference between exit() and _exit() is that the former calls you atexit() functions. You relatively rarely need to do that in the child or the parent. At least, I never had any strange side effect. However, some libraries do make use of the atexit() without consideration of the possibility a fork() gets called. One way to protect yourself in an atexit() function is to record the PID of the process which requires the atexit() function. If when the function gets called the PID doesn't match, then you just return and do nothing else.
pid_t cleanup_pid = -1;
void cleanup()
{
if(cleanup_pid != getpid())
{
return;
}
... do your clean up here ...
}
void some_function_requiring_cleanup()
{
if(cleanup_pid != getpid())
{
cleanup_pid = getpid();
atexit(cleanup);
}
... do work requiring cleanup ...
}
Obviously, the number of libraries that use atexit() and do it right is probably very close to 0. So... you should avoid such libraries.
Remember that if you call execve() or _exit(), the cleanup will not occur. So in case of a tmpfile() call in the child + _exit(), that temporary file will not get deleted automatically...
Unless you know what you are doing, the child process should always call _exit() after it has done its stuff:
pid_t pid = fork()
if (pid == 0)
{
do_some_stuff(); // Make sure this doesn't throw anything
_exit(0);
}
The underscore is important. Do not call exit() in child process, it flushes stream buffers to disk (or wherever the filedescriptor is pointing), and you will end up with things written twice.

How to create a process in C++ under Linux?

I'd like to create a process by calling a executable, just as popen would allow. But I don't want to actually communicate through a pipe with it: I want to control it, like sending signals there or find out if the process is running, wait for it to finish after sending SIGINT and so on, just like multiprocessing in Python works. Like this:
pid_t A = create_process('foo');
pid_t B = create_process('bar');
join(B); // wait for B to return
send_signal(A, SIGINT);
What's the proper way to go?
Use case for example:
monitoring a bunch of processes (like restarting them when they crash)
UPDATE
I see in which direction the answers are going: fork(). Then I'd like to modify my use case: I'd like to create a class which takes a string in the constructor and is specified as follows: When an object is instantiated, a (sub)process is started (and controlled by the instance of the class), when the destructor is called, the process gets the terminate signal and the destructor returns as soon as the process returned.
Use case now: In a boost state chart, start a process when a state is entered, and send termination when the state has been left. I guess, http://www.highscore.de/boost/process/process/tutorials.html#process.tutorials.start_child is the thing that comes closest to what I'm looking for, excpet that it seems outdated.
Isn't that possible in a non-invasive way? Maybe I have a fundamental misunderstanding and there is a better way to do this kind of work, if so I'd be glad to get some hints.
UPDATE 2
Thanks to the answers below, I think I got the idea a little bit. I thought, this example would print "This is main" three times, once for the "parent", and once for each fork() – but that's wrong. So: Thank you for the patient answers!
#include <iostream>
#include <string>
#include <unistd.h>
struct myclass
{
pid_t the_pid;
myclass(std::string the_call)
{
the_pid = fork();
if(the_pid == 0)
{
execl(the_call.c_str(), NULL);
}
}
};
int main( int argc, char** argv )
{
std::cout << "This is main" << std::endl;
myclass("trivial_process");
myclass("trivial_process");
}
The below is not a realistic code at all, but it gives you some idea.
pid_t pid = fork()
if (pid == 0) {
// this is child process
execl("foo", "foo", NULL);
}
// continue your code in the main process.
Using the previously posted code, try this:
#include <signal.h>
#include <unistd.h>
class MyProc
{
public:
MyProc( const std::string& cmd)
{
m_pid = fork()
if (pid == 0) {
execl(cmd.c_str(), cmd.c_str(), NULL);
}
}
~MyProc()
{
// Just for the case, we have 0, we do not want to kill ourself
if( m_pid > 0 )
{
kill(m_pid, SIGKILL);
wait(m_pid);
}
}
private:
pid_t m_pid;
}
The downside I see on this example will be, you can not be sure, the process has finished (and probably he will not) if the signal is emitted, since the OS will continue after the kill immediately and the other process may get it delayed.
To ensure this, you may use ps ... with a grep to the pid, this should work then.
Edit: I have added the wait, which cames up in a comment up there!
Have a look to fork() (man 2 fork)

Need to write a daemon in linux, not sure what to use C++ or C

I have a little problem picking the right language to write my daemon,
I am confused between C and C++, I want to use C++ because it is more expanded than C,
but I want to use C because it is the starting point of everything in linux,
I want to go for C++ as I have many resources about it, so, does it make any difference if I pick C++ instead of C?
and what I will have good if I learn C more?
I feel like if I go into C++ I will cover C within C++...
Regards
Use whichever language you know best right now.
Use neither C nor C++, they are not really required for general programming.
Use a high-level language that makes thing easy, such as Python
Of course it depends what kind of "daemon" you're writing, but in all likelihood, you want to be focusing your development effort on the task in hand, not fixing things like memory leaks, string handling or other distractions. Using neither C or (to a lesser extent) C++ will allow you to do this.
I can answer this one entirely in code. Writing a daemon roughly consists of doing this:
/*
* Daemon Initialisation:
* 1. Fork()
* 2. setsid()
* 3. Fork() do we need to do this twice?
* 4. Chdir /
* 5. Umask(0)
* 6. Close STDIN/OUT/ERR
* 7. Optionally re-open stuff.
*
* Refs:
* 1. http://www.faqs.org/faqs/unix-faq/programmer/faq/
* 2. http://www.netzmafia.de/skripten/unix/linux-daemon-howto.html
* 3. http://www.enderunix.org/docs/eng/daemon.php
*/
/* Variables */
/* Our process ID and Session ID */
pid_t pid, sid;
int fd = 0;
/* Fork off the parent process */
pid = fork();
if (pid < 0)
{
exit(EXIT_FAILURE);
}
/* If we got a good PID, then
* we can exit the parent process.
*/
if (pid > 0)
{
exit(EXIT_SUCCESS);
}
/* Create a new SID for the child process */
sid = setsid();
if (sid < 0)
{
/* Log the failure */
exit(EXIT_FAILURE);
}
/* Fork off the parent process, again */
pid = fork();
if (pid < 0)
{
exit(EXIT_FAILURE);
}
/* If we got a good PID, then
we can exit the parent process. */
if (pid > 0)
{
exit(EXIT_SUCCESS);
}
/* Change the current working directory */
if ((chdir("/")) < 0)
{
/* Log the failure */
exit(EXIT_FAILURE);
}
/* Change the file mode mask */
umask(0);
/* Close all file descriptors */
for (fd = getdtablesize(); fd >= 0; --fd)
{
close(fd);
}
/* Open standard file descriptors from
* elsewhere.
* e.g. /dev/null -> stdin.
* /dev/console -> stderr?
* logfile as stdout?
*/
fd = open("/dev/null", O_RDWR); /* open stdin */
dup(fd); /* stdout */
dup(fd); /* stderr */
These are all C function calls, but of course, you can call them from C++.
The reason I have this code sat around is because it's actually a function I pass function pointers to which executes my "daemon body". Why do I do it this way? To work out where my errors are running the code directly as a process (if necessary with root privs) then I "daemonise". It's quite hard to debug a daemon process otherwise...
Edit: of course, using function pointers is a C way of thinking, but there's no reason you can't implement some form of class-based mechanism.
So, honestly, it really doesn't matter. Pick whichever you prefer.
as far as I'm concerned, C++ is C with optional extras. So go with that, and use whatever bits and pieces you feel comfortable with.
There is no reason to choose C instead of C++ if you are more comfortable with C++, and vice versa. They are both equally capable languages for this task.
Unless you are looking to become more comfortable with C, just use what you know.
Unless you are planning on doing kernel development or embedded programming, learning C++ is strictly better than learning C. The only thing you need to be careful of is that C++ "mangles" its function names, so that your function void foo() in C++ will not be accessible directly by a C program. The trick is to declare it with C linkage, as in extern "C" void foo().
That said, C++ is a much bigger language than C, and it will definitely take more time to learn.
You should definitely use C++.
You should definitely use C, period.