I'm new to working with forking and I am having a trouble understanding how to achieve what I want. I'll try to explain as best I can.
I have Process A which is a functional Berkeley socket server running on Linux.
I need Process A to load a program from the disk into a separate non-blocking process (Process B) in a background state. Then Process A needs to pass Process B control of Process A's sockets. Lastly Process A needs to end, leaving process B running.
I'm unclear on whats needed to pass the sockets to a new process if the old one ends, and the best way to create a non-blocking new process that allows the original process to end.
There's nothing special you need to do. Just make sure the close on exec flag is cleared for any file descriptors you want process B to inherit and set for any file descriptors you don't want process B to inherit. Then call exec to replace process A with process B. Process B will start with all inheritable file descriptors intact.
If you need to pass an open file (such as a socket) without using inheritance-through-fork, you use ioctl with I_SENDFD. Here is a very detailed description. (There is a corresponding mechanism for receiving it.) You can do this with a named pipe which connects the processes, or via a variation, with a Unix domain socket.
Related
I have my C++ program that forks into two processes, 1 (the original) and 2 (the forked process).
In the forked process (2), it execs program A that does a lot of computation.
The original process (1) communicates with that program A through standard input and output redirected to pipes.
I am trying to add a websocket connection to my code in the original process (1). I would like my original process to effectively select or epoll on whether there is data to be read from the pipe to program A or there is data to be read from the websocket connection.
Given that a beast websocket is not a file descriptor how can I do the effect of select or epoll?
Which version of Boost are you using? If it is relatively recent it should include support for boost::process::async_pipe which allows you to use I/O Objects besides sockets asynchronously with Asio. Examples are provided in the tutorials for the boost::process library. Since Beast uses the Asio library to perform I/O under the hood, you can combine the two quite easily.
Given that a beast websocket is not a file descriptor...
The Beast WebSocket is not a file descriptor, but it does use TCP sockets to perform I/O (see the linked examples above), and Asio is very good at using select/epoll with TCP sockets. Just make sure you are doing the async_read, async_write and io_service::run operations as usual.
you can make little change in your code. Replace the pipe with two Message Queue. For example out_q and response_q. Now your child process A will continuously read out_q and whenever your main process drop a message to out_q your main process will not wait for any response from child and your child will consume that message. Communication through message queue is asynchronous. But if you still need a kind of reply like any success or failure message from the child you can get it through response_q which will be read by your parent process. To know the response from child against a specific message originally sent from parent, you can use correlation id. (Read little about correlation id).
Now in parent process implement two 2 threads one will continuously read to web call and other one will read to standard input. And one method (probably static) which will be connected to out_q to drop message. Use mutex so that only one thread can call it and drop message to the out_q. Your main thread or process will read the response_q . In this way you can make everything parallel and asynchronous. If you don’t want to use thread still you have option for you fork() and create two child process for the same. Hope this will help you.
The scenario:
There are several processes running on a machine. Names and handles unknown, but they all have a piece of code running in them that's under our control.
A command line process is run. It signals to the other processes that they need to end (SetEvent), which our code picks up and handles within the other processes.
The goal:
The command line process needs to wait until the other processes have ended. How can this be achieved?
All that's coming to mind is to set up some shared memory or something and have each process write its handle into it so the command line process can wait on them, but this seems like so much effort for what it is. There must be some kernel level reference count that can be waited on?
Edit 1:
I'm thinking maybe assigning the processes to a job object, then the command line processes can wait on that? Not ideal though...
Edit 2:
Can't use job objects as it would interfere with other things using jobs. So now I'm thinking that the processes would obtain a handle to some/any sync object (semaphore, event, etc), and the command line process would poll for its existance. It would have to poll as if it waited it would keep the object alive. The sync object gets cleaned up by windows when the processes die, so the next poll would indicate that there are no processes. Not the niceset, cleanest method, but simple enough for the job it needs to do. Any advance on that?
You can do either of following ways.
Shared Memory (memory mapped object) : CreateFileMapping, then MapViewOfFile --> Proceed the request. UnmapViewFile. Close the file,
Named Pipe : Create a nameed pipe for each application. And keep running a thread to read the file. So, You can write end protocol from your application by connecting to that named pipe. ( U can implement a small database as like same )
WinSock : (Dont use if you have more number of processes. Since you need to send end request to the other process. Either the process should bind to your application or it should be listening in a port.)
Create a file/DB : Share the file between the processes. ( You can have multiple files if u needed ). Make locking before reading or writing.
I would consider a solution using two objects:
a shared semaphore object, created by the main (controller?) app, with an initial count of 0, just before requesting the other processes to terminate (calling SetEvent()) - I assume that the other processes don't create this event object, neither they fail if it has not been created yet.
a mutex object, created by the other (child?) processes, used not for waiting on it, but for allowing the main process to check for its existence (if all child processes terminate it should be destroyed). Mutex objects have the distinction that can be "created" by more than one processes (according to the documentation).
Synchronization would be as follows:
The child processes on initialization should create the Mutex object (set initial ownership to FALSE).
The child processes upon receiving the termination request should increase the semaphore count by one (ReleaseSemaphore()) and then exit normally.
The main process would enter a loop calling WaitForSingleObject() on the semaphore with a reasonably small timeout (eg some 250 msec), and then check not whether the object was granted or a timeout has occurred, but whether the mutex still exists - if not, this means that all child processes terminated.
This setup avoids making an interprocess communication scheme (eg having the child processes communicating their handles back - the number of which is unknown anyway), while it's not strictly speaking "polling" either. Well, there is some timeout involved (and some may argue that this alone is polling), but the check is also performed after each process has reported that it's terminating (you can employ some tracing to see how many times the timeout has actually elapsed).
The simple approach: you already have an event object that every subordinate process has open, so you can use that. After setting the event in the master process, close the handle, and then poll until you discover that the event object no longer exists.
The better approach: named pipes as a synchronization object, as already suggested. That sounds complicated, but it isn't.
The idea is that each of the subordinate processes creates an instance of the named pipe (i.e., all with the same name) when starting up. There's no need for a listening thread, or indeed any I/O logic at all; you just need to create the instance using CreateNamedPipe, then throw away the handle without closing it. When the process exits, the handle is closed automatically, and that's all we need.
To see whether there are any subordinate processes, the master process would attempt to connect to that named pipe using CreateFile. If it gets a file not found error, there are no subordinate processes, so we're done.
If the connection succeeded, there's at least one subordinate process that we need to wait for. (When you attempt to connect to a named pipe with more than one available instance, Windows chooses which instance to connect you to. It doesn't matter to us which one it is.)
The master process would then call ReadFile (just a simple synchronous read, one byte will do) and wait for it to fail. Once you've confirmed that the error code is ERROR_BROKEN_PIPE (it will be, unless something has gone seriously wrong) you know that the subordinate process in question has exited. You can then loop around and attempt another connection, until no more subordinate processes remain.
(I'm assuming here that the user will have to intervene if one or more subordinates have hung. It isn't impossible to keep track of the process IDs and do something programmatically if that is desirable, but it's not entirely trivial and should probably be a separate question.)
A common server socket pattern on Linux/UNIX systems is to listen on a socket, accept a connection, and then fork() to process the connection.
So, it seems that after you accept() and fork(), once you're inside the child process, you will have inherited the listening file descriptor of the parent process. I've read that at this point, you need to close the listening socket file descriptor from within the child process.
My question is, why? Is this simply to reduce the reference count of the listening socket? Or is it so that the child process itself will not be used by the OS as a candidate for routing incoming connections? If it's the latter, I'm a bit confused for two reasons:
(A) What tells the OS that a certain process is a candidate for accepting connections on a certain file descriptor? Is it the fact that the process has called accept()? Or is it the fact that the process has called listen()?
(B) If it's the fact that the process has called listen(), don't we have a race condition here? What if this happens:
Parent process listens on socket S.
Incoming connection goes to Parent Process.
Parent Process forks a child, child has a copy of socket S
BEFORE the child is able to call close(S), a second incoming connection goes to Child Process.
Child Process never calls accept() (because it's not supposed to), so the incoming connection gets dropped
What prevents the above condition from happening? And more generally, why should a child process close the listening socket?
Linux queues up pending connections. A call to accept, from either the parent or child process, will poll that queue.
Not closing the socket in the child process is a resource leak, but not much else. The parent will still grab all the incoming connections, because it's the only one that calls accept, but if the parent exits, the socket will still exist because it's open on the child, even if the child never uses it.
The incoming connection will be 'delivered' to which ever process is calling accept(). After you forked before closing the file descriptor you could accept the connection in both processes.
So as long as you never accept any connections in the child thread and the parent is continuing to accept the connections everything would work fine.
But if you plan to never accept connections in your child process, why would you want to keep resources for the socket in this process?
The interesting question would be what happens if both processes call accept() on the socket. I could not find definite information on this at the moment. What I could find is, that you can be sure, that every connection is only delivered to only one of these processes.
In the socket() manual, a paragraph says:
SOCK_CLOEXEC
Set the close-on-exec (FD_CLOEXEC) flag on the new file descriptor. See the description of the O_CLOEXEC flag in open(2) for
reasons why this may be useful.
Unfortunately, that doesn't do anything when you call fork(), it's only for when you call execv() and other similar functions. Anyway, reading the info in the open() function manual we see:
O_CLOEXEC (since Linux 2.6.23)
Enable the close-on-exec flag for the new file descriptor. Specifying this flag permits a program to avoid additional fcntl(2) F_SETFD operations to set the FD_CLOEXEC flag.
Note that the use of this flag is essential in some multithreaded programs, because using a separate fcntl(2) F_SETFD operation to set the FD_CLOEXEC flag does not suffice to avoid race conditions where one thread opens a file descriptor and attempts to set its close-on-exec flag using fcntl(2) at the same time as another thread does a fork(2) plus execve(2). Depending on the order of execution, the race may lead to the file descriptor returned by open() being unintentionally leaked to the program executed by the child process created by fork(2). (This kind of race is in principle possible for any system call that creates a file descriptor whose close-on-exec flag should be set, and various other Linux system calls provide an equivalent of the O_CLOEXEC flag to deal with this problem.)
Okay so what does all of that mean?
The idea is very simple. If you leave a file descriptor open when you call execve(), you give the child process access to that file descriptor and thus it may be given access to data that it should not have access to.
When you create a service which fork()s and then executes code, that code often starts by dropping rights (i.e. the main apache2 service runs as root, but all the spawned fork() actually run as the httpd or www user—it is important for the main process to be root in order to open ports 80 and 443, any port under 1024, actually). Now, if a hacker is somehow able to gain control of that child process, they at least won't have access to that file descriptor if closed very early on. This is much safer.
On the other hand, my apache2 example works differently: it first opens a socket and binds it to port 80, 443, etc. and then creates children with fork() and each child calls accept() (which by default blocks). The first incoming connection will wake up one of the children by returning from the accept() call. So I guess that one is not that risky after all. It will even keep that connection open and call accept() again, up to the max. defined in your settings (something like 100 by default, depends on the OS you use). After max. accept() calls, that child process exits and the server creates a new instance. This is to make sure that the memory footprint doesn't grow too much.
So in your case, it may not be that important. However, if a hacker takes over your process, they could accept other connections and handle them with their canny version of your server... something to thing about. If your service is internal (only runs on your Intranet), then the danger is lesser (although from what I read, most thieves in companies are employees working there...)
The child process won't be listening on the socket unless accept() is called, in which case incoming connections can go to either process.
A child process inherits all files descriptors from its parent. A child process should close all listening sockets to avoid conflicts with its parent.
I am writing a program in openFrameworks a c++ framework. I want to start another app and communicate with it over stdin and stdout. I can start a new thread conveniently using the ofThread class. I had planned on creating two pipes and redirecting the std in and out of the thread to the pipes (using dup2), but unfortunately, this redirects the pipes for the whole app, not just the thread.
Is there a way I can start another app and be able to reads its output and provide it input?
Instead of another thread you'll need to create a child process using the fork() function (which might involve another thread intrinsically).
The difference is, that fork creates a complete copy of the parent process environment that should be shown on an exec() call within scope of the child process, while just exec() from a thread tries to share all the resource from it's parent process (thread) and thus might lead to unexpected concurrency (race conditon) problems.
If your "another app" is implemented as a subthread within your existing program, you don't need to redirect stdin and stdout to communicate with it over pipes. Just pass the pipe file descriptors to the subthread when you start it up. (You can use fdopen to wrap file descriptors in FILE objects. If you have dup2 and pipe, you have fdopen as well.)
In my C/C++ server application which runs on Mac (Darwin Kernel Version 10.4.0) I'm forking child processes and want theses childes to not inherit file handles (files, sockets, pipes, ...) of the server. Seems that by default all handles are being inherited, even more, netstat shows that child processes are listening to the server's port. How can I do such kind of fork?
Normally, after fork() but before exec() one does getrlimit(RLIMIT_NOFILE, fds); and then closes all file descriptors lower than fds.
Also, close-on-exec can be set on file descriptors using fcntl(), so that they get closed automatically on exec(). This, however, is not thread-safe because another thread can fork() after this thread opens a new file descriptor but before it sets close-on-exec flag.
On Linux this problem has been solved by adding O_CLOEXEC flag to functions like open() so that no extra call is required to set close-on-exec flag.
Nope, you need to close them yourself since only you know which ones you need to keep open or not.
Basically no. You have to do that yourself. Maybe pthread_atfork help, but it still be tedious.