I'm looking at the code for a c++ program which pipes the contents of a file to more. I don't quite understand it, so I was wondering if someone could write pseudocode for a c++ program that pipes something to something else? Why is it necessary to use fork?
create pipe
fork process
if child:
connect pipe to stdin
exec more
write to pipe
You need fork() so that you can replace stdin of the child before calling, and so that you don't wait for the process before continuing.
You will find your answer precisely here
Why is it necessary to use fork?
When you run a pipeline from the shell, eg.
$ ls | more
what happens? The shell runs two processes (one for ls, one for more). Additionally, the output (STDOUT) of ls is connected to the input (STDIN) of more, by a pipe.
Note that ls and more don't need to know anything about pipes, they just write to (and read from) their STDOUT (and STDIN) respectively. Further, because they're likely to do normal blocking reads and writes, it's essential that they can run concurrently. Otherwise ls could just fill the pipe buffer and block forever before more gets a chance to consume anything.
... pipes something to something else ...
Note also that aside from the concurrency argument, if your something else is another program (like more), it must run in another process. You create this process using fork. If you just run more in the current process (using exec), it would replace your program.
In general, you can use a pipe without fork, but you'll just be communicating within your own process. This means you're either doing non-blocking operations (perhaps in a synchronous co-routine setup), or using multiple threads.
Related
In my C++ program I need to execute a bash script. I need then to return the result obtained running the script in my c++ program.
I have two possibilities:
1. use system(script.sh). In script.sh I redirect the output in a file which is processd after I return to the c++ program.
2. use popen
I am interested which of this methods is preffered, considering that the output returned from script.sh could be big(100 M). Thanks.
When using system the parent process is blocked until the child process terminates. The child process will run with full performance.
popen will start the child process, but not wait until it ended. So the parent process can continue to do whatever it wants while the child is running, it can for example read the output of the child process. The parent process can decide if it wants to read blocking or non-blocking from the child's output pipe, depending on how much other things the parent process has to do. The child will run in parallel and write its output to the pipe. It might be blocked when writing if the parent process is not reading from the pipe and the pipe's memory limit is reached. So the parent process should keep on reading the output.
The system approach is a bit simpler. But popen gives you the possibility to read the process's output while it is still running. And you don't need the extra file (space). So I'd use popen.
I have two processes written in C++, piped one after the other. One gives some information to the other's stdin, then they both go on to do something else.
The problem is that the second process hangs inside cin.getline(), even though there's no more data to be exchanged. The solution was for the first process to fclose(stdout), and that works, except when I use the process wrapped up in a script. So apparently the stdout of the script is still open after closing it by the process - which seems fair but in my case, can I close it? Thanks
Since your program doesn't terminate, you can exec your-program in the script instead of just your-program and save an open file descriptor at the writing end of the pipe (and a bunch of other things).
Alternatively, start your program in the background and exit the script.
You can also close the standard output, but if you do that before you start your program, it won't be able to use the closed file descriptor. So you have to close it while the program is running. This is not exactly trivial. I can think of starting the program in the background, closing the standard output (use exec 1>&- for that) and bringing the program back to the foreground.
I need to learn how to create a pipe and use fork, and also how to write to a pipe and read, in VC++ 2010.
Are there any tutorials on how to do that?
This question is already answered in detail here.
Quoting verbatim from the same answer
A pipe is a mechanism for interprocess communication. Data written to the pipe by one process can be read by another process. The primitive for creating a pipe is the pipe function. This creates both the reading and writing ends of the pipe. It is not very useful for a single process to use a pipe to talk to itself. In typical use, a process creates a pipe just before it forks one or more child processes. The pipe is then used for communication either between the parent or child processes, or between two sibling processes. A familiar example of this kind of communication can be seen in all operating system shells. When you type a command at the shell, it will spawn the executable represented by that command with a call to fork. A pipe is opened to the new child process and its output is read and printed by the shell. This page has a full example of the fork and pipe functions...
I am programming a shell in c++. It needs to be able to pipe the output from one thing to another. For example, in linux, you can pipe a textfile to more by doing cat textfile | more.
My function to pipe one thing to another is declared like this:
void pipeinput(string input, string output);
I send "cat textfile" as the input, and "more" as the output.
In c++ examples that show how to make pipes, fopen() is used. What do I send as my input to fopen()? I have seen c++ examples of pipeing using dup2 and without suing dup2. What's dup2 used for? How do you know if you need to use it or not?
Take a look at popen(3), which is a way to avoid execvp.
For a simple, two-command pipeline, the function interface you propose may be sufficient. For the general case of an N-stage pipeline, I don't think it is flexible enough.
The pipe() system call is used to create a pipe. In context, you will be creating one pipe before forking. One of the two processes will arrange for the write end of the pipe to become its standard output (probably using dup2()), and will then close both of the file descriptors originally returned by pipe(). It will then execute the command that writes to the pipe (cat textfile in your example). The other process will arrange for the read enc of the pipe to become its standard input (probably using dup2() again), and will then close both of the file descriptor originally returned by pipe(). It will then execute the command that reads from the pipe (more in your example).
Of course, there will be still a third process around - the parent shell process - which forked off a child to run the entire pipeline. You might decide you want to refine the mechanisms a bit if you want to track the statuses of each process in the pipeline; the process organization is then a bit different.
fopen() is not used to create pipes. It can be used to open the file descriptor, but it is not necessary to do so.
Pipes are created with the pipe(2) call, before forking off the process. The subprocess has a little bit of file descriptor management to do before execing the command. See the example in pipe's documentation.
I am creating a child-parent fork() to be able to communicate with a shell(/bin/sh) from the parent through a pipe.
The problem is:
In a parent I set a select() on a child output, but it unblocks only when the process is finished! So when I run say ps it's okay. but when I run /bin/sh it does not output until shell exits. But I want to read it's output!
for(;;) {
select(PARENT_READ+1,&sh,NULL,NULL,NULL); // This unblocks only when shell exits!
if (FD_ISSET(PARENT_READ,&sh)) {
while (n = read (PARENT_READ, &buf,30)) {
buf[30]='\0';
printf("C: %s\n",buf);
};
};
}
The answer is somewhere in the field of disabling buffering of pipes?
A lot of programs change their behavior depending on whether or not they think they're talking to a terminal (tty), and the shell definitely does this this. Also, the default C streams stdout and stderr are probably unbuffered if stdout is a tty, and fully buffered otherwise - that means that they don't flush until the internal buffer is full, the program explicitly flushes them, or the program ends.
To work around this problem, you have to make your program pretend to be a terminal. To do this, you can use your system's pseudo-terminal APIs (try man 7 pty). The end result is a pair of file descriptors that sort-of work like a pipe.
Also, as an aside, when select unblocks, you should read exactly once from the triggered file descriptor. If you read more than once, which is possible with the loop you've got there, you risk blocking again on subsequent reads, unless you've got your FD in non-blocking mode.
However, I have to ask: why do you need to interact with the shell in this way? Is it possible to, say, just run a shell script, or use "/bin/sh -c your_command_here" instead? There are relatively few programs that actually need a real terminal to work correctly - the main ones are programs that prompt for a password, like ssh, su or sudo.