I was reading about pipes in my operating system course and writing some code to understand it better. I have a doubt regardign the following code:
int fd[2]; // CREATING PIPE
pipe(fd);
int status;
int pid=fork();
if(pid==0)
{
// WRITER PROCESS
srand(123);
int arr[3]={1,2,3};
close(fd[0]); // CLOSE UNUSED(READING END)
for(int i=0;i<3;i++)
write(fd[1],&arr[i],sizeof(int));
close(fd[1]); // CLOSE WRITING END AFTER WRITING SO AS READ GETS THE EOF
}
else
{
// READER PROCESS
int arr[10];
int i=0;
int n_bytes;
//close(fd[1]); // CLOSE UNUSED(WRITING END)
while((n_bytes=read(fd[0],&arr[i],sizeof(int)))>0) // READIN IN A LOOP UNTIL END
i++;
close(fd[0]); // CLOSE READING END after reading
for(int j=0;j<i;j++)
cout<<arr[j]<<endl;
while(wait(&status)>0)
;
}
If I run this, the read is getting blocked, if I uncomment the close(fd[1]) command in the reader process, the code runs fine.
That means close(fd[1]) closes the write end and read can proceed.
My doubt is even if i dont close the write end in reader process, it is getting closed at the end of the writer process. So why is still read sys call getting blocked?
Initially, both processes have open file descriptors to both the read and write ends of the pipe.
The OS will only close an end of the pipe when all open file descriptors to it have been closed, so if you don't call close(fd[1]) in the child process one file descriptor will remain open, and the write end of the pipe will not be closed, and read will block waiting for input that will never come.
Two problems:
The first is that due to operator precedence the loop condition n_bytes=read(fd[0],&arr[i],sizeof(int))>0 is really equalt n_bytes = (read(fd[0],&arr[i],sizeof(int)) > 0). That is, you assign the value of the comparison to the variable n_bytes. To correct this add extra parentheses around the assignment, as in (n_bytes=read(fd[0],&arr[i],sizeof(int)))>0.
The second problem is that both the parent and the child process will call wait in a loop. You should only do that in the parent process to wait for the child.
Related
I'm trying to create a function that returns true if execvp is successful and false if it is not. Initially, I didn't use a pipe and the problem was that whenever execvp failed, I get 2 returns, one false and one true (from the parent). Now that I'm piping, I'm never getting a false returned when execvp fails.
I know there are a lot related questions and answers on this topic, but I can't seem to narrow down where my particular error is. What I want is for my variables return_type_child, return_type_parent, and this->return_type to all contain the same value. I expected that in the child process, execvp would fail so the next lines would execute. As a result, I thought that the 3 variables mentioned would all be false, but instead when I print out the value in this->return_type, 1 is displayed.
bool Command::execute() {
this->fork_helper();
return return_type;
}
void Command::fork_helper() {
bool return_type_child = true;
int fd[2];
pipe(fd);
pid_t child;
char *const argv[] = {"zf","-la", nullptr};
child = fork();
if (child > 0) {
wait(NULL);
close(0);
close(fd[1]);
dup(fd[0]);
bool return_type_parent = read(fd[0], &return_type_child, sizeof(return_
this->return_type = return_type_parent;
}
else if (child == 0) {
close(fd[0]);
close(1);
dup(fd[1]);
execvp(argv[0], argv);
this->return_type = false;
return_type_child = false;
write(1,&return_type_child,sizeof(return_type_child));
}
return;
}
I've also tried putting a cout statement after execvp(argv[0], argv), which never ran. Any help is greatly appreciated!
From the code, it seems to be an XY problem (edit: moved this section to the front due to a comment that confirms this). If the goal is to get the exit status of the child, then for that there is the value that wait returns, and no pipes are required:
int stat;
wait(&stat);
Read the manual of wait to figure out how to read it. The value of stat can be tested as follows:
WEXITSTATUS(stat) - If WIFEXITED(stat) != 0, then this are the lower 8 bits of child's call to exit(N) or the return value from main. It might work correctly without checking WIFEXITED, but the standard does not specify that.
WTERMSIG(stat) - If WIFSIGNALED(stat) != 0, then this is the signal number that caused the process to exit (e.g. 11 is segmentation fault). It might work correctly without checking WIFSIGNALED, but the standard does not specify that.
There are several errors in the code. See the added comments:
void Command::fork_helper() {
// File descriptors here: 0=stdin, 1=stdout, 2=stderr
//(and 3..N opened in the program, could also be none).
bool return_type_child = true;
int fd[2];
pipe(fd);
// File descriptors here: 0=stdin, 1=stdout, 2=stderr
//(and 3..N opened in the program, could also be none).
// N+1=fd[0] data exhaust of the pipe
// N+2=fd[1] data intake of the pipe
pid_t child;
char *const argv[] = {"zf","-la", nullptr};
child = fork();
if (child > 0) {
// This code is executed in the parent.
wait(NULL); // wait for the child to complete.
This wait is a potential deadlock: if the child writes enough data to the pipe (usually in the kilobytes), the write blocks and waits for the parent to read the pipe. The parent wait(NULL) waits for the child to complete, which which waits for the parent to read the pipe. This is likely not effecting the code in question, but it is problematic.
close(0);
close(fd[1]);
dup(fd[0]);
// File descriptors here: 0=new stdin=data exhaust of the pipe
// 1=stdout, 2=stderr
// (and 3..N opened in the program, could also be none).
// N+1=fd[0] data exhaust of the pipe (stdin is now a duplicate)
This is problematic since:
the code just lost the original stdin.
The pipe is never closed. You should close fd[0] explicitly, don't close(0),
and don't duplicate fd[0].
It is good idea to avoid having duplicate descriptors, except for having stderr duplicate stdout.
.
bool return_type_parent = read(fd[0], &return_type_child, sizeof(return_
this->return_type = return_type_parent;
}
else if (child == 0) {
// this code runs in the child.
close(fd[0]);
close(1);
dup(fd[1]);
// File descriptors here: 0=stdin, 1=new stdout=pipe intake, 2=stderr
//(and 3..N opened in the program, could also be none).
// N+2=fd[1] pipe intake (new stdout is a duplicate)
This is problematic, since there are two duplicate data intakes to the pipe. In this case it is not critical since they are both closed automatically when the process ends, but it is a bad practice. It is a bad practice, since only closing all the pipe intakes signals END-OF-FILE to the exhaust. Closing one intake but not the other, does not signal END-OF-FILE. Again, in your case it is not causing trouble since the child's exit closes all the intakes.
execvp(argv[0], argv);
The code below the above line is never reached, unless execvp itself failed. The execvp fails only when the file does not exist, or the caller has no permission to execute it. If the executable starts to execute and fails later (possibly even if it fails to read a shared library), then still execvp itself succeeds and never returns. This is because execvp replaces the executable, and the following code is no longer in memory when execvp starts to run the other program.
this->return_type = false;
return_type_child = false;
write(1,&return_type_child,sizeof(return_type_child));
}
return;
}
pclose()'s man page says:
The pclose() function waits for the associated process to terminate and returns the exit status of the command as returned by wait4(2).
I feel like this means if the associated FILE* created by popen() was opened with type "r" in order to read the command's output, then you're not really sure the output has completed until after the call to pclose(). But after pclose(), the closed FILE* must surely be invalid, so how can you ever be certain you've read the entire output of command?
To illustrate my question by example, consider the following code:
// main.cpp
#include <iostream>
#include <cstdio>
#include <cerrno>
#include <cstring>
#include <sys/types.h>
#include <sys/wait.h>
int main( int argc, char* argv[] )
{
FILE* fp = popen( "someExecutableThatTakesALongTime", "r" );
if ( ! fp )
{
std::cout << "popen failed: " << errno << " " << strerror( errno )
<< std::endl;
return 1;
}
char buf[512] = { 0 };
fread( buf, sizeof buf, 1, fp );
std::cout << buf << std::endl;
// If we're only certain the output-producing process has terminated after the
// following pclose(), how do we know the content retrieved above with fread()
// is complete?
int r = pclose( fp );
// But if we wait until after the above pclose(), fp is invalid, so
// there's nowhere from which we could retrieve the command's output anymore,
// right?
std::cout << "exit status: " << WEXITSTATUS( r ) << std::endl;
return 0;
}
My questions, as inline above: if we're only certain the output-producing child process has terminated after the pclose(), how do we know the content retrieved with the fread() is complete? But if we wait until after the pclose(), fp is invalid, so there's nowhere from which we could retrieve the command's output anymore, right?
This feels like a chicken-and-egg problem, but I've seen code similar to the above all over, so I'm probably misunderstanding something. I'm grateful for an explanation on this.
TL;DR executive summary: how do we know the content retrieved with the fread() is complete? — we've got an EOF.
You get an EOF when the child process closes its end of the pipe. This can happen when it calls close explicitly or exits. Nothing can come out of your end of the pipe after that. After getting an EOF you don't know whether the process has terminated, but you do know for sure that it will never write anything to the pipe.
By calling pclose you close your end of the pipe and wait for termination of the child. When pclose returns, you know that the child has terminated.
If you call pclose without getting an EOF, and the child tries to write stuff to its end of the pipe, it will fail (in fact it wil get a SIGPIPE and probably die).
There is absolutely no room for any chicken-and-egg situation here.
Read the documentation for popen more carefully:
The pclose() function shall close a stream that was opened by popen(), wait for the command to terminate, and return the termination status of the process that was running the command language interpreter.
It blocks and waits.
I learned a couple things while researching this issue further, which I think answer my question:
Essentially: yes it is safe to fread from the FILE* returned by popen prior to pclose. Assuming the buffer given to fread is large enough, you will not "miss" output generated by the command given to popen.
Going back and carefully considering what fread does: it effectively blocks until (size * nmemb) bytes have been read or end-of-file (or error) is encountered.
Thanks to C - pipe without using popen, I understand better what popen does under the hood: it does a dup2 to redirect its stdout to the write-end of the pipe it uses. Importantly: it performs some form of exec to execute the specified command in the forked process, and after this child process terminates, its open file descriptors, including 1 (stdout) are closed. I.e. termination of the specified command is the condition by which the child process' stdout is closed.
Next, I went back and thought more carefully about what EOF really was in this context. At first, I was under the loosey-goosey and mistaken impression that "fread tries to read from a FILE* as fast as it can and returns/unblocks after the last byte is read". That's not quite true: as noted above: fread will read/block until its target number of bytes is read or EOF or error are encountered. The FILE* returned by popen comes from a fdopen of the read-end of the pipe used by popen, so its EOF occurs when the child process' stdout - which was dup2ed with the write-end of the pipe - is closed.
So, in the end what we have is: popen creating a pipe whose write end gets the output of a child process running the specified command, and whose read end if fdopened to a FILE* passed to fread. (Assuming fread's buffer is big enough), fread will block until EOF occurs, which corresponds to closure of the write end of popen's pipe resulting from termination of the executing command. I.e. because fread is blocking until EOF is encountered, and EOF occurs after command - running in popen's child process - terminates, it's safe to use fread (with a sufficiently large buffer) to capture the complete output of the command given to popen.
Grateful if anyone can verify my inferences and conclusions.
popen() is just a shortcut for series of fork, dup2, execv, fdopen, etc. It will give us access to child STDOUT, STDIN via files stream operation with ease.
After popen(), both the parent and the child process executed independently.
pclose() is not a 'kill' function, its just wait for the child process to terminate. Since it's a blocking function, the output data generated during pclose() executed could be lost.
To avoid this data lost, we will call pclose() only when we know the child process was already terminated: a fgets() call will return NULL or fread() return from blocking, the shared stream reach the end and EOF() will return true.
Here is an example of using popen() with fread(). This function return -1 if the executing process is failed, 0 if Ok. The child output data is return in szResult.
int exec_command( const char * szCmd, std::string & szResult ){
printf("Execute commande : [%s]\n", szCmd );
FILE * pFile = popen( szCmd, "r");
if(!pFile){
printf("Execute commande : [%s] FAILED !\n", szCmd );
return -1;
}
char buf[256];
//check if the output stream is ended.
while( !feof(pFile) ){
//try to read 255 bytes from the stream, this operation is BLOCKING ...
int nRead = fread(buf, 1, 255, pFile);
//there are something or nothing to read because the stream is closed or the program catch an error signal
if( nRead > 0 ){
buf[nRead] = '\0';
szResult += buf;
}
}
//the child process is already terminated. Clean it up or we have an other zoombie in the process table.
pclose(pFile);
printf("Exec command [%s] return : \n[%s]\n", szCmd, szResult.c_str() );
return 0;
}
Note that, all files operation on the return stream work on BLOCKING mode, the stream is open without O_NONBLOCK flags. The fread() can be blocked forever when the child process hang and nerver terminated, so use popen() only with trusted program.
To take more controls on child process and avoid the file blockings operation, we should use fork/vfork/execlv, etc. by ourself, modify the pipes opened attribut with O_NONBLOCK flags, use poll() or select() from time to time to determine if there are some data then use read() function to read from the pipe.
Use waitpid() with WNOHANG periodically to see if the child process was terminated.
My C++ program is just a very simple while loop in which I grab user command from the console (standard input, stdin) using the getline() blocking function. Every now and then I must call an external bash script for other purposes. The script is not directly related to what the user do, it just do some stuff on the filesystem but it has to print text lines in the console standard output (stdout) to inform the user about the outcome of its computations.
What I get is that as soon as the script starts and prints stuff to stdout, the getline() function behave has it were non-blocking (it is supposed to block until the user inputs some text). As a consequence, the while(1) starts spinning at full speed and the CPU usage skyrockets to a near 100%.
I narrowed down the problem to a single C++ source file which reproduces the problem in the same exact way, here it is:
#include<iostream>
#include<string>
#include<sstream>
#include<iostream>
#include<stdlib.h>
#include<stdio.h>
int main(void)
{
int pid = fork(); // spawn
if(pid > 0)
{
// child thread
system("sleep 5; echo \"you're screwed up!!!\"");
}
else
{
// main thread
std::string input;
while(1)
{
std::cout << std::endl << "command:";
getline(std::cin, input);
}
}
}
In this particular case after 5 seconds the program starts spamming "\ncommand:" on stdout and the only way to stop it is sending a SIGKILL signal. Sometimes you have to press some keys on the keyboard before the program starts spamming text lines.
Let run this code for 10 seconds then press any key on the keyboard. Be sure to be ready to fire the SIGKILL signal to the process from another console. You can use the command killall -9 progname
Did you check if failbit or eof is set?
Try changing the following line of your code
if (pid > 0)
to
if (pid == 0)
fork() returns 0 to child and pid of child to parent. In your example, you are running system() in parent and exiting the parent. The child then becomes orphan process running in a while(1) loop which i guess is messing up with the stdin, stdout.
I have modified your program to run system() in child process.
The basic problem:
if(pid > 0)
{
// child thread
system("sleep 5; echo \"you're screwed up!!!\"");
}
this is the PARENT. ;) The child gets pid : 0.
A c++ program of mine calls fork() and the child immediately executes another program. I have to put interact with the child, but terminate its parent simultaneously because its executable will be replaced. I somehow need to get the orphan back into the foreground so that I may interact with it via the bash - I am currently only getting its output. So I either need to send the parent to the background, the child to the foreground and then terminate the parent, or send the child to the background immediately when the parent terminates.
To my knowledge, I must set the child to be process group leader before its parent terminates.
With generous borrowing from this thread, I arrived at the following testing ground (note, this is not the full program - it just outlines the procedure):
int main(int argc, char *argcv[])
printf("%i\n", argc);
printf("\nhello, I am %i\n", getpid());
printf("parent is %i\n", getppid());
printf("process leader is %i\n", getsid(getpid()));
int pgrp;
std::stringstream pidstream;
pidstream << tcgetpgrp(STDIN_FILENO);
pidstream >> pgrp;
printf("foreground process group ID %i\n", pgrp);
if(argc==1)
{
int child = fork();
if(!child) {execl("./nameofthisprogram","nameofthisprogram", "foo", NULL);}
else
{
signal(SIGTTOU, SIG_IGN);
usleep(1000*1000*1);
tcsetpgrp(0, child);
tcsetpgrp(1, child);
std::stringstream pidstream2;
pidstream2 << tcgetpgrp(STDIN_FILENO);
pidstream2 >> pgrp;
printf("foreground process group ID %i\n", pgrp);
usleep(1000*1000*3);
return 0;
}
}
// signal(SIGTTOU, SIG_IGN); unnecessary
int input;
int input2;
printf("write something\n");
std::cin >> input;
printf("%i\n", input);
usleep(1000*1000*3);
printf("%i\n", input);
printf("write something else\n");
std::cin >> input2;
usleep(1000*1000*3);
printf("%i\n", input2);
return 0;
With the above code, the parent dies after I get prompted for the first input. If I then delay my answer beyond the parent's death, it picks up the first input character and prints it again. For input2, the program does not wait for my input.
So it seems that after the first character, input is entirely terminated.
Am I approaching this fundamentally wrong, or is it simply a matter of reassigning a few more ids and altering some signals?
I see a few things wrong here.
You're never putting the child process in its own process group; therefore, it remains in the original one and is therefore in the foreground along with the parent.
You're calling tcsetpgrp() twice; it only needs to be called once. Assuming no redirection, stdin and stdout both refer to the terminal and therefore either call would do.
With the above code, the parent dies after I get prompted for the first input. If I then delay my answer beyond the parent's death, it picks up the first input character and prints it again. For input2, the program does not wait for my input. So it seems that after the first character, input is entirely terminated.
What you're observing here is a direct consequence of 1.: since both processes are in the foreground, they're both racing to read from stdin and the outcome is undefined.
I somehow need to get the orphan back into the foreground so that I may interact with it via the bash - I am currently only getting its output.
From what I understand, you would expect to be interacting with the exec'ed child after the fork()/exec(). For that to happen, the child needs to be in its own process group, and needs to be put in the foreground.
int child = fork();
signal(SIGTTOU, SIG_IGN);
if (!child) {
setpgid(0, 0); // Put in its own process group
tcsetpgrp(0, getpgrp()); // Avoid race condition where exec'd program would still be in the background and would try to read from the terminal
execl("./nameofthisprogram","nameofthisprogram", "foo", NULL);
} else {
setpgid(child, child); // Either setpgid call will succeed, depending on how the processes are scheduled.
tcsetpgrp(0, child); // Move child to foreground
}
Notice that we call the setpgid()/tcsetpgrp() pair in both the parent and the child. We do so because we don't know which will be scheduled first, and we want to avoid the race condition where the exec'ed program would attempt to read from stdin (and therefore receive a SIGTTIN which would stop the process) before the parent has had time to put it in the foreground. We also ignore SIGTTOU because we know that either the child or the parent will receive one with the calls to tcsetpgrp().
I have to modify a simple shell I wrote for a previous homework assignment to handle I/O redirection and I'm having trouble getting the pipes to work. It seems that when I write and read to stdout and from stdin after duplicating the file descriptors in the separates processes, the pipe works, but if I use anything like printf, fprintf, gets, fgets, etc to try and see if the output is showing up in the pipe, it goes to the console even though the file descriptor for stdin and stdout clearly is a copy of the pipe (I don't know if that's the correct way to phrase that, but the point is clear I think).
I am 99.9% sure that I am doing everything as it should be at least in plain C -- such as closing all the file descriptors appropriately after the dup() -- and file I/O works fine, so this seems like an issue of a detail that I am not aware of and cannot find any information on. I've spent most of the day trying different things and the past few hours googling trying to figure out if I could redirect cin and cout to the pipe to see if that would fix it, but it seems like it's more trouble than it's worth at this point.
Should this work just by redirecting stdin and stdout since cin and cout are supposed to be sync'd with stdio? I thought it should, especially since the commands are probably written in C so they would use stdio, I would think. However, if I try a command like "cat [file1] [file2] | sort", it prints the result of cat [file1] [file2] to the command line, and the sort doesn't get any input so it has no output. It's also clear that cout and cin are not affected by the dup() either, so I put two and two together and came to this conclusion
Here is a somewhat shortened version of my code minus all the error checking and things like that, which I am confident I am handling well. I can post the full code if it come to it, but it's a lot so I'll start with this.
I rewrote the function so that the parent forks off a child for each command and connects them with pipes as necessary and then waits for the child processes to die. Again, write and read on the file descriptors 0 and 1 work (i.e. write to and reads from the pipe), stdio on the FILE pointers stdin and stdout do not work (do not write to pipe).
Thanks a lot, this has been killing me...
UPDATE: I wasn't changing the string cmd for each of the different commands so it didn't appear to work because the pipe just went to the same command so the final output was the same... Sorry for the dumbness, but thanks because I found the problem with strace.
int call_execv( string cmd, vector<string> &argv, int argc,
vector<int> &redirect)
{
int result = 0, pid, /* some other declarations */;
bool file_in, file_out, pipe_in, pipe_out;
queue<int*> pipes; // never has more than 2 pipes
// parse, fork, exec, & loop if there's a pipe until no more pipes
do
{
/* some declarations for variables used in parsing */
file_in = file_out = pipe_in = pipe_out = false;
// parse the next command and set some flags
while( /* there's more redirection */ )
{
string symbol = /* next redirection symbol */
if( symbol == ">" )
{
/* set flags, get filename, etc */
}
else if( symbol == "<" )
{
/* set flags, get filename, etc */
}
else if( pipe_out = (symbol == "|") )
{
/* set flags, and... */
int tempPipes[2];
pipes.push( pipe(tempPipes) );
break;
}
}
/* ... set some more flags ... */
// fork child
pid = fork();
if( pid == 0 ) // child
{
/* if pipe_in and pipe_out set, there are two pipes in queue.
the old pipes read is dup'd to stdin, and the new pipes
write is dup'd to stdout, other two FD's are closed */
/* if only pipe_in or pipe_out, there is one pipe in queue.
the unused end is closed in whichever if statement evaluates */
/* if neither pipe_in or pipe_out is set, no pipe in queue */
// redirect stdout
if( pipe_out ){
// close newest pipes read end
close( pipes.back()[P_READ] );
// dup the newest pipes write end
dup2( pipes.back()[P_WRITE], STDOUT_FILENO );
// close newest pipes write end
close( pipes.back()[P_WRITE] );
}
else if( file_out )
freopen(outfile.c_str(), "w", stdout);
// redirect stdin
if( pipe_in ){
close( pipes.front()[P_WRITE] );
dup2( pipes.front()[P_READ], STDIN_FILENO );
close( pipes.front()[P_READ] );
}
else if ( file_in )
freopen(infile.c_str(), "r", stdin);
// create argument list and exec
char **arglist = make_arglist( argv, start, end );
execv( cmd.c_str(), arglist );
cout << "Execution failed." << endl;
exit(-1); // this only executes is execv fails
} // end child
/* close the newest pipes write end because child is writing to it.
the older pipes write end is closed already */
if( pipe_out )
close( pipes.back()[P_WRITE] );
// remove pipes that have been read from front of queue
if( init_count > 0 )
{
close( pipes.front()[P_READ] ); // close FD first
pipes.pop(); // pop from queue
}
} while ( pipe_out );
// wait for each child process to die
return result;
}
Whatever the problem, you are not checking any return values. How do you know if the pipe() or the dup2() command succeeded? Have you verified that stdout and stdin really point to the pipe right before execv? Does execv keep the filedescriptors you give it? Not sure, here is the corresponding paragraph from the execve documentation:
By default, file descriptors remain open across an execve(). File descriptors that are marked close-on-exec are closed; see the description of FD_CLOEXEC in fcntl(2). (If a
file descriptor is closed, this will cause the release of all record locks obtained on the underlying file by this process. See fcntl(2) for details.) POSIX.1-2001 says
that if file descriptors 0, 1, and 2 would otherwise be closed after a successful execve(), and the process would gain privilege because the set-user_ID or set-group_ID per‐
mission bit was set on the executed file, then the system may open an unspecified file for each of these file descriptors. As a general principle, no portable program,
whether privileged or not, can assume that these three file descriptors will remain closed across an execve().
You should add more debug output and see what really happens. Did you use strace -f (to follow children) on your program?
The following:
queue<int*> pipes; // never has more than 2 pipes
// ...
int tempPipes[2];
pipes.push( pipe(tempPipes) );
Is not supposed to work. Not sure how it compiles since the result of pipe() is int. Note only that, tempPipes goes out of scope and its contents get lost.
Should be something like that:
struct PipeFds
{
int fds[2];
};
std::queue<PipeFds> pipes;
PipeFds p;
pipe(p.fds); // check the return value
pipes.push(p);