read() on a pipe is not blocking - c++

I have following piece of code where I am using the pipe for two way read and write between parent and child process.
From what I have read, if I dont use O_NONBLOCK, the read should block until the data is written to the pipe from the other side.
However, I notice that the read on the parent side did not block. I know that, since I am debugging in gdb, I have put a sleep as a first statement inside the child.
Why did the read() by parent not block here? Also, is there anything else that I need to do to synchronize the read/write as below between the two processes?
typedef struct
{
int x;
int y;
}PayLoad;
PayLoad pl;
bool b = false;
int pipe_fds[2];
void p(int i, int j)
{
pl.x = i;
pl.y = j;
pipe(pipe_fds);
pid_t cpid = fork();
if (cpid == 0) // child process
{
std::this_thread::sleep_for(std::chrono::seconds(100)); // just for debugging
close(pipe_fds[1]);
read(pipe_fds[0], &pl, sizeof(Payload));
//... do some processing on read data
close(pipe_fds[0]);
write(pipe_fds[1], &b, sizeof(bool));
close(pipe_fds[1]);
}
else if (cpid > 0) // parent process
{
close(pipe_fds[0]);
write(pipe_fds[1], &pl, sizeof(Payload));
close(pipe_fds[1]);
read(pipe_fds[0], &b, sizeof(bool)); <------ did not block!
close(pipe_fds[0]);
}
}

If O_NONBLOCK is set, read() will return a -1 and set errno to [EAGAIN].
The real problem is you are closing the file descriptors before using them. For example, in the child process, you are closing pipe_fds[1] and you are using it for writing some value. In the parent process, you are closing pipe_fds[0] and you are using it for reading some value. Once the process closes the file descriptor, the process shouldn't use it for reading or writing. Usually pipe concept is one process (either parent or child) will write using one of file descriptors created by pipe and the other process (either parent or child) will read the data using another file descriptor.

Related

Do input redirection and capture command output (Custom shell-like program)

I'm writing a custom shell where I try to add support for input, output redirections and pipes just like standard shell. I stuck at point where I cannot do input redirection, but output redirection is perfectly working. My implementation is something like this (only related part), you can assume that (string) input is non-empty
void execute() {
... // stuff before execution and initialization of variables
int *fds;
std::string content;
std::string input = readFromAFile(in_file); // for input redirection
for (int i = 0; i < commands.size(); i++) {
fds = subprocess(commands[i]);
dprintf(fds[1], "%s", input.data()); // write to write-end of pipe
close(fds[1]);
content += readFromFD(fds[0]); // read from read-end of pipe
close(fds[0]);
}
... // stuff after execution
}
int *subprocess(std::string &cmd) {
std::string s;
int *fds = new int[2];
pipe(fds);
pid_t pid = fork();
if (pid == -1) {
std::cerr << "Fork failed.";
}
if (pid == 0) {
dup2(fds[1], STDOUT_FILENO);
dup2(fds[0], STDIN_FILENO);
close(fds[1]);
close(fds[0]);
system(cmd.data());
exit(0); // child terminates
}
return fds;
}
My thought is subprocess returns a pipe (fd_in, fd_out) and parent can write to write-end and read-from read-end afterwards. However when I try an input redirection something like sort < in.txt, the program just hangs. I think there is a deadlock because one waiting other to write, and other one to read, however, after parent writes to write-end it closes, and then read from read-end. How should I consider this case ?
When I did a bit of searching, I saw this answer, which my original thinking was similar except that in the answer it mentions creating two pipes. I did not quite understand this part. Why do we need two separate pipes ?

Creating a dispatch queue / thread handler in C++ with pipes: FIFOs overfilling

Threads are resource-heavy to create and use, so often a pool of threads will be reused for asynchronous tasks. A task is packaged up, and then "posted" to a broker that will enqueue the task on the next available thread.
This is the idea behind dispatch queues (i.e. Apple's Grand Central Dispatch), and thread handlers (Android's Looper mechanism).
Right now, I'm trying to roll my own. In fact, I'm plugging a gap in Android whereby there is an API for posting tasks in Java, but not in the native NDK. However, I'm keeping this question platform independent where I can.
Pipes are the ideal choice for my scenario. I can easily poll the file descriptor of the read-end of a pipe(2) on my worker thread, and enqueue tasks from any other thread by writing to the write-end. Here's what that looks like:
int taskRead, taskWrite;
void setup() {
// Create the pipe
int taskPipe[2];
::pipe(taskPipe);
taskRead = taskPipe[0];
taskWrite = taskPipe[1];
// Set up a routine that is called when task_r reports new data
function_that_polls_file_descriptor(taskRead, []() {
// Read the callback data
std::function<void(void)>* taskPtr;
::read(taskRead, &taskPtr, sizeof(taskPtr));
// Run the task - this is unsafe! See below.
(*taskPtr)();
// Clean up
delete taskPtr;
});
}
void post(const std::function<void(void)>& task) {
// Copy the function onto the heap
auto* taskPtr = new std::function<void(void)>(task);
// Write the pointer to the pipe - this may block if the FIFO is full!
::write(taskWrite, &taskPtr, sizeof(taskPtr));
}
This code puts a std::function on the heap, and passes the pointer to the pipe. The function_that_polls_file_descriptor then calls the provided expression to read the pipe and execute the function. Note that there are no safety checks in this example.
This works great 99% of the time, but there is one major drawback. Pipes have a limited size, and if the pipe is filled, then calls to post() will hang. This in itself is not unsafe, until a call to post() is made within a task.
auto evil = []() {
// Post a new task back onto the queue
post({});
// Not enough new tasks, let's make more!
for (int i = 0; i < 3; i++) {
post({});
}
// Now for each time this task is posted, 4 more tasks will be added to the queue.
});
post(evil);
post(evil);
...
If this happens, then the worker thread will be blocked, waiting to write to the pipe. But the pipe's FIFO is full, and the worker thread is not reading anything from it, so the entire system is in deadlock.
What can be done to ensure that calls to post() eminating from the worker thread always succeed, allowing the worker to continue processing the queue in the event it is full?
Thanks to all the comments and other answers in this post, I now have a working solution to this problem.
The trick I've employed is to prioritise worker threads by checking which thread is calling post(). Here is the rough algorithm:
pipe ← NON-BLOCKING-PIPE()
overflow ← Ø
POST(task)
success ← WRITE(task, pipe)
IF NOT success THEN
IF THREAD-IS-WORKER() THEN
overflow ← overflow ∪ {task}
ELSE
WAIT(pipe)
POST(task)
Then on the worker thread:
LOOP FOREVER
task ← READ(pipe)
RUN(task)
FOR EACH overtask ∈ overflow
RUN(overtask)
overflow ← Ø
The wait is performed with pselect(2), adapted from the answer by #Sigismondo.
Here's the algorithm implemented in my original code example that will work for a single worker thread (although I haven't tested it after copy-paste). It can be extended to work for a thread pool by having a separate overflow queue for each thread.
int taskRead, taskWrite;
// These variables are only allowed to be modified by the worker thread
std::__thread_id workerId;
std::queue<std::function<void(void)>> overflow;
bool overflowInUse;
void setup() {
int taskPipe[2];
::pipe(taskPipe);
taskRead = taskPipe[0];
taskWrite = taskPipe[1];
// Make the pipe non-blocking to check pipe overflows manually
::fcntl(taskWrite, F_SETFL, ::fcntl(taskWrite, F_GETFL, 0) | O_NONBLOCK);
// Save the ID of this worker thread to compare later
workerId = std::this_thread::get_id();
overflowInUse = false;
function_that_polls_file_descriptor(taskRead, []() {
// Read the callback data
std::function<void(void)>* taskPtr;
::read(taskRead, &taskPtr, sizeof(taskPtr));
// Run the task
(*taskPtr)();
delete taskPtr;
// Run any tasks that were posted to the overflow
while (!overflow.empty()) {
taskPtr = overflow.front();
overflow.pop();
(*taskPtr)();
delete taskPtr;
}
// Release the overflow mechanism if applicable
overflowInUse = false;
});
}
bool write(std::function<void(void)>* taskPtr, bool blocking = true) {
ssize_t rc = ::write(taskWrite, &taskPtr, sizeof(taskPtr));
// Failure handling
if (rc < 0) {
// If blocking is allowed, wait for pipe to become available
int err = errno;
if ((errno == EAGAIN || errno == EWOULDBLOCK) && blocking) {
fd_set fds;
FD_ZERO(&fds);
FD_SET(taskWrite, &fds);
::pselect(1, nullptr, &fds, nullptr, nullptr, nullptr);
// Try again
return write(tdata);
}
// Otherwise return false
return false;
}
return true;
}
void post(const std::function<void(void)>& task) {
auto* taskPtr = new std::function<void(void)>(task);
if (std::this_thread::get_id() == workerId) {
// The worker thread gets 1st-class treatment.
// It won't be blocked if the pipe is full, instead
// using an overflow queue until the overflow has been cleared.
if (!overflowInUse) {
bool success = write(taskPtr, false);
if (!success) {
overflow.push(taskPtr);
overflowInUse = true;
}
} else {
overflow.push(taskPtr);
}
} else {
write(taskPtr);
}
}
Make the pipe write file descriptor non-blocking, so that write fails with EAGAIN when the pipe is full.
One improvement is to increase the pipe buffer size.
Another is to use a UNIX socket/socketpair and increase the socket buffer size.
Yet another solution is to use a UNIX datagram socket which many worker threads can read from, but only one gets the next datagram. In other words, you can use a datagram socket as a thread dispatcher.
You can use the old good select to determine whether the file descriptors are ready to be used for writing:
The file descriptors in writefds will be watched to see if
space is available for write (though a large write may still block).
Since you are writing a pointer, your write() cannot be classified as large at all.
Clearly you must be ready to handle the fact that a post may fail, and then be ready to retry it later... otherwise you will be facing indefinitely growing pipes, until you system will break again.
More or less (not tested):
bool post(const std::function<void(void)>& task) {
bool post_res = false;
// Copy the function onto the heap
auto* taskPtr = new std::function<void(void)>(task);
fd_set wfds;
struct timeval tv;
int retval;
FD_ZERO(&wfds);
FD_SET(taskWrite, &wfds);
// Don't wait at all
tv.tv_sec = 0;
tv.tv_usec = 0;
retval = select(1, NULL, &wfds, NULL, &tv);
// select() returns 0 when no FD's are ready
if (retval == -1) {
// handle error condition
} else if (retval > 0) {
// Write the pointer to the pipe. This write will succeed
::write(taskWrite, &taskPtr, sizeof(taskPtr));
post_res = true;
}
return post_res;
}
If you only look at Android/Linux using a pipe is not start of the art but using a event file descriptor together with epoll is the way to go.

C++ both input and output pipe to the external program

I am trying to invoke external program with some input and retrieve the output from it within a program.
It will be look like;
(some input) | (external program) | (retrieve output)
I first thought about using a popen() but it seems like, it is not possible because the pipe is not bidirectional.
Is there any easy way to handle this kind of stuff in linux?
I can try making a temp file but it will be great if it can be handled clearly without accessing the disk.
Any Solution? Thanks.
On linux you can use pipe function: Open two new pipes, one for each direction, then create a child process using fork, afterwards, you typically close the file descriptors not in use (read end on parent, write end on child of the pipe for parent sending to child and vice versa for the other pipe) and then start your application using execve or one of its front ends.
If you dup2 the pipes' file descriptors to the standard console file handles (STDIN_FILENO/STDOUT_FILENO; each process separately), you should even be able to use std::cin/std::cout for communicating with the other process (you might want to do so only for the child, as you might want to keep your console in parent). I have no tested this, though, so that's left to you.
When done, you'd yet wait or waitpid for your child process to terminate. Might look like similar to the following piece of code:
int pipeP2C[2], pipeC2P[2];
// (names: short for pipe for X (writing) to Y with P == parent, C == child)
if(pipe(pipeP2C) != 0 || pipe(pipeC2P) != 0)
{
// error
// TODO: appropriate handling
}
else
{
int pid = fork();
if(pid < 0)
{
// error
// TODO: appropriate handling
}
else if(pid > 0)
{
// parent
// close unused ends:
close(pipeP2C[0]); // read end
close(pipeC2P[1]); // write end
// use pipes to communicate with child...
int status;
waitpid(pid, &status, 0);
// cleanup or do whatever you want to do afterwards...
}
else
{
// child
close(pipeP2C[1]); // write end
close(pipeC2P[0]); // read end
dup2(pipeP2C[0], STDIN_FILENO);
dup2(pipeC2P[1], STDOUT_FILENO);
// you should be able now to close the two remaining
// pipe file desciptors as well as you dup'ed them already
// (confirmed that it is working)
close(pipeP2C[0]);
close(pipeC2P[1]);
execve(/*...*/); // won't return - but you should now be able to
// use stdin/stdout to communicate with parent
}
}

Linux - child reading from pipe receives debug messages sent to standard output

I'm trying to create a parent and a child processes that would communicate through a pipe.
I've setup the child to listen to its parent through a pipe, with a read command running in a while loop.
In order to debug my program I print debug messages to the standard output (note that my read command is set to the pipe with a file descriptor different than 0 or 1).
From some reason these debug messages are being received in the read command of my child process. I can't understand why this is happening. What could be causing this? What elegant solution do I have to solve it (apart from writing to the standard error instead of output)?
This code causes an endless loop because of the cout message that just triggers another read. Why? Notice that the child process exists upon receiving a CHILD_EXIT_CODE signal from parent.
int myPipe[2]
pipe(myPipe);
if(fork() == 0)
{
int readPipe = myPipe[0];
while(true)
{
size_t nBytes = read(readPipe, readBuffer, sizeof(readBuffer));
std::cout << readBuffer << "\n";
int newPosition = atoi(readBuffer);
if(newPosition == CHILD_EXIT_CODE)
{
exit(0);
}
}
}
Edit: Code creating the pipe and fork
I do not know what is doing your parent process (you did not post your code), but because of your description it seems like your parent and child processes are sharing the same stdout stream (the child inherits copies of the parent's set of open file descriptors; see man fork)
I guess, what you should do is to attach stdout and stderr streams in your parent process to the write side of your pipes (you need one more pipe for the stderr stream)
This is what I would try if I were in your situation (in my opinion you are missing dup2):
pid_t pid; /*Child or parent PID.*/
int out[2], err[2]; /*Store pipes file descriptors. Write ends attached to the stdout*/
/*and stderr streams.*/
// Init value as error.
out[0] = out[1] = err[0] = err[1] = -1;
/*Creating pipes, they will be attached to the stderr and stdout streams*/
if (pipe(out) < 0 || pipe(err) < 0) {
/* Error: you should log it */
exit (EXIT_FAILURE);
}
if ((pid=fork()) == -1) {
/* Error: you should log it */
exit (EXIT_FAILURE);
}
if (pid != 0) {
/*Parent process*/
/*Attach stderr and stdout streams to your pipes (their write end)*/
if ((dup2(out[1], 1) < 0) || (dup2(err[1], 2) < 0)) {
/* Error: you should log it */
/* The child is going to be an orphan process you should kill it before calling exit.*/
exit (EXIT_FAILURE);
}
/*WHATEVER YOU DO WITH YOUR PARENT PROCESS*/
/* The child is going to be an orphan process you should kill it before calling exit.*/
exit(EXIT_SUCCESS);
}
else {
/*Child process*/
}
You should not forget a couple of things:
wait or waitpid to release associated memory to child process when it dies. wait or waitpid must be called from parent process.
If you use wait or waitpid you might have to think about blocking SIGCHLD before calling fork and in that case you should unblock SIGCHLD in your child process right after fork, at the beginning of your child process code (A child created via fork(2) inherits a copy of its parent's signal mask; see sigprocmask).
.
Something that many times is forgotten. Be aware of EINTR error. dup2, waitpid/wait, read and many others are affected by this error.
If your parent process dies before your child process you should try to kill the child process if you do not want it to become an orphan one.
Take a look at _exit. Perhaps you should use it in your child process instead of exit.

Child process is blocked by full pipe, cannot read in parent process

I have roughly created the following code to call a child process:
// pipe meanings
const int READ = 0;
const int WRITE = 1;
int fd[2];
// Create pipes
if (pipe(fd))
{
throw ...
}
p_pid = fork();
if (p_pid == 0) // in the child
{
close(fd[READ]);
if (dup2(fd[WRITE], fileno(stdout)) == -1)
{
throw ...
}
close(fd[WRITE]);
// Call exec
execv(argv[0], const_cast<char*const*>(&argv[0]));
_exit(-1);
}
else if (p_pid < 0) // fork has failed
{
throw
}
else // in th parent
{
close(fd[WRITE]);
p_stdout = new std::ifstream(fd[READ]));
}
Now, if the subprocess does not write too much to stdout, I can wait for it to finish and then read the stdout from p_stdout. If it writes too much, the write blocks and the parent waits for it forever.
To fix this, I tried to wait with WNOHANG in the parent, if it is not finished, read all available output from p_stdout using readsome, sleep a bit and try again. Unfortunately, readsome never reads anything:
while (true)
{
if (waitid(P_PID, p_pid, &info, WEXITED | WNOHANG) != 0)
throw ...;
else if (info.si_pid != 0) // waiting has succeeded
break;
char tmp[1024];
size_t sizeRead;
sizeRead = p_stdout->readsome(tmp, 1024);
if (sizeRead > 0)
s_stdout.write(tmp, sizeRead);
sleep(1);
}
The question is: Why does this not work and how can I fix it?
edit: If there is only child, simply using read instead of readsome would probably work, but the process has multiple children and needs to react as soon as one of them terminates.
As sarnold suggested, you need to change the order of your calls. Read first, wait last. Even if your method worked, you might miss the last read. i.e. you exit the loop before you read the last set of bytes that was written.
The problem might be is that ifstream is non-blocking. I've never liked iostreams, even in my C++ projects, I always liked the simplicity of C's stdio functions (i.e. FILE*, fprintf, etc). One way to get around this is to read if the descriptor is readable. You can use select to determine if there is data waiting on that pipe. You're going to need select if you are going to read from multiple children anyway, so might as well learn it now.
As for a quick isreadable function, try something like this (please note I haven't tried compiling this):
bool isreadable(int fd, int timeoutSecs)
{
struct timeval tv = { timeoutSecs, 0 };
fd_set readSet;
FD_ZERO(&readSet);
return select(fds, &readSet, NULL, NULL, &tv) == 1;
}
Then in your parent code, do something like:
while (true) {
if (isreadable(fd[READ], 1)) {
// read fd[READ];
if (bytes <= 0)
break;
}
}
wait(pid);
I'd suggest re-writing the code so that it doesn't call waitpid(2) until after read(2) calls on the pipe return 0 to signify end-of-file. Once you get the end-of-file return from your read calls, you know the child is dead, and you can finally waitpid(2) for it.
Another option is to de-couple the reading from the reaping even further and perform the wait calls in a SIGCHLD signal handler asynchronously to the reading operations.