What does this C code do? - c++

I'm really new to C programming, although I have done quite a bit of other types of programming.
I was wondering if someone could explain to me why this program outputs 10.
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
#include <sys/wait.h>
#include <stdlib.h>
int value = 10;
int main()
{
pid_t pid;
pid = fork();
if(pid == 0){
value += 10;
}
else if(pid > 0){
wait(NULL);
printf("parent: value = %d\n", value); //Line A
exit(0);
}
}
I know the output is "parent: value = 10". Anyone know why?
Thanks!

fork creates two processes (the "parent" and the "child"). Each process has a different value of pid in your example. The child process has a pid of 0. The parent process has a pid of the child's operating system pid (assigned by the OS.)
In your example, each process has it's own value in its memory. They do not share memory (like you think they should by your question.) If you change one process (the first part of the if) it will not be reflected in the second process (the second part of the if.)
Edit: Explained the value of pid.

About fork() :
If fork() returns a negative value,
the creation of a child process was
unsuccessful.
If fork() returns a zero to the newly
created child process.
If fork() returns a positive value, the
process ID of the child process, to
the parent.
So in you case it bound to return a number greater than 0 & thus the value will remain 10 & will be printed.

Well, fork spawns a new process. It more or less copies the current process, and both the new one (the child) and the old one (the parent) go on at the same point in the code. But there is one significant difference (that interests us) here: for the child, fork returns 0. For the parent, it returns the process ID of the child.
So the if(pid ==0) part is true for the child. The child simple add 10 to his value, and then exits since there is no further code.
The else part is true for the parent (except for the very rare case that fork returned an error with -1). The parent simply waits for the child to exit. But the child has modified its own copy of value, the one of the parent is still untouched and that is why you get the output of "10". Then the parent also exits.

fork() creates a new process: it has two return values in two different contexts, so both paths run in your if statement. The conditional is mostly used to determine which process you run in after the fork.

when you call fork, it creates a copy of the process in such a way that both the copies' program counters are at the same position in their code sections. Hence when any of these copies resumes execution, both will just be finishing the call to fork.
So both of them should execute identically.
BUT, fork returns 0 in the child process, and the pid of the child process in the parent process.
That explains the mojo behind the if( pid==0 ) part.
So when the child process changes the value of value, it actually changes that in its own copy (remember: the process got copied, so the data sections got copied too).
Meanwhile, the parent process executes with its old value of value, which is 10.
Even after the child changes its copy of value and dies, the parent's copy is still 10.

The fork system call creates a new process as a child of the existing (parent) process. Both the parent and the child continue execution at the line following the fork statement, however the child process is given an exact copy of the parents address space.
The fork system call returns the process id of the newly created process to the parent and zero to the child, therefore within this code the child will increment its own copy of the value variable and the parent will print out its own copy.
You will often see fork followed by an exec within the child so that it replaces itself with another program.

Related

Interprocess communication, reading from multiple children stdout

I'm trying to write a custom shell-like program, where multiple commands can be executed concurrently. For a single command this is not much complicated. However, when I try to concurrently execute multiple commands (each one in a separate child) and capture their stdout I'm having a problem.
What I tried so far is this under my shell application I have two functions to run the commands concurrently, execute() takes multiple commands and for each of the commands it fork() a child process to execute the command, subprocess() takes 1 cmd and executes it.
void execute(std::vector<std::string> cmds) {
int fds[2];
pipe(fds);
std::pair<pid_t, int> sp;
for (int i = 0; i < cmds.size(); i++) {
std::pair<pid_t, int> sp = this->subprocess(cmds[i], fds);
}
// wait for all children
while (wait(NULL) > 0);
close(sp.second);
}
std::pair<pid_t, int> subprocess(std::string &cmd, int *fds) {
std::pair<pid_t, int> process = std::make_pair(fork(), fds[0]);
if (process.first == 0) {
close(fds[0]); // no reading
dup2(fds[1], STDIN_FILENO);
close(fds[1]);
char *argv[] = {"/bin/sh", "-c", cmd.data(), NULL};
execvp(argv[0], argv);
exit(0);
}
close(fds[1]); // only reading
return process;
}
The problem here is, when I execute multiple commands on my custom shell (not diving into spesifics here, but it will call execute() at some point.) if I use STDIN_FILENO as above to capture child process stdout, it keeps writing to shell's stdin forever what the captured output is, for example
if the input commands are
echo im done, yet?
echo nope
echo maybe
then, in writing to STDIN_FILENO case, the output is like (where >>> ) is my marker for user input.
im done, yet?
nope
maybe
>>> nope
maybe
im done, yet?
>>> im done, yet?
nope
maybe
in writing to STDOUT_FILENO case, it seems it's ignoring one of the commands (probably the first child), I'm not sure why?
maybe
nope
>>> maybe
nope
>>> nope
maybe
>>> maybe
nope
>>> nope
So, potential things I thought are in my shell I'm using std::cin >> ... for user input in a while loop ofc, this may somehow conflict with stdin case. On the other hand, in the main process (parent) I'm waiting for all children to exit, so children somehow is not exiting, but child should die off after execvp, right ? Moreover, I close the reading end in the main process close(sp.second). At this point, I'm not sure why this case happens ?
Should I not use pipe() for a process like this ? If I use a temp file to redirect stdout of child process, would everything be fine ? and if so, can you please explain why ?
There are multiple, fundamental, conceptual problems in the shown code.
std::pair<pid_t, int> sp;
This declares a new std::pair object. So far so good.
std::pair<pid_t, int> sp = this->subprocess(cmds[i], fds);
This declares a new std::pair object inside the for loop. It just happens to have the same name as the sp object at the function scope. But it's a different object that has nothing to do, whatsoever, with it. That's how C++ works: when you declare an object inside an inner scope, inside an if statement, a for loop, or anything that's stuffed inside another pair of { ... } you end up declaring a new object. Whether its name happens to be the same as another name that's been declared in a larger scope, it's immaterial. It's a new object.
// wait for all children
while (wait(NULL) > 0);
close(sp.second);
There are two separate problems here.
For starters, if we've been paying attention: this sp object has not been initialized to anything.
If the goal here is to read from the children, that part is completely missing, and that should be done before waiting for the child processes to exit. If, as the described goal is here, the child processes are going to be writing to this pipe the pipe should be read from. Otherwise if nothing is being read from the pipe: the pipe's internal buffer is limited, and if the child processes fill up the pipe they'll be blocked, waiting for the pipe to be read from. But the parent process is waiting for the child processes to exist, so everything will hang.
Finally, it is also unclear why the pipe's file descriptor is getting passed to the same function, only to return a std::pair with the same file descriptor. The std::pair serves no useful purpose in the shown code, so it's likely that there's also more code that's not shown here, where this is put to use.
At least all of the above problems must be fixed in order for the shown code to work correctly. If there's other code that's not shown, it may or may not have additional issues, as well.

Why this line wouldn't be printed? (C++ threads)

Found this example in the net and can't find out why this line wouldn't be printed
#include<stdlib.h>
#include<unistd.h>
int main()
{
pid_t return_value;
printf("Forking process\n");
return_value=fork();
printf("The process id is %d
and return value is %d\n",
getpid(), return_value);
execl("/bin/ls/","ls","-l",NULL);
printf("This line is not printed\n");
}
A successful execl never returns, see the man page:
The exec() functions only return if an error has occurred.
Instead, the host process is replaced by what you are execing, in this case, the ls process image:
The exec() family of functions replaces the current process image with a new process image.
This way, your program will be replaced in memory before reaching the last printf statement, causing it to never execute.
exec*() functions is a special in sense that they are non-returning. Typical implementation of that function "replaces" modules of current process that is effectively the same as starting of new program right inside of current process. In your case new program is /bin/ls. During execl() all previous images are unloaded from process, then /bin/ls and all its dependencies are loaded and control is passed to entry point of /bin/ls, that calls its main() function, so on.
Thus there is no place to return control after execl() since module that calls it no more exists in address space of current process.

Is there a way for my win32 program to tell that the child process it launched has crashed (and not just exited)?

I wrote a multi-platform C++ class that launches a user-specified child process, and lets the user communicate with the child process's stdin/stdout, wait for the child process to exit, and so on.
In the Unix/POSIX implementation of this class, I've just added a feature that lets the caller find out whether the child process's exit was due to an unhandled signal (i.e. a crash):
bool ChildProcessDataIO :: WaitForChildProcessToExit(bool & retDidChildProcessCrash)
{
int status = 0;
int pid = waitpid(_childPID, &status, 0);
if (pid == _childPID)
{
retDidChildProcessCrash = WIFSIGNALED(status);
return true;
}
else return false; // error, couldn't get child process's status
}
... and now I'd like to add similar functionality to the Windows implementation, which currently looks like this:
bool ChildProcessDataIO :: WaitForChildProcessToExit(bool & retDidChildProcessCrash)
{
bool ret = (WaitForSingleObject(_childProcess, INFINITE) == WAIT_OBJECT_0);
if (ret)
{
/* TODO: somehow set (retDidChildProcessCrash) here */
}
return ret;
}
... but I haven't figured out how to set (retDidChildProcessCrash) to the appropriate value using the Win32 API.
Is there some way to do this, or do I just need to put a note in my documentation that this feature isn't currently implemented under Windows?
Arrange for the child to communicate with the parent to indicate completion. A shared event would be one way. If the process terminates and the parent has not received notification of success then it can conclude that the child failed.
Another option might be to use the process exit code. Will be zero on success, assuming the child follows the usual conventions. And a crash will lead to an error code indicating the form of the crash, according to this question: Predictable exit code of crashed process in Windows?
This is less reliable though. A process might terminate because it had TerminateProcess called on it, with a zero exit code. So if you control both process the first approach is safer. If you don't control the child process then the exit code might be your best shot. There's not much else you can get from a process handle of a terminated process.

Fork() printing multiple case in switch-case

I just started to learn Linux programming,My doubt may seem very silly to you,but i am really very confused,so help me to get through this-
here goes the code
#include <string>
#include <iostream>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include "err.h"
using namespace std;
int main(){
int a=-5;
switch(a=fork()){
case -1:
cout<<"error\n";
break;
case 0:
cout<<"here comes the child\n";
break;
default:
cout<<"a is "<<a<<endl;
// break;
}
return 0;
}
output:
a is 28866
here comes the child
Question1:I don't understand why both case 0: and default: gets
executed !
Question2:According to me value of a should be 0 if child process is
created successfully!
Question1:I don't understand why both case 0: and default: gets executed !
The case 0 is executed by the child process, where fork returns 0. The default case is executed in the parent process, where the return value of fork is the pid of the new child process.
Fork, as the documentation says, creates an exact duplicate of the calling process, including the current instruction pointer. I.e. both, the parent, and the child process will execute the switch statement.
Question2:According to me value of a should be 0 if child process is created successfully!
In the child process, yes. In the parent it's the child process's pid.
On successful execution, the fork command returns the process id of the child process to the parent process and it returns 0 to the child process. After the fork command execution, both the parent and child process execute the same set of instructions. In this case, both the child and the parent processes execute the switch statement. The value "a is 28866" is printed by the child process and the value "here comes the child" is printed by the parent process. To make the parent and child processes execute different instructions, check the return value of the fork command.
See the fork(2) documentation:
On success, the PID of the child process is returned in the parent,
and 0 is returned in the child.
So in your example, you get both 28866 and 0 as return values in two seperate processes (parent process and child process) which explains the output. Note that the output order could vary.
This is was fork was done for: You want to execute your program or parts of your program simultaneously. The return value allows you to detect which process you are in.

GDB/DDD: Debug shared library with multi-process application C/C++

I am trying to debug a server application but I am running into some difficulties breaking where I need to. The application is broken up into two parts:
A server application, which spawns worker processes (not threads) to handle incoming requests. The server basically spawns off processes which will process incoming requests first-come first-served.
The server also loads plugins in the form of shared libraries. The shared library defines most of the services the server is able to process, so most of the actual processing is done here.
As an added nugget of joy, the worker processes "respawn" (i.e. exit and a new worker process is spawned) so the PIDs of the children change periodically. -_-'
Basically I need to debug a service that's called within the shared library but I don't know which process to attach to ahead of time since they grab requests ad-hoc. Attaching to the main process and setting a breakpoint hasn't seemed to work so far.
Is there a way to debug this shared library code without having to attach to a process in advance? Basically I'd want to debug the first process that called the function in question.
For the time being I'll probably try limiting the number of worker processes to 1 with no respawn, but it'd be good to know how to handle a scenario like this in the future, especially if I'd like to make sure it still works in the "release" configuration.
I'm running on a Linux platform attempting to debug this with DDD and GDB.
Edit: To help illustrate what I'm trying to accomplish, let me provide a brief proof on concept.
#include <iostream>
#include <stdlib.h>
#include <unistd.h>
using namespace std;
int important_function( const int child_id )
{
cout << "IMPORTANT(" << child_id << ")" << endl;
}
void child_task( const int child_id )
{
const int delay = 10 - child_id;
cout << "Child " << child_id << " started. Waiting " << delay << " seconds..." << endl;
sleep(delay);
important_function(child_id);
exit(0);
}
int main( void )
{
const int children = 10;
for (int i = 0; i < 10; ++i)
{
pid_t pid = fork();
if (pid < 0) cout << "Fork " << i << "failed." << endl;
else if (pid == 0) child_task(i);
}
sleep(10);
return 0;
}
This program will fork off 10 processes which will all sleep 10 - id seconds before calling important_function, the function in which I want to debug in the first calling child process (which should, here, be the last one I fork).
Setting the follow-fork-mode to child will let me follow through to the first child forked, which is not what I'm looking for. I'm looking for the first child that calls the important function.
Setting detach-on-fork off doesn't help, because it halts the parent process until the child process forked exits before continuing to fork the other processes (one at a time, after the last has exited).
In the real scenario, it is also important that I be able to attach on to an already running server application who's already spawned threads, and halt on the first of those that call the function.
I'm not sure if any of this is possible since I've not seen much documentation on it. Basically I want to debug the first application to call this line of code, no matter what process it's coming from. (While it's only my application processes that'll call the code, it seems like my problem may be more general: attaching to the first process that calls the code, no matter what its origin).
You can set a breakpoint at fork(), and then issue "continue" commands until the main process's next step is to spawn the child process you want to debug. At that point, set a breakpoint at the function you want to debug, and then issue a "set follow-fork-mode child" command to gdb. When you continue, gdb should hook you into the child process at the function where the breakpoint is.
If you issue the command "set detach-on-fork off", gdb will continue debugging the child processes. The process that hits the breakpoint in the library should halt when it reaches that breakpoint. The problem is that when detach-on-fork is off, gdb halts all the child processes that are forked when they start. I don't know of a way to tell it to keep executing these processes after forking.
A solution to this I believe would be to write a gdb script to switch to each process and issue a continue command. The process that hits the function with the breakpoint should stop.
A colleague offered another solution to the problem of getting each child to continue. You can leave "detach-on-fork" on, insert a print statement in each child process's entry point that prints out its process id, and then give it a statement telling it to wait for the change in a variable, like so:
{
volatile int foo = 1;
printf("execute \"gdb -p %u\" in a new terminal\n", (unsigned)getpid());
printf("once GDB is loaded, give it the following commands:\n");
printf(" set variable foo = 0\n");
printf(" c\n");
while (foo == 1) __asm__ __volatile__ ("":::"memory");
}
Then, start up gdb, start the main process, and pipe the output to a file. With a bash script, you can read in the process IDs of the children, start up multiple instances of gdb, attach each instance to one of the different child processes, and signal each to continue by clearing the variable "foo".