using exec to execute a system command in a new process

using exec to execute a system command in a new process - c++

I am trying to spawn a process that executes a system command, while my own program still proceeds and two processes will run in parallel. I am working on linux.
I looked up online and sounds like I should use exec() family. But it doesn't work quite as what I expected. For example, in the following code, I only see "before" being printed, ,but not "done".
I am curious if I am issing anything?
#include <unistd.h>
#include <iostream>
using namespace std;
main()
{
cout << "before" << endl;
execl("/bin/ls", "/bin/ls", "-r", "-t", "-l", (char *) 0);
cout << "done" << endl;
}
[UPDATE]
Thank you for your guys comments. Now my program looks like this. Everything works fine except at the end, I have to press enter to finish the program. I am not sure why I have to press the last enter?
#include <unistd.h>
#include <iostream>
using namespace std;
main()
{
cout << "before" << endl;
int pid = fork();
cout << pid << endl;
if (pid==0) {
execl("/bin/ls", "ls", "-r", "-t", "-l", (char *) 0);
}
cout << "done" << endl;
}

You're missing a call to fork. All exec does is replace the current process image with that of the new program. Use fork to spawn a copy of your current process. Its return value will tell you whether it's the child or the original parent that's running. If it's the child, call exec.
Once you've made that change, it only appears that you need to press Enter for the programs to finish. What's actually happening is this: The parent process forks and executes the child process. Both processes run, and both processes print to stdout at the same time. Their output is garbled. The parent process has less to do than the child, so it terminates first. When it terminates, your shell, which was waiting for it, wakes and prints the usual prompt. Meanwhile, the child process is still running. It prints more file entries. Finally, it terminates. The shell isn't paying attention to the child process (its grandchild), so the shell has no reason to re-print the prompt. Look more carefully at the output you get, and you should be able to find your usual command prompt buried in the ls output above.
The cursor appears to be waiting for you to press a key. When you do, the shell prints a prompt, and all looks normal. But as far as the shell was concerned, all was already normal. You could have typed another command before. It would have looked a little strange, but the shell would have executed it normally because it only receives input from the keyboard, not from the child process printing additional characters to the screen.
If you use a program like top in a separate console window, you can watch and confirm that both programs have already finished running before you have to press Enter.

The Exec family of functions replaces the current process with the new executable.
To do what you need, use one of the fork() functions and have the child process exec the new image.
[response to update]
It is doing exactly what you told it: You don't have to press "enter" to finish the program: It has already exited. The shell has already given a prompt:
[wally#zenetfedora ~]$ ./z
before
22397
done
[wally#zenetfedora ~]$ 0 << here is the prompt (as well as the pid)
total 5102364
drwxr-xr-x. 2 wally wally 4096 2011-01-31 16:22 Templates
...
The output from ls takes awhile so it buries the prompt. If you want output to appear in a more logical order, add sleep(1) (or maybe longer) before the "done".

You're missing the part where execl() replaces your current program in memory with /bin/ls
I would suggest looking at popen() which will fork and exec a new process, then let you read or write to it via a pipe. (Or if you need read and write, fork() yourself, then exec())

Related

ubuntu server pipeline stop process termination when the first exit

The situation is: I have an external application so I don't have the source code and i can't change it. While running, the application writes logs to the stderr. The task is to write a program that check the output of it and separate some part of the output to other file. My solution is to start the app like
./externalApp 2>&1 | myApp
the myApp is a c++ app with the following source:
using namespace std;
int main ()
{
string str;
ofstream A;
A.open("A.log");
ofstream B;
B.open("B.log");
A << "test start" << endl;
int i = 0;
while (getline(cin,str))
{
if(str.find("asdasd") != string::npos)
{
A << str << endl;
}
else
{
B << str << endl;
}
++i;
}
A << "test end: " << i << " lines" << endl;
A.close();
B.close();
return 0;
}
The externalApp can crash or be terminated. A that moment the myApp gets terminated too and it is don't write the last lines and don't close the files. The file can be 60Gb or larger so saving it and processing it after not a variant.
Correction: My problem is that when the externalApp crash it terminate myApp. That mean any code after while block will never run. So the question is: Is there a way to run myApp even after the externalApp closed?
How can I do this task correctly? I interesed in any other idea to do this task.

There's nothing wrong with the shown code, and nothing in your question offers any evidence of anything being wrong with the shown code. No evidence was shown that your logging application actually received "the last lines" to be written from that external application. Most likely that external application simply failed to write them to standard output or error, before crashing.
The most likely explanation is that your external application checks if its standard output or error is connected to an interactive terminal; if so each line of its log message is followed by an explicit buffer flush. When the external application's standard output is a pipe, no such flushing takes place, so the log messages get buffered up, and are flushed only when the application's internal output buffer is full. This is a fairly common behavior. But because of that, when the external application crashes its last logged lines are lost forever. Because your logger never received them. Your logger can't do anything about log lines it never read.
In your situation, the only available option is to set up and connect a pseudo-tty device to the external application's standard output and error, making it think that's connected to an interactive terminal, while its output is actually captured by your application.
You can't do this from the shell. You need to write some code to set this up. You can start by reading the pty(7) manual page which explains the procedure to follow, at which point you will end up with file descriptors that you can take, and attach to your external application.

If you want your program to cleanly deal with the external program crashing you will probably need to handle SIGPIPE. The default behaviour of this signal is to terminate the process.

So the problem was not that when the first element of the pipe ended it terminate the second. The real problem was that the two app with pipes launched from bash script and when the bash script ended it terminated all of it child process. I solved it using
signal(SIGHUP,SIG_IGN);
that way my app executed to the end.
Thank you for all the answer at least I learned lot about the signals and pipes.

GDB/DDD: Debug shared library with multi-process application C/C++

I am trying to debug a server application but I am running into some difficulties breaking where I need to. The application is broken up into two parts:
A server application, which spawns worker processes (not threads) to handle incoming requests. The server basically spawns off processes which will process incoming requests first-come first-served.
The server also loads plugins in the form of shared libraries. The shared library defines most of the services the server is able to process, so most of the actual processing is done here.
As an added nugget of joy, the worker processes "respawn" (i.e. exit and a new worker process is spawned) so the PIDs of the children change periodically. -_-'
Basically I need to debug a service that's called within the shared library but I don't know which process to attach to ahead of time since they grab requests ad-hoc. Attaching to the main process and setting a breakpoint hasn't seemed to work so far.
Is there a way to debug this shared library code without having to attach to a process in advance? Basically I'd want to debug the first process that called the function in question.
For the time being I'll probably try limiting the number of worker processes to 1 with no respawn, but it'd be good to know how to handle a scenario like this in the future, especially if I'd like to make sure it still works in the "release" configuration.
I'm running on a Linux platform attempting to debug this with DDD and GDB.
Edit: To help illustrate what I'm trying to accomplish, let me provide a brief proof on concept.
#include <iostream>
#include <stdlib.h>
#include <unistd.h>
using namespace std;
int important_function( const int child_id )
{
cout << "IMPORTANT(" << child_id << ")" << endl;
}
void child_task( const int child_id )
{
const int delay = 10 - child_id;
cout << "Child " << child_id << " started. Waiting " << delay << " seconds..." << endl;
sleep(delay);
important_function(child_id);
exit(0);
}
int main( void )
{
const int children = 10;
for (int i = 0; i < 10; ++i)
{
pid_t pid = fork();
if (pid < 0) cout << "Fork " << i << "failed." << endl;
else if (pid == 0) child_task(i);
}
sleep(10);
return 0;
}
This program will fork off 10 processes which will all sleep 10 - id seconds before calling important_function, the function in which I want to debug in the first calling child process (which should, here, be the last one I fork).
Setting the follow-fork-mode to child will let me follow through to the first child forked, which is not what I'm looking for. I'm looking for the first child that calls the important function.
Setting detach-on-fork off doesn't help, because it halts the parent process until the child process forked exits before continuing to fork the other processes (one at a time, after the last has exited).
In the real scenario, it is also important that I be able to attach on to an already running server application who's already spawned threads, and halt on the first of those that call the function.
I'm not sure if any of this is possible since I've not seen much documentation on it. Basically I want to debug the first application to call this line of code, no matter what process it's coming from. (While it's only my application processes that'll call the code, it seems like my problem may be more general: attaching to the first process that calls the code, no matter what its origin).

You can set a breakpoint at fork(), and then issue "continue" commands until the main process's next step is to spawn the child process you want to debug. At that point, set a breakpoint at the function you want to debug, and then issue a "set follow-fork-mode child" command to gdb. When you continue, gdb should hook you into the child process at the function where the breakpoint is.
If you issue the command "set detach-on-fork off", gdb will continue debugging the child processes. The process that hits the breakpoint in the library should halt when it reaches that breakpoint. The problem is that when detach-on-fork is off, gdb halts all the child processes that are forked when they start. I don't know of a way to tell it to keep executing these processes after forking.
A solution to this I believe would be to write a gdb script to switch to each process and issue a continue command. The process that hits the function with the breakpoint should stop.
A colleague offered another solution to the problem of getting each child to continue. You can leave "detach-on-fork" on, insert a print statement in each child process's entry point that prints out its process id, and then give it a statement telling it to wait for the change in a variable, like so:
{
volatile int foo = 1;
printf("execute \"gdb -p %u\" in a new terminal\n", (unsigned)getpid());
printf("once GDB is loaded, give it the following commands:\n");
printf(" set variable foo = 0\n");
printf(" c\n");
while (foo == 1) __asm__ __volatile__ ("":::"memory");
}
Then, start up gdb, start the main process, and pipe the output to a file. With a bash script, you can read in the process IDs of the children, start up multiple instances of gdb, attach each instance to one of the different child processes, and signal each to continue by clearing the variable "foo".

Strange behavior with boost file_sink when forking

I'm observing some strange behavior when I use a file_sink (in boost::iostreams) and then fork() a child process.
The child continues the same codebase, i.e., no exec() call, because this is done as part of daemonizing the process. My full code fully daemonizaes the process, of course, but I have omitted those steps that are unncessary for reporducing the behavior.
The following code is a simplified example that demonstrates the behavior:
using namespace std;
namespace io = boost::iostreams;
void daemonize(std::ostream& log);
int main (int argc, char** argv)
{
io::stream_buffer<io::file_sink> logbuf;
std::ostream filelog(&logbuf);
//std::ofstream filelog;
// Step 1: open log
if (argc > 1)
{
//filelog.open(argv[1]);
logbuf.open(io::file_sink(argv[1]));
daemonize(filelog);
}
else
daemonize(std::cerr);
return EXIT_SUCCESS;
}
void daemonize(std::ostream& log)
{
log << "Log opened." << endl;
// Step 2: fork - parent stops, child continues
log.flush();
pid_t pid = fork(); // error checking omitted
if (pid > 0)
{
log << "Parent exiting." << endl;
exit(EXIT_SUCCESS);
}
assert(0 == pid); // child continues
// Step 3: write to log
sleep(1); // give parent process time to exit
log << "Hello World!" << endl;
}
If I run this with no argument (e.g., ./a.out), so that it logs to stderr, then I get the expected output:
Log opened.
Parent exiting.
Hello World!
However, if I do something like ./a.out temp; sleep 2; cat temp then I get:
Log opened.
Hello World!
So the parent is somehow no longer writing to the file after the fork. That's puzzle #1.
Now supposed I just move io::stream_buffer<io::file_sink> logbuf; outside of main so that it's a global variable. Doing that and simply running ./a.out gives the same expected output as in the previous case, but writing to a file (e.g., temp) now gives a new puzzling behavior:
Log opened.
Parent exiting.
Log opened.
Hello World!
The line that writes "Log opened." is before the fork() so I don't see why that should appear twice in the output. (I even put an explicit flush() immediate before the fork() to make sure that line of output wasn't simply buffered, and then the buffer got copied during the fork() and later both copies eventually flushed to the stream...) So that's puzzle #2.
Of course, if I comment out the whole fork() process (the entire section labeled as "Step 2") then it behaves as expected for both file and stderr output, and regardless of whether logbuf is global or local to main().
Also, if I switch filelog to be an ofstream instead of stream_buffer<file_sink> (see commented out lines in main()) then it also behaves as expected for both file and stderr output, and regardless of whether filelog/logbuf are global or local to main().
So it really seems that it's an interaction between file_sink and fork() producing these strange behaviors... If anyone has ideas on what may be causing these, I'd appreciate the help!

I think I got it figured out... creating this answer for posterity / anyone who stumbles on this questions looking for an answer.
I observed this behavior in boost 1.40, but when I tried it using boost 1.46 everything behaved in the expected manner in all cases, i.e.:
Log opened.
Parent exiting.
Hello World!
So my assumption right now is that this was actually a bug in boost that was fixed sometime between version 1.41-1.46. I didn't see anything in the release notes that made it real obvious to me that they found & fixed the bug, but it's possible the release notes discussed fixing some underlying cause of this bug and I wasn't able to make the conneciton between that underlying cause and this scenario.
In any case, the solution seems to be to install boost version >= 1.46

program stops after execvp( command.argv[0], command.argv)

I am writing a small shell program that takes a command and executes it. If the user enters a not valid command the if statement returns a -1. If the command is correct it executes the command, however once it executes the command the program ends. What am I doing wrong that is does not execute the lines of code after it? I have tested execvp( command.argv[0], command.argv) with ls and cat commands so I am pretty sure it works. Here is my code.
int shell(char *cmd_str ){
int commandLength=0;
cmd_t command;
commandLength=make_cmd(cmd_str, command);
cout<< commandLength<<endl;
cout << command.argv[0]<< endl;
if( execvp( command.argv[0], command.argv)==-1)
//if the command it executed nothing runs after this line
{
commandLength=-1;
}else
{
cout<<"work"<<endl;
}
cout<< commandLength<<endl;
return commandLength;
}

From man page of execvp(3)
The exec() family of functions replaces the current process image with
a new process image
So your current process image is overwritten with the image of your command! Hence you need to use a fork+exec combination always so that your command executes in the child process and your current process continues safely as a parent!
On a lighter note I want to illustrate the problem with a picture as a picture speaks a thousand words. No offence intended :) :)

From the documentation on exec
The exec() family of functions replaces the current process image with a new process image. The functions described in this manual page are front-ends for execve(2). (See the manual page for > execve(2) for further details about the replacement of the current process image.)
If you want your process to continue, this is not the function you want to use.

#Pavan - Just for nit-pickers like myself, technically the statement "current process is gone" is not true. It's still the same process, with the same pid, just overwritten with a different image (code, data etc).

How to prevent a Linux program from running more than once?

What is the best way to prevent a Linux program/daemon from being executed more than once at a given time?

The most common way is to create a PID file: define a location where the file will go (inside /var/run is common). On successful startup, you'll write your PID to this file. When deciding whether to start up, read the file and check to make sure that the referenced process doesn't exist (or if it does, that it's not an instance of your daemon: on Linux, you can look at /proc/$PID/exe). On shutdown, you may remove the file but it's not strictly necessary.
There are scripts to help you do this, you may find start-stop-daemon to be useful: it can use PID files or even just check globally for the existence of an executable. It's designed precisely for this task and was written to help people get it right.

Use the boost interprocess library to create a memory block that will be created by the process. If it already exists, it means that there is another instance of the process. Exit.
The more precise link to what you need would be this one.
#include <boost/interprocess/shared_memory_object.hpp>
#include <boost/scoped_ptr.hpp>
int main()
{
using boost::interprocess;
boost::scoped_ptr<shared_memory_object> createSharedMemoryOrDie;
try
{
createSharedMemoryOrDie.reset(
new shared_memory_object(create_only, "shared_memory", read_write));
} catch(...)
{
// executable is already running
return 1;
}
// do your thing here
}

If you have access to the code (i.e. are writing it):
create a temporary file, lock it, remove when done, return 1; if file exists, or,
list processes, return 1; if the process name is in the list
If you don't:
create a launcher wrapper to the program that does one of the above

I do not know what your exact requirement is but I had a similar requirement; in that case I started my daemon from a Shell script ( it was a HP-UX machine) and before starting the daemon I checked if an exec by same name is already running. If it is; then don't start a new one.
By this way I was also able control the number of instances of a process.

I think this scheme should work (and is also robust against crashes):
Precondition: There is a PID file for your application (typically in /var/run/)
1. Try to open the PID file
2. If it does not exist, create it and write your PID to it. Continue with the rest of the program
3. If it exist, read the PID
4. If the PID is still running and is an instance of your program, then exit
5. If the PID does not exist or is used by another program, remove the PID file and go to step 2.
6. At program termination, remove the PID file.
The loop in step 5 ensures that, if two instances are started at the same time, only one will be running in the end.

Have a pid file and on the startup do a 'kill -0 <pid>'. Where is the value read from file. If the response is != 0 then the daemon is not alive and you might restart it
Another approach would be to bind to a port and handle the bind exception on the second attempt to start the daemon. If the port is in use then exit otherwise continue running the daemon.

I believe my solution is the simplest:
(don't use it if racing condition is a possible scenario, but on any other case this is a simple and satisfying solution)
#include <sys/types.h>
#include <unistd.h>
#include <sstream>
void main()
{
// get this process pid
pid_t pid = getpid();
// compose a bash command that:
// check if another process with the same name as yours
// but with different pid is running
std::stringstream command;
command << "ps -eo pid,comm | grep <process name> | grep -v " << pid;
int isRuning = system(command.str().c_str());
if (isRuning == 0) {
cout << "Another process already running. exiting." << endl;
return 1;
}
return 0;
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js