to system() or fork()/exec()? - c++

There appear to be two common ways of running an external executable from C in unix, the
system()
call and
pid = fork()
switch(pid)
//switch statement based on return value of pid,
//one branch of which will include and exec() command
Is there any reason to prefer a fork/exec over system in the case where they are functionally equivalent (parent process waits for child to finish, no complex information is returned from child)?.

system executes a command-interpreter, i.e. a shell, which (a) is slower than a direct fork/exec, (b) may behave differently on different systems and (c) is a potential security hazard if you pass it a string from an untrusted source. Also, system waits for the child process to exit, while you might want it to run concurrently with the parent process.
More in general, the low-level fork/exec gives you additional control: before or in between the two operations, you might want to chdir, open pipes, close file descriptors, set up shared memory, etc.
(By different systems, I don't mean Windows vs. Unix (as Windows doesn't even have fork): I'm talking Red Hat Linux vs. Ubuntu. The former uses Bash to execute what is passed to system, the latter a lightweight POSIX-compatible shell.)

fork() creates a new process. If you don't need to do that, just use system() (or popen()). You might want a second process to achieve parallelism, or for finer-grained control over the job, but often you just don't care for that if the job is meant to be synchronous.
On the other hand, I find that 95% of uses of system() are unnecessary or would somehow be better off done another way (e.g. using zlib instead of system("gzip")). So maybe the best answer is to use neither!

Going via system() additionally invokes a shell process, which might not be what you want.
Also the calling process is notified only when such shell dies not when the actual process run by the shell died.

system() will type out the command and execute it like a user would have typed out.
i mostly saw it like system("pause"); system("cls");
But if you need to control the child process, you want to fork.

Related

What is the difference between using a win32 function and passing a cmd command through system()?

should i use win32 functions if a terminal command can do the same thing through system();? Like, what is the difference between using a win32 function to, for example, create a file and using system("echo >> file.txt");.
should i use win32 functions if a terminal command can do the same thing through system();?
This asks for opinions, but I'm going out on a limb and say: Almost never use system().
One category of reasons to actually use it is if you need to use a program that doesn't come with a library/API - but it has an understandable commandline interface. I place all usage of system() in the as a last resort bin. It's nearly never what you should aim for.
Like, what is the difference between using a win32 function to, for example, create a file and using system("echo >> file.txt");.
The system() call on Windows and Posix platforms creates a child process (cmd.exe, /bin/sh or whatever shell is the default) to execute the command. The main process then just waits for the child process to finish.
This sub shell has to interpret the command, validate it and print all sorts of meaningless messages to stdout when you as a programmer are usually interested in if creating the file succeeded or not. All this usually takes a lot more system resources than just calling the API at hand.
The shells are also different beasts and come in different versions so your program may behave differently if deployed on different versions of your target platform. This isn't detected at program startup as it would be if your program linked with the proper libraries instead. Different shells have different security vulnerabilities and programs using system() have historically been successfully attacked. Since you probably don't know the version of the shell you use, you also have no control over this part.
That said, I don't recommend calling the WinAPI directly to accomplish what your example does. Using standard C++ should do quite well:
if(std::ofstream os("file.txt", std::ios::app); os) { // c++17 init-statement
os << '\n';
} else {
// failed opening "file.txt" for writing
}
Edit: dxiv pointed out that my suggestion doesn't do what echo >> file.txt does when using cmd.exe as a shell. This would apparently be the equivalent:
os << "ECHO is on.\n";
As also pointed out by dxiv, this "reinforces both "may behave differently" and "print all sorts of meaningless messages".
system() creates a new instance of the command interpreter (cmd.exe) and runs the specified command in it. There's a substantial overhead to creating a new process, so you should only do so if you absolutely need to. This is a dumb analogy, but it would be like saying, hmm, I need a refrigerator, I'll build a whole house so I can use the fridge.
Furthermore, you'll find that if you use system(), the created process's stdin and stdout are not accessible to your program without further work. So you can't tell much about what it did other than by reading the return code.
Compare that to calling Windows API functions to accomplish your task, which are much faster since they don't have the process creation overhead, provide rich error codes, and on top of that also provide tons and tons more functionality than does cmd.exe.

Is using popen() in C/C++ is a bad coding practise?

I want to change the timezone for Linux system. I know there are many ways.
One way is to use tzset() function and another is to call 'timedatectl' command from 'popen()' function.
I am using second approach i.e, using "popen()".
I just want to ask is it a good programming practice to use "popen()" in your code?
Also, I am carefully calling "pclose()" for every "popen()".
There is nothing wrong about popen in general, if you really need a child process to do a specific job for you.
popen creates a pipe allowing you to either read the output (what it wrote to stdout) of the child process or write input to its stdin - but not both at the same time.
If you are not interested in either option, you possibly might prefer calling system instead (however, system will wait for the process to terminate, in contrast to popen - pclose waits for).
But why would you want to create a separate process, if you can do the same job by simply calling an ordinary function (system call or not)? You are creating a lot of overhead using a process then (process must be initialised and hooked into OS, it needs its own memory for executable code, heap and stack, ...)!
It gets a little more complicated, if the job in question requires a considerable amount of time and you cannot afford to wait for the function to complete, but need to do some other stuff. However, in such a case, I'd rather create a thread only and again call the function from there...
popen() invokes a shell to run the command which is an extra unnecessary layer of indirection. Plus there are all sorts of security pitfalls, for instance, you don't have control over the environment - or which shell actually gets invoked.
I'd say it's fine for prototypes and proofs of concept, but for production code you should use fork(), one of the execs and pipes for IO.
EDIT
If there is a function equivalent to doinf something by invoking a command, always use that first. For example, if you can achieve what you want with tzset(), always use that in preference to spawning a new process.

Multithreaded program and fork(): alternative or safe implementation

In a multithreaded Linux/C++-program, I want to use fork() with a signal handler for SIGCHLD.
In the child process I use open() to create two new file descriptors, sendfile() and close(), then the child exits.
I planned to use fork() to implement the following requirements:
The threads in the parent process shall be able to
detect the normal termination of the child process, and in that case shall be able to create another fork() doing the open()/sendfile()/close() for a range of files
kill the sendfile()-child process in case of a specific event and detect the intentional termination to clean up
For requirement 1 I could just wait for the result of sendfile().
Requirement 2 is why I think I need to use fork() in the first place.
After reading the following posts
Threads and fork(). How can I deal with that?
fork in multi-threaded program
I think that my solution might not be a good one.
My questions are:
Is there any other solution to implement requirement 2 ?
Or how can I make sure that the library calls open(), close() and sendfile() will be okay?
Update:
The program will run on a Busybox Linux / ARM
I've assumed that I should use sendfile() for having the most efficient file transfer due to several posts I've read regarding this topic.
A safe way to implement my requirement could be using fork() and exec*() with cp, with the disadvantage that the file transfer might be less efficient
Update 2:
it's sufficient to fork() once in case of a specific event (instead of once per file) since I switched to exec*() with rsync in the child process. However the program needs invoke that rsync always in case of a specific event.
You can use threads, but forcefully terminating threads typically leads to memory leaks and other problems.
My linux experience is somewhat limited, but I would probably try to fork the program early, before it gets multithreaded. Now that you have two instances, the single threaded instance can be safely used to manage the starting and stopping of additional instances.

Linux - Is there a way to call to system call (bash scripts) without forking a new process?

In my c++ apps, I need to run several bash scripts. (sometimes regular system calls, i.e "shutdown" , "rm").
When using the "system" call, it forks a new process.
Is there a way to call "system" without forking a new process ?
Not easily. But if you're willing to add complexity to your setup, you could:
make your program itself do the system calls that rm, shutdown, etc would do (you'd be reinventing the wheel, though)
write a script that listens to commands given on a port and executes the commands (and keep the script running -- perhaps as a daemon). Complex and fragile...
Then you wouldn't need to fork() or call exec... But it's better to just fork a new process, or use exec -- I see no advantage in doing things different in this situation.
Well, actually, there are analogs of these bash commands in standard libc library.
But nevetherless if you want to try ... I wanna to say how about exploits?
I mean, in this case there will be no syscalls (exclude exploits's).
No, because running a script means running an instance of bash and/or other executable binary.
Although for most of regular system calls there is corresponding C function, which of course doesn't fork the process (like for example unlink() for rm).
EDIT: I know about exec* functions, but if OP wants to run several commands they are useless, maybe except for running the last one.

System() calls in C++ and their roles in programming

I've often heard that using system("PAUSE") is bad practice and to use std::cin.get() instead. Now my understanding of system calls is that they take a string which they enter into a system command line and talk with the OS, so PAUSE is a DOS command that pauses the output in the command window. I assume this works similarly with Mac and unix with different keywords, and using system calls is discouraged because of a lack of cross OS compatibility. (If I'm wrong with any of this, please correct me)
my question is this: When is it appropriate to use system() calls? How should they be applied? When should they NOT be applied?
system("PAUSE") is certainly less than ideal. using a call to system creates a subprocess, which on windows is fairly expensive and in any case not terribly cheap on any operating system. On embedded systems the memory overhead is significant.
If there is any way to do it without much pain natively then do it. In the case of waiting for the user to press a single button, cin.get() will be very hard to beat. In this case, your applications process will just block on stdin, setting only a few flags visible to the kernel, and most importantly, allocates no new memory and creates no new scheduling entities, not even an interrupt handler.
Additionally, it will work the same on all operating systems with all c++ compilers, since it uses only a very basic feature of a very standard part of the language, rather than depend on anything the OS provides.
EDIT: predicting your concern that it doesn't matter if it's expensive because the whole idea is to pause. Well, first off, if its expensive, then it's going to hurt performance for anything else that might be going on. Ever notice (on windows) when one application is launching, other, already open apps become less responsive too? Additionally, your user might not be a live human, but rather another program working on behalf of a human user (Say, a shell script). The script already knows what to do next and can pre-fill stdin with a character to skip over the wait. If you have used a subprocess here, the script will experience a (noticeable to a human) delay. If the script is doing this hundreds (or hundreds of millions!) of times, a script that could take seconds to run now takes days or years.
EDIT2: when to use system(): when you need to do something that another process does, that you can't do easily. system() isn't always the best candidate because it does two things that are somewhat limiting. First, the only way to communicate with the subprocess is by command line arguments as input and return value as output. The second is that the parent process blocks until the child process has completed. These two factors limit the cases in which system is useable.
on unixy systems, most subprocesses happen with fork because it allows the same program to continue in the same place as two separate processes, one as a child of the other (which is hardly noticeable unless you ask for it from the OS). On Linux, this is especially well optimized, and about as cheap as creating a pthread. Even on systems where this is not as fast, it is still very useful (as demonstrated by the apache process-pool methodology) (unavailable on windows/link to unix docs)
other cases (on windows too!) are often handled by popen or exec family of functions. popen creates a subprocess and a brand new pipe connecting to the subprocesses' stdin or stdout. Both parent and child processes can then run concurrently and communicate quite easily. (link to windows docs/link to unix docs)
exec* family of functions (there are several, execl, execv and so on) on the other hand causes the current program to be replaced by the new program. The original program exits invisibly and the new process takes over. When then new process returns, it will return to whatever called the original process, as if that process had returned at that point instead of vanishing. The advantage of this over exit(system("command")) is that no new process is created, saving time and memory (though not always terribly much) (link to windows docs /link to unix docs)
system could plausibly be used by some scripted tool to invoke several steps in some recipe action. For example, at a certain point, a program could use system to invoke a text editor to edit some configuration file. It need not concern itself too much with what happens, but it should certainly wait until the user has saved and closed the editor before continuing. It can then use the return value to find out if the editing session was successful, in the sense that the editor actually opened the requested file (and that the editor itself existed at all!), but will read the actual results of the session from the edited file directly, rather than communicating with the subprocess. (link to windows docs/link to unix docs)
System calls are sent to the shell or command line interpreter of the OS (dos, bash, etc) and its up to the shell to do what it wants with this command.
You would avoid using these kind of calls as it would reduce your programs portability to work with other operating systems. I would think only when you are absolutely sure that your code is targeting a specific OS that you should use such calls.
But my question is this: When is it appropriate to use system() calls? How should they be applied?
When you can't do the thing you're trying to do with your own code or a library (or the cost of implementing it outweighs the cost of launching a new process to do so). system() is pretty costly in terms of system resources compared to cin.get(), and as such it should only be used when absolutely necessary. Remember that system() typically launches both an entire new shell and whatever program you asked it to run, so thats two new executables being launched.
By the way, system() call should never be used with binaries with SUID or SGID bit set, quoting from the man page:
Do not use system() from a program with set-user-ID or set-group-ID
privileges, because strange values for some environment variables
might be used to subvert system integrity. Use the exec(3) family of
functions instead, but not execlp(3) or execvp(3). system() will not,
in fact, work properly from programs with set-user-ID or set-group-ID
privileges on systems on which /bin/sh is bash version 2, since bash 2
drops privileges on startup.
system() is used to ask the operating system to run a program.
Why would your program want the operating system to run a program? Well there are cases. Sometimes an external program or operating system command can perform a task that is hard to do in your own program. For example, an external program may operate with elevated privileges or access propriety data formats.
The system() function, itself, is fairly portable but the command string you pass it is likely to be very platform-specific -- though the command string can be pulled from local configuration data to make it more platform-agnostic.
Other functions like fork(), exec*(), spawn*() and CreateProcess() will give you much more control over the way you run the external program, but are platform-specific and may not be available on your platform of choice.
system("PAUSE") is an old DOS trick and is generally considered to be fairly grotty style these days.
As far as i know system("PAUSE") is a windows only thing, and that is why it is frowned upon.