Can I use boost library for crossplatform application executing? - c++

Is there any WinAPI WinExec analog in boost (c++) libraries? I need to run executable from my program, and pass parameters to it. Should I use any other cross-platform libraries for this, or handle myself what OS my program is compiled for?

Important: see update at the end for POSIX systems.
My opinion is that you should use the APIs/syscalls provided by the various platforms you wish to support, or use some kind of abstraction layer (the Boost.Process library, mentioned by Noah Roberts, may be an idea) to avoid dealing with platform-specific details.
I strongly disagree with using the system function because it isn't intended to start a process you specify, but instead it's supposed to pass the string you specified to the "system default shell" or "command processor" (if any). This has several drawbacks:
resource wastage; instead of a process now (usually) you are spawning two, one of which (the shell) is useless for your final objective (starting the process you want). This is usually negligible, but may be noticeable on systems where processes aren't lightweight objects (Windows) if they are running low on resources.
useless confusion; several security suites I've dealt with warn every time an unknown/untrusted process starts a new process; instead of just displaying a warning, now the security suite will display two of them (and you're making the first one quite unclear);
unpredictability of the result; the platform-agnostic system's documentation could be replaced without much loss with "undefined behavior" - and actually it is quite like that. Why do I say this? Because:
first of all, there's not even a guarantee that system has some meaning on the current platform, as there could be no "default shell" at all. But this is an extreme case that isn't usually a problem - and that can be also caught quite easily (if(system(NULL)==0) there's no shell); the real problem is that
in general, you don't have idea about what shell is the "default shell", and how it parses its input; on Linux it will usually be /bin/sh update: actually, this is mandated by POSIX, see below, on Windows it may be command.com as well as cmd.exe, on another OS it will be still another thing. So, you aren't sure about, e.g., how to escape spaces in the path, or if you should quote the path; heck, you don't even know if such shell requires some special command to start executables!
More fun: you don't even know if the call is actually blocking: you know that by the time system will return the shell will be terminated, but you don't know if the shell will wait for the spawned process to end; concrete example: cmd.exe doesn't wait for GUI executables to end before returning, while on Linux GUI executables are executables like all the others and don't have such special treatment. In this case you'll have to create a special case for Windows, and make a command string like start /wait youexecutable.exe - hoping that the version of the interpreter still (or yet, depending on the version of Windows) supports that syntax. And IIRC start has different options on Windows 9x and Windows NT family, so you won't even be sure with that.
It's not enough: you aren't even sure if the application has been started: the system return value is relative to the command interpreter return code. As far as system is concerned, if a shell is started the call succeeded, and there ends what system considers an error.
Then you're left with the error code of the shell - about which, again, we don't know anything. Maybe it's a carbon-copy of the error code of the last executed command; maybe it is an error code relative just to the shell (e.g., 1 = last command executed, 0 = last command was invalid), maybe it's 42. Who knows?
Since in a good application you'll want, at least, to know if the call is blocking/nonblocking, to get a meaningful exit code (the one actually returned by the application you started), to be sure if the application has been started, to have meaningful error codes in case things went wrong, system most probably doesn't suit your needs; to earn any of these guarantees, you have to go with platform-specific hacks or with non guaranteed assumptions, wasting all the cross-platform "compatibility" of system.
So, I'll state it again: use the system calls provided by the various platforms (e.g., fork+exec on POSIX, CreateProcess on Windows), which specify exactly what they are guaranteed to do, or go with third party abstraction code; the system way is definitely not good.
Update: since when I wrote this answer, I learned that on POSIX systems system is specified way better - in particular, it's mandated that it will execute the commands with /bin/sh -c command, blocking until the termination of the shell process.
sh behavior, in turn, is mandated in several ways by POSIX; thus, on POSIX systems, some of the disadvantages listed under "unpredictability of the result" no longer apply:
the default shell is specified, so, as long as you use just sh stuff guaranteed by POSIX (e.g. no bashisms), you are safe;
the call is blocking;
if your command is well-formed, the shell itself doesn't encounter problems, waitpid succedes, ..., you should get a copy of the error code of the executed program
So, if you run on POSIX, the situation is way less tragic; if, instead, you have to be portable, keep avoiding system.

What is wrong with system(), which is part of standard C++? See http://www.cplusplus.com/reference/clibrary/cstdlib/system.

There's a library that I believe is trying to get into boost called Boost.Process. You'll have to find a download for it, probably in sandbox or whatnot.

You might want to take a look at this question regarding popen() on win32: popen

Related

Sending arguments to executables from another program

I know this can easily be done using the platform's system() implementation. However, from what I have read using system is often not the best approach and can lead to security drawbacks. Is there a different industry standard approach to this type of problem? What are the options available to the user to do this sort of thing?
I am specifically interested in the implementation in C/C++, but I do not think this type of thing will be language dependent; I suspect it shall be platform specific.
You might be looking for the standard POSIX functions fork and exec*. This works for Unix-like platforms (Linux and Mac).
On Windows, there's the CreateProcess API.
fork and exec are a little odd, because fork duplicates your current process entirely and returns different results to each copy. The new copy of the program should then set up any needed settings (closing files that shouldn't be open in both programs, changing environment variables, etc.) and finally call one of the exec functions, which replaces that process with the specified program (while maintaining the currently open file descriptors and such).
The security issue which you alluded to with system is that system uses the system's shell to execute the program and parse its arguments, and if you're not careful, the shell can do things you don't want. (For example, "ls " + argument seems innocuous, but it can delete data if argument is "; rm -rf /*").
If you control the arguments, or if you're careful to escape any shell metacharacters in your parameters to system, you should be okay, although it's most reliable to avoid it.
To avoid the security issue, use a method of spawning a program that lets you specify a list of arguments, already parsed, instead of specifying a string that has to be parsed to extract arguments:
Using POSIX, fork then call one of the exec functions.
On Windows, use CreateProcess.
Use a cross-platform library function like the Apache Portable Runtime's apr_proc_create.
These don't exactly match system()'s behavior (system, for example, does a bit with signal handling and return values), but they're close.
You've likely already seen it's mention, but fork() and exec are typically the choices to go with in Linux programming, but for Windows, you'd have to use the OS API to create a new process. system() is still a good choice for smaller project because they typically don't run into the same malicious problems that big-name software can. It also natively waits for the child application to return before continuing on in the parent program, which can be a nice trait if you're using an external binary to run calculations or something else and you'll be getting the return value.
A lot of people will tell you that using system() is wrong, but it's really not. It's frowned upon in the professional market because of its inherent problems, but otherwise it works.

Debugging: Tracing (and diff-ing) function call tree of two version of the same program

I'm working on the rewriting of some
code in a c++ cmd line program.
I
changed the low level data structure that
it uses and the new version passes all
the tests (quite a lot) without any
problem and I get the correct output
from both the new and the old version...
Still, when give certain input, they give
different behaviour.
Getting to the point: Being somewhat of
a big project I don't have a clue about
how to track down when the execution
flow diverges, so... is there way to trace
the function call tree (possibly excluding
std calls) along with, i don't know, line
number in the source file and source
name?
Maybe some gcc or macro kungfu?
I would need a Linux solution since that's where the program runs.
Still, when give certain input, they give different behaviour
I would expand logging in you old and new versions in order to understand better work of your algorithms for certain input. When it become clearer you can for example use gdb if you still need it.
Update
OK, As for me logging is OK, but you do not want to add it.
Another method is tracing. Actually I used it only on Solaris but I see that it exists also on Linux. I have not used it on Linux so it is just an idea that you can test.
You can use SystemTap
User-Space Probing SystemTap initially focused on kernel-space probing. However, there are many instances where userspace probing can
help diagnose a problem. SystemTap 0.6 added support to allow probing
userspace processes. SystemTap includes support for probing the entry
into and return from a function in user-space processes, probing
predefined markers in user-space code, and monitoring user-process
events.
I can gurantee that it will work but why don't give it a try?
There is even an example in the doc:
If you want to see how the function xmalloc function is being called
by the command ls, you could use the user-space backtrack functions to
provide that information.
stap -d /bin/ls --ldd \
-e 'probe process("ls").function("xmalloc") {print_ustack(ubacktrace())}' \
-c "ls /"

Alternative to System() for running a batch file in a program

I want to make a system were I can run a make file and several other gcc related things within a program, basically to use gcc and compile stuff within the program. If I wrote up all the stuff I want to do to a batch file then I'd need to run that batch file from within the program.
Now everyone says System() calls are extremely bad because of security and various other things. So considering I am using c++ what would be a good alternative to System() to run batch files. If preferable I would like the alternative to cross platform.
Thanks
You could look to use the fork and execl family of calls although these are tied down to Unix / Linux and, depending on how you use them, are arguably no safer than system.
I doubt very much that you'll find a common, cross platform way of doing this if only because all platforms will have different and unique ways of doing this. Also, the scripts you're trying to run will no doubt have to be different on different platforms and there may be different ways of specifying things such as directory paths etc.
My suggestion would be to first ask yourself how you'll take the following questions - which would be my main concerns:
How am I going to prevent accidental / intentional misuse?
How am I going to detect errors or success status within the scripts I'm running?
How am I going to provide for dependencies? E.g. script A must run completely and correctly before script B runs.
How am I going to report the success and failure state.
My final question would be why do you want to do this in C++? Is there a specific reason? Naturally I'm a C++ evangelist although I would have thought this would be better tackled by a scripting language such as Perl, Python or possibly Bash unless you're embarking on something far more radical.

Print shared library dependencies from C++

I need to allocate an exact set of shared library dependencies of a binary program. I'm working on linux and the project is written in my C++. Thus, I need a recursive ldd-like functionality in C++. How can I do it?
To quote Han Solo, "I got a bad feeling about this". Setting up a chroot for a child process from within a C++ program sounds like some architectural misconception / screwup further up the line. Sorry, no ready-made C++ solution that springs to mind. You could, of course, run ltrace / strace / recursive-ldd and parse their output...
...but generally speaking, the idea is to set up the chroot environment statically (i.e. before any processes are started), not dynamically. With a dynamic approach, an attacker could fool the main process into believing it should give the child process things it shouldn't have in the chroot. That defeats the whole purpose.
Tools for statically setting up chroot environments for a given executable are plenty, tools for doing so dynamically I couldn't find any. This is a hint in itself.
In the meantime I've found the following:
linux/gcc: ldd functionality from inside a C/C++ program
where the accepted answer suggests to use:
setenv("LD_TRACE_LOADED_OBJECTS", "1", 1);
FILE *ldd = popen("/lib/libz.so");
I tried it out and worked both from bash and from C++ (ofc in this case I think of an equivalent version). However if I ran either versions for a SUID binary (what I actually have) then I got exit code 5 (i guess permission problems).
Then I traced what ldd exactly does and the following seems fine (at least in command line):
LD_TRACE_LOADED_OBJECTS=1 /lib64/ld-linux-x86-64.so.2 binary_name
The (dummy) question is: what is the equivalent implementation of this in C++?

When writing a portable c/c++ program, what is the best way to consume external files?

I'm pretty new to the c/c++ scene, I've been spoon fed on virtual machines for too long.
I'm modifying an existing C++ tool that we use across the company. The tool is being used on all the major operating systems (Windows, Mac, Ubuntu, Solaris, etc). I'm attempting to bridge the tool with another tool written Java. Basically I just need to call java -jar from the C++ tool.
The problem is, how do I know where the jar is located on the user's computer? The c++ executables are currently checked into Perforce, and users sync and then call the exe, presumably leaving the exe in place (although they could copy it somewhere else). My current solution checks in the jar file beside the exe.
I've looked at multiple ways to calculate the location of the exe from C++, but none of them seem to be portable. On windows there is a 'GetModuleLocation' and on posix you can look at the procs/process.exe info to figure out the location of the process. And on most systems you can look at argv[0] to figure out where the exe is. But most of these techniques are 100% guaranteed due to users using $PATH, symlinks, etc to call the exe.
So, any guidance on the right way to do this that will always work? I guess I have no problem ifdef'ing multiple solutions, but it seems like there should be a more elegant way to do this.
I don't believe there is a portable way of doing this. The C++ standard itself does not define anything about the execution environment. The best you get is the std::system call, and that can fail for things like Unicode characters in path names.
The issue here is that C and C++ are both used on systems where there's no such thing as an operating system. No such thing as $PATH. Therefore, it would be nonsensical for the standards committee to require a conforming implementation provide such features.
I would just write one implementation for POSIX, one for Mac (if it differs significantly from the POSIX one... never used it so I'm not sure), and one for Windows (Select which one at compilation time with the preprocessor). It's maybe 3 function calls for each one; not a lot of code, and you'll be sure you're following the conventions of your target platform.
I'd like to point you to a few URLs which might help you find where the current executable was located. It does not appear as if there is one method for all (aside from the ARGV[0] + path search method which as you note is spoofable, but…are you really in a threat environment such that this is likely to happen?).
How to get the application executable name in WindowsC++/CLI?
https://superuser.com/questions/49104/handy-tool-to-find-executable-program-location
Finding current executable's path without /proc/self/exe
How do I find the location of the executable in C?
There are several solutions, none of them perfect. Under Windows, as
you have said, you can use GetModuleLocation, but that's not available
under Unix. You can try to simulate how the shell works, using
argv[0] and getenv("PATH"), but that's not easy, and it's not 100%
reliable either. (Under Unix, and I think under Windows as well, the
spawning application can hoodwink you, and put any sort of junk in
argv[0].) The usual solution under Unix is to require an environment
variable, e.g. MYAPPLICATION_HOME, which should contain the root
directory where you're application is installed; the application won't
start without it. Or you can ask the user to specify the root path with
a command line option.
In practice, I usually use all three: the command line option has
precedence, and is very useful when testing; the environment variable
works well in the Unix world, since it's what people are used to; and if
neither are present, I'll try to work out the location from where I was
started, using system dependent code: GetModuleLocation under Windows,
and getenv("PATH") and all the rest under Unix. (The Unix solution
isn't that hard if you already have code for breaking a string into
fields, and are using boost::filesystem.)
Good solution would be to write your custom function that is guaranteed to work in every platform you use. Preferably should use runtime checks if it worked, and then fallback to ifdefs only if some way of detecting it is not available in all platforms. But it might not be easy to detect if your code that executes correctly for example argv[0] would return the correct path...