Print shared library dependencies from C++

Print shared library dependencies from C++ - c++

I need to allocate an exact set of shared library dependencies of a binary program. I'm working on linux and the project is written in my C++. Thus, I need a recursive ldd-like functionality in C++. How can I do it?

To quote Han Solo, "I got a bad feeling about this". Setting up a chroot for a child process from within a C++ program sounds like some architectural misconception / screwup further up the line. Sorry, no ready-made C++ solution that springs to mind. You could, of course, run ltrace / strace / recursive-ldd and parse their output...
...but generally speaking, the idea is to set up the chroot environment statically (i.e. before any processes are started), not dynamically. With a dynamic approach, an attacker could fool the main process into believing it should give the child process things it shouldn't have in the chroot. That defeats the whole purpose.
Tools for statically setting up chroot environments for a given executable are plenty, tools for doing so dynamically I couldn't find any. This is a hint in itself.

In the meantime I've found the following:
linux/gcc: ldd functionality from inside a C/C++ program
where the accepted answer suggests to use:
setenv("LD_TRACE_LOADED_OBJECTS", "1", 1);
FILE *ldd = popen("/lib/libz.so");
I tried it out and worked both from bash and from C++ (ofc in this case I think of an equivalent version). However if I ran either versions for a SUID binary (what I actually have) then I got exit code 5 (i guess permission problems).
Then I traced what ldd exactly does and the following seems fine (at least in command line):
LD_TRACE_LOADED_OBJECTS=1 /lib64/ld-linux-x86-64.so.2 binary_name
The (dummy) question is: what is the equivalent implementation of this in C++?

Related

Is there a way to figure out what environment variables are needed/used by an executable?

I've got a C++ program that will run certain very specific commands as root. The reason this is needed is because another program running under Node.js needs to do things like set system time, set time zone, etc that require root privileges to accomplish. I'm using the function execve in C++ to make the system call with root privileges after using the setuid command. I specifically choose the execve command because I want to wall off the environment so I don't create an environment variable vulnerability.
setuid(0);
execve(acExeName, pArgsForExec2, pcEnv);
What I want to do is find out exactly the pcEnv which is the environment variable list for the program to execute with that my program needs. For example, if I want to run the tool time-admin as if I was running it from the console, how can I figure out what environment variables it needs. I know I can print off the environment variables with the command printenv, but that gives me all of them. I'm quite sure I don't need them all and want as small a subset as possible.
I know I can use them all and then slowly comment each one out and see if it keeps working, but I'd really rather not go that far.
Anyone got a clever way to figure out what environment variables are used by a program? I should add I'm doing this on a Ubuntu 12.04 LTS install.
Thanks for any help.

There are no general ways of figuring out the environment variables used by some program. For example, one could imagine that a program has some configuration files which gives the name of environment variables.
Actually many shell like programs (or script interpreters) are doing that.
More generally, the argument to getenv(3) could be computed. So in theory you cannot guess its possible values. (I might be wrong, but some very old versions of libc and of bash used to play such tricks; unfortunately, I forgot the details, but sometimes an environment variable with some pid number in its name was used).
And, as others commented, you might want to use ltrace (or play LD_PRELOAD tricks), or use gdb, to find out how getenv is called ...
And the application might also use the environ variable (see environ(7) ...) or the third argument to main ....
In practice however, a reasonably written program should clearly document all the environment variables it is using....
If you have access to the source code of the program, you could, if it is compiled by GCC, use (the just released version 1.0 of) the MELT plugin. MELT is a domain specific language to extend GCC and can be used to explore the internal Gimple representations handled by GCC while compiling your program. In particular with its new findgimple mode you could find in one command all the calls to getenv with a constant string.

Alternative to System() for running a batch file in a program

I want to make a system were I can run a make file and several other gcc related things within a program, basically to use gcc and compile stuff within the program. If I wrote up all the stuff I want to do to a batch file then I'd need to run that batch file from within the program.
Now everyone says System() calls are extremely bad because of security and various other things. So considering I am using c++ what would be a good alternative to System() to run batch files. If preferable I would like the alternative to cross platform.
Thanks

You could look to use the fork and execl family of calls although these are tied down to Unix / Linux and, depending on how you use them, are arguably no safer than system.
I doubt very much that you'll find a common, cross platform way of doing this if only because all platforms will have different and unique ways of doing this. Also, the scripts you're trying to run will no doubt have to be different on different platforms and there may be different ways of specifying things such as directory paths etc.
My suggestion would be to first ask yourself how you'll take the following questions - which would be my main concerns:
How am I going to prevent accidental / intentional misuse?
How am I going to detect errors or success status within the scripts I'm running?
How am I going to provide for dependencies? E.g. script A must run completely and correctly before script B runs.
How am I going to report the success and failure state.
My final question would be why do you want to do this in C++? Is there a specific reason? Naturally I'm a C++ evangelist although I would have thought this would be better tackled by a scripting language such as Perl, Python or possibly Bash unless you're embarking on something far more radical.

When writing a portable c/c++ program, what is the best way to consume external files?

I'm pretty new to the c/c++ scene, I've been spoon fed on virtual machines for too long.
I'm modifying an existing C++ tool that we use across the company. The tool is being used on all the major operating systems (Windows, Mac, Ubuntu, Solaris, etc). I'm attempting to bridge the tool with another tool written Java. Basically I just need to call java -jar from the C++ tool.
The problem is, how do I know where the jar is located on the user's computer? The c++ executables are currently checked into Perforce, and users sync and then call the exe, presumably leaving the exe in place (although they could copy it somewhere else). My current solution checks in the jar file beside the exe.
I've looked at multiple ways to calculate the location of the exe from C++, but none of them seem to be portable. On windows there is a 'GetModuleLocation' and on posix you can look at the procs/process.exe info to figure out the location of the process. And on most systems you can look at argv[0] to figure out where the exe is. But most of these techniques are 100% guaranteed due to users using $PATH, symlinks, etc to call the exe.
So, any guidance on the right way to do this that will always work? I guess I have no problem ifdef'ing multiple solutions, but it seems like there should be a more elegant way to do this.

I don't believe there is a portable way of doing this. The C++ standard itself does not define anything about the execution environment. The best you get is the std::system call, and that can fail for things like Unicode characters in path names.
The issue here is that C and C++ are both used on systems where there's no such thing as an operating system. No such thing as $PATH. Therefore, it would be nonsensical for the standards committee to require a conforming implementation provide such features.
I would just write one implementation for POSIX, one for Mac (if it differs significantly from the POSIX one... never used it so I'm not sure), and one for Windows (Select which one at compilation time with the preprocessor). It's maybe 3 function calls for each one; not a lot of code, and you'll be sure you're following the conventions of your target platform.

I'd like to point you to a few URLs which might help you find where the current executable was located. It does not appear as if there is one method for all (aside from the ARGV[0] + path search method which as you note is spoofable, but…are you really in a threat environment such that this is likely to happen?).
How to get the application executable name in WindowsC++/CLI?
https://superuser.com/questions/49104/handy-tool-to-find-executable-program-location
Finding current executable's path without /proc/self/exe
How do I find the location of the executable in C?

There are several solutions, none of them perfect. Under Windows, as
you have said, you can use GetModuleLocation, but that's not available
under Unix. You can try to simulate how the shell works, using
argv[0] and getenv("PATH"), but that's not easy, and it's not 100%
reliable either. (Under Unix, and I think under Windows as well, the
spawning application can hoodwink you, and put any sort of junk in
argv[0].) The usual solution under Unix is to require an environment
variable, e.g. MYAPPLICATION_HOME, which should contain the root
directory where you're application is installed; the application won't
start without it. Or you can ask the user to specify the root path with
a command line option.
In practice, I usually use all three: the command line option has
precedence, and is very useful when testing; the environment variable
works well in the Unix world, since it's what people are used to; and if
neither are present, I'll try to work out the location from where I was
started, using system dependent code: GetModuleLocation under Windows,
and getenv("PATH") and all the rest under Unix. (The Unix solution
isn't that hard if you already have code for breaking a string into
fields, and are using boost::filesystem.)

Good solution would be to write your custom function that is guaranteed to work in every platform you use. Preferably should use runtime checks if it worked, and then fallback to ifdefs only if some way of detecting it is not available in all platforms. But it might not be easy to detect if your code that executes correctly for example argv[0] would return the correct path...

What does a C++ program require to run?

This question has been bothering me for a while now. Let's consider the two following programs:
#incude <iostream>
int main()
{
std::cout << "Hello, World!";
}
and
int main()
{
int x = 5;
int y = x*x;
}
Windows:
The first example, naturally, requires some system .dll's for the console. I understand that. What about the second? Does it need anything to run? Some runtime libraries? By the way, what do runtime libraries actually do?
Linux:
No idea, can you enlighten me?
I know it depends on the compiler and OS, but I need either a general answer or particular examples. TIA.

As a general answer, the first will require the C++ runtime libraries (the stuff you need to support the standard library calls). These form an interface of sorts between the language and the support libraries, which in turn know how to achieve what they do in the given environment.
The second makes no use of the runtime libraries. It will use the C startup and termination code (that initialises and tears down the C environment) but it's a discussion point as to whether or not these are considered part of the runtime libraries. If you consider them a part, then , yes, they will be used. It will probably be a very small part used since there's usually a big difference in size between startup code and the streams stuff.
You can link your code statically (binding at link time) with runtime libraries or dynamically (so that the actual binding is done at load time). That's true for both Windows and Linux.

For Windows applications you can use the Dependency Walker to see all dependencies.

The first program performs stream I/O, which means it has to interact with resources (console, gui) managed by the OS. So, ultimately, the OS has to be invoked via an API implemented in a system dll.
On windows the second program requires no libraries. I'm fairly sure the same is true on Linux.

Compile them with GCC, and get executable named 'hi', in console write:
ldd hi
will give you the shared objects(dynamic libraries) which are connected to your program.
Just for quick answer here is an output:
ldd tifftest
libtiff.so.3 => /usr/lib/libtiff.so.3 (0x4001d000)
libc.so.6 => /lib/libc.so.6 (0x40060000)
libjpeg.so.62 => /usr/lib/libjpeg.so.62 (0x40155000)
libz.so.1 => /usr/lib/libz.so.1 (0x40174000)
/lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)

Well, let's look at this from a more general point of view:
To start with, you'll need a computer with a compatible CPU that works with target machine of the compiler output. You might think this is obvious but assuming that code compiles to x86 machine code, it won't run on an Alpha CPU which uses different instructions. Alternatively, if you compile to x64 machine code, it won't run on an x86-only CPU. So the correct hardware is necessary to run the C++ program, in contrast to virtual-machine based languages like Java, which abstract that away.
You will also need the correct operating system. I'm not an expert on porting programs but I don't think it's possible to build a single executable that runs on multiple operating systems in C++. For example, compiling even your second example to Windows will have a lot of runtime-library code behind the scenes before and after the actual call to your main() function. This will do things like prepare the heap and initialise the CRT library. The CRT for Windows is implemented via the Windows API. You can static link the library so no CRT DLL is required, but the code in your program still makes calls to the Windows API, so is still platform dependent. As an experiment I compiled an empty program with static linking on Windows with Visual Studio, and according to Dependency Walker it still references KERNEL32.DLL for functions like HeapCreate and ExitProcess. So the 'empty' program still does a whole bunch of operating system stuff for you, in preparation for doing something useful (regardless of whether or not your program does anything useful).
Also note there may well be a minimum operating system version: Visual Studio 2010 requires Windows XP SP2 or above for even an empty program, due to calls made to EncodePointer and DecodePointer. See this question.
The system will have to have the memory available to launch your program. You may think it does nothing, but as above demonstrates, before main() is called a whole load of OS initialization calls are made by your program's library. These probably require some memory, and the processing time necessary to execute it.
Depending on the configuration of the operating system, you may need sufficient security privileges to launch executable programs.
So, in short, to run an empty C++ program even with static linking, you need the right CPU, operating system, permission to run the executable, and memory/processing time to complete the program. Compared to VM technologies like Java or .NET, the requirements would reduce to probably just the correct virtual machine, necessary privilege, and necessary memory/CPU time to run the program. This may not be as simple as it sounds: you might need the correct version of a virtual machine, such as .NET framework 4.0. This can complicate your distribution process since updating the entire JVM or .NET framework for a machine can be a time consuming process requiring administrator privileges and maybe an internet connection. In some narrow cases, this could be a deal breaker, since on rare occasions it may be easier to be able to say "it will run on any x86-compatible Windows operating system since XP" as opposed to "any machine with the latest virtual machine that was only released yesterday". For most purposes, though, the fact the virtual machine allows you to (in theory) forget about the CPU and operating system makes the distribution of the program easier; with C++, you are required to at least compile separate executables for each combination of platform and CPU you want to support, regardless of the additional requirements of the libraries you're using.

C programs on Windows require CRT libraries that come with Windows. C++ sometimes require so called "C++ redistributable". They can be embedded in app via linking but this will make EXE bigger.

For the 1st part of your question - you have been answered by several members.
But What I am saying is general and required for both cases - (in case you are not aware of)
For any program to run, it has to be provided with resources it needs. While answering 1st part team has already listed several items.
But in general, what it needs is well defined address space (in main memory), its properties and CPU time. Operating System ensures you get that when you execute your program. Unless there is some ridiculous conflict your program will get that (and that's why I guess Chubsdad commented "you need luck").
OS scheduling, CPU asking to fetch instructions/data from memory and then the executing it... all forms a "machine" that executes your program.
Finding the entry point (or first point in your program to execute) is all that is decided either at compile time (main function for example) or while you load your program using some system call like exec() (in Unix) / CreateProcess() (in windows).

On Linux, any C program is statically linked to some CRT libraries. The true entry point of the program is the _start() function defined in /usr/lib/crt1.o. This function calls some libc functions like __libc_start_main(). Thus you still need the libc library...
You could do without libc, but it's tricky. You would need to rename your entry point _start(), or instruct the linker to start at main(). And you would also need some inline assembly to issue the _exit() system call when the program is done, otherwise it would just crash. And of course, do the link explicitly with the ld command instead of through the gcc frontend.

Can I use boost library for crossplatform application executing?

Is there any WinAPI WinExec analog in boost (c++) libraries? I need to run executable from my program, and pass parameters to it. Should I use any other cross-platform libraries for this, or handle myself what OS my program is compiled for?

Important: see update at the end for POSIX systems.
My opinion is that you should use the APIs/syscalls provided by the various platforms you wish to support, or use some kind of abstraction layer (the Boost.Process library, mentioned by Noah Roberts, may be an idea) to avoid dealing with platform-specific details.
I strongly disagree with using the system function because it isn't intended to start a process you specify, but instead it's supposed to pass the string you specified to the "system default shell" or "command processor" (if any). This has several drawbacks:
resource wastage; instead of a process now (usually) you are spawning two, one of which (the shell) is useless for your final objective (starting the process you want). This is usually negligible, but may be noticeable on systems where processes aren't lightweight objects (Windows) if they are running low on resources.
useless confusion; several security suites I've dealt with warn every time an unknown/untrusted process starts a new process; instead of just displaying a warning, now the security suite will display two of them (and you're making the first one quite unclear);
unpredictability of the result; the platform-agnostic system's documentation could be replaced without much loss with "undefined behavior" - and actually it is quite like that. Why do I say this? Because:
first of all, there's not even a guarantee that system has some meaning on the current platform, as there could be no "default shell" at all. But this is an extreme case that isn't usually a problem - and that can be also caught quite easily (if(system(NULL)==0) there's no shell); the real problem is that
in general, you don't have idea about what shell is the "default shell", and how it parses its input; on Linux it will usually be /bin/sh update: actually, this is mandated by POSIX, see below, on Windows it may be command.com as well as cmd.exe, on another OS it will be still another thing. So, you aren't sure about, e.g., how to escape spaces in the path, or if you should quote the path; heck, you don't even know if such shell requires some special command to start executables!
More fun: you don't even know if the call is actually blocking: you know that by the time system will return the shell will be terminated, but you don't know if the shell will wait for the spawned process to end; concrete example: cmd.exe doesn't wait for GUI executables to end before returning, while on Linux GUI executables are executables like all the others and don't have such special treatment. In this case you'll have to create a special case for Windows, and make a command string like start /wait youexecutable.exe - hoping that the version of the interpreter still (or yet, depending on the version of Windows) supports that syntax. And IIRC start has different options on Windows 9x and Windows NT family, so you won't even be sure with that.
It's not enough: you aren't even sure if the application has been started: the system return value is relative to the command interpreter return code. As far as system is concerned, if a shell is started the call succeeded, and there ends what system considers an error.
Then you're left with the error code of the shell - about which, again, we don't know anything. Maybe it's a carbon-copy of the error code of the last executed command; maybe it is an error code relative just to the shell (e.g., 1 = last command executed, 0 = last command was invalid), maybe it's 42. Who knows?
Since in a good application you'll want, at least, to know if the call is blocking/nonblocking, to get a meaningful exit code (the one actually returned by the application you started), to be sure if the application has been started, to have meaningful error codes in case things went wrong, system most probably doesn't suit your needs; to earn any of these guarantees, you have to go with platform-specific hacks or with non guaranteed assumptions, wasting all the cross-platform "compatibility" of system.
So, I'll state it again: use the system calls provided by the various platforms (e.g., fork+exec on POSIX, CreateProcess on Windows), which specify exactly what they are guaranteed to do, or go with third party abstraction code; the system way is definitely not good.
Update: since when I wrote this answer, I learned that on POSIX systems system is specified way better - in particular, it's mandated that it will execute the commands with /bin/sh -c command, blocking until the termination of the shell process.
sh behavior, in turn, is mandated in several ways by POSIX; thus, on POSIX systems, some of the disadvantages listed under "unpredictability of the result" no longer apply:
the default shell is specified, so, as long as you use just sh stuff guaranteed by POSIX (e.g. no bashisms), you are safe;
the call is blocking;
if your command is well-formed, the shell itself doesn't encounter problems, waitpid succedes, ..., you should get a copy of the error code of the executed program
So, if you run on POSIX, the situation is way less tragic; if, instead, you have to be portable, keep avoiding system.

What is wrong with system(), which is part of standard C++? See http://www.cplusplus.com/reference/clibrary/cstdlib/system.

There's a library that I believe is trying to get into boost called Boost.Process. You'll have to find a download for it, probably in sandbox or whatnot.

You might want to take a look at this question regarding popen() on win32: popen

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js