How to use dlopen() to get the executables path - c++

I am trying to use dlopen() and dlinfo() to get the path my executable. I am able to get the path to a .so by using the handle returned by dlopen() but when I use the handle returned by dlopen(NULL,RTLD_LAZY); then the path I get back is empty.
void* executable_handle = dlopen(0, RTLD_LAZY);
if (nullptr != executable_handle)
{
char pp_linkmap[sizeof(link_map)];
int r = dlinfo(executable_handle, RTLD_DI_LINKMAP, pp_linkmap);
if (0 == r)
{
link_map* plink = *(link_map**)pp_linkmap;
printf("path: %s\n", plink->l_name);
}
}
Am I wrong in my assumption that the handle for the executable can be used in the dlinfo functions the same way a .so handle can be used?

Am I wrong in my assumption that the handle for the executable can be used in the dlinfo functions the same way a .so handle can be used?
Yes, you are.
The dynamic linker has no idea which file the main executable was loaded from. That's because the kernel performs all mmaps for the main executable, and only passes a file descriptor to the dynamic loader (who's job it is to load other required libraries and star the executable running).
I'm trying to replicate some of the functionality of GetModuleFileName() on linux
There is no reliable way to do that. In fact the executable may no longer exist anywhere on disk at all -- it's perfectly fine to run the executable and remove the executable file while the program is still running.
Also hard links mean that there could be multiple correct answers -- if a.out and b.out are hard linked, there isn't an easy way to tell whether a.out or b.out was used to start the program running.
Your best options probably are reading /proc/self/exe, or parsing /proc/self/cmdline and/or /proc/self/maps.

The BSD utility library has a function getprogname(3) that does exactly what you want. I'd suggest that is more portable and easier to use than procfs in this case.

Related

Cannot dlopen a shared library from a shared library, only from executables

I have a C++ CMake project that has multiple sub-projects that I package into shared libraries. Then, the project itself, which is an executable, links with all these shared libraries. This is a project that is being ported from Windows to Ubuntu. What I do is have the exectable, EXE, use a one subproject, Core, to open all other libraries. Problem is that this isn't working on Linux.
This is EXE:
int main(int argc, char *argv[])
{
core::plugin::PluginManager& wPluginManager = core::plugin::PluginManagerSingleton::Instance();
wPluginManager.loadPlugin("libcore.so");
wPluginManager.loadPlugin("libcontroller.so")
wPluginManager.loadPlugin("libos.so")
wPluginManager.loadPlugin("libnetwork.so")
wPluginManager.loadPlugin("liblogger.so")
}
This is core::plugin::PluginManager::loadPlugin():
bool PluginManager::loadPlugin(const boost::filesystem::path &iPlugin) {
void* plugin_file = dlopen(plugin_file_name, RTLD_LAZY);
std::cout << (plugin_file ? " success" : "failed") << std::endl;
return true;
}
What happens is that libcore gets loaded properly, but then all other libraries fail with no no error message. I cannot find out why it's not working. However, when I do the same thing, but instead of having Core load the libraries, I simply do it in main and it works.
Basically, I can load libraries from an exe, but I can't from other shared libraries. What gives and how can I fix this?
The most likely reason for dlopen from the main executable to succeed and for the exact same dlopen from libcore.so to fail is that the main executable has correct RUNPATH to find all the libraries, but libcore.so does not.
You can verify this with:
readelf -d main-exe | grep R.*PATH
readelf -d libcore.so | grep R.PATH
If (as I suspect) main-exe has RUNPATH, and libcore.so doesn't, the right fix is to add -rpath=.... to the link line for libcore.so.
You can also gain a lot of insight into dynamic loader operation by using LD_DEBUG envrironment variable:
LD_DEBUG=libs ./main-exe
will tell you which directories the loader is searching for which libraries, and why.
I cannot find out why it's not working
Yes, you can. You haven't spent nearly enough effort trying.
Your very first step should be to print the value of dlerror() when dlopen fails. The next step is to use LD_DEBUG. And if all that fails, you can actually debug the runtime loader itself -- it's open-source.
I managed to find a fix for this issue. I don't quite understand the inner workings nor the explanation of my solution, but it works. If someone who has a better understanding than my very limited experience with shared libraries could comment on my answer with the real explanation, I'm sure it could help future viewers of this question.
What I was currently doing is dlopen("libcore.so"). I simply changed it to an absolute path dlopen("/home/user/project/libcore.so") and it now works. I have not yet tried with relative paths, but it appears we should always use relative or absolute paths instead of just the filename with dlopen.
If absolute path was help, maybe problem is local dependencies of shared libraries. Another words, maybe libcontroller.so is depend from libos.so or other your library, but cannot find it. Linux loader means that all shared libraries are placed in /lib, /usr/lib, etc. You need to specify path for find dynamic libraries with environment variable LD_LIBRARY_PATH.
Try to run your app this way:
LD_LIBRARY_PATH=/path/to/your/executable/and/modules ./yourapp
bool PluginManager::loadPlugin(const boost::filesystem::path &iPlugin) {
void* plugin_file = dlopen(plugin_file_name, RTLD_LAZY);
std::cout << (plugin_file ? " success" : "failed") << std::endl;
return true;
}
The flags to use with dlopen depend upon the distro. I think Debian and derivatives use RTLD_GLOBAL | RTLD_LAZY, while Red Hat and derivatives use RTLD_GLOBAL. Or maybe it is vice-versa. And I seem to recall Android uses RTLD_LOCAL, too.
You should just try both to simplify loading on different platforms:
bool PluginManager::loadPlugin(const boost::filesystem::path &iPlugin) {
void* plugin_file = dlopen(plugin_file_name, RTLD_GLOBAL);
if (!plugin_file) {
plugin_file = dlopen(plugin_file_name, RTLD_GLOBAL | RTLD_LAZY);
}
const bool success = plugin_file != NULL;
std::cout << (success ? "success" : "failed") << std::endl;
return success ;
}
What happens is that libcore gets loaded properly, but then all other libraries fail with no no error message
This sounds a bit unusual. It sounds like the additional libraries from the sub-projects are not in the linker path.
You should ensure the additional libraries are in the linker path. Put them next to libcore.so in the filesystem since loading libcore.so seems to work as expected.
If they are already next to libcore.so, then you need to provide more information, like the failure from loadPlugin, the RUNPATH used (if present) and the output of ldd.
but then all other libraries fail with no no error message. I cannot find out why it's not working.
As #Paul stated in the comments, the way to check for a dlopen error is with dlerror. It is kind of a crappy way to do it since you can only get a text string and not an error code.
The dlopen man page is at http://man7.org/linux/man-pages/man3/dlopen.3.html, and it says:
RETURN VALUE
On success, dlopen() and dlmopen() return a non-NULL handle for the
loaded library. On error (file could not be found, was not readable,
had the wrong format, or caused errors during loading), these
functions return NULL.
On success, dlclose() returns 0; on error, it returns a nonzero value.
Errors from these functions can be diagnosed using dlerror(3).

Is using an absolute path to load system DLLs with LoadLibrary() and GetModuleHandle() better to prevent DLL hijacking?

In many cases to load some newer API one would use a construct as such:
(FARPROC&)pfnZwQueryVirtualMemory = ::GetProcAddress(::GetModuleHandle(L"ntdll.dll"), "ZwQueryVirtualMemory");
But then, considering a chance of Dll hijacking, is it better to specify a DLL's absolute path, as such. Or is it just an overkill?
WCHAR buff[MAX_PATH];
buff[0] = 0;
if(::GetSystemDirectory(buff, MAX_PATH) &&
::PathAddBackslash(buff) &&
SUCCEEDED(::StringCchCat(buff, MAX_PATH, L"ntdll.dll")))
{
(FARPROC&)pfnZwQueryVirtualMemory = ::GetProcAddress(::GetModuleHandle(buff), "ZwQueryVirtualMemory");
}
else
{
//Something went wrong
pfnZwQueryVirtualMemory = NULL;
}
The problem with the latter method is that it doesn't always work (for instance with Comctl32.dll.)
You don't have to do anything special for ntdll.dll and kernel32.dll because they are going to be loaded before you get the chance to do anything, they are also on the known-dlls list.
The dll hijacking issues often include auxiliary libraries. Take version.dll for example, it is no longer on the known-dlls list so explicitly linking to it is problematic, it needs to be loaded dynamically.
The best solution is a combination of 3 things:
Call SetDefaultDllDirectories(LOAD_LIBRARY_SEARCH_SYSTEM32) if it is available (Win8+ and updated Win7).
Call LoadLibrary with the full path (GetSystemDirectory) before calling GetModuleHandle.
Don't explicitly link to anything other than kernel32, user32, gdi32, shlwapi, shell32, ole32, comdlg32 and comctl32.
If SetDefaultDllDirectories is not available then it is really hard to protect yourself if you don't control the application directory because various Windows functions will delay-load dlls like shcore.dll without full paths (especially the shell APIs). SetDllDirectory("") helps against the current/working directory but there is no good application directory workaround for unpatched pre-Win8 systems, you just have to test with Process Monitor and manually load the problematic libraries early in WinMain.
The application directory is a problem because some users just put everything in the downloads folder and run it from there. This means you might end up with a malicious dll in your application directory.

Determine the path of a *.dll or *.so at runtime

A C++ library must read a file from disk that is located in the same directory as the library. I don't know if the question is trivial or if it is impossible: How can the library determine its own path on disk? The solution should work for Linux (.so) and Windows (.dll).
Although the task you describe is usually solved in a different fashion, here's the solution for Linux:
#define _GNU_SOURCE
#include <stdio.h>
#include <dlfcn.h>
void foo() {
Dl_info dlInfo;
dladdr(puts, &dlInfo);
if (dlInfo.dli_sname != NULL && dlInfo.dli_saddr != NULL)
printf("puts is loaded from %s\n", dlInfo.dli_fname);
else
printf("It's strange but puts is not found\n");
puts("Hello, world! I'm foo!");
}
Here the source of puts function is determined using dladdr.
For Windows see this recipe
When I told that the task is solved differently I meant the following. In Linux (and oher "typical unix-like system") shared libraries are separated from the rest of "application data" (the former reside in /lib, /lib64, /usr/lib or /usr/lib64, depending on target platform, "importance of a particular library" and other factors, the latter typically go to /usr/share/<appname>). That is the path to library won't help you. Path to application data files is usually configured during compilation, sometimes - via a dedicated setting in application configuration.
For those "third-party" apps which are intended to be installed to /opt/ or other "special locations", like /usr/local the path to application data is calculated from the path of the application, which in turn is often determined from application binary location, which is known right from main() arguments. Certainly your case may require a special layout, but it's useful to keep in mind these considerations.
Thank you for the answers. As I understand now getting the path is overly complicated and not compatible across platforms. It seems to be better to embed the file in the executable.
Edit: I have found
http://www.fourmilab.ch/xd/
which is really easy to use.

Substitute for call to "system" in my C++ program

I'm trying to find find a substitute for a call to "system" (from stdlib.h) in my C++ program.
So far I've been using it to call g++ in my program to compile and then link a variable number of source files in a directory chosen by the user.
Here I've got an example how the command could approximately look like: "C:/mingw32/bin/g++.exe -L"C:\mingw32\lib" [...]"
However, I have the problem that (at least with the MinGW compiler I'm using) I get the error "Command line is too long" when the command string gets too long.
In my case it was about 12000 characters long. So I probably need another way to call g++.
Additionally, I've read that you generally shouldn't use "system" anyway: http://www.cplusplus.com/forum/articles/11153/
So I'm in need for some substitute (that should also be as platform independent as possible, because I want the program to run on Windows and Linux).
I've found one candidates that would generally look quite well suited:
_execv / execv:
Platform independent, but:
a) http://linux.die.net/man/3/exec says "The exec() family of functions replaces the current process image with a new process image". So do I need to call "fork" first so that the C++ program isn't terminated? Is fork also available on Windows/MSVC?
b) Using "system", I've tested whether the return value was 0 to see if the source file could be compiled. How would this work with exec? If I understand the manpage correctly, will it only return the success of creating the new process and not the status of g++? And with which function could I suspend my program to wait for g++ to finish and get the return value?
All in all, I'm not quite sure how I should handle this. What are your suggestions? How do multiplatform programs like Java (Runtime.getRuntime().exec(command)) or the Eclipse C++ IDE internally solve this? What would you suggest me to do to call g++ in an system independent way - with as many arguments as I want?
EDIT:
Now I'm using the following code - I've only tested it on Windows yet, but at least there it seems to work as expected. Thanks for your idea, jxh!
Maybe I'll look into shortening the commands by using relative paths in the future. Then I would have to find a platform independent way of changing the working directory of the new process.
#ifdef WIN32
int success = spawnv(P_WAIT, sCompiler.c_str(), argv);
#else
pid_t pid;
switch (pid = fork()) {
case -1:
cerr << "Error using fork()" << endl;
return -1;
break;
case 0:
execv(sCompiler.c_str(), argv);
break;
default:
int status;
if (wait(&status) != pid) {
cerr << "Error using wait()" << endl;
return -1;
}
int success = WEXITSTATUS(status);
}
#endif
You might get some traction with some of these command line options if all your files are in (or could be moved to) one (or a small number) of directories. Given your sample path to audio.o, this would reduce your command line by about 90%.
-Ldir
Add directory dir to the list of directories to be searched for `-l'.
From: https://gcc.gnu.org/onlinedocs/gcc-3.0/gcc_3.html#SEC17
-llibrary
Search the library named library when linking.
It makes a difference where in the command you write this option; the linker searches processes libraries and object files in the order they are specified. Thus, foo.o -lz bar.o' searches libraryz' after file foo.o' but beforebar.o'. If bar.o' refers to functions inz', those functions may not be loaded.
The linker searches a standard list of directories for the library, which is actually a file named `liblibrary.a'. The linker then uses this file as if it had been specified precisely by name.
The directories searched include several standard system directories plus any that you specify with `-L'.
Normally the files found this way are library files--archive files whose members are object files. The linker handles an archive file by scanning through it for members which define symbols that have so far been referenced but not defined. But if the file that is found is an ordinary object file, it is linked in the usual fashion. The only difference between using an -l' option and specifying a file name is that-l' surrounds library with lib' and.a' and searches several directories.
From: http://gcc.gnu.org/onlinedocs/gcc-3.0/gcc_3.html
Here's another option, perhaps closer to what you need. Try changing the directory before calling system(). For example, here's what happens in Ruby...I'm guessing it would act the same in C++.
> system('pwd')
/Users/dhempy/cm/acts_rateable
=> true
> Dir.chdir('..')
=> 0
> system('pwd')
/Users/dhempy/cm
=> true
If none of the other answers pan out, here's another. You could set an environment variable to be the path to the directory, then use that variable before each file that you link in.
I don't like this approach much, as you have to tinker with the environment, and I don't know if that would actually affect the command line limit. It may be that the limit applies after interpolating the command. But, something to thing about, regardless.

How can I load an external file/program in memory and then execute it (C++ and Unix)?

Let's say I have an executable file called "execfile". I want to read that file using a C++ program (on Unix) and then execute it from memory. I know this is possible on Windows but I was not able to find a solution for Unix.
In pseudo-code it would be something like this:
declare buffer (char *)
readfile "execfile" in buffer
execute buffer
Just to make it clear: obviously I could just execute the file using system("execfile"), but, as I said, this is not what I intend to do.
Thank you.
EDIT 1: To make it even more clear (and the reason why I can't use dlopen): the reason I need this functionality is because the executable files are going to be generated dynamically and so I cannot just build all of them at once in a single library. To be more precise I'm building a tool that will first encrypt an executable file with a key and then it will be able to execute that encrypted file, first decrypting it and then executing it (and I don't want to have a copy of the decrypted file on the file system).
You cannot without writing a mountain of code. Loading and linking an a.out is a kernel facility, not a user mode facility, on linux.
You'd be better off making a shared library and loading it with dlopen.
The solution to load-and-run -- not necessarily in C++ -- is to use dlopen+dlsym to load dynamic library and obtain a pointer to function defined in the library.
See C++ dlopen mini HOWTO for description of solving problems with C++ symbols in dynamic libraries.