C++ - Does LoadLibrary() actually link to the library? - c++

I'm using Code::Blocks and hate manually linking DLLs. I found the LoadLibrary() function, and am wondering if it works like a .a or .lib file would. Does this function work like that? If not, what can I do programming-wise (if anything) to link a DLL without having to link a DLL by doing the Project < Build options < Linker settings < add < ... method?

LoadLibrary loads the requested library (and all the libraries it needs) into your process' address space. In order to access any of the code/data in that library, you need to find out the code or data address in the newly loaded region of memory. You need to use GetProcAddress.
The difference between this process and adding a library during build time is that for build-time library the compiler prepares a list of locations that refer to a given function, linker puts that list into .exe, and run-time linker loads the library, does the equivalent of GetProcAddress for the function name and places the address into all locations that the compiler marked.
When you don't have this automated support, you have to declare a pointer to function, call GetProcAddress yourself, and assign the returned value to your pointer to function. You can then call the function like any other C function (note "C" part - the above process is complicated by name mangling when you use C++, so make use of extern "C")

LoadLibrary() loads a DLL at runtime. Normaly you link when you compile the EXE, and link the DLLs like a static library at this time. If you need to load libraries dynamically at runtime, you use LoadLibrary().
For example when you implement a pluginsystem, is this usefull, as you don't know the libraries beforehand.

Not how it works at all. LoadLibrary is used to load a "at compile time unknown" DLL - such as program extensions/plugins or "This DLL for SSE, that DLL for non-SSE" based on "what can the hardware do - one could also consider having a DLL per connection type to email servers or something like that, so that the email program doesn't have to "carry" all the different variants, when only one is used for any particular email address.
Further, to use a DLL that has been loaded this way, you need to use GetProcAddress to get the address of the functions in the DLL. This is very different from linking the DLL into the project at buildtime, where functions simply appear "automagically" by help of the system loader functions that load the DLL's that are added to the project at build-time.

Contrary to static or dynamic library linking, LoadLibrary doesn't make the library's symbols directly available to your program. You need to call GetProcAddress at runtime to get a pointer to the functions you want to call.
As #Devolus mentioned, this is a good way to implement plugin systems and/or to access optional components. However, since the symbols are not available to your program in a transparent way, this is not really practical for ordinary usage.

Related

Why do we need .lib file in case of importing functions from .dll?

Can you help me to understand, why do we need .lib files when importing functions and data from dll?
I've heard, that it contains a list of the exported functions and data elements from the corresponding dll, but when I used CFF Explorer to explore my dll, I found out that dll already has addresses of exporting functions so I theoretically can link my program with .dll without any additional files.
Can you, please, explain what kind of data is stored in the .lib files more detailed.
And, also, yes, I know, that visual studio forces us to add .lib files into additional dependencies section, but why does it really needs them?
When your source code statically calls exported DLL functions, or statically accesses exported DLL variables, those references are compiled into your executable's intermediate object files as pointers, whose values get populated at run-time.
When the linker is combining the compiler-generated object files to make the final executable, it has to figure out what all of the compiler-generated references actually refer to. If it can't match a given reference to some piece of code in your executable, it needs to match it to an external DLL instead. So it needs to know which DLLs to even look at, and how those DLLs export things. A DLL may export a given function/variable by name OR by ordinal number, so the linker needs a way to map the identifiers used by your code references to specific entries in the EXPORTS tables of specific .dll files (especially in the case where things are exported by ordinals). Static-link .lib files provide the linker with that mapping information (ie FunctionA maps to Ordinal 123 in DLL XYZ.dll, FunctionB maps to name _FunctionB#4 in DLL ABC.dll, etc).
The linker can then populate the IMPORTS table of your executable with information about the appropriate EXPORTS entries needed, and then make the DLL references in your code point to the correct IMPORTS entries (if the linker can't resolve a compiler-generate reference to a piece of code in your executable, or to a specific DLL export, it aborts with an "unresolved external" error).
Then, when your executable is loaded at run-time, the OS Loader looks at the IMPORTS table to know which DLL exports are needed, so it can then load the appropriate DLLs into memory and update the entries in the IMPORTS table with real memory addresses that are based on each DLL's EXPORTS table (if a referenced DLL fails to load, or if a referenced export fails to be found, the OS Loader aborts loading your executable). That way, when your code calls DLL functions or accesses DLL variables, those accesses go to the right places.
Things are very different if your source code dynamically accesses DLL functions/variables via explicit calls to GetProcAddress() at run-time. In that case, static-link .lib files are not needed for those accesses, since your own code is handling the loading of DLLs into memory and locating the exports that it wants to use.
However, there is a 3rd option that blends the above scenarios together: you can write your code to access the DLL functions/variables statically but use your linker's delay-load feature (if it has one). In that case, you still need static-link .lib files for each delay-loaded DLL you access, but the linker populates a separate DELAYLOAD table in your executable with references to the DLL exports, instead of populating the IMPORTS table. It points the compiler-generated DLL references to stubs in your compiler's RTL that will replace the references with addresses from GetProcAddress() when the stubs are accessed for the first time at run-time, thus avoiding the need for the references to be populated by the OS Loader at load-time. This allows your executable to run normally even if the DLL exports are not present at load-time, and may not even need to load the DLLs at all if they are never used (of course, if your executable does try to access a DLL export dynamically and it fails to load, your code is likely to crash, but that is a separate issue).
I've heard, that it contains a list of the exported functions and data elements from the corresponding dll, but when I used CFF Explorer to explore my dll, I found out that dll already has addresses of exporting functions so I theoretically can link my program with .dll without any additional files.
As a trivial example of why this can't always work, consider an executable that accesses two DLLs, one for a Winsock filter and the other for an allocator. And say that on this particular machine, the Winsock filter DLL happens to also implement an allocator with the same API and the allocator DLL happens to also implement a Winsock filter with the same API. How could the compiler know which API functions to access from which DLL? The library file contains the intent in accessing the DLL, that is, the API and functions you want to access.
Importantly, there is no such thing as "The corresponding DLL". There might be different DLL files on different systems. What the linker needs to know is what the DLL is supposed to look like that it can rely on, not what the DLL that you might happen to use on some particular system might happen to be.
For example, suppose the DLL file contains an allocator. You might have one DLL file for an allocator with debugging, one for an allocator with optimizations for specific CPU versions, and one for an allocator that uses a new, experimental algorithm. What the linker needs to know is the API that all these DLL files implement, not the specific implementation in any one file.
You can produce a LIB file from a DLL file but you might wind up building an executable that doesn't work when using some other version of the DLL file. You would have to assume that whatever this particular DLL happens to do is precisely what every other DLL that implements the same API will happen to do.

Does DLL linking on windows results in GetProcAddress on runtime?

I'm curious about how Dynamic Linking works on windows. Since we CAN NOT link to a directly, windows usually link your executable to a LIB file which contains the stub of functions exported by the DLL. Does this type of linking results in LoadLibrary and GetProcAddress at runtime? If not, how does the linking work internally?
The answer is maybe.
The default method is to create an Import Table, which lists all required DLL's and the functions used from there. This table is parsed directly by the OS. It will probably reuse some of the same code behind LoadLibrary for that. It most likely will not use the code from GetProcAddress but prefer to do a single bulk lookup of all necessary functions.
However there's an MSVC feature called delay-loading. With this feature, MSVC++ will not build such an import table, but insert actual LoadLibrary and GetProcAddress calls. The benefit is that these calls are made at the latest possible moment. While you don't need a particular DLL, it's not loaded. This can accelerate program start up.

Static link an existing windows binary

I was wondering if I can take an existing windows DLL and static link the dynamically-linked files?
I saw a number of projects to do this with Linux/elf
http://magicermine.com/
http://statifier.sourceforge.net/
http://bitwagon.com/jumpstart/jumpstart.html
I imagine this is most likely not possible, but I am running into some issues in WinPE where when I statically linked the DLLs everything started working great.
I don't have the source to the existing DLL.
I guess I could make a pass-through DLL that exposed all of the same functions and static linked?
There is no tool support for linking in the code of a DLL statically.
The problem is that a DLL is a full Windows PE executable, not a C or C++ “library” in any sense. The C++ standard has only one statement that is vaguely in support of DLL-like things (in the para about dynamic initialization after first statement of main). You’re out of luck.
But if you had the source code (as e.g. with MFC), which you say you don’t, then you could just have created static libraries.
Do note that there already is a meaning for “linking statically” a DLL, namely to have it loaded and have its functions resolved automatically.
Which is the usual way of using a DLL.
And which is in contrast to explicitly loading it dynamically and using GetProcAddress to resolve its functions.
Regarding
” when I statically linked the DLLs everything started working great
presumably earlier you have explictly loaded the DLLs dynamically, and used GetProcAddress, and presumably something about that did not work perfectly.
One main problem with GetProcAddress is that it assumes that the provided function name is encoded as Windows ANSI (the machine-dependent encoding reported by GetACP), and then (apparently) translates that to UTF-8 for the function lookup.
One workaround could be to access the function by ordinal rather than name.
One way to find the ordinal with Microsoft's tools, is to use dumpbin /exports.

C/C++ How Does Dynamic Linking Work On Different Platforms?

How does dynamic linking work generally?
On Windows (LoadLibrary), you need a .dll to call at runtime, but at link time, you need to provide a corresponding .lib file or the program won't link... What does the .lib file contain? A description of the .dll methods? Isn't that what the headers contain?
Relatedly, on *nix, you don't need a lib file... How how does the compiler know that the methods described in the header will be available at runtime?
As a newbie, when you think about either one of the two schemes, then the other, neither of them make sense...
To answer your questions one by one:
Dynamic linking defers part of the linking process to runtime.
It can be used in two ways: implicitly and explicitly.
Implicitly, the static linker will insert information into the
executable which will cause the library to load and resolve the
necessary symbols. Explicitly, you must call LoadLibrary or
dlopen manually, and then GetProcAddress/dlsym for each
symbol you need to use. Implicit loading is used for things
like the system library, where the implementation will depend on
the version of the system, but the interface is guaranteed.
Explicit loading is used for things like plug-ins, where the
library to be loaded will be determined at runtime.
The .lib file is only necessary for implicit loading. It
contains the information that the library actually provides this
symbol, so the linker won't complain that the symbol is
undefined, and it tells the linker in what library the symbols
are located, so it can insert the necessary information to cause
this library to automatically be loaded. All the header files
tell the compiler is that the symbols will exist, somewhere; the
linker needs the .lib to know where.
Under Unix, all of the information is extracted from the
.so. Why Windows requires two separate files, rather than
putting all of the information in one file, I don't know; it's
actually duplicating most of the information, since the
information needed in the .lib is also needed in the .dll.
(Perhaps licensing issues. You can distribute your program with
the .dll, but no one can link against the libraries unless
they have a .lib.)
The main thing to retain is that if you want implicit loading,
you have to provide the linker with the appropriate information,
either with a .lib or a .so file, so that it can insert that
information into the executable. And that if you want explicit
loading, you can't refer to any of the symbols in the library
directly; you have to call GetProcAddress/dlsym to get their
addresses yourself (and do some funny casting to use them).
The .lib file on Windows is not required for loading a dynamic library, it merely offers a convenient way of doing so.
In principle, you can use LoadLibrary for loading the dll and then use GetProcAddress for accessing functions provided by that dll. The compilation of the enclosing program does not need to access the dll in that case, it is only needed at runtime (ie. when LoadLibrary actually executes). MSDN has a code example.
The disadvantage here is that you need to manually write code for loading the functions from the dll. In case you compiled the dll yourself in the first place, this code simply duplicates knowledge that the compiler could have extracted from the dll source code automatically (like the names and signatures of exported functions).
This is what the .lib file does: It contains the GetProcAddress calls for the Dlls exported functions, generated by the compiler so you don't have to worry about it. In Windows terms, this is called Load-Time Dynamic Linking, since the Dll is loaded automatically by the code from the .lib file when your enclosing program is loaded (as opposed to the manual approach, referred to as run-time dynamic linking).
How does dynamic linking work generally?
The dynamic link library (aka shared object) file contains machine code instructions and data, along with a table of metadata saying which offsets in that code/data relate to which "symbols", the type of the symbol (e.g. function vs data), the number of bytes or words in the data, and a few other things. Different OS will tend to have different shared object file formats, and indeed the same OS may support several, but that's the gist of it.
So, imagine the shared library's a big chunk of bytes with an index like this:
SYMBOL ADDRESS TYPE SIZE
my_function 1000 function 2893
my_number 4800 variable 4
In general, the exact type of the symbols need not be captured in the metadata table - it's expected that declarations in the library's header files contain all the missing information. C++ is a bit special - compared to say C - because overloading can mean there are several functions with the same name, and namespaces allow for further symbols that would otherwise be ambiguously named - for that reason name mangling is typically used to concatenate some representation of the namespace and function arguments to the function name, forming something that can be unique in the library object file.
A program wanting to use the shared object can generally do one of two things:
have the OS load both itself and the shared object around the same time (before executing main()), with the OS Loader responsible for finding the symbols and examining metadata in the program file image about the use of those symbols, then patching in symbol addresses in the memory the program uses, such that the program can then just run and work functionally as if it'd known about the symbol addresses when it was first compiled (but perhaps a little slower)
or, explicitly in its own source code call dlopen sometime after main runs, then use dlsym or similar to get the symbol addresses, save them into (function/data) pointers based on the programmer's knowledge of the expected data types, then call them explicitly using the pointers.
On Windows (LoadLibrary), you need a .dll to call at runtime, but at link time, you need to provide a corresponding .lib file or the program won't link...
That doesn't sound right. Should be one or the other I'd think.
Wtf does the .lib file contain? A description of the .dll methods? Isn't that what the headers contain?
A lib file is - at this level of description - pretty much the same as a shared object file... the main difference is that the compiler's finding the symbol addresses before the program's shipped and run.
Modern *nix systems derive process of dynamic linking from Solaris OS. Linux, particularly, doesn't need separate .lib file because all external dependencies are contained in ELF format. .interp section of ELF file indicates that there are external symbols inside this executable that needed to be resolved dynamically. This comes for dynamic linking.
There is a way to handle dynamic linking in user space. This method is called dynamic loading. This is when you are using system calls to get function pointers to methods from external *.so.
More information can be found from this article http://www.ibm.com/developerworks/library/l-dynamic-libraries/.
Relatedly, on OS X (and I assume *nix... dlopen), you don't need a lib file... How how does the compiler know that the methods described in the header will be available at runtime?
Compilers or linkers do not need such information. You, the programmer, need to handle the situation that the shared libraries you try to open by dlopen() may not exist.
You can use a DLL file in Windows in two ways: Either you link with it, and you're done, nothing more to do. Or you load it dynamically during run-time.
If you link with it, then the DLL library file is used. The link-library contains information that the linker uses to actually know which DLL to load and where in the DLL functions are, so it can call them. When your program is loaded, the operating system also loads the DLL for you, basically what is does it call LoadLibrary for you.
In other operating systems (like OS X and Linux) it works in a similar way. The difference is that on these systems the linker can look directly at the dynamic library (the .so/.dynlib file) and figure out what's needed without a separate static library like on Windows.
To load a library dynamically, you don't need to link with anything related to the library you want to load.
Like others already said: what is included in a .lib file on Windows is included directly in the .so/.dynlib on Linux/OS X. But the main question is... why?
Isn't *nix solution better?
I think it is, but the .lib has one advantage. The developer linking to the DLL doesn't actually need to have access to the DLL file itself.
Does a scenario like that happen often in the real world? Is it worth the effort of maintaining two files per DLL file? I don't know.
Edit: Ok, guys let's make things even more confusing! You can link directly to a DLL on Windows, using MinGW. So the whole import library problem is not directly related to Windows itself. Taken from sampleDLL article from MinGW wiki:
The import library created by the "--out-implib" linker option is
required iff (==if and only if) the DLL shall be interfaced from some
C/C++ compiler other than the MinGW toolchain. The MinGW toolchain is
perfectly happy to directly link against the created DLL. More details
can be found in the ld.exe info files that are part of the binutils
package (which is a part of the toolchain).
Linux also requires to link, but instead against a .Lib library it needs to link to the dynamic linker /lib/ld-linux.so.2, but this usually happens behind the scenes when using GCC (however if using an assembler you do need to specify it manually).
Both approaches, either the Windows .LIB approach or the Linux dynamic linker linking approach, are considered in reality as static linking. There is, however, a difference that in Windows part of the work is done at link time although it still has work at load time (I am not sure, but I think that the .LIB file is merely for the linker to know the physical library name, the symbols however are only resolved at load time), while in Linux everything besides linking to the dynamic linker happen at load time.
Dynamic linking is in general referring to open manually the DLL file at runtime (such as using LoadLinrary()), in which case the burden is entirely on the programmer.
In shared library, such as .dll .dylib and .so, there is some information about symbol's name and address, like this:
------------------------------------
| symbol's name | symbol's address |
|----------------------------------|
| Foo | 0x12341234 |
| Bar | 0xabcdabcd |
------------------------------------
And the load function, such as LoadLibrary and dlopen, loads shared library and make it available to use.
GetProcAddress and dlsym find you symbol's address. For example:
HMODULE shared_lib = LoadLibrary("asdf.dll");
void *symbol = GetProcAddress("Foo");
// symbol is 0x12341234
In windows, there is .lib file to use .dll. When you link to this .lib file, you don't need to call LoadLibrary and GetProcAddress, and just use shared library's function as if they're "normal" functions. How can it work?
In fact, the .lib contains an import information. It's like that:
void *Foo; // please put the address of Foo there
void *Bar; // please put the address of Bar there
When the operating system loads your program (strictly speaking, your module), operating system performs LoadLibrary and GetProcAddress automatically.
And if you write code such as Foo();, compiler convert it into (*Foo)(); automatically. So you can use them as if they're "normal" functions.

Loosen DLL dependencies at start of application

in my Windows C++ program I have a few dependencies on DLLs (coming with drivers of input devices). I don't actually load the DLLs myself, but the drivers provide (small) .lib library that I statically link against (and I assume it is those libraries that make sure the DLLs are present in the system and loads them). I'm writing an application that can take input from a series of video cameras. At run-time, the user chooses which one to use. Currently my problem is that my routines that query whether a camera is connected already require the functionality of the camera being present on the system. I.e. let's say there is camera model A and B, the user has to install the drivers for A and B, even if he knows he just owns model B. The user has to do this, as otherwise my program won't even start (then when it started it will of course tell the user which of the two cameras are actually connected).
I'd like to know whether there is any possibility, at run-time, to determine which of the DLLs are present, and for those that aren't, somehow disable loading even the static (and, thus, dynamic) component.
So basically my problem is that you cannot do if(DLL was found){ #include "source that includes header using functions defined in lib which loads DLL"}
I think using the DELAYLOAD linker flag may provide the functionality required. It would allow linking with the .lib files but would only attempt to load the DLL if it is used:
link.exe ... /DELAYLOAD:cameraA.dll /DELAYLOAD:cameraB.dll Delayimp.lib
The code would be structured something similar to:
if (/* user selected A */)
{
// Use camera A functions, resulting in load of cameraA's DLL.
}
else
{
// Use camera B functions, resulting in load of cameraB's DLL.
}
From Linker Support for Delay-Loaded DLLs
:
Beginning with Visual C++ 6.0, when statically linking with a DLL, the
linker provides options to delay load the DLL until the program calls
a function in that DLL.
An application can delay load a DLL using the /DELAYLOAD (Delay Load Import)
linker option with a helper function (default implementation provided by
Visual C++). The helper function will load the DLL at run time by calling
LoadLibrary and GetProcAddress for you.
You should consider delay loading a DLL if:
- Your program may not call a function in the DLL.
- A function in the DLL may not get called until late in your program's
execution.
You need to load the libs at run-time. Take a look a look at LoadLibary.
This is an MSDN article about this: DLLs the Dynamic Way I just glanced over this. It's very old.
This one shows the usage of LoadLibrary: Using Run-Time Dynamic Linking