In Linux when I load shared library using dlopen() from executable (or shared library) I expect that undefined symbols in this library will be automatically resolved, of course as long as executable (or shared library) defines these symbols.
For example, I have library A with these header and source files:
#pragma once
int funcA();
#include "A/header.h"
int funcA() {}
also I have library B with this source file:
#include "A/header.h"
void funcB() {
funcA();
}
for library B I specify path to header of library A, but I don't link library A to library B.
In this case, if I load library B from library A by calling dlopen(), then undefined symbol funcA in library B will be resolved, so library B will be able to call funcA.
Is it also true for Windows, or I have to manually find addresses for all symbols I need?
After researching already answered questions on Stack Overflow:
External symbol resolving in a dll
Compile to .dll with some undefined references with MinGW
I realized that if I want to make something similar work on Windows, I have to create some import library for my shared library A.
At first I thought it's needed only for MSVC, but looks like MinGW needs import library too, because it's how things work on Windows. Correct me if I miss something.
For me it's big no-no, so probably I will change a way how I work with shared libraries to explicitly retrieve every symbol I need via additional interface. Fortunately, there are not so many of them.
Related
I'm building a shared library using cmake. Here are the steps that I take, starting with building a shared library A_shared using an object library A_obj.
add_library(A_obj OBJECT ${A_SRCS})
add_library(A_shared SHARED)
target_link_libraries(A_shared PUBLIC A_obj)
This process works. Now I wish to build another shared library that uses A_shared and its own sources. So I have:
add_library(B_obj OBJECT ${B_SRCS})
target_link_libraries(B_obj PUBLIC A_shared)
add_library(B_shared SHARED)
target_link_libraries(B_shared PUBLIC B_obj)
It seems to me that this should be valid, because I'm building objects B_obj which have dependence on a shared libraries A_shared, and then using these objects to construct shared library B_shared and at the same time transitively passing the dependence on A_shared using the call to target_link_libraries.
However, this results in undefined symbols when building in MSVC. When linking B_shared.dll, I get unresolved external dependencies on global variables that were defined in ${A_SRCS} and used in ${B_srcs}, and not anything else (like functions). Strangely, the object files B_obj compile fine.
If I instead link B_shared to A_obj, it works fine. But this gives me the impression that B_shared will actually contain the object files from A_obj, but all I want it only to link to A_shared.
If I link B_obj to A_obj, nothing changes and I still get unresolved dependencies.
With gcc, B_shared is successfully linked.
Therefore, my question is: am I doing the correct thing in cmake to achieve what I want? I'm wondering what I'm misunderstanding, because I've researched this extensively and I can't find the fault in my process, so I would greatly appreciate any clarification.
This comes from the behaviour of shared libraries on Windows in general: See this question. I have to manually export the symbols. The interaction with the object library wasn't creating issues.
I've very simple piece of code witch utilize Libboost filesystem to check if file exist or not. Additionally I want to use libboost as dll library, not static one. Here you have what I written few minutes ago:
void Hex2bin::convert(string filename, vector<uint8_t>* decodedBytes) {
const path fname(filename); // from boost::filesystem
if (exists(fname)) {
;
}
else {
throw new EFileDoesntExist;
}
}
Unfortunately when I remove -lboost_filesystem from linker settings and add macro BOOST_FILESYSTEM_DYN_LINK globally in Eclipse configuration I get only such linker error as below:
/usr/include/boost/filesystem/operations.hpp:446: undefined reference to `boost::filesystem::detail::status(boost::filesystem::path const&, boost::system::error_code*)'
Source file compiles without any warning. When I revert back to -lboost_filesystem everything works OK, but I assume that then library is statically linked to EXE file. Have anybody any idea what is going wrong? Or maybe I have wrong understanding how libboost can be linked?
No. You still need to specify-lboost_filesystem even if the library is a shared object rather than a static library. In fact, most linkers will prefer to link against a shared object rather than a .a if both are present (there are ways to change this if necessary).
Use ldd to see the shared libraries an executable is linked against.
Only Windows (specifically MSVC++) supports "auto-linking" with Boost. On Linux, you'd either link against libboost_filesystem.so or libboost_filesystem.a, but in either case you need to link explicitly.
What will happen if executable and shared library contain functions with the same name? For example EXE has definition like this:
extern int fund()
{
return 0;
}
and shared library has same definition:
extern int fund()
{
return 1;
}
what function will be called from executable and from shared library:
1 - for Windows?
2 - for unix-base?
PS: When I define AfxWinMain in my MFC application, on startup it will be called instead of the AfxWinMain in the MFC DLL. I need some theory why is it so?
You have answered the question in the heading yourselves already.
Non-shared library dependencies are resolved at link time, not at load time. Once the linker has satisfied that external reference towards a static library, it will stay that way and neither the Windows nor the Unix loader will try to resolve it anymore (the symbol is normally not even "visible" in the binary after the link stage).
When linking against libraries (regardless of static or dynamic), the linker stops searching for a symbol to resolve as soon as it has found a reference that satisfies the requirement and will not look any further in any other (or the same) library for that symbol. That is why you can supply multiple definitions for the same function in libraries (as opposed to object files, those are guaranteed to be searched exhaustively and thus will be checked for duplicate symbols).
Only symbols that need to be resolved at load time are marked as "external shared" and are resolved by the loader at runtime.
I don't see a fundamental difference in this respect between unixoid OSs and Windows.
I have library A where some functionality in it requires library B. library A has two independent classes F and G (i.e. F and G do not know about each other) where G includes headers from library B in its cpp file, thus the dependency of library A on library B because of class G. F does not use any functionality from library B.
I now have an executable E that uses F but not G. Am I required to link against library B even though I am not using any functionality from library A that uses library B? If yes, is there any way to avoid that without splitting up library A into two libraries?
I was under the assumption that you don't have to link against the external library unless you are using its functionality somehow.
No, this is not necessary. A static library is a very simple file format, it is just a bag of .obj files. The linker only pulls in the .obj files that it need to resolve a dependency in your main program. Or an .obj file that was pulled in that in turn requires another one to be used. You only get a linker error when symbols are still unresolved after it looked at the available .obj files.
A sample implementation of the G class I tried to double-check this, in g.cpp:
#include "stdafx.h"
#include "a.h"
void foo(); // Defined in b.lib
G::G() {
foo();
}
And tested in a program that looked like this:
#include "stdafx.h"
#include "..\a\a.h"
int main()
{
F obj;
//G obj2;
return 0;
}
It linked just fine without linking b.lib. Removing the comment from obj2 produced the expected LNK2019 error for foo().
Plenty of ways that this might not pan-out in practice. Pretty hard to see dependencies with an unaided eye. And the unit of linkage is an object file, g.obj in the above example. So it is important that your F class members are defined in a different source code file than your G class members. In other words, you need a separate f.obj and g.obj file in a.lib.
This can be tinkered with, the /Gy compile option can package individual functions each in a separate section. Which lights up the /OPT:REF linker option, the unit of linkage now becomes an individual function instead of an object file. But that's a pretty high price to pay, it only improves the final size of the executable and doesn't remove the need to still have to link b.lib. And it disables incremental linking and may require tinkering with the original library projects. As long as you need to do that, just easier to keep the code for F and G in separate source files.
The linker's /VERBOSE option can provide insight, it shows you what is being pulled in and which .obj file caused a dependency to be linked. Enter it in the Linker + Command Line, Additional Options box.
When you link to a static library, what the linker does is to pick out of the library the object files defining symbols used by the program. So if you don't reference G, it won't be included in the program. That happens if you add the respective static library when linking or not.
There is an implicit dependency between library A and library B. You may not have to link explicitly depending on the compiler you're using. But there is a fair chance that during its execution it is going to look for library B. In windows, there is an option to delay load a DLL - which means an attempt to load library B is not done unless the code path is hit.
This is my 2nd post on this site in my effort to understand the compilation/linking process with gcc. When I try to make an executable, symbols need to be resolved at link time, but when I try to make a shared library, symbols are not resolved at link time of this library. They will perhaps be resolved when I am trying to make an executable using this shared library. Hands-on:
bash$ cat printhello.c
#include <stdio.h>
//#include "look.h"
void PrintHello()
{
look();
printf("Hello World\n");
}
bash$ cat printbye.c
#include <stdio.h>
//#include "look.h"
void PrintBye()
{
look();
printf("Bye bye\n");
}
bash$ cat look.h
void look();
bash$ cat look.c
#include <stdio.h>
void look()
{
printf("Looking\n");
}
bash$ gcc printhello.c printbye.c
/usr/lib/gcc/i386-redhat-linux/4.1.2/../../../crt1.o: In function `_start':
(.text+0x18): undefined reference to `main'
/tmp/cck21S0u.o: In function `PrintHello':
printhello.c:(.text+0x7): undefined reference to `look'
/tmp/ccNWbCnd.o: In function `PrintBye':
printbye.c:(.text+0x7): undefined reference to `look'
collect2: ld returned 1 exit status
bash$ gcc -Wall -shared -o libgreet printhello.c printbye.c
printhello.c: In function 'PrintHello':
printhello.c:6: warning: implicit declaration of function 'look'
printbye.c: In function 'PrintBye':
printbye.c:5: warning: implicit declaration of function 'look'
So my question is why are symbols not resolved when I am linking a shared library. This work(Resolving symbols of its downstream) will need to be done when I will use this library to make an executable, but that means we need to know what this library depends on when using this library, but isn't it not undesirable?
Thanks,
Jagrati
Does adding -z defs when building the library do what you want? If not, check the ld man pages, there are quite a few options on the handling of undefined symbols.
Since you didn't give the -c (compile only) option, you requested gcc to compile the two source files and link them with the standard library (libc) and the c run-time startup (crt0, typically) to produce a running program. crt0 tries to enter your program by calling main(), which is the undefined symbol the linker can't find. It can't find it because you don't have a main() in either of your .c files, right?
So, on to your actual question, "Why symbols of a shared library are not resolved at link time?" The answer is, what do you mean by "link time?" By defintion, a dynamically linked program isn't "linked" until it starts (or maybe not even then, depending on your system.)
On a Linux system, you can see which dynamic libraries a program depends on with the ldd command (on Mac OS use 'otool -L'). The output of ldd will tell you which dynamic libraries a program depends on, which where found in the library search path, and which ones cannot be found (if any).
When you dynamic program starts, the dynamic linker that was linked into it locates and loads the dynamic libraries the program depends on, and "fixes" the references to the external symbols. If any of these fail, your program will fail to start. One all of the formerly unresolved symbols have been resolved, the dynamic linker returns and the C runtime will call your main() function. (It's somewhat different on Mac OS, but similar in effect, the linking happens after your program is started.)
I think the linker option -Bsymbolic is what you're looking for.
The linked has no way of knowing, in ELF at least, where the symbols are (i.e. in which libraries). In OS X, on the other hand, you need to link libraries the way you described. In the end, it is a question of design. One is more flexible, the other, more rigorous.
Even when you build a shared library it must resolve all the dependencies.
Thus when a shared library is loaded at compile time it knows what other shared libraries to load at runtime so that it can resolve other dependencies.
1) Build a shared (look.<sharedLib>) library with look()
2) Build a shared (hg.<sharedLib>) library with hello() bye() link against look.<sharedLib>
3) Build Application with main() that links against hg.<sharedlib>
At runtime the application will then load hg.<sharedlib> which will intern load the shared library look.<sharedlib>
A executable requires a entry point. But a shared library can be built without the entry point and later the executable can be compiled with this shared library.