I have a large Qt project under Ubuntu. Just found that G++ lets me compile AND link code where I'm calling an declared but undefined method. It crashes at runtime at that call.
I couldn't reproduce this behavior with a test project, although I enforced the same g++ command line.
The questions are:
why does it let me do that?
How can I make the linker generate an error?
Edits (based on the comments):
I know it's not optimized away, as it crashes at runtime when I call that method.
I declared and called another identical method with a dummy name - I think something along the lines of gfdsgfdhgasfdhgfa() will do :) - same thing.
The app crashes when the undefined method is called. Sorry for missing this important detail.
The undefined method is not a slot.
Yes, I'm clearing the build dir. I'm using qmake.
Just found there's an utility called nm. If I'm running it with the -u (show undefined only) option on the output .so I can see this method in the list. Why is GCC assuming it's external?
It looks like by default GCC (not only G++) assumes all undefined symbols are externals. Visual Studio doesn't.
Relevant question: Force GCC to notify about undefined references in shared libraries
--allow-shlib-undefined
--no-allow-shlib-undefined
Allows (the default) or disallows undefined symbols in shared
libraries (It is meant, in shared libraries _linked_against_, not the
one we're creating!--Pavel Shved). This switch is similar to --no-un-
defined except that it determines the behaviour when the undefined
symbols are in a shared library rather than a regular object file. It
does not affect how undefined symbols in regular object files are
handled.
The reason that --allow-shlib-undefined is the default is that the
shared library being specified at link time may not be the same as
the one that is available at load time, so the symbols might actually
be resolvable at load time. Plus there are some systems, (eg BeOS)
where undefined symbols in shared libraries is normal. (The kernel
patches them at load time to select which function is most appropri-
ate for the current architecture. This is used for example to dynam-
ically select an appropriate memset function). Apparently it is also
normal for HPPA shared libraries to have undefined symbols.
Related
I'm struggling with a linking error.
I have 3 modules:
static library A which defines function ole::compound_document::find_storage(const std::string&);
shared library B which is linked to A and uses the function;
executable C which is linked to B and uses functions from B (but does not call directly functions of library A).
During the linking of the executable C, I receive the following error message:
../../bin/B.so: undefined reference to `ole::compound_document::find_storage(std::string const&)'
The function is defined in library A.
If I run utility nm on shared library B, I receive the following output:
0000000001841c70 T ole::compound_document::find_storage(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
U ole::compound_document::find_storage(std::string const&)
It shows two find_storage functions. One of them is defined another is not defined.
I'm trying to understand how can it happens. So far unsuccessful.
The problem appears under Linux (Ubuntu), compiler: clang-9. On Windows, I can build the libraries and the executable without any problem.
I've tried to create a minimal example, just putting 3 simple modules together with only a few functions. Everything works. The compiler uses only the first definition of the function. I don't understand where the second definition comes from. I suspected some mix of c++ standards but cannot find anything.
Any suggestions will be highly appreciated.
You will need to look for problems in library B's code. std::string, verbatim, is not something that's expected to be an actual type referenced from an exported symbol, since std::string is just an alias for a std::basic_string instance. Narrow this down by looking at symbols of all modules that were linked into library B, and once you identified the module that does, you'll need to figure out why.
Your question does not provide sufficient data to decisively identify the linking problem because, of course, that can only be done by inspecting all the object modules involved in the linking, and inspecting all the source code for the likely violations of the One Definition Rule, or ill-formed code that did not produce a diagnostic, but manifested itself as a link failure.
Therefore, the following answer is meant to be as a general guide to isolating these kinds of linking failure. I don't think this question qualifies to be hammered by the canonical answer.
You have a symbol in the linked shared library that's not getting resolved when linked to an executable, specifically:
ole::compound_document::find_storage(std::string const&)
Just like you ran nm on the shared library, you can use it on every module that went into the shared library, individually. This will find which object module the unresolved reference came from. If you don't find it, it must've come from the static library you linked with, so repeat your search there.
That reference came from one of the object modules that you used to build the shared library. You will find it this way, it's unlikely to pop into existence of its own.
Once you find the relevant module, you're then on your own, by looking at the actual code that was compiled, and figure out what's up. If you can't figure it out: divide and conquer. Take the object module, and split the source file into two files, half of the functions in each one, compile them separately, and then look and see where the unresolved reference comes from.
Finally: before doing all that, try the low hanging fruit: make clean, then recompile everything. This has all the hallmarks of a compiler switch, some of the object modules were compiled by a different compiler. If that static library was provided by a third party vendor as a binary blob, it must've been compiled by a different compiler or a different version of your compiler. C++ does not guarantee binary ABI compatibility.
As mentioned by n. 'pronouns' m.The problem was in inconsistent use of -D_GLIBCXX_USE_CXX11_ABI flag. I used c++14 standard, but some projects were compiled with the flag -D_GLIBCXX_USE_CXX11_ABI=1.
It was not easy to find out, because I use a lot of 3rd party libraries with conan package manager.
This code:
void undefined_fcn();
void defined_fcn() {}
struct api_t {
void (*first)();
void (*second)();
};
api_t api = {undefined_fcn, defined_fcn};
defines a global variable api with a pointer to a non-existent function. However, it compiles, and to my surprise, links with absolutely no complaints from GCC, even with all those -Wall -Wextra -Werror -pedantic flags.
This code is part of a shared library. Only when I load the library, at run-time, it finally fails. How do I check, at library link-time, that I did't forget to define any function?
Update: this question mentions the same problem, and the answer is the same: -Wl,--no-undefined. (by the way, I guess this could even be marked as duplicate). However, according to the accepted answer below, you should be careful when using -Wl,--no-undefined.
This code is part of a shared library.
That's the key. The whole purpose of having a shared library is to have an "incomplete" shared object, with undefined symbols that must be resolved when the main executable loads it and all other shared libraries it gets linked with. At that time, the runtime loader attempts to resolve all undefined symbols; and all undefined symbols must be resolved, otherwise the executable will not start.
You stated you're using gcc, so you are likely using GNU ld. For the reason stated above, ld will link a shared library with undefined symbols, but will fail to link an executable unless all undefined symbols are resolved against the shared libraries the executable gets linked with. So, at runtime, the expected behavior is that the runtime loader is expected to successfully resolve all symbols too; so the only situation when the runtime loader fails to start the executable will indicate a fatal runtime environment failure (such as a shared library getting replaced with an incompatible version).
There are some options that can be used to override this behavior. The --no-undefined option instructs ld to report a link failure for undefined symbols when linking a shared libraries, just like executables. When invoking ld indirectly via gcc this becomes -Wl,--no-undefined.
However, you are likely to discover that this is going to be a losing proposition. You better hope that none of the code in your shared library uses any class in the standard C++ or C library. Because, guess what? -- those references will be undefined symbols, and you will fail to link your shared library!
In other words, this is a necessary evil that you need to deal with.
You can't have the compiler tell you whether you forgot to define the function in that implementation file. And the reason is when you define a function it is implicitly marked extern in C++. And you cannot tell what is in a shared library until after it is linked (the compiler's linker does not know if the reference is defined)
If you are not familiar with what extern means. Things marked extern signal external linkage, so if you have a variable that is extern the compiler doesn't require a definition for that variable to be in the translation unit that uses it. The definition can be in another implementation file and the reference is resolved at link time (when you link with a translation unit that defines the variable). The same applies for functions, which are essentially variables of a function type.
To get the behavior you want make the function static which tells the compiler that the function is not extern and is a part of the current translation unit, in which case it must be defined -Wundefined-internal picks up on this (-Wundefined-internal is a part of -Werror so just compile with that)
I have the following problem. I have a shared library, which is just a bunch of translation units linked together so when I compile that shared library I won't get any linker error (undefined references, even though I might have).
The shared library gets loaded dynamically from an executable which also contains the exports which my shared library is using (The references used in my library are resolved at runtime).
The main problem is that I want the undefined reference warnings so I can fix them statically instead of waiting the application to crash.
I read somewhere that I can pass "-Wl,--no-undefined" to gcc so I can get these errors back, indeed it worked but it also gave me all the undefined references of the executable's exports. I want to filter these warnings just to the scope of my translation units.
Is this possible? If not, how can I define reference to a executable which has exports for a shared library.
you can try linking the library & main program with -Wl,-z,now. that should make the runtime ldso resolve all references immediately and throw an error if none are found.
otherwise, i'm not seeing an option off hand in the linker manual to say "allow this ELF to satisfy symbols, but don't actually list it as a DT_NEEDED".
you could try using -Wl,--no-undefined and parsing the output with a script so you can filter out symbols you know will be satisfied by the main program.
another option might be to label all the symbols you know the main program provides with __attribute__((weak)) and then still use -Wl,--no-undefined. the weak symbols won't be reported as an error.
Description :
a. Class X contains a static private data member ptr and static public function member getptr()/setptr().
In X.cpp, the ptr is set to NULL.
b. libXYZ.so (shared object) contains the object of class X (i.e libXYZ.so contains X.o).
c. libVWX.so (shared object) contains the object of class X (i.e libVWX.so contains X.o).
d. Executable a.exe contains X.cpp as part of translation units and finally is linked to libXYZ.so, libVWX.so
PS:
1. There are no user namespaces involved in any of the classes.
2. The libraries and executable contain many other classes also.
3. no dlopen() has been done. All libraries are linked during compile time using -L and -l flags.
Problem Statement:
When compiling and linking a.exe with other libraries (i.e libXYZ.so and libVWX.so), I expected a linker error (conflict/occurance of same symbol multiple times) but did not get one.
When the program was executed - the behavior was strange in SUSE 10 Linux and HP-UX 11 IA64.
In Linux, when execution flow was pushed across all the objects in different libraries, the effect was registered in only one copy of X.
In HPUX, when execution flow was pushed across all the objects in different libraries, the effect was registered in 3 differnt copies of X (2 belonging to each libraries and 1 for executable)
PS : I mean during running the program, the flow did passed thourgh multiple objects belonging to a.exe, libXYZ.so and libVWX.so) which interacted with static pointer belonging to X.
Question:
Is Expecting linker error not correct? Since two compilers passed through compilation silently, May be there is a standard rule in case of this type of scenario which I am missing. If so, Please let me know the same.
How does the compiler (gcc in Linux and aCC in HPUX) decide how many copies of X to keep in the final executable and refer them in such scenarios.
Is there any flag supported by gcc and aCC which will warn/stop compilation to the users in these kind of scenarios?
Thanks for your help in advance.
I'm not too sure that I've completely understood the scenario. However,
the default behavior on loading dynamic objects under Linux (and other
Unices) is to make all symbols in the library available, and to only use
the first encountered. Thus, if you both libXYZ.so and libVWX.so
contain a symbol X::ourData, it is not an error; if you load them in
that order, libVWX.so will use the X::ourData from libXYZ.so,
instead of its own. Logically, this is a lot like a template definition
in a header: the compiler chooses one, more or less by chance, and if
any of the definitions is not the same as all of the others, it's
undefined behavior. This behavior can be
overridden by passing the flag RTLD_LOCAL to dlopen.
With regards to your questions:
The linker is simply implementing the default behavior of dlopen (that which you get when the system loads the library implicitely). Thus, no error (but the logical equivalent of undefined behavior if any of the definitions isn't the same).
The compiler doesn't decide. The decision is made when the .so is loaded, depending on whether you specify RTLD_GLOBAL or RTLD_LOCAL when calling dlopen. When the runtime calls dlopen implicitly, to resolve a dependency, it will use RTLD_GLOBAL if this occurs when loading the main executable, and what ever was used to load the library when the dependency comes from a library. (This means, of course, that RTLD_GLOBAL will propagate until you invoke dlopen explicitly.)
The function is "public static", so I assume it's OOP-meaning of "static" (does not need instance), not C meaning of static (file-static; local to compilation unit). Therefore the functions are extern.
Now in Linux you have explicit right to override library symbols, both using another library or in the executable. All extern symbols in libraries are resolved using the global offset table, even the one the library actually defines itself. And while functions defined in the executable are normally not resolved like this, but the linker notices the symbols will get to the symbol table from the libraries and puts the reference to the executable-defined one there. So the libraries will see the symbol defined in the executable, if you generated it.
This is explicit feature, designed so you can do things like replace memory allocation functions or wrap filesystem operations. HP-UX probably does not have the feature, so each library ends up calling it's own implementation, while any other object that would have the symbol undefined will see one of them.
There is a difference beetween "extern" symbols (which is the default in c++) and "shared libary extern". By default symbols are only "extern" which means the scope of one "link unit" e.g. an executable or a library.
So the expected behaviour would be: no compiler error and every module works with its own copy.
That leads to problems of course in case of inline compiling etc...etc...
To declare a symbol "shared library extern" you have to use a ".def" file or an compiler declaration.
e.g. in visual c++ that would be "_declspec(dllexport)" and "_declspec(dllimport)".
I do not know the declarations for gcc at the moment but I am sure someone does :-)
Under Solaris 10, I'm creating a library A.so that calls a function f() which is defined in library B.so. To compile the library A.so, I declare in my code f() as extern.
Unfortunately, I "forgot" to declare in A's makefile that it has to link with B.
However, "make A" causes no warning, no error, and the library A.so is created.
Of course, when executing A's code, the call of f() crashes because it is undefined.
Is there a way (linker option, code trick...) to make the compilation of library A fail ?
How can I be sure that all symbols refered to in library A are defined at compile time ?
Thanks for any suggestions.
Simplest way: add a "test_lib" target in the makefile, that will produce a binary using all the symbols expored from libraryA. (doesn't have to be anything meaningful... just take the address, no need to call the function or anything, it just needs to be referenced).
Tanks
I think I found something interesting and even simpler in the linker's manual (d'ho)
The -z defs option and the --no-undefined option force a fatal error if any undefined symbols remain at the end of the link. This mode is the default when an executable is built. For historic reasons, this mode is not the default when building a shared object. Use of the -z defs option is recommended, as this mode assures the object being built is self-contained. A self-contained object has all symbolic references resolved internally, or to the object's immediate dependencies.