This code:
void undefined_fcn();
void defined_fcn() {}
struct api_t {
void (*first)();
void (*second)();
};
api_t api = {undefined_fcn, defined_fcn};
defines a global variable api with a pointer to a non-existent function. However, it compiles, and to my surprise, links with absolutely no complaints from GCC, even with all those -Wall -Wextra -Werror -pedantic flags.
This code is part of a shared library. Only when I load the library, at run-time, it finally fails. How do I check, at library link-time, that I did't forget to define any function?
Update: this question mentions the same problem, and the answer is the same: -Wl,--no-undefined. (by the way, I guess this could even be marked as duplicate). However, according to the accepted answer below, you should be careful when using -Wl,--no-undefined.
This code is part of a shared library.
That's the key. The whole purpose of having a shared library is to have an "incomplete" shared object, with undefined symbols that must be resolved when the main executable loads it and all other shared libraries it gets linked with. At that time, the runtime loader attempts to resolve all undefined symbols; and all undefined symbols must be resolved, otherwise the executable will not start.
You stated you're using gcc, so you are likely using GNU ld. For the reason stated above, ld will link a shared library with undefined symbols, but will fail to link an executable unless all undefined symbols are resolved against the shared libraries the executable gets linked with. So, at runtime, the expected behavior is that the runtime loader is expected to successfully resolve all symbols too; so the only situation when the runtime loader fails to start the executable will indicate a fatal runtime environment failure (such as a shared library getting replaced with an incompatible version).
There are some options that can be used to override this behavior. The --no-undefined option instructs ld to report a link failure for undefined symbols when linking a shared libraries, just like executables. When invoking ld indirectly via gcc this becomes -Wl,--no-undefined.
However, you are likely to discover that this is going to be a losing proposition. You better hope that none of the code in your shared library uses any class in the standard C++ or C library. Because, guess what? -- those references will be undefined symbols, and you will fail to link your shared library!
In other words, this is a necessary evil that you need to deal with.
You can't have the compiler tell you whether you forgot to define the function in that implementation file. And the reason is when you define a function it is implicitly marked extern in C++. And you cannot tell what is in a shared library until after it is linked (the compiler's linker does not know if the reference is defined)
If you are not familiar with what extern means. Things marked extern signal external linkage, so if you have a variable that is extern the compiler doesn't require a definition for that variable to be in the translation unit that uses it. The definition can be in another implementation file and the reference is resolved at link time (when you link with a translation unit that defines the variable). The same applies for functions, which are essentially variables of a function type.
To get the behavior you want make the function static which tells the compiler that the function is not extern and is a part of the current translation unit, in which case it must be defined -Wundefined-internal picks up on this (-Wundefined-internal is a part of -Werror so just compile with that)
Related
I have the following problem. I have a shared library, which is just a bunch of translation units linked together so when I compile that shared library I won't get any linker error (undefined references, even though I might have).
The shared library gets loaded dynamically from an executable which also contains the exports which my shared library is using (The references used in my library are resolved at runtime).
The main problem is that I want the undefined reference warnings so I can fix them statically instead of waiting the application to crash.
I read somewhere that I can pass "-Wl,--no-undefined" to gcc so I can get these errors back, indeed it worked but it also gave me all the undefined references of the executable's exports. I want to filter these warnings just to the scope of my translation units.
Is this possible? If not, how can I define reference to a executable which has exports for a shared library.
you can try linking the library & main program with -Wl,-z,now. that should make the runtime ldso resolve all references immediately and throw an error if none are found.
otherwise, i'm not seeing an option off hand in the linker manual to say "allow this ELF to satisfy symbols, but don't actually list it as a DT_NEEDED".
you could try using -Wl,--no-undefined and parsing the output with a script so you can filter out symbols you know will be satisfied by the main program.
another option might be to label all the symbols you know the main program provides with __attribute__((weak)) and then still use -Wl,--no-undefined. the weak symbols won't be reported as an error.
I have a large Qt project under Ubuntu. Just found that G++ lets me compile AND link code where I'm calling an declared but undefined method. It crashes at runtime at that call.
I couldn't reproduce this behavior with a test project, although I enforced the same g++ command line.
The questions are:
why does it let me do that?
How can I make the linker generate an error?
Edits (based on the comments):
I know it's not optimized away, as it crashes at runtime when I call that method.
I declared and called another identical method with a dummy name - I think something along the lines of gfdsgfdhgasfdhgfa() will do :) - same thing.
The app crashes when the undefined method is called. Sorry for missing this important detail.
The undefined method is not a slot.
Yes, I'm clearing the build dir. I'm using qmake.
Just found there's an utility called nm. If I'm running it with the -u (show undefined only) option on the output .so I can see this method in the list. Why is GCC assuming it's external?
It looks like by default GCC (not only G++) assumes all undefined symbols are externals. Visual Studio doesn't.
Relevant question: Force GCC to notify about undefined references in shared libraries
--allow-shlib-undefined
--no-allow-shlib-undefined
Allows (the default) or disallows undefined symbols in shared
libraries (It is meant, in shared libraries _linked_against_, not the
one we're creating!--Pavel Shved). This switch is similar to --no-un-
defined except that it determines the behaviour when the undefined
symbols are in a shared library rather than a regular object file. It
does not affect how undefined symbols in regular object files are
handled.
The reason that --allow-shlib-undefined is the default is that the
shared library being specified at link time may not be the same as
the one that is available at load time, so the symbols might actually
be resolvable at load time. Plus there are some systems, (eg BeOS)
where undefined symbols in shared libraries is normal. (The kernel
patches them at load time to select which function is most appropri-
ate for the current architecture. This is used for example to dynam-
ically select an appropriate memset function). Apparently it is also
normal for HPPA shared libraries to have undefined symbols.
For Example
#include <iostream>
int add(int x, int y);
int main()
{
cout << add(5, 5) << endl;
}
This would compile but not link. I understand the problem, I just don't understand why it compiles fine but doesn't link.
Because the compiler doesn't know whether that function is provided by a library (or other translation unit). There's nothing in your prototype that tells the compiler the function is defined locally (you could use static for that).
The input to a C or C++ compiler is one translation unit - more or less one source code file. Once the compiler is finished with that one source code, it has done its job.
If you call/use a symbol, such as a function, which is not part of that translation unit, the compiler assumes it's defined somewhere else.
Later on, you link together all the object files and possibly the libraries you want to use, all references are tied together - it's only at this point, when pulling together everything that's supposed to create an executable, one can know that something is missing.
When a compiler compiles, it generates the output (object file) with the table of defined symbols (T) and undefined symbols (U) (see man page of nm). Hence there is no requirement that all the references are defined in every translation unit. When all the object files are linked (with any libraries etc), the final binary should have all the symbols defined (unless the target in itself is a library). This is the job of the linker. Hence based on the requested target type (library or not), the linker might not or might give an error for undefined functions. Even if the target is a non-library, if it is not statically linked, it still might refer to shared libraries (.so or .dll), hence if on the target machine while the binary is run, if the shared libraries are missing or if any symbols missing, you might even get a linker error. Hence between compiler, linker and loader, every one is trying to best provide you with the definition of every symbol needed. Here by giving declaring add, you are pacifying the compiler, which hopes that the linker or loader would do the required job. Since you didnt pacify the linker (by say providing it with a shared library reference), it stops and cribs. If you have even pacified the linker, you would have got the error in the loader.
Description :
a. Class X contains a static private data member ptr and static public function member getptr()/setptr().
In X.cpp, the ptr is set to NULL.
b. libXYZ.so (shared object) contains the object of class X (i.e libXYZ.so contains X.o).
c. libVWX.so (shared object) contains the object of class X (i.e libVWX.so contains X.o).
d. Executable a.exe contains X.cpp as part of translation units and finally is linked to libXYZ.so, libVWX.so
PS:
1. There are no user namespaces involved in any of the classes.
2. The libraries and executable contain many other classes also.
3. no dlopen() has been done. All libraries are linked during compile time using -L and -l flags.
Problem Statement:
When compiling and linking a.exe with other libraries (i.e libXYZ.so and libVWX.so), I expected a linker error (conflict/occurance of same symbol multiple times) but did not get one.
When the program was executed - the behavior was strange in SUSE 10 Linux and HP-UX 11 IA64.
In Linux, when execution flow was pushed across all the objects in different libraries, the effect was registered in only one copy of X.
In HPUX, when execution flow was pushed across all the objects in different libraries, the effect was registered in 3 differnt copies of X (2 belonging to each libraries and 1 for executable)
PS : I mean during running the program, the flow did passed thourgh multiple objects belonging to a.exe, libXYZ.so and libVWX.so) which interacted with static pointer belonging to X.
Question:
Is Expecting linker error not correct? Since two compilers passed through compilation silently, May be there is a standard rule in case of this type of scenario which I am missing. If so, Please let me know the same.
How does the compiler (gcc in Linux and aCC in HPUX) decide how many copies of X to keep in the final executable and refer them in such scenarios.
Is there any flag supported by gcc and aCC which will warn/stop compilation to the users in these kind of scenarios?
Thanks for your help in advance.
I'm not too sure that I've completely understood the scenario. However,
the default behavior on loading dynamic objects under Linux (and other
Unices) is to make all symbols in the library available, and to only use
the first encountered. Thus, if you both libXYZ.so and libVWX.so
contain a symbol X::ourData, it is not an error; if you load them in
that order, libVWX.so will use the X::ourData from libXYZ.so,
instead of its own. Logically, this is a lot like a template definition
in a header: the compiler chooses one, more or less by chance, and if
any of the definitions is not the same as all of the others, it's
undefined behavior. This behavior can be
overridden by passing the flag RTLD_LOCAL to dlopen.
With regards to your questions:
The linker is simply implementing the default behavior of dlopen (that which you get when the system loads the library implicitely). Thus, no error (but the logical equivalent of undefined behavior if any of the definitions isn't the same).
The compiler doesn't decide. The decision is made when the .so is loaded, depending on whether you specify RTLD_GLOBAL or RTLD_LOCAL when calling dlopen. When the runtime calls dlopen implicitly, to resolve a dependency, it will use RTLD_GLOBAL if this occurs when loading the main executable, and what ever was used to load the library when the dependency comes from a library. (This means, of course, that RTLD_GLOBAL will propagate until you invoke dlopen explicitly.)
The function is "public static", so I assume it's OOP-meaning of "static" (does not need instance), not C meaning of static (file-static; local to compilation unit). Therefore the functions are extern.
Now in Linux you have explicit right to override library symbols, both using another library or in the executable. All extern symbols in libraries are resolved using the global offset table, even the one the library actually defines itself. And while functions defined in the executable are normally not resolved like this, but the linker notices the symbols will get to the symbol table from the libraries and puts the reference to the executable-defined one there. So the libraries will see the symbol defined in the executable, if you generated it.
This is explicit feature, designed so you can do things like replace memory allocation functions or wrap filesystem operations. HP-UX probably does not have the feature, so each library ends up calling it's own implementation, while any other object that would have the symbol undefined will see one of them.
There is a difference beetween "extern" symbols (which is the default in c++) and "shared libary extern". By default symbols are only "extern" which means the scope of one "link unit" e.g. an executable or a library.
So the expected behaviour would be: no compiler error and every module works with its own copy.
That leads to problems of course in case of inline compiling etc...etc...
To declare a symbol "shared library extern" you have to use a ".def" file or an compiler declaration.
e.g. in visual c++ that would be "_declspec(dllexport)" and "_declspec(dllimport)".
I do not know the declarations for gcc at the moment but I am sure someone does :-)
It appears GCC linker doesn't care for one variable being defined in two files. I suspect this is the cause of trouble a 3rd party library is causing us.
Take this:
File a.cpp contains:
int foo;
//do things with it.
File b.cpp contains:
int foo;
//do other things with it.
File c.cpp contains:
extern int foo;
//do other things with it.
They are all compiled by gcc to .o files, then linked as shared object.
gcc -fPIC -c a.cpp
gcc -fPIC -c b.cpp
gcc -fPIC -c c.cpp
ld *.o -shared -soname,mylib -o mylib
The linker doesn't complain at all, but the resulting binary misbehaves. We suspect there are at least a few conflicts of this kind and would like to locate them. What kind of linker options would let us detect them?
(interestingly, if the variables are initialized (int foo=0) in both files, it produces an error).
Hold on now -- are you using foo for two different purposes in the two files? That would certainly lead to run-time errors. If foo needs to be global, then it should be defined in just one module -- the linker may accept it, but you will still only get one copy of foo. If it doesn't need to be global, it should be declared 'static int foo;'
This is a serious design bug in gcc/ld, it does not occur using MSVC. It won't happen linking programs, only shared libraries. When you link a program, the linker ensures all external references are satisfied, at least at link time. When you link a shared library it does not. Instead, external references for which there are no definitions are just left to dangle, the argument (given in ld man page) being that the symbol has to be resolved at load-time dynamically anyhow, so there's no point checking it. It's also hard if you use the stupid feature of shared libraries grabbing symbols from executables.
Your program will not misbehave. If you specify that symbols in a library must be satisfied on loading, you will get a load-time error, if you specify lazy linkage, the error will still occur, but only on the first use of the symbol (AFAIK!)
Some older OS, I believe BSD for example, allowed unsatisfied external pointers to be left as NULL so that you could write an "in program" check to see if the symbol was linked or not. Linux ld at least does not support this AFAIK.
There is a linker switch to force satisfaction of external references for shared libraries, but it is hard to use correctly in portable builds because it requires you to explicitly link the startup library for your processor.
I consider this a very serious design bug, and tried to file a bug report. In my own product we were happy to be able to build under Cygwin because underneath it uses MSVC linker which does not permit this behaviour, quite a lot of bugs were found in my code this way.
It seems compiler option -fno-common forces all the variables to be initialized, so it triggers errors upon linking.