extern variable defined inside .so and executable leads do undefined behavior? - c++

I am working in a legacy code, and I am facing a strange problem. I have the executable file, and it uses a .so library called dbaccess.so.
I also have a lib called "lib_base", and this lib is statically linked to both projects (dbaccess.so and the executable).
________
|lib_base|
|________|
/ \
/ statically \
/ linked \
______ / \ _____________
|my_app| | dbaccess.so |
|______| <---dinamically |_____________|
The problem is that, inside the "lib_base", I have in a .cpp (misc.cpp) file a variable defined as
char apName[_MAX_FNAME];
And inside a .cpp (clientconn.cpp) in "dbaccess.so" I have:
extern char apName[_MAX_FNAME];
And I've noticed some strange behaviors in the code. It looks like the "extern" variable is confusing the definition of "apName" inside the my_app's lib_base and the one inside dbaccess' lib_base.
When debugging a dbaccess' function with gdb, the following happens:
strcpy( apName, "test" );
printf( apName );
in the console, "test" is printed, but if I write the following line in gdb's console after the strcpy:
print apName
It prints "apfile.ini".
Do someone know if this problem is really related to the fact that "lib_base" is linked to both projects? Is there any compilation flag or something that could be passed when compiling the dbaccess.so to avoid this?
I am using linux and gcc as compiler.

If you define a variable in a .cpp file that should only be visible within that file you should use static.
If you want to share the value between the two lib_base then you shouldn't statically link it twice - you're getting duplicates of both the code and the data, which is both inefficient and confusing.
C++ has this thing called the One Definition Rule that says you can define the same thing as many times as you like (so you can include the same header file in multiple cpp files), and as long they are all defined the same, and it will Just Work. Basically this means that the linker is allowed to throw away duplicate objects and just keep one, at random. If you break the ODR the compiler will not know, and linker will probably not know, and then you get undefined behaviour.
In this case you've not broken the ODR, but you've linked the same thing into two different objects (your app, and the shared library), which is a different problem. The dynamic linker (that loads shared libraries at run time) doesn't worry about any of that nonsense - all it does is connect undefined symbols in the app to their definitions in the library. Your apName in the main app is clearly not undefined, so the dynamic linker has nothing to do there, so there's no undefined behaviour here.
Assuming that you didn't want both instances of lib_base to share the definition of aPname then your application appears to be linked just fine (printf proves that), but GDB doesn't perform well with ambiguous symbol names. When GDB looks up symbol names it doesn't necessarily know where to look first, so you don't get the one you expected.
Sometimes GDB can sort itself out if you first do list main (or whatever) to set the context you want. Basically though, don't duplicate code - the debugger won't like it.
If you must do this, check out the symbol-table and add-symbol-table commands - you can choose to load only the symbols from one file or the other, and debug the bit you need.

Related

How do I rectify inconsistent Unresolved External errors? [duplicate]

This question already has answers here:
What is an undefined reference/unresolved external symbol error and how do I fix it?
(39 answers)
Closed 9 years ago.
I'm trying to wrap my head around C++ developement using the SFML library. I'm following a tutorial (http://www.gamefromscratch.com/page/Game-From-Scratch-CPP-Edition-Part-7.aspx), and using visual studio 2010.
A problem I keep running into regards unresolved externals. I'm really struggling with this, because unlike most errors I run into, it doesn't seem to a) have anything to do with the code, and b) doesn't behave consistently. Rather than give y'all a specific example and ask for help solving that one example, I'm hoping to develop a more reliable way of attacking these problems. I'll give you an outline of a common occurance though.
I have a solution with 8 header files and 8 cpp files that correspond to them. The solution is stable: It compiles and runs with no errors or warnings.
I'll go into a header file and add this line:
virtual void DoNothing();
I'll then go into the matching cpp file and write the method:
void DoNothing(){};
I compile and run, and get 5 unresolved external errors. They don't point to any line of code, so I don't really know how to fix them, but I obviously did something wrong. Fair enough. Trying to get back to a stable state, I delete the two lines of code I had inserted, and compile. Even though the code is identical to the last stable state, I get the same unresolved external errors.
Trying random things, I go into another cpp file and reverse the order of two included header files. The game compiles now. If I switch the order of the included header files back, it compiles.
What the hell are unresolved external errors? Why don't they seem to behave consistently with the code I've entered? How do I read them to find out what the problem is, and how do I avoid them in the first place?
Thank you.
ps: If there are more specific details I should provide, please just let me know.
"Unresolved External" errors mean that your code is referring to something (usually a function or method, but can be a variable too) that does not exist. These are link errors, and not compile errors; that's why you don't get a line number and more helpful error messages.
Let me give you a little background on how C++ code is turned into an executable (and keep in mind that I'm simplifying things a bit.)
Each C++ source file (and not header file) in your project is compiled separately. A ".cpp" file and all the headers it includes are compiled into what is called an object file or object code. (These files have a ".obj" or ".o" extension.) You can also think of library files (that is ".lib" files on Windows and ".a" files on Linux) as a collection of these object files, stored for later use.
To produce the executable programs (e.g. the EXE or DLL file on Windows) all these object files are linked together are voila!
Now, the important thing here is that each source file is compiled in isolation and independent from other source files. So, if the code in one file calls a function that is implemented in another file, the compiler won't see the actual body of that function and can only assume that as long as the declaration of the called function is visible (i.e. the prototype, i.e. the line you write in headers,) then these files are going to be linked together eventually and will leave the task of actually making the call to the linker. This usually means that as long as you include the right headers, your compiler is going to be happy.
But the linker is going to be more tenacious and pedantic. At link time, you really really need to provide the body (i.e. the implementation) of all the functions that you use all over the project. It is your task to make sure that all the right object files and libraries are linked together and the implementation of each used function exists somewhere among them exactly once (no more, no less.)
This brings us to your problem. When you get an "unresolved external" linker error, this means that the body of a function you've called does not exist anywhere in object files and libraries that you are linking together.
Obviously, one of two things is happening. Either you have included the header for an external library, but have forgotten to link in the library file itself (which is not your problem here) or you've declared (i.e. written the prototype for) a function but have forgotten to implement its body.
Keep in mind that the linker is really strict here. If you declare something like this in your class:
class Foo {
void bar (int x);
};
and then in your ".cpp" file, implement this function:
void bar (int x)
{
// Do nothing
}
then you'll get an unresolved external error if you actually call Foo::bar() anywhere in your program, because the implemented bar() is not a method of Foo (you should have implemented void Foo::bar (int x) {}.) Similar things happen if you slightly misspell or get the type of the arguments wrong or whatnot.
Reading linker errors and making sense from them can be hard. Sometimes, the name that the linker is complaining about (the "symbol" it says it can't find) is all mangled beyond recognition. This has to do with *Application Binary Interface*s (ABI) and several decades of history and precedence. Anyways, most of the time, if you look closely and the link error message, you can see what the function name was and check your code (or libraries) and try again.
Also, though it's rare, it sometimes happens that in order to solve some link issues, you have to resort to completely rebuilding your project.
Every time I've seen behavior like this it has been because of a circular reference between projects. For example, project A has a reference to an object/symbol implemented in project B while at the same time project B has a reference to an object/symbol from project A. Every time you build your solution, the tools have to compile one project first, then the other. If you make a change to the second project to be compiled, the first one cannot see the change on the first round of compilations and the build fails. If you manage to manually build project B (against a now obsolete copy of library B), then the solution starts to build correctly. More complex cycles are possible (e.g. A depends on B, which depends on C, which depends on A). You don't mention multiple projects explicitly, but I bet you have them.
These circular references are common on large solutions that have been around for a long time and have grown slowly over time. People get in habit of adding links from everything to everything because they need one function from here, a struct from there...
Hunt down these dependencies. You should be able to do a full clean rebuild from nothing but the source code. Your dependency tree should look like... Well, a tree; not a graph.

Is it possible to write COM code in a static library and then link it to a DLL?

I am currently working on a project that has a number of COM objects written in C++ with ATL.
Currently, they are all defined in .cpp and .idl files that are directly compiled into the COM DLL.
To allow unit tests to be written easier, I am planning on moving the implementation of the COM objects out into a separate static library. That library can then be linked in to the main DLL, and the separate unit test project.
I am assuming that there's nothing particularly special about the code generated by ATL, and that this will work much like all other C++ code when it comes to linking with static libraries. However, I don't have too much actual knowledge of ATL myself so don't know if this is really the case.
Will this work as I'm expecting? Or are there pitfalls that I should look out for?
There are gotchas since LIBs are pulled in only if they are referenced, as opposed to OBJs which are explicitly included.
Larry Osterman discussed some of the subtleties a few years ago:
When I moved my code into a library, what happened to my ATL COM
objects?
A caveat: This post discusses details of how ATL7 works. For other
version of ATL, YMMV. The general principals apply for all
versions, but the details are likely to be different.
My group’s recently been working on reducing the number of DLLs
that make up the feature we’re working on (going from somewhere
around 8 to 4). As a part of this, I’ve spent the past couple of
weeks consolidating a bunch of ATL COM DLL’s.
To do this, I first changed the DLLs to build libraries, and then
linked the libraries together with a dummy DllInit routine (which
basically just called CComDllModule::DllInit()) to make the DLL.
So far so good. Everything linked, and I got ready to test the new
DLL.
For some reason, when I attempted to register the DLL, the
registration didn’t actually register the COM objects. At that
point, I started kicking my self for forgetting one of the
fundamental differences between linking objects together to make an
executable and linking libraries together to make an executable.
To explain, I’ve got to go into a bit of how the linker works. When
you link an executable (of any kind), the linker loads all the
sections in the object files that make up the executable. For each
extdef symbol in the object files, it starts looking for a public
symbol that matches the symbol.
Once all of the symbols are matched, the linker then makes a second
pass combining all the .code sections that have identical contents
(this has the effect of collapsing template methods that expand into
the same code (this happens a lot with CComPtr)).
Then a third pass is run. The third pass discards all of the
sections that have not yet been referenced. Since the sections
aren’t referenced, they’re not going to be used in the resulting
executable, so to include them would just bloat the executable.
Ok, so why didn’t my ATL based COM objects get registered? Well,
it’s time to play detective.
Well, it turns out that you’ve got to dig a bit into the ATL code to
figure it out.
The ATL COM registration logic gets picked in the CComModule
object. Within that object, there’s a method
RegisterClassObjects, which redirects to
AtlComModuleRegisterClassObjects. This function walks a list of
_ATL_OBJMAP_ENTRY structures and calls the RegisterClassObject
on each structure. The list is retrieved from the
m_ppAutoObjMapFirst member of the CComModule (ok, it’s really a
member of the _ATL_COM_MODULE70, which is a base class for the
CComModule). So where did that field come from?
It’s initialized in the constructor of the CAtlComModule, which
gets it from the __pobjMapEntryFirst global variable. So where’s
__pobjMapEntryFirst field come from?
Well, there are actually two fields of relevance,
__pobjMapEntryFirst and __pobjMapEntryLast.
Here’s the definition for the __pobjMapEntryFirst:
__declspec(selectany) __declspec(allocate("ATL$__a")) _ATL_OBJMAP_ENTRY* __pobjMapEntryFirst = NULL;
And here’s the definition for __pobjMapEntryLast:
__declspec(selectany) __declspec(allocate("ATL$__z")) _ATL_OBJMAP_ENTRY* __pobjMapEntryLast = NULL;
Let’s break this one down:
__declspec(selectany): __declspec(selectany) is a directive to
the linker to pick any of the similarly named items from the section
– in other words, if a __declspec(selectany) item is found
in multiple object files, just pick one, don’t complain about it
being multiply defined.
__declspec(allocate("ATL$__a")): This one’s the one that makes
the magic work. This is a declaration to the compiler, it tells the
compiler to put the variable in a section named "ATL$__a" (or
"ATL$__z").
Ok, that’s nice, but how does it work?
Well, to get my ATL based COM object declared, I included the
following line in my header file:
OBJECT_ENTRY_AUTO(<my classid>, <my class>)
OBJECT_ENTRY_AUTO expands into:
#define OBJECT_ENTRY_AUTO(clsid, class) \
__declspec(selectany) ATL::_ATL_OBJMAP_ENTRY __objMap_##class = {&clsid, class::UpdateRegistry, class::_ClassFactoryCreatorClass::CreateInstance, class::_CreatorClass::CreateInstance, NULL, 0, class::GetObjectDescription, class::GetCategoryMap, class::ObjectMain }; \
extern "C" __declspec(allocate("ATL$__m")) __declspec(selectany) ATL::_ATL_OBJMAP_ENTRY* const __pobjMap_##class = &__objMap_##class; \
OBJECT_ENTRY_PRAGMA(class)
Notice the declaration of __pobjMap_##class above – there’s
that declspec(allocate("ATL$__m")) thingy again. And that’s where
the magic lies. When the linker’s laying out the code, it sorts
these sections alphabetically – so variables in the ATL$__a
section will occur before the variables in the ATL$__z section.
So what’s happening under the covers is that ATL’s asking the linker
to place all the __pobjMap_<class name> variables in the
executable between __pobjMapEntryFirst and __pobjMapEntryLast.
And that’s the crux of the problem. Remember my comment above about
how the linker works resolving symbols? It first loads all the items
(code and data) from the OBJ files passed in, and resolves all the
external definitions for them. But none of the files in the wrapper
directory (which are the ones that are explicitly linked) reference
any of the code in the DLL (remember, the wrapper doesn’t do much more
than simply calling into ATL’s wrapper functions – it doesn’t
reference any of the code in the other files.
So how did I fix the problem? Simple. I knew that as soon as the
linker pulled in the module that contained my COM class definition,
it'd start resolving all the items in that module. Including the
__objMap_<class>, which would then be added in the right location so that ATL would be able to pick it up. I put a dummy function call
called ForceLoad<MyClass> inside the module in the library, and
then added a function called CallForceLoad<MyClass> to my DLL
entry point file (note: I just added the function – I didn’t
call it from any code).
And voila, the code was loaded, and the class factories for my COM
objects were now auto-registered.
What was even cooler about this was that since no live code called
the two dummy functions that were used to pull in the library, pass
three of the linker discarded the code!

Templated C++ Object Files

Lets say I have two .cpp files, file1.cpp and file2.cpp, which use std::vector<int>. Suppose that file1.cpp has a int main(void). If I compiled both into file1.o and file2.o, and linked the two object files into an elf binary which I can execute. I am compiling on a 32-bit Ubuntu Linux machine.
My question regards how the compiler and linker put together the symbols for the std::vector:
When the linker makes my final binary, is there code duplication? Does the linker have one set of "templated" code for the code in f1.o that uses std::vector and another set of std::vector code for the code that comprises f2.o?
I tried this for myself (I used g++ -g) and I looked at my final executable disassembly, and I found the labels generated for the vector constructor and other methods were apparently random, although the code from f1.o appeared to have called the same constructor as the code from f2.o. I could not be sure, however.
If the linker does prevent the code duplication, how does it do it? Must it "know" what templates are? Does it always prevent code duplication regarding multiple uses of the same templated code across multiple object files?
It knows what the templates are through name mangling. The type of the object is encoded by the compiler in its name, and that allows the linker to filter out the duplicate implementations of the same template.
This is done during linking, and not compilation, because each .o file can be linked with anything thus cannot be stripped of something that may later be needed. Only the linker can decide which code is unused, which template is duplicate, etc. This is done by using "Weak Symbols" in the object's symbol list: Symbols that the linker can remove if they appear multiple times (as opposed to other symbols, like user-defined functions, that cannot be removed if duplicate and cause a linking error).
Your question is stated verbatim in the opening section of this documentation:
http://gcc.gnu.org/onlinedocs/gcc/Template-Instantiation.html
Technically due to the "one definition rule" there is only one std::vector<int> and therefore the code should be linked together. What may happen is that some code is inlined which would speed up execution time but could produce more code.
If you had one file using std::vector<int> and another using std::vector<unsigned int> then you would have 2 classes and potentially lots of duplicate code.
Of course the writers of vector might use some common code for certain situations eg POD types that removes the duplication.

How does function-level linking deal with variables declared at file level?

As I understand function-level linking builds (explicitly or not) a graph of all possible calls and only includes the reachable functions' code into the produced binary. But how does it deal with variables declared at file level?
Say I have
MyClass GlobalVariable;
static MyClass StaticGlobalVariable;
in some file that contains only these two variables and a set of functions not actually called from any of the remaining code.
Will the code for these variables allocation/initialization be included into the output?
From experience (rather than quoting the standard):
If the initilaization has visible side effects like calls into external libraries or file I/O, the initialization will always happen.
boost::singleton_default provides an interesting solution that enforces the initialization to be done only when the object is referenced elsewhere, i.e. when all other references to the object are removed by the linker, the initialization is removed, too.
Edit: Yes. g++ optimize flags try to figure out function calls and prune away .o files resulting in linker errors. I'm not sure if this happens only with certain optimize flags, but it does happen.
A bad habit in our company is the presence of a lot of 'extern g_GlobalFunction()' definitions in different files. As their calls depended on conditional code, the .o files were often thrown away, resulting in link errors.
We fixed that with g_InitModule() and g_InitFileName() calls that are called hierarchically starting from main(). Mostly, these are empty functions just meant to dissuade g++ from discarding the .o file.

Function declarations and an unresolved external

I am looking after a huge old C program and converting it to C++ (which I'm new to).
There are a great many complicated preprocessor hacks going on connected to the fact that the program must run on many different platforms in many different configurations.
In one file (call it file1.c) I am calling functionA().
And in another file (call it file2.c) I have a definition of functionA().
Unfortunately the exact type of the function is specified by a collection of macros created in a bewildering number of ways.
Now the linker is complaining that:
functionA is an unresolved external symbol.
I suspect that the problem is that the prototype as seen in file1.c is slightly different from the true definition of the function as seen in file2.c.
There is a lot of scope for subtle differences due to mismatches between _cdecl and fastcall, and between with and without __forceinline.
Is there some way to show exactly what the compiler thinks is the type of functionA() as seen by file1.c as opposed to file2.c?
You can pass a flag to the compiler (/P, I think) that causes it to output the complete preprocessed output that is passed to the compiler - you can then open this (huge) file, and search through it and the information you need will be in there, somewhere.
Must you actually convert all the existing C code to C++? This is likely to be a lot of work, especially given what you've described so far.
Instead, you can write new code in C++ and call into the C code using extern "C". For example, in a C++ source file you can:
extern "C" {
#include "old_c_header.h"
}
This changes the linkage so the C++ compiler generates external references to the C code without name mangling, allowing the linker to match everything up.
Normally you should have the expected and the actual signature in the output.
Otherwise you can instruct the compiler to output the results of the preprocessing into a seperate file, cl.exe /p for MSVC and for gcc gcc -E.