Linker error with Duplicated Symbols, SWIG and C++ Vectors

Linker error with Duplicated Symbols, SWIG and C++ Vectors - c++

I came across this error trying to compile a shared object from 2 sets of objects. The first set contains one .os object compiled from one cpp file generated by SWIG. The second set is contains all of the .so files from the individual files that make up the interface to be wrapped.
$g++ -shared *.os -o Mathlibmodule.so
ld: duplicate symbol std::vector<int, std::allocator<int> >::size() constin Mathlib_wrap.o and Capsule.o
The swig c++ wrapper (Mathlib_wrap.o's source file) is machine generated and nasty to look at, with lots of #defines to make it extra hard to trace. It looks like the redefinition is present in all of the object files in the second set. I've traced through the headers included in all those files, and the seem to be #pragma once'd.
What advice do people have for tracking down what/where the problem is?

I'm going to assume that you've properly #ifndef/#define blocked all of the header files in your C++ library, after that I'd check your .i file to make sure you aren't actually duplicating some declaration there somehow. Maybe try importing a small small piece of the library first or something.
I have run into issues like this before, but its always turned out to be something silly I'd done. Nothing specific I'm afraid.
Post the .i file maybe, donno.

When in doubt, assume that the error means what it says: Actual code was generated for vector<T>::size within each of those object files. This of course seems very unusual because you would expect the function to be expanded inline in each file it was being used in.
If it weren't std::vector the first thing I would say is that a function defined in a header wasn't marked inline correctly. The compiler would generate the code in each source file that included that header. What version of g++ are you using, and are you using a custom standard library/vector implementation?
One thing to check is to compile with optimization on (-O2) and see if that causes it to inline the calls within creating an actual function.
Another possibility is that you're including two different versions of the vector include, and violating the one definition rule. At that point I wouldn't rule out a linker error such as you're seeing.

Related

What is __aeabi_unwind_cpp_pr1' and how can I avoid it?

I have a bunch of arm assembly, C and C++ files. gcc is trying to link them, but these are for an embedded project.
I am not using any external libraries, all code that is being used was written by me. An error seems to happen because I have a function called int kernel_main(void) defined in main.c that is trying to call set_LED(int value) defined in mailbox.cpp which includes the header mailbox.h (I did include the header in the main.c file).
The exact error is:
undefined reference to `__aeabi_unwind_cpp_pr1'
The way I am making my project is:
-compile all source files (.s, .c, .cpp) into object files (.o) without linking (-c), then link them all together with the use of a custom linker script.
Edit: I am going to add some information to make things more clear.
First changing all files so that all of them are C files (no cpp extensions) yields:
undefined reference to `set_LED'
It is unlikely that the issue itself is name mangling an it probably has nothing to do with CPP and C differences.
The problem is very likely to be a linker issue
This is the build process:
Compile c files, Example:
arm-none-eabi-g++ -O0 -march=armv8-a source/MainFiles/mailbox.cpp -nostartfiles -c -o objects/MainFiles/mailbox.o
(Compiling a C++ file would be identical except for the use of g++ instead of gcc)
Link everything:
arm-none-eabi-ld object1 object2... -o build/kernel.elf -T ./source/kernel.ld -I include_directory_1 -I include_directory_2 -L include_directory_1 -L indlude_directory_2
Include directories are all directories under the current one
Edit:
The error came back. Ignore the parts of this question relevant to name mangling. The error I need to fix is:
./objects/Hardware/mailbox.o:(.ARM.exidx+0x18): undefined reference to `__aeabi_unwind_cpp_pr1'
So far all I know is that this has something to do with unwinding the stack and exceptions. It seems the function is defined in libgcc. However I have used -nostdlib, I have omitted it, and in both cases the error persists. I have tried changing file extensions to .c whenever possible and to .cpp whenever possible, alas the error is always there.
It got fixed only as long as I had exactly 1 cpp file and the rest of my files were C files (this is no longer true, I tried). What triggered the error again was that I was refactoring the code and I wanted to move a couple of functions to new files.
In other words, without deleting a single file, declaring a function named wait(uint32_t time) in mailbox.cpp works, declaring it in a file called time.c (or cpp) with it's respective header declaration and including the header in mailbox.cpp breaks things. Note I don't delete the files when moving the function I simply delete the function declaration inside each file.
Adding a stub like this:
void __aeabi_unwind_cpp_pr1()
{
}
Fixes the problem and the code works. But I don't like this solution. I don't want a useless stub being called mysteriously in my code. I don't need nor want this function in my current implementation, how can I tell the compiler or the linker that they are to omit whatever they are doing that requires this function?

The solution is very simple. As it turns out exceptions are enabled by default (which is what generates the code that calls __eabi_unwind_cpp_pr1). To disable them all that is needed is to pass:
-fno-exceptions as an argument to the gcc/g++ compiler and the problem is solved.

You have a reference to this function that belongs to the C++ runtime of GCC. It's part of the exception handling. Whatever you are doing, sounds a little crazy, but anyway you can do this if you really know what you are doing. You must link against the C++ runtime libraries. That's it. Link against "libstdc++".
About the set_LED I also believe it's just about the C++ mangling, just as Justin J mentioned in the other answer.

I have seen this when mixing C and C++. Because of name mangling, the symbols will have different names internally depending on the type of the source file.
If the source for 'set_LED'is a c file, use the following in the header around the prototype and see if it helps.
#ifdef __cplusplus
extern "C" {
#endif
// function prototypes here
#ifdef __cplusplus
}
#endif

Please also add prefix "-shared" without quotes to -fno-exceptions. I am using ARM GCC version

How does a function caller use a header file to determine what to do with a compiled binary?

My understanding is that C++ (and C, I guess) header files are never compiled, and simply act as an explanation of the interface of the C++ file they describe.
So if my header file describes a hello() function, some program that includes the header will know about hello() and how to call it and what arguments to give it, etc.
However, after compilation (and before linking, I guess? I'm not sure), when the hello.c file is binary machine code, and hello.h is still C++, how does the compiler/linker know how to call a function in the binary blob based on the presence of its declaration in the header file?
I understand concepts such as symbol tables, abstract syntax trees, etc (i.e., I have taken a compiler class in the past), but this is a gap in my knowledge).

The implementation of hello() assumes a certain calling convention (where are the parameters on the stack, who cleans up the stack the caller or the callee, etc).
The compiler generates code with the correct calling convention. It may use information from the header file to do this (e.g. the function is marked __stdcall in Windows program) or it may use it's default calling convention. The compiler will also use the header file to make sure your are calling the routine with the right number and types of parameters. Once the code is generated by the compiler the header file is not used again.
The linker is not concerned with calling convention it's primary responsibility is to patch together the binaries you've compiled by fixing up references among your modules and any libraries it calls.

A C/C++ compilation unit (cpp file / c file) includes all the header files (as text) and the code.
The header file helps explain how to produce the call instruction
push arg1
push arg2
call _some_function
If the compilation unit includes _some_function then this will be resolved at compile time.
Otherwise it becomes an undefined symbol. If so, when the linker comes along, it looks through all the object files and libraries to resolve all the undefined symbols.
So the header file helps code the assembly correctly.
Object and library files provide implementations.
The library files are optional. When a linker looks in a library file, it only gets added if it satisfies some symbol, otherwise it is not added to the binary.
Object files (ignoring optimization) will get added to the binary completely.

Building a C++ program is a two-step process: compile and link.
The header is for compilation of the module you are writing. The binary is for linking: it contains the compiled code for the method defined in the correspnding header. The header has to match what's already been compiled. At link time you will learn if your header has a method signature that matches what was compiled in the binary.

How can I find out the references to a C++ symbol

I am working on an existing big C++ code base (more than 1 million line of code). I need to remove some part of the code deemed not useful. However, when I just exclude that part of code from the build process (i.e. not to compile them), eventually I got "undefined references" error in linking for some symbols (class function names) I removed.
A problem rose when I tried to find out where in other code have the references. Using Cscope or OpenGrok, I can find out a few explicit references but does not really help after removing such references. There are lots of other cases indirectly referring to the symbol I removed, for example:
virtual functions overridden in child class
"typedef" defined other symbol to refer to this missing symbol.
My question is: is there any gcc/g++ option I can turn on to have a output of all references (that gcc/g++ is aware of) direct or indirect to the symbol I removed?
If no such gcc/g++ option, is there any other tool that can produce such output?
Thanks.

Removing the compilation units (c or cpp files) from your project does not completely remove them. Those are typically just the definitions of functions and classes. The declarations of those functions and classes still exist in headers which are likely still being included in other compilation units.
Track down where these things are declared (typically in header files) and either comment them out in the headers or stop including the headers entirely if you don't need anything within them for your project.
For example:
If you are removing foo.c from a project, make sure any instance of #include "foo.h" has been removed from all other c/cpp files

You can instruct LD to emit a linker map containing a cross reference table using the flags -Map=path/to/my_mapfile.map and --cref. More info here:
https://sourceware.org/binutils/docs/ld/Options.html
The map file is very long and terse, but it usually has enough information to help you pinpoint exactly why a given symbol is still being referenced.

How do C++ header files work?

When I include some function from a header file in a C++ program, does the entire header file code get copied to the final executable or only the machine code for the specific function is generated. For example, if I call std::sort from the <algorithm> header in C++, is the machine code generated only for the sort() function or for the entire <algorithm> header file.
I think that a similar question exists somewhere on Stack Overflow, but I have tried my best to find it (I glanced over it once, but lost the link). If you can point me to that, it would be wonderful.

You're mixing two distinct issues here:
Header files, handled by the preprocessor
Selective linking of code by the C++ linker
Header files
These are simply copied verbatim by the preprocessor into the place that includes them. All the code of algorithm is copied into the .cpp file when you #include <algorithm>.
Selective linking
Most modern linkers won't link in functions that aren't getting called in your application. I.e. write a function foo and never call it - its code won't get into the executable. So if you #include <algorithm> and only use sort here's what happens:
The preprocessor shoves the whole algorithm file into your source file
You call only sort
The linked analyzes this and only adds the source of sort (and functions it calls, if any) to the executable. The other algorithms' code isn't getting added
That said, C++ templates complicate the matter a bit further. It's a complex issue to explain here, but in a nutshell - templates get expanded by the compiler for all the types that you're actually using. So if have a vector of int and a vector of string, the compiler will generate two copies of the whole code for the vector class in your code. Since you are using it (otherwise the compiler wouldn't generate it), the linker also places it into the executable.

In fact, the entire file is copied into .cpp file, and it depends on compiler/linker, if it picks up only 'needed' functions, or all of them.
In general, simplified summary:
debug configuration means compiling in all of non-template functions,
release configuration strips all unneeded functions.
Plus it depends on attributes -> function declared for export will be never stripped.
On the other side, template function variants are 'generated' when used, so only the ones you explicitly use are compiled in.
EDIT: header file code isn't generated, but in most cases hand-written.

If you #include a header file in your source code, it acts as if the text in that header was written in place of the #include preprocessor directive.
Generally headers contain declarations, i.e. information about what's inside a library. This way the compiler allows you to call things for which the code exists outside the current compilation unit (e.g. the .cpp file you are including the header from). When the program is linked into an executable that you can run, the linker decides what to include, usually based on what your program actually uses. Libraries may also be linked dynamically, meaning that the executable file does not actually include the library code but the library is linked at runtime.

It depends on the compiler. Most compilers today do flow analysis to prune out uncalled functions. http://en.wikipedia.org/wiki/Data-flow_analysis

Preventing objects from being linked if they are not needed?

I have an ARM project that I'm building with make. I'm creating the list of object files to link based on the names of all of the .c and .cpp files in my source directory. However, I would like to exclude objects from being linked if they are never used. Will the linker exclude these objects from the .elf file automatically even if I include them in the list of objects to link? If not, is there a way to generate a list of only the objects that need to be linked?

You have to compile your code differently to strip out function and data that isn't used. Usually all the objects are compiled into the same symbol, so they can't be individually omitted if they're not used.
Add the two following switches to your compiler line:
-ffunction-sections -fdata-sections
When you compile, the compiler will now put individual functions and data into their own sections instead of lumping them all in one module section.
Then, in your linker, specify the following:
--gc-sections
This instructs the linker to remove unused sections ("gc" is for garbage collection). It will garbage collect parts of files and entire files. For example, if you're compiling an object, but only use 1 function of 100 in the object, it will toss out the other 99 you're not using.
If you run into issues with functions not found (it happens due to various reasons like externs between libraries), you can use .keep directives in your linker file (*.ld) in order to prevent garbage collection on those individual functions.

If you are using RealView, it seems that it is possible. This section discusses it:
3.3.3 Unused section elimination
Unused section elimination removes code that is never executed, or data that is not
referred to by the code, from the final image. This optimization can be controlled by the
--remove, --no_remove, --first, --last, and --keep linker options. Use the --info unused
linker option to instruct the linker to generate a list of the unused sections that have been
eliminated.

Like many people said, the answer is "depends". In my experience, RVCT is very good about dead code stripping. Unused code and data will almost always be removed in the final link stage. GCC, on the other hand (at least without the LLVM back end), is rather poor at whole image static analysis and will not do a very good job at removing unused code (and woe be it to you if your code is in different sections requiring long jumps). You can take some steps to mitigate it, such as using function-sections, which creates a separate section for each function and enables some better dead code stripping.
Have your linker generate a map file of your binary so you can see what made it in there and what got stripped out.

Depending on the sophistication of the compiler/linker and optimization level, the linker will not link in code that isn't being called.

What compiler/linker are you using? Some linkers do this automatically, and some provide the feature as a command-line option.

In my experience, many compilers will not include unused code on an object file basis. Some may not have this resolution and will include entire libraries ("because this makes the build process faster").
For example, given a file junk.c and it has three functions: Func1, Func2 and Func3. The build process creates an object file, junk.o, which has all three functions in it. If function Func2 is not used, it will be included anyway because the linker can't exclude one function out of an object file.
On the other hand, given files: Func1.c, Func2.c, and Func3.c, with the functions above, one per file. If Func2 in Func2.c is not used, the linker will not include it.
Some linkers are intelligent enough to exclude files out of libraries. However, each linker is different on its granularity of file inclusion (and thus file exclusion). Read your linker's manual or contact their customer support for exact information.
I suggest moving the suspect functions into a separate file (one function per file) and rebuild. Measure the code size before and after. Also, there may be a difference between Debug and Release linking. The Debug linking could be lazy and just throw everything in while the Release linking puts more effort into removing unused code.
Just my thoughts and experience, Your Mileage May Vary (YMMV).

Traditionally linkers link in all object files that are explicity specified in the command line, even if they could be left out and the program would not have any unresolved symbols. This means that you can deliberately change the behaviour of a program by including an object file that does something triggered from static initialization but is not called directly or indirectly from main.
Typically if you place most of your object files in a static library and link this library with a single object file containing your entry point the linker will only pick out members of the library (iteratively) that help resolve an unresolved symbol reference in the original object file or one included subsequently because it resolved a previous unresolved symbol.
In short, place most of your object files in a library and just link this with one object containing your entry point.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Linker error with Duplicated Symbols, SWIG and C++ Vectors - c++

Related

What is __aeabi_unwind_cpp_pr1' and how can I avoid it?

How does a function caller use a header file to determine what to do with a compiled binary?

How can I find out the references to a C++ symbol

How do C++ header files work?

Preventing objects from being linked if they are not needed?

Categories

Resources