Bazel linker not finding function defines - c++

I've been able to make a fair amount of progress in trying to add Bazel build files to enable the building of the gennorm2 tool in ICU. Here is my work-in-progress PR using the Bazel target //icu4c/source/tools/gennorm2.
I'm currently getting stuck when running bazelisk build //icu4c/source/tools/gennorm2 --verbose_failures --sandbox_debug with these errors.
They reference functions defined in urename.h. As I understand it, urename.h is also used to rename certain functions by appending a suffix with the version number (_68), but I defined a preprocessor constant U_DISABLE_RENAMING to disable that specific behavior. This only had the effect of changing the names of the undefined function names in the error output, but otherwise not changing it (ex: errors now complain of u_errorName instead of u_errorName_68).
The part that puzzles me is why the error output claims that these symbols are not found. As you can see, the target //icu4c/source/tools/gennorm2 depends on //icu4c/source/common:platform, which in turn depends on //icu4c/source/common:headers, which includes the field hdrs = glob(["unicode/*.h", "unicode/*.h",]), which should be matching
/icu4c/source/common/unicode/urename.h.
In case it helps, this is the verbose log output when running make VERBOSE=1 using the current autotools-based configure + make build on a fresh checkout of ICU.

A teammate was able to take a look and help me reason through the errors and ultimately fix them.
The first thing is to acknowledge that it is indeed a linker error, which can see by noticing the error message references the linker program ld.
This is important because we previously spent time in the wrong place by debugging the compile configs as if the problem happened during the compile phase before the linker phase. (But I learned about one way to debug compile problems is taking the raw GCC command given by --verbose_failures --sandbox_debug and replacing -c with -E and changing the argument of -o to a .txt file in /tmp to save the output what the compiler sees for that file after all the includes are recursively inlined). This means that my attempts to solve the problem by specifying preprocessor defines for the compile-phase were misguided.
The project's documentation on dependencies revealed that I had mis-specified a dependency on one of the targets to specify only the headers (//icu4c/source/common:headers) instead of the relevant definitions and headers (//icu4c/source/common:platform).
After doing that, we solved another problem that was small and interesting. The gennorm2 target depends on code to get the current year (ex: for printing out help messages that include the copyright statement with the year range). As an i18n library, ICU has code to get that, somewhere in //icu4c/source/i18n:icu4ci18n. This creates an excessive amount of code dependencies for an isolated use case (and will cause problems for follow-on work), so we replaced the block of code in gennorm2 calling those calendar year fns (ucal_open, ucal_getNow, ucal_setMillis, ucal_get, ucal_close) with the libc date library function to give us the year as a number, and added linkopts = ["-ldl"] to link in the dl date library.

Related

How can I link a library that contains conditional types/variables definition based on global variables defined through CMake?

Introduction
I am trying to use Toulbar2 as a C++ library in my CMake project, however I am having much trouble linking it to my main executable.
I found many similar questions on this topic, both here and on other similar website, but none of them helped me with my specific issue. I tried literally everything and I did not menage to make it work, I was hoping that some of you may help me with that.
I am running Ubuntu 18.04, CMake version 3.23 and in my project I am using the standard C++11. I am a proficient programmer, but I am just an beginner/intermediate user of both C++ and CMake.
What I've already tried to do
I cannot list all my attempts, so I will only mention those I think are my best ones, to give you an idea of what I may be doing wrong.
1) In my first attempt, I tried to use the same approach I used for any non-standard library I imported, i.e. using find_package() in CMakeLists.txt to then link the found LIBRARIES and include the found INCLUDE_DIRS. However, I soon realised that Toulbar2 provides neither a Find<package>.cmake or <name>Config.cmake file. So, this approach could not work.
2) My second attempt is the one that in my opinion brought me the closest to the solution I hoped for. You can easily compile Toulbar2 as a dynamic library using the command: cmake -DLIBTB2=ON .. in an hypothetical build directory you previously created. After compiling with make you have your .so file in build/lib/Linux. After installation, you can make CMake find this library by itself using the command find_library. So, my CMakeLists.txt ended up looking like this:
[...]
find_library(TB2_LIBRARIES tb2)
if(TB2_LIBRARIES)
set(all_depends ${all_depends} ${TB2_LIBRARIES})
else(TB2_LIBRARIES)
add_compile_definitions("-DNO_TB2")
message("Compiling without Toulbar2, if you want to use it, please install it first")
endif(TB2_LIBRARIES)
[...]
target_link_libraries(main ${all_depends})
[...]
This code works to some extent, meaning that CMake correctly finds the library and runs the linking command, however if I try to #include <toulbar2lib.hpp> the header is not found. So I figured out I should have told CMake where to find that header, so I ended up adding a
include_directories(/path/to/header/file's/directory)
However, I still have another problem. The header is found, but a lot of names used in the header are not found at compilation time. The reason is that in Toulbar2 some variables/types are defined conditionally by using preprocessing directives like #ifdef or #ifndef, and in turn the global variables used in these conditions are defined through CMake at compilation time. If you are interested in an example, I can mention the Cost type that is used in the mentioned header file. I see that there's a piece missing in the puzzle here, but I cannot figure out which one. Since I pre-compiled the library those definitions should exist when I include the header file, because I am correctly linking the correspondent library that contains those definitions.
3) My third attempt is less elegant than the the other two I mentioned, but I was desperately trying to find a solution. So, I copied the whole toulbar2 cloned folder inside my project and I tried to add it as a subdirectory, meaning that my main CMakeLists.txt contains the line:
add_subdirectory(toulbar2)
It provides a CMakeLists.txt too, there should be no problem in doing it. Then I include the src directory of toulbar2, that contains the header file I need, and I should be okay. Right? Wrong. I got the same problem that I had before with (2), i.e. some variables/types conditionally defined were not actually defined when I tried to compile my project, even though the subproject toulbar2 was correctly (no errors) compiled.
I just wanted to mention that any answer is welcome, however if you could help me figure out an elegant solution (see 1 or 2) for this problem it would be way better, as this code is intended to be published soon or later. Thank you in advance for your help.
Solution 2) looks fine. You just need to add the following compilation flags -DNDEBUG -DBOOST -DLONGDOUBLE_PROB -DLONGLONG_COST when compiling your project with toulbar2lib.hpp. See github/toulbar2 README.md how to compile without cmake for those flags (except WCSPFORMATONLY that should not by used in this context).

Checking value of preprocessor symbol(#define)

i'm trying to cross -ompile an application to another system. I created all dependencies and started compiling. This then stops with one of my dependency libraries, namely Qt3, causing compiler errors:
Error: expected class-name before "{" token
and
Error: "QMutex" does not name a type
I'm suspecting the Q_EXPORT symbol to be defined wrong because i forgot to simulate some environment settings. But because it's definition depends on symbols which depend on symbols which depend on symbols, and so on, it's hard to check.
Just outputing it in an test program isn't working either because the value of Q_EXPORT is not always convertable to string.
My question is:
How do i check the value of a preprocessor symbol (while compiling/preprocessing) with GNU Compiler.
I thought there would be an option for this but i havn't found anything while searching on the web.
Debugging of macro symbols can be tricky, because it happens before the actual compilation [1]. Running the build system in such a way that you print the actual compilation command is a good starting point.
Then you can grab the actual compile command, and substitute the -c with -E or something similar, to inspect the actual generated preprocessor output. Then locate the actual place in the source that you are compiling - expect the output from -E to be HUGE - a million lines of output is not unusual. Use the #file and #line preprocessor symbols to track which file you are in, and what line you're at.
[1] Not strictly true in all compilers, as to help with precisely the problem that macros are making it hard to follow what the code is actually doing, modern compilers expand macros during the proper parsing of the code. However, that's not helping in this particular case, apparently.

CPPCheck returns inconsistent results

I've set up CPPCheck (v1.6.1) for a large project containing a bunch of libraries.
When I check a library then I get some check failures which I'm interested in and all is well. However at this point I just have a text file list of all the *.cpp and *.h in that library which I'm passing by '--file-list=...'
Of course, I do get some errors about missing includes, because this library (say MyLibA) includes files from another one of my libraries (MyLibB).
So I now construct a text file that has all the include paths from MyLibB and pass it to cppcheck via '--includes-file=...'.
At this point I get some cpp failures about headers within MyLibB, which is not unexpected, however all the errors that were reported about MyLibA are no longer reported.
Is this a bug or am I doing something wrong?
If cppcheck runs into a #error then it aborts the check. So you can end up in the situation whereby including headers triggers a #error (if for example you haven't correctly set up your -D preprocessor defines for cppcheck on the command line).
This means that files that were checked previously will no longer get checked because the tests were aborted in the header, i.e. before the offending lines of code were reached

C++: How to add external libraries

I'm trying to add SVL to my project.
Without it I'm getting hundreds of errors (undefined reference...). After adding -lSVL all errors are gone, but gcc says: "cannot find -lSVL". Everything else (SDL, SDL_TTF, SDL_Mixer...) works fine.
You should inform gcc of the path where libsvl.a is installed, for example:
gcc -lsvl -L/my/path/
Also, pay attention to the case if you work under Linux ("SVL" is different from "svl").
There are two parts to adding an external library; you need to tell the compiler[1] where to find the description of the API (i.e., the header files) and you need to tell the linker where to find the implementation of the API (i.e., the library file(s)).
The list of possible locations of the headers is given by the include path, which with a traditional compiler is added to with the -I option. It takes a directory name to add; the directory is one extra place the compiler will look for header files.
The list of possible locations of the library is given by the link path. It's just like the include path, but is added to with -L. Note that you can also (at least normally) give the full path to the library directly on the command line, but this isn't particularly recommended because it tends to embed more information in the executable than is really needed.
The syntax for MSVC is recognizably similar IIRC.
If you're using an IDE, you'll probably have to set these things in the project options, but as long as you remember that you need to set both include and library paths, you'll be able to find your way through.
[1]Strictly, you're telling the preprocessor, but the preprocessor's output is virtually always directed straight into the compiler proper.

Preventing objects from being linked if they are not needed?

I have an ARM project that I'm building with make. I'm creating the list of object files to link based on the names of all of the .c and .cpp files in my source directory. However, I would like to exclude objects from being linked if they are never used. Will the linker exclude these objects from the .elf file automatically even if I include them in the list of objects to link? If not, is there a way to generate a list of only the objects that need to be linked?
You have to compile your code differently to strip out function and data that isn't used. Usually all the objects are compiled into the same symbol, so they can't be individually omitted if they're not used.
Add the two following switches to your compiler line:
-ffunction-sections -fdata-sections
When you compile, the compiler will now put individual functions and data into their own sections instead of lumping them all in one module section.
Then, in your linker, specify the following:
--gc-sections
This instructs the linker to remove unused sections ("gc" is for garbage collection). It will garbage collect parts of files and entire files. For example, if you're compiling an object, but only use 1 function of 100 in the object, it will toss out the other 99 you're not using.
If you run into issues with functions not found (it happens due to various reasons like externs between libraries), you can use .keep directives in your linker file (*.ld) in order to prevent garbage collection on those individual functions.
If you are using RealView, it seems that it is possible. This section discusses it:
3.3.3 Unused section elimination
Unused section elimination removes code that is never executed, or data that is not
referred to by the code, from the final image. This optimization can be controlled by the
--remove, --no_remove, --first, --last, and --keep linker options. Use the --info unused
linker option to instruct the linker to generate a list of the unused sections that have been
eliminated.
Like many people said, the answer is "depends". In my experience, RVCT is very good about dead code stripping. Unused code and data will almost always be removed in the final link stage. GCC, on the other hand (at least without the LLVM back end), is rather poor at whole image static analysis and will not do a very good job at removing unused code (and woe be it to you if your code is in different sections requiring long jumps). You can take some steps to mitigate it, such as using function-sections, which creates a separate section for each function and enables some better dead code stripping.
Have your linker generate a map file of your binary so you can see what made it in there and what got stripped out.
Depending on the sophistication of the compiler/linker and optimization level, the linker will not link in code that isn't being called.
What compiler/linker are you using? Some linkers do this automatically, and some provide the feature as a command-line option.
In my experience, many compilers will not include unused code on an object file basis. Some may not have this resolution and will include entire libraries ("because this makes the build process faster").
For example, given a file junk.c and it has three functions: Func1, Func2 and Func3. The build process creates an object file, junk.o, which has all three functions in it. If function Func2 is not used, it will be included anyway because the linker can't exclude one function out of an object file.
On the other hand, given files: Func1.c, Func2.c, and Func3.c, with the functions above, one per file. If Func2 in Func2.c is not used, the linker will not include it.
Some linkers are intelligent enough to exclude files out of libraries. However, each linker is different on its granularity of file inclusion (and thus file exclusion). Read your linker's manual or contact their customer support for exact information.
I suggest moving the suspect functions into a separate file (one function per file) and rebuild. Measure the code size before and after. Also, there may be a difference between Debug and Release linking. The Debug linking could be lazy and just throw everything in while the Release linking puts more effort into removing unused code.
Just my thoughts and experience, Your Mileage May Vary (YMMV).
Traditionally linkers link in all object files that are explicity specified in the command line, even if they could be left out and the program would not have any unresolved symbols. This means that you can deliberately change the behaviour of a program by including an object file that does something triggered from static initialization but is not called directly or indirectly from main.
Typically if you place most of your object files in a static library and link this library with a single object file containing your entry point the linker will only pick out members of the library (iteratively) that help resolve an unresolved symbol reference in the original object file or one included subsequently because it resolved a previous unresolved symbol.
In short, place most of your object files in a library and just link this with one object containing your entry point.