What does bazel alwayslink = true mean? - c++

I'm new to bazel. Here's the explanation in bazel doc:
https://docs.bazel.build/versions/master/be/c-cpp.html#cc_library.alwayslink
alwayslink
Boolean; optional; nonconfigurable; default is 0
If 1, any binary that depends (directly or indirectly) on this C++
library will link in all the object files for the files listed in
srcs, even if some contain no symbols referenced by the binary. This
is useful if your code isn't explicitly called by code in the binary,
e.g., if your code registers to receive some callback provided by some
service.
I don't quite understand the last sentence: e.g., if your code registers to receive some callback provided by some service. Can anyone give and example? Thanks!

e.g., if your code registers to receive some callback provided by some service.
AIUI, this is the case when the cc_binary builds a shared library / a DLL. You need the linker to keep all symbols, even if unused, because another binary who loads the .so/.dll at runtime might need those symbols.

Most linkers will remove object files if they are unreferenced by the rest of the program, even if they contain dynamic initialization.
For example:
struct S { S(); }
S s;
S::S() {
std::cout << "Hello, World!";
}
This code prints Hello, World! during dynamic initialization. The constructor of the global variable s prints it, and that is called during dynamic initialization.
If you link this code with a program, it might be removed, so Hello, World! will not be printed by the program.
The alwayslink flag prevents this, and forces the program to include the dynamic initialization.
The usual use case for this is when the dynamic initialization causes registration of stuff. For example it might be a "plugin" for some application, and its dynamic initialization adds it to the global list of plugins.
It has nothing to do with shared vs static libraries.

Related

if we had a single file project that contained all the code can we not use the linker?

Linker question:
if I had a file. c that has no includes at all, would we still need a linker?
Although the linker is so-named because it links together multiple object files, it performs other functions as well. It may resolve addresses that were left incomplete by the compiler. It produces a program in an executable file format that the system’s program loader can read and load, and that format may differ from that of object modules. Specifics depend on the operating system and build tools.
Further, to have a complete program in one source file, you must provide not just the main routine you are familiar with from C and C++ but also the true start of the program, the entry point that the program loader starts execution at, and you must provide implementations for all functions you use in the program, such as invocations of system services via special trap or system-call instructions to read and write data.
You can create a project, which has no typical C startup code, in which case, you may not even have a main(). However, you still need a linker, because the linker creates the required executable file format for the given architecture.
It also will set the entrypoint, where the actual execution starts.
So you can omit the standard libraries, and create a binary, which is completly void of any C functions, but you still need the linker to actually make a runable binary.
The object file format, generated by the compiler, is very different to the executable file format, because it only provides all information, that is required for the linker.
Yes. The linker does more than merely link the files. Check out this resource for more info: https://en.wikibooks.org/wiki/C%2B%2B_Programming/Programming_Languages/C%2B%2B/Code/Compiler/Linker#:~:text=The%20linker%20is%20a%20program,translation%20unit%20have%20external%20linkage.
Believe it or not, multiple libraries can be referenced by default. So, even if you don't #includea resource, the compiler may have to internally link or reference something outside of the translation unit. There are also redundancies and other considerations that are "eliminated" by the compiler.
Despite its name the linker is properly a "linker/locater". It performs two functions - 1) linking object code, 2) determining where in memory the data and code elements exist.
The object code out of the compiler is not "located" even if it has no unresolved links.
Also even if you have the simplest possible valid code:
int main(){ return 0; }
with no includes, the linker will normally implicitly link the C runtime start-up, which is required to do everything necessary before running main(). That may be very little. On some target such as ARM Cortex-M you can in fact run C code directly from the reset vector so long as you don't assume static initialisation or complete library support. So it is possible to write the reset code entirely in C, but you probably still need code to initialise the vector table with the reset handler (your C start-up function) and the initial stack pointer. On Cortex-M that can be done using in-line assembler perhaps, but it is all rather cumbersome and unnecessary and does not forgo the linker.

MSVCR and CRT Initialization

Out of curiosity, what exactly happens when an application compiled with the MSVCR is loaded, resp. how does the loader of Windows actually initialize the CRT? For what I have gathered so far, when the program as well as all the imported libraries are loaded into memory and all relocations are done, the CRT startup code (_CRT_INIT()?) initializes all global initializers in the .CRT$XC* sections and calls the user defined main() function. I hope this is correct so far.
But let's assume, for the sake of explanation, a program that is not using the MSVCR (e.g. an application built with Cygwin GCC or other compilers) tries to load a library at runtime, requiring the CRT, using a custom loader/runtime linker, so no LoadLibrary() involved. How would the loader/linker has to handle CRT initialization? Would it have to manually initialize all "objects" in said sections, does it have to do something else to make the internal wiring of the library work properly, or would it have to just call _CRT_INIT() (which unpractically is defined in the runtime itself and not exported anywhere as far as I am aware). Would this mix-up even work in any way, assuming that the non-CRT application and the CRT-library would not pass any objects, exceptions and things the like between them?
I would be very interested in finding out, because I can't quite make out what the CRT has an effect on the actual loading process...
Any information is very appreciated, thanks!
The entrypoint for an executable image is selected with the /ENTRY linker option. The defaults it uses are documented well in the MSDN Library article. They are the CRT entrypoint.
If you want to replace the CRT then either pick the same name or use the /ENTRY option explicitly when you link. You'll also need /NODEFAULTLIB to prevent it from linking the regular .lib
Each library compiled against the C++ runtime is calling _DllMainCRTStartup when it's loaded. _DllMainCRTStartup calls _CRT_INIT, which initializes the C/C++ run-time library and invokes C++ constructors on static, non-local variables.
The PE format contains an optional header that has a slot called 'addressofentrypoint', this slot calls a function that will call _DllMainCRTStartup which fires the initialization chain.
after _DllMainCRTStartup finishes the initialization phase it will call your own implemented DllMain() function.
When you learn about programming, someone will tell you that "the first thing that happens is that the code runs in main. But that's a bit like when you learn about atoms in School, they are fairly well organized and operate acorrding to strict rules. If you later go to a Nuclear/Particle Physics class at university, those simple/strict rules are much more detailed and don't always apply, etc.
When you link a C or C++ program the CRT contains some code something like this:
start()
{
CRT_init();
...
Global_Object_Constructors();
...
exit(main());
}
So the initialization is done by the C runtime library itself, BEFORE it calls your main.
A DLL has a DllMain that is executed by LoadLibrary() - this is responsible for initializing/creating global objects in the DLL, and if you don't use LoadLibrary() [e.g. loading the DLL into memory yourself] then you would have to ensure that objects are created and initialized.

Static linking of C++ code

I have the following problem: I use some classes like the following to initialize C libraries:
class Hello
{
public:
Hello()
{
cout << "Hello world" << endl;
}
~Hello()
{
cout << "Goodbye cruel world" << endl;
}
} hello_inst;
If I include this code in a hello.cc file and compile it together with another file containing my main(), then the hello_inst is created before and destroyed after the call
to main(). In this case it just prints some lines, in my project I initialize libxml via
LIBXML_TEST_VERSION.
I am creating multiple executables which share a lot of the same code in a cmake project.
According to this thread: Adding multiple executables in CMake I created a static library containing the code shown above and then linked the executables against that library. Unfortunately in that case the hello_inst is never created (and libxml2 is never initialized). How can I fix this problem?
I had a similar problem and solved it by defining my libraries as static. Therefore I used the following code:
add_library( MyLib SHARED ${LBMLIB_SRCS} ${LBMLIB_HEADER})
Maybe this fixes your problem
There is no official way of forcing shared libraries global variables to be initialised by the standard and is compiler dependent.
Usually this is done either the first time something in that library is actually used (A class, function, or variable) or when the variable itself is actually used.
If you want to force hello_inst to be used, call a function on it, then see if and when the constructor and destructors are called.
Read this thread for more information:
http://www.gamedev.net/topic/622861-how-to-force-global-variable-which-define-in-a-static-library-to-initialize/
As far as I'm aware, statics defined in library should be constructed before main is called and destroyed after main, in the manner you describe. Indeed I have used shared libraries in many projects, and have never encountered the problems you describe.
I understand a library file, to be little more than a container of object files.
However, that said.....
If your code does nothing with the object that is created, the linker is free to remove it (dead code removal). I would suggest making sure the static object is referenced. Call a member function, perhaps?

How do I make an unreferenced object load in C++?

I have a .cpp file (let's call it statinit.cpp) compiled and linked into my executable using gcc.
My main() function is not in statinit.cpp.
statinit.cpp has some static initializations that I need running.
However, I never explicitly reference anything from statinit.cpp in my main(), or in anything referenced by it.
What happens (I suppose) is that the linked object created from statinit.cpp is never loaded on runtime, so my static initializations are never run, causing a problem elsewhere in the code (that was very difficult to debug, but I traced it eventually).
Is there a standard library function, linker option, compiler option, or something that can let me force that object to load on runtime without referencing one of its elements?
What I thought to do is to define a dummy function in statinit.cpp, declare it in a header file that main() sees, and call that dummy function from main(). However, this is a very ugly solution and I'd very much like to avoid making changes in statinit.cpp itself.
Thanks,
Daniel
It is not exactly clear what the problem is:
C++ does not have the concept of static initializers.
So one presume you have an object in "File Scope".
If this object is in the global namespace then it will be constructed before main() is called and destroyed after main() exits (assuming it is in the application).
If this object is in a namespace then optionally the implementation can opt to lazy initialize the variable. This just means that it will be fully initialized before first use. So if you are relying on a side affect from construction then put the object in the global namespace.
Now a reason you may not be seeing the constructor to this object execute is that it was not linked into the application. This is a linker issue and not a language issue. This happens when the object is compiled into a static library and your application is then linked against the static library. The linker will only load into the application functions/objects that are explicitly referenced from the application (ie things that resolve undefined things in the symbol table).
To solve this problem you have a couple of options.
Don't use static libraries.
Compile into dynamic libraries (the norm nowadays).
Compile all the source directly into the application.
Make an explicit reference to the object from within main.
I ran into the same problem.
Write a file, DoNotOptimizeAway.cpp:
void NoDeadcodeElimination()
{
// Here use at least once each of the variables that you'll need.
}
Then call NoDeadcodeElimination() from main.
EDIT: alternatively you can edit your linker options and tell it to always link everything, even if it's not used. I don't like this approach though since executables will get much bigger.
These problems, and the problems with these potential solutions all revolve around the fact that you can't guarantee much about static initialization. So since it's not dependable, don't depend on it!
Explicitly initialize data with a static "InititalizeLibrary" type static function. Now you guarantee it happens, and you guarantee when it happens in relation to other code based on when you make the call.
One C++'ish way to do this is with Singletons.
Essentially, write a function to return a reference to the object. To force it to initialize, make it a static object inside the function.
Make a class static function that is vaguely like this:
class MyClass {
static MyClass& getObject()
{
static MyObject obj;
return obj;
}
};
Since you are using C++, you could always declare a global object (ie a global variable that references a class in statinit.cpp. As always, the constructor will be called on initialization and since the object is global, this will be called before main() is run.
There is one very important caveat though. There is no guarantee as to when the constructor will be called and there is no way to explicitly order when each of these constructors is called. This will also probably defeat any attempt to check for memory leaks since you can no longer guarantee that all the memory allocated while running main has been deallocated.
Is the problem that the static items were never initialized, or is the problem that the static items weren't initialized when you needed to use them?
All static initialization is supposed to be finished before your main() runs. However, you can run into issues if you initialize on static object with another static object. (Note: this doesn't apply if you are using primitives such as int)
For example, if you have in file x.cpp:
static myClass x(someVals);
And in y.cpp:
static myClass y = x * 2;
It's possible that the system will try to instantiate y before x is created. In that case, the "y" variable will likely be 0, as x is likely 0 before it is initialized.
In general, the best solution for this is to instantiate the object when it is first used (if possible). However, I noticed above you weren't allowed to modify that file. Are the values from that file being used elsewhere, and maybe you can change how those values are accessed?
Read the manual page for the ld command and look at the -u option. If statinit.cpp defines anything that looks like a function then try naming it in -u. Otherwise choose a data object that's defined in statinit.cpp and name that in -u, and hope it works. I think it's best if you write the command line so the -u option comes immediately before your -l option for your library that has statinit's object code in it.
Of course the dynamic library solution is the best, but I've also been told it's possible to link the whole static library with the linker option:
-Wl,-whole-archive
before the library's -l option, and
-Wl,-no-whole-archive
after it (to avoid including other libraries as a whole, too).

Compiling a DLL with gcc

Sooooo I'm writing a script interpreter. And basically, I want some classes and functions stored in a DLL, but I want the DLL to look for functions within the programs that are linking to it, like,
program dll
----------------------------------------------------
send code to dll-----> parse code
|
v
code contains a function,
that isn't contained in the DLL
|
list of functions in <------/
program
|
v
corresponding function,
user-defined in the
program--process the
passed argument here
|
\--------------> return value sent back
to the parsing function
I was wondering basically, how do I compile a DLL with gcc? Well, I'm using a windows port of gcc. Once I compile a .dll containing my classes and functions, how do I link to it with my program? How do I use the classes and functions in the DLL? Can the DLL call functions from the program linking to it? If I make a class { ... } object; in the DLL, then when the DLL is loaded by the program, will object be available to the program? Thanks in advance, I really need to know how to work with DLLs in C++ before I can continue with this project.
"Can you add more detail as to why you want the DLL to call functions in the main program?"
I thought the diagram sort of explained it... the program using the DLL passes a piece of code to the DLL, which parses the code, and if function calls are found in said code then corresponding functions within the DLL are called... for example, if I passed "a = sqrt(100)" then the DLL parser function would find the function call to sqrt(), and within the DLL would be a corresponding sqrt() function which would calculate the square root of the argument passed to it, and then it would take the return value from that function and put it into variable a... just like any other program, but if a corresponding handler for the sqrt() function isn't found within the DLL (there would be a list of natively supported functions) then it would call a similar function which would reside within the program using the DLL to see if there are any user-defined functions by that name.
So, say you loaded the DLL into the program giving your program the ability to interpret scripts of this particular language, the program could call the DLLs to process single lines of code or hand it filenames of scripts to process... but if you want to add a command into the script which suits the purpose of your program, you could say set a boolean value in the DLL telling it that you are adding functions to its language and then create a function in your code which would list the functions you are adding (the DLL would call it with the name of the function it wants, if that function is a user-defined one contained within your code, the function would call the corresponding function with the argument passed to it by the DLL, the return the return value of the user-defined function back to the DLL, and if it didn't exist, it would return an error code or NULL or something). I'm starting to see that I'll have to find another way around this to make the function calls go one way only
This link explains how to do it in a basic way.
In a big picture view, when you make a dll, you are making a library which is loaded at runtime. It contains a number of symbols which are exported. These symbols are typically references to methods or functions, plus compiler/linker goo.
When you normally build a static library, there is a minimum of goo and the linker pulls in the code it needs and repackages it for you in your executable.
In a dll, you actually get two end products (three really- just wait): a dll and a stub library. The stub is a static library that looks exactly like your regular static library, except that instead of executing your code, each stub is typically a jump instruction to a common routine. The common routine loads your dll, gets the address of the routine that you want to call, then patches up the original jump instruction to go there so when you call it again, you end up in your dll.
The third end product is usually a header file that tells you all about the data types in your library.
So your steps are: create your headers and code, build a dll, build a stub library from the headers/code/some list of exported functions. End code will link to the stub library which will load up the dll and fix up the jump table.
Compiler/linker goo includes things like making sure the runtime libraries are where they're needed, making sure that static constructors are executed, making sure that static destructors are registered for later execution, etc, etc, etc.
Now as to your main problem: how do I write extensible code in a dll? There are a number of possible ways - a typical way is to define a pure abstract class (aka interface) that defines a behavior and either pass that in to a processing routine or to create a routine for registering interfaces to do work, then the processing routine asks the registrar for an object to handle a piece of work for it.
On the detail of what you plan to solve, perhaps you should look at an extendible parser like lua instead of building your own.
To your more specific focus.
A DLL is (typically?) meant to be complete in and of itself, or explicitly know what other libraries to use to complete itself.
What I mean by that is, you cannot have a method implicitly provided by the calling application to complete the DLLs functionality.
You could however make part of your API the provision of methods from a calling app, thus making the DLL fully contained and the passing of knowledge explicit.
How do I use the classes and functions in the DLL?
Include the headers in your code, when the module (exe or another dll) is linked the dlls are checked for completness.
Can the DLL call functions from the program linking to it?
Yes, but it has to be told about them at run time.
If I make a class { ... } object; in the DLL, then when the DLL is loaded by the program, will object be available to the program?
Yes it will be available, however there are some restrictions you need to be aware about. Such as in the area of memory management it is important to either:
Link all modules sharing memory with the same memory management dll (typically c runtime)
Ensure that the memory is allocated and dealloccated only in the same module.
allocate on the stack
Examples!
Here is a basic idea of passing functions to the dll, however in your case may not be most helpfull as you need to know up front what other functions you want provided.
// parser.h
struct functions {
void *fred (int );
};
parse( string, functions );
// program.cpp
parse( "a = sqrt(); fred(a);", functions );
What you need is a way of registering functions(and their details with the dll.)
The bigger problem here is the details bit. But skipping over that you might do something like wxWidgets does with class registration. When method_fred is contructed by your app it will call the constructor and register with the dll through usage off methodInfo. Parser can lookup methodInfo for methods available.
// parser.h
class method_base { };
class methodInfo {
static void register(factory);
static map<string,factory> m_methods;
}
// program.cpp
class method_fred : public method_base {
static method* factory(string args);
static methodInfo _methoinfo;
}
methodInfo method_fred::_methoinfo("fred",method_fred::factory);
This sounds like a job for data structures.
Create a struct containing your keywords and the function associated with each one.
struct keyword {
const char *keyword;
int (*f)(int arg);
};
struct keyword keywords[max_keywords] = {
"db_connect", &db_connect,
}
Then write a function in your DLL that you pass the address of this array to:
plugin_register(keywords);
Then inside the DLL it can do:
keywords[0].f = &plugin_db_connect;
With this method, the code to handle script keywords remains in the main program while the DLL manipulates the data structures to get its own functions called.
Taking it to C++, make the struct a class instead that contains a std::vector or std::map or whatever of keywords and some functions to manipulate them.
Winrawr, before you go on, read this first:
Any improvements on the GCC/Windows DLLs/C++ STL front?
Basically, you may run into problems when passing STL strings around your DLLs, and you may also have trouble with exceptions flying across DLL boundaries, although it's not something I have experienced (yet).
You could always load the dll at runtime with load library