Iam looking at a piece of code that creates global class variables. The constructors of these classes calls a symbol table singleton and adds the this pointers in it..
In a Keywords.cpp file
class A : class KeyWord
{
A() { add(); }
} A def;
similarly for keywords B,C etc
void KeyWord::add()
{
CSymbolCtrl& c = CSymbolCtrl::GetInstance();
c.addToTable(this);
}
These translation units are compiled to form a library. When i "dumpbin" the library, i see the dynamic initializers for ADef, BDef etc.
No in the exe, when i call the CSymbolCtrl instance, i didnt find the ADef, BDef.. stored in its map. When i set a breakpoint in add(), its not getting hit. Is there a way that the linker is ignoring ADef, BDef because they are not referenced anywhere?
}
From Standard docs 1.9 Program execution,
4) This provision is sometimes called the “as-if” rule, because an implementation is free to disregard any requirement of this International Standard
as long as the result is as if the requirement had been obeyed, as far as can be determined from the observable behavior of the program. For instance,
an actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no side effects affecting the
observable behavior of the program are produced.
So, it might, yes.
The short answer is yes. A pretty common way to force registration is to do something like:
static bool foo = register_type();
It's not too clear from your question, but are you actually
including the compiled object files in your link or not? Just
putting a file in a library doesn't cause it to be included in
the final program. By definition, a file from a library will
only be included in the executable if it resolves an unresolved
external symbol. If you want an object file to be part of the
final executable, and it doesn't contain any globals which would
resolve an undefined external, then you have several choices:
-- Link the object file directly, rather than putting it in
a library. (This is the "standard" or "canonical" way of
doing it.)
-- Use a DLL. Dispite the name, DLL's are not libraries, but
object files, and are linked in an all or nothing way.
-- Create a dummy global symbol, and reference it somewhere.
(This can often be automated, and might be the preferred
solution if you're delivering a library as a third party
supplier.)
Related
I am working on a factory that will have types added to them, however, if the class is not explicitly instiated in the .exe that is exectured (compile-time), then the type is not added to the factory. This is due to the fact that the static call is some how not being made. Does anyone have any suggestions on how to fix this? Below is five very small files that I am putting into a lib, then an .exe will call this lib. If there is any suggestions on how I can get this to work, or maybe a better design pattern, please let me know. Here is basically what I am looking for
1) A factory that can take in types
2) Auto registration to go in the classes .cpp file, any and all registration code should go in the class .cpp (for the example below, RandomClass.cpp) and no other files.
BaseClass.h : http://codepad.org/zGRZvIZf
RandomClass.h : http://codepad.org/rqIZ1atp
RandomClass.cpp : http://codepad.org/WqnQDWQd
TemplateFactory.h : http://codepad.org/94YfusgC
TemplateFactory.cpp : http://codepad.org/Hc2tSfzZ
When you are linking with a static library, you are in fact extracting from it the object files which provide symbols which are currently used but not defined. In the pattern that you are using, there is probably no undefined symbols provided by the object file which contains the static variable which triggers registration.
Solutions:
use explicit registration
have somehow an undefined symbol provided by the compilation unit
use the linker arguments to add your static variables as a undefined symbols
something useful, but this is often not natural
a dummy one, well it is not natural if it is provided by the main program, as a linker argument it main be easier than using the mangled name of the static variable
use a linker argument stating that all the objects of a library have to be included
dynamic libraries are fully imported, thus don't have that problem
As a general rule of thumb, an application do not include static or global variables from a library unless they are implicitly or explicitly used by the application.
There are hundred different ways this can be refactored. One method could be to place the static variable inside function, and make sure the function is called.
To expand on one of #AProgrammer's excellent suggestions, here is a portable way to guarantee the calling program will reference at least one symbol from the library.
In the library code declare a global function that returns an int.
int make_sure_compilation_unit_referenced() { return 0; }
Then in the header for the library declare a static variable that is initialized by calling the global function:
extern int make_sure_compilation_unit_referenced();
static int never_actually_used = make_sure_compilation_unit_referenced();
Every compilation unit that includes the header will have a static variable that needs to be initialized by calling a (useless) function in the library.
This is made a little cleaner if your library has its own namespace encapsulating both of the definitions, then there's less chance of name collisions between the bogus function in your library with other libraries, or of the static variable with other variables in the compilation unit(s) that include the header.
I have some code I want to execute at global scope. So, I can use a global variable in a compilation unit like this:
int execute_global_code();
namespace {
int dummy = execute_global_code();
}
The thing is that if this compilation unit ends up in a static library (or a shared one with -fvisibility=hidden), the linker may decide to eliminate dummy, as it isn't used, and with it my global code execution.
So, I know that I can use concrete solutions based on the specific context: specific compiler (pragma include), compilation unit location (attribute visibility default), surrounding code (say, make an dummy use of dummy in my code).
The question is, is there a standard way to ensure execute_global_code will be executed that can fit in a single macro which will work regardless of the compilation unit placement (executable or lib)? ie: only standard c++ and no user code outside of that macro (like a dummy use of dummy in main())
The issue is that the linker will use all object files for linking a binary given to it directly, but for static libraries it will only pull those object files which define a symbol that is currently undefined.
That means that if all the object files in a static library contain only such self-registering code (or other code that is not referenced from the binary being linked) - nothing from the entire static library shall be used!
This is true for all modern compilers. There is no platform-independent solution.
A non-intrusive to the source code way to circumvent this using CMake can be found here - read more about it here - it will work if no precompiled headers are used. Example usage:
doctest_force_link_static_lib_in_target(exe_name lib_name)
There are some compiler-specific ways to do this as grek40 has already pointed out in a comment.
Background
First of all, I think this question goes beyond the C++ standard. The standard deals with multiple translation units (instantiation units) and thus multiple object modules, but does not seem to acknowledge the possibility of having multiple independently compiled and linked binary modules (i.e., .so files on Linux and .dll files on Windows). After all, the latter more of less enters into the world of application binary interface (ABI) that the standard leaves to implementations to consider at present.
When only a single binary module is involved, the following code snippet illustrates an elegant and portable (standard-compliant) solution to make singletons.
inline T& get() {
static T var{};
return var;
}
There are two things to note about this solution. First, the inline specifier makes the function a candidate to be included in multiple translation units, which is very convenient. Note that, the standard guarantees there is only a single instance of get() and the local static variable var in the final binary module (see here).
The second thing to note is that since C++11, initialization of static local variables is properly synchronized (see the Static local variables section here). So, concurrent invocations of get() is fine.
Now, I try to extend this solution to the case when multiple binary modules are involved. I find the following variant works with VC++ on Windows.
// dllexport is used in building the library module, and
// dllimport is used in using the library in an application module.
// Usually controlled by a macro switch.
__declspec(dllexport/dllimport) inline T& get() {
static T var{};
return var;
}
Note for non-Windows users: __declspec(dllexport) specifies that an entity (i.e., a function, a class or an object) is implemented (defined) in this module and is to be referenced by other modules. __declspec(dllimport), on the other hand, specifies that an entity is not implemented in this module and is to be found in some other module.
Since VC++ supports exporting and importing template instantiations (see here), the above solution can even be templated. For example:
template <typename T> inline
T& get() {
static T var{};
return var;
}
// EXTERN is defined to be empty in building the library module, and
// to `extern` in using the library module in an application module.
// Again, this is usually controlled by a macro switch.
EXTERN template __declspec(dllexport/dllimport) int& get<int>();
As a side note, the inline specifier is not mandatory here. See this S.O. question.
The Question
Since there is no __declspec(dllexport/import) equivalents in GCC and clang, is there a way to make a variant of the above solution that works on these two compilers?
Also, in Boost.Log, I noticed the BOOST_LOG_INLINE_GLOBAL_LOGGER_DEFAULT macro (see the Global logger objects section here). It is claimed to create singletons even if the application consists of multiple modules. If someone knows about the inner workings of this macro, explanations are welcome here.
Finally, if you know about any better solutions for making singletons, feel free to post it as an answer.
Since there is no __declspec(dllexport/import) equivalents in GCC and clang, is there a way to make a variant of the above solution that works on these two compilers?
First, this is not as much a compiler-related question but rather an underlying operating system one. GCC (and supposedly clang) do support __declspec(dllexport/import) on Windows and basically do the same as what MSVC does with the functions and object marked this way. Basically, the marked symbol is placed in a table of exported symbols from the dll (an export table). This table can be used, for instance, when you query for a symbol in a dll in run time (see GetProcAddress).
Along with the dll there comes an associated lib file that contains auxiliary data for linking your application with the dll. When you link your application with the library, the linker uses the lib file to resolve references to the dll symbols and compose the import table in your application binary. When the application starts, the OS (or rather the runtime loader component of the OS) uses the import table to find out what dlls your application depends on and what symbols it imports from those dlls. It then uses export tables in the dlls to resolve addresses of the referenced symbols in the dlls and complete the linking process.
The important side effect of this process is that only imported symbols are dynamically resolved and every symbol you dynamically link to is associated with a particular dll. You can have same-named symbols in multiple dlls and the application itself, and these symbols will refer to distinct entities as long as they are not exported. If they are exported, linking process will fail because of ambiguity. This makes process-wide singletons difficult on Windows. This also breaks some C/C++ language rules, because taking address of an object or function with external linkage (in the language terms) can produce different addresses in different parts of the program. On the other hand, the dlls are more self-contained and depend on the loading context in a lesser degree.
Things are significantly different on Linux and other POSIX-like OSs. When linked, for each shared object (which can be an so library or the application executable) a table of symbols is compiled. It lists both the symbols this shared object implements and the symbols it is missing. Additionally, the linker may embed into the shared object a list of other shared objects (optionally, with search paths) that could be used to resolve the missing symbols. The runtime loader includes a linker which loads the shared objects sequentially and constructs a global table of symbols comprising symbols from all shared objects. As that table is constructed, the duplicate symbols from multiple shared objects are resolved to a single implementation (as all implementations are considered equivalent, the first shared object in the load list that implements the symbol is used). Any missing symbols are also resolved as the subsequent shared objects in the link order are loaded.
The effect of this process is that each symbol with external linkage resolve to a single implementation in one of the shared objects, even if multiple shared objects implement it. This is more in line with the C/C++ language rules and makes it simpler to implement process-wide singletons. A simple function-local static variable, not marked in any special way, is enough.
Now, there are ways to influence the linking process, and in particular there are ways to limit the symbols that are exported from a shared object. The most common ways to do that are using symbol visibility and linker scripts. With these tools it is possible to achieve linking behavior very close to Windows, with all its pros and cons. Note that when you limit symbol visibility you do have to mark the symbols you intend to export from the shared object with the visibility attribute or pragma. There's no need to mark symbols for import though.
Also, in Boost.Log, I noticed the BOOST_LOG_INLINE_GLOBAL_LOGGER_DEFAULT macro (see the Global logger objects section here). It is claimed to create singletons even if the application consists of multiple modules. If someone knows about the inner workings of this macro, explanations are welcome here.
Boost.Log requires to be built as a shared library when it is used from a multi-module application. This makes it possible for it to have a process-wide storage of references to global loggers declared throughout the application (the storage is implemented within the Boost.Log dll/so). When you obtain a logger declared with the BOOST_LOG_INLINE_GLOBAL_LOGGER_DEFAULT or similar macro, the storage is first looked up for the reference to the logger. If it is not found, the logger is created and a reference to it is stored back to the internal storage. Otherwise the existing reference is used. Along with reference caching, this provides performance very close to a function-local static variable.
Finally, if you know about any better solutions for making singletons, feel free to post it as an answer.
Although this is not really an answer, you should generally avoid singletons. They are difficult to implement correctly and in a way that does not hamper performance. If you really do have to implement one then the solution similar to Boost.Log looks generic enough. Note however that with this solution it is generally not known which module created (and as such, 'owns') the singleton, so you can't unload any modules dynamically. There may be simpler ways that are case-specific, like exporting a function returning a reference to the local static object. If you want portability and support non-default symbol visibility by default, always explicitly export your symbols.
GCC has the ability to make a symbol link weakly via __attribute__((weak)). I want to use the a weak symbol in a static library that users can override in their application. A GCC style weak symbol would let me do that, but I don't know if it can be done with visual studio.
Does Visual Studio offer a similar feature?
You can do it, here is an example in C:
/*
* pWeakValue MUST be an extern const variable, which will be aliased to
* pDefaultWeakValue if no real user definition is present, thanks to the
* alternatename directive.
*/
extern const char * pWeakValue;
extern const char * pDefaultWeakValue = NULL;
#pragma comment(linker, "/alternatename:_pWeakValue=_pDefaultWeakValue")
MSVC++ has __declspec(selectany) which covers part of the functionality of weak symbols: it allows you to define multiple identical symbols with external linkage, directing the compiler to choose any one of several available. However, I don't think MSVC++ has anything that would cover the other part of weak symbol functionality: the possibility to provide "replaceable" definitions in a library.
This, BTW, makes one wonder how the support for standard replaceable ::operator new and ::operator delete functions works in MSVC++.
MSVC used to behave such that if a symbol is defined in a .obj file and a .lib it would use the one on the .obj file without warning. I recall that it would also handle the situation where the symbol is defined in multiple libs it would use the one in the library named first in the list.
I can't say I've tried this in a while, but I'd be surprised if they changed this behavior (especially that .obj defined symbols override symbols in .lib files).
The only way i know. Place each symbol in a separate library. User objects with overrides also must be combined to library. Then link all together to an application. User library must be specified as input file, your lib's must be transfered to linker using /DEFAULTLIB: option.
From here:
... if a needed symbol can be satisfied without consulting a library,
then the OBJ in the library will not be used. This lets you override a
symbol in a library by explicitly placing it an OBJ. You can also
override a symbol in a library to putting it in another library that
gets searched ahead of the one you want to override. But you can’t
override a symbol in an explicit OBJ, because those are part of the
initial conditions.
This behavior results from the algorithm adopted by the linker.
So for short, to override a function,
With GCC, you must use the __attribute__((weak)). You cannot rely on the input order of object files into the linker to decide which function implementation is used.
With VS toolchain, you can rely on the order of the object/lib files and you must place your implementation of the function before the LIB that implements the original function. And you can put your implementation as an OBJ or LIB.
There isn't an MS-VC equivalent to this attribute. See http://connect.microsoft.com/VisualStudio/feedback/details/505028/add-weak-function-references-for-visual-c-c. I'm going to suggest something horrible: reading the purpose of it here: http://www.kolpackov.net/pipermail/notes/2004-March/000006.html it is essentially to define functions that, if their symbols exist, are used, otherwise, are not, so...
Why not use pre-processor for this purpose, with the huge caveat of "if you need to do this at all"? (I'm not a fan of recommending pre-processor).
Example:
#ifdef USE_MY_FUNCTION
extern void function();
#endif
then call appropriately in the application logic, surrounded by #ifdef statements. If your static library is linked in, as part of the linking in process, tweak the defines to define USE_MY_FUNCTION.
Not quite a direct equivalent and very ugly but it's the best I can think of.
I am working on a factory that will have types added to them, however, if the class is not explicitly instiated in the .exe that is exectured (compile-time), then the type is not added to the factory. This is due to the fact that the static call is some how not being made. Does anyone have any suggestions on how to fix this? Below is five very small files that I am putting into a lib, then an .exe will call this lib. If there is any suggestions on how I can get this to work, or maybe a better design pattern, please let me know. Here is basically what I am looking for
1) A factory that can take in types
2) Auto registration to go in the classes .cpp file, any and all registration code should go in the class .cpp (for the example below, RandomClass.cpp) and no other files.
BaseClass.h : http://codepad.org/zGRZvIZf
RandomClass.h : http://codepad.org/rqIZ1atp
RandomClass.cpp : http://codepad.org/WqnQDWQd
TemplateFactory.h : http://codepad.org/94YfusgC
TemplateFactory.cpp : http://codepad.org/Hc2tSfzZ
When you are linking with a static library, you are in fact extracting from it the object files which provide symbols which are currently used but not defined. In the pattern that you are using, there is probably no undefined symbols provided by the object file which contains the static variable which triggers registration.
Solutions:
use explicit registration
have somehow an undefined symbol provided by the compilation unit
use the linker arguments to add your static variables as a undefined symbols
something useful, but this is often not natural
a dummy one, well it is not natural if it is provided by the main program, as a linker argument it main be easier than using the mangled name of the static variable
use a linker argument stating that all the objects of a library have to be included
dynamic libraries are fully imported, thus don't have that problem
As a general rule of thumb, an application do not include static or global variables from a library unless they are implicitly or explicitly used by the application.
There are hundred different ways this can be refactored. One method could be to place the static variable inside function, and make sure the function is called.
To expand on one of #AProgrammer's excellent suggestions, here is a portable way to guarantee the calling program will reference at least one symbol from the library.
In the library code declare a global function that returns an int.
int make_sure_compilation_unit_referenced() { return 0; }
Then in the header for the library declare a static variable that is initialized by calling the global function:
extern int make_sure_compilation_unit_referenced();
static int never_actually_used = make_sure_compilation_unit_referenced();
Every compilation unit that includes the header will have a static variable that needs to be initialized by calling a (useless) function in the library.
This is made a little cleaner if your library has its own namespace encapsulating both of the definitions, then there's less chance of name collisions between the bogus function in your library with other libraries, or of the static variable with other variables in the compilation unit(s) that include the header.