Template class specialized inside and outside lib - c++

Consider this synthetic example. I have two native C++ projects in my Visual Studio 2010 solution. One is console exe and another is lib.
There are two files in lib:
// TImage.h
template<class T> class TImage
{
public:
TImage()
{
#ifndef _LIB
std::cout << "Created (main), ";
#else
std::cout << "Created (lib), ";
#endif
std::cout << sizeof(TImage<T>) << std::endl;
}
#ifdef _LIB
T c[10];
#endif
};
void CreateImageChar();
void CreateImageInt();
and
// TImage.cpp
void CreateImageChar()
{
TImage<char> image;
std::cout << sizeof(TImage<char>) << std::endl;
}
void CreateImageInt()
{
TImage<int> image;
std::cout << sizeof(TImage<int>) << std::endl;
}
And one file in exe:
// main.cpp
int _tmain(int argc, _TCHAR* argv[])
{
TImage<char> image;
std::cout << sizeof(TImage<char>) << std::endl;
CreateImageChar();
CreateImageInt();
return 0;
}
I know, I shouldn't actually do like this, but this is just for understanding what is happening. And that's, what happens:
// cout:
Created (main), 1
1
Created (main), 1
10
Created (lib), 40
40
So how exactly this happened, that linker overrides lib's version of TImage<char> with exe's version of TImage<char>? But since there is no exe's version of TImage<int>, it preserves lib's version of TImage<int>?.. Is this behavior standardized, and if so, where can I found the description?
Update: Explanations of the effect given below are correct, thanks. But the question was "how exactly this happened"?.. I expected to get some linker error like "multiply defined symbols". So the most suitable answer is from Antonio Pérez's reply.

Template code creates duplicated object code.
The compiler copies the template code for the type you provide when you instance the template. So when TImage.cpp is compiled you get object code for two versions of your template, one for char and one for int in TImage.o. Then main.cpp is compiled and you get a new version of your template for char in main.o. Then the linker happens to use the one in main.o always.
This explains why your output yields the 'Created' lines. But it was a little bit puzzling to see the mismatch in lines 3, 4 regarding object's size:
Created (main), 1
10
This is due to the compiler resolving the sizeof operator during compile-time.

I am assuming here that you are building a static library, because you do not have any __decelspec(dllexport) or extern "C" in the code. What happens here is the following. The compiler create an instance of TImage<char> and TImage<int> for your lib. It also creates an instance for the your executable. When the linker joins the static library and the objects of the executable together duplicate code gets removed. Note here that static libraries are linked in like object code, so it does not make any difference if you create one big executable or multiple static libraries and an executable. If you would build one executable the result would be dependent on the order the objects are linked in; aka "not defined".
If you change the library to a DLL the behavior changes. Since you are calling over the boundary of a DLL, each needs their copy of TImage<char>. In most cases DLLs behave more as you would expect a library to work. Static libraries are normally just a convenience, so you need not put the code into your project.
Note: This only applies on Windows. On POSIX systems *.a files behave like *.so file, which creates quite some head aches for compiler developers.
Edit: Just never ever pass the TImage class over a DLL boundary. That will ensure a crash. That is the same reason why Microsoft's std::string implementation crashes when mixing debug and release builds. They do exactly what you did only with the NDEBUG macro.

The compiler will always instantiate your template when you use it - if the definition is available.
This means, it generates the required functions, methods etc. for the desired specialization and places them in the object file. This is the reason why you either need to have the definition available (usually in a header file) or an existing instantiation (e.g. in another object file or library) for the specific specialization that you are using.
Now, when linking, a situation might occur which is usually not allowed: more than one definition per class/function/method. For templates, this is specifically allowed and the compiler will choose one definition for you. That is what is happening in your case and what you call "overriding".

Memory layout is a compile-time concept; it has nothing to do with the linker. The main function thinks TImage is smaller than the CreateImage... functions do, because it was compiled with a different version of TImage.
If the CreateImage... function were defined in the header as inline functions, they would become part of main.cpp's compilation unit, and would thus report the same size characteristics as main reports.
This also has nothing to do with templates and when they get instantiated. You'd observe the same behaviour if TImage was an ordinary class.
EDIT: I just noticed that third line of cout doesn't contain "Created (lib), 10". Assuming it's not a typo, I suspect what's happening is that CreateImageChar is not inlining the call to the constructor, and is thus using main.cpp's version.

Template creates duplicates of classes (ie space) during the compilation itself. So when you are using too much of templates, the smart compilers try to optimize them based on parametrization of templates.

Within your library, you have a TImage that prints "lib" on construction, and contains an array of T. There are two of these, one for int and one for char.
In main, you have a TImage that prints "main" on construction, and does not contain an array of T. There is only the char version in this case; because you never ask for the int version to be created.
When you go to link, the linker choses one of the two TImage<char> constructors as the official one; it happens to chose main's version. This is why your third line prints "main" instead of "lib"; because you are calling that version of the constructor. Generally, you don't care which version of the constructor gets called... they are required to all be the same, but you have violated that requirement.
That's the important part: your code is now broken. Within your library functions, you expect to see that array of char within TImage<char> but the constructor never creates it. Further, imagine if you say new TImage<char> within main, and pass the pointer to a function within your library, where it gets deleted. main allocates one byte of space, the library function tries to release ten. Or if your CreateImageChar method returned the TImage<char> it creates... main will allocate one byte on the stack for the return value, and CreateImageChar will fill it with ten bytes of data. And so on.

Related

Static linking of C++ code

I have the following problem: I use some classes like the following to initialize C libraries:
class Hello
{
public:
Hello()
{
cout << "Hello world" << endl;
}
~Hello()
{
cout << "Goodbye cruel world" << endl;
}
} hello_inst;
If I include this code in a hello.cc file and compile it together with another file containing my main(), then the hello_inst is created before and destroyed after the call
to main(). In this case it just prints some lines, in my project I initialize libxml via
LIBXML_TEST_VERSION.
I am creating multiple executables which share a lot of the same code in a cmake project.
According to this thread: Adding multiple executables in CMake I created a static library containing the code shown above and then linked the executables against that library. Unfortunately in that case the hello_inst is never created (and libxml2 is never initialized). How can I fix this problem?
I had a similar problem and solved it by defining my libraries as static. Therefore I used the following code:
add_library( MyLib SHARED ${LBMLIB_SRCS} ${LBMLIB_HEADER})
Maybe this fixes your problem
There is no official way of forcing shared libraries global variables to be initialised by the standard and is compiler dependent.
Usually this is done either the first time something in that library is actually used (A class, function, or variable) or when the variable itself is actually used.
If you want to force hello_inst to be used, call a function on it, then see if and when the constructor and destructors are called.
Read this thread for more information:
http://www.gamedev.net/topic/622861-how-to-force-global-variable-which-define-in-a-static-library-to-initialize/
As far as I'm aware, statics defined in library should be constructed before main is called and destroyed after main, in the manner you describe. Indeed I have used shared libraries in many projects, and have never encountered the problems you describe.
I understand a library file, to be little more than a container of object files.
However, that said.....
If your code does nothing with the object that is created, the linker is free to remove it (dead code removal). I would suggest making sure the static object is referenced. Call a member function, perhaps?

C++: Virtual functions in dynamic shared library produce segfault

in an application I am writing, I am dynamically loading objects from a shared library I wrote.
This fine up to a point where virtual functions come into play. This means that I can easily call getters and setters, but I immediately get a segmentation fault when trying to call a function that overwrites a virtual base class functions. I am running out of ideas currently, as this happens to every class (hierarchy) in my project.
The functions can be successfully called when creating the objects inside the application, not using dynamic loading or shared libraries at all.
I suspect either a conceptual mistake or some kind of compilation/linking error on my side.
The class hierarchy looks something like this:
BaseClass.h:
class BaseClass : public EvenMoreBaseClass {
public:
virtual bool enroll(const shared_ptr<string> filepath) = 0;
}
Derived.h:
class Derived : public BaseClass {
bool enroll(const shared_ptr<string> filepath);
}
Derived.cpp:
bool Derived::enroll(const shared_ptr<string> filepath) {
cout << "enroll" << endl;
}
(includes and namespace left out here)
The application loads the library and gets a (shared) pointer to an object of BaseClass (the application includes BaseClass.h). All functions can be executed, except the virtual ones.
Development is done in Eclipse CDT. Everything is currently in one project with different build configurations (the application's .cpp is disabled in the shared library's configuration and vice versa). The compiler is g++ 4.4. All .o files of the hierarchy are linked with the library, -shared is set.
I would really appreciate any help.
Maybe this it a very late answer, but I found same behaviour, it happens if you use dlclose(handle) before accessing a virtual method (though static methods worked correctly). So you shouldn't close library handle until you are done with objects from it.
Make sure that the dll and the executable are both being compiled with the exact same compiler settings. If they are different the compiler might be generating code in the executable that doesn't match what the dll expects and vice versa which is a ODR violation.
Note that standard library types such as std::string and std::shared_ptr might have different layouts in debug configurations for example and can also change between compiler versions (even from the same vendor).
In general you need to be careful using classes across dll boundaries. It is usually recommended to deal with only built-in types and stateless interfaces.
Could there be any difference in sizes of the types between the projects set in the build configurations?
[update]
I have encountered similar problems when one project has types such as short and long defined to be different sizes to the other project, and classes containing these types are referenced through the same header file. One project puts the second short at byte 2, the other project thinks it is at byte 4.
I have also had problems with different padding values. One project pads odd characters to 4 byte boundaries, the other thinks it is padded to 8 byte boundaries.
The same might hold true for virtual function pointers, maybe one project fits them in 32 bits, the other expects them in 64 bits.

in c++ main function is the entry point to program how i can change it to an other function?

I was asked an interview question to change the entry point of a C or C++ program from main() to any other function. How is it possible?
In standard C (and, I believe, C++ as well), you can't, at least not for a hosted environment (but see below). The standard specifies that the starting point for the C code is main. The standard (c99) doesn't leave much scope for argument:
5.1.2.2.1 Program startup: (1) The function called at program startup is named main.
That's it. It then waffles on a bit about parameters and return values but there's really no leeway there for changing the name.
That's for a hosted environment. The standard also allows for a freestanding environment (i.e., no OS, for things like embedded systems). For a freestanding environment:
In a freestanding environment (in which C program execution may take place without any benefit of an operating system), the name and type of the function called at program startup are implementation-defined. Any library facilities available to a freestanding program, other than the minimal set required by clause 4, are implementation-defined.
You can use "trickery" in C implementations so that you can make it look like main isn't the entry point. This is in fact what early Windows compliers did to mark WinMain as the start point.
First way: a linker may include some pre-main startup code in a file like start.o and it is this piece of code which runs to set up the C environment then call main. There's nothing to stop you replacing that with something that calls bob instead.
Second way: some linkers provide that very option with a command-line switch so that you can change it without recompiling the startup code.
Third way: you can link with this piece of code:
int main (int c, char *v[]) { return bob (c, v); }
and then your entry point for your code is seemingly bob rather than main.
However, all this, while of possibly academic interest, doesn't change the fact that I can't think of one single solitary situation in my many decades of cutting code, where this would be either necessary or desirable.
I would be asking the interviewer: why would you want to do this?
The entry point is actually the _start function (implemented in crt1.o) .
The _start function prepares the command line arguments and then calls main(int argc,char* argv[], char* env[]),
you can change the entry point from _start to mystart by setting a linker parameter:
g++ file.o -Wl,-emystart -o runme
Of course, this is a replacement for the entry point _start so you won't get the command line arguments:
void mystart(){
}
Note that global/static variables that have constructors or destructors must be initialized at the beginning of the application and destroyed at the end. Keep that in mind if you are planning on bypassing the default entry point which does it automatically.
From C++ standard docs 3.6.1 Main Function,
A program shall contain a global function called main, which is the designated start of the program. It is implementation-defined
whether a program in a freestanding environment is required to define a main function.
So, it does depend on your compiler/linker...
If you are on VS2010, this could give you some idea
As it is easy to understand, this is not mandated by the C++ standard and falls in the domain of 'implemenation specific behavior'.
This is highly speculative, but you might have a static initializer instead of main:
#include <iostream>
int mymain()
{
std::cout << "mymain";
exit(0);
}
static int sRetVal = mymain();
int main()
{
std::cout << "never get here";
}
You might even make it 'Java-like', by putting the stuff in a constructor:
#include <iostream>
class MyApplication
{
public:
MyApplication()
{
std::cout << "mymain";
exit(0);
}
};
static MyApplication sMyApplication;
int main()
{
std::cout << "never get here";
}
Now. The interviewer might have thought about these, but I'd personally never use them. The reasons are:
It's non-conventional. People won't understand it, it's nontrivial to find the entry point.
Static initialization order is nondeterministic. Put in another static variable and you'll never now if it gets initialized.
That said, I've seen it being used in production instead of init() for library initializers. The caveat is, on windows, (from experience) your statics in a DLL might or might not get initialized based on usage.
Modify the crt object that actually calls the main() function, or provide your own (don't forget to disable linking of the normal one).
With gcc, declare the function with attribute((constructor)) and gcc will execute this function before any other code including main.
For Solaris Based Systems I have found this. You can use the .init section for every platforms I guess:
pragma init (function [, function]...)
Source:
This pragma causes each listed function to be called during initialization (before main) or during shared module loading, by adding a call to the .init section.
It's very simple:
As you should know when you use constants in c, the compiler execute a kind of 'macro' changing the name of the constant for the respective value.
just include a #define argument in the beginning of your code with the name of start-up function followed by the name main:
Example:
#define my_start-up_function (main)
I think it is easy to remove the undesired main() symbol from the object before linking.
Unfortunately the entry point option for g++ is not working for me(the binary crashes before entering the entry point). So I strip undesired entry-point from object file.
Suppose we have two sources that contain entry point function.
target.c contains the main() we do not want.
our_code.c contains the testmain() we want to be the entry point.
After compiling(g++ -c option) we can get the following object files.
target.o, that contains the main() we do not want.
our_code.o that contains the testmain() we want to be the entry point.
So we can use the objcopy to strip undesired main() function.
objcopy --strip-symbol=main target.o
We can redefine testmain() to main() using objcopy too.
objcopy --redefine-sym testmain=main our_code.o
And then we can link both of them into binary.
g++ target.o our_code.o -o our_binary.bin
This works for me. Now when we run our_binary.bin the entry point is our_code.o:main() symbol which refers to our_code.c::testmain() function.
On windows there is another (rather unorthodox) way to change the entry point of a program: TLS. See this for more explanations: http://isc.sans.edu/diary.html?storyid=6655
Yes,
We can change the main function name to any other name for eg. Start, bob, rem etc.
How does the compiler knows that it has to search for the main() in the entire code ?
Nothing is automatic in programming.
somebody has done some work to make it looks automatic for us.
so it has been defined in the start up file that the compiler should search for main().
we can change the name main to anything else eg. Bob and then the compiler will be searching for Bob() only.
Changing a value in Linker Settings will override the entry point. i.e., MFC applications use a value of 'Windows (/SUBSYSTEM:WINDOWS)' to change entry point from main() to CWinApp::WinMain().
Right clicking on solution > Properties > Linker > System > Subsystem > Windows (/SUBSYSTEM:WINDOWS)
...
Very practical benefit to modifying entry point:
MFC is a framework we take advantage of to write Windows applications in C++. I know it's ancient, but my company maintains one for legacy reasons! You will not find a main() in MFC code. MSDN says the entry point is WinMain(), instead. Thus, you can override the WinMain() of your base CWinApp object. Or, most people override CWinApp::InitInstance() because the base WinMain() will call it.
Disclaimer: I use empty parentheses to denote a method, without caring how many arguments.

Explicit Loading of DLL

I'm trying to explicitly link with a DLL. No other resources is available except the DLL file itself and some documentation about the classes and its member functions.
From the documentation, each class comes with its own
member typedef
example: typedef std::map<std::string,std::string> Server::KeyValueMap, typedef std::vector<std::string> Server::String Array
member enumeration
example: enum Server::Role {NONE,HIGH,LOW}
member function
example: void Server::connect(const StringArray,const KeyValueMap), void Server::disconnect()
Implementing the codes from google search, i manage to load the dll can call the disconnect function..
dir.h
LPCSTR disconnect = "_Java_mas_com_oa_rollings_as_apiJNI_Server_1disconnect#20";
LPCSTR connect =
"_Java_mas_com_oa_rollings_as_apiJNI_Server_1connect#20";
I got the function name above from depends.exe. Is this what is called decorated/mangled function names in C++?
main.cpp
#include <iostream>
#include <windows.h>
#include <tchar.h>
#include "dir.h"
typedef void (*pdisconnect)();
int main()
{
HMODULE DLL = LoadLibrary(_T("server.dll"));
pdisconnect _pdisconnect;`
if(DLL)
{
std::cout<< "DLL loaded!" << std::endl;
_disconnect = (pdisconnect)GetProcAddress(DLL,disconnect);
if(_disconnect)
{
std::cout << "Successful link to function in DLL!" << std::endl;
}
else
{
std::cout<< "Unable to link to function in DLL!" << std::endl;
}
}
else
{
std::cout<< "DLL failed to load!" << std::endl;
}
FreeLibrary (DLL);
return 0;}
How do i call (for example) the connect member function which has the parameter datatype declared in the dll itself?
Edit
more info:
The DLL comes with an example implementation using Java. The Java example contains a Java wrapper generated using SWIG and a source code.
The documentation lists all the class, their member functions and also their datatypes. According to the doc, the list was generated from the C++ source codes.(??)
No other info was given (no info on what compiler was used to generate the DLL)
My colleague is implementing the interface using Java based on the Java example given, while I was asked to implement using C++. The DLL is from a third party company.
I'll ask them about the compiler. Any other info that i should get from them?
I had a quick read through about JNI but i dont understand how it's implemented in this case.
Update
i'm a little confused... (ok, ok... very confused)
Do i call(GetProcAddress) each public member function separately only when i want to use them?
Do i create a dummy class that imitates the class in the dll. Then inside the class definition, i call the equivalent function from the DLL? (Am i making sense here?) fnieto, is this what you're showing me at the end of your post?
Is it possible to instantiate the whole class from the DLL?
I was trying to use the connect function described in my first post. From the Depends.exe DLL output,
std::map // KeyValueMap has the following member functions: del, empty, get, has_1key,set
std::vector // StringArray has the following member functions: add, capacity, clear, get, isEMPTY, reserve, set, size
which is different from the member functions of map and vector in my compiler (VS 2005)...
Any idea? or am i getting the wrong picture here...
Unless you use a disassembler and try to figure out the paramater types from assemly code, you can't. These kind of information is not stored in the DLL but in a header file coming with the DLL. If you don't have it, the DLL is propably not meant to be used by you.
I would be very careful if I were you: the STL library was not designed to be used across compilation boundaries like that.
Not that it cannot be done, but you need to know what you are getting into.
This means that using STL classes across DLL boundaries can safely work only if you compile your EXE with the same exact compiler and version, and the same settings (especially DEBUG vs. RELEASE) as the original DLL. And I do mean "exact" match.
The C++ standard STL library is a specification of behavior, not implementation. Different compilers and even different revisions of the same compiler can, and will, differ on the code and data implementations. When your library returns you an std::map, it's giving you back the bits that work with the DLL's version of the STL, not necessarily the STL code compiled in your EXE.
(and I'm not even touching on the fact that name mangling can also differ from compiler to compiler)
Without more details on your circumstances, I can't be sure; but this can be a can of worms.
In order to link with a DLL, you need:
an import library (.LIB file), this describes the relation between C/C++ names and DLL exports.
the C/C++ signatures of the exported items (usually functions), describing the calling convention, arguments and return value. This usually comes in a header file (.H).
From your question it looks like you can guess the signatures (#2), but you really need the LIB file (#1).
The linker can help you generate a LIB from a DLL using an intermediate DEF.
Refer to this question for more details: How to generate an import library from a DLL?
Then you need to pass the .lib as an "additional library" to the linker. The DLL must be available on the PATH or in the target folder.

Finding C++ static initialization order problems

We've run into some problems with the static initialization order fiasco, and I'm looking for ways to comb through a whole lot of code to find possible occurrences. Any suggestions on how to do this efficiently?
Edit: I'm getting some good answers on how to SOLVE the static initialization order problem, but that's not really my question. I'd like to know how to FIND objects that are subject to this problem. Evan's answer seems to be the best so far in this regard; I don't think we can use valgrind, but we may have memory analysis tools that could perform a similar function. That would catch problems only where the initialization order is wrong for a given build, and the order can change with each build. Perhaps there's a static analysis tool that would catch this. Our platform is IBM XLC/C++ compiler running on AIX.
Solving order of initialization:
First off, this is just a temporary work-around because you have global variables that you are trying to get rid of but just have not had time yet (you are going to get rid of them eventually aren't you? :-)
class A
{
public:
// Get the global instance abc
static A& getInstance_abc() // return a reference
{
static A instance_abc;
return instance_abc;
}
};
This will guarantee that it is initialised on first use and destroyed when the application terminates.
Multi-Threaded Problem:
C++11 does guarantee that this is thread-safe:
§6.7 [stmt.dcl] p4
If control enters the declaration concurrently while the variable is being initialized, the concurrent execution shall wait for completion of the initialization.
However, C++03 does not officially guarantee that the construction of static function objects is thread safe. So technically the getInstance_XXX() method must be guarded with a critical section. On the bright side, gcc has an explicit patch as part of the compiler that guarantees that each static function object will only be initialized once even in the presence of threads.
Please note: Do not use the double checked locking pattern to try and avoid the cost of the locking. This will not work in C++03.
Creation Problems:
On creation, there are no problems because we guarantee that it is created before it can be used.
Destruction Problems:
There is a potential problem of accessing the object after it has been destroyed. This only happens if you access the object from the destructor of another global variable (by global, I am referring to any non-local static variable).
The solution is to make sure that you force the order of destruction.
Remember the order of destruction is the exact inverse of the order of construction. So if you access the object in your destructor, you must guarantee that the object has not been destroyed. To do this, you must just guarantee that the object is fully constructed before the calling object is constructed.
class B
{
public:
static B& getInstance_Bglob;
{
static B instance_Bglob;
return instance_Bglob;;
}
~B()
{
A::getInstance_abc().doSomthing();
// The object abc is accessed from the destructor.
// Potential problem.
// You must guarantee that abc is destroyed after this object.
// To guarantee this you must make sure it is constructed first.
// To do this just access the object from the constructor.
}
B()
{
A::getInstance_abc();
// abc is now fully constructed.
// This means it was constructed before this object.
// This means it will be destroyed after this object.
// This means it is safe to use from the destructor.
}
};
I just wrote a bit of code to track down this problem. We have a good size code base (1000+ files) that was working fine on Windows/VC++ 2005, but crashing on startup on Solaris/gcc.
I wrote the following .h file:
#ifndef FIASCO_H
#define FIASCO_H
/////////////////////////////////////////////////////////////////////////////////////////////////////
// [WS 2010-07-30] Detect the infamous "Static initialization order fiasco"
// email warrenstevens --> [initials]#[firstnamelastname].com
// read --> http://www.parashift.com/c++-faq-lite/ctors.html#faq-10.12 if you haven't suffered
// To enable this feature --> define E-N-A-B-L-E-_-F-I-A-S-C-O-_-F-I-N-D-E-R, rebuild, and run
#define ENABLE_FIASCO_FINDER
/////////////////////////////////////////////////////////////////////////////////////////////////////
#ifdef ENABLE_FIASCO_FINDER
#include <iostream>
#include <fstream>
inline bool WriteFiasco(const std::string& fileName)
{
static int counter = 0;
++counter;
std::ofstream file;
file.open("FiascoFinder.txt", std::ios::out | std::ios::app);
file << "Starting to initialize file - number: [" << counter << "] filename: [" << fileName.c_str() << "]" << std::endl;
file.flush();
file.close();
return true;
}
// [WS 2010-07-30] If you get a name collision on the following line, your usage is likely incorrect
#define FIASCO_FINDER static const bool g_psuedoUniqueName = WriteFiasco(__FILE__);
#else // ENABLE_FIASCO_FINDER
// do nothing
#define FIASCO_FINDER
#endif // ENABLE_FIASCO_FINDER
#endif //FIASCO_H
and within every .cpp file in the solution, I added this:
#include "PreCompiledHeader.h" // (which #include's the above file)
FIASCO_FINDER
#include "RegularIncludeOne.h"
#include "RegularIncludeTwo.h"
When you run your application, you will get an output file like so:
Starting to initialize file - number: [1] filename: [p:\\OneFile.cpp]
Starting to initialize file - number: [2] filename: [p:\\SecondFile.cpp]
Starting to initialize file - number: [3] filename: [p:\\ThirdFile.cpp]
If you experience a crash, the culprit should be in the last .cpp file listed. And at the very least, this will give you a good place to set breakpoints, as this code should be the absolute first of your code to execute (after which you can step through your code and see all of the globals that are being initialized).
Notes:
It's important that you put the "FIASCO_FINDER" macro as close to the top of your file as possible. If you put it below some other #includes you run the risk of it crashing before identifying the file that you're in.
If you're using Visual Studio, and pre-compiled headers, adding this extra macro line to all of your .cpp files can be done quickly using the Find-and-replace dialog to replace your existing #include "precompiledheader.h" with the same text plus the FIASCO_FINDER line (if you check off "regular expressions, you can use "\n" to insert multi-line replacement text)
Depending on your compiler, you can place a breakpoint at the constructor initialization code. In Visual C++, this is the _initterm function, which is given a start and end pointer of a list of the functions to call.
Then step into each function to get the file and function name (assuming you've compiled with debugging info on). Once you have the names, step out of the function (back up to _initterm) and continue until _initterm exits.
That gives you all the static initializers, not just the ones in your code - it's the easiest way to get an exhaustive list. You can filter out the ones you have no control over (such as those in third-party libraries).
The theory holds for other compilers but the name of the function and the capability of the debugger may change.
perhaps use valgrind to find usage of uninitialized memory. The nicest solution to the "static initialization order fiasco" is to use a static function which returns an instance of the object like this:
class A {
public:
static X &getStatic() { static X my_static; return my_static; }
};
This way you access your static object is by calling getStatic, this will guarantee it is initialized on first use.
If you need to worry about order of de-initialization, return a new'd object instead of a statically allocated object.
EDIT: removed the redundant static object, i dunno why but i mixed and matched two methods of having a static together in my original example.
There is code that essentially "initializes" C++ that is generated by the compiler. An easy way to find this code / the call stack at the time is to create a static object with something that dereferences NULL in the constructor - break in the debugger and explore a bit. The MSVC compiler sets up a table of function pointers that is iterated over for static initialization. You should be able to access this table and determine all static initialization taking place in your program.
We've run into some problems with the
static initialization order fiasco,
and I'm looking for ways to comb
through a whole lot of code to find
possible occurrences. Any suggestions
on how to do this efficiently?
It's not a trivial problem but at least it can done following fairly simple steps if you have an easy-to-parse intermediate-format representation of your code.
1) Find all the globals that have non-trivial constructors and put them in a list.
2) For each of these non-trivially-constructed objects, generate the entire potential-function-tree called by their constructors.
3) Walk through the non-trivially-constructor function tree and if the code references any other non-trivially constructed globals (which are quite handily in the list you generated in step one), you have a potential early-static-initialization-order issue.
4) Repeat steps 2 & 3 until you have exhausted the list generated in step 1.
Note: you may be able to optimize this by only visiting the potential-function-tree once per object class rather than once per global instance if you have multiple globals of a single class.
Replace all the global objects with global functions that return a reference to an object declared static in the function. This isn't thread-safe, so if your app is multi-threaded you might need some tricks like pthread_once or a global lock. This will ensure that everything is initialized before it is used.
Now, either your program works (hurrah!) or else it sits in an infinite loop because you have a circular dependency (redesign needed), or else you move on to the next bug.
The first thing you need to do is make a list of all static objects that have non-trivial constructors.
Given that, you either need to plug through them one at a time, or simply replace them all with singleton-pattern objects.
The singleton pattern comes in for a lot of criticism, but the lazy "as-required" construction is a fairly easy way to fix the majority of the problems now and in the future.
old...
MyObject myObject
new...
MyObject &myObject()
{
static MyObject myActualObject;
return myActualObject;
}
Of course, if your application is multi-threaded, this can cause you more problems than you had in the first place...
Gimpel Software (www.gimpel.com) claims that their PC-Lint/FlexeLint static analysis tools will detect such problems.
I have had good experience with their tools, but not with this specific issue so I can't vouch for how much they would help.
Some of these answers are now out of date. For the sake of people coming from search engines, like myself:
On Linux and elsewhere, finding instances of this problem is possible through Google's AddressSanitizer.
AddressSanitizer is a part of LLVM starting with version 3.1 and a
part of GCC starting with version 4.8
You would then do something like the following:
$ g++ -fsanitize=address -g staticA.C staticB.C staticC.C -o static
$ ASAN_OPTIONS=check_initialization_order=true:strict_init_order=true ./static
=================================================================
==32208==ERROR: AddressSanitizer: initialization-order-fiasco on address ... at ...
#0 0x400f96 in firstClass::getValue() staticC.C:13
#1 0x400de1 in secondClass::secondClass() staticB.C:7
...
See here for more details:
https://github.com/google/sanitizers/wiki/AddressSanitizerInitializationOrderFiasco
Other answers are correct, I just wanted to add that the object's getter should be implemented in a .cpp file and it should not be static. If you implement it in a header file, the object will be created in each library / framework you call it from....
If your project is in Visual Studio (I've tried this with VC++ Express 2005, and with Visual Studio 2008 Pro):
Open Class View (Main menu->View->Class View)
Expand each project in your solution and Click on "Global Functions and Variables"
This should give you a decent list of all of the globals that are subject to the fiasco.
In the end, a better approach is to try to remove these objects from your project (easier said than done, sometimes).