At the very first, apologies in advance for the very indistinct presentation of the question - if I knew what to ask for, I would probably know how to fix it.
... But I do not even have a faint theory, seven years of C++ experience notwithstanding. Any helpful pointers (hè) will be most appreciated.
Possibly related to this question. (same symptoms)
Not related to that one. (no explicit function pointers here)
Problem: A count of objects with certain properties, which themselves are checked by a member function of their class shows incorrect results. The source is that the check happens with gibberish values. The source of that is that the pointer "this" changes between calling the function and entering its body. Once the body is left, the pointer is again correct.
Sadly, the solution for the related question does not work here.
I am not able to produce a minimal example of the problem.
Furthermore, literally hundreds of member functions are being called correctly in the same program, as far as I know without exhibiting this behaviour.
What can I do?
As a bandaid, replacing the function call with a copy of its body works, but this is no proper solution.
I am at a complete loss as to how to proceed with my diagnosis.
Concise: What steps can I follow to attain greater insight into the nature of the problem?
A short checklist of things already taken care of:
The objects in question are properly initialised at the time of the call.
All optimisations are off. No inlining. This is a debug build with the appropriate settings in effect.
Cleaning and rebuilding the project has not yielded a different result.
Recompiling with the original (but retyped) function call after the bandaid solution had been tested successfully led to a return of the problem.
There are no compiler warnings in the compilation unit involved (warning level 3), specifically disabled project-wide are:
C4005 (macro redefinition, due to using a custom/hacked Windows SDK for compatibility reasons - this was originally a Win95 program)
C4244 (implicit cast to smaller type, due to legacy code waiting to be refactored - those are all float-to-int conversions that lack an explicit cast, all 800+ instances of them)
C4995 (calling function marked with #pragma deprecated, due to C-lib functions being called without preceding underscore - hoping to eventually switch to GCC)
"control flow guard" and "basic runtime checks" are enabled, but do not trigger.
And a hint that may or may not be relevant, but which I cannot interpret at the moment:
For the very first hex, IsSea is called normally, that is: "this" inside is identical to "this" outside
Only in all hexes that follow does the bug happen.
The altered "this" pointer does not point to the first hex though, it seems to hit unallocated space.
Here is an extract of how it looks like:
Header:
// These three are actually included from other files.
constexpr unsigned int AREA_none = 0u;
constexpr unsigned int AREA_Lake = 1u;
enum Terrain
{
OCEAN = 8,
TERRA_INCOGNITA = 10
};
class CHex
{
public:
CHex(); // initialises ALL members with defined error values
bool ReadHex(); // Init-type function calling the problematic one
bool IsSea() const // problematic function
{ return this->_Area != AREA_none && this->_Area != AREA_LAKE && this->_nTerrain == Terrain::OCEAN; }
// The body does the right thing - just WITH the wrong thing.
protected:
unsigned int _Area;
int _nNavalIndex;
Terrain _nTerrain;
static int _nNavalCount = 0;
// There are a lot more functions in here, both public and protected.
// The class also inherits a bunch from three other classes, but no virtual functions and no overlaps are involved.
}
Source:
CHex::CHex() : _Area{0u}, _nNavalIndex{0}, _nTerrain{Terrain::TERRA_INCOGNITA}
{}
bool CHex::ReadHex()
{
// Calls a lexer/parser pair to procure values from several files.
// _Area and _nTerrain are being initialised in this process.
// All functions called here work as expected and produce data matching the source files.
// _Area and _nTerrain have the correct values seen in the source files at this point.
if(this->IsSea()) // but inside that it looks as if they were uninitialised
// This ALWAYS happens because the function always returns true.
_nNavalIndex = _nNavalCount++;
// Stopping at the next instruction, all values are again correct
// - with the notable exception of the two modified by the instruction that should not have happened.
// If I replace that with the following, I receive the correct result:
/*
// direct copy of the function's body
if(this->_Area != AREA_none && this->_Area != AREA_Lake && this->_nTerrain == Terrain::OCEAN)
_nNavalIndex = _nNavalCount++; // only happens when it should; at the end, the count is correct
*/
// Sanity checks follow here.
// They too work correctly and produce results appropriate for the data.
return true; // earlier returns exist in the commented-out parts
}
Sorry again for this big mess, but well, right now I am a mess. It's like seeing fundamental laws of physics change.
--
On advice from #Ben Voigt I hacked in a diagnostic that dumps the pointers into a file. Behold:
Before ReadHex: 20A30050 (direct array access) On ReadHex: 20A30050 On IsSea: 20A30050 (with members: 0, 8) After ReadHex: 20A30050
Before ReadHex: 20A33EAC (direct array access) On ReadHex: 20A33EAC On IsSea: 20A33EAC (with members: 2, 0) After ReadHex: 20A33EAC
Before ReadHex: 20A37D08 (direct array access) On ReadHex: 20A37D08 On IsSea: 20A37D08 (with members: 2, 0) After ReadHex: 20A37D08
Before ReadHex: 20A3BB64 (direct array access) On ReadHex: 20A3BB64 On IsSea: 20A3BB64 (with members: 3, 0) After ReadHex: 20A3BB64
Before ReadHex: 20A3F9C0 (direct array access) On ReadHex: 20A3F9C0 On IsSea: 20A3F9C0 (with members: 4, 3) After ReadHex: 20A3F9C0
Before ReadHex: 20A4381C (direct array access) On ReadHex: 20A4381C On IsSea: 20A4381C (with members: 3, 0) After ReadHex: 20A4381C
[...]
They are all correct. Every single one of them. And even better: The function now evaluates correctly!
Here is the changed source (I am omitting the comments this time):
Header:
// These three are actually included from other files.
constexpr unsigned int AREA_none = 0u;
constexpr unsigned int AREA_Lake = 1u;
enum Terrain
{
OCEAN = 8,
TERRA_INCOGNITA = 10
};
extern FILE * dump;
class CHex
{
public:
CHex();
bool ReadHex();
bool IsSea() const {
fprintf(dump, "\tOn IsSea:\t%p (with members: %u, %i) ", (void*)this, this->_Area, this->_nTerrain);
return this->_Area != AREA_none && this->_Area != AREA_LAKE && this->_nTerrain == Terrain::OCEAN; }
protected:
unsigned int _Area;
int _nNavalIndex;
Terrain _nTerrain;
static int _nNavalCount = 0;
// lots more functions and attributes
}
Source:
CHex::CHex() : _Area{0u}, _nNavalIndex{0}, _nTerrain{Terrain::TERRA_INCOGNITA}
{}
bool CHex::ReadHex()
{
fprintf(dump, "On ReadHex:\t%p ", (void*)this);
// Calls a lexer/parser pair to procure values from several files.
// _Area and _nTerrain are being initialised in this process.
if(this->IsSea()) // Suddenly works!?
_nNavalIndex = _nNavalCount++;
// Sanity checks follow here.
fprintf(dump, "After ReadHex:\t%p ", (void*)this);
return true;
}
The additional outputs (as well as the initialisation and closing of dump) come from the next higher level in the control flow, another function in another class, where the loop over all hexes resides. I omitted that for now, but will add it if someone thinks it's important.
And apropos that. It now looks to me as if this fault were a result of bugs in the tools, not in the code. As a matter of fact, even though the function now evaluates correctly, the debugger still shows the wrong pointer from before and its nonsensical members.
EDIT for OP edit:
This now smells even more like a ODR violation. Changing an inline function, and having that change program behavior is exactly what could happen with the undefined behavior induced from ODR violations. Do you use templates anywhere? Also, try de-inlining IsSea() in the original version to see if that helps.
(original answer):
This smells like one of three things to me.
First, it could be a one-definition-rule violation for the function in question. Make absolutely certain there aren't multiple versions in different translation units, or different compilation settings in different units.
Secondly the compiler could be doing something because of your use of the reserved name _Area. Regardless of anything else you should fix this problem.
Thirdly, VC++ can utilize different mechanisms for member function pointers, and possibly one of those is affecting your case here (even given that you don't show use of member function pointers). See https://msdn.microsoft.com/en-us/library/83cch5a6.aspx?f=255&MSPPError=-2147217396 for some information. Another possibility is that the compiler options for such pointers are different across translation units.
This is a very sad answer, but alas, sometimes a sad answer is nevertheless correct.
Let us start with the smaller but at least somewhat useful part:
The values VS2015's debugger displays for the this pointer - and in extension all members of the object pointed to - are indeed incorrect and in a very reproducable way:
If a breakpoint is set on a member function defined in a header file, "this" displayed in the debugger will display - the entry point of this function. It will still have the type of the object in question and display all the members... but as those values are populated as offsets from the entry point of said function, their displayed contents are of course nonsensical.
All of this is purely a UI issue and does not affect the actual program execution.
And now the useless and depressing part:
The fault which originally prompted me to open this topic, which persisted through several compilations with different build settings - is gone and can no longer be reproduced after I put in the fprinf commands to dump the pointer addresses to a file in order to discover the bug described above.
Even though the code is letter by letter identical to the formerly faulty code, it now works flawlessly. Further alterations have done nothing to change this. I cannot bring it back, try as I might.
This whole dastard thing - was a fluke. Somehow. Which means that it may happen again, at any time, for no apparent reason, without any means of prevention whatsoever. Great, is it not?
...
In any case, heartfelt thanks to #Ben Voigt for raising the notion that mayhap those debugger outputs may not be related to reality in the first place.
Similarly, thanks to #dyp for pointing out and explaining an unrelated potential issue (names prefixed by '_' followed by a capital letter being reserved expressions) I was not aware of before.
Thanks to #Mark-B for actually providing hypotheses about the problem, even if none of them proved to be correct. At least there were some and one might have been.
Related
I created a structure like that:
struct Options {
double bindableKeys = 567;
double graphicLocation = 150;
double textures = 300;
};
Options options;
Right after this declaration, in another process, I open the process which contains the structure and search for a byte array with the struct's doubles but nothing gets found.
To obtain a result, I need to add something like std::cout << options.bindableKeys;after the declaration. Then I get a result from my pattern search.
Why is this behaving like that? Is there any fix?
Minimal reproducible example:
struct Options {
double bindableKeys = 567;
double graphicLocation = 150;
double textures = 300;
};
Options options;
while(true) {
double val = options.bindableKeys;
if(val > 10)
std::cout << "test" << std::endl;
}
You can search the array with CheatEngine or another pattern finder
Contrary to popular belief, C++ source code is not a sequence of instructions provided to the executing computer. It is not a list of things that the executable will contain.
It is merely a description of a program.
Your compiler is responsible for creating an executable program, that follows the same semantics and logical narrative as you've described in your source code.
Creating an Options instance is all well and good, but if creating it does not do anything (has no side effects) and you never use any of its data, then it may as well not exist, and therefore is not a part of the logical narrative of your program.
Consequently, there is no reason for the compiler to put it into the executable program. So, it doesn't.
Some people call this "optimisation". That the instance is "optimised away". I prefer to call it common sense: the instance was never truly a part of your program.
And even if you do use the data in the instance, it may be possible for an executable program to be created that more directly uses that data. In your case, nothing changes the default values of Option's members, so there is no reason to include them into the program: the if statement can just have 567 baked into it. Then, since it's baked in, the whole condition becomes the constant expression 567 > 10 which must always be true; you'll likely find that the resulting executable program consequently contains no branching logic at all. It just starts up, then outputs "test" over and over again until you force-terminate it.
That all being said, because we live in a world governed by physical laws, and because compilers are imperfect, there is always going to be some slight leakage of this abstraction. For this reason, you can trick the compiler into thinking that the instance is "used" in a way that requires its presence to be represented more formally in the executable, even if this isn't necessary to implement the described program. This is common in benchmarking code.
I'm getting an ICE on Visual Studio 2015 CTP 6. Unfortunately, this is happening in a large project, and I can't post the whole code here, and I have been unable to reproduce the problem on a minimal sample. What I'm hoping to get is help in constructing such a sample (to submit to Microsoft) or possibly illumination regarding what's happening and/or what I'm doing wrong.
This is a mock-up of what I'm doing. (Note that the code I'm presenting here does NOT generate an ICE; I'm merely using this simple example to explain the situation.)
I have a class A which is not copyable (it has a couple of "reference" members) and doesn't have a default constructor. Another class, B holds an array of As (plain C array of A values, no references/pointers) and I'm initializing this array in the constructor of B using uniform initialization syntax. See the sample code below.
struct B;
struct A
{
int & x;
B * b;
A (B * b_, int & x_) : x (x_), b (b_) {}
A (A const &) = delete;
A & operator = (A const &) = delete;
};
struct B
{
A a [3];
int foo;
B ()
: a {{this,foo},{this,foo},{nullptr,foo}} // <-- THE CULPRIT!
, foo (2)
{ // <-- This is where the compiler says the error occurs
}
};
int main ()
{
B b;
return 0;
}
I can't use std::array because I need to construct the elements in their final place (can't copy.) I can't use std::vector because I need B to contain the As.
Note that if I don't use an array and use individual variables (e.g. A a0, a1, a2;, which I can do because the array is small and fixed in size) the ICE goes away. But this is not what I want since I'll lose ability to get to them by index, which I need. I can use a union of the loose variables over the array to solve my ICE problem and get indexing (construct using the variables, access using the array,) but I think that would result in "undefined behavior" and seems convoluted.
The obvious differences between the above sample and my actual code (aside from the scale) is that A and B are classes instead of structs, each is declared/defined in its own source/header file pair, and none of the constructors is inline. (I duplicated these and still couldn't reproduce the ICE.)
For my actual project, I've tried cleaning the built files and rebuild, to no avail. Any suggestions, etc.?
P.S. I'm not sure if my title is suitable. Any suggestions on that?!?!
UPDATE 1: This is the compiler file referenced in the C1001 fatal error message: (compiler file 'f:\dd\vctools\compiler\utc\src\p2\main.c', line 230).
UPDATE 2: Since I had forgotten to mention, the codebase compiles cleanly (and correctly) under GCC 4.9.2 in C++14 mode.
Also, I'm compiling with all optimizations disabled.
UPDATE 3: I have found out that if I rearrange the member data in B and put the array at the very end, the code compiles. I've tried several other permutations and it sometimes does compile and sometimes doesn't. I can't see any patterns regarding what other members coming before the array make the compiler go full ICE! (being UDTs or primitives, having constructors or not, POD or not, reference or pointer or value type, ...)
This means that I have sort of a solution for my problem, although my internal class layout is important to me and this application, I can tolerate the performance hit (due to cache misses resulting from putting some hot data apart from the rest) in order to get past this thing.
However, I still really like a minimal repro of the ICE to be able to submit to Microsoft. I don't want to be stuck with this for the next two years (at least!)
UPDATE 4: I have tried VS2015 RC and the ICE is still there (although the error message refers to a different internal line of code, line 247 in the same "main.c" file.)
And I have opened a bug report on Microsoft Connect.
I did report this to Microsoft, and after sharing some of my project code with them, it seems that the problem has been tracked down and fixed. They said that the fix will be included in the final VC14 release.
Thanks for the comments and pointers.
I have a piece of templated code that is never run, but is compiled. When I remove it, another part of my program breaks.
First off, I'm a bit at a loss as to how to ask this question. So I'm going to try throwing lots of information at the problem.
Ok, so, I went to completely redesign my test project for my experimental core library thingy. I use a lot of template shenanigans in the library. When I removed the "user" code, the tests gave me a memory allocation error. After quite a bit of experimenting, I narrowed it down to this bit of code (out of a couple hundred lines):
void VOODOO(components::switchBoard &board) {
board.addComponent<using_allegro::keyInputs<'w'> >();
}
Fundementally, what's weirding me out is that it appears that the act of compiling this function (and the template function it then uses, and the template functions those then use...), makes this bug not appear. This code is not being run. Similar code (the same, but for different key vals) occurs elsewhere, but is within Boost TDD code.
I realize I certainly haven't given enough information for you to solve it for me; I tried, but it more-or-less spirals into most of the code base. I think I'm most looking for "here's what the problem could be", "here's where to look", etc. There's something that's happening during compile because of this line, but I don't know enough about that step to begin looking.
Sooo, how can a (presumably) compilied, but never actually run, bit of templated code, when removed, cause another part of code to fail?
Error:
Unhandled exceptionat 0x6fe731ea (msvcr90d.dll) in Switchboard.exe:
0xC0000005: Access violation reading location 0xcdcdcdc1.
Callstack:
operator delete(void * pUser Data)
allocator< class name related to key inputs callbacks >::deallocate
vector< same class >::_Insert_n(...)
vector< " " >::insert(...)
vector<" ">::push_back(...)
It looks like maybe the vector isn't valid, because _MyFirst and similar data members are showing values of 0xcdcdcdcd in the debugger. But the vector is a member variable...
Update: The vector isn't valid because it's never made. I'm getting a channel ID value stomp, which is making me treat one type of channel as another.
Update:
Searching through with the debugger again, it appears that my method for giving each "channel" it's own, unique ID isn't giving me a unique ID:
inline static const char channel<template args>::idFunction() {
return reinterpret_cast<char>(&channel<CHANNEL_IDENTIFY>::idFunction);
};
Update2: These two are giving the same:
slaveChannel<switchboard, ALLEGRO_BITMAP*, entityInfo<ALLEGRO_BITMAP*>
slaveChannel<key<c>, char, push<char>
Sooo, having another compiled channel type changing things makes sense, because it shifts around the values of the idFunctions? But why are there two idFunctions with the same value?
you seem to be returning address of the function as a character? that looks weird. char has much smaller bit count than pointer, so it's highly possible you get same values. that could reason why changing code layout fixes/breaks your program
As a general answer (though aaa's comment alludes to this): When something like this affects whether a bug occurs, it's either because (a) you're wrong and it is being run, or (b) the way that the inclusion of that code happens to affect your code, data, and memory layout in the compiled program causes a heisenbug to change from visible to hidden.
The latter generally occurs when something involves undefined behavior. Sometimes a bogus pointer value will cause you to stomp on a bit of your code (which might or might not be important depending on the code layout), or sometimes a bogus write will stomp on a value in your data stack that might or might not be a pointer that's used later, or so forth.
As a simple example, supposing you have a stack that looks like:
float data[10];
int never_used;
int *important pointer;
And then you erroneously write
data[10] = 0;
Then, assuming that stack got allocated in linear order, you'll stomp on never_used, and the bug will be harmless. However, if you remove never_used (or change something so the compiler knows it can remove it for you -- maybe you remove a never-called function call that would use it), then it will stomp on important_pointer instead, and you'll now get a segfault when you dereference it.
I've been trying run Insure++ with some scientific code and it reports many errors, although to be fair it officially does not support K&R C and I don't know what having a lot of K&R functions has done to its evaluation process. The C and C++ code it is testing is being run in a DLL invoked from a WPF application.
One error report that puzzles me is the following, which I'm confident is safe code but am trying to work out why it thinks is an error (it does work). I'd be interested if anyone has an insight into why this might be an error condition.
[MacImagePlot.c:984] **READ_OVERFLOW**
SetCursorQD(*GetCursorQD(watchCursor));
Reading overflows memory: GetCursorQD(watchCursor)
bbbbb
| 4 | 4 |
rrrrr
Reading (r) : 0x5639d164 thru 0x5639d167 (4 bytes)
From block (b) : 0x5639d160 thru 0x5639d163 (4 bytes)
gWatchCursor, declared at WPFMacGraphics.cpp, 418
for some very simple code.
typedef int Cursor;
typedef Cursor* CursPtr;
typedef CursPtr* CursHandle;
CursHandle GetCursorQD (short cursorID);
void SetCursorQD (const Cursor *crsr);
enum {
....
watchCursor = 4
};
// file globals
Cursor gWatchCursor=watchCursor;
CursPtr gWatchCursorPtr = &gWatchCursor;
CursHandle GetCursorQD (short cursorID)
{
if (cursorID==watchCursor) // this is actually the only case ever called
return &gWatchCursorPtr;
return 0;
}
I'm not familiar at all with the tools you're talking about, but have you verified that your GetCursorQD function is returning the pointer you expect and not NULL/0?
Perhaps something wonky happened with your enum definition for watchCursor (such as it being declared differently elsewhere, or it picking up a local variable instead of the enum).
I hate to say it but I suspect your problem is going to be the lack of some arcane function modifiers needed to ensure that data on the stack isn't getting munged when crossing the DLL boundary. I'd suggest writing a simple app that replicates the code but does it all in one module and see if Insure++ still detects an error. If it doesn't, get ready to wade through __declspec documentation.
I assume that the following line is the Problem:
if (cursorID==watchCursor)
cursorID is defined as short (usually 2 Bytes)
watchCursor is part of a enum and thus of type int (4 Bytes on a 32Bit OS)
This actually is not a problem. The compiler will cast one of both parameters correctly, as far as the enum value will not exceed a 2 Byte range.
By my experience all static (as well as runtime-) code analysis tools report many false positives (i tried some of them). They of course help, but it takes quite a while to assert false positives from real bugs.
Like Soapbox, I am not familiar with Insure++.
But looking at the code, it is admittedly a bit confusing...so
That typedef makes CursHandle effectively a pointer to pointer to int...
CursHandle is a pointer of type CursPtr
CursPtr is a pointer of type Cursor
Cursor is typedef'd to type int
yet in the GetCursorQD, you are using a 'double address of' int? The reason I say 'double address' is the function is returning a address of gWatchCursorPtr (&gWatchCursorPtr) of type CursHandle, which in turn is a global variable which is a address of gWatchCursor (&gWatchCursor) which is of type Cursor.
Your definition of the return type for the function does not match up with the global variable name's typeof despite the typedef's...that's what my thinking is...
Hope this helps,
Best regards,
Tom.
We've run into some problems with the static initialization order fiasco, and I'm looking for ways to comb through a whole lot of code to find possible occurrences. Any suggestions on how to do this efficiently?
Edit: I'm getting some good answers on how to SOLVE the static initialization order problem, but that's not really my question. I'd like to know how to FIND objects that are subject to this problem. Evan's answer seems to be the best so far in this regard; I don't think we can use valgrind, but we may have memory analysis tools that could perform a similar function. That would catch problems only where the initialization order is wrong for a given build, and the order can change with each build. Perhaps there's a static analysis tool that would catch this. Our platform is IBM XLC/C++ compiler running on AIX.
Solving order of initialization:
First off, this is just a temporary work-around because you have global variables that you are trying to get rid of but just have not had time yet (you are going to get rid of them eventually aren't you? :-)
class A
{
public:
// Get the global instance abc
static A& getInstance_abc() // return a reference
{
static A instance_abc;
return instance_abc;
}
};
This will guarantee that it is initialised on first use and destroyed when the application terminates.
Multi-Threaded Problem:
C++11 does guarantee that this is thread-safe:
§6.7 [stmt.dcl] p4
If control enters the declaration concurrently while the variable is being initialized, the concurrent execution shall wait for completion of the initialization.
However, C++03 does not officially guarantee that the construction of static function objects is thread safe. So technically the getInstance_XXX() method must be guarded with a critical section. On the bright side, gcc has an explicit patch as part of the compiler that guarantees that each static function object will only be initialized once even in the presence of threads.
Please note: Do not use the double checked locking pattern to try and avoid the cost of the locking. This will not work in C++03.
Creation Problems:
On creation, there are no problems because we guarantee that it is created before it can be used.
Destruction Problems:
There is a potential problem of accessing the object after it has been destroyed. This only happens if you access the object from the destructor of another global variable (by global, I am referring to any non-local static variable).
The solution is to make sure that you force the order of destruction.
Remember the order of destruction is the exact inverse of the order of construction. So if you access the object in your destructor, you must guarantee that the object has not been destroyed. To do this, you must just guarantee that the object is fully constructed before the calling object is constructed.
class B
{
public:
static B& getInstance_Bglob;
{
static B instance_Bglob;
return instance_Bglob;;
}
~B()
{
A::getInstance_abc().doSomthing();
// The object abc is accessed from the destructor.
// Potential problem.
// You must guarantee that abc is destroyed after this object.
// To guarantee this you must make sure it is constructed first.
// To do this just access the object from the constructor.
}
B()
{
A::getInstance_abc();
// abc is now fully constructed.
// This means it was constructed before this object.
// This means it will be destroyed after this object.
// This means it is safe to use from the destructor.
}
};
I just wrote a bit of code to track down this problem. We have a good size code base (1000+ files) that was working fine on Windows/VC++ 2005, but crashing on startup on Solaris/gcc.
I wrote the following .h file:
#ifndef FIASCO_H
#define FIASCO_H
/////////////////////////////////////////////////////////////////////////////////////////////////////
// [WS 2010-07-30] Detect the infamous "Static initialization order fiasco"
// email warrenstevens --> [initials]#[firstnamelastname].com
// read --> http://www.parashift.com/c++-faq-lite/ctors.html#faq-10.12 if you haven't suffered
// To enable this feature --> define E-N-A-B-L-E-_-F-I-A-S-C-O-_-F-I-N-D-E-R, rebuild, and run
#define ENABLE_FIASCO_FINDER
/////////////////////////////////////////////////////////////////////////////////////////////////////
#ifdef ENABLE_FIASCO_FINDER
#include <iostream>
#include <fstream>
inline bool WriteFiasco(const std::string& fileName)
{
static int counter = 0;
++counter;
std::ofstream file;
file.open("FiascoFinder.txt", std::ios::out | std::ios::app);
file << "Starting to initialize file - number: [" << counter << "] filename: [" << fileName.c_str() << "]" << std::endl;
file.flush();
file.close();
return true;
}
// [WS 2010-07-30] If you get a name collision on the following line, your usage is likely incorrect
#define FIASCO_FINDER static const bool g_psuedoUniqueName = WriteFiasco(__FILE__);
#else // ENABLE_FIASCO_FINDER
// do nothing
#define FIASCO_FINDER
#endif // ENABLE_FIASCO_FINDER
#endif //FIASCO_H
and within every .cpp file in the solution, I added this:
#include "PreCompiledHeader.h" // (which #include's the above file)
FIASCO_FINDER
#include "RegularIncludeOne.h"
#include "RegularIncludeTwo.h"
When you run your application, you will get an output file like so:
Starting to initialize file - number: [1] filename: [p:\\OneFile.cpp]
Starting to initialize file - number: [2] filename: [p:\\SecondFile.cpp]
Starting to initialize file - number: [3] filename: [p:\\ThirdFile.cpp]
If you experience a crash, the culprit should be in the last .cpp file listed. And at the very least, this will give you a good place to set breakpoints, as this code should be the absolute first of your code to execute (after which you can step through your code and see all of the globals that are being initialized).
Notes:
It's important that you put the "FIASCO_FINDER" macro as close to the top of your file as possible. If you put it below some other #includes you run the risk of it crashing before identifying the file that you're in.
If you're using Visual Studio, and pre-compiled headers, adding this extra macro line to all of your .cpp files can be done quickly using the Find-and-replace dialog to replace your existing #include "precompiledheader.h" with the same text plus the FIASCO_FINDER line (if you check off "regular expressions, you can use "\n" to insert multi-line replacement text)
Depending on your compiler, you can place a breakpoint at the constructor initialization code. In Visual C++, this is the _initterm function, which is given a start and end pointer of a list of the functions to call.
Then step into each function to get the file and function name (assuming you've compiled with debugging info on). Once you have the names, step out of the function (back up to _initterm) and continue until _initterm exits.
That gives you all the static initializers, not just the ones in your code - it's the easiest way to get an exhaustive list. You can filter out the ones you have no control over (such as those in third-party libraries).
The theory holds for other compilers but the name of the function and the capability of the debugger may change.
perhaps use valgrind to find usage of uninitialized memory. The nicest solution to the "static initialization order fiasco" is to use a static function which returns an instance of the object like this:
class A {
public:
static X &getStatic() { static X my_static; return my_static; }
};
This way you access your static object is by calling getStatic, this will guarantee it is initialized on first use.
If you need to worry about order of de-initialization, return a new'd object instead of a statically allocated object.
EDIT: removed the redundant static object, i dunno why but i mixed and matched two methods of having a static together in my original example.
There is code that essentially "initializes" C++ that is generated by the compiler. An easy way to find this code / the call stack at the time is to create a static object with something that dereferences NULL in the constructor - break in the debugger and explore a bit. The MSVC compiler sets up a table of function pointers that is iterated over for static initialization. You should be able to access this table and determine all static initialization taking place in your program.
We've run into some problems with the
static initialization order fiasco,
and I'm looking for ways to comb
through a whole lot of code to find
possible occurrences. Any suggestions
on how to do this efficiently?
It's not a trivial problem but at least it can done following fairly simple steps if you have an easy-to-parse intermediate-format representation of your code.
1) Find all the globals that have non-trivial constructors and put them in a list.
2) For each of these non-trivially-constructed objects, generate the entire potential-function-tree called by their constructors.
3) Walk through the non-trivially-constructor function tree and if the code references any other non-trivially constructed globals (which are quite handily in the list you generated in step one), you have a potential early-static-initialization-order issue.
4) Repeat steps 2 & 3 until you have exhausted the list generated in step 1.
Note: you may be able to optimize this by only visiting the potential-function-tree once per object class rather than once per global instance if you have multiple globals of a single class.
Replace all the global objects with global functions that return a reference to an object declared static in the function. This isn't thread-safe, so if your app is multi-threaded you might need some tricks like pthread_once or a global lock. This will ensure that everything is initialized before it is used.
Now, either your program works (hurrah!) or else it sits in an infinite loop because you have a circular dependency (redesign needed), or else you move on to the next bug.
The first thing you need to do is make a list of all static objects that have non-trivial constructors.
Given that, you either need to plug through them one at a time, or simply replace them all with singleton-pattern objects.
The singleton pattern comes in for a lot of criticism, but the lazy "as-required" construction is a fairly easy way to fix the majority of the problems now and in the future.
old...
MyObject myObject
new...
MyObject &myObject()
{
static MyObject myActualObject;
return myActualObject;
}
Of course, if your application is multi-threaded, this can cause you more problems than you had in the first place...
Gimpel Software (www.gimpel.com) claims that their PC-Lint/FlexeLint static analysis tools will detect such problems.
I have had good experience with their tools, but not with this specific issue so I can't vouch for how much they would help.
Some of these answers are now out of date. For the sake of people coming from search engines, like myself:
On Linux and elsewhere, finding instances of this problem is possible through Google's AddressSanitizer.
AddressSanitizer is a part of LLVM starting with version 3.1 and a
part of GCC starting with version 4.8
You would then do something like the following:
$ g++ -fsanitize=address -g staticA.C staticB.C staticC.C -o static
$ ASAN_OPTIONS=check_initialization_order=true:strict_init_order=true ./static
=================================================================
==32208==ERROR: AddressSanitizer: initialization-order-fiasco on address ... at ...
#0 0x400f96 in firstClass::getValue() staticC.C:13
#1 0x400de1 in secondClass::secondClass() staticB.C:7
...
See here for more details:
https://github.com/google/sanitizers/wiki/AddressSanitizerInitializationOrderFiasco
Other answers are correct, I just wanted to add that the object's getter should be implemented in a .cpp file and it should not be static. If you implement it in a header file, the object will be created in each library / framework you call it from....
If your project is in Visual Studio (I've tried this with VC++ Express 2005, and with Visual Studio 2008 Pro):
Open Class View (Main menu->View->Class View)
Expand each project in your solution and Click on "Global Functions and Variables"
This should give you a decent list of all of the globals that are subject to the fiasco.
In the end, a better approach is to try to remove these objects from your project (easier said than done, sometimes).