c++ static unorderd_map used in static method is not initialized - c++

I have some code where a static method is called, and the static std::unordered_map within the same file is not initialized. I understand the static initialization between two compile units is "undefined" and there are many SO questions on the topic; however, when I use an std::vector the issue does not occur. Also, the code can execute, but I am confused as to why these specific compile orders do not work. SO, my questions are:
There is another SO question (which I've been unable to find!) about static initialization and dynamic initialization of static variables. Is this error due to std::undored_map actually being a dynamic initialization?
is there a way to get this code to initialize the std::unordered_map as I expected? I'm actually trying to create a static library .lib or .a. When I link the static library, it generally needs to come last, and so the error occurs.
are there any workarounds for this? One option I've thought of is to create both an std::vector and an std::unordered_map. Use the std::vector while the std::unordered_map is uninitialized (via bool _map_is_initialized). Change the initialization of the std::unordered_map to be explicitly dynamic by calling a function which iterates over the values in the std::vector to produce the std::unordered_map.
Linux
g++ -std=c++1y -g -c thing.cpp
g++ -std=c++1y -g -c main.cpp
g++ -g main.o thing.o -o main
./main
This results in a Floating point exception (core dumped) error. Through gdb, I was able to figure out that hashtable_policy.h trys __num % __den; where __den==0. Also using gdb, it appears as though Thing::Things is uninitialized.
(gdb) break thing.cpp:12
(gdb) run
(gdb) print Thing::Things
No symbol "Things" in specified context.
(gdb) print thing
$1 = (Thing *) 0x618c20
Windows
cl /EHsc /Zi /c main.cpp
cl /EHsc /Zi /c thing.cpp
link /debug main.obj thing.obj
main
In my actual code, this resulted in a very clear segmentation fault; however, this example just opens a popup that says the application failed. ... I have not done better diagnostics.
Code
thing.cpp
#include<iostream>
#include "thing.hpp"
std::vector<Thing*> Before; // EDIT: added
std::unordered_map<std::string, Thing*> Thing::Things;
std::vector<Thing*> After; // EDIT: added
Thing::Thing(std::string name) : name(name) {
}
bool Thing::Register(Thing *thing) {
std::cout << "no worries, vectors initialized..." << std::endl;
Thing::Before.push_back(thing); // EDIT: added
Thing::After.push_back(thing); // EDIT: added
std::cout << "added to vectors, about to fail..." << std::endl;
Thing::Things[thing->name] = thing;
return true;
}
thing.hpp
#pragma once
#include <string>
#include <unordered_map>
class Thing {
public:
static std::vector<Thing*> Before; // EDIT: added
static std::unordered_map<std::string, Thing*> Things;
static std::vector<Thing*> After; // EDIT: added
static bool Register(Thing* thing);
std::string name;
Thing(std::string name);
};
#define ADD_THING(thing_name) \
static bool thing_name## _is_defined = Thing::Register(new Thing( #thing_name ));
main.cpp
#include "thing.hpp"
#include <iostream>
ADD_THING(obligatory);
ADD_THING(foo);
ADD_THING(bar);
int main(int argc, char* argv[]) {
std::cout << "before loop" << std::endl;
for (auto thing : Thing::Things) {
std::cout << "thing.name: " << thing.first << std::endl;
}
return 0;
}
EDIT
If the order within a given compile unit is guaranteed, why do static std::vector<Thing*> Thing::Before and static std::vector<Thing*> Thing::After get initialized, but static std::unordered_map<std::string, Thing*> Thing::Things does not?

As noted in the comments, static initialization order is not defined. Who knows the difference between vector and map. Maybe your compiler initializes classes with even number of characters in their name first.
If you're running c++11 or greater, static initialization of function local items is guaranteed to be thread safe. They will be initialized the first time control passes through the declaration statement.
// Header
class Thing {
public:
static std::unordered_map<std::string, Thing*>& Things();
static bool Register(Thing* thing);
// CPP
std::unordered_map<std::string, Thing*>& Thing::Things()
{
static std::unordered_map<std::string, Thing*> things;
return things;
}
This will initialize the first time you ask for the Things, and avoids all the potential randomness of static initialization.

Static initialization is tricky. As this answer states, the standard provides some guarantees as to the order of initialization within a single "translation unit" (normally a .cpp source file), but none whatsoever concerning what order initializations in different translation units will follow.
When you added the Before and After vectors to the code, you observed that unlike the calls to ordered_map::operator[], the calls to vector::push_back() did not crash the process and concluded that the objects were being initialized out of order within a single translation unit, contrary to the standard's guarantees. There is a hidden assumption there, namely that since push_back() did not cause a crash, the vector must therefore have been initialized. This turns out not to be the case: that method call on an uninitialized object is almost certainly corrupting memory somewhere, but won't necessarily cause a crash. A better way of checking whether or not the constructor is being called would be to run the code in a debugger, and set breakpoints on the lines which contain the objects' definitions, for instance std::vector<Thing*> Before in thing.cpp. This will show that initialization will occur as predicted in the standard.
The best option for avoiding the "fiasco", as described here, is "construct on first use". In the case of your example code, this would involve changing any direct use of Thing::Things, such as this line:
Thing::Things[thing->name] = thing;
To a method, say Thing::GetThings(), which initializes the object and returns a reference to it. lcs' answer provides an example of this, but beware: although it solves the static initialization problem, using a scoped static object may introduce an even more pernicious problem: crashes on program exit due to static deinitialization order. For that reason, allocating the object with the new keyword is preferred:
std::unordered_map<std::string, Thing*>& Thing::GetThings()
{
static std::unordered_map<std::string, Thing*>* pThings =
new std::unordered_map<std::string, Thing*>();
return *pThings;
}
That instance will of course never be delete'd, which feels an awful lot like a memory leak. But even if it weren't a pointer, de-initialization would only occur at program shutdown. So, unless the object's destructor performs some important function like flushing a file's contents to disk, the only difference that matters is the fact that using a pointer avoids the possibility of a crash on exit.

Related

Constructor is not called with /NODEFAULTLIB

I'm using /NODEFAULTLIB to disable CRT(C Runtime), however my constructor is not called, which ends up causing an error in std::map (Access violation) because it is not initialized properly, since std::map constructor it's not called.
Code compiled with LLVM 8.0.0, compiled in mode debug x86
class c_test
{
public:
c_test( int a ) // Constructor not called
{
printf( "Test: %i\n", a ); // Doesn't appear and breakpoint is not reached
}
void add( const std::string& key, const std::string& val )
{
_data[ key ] = val;
}
private:
std::map< std::string, std::string > _data;
};
c_test test{ 1337 };
int main()
{
test.add( "qwrqrqr", "23142421" );
test.add( "awrqw", "12asa1faf" );
return 1;
}
I've implemented my own functions new(HeapAlloc), delete(HeapFree), printf, memcpy, memmove, etc, and all are working perfectly, I have no idea why this happening.
Disabling the CRT is madness.
This performs crucial functions, such as static initialisation. Lack of static initialisation is why your map is in a crippled state. I would also wholly expect various parts of the standard library to just stop working; you're really creating a massive problem for yourself.
Don't reinvent little pieces of critical machinery — turn the CRT back on and use the code the experts wrote. There is really nothing of relative value to gain by turning it off.
I discovered the problem and solved, one guy from another forum said that I needed manually call constructors that are stored in pointers in .CRT section, I just did it and it worked perfectly
I just called _GLOBAL__sub_I_main_cpp function that calls my constructor and solved all my problems, thanks for the answers.

Segmentation fault when initializing a file scoped variable

The situation I have is that I am trying to initialize a file scoped variable, std::string, in a shared object constructor. It will probably be more clear in code:
#include <string>
#include <dlfcn.h>
#include <cstring>
static std::string pathToDaemon; // daemon should always be in the same dir as my *.so
__attribute__((constructor))
static void SetPath()
{
int lastSlash(0):
Dl_info dl_info;
memset(&dl_info, 0, sizeof(dl_info));
if((dladdr((void*)SetPath, &dl_info)) == 0)
throw up;
pathToDaemon = dl_info.dli_fname; // **whoops, segfault here**
lastSlash = pathToDaemon.find_last_of('/');
if(std::string::npos == lastSlash)
{
// no slash, but in this dir
pathToDaemon = "progd";
}
else
{
pathToDaemon.erase(pathToDaemon.begin() + (lastSlash+1), pathToDaemon.end());
pathToDaemon.append("progd");
}
std::cout << "DEBUG: path to daemon is: " << pathToDaemon << std::endl;
}
I have a very simple program that does this same thing: a test driver program for concept if you will. The code in that looks just like this: a "shared object ctor" which uses dladdr() to store off the path of the *.so file when the file is loaded.
Modifications I've tried:
namespace {
std::string pathToDaemon;
__attribute__((constructor))
void SetPath() {
// function def
}
}
or
static std::string pathToDaemon;
__attribute__((constructor))
void SetPath() { // this function not static
// function def
}
and
std::string pathToDaemon; // variable not static
__attribute__((constructor))
void SetPath() { // this function not static
// function def
}
The example you see above sits in a file that is compiled into both a static object library and a DLL. The compilation process:
options for static.a: --std=C++0x -c -Os.
options for shared.so: -Wl,--whole-archive /path/to/static.a -Wl,--no-whole-archive -lz -lrt -ldl -Wl,-Bstatic -lboost_python -lboost_thread -lboost_regex -lboost_system -Wl,-Bdynamic -fPIC -shared -o mymodule.so [a plethora of more objects which wrap into python the static stuff]
The hoops I have to jump through in the bigger project make a much more complicated build process than my little test driver program requires. This makes me think that the problem lies there. Can anyone please shed some light on what I'm missing?
Thanks,
Andy
I think it's worth giving the answer that I've found. The problem was due to the complex nature of the shared library loading. I discovered after some digging that I could reproduce the problem in my test bed program when compiling the code with optimizations enabled. That confirmed the hypothesis that the variable truly didn't exist when it was being accessed by the constructor function.
GCC includes some extra tools for C++ which allow for developers to force certain things to happen at particular times during code initialization. More precisely, it allows for certain things to take place in particular order rather than particular times.
For example:
int someVar(55) __attribute__((init_priority(101)));
// This function is a lower priority than the initialization above
// so, this will happen *after*
__attribute__((constructor(102)))
void SomeFunc() {
// do important stuff
if(someVar == 55) {
// do something here that important too
someVar = 44;
}
}
I was able to use these tools to success in the test bed program even with optimizations enabled. The happiness which ensued was short lived when applied to my much larger library. Ultimately, the problem was due to the nature of such a large amount of code and the problematic way in which the variables are brought into existence. It just wasn't reliable to use these mechanisms.
Since I wanted to avoid repeated calls for evaluating the path, i.e.
std::string GetPath() {
Dl_info dl_info;
dladdr((void*)GetPath, &dl_info);
// do wonderful stuff to find the path
return dl_info.dli_fname;
}
The solution turned out to be much simpler than I was trying to make it:
namespace {
std::string PathToProgram() {
Dl_info dl_info;
dladdr((void*)PathToProgram, &dl_info);
std::string pathVar(dl_info.dli_fname);
// do amazing things to find the last slash and remove the shared object
// from that path and append the name of the external daemon
return pathVar;
}
std::string DaemonPath() {
// I'd forgotten that static variables, like this, are initialized
// only once due to compiler magic.
static const std::string pathToDaemon(PathToProgram());
return pathToDaemon;
}
}
As you can see, exactly what I wanted with less confusion. Everything happens only once, except calls to DaemonPath(), and everything remains within the translation unit.
I hope this helps someone who runs into this in the future.
Andy
Maybe you could try running valgrind on your program
In you self posted solution above, you have changed your »interface« (for the code that reads your pathToDaemon / DaemonPath()) from »Accessing a file scoped variable« to »calling a function in anonymous namespace« - so far ok.
But the implementation of DaemonPath() is not done in a thread-safe way. I though that thread-safeness matters, because your are wrote »-lboost_thread« in your question. So you may think about to change the implementation thread-safe. There are many discussions and solutions about singleton pattern and thread-safeness available, e.g.:
Article from Scott Meyers
Stack Overflow
The fact is, that your DaemonPath() will invoked (maybe far) after loading of the library is done. Note, that only the 1st call to the singleton pattern is critical in a multithreaded environment.
As an alternative, you may add a simple »early« call to your DaemonPath() function like this:
namespace {
std::string PathToProgram() {
... your code from above ...
}
std::string DaemonPath() {
... your code from above ...
}
__attribute__((constructor)) void MyPathInit() {
DaemonPath();
}
}
or in a more portable way like this:
namespace {
std::string PathToProgram() {
... your code from above ...
}
std::string DaemonPath() {
... your code from above ...
}
class MyPathInit {
public:
MyPathInit() {
DaemonPath();
}
} myPathInit;
}
Of course, this approach don't makes your singleton pattern thread-safe. But sometimes, there are situations, we can be sure that there are no concurrent thread accesses (e.g. at initialization time when the shared lib is loading). If this conditions matches for you, this approach could be a way to bypass thread-safeness problem without the use of thread locking (mutex...).

static variable destructor invoked before library destructor

consider the following code for a dynamic loaded library built with g++-4.7 on linux, -fPIC and linked with -rdynamic option:
struct Wrapper
{
libraryUnregisterCbMap_t instance;
Wrapper() : instance() { HDebugLog("Wrapper CTOR!");}
~Wrapper() { HDebugLog("Wrapper DESTRUCTOR!"); }
};
inline libraryUnregisterCbMap_t& getLibraryUnregisterMap()
{
static Wrapper unregisterLibraryMap;
HDebugLog("getLibraryUnregisterMap: we have " <<unregisterLibraryMap.instance.size() << " elements. the address of the map is " << &unregisterLibraryMap.instance);
return unregisterLibraryMap.instance;
}
void registerLibrary(callbackContainer_t* p)
{
auto& map = getLibraryUnregisterMap();
}
void unregisterLibrary()
{
auto& map = getLibraryUnregisterMap();
}
void __attribute__ ((constructor)) library_init()
{
static callbackContainer_t cbContainer;
HDebugLog("Library constructor: address of static cbContainer is: " << &cbContainer );
registerLibrary( &cbContainer);
}
void __attribute__ ((destructor)) library_fini()
{ unregisterLibrary(); }
the interesting/annoying part for me is that library_fini() is not being called after i call lt_dlclose, so it seems to be rather useless for finalisation, as when i load this module during a run, the destructor of Wrapper instance happens before the call to library_fini. Needless to say, this default behavior does not make any sense.
How do i change this meaningless behaviour? i need to finalise my static data in my library finalization routine. Why lt_dlclose is not invoking library_fini()?
Let me first admit that I'm out of my depth here. That said, googling turned up a thread that, at least to my limited knowledge, appears to address a similar problem to yours:
http://lists.apple.com/archives/xcode-users/2005/Aug/msg00133.html
Do you happen to be doing whatever you're doing on OSX? There's something in the thread (maybe the second follow-up) about OSX behaving differently, i.e. not calling destructors but just setting memory to be free.
Apologies if the link isn't useful. Just thought I'd have a go since no one else has answered at this point.
Edit:
Again, out of my depth - but I found two more links that might be relevant:
http://phoxis.org/2011/04/27/c-language-constructors-and-destructors-with-gcc/
in the comments, people mention having problems with destructors when they use exit, and having to use the atexit function to overcome these problems
http://clang-developers.42468.n3.nabble.com/Priority-settings-for-static-variables-and-attribute-destructor-td4030466.html
global resource destructed before attribute ((destructor)) function is called. Suggested solution is to use priorities with the destructor.

c++ static-init-fiasco example

Learning C++ with help of "Thinking in C++" by Bruce Eckel,stuck in exercise 32, chapter 10.
The question is how to change link order, that Mirror::test() called for object m5 return false.
Here is my code.
mirror.h:
#ifndef MIRROR_H_
#define MIRROR_H_
class Mirror {
public:
Mirror() {logic_ = true; self_ = 0;};
Mirror(Mirror *ptr) {self_ = ptr; logic_ = false;};
bool test() {
if (self_ != 0) {
return self_->test();
} else {
return logic_;
}
};
private:
bool logic_;
Mirror *self_;
};
#endif // MIRROR_H_
task
one.cpp
#include "mirror.h"
Mirror m1;
two.cpp
#include "mirror.h"
extern Mirror m1;
Mirror m2 (&m1);
three.cpp
#include "mirror.h"
extern Mirror m2;
Mirror m3 (&m2);
and so on. Finally,
five.cpp
#include "mirror.h"
#include <iostream>
extern Mirror m4;
Mirror m5 (&m4);
int main(int argc, char* argv[]) {
std::cout << m5.test() << std::endl;
}
m5.test() returns true. The task says, that I should change linking order, that m5.test() returns false. I have tried to use:
init_priority (priority)
In Standard C++, objects defined at namespace scope are guaranteed to be initialized in an order in strict accordance with that of their
definitions in a given translation unit. No guarantee is made for
initializations across translation units. However, GNU C++ allows
users to control the order of initialization of objects defined at
namespace scope with the init_priority attribute by specifying a
relative priority, a constant integral expression currently bounded
between 101 and 65535 inclusive. Lower numbers indicate a higher
priority.
but no luck.
Full exercise text:
In a header file, create a class Mirror that contains two data
members: a pointer to a Mirror object and a bool. Give it two
constructors: the default constructor initializes the bool to true and
the Mirror pointer to zero. The second constructor takes as an
argument a pointer to a Mirror object, which it assigns to the
object’s internal pointer; it sets the bool to false. Add a member
function test( ): if the object’s pointer is nonzero, it returns the
value of test( ) called through the pointer. If the pointer is zero,
it returns the bool. Now create five cpp files, each of which includes
the Mirror header. The first cpp file defines a global Mirror object
using the default constructor. The second file declares the object in
the first file as extern, and defines a global Mirror object using the
second constructor, with a pointer to the first object. Keep doing
this until you reach the last file, which will also contain a global
object definition. In that file, main( ) should call the test( )
function and report the result. If the result is true, find out how to
change the linking order for your linker and change it until the
result is false.
You'll need to change the order of the object files when passing them to the linker. This works reasonable for the toplevel code although different compilers use different approaches, i.e., it isn't portable. Also, for libraries you generally can't control the order in which the objects are included. For example, if you have
// file1.cpp
int main() {
}
// file2.cpp
#include <iostream>
static bool value = std::cout << "file2.cpp\n";
// file3.cpp
#include <iostream>
static bool value = std::cout << "file3.cpp\n";
... and you link two programs like this:
g++ -o tst1 file1.cpp file2.cpp file3.cpp
g++ -o tst2 file1.cpp file3.cpp file2.cpp
you will get different output for tst1 and tst2, e.g.:
$ ./tst1
file2.cpp
file3.cpp
$ ./tst2
file3.cpp
file2.cpp
The overall moral is: don't do it. That is: don't use global objects. If you feel you absolutely need to use global objects, encapsulate them into functions, e.g.:
Type& global_value() {
static Type value; // possibly with constructor arguments
return value;
}
This way, value is initialized the first time it is accessed and there is no way to access it while it isn't constructed, yet. If you encapsulate all objects like this, you can guarantee that they are constructed in an appropriate order (unless you have a cyclic dependency in which case it can't be made to work and you should seriously rethink your design). The above approach encapsulating objects into function is, unfortunately, not thread-safe in C++ 2003. It is thread-safe in C++ 2011, however. Still, use of global variable is generally problematic and you definitely want to minimize their use.
I was struggling with this exercise too.
I managed to write a small Python script to prepare makefile entries that link and test final executable using all possible permutation of object files:
import itertools
for perm in itertools.permutations([1, 2, 3, 4, 5]):
print '\tg++ u0{0}.o u0{1}.o u0{2}.o u0{3}.o u0{4}.o -o $# && ./main.exe'.format(*perm)
After executing my make process it turned out, that all of the possible configurations yielded true value.
This is due to the fact, that all global (i.e. static) variables are guaranteed to be initialized before entering main function.
I defined a global bool variable that holds result from a test() function before main, something like this:
#include "mirror.h"
#include <iostream>
extern Mirror m4;
Mirror m5 (&m4);
bool result = m5.test();
int main(int argc, char* argv[]) {
std::cout << result << std::endl;
}
Bingo! Some of the objects' permutations yielded false at the progam's output.
All static variables are initialized with zeroes before any of their possible constructors are called. In this exercise, the order in which constructors are called is the clue.
If any object in the depencence chain has not been initialized by constructor when a value of result variable is established, the result is false value (self_ value is 0 and logic_ value is false, so test function returns false).
When a result variable is evaluated before entering main function, there is such possibility and order of object files in a linker command has something to do with the result.

Problems with Static Initialization

I'm having some weird issues with static initalization. I'm using a code generator to generate structs and serialization code for a message passing system I wrote. In order to have a way of easily allocating a message based on it's message id I have my code generator ouput something similar to the following for each message type:
MessageAllocator s_InputPushUserControllerMessageAlloc(INPUT_PUSH_USER_CONTROLLER_MESSAGE_ID, (AllocateMessageFunc)Create_InputPushUserControllerMessage);
The MessageAllocator class basically looks like this:
MessageAllocator::MessageAllocator( uint32_t messageTypeID, AllocateMessageFunc func )
{
if (!s_map) s_map = new std::map<uint32_t, AllocateMessageFunc>();
if (s_map->insert(std::make_pair(messageTypeID, func)).second == false)
{
//duplicate key!
ASSERT(false, L"Nooooo!");
}
s_count++;
}
MessageAllocator::~MessageAllocator()
{
s_count--;
if (s_count == 0) delete s_map;
}
where s_map and s_count are static members of MessageAllocator. This works most of the time but sometimes messages are not added to the map. For example, this particular message is not added unless i call Create_InputPushUserControllerMessage() somewhere in my startup code, however other messages work fine. I thought this might be something to do with the linker incorrectly thinking the type is unreferenced and removing it so I disabled that using the /OPT:NOREF switch (I'm using Visual Studio 2008 SP1) but that had no effect.
I'm aware of the problem of the "static initialization order fiasco" but as far as I know the order in which these objects are created shouldn't alter the result so this seems ok to me.
Any insight here would be appreciated.
Put the static into a class so it is a static member of a class
struct InputPushUserControllerMessageAlloc { static MessageAllocator s_obj; };
MessageAllocator InputPushUserControllerMessageAlloc::s_obj(
INPUT_PUSH_USER_CONTROLLER_MESSAGE_ID,
(AllocateMessageFunc)Create_InputPushUserControllerMessage);
The Standard allows it to delay initialization of objects having namespace scope until any function/object from its translation unit is used. If the initialization has side-effect, it can't be optimized out. But that doesn't forbid delaying it.
Not so of objects having class-scope. So that might forbid it optimizing something there.
I would change s_map from a static class member into a static method member:
std::map<uint32_t,AllocateMessageFunc>& MessageAllocator::getMap()
{
// Initialized on first use and destroyed correctly on program termination.
static std::map<uint32_t,AllocateMessageFunc> s_map;
return s_map;
}
MessageAllocator::MessageAllocator( uint32_t messageTypeID, AllocateMessageFunc func )
{
if (getMap().insert(std::make_pair(messageTypeID, func)).second == false)
{
//duplicate key!
ASSERT(false, L"Nooooo!");
}
}
No need for destructor or a count.
If your global objects are in separate DLL's(or shared libs) that are lazy loaded.
This may cause a problem similar to your description.
You are not setting the pointer back to null.
MessageAllocator::~MessageAllocator()
{
s_count--;
if (s_count == 0)
{
delete s_map;
s_map = 0;
}
}
Turns out that the object files containing the static initializers were not included by the linker because nothing referenced any functions in them. To work around this I extern "C"-ed one of the generated functions so that it would have a predictable non-mangled name and then forced a reference to it using a pragma like this for each message
#pragma comment(linker, "/include:Create_GraphicsDynamicMeshCreationMessage")
which I put in the generated header file that is later included in all the other non-generated files. It's MSVC only and kind of hack but I assume I can do something similar on GCC once I eventually port it.