Double initialization of a static STL container in a C++ library

Double initialization of a static STL container in a C++ library - c++

There are a few good questions and answers here around the "static initialization order fiasco", but I seem to have hit against yet another expression of it, specially ugly because it does not crash but looses and leaks data.
I have a custom C++ library and an application that links against it. There is an static STL container in the library that registers all instances of a class. Those instances happen to be static variables in the application.
As a result of the "fiasco" (I believe), we get the container filled with the application instances during application initialization, then the library gets to initialize and the container is reset (probably leaking memory), ending up only with the instances from the library.
This is how I reproduced it with simplified code:
mylib.hpp:
#include <iostream>
#include <string>
#include <vector>
using namespace std;
class MyLibClass {
static vector<string> registry;
string myname;
public:
MyLibClass(string name);
};
mylib.cpp:
#include "mylib.hpp"
vector<string> MyLibClass::registry;
MyLibClass::MyLibClass(string name)
: myname(name)
{
registry.push_back(name);
for(unsigned i=0; i<registry.size(); i++)
cout << " ["<< i <<"]=" << registry[i];
cout << endl;
}
MyLibClass l1("mylib1");
MyLibClass l2("mylib2");
MyLibClass l3("mylib3");
myapp.cpp:
#include "mylib.hpp"
MyLibClass a1("app1");
MyLibClass a2("app2");
MyLibClass a3("app3");
int main() {
cout << "main():" << endl;
MyLibClass m("main");
}
Compile the objects with:
g++ -Wall -c myapp.cpp mylib.cpp
g++ myapp.o mylib.o -o myapp1
g++ mylib.o myapp.o -o myapp2
Run myapp1:
$ ./myapp1
[0]=mylib1
[0]=mylib1 [1]=mylib2
[0]=mylib1 [1]=mylib2 [2]=mylib3
[0]=mylib1 [1]=mylib2 [2]=mylib3 [3]=app1
[0]=mylib1 [1]=mylib2 [2]=mylib3 [3]=app1 [4]=app2
[0]=mylib1 [1]=mylib2 [2]=mylib3 [3]=app1 [4]=app2 [5]=app3
main():
[0]=mylib1 [1]=mylib2 [2]=mylib3 [3]=app1 [4]=app2 [5]=app3 [6]=main
Run myapp2:
$ ./myapp2
[0]=app1
[0]=app1 [1]=app2
[0]=app1 [1]=app2 [2]=app3
[0]=mylib1
[0]=mylib1 [1]=mylib2
[0]=mylib1 [1]=mylib2 [2]=mylib3
main():
[0]=mylib1 [1]=mylib2 [2]=mylib3 [3]=main
Here comes the question, the static vector was re-initialized, or used before initialization? Is this an expected behavior?
If I 'ar' the library as 'mylib.a' (ar rcs mylib.a mylib.o), the problem does not happen, but probably because there is only one valid order to link to the .a and it is by having the library in the last place, as for myapp1 here.
But in our real application, a more complex one with many object files and a few static (.a) libraries sharing a few static registries, the problem is happening and the only way we managed to solve it so far is by applying '[10.15] How do I prevent the "static initialization order fiasco"?'.
(I am still researching in our somewhat complex build system to see if we are linking correctly).

One way to work around initialization order problems is to move the static variables from global scope to local scope.
Instead of having a registry variable within the class, put it into a function:
vector<string> & MyLibClass::GetRegistry()
{
static vector<string> registry;
return registry;
}
In the places where you would have used registry directly, have it call GetRegistry.

If you give vector<string> a custom constructor you will see, that it is indeed called only once, but in myapp2 you are using registry uninitialized first, then it gets initialized ("removing" everything that's inside) and then filled again. That it doesn't segfault is just luck :)
I can't tell which part of the standard says something about this behaviour, but IMHO you should /never/ let static variables depend on each other. You might use a Meyers singleton for example for registry.

You are using 2 known techiques.
(1) The "module/library/namespace" as a "device" pattern
(2) Custom type registration, with a static class.
Done something similar with "Object Pascal" and "Plain C". I have several files, each file working as a module / namespace, with typedefs, classes, functions.
Additionally, each "namespace" had 2 special methods (same signature or prototype), that simulate connecting a device, and disconnecting a device. Already tryed to call those methods automatically, but executing order also went wrong.
Static, Singleton classes can become a mess. I suggest, forget using macros or preprocessor/compiler and call your initialization / finalization methods yourself.
----
mylib.hpp
----
class MyLibClass {
public:
Register(string libraryName);
UnRegister(string libraryName);
};
// don't execute the "custom type registration here"
-----
mynamespace01.cpp
-----
#include "mylib.hpp"
void mynamespace01_otherstuff() { ... }
// don't execute registration
void mynamespace01_start() {
if not (MyLibClass::IsUnRegistered("mynamespace01")) MyLibClass::Register("mynamespace01");
}
void mynamespace01_finish()
{
if not (MyLibClass::IsRegistered("mynamespace01")) MyLibClass::UnRegister("mynamespace01");
}
-----
mynamespace02.cpp
-----
#include "mylib.hpp"
// check, "2" uses "1" !!!
#include "mynamespace01.hpp"
void mynamespace02_otherstuff() { ... }
// don't execute registration !!!
void mynamespace02_start() {
// check, "2" uses "1" !!!
void mynamespace01_start();
if not (MyLibClass::IsUnRegistered("mynamespace01")) MyLibClass::Register("mynamespace02");
void mynamespace02_start();
}
void mynamespace02_finish(){
void mynamespace02_finish();
if not (MyLibClass::IsRegistered("mynamespace02")) MyLibClass::UnRegister("mynamespace02");
// check, "2" uses "1" !!!
void mynamespace02_start();
}
-----
myprogram.cpp
-----
#include "mynamespace01.hpp"
#include "mynamespace02.hpp"
void myprogram_otherstuff() { ... }
// don't execute registration !!!
void myprogram_start() {
// check, "2" uses "1" !!!
mynamespace01_start();
mynamespace02_start();
if not (MyLibClass::IsUnRegistered("myprogram")) MyLibClass::Register("myprogram");
}
void myprogram_finish() {
if not (MyLibClass::IsRegistered("myprogram")) MyLibClass::UnRegister("myprogram");
// check, "2" uses "1" !!!
mynamespace01_finish();
mynamespace02_finish();
}
void main () {
// all registration goes here !!!:
// "device" initializers order coded by hand:
myprogram_start();
// other code;
// "device" finalizers order inverse coded by hand:
myprogram_finish();
}
-----
Check that this code is more complex and verbose that yours,
but, in my experience, is more stable.
I also add "finalizer" to "initializer", and replace identifier for "Register".
Good Luck.

Related

Registering file handlers in code running prior to main()

I'm looking into ways to prevent unnecessary clutter in setup code in main() as well as various other places. I often have tons of setup code that registers itself with some factory. A standard example is e.g. handlers for various file types.
To avoid having to write this code and instead just make handlers magically work if linked into the application, I figured I could replace the code by something like the following:
test.cc:
int main() {
return 0;
}
loader.h:
#ifndef LOADER_H_
#define LOADER_H_
#include <functional>
namespace loader {
class Loader {
public:
Loader(std::function<void()> f);
};
} // namespace loader
#define REGISTER_HANDLER(name, f) \
namespace { \
::loader::Loader _macro_internal_ ## name(f); \
}
#endif // LOADER_H_
loader.cc:
#include "loader.h"
#include <iostream>
namespace loader {
Loader::Loader(std::function<void()> f) { f(); }
} // namespace loader
a.cc:
#include <iostream>
#include "loader.h"
REGISTER_HANDLER(a, []() {
std::cout << "hello from a" << std::endl;
})
The idea here is that a.cc would in a real application e.g. call some method where it registers it self as a handler for a certain file type. Compiling the code with c++ -std=c++11 test.cc loader.cc a.cc creates a binary that prints "hello from a" while c++ -std=c++11 test.cc loader.cc stays silent.
I'm wondering if there's something subtle that I might need to be careful with? For example, if someone creates complex objects in the lambda that is run here, I assume weird things can happen during cleanup for example in a multithreaded application?

You wrote:
... unnecessary clutter in setup code in main() ...
int main() {
return 0;
}
This is not preventing unnecessary clutter. This is hiding your initializations. They still occur, but now you have to chase after them. That's really not the way to do it. Also, it will force the use of a lot of global state - in many independent global variables, most probably - which is also a bad thing. Instead, consider writing something like:
class my_app_state { /* ... */ };
my_app_state initialize(/* perhaps with argc and argv here? */) {
//
// Your "unnecessary" clutter goes here...
//
return whatever;
}
int main() {
auto app_state = initialize();
//
// do stuff involving the app_state...
//
}
and don't try to "game" the program loader.

This approach is not guaranteed to work:
[basic.start.dynamic]/4 It is implementation-defined whether the dynamic initialization of a non-local non-inline variable with static storage duration is sequenced before the first statement of main or is deferred. If it is deferred, it strongly happens before any non-initialization odr-use of any non-inline function or non-inline variable defined in the same translation unit as the variable to be initialized.
Thus, the initialization of _macro_internal_a may be deferred until something in a.cc is used. And since nothing in a.cc is in fact used, the initialization may not be performed at all.
In practice, linkers tend to discard object files that do not appear to be referenced by anything in the program (especially when those files come from libraries).

Using a variable from a different file before main

I'm having a little trouble understanding why my code works the way it does (or doesn't work the way it ought to).
I'm trying to write (in C++) an interface that allows to use some functions operating on unordered_map from the standard template library in C. However, I'd also like to write a namespace that allows to use them in C++ as well.
What I'm asking is not how this can be thone in a different way, but why it works the way it does;
Let's say for a while that I need only two functions: to add elements and write the size of the map. The header is the following:
//project.h
#ifdef __cplusplus
extern "C" {
#endif
void add(int, int);
void give_size();
#ifdef __cplusplus
}
#endif
The source code:
//project.cc
#include <unordered_map>
#include <iostream>
#include "project.h"
using namespace std;
unordered_map<int, int> my_map;
void add(int arg, int val) {
my_map.insert ({{arg, val}});
}
void give_size() {
cout << my_map.size() << endl;
}
The interface for C++:
//cproject
namespace pro {
#include "project.h"
}
and a test:
//test.cc
#include "cproject"
namespace {
unsigned long test() {
::pro::add(1,2);
::pro::add(3,4);
return 0;
}
unsigned long dummy = test();
}
int main() {
::pro::give_size();
return 0;
}
And, for completeness, the Makefile:
g++ -Wall -std=c++11 -c -o project.o project.cc
g++ -Wall -std=c++11 -c -o test.o test.cc
g++ test.o project.o -o test
The problem is, of course, that running test outputs 0 instead of 2 - which means that the map disappears somewhere before the test's main.
I was thinking it might be some sort of static initialization order fiasco, however I don't find the attached solution very helpful, since I don't explicitly call objects from the project.cc file in test.cc.
I would appreciate any help with that issue.

Yes, it's the poorly named static initialization order fiasco. Poorly named because the C++ standard calls it "dynamic initialization"; "static initialization" is something different.
which means that the map disappears somewhere before the test's main
Not quite. The problem is that you use the map before it's there, adding values to it. Now it happens that for some map implementations, a zero-initialized state (and that is what is done to all global variables before any dynamic initializers run) is the same as what the default constructor does. So the code in test is executed first and tries to add things to the map, and the map's insert function works just fine, creating nodes, setting internal pointers to the nodes, etc.
Then the actual default constructor of the map runs, resetting those pointers to null, leaking and forgetting all the nodes you created. Your previous insertions are undone, and the map is empty again.
The solutions provided in your link would work; you implicitly call objects through the free functions, even if you don't do it explicitly. There's no real distinction. You still replace the global my_map in project.cc with a function that returns a reference to a function-level static (or pointer, depending on which exact solution you choose). The only difference is that you call this function not from within test.cc, but from within add and give_size.
As a side note, this whole global state thing is generally rather suspect. It's not thread-safe, and it makes it harder to understand what the program is doing. Consider not doing it this way at all.

How to trigger c'tors of globals in executable shared library (.so)?

I have a shared library that I would like to make executable, similar to libc. When the library executes, I would like it to dump a list of the names of classes that are registered with a particular abstract factory (this is C++). I use the standard technique of registering classes with the factory through the initialization/construction of global variables.
There are several tutorials on how to make shared libraries executable (here and here, for example). It's relatively straight forward. However, when I try it out, I find that the entry point is executed before any constructors of globals are called. In my case, this means that no classes have registered with my factory, so I have no information to print out.
I would like to either execute the entry point after the constructors have been called, or learn how to trigger construction myself from my entry-point function. Is this possible? Does anyone know how to go about doing this? Perhaps there is an internal libc function that I can extern and call?

I believe that I have come across a workable solution. It is based upon techniques for creating -nostdlib executables (such as OS kernels). However, our shared library still links the standard libraries in this case. I found this RaspberryPi forum thread especially useful.
The solution is to manually execute the function pointers stored in the shared library's embedded init_array. The trick is to use a linker script to define pointers for accessing this array. We then extern these pointers in the program code. We can repeat the process for executing destructors as well.
In test.cpp, we have the following:
#include <cstdio>
#include <unistd.h>
class Test
{
public:
Test() { printf("Hello world!\n"); }
~Test() { printf("Goodbye world!\n"); }
};
Test myGlobal; // a global class instance
extern "C"
{
// ensures linker generates executable .so (assuming x86_64)
extern const char test_interp[] __attribute__((section(".interp"))) =
"/lib64/ld-linux-x86-64.so.2";
// function walks the init_array and executes constructors
void manual_init(void)
{
typedef void (*constructor_t)(void);
// _manual_init_array_start and _manual_init_array_end
// are created by linker script.
extern constructor_t _manual_init_array_start[];
extern constructor_t _manual_init_array_end[];
constructor_t *fn = _manual_init_array_start;
while(fn < _manual_init_array_end)
{
(*fn++)();
}
}
// function walks the fini_array and executes destructors
void manual_fini(void)
{
typedef void (*destructor_t)(void);
// _manual_init_array_start and _manual_init_array_end
// are created by linker script.
extern destructor_t _manual_fini_array_start[];
extern destructor_t _manual_fini_array_end[];
destructor_t *fn = _manual_fini_array_start;
while(fn < _manual_fini_array_end)
{
(*fn++)();
}
}
// entry point for libtest.so when it is executed
void lib_entry(void)
{
manual_init(); // call ctors
printf("Entry point of the library!\n");
manual_fini(); // call dtors
_exit(0);
}
We need to manually define the _manual* pointers through a linker script. We must use the INSERT keyword so that don't entirely replace ld's default linker script. In a file test.ld, we have:
SECTIONS
{
.init_array : ALIGN(4)
{
_manual_init_array_start = .;
*(.init_array)
*(SORT_BY_INIT_PRIORITY(.init_array.*))
_manual_init_array_end = .;
}
}
INSERT AFTER .init; /* use INSERT so we don't override default linker script */
SECTIONS
{
.fini_array : ALIGN(4)
{
_manual_fini_array_start = .;
*(.fini_array)
*(SORT_BY_INIT_PRIORITY(.fini_array.*))
_manual_fini_array_end = .;
}
}
INSERT AFTER .fini; /* use INSERT so we don't override default linker script */
We must provide two parameters to the linker: (1) our linker script and (2) the entry point to our library. To compile, we do the following:
g++ test.cpp -fPIC -Wl,-T,test.ld -Wl,-e,lib_entry -shared -o libtest.so
We get the following output when libtest.so is executed at the commandline:
$ ./libtest.so
Hello world!
Entry point of the library!
Goodbye world!
I've also tried defining globals in .cpp files other than test.cpp that are also compiled into the shared library. The ctor calls for these globals are included in init_array, so they are called by manual_init(). The library works "like normal" when it is loaded as a regular shared library.

Calling function from different files in C++ error

I would like to use one function from Stats.cpp in Application.cpp. Here are my code snippets:
In Stats.h:
#ifndef STATS_H
#define STATS_H
class Stats
{
public:
void generateStat(int i);
};
#endif
In Stats.cpp:
#include Stats.h
void generateStat(int i)
{
//some process code here
}
In Application.cpp:
int main()
{
generateStat(10);
}
I get an "unresolved external symbol" error however I don't know what I else I would need to include in order for Application.cpp. Any thoughts?

In Stats.cpp
you need to define generateStat like following :
#include Stats.h
void Stats:: generateStat(int i) // Notice the syntax, use of :: operator
{
//some process code here
}
Then create object of class Stats, use it to call the public member function generateStat
Stats s;
s.generateStat( 10 ) ;
Build the application using :
g++ -o stats Stats.cpp Application.cpp -I.

generateStat is part of your Stats class. You need to instantiate a Stats object (along with the necessary includes for Stats.h in your main class)
For example,
Stats stat;
stat.generateStat(i);
Also, your function definition needs to include the class name Stats::generateStat.

The same error msg occured 2 weeks ago (at work).
At first glance --- Try:
void Stats::generateStat(int i) {
//some process code here }
The class name was missing. Hence, unresolved.
btw Concerning your header --- another issue, this #ifndef directive should not be necessary cause you should declare Stats only once in a namespace.
#ifndef CLASS_H
#define CLASS_H
#include "Class.h"
#endif
This is a generic example - Usable in cpp files.
EDIT: Now, I saw your invocation (main method in your case). You need an object instance to invoke your method.
Stats* stats = new Stats(); //add a default constructor if not done
stats->generateStat(77);
// any other stats stuff ......
// in posterior to the last use
delete(stats);
In your header:
Stats::Stats(){}; //with an empty body - no need to write it again in the cpp file

How to auto-include all headers in directory

I'm going through exercises of a C++ book. For each exercise I want to minimize the boilerplate code I have to write. I've set up my project a certain way but it doesn't seem right, and requires too many changes.
Right now I have a single main.cpp file with the following:
#include "e0614.h"
int main()
{
E0614 ex;
ex.solve();
}
Each time I create a new class from an exercise, I have to come and modify this file to change the name of the included header as well as the class i'm instantiating.
So my questions are:
Can I include all headers in the directory so at least I don't have to change the #include line?
Better yet, can I rewrite my solution so that I don't even have to touch main.cpp, without having one file with all the code for every exercise in it?
Update:
I ended up following Poita_'s advice to generate main.cpp via a script.
Since I'm using an IDE (Visual Studio), I wanted this integrated with it, so did a bit of research on how. For those interested in how, read on (it was fairly, but not entirely, straightforward).
Visual Studio lets you use an external tool via the Tools -> External Tools menu, and contains a bunch of pre-defined variables, such as $(ItemFileName), which can be passed on to the tool. So in this instance I used a simple batch file, and it gets passed the name of the currently selected file in Visual Studio.
To add that tool to the toolbar, right click on the toolbar, select Customize -> Commands -> Tools, and select the "External Command X" and drag it to the toolbar. Substitute X with the number corresponding to the tool you created. My installation contained 5 default pre-existing tools listed in Tools -> External Tools, so the one I created was tool number 6. You have to figure out this number as it is not shown. You can then assign an icon to the shortcut (it's the BuildMain command shown below):

No. You have to include them all if that's what you want to do.
No. At least, not in a way that's actually going to save typing.
Of course, you could write a script to create main.cpp for you...

If you build your code using make, you should be able to do this.
Can I include all headers in the directory so at least I don't have to change the #include line?
Change your include line to something like #include <all_headers.h>. Now, you can let your Makefile auto-generate all_headers.h with a target like:
all_headers.h:
for i in `ls *.h`; do echo "#include <$i>" >>all_headers.h; done
Make sure that all_headers.h is getting deleted when you 'make clean'.
Better yet, can I rewrite my solution so that I don't even have to touch main.cpp,
without having one file with all the code for every exercise in it?
You can do this if you abstract away your class with a typedef. In your example, change your class name from E0614 to myClass (or something). Now, add a line to your Makefile underneath the for loop above that says echo "typedef "$MY_TYPE" myClass;" >>all_headers.h. When you build your program, invoke 'make' with something like make MY_TYPE=E0614 and your typedef will be automatically filled in with the class you are wanting to test.

If you're on Unix system, you can have a softlink that points to the latest excercise.
ln -s e0615.h latest.h
and name your class E instead of E0614, of course
P.S. To the best of my knowledge, you can't do #include xxx*

Don't use one main.cpp which you modify for each exercise. This solution makes use of make's builtin rules, so you only have to type make e0614 and it will generate e0614.cpp, compile, and link it. You can customize each .cpp file (they won't be regenerated as written below) and maintain all of that history to refer to as you complete exercises, rather than erasing it as you move from one to the next. (You should also use source control, such as Mercurial.)
Makefile
e%.cpp:
./gen_ex_cpp $# > $#
You can generate boilerplate code with scripts, because you don't want it to be tedious either. There are several options for these scripts—and I use a variety of languages including C++, Python, and shell—but the Python below is short and should be simple and clear enough here.
Sample generate script
#!/usr/bin/python
import sys
args = sys.argv[1:]
if not args:
sys.exit("expected filename")
name = args.pop(0).partition(".")[0]
if args:
sys.exit("unexpected args")
upper_name = name.upper()
print """
#include "%(name)s.hpp"
int main() {
%(upper_name)s ex;
ex.solve();
return 0;
}
""" % locals()

Make a master include file containing the names of all the headers you want.
It's a really bad idea to include *, even if you could.

You could use conditional compilation for the class name by using concatenation.
// Somewhere in your other files
define CLASS_NUMBER E0614
// in main.cpp
#define ENTERCLASSNUMBER(num) \
##num## ex;
// in main()
ENTERCLASSNUMBER(CLASS_NUMBER)
Don't know about the includes though. As suggested above, a script might be the best option.

writing a makefile rule to pass the name of the executable as a -DHEADERFILE=something parameter to the compiler shouldn't be difficult at all. Something like:
%.exe : %.h %.cpp main.cpp
gcc -o $< -DHEADER_FILE=$<F $>
OTOH, I don't know if #include does macro expansion on the filename.

sed -i 's/\<\\([eE]\\)[0-9]+\\>/\19999/' main.cpp
Replace 9999 with the required number. There might be better ways.

Why not using object mechanisms ?
You can use an Exemplar strategy for this.
class BaseExercise
{
public:
static bool Add(BaseExercise* b) { Collection().push_back(b); return true; }
static size_t Solve() {
size_t nbErrors = 0;
for(collections_type::const_iterator it = Collection().begin(), end = Collection().end(); it != end; ++it)
nbErrors += it->solve();
return nbErrors;
}
size_t solve() const
{
try {
this->solveImpl();
return 0;
} catch(std::exception& e) {
std::cout << mName << " - end - " << e.what() << std::endl;
return 1;
}
}
protected:
explicit BaseExercise(const char* name): mName(name)
{
}
private:
typedef std::vector<BaseExercise*> collection_type;
static collection_type& Collection() { collection_type MCollection; return MCollection; }
virtual void solveImpl() const = 0;
const char* mName;
}; // class BaseExercise
template <class T>
class BaseExerciseT: public BaseExercise
{
protected:
explicit BaseExerciseT(const char* b): BaseExercise(b) {
static bool MRegistered = BaseExercise::Add(this);
}
};
Okay, that's the base.
// Exercise007.h
#include "baseExercise.h"
class Exercise007: public BaseExerciseT<Exercise007>
{
public:
Exercise007(): BaseExerciseT<Exercise007>("Exercise007") {}
private:
virtual void solveImpl() const { ... }
};
// Exercise007.cpp
Exercise007 gExemplar007;
And for main
// main.cpp
#include "baseExercise.h"
int main(int argc, char* argv[])
{
size_t nbErrors = BaseExercise::Solve();
if (nbErrors) std::cout << nbErrors << " errors" << std::endl;
return nbErrors;
}
And here, you don't need any script ;)

try this:-
#ifndef a_h
#define a_h
#include <iostream>
#include <conio.h>
#incl....as many u like
class a{
f1();//leave it blank
int d;
}
#endif //save this as a.h
later
include this in ur main program that is cpp file
#include "a.h"
...your program

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Double initialization of a static STL container in a C++ library - c++

Related

Registering file handlers in code running prior to main()

Using a variable from a different file before main

How to trigger c'tors of globals in executable shared library (.so)?

Calling function from different files in C++ error

How to auto-include all headers in directory

Categories

Resources