Purpose of class ExplicitInit in objc runtime source code

Purpose of class ExplicitInit in objc runtime source code - c++

I'm studying class ExplicitInit in objc runtime source code. I know that class ExplicitInit is used in static objc::ExplicitInit<StripedMap<SideTable>> SideTablesMap;. But why ExplicitInit is necessary? StripedMap is good enough to store SideTable.
I think the comments for class ExplicitInit in DenseMapExtras.h is the key to understand why ExplicitInit is necessary. But I can't understand the comments because of my poor C++ knowledge.
the comments show below:
// We cannot use a C++ static initializer to initialize certain globals because
// libc calls us before our C++ initializers run. We also don't want a global
// pointer to some globals because of the extra indirection.
//
// ExplicitInit / LazyInit wrap doing it the hard way.
there are three sentences in comments above, but I can't understand them all. Can anybody help me explain them?
And don't forget the first question: why ExplicitInit is necessary? 😀😂

We cannot use a C++ static initializer to initialize certain globals because libc calls us before our C++ initializers run.
There is some deep dyld and Mach-O voodoo that I don't understand when it comes to the Objective-C runtime! This was a fun question to try dig into.
The function _objc_init() (in objc-os.mm) performs:
Bootstrap initialization. Registers our image notifier with dyld.
Called by libSystem BEFORE library initialization time
_objc_init() calls runtime_init(), a portion of which is objc::allocatedClasses.init();. allocatedClasses is declared as static ExplicitInitDenseSet<Class> allocatedClasses; in objc-runtime-new.mm.
So it appears libSystem calls _objc_init() which in turns calls runtime_init() which in turn uses a C++ static variable that hasn't yet been constructed because the library itself hasn't been initialized, hence the need for ExplicitInit.
We also don't want a global pointer to some globals because of the extra indirection.
I don't know why this design wasn't chosen. Perhaps for cache locality?
ExplicitInit / LazyInit wrap doing it the hard way.
The init() methods in these classes use placement new to explicitly initialize the underlying type into the space reserved for it in _storage. This would normally happen automatically during static initialization if the type was instantiated directly. In this case space is reserved for the underlying object so it may be explicitly initialized by calling init() which constructs the object.

Related

Static variable from a static linked lib used before its construction [duplicate]

When I use static variables in C++, I often end up wanting to initialize one variable passing another to its constructor. In other words, I want to create static instances that depend on each other.
Within a single .cpp or .h file this is not a problem: the instances will be created in the order they are declared. However, when you want to initialize a static instance with an instance in another compilation unit, the order seems impossible to specify. The result is that, depending on the weather, it can happen that the instance that depends on another is constructed, and only afterwards the other instance is constructed. The result is that the first instance is initialized incorrectly.
Does anyone know how to ensure that static objects are created in the correct order? I have searched a long time for a solution, trying all of them (including the Schwarz Counter solution), but I begin to doubt there is one that really works.
One possibility is the trick with the static function member:
Type& globalObject()
{
static Type theOneAndOnlyInstance;
return theOneAndOnlyInstance;
}
Indeed, this does work. Regrettably, you have to write globalObject().MemberFunction(), instead of globalObject.MemberFunction(), resulting in somewhat confusing and inelegant client code.
Update: Thank you for your reactions. Regrettably, it indeed seems like I have answered my own question. I guess I'll have to learn to live with it...

You have answered your own question. Static initialization order is undefined, and the most elegant way around it (while still doing static initialization i.e. not refactoring it away completely) is to wrap the initialization in a function.
Read the C++ FAQ items starting from https://isocpp.org/wiki/faq/ctors#static-init-order

Maybe you should reconsider whether you need so many global static variables. While they can sometimes be useful, often it's much simpler to refactor them to a smaller local scope, especially if you find that some static variables depend on others.
But you're right, there's no way to ensure a particular order of initialization, and so if your heart is set on it, keeping the initialization in a function, like you mentioned, is probably the simplest way.

Most compilers (linkers) actually do support a (non-portable) way of specifying the order. For example, with visual studio you can use the init_seg pragma to arrange the initialization into several different groups. AFAIK there is no way to guarantee order WITHIN each group. Since this is non-portable you may want to consider if you can fix your design to not require it, but the option is out there.

Indeed, this does work. Regrettably, you have to write globalObject().MemberFunction(), instead of globalObject.MemberFunction(), resulting in somewhat confusing and inelegant client code.
But the most important thing is that it works, and that it is failure proof, ie. it is not easy to bypass the correct usage.
Program correctness should be your first priority. Also, IMHO, the () above is purely stylistic - ie. completely unimportant.
Depending on your platform, be careful of too much dynamic initialization. There is a relatively small amount of clean up that can take place for dynamic initializers (see here). You can solve this problem using a global object container that contains members different global objects. You therefore have:
Globals & getGlobals ()
{
static Globals cache;
return cache;
}
There is only one call to ~Globals() in order to clean up for all global objects in your program. In order to access a global you still have something like:
getGlobals().configuration.memberFunction ();
If you really wanted you could wrap this in a macro to save a tiny bit of typing using a macro:
#define GLOBAL(X) getGlobals().#X
GLOBAL(object).memberFunction ();
Although, this is just syntactic sugar on your initial solution.

dispite the age of this thread, I would like to propose the solution I've found.
As many have pointed out before of me, C++ doesn't provide any mechanism for static initialization ordering. What I propose is to encapsule each static member inside a static method of the class that in turn initialize the member and provide an access in an object-oriented fashion.
Let me give you an example, supposing we want to define the class named "Math" which, among the other members, contains "PI":
class Math {
public:
static const float Pi() {
static const float s_PI = 3.14f;
return s_PI;
}
}
s_PI will be initialized the first time Pi() method is invoked (in GCC). Be aware: the local objects with static storage have an implementation dependent lifecyle, for further detail check 6.7.4 in 2.
Static keyword, C++ Standard

Wrapping the static in a method will fix the order problem, but it isn't thread safe as others have pointed out but you can do this to also make it thread if that is a concern.
// File scope static pointer is thread safe and is initialized first.
static Type * theOneAndOnlyInstance = 0;
Type& globalObject()
{
if(theOneAndOnlyInstance == 0)
{
// Put mutex lock here for thread safety
theOneAndOnlyInstance = new Type();
}
return *theOneAndOnlyInstance;
}

Delay the construction of a static member object

The project is compiled into a dll to be injected into an executable
The project relies on an API which is initialized at the very beginning in main() like so:
int DLL_main()
{
TheApi::Initialize();
AnObject anObjectInstance;
//..
}
There is an object that is constructed with a class definition similar to this:
class AnObject()
{
AnObject();
~AnObject();
static ApiHelper apiHelperObject; //This object assists in making certain api features easier to use
}
//Inside AnObject.cpp
ApiHelper AnObject::apiHelperObject;
In the constructor of apiHelperObject, there are some API function calls
Upon injection of the dll, nothing happens (no error message as well) however,
when the static keyword is removed from apiHelperObject all works fine
The issue seems to be that the static member is being constructed before the API is initialized
It is not possible to call TheApi::Initialize() in apiHelperObject's constructor because there are multiple different api helper objects, and that would cause TheApi::Initialize() to be called more than once
And so the question is:
What is the best way of initializing the api before the static member object is constructed? Or, what is the best way to delay the construction of the static member?
Preferably, a pointer is not used as the syntax is not especially favored
Thank you

In ordinary standard C++ you can always delay the initialization of a static object by making it local to an accessor function.
Essentially that's a Meyers' singleton:
auto helper_object()
-> ApiHelper&
{
static ApiHelper the_object;
return the_object;
}
Here, in standard C++, the object is initialized the first time execution passes through the declaration.
But the C++ standard does not actively support dynamic libraries, much less DLL injection. So it's difficult to say how this is going to play out. Beware of threading issues.

How to make an old C codebase with many globals reentrant

I'm working with a large, old C codebase (an interpreter) that uses global variables a great deal, with the result that I cannot have two instances of it at once. Is there a straightforward (ideally automated) approach to convert this code to something reentrant? i.e. some refactor tool that would make all globals part of a struct and prepend the pointer to all variables?
Could I convert to C++ and wrap the entire thing in a class definition?

I would recommend to convert your project into C++11 project and change all your static vars into threadlocal.
This can be up to several days of work depending on the size of your project. In certain cases this will work.

I'm not aware of any "ready made" solution for this type of problem.
As a general rule, global variables are going to make it hard to make the code reentrant.
If you can remove all the global variables [simply delete the globals and see where you get compiler errors]. Replace the globals with a structure, and then use a structure per instance that is passed along, you'd be pretty much done (as long as the state of the interpreter instances is independent, and the instances don't need to know about each other). [Of course, you may need to have more than a single structure to solve the problem, but your global variables should be possible to "stick in a structure"].
Of course, making the structure and the code go together as a C++ class (which may have smaller classes as part of the solution) would be the "next step", but it's not entirely straight forward to do this, if you are not familiar with C++ and class designs.

Are you trying to make it reentrant in order to be able to make it multi-thread, and divide the work between threads?
If so, I would consider making it multi process, instead of multy-thread,

What I usually do with interpreters is go straight to a class with instance vars rather than globals. Not sure what you are interpreting, but it could be possible to pass in a file path or string container that the class interprets with an internal thread, so encapsulating the entire interpretation run.

It is possible to wrap the whole thing in a class definition, but it will not work for code that takes addresses of functions and passes them to C code. Also, converting a large legacy code base to be compilable by a C++ compiler is tedious enough that it probably outweighs the effort of removing the global variables by hand.
Barring that one, you have two options:
Since you need reentrancy to implement threading, it might be easiest to declare all global variables thread-local. If you have a C compiler that supports thread-locals, this is typically as easy as slapping a __thread (or other compiler-specific keyword) before every declaration. Then you create a new interpreter simply by creating a new thread and initializing the interpreter in the normal way.
If you cannot use __thread or equivalent, then you have to do a bit more footwork. You need to place all global variables in a structure, and replace every access to global variable foo with get_ctx()->foo. This is tedious, but straightforward. Once you are done, get_ctx() can allocate and return a thread-local interpreter state, using an API of your choosing.

A program transformation tool that can handle the C language would be able to automate such changes.
It needs to be able to resolve each symbol to its declaration, and preprocessor conditionals are likely to be trouble. I'd name one, but SO zealots object when I do that.
Your solution of a struct containing the globals is the right idea; you need rewrite that replaces each global declaration with a slot member, and each access to a global with access to the corresponding struct member.
The remaining question is, where does the struct pointer come from? One answer is a global variable that is multiplexed when threads are switched; a better answer if available under your OS is the get the struct pointer from thread local variables.

C++ static initialization order

When I use static variables in C++, I often end up wanting to initialize one variable passing another to its constructor. In other words, I want to create static instances that depend on each other.
Within a single .cpp or .h file this is not a problem: the instances will be created in the order they are declared. However, when you want to initialize a static instance with an instance in another compilation unit, the order seems impossible to specify. The result is that, depending on the weather, it can happen that the instance that depends on another is constructed, and only afterwards the other instance is constructed. The result is that the first instance is initialized incorrectly.
Does anyone know how to ensure that static objects are created in the correct order? I have searched a long time for a solution, trying all of them (including the Schwarz Counter solution), but I begin to doubt there is one that really works.
One possibility is the trick with the static function member:
Type& globalObject()
{
static Type theOneAndOnlyInstance;
return theOneAndOnlyInstance;
}
Indeed, this does work. Regrettably, you have to write globalObject().MemberFunction(), instead of globalObject.MemberFunction(), resulting in somewhat confusing and inelegant client code.
Update: Thank you for your reactions. Regrettably, it indeed seems like I have answered my own question. I guess I'll have to learn to live with it...

You have answered your own question. Static initialization order is undefined, and the most elegant way around it (while still doing static initialization i.e. not refactoring it away completely) is to wrap the initialization in a function.
Read the C++ FAQ items starting from https://isocpp.org/wiki/faq/ctors#static-init-order

Maybe you should reconsider whether you need so many global static variables. While they can sometimes be useful, often it's much simpler to refactor them to a smaller local scope, especially if you find that some static variables depend on others.
But you're right, there's no way to ensure a particular order of initialization, and so if your heart is set on it, keeping the initialization in a function, like you mentioned, is probably the simplest way.

Most compilers (linkers) actually do support a (non-portable) way of specifying the order. For example, with visual studio you can use the init_seg pragma to arrange the initialization into several different groups. AFAIK there is no way to guarantee order WITHIN each group. Since this is non-portable you may want to consider if you can fix your design to not require it, but the option is out there.

Indeed, this does work. Regrettably, you have to write globalObject().MemberFunction(), instead of globalObject.MemberFunction(), resulting in somewhat confusing and inelegant client code.
But the most important thing is that it works, and that it is failure proof, ie. it is not easy to bypass the correct usage.
Program correctness should be your first priority. Also, IMHO, the () above is purely stylistic - ie. completely unimportant.
Depending on your platform, be careful of too much dynamic initialization. There is a relatively small amount of clean up that can take place for dynamic initializers (see here). You can solve this problem using a global object container that contains members different global objects. You therefore have:
Globals & getGlobals ()
{
static Globals cache;
return cache;
}
There is only one call to ~Globals() in order to clean up for all global objects in your program. In order to access a global you still have something like:
getGlobals().configuration.memberFunction ();
If you really wanted you could wrap this in a macro to save a tiny bit of typing using a macro:
#define GLOBAL(X) getGlobals().#X
GLOBAL(object).memberFunction ();
Although, this is just syntactic sugar on your initial solution.

dispite the age of this thread, I would like to propose the solution I've found.
As many have pointed out before of me, C++ doesn't provide any mechanism for static initialization ordering. What I propose is to encapsule each static member inside a static method of the class that in turn initialize the member and provide an access in an object-oriented fashion.
Let me give you an example, supposing we want to define the class named "Math" which, among the other members, contains "PI":
class Math {
public:
static const float Pi() {
static const float s_PI = 3.14f;
return s_PI;
}
}
s_PI will be initialized the first time Pi() method is invoked (in GCC). Be aware: the local objects with static storage have an implementation dependent lifecyle, for further detail check 6.7.4 in 2.
Static keyword, C++ Standard

Wrapping the static in a method will fix the order problem, but it isn't thread safe as others have pointed out but you can do this to also make it thread if that is a concern.
// File scope static pointer is thread safe and is initialized first.
static Type * theOneAndOnlyInstance = 0;
Type& globalObject()
{
if(theOneAndOnlyInstance == 0)
{
// Put mutex lock here for thread safety
theOneAndOnlyInstance = new Type();
}
return *theOneAndOnlyInstance;
}

Is using too much static bad or good?

I like to use static functions in C++ as a way to categorize them, like C# does.
Console::WriteLine("hello")
Is this good or bad? If the functions are used often I guess it doesn't matter, but if not do they put pressure on memory?
What about static const?

but is it good or bad
The first adjective that comes to mind is "unnecessary". C++ has free functions and namespaces, so why would you need to make them static functions in a class?
The use of static methods in uninstantiable classes in C# and Java is a workaround because those languages don't have free functions (that is, functions that reside directly in the namespace, rather than as part of a class). C++ doesn't have that flaw. Just use a namespace.

I'm all for using static functions. These just make sense especially when organized into modules (static class in C#).
However, the moment those functions need some kind of external (non compile-time const) data, then that function should be made an instance method and encapsulated along with its data into a class.
In a nutshell: static functions ok, static data bad.

Those who say static functions can be replaced by namespaces are wrong, here is a simple example:
class X
{
public:
static void f1 ()
{
...
f2 ();
}
private:
static void f2 () {}
};
As you can see, public static function f1 calls another static, but private function f2.
This is not just a collection of functions, but a smart collection with its own encapsulated methods. Namespaces would not give us this functionality.
Many people use the "singleton" pattern, just because it is a common practice, but in many cases you need a class with several static methods and just one static data member. In this case there is no need for a singleton at all. Also calling the method instance() is slower than just accessing the static functions/members directly.

Use namespaces to make a collection of functions:
namespace Console {
void WriteLine(...) // ...
}
As for memory, functions use the same amount outside a function, as a static member function or in a namespace. That is: no memory other that the code itself.

One specific reason static data is bad, is that C++ makes no guarantees about initialization order of static objects in different translation units. In practice this can cause problems when one object depends on another in a different translation unit. Scott Meyers discusses this in Item 26 of his book More Effective C++.

Agree with Frank here, there's not a problem with static (global) functions (of course providing they are organised).. The problems only start to really creep in when people think "oh I will just make the scope on this bit of data a little wider".. Slippery slope :)
To put it really into perspective.. Functional Programming ;)

The problem with static functions is that they can lead to a design that breaks encapsulation. For example, if you find yourself writing something like:
public class TotalManager
{
public double getTotal(Hamburger burger)
{
return burger.getPrice() + burget.getTax();
}
}
...then you might need to rethink your design. Static functions often require you to use setters and getters which clutter a Class's API and makes things more complicated in general. In my example, it might be better to remove Hamburger's getters and just move the getTotal() class into Hamburger itself.

I tend to make classes that consist of static functions, but some say the "right way" to do this is usually to use namespaces instead. (I developed my habits before C++ had namespaces.)
BTW, if you have a class that consists only of static data and functions, you should declare the constructor to be private, so nobody tries to instantiate it. (This is one of the reasons some argue to use namespaces rather than classes.)

For organization, use namespaces as already stated.
For global data I like to use the singleton pattern because it helps with the problem of the unknown initialization order of static objects. In other words, if you use the object as a singleton it is guaranteed to be initialized when its used.
Also be sure that your static functions are stateless so that they are thread safe.

I usually only use statics in conjunction with the friend system.
For example, I have a class which uses a lot of (inlined) internal helper functions to calculate stuff, including operations on private data.
This, of course, increases the number of functions the class interface has.
To get rid of that, I declare a helper class in the original classes .cpp file (and thus unseen to the outside world), make it a friend of the original class, and then move the old helper functions into static (inline) member functions of the helper class, passing the old class per reference in addition to the old parameters.
This keeps the interface slim and doesn't require a big listing of free friend functions.
Inlining also works nicely, so I'm not completely against static.
(I avoid it as much as I can, but using it like this, I like to do.)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Purpose of class ExplicitInit in objc runtime source code - c++

Related

Static variable from a static linked lib used before its construction [duplicate]

Delay the construction of a static member object

How to make an old C codebase with many globals reentrant

C++ static initialization order

Is using too much static bad or good?

Categories

Resources