Is there a way I can identify if a function is being called the for the first time by identifying the internal (hidden?) variables that GCC uses to facilitate static variables (variables that already exist in my function)?
I hope to get at these variables from the C++ code.
There's no way to rely on internals of the compiler, and even if you tried there's no guarantee it wouldn't change with the next version.
Use this common idiom:
static bool firsttime = true;
if (firsttime)
{
firsttime = false;
// other stuff here
}
GCC uses a hidden flag indicating if a variable has been initialized. There is no way to access this. Even so, these don't actually track the "first time" but rather whether the variable has been initialized. Consider the following:
void func()
{
static T a;
static T b;
}
But the standard would actually allow that these are initialized by different threads. So if that happens which one got their first. Checking the gcc disassembly it does seem each is handled with its own lock, at least in non-optimized mode, the code changes a lot in optimized mode (so even moreso, "first time" is not clearly defined).
Additionally, as Mark points out in his comment, it greatly depends on what type you are using when the initialization is done. Simple types might, not guaranteed, be initialized globally, others will indeed wait until the function is called first.
Why do you need to know first time entry anyway?
Related
Having recently got feedback from Code Review stating the impropriety of non-initialized variables, my class variable initialization now seems very ugly:
class MyClass
{
private:
int variable_one;
int variable_two;
int variable_three;
MyClass():variable_one(0),variable_two(0),variable_three(0){};
//...
};
Previously, I wouldn't define my variables until they are needed:
class MyClass
{
private:
int variable_one;
void MyFunction(int x)
{
variable_one = x;
}
};
Why is my second example frowned upon? What are the risks involved by bot initializing variables?
The risk with leaving variables uninitialized is that you might read them before they've been set up. That can lead to extremely hard-to-diagnose bugs and erratic behavior. You can also initialize them to sentinel values to make it easier to detect when they haven't been set up.
As a note, since C++11 (what's now supported by most compilers), you can just do this:
class MyClass
{
private:
int variable_one = 0;
int variable_two = 0;
int variable_three = 0;
};
Now there's little code overhead and it makes clear that they get default values unless you specifically set them to something else.
It is frowned upon because someone, someday is going to use your class and assume that internal state has been setup during initialization. You are perhaps worried about initialization not needed and wasting time? Instead, depending on how many various and assorted methods you have, you will be repeating the initialization code in every one of them, until you forget in just one of them, but it works fine because you ran in debug mode so all memory is cleared. Then a month later, someone compiles in release and all the sudden they think that their code is broken because you didn't initialize your code in a central location.
RAII - Resource Allocation Is Initialization whenever possible. When they create an instance of your class, make sure it is initialized.
The answers provided so far are all correct, but no one mentioned there's a more general OO principle at work: An object's methods transform it from one internally consistent state to another. It should never be possible to use an object where it does something silly (unless it's a silly object).
For example, if an object has an array and a count of currently active elements in the array, it should never be true that there are active elements when count is zero. The method that updates the array also updates the count, keeping the state consistent with itself. The momentary inconsistency -- after the element is added and before the count is updated -- is not visible to the user of the object.
In your example, MyClass gets off on the wrong foot by creating a nondeterministic initial state. Whatever relationship the member variables have to each other, their values are determined by compiler happenstance. The more it's used, the probability that that's what you want approaches zero.
The first method you've specified is called as initializer list, and it's the only way to initialize when you've const or reference data members in your class.
If you don't initialize your data members that are C++ objects, it will still be initialized to a default value by calling the corresponding constructor, and then the same exercise will be repeated when you try to initialize it at a later time (like how you're doing it inside a function, when you deem it to be necessary).
I recently came across the Nifty Counter Idiom. My understanding is that this is used to implement globals in the standard library like cout, cerr, etc. Since the experts have chosen it, I assume that it's a very strong technique.
I'm trying to understand what the advantage is over using something more like a Meyer Singleton.
For example, one could just have, in a header file:
inline Stream& getStream() { static Stream s; return s; }
static Stream& stream = getStream();
The advantage is you don't have to worry about reference counting, or placement new, or having two classes, i.e. the code is much simpler. Since it's not done this way, I'm sure there's a reason:
Is this not guaranteed to have a single global object across shared and static libraries? It seems like the ODR should guarantee that there can only be one static variable.
Is there some kind of performance cost? It seems like in both my code and the Nifty Counter, you are following one reference to get to the object.
Is there some situations where the reference counting is actually useful? It seems like it will still just lead to the object being constructed if the header is included, and destroyed at program end, like the Meyer Singleton.
Does the answer involve dlopen'ing something manually? I don't have too much experience with that.
Edit: I was prompted to write the following bit of code while reading Yakk's answer, I add it to the original question as a quick demo. It's a very minimal example that shows how using the Meyer Singleton + a global reference leads to initialization before main: http://coliru.stacked-crooked.com/a/a7f0c8f33ba42b7f.
The static local/Meyer's singleton + static global reference (your solution) is nearly equivalent to the nifty counter.
The differences are as follows:
No .cpp file is required in your solution.
Technically the static Steam& exists in every compilation unit; the object being referred to does not. As there is no way to detect this in the current version of C++, under as-if this goes away. But some implementations might actually create that reference instead of eliding it.
Someone could call getStream() prior to the static Stream& being created; this would cause difficulty in destruction order (with the stream being destroyed later than expected). This can be avoided by making that against the rules.
The standard is mandated to make creating the static Stream local in the inline getStream thread safe. Detecting that this is not going to happen is challenging for the compiler, so some redundant thread-safety overhead may exist in your solution. The nifty counter does not support thread safety explicitly; this is considered safe as it runs at static initialization time, prior to when threads are expected.
The call to getStream() must occur in each and every compilation unit. Only if it is proven that it cannot do anything may it be optimized out, which is difficult. The nifty counter has a similar cost, but the operation may or may not be be simpler to optimize out or in runtime cost. (Determining this will require inspecting resulting assembly output on a variety of compilers)
"magic statics" (statics locals without race conditions) where introduced in C++11. There could be other issues prior to C++11 magic statics with your code; the only one I can think of is someone calling getStream() directly in another thread during static initialization, which (as mentioned above) should be banned in general.
Outside the realm of the standard, your version will automatically and magically create a new singleton in each dynamicly linked chunk of code (DLL, .so, etc). The nifty counter will only create the singleton in the cpp file. This may give the library writer tighter control over accidentally spawning new singletons; they can stick it into the dynamic library, instead of spawning duplicates.
Avoiding having more than one singleton is sometimes important.
Summarizing the answers and comments:
Let's compare 3 different options for a library, wishing to present a global Singleton, as a variable or via a getter function:
Option 1 - the nifty counter pattern, allowing the use of a global variable that is:
assured to be created once
assured to be created before the first usage
assured to be created once across all shared objects that are dynamically linked with the library creating this global variable.
Option 2 - the Meyers singleton pattern with a reference variable (as presented in the question):
assured to be created once
assured to be created before the first usage
However, it will create a copy of the singleton object in shared objects, even if all shared objects and the main are linked dynamically with the library. This is because the Singleton reference variable is declared static in a header file and must have its initialization ready at compile time wherever it is used, including in shared objects, during compile time, before meeting the program they will be loaded to.
Option 3 - the Meyers singleton pattern without a reference variable (calling a getter for retrieving the Singleton object):
assured to be created once
assured to be created before the first usage
assured to be created once across all shared objects that are dynamically linked with the library creating this Singleton.
However, in this option there is no global variable nor inline call, each call for retrieving the Singleton is a function call (that can be cached on the caller side).
This option would look like:
// libA .h
struct A {
A();
};
A& getA();
// some other header
A global_a2 = getA();
// main
int main() {
std::cerr << "main\n";
}
// libA .cpp - need to be dynamically linked! (same as libstdc++ is...)
// thus the below shall be created only once in the process
A& getA() {
static A a;
return a;
}
A::A() { std::cerr << "construct A\n"; }
All of your questions about utility / performance of Nifty Counter aka Schwartz Counter were basically answered by Maxim Egorushkin in this answer (but see also the comment threads).
Global variables in modern C++
The main issue is that there is a trade-off taking place. When you use Nifty Counter your program startup time is a bit slower (in large projects), since all these counters have to run before anything can happen. That doesn't happen in Meyer's singleton.
However, in the Meyer's singleton, every time you want to access the global object, you have to check if it's null, or, the compiler emits code that checks if the static variable was already constructed before any access is attempted. In the Nifty Counter, you have your pointer already and you just fire away, since you can assume the init happened at startup time.
So, Nifty Counter vs. Meyer's singleton is basically a trade-off between program startup time and run-time.
With the solution you have here, the global stream variable gets assigned at some point during static initialization, but it is unspecified when. Therefore the use of stream from other compilation units during static initialization may not work. Nifty counter is a way to guarantee that a global (e.g. std::cout) is usable even during static initialization.
#include <iostream>
struct use_std_out_in_ctor
{
use_std_out_in_ctor()
{
// std::cout guaranteed to be initialized even if this
// ctor runs during static initialization
std::cout << "Hello world" << std::endl;
}
};
use_std_out_in_ctor global; // causes ctor to run during static initialization
int main()
{
std::cout << "Did it print Hello world?" << std::endl;
}
Recently a fellow worker showed to me a code like this:
void SomeClass::function()
{
static bool init = false;
if (!init)
{
// hundreds of lines of ugly code
}
init = true;
}
He wants to check if SomeClass is initialized in order to execute some piece of code once per Someclass instance but the fact is that only one instance of SomeClass will exist in all the lifetime of the program.
His question were about the init static variable, about when it's initialized. I've answered that the initialization occurs once, so the value will be false at first call and true the rest of its lifetime. After answering I've added that such use of static variables is bad practice but I haven't been able to explain why.
The reasons that I've been thinking so far are the following:
The behaviour of static bool init into SomeClass::function could be achieved with a non-static member variable.
Other functions in SomeClass couldn't check the static bool init value because it's visibility is limited to the void SomeClass::function() scope.
The static variables aren't OOPish because they define a global state instead of a object state.
This reasons looks poor, unclever and not very concrete to me so I'm asking for more reasons to explain why the use of static variables in function and member-function space are a bad practice.
Thanks!
This is certainly a rare occurrence, at least, in good quality code, because of the narrow case for which it's appropriate. What this basically does is a just-in-time initialization of a global state (to deliver some global functionality). A typical example of this is having a random number generator function that seeds the generator at the first call to it. Another typical use of this is a function that returns the instance of a singleton, initialized on the first call. But other use-case examples are few and far between.
In general terms, global state is not desirable, and having objects that contain self-sufficient states is preferred (for modularity, etc.). But if you need global state (and sometimes you do), you have to implement it somehow. If you need any kind of non-trivial global state, then you should probably go with a singleton class, and one of the preferred ways to deliver that application-wide single instance is through a function that delivers a reference to a local static instance initialized on the first call. If the global state needed is a bit more trivial, then doing the scheme with the local static bool flag is certainly an acceptable way to do it. In other words, I see no fundamental problem with employing that method, but I would naturally question its motivations (requiring a global state) if presented with such code.
As is always the case for global data, multi-threading will cause some problems with a simplistic implementation like this one. Naive introductions of global state are never going to be inherently thread-safe, and this case is no exception, you'd have to take measures to address that specific problem. And that is part of the reasons why global states are not desirable.
The behaviour of static bool init into SomeClass::function could be achieved with a non-static member variable.
If there is an alternative to achieve the same behavior, then the two alternatives have to be judged on the technical issues (like thread-safety). But in this case, the required behavior is the questionable thing, more so than the implementation details, and the existence of alternative implementations doesn't change that.
Second, I don't see how you can replace a just-in-time initialization of a global state by anything that is based on a non-static data member (a static data member, maybe). And even if you can, it would be wasteful (require per-object storage for a one-time-per-program-execution thing), and on that ground alone, wouldn't make it a better alternative.
Other functions in SomeClass couldn't check the static bool init value because it's visibility is limited to the void SomeClass::function() scope.
I would generally put that in the "Pro" column (as in Pro/Con). This is a good thing. This is information hiding or encapsulation. If you can hide away things that shouldn't be a concern to others, then great! But if there are other functions that would need to know that the global state has already been initialized or not, then you probably need something more along the lines of a singleton class.
The static variables aren't OOPish because they define a global state instead of a object state.
OOPish or not, who cares? But yes, the global state is the concern here. Not so much the use of a local static variable to implement its initialization. Global states, especially mutable global states, are bad in general and should never be abused. They hinder modularity (modules are less self-sufficient if they rely on global states), they introduce multi-threading concerns since they are inherently shared data, they make any function that use them non-reentrant (non-pure), they make debugging difficult, etc... the list goes on. But most of these issues are not tied to how you implement it. On the other hand, using a local static variable is a good way to solve the static-initialization-order-fiasco, so, they are good for that reason, one less problem to worry about when introducing a (well-justified) global state into your code.
Think multi-threading. This type of code is problematic when function() can be called concurrently by multiple threads. Without locking, you're open to race conditions; with locking, concurrency can suffer for no real gain.
Global state is probably the worst problem here. Other functions don't have to be concerned with it, so it's not an issue. The fact that it can be achieved without static variable essentially means you made some form of a singleton. Which of course introduces all problems that singleton has, like being totally unsuitable for multithreaded environment, for one.
Adding to what others said, you can't have multiple objects of this class at the same time, or at least would they not behave as expected. The first instance would set the static variable and do the initialization. The ones created later though would not have their own version of init but share it with all other instances. Since the first instance set it to true, all following won't do any initialization, which is most probably not what you want.
I've just read that if I want to be sure about the initialization order it will be better to use some function which will turn global variable into local(but still static), my question, do I need to keep some identifier which tells me that my static object has already been created(the identifier inside function which prevent me from the intialization of the static object one more time) or not? cause I can use this function with initialization in different places, thanks in advance for any help
The first question is do your static lifetime objects care about the order they are initialized?
If true the second question is why?
The initialization is only a problem if a global object uses another global object during its initialization (i.e. when the constructor is running). Note: This is horrible proactive and should be avoided (globals should not be used and if they are they should be interdependent).
If they must be linked then they should be related (in which case you could potentially make a new object that includes the two old ones so that you can control their creation more precisely). If that is not possible you just put them in the same compilation unit (read *.cpp file).
As far as the standard is concerned, initialization of a function-scope static variable only happens once:
int *gettheint(bool set_to_four) {
static int foo = 3; // only happens once, ever
if (set_to_four) {
foo = 4; // happens as many times as the function is called with true
}
return &foo;
}
So there's no need for gettheint to check whether foo has already been initialized - the value won't be overwritten with 3 on the second and subsequent calls.
Threads throw a spanner in the works, being outside the scope of the standard. You can check the documentation for your threading implementation, but chances are that the once-onliness of the initialization is not thread-safe in your implementation. That's what pthread_once is for, or equivalent. Alternatively in a multi-threaded program, you could call the function before creating any extra threads.
I have a loop as follows
while(1)
{
int i;
}
Does i get destroyed and recreated on the stack each time the loop occurs?
Theoretically, it gets recreated. In practice, it might be kept alive and reinitalized for optimization reasons.
But from your point of view, it gets recreated, and the compiler handles the optimization (i.e, keep it at it's innermost scope, as long as it's a pod type).
Not necessarily. Your compiler could choose to change it into
int i;
while(1) {
...
i = 0;
}
It may not be literally created and destroyed on the stack every time. However, semantically, that is what occurs,and when you use more complex types in C++ that have custom destruction behaviour then that is exactly what happens, although the compiler may still choose to hold the stack memory separately.
Conceptually, yes. But since there's nothing being done to the value, the compiler is very likely to generate code does nothing with the variable on each iteration of the loop. It can, for instance, allocate it in advance (when the function enters), since it's going to be used later.
Since you can't reference the variable outside the defining scope, that doesn't change the semantics.
In C you have to look at the assembly generated to know that (the compiler might have chosen to put it in a register).
What you know is that outside the loop you cannot access that particular object by any means (by name, by pointer, by hack, ...)