I have a code which has been in use for a long time which has a scenario like following.
There are two classes, A and B.
A class has a public static variable static B* pB;
B class has a static object of itself (static B instance;).
In B constructor I set A::pB = this;
My question is, since static variable initialization order is undefined, if b::instance got initialized before A::pB, making B constructor to get called first and it tries to set A::pB which is uninitialized yet, could it lead to a problem?
My current code runs without any unexpected behavior. Wanted to find out whether it is just my luck or not
(Initialization of A::pB and B::instance happens in different translation units)
There are three parts to initialization, zero initialization,
static initialization and dynamic initialization. (Zero
initialization is actually part of static initialization, but
it's often convenient to keep them separate.) They occur in
that order.
If you don't specify any initialization for static B* A::pb;,
it will be zero initialized (before anything else), and nothing
else. If you specify a constant initializer, e.g.
B* A::pb = nullptr;
, that will also occur before any dynamic initialization.
What happens in a constructor is dynamic initialization, so
there should be no problem with your code unless A::pb also
has dynamic initialization, something like:
B* A::pb = someFunctionReturningABStar();
And finally: the default constructor for pointer is trivial; the
pointer is effectively "constructed" when the program is loaded,
before any code is executed. So there could never be a problem
due to assigning to pB in a constructor. The only problem
could occur if there was dynamic initialization of the pointer,
which might occur after the assignment in the constructor of
B::instance, and overwrite it.
And of course, until B::instance is constructed, any other
code will see a null pointer in A::pb.
Related
I'm new to c++ so i don't know much about it yet
so basically i have this code
header file
class Application{
public:
static Application& getInstance()
{
return *mInstance;
}
Application();
void run();
protected:
static Application* mInstance;
source file
Application* Application::mInstance;
Application::Application()
{
mInstance = this;
}
then i do
Application::getInstance().run();
When does the constructor for Application class runs?
It seems to work in visual studio.
So my question is why does this work?
Why does getInstance does not return a null pointer? since i have never instantiated the class.
Is this code standard?
will this work on any modern c++ compiler?
The constructor of a class belongs to an object, i.e. if it is called explicitly by the code somewhere it creates an object in the memory (on the stack or on the heap). So if you do not call the constructor somewhere it is never executed.
I see only the point why this can run is that you did not specify the initial value of the mInstance pointer. This means that its value is undefined and accidentally can have some valid address. If the run() method does not touch the mInstance object itself your code can run sometimes but sometimes not. This is the problem of uninitialized variables in C++.
But I suggest to follow the right singleton pattern:
1. Initialize mInstance as a nullptr.
2. The getInstance() function should call the constructor if mInstance is nullptr.
(Just a general hint: Avoid passing this in constructor to somewhere else because at that point the object is not fully constructed and in case of multi-thread application it can cause issues.)
The constructor is executed whenever an object is constructed.
Static objects at file scope are constructed before calling main(). However, if you are relying on order of construction of two such static objects (e.g. construction of one relies on the other already being constructed) you potentially have a problem. The order of construction of statics defined in two distinct compilation units (aka source files) is unspecified.
Static objects that are local (e.g. to a function) are constructed as control passes through their declaration .... typically as their function is called for the first time.
Consider following sample code:
class C
{
public:
int* x;
};
void f()
{
C* c = static_cast<C*>(malloc(sizeof(C)));
c->x = nullptr; // <-- here
}
If I had to live with the uninitialized memory for any reason (of course, if possible, I'd call new C() instead), I still could call the placement constructor. But if I omit this, as above, and initialize every member variable manually, does it result in undefined behaviour? I.e. is circumventing the constructor per se undefined behaviour or is it legal to replace calling it with some equivalent code outside the class?
(Came across this via another question on a completely different matter; asking for curiosity...)
It is legal now, and retroactively since C++98!
Indeed the C++ specification wording till C++20 was defining an object as (e.g. C++17 wording, [intro.object]):
The constructs in a C++ program create, destroy, refer to, access, and
manipulate objects. An object is created by a definition (6.1), by a
new-expression (8.5.2.4), when implicitly changing the active member
of a union (12.3), or when a temporary object is created (7.4, 15.2).
The possibility of creating an object using malloc allocation was not mentioned. Making it a de-facto undefined behavior.
It was then viewed as a problem, and this issue was addressed later by https://wg21.link/P0593R6 and accepted as a DR against all C++ versions since C++98 inclusive, then added into the C++20 spec, with the new wording:
[intro.object]
The constructs in a C++ program create, destroy, refer to, access, and manipulate objects. An object is created by a definition, by a new-expression, by an operation that implicitly creates objects (see below)...
...
Further, after implicitly creating objects within a specified region of
storage, some operations are described as producing a pointer to a
suitable created object. These operations select one of the
implicitly-created objects whose address is the address of the start
of the region of storage, and produce a pointer value that points to
that object, if that value would result in the program having defined
behavior. If no such pointer value would give the program defined
behavior, the behavior of the program is undefined. If multiple such
pointer values would give the program defined behavior, it is
unspecified which such pointer value is produced.
The example given in C++20 spec is:
#include <cstdlib>
struct X { int a, b; };
X *make_x() {
// The call to std::malloc implicitly creates an object of type X
// and its subobjects a and b, and returns a pointer to that X object
// (or an object that is pointer-interconvertible ([basic.compound]) with it),
// in order to give the subsequent class member access operations
// defined behavior.
X *p = (X*)std::malloc(sizeof(struct X));
p->a = 1;
p->b = 2;
return p;
}
There is no living C object, so pretending that there is one results in undefined behavior.
P0137R1, adopted at the committee's Oulu meeting, makes this clear by defining object as follows ([intro.object]/1):
An object is created by a definition ([basic.def]), by a new-expression ([expr.new]), when implicitly changing the active member of a union ([class.union]), or when a temporary object is created ([conv.rval], [class.temporary]).
reinterpret_cast<C*>(malloc(sizeof(C))) is none of these.
Also see this std-proposals thread, with a very similar example from Richard Smith (with a typo fixed):
struct TrivialThing { int a, b, c; };
TrivialThing *p = reinterpret_cast<TrivialThing*>(malloc(sizeof(TrivialThing)));
p->a = 0; // UB, no object of type TrivialThing here
The [basic.life]/1 quote applies only when an object is created in the first place. Note that "trivial" or "vacuous" (after the terminology change done by CWG1751) initialization, as that term is used in [basic.life]/1, is a property of an object, not a type, so "there is an object because its initialization is vacuous/trivial" is backwards.
I think the code is ok, as long as the type has a trivial constructor, as yours. Using the object cast from malloc without calling the placement new is just using the object before calling its constructor. From C++ standard 12.7 [class.dctor]:
For an object with a non-trivial constructor, referring to any non-static member or base class of the object before the constructor begins execution results in undefined behavior.
Since the exception proves the rule, referrint to a non-static member of an object with a trivial constructor before the constructor begins execution is not UB.
Further down in the same paragraphs there is this example:
extern X xobj;
int* p = &xobj.i;
X xobj;
This code is labelled as UB when X is non-trivial, but as not UB when X is trivial.
For the most part, circumventing the constructor generally results in undefined behavior.
There are some, arguably, corner cases for plain old data types, but you don't win anything avoiding them in the first place anyway, the constructor is trivial. Is the code as simple as presented?
[basic.life]/1
The lifetime of an object or reference is a runtime property of the object or reference. An object is said to have non-vacuous initialization if it is of a class or aggregate type and it or one of its subobjects is initialized by a constructor other than a trivial default constructor. [ Note: initialization by a trivial copy/move constructor is non-vacuous initialization. — end note ] The lifetime of an object of type T begins when:
storage with the proper alignment and size for type T is obtained, and
if the object has non-vacuous initialization, its initialization is complete.
The lifetime of an object of type T ends when:
if T is a class type with a non-trivial destructor ([class.dtor]), the destructor call starts, or
the storage which the object occupies is reused or released.
Aside from code being harder to read and reason about, you will either not win anything, or land up with undefined behavior. Just use the constructor, it is idiomatic C++.
This particular code is fine, because C is a POD. As long as C is a POD, it can be initialized that way as well.
Your code is equivalent to this:
struct C
{
int *x;
};
C* c = (C*)malloc(sizeof(C));
c->x = NULL;
Does it not look like familiar? It is all good. There is no problem with this code.
While you can initialize all explicit members that way, you cannot initialize everything a class may contain:
references cannot be set outside an initializer list
vtable pointers cannot be manipulated by code at all
That is, the moment that you have a single virtual member, or virtual base class, or reference member, there is no way to correctly initialize your object except by calling its constructor.
I think it shouldn't be UB. You make your pointer point to some raw memory and are treating its data in a particular way, there's nothing bad here.
If the constructor of this class does something (initializes variables, etc), you'll end up with, again, a pointer to raw, uninitialized object, using which without knowing what the (default) constructor was supposed to be doing (and repeating its behavior) will be UB.
I'm having some problems initializing static string members in c++. I have several classes and each one is holding several static string members that represent an id. When I am initializing the variables by calling a static function everything is fine. However, when I'd like to assign one variable with the value of another it still holds empty string. What's problem with this code?
std::string A::id()
{
std::stringstream sst;
sst << "id" << i;
i++;
return sst.str();
}
std::string B::str = A::id(); //prints "id0";
std::string C::str = "str"; //prints str
std::string D::str = B::str; //prints "" <-- what's wrong here?
std::string D::str2 = C::str; //prints ""
It appears as if the variables I am referring to (B::str and C::str) havent been initialized yet. But I assume when D::str = B::str is executed C::str is initialized at the latest and therefore D::str should also hold the string "id0".
This is Static Initialization Fiasco.
As per the C++ Standard the initialization order of objects with static storage duration is unspecified if they are declared in different Translation units.
Hence any code that relies on the order of initialization of such objects is bound to fail, and this problem is famously known as The Static Initialization Fiasco in C++.
Your code relies on the condition that Initialization of B::str and C::str happens before D::str, which is not guaranteed by the Standard. Since these 3 static storage duration objects reside in different translation units they might be initialized in Any order.
How to avoid it?
The solution is to use Construct On First Use Idiom,in short it means replacing the global object, with a global function, that returns the object by reference. The object returned by reference should be local static, Since static local objects are constructed the first time control flows over their declaration, the object will be created only on the first call and on every subsequent call the same object will be returned, thus simulating the behavior you need.
This should be an Interesting read:
How do I prevent the "static initialization order fiasco"?
There is no guaranteed order of initialization for static variables. So don't rely on it.
Instead init them with actual literal's, or better yet, init them at run-time when they are really needed.
Note: I am using the g++ compiler (which is I hear is pretty good and supposed to be pretty close to the standard).
I have the simplest class I could think of:
class BaseClass {
public:
int pub;
};
Then I have three equally simple programs to create BaseClass object(s) and print out the [uninitialized] value of their data.
Case 1
BaseClass B1;
cout<<"B1.pub = "<<B1.pub<<endl;
This prints out:
B1.pub = 1629556548
Which is fine. I actually thought it would get initialized to zero because it is a POD or Plain Old Datatype or something like that, but I guess not? So far so good.
Case 2
BaseClass B1;
cout<<"B1.pub = "<<B1.pub<<endl;
BaseClass B2;
cout<<"B2.pub = "<<B2.pub<<endl;
This prints out:
B1.pub = 1629556548
B2.pub = 0
This is definitely weird. I created two of the same objects the same exact way. One got initialized and the other did not.
Case 3
BaseClass B1;
cout<<"B1.pub = "<<B1.pub<<endl;
BaseClass B2;
cout<<"B2.pub = "<<B2.pub<<endl;
BaseClass* pB3 = new BaseClass;
cout<<"B3.pub = "<<pB3->pub<<endl;
This prints out:
B1.pub = 0
B2.pub = 0
B3.pub = 0
This is the most weird yet. They all get initialized to zero. All I did was add two lines of code and it changed the previous behavior.
So is this just a case of 'uninitialized data leads to unspecified behavior' or is there something more logical going on 'under the hood'?
I really want to understand the default constructor/destructor behavior because I have a feeling that it will be very important for completely understanding the inheritance stuff..
So is this just a case of 'uninitialized data leads to unspecified behavior'
Yes...
Sometimes if you call malloc (or new, which calls malloc) you will get data that is filled with zeroes because it is in a fresh page from the kernel. Other times it will be full of junk. If you put something on the stack (i.e., auto storage), you will almost certainly get garbage — but it can be hard to debug, because on your system that garbage might happen to be somewhat predictable. And with objects on the stack, you'll find that changing code in a completely different source file can change the values you see in an uninitialized data structure.
About POD: Whether or not something is POD is really a red herring here. I only explained it because the question mentioned POD, and the conversation derailed from there. The two relevant concepts are storage duration and constructors. POD objects don't have constructors, but not everything without a constructor is POD. (Technically, POD objects don't have non-trivial constructors nor members with non-trivial constructors.)
Storage duration: There are three kinds. Static duration is for globals, automatic is for local variables, and dynamic is for objects on the heap. (This is a simplification and not exactly correct, but you can read the C++ standard yourself if you need something exactly correct.)
Anything with static storage duration gets initialized to zero. So if you make a global instance of BaseClass, then its pub member will be zero (at first). Since you put it on the stack and the heap, this rule does not apply — and you don't do anything else to initialize it, so it is uninitialized. It happens to contain whatever junk was left in memory by the last piece of code to use it.
As a rule, any POD on the heap or the stack will be uninitialized unless you initialize it yourself, and the value will be undefined, possible changing when you recompile or run the program again. As a rule, any global POD will get initialized to zero unless you initialize it to something else.
Detecting uninitialized values: Try using Valgrind's memcheck tool, it will help you find where you use uninitialized values — these are usually errors.
It depends how you declare them:
// Assuming the POD type's
class BaseClass
{
public:
int pub;
};
Static Storage Duration Objects
These objects are always zero initialized.
// Static storage duration objects:
// PODS are zero initialized.
BaseClass global; // Zero initialized pub = 0
void plop()
{
static BaseClass functionStatic; // Zero initialized.
}
Automatic/Dynamic Storage Duration Objects
These objects may by default or zero initialized depending on how you declare them
void plop1()
{
// Dynamic
BaseClass* dynaObj1 = new BaseClass; // Default initialized (does nothing)
BaseClass* dynaObj2 = new BaseClass(); // Zero Initialized
// Automatic
BaseClass autoObj1; // Default initialized (does nothing)
BaseClass autoObj2 = BaseClass(); // Zero Initialized
// Notice that zero initialization of an automatic object is not the same
// as the zero initialization of a dynamic object this is because of the most
// vexing parse problem
BaseClass autoObj3(); // Unfortunately not a zero initialized object.
// Its a forward declaration of a function.
}
I use the term 'Zero-Initialized'/'Default-Initialization' but technically slightly more complex. 'Default-Initialization' will becomes 'no-initialization' of the pub member. While the () invokes 'Value-Initialization' that becomes 'Zero-Initialization' of the pub member.
Note: As BaseClass is a POD this class behaves just like the builtin types. If you swap BaseClass for any of the standard type the behavior is the same.
In all three cases, those POD objects could have indeterminate values.
POD objects without any initializer will NOT be initialized by default value. They just contain garbage..
From Standard 8.5 Initializers,
"If no initializer is specified for an object, and the object is of
(possibly cv-qualified) non-POD class type (or array thereof), the
object shall be default-initialized; if the object is of
const-qualified type, the underlying class type shall have a
user-declared default constructor. Otherwise, if no initializer is
specified for a nonstatic object, the object and its subobjects, if
any, have an indeterminate initial value; if the object or any of
its subobjects are of const-qualified type, the program is
ill-formed."
You can zero initialize all the members of a POD struct like this,
BaseClass object={0};
The way you wrote your class, the value of pub is undefined and can be anything. If you create a default constructor that will call the default constructor of pub - it will be guaranteed to be zero:
class BaseClass {
public:
BaseClass() : pub() {}; // calling pub() guarantees it to be zero.
int pub;
};
That would be a much better practice.
As a general rule, yes, uninitialized data leads to unspecified behavior. That's why other languages like C# take steps to insure that you don't use uninitialized data.
This is why you always, always, always, initialize a classes (or ANY variable) to a stable state instead of relying on the compiler to do it for you; especially since some compilers deliberately fill them with garbage. In fact, the only POD MSVC++ doesn't fill with garbage is bool, they get initialized to true. One would think it'd be safer to init it to false, but that's Microsoft for ya.
Case 1
Regardless of whether the encapsulating type is POD, data members of a built-in type are not default initialised by an encapsulating default constructor.
Case 2
No, neither got initialised. The underlying bytes at the memory position of one of them just happened to be 0.
Case 3
Same.
You seem to be expecting some guarantees about the "value" of uninitialised objects, whilst simultaneously professing the understanding that no such value exists.
I have a question about the default initialization in C++. I was told the non-POD object will be initialized automatically. But I am confused by the code below.
Why when I use a pointer, the variable i is initialized to 0, however, when I declare a local variable, it's not. I am using g++ as the compiler.
class INT {
public: int i;
};
int main () {
INT* myint1 = new INT;
INT myint2;
cout<<"myint1.i is "<<myint1->i<<endl;
cout<<"myint2.i is "<<myint2.i<<endl;
return 0;
}
The output is
myint1.i is 0
myint2.i is -1078649848
You need to declare a c'tor in INT and force 'i' to a well-defined value.
class INT {
public:
INT() : i(0) {}
...
};
i is still a POD, and is thus not initialized by default. It doesn't make a difference whether you allocate on the stack or from the heap - in both cases, the value if i is undefined.
in both cases its not initialized, you were just lucky to get 0 in the first one
It depends on the compiler. The big difference here is that a pointer set to new Something refers to some area of memory in heap, while local variables are stored on the stack. Perhaps your compiler zeroes heap memory but doesn't bother zeroing stack memory; either way, you can't count on either method zeroing your memory. You should use something like ZeroMemory in Win32 or memset from the C standard libraries to zero your memory, or set i = 0 in the constructor of INT.
For a class or struct-type, if you don't tell it which constructor to use when defining a variable, then the default constructor is called. If you didn't define a default constructor, then the compiler creates one for you. If the type is not a class (or struct) type, then it is not initialized since it won't have a constructor, let alone a default constructor (so no built-in types like int will ever be default-initialized).
So, in your example, both myint1 and myint2 are default constructed with the default constuctor that the compiler declared for INT. But since that won't initialize any non-class/struct variables in an INT, the i member variable of INT is not initialized.
If you want i to be initialized, you need to write a default constructor for INT which initializes it.
Firstly, your class INT is POD.
Secondly, when something is "initialized automatically" (for automatic or dynamic objects), it means that the constructor is called automatically. There is no other "automatic" initialization scenario (for automatic or dynamic objects); all other scenarios require a "manually" specified initializer. However, if the constructor does not do anything to perform the desired initialization, then that initialization will not take place. It is your responsibility to write that constructor, when necessary.
In both cases in your example you should get garbage in your objects. The 0 that you observe in case of new-ed object is there purely by accident.
That's true for non-POD object's but your object is POD since it has no user defined constructor and only contains PODs itself.
We just had this topic, you can find some explinations here.
If you add more variables to the class, you will end up with non-initialized member variables.