C++ initialize differently in container vs as local variable - c++

This question came up while I was trying to write an RAII wrapper class for an OpenGL buffer object. The way that OpenGL creates buffer objects is by a call to void glGenBuffers(GLsizei n​, GLuint* buffers​) which "returns n buffer object names in buffers​." Also, note that the object name 0 is special to OpenGL, being a default value "like a NULL pointer."
So my first idea is to create a class buffer_object like
class buffer_object {
public:
// constructors, destructor
private:
unsigned int const name;
};
I want two different initialization behaviors:
If I create an array std::array<buffer_object, N> buffer_obj_array {};, I want the buffer_object array elements to have their names initialized to 0. Ditto for other containers.
If I create a local variable, don't initialize to 0, but actually give it a name by calling glGenBuffers(1, &name).
The problem is I don't fully understand list initialization, value initialization, and default initialization.
About std::array's default constructor, cppreference.com says it uses aggregate initialization, which is a type of list initialization, which value initializes the elements in an array that aren't given in an initialization list. About value initialization, cppreference.com says,
if T is a class type with at least one user-provided constructor of
any kind, the default constructor is called
So, does that mean there's no way to do the following?
{
buffer_object obj; // has some name given by glGenBuffers; ready to use
int const n = 3;
std::array<buffer_object, n> foo; // each element has name 0
glGenBuffers(n, foo.data()); // gives each element its own name
}
// everything destructed using buffer_object::~buffer_object
— because I'm asking for two different behaviors from the default constructor?
Maybe I'm taking the wrong approach.

C++ object initialization is not, and cannot be, predicated on where that object is created (with the exception of default-initializing objects of trivial types, which zero-initialize when used for global variables, but not for automatics).
If you give a type a non-trivial default constructor, you are making a strong statement that objects of this type need to have some real code executed before they can be considered valid objects of this type. C++ will therefore assume you are serious about this statement and ensure that said "real code" is executed in all cases when objects of that type are created.
If you give buffer_object a default constructor that generates a buffer object, you are saying that objects of this type should always own a valid OpenGL buffer object (except maybe when they are moved-from). If you give buffer_object a default constructor that zeros out the internal buffer object handle, you are saying that having no OpenGL buffer object handle is a normal, valid state for a buffer_object to be in.
There's no half-way with these two statements. Either the default is empty or it isn't.
Now, you can have alternate constructors. The default constructor could make the handle zero, with another constructor that takes a simple tag type that explicitly says that it will generate a handle. Or better yet, how about a factory function that makes buffer_objects that have generated handles:
buffer_object gen_buffer()
{
GLuint handle = 0;
glGenBuffers(1, &handle);
return buffer_object(handle);
};
auto obj = gen_buffer(); //Clearly states what is going on.
Now there is no question what is in obj: it has a generated buffer object handle. Because that's what the code says is going on.

Related

Function Pointer Array Initialization

Consider an array of function pointers within a class in c++14:
class myClass {
myClass();
void (* myFunctions [5])();
}
This array will be populated in other parts of the code, and it is not guaranteed that the array will actually hold as many function pointers as it can (5 in the example). So before calling any of the functions referred to by the array i'd need to check whether the pointer I'm trying to use actually stores something. My idea was to compare the pointer against nullptr, and assume that it points to a valid function if the test fails, i.e. in some function of that class:
if(myFunctions[x] == nullptr) {
// Handle error
return;
}
// Use myFunctions[x]
I was wondering if I need to initialize the array before making that kind of speculation. How is this array initialized if I only write the above two lines of code? Can I assume something like myFunctions[x] == nullptr is true in that case? Or do I need to do an explicit initialization like the following one?
for(int i = 0; i < dimension; i++)
myFunctions[i] = nullptr;
I've found the zero initialization page on cppreference.com, but as I'm learning c++ and I'm still very unexperienced I wasn't able to understand whether this applies in my case. I've also tried to test the pointers by printing their value without assigning any function address and got 0 0 0 0 0, but I don't think the result is reliable: this program is for a microcontroller, and it's possible that the "garbage" values on the stack are all 0s since it is emptied during each code upload.
I don't think that you have to initialize the array, although I'd encourage you to do that. The reason why the array is default-initialized in your case is because you provide a constructor and do not mention the data member in the initializer list (cf. cppreference/default initialization):
Default initialization is performed in three situations:
...
3) when a base class or a non-static data member is not mentioned in a
constructor initializer list and that constructor is called.
Hence, default initialization will take place.
Anyway, in order to express that you rely on a data member to be "zero" without enforcing in other ways that non-initialized entries will not be accessed, I'd make the initialization explicit, e.g.
class myClass {
myClass();
void (* myFunctions [5])() = { };
}

Is circumventing a class' constructor legal or does it result in undefined behaviour?

Consider following sample code:
class C
{
public:
int* x;
};
void f()
{
C* c = static_cast<C*>(malloc(sizeof(C)));
c->x = nullptr; // <-- here
}
If I had to live with the uninitialized memory for any reason (of course, if possible, I'd call new C() instead), I still could call the placement constructor. But if I omit this, as above, and initialize every member variable manually, does it result in undefined behaviour? I.e. is circumventing the constructor per se undefined behaviour or is it legal to replace calling it with some equivalent code outside the class?
(Came across this via another question on a completely different matter; asking for curiosity...)
It is legal now, and retroactively since C++98!
Indeed the C++ specification wording till C++20 was defining an object as (e.g. C++17 wording, [intro.object]):
The constructs in a C++ program create, destroy, refer to, access, and
manipulate objects. An object is created by a definition (6.1), by a
new-expression (8.5.2.4), when implicitly changing the active member
of a union (12.3), or when a temporary object is created (7.4, 15.2).
The possibility of creating an object using malloc allocation was not mentioned. Making it a de-facto undefined behavior.
It was then viewed as a problem, and this issue was addressed later by https://wg21.link/P0593R6 and accepted as a DR against all C++ versions since C++98 inclusive, then added into the C++20 spec, with the new wording:
[intro.object]
The constructs in a C++ program create, destroy, refer to, access, and manipulate objects. An object is created by a definition, by a new-expression, by an operation that implicitly creates objects (see below)...
...
Further, after implicitly creating objects within a specified region of
storage, some operations are described as producing a pointer to a
suitable created object. These operations select one of the
implicitly-created objects whose address is the address of the start
of the region of storage, and produce a pointer value that points to
that object, if that value would result in the program having defined
behavior. If no such pointer value would give the program defined
behavior, the behavior of the program is undefined. If multiple such
pointer values would give the program defined behavior, it is
unspecified which such pointer value is produced.
The example given in C++20 spec is:
#include <cstdlib>
struct X { int a, b; };
X *make_x() {
// The call to std​::​malloc implicitly creates an object of type X
// and its subobjects a and b, and returns a pointer to that X object
// (or an object that is pointer-interconvertible ([basic.compound]) with it),
// in order to give the subsequent class member access operations
// defined behavior.
X *p = (X*)std::malloc(sizeof(struct X));
p->a = 1;
p->b = 2;
return p;
}
There is no living C object, so pretending that there is one results in undefined behavior.
P0137R1, adopted at the committee's Oulu meeting, makes this clear by defining object as follows ([intro.object]/1):
An object is created by a definition ([basic.def]), by a new-expression ([expr.new]), when implicitly changing the active member of a union ([class.union]), or when a temporary object is created ([conv.rval], [class.temporary]).
reinterpret_cast<C*>(malloc(sizeof(C))) is none of these.
Also see this std-proposals thread, with a very similar example from Richard Smith (with a typo fixed):
struct TrivialThing { int a, b, c; };
TrivialThing *p = reinterpret_cast<TrivialThing*>(malloc(sizeof(TrivialThing)));
p->a = 0; // UB, no object of type TrivialThing here
The [basic.life]/1 quote applies only when an object is created in the first place. Note that "trivial" or "vacuous" (after the terminology change done by CWG1751) initialization, as that term is used in [basic.life]/1, is a property of an object, not a type, so "there is an object because its initialization is vacuous/trivial" is backwards.
I think the code is ok, as long as the type has a trivial constructor, as yours. Using the object cast from malloc without calling the placement new is just using the object before calling its constructor. From C++ standard 12.7 [class.dctor]:
For an object with a non-trivial constructor, referring to any non-static member or base class of the object before the constructor begins execution results in undefined behavior.
Since the exception proves the rule, referrint to a non-static member of an object with a trivial constructor before the constructor begins execution is not UB.
Further down in the same paragraphs there is this example:
extern X xobj;
int* p = &xobj.i;
X xobj;
This code is labelled as UB when X is non-trivial, but as not UB when X is trivial.
For the most part, circumventing the constructor generally results in undefined behavior.
There are some, arguably, corner cases for plain old data types, but you don't win anything avoiding them in the first place anyway, the constructor is trivial. Is the code as simple as presented?
[basic.life]/1
The lifetime of an object or reference is a runtime property of the object or reference. An object is said to have non-vacuous initialization if it is of a class or aggregate type and it or one of its subobjects is initialized by a constructor other than a trivial default constructor. [ Note: initialization by a trivial copy/move constructor is non-vacuous initialization. — end note ] The lifetime of an object of type T begins when:
storage with the proper alignment and size for type T is obtained, and
if the object has non-vacuous initialization, its initialization is complete.
The lifetime of an object of type T ends when:
if T is a class type with a non-trivial destructor ([class.dtor]), the destructor call starts, or
the storage which the object occupies is reused or released.
Aside from code being harder to read and reason about, you will either not win anything, or land up with undefined behavior. Just use the constructor, it is idiomatic C++.
This particular code is fine, because C is a POD. As long as C is a POD, it can be initialized that way as well.
Your code is equivalent to this:
struct C
{
int *x;
};
C* c = (C*)malloc(sizeof(C));
c->x = NULL;
Does it not look like familiar? It is all good. There is no problem with this code.
While you can initialize all explicit members that way, you cannot initialize everything a class may contain:
references cannot be set outside an initializer list
vtable pointers cannot be manipulated by code at all
That is, the moment that you have a single virtual member, or virtual base class, or reference member, there is no way to correctly initialize your object except by calling its constructor.
I think it shouldn't be UB. You make your pointer point to some raw memory and are treating its data in a particular way, there's nothing bad here.
If the constructor of this class does something (initializes variables, etc), you'll end up with, again, a pointer to raw, uninitialized object, using which without knowing what the (default) constructor was supposed to be doing (and repeating its behavior) will be UB.

Is uninitialized data behavior well specified?

Note: I am using the g++ compiler (which is I hear is pretty good and supposed to be pretty close to the standard).
I have the simplest class I could think of:
class BaseClass {
public:
int pub;
};
Then I have three equally simple programs to create BaseClass object(s) and print out the [uninitialized] value of their data.
Case 1
BaseClass B1;
cout<<"B1.pub = "<<B1.pub<<endl;
This prints out:
B1.pub = 1629556548
Which is fine. I actually thought it would get initialized to zero because it is a POD or Plain Old Datatype or something like that, but I guess not? So far so good.
Case 2
BaseClass B1;
cout<<"B1.pub = "<<B1.pub<<endl;
BaseClass B2;
cout<<"B2.pub = "<<B2.pub<<endl;
This prints out:
B1.pub = 1629556548
B2.pub = 0
This is definitely weird. I created two of the same objects the same exact way. One got initialized and the other did not.
Case 3
BaseClass B1;
cout<<"B1.pub = "<<B1.pub<<endl;
BaseClass B2;
cout<<"B2.pub = "<<B2.pub<<endl;
BaseClass* pB3 = new BaseClass;
cout<<"B3.pub = "<<pB3->pub<<endl;
This prints out:
B1.pub = 0
B2.pub = 0
B3.pub = 0
This is the most weird yet. They all get initialized to zero. All I did was add two lines of code and it changed the previous behavior.
So is this just a case of 'uninitialized data leads to unspecified behavior' or is there something more logical going on 'under the hood'?
I really want to understand the default constructor/destructor behavior because I have a feeling that it will be very important for completely understanding the inheritance stuff..
So is this just a case of 'uninitialized data leads to unspecified behavior'
Yes...
Sometimes if you call malloc (or new, which calls malloc) you will get data that is filled with zeroes because it is in a fresh page from the kernel. Other times it will be full of junk. If you put something on the stack (i.e., auto storage), you will almost certainly get garbage — but it can be hard to debug, because on your system that garbage might happen to be somewhat predictable. And with objects on the stack, you'll find that changing code in a completely different source file can change the values you see in an uninitialized data structure.
About POD: Whether or not something is POD is really a red herring here. I only explained it because the question mentioned POD, and the conversation derailed from there. The two relevant concepts are storage duration and constructors. POD objects don't have constructors, but not everything without a constructor is POD. (Technically, POD objects don't have non-trivial constructors nor members with non-trivial constructors.)
Storage duration: There are three kinds. Static duration is for globals, automatic is for local variables, and dynamic is for objects on the heap. (This is a simplification and not exactly correct, but you can read the C++ standard yourself if you need something exactly correct.)
Anything with static storage duration gets initialized to zero. So if you make a global instance of BaseClass, then its pub member will be zero (at first). Since you put it on the stack and the heap, this rule does not apply — and you don't do anything else to initialize it, so it is uninitialized. It happens to contain whatever junk was left in memory by the last piece of code to use it.
As a rule, any POD on the heap or the stack will be uninitialized unless you initialize it yourself, and the value will be undefined, possible changing when you recompile or run the program again. As a rule, any global POD will get initialized to zero unless you initialize it to something else.
Detecting uninitialized values: Try using Valgrind's memcheck tool, it will help you find where you use uninitialized values — these are usually errors.
It depends how you declare them:
// Assuming the POD type's
class BaseClass
{
public:
int pub;
};
Static Storage Duration Objects
These objects are always zero initialized.
// Static storage duration objects:
// PODS are zero initialized.
BaseClass global; // Zero initialized pub = 0
void plop()
{
static BaseClass functionStatic; // Zero initialized.
}
Automatic/Dynamic Storage Duration Objects
These objects may by default or zero initialized depending on how you declare them
void plop1()
{
// Dynamic
BaseClass* dynaObj1 = new BaseClass; // Default initialized (does nothing)
BaseClass* dynaObj2 = new BaseClass(); // Zero Initialized
// Automatic
BaseClass autoObj1; // Default initialized (does nothing)
BaseClass autoObj2 = BaseClass(); // Zero Initialized
// Notice that zero initialization of an automatic object is not the same
// as the zero initialization of a dynamic object this is because of the most
// vexing parse problem
BaseClass autoObj3(); // Unfortunately not a zero initialized object.
// Its a forward declaration of a function.
}
I use the term 'Zero-Initialized'/'Default-Initialization' but technically slightly more complex. 'Default-Initialization' will becomes 'no-initialization' of the pub member. While the () invokes 'Value-Initialization' that becomes 'Zero-Initialization' of the pub member.
Note: As BaseClass is a POD this class behaves just like the builtin types. If you swap BaseClass for any of the standard type the behavior is the same.
In all three cases, those POD objects could have indeterminate values.
POD objects without any initializer will NOT be initialized by default value. They just contain garbage..
From Standard 8.5 Initializers,
"If no initializer is specified for an object, and the object is of
(possibly cv-qualified) non-POD class type (or array thereof), the
object shall be default-initialized; if the object is of
const-qualified type, the underlying class type shall have a
user-declared default constructor. Otherwise, if no initializer is
specified for a nonstatic object, the object and its subobjects, if
any, have an indeterminate initial value; if the object or any of
its subobjects are of const-qualified type, the program is
ill-formed."
You can zero initialize all the members of a POD struct like this,
BaseClass object={0};
The way you wrote your class, the value of pub is undefined and can be anything. If you create a default constructor that will call the default constructor of pub - it will be guaranteed to be zero:
class BaseClass {
public:
BaseClass() : pub() {}; // calling pub() guarantees it to be zero.
int pub;
};
That would be a much better practice.
As a general rule, yes, uninitialized data leads to unspecified behavior. That's why other languages like C# take steps to insure that you don't use uninitialized data.
This is why you always, always, always, initialize a classes (or ANY variable) to a stable state instead of relying on the compiler to do it for you; especially since some compilers deliberately fill them with garbage. In fact, the only POD MSVC++ doesn't fill with garbage is bool, they get initialized to true. One would think it'd be safer to init it to false, but that's Microsoft for ya.
Case 1
Regardless of whether the encapsulating type is POD, data members of a built-in type are not default initialised by an encapsulating default constructor.
Case 2
No, neither got initialised. The underlying bytes at the memory position of one of them just happened to be 0.
Case 3
Same.
You seem to be expecting some guarantees about the "value" of uninitialised objects, whilst simultaneously professing the understanding that no such value exists.

Can I specify default value?

Why is it that for user defined types when creating an array of objects every element of this array is initialized with the default constructor, but when I create an array of a built-in type that isn't the case?
And second question: Is it possible to specify default value to be used while initializing elements in the array? Something like this (not valid):
char* p = new char[size]('\0');
And another question in this topic while I'm with arrays. I suppose that when creating an array of user defined type, every element of this array will be initialized with default value. Why is this?
If arrays for built in types do not initialize their elements with their defaults, why do they do it for User Defined Types?
Is there a way to avoid/circumvent this default construction somehow? It seems like bit of a waste if I for example have created an array with size 10000, which forces 10000 default constructor calls, initializing data which I will (later on) overwrite anyway.
I think that behaviour should be consistent, so either every type of array should be initialized or none. And I think that the behaviour for built-in arrays is more appropriate.
That's how built-in types work in C++. In order to initialize them you have to supply an explicit initializer. If you don't, then the object will remain uninitialized. This behavior is in no way specific to arrays. Standalone objects behave in exactly the same way.
One problem here is that when you are creating an array using new[], you options for supplying an initializer (in the current version of the language) are very limited. In fact, the only initializer you can supply is the empty ()
char* p = new char[size]();
// The array is filled with zeroes
In case of char type (or any other scalar type), the () initializer will result in zero-initialization, which is incidentally what you tried to do.
Of course, if your desired default value in not zero, you are out of luck, meaning that you have to explicitly assign the default values to the elements of the new[]-ed array afterwards.
As for disabling the default constructor call for arrays of types with user-defined default constructor... well, there's no way to achieve that with ordinary new[]. However, you can do it by implementing your own array construction process (which is what std::vector does, for one example). You first allocate raw memory for the entire array, and then manually construct the elements one-by-one in any way you see fit. Standard library provides a number of primitive intended to be used specifically for that purpose. That includes std::allocator and functions like uninitialized_copy, uninitialized_fill and so on.
Something like this (not valid):
As far as I know that is perfectly valid.
Well not completely, but you can get a zero intialized character array:
#include <iostream>
#include <cstdlib>
int main(int argc, char* argv[])
{
//The extra parenthesis on the end call the "default constructor"
//of char, which initailizes it with zero.
char * myCharacters = new char[100]();
for(size_t idx = 0; idx != 100; idx++) {
if (!myCharacters[idx])
continue;
std::cout << "Error at " << idx << std::endl;
std::system("pause");
}
delete [] myCharacters;
return 0;
}
This program produces no output.
And another question in this topic while I'm with arrays. I suppose that when creating an array of user defined type and knowing the fact that every elem. of this array will be initialized with default value firstly why?
Because there's no good syntactic way to specialize each element allocated with new. You can avoid this problem by using a vector instead, and calling reserve() in advance. The vector will allocate the memory but the constructors will not be called until you push_back into the vector. You should be using vectors instead of user managed arrays anyway because new'd memory handling is almost always not exception safe.
I think that behaviour should be consistent, so either every type of array should be initialized or none. And I think that the behaviour for built-in arrays is more appropriate.
Well if you can think of a good syntax for this you can write up a proposal for the standard -- not sure how far you'll get with that.
Why is it that for user defined types when creating an array of objects every element of this array is initialized with the default constructor, but when I create an array of a built-in type that isn't the case? and
And another question in this topic while I'm with arrays. I suppose that when creating an array of user defined type, every element of this array will be initialized with default value. Why is this? and
If arrays for built in types do not initialize their elements with their defaults, why do they do it for User Defined Types?
Because a user defined type is never ever valid until its constructor is called. Built in types are always valid even if a constructor has not been called.
And second question: Is it possible to specify default value to be used while initializing elements in the array? Something like this (not valid):
Answered this above.
Is there a way to avoid/circumvent this default construction somehow? It seems like bit of a waste if I for example have created an array with size 10000, which forces 10000 default constructor calls, initializing data which I will (later on) overwrite anyway.
Yes, you can use a vector as I described above.
Currently (unless you're using the new C++0x), C++ will call the constructor that takes no arguments e.g. myClass::myClass(). If you want to initialise it to something, implement a constructor like this that initialises your variables. e.g.
class myChar {
public:
myChar();
char myCharVal;
};
myChar::myChar(): myCharVal('\0') {
}
The C++ philosophy is - don't pay for something you don't need.
And I think the behviour is pretty unified. If your UDT didn't have a default constructor, nothing would be run anyway and the behaviour would be the same as for built-in types (which don't have a default constructor).
Why is it that for user defined types when creating an array of objects every element of this array is initialized with the default constructor, but when I create an array of a built-in type that isn't the case?
But if the default constructor of an object does nothing then it is still not initialiized.
class X
{
public:
char y;
}
X* data = new X[size]; // defaut constructor called. But y is still undefined.
And second question: Is it possible to specify default value to be used while initializing elements in the array? > Something like this (not valid):
Yes:
char data1[size] = { 0 };
std::vector<char> data2(size,0);
char* data3 = new char[size];
memset(data3,0,size);
Is there a way to avoid/circumvent this default construction somehow? It seems like bit of a waste if I for example have created an array with size 10000, which forces 10000 default constructor calls, initializing data which I will (later on) overwrite anyway.
Yes. Use a std::vector.
You can reserve the space for all the elements you need without calling the constructor.
std::vector<char> data4;
data4.reserve(size);

default initialization in C++

I have a question about the default initialization in C++. I was told the non-POD object will be initialized automatically. But I am confused by the code below.
Why when I use a pointer, the variable i is initialized to 0, however, when I declare a local variable, it's not. I am using g++ as the compiler.
class INT {
public: int i;
};
int main () {
INT* myint1 = new INT;
INT myint2;
cout<<"myint1.i is "<<myint1->i<<endl;
cout<<"myint2.i is "<<myint2.i<<endl;
return 0;
}
The output is
myint1.i is 0
myint2.i is -1078649848
You need to declare a c'tor in INT and force 'i' to a well-defined value.
class INT {
public:
INT() : i(0) {}
...
};
i is still a POD, and is thus not initialized by default. It doesn't make a difference whether you allocate on the stack or from the heap - in both cases, the value if i is undefined.
in both cases its not initialized, you were just lucky to get 0 in the first one
It depends on the compiler. The big difference here is that a pointer set to new Something refers to some area of memory in heap, while local variables are stored on the stack. Perhaps your compiler zeroes heap memory but doesn't bother zeroing stack memory; either way, you can't count on either method zeroing your memory. You should use something like ZeroMemory in Win32 or memset from the C standard libraries to zero your memory, or set i = 0 in the constructor of INT.
For a class or struct-type, if you don't tell it which constructor to use when defining a variable, then the default constructor is called. If you didn't define a default constructor, then the compiler creates one for you. If the type is not a class (or struct) type, then it is not initialized since it won't have a constructor, let alone a default constructor (so no built-in types like int will ever be default-initialized).
So, in your example, both myint1 and myint2 are default constructed with the default constuctor that the compiler declared for INT. But since that won't initialize any non-class/struct variables in an INT, the i member variable of INT is not initialized.
If you want i to be initialized, you need to write a default constructor for INT which initializes it.
Firstly, your class INT is POD.
Secondly, when something is "initialized automatically" (for automatic or dynamic objects), it means that the constructor is called automatically. There is no other "automatic" initialization scenario (for automatic or dynamic objects); all other scenarios require a "manually" specified initializer. However, if the constructor does not do anything to perform the desired initialization, then that initialization will not take place. It is your responsibility to write that constructor, when necessary.
In both cases in your example you should get garbage in your objects. The 0 that you observe in case of new-ed object is there purely by accident.
That's true for non-POD object's but your object is POD since it has no user defined constructor and only contains PODs itself.
We just had this topic, you can find some explinations here.
If you add more variables to the class, you will end up with non-initialized member variables.