C++: member pointer initialised? - c++

Code sample should explain things:
class A
{
B* pB;
C* pC;
D d;
public :
A(int i, int j) : d(j)
{
pC = new C(i, "abc");
} // note pB is not initialised, e.g. pB(NULL)
...
};
Obviously pB should be initialised to NULL explicitly to be safe (and clear), but, as it stands, what is the value of pB after construction of A? Is it default initialised (which is zero?) or not (i.e. indeterminate and whatever was in memory). I realise initialisation in C++ has a fair few rules.
I think it isn't default initialised; as running in debug mode in Visual Studio it has set pB pointing to 0xcdcdcdcd - which means the memory has been new'd (on the heap) but not initialised. However in release mode, pB always points to NULL. Is this just by chance, and therefore not to be relied upon; or are these compilers initialising it for me (even if it's not in the standard)? It also seems to be NULL when compiled with Sun's compiler on Solaris.
I'm really looking for a specific reference to the standard to say one way or the other.
Thanks.

Here is the relevant passage fromt he standard:
12.6.2 Initializing bases and members [class.base.init]
4 If a given nonstatic data member or
base class is not named by a mem-
initializer-id in the
mem-initializer-list, then
--If the entity is a nonstatic data
member of (possibly cv-qualified)
class type (or array thereof) or a base class, and the entity class
is a non-POD class, the entity is default-initialized (dcl.init).
If the entity is a nonstatic data member of a const-qualified type,
the entity class shall have a user-declared default constructor.
--Otherwise, the entity is not
initialized. If the entity is of
const-qualified type or reference type, or of a (possibly cv-quali-
fied) POD class type (or array thereof) containing (directly or
indirectly) a member of a const-qualified type, the program is
ill-
formed.
After the call to a constructor for
class X has completed, if a member
of X is neither specified in the
constructor's mem-initializers, nor
default-initialized, nor initialized
during execution of the body of
the constructor, the member has
indeterminate value.

According to the C++0x standard section 12.6.2.4, in the case of your pointer variable, if you don't include it in the initializer list and you don't set it in the body of the constructor, then it has indeterminate value. 0xCDCDCDCD and 0 are two possible such values, as is anything else. :-)

I believe this is a artifact from the good old C days when you could not have expectations on what alloc'd memory contains. As the standards progressed to C++ this "convention" was maintained. As the C++ compilers developed the individual authors took it upon themselves to "fix" this problem. Therefore your mileage may vary depending on your compiler of choice.
The "0xcdcdcdcd" looks to be a readily identifiable pattern that "helps" in debugging you code. That is why it doesn't show in release mode.
I hope this helped in a little way and good luck.

Uninitialised pointers are allow to basically contain a random value, although some compilers tend to fill them with 0 or some other recognisable value, especially in debug mode.
IMHO this is due to C++'s "don't pay for what you don't use" design. If you don't consider it important, the compiler does not need to go through the expense of initialising the variable for you. Of course, once you've chased a random pointer you might find it prudent to initialise it the next time around...

The value of pB is undefined. It may or may not be consistently the same value - usually depends on what was previously at the same place in memory prior to the allocation of a particular instance of A.

Uninitialised pointers can point to anything. Some compiler vendors will help you out and make them point to 0 or 0xcdcdcdcd or whatever.
To make sure your code is safe and portable you should always initialise your pointers. either to 0 or to a valid value.
e.g.
C* pc = 0;
or
C* pc = new C(...);
If you always initialise pointers to 0 then this is safe :
if (!pc)
pc = new C(...);
If you don't initialise then you've got no way of telling initialised and uninitialised pointers apart.
As an aside, there's no such keyword in C++ as NULL. Most compilers define NULL as 0, but it's not considered portable to use it. The new c++0x standard will introduce a new keyword, nullptr, so when that comes out we'll finally have a portable null pointer constant.

It's rare that I'll recommend not learning something about the language you're using, but in this case, whether or not pB is initialized isn't useful information. Just initialize it. If it's automatically initialized, the compiler will optimize out the extra initialization. If it isn't, you've added one extra processor instruction and prevented a whole slew of potential bugs.

Related

Does a member have to be initialized to take its address?

Can I initialize a pointer to a data member before initializing the member? In other words, is this valid C++?
#include <string>
class Klass {
public:
Klass()
: ptr_str{&str}
, str{}
{}
private:
std::string *ptr_str;
std::string str;
};
this question is similar to mine, but the order is correct there, and the answer says
I'd advise against coding like this in case someone changes the order of the members in your class.
Which seems to mean reversing the order would be illegal but I couldn't be sure.
Does a member have to be initialized to take its address?
No.
Can I initialize a pointer to a data member before initializing the member? In other words, is this valid C++?
Yes. Yes.
There is no restriction that operand of unary & need to be initialised. There is an example in the standard in specification of unary & operator:
int a;
int* p1 = &a;
Here, the value of a is indeterminate and it is OK to point to it.
What that example doesn't demonstrate is pointing to an object before its lifetime has begun, which is what happens in your example. Using a pointer to an object before and after its lifetime is explicitly allowed if the storage is occupied. Standard draft says:
[basic.life] Before the lifetime of an object has started but after the storage which the object will occupy has been allocated or, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any pointer that represents the address of the storage location where the object will be or was located may be used but only in limited ways ...
The rule goes on to list how the usage is restricted. You can get by with common sense. In short, you can treat it as you could treat a void*, except violating these restrictions is UB rather than ill-formed. Similar rule exists for references.
There are also restrictions on computing the address of non-static members specifically. Standard draft says:
[class.cdtor] ... To form a pointer to (or access the value of) a direct non-static member of an object obj, the construction of obj shall have started and its destruction shall not have completed, otherwise the computation of the pointer value (or accessing the member value) results in undefined behavior.
In the constructor of Klass, the construction of Klass has started and destruction hasn't completed, so the above rule is satisfied.
P.S. Your class is copyable, but the copy will have a pointer to the member of another instance. Consider whether that makes sense for your class. If not, you will need to implement custom copy and move constructors and assignment operators. A self-reference like this is a rare case where you may need custom definitions for those, but not a custom destructor, so it is an exception to the rule of five (or three).
P.P.S If your intention is to point to one of the members, and no object other than a member, then you might want to use a pointer to member instead of pointer to object.
Funny question.
It is legitimate and will "work", though barely. There is a little "but" related to types which makes the whole thing a bit awkward with a bad taste (but not illegitimate), and which might make it illegal some border cases involving inheritance.
You can, of course, take the address of any object whether it's initialized or not, as long as it exists in the scope and has a name which you can prepend operator& to. Dereferencing the pointer is a different thing, but that wasn't the question.
Now, the subtle problem is that the standard defines the result of operator& for non-static struct members as "“pointer to member of class C of type T” and is a prvalue designating C::m".
Which basically means that ptr_str{&str} will take the address of str, but the type is not pointer-to, but pointer-to-member-of. It is then implicitly and silently cast to pointer-to.
In other words, although you do not need to explicitly write &this->str, that's nevertheless what its type is -- it's what it is and what it means [1].
Is this valid, and is it safe to use it within the initializer list? Well yes, just... barely. It's safe to use it as long as it's not being used to access uninitialized members or virtual functions, directly or indirectly. Which, as it happens, is the case here (it might not be the case in a different, arguably contrived case).
[1] Funnily, paragraph 4 starts with a clause that says that no member pointer is formed when you put stuff in parentheses. That's remarkable because most people would probably do that just to be 100% sure they got operator precedence right. But if I read correctly, then &this->foo and &(this->foo) are not in any way the same!

PODs and inheritance in C++11. Does the address of the struct == address of the first member?

(I've edited this question to avoid distractions. There is one core question which would need to be cleared up before any other question would make sense. Apologies to anybody whose answer now seems less relevant.)
Let's set up a specific example:
struct Base {
int i;
};
There are no virtual method, and there is no inheritance, and is generally a very dumb and simple object. Hence it's Plain Old Data (POD) and it falls back on a predictable layout. In particular:
Base b;
&b == reinterpret_cast<B*>&(b.i);
This is according to Wikipedia (which itself claims to reference the C++03 standard):
A pointer to a POD-struct object, suitably converted using a reinterpret cast, points to its initial member and vice versa, implying that there is no padding at the beginning of a POD-struct.[8]
Now let's consider inheritance:
struct Derived : public Base {
};
Again, there are no virtual methods, no virtual inheritance, and no multiple inheritance. Therefore this is POD also.
Question: Does this fact (Derived is POD in C++11) allow us to say that:
Derived d;
&d == reinterpret_cast<D*>&(d.i); // true on g++-4.6
If this is true, then the following would be well-defined:
Base *b = reinterpret_cast<Base*>(malloc(sizeof(Derived)));
free(b); // It will be freeing the same address, so this is OK
I'm not asking about new and delete here - it's easier to consider malloc and free. I'm just curious about the regulations about the layout of derived objects in simple cases like this, and where the initial non-static member of the base class is in a predictable location.
Is a Derived object supposed to be equivalent to:
struct Derived { // no inheritance
Base b; // it just contains it instead
};
with no padding beforehand?
You don't care about POD-ness, you care about standard-layout. Here's the definition, from the standard section 9 [class]:
A standard-layout class is a class that:
has no non-static data members of type non-standard-layout class (or array of such types) or reference,
has no virtual functions (10.3) and no virtual base classes (10.1),
has the same access control (Clause 11) for all non-static data members,
has no non-standard-layout base classes,
either has no non-static data members in the most derived class and at most one base class with non-static data members, or has no base classes with non-static data members, and
has no base classes of the same type as the first non-static data member.
And the property you want is then guaranteed (section 9.2 [class.mem]):
A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa.
This is actually better than the old requirement, because the ability to reinterpret_cast isn't lost by adding non-trivial constructors and/or destructor.
Now let's move to your second question. The answer is not what you were hoping for.
Base *b = new Derived;
delete b;
is undefined behavior unless Base has a virtual destructor. See section 5.3.5 ([expr.delete])
In the first alternative (delete object), if the static type of the object to be deleted is different from its dynamic type, the static type shall be a base class of the dynamic type of the object to be deleted and the static type shall have a virtual destructor or the behavior is undefined.
Your earlier snippet using malloc and free is mostly correct. This will work:
Base *b = new (malloc(sizeof(Derived))) Derived;
free(b);
because the value of pointer b is the same as the address returned from placement new, which is in turn the same address returned from malloc.
Presumably your last bit of code is intended to say:
Base *b = new Derived;
delete b; // delete b, not d.
In that case, the short answer is that it remains undefined behavior. The fact that the class or struct in question is POD, standard layout or trivially copyable doesn't really change anything.
Yes, you're passing the right address, and yes, you and I know that in this case the dtor is pretty much a nop -- nonetheless, the pointer you're passing to delete has a different static type than dynamic type, and the static type does not have a virtual dtor. The standard is quite clear that this gives undefined behavior.
From a practical viewpoint, you can probably get away with the UB if you really insist -- chances are pretty good that there won't be any harmful side effects from what you're doing, at least with most typical compilers. Beware, however, that even at best the code is extremely fragile so seemingly trivial changes could break everything -- and even switching to a compiler with really heavy type checking and such could do so as well.
As far as your argument goes, the situation's pretty simple: it basically means the committee probably could make this defined behavior if they wanted to. As far as I know, however, it's never been proposed, and even if it had it would probably be a very low priority item -- it doesn't really add much, enable new styles of programming, etc.
This is meant as a supplement to Ben Voigt's answer', not a replacement.
You might think that this is all just a technicality. That the standard calling it 'undefined' is just a bit of semantic twaddle that has no real-world effects beyond allowing compiler writers to do silly things for no good reason. But this is not the case.
I could see desirable implementations in which:
Base *b = new Derived;
delete b;
Resulted in behavior that was quite bizarre. This is because storing the size of your allocated chunk of memory when it is known statically by the compiler is kind of silly. For example:
struct Base {
};
struct Derived {
int an_int;
};
In this case, when delete Base is called, the compiler has every reason (because of the rule you quoted at the beginning of your question) to believe that the size of the data pointed at is 1, not 4. If it, for example, implements a version of operator new that has a separate array in which 1 byte entities are all densely packed, and a different array in which 4 byte entities are all densely packed, it will end up assuming the Base * points to somewhere in the 1-byte entity array when in fact it points somewhere in the 4-byte entity array, and making all kinds of interesting errors for this reason.
I really wish operator delete had been defined to also take a size, and the compiler passed in either the statically known size if operator delete was called on an object with a non-virtual destructor, or the known size of the actual object being pointed at if it were being called as a result of a virtual destructor. Though this would likely have other ill effects and maybe isn't such a good idea (like if there are cases in which operator delete is called without a destructor having been called). But it would make the problem painfully obvious.
There is lots of discussion on irrelevant issues above. Yes, mainly for C compatibility there are a number of guarantees you can rely as long as you know what you are doing. All this is, however, irrelevant to your main question. The main question is: Is there any situation where an object can be deleted using a pointer type which doesn't match the dynamic type of the object and where the pointed to type doesn't have a virtual destructor. The answer is: no, there is not.
The logic for this can be derived from what the run-time system is supposed to do: it gets a pointer to an object and is asked to delete it. It would need to store information on how to call derived class destructors or about the amount of memory the object actually takes if this were to be defined. However, this would imply a possibly quite substantial cost in terms of used memory. For example, if the first member requires very strict alignment, e.g. to be aligned at an 8 byte boundary as is the case for double, adding a size would add an overhead of at least 8 bytes to allocate memory. Even though this might not sound too bad, it may mean that only one object instead of two or four fits into a cache line, reducing performance substantially.

Built-in datatypes versus User defined datatypes in C++

Constructors build objects from dust.
This is a statement which I have been coming across many times,recently.
While initializing a built-in datatype variable, the variable also HAS to be "built from dust" . So, are there also constructors for built in types?
Also, how does the compiler treat a BUILT IN DATATYPE and a USER DEFINED CLASS differently, while creating instances for each?
I mean details regarding constructors, destructors etc.
This query on stack overflow is regarding the same and it has some pretty intresting details , most intresting one being what Bjarne said ... !
Do built-in types have default constructors?
Simply put, according to the C++ standard:
12.1 Constructors [class.ctor]
2. A constructor is used to initialize objects of its class type...
so no, built-in datatypes (assuming you're talking about things like ints and floats) do not have constructors because they are not class types. Class types are are specified as such:
9 Classes [class]
1. A class is a type. Its name becomes a class-name (9.1) within
its scope.
class-name:
identifier
template-id
Class-specifiers and elaborated-type-specifiers (7.1.5.3) are used
to make class-names. An object of a class consists of a (possibly
empty) sequence of members and base class objects.
class-specifier:
class-head { member-specification (opt) }
class-head:
class-key identifieropt base-clauseopt
class-key nested-name-specifier identifier base-clauseopt
class-key nested-name-specifieropt template-id base-clauseopt
class-key:
class
struct
union
And since the built-in types are not declared like that, they cannot be class types.
So how are instances of built-in types created? The general process of bringing built-in and class instances into existance is called initialization, for which there's a huge 8-page section in the C++ standard (8.5) that lays out in excruciating detail about it. Here's some of the rules you can find in section 8.5.
As already mentioned, built-in data types don't have constructors.
But you still can use construction-like initialization syntax, like in int i(3), or int i = int(). As far as I know that was introduced to language to better support generic programming, i.e. to be able to write
template <class T>
T f() { T t = T(); }
f(42);
While initializing a built-in datatype variable, the variable also HAS to be "built from dust" . So, are there also constructors for built in types?
Per request, I am rebuilding my answer from dust.
I'm not particularly fond of that "Constructors build objects from dust" phrase. It is a bit misleading.
An object, be it a primitive type, a pointer, or a instance of a big class, occupies a certain known amount of memory. That memory must somehow be set aside for the object. In some circumstances, that set-aside memory is initialized. That initialization is what constructors do. They do not set aside (or allocate) the memory needed to store the object. That step is performed before the constructor is called.
There are times when a variable does not have to be initialized. For example,
int some_function (int some argument) {
int index;
...
}
Note that index was not assigned a value. On entry to some_function, a chunk of memory is set aside for the variable index. This memory already exists somewhere; it is just set aside, or allocated. Since the memory already exists somewhere, each bit will have some pre-existing value. If a variable is not initialized, it will have an initial value. The initial value of the variable index might be 42, or 1404197501, or something entirely different.
Some languages provide a default initialization in case the programmer did not specify one. (C and C++ do not.) Sometimes there is nothing wrong with not initializing a variable to a known value. The very next statement might be an assignment statement, for example. The upside of providing a default initialization is that failing to initialize variables is a typical programming mistake. The downside is that this initialization has a cost, albeit typically tiny. That tiny cost can be significant when it occurs in a time-critical, multiply-nested loop. Not providing a default initial value fits the C and C++ philosophy of not providing something the programmer did not ask for.
Some variables, even non-class variables, absolutely do need to be given an initial value. For example, there is no way to assign a value to a variable that is of a reference type except in the declaration statement. The same goes for variables that are declared to be constant.
Some classes have hidden data that absolutely do need to be initialized. Some classes have const or reference data members that absolutely do need to be initialized. These classes need to be initialized, or constructed. Not all classes do need to be initialized. A class or structure that doesn't have any virtual functions, doesn't have an explicitly-provided constructor or destructor, and whose member data are all primitive data types, is called plain old data, or POD. POD classes do not need to be constructed.
Bottom line:
An object, whether it is a primitive type or an instance of a very complex class, is not "built from dust". Dust is, after all, very harmful to computers. They are built from bits.
Setting aside, or allocating, memory for some object and initializing that set-aside memory are two different things.
The memory need to store an object is allocated, not created. The memory already exists. Because that memory already exists, the bits that comprise the object will have some pre-existing values. You should of course never rely on those preexisting values, but they are there.
The reason for initializing variables, or data members, is to give them a reliable, known value. Sometimes that initialization is just a waste of CPU time. If you didn't ask the compiler to provide such a value, C and C++ assume the omission is intentional.
The constructor for some object does not allocate the memory needed to store the object itself. That step has already been done by the time the constructor is called. What a constructor does do is to initialize that already allocated memory.
The initial response:
A variable of a primitive type does not have to be "built from dust". The memory to store the variable needs to be allocated, but the variable can be left uninitialized. A constructor does not build the object from dust. A constructor does not allocate the memory needed to store the to-be constructed object. That memory has already been allocated by the time the constructor is called. (A constructor might initialize some pointer data member to memory allocated by the constructor, but the bits occupied by that pointer must already exist.)
Some objects such as primitive types and POD classes do not necessarily need to be initialized. Declare a non-static primitive type variable without an initial value and that variable will be uninitialized. The same goes for POD classes. Suppose you know you are going to assign a value to some variable before the value of the variable is accessed. Do you need to provide an initial value? No.
Some languages do give an initial value to every variable. C and C++ do not. If you didn't ask for an initial value, C and C++ are not going to force an initial value on the variable. That initialization has a cost, typically tiny, but it exists.
Built In data types(fundamental types, arrays,references, pointers, and enums) do not have constructors.
A constructor is a member function. A member function can only be defined for a class type
C++03 9.3/1:
"Functions declared in the definition of a class, excluding those declared with a friend specifier, are called member functions of that class".
Many a times usage of an POD type in certain syntax's(given below) might give an impression that they are constructed using constructors or copy constructors but it just Initialization without any of the two.
int x(5);

Do classes with uninitialized pointers have undefined behavior?

class someClass
{
public:
int* ptr2Int;
};
Is this a valid class (yes it compiles)? Provided one assigns a value to ptr2Int before dereferencing it, is the class guaranteed to work as one would expect?
An uninitialized pointer inside a class is in no way different from a standalone uninitailized pointer. As long as you are not using the pointer in any dangerous way, you are fine.
Keep in mind though that "dangerous ways" of using an uninitialized pointer include a mere attempt to read its value (no dereference necessary). The implicit compiler-provided copy-constructor and copy-assignment operators present in your class might perform such an attempt if you use these implicit member functions before the pointer gets assigned a valid value.
Actually, if I'm not mistaken, this issue was a matter of some discussion at the level of the standardization committee. Are the implicitly generated member functions allowed to trip over trap representations possibly present in the non-initialized members of the class? I don't remember what was the verdict. (Or maybe I saw that discussion in the context of C99?)
Yes, it's fine. The pointer itself exists, just its value is just unknown, so dereferencing it is unsafe. Having an uninitialized variable is perfectly fine, and pointers aren't any different
Yes, this is exactly the same as a struct with a single uninitialized pointer, and both are guaranteed to work just fine (as long as you set the pointer before any use of it, of course).
Until you dereference the pointer it's all good, then it's undefined territory.
Some compilers will set pointers to default values (like null) depending if you compile in debug or release mode. So things could work in one mode and suddenly everything falls apart in another.

Array of pointers member, is it initialized?

If I have a
class A
{
private:
Widget* widgets[5];
};
Is it guaranteed that all pointers are NULL, or do I need to initialize them in the constructor? Is it true for all compilers?
Thanks.
The array is not initialized unless you do it. The standard does not require the array to be initialized.
It is not initialized if it is on the stack or using the default heap allocator (although you can write your own to do so).
If it is a global variable it is zero filled.
This is true for all conformant compilers.
It depends on the platform and how you allocate or declare instances of A. If it's on the stack or heap, you need to explicitly initialize it. If it's with placement new and a custom allocator that initializes memory to zero or you declare an instance at file scope AND the platform has the null pointer constant be bitwise zero, you don't. Otherwise, you should.
EDIT: I suppose I should have stated the obvious which was "don't assume that this happens".
Although in reality the answer is "it depends on the platform". The standard only tells you what happens when you initialize explicitly or at file scope. Otherwise, it is easiest to assume that you are in an environment that will do the exact opposite of what you want it to do.
And if you really need to know (for educational or optimizational purposes), consult the documentation and figure out what you can rely on for that platform.
In general case the array will not be initialized. However, keep in mind that the initial value of the object of class type depends not only on how the class itself is defined (constructor, etc), but might also depend on the initializer used when creating the object.
So, in some particular cases the answer to your question might depend on the initializer you supply when creating the object and on the version of C++ language your compiler implements.
If you supply no initializer, the array will contain garbage.
A* pa = new A;
// Garbage in the array
A a;
// Garbage in the array
If supply the () initializer, in C++98 the array will still contain garbage. In C++03 however the object will be value-initialized and the array will contain null-pointers
A* pa = new A();
// Null-pointers in the array in C++03, still garbage in C++98
A a = A();
// Null-pointers in the array in C++03, still garbage in C++98
Also, objects with static storage duration are always zero-initialized before any other initialization takes place. So, if you define an object of A class with static storage duration, the array will initially contain null-pointers.