Behaviour of uninitialized C++ structs - c++

Say we declare an array of structs in a local scope:
int main()
{
RandomStruct array [1000];
}
Currently the structs in the array are uninitialized. While this means that the struct variables are also uninitialized, does this also mean anything else? Like if I, for example, set all the variables from an unitialized struct to the desired value, and then use functions of this struct, or if I use functions of the struct that don't use uninitialized variables before I set them. Am I correct in thinking that only the variables will be uninitialized and that what the array does is just assign random memory to each of the struct's variables?

Currently the structs in the array are uninitialized
No, they are default-initialized.
While this means that the struct variables are also uninitialized
The effect on the members depends on the definition of RandomStruct. Depending on that definition, default-initialization of RandomStruct may have the effect of default-initialization of some or all of the non-static data members of RandomStruct. It may have the effect eventually of default-initializing a variable of non-class type, as a member of RandomStruct, or a member of a member, etc. That variable of non-class type will have an indeterminate value.
Like if I, for example, set all the variables from an unitialized struct to the desired value, and then use functions of this struct, or if I use functions of the struct that don't use uninitialized variables before I set them
If all members are initialized to determinate values before being used, everything is OK. Member function calls that don't "observe" the indeterminate values are OK.
Am I correct in thinking that only the variables will be uninitialized and that what the array does is just assign random memory to each of the struct's variables?
That's not quite true. That would imply that observing the indeterminate values is OK but their value is just unknown. It is not. But so long as you don't observe the values, this is a valid intuition.
It is OK to leave them indeterminate so long as they are not observed. But, it is undefined behavior to "observe" the indeterminate value by producing it in any evaluation, except in very limited, enumerated conditions.
This means that a correct program is not allowed to observe the value, but the compiler is also not required to diagnose it. However, the compiler can assume it is never done (because a correct program cannot do it) and C++ places no requirements on an invalid program.

Related

int x; int y; int *ptr; is not initialization, right?

I'm reading 'C++ All-in-One for Dummies' by J. P. Mueller and J. Cogswell and stumbled onto this:
#include <iostream>
using namespace std;
int main()
{
int ExpensiveComputer;
int CheapComputer;
int *ptrToComp;
...
This code starts out by initializing all the goodies involved — two integers
and a pointer to an integer.
Just to confirm, this is a mistake and should read '... by declaring', right? It's just strange to me that such basic mistakes still make their way to books.
From the point of view of the language, this is default initialization. The problem is, they are initialized to indeterminate values.
otherwise, nothing is done: the objects with automatic storage duration (and their subobjects) are initialized to indeterminate values.
Default initialization of non-class variables with automatic and dynamic storage duration produces objects with indeterminate values (static and thread-local objects get zero initialized)
Note that any attempt to read these indeterminate values leads to UB.
From the standard, [dcl.init]/7
To default-initialize an object of type T means:
If T is a (possibly cv-qualified) class type ([class]), constructors are considered. The applicable constructors are enumerated
([over.match.ctor]), and the best one for the initializer () is chosen
through overload resolution ([over.match]). The constructor thus
selected is called, with an empty argument list, to initialize the
object.
If T is an array type, each element is default-initialized.
Otherwise, no initialization is performed.
Yes you are correct.
You declared and defined these variables, you did not initialize them!
PS: What is the difference between a definition and a declaration?
This code both declares and defines three variables but does not initialize them (their values are said to be indeterminate).
A variable declaration only must include keyword extern.
Right. Hence, "dummies". :)
We can't even blame this on legacy; historically C programmers would declare* a variable and then "initialize" it later with its first assignment.
But it was never the case that simply declaring a variable, without an initializer, were deemed to be "initializing" it.**
So the wording is just wrong.
* Technically we're talking about definitions, but when we say "declare a variable" we almost always mean defining declarations.
** Though objects with static storage duration do undergo their own zero-initialisation phase before anything else happens, so forgoing initialisation yourself is not a catastrophe in that case. Still, we cannot claim that we have initialised that object.

Why compilers put zeros into arrays while they do not have to?

I'm trying to understand when compilers should value initialize arrays and when they should default initialize it. I'm trying two options: one raw array, another array aggregated in a struct:
const int N = 1000;
struct A
{
uint32_t arr[N];
A() = default;
};
void print(uint32_t* arr, const std::string& message)
{
std::cout << message << ": " <<
(std::count(arr, arr + N, 0) == N ? "all zeros" : "garbage") << std::endl;
}
int main()
{
uint32_t arrDefault[N];
print(arrDefault, "Automatic array, default initialization");
uint32_t arrValue[N] = {};
print(arrValue, "Automatic array, value initialization");
uint32_t* parrDefault = new uint32_t[N];
print(parrDefault, " Dynamic array, default initialization");
uint32_t* parrValue = new uint32_t[N]();
print(parrValue, " Dynamic array, value initialization");
A structDefault;
print(structDefault.arr, "Automatic struct, default initialization");
A structValue{};
print(structValue.arr, "Automatic struct, value initialization");
A* pstructDefault = new A;
print(pstructDefault->arr, " Dynamic struct, default initialization");
A* psstructValue = new A();
print(psstructValue->arr, " Dynamic struct, value initialization");
}
Here is what I see for clang and VC++:
Automatic array, default initialization: garbage
Automatic array, value initialization: all zeros
Dynamic array, default initialization: garbage
Dynamic array, value initialization: all zeros
Automatic struct, default initialization: all zeros
Automatic struct, value initialization: all zeros
Dynamic struct, default initialization: garbage
Dynamic struct, value initialization: all zeros
Output for gcc is different only in the first line, where it also puts "all zeros".
From my point of view they are all wrong, and what I expect is:
Automatic array, default initialization: garbage
Automatic array, value initialization: all zeros
Dynamic array, default initialization: garbage
Dynamic array, value initialization: all zeros
Automatic struct, default initialization: garbage
Automatic struct, value initialization: garbage
Dynamic struct, default initialization: garbage
Dynamic struct, value initialization: garbage
I.e. output is ok for raw arrays (except for gcc): we have garbage for default and zeros for value. Great. But for a struct I would expect to have garbage all the time. From default initialization:
Default initialization is performed in three situations:
...
...
when a base class or a non-static data member is not mentioned in a constructor initializer list and that constructor is called.
The effects of default initialization are:
if T is a non-POD (until C++11) class type, ...
if T is an array type, every element of the array is
default-initialized;
otherwise, nothing is done: the objects with automatic storage duration (and their subobjects) are initialized to indeterminate values.
In my example I have non-static data member that is not mentioned in a constructor initializer list, which is an array of POD type. I expect it to be left with indeterminate values, no matter how my struct is constructed.
My questions are:
Why does compilers violate that? I mean, why they put zeros when they do not have to, wasting my runtime? Am I wrong in my readings?
How can I enforce such behavior to make sure I do not waste my runtime populating arrays with zeros?
Why gcc performs value initialization for an automatic array?
A structValue{}; is aggregate initialization, so 0 are guaranteed.
As A has no user provided constructor because explicitly defaulted constructors do not count as such, the same applies for value initialization as in A* psstructValue = new A();.
For the default initialization cases: Reading uninitialized variables is UB, and Undefined behavior is undefined. The compiler can do with that whatever it wants. Showing you 0 is just as legal as crashing. Maybe there even were 0 in the memory you read by chance. Maybe the compilers felt like 0 initializing. Both equally fine from the standard's point of view.
That being said, you have a better chance of seeing garbage when testing with Release / optimized builds. Debug builds tend to do extra stuff to help diagnosing problems, including doing some extra initialization.
(For the record: gcc and clang with -O3 appear to do no unnecessary initialization on my Linux system at first glance. Nevertheless, I got "all zeroes" for every case. That appears to be by chance.)
The other answer doesn't really address the REASON just kind of dances around with the language specification.
The actual reason is due to how the initialization process works.
Ask yourself the question how do I know if something is initialized.
That is why static data DOES need to be initialized, while data that is not, does not. If you didn't go through first and zero out all of the static data then the static dynamic initialization process (look it up) would be basically impossible.
You would constantly run into issues like two statics that obliquely reference each other in their initialization and everything falls apart.
So without this rule C++ basically is impossible to write a compiler for. Though there's other initialization schemes that don't have this requirement it would require a big overhaul of the language to implement them.

Default value of enum declared in class

I have a class whose member is an enum declared inside this class:
#include<iostream>
class test
{
public:
enum TYPE{MAN, WOMAN};
TYPE type;
};
int main()
{
test x;
if(x.type == test::MAN) std::cout<<"MAN"<<std::endl;
if(x.type == test::WOMAN) std::cout<<"WOMAN"<<std::endl;
std::cout<<"ok"<<std::endl;
return 0;
}
I know that if an enum is declared at namespace scope, it has a default value 0 and when it's declared locally, it doesn't have any default values, which leads to undefined behavior.
My question is: what if I have an enum which belongs to a class? Is it undefined behavior as well?
I tested the above code and x.type is neither MAN nor WOMAN. However, I've done it for only one compiler and one operating system. I'm interested in a more general answer. I haven't found any information regarding this issue anywhere else.
Edit1: Can referring to this indeterminate value cause segmentation fault?
Edit2: I know this is not a well designed class- it's not mine and I'm trying to debug it. So telling me that I can default-initialize object doesn't solve my problem. Please, treat it as a theoretical question.
The default value of the first name in an enum is 0, regardless of the scope of the enum.
There is no guaranteed default value of an automatic local variable like test x; in main here. It has an indeterminate value. And it's Undefined Behavior to use that value.
You can ¹default-initialize it like this:
test x{};
¹ A subtle point is that at top level this gives a “value-initialization”.
If your object don't have any constructors, then it depends on where you create your object. If it's created globally, then all variables are zero-initialized. If not, they are not initialized properly and reading from them results in UB.
You can force zero-initialization of a non-global variable with test x{}; syntax.
First: "Testing" for undefined behavior is almost never going to give you the right answer.
This is undefined behavior because you are reading from an uninitialized variable with automatic storage duration. Such a variable has an indeterminate value and must not be read from. Every non-static function scope variable has automatic storage duration.
I think you are confusing the definition of the enum type (which happens inside the class definition) with the declaration of a variable of this type (at function scope). In your example x is a variable with automatic storage duration no matter where the type TYPE has been defined.

Local Variables Being Passed ( C++)

I have encountered a problem in my learning of C++, where a local variable in a function is being passed to the local variable with the same name in another function, both of these functions run in main().
When this is run,
#include <iostream>
using namespace std;
void next();
void again();
int main()
{
int a = 2;
cout << a << endl;
next();
again();
return 0;
}
void next()
{
int a = 5;
cout << a << endl;
}
void again()
{
int a;
cout << a << endl;
}
it outputs:
2
5
5
I expected that again() would say null or 0 since 'a' is declared again there, and yet it seems to use the value that 'a' was assigned in next().
Why does next() pass the value of local variable 'a' to again() if 'a' is declared another time in again()?
http://en.cppreference.com/w/cpp/language/ub
You're correct, an uninitialized variable is a no-no. However, you are allowed to declare a variable and not initialize it until later. Memory is set aside to hold the integer, but what value happens to be in that memory until you do so can be anything at all. Some compilers will auto-initialize variables to junk values (to help you catch bugs), some will auto-initialize to default values, and some do nothing at all. C++ itself promises nothing, hence it's undefined behavior. In your case, with your simple program, it's easy enough to imagine how the compiler created assembly code that reused that exact same piece of memory without altering it. However, that's blind luck, and even in your simple program isn't guaranteed to happen. These types of bugs can actually be fairly insidious, so make it a rule: Be vigilant about uninitialized variables.
An uninitialized non-static local variable of *built-in type (phew! that was a mouthful) has an indeterminate value. Except for the char types, using that value yields formally Undefined Behavior, a.k.a. UB. Anything can happen, including the behavior that you see.
Apparently with your compiler and options, the stack area that was used for a in the call of next, was not used for something else until the call of again, where it was reused for the a in again, now with the same value as before.
But you cannot rely on that. With UB anything, or nothing, can happen.
* Or more generally of POD type, Plain Old Data. The standard's specification of this is somewhat complicated. In C++11 it starts with §8.5/11, “If no initializer is specified for an object, the object is default-initialized; if no initialization is performed, an object with automatic or dynamic storage duration has indeterminate value.”. Where “automatic … storage duration” includes the case of local non-static variable. And where the “no initialization” can occur in two ways via §8.5/6 that defines default initialization, namely either via a do-nothing default constructor, or via the object not being of class or array type.
This is completely coincidental and undefined behavior.
What's happened is that you have two functions called immediately after one another. Both will have more or less identical function prologs and both reserve a variable of exactly the same size on the stack.
Since there are no other variables in play and the stack is not modified between the calls, you just happen to end up with the local variable in the second function "landing" in the same place as the previous function's local variable.
Clearly, this is not good to rely upon. In fact, it's a perfect example of why you should always initialize variables!

Default Values C++11, compiler to compiler

Question 1:
How can you tell the default value of a variable? That is (if my vocabulary is wrong) the value of a variables before it is assigned?
Question 2:
How does this differ between compilers?
Question 3:
Is there a better way to default values?
Question 4:
And finally, are there other exceptions to this rule?
Example code:
bool foolean;
int fintoo;
double fooble;
char charafoo;
What would these be by default compiler to compiler?
In all versions of C++, all the variables in your question will be zero-initialized (statically) if they're declared at namespace scope. In all other cases, they will have garbage values if left uninitialized.
Note that a garbage value is anything which is at the memory location where the variable is defined — it is just a pattern of 0s and 1s. Such values shouldn't be read by your program, else your code will invoke undefined behaviour.
In C++11, if you write these as local variables (or namespace variables):
bool foolean {};
int fintoo {};
double fooble {};
char charafoo {};
They're default-initialized which means zero in this case (as they are built-types).
If a variable is automatic (that is, a non-static variable local to a function, or a member thereof), there is no default. From pratical perspective, the variable is allocated on stack, and what's there on the stack (probably leftover from a previous function call) will become the value of the variable.
Also, some compilers add code to initialize the stack frame to a well-known value in debug mode. That lets you easily see that a variable hasn't been initialized while debugging.
If a variable is static (declared in namespace scope, or with the static keyword in a function), the default is zero.