Why compilers put zeros into arrays while they do not have to?

Why compilers put zeros into arrays while they do not have to? - c++

I'm trying to understand when compilers should value initialize arrays and when they should default initialize it. I'm trying two options: one raw array, another array aggregated in a struct:
const int N = 1000;
struct A
{
uint32_t arr[N];
A() = default;
};
void print(uint32_t* arr, const std::string& message)
{
std::cout << message << ": " <<
(std::count(arr, arr + N, 0) == N ? "all zeros" : "garbage") << std::endl;
}
int main()
{
uint32_t arrDefault[N];
print(arrDefault, "Automatic array, default initialization");
uint32_t arrValue[N] = {};
print(arrValue, "Automatic array, value initialization");
uint32_t* parrDefault = new uint32_t[N];
print(parrDefault, " Dynamic array, default initialization");
uint32_t* parrValue = new uint32_t[N]();
print(parrValue, " Dynamic array, value initialization");
A structDefault;
print(structDefault.arr, "Automatic struct, default initialization");
A structValue{};
print(structValue.arr, "Automatic struct, value initialization");
A* pstructDefault = new A;
print(pstructDefault->arr, " Dynamic struct, default initialization");
A* psstructValue = new A();
print(psstructValue->arr, " Dynamic struct, value initialization");
}
Here is what I see for clang and VC++:
Automatic array, default initialization: garbage
Automatic array, value initialization: all zeros
Dynamic array, default initialization: garbage
Dynamic array, value initialization: all zeros
Automatic struct, default initialization: all zeros
Automatic struct, value initialization: all zeros
Dynamic struct, default initialization: garbage
Dynamic struct, value initialization: all zeros
Output for gcc is different only in the first line, where it also puts "all zeros".
From my point of view they are all wrong, and what I expect is:
Automatic array, default initialization: garbage
Automatic array, value initialization: all zeros
Dynamic array, default initialization: garbage
Dynamic array, value initialization: all zeros
Automatic struct, default initialization: garbage
Automatic struct, value initialization: garbage
Dynamic struct, default initialization: garbage
Dynamic struct, value initialization: garbage
I.e. output is ok for raw arrays (except for gcc): we have garbage for default and zeros for value. Great. But for a struct I would expect to have garbage all the time. From default initialization:
Default initialization is performed in three situations:
...
...
when a base class or a non-static data member is not mentioned in a constructor initializer list and that constructor is called.
The effects of default initialization are:
if T is a non-POD (until C++11) class type, ...
if T is an array type, every element of the array is
default-initialized;
otherwise, nothing is done: the objects with automatic storage duration (and their subobjects) are initialized to indeterminate values.
In my example I have non-static data member that is not mentioned in a constructor initializer list, which is an array of POD type. I expect it to be left with indeterminate values, no matter how my struct is constructed.
My questions are:
Why does compilers violate that? I mean, why they put zeros when they do not have to, wasting my runtime? Am I wrong in my readings?
How can I enforce such behavior to make sure I do not waste my runtime populating arrays with zeros?
Why gcc performs value initialization for an automatic array?

A structValue{}; is aggregate initialization, so 0 are guaranteed.
As A has no user provided constructor because explicitly defaulted constructors do not count as such, the same applies for value initialization as in A* psstructValue = new A();.
For the default initialization cases: Reading uninitialized variables is UB, and Undefined behavior is undefined. The compiler can do with that whatever it wants. Showing you 0 is just as legal as crashing. Maybe there even were 0 in the memory you read by chance. Maybe the compilers felt like 0 initializing. Both equally fine from the standard's point of view.
That being said, you have a better chance of seeing garbage when testing with Release / optimized builds. Debug builds tend to do extra stuff to help diagnosing problems, including doing some extra initialization.
(For the record: gcc and clang with -O3 appear to do no unnecessary initialization on my Linux system at first glance. Nevertheless, I got "all zeroes" for every case. That appears to be by chance.)

The other answer doesn't really address the REASON just kind of dances around with the language specification.
The actual reason is due to how the initialization process works.
Ask yourself the question how do I know if something is initialized.
That is why static data DOES need to be initialized, while data that is not, does not. If you didn't go through first and zero out all of the static data then the static dynamic initialization process (look it up) would be basically impossible.
You would constantly run into issues like two statics that obliquely reference each other in their initialization and everything falls apart.
So without this rule C++ basically is impossible to write a compiler for. Though there's other initialization schemes that don't have this requirement it would require a big overhaul of the language to implement them.

Related

C++ Pointers and C-style array initialization

Long time java programmer here, new to C++. I have been working with C-style "traditional" arrays (similar to arrays in java). I understand in C++ we can create a simple array as follows:
Person people[3];
The contents of this array is essentially uninitialized junk values. When I print out the contents of each element in the array, I (believe) am getting the memory address of each element.
for(int i = 0; i < 3; i++){std::cout << &person[i] << std::endl;}
Results:
//Note, I get different results here when I use an enhanced for loop vs an iterator for loop. Weird.
00EFFB6C
00EFFBA8
00EFFBE4
Now, here is the part I have failed to get a clear explanation on. I create a pointer to one of the elements in the array. I then ask for some value back from that pointer. In java, I would expect to get a null pointer, but in C++, that is not happening.
Instead, I get the default value, as though this element is initialized:
Person* person1Ptr = &people[0];//Points to an uninitialized value
std::cout << person1Ptr->getFirstName() << std::endl;//Output: "Default First Name", expected nullptr
When I try to get the first name of an element using a reference, this doesn't work (presumably because the value doesn't exist).
Full paste of code: https://pastebin.com/cEadfJhr
From my research, C++ does NOT fill arrays with objects of the specified type automagically.
How is my person1Ptr returning a value?

I believe the problem stems from this misconception :
From my research, C++ does NOT fill arrays with objects of the specified type automagically.
C++ objects have value semantics. Defining a local variable of type T concretely creates a unique instance of that type, it is not a handle to a potential T. The expression Foo f; is conceptually equivalent to the Java expression Foo f = new Foo();. Additionally, value semantics means assignment usually implies a copy. The C++ expression Foo f; Foo g = f; is conceptually equivalent to the Java expression Foo f = new Foo(); Foo g = f.Clone();.
In the case of an array, defining a local variable Foo f[3]; immediately creates three instances of Foo as elements of the f array. Your misconception may come from the fact that creating an object in C++ does not imply that it has been initialized. An object can exist in an uninitialized state. For example int i; create an int object identified by i but its value is indeterminate. In the case of int i[3]; you would have an array of three int each with indeterminate values.
The rules for initialization are very complicated in C++. In the case of Person people[3]; you have an array that is default initialized.
You are initializing an object of type Person[3]. According to default initialization rules :
if T is an array type, every element of the array is default-initialized;
That means each Person gets its own default initialization. To see what happens, consider that T is Person :
if T is a class type, the constructors are considered and subjected to overload resolution against the empty argument list. The constructor selected (which is one of the default constructors) is called to provide the initial value for the new object;
So each Person's default constructor will be called to initialize that element. You end up with Person people[3]; defining three default constructed Person objects with default initial values.

How to Initialize all Variables in the program to zero without doing it explicitly in c++ [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I want to make all the variables which I create in my program to have initialization to zero without doing it explicitly.
For example suppose if I create a variable like below:
int i;
We know that it must contain any garbage value and to make it have value 0 by default we need to declare it like
int i=0;
but I want my all variables in the program to contain garbage value as 0 not any other value.
In my code I want my every variables (which include class variables, global variables, local variables) to automatically initialize to 0 without doing it explicitly.
So what logic I tried is that in C, I have written the "Hello World" program without using main() function. So what I have done in that time is to override the _start function which is the first function called by compiler to set up everything and then call to main(). So I think there must be a function which compiler calls during the creation of a variables and I thought we can set the value 0 to the all variables there. Please help me with this problem. If there exist some other logic to solve this problem you can share with me I am open to all solutions but please don't say to explicitly declare them with the value 0.

As a person who spends much of his working life looking at other people's broken code, I really have to say this, although I doubt it will be popular...
If it matters what the initial value of a variable is, then you should initialize it explicitly.
That's true even if you initialize it to what is, in fact, the default value. When I look at a statement like this:
int i = 0;
I immediately know (or think I know) that the programmer really thought about the value, and set it. If I read this:
int i;
then I assume that the programmer does not care about the value -- presumably because it will be assigned a value later.
So far as automatic variables are concerned, it would be easy enough for the compiler to generate code that zero'd the relevant part of the stack frame on entry to a function. I suspect it would be hard to do in application code; but why would you want to? Not only would it make the program behave in a way that appears to violate the language specifications, it would encourage slopping, unreadable programming practices.
Just initialize your variables, and have done with it. Or, if you don't want to initialize it because you know that the compiler will initialize it in the way you want, insert a comment to that effect. You'll thank yourself when you have to fix a bug five years later.

Default initialization and some words regarding the complexity of initialization in C++
To limit the scope of this discussion, let T be any kind of type (fundamental such as int, class types, aggregate as well as non-aggregate), and the t be a variable of automatic storage duration:
int main() {
T t; // default initialization
}
The declaration of t means t will be initialized by means of default initialization. Default initialization acts differently for different kind of types:
For fundamental types such as int, bool, float and so on, the effect is that t is left in an uninitialized state, and reading from it before explicitly initializing it (later) is undefined behavior
For class types, overload resolution resolve to a default constructor (which may be implicitly or implicitly generated), which will initialize the object but where its data member object could end up in an uninitialized state, depending on the definition of the default constructor selected
For array types, every element of the array is default-initialize, following the rules above
C++ initialization is complex, and there are many special rules and gotchas that can end up with uninitialized variable or data members of variables whence read results in UB.
Hence a long-standing recommendation is to always explicitly initialized you variables (with automatic storage duration), and not rely on the fact that default initialization may or may not result in a fully initialized variable. A common approach is (attempting) to use value initialization, by means of initialization of variable with empty braces:
int main() {
int i{}; // value-initialization -> zero-initialization
SomeAggregateClass ac{}; // aggregate initialization
}
However even this approach can fail for class types if one is not careful whether the class is an aggregate or not:
struct A {
A() = default; // not user-provided.
int a;
};
struct B {
B(); // user-provided.
int b;
};
// Out of line definition: a user-provided
// explicitly-defaulted constructor.
B::B() = default;
In this example (in C++11 through C++17), A is an aggregate, whereas B is not. This, in turn, means that initialization of B by means of an empty direct-list-init will result in its data member b being left in an uninitialized state. For A, however, the same initialization syntax will result in (via aggregate initialization of the A object and subsequent value initalization of its data member a) zero-initialization of its data member a:
A a{};
// Empty brace direct-list-init:
// -> A has no user-provided constructor
// -> aggregate initialization
// -> data member 'a' is value-initialized
// -> data member 'a' is zero-initialized
B b{};
// Empty brace direct-list-init:
// -> B has a user-provided constructor
// -> value-initialization
// -> default-initialization
// -> the explicitly-defaulted constructor will
// not initialize the data member 'b'
// -> data member 'b' is left in an unititialized state
This may come as a surprise, and with the obvious risk of reading the uninitialized data member b with the result of undefined behaviour:
A a{};
B b{}; // may appear as a sound and complete initialization of 'b'.
a.a = b.b; // reading uninitialized 'b.b': undefined behaviour.

Difference between various initializers in C++

I've only recently started learning C++ as part of my 10th Grade syllabus, and am only aware of the basics, thus simple answers (if possible) will be appreciated.
I'm rather confused between initialization and assignment.
//Case 1
int a=5; //This is initialization
a=6; //This is assignment
From what I've understood, a variable is initialized when you give it a value to hold while declaring it. Changing this later in the code will be an assignment. Right?
What about :
//Case 2
int b;
{
//Block of code which does not call variable b
.
.
.
//End of block
}
b=6; // Is this initialization as well?
While 'b' is uninitialized when we declare, we later assign the value '6'. Can we say the 'b' is initialized now? Or are the terms initialized and uninitialized not applicable to 'b' anymore?
I read the an uninitialized variable holds "garbage values" till it isn't initialized. What exactly are "garbage values"?
What is the difference between the following initializers : '()', '{}', and '='?

Okay, once you declare a variable without assigning any value, like
int b;
that means that the compiler reserves some space in the memory to hold the value (to be exact, in this case the memory is reserved on the stack). But since you didn't assign any value to the variable, it still holds the value, that the assigned space in memory had before. And that can be anything. Those are garbage values.
Initializers:
int b(1);
assigns the value 1 to be (in general, it calls a constructor of the type)
The brackets can be used to initialize arrays like this (edit):
int b[] = {1, 3, 5, 7};
And the = just assigns a value. The difference between this and the first will only become interesting when dealing with more complex types (classes), where you have constructors

Easilly spoken:
Uninitialize variable:
int a;
You are declare a variable that means you allocate memory but dont assign a value to it. So its compiler dependend if the value is set to 0 or not. So there could be anything in. Thats waht you called garbage values.
Initialized variable:
int a = 0;
You are declare a variable that means you allocate memory and assigne a value to it.
Assigne Values:
a = 10;
You assigne a rvalue (in this case 10) to a lvalue ( a). So you dont allocate new memory.

You're basically right.
Some older texts call the first assignment to an uninitialised variable an "initialisation", although this is not strictly accurate.
"Garbage values" are arbitrary values. They could look meaningful or could be totally random.

Initialization serves to initialize an uninitialized value.
It can be done my means of copy constructor, i.e. int a = 1; or int a(1);, it can be done by means of assignment, i.e. int a; a = 1;, it can be done via a function, i.e. int a; init(a);. Initialization is not a "language thing", it is just the act of specifying an unspecified value.
A "garbage value" is an arbitrary value. Some storage will be given to the uninitialized object, and attempting to read it will produce a value of whatever happened to be in that memory.

Does int() return 0 or an arbitrary value?

Consider this code:
template <typename T>
void f()
{T x = T();}
When T = int, is x equal to 0 or to an arbitrary value?
Bonus question: and consequently, are arrays (both T[N] and std::array<T, N>) the only types where such a syntax may leave contents with arbitrary values.

T() gives value initialization, which gives zero-initialization for a type other than a class, union or array. (§8.5/7 bullet 3): "otherwise, the object is zero-initialized." For an array, each element of the array is value initialized.
For an array, the content will be of an arbitrary value if it's auto storage class, but zero initialized if it's static storage class -- i.e., global (assuming, of course, that you don't specify any initialization).

Regarding your first question, it is called value-initialization, and is well-covered in the standard (C++11 § 8.5 covers initialization in detail, with the specifics on () initialization and how it eventually leads to zero-initialization, starting with 8.5/16 (covers ()), which leads to 8.5/7 (value-initialization), and finally 8.5/5 (zero-initialization).
Regarding std::array<T,N>, if T is a class-type the constructors will fire for each element (user-provided or default if none are provided by the user). If default construction happens default-initialization will fire (which isn't very exciting). By the standard (8.5/6), each element is default-initialized. For T that is not a class type this is effectively the same as T ar[N]; which as you have pointed out, is indeterminate as well (because it is default initialized, which by the standard "no initialization is performed.".
Finally, if static storage is declared for your fixed array of non-class-type, it resides in zero-filled memory on inception. For automatic storage, its back to "no initialization is performed." as your end-game.
I hope i didn't miss something (and I know i'll hear it if I did). There are lots of interesting questions on SO that cover areas like this. If I get a chance I'll link a few.

C++ - value of uninitialized vector<int>

I understand from the answer to this question that values of global/static uninitialized int will be 0. The answer to this one says that for vectors, the default constructor for the object type will be called.
I am unable to figure out - what happens when I have vector<int> v(10) in a local function. What is the default constructor for int? What if I have vector<int> v(10) declared globally?
What I am seeing is that vector<int> v(10) in a local function is resulting in variables being 0 - but I am not sure if that is just because of my compiler or is the fixed expected behaviour.

The zero initialization is specified in the standard as default zero initialization/value initialization for builtin types, primarily to support just this type of case in template use.
Note that this behavior is different from a local variable such as int x; which leaves the value uninitialized (as in the C language that behavior is inherited from).

It is not undefined behaviour, a vector automatically initialises all its elements. You can select a different default if you want.
The constructor is:
vector( size_type, T t = T() )
and for int, the default type (returned by int()) is 0.
In a local function this:
int x;
is not guaranteed to initialise the variable to 0.
int x = int();
would do so.
int x();
sadly does neither but declares a function.

The constructor you are using actually takes two arguments, the second of which is optional. Its declaration looks like this:
explicit vector(size_type n, const T& value = T())
The first argument is the number of elements to create in the vector initially; the second argument is the value to copy into each of those elements.
For any object type T, T() is called "value initialization." For numeric types, it gives you 0. For a class type with a default constructor, it gives you an object that has been default constructed using that constructor.
For more details on the "magic parentheses," I'd recommend reading Michael Burr's excellent answer to the question "Do the parentheses after the type name make a difference with new?" It discusses value initialization when used with new specifically, but for the most part is applicable to value initialization wherever else it can be used.

By default, vector elements are zero-initialized and not default-initialized. Those are two different but related concepts:
zero-initialization is what is done for static objects not having an explicit initialization and what is done for a member given in the initialized list with an initializer of (). For basic types, the value used is 0 converted to the type.
default-initialization is what is done for not explicitly initialized non static variables and members. For basic types it stay uninitialized.
(And C++0X introduces value-initialization which is still different).

As mentioned by others, what happens is the zero initialization kicks in. I actually use that a lot in my code (outside of vectors and other classes):
some_type my_var = some_type();
This allows me to make sure that my variables are always properly initialized since by default C/C++ do not initialize basic types (char, short, int, long, float, double, etc.)
Since C++11, you also can do so in your class definitions:
class MyClass
{
...
int my_field_ = 123; // explicit initialization
int your_field_ = int(); // zero initialization
};
For vectors, the std library uses T(). Whatever T() is, it will use that default initialization. For a class, it calls the default constructor. For a basic type, it uses zero ('\0', 0, 0.0f, 0.0, nullptr`).
As mentioned by James McNellis and Nawaz, it is possible to set the value used to initialize the vector as in:
std::vector<int> foo(100, 1234);
That feature is also available when you resize your vector (if the vector shrinks, the default value is ignored):
foo.resize(200, 1234);
So that way you can have a default initialization value. However, it's a be tricky since you have to make sure that all your definitions and resize() calls use that default value. That's when you want to write your own class which ensures that the default value is always passed to the vector functions.
However, if you want to have a way to auto-initialize to a specific value, you can mix both features this way:
struct my_value {
int v = 123;
};
std::vector<my_value> foo(100);
// here foo[n].v == 123 for n in [0, 100)
This is my preferred way of dealing with this issue (i.e. if I don't want zero by default). It's an extra .v, but much less prone to mistakes and you don't need to know of the default value when you create a vector of my_value.
Also, for those who think this will be slow, it won't. The struct is like syntactic sugar as far as C++ is concerned. One optimized, it will be exactly the same as a simple std::vector<int> foo(100, 123).

The default initialization for an int type is to initialize it to 0.
This is true of most (if not all) primitive types: char will initialize to (char)0 (or '\0' if you prefer), float will initialize to 0.0f, and any pointer initializes to NULL. For other types, the parameterless constructor is invoked.
In general, the default initialization should happen pretty much whenever you aren't able to specify a constructor (or choose not to).

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Why compilers put zeros into arrays while they do not have to? - c++

Related

C++ Pointers and C-style array initialization

How to Initialize all Variables in the program to zero without doing it explicitly in c++ [closed]

Difference between various initializers in C++

Does int() return 0 or an arbitrary value?

C++ - value of uninitialized vector<int>

Categories

Resources