In the following class:
struct S {
S() : B{} {}
const uint8_t B[32];
};
Are all 32 bytes of the B array guaranteed to be initialized to zero by the default constructor?
Is there any way to create an object of type S such that any element of the B array is not zero? (without const casting or reinterpretting memory). Do all forms of initialization of S lead to a zeroed B array?
Are all 32 bytes of the B array guaranteed to be initialized to zero by the default constructor?
Yes, B is value-initialized which for an array means each member is value-initialized - primitive types are value-initialized to 0.
Is there any way to create an object of type S such that any element of the B array is not zero?
Not as far as I know, although S still has the default copy constructor so if somehow you got an S with non-zero B, you can clone those objects.
const member guarantees the values cannot be changed throughout the lifetime, so any non-zero value must be set at initialization which leads to the third question...
Do all forms of initialization of S lead to a zeroed B array?
Yes, S is not an aggregate (due to user-provided ctor) so there is no way how to initialize the members directly.
Related
According to https://en.cppreference.com/w/cpp/language/zero_initialization
In the example provided by the documentation:
std::string s; // is first zero-initialized to indeterminate value
// then default-initialized to ""
Why does zero initialization occur to string s; if the syntax is for static T object;?
Why does zero initialization happen before default initialization and why are both allowed to happen?
The effects of zero initialization are:
If T is a scalar type, the object's initial value is the integral
constant zero explicitly converted to T.
If T is an non-union class
type, all base classes and non-static data members are
zero-initialized, and all padding is initialized to zero bits. The
constructors, if any, are ignored.
If T is a union type, the first
non-static named data member is zero-initialized and all padding is
initialized to zero bits.
If T is array type, each element is
zero-initialized
If T is reference type, nothing is done.
What if I initialize string array[2] = {"Test1"};? I know that the array will contain "Test1" and empty string "".
But according to the above documentation,
If T is array type, each element is
zero-initialized
The data type is string which is an object / reference type?
If T is reference type, nothing is done.
Nothing is done? I thought maybe a constructor would have been called. Surely an empty string is something?
(Unless otherwise specified, all declarations in this answer are assumed to be in namespace scope.)
Why does zero initialization occur to string s; if the syntax is for
static T object;?
Why does zero initialization happen before
default initialization and why are both allowed to happen?
Variables with static storage duration are first zero-initialized at compile time, and then optionally dynamically initialized at runtime. static T object; declares an object of static storage duration. For a simple declaration like
int x;
The dynamic initialization is not performed. For a more sophisticated declaration like
std::string s;
Zero-initializing a string may result in an invalid string with a broken class invariant. Therefore, the dynamic initialization calls the default constructor to ensure that the object is valid.
What if I initialize string array[2] = {"Test1"};? I know that the
array will contain "Test1" and empty string "".
First, at compile time, the two objects are zero-initialized, resulting in possible invalid state. Then, at runtime, the constructors are called (const char* constructor for the first object and default constructor for the second object), and the valid objects are constructed.
The data type is string which is an object / reference type?
std::string is an object type instead of a reference type.
[For a reference type] Nothing is done? I thought maybe a constructor
would have been called. Surely an empty string is something?
A reference type is not considered an actual "object", so there is no point in specifying its zero-initialization semantics.
Why does zero initialization occur to string s; if the syntax is for static T object;?
Why does zero initialization happen before default initialization and why are both allowed to happen?
In page you linked to, that defines a non-local variable.
Non-local variables are initialized in two phases.
Static intialization.
Dynamic initialization, if it applies.
In static initialization phase, a variable is initialized using constant initialization or zero initialization
Dyanmic initialization is used, if it applies, such as for objects that have the appropriate constructor or for objects that are initialized using an expression that can be evaulated at run time.
You can read more on the topic at https://en.cppreference.com.
Nothing is done? I thought maybe a constructor would have been called. Surely an empty string is something?
A reference cannot be zero-initialized. It can only be initialized using a object that it will be a reference to.
I'm trying to understand when compilers should value initialize arrays and when they should default initialize it. I'm trying two options: one raw array, another array aggregated in a struct:
const int N = 1000;
struct A
{
uint32_t arr[N];
A() = default;
};
void print(uint32_t* arr, const std::string& message)
{
std::cout << message << ": " <<
(std::count(arr, arr + N, 0) == N ? "all zeros" : "garbage") << std::endl;
}
int main()
{
uint32_t arrDefault[N];
print(arrDefault, "Automatic array, default initialization");
uint32_t arrValue[N] = {};
print(arrValue, "Automatic array, value initialization");
uint32_t* parrDefault = new uint32_t[N];
print(parrDefault, " Dynamic array, default initialization");
uint32_t* parrValue = new uint32_t[N]();
print(parrValue, " Dynamic array, value initialization");
A structDefault;
print(structDefault.arr, "Automatic struct, default initialization");
A structValue{};
print(structValue.arr, "Automatic struct, value initialization");
A* pstructDefault = new A;
print(pstructDefault->arr, " Dynamic struct, default initialization");
A* psstructValue = new A();
print(psstructValue->arr, " Dynamic struct, value initialization");
}
Here is what I see for clang and VC++:
Automatic array, default initialization: garbage
Automatic array, value initialization: all zeros
Dynamic array, default initialization: garbage
Dynamic array, value initialization: all zeros
Automatic struct, default initialization: all zeros
Automatic struct, value initialization: all zeros
Dynamic struct, default initialization: garbage
Dynamic struct, value initialization: all zeros
Output for gcc is different only in the first line, where it also puts "all zeros".
From my point of view they are all wrong, and what I expect is:
Automatic array, default initialization: garbage
Automatic array, value initialization: all zeros
Dynamic array, default initialization: garbage
Dynamic array, value initialization: all zeros
Automatic struct, default initialization: garbage
Automatic struct, value initialization: garbage
Dynamic struct, default initialization: garbage
Dynamic struct, value initialization: garbage
I.e. output is ok for raw arrays (except for gcc): we have garbage for default and zeros for value. Great. But for a struct I would expect to have garbage all the time. From default initialization:
Default initialization is performed in three situations:
...
...
when a base class or a non-static data member is not mentioned in a constructor initializer list and that constructor is called.
The effects of default initialization are:
if T is a non-POD (until C++11) class type, ...
if T is an array type, every element of the array is
default-initialized;
otherwise, nothing is done: the objects with automatic storage duration (and their subobjects) are initialized to indeterminate values.
In my example I have non-static data member that is not mentioned in a constructor initializer list, which is an array of POD type. I expect it to be left with indeterminate values, no matter how my struct is constructed.
My questions are:
Why does compilers violate that? I mean, why they put zeros when they do not have to, wasting my runtime? Am I wrong in my readings?
How can I enforce such behavior to make sure I do not waste my runtime populating arrays with zeros?
Why gcc performs value initialization for an automatic array?
A structValue{}; is aggregate initialization, so 0 are guaranteed.
As A has no user provided constructor because explicitly defaulted constructors do not count as such, the same applies for value initialization as in A* psstructValue = new A();.
For the default initialization cases: Reading uninitialized variables is UB, and Undefined behavior is undefined. The compiler can do with that whatever it wants. Showing you 0 is just as legal as crashing. Maybe there even were 0 in the memory you read by chance. Maybe the compilers felt like 0 initializing. Both equally fine from the standard's point of view.
That being said, you have a better chance of seeing garbage when testing with Release / optimized builds. Debug builds tend to do extra stuff to help diagnosing problems, including doing some extra initialization.
(For the record: gcc and clang with -O3 appear to do no unnecessary initialization on my Linux system at first glance. Nevertheless, I got "all zeroes" for every case. That appears to be by chance.)
The other answer doesn't really address the REASON just kind of dances around with the language specification.
The actual reason is due to how the initialization process works.
Ask yourself the question how do I know if something is initialized.
That is why static data DOES need to be initialized, while data that is not, does not. If you didn't go through first and zero out all of the static data then the static dynamic initialization process (look it up) would be basically impossible.
You would constantly run into issues like two statics that obliquely reference each other in their initialization and everything falls apart.
So without this rule C++ basically is impossible to write a compiler for. Though there's other initialization schemes that don't have this requirement it would require a big overhaul of the language to implement them.
This question already has answers here:
Why do C and C++ support memberwise assignment of arrays within structs, but not generally?
(5 answers)
Closed 8 years ago.
What is the difference between the two array assignments, one inside a struct and one outside a struct?
struct A
{
char s[4];
};
int main(int argc, char *argv[])
{
char s[4];
char d[4];
d = s; // 'invalid array assignment'
A a, b;
b = a; // compiles without problems
return 0;
}
The default operator = is supposed to invoke member-by-member assignment operators. If so, then there should exist an array assignment operator, but the compiler does not want to invoke it explicitly. Why?
I think this is why... Struct is a class object and there is this special rule for assignment of member its an array, not for an array itself (c++14 draft):
12.8 Copying and moving class objects
12.8.28. The implicitly-defined copy/move assignment operator for a non-union class X performs memberwise copy-
/move assignment of its subobjects. The direct base classes of X are assigned first, in the order of their
declaration in the base-specifier-list, and then the immediate non-static data members of X are assigned, in
the order in which they were declared in the class definition. Let x be either the parameter of the function
or, for the move operator, an xvalue referring to the parameter. Each subobject is assigned in the manner
appropriate to its type:
— if the subobject is an array, each element is assigned, in the manner appropriate to the element type;
So the copy procedure is not defined for an array (because it's an unmodifiable variable type), but for a member type that is an array. There is no definition for implicit operator= for an array.
The value of an array name (e.g., s) is the starting address of the array. Once the array is allocated in memory, that address should be fixed. d = s is saying to assign the starting address of s[4] to d[4], which obviously can't be done.
A simple structure like yours is just a chuck of bits. In your case, an instance of A occupies 4 bytes. When you do a = b, it copies the bits of b to that of a.
To illustrate the difference, I don't think you can do a.s = b.s. You can try.
I have recently fixed a bug in an application of mine: the problem was that an object that resides on the stack had a field left uninitialized.
The object had a class declaration of this type:
struct A{
int somefield, someotherfield;
A(): someotherfield(0) {}
}
and when declaring a local variable (like A var; in a function), somefield was left uninitialized, and so a read of it would return a randomish value.
I was certain that fields of a class, which don't appear in the constructor initialization list, would always get initialized by a synthesized trivial constructor (in the case of an int, a zero value). Evidently I am wrong.
So what are the general rules about implicit field initialization?
classes and structs are initialized by contructor
Basic types int double char short ... are not initialized and contain random numbers
Pointers are not initialized and point to random positions
arrays of classes or structs cause each element to be initialized by its constructor
arrays of basic types or pointers are random.
I understand from the answer to this question that values of global/static uninitialized int will be 0. The answer to this one says that for vectors, the default constructor for the object type will be called.
I am unable to figure out - what happens when I have vector<int> v(10) in a local function. What is the default constructor for int? What if I have vector<int> v(10) declared globally?
What I am seeing is that vector<int> v(10) in a local function is resulting in variables being 0 - but I am not sure if that is just because of my compiler or is the fixed expected behaviour.
The zero initialization is specified in the standard as default zero initialization/value initialization for builtin types, primarily to support just this type of case in template use.
Note that this behavior is different from a local variable such as int x; which leaves the value uninitialized (as in the C language that behavior is inherited from).
It is not undefined behaviour, a vector automatically initialises all its elements. You can select a different default if you want.
The constructor is:
vector( size_type, T t = T() )
and for int, the default type (returned by int()) is 0.
In a local function this:
int x;
is not guaranteed to initialise the variable to 0.
int x = int();
would do so.
int x();
sadly does neither but declares a function.
The constructor you are using actually takes two arguments, the second of which is optional. Its declaration looks like this:
explicit vector(size_type n, const T& value = T())
The first argument is the number of elements to create in the vector initially; the second argument is the value to copy into each of those elements.
For any object type T, T() is called "value initialization." For numeric types, it gives you 0. For a class type with a default constructor, it gives you an object that has been default constructed using that constructor.
For more details on the "magic parentheses," I'd recommend reading Michael Burr's excellent answer to the question "Do the parentheses after the type name make a difference with new?" It discusses value initialization when used with new specifically, but for the most part is applicable to value initialization wherever else it can be used.
By default, vector elements are zero-initialized and not default-initialized. Those are two different but related concepts:
zero-initialization is what is done for static objects not having an explicit initialization and what is done for a member given in the initialized list with an initializer of (). For basic types, the value used is 0 converted to the type.
default-initialization is what is done for not explicitly initialized non static variables and members. For basic types it stay uninitialized.
(And C++0X introduces value-initialization which is still different).
As mentioned by others, what happens is the zero initialization kicks in. I actually use that a lot in my code (outside of vectors and other classes):
some_type my_var = some_type();
This allows me to make sure that my variables are always properly initialized since by default C/C++ do not initialize basic types (char, short, int, long, float, double, etc.)
Since C++11, you also can do so in your class definitions:
class MyClass
{
...
int my_field_ = 123; // explicit initialization
int your_field_ = int(); // zero initialization
};
For vectors, the std library uses T(). Whatever T() is, it will use that default initialization. For a class, it calls the default constructor. For a basic type, it uses zero ('\0', 0, 0.0f, 0.0, nullptr`).
As mentioned by James McNellis and Nawaz, it is possible to set the value used to initialize the vector as in:
std::vector<int> foo(100, 1234);
That feature is also available when you resize your vector (if the vector shrinks, the default value is ignored):
foo.resize(200, 1234);
So that way you can have a default initialization value. However, it's a be tricky since you have to make sure that all your definitions and resize() calls use that default value. That's when you want to write your own class which ensures that the default value is always passed to the vector functions.
However, if you want to have a way to auto-initialize to a specific value, you can mix both features this way:
struct my_value {
int v = 123;
};
std::vector<my_value> foo(100);
// here foo[n].v == 123 for n in [0, 100)
This is my preferred way of dealing with this issue (i.e. if I don't want zero by default). It's an extra .v, but much less prone to mistakes and you don't need to know of the default value when you create a vector of my_value.
Also, for those who think this will be slow, it won't. The struct is like syntactic sugar as far as C++ is concerned. One optimized, it will be exactly the same as a simple std::vector<int> foo(100, 123).
The default initialization for an int type is to initialize it to 0.
This is true of most (if not all) primitive types: char will initialize to (char)0 (or '\0' if you prefer), float will initialize to 0.0f, and any pointer initializes to NULL. For other types, the parameterless constructor is invoked.
In general, the default initialization should happen pretty much whenever you aren't able to specify a constructor (or choose not to).