Avoid field zero initialization in C++14 - c++

Since C++14 (it might be C++11, i'm not sure), zero initialization happens on class construction on certain conditions, depending on the way the constructor is called.
Is there a way to ensure a raw value field (let's say a pointer) is never zero initialized ?
I guess not because it seems that zero initialization happens at the class level and not at the field level (surely a kind of memset(this, 0, sizeof(TheClass))), but I'm still hoping there is way, a hack, something...
The idea is to be able to initialize a field before a placement new is called so that that member is available during construction time.

According to cppreference's take on zero initialization:
If T is an non-union class type, all base classes and non-static data
members are zero-initialized, and all padding is initialized to zero
bits. The constructors, if any, are ignored.
If your object is a victim of zero initialization you're out of luck. And every object with static or thread local storage duration will always be zero initialized (except for constant initialization).

Is there a way to ensure a raw value field (let's say a pointer) is never zero initialized ?
It is possible, yes.
If the class is trivially default constructible, then simply default initialise the object, and its members will be default initialised as well:
int main() {
T t; // default initialised
If the class is not trivially default constructible, you can write a user defined constructor that leaves the member uninitialised (note that doing this is generally discouraged):
struct S {
int member;
S(){} // member is left default initialised
Warning: With a user defined constructor, the member would be left default initialised even if the object is value initialised.
Objects with static storage duration are always zero or constant initialised. Neither of those leave members uninitialised. Only way to avoid that initialisation is to not create objects with static storage duration.
The idea is to be able to initialize a field before a placement new is called so that that member is available during construction time.
If you mean your idea is to initialise a field of an object that hasn't yet been created with placement new, then the idea is not possible to implement.
Members are available during construction time. You don't need any tricks to achieve this. All members that can be used by their siblings in order of their initialisation. And the constructor body is run after all members have been initialised.

As #eerorika's answer suggests:
struct S {
int member;
S(){} // member is left default initialised
...note that doing this is generally discouraged
I would say that the clearest way to achieve this without confusing the user of the class would be not to use a int, but a std::aligned_storage_t as it is clear that this member is used just for storage of objects that might be created later rather than a value that can be used on its own.
Depending how much code you would like to write, this might be tedious, because you have to write placement news and use std::launder (since c++17) to be compliant with the standard. However, this is IMO the best way to express your intent.
cppreference gives a good example of how to use std::aligned_storage (with some modification, see comments):
#include <iostream>
#include <type_traits>
#include <string>
template<class T, std::size_t N>
class static_vector
{
// properly aligned uninitialized storage for N T's
typename std::aligned_storage<sizeof(T), alignof(T)>::type data[N];
std::size_t m_size = 0;
public:
// My modification here suppresses zero-initialization
// if initialized with empty braces
static_vector() {};
// Create an object in aligned storage
template<typename ...Args> void emplace_back(Args&&... args)
{
if( m_size >= N ) // possible error handling
throw std::bad_alloc{};
// construct value in memory of aligned storage
// using inplace operator new
new(&data[m_size]) T(std::forward<Args>(args)...);
++m_size;
}
// Access an object in aligned storage
const T& operator[](std::size_t pos) const
{
// note: needs std::launder as of C++17
return *reinterpret_cast<const T*>(&data[pos]);
}
// Delete objects from aligned storage
~static_vector()
{
for(std::size_t pos = 0; pos < m_size; ++pos) {
// note: needs std::launder as of C++17
reinterpret_cast<T*>(&data[pos])->~T();
}
}
};
int main()
{
static_vector<std::string, 10> v1;
v1.emplace_back(5, '*');
v1.emplace_back(10, '*');
std::cout << v1[0] << '\n' << v1[1] << '\n';
static_vector<std::size_t, 10> v2{};
// This is undefined behavior.
// Here it's just used to demonstrate that
// the memory is not initialized.
std::cout << v2[0] << "\n";
}
Run it on compiler explorer
However, since the c++ standard only mandates when zero-initialization must happen, but not when it must not happen, sometimes you really have to wrestle with your compiler in order to suppress zero initialization.
See this question and this question for more details.

Related

Apparent bug in clang when assigning a r value containing a `std::string` from a constructor

While testing handling of const object members I ran into this apparent bug in clang. The code works in msvc and gcc. However, the bug only appears with non-consts which is certainly the most common use. Am I doing something wrong or is this a real bug?
https://godbolt.org/z/Gbxjo19Ez
#include <string>
#include <memory>
struct A
{
// const std::string s; // Oddly, declaring s const compiles
std::string s;
constexpr A() = default;
constexpr A(A&& rh) = default;
constexpr A& operator=(A&& rh) noexcept
{
std::destroy_at(this);
std::construct_at(this, std::move(rh));
return *this;
}
};
constexpr int foo()
{
A i0{}; // call ctor
// Fails with clang. OK msvc, gcc
// construction of subobject of member '_M_local_buf' of union with no active member is not allowed in a constant expression { return ::new((void*)__location) _Tp(std::forward<_Args>(__args)...); }
i0 = A{}; // call assign rctor
return 42;
}
int main() {
constexpr int i = foo();
return i;
}
For those interested, here's the full version that turns const objects into first class citizens (usable in vectors, sorting, and such). I really dislike adding getters to maintain immutability.
https://godbolt.org/z/hx7f9Krn8
Yes this is a libstdc++ or clang issue: std::string's move constructor cannot be used in a constant expression. The following gives the same error:
#include <string>
constexpr int f() {
std::string a;
std::string b(std::move(a));
return 42;
}
static_assert(f() == 42);
https://godbolt.org/z/3xWxYW717
https://en.cppreference.com/w/cpp/compiler_support does not show that clang supports constexpr std::string yet.
Your game of "construct a new object in place of the old one" is the problem.
It is completely forbidden if the object is const or contains any const member subobjects.
due to the following rule in [basic.life] (note that a rewrite1 of this rule is proposed in post-C++17 drafts)
If, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, a new object is created at the storage location which the original object occupied, a pointer that pointed to the original object, a reference that referred to the original object, or the name of the original object will automatically refer to the new object and, once the lifetime of the new object has started, can be used to manipulate the new object, if:
the storage for the new object exactly overlays the storage location which the original object occupied,
and
the new object is of the same type as the original object (ignoring the top-level cv-qualifiers)
and
the type of the original object is not const-qualified, and, if a class type, does not contain any non-static data member whose type is const-qualified or a reference type
and
the original object was a most derived object of type T and the new object is a most derived object of type T (that is, they are not base class subobjects).
You have to abide by this rule for the purposes both of return *this; and also the implicit destructor call.
It also doesn't work during constexpr evaluation.
... this one is specific to the fact that std::string small-string optimization may be implemented using a union, and changing an active union member has been forbidden during constexpr evaluation, although this rule too seems to have changed post-C++17.
1 I consider said change to be misguided (it doesn't even permit the pattern it was supposed to fix) and break legitimate coding patterns. While it's true that a pointer to const-qualified object only made my view readonly and did not let me assume that the object wasn't being changed by someone else holding a pointer/reference that wasn't so qualified, in the past if I was given a pointer (qualified or not) to an object with a const member, I was assured that no one was changing that member and I (or my optimizing compiler) could safely use a cached copy of that member (or data derived from that member value, such as a hash or a comparison result).
Apparently this is no longer true.
While changing the language rule may automatically remove all compiler optimizations that would have assumed immutability of a const member, there's no automatic patch to user-written code which was correct and bug-free under the old rules, for example std::map and std::unordered_map code using std::pair<const Key, Value>. Yet the DR doesn't appear to have considered this as a breaking change...
I was asked for a code snippet that illustrates a behavior change of existing valid code, here it is. This code was formerly illegal, under the new rules it's legal, and the map will fail to maintain its invariants.
std::map<int, T> m{data_source()};
/* new code, now legal */
for( auto& keyvalue : m ) {
int newkey = -keyvalue.first;
std::construct_at(&keyvalue.first, newkey);
// or new (&keyvalue.first) int(newkey);
}
/* existing valid code that breaks */
std::cout << m[some_key()];
Consider the new relaxed wording of the restriction
the original object is neither a complete object that is const-qualified nor a subobject of such an object
keyvalue.first is const-qualified, but it is not a complete object, and it is a subobject of a complete object (std::pair<const Key, Value>) that is not const-qualified. This code is now legal. It's not even against the spirit of the rule, the DR explicitly mentioned the intent to perform in-place replacement of container elements with const subobjects.
It's the implementation of std::map that breaks, along with all existing code that uses the map instance, an unfortunate action-at-a-distance resulting from addition of now-legal code.
Please note that the actual replacement of the key could take place in code that merely has the pointer &keyvalue and needn't know that std::pair instance is actually inside a std::map), so the stupidity of what's being done won't be so obvious.

C++11: Does an assignment operator prevent a type from being POD, and thus being global-initialized?

Background: I'm in a large-code environment where the undefined order in which global constructors are run is problematic. So I have a custom class that is designed to delay initialization until first-use. All its magic occurs inside its operator* and operator-> functions; they're the only thing defined. It also stores some state within itself, to make available to the automatic-initialization function. That state must, of course, be POD, so that the whole class is POD, so that it can be completely set up before anyone's code starts running, so that all code everywhere can use all globals everywhere, without fear that the globals haven't been set up yet.
A while back someone added a private, never-defined assignment operator, so that the type would never be assigned to (it's not designed to ever change anyway). Now someone else is saying the class is broken because it's not POD. If instead of it being declared-but-not-defined, I declare it as "=delete", I'm thinking that's somehow better. And indeed, with that change, std::is_pod<>::value returns true for the type.
But does as assignment operator prevent a type from being POD? I thought the requirements were just that it had to have only public data members, no virtual methods, and no constructor or destructor.
And more to the point for my situation: does the presence of a never-defined assignment operator prevent the class from being able to be initialized at global initialization time, along with all the other global PODs?
Reduced example:
struct LazyString {
const char *c_str;
bool has_been_inited;
string *lazy_str_do_not_use_directly;
string &operator*() { return *get(); }
string *operator->() { return get(); }
private:
string *get() {
// The real code uses a mutex, of course, to be thread-safe.
if (!has_been_inited) {
lazy_str_do_not_use_directly = new string(c_str);
has_been_inited = true;
}
return lazy_str_do_not_use_directly;
}
// Does this make the class non-POD?
// If so, does that mean that global variables of this type
// will not be initialized at global-initialization time, that wonderful
// moment in time where no code has yet been run?
void operator=(const LazyString&);
// If I do this instead, it breaks C++03 compatibility, but is this somehow better?
void operator=(const LazyString&) = delete;
};
LazyString lazy = { "lazy" };
int main(int argc, char *argv[]) {
std::cout << *lazy;
}
Does an assignment operator prevent a type from being POD
Yes. A POD type must be trivial; and so must be trivially copyable; and so must have no non-trivial copy or move assignment operators.
Any user-provided operator is non-trivial, so declaring the copy constructor makes the class non-trivial, and hence non-POD.
and thus being global-initialized?
No. Any instantiable type, POD or not, can be a global variable.
UPDATE: From the comment, you meant to ask:
Can it be statically, rather than dynamically, initialised?
Yes, since it has a trivial constructor; as long as the initialiser is a constant expression. In your example, { "lazy" } is a constant expression, so LazyString can be statically initialised.
The important feature here is that it has a trivial constructor, not that it's POD. POD means it also meets various other requirements, not relevant to initialisation.
C++ has several stages of initialization for non-local variables with static storage duration (global variable fall into this category) - regardless of whether the type is a POD or not.
zero initialization takes place before any other initialization
constant initialization takes place
dynamic initialization
The first two types of initialization must take place before dynamic initialization. Dynamic initialization is the situation where the order of initialization can be difficult to set. Some dynamic initialization is unordered, some is ordered within a translation unit, etc.
Even if you global variable is not POD, you can be sure that zero initialization will have taken place before any dynamic init.
See C++11 3.6.2 "Initialization of non-local variables" for details.

Unrestricted union in practice

I have some questions about unrestricted unions and their application in practice.
Let's suppose I have the following code :
struct MyStruct
{
MyStruct(const std::vector<int>& a) : array(a), type(ARRAY)
{}
MyStruct(bool b) : boolean(b), type(BOOL)
{}
MyStruct(const MyStruct& ms) : type(ms.type)
{
if (type == ARRAY)
new (&array) std::vector<int>(ms.array);
else
boolean = ms.boolean;
}
MyStruct& operator=(const MyStruct& ms)
{
if (&ms != this) {
if (type == ARRAY)
array.~vector<int>(); // EDIT(2)
if (ms.type == ARRAY)
new (&array) std::vector<int>(ms.array);
else
boolean = ms.boolean;
type = ms.type;
}
return *this;
}
~MyStruct()
{
if (type == ARRAY)
array.~vector<int>();
}
union {
std::vector<int> array;
bool boolean;
};
enum {ARRAY, BOOL} type;
};
Is this code valid :) ?
Is it necessary to explicitly call the vector destructor each time we are using the boolean (as stated here http://cpp11standard.blogspot.com/2012/11/c11-standard-explained-1-unrestricted.html)
Why a placement new is required instead of just doing something like 'array = ms.array' ?
EDIT:
Yes, it compiles
"Members declared inside anonymous unions are actually members of the containing class, and can be initialized in the containing class's constructor." (C++11 anonymous union with non-trivial members)
Adding explicit destructors as suggested, leads to SIGSEV with g++ 4.8 / clang 4.2
The code's buggy: change array.clear(); to array.~vector<int>();
Explanation: operator= is using placement new over an object that hasn't been destructed, which could do anything but practically you can expect it to leak the dynamic memory the previous array had been using (clear() doesn't release memory / change capacity, it just destructs elements and changes size).
From 9.5/2:
If any non-static data member of a union has a non-trivial default
constructor (12.1), copy constructor (12.8), move constructor (12.8), copy assignment operator (12.8), move
assignment operator (12.8), or destructor (12.4), the corresponding member function of the union must be
user-provided or it will be implicitly deleted (8.4.3) for the union.
So, the vector constructor, destructor etc never kicks in by themselves: you must call them explicitly when wanted.
In 9.5/3 there's an example:
Consider the following union:
union U {
int i;
float f;
std::string s;
};
Since std::string (21.3) declares non-trivial versions of all of the special member functions, U will have
an implicitly deleted default constructor, copy/move constructor, copy/move assignment operator, and destructor.
To use U, some or all of these member functions must be user-provided.
That last bit - "To use U, some or all of these member functions must be user-provided." - seems to presume that U needs to coordinate its own vaguely value-semantic behaviour, but in your case the surrouding struct is doing that so you don't need to define any of these union member functions.
2: we must call the array destructor whenever an array value is being replaced by a boolean value. If in operator= a new array value is being placement-newed instead of assigned, then the old array must also have its destructor called, but using operator= would be more efficient when the existing memory is sufficient for all the elements being copied. Basically, you must match constructions and destructions. UPDATE: the example code has a bug as per your comment below.
3: Why a placement new is required instead of just doing something like 'array = ms.array' ?
array = ms.array invokes std::vector<int>::operator= which always assumes the this pointer addresses an already properly constructed object. Inside that object you can expect there to be a pointer which will either be NULL or refer to some internal short-string buffer, or refer to heap. If your object hasn't been destructed, then operator= may well call a memory deallocation function on the bogus pointer. Placement new says "ignore the current content of the memory that this object will occupy, and construct a new object with valid members from scratch.
The union does not declare a default constructor, copy constructor, copy assignment operator, or destructor.
If std::string declares at least one non-trivial version of a special member function (which is the case), the forementioned ones are all implicitly deleted, and you must declare (and define) them (... if they're used, which is the case).
Insofar, that code isn't correct and should not successfully compile (this is almost-to-the-letter identical to the example in 9.5 par 3 of the standard, except there it's std::string, not std::vector).
(Does not apply for an anon union, as correctly pointed out)
About question (2): In order to safely switch the union, this is necessary, yes. The Standard explicitly says that in 9.5 par 4 [Note].
It makes sense too, if you think about it. At most one data member can be active in a union at any time, and they're not magically default constructed/destroyed, which means you need to properly construct/destruct things. It is not meaningful (or even defined) to use the union as something else otherwise (not that you couldn't do that anyway, but it's undefined).
The object is not a pointer, and you don't know whether it's allocated on the heap either (even if it is allocated on the heap, then it's inside another object, so it's still not allowable to delete it). How do you destroy an object if you can't call delete? How do you allocate an object -- possibly several times -- without leaking if you can't delete it? This doesn't leave many choices. Insofar, the [Note] makes perfect sense.

Does a c++ struct have a default constructor?

I wrote the following code snippet:
void foo()
{
struct _bar_
{
int a;
} bar;
cout << "Value of a is " << bar.a;
}
and compiled it with g++ 4.2.1 (Mac). The output is "Value of a is 0".
Is it true to say that data members of a struct in c++ are always initialized by default (compared to c)? Or is the observed result just coincidence?
I can imagine that structs in c++ have a default constructor (since a struct and a class is almost the same in c++), which would explain why the data member a of bar is initialized to zero.
The simple answer is yes.
It has a default constructor.
Note: struct and class are identical (apart from the default state of the accesses specifiers).
But whether it initializes the members will depends on how the actual object is declared. In your example no the member is not initialized and a has indeterminate value.
void func()
{
_bar_ a; // Members are NOT initialized.
_bar_ b = _bar_(); // Members are zero-initialized
// From C++14
_bar_ c{}; // New Brace initializer (Members are zero-initialized)
_bar_* aP = new _bar_; // Members are NOT initialized.
_bar_* bP = new _bar_(); // Members are zero-initialized
// From C++14
_bar_ cP = new _bar_{}; // New Brace initializer (Members are zero-initialized)
}
// static storage duration objects
// i.e. objects at the global scope.
_bar_ c; // Members are zero-initialized.
The exact details are explained in the standard at 8.5 Initializers [dcl.init] paragraphs 4-10. But the following is a simplistic summary for this situation.
A structure without a user defined constructor has a compiler generated constructor. But what it does depends on how it is used and it will either default initialize its members (which for POD types is usually nothing) or it may zero initialize its members (which for POD usually means set its members to zero).
PS. Don't use a _ as the first character in a type name. You will bump into problems.
Is it true to say that data members of a struct in c++ are always initialized by default (compared to c)? Or is the observed result just coincidence?
It is a coincidence.
Your code invokes Undefined Behavior; unless you explicitly set the members to 0 they can be anything.
Not an answer, but you might take it to be... if you want to try it:
void foo() {
struct test {
int value;
} x;
std::cout << x.value << std::endl;
x.value = 1000;
}
int main() {
foo();
foo();
}
In your example, the memory already had the 0 value before the variable was created, so you can call it a lucky coincidence (in fact, some OS will zero out all memory before starting a process, which means that 0 is quite a likely value to find in a small short program...), the previous code will call the function twice, and the memory from the first call will be reused in the second one, chances are that the second time around it will print 1000. Note however that the value is still undefined, and that this test might or not show the expected result (i.e. there are many things that the compiler can do and would generate a different result...)
Member variables of a struct are not initialized by default. Just like a class (because a struct is exactly the same thing as a class, only in struct the members are public by default).
Do not rely on this functionality it is non-standard
just add
foo() : a() {}
I can't remember the exact state of gcc 4.2 (i think it is too old) but if you were using C++11 you can do the following
foo()=default;

Initialisation and assignment

What EXACTLY is the difference between INITIALIZATION and ASSIGNMENT ?
PS : If possible please give examples in C and C++ , specifically .
Actually , I was confused by these statements ...
C++ provides another way of initializing member variables that allows us to initialize member variables when they are created rather than afterwards. This is done through use of an initialization list.
Using an initialization list is very similar to doing implicit assignments.
Oh my. Initialization and assignment. Well, that's confusion for sure!
To initialize is to make ready for use. And when we're talking about a variable, that means giving the variable a first, useful value. And one way to do that is by using an assignment.
So it's pretty subtle: assignment is one way to do initialization.
Assignment works well for initializing e.g. an int, but it doesn't work well for initializing e.g. a std::string. Why? Because the std::string object contains at least one pointer to dynamically allocated memory, and
if the object has not yet been initialized, that pointer needs to be set to point at a properly allocated buffer (block of memory to hold the string contents), but
if the object has already been initialized, then an assignment may have to deallocate the old buffer and allocate a new one.
So the std::string object's assignment operator evidently has to behave in two different ways, depending on whether the object has already been initialized or not!
Of course it doesn't behave in two different ways. Instead, for a std::string object the initialization is taken care of by a constructor. You can say that a constructor's job is to take the area of memory that will represent the object, and change the arbitrary bits there to something suitable for the object type, something that represents a valid object state.
That initialization from raw memory should ideally be done once for each object, before any other operations on the object.
And the C++ rules effectively guarantee that. At least as long as you don't use very low level facilities. One might call that the C++ construction guarantee.
So, this means that when you do
std::string s( "one" );
then you're doing simple construction from raw memory, but when you do
std::string s;
s = "two";
then you're first constructing s (with an object state representing an empty string), and then assigning to this already initialized s.
And that, finally, allows me to answer your question. From the point of view of language independent programming the first useful value is presumably the one that's assigned, and so in this view one thinks of the assignment as initialization. Yet, at the C++ technical level initialization has already been done, by a call of std::string's default constructor, so at this level one thinks of the declaration as initialization, and the assignment as just a later change of value.
So, especially the term "initialization" depends on the context!
Simply apply some common sense to sort out what Someone Else probably means.
Cheers & hth.,
In the simplest of terms:
int a = 0; // initialization of a to 0
a = 1; // assignment of a to 1
For built in types its relatively straight forward. For user defined types it can get more complex. Have a look at this article.
For instance:
class A
{
public:
A() : val_(0) // initializer list, initializes val_
{}
A(const int v) : val_(v) // initializes val_
{}
A(const A& rhs) : val_(rhs.val_) // still initialization of val_
{}
private:
int val_;
};
// all initialization:
A a;
A a2(4);
A a3(a2);
a = a3; // assignment
Initialization is creating an instance(of type) with certain value.
int i = 0;
Assignment is to give value to an already created instance(of type).
int i;
i = 0
To Answer your edited Question:
What is the difference between Initializing And Assignment inside constructor? &
What is the advantage?
There is a difference between Initializing a member using initializer list and assigning it an value inside the constructor body.
When you initialize fields via initializer list the constructors will be called once.
If you use the assignment then the fields will be first initialized with default constructors and then reassigned (via assignment operator) with actual values.
As you see there is an additional overhead of creation & assignment in the latter, which might be considerable for user defined classes.
For an integer data type or POD class members there is no practical overhead.
An Code Example:
class Myclass
{
public:
Myclass (unsigned int param) : param_ (param)
{
}
unsigned int param () const
{
return param_;
}
private:
unsigned int param_;
};
In the above example:
Myclass (unsigned int param) : param_ (param)
This construct is called a Member Initializer List in C++.
It initializes a member param_ to a value param.
When do you HAVE TO use member Initializer list?
You will have(rather forced) to use a Member Initializer list if:
Your class has a reference member
Your class has a const member or
Your class doesn't have a default constructor
Initialisation: giving an object an initial value:
int a(0);
int b = 2;
int c = a;
int d(c);
std::vector<int> e;
Assignment: assigning a new value to an object:
a = b;
b = 5;
c = a;
d = 2;
In C the general syntax for initialization is with {}:
struct toto { unsigned a; double c[2] };
struct toto T = { 3, { 4.5, 3.1 } };
struct toto S = { .c = { [1] = 7.0 }, .a = 32 };
The one for S is called "designated initializers" and is only available from C99 onward.
Fields that are omitted are automatically initialized with the
correct 0 for the corresponding type.
this syntax applies even to basic data types like double r = { 1.0
};
There is a catchall initializer that sets all fields to 0, namely { 0 }.
if the variable is of static linkage all expressions of the
initializer must be constant expressions
This {} syntax can not be used directly for assignment, but in C99 you can use compound literals instead like
S = (struct toto){ .c = { [1] = 5.0 } };
So by first creating a temporary object on the RHS and assigning this to your object.
One thing that nobody has yet mentioned is the difference between initialisation and assignment of class fields in the constructor.
Let us consider the class:
class Thing
{
int num;
char c;
public:
Thing();
};
Thing::Thing()
: num(5)
{
c = 'a';
}
What we have here is a constructor that initialises Thing::num to the value of 5, and assigns 'a' to Thing::c. In this case the difference is minor, but as was said before if you were to substitute int and char in this example for some arbitrary classes, we would be talking about the difference between calling a parameterised constructor versus a default constructor followed by operator= function.