C++11 anonymous union with non-trivial members - c++

I'm updating a struct of mine and I was wanting to add a std::string member to it. The original struct looks like this:
struct Value {
uint64_t lastUpdated;
union {
uint64_t ui;
int64_t i;
float f;
bool b;
};
};
Just adding a std::string member to the union, of course, causes a compile error, because one would normally need to add the non-trivial constructors of the object. In the case of std::string (text from informit.com)
Since std::string defines all of the six special member functions, U will have an implicitly deleted default constructor, copy constructor, copy assignment operator, move constructor, move assignment operator and destructor. Effectively, this means that you can't create instances of U unless you define some, or all of the special member functions explicitly.
Then the website goes on to give the following sample code:
union U
{
int a;
int b;
string s;
U();
~U();
};
However, I'm using an anonymous union within a struct. I asked ##C++ on freenode and they told me the correct way to do that was to put the constructor in the struct instead and gave me this example code:
#include <new>
struct Point {
Point() {}
Point(int x, int y): x_(x), y_(y) {}
int x_, y_;
};
struct Foo
{
Foo() { new(&p) Point(); }
union {
int z;
double w;
Point p;
};
};
int main(void)
{
}
But from there I can't figure how to make the rest of the special functions that std::string needs defined, and moreover, I'm not entirely clear on how the ctor in that example is working.
Can I get someone to explain this to me a bit clearer?

There is no need for placement new here.
Variant members won't be initialized by the compiler-generated constructor, but there should be no trouble picking one and initializing it using the normal ctor-initializer-list. Members declared inside anonymous unions are actually members of the containing class, and can be initialized in the containing class's constructor.
This behavior is described in section 9.5. [class.union]:
A union-like class is a union or a class that has an anonymous union as a direct member. A union-like class X has a set of variant members. If X is a union its variant members are the non-static data members; otherwise, its variant members are the non-static data members of all anonymous unions that are members of X.
and in section 12.6.2 [class.base.init]:
A ctor-initializer may initialize a variant member of the constructor’s class. If a ctor-initializer specifies more than one mem-initializer for the same member or for the same base class, the ctor-initializer is ill-formed.
So the code can be simply:
#include <new>
struct Point {
Point() {}
Point(int x, int y): x_(x), y_(y) {}
int x_, y_;
};
struct Foo
{
Foo() : p() {} // usual everyday initialization in the ctor-initializer
union {
int z;
double w;
Point p;
};
};
int main(void)
{
}
Of course, placement new should still be used when vivifying a variant member other than the other initialized in the constructor.

That new (&p) Point() example is a call to the Standard placement new operator (via a placement new expression), hence why you need to include <new>. That particular operator is special in that it does not allocate memory, it only returns what you passed to it (in this case it's the &p parameter). The net result of the expression is that an object has been constructed.
If you combine this syntax with explicit destructor calls then you can achieve complete control over the lifetime of an object:
// Let's assume storage_type is a type
// that is appropriate for our purposes
storage_type storage;
std::string* p = new (&storage) std::string;
// p now points to an std::string that resides in our storage
// it was default constructed
// *p can now be used like any other string
*p = "foo";
// Needed to get around a quirk of the language
using string_type = std::string;
// We now explicitly destroy it:
p->~string_type();
// Not possible:
// p->~std::string();
// This did nothing to our storage however
// We can even reuse it
p = new (&storage) std::string("foo");
// Let's not forget to destroy our newest object
p->~string_type();
When and where you should construct and destroy the std::string member (let's call it s) in your Value class depends on your usage pattern for s. In this minimal example you never construct (and hence destruct) it in the special members:
struct Value {
Value() {}
Value(Value const&) = delete;
Value& operator=(Value const&) = delete;
Value(Value&&) = delete;
Value& operator=(Value&&) = delete;
~Value() {}
uint64_t lastUpdated;
union {
uint64_t ui;
int64_t i;
float f;
bool b;
std::string s;
};
};
The following is thus a valid use of Value:
Value v;
new (&v.s) std::string("foo");
something_taking_a_string(v.s);
using string_type = std::string;
v.s.~string_type();
As you may have noticed, I disabled copying and moving Value. The reason for that is that we can't copy or move the appropriate active member of the union without knowing which one it is that is active, if any.

Related

C++ initialization list vs assigning values

in C++, what is the difference between initialization list and assigning values in a constructer rather than the way each method looks?
I mean what's the advantage of using one rather than the other and why in the given example in the slide (below) only works with initialization? (I hope if you could add some resources to it since I didn't find)
Click here to view slide: uploaded on imgur
Using initialization list in constructor is the one step process i.e. it initializes objects at the moment it’s declared. It calls copy constructor.
Whereas using assignment is the two-step process i.e. define the object and then assign it. Defining objects calls default constructor and then assignment calls assignment operator. Hence, expensive operations.
In C++, constant or reference data member variables of a class can only be initialized in the initialization list, not using assignment in constructor body.
Both constant and reference data member variables have property that they both must be initialized at the moment of declaration. So, there is only way to use initialization list in constructor, as initialization list initializes class member variables at the time of declaration whereas assignment if constructor body initializes data members after declaration.
There are situations where initialization of data members inside constructor doesn’t work and Initializer List must be used. Following are such cases.
For initialization of non-static const data members.
#include<iostream>
using namespace std;
class Test {
const int t;
public:
Test(int t):t(t) {} //Initializer list must be used
int getT() { return t; }
};
int main() {
Test t1(10);
cout<<t1.getT();
return 0;
}
For initialization of reference members.
#include<iostream>
using namespace std;
class Test {
int &t;
public:
Test(int &t):t(t) {} //Initializer list must be used
int getT() { return t; }
};
int main() {
int x = 20;
Test t1(x);
cout<<t1.getT()<<endl;
x = 30;
cout<<t1.getT()<<endl;
return 0;
}
For initialization of member objects which do not have default constructor. (In your case Array digits does not have default constructor)
#include <iostream>
using namespace std;
class A {
int i;
public:
A(int );
};
A::A(int arg) {
i = arg;
cout << "A's Constructor called: Value of i: " << i << endl;
}
// Class B contains object of A
class B {
A a;
public:
B(int );
};
B::B(int x):a(x) { //Initializer list must be used
cout << "B's Constructor called";
}
int main() {
B obj(10);
return 0;
}
For initialization of base class members.
When constructor’s parameter name is same as data member.
For Performance reasons.
If you don't use an initialization list, the data members of the class will be default constructed before the body of the constructor is reached:
class Foo{
private:
int bar;
public:
Foo(int _bar){//bar is default constructed here
bar = _bar; //bar is assigned a new value here
}
};
This isn't a big issue for a fundamental-type, like int, as the default constructor is not expensive. However, it can become an issue if the data member does not have a default constructor, or default construction followed by assignment is more expensive than direct construction:
//Bar does not have a default constructor, only a copy constructor
class Bar{
public:
Bar() = delete; //no default constructor
Bar(const Bar& bar); //copy constructor only
};
class Foo{
private:
Bar bar;
public:
Foo(const Bar& _bar){
//this will not compile, bar does not have a default constructor
// or a copy assignment operator
bar = _bar;
}
Foo(const Bar& _bar) : bar(_bar){//this will compile, copy constructor for bar called
}
};
Generally speaking, use the initialization list for more efficient code.

initialization of class member in anonymous union

I have observed that the following code segfaults at the line ar.p():
#include <iostream>
class A
{
public:
virtual void p() { std::cout<<"A!\n"; }
};
class B : public A
{
public:
void p() { std::cout<<"B!\n"; }
};
struct Param
{
enum {AA, BB} tag;
union {
A a;
B b;
};
Param(const A &p)
: tag(AA) {a = p;}
A &get() {
switch(tag) {
case AA: return a;
case BB: return b;
}
}
};
int main() {
A a;
a.p();
Param u(a);
A &ar = u.get();
ar.p();
}
However, when I change the Param constructor to:
Param(const A &p)
: tag(AA), a(p) {}
it does not segfault anymore.
I think it has something to do with the way the vtable ptr for union member a is initialized, but I'd like to understand this bug better.
On coliru: http://coliru.stacked-crooked.com/a/85182239c9f033c1
The union doesn't have an implicit constructor, you have to add your own constructor to the union which initializes one of the members of the union. I think this is because the compiler can't know whether you want to initialize a or b. You may also need an assignment operator and destructor. See also this question: Why does union has deleted default constructor if one of its member doesn't have one whatsoever?
The constructor should use placement new or it can use a member initializer to construct one of the union members, as you do in your alternative constructor.
If you want to assign something to b afterwards, you have to destruct a using a.~A(), and then initialize b with placement new.
If you have members in the union with a non-trivial destructor then the union must have a destructor which calls the destructor on the member which is used at that point.
In your original code the assignment operator and the p() method are called without running a constructor first, leading to the crash.

Why an union wont allow its elements to have an user-defined constructor?

How can I make this second struct work, when the first struct has a constructor?
I get error:
error C2620: member 'test::teststruct::pos' of union 'test::teststruct::<unnamed-tag>' has user-defined constructor or non-trivial default constructor
The code:
struct xyz {
Uint8 x, y, z, w;
xyz(Uint8 x, Uint8 y, Uint8 z) : x(x), y(y), z(z) {}
};
struct teststruct {
union {
Uint32 value;
xyz pos; // error at this line.
};
};
I could use a function to initialize the xyz struct, but wouldn't it be a lot slower? Not to mention: I have tons of structs I need to create own function with a prefix like init_xyz() or etc, which isn't nice. Is there any other way going around this problem?
Probably to avoid this:
struct A {
Uint8 a;
A() : a(111) {}
};
struct B {
Uint8 b;
B() : b(2222) {}
};
struct teststruct {
union {
A aValue;
B bValue;
};
};
What should happen, A and B constructors will both try to initialize the same memory in different ways. Instead of having some rule saying which will win, it was probably easier to say user defined constructors aren't allowed.
From C++03, 9.5 Unions, pg 162
A union can have member functions (including constructors and destructors), but not virtual (10.3) functions. A union shall not have base classes. A union shall not be used as a base class.An object of a class with a non-trivial constructor (12.1), a non-trivial copy constructor (12.8), a non-trivial destructor (12.4), or a non-trivial copy assignment operator (13.5.3, 12.8) cannot be a member of a union, nor can an array of such objects

In this specific case, is there a difference between using a member initializer list and assigning values in a constructor?

Internally and about the generated code, is there a really difference between :
MyClass::MyClass(): _capacity(15), _data(NULL), _len(0)
{
}
and
MyClass::MyClass()
{
_capacity=15;
_data=NULL;
_len=0
}
thanks...
You need to use initialization list to initialize constant members,references and base class
When you need to initialize constant member, references and pass parameters to base class constructors, as mentioned in comments, you need to use initialization list.
struct aa
{
int i;
const int ci; // constant member
aa() : i(0) {} // will fail, constant member not initialized
};
struct aa
{
int i;
const int ci;
aa() : i(0) { ci = 3;} // will fail, ci is constant
};
struct aa
{
int i;
const int ci;
aa() : i(0), ci(3) {} // works
};
Example (non exhaustive) class/struct contains reference:
struct bb {};
struct aa
{
bb& rb;
aa(bb& b ) : rb(b) {}
};
// usage:
bb b;
aa a(b);
And example of initializing base class that requires a parameter (e.g. no default constructor):
struct bb {};
struct dd
{
char c;
dd(char x) : c(x) {}
};
struct aa : dd
{
bb& rb;
aa(bb& b ) : dd('a'), rb(b) {}
};
Assuming that those values are primitive types, then no, there's no difference. Initialization lists only make a difference when you have objects as members, since instead of using default initialization followed by assignment, the initialization list lets you initialize the object to its final value. This can actually be noticeably faster.
Yes. In the first case you can declare _capacity, _data and _len as constants:
class MyClass
{
private:
const int _capacity;
const void *_data;
const int _len;
// ...
};
This would be important if you want to ensure const-ness of these instance variables while computing their values at runtime, for example:
MyClass::MyClass() :
_capacity(someMethod()),
_data(someOtherMethod()),
_len(yetAnotherMethod())
{
}
const instances must be initialized in the initializer list or the underlying types must provide public parameterless constructors (which primitive types do).
I think this link http://www.cplusplus.com/forum/articles/17820/ gives an excellent explanation - especially for those new to C++.
The reason why intialiser lists are more efficient is that within the constructor body, only assignments take place, not initialisation. So if you are dealing with a non-built-in type, the default constructor for that object has already been called before the body of the constructor has been entered. Inside the constructor body, you are assigning a value to that object.
In effect, this is a call to the default constructor followed by a call to the copy-assignment operator. The initialiser list allows you to call the copy constructor directly, and this can sometimes be significantly faster (recall that the initialiser list is before the body of the constructor)
I'll add that if you have members of class type with no default constructor available, initialization is the only way to construct your class.
A big difference is that the assignment can initialize members of a parent class; the initializer only works on members declared at the current class scope.
Depends on the types involved. The difference is similar between
std::string a;
a = "hai";
and
std::string a("hai");
where the second form is initialization list- that is, it makes a difference if the type requires constructor arguments or is more efficient with constructor arguments.
The real difference boils down to how the gcc compiler generate machine code and lay down the memory. Explain:
(phase1) Before the init body (including the init list): the compiler allocate required memory for the class. The class is alive already!
(phase2) In the init body: since the memory is allocated, every assignment now indicates an operation on the already exiting/'initialized' variable.
There are certainly other ways to handle const type members. But to ease their life, the gcc compiler writers decide to set up some rules
const type members must be initialized before the init body.
After phase1, any write operation only valid for non-constant members.
There is only one way to initialize base class instances and non-static member variables and that is using the initializer list.
If you don't specify a base or non-static member variable in your constructor's initializer list then that member or base will either be default-initialized (if the member/base is a non-POD class type or array of non-POD class types) or left uninitialized otherwise.
Once the constructor body is entered, all bases or members will have been initialized or left uninitialized (i.e. they will have an indeterminate value). There is no opportunity in the constructor body to influence how they should be initialized.
You may be able to assign new values to members in the constructor body but it is not possible to assign to const members or members of class type which have been made non-assignable and it is not possible to rebind references.
For built in types and some user-defined types, assigning in the constructor body may have exactly the same effect as initializing with the same value in the initializer list.
If you fail to name a member or base in an initializer list and that entity is a reference, has class type with no accessible user-declared default constructor, is const qualified and has POD type or is a POD class type or array of POD class type containing a const qualified member (directly or indirectly) then the program is ill-formed.
If you write an initializer list, you do all in one step; if you don't write an initilizer list, you'll do 2 steps: one for declaration and one for asign the value.
There is a difference between initialization list and initialization statement in a constructor.
Let's consider below code:
#include <initializer_list>
#include <iostream>
#include <algorithm>
#include <numeric>
class MyBase {
public:
MyBase() {
std::cout << __FUNCTION__ << std::endl;
}
};
class MyClass : public MyBase {
public:
MyClass::MyClass() : _capacity( 15 ), _data( NULL ), _len( 0 ) {
std::cout << __FUNCTION__ << std::endl;
}
private:
int _capacity;
int* _data;
int _len;
};
class MyClass2 : public MyBase {
public:
MyClass2::MyClass2() {
std::cout << __FUNCTION__ << std::endl;
_capacity = 15;
_data = NULL;
_len = 0;
}
private:
int _capacity;
int* _data;
int _len;
};
int main() {
MyClass c;
MyClass2 d;
return 0;
}
When MyClass is used, all the members will be initialized before the first statement in a constructor executed.
But, when MyClass2 is used, all the members are not initialized when the first statement in a constructor executed.
In later case, there may be regression problem when someone added some code in a constructor before a certain member is initialized.
Here is a point that I did not see others refer to it:
class temp{
public:
temp(int var);
};
The temp class does not have a default ctor. When we use it in another class as follow:
class mainClass{
public:
mainClass(){}
private:
int a;
temp obj;
};
the code will not compile, cause the compiler does not know how to initialize obj, cause it has just an explicit ctor which receives an int value, so we have to change the ctor as follow:
mainClass(int sth):obj(sth){}
So, it is not just about const and references!

Any workarounds for non-static member array initialization?

In C++, it's not possible to initialize array members in the initialization list, thus member objects should have default constructors and they should be properly initialized in the constructor. Is there any (reasonable) workaround for this apart from not using arrays?
[Anything that can be initialized using only the initialization list is in our application far preferable to using the constructor, as that data can be allocated and initialized by the compiler and linker, and every CPU clock cycle counts, even before main. However, it is not always possible to have a default constructor for every class, and besides, reinitializing the data again in the constructor rather defeats the purpose anyway.]
E.g. I'd like to have something like this (but this one doesn't work):
class OtherClass {
private:
int data;
public:
OtherClass(int i) : data(i) {}; // No default constructor!
};
class Foo {
private:
OtherClass inst[3]; // Array size fixed and known ahead of time.
public:
Foo(...)
: inst[0](0), inst[1](1), inst[2](2)
{};
};
The only workaround I'm aware of is the non-array one:
class Foo {
private:
OtherClass inst0;
OtherClass inst1;
OtherClass inst2;
OtherClass *inst[3];
public:
Foo(...)
: inst0(0), inst1(1), inst2(2) {
inst[0]=&inst0;
inst[1]=&inst1;
inst[2]=&inst2;
};
};
Edit: It should be stressed that OtherClass has no default constructor, and that it is very desirable to have the linker be able to allocate any memory needed (one or more static instances of Foo will be created), using the heap is essentially verboten. I've updated the examples above to highlight the first point.
One possible workaround is to avoid the compiler calling the OtherClass constructor at all, and to call it on your own using placement new to initialize it whichever way you need. Example:
class Foo
{
private:
char inst[3*sizeof(OtherClass)]; // Array size fixed. OtherClass has no default ctor.
// use Inst to access, not inst
OtherClass &Inst(int i) {return (OtherClass *)inst+i;}
const OtherClass &Inst(int i) const {return (const OtherClass *)inst+i;}
public:
Foo(...)
{
new (Inst(0)) OtherClass(...);
new (Inst(1)) OtherClass(...);
new (Inst(2)) OtherClass(...);
}
~Foo()
{
Inst(0)->~OtherClass();
Inst(1)->~OtherClass();
Inst(2)->~OtherClass();
}
};
To cater for possible alignment requirements of the OtherClass, you may need to use __declspec(align(x)) if working in VisualC++, or to use a type other than char like:
Type inst[3*(sizeof(OtherClass)+sizeof(Type)-1)/sizeof(Type)];
... where Type is int, double, long long, or whatever describes the alignment requirements.
What data members are in OtherClass? Will value-initialization be enough for that class?
If value-initialization is enough, then you can value-initialize an array in the member initialization list:
class A {
public:
A ()
: m_a() // All elements are value-initialized (which for int means zero'd)
{
}
private:
int m_a[3];
};
If your array element types are class types, then the default constructor will be called.
EDIT: Just to clarify the comment from Drealmer.
Where the element type is non-POD, then it should have an "accessible default constructor" (as was stated above). If the compiler cannot call the default constructor, then this solution will not work.
The following example, would not work with this approach:
class Elem {
public:
Elem (int); // User declared ctor stops generation of implicit default ctor
};
class A {
public:
A ()
: m_a () // Compile error: No default constructor
{}
private:
Elem m_a[10];
};
One method I typically use to make a class member "appear" to be on the stack (although actually stored on the heap):
class Foo {
private:
int const (&array)[3];
int const (&InitArray() const)[3] {
int (*const rval)[3] = new int[1][3];
(*rval)[0] = 2;
(*rval)[1] = 3;
(*rval)[2] = 5;
return *rval;
}
public:
explicit Foo() : array(InitArray()) { }
virtual ~Foo() { delete[] &array[0]; }
};To clients of your class, array appears to be of type "int const [3]". Combine this code with placement new and you can also truly initialize the values at your discretion using any constructor you desire. Hope this helps.
Array members are not initialized by default. So you could use a static helper function that does the initialization, and store the result of the helper function in a member.
#include "stdafx.h"
#include <algorithm>
#include <cassert>
class C {
public: // for the sake of demonstration...
typedef int t_is[4] ;
t_is is;
bool initialized;
C() : initialized( false )
{
}
C( int deflt )
: initialized( sf_bInit( is, deflt ) )
{}
static bool sf_bInit( t_is& av_is, const int i ){
std::fill( av_is, av_is + sizeof( av_is )/sizeof( av_is[0] ), i );
return true;
}
};
int _tmain(int argc, _TCHAR* argv[])
{
C c(1), d;
assert( c.is[0] == 1 );
return 0;
}
Worth noting is that in the next standard, they're going to support array initializers.
Use inheritance for creating proxy object
class ProxyOtherClass : public OtherClass {
public:
ProxyOtherClass() : OtherClass(0) {}
};
class Foo {
private:
ProxyOtherClass inst[3]; // Array size fixed and known ahead of time.
public:
Foo(...) {}
};
And what about using array of pointers instead of array of objects?
For example:
class Foo {
private:
OtherClass *inst[3];
public:
Foo(...) {
inst[0]=new OtherClass(1);
inst[1]=new OtherClass(2);
inst[2]=new OtherClass(3);
};
~Foo() {
delete [] inst;
}
};
You say "Anything that can be initialized using only the initialization list is in our application far preferable to using the constructor, as that data can be allocated and initialized by the compiler and linker, and every CPU clock cycle counts".
So, don't use constructors. That is, don't use conventional "instances". Declare everything statically. When you need a new "instance", create a new static declaration, potentially outside of any classes. Use structs with public members if you have to. Use C if you have to.
You answered your own question. Constructors and destructors are only useful in environments with a lot of allocation and deallocation. What good is destruction if the goal is for as much data as possible to be allocated statically, and so what good is construction without destruction? To hell with both of them.