Why is std::string incompatible with C unions? [duplicate]

Why is std::string incompatible with C unions? [duplicate] - c++

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Why compiler doesn't allow std::string inside union ?
I knew that I had this problem when I started with C++: The compiler wouldn't allow me to put a variable of the type std::string into unions.
That was years ago, but actually I still don't know the exact answer. I read something related to a copy function with the string that the union didn't like, but that's pretty much all.
Why are C++ STL strings incompatible with unions?

From Wikipedia:
C++ does not allow for a data member to be any type that has a full fledged constructor/destructor
and/or copy constructor, or a non-trivial copy assignment operator. In particular, it is impossible to
have the standard C++ string as a member of a union.
Think about it this way: If you have a union of a class type like std::string and a primitive type (let's say a long), how would the compiler know when you are using the class type (in which case the constructor/destructor will need to be called) and when you are using the simple type? That's why full-fledged class types are not allowed as members of a union.

Class which have user-defined constructor or user-defined destructor is not allowed in union.
You can have pointer of such class as member of union, though.
struct X
{
X() {}
~X() {}
};
union A
{
X x; // not allowed - X has constructor (and destructor too)
X *px; //allowed!
};
Or you can use boost::variant which is a safe, generic, stack-based discriminated union container.
§9.5/1 says (formatting and emphasize is mine)
A union can have member functions (including constructors and destructors), but not virtual (10.3) functions.
A union shall not have base classes.
A union shall not be used as a base class.
An object of a class with a non-trivial constructor (12.1), a non-trivial copy constructor (12.8), a non-trivial destructor (12.4), or a non-trivial copy assignment operator (13.5.3, 12.8) cannot be a member of a union, nor can an array of such objects.
If a union contains a static data member, or a member of reference type, the program is ill-formed.
Interesting!

Related

Understanding constructor concept

I don't understand what constructor means in C++ formally. I was reading 3.8 clause (Object lifetime, N3797) and come across with the following:
An object is said to have non-trivial initialization if it is of a
class or aggregate type and it or one of its members is initialized by
a constructor other than a trivial default constructor.
I would like to understand an initialization in general. I've read section 8.5, N3797. Is it true that if some object is initialized, a constructor (possibly trivial default) will be called? I mean that every initialization process (even zero-initialization) means constructor calling. It would be good if you provide corresponding references to the Standard.

I don't understand what contructor means in C++ formally.
As far as I know, the standard does not explicitly contain a definition of the term "constructor". However, §12.1/1 says
Constructors do not have names. A special declarator syntax is used to declare or define the constructor.
The syntax uses:
an optional decl-specifier-seq in which each decl-specifier is either a function-specifier or constexpr,
the constructor's class name, and
a parameter list
in that order. In such a declaration, optional parentheses around the constructor class name are ignored.
Thus, if you declare a member function according to this syntax, the function you are declaring is a constructor. In addition,
The default constructor (12.1), copy constructor and copy assignment operator (12.8), move constructor
and move assignment operator (12.8), and destructor (12.4) are special member functions. [ Note: The
implementation will implicitly declare these member functions for some class types when the program does
not explicitly declare them. The implementation will implicitly define them if they are odr-used (3.2).
See 12.1, 12.4 and 12.8. — end note ] Programs shall not define implicitly-declared special member functions.
(§12/1)
So there you go---every class has at least three constructors declared, whether implicitly or explicitly; and you can also declare other constructors using the syntax in §12.1/1. The entire set of functions thus declared forms the set of constructors.
Is it true that if some object is initialized, a constructor (possibly trivial default) will be called? I mean that every initialization process (even zero-initialization) means constructor calling.
No, this is false. For example, int has no constructors. This is despite the fact that you can initialize an int with similar syntax compared to initialization of objects of class type.
struct Foo {};
Foo f {}; // calls default constructor
int i {}; // sets the value of i to 0
Also, zero-initialization never invokes a constructor, but zero-initialization is also never the only step in the initialization of an object.
If by "object" you meant "object of class type", it is still not true that a constructor is always called, although three constructors are always declared, as stated above. See §8.5/7 on value initialization:
To value-initialize an object of type T means:
if T is a (possibly cv-qualified) class type (Clause 9) with a user-provided constructor (12.1), then the
default constructor for T is called (and the initialization is ill-formed if T has no accessible default
constructor);
if T is a (possibly cv-qualified) non-union class type without a user-provided constructor, then the object
is zero-initialized and, if T's implicitly-declared default constructor is non-trivial, that constructor is
called.
if T is an array type, then each element is value-initialized;
otherwise, the object is zero-initialized.
Therefore, when the default constructor for a non-union class type is trivial and you are value-initializing an object of that type, the constructor really is not called.

Classes are types:
A class is a type.
§9 [class]
However, not all types are classes. The standard refers to types which are not class types (scalar types, for example) in §3.9.
Only class types, however, can have member functions:
Functions declared in the definition of a class, excluding those declared with a friend specifier, are called member functions of that class.
§9.3 [class.mfct]
Constructors are member functions. Therefore types without constructors can exist (i.e. types that are not class types). Therefore initialization does not necessarily involve calling a constructor, since non-class types (int for example) may be initialized.
Note that something does not have to be of class type to be an "object":
An object is a region of storage.
§1.8 [intro.object]
Therefore an int, while not being of class type, would be an "object".

I think your answer would be found in 3.6.2, and 8.5 of N3797.
A constructor is always called on object creation. But initialization of an object is
multi-step process.
I understand that Zero-Initialization is separate and performed as the first step during object initialization, the Constructor (possibly trivial default) is called later

A constructor is special function. It looks like a normal function. It doesn't have return type (though some say the return type is the object of the class) like void, int, char, double, etc. It has same name as name of Class. It runs automatically when object is created.
In C++, you don't need new operator to initialize an object. Just declare it and set its attribute. i.e. ClassA object1, object2;
For example
Class Player{
int jerseyNo;
//constructor
Player(){
cout<<"HELLO";
}
};
In main you can do the following:
Player nabin;
And you can declare destructor as well. Destructor is special function similar to constructor but runs when object is destroyed i.e. when object moves out of scope.
Example:
Class Player{
..
~Player(){
cout<<"Destructor ran";
}
..
};
P.S The order of execution of constructor and destructor is reverse.
Example:
Player p1,p2,p3;
The order of their execution
1. p1's constructor runs
2. p2's constructor runs
3. p3's constructor runs
4. p3's destructor runs
5. p2's destructor runs
6. p1's destructor runs

You asked:
Is it true that if some object is initialized, a constructor (possibly trivial default) will be called?
The short answer: No.
A longer answer: A constructor is called only for objects of class types. For objects of other types, there are no constructors, and hence constructors cannot be called.
You said:
I mean that every initialization process (even zero-initialization) means constructor calling.
Objects of class types can be initialized by calling constructors, which can be explicit or implicit. They can also be initialized by other initialization methods. Objects of other types are initialized by directly setting the initial values of memory occupied by them.
You have already the seen section 8.5 of the draft standard on initialization.
Details about class constructors can be found in section 12.1 Constructors.
Details about class initialization can be found in section 12.6 Initialization.

Basically, whenever you construct a data type in c++, it can happen in one of two ways. You can either be calling a constructor, or you can essentially be copying chunks of memory around. So constructors are not always called.
In C++, there is a notion of primitives. Integers, doubles, pointers, characters, pointers (to anything) are all primitives. Primitives are data types, but they are not classes. All primitives are safe to copy bitwise. When you create or assign primitives, no constructor is called. All that happens is that some assembly code is generated that copies around some bits.
With classes it's a little more complicated; the answer is that usually a constructor is called, but not always. In particular, C++11 has the concept of trivial classes. Trivial classes are classes that satisfy several conditions:
They use the defaults for all the 'special' functions: constructor, destructor, move, copy, assignment.
They don't have virtual functions.
All non static members of the class are trivial classes or primitives.
As far as C++11 is concerned, objects of any class that satisfy this requirement can be treated like chunks of data. When you create such an object, it's constructor will not be called. If you create such an object on the stack (without new), then its destructor will not be called at program termination either, as would normally be called.
Don't take my word for it though. Check out the assembly generated for the code I wrote. You can see that there are two classes, one is trivial and one is not. The non-trivial class has a call to its constructor in the assembly, the trivial one does not.

Unions used like Classes/Structs

I was trying to learn more about unions and their usefulness, when I was surprised that the following code is perfectly valid and works exactly as expected:
template <class T>
union Foo
{
T a;
float b;
Foo(const T& value)
: a(value)
{
}
Foo(float f)
: b(f)
{
}
void bar()
{
}
~Foo()
{
}
};
int main(int argc, char* argv[])
{
Foo<int> foo1(12.0f);
Foo<int> foo2((int) 12);
foo1.bar();
foo2.bar();
int s = sizeof(foo1); // s = 4, correct
return 0;
}
Until now, I had no idea that it is legal to declare unions with templates, constructors, destructor, and even member functions. In case it's relevant, I'm using Visual Studio 2012.
When I searched the internet to find more about using unions in this manner, I found nothing. Is this a new feature of C++, or something specific to MSVC? If not, I'd like to learn more about unions, specifically examples of them used like classes (above). If someone could point me to a more detailed explanation of unions and their usage as data structures, it'd be much appreciated.

Is this a new feature of C++, or something specific to MSVC?
No, as BoBtFish said, the 2003 C++ standard section 9.5 Unions paragraph 1 says:
[...] A union can have member functions (including constructors and destructors), but not virtual (10.3) functions. A union shall not have base classes. A union shall not be used as a base class. An object of a class with a non-trivial constructor (12.1), a non-trivial copy constructor (12.8), a non-trivial destructor (12.4), or a non-trivial copy assignment operator (13.5.3, 12.8) cannot be a member of a union, nor can an array of such objects. If a union contains a static data member, or a member of reference type, the program is ill-formed.
unions do come under section 9 Classes and the grammar for class-key is as follows:
class-key:
class
struct
union
So acts like a class but has many more restrictions. The key restriction being that unions can only have one active non-static member at a time, which is also covered in paragraph 1:
In a union, at most one of the non-static data members can be active at any time, that is, the value of at most one of the non-static data members can be stored in a union at any time. [...]
The wording in the C++11 draft standard is similar so it has not changed too much since 2003.
As for the use of a union, there are two common reasons which are covered from different angles in this previous thread C/C++: When would anyone use a union? Is it basically a remnant from the C only days? to summarize:
To implement your own Variant type, a union gives you the ability to represent all the varying types without wasting memory. This answer to the thread gives a good example.
Type punning but I would read Understanding Strict Aliasing as well since there are many cases where type punning is undefined behavior.
This answer to Unions cannot be used as Base class gives some really great insight into why unions are implemented as they are in C++.

Memset POD struct with this pointer

I have many POD struct with a lot of member variables. Instead of initializing each members in the constructor, I simply use memset. Is this valid in C++?
struct foo
{
foo() { std::memset(this, 0, sizeof (foo)); }
int var1;
float var2;
double var3;
// more variables..
};

It's not guaranteed to work, since the C++ standard permits implementations in which all-bits-zero is a trap representation of float or double. So reading those members on such an implementation would have undefined behavior.
The same applies to any padding bytes that the implementation might put between the data members -- modifying them is either undefined behavior or else puts the object into an undefined state, that has undefined behavior when used. I forget which.
In practice it will work on all implementations I know, though.
Other answers make valid points about your class being non-POD (C++03) and non-trivial (C++11). Thing is, even if you removed the constructor and called memset from somewhere else it would still not be guaranteed to work by the standard. But if you did remove the constructor you could use aggregate initialization:
foo f = {0};
and that would intialize all members to zero values (whether or not that is represented by all-bits-zero), guaranteed.

According to standard your struct is not POD type and thus it is not allowed to use memset.
9 Classes
A trivial class is a class that has a default constructor (12.1), has no non-trivial default constructors ,
and is trivially copyable
10 A POD struct108 is a non-union class that is both a trivial class and a standard-layout class, and has no
non-static data members of type non-POD struct, non-POD union (or array of such types).
Since your class have non-trivial default constructor it is no longer trivial, and as result not a POD type.
Most likely is will be working on most of the compilers, no guarantee thru.

what is variant member in c++?

I am new to C++. I read very frequently from some sites that variant member?.
class School
{
int x; -> data member.
}
I know data member. but what is variant member ?
NOTE:
From the c++ specification: Under Constructor page.
X is a union-like class that has a variant member with a non-trivial default constructor.

"Variant member" is defined in 9.5/8 of C++11:
A union-like class is a union or a class that has an anonymous union
as a direct member. A union-like class X has a set of variant members.
If X is a union its variant members are the non-static data members;
otherwise, its variant members are the non-static data members of all
anonymous unions that are members of X.
In other words, all the non-static data members of a union are "variant members", and for a class containing any anonymous unions, their non-static data members are "variant members" of the class.
The context that you quoted is 12.1/5, saying that if a union-like class has a variant member with a non-trivial default constructor, then the default constructor of the class itself is deleted. The problem is which variant member should be constructed by the default constructor of the class, and the solution is not to have a default constructor. If all variant members have trivial default constructors there's no problem, since by doing nothing the default constructor of the class is constructing all/none of them equally.
boost::variant is a separate thing. I wouldn't be too surprised if "some sites" say "variant members" when they mean "the possible types that a given boost::variant can hold", that is to say the "members" of that variant. But that's not the meaning newly-defined in the C++11 standard.

The term variant is usually employed to identify a member that can hold a value of a set of different types. Similar to a union in the language, the term variant is usually reserved for types that allow the storage of the different options in a type safe way.
You might want to read over the documentation of the boost variant library for one such example, and if that does not clear up the concept, drop a comment/create a question with your doubts.
Boost Variant

A variant is a structure containing a union member and an unsigned integer member that describes which member of the union is currently being used. If you don't know what a union is, read about it first, and then come back.

Can't C++ POD type have any constructor?

I have a class and a const variable.
struct A
{
int b;
};
A const a;
The class A is POD and can be initialized like this.
A const a = { 3 };
IMHO, it looks fine to have a constructor like this.
struct A
{
int b;
A(int newB) : b(newB)
{
}
};
But Clang assumes A as non-aggregate type. Why I can't have constructor like that? Or should I do something else?
I modified question to present my original meaning. I had wrote the struct as class by mistake, and sorry for #Johannes about confusing :)

POD means Plain Old Data type which by definition cannot have user-defined constructor.
POD is actually an aggregate type (see the next quotation). So what is aggregate? The C++ Standard says in section §8.5.1/1,
An aggregate is an array or a class
(clause 9) with no user-declared
constructors (12.1), no private or
protected nonstatic data members
(clause 11), no base classes (clause
10), and no virtual functions (10.3).
And section §9/4 from the C++ Standard says,
[....] A POD-struct is an aggregate class that has no non-static data
members of type non-POD-struct,
non-POD-union (or array of such types)
or reference, and has no user-defined
copy assignment operator and no
user-defined destructor. Similarly, a
POD-union is an aggregate union that
has no non-static data members of type
non-POD-struct, non-POD-union (or
array of such types) or reference, and
has no user-defined copy assignment
operator and no user-defined
destructor. A POD class is a class
that is either a POD-struct or a
POD-union.
From this, its also clear that POD class/struct/union though cannot have user-defined assignment operator and user-defined destructor also.
There are however other types of POD. The section §3.9/10 says,
Arithmetic types (3.9.1),
enumeration types, pointer types, and
pointer to member types (3.9.2), and
cv-qualified versions of these types
(3.9.3) are collectively called scalar
types. Scalar types, POD-struct types,
POD-union types (clause 9), arrays of
such types and cv-qualified versions
of these types (3.9.3) are
collectively called POD types.
Read this FAQ : What is a "POD type"?

The class A is POD and can be initialized like this
Sorry, that is wrong. Because b is private, the class is not a POD.
But Clang assumes A as non-aggregate type. Why I can't have constructor like that? Or should I do something else?
This is a limitation of C++ as it exists currently. C++0x will not have this limitation anymore. While in C++0x your type is not a POD either, your initialization will work (assuming that you make that constructor public).
(Also, I think a better term for you to use here is "aggregate". The requirement for using { ... } is that your class is an aggregate. It doesn't have to be a POD).

The other answers describe the POD rules pretty well. If you want to get a similar initialization style to a constructor for a POD you can use a make_-style function, for example:
struct A
{
int i_;
};
A make_A(int i = 0)
{
A a = { i };
return a;
}
now you can get initialized POD instances like:
A a = make_A();

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Why is std::string incompatible with C unions? [duplicate] - c++

Related

Understanding constructor concept

Unions used like Classes/Structs

Memset POD struct with this pointer

what is variant member in c++?

Can't C++ POD type have any constructor?

Categories

Resources