I was trying to learn more about unions and their usefulness, when I was surprised that the following code is perfectly valid and works exactly as expected:
template <class T>
union Foo
{
T a;
float b;
Foo(const T& value)
: a(value)
{
}
Foo(float f)
: b(f)
{
}
void bar()
{
}
~Foo()
{
}
};
int main(int argc, char* argv[])
{
Foo<int> foo1(12.0f);
Foo<int> foo2((int) 12);
foo1.bar();
foo2.bar();
int s = sizeof(foo1); // s = 4, correct
return 0;
}
Until now, I had no idea that it is legal to declare unions with templates, constructors, destructor, and even member functions. In case it's relevant, I'm using Visual Studio 2012.
When I searched the internet to find more about using unions in this manner, I found nothing. Is this a new feature of C++, or something specific to MSVC? If not, I'd like to learn more about unions, specifically examples of them used like classes (above). If someone could point me to a more detailed explanation of unions and their usage as data structures, it'd be much appreciated.
Is this a new feature of C++, or something specific to MSVC?
No, as BoBtFish said, the 2003 C++ standard section 9.5 Unions paragraph 1 says:
[...] A union can have member functions (including constructors and destructors), but not virtual (10.3) functions. A union shall not have base classes. A union shall not be used as a base class. An object of a class with a non-trivial constructor (12.1), a non-trivial copy constructor (12.8), a non-trivial destructor (12.4), or a non-trivial copy assignment operator (13.5.3, 12.8) cannot be a member of a union, nor can an array of such objects. If a union contains a static data member, or a member of reference type, the program is ill-formed.
unions do come under section 9 Classes and the grammar for class-key is as follows:
class-key:
class
struct
union
So acts like a class but has many more restrictions. The key restriction being that unions can only have one active non-static member at a time, which is also covered in paragraph 1:
In a union, at most one of the non-static data members can be active at any time, that is, the value of at most one of the non-static data members can be stored in a union at any time. [...]
The wording in the C++11 draft standard is similar so it has not changed too much since 2003.
As for the use of a union, there are two common reasons which are covered from different angles in this previous thread C/C++: When would anyone use a union? Is it basically a remnant from the C only days? to summarize:
To implement your own Variant type, a union gives you the ability to represent all the varying types without wasting memory. This answer to the thread gives a good example.
Type punning but I would read Understanding Strict Aliasing as well since there are many cases where type punning is undefined behavior.
This answer to Unions cannot be used as Base class gives some really great insight into why unions are implemented as they are in C++.
Related
I'm migrating a C++ Visual Studio Project from VS2017 to VS2019.
I'm getting an error now, that didn't occur before, that can be reproduced with these few lines of code:
struct Foo
{
Foo() = default;
int bar;
};
auto test = Foo { 0 };
The error is
(6): error C2440: 'initializing': cannot convert from
'initializer list' to 'Foo'
(6): note: No constructor could take the source type, or
constructor overload resolution was ambiguous
The project is compiled with /std:c++latest flag. I reproduced it on godbolt. If I switch it to /std:c++17, it compiles fine as before.
I tried to compile the same code with clang with -std=c++2a and got a similar error. Also, defaulting or deleting other constructors generates this error.
Apparently, some new C++20 features were added in VS2019 and I'm assuming the origin of this issue is described in https://en.cppreference.com/w/cpp/language/aggregate_initialization.
There it says that an aggregate can be a struct that (among other criteria) has
no user-provided, inherited, or explicit constructors (explicitly defaulted or deleted constructors are allowed) (since C++17) (until C++20)
no user-declared or inherited constructors (since C++20)
Note that the part in parentheses "explicitly defaulted or deleted constructors are allowed" was dropped and that "user-provided" changed to "user-declared".
So my first question is, am I right assuming that this change in the standard is the reason why my code compiled before but does not anymore?
Of course, it's easy to fix this: Just remove the explicitly defaulted constructors.
However, I have explicitly defaulted and deleted very many constructors in all of my projects because I found it was a good habit to make code much more expressive this way because it simply results in fewer surprises than with implicitly defaulted or deleted constructors. With this change however, this doesn't seem like such a good habit anymore...
So my actual question is:
What is the reasoning behind this change from C++17 to C++20? Was this break of backwards compatibility made on purpose? Was there some trade off like "Ok, we're breaking backwards compatibility here, but it's for the greater good."? What is this greater good?
The abstract from P1008, the proposal that led to the change:
C++ currently allows some types with user-declared constructors to be initialized via aggregate initialization, bypassing those constructors. The result is code that is surprising, confusing, and buggy. This paper proposes a fix that makes initialization semantics in C++ safer, more uniform,and easier to teach. We also discuss the breaking changes that this fix introduces.
One of the examples they give is the following.
struct X {
int i{4};
X() = default;
};
int main() {
X x1(3); // ill-formed - no matching c’tor
X x2{3}; // compiles!
}
To me, it's quite clear that the proposed changes are worth the backwards-incompatibility they bear. And indeed, it doesn't seem to be good practice anymore to = default aggregate default constructors.
The reasoning from P1008 (PDF) can be best understood from two directions:
If you sat a relatively new C++ programmer down in front of a class definition and ask "is this an aggregate", would they be correct?
The common conception of an aggregate is "a class with no constructors". If Typename() = default; is in a class definition, most people will see that as having a constructor. It will behave like the standard default constructor, but the type still has one. That is the broad conception of the idea from many users.
An aggregate is supposed to be a class of pure data, able to have any member assume any value it is given. From that perspective, you have no business giving it constructors of any kind, even if you defaulted them. Which brings us to the next reasoning:
If my class fulfills the requirements of an aggregate, but I don't want it to be an aggregate, how do I do that?
The most obvious answer would be to = default the default constructor, because I'm probably someone from group #1. Obviously, that doesn't work.
Pre-C++20, your options are to give the class some other constructor or to implement one of the special member functions. Neither of these options are palatable, because (by definition) it's not something you actually need to implement; you're just doing it to make some side effect happen.
Post-C++20, the obvious answer works.
By changing the rules in such a way, it makes the difference between an aggregate and non-aggregate visible. Aggregates have no constructors; so if you want a type to be an aggregate, you don't give it constructors.
Oh, and here's a fun fact: pre-C++20, this is an aggregate:
class Agg
{
Agg() = default;
};
Note that the defaulted constructor is private, so only people with private access to Agg can call it... unless they use Agg{}, bypasses the constructor and is perfectly legal.
The clear intent of this class is to create a class which can be copied around, but can only get its initial construction from those with private access. This allows forwarding of access controls, as only code which was given an Agg can call functions that take Agg as a parameter. And only code with access to Agg can create one.
Or at least, that's how it is supposed to be.
Now you could fix this more targetedly by saying that it's an aggregate if the defaulted/deleted constructors are not publicly declared. But that feels even more in-congruent; sometimes, a class with a visibly declared constructor is an aggregate and sometimes it isn't, depending on where that visibly declared constructor is.
Towards a less surprising aggregate in C++20
To be on the same page with all readers, lets start by mentioning that aggregate class types make up a special family of class types that can be, particularly, initialized by means of aggregate initialization, using direct-list-init or copy-list-init, T aggr_obj{arg1, arg2, ...} and T aggr_obj = {arg1, arg2, ...}, respectively.
The rules governing whether a class is an aggregate or not are not entirely straight-forward, particularly as the rules have been changing between different releases of the C++ standard. In this post we’ll go over these rules and how they have changed over the standard release from C++11 through C++20.
Before we visit the relevant standard passages, consider the implementation of the following contrived class type:
namespace detail {
template <int N>
struct NumberImpl final {
const int value{N};
// Factory method for NumberImpl<N> wrapping non-type
// template parameter 'N' as data member 'value'.
static const NumberImpl& get() {
static constexpr NumberImpl number{};
return number;
}
private:
NumberImpl() = default;
NumberImpl(int) = delete;
NumberImpl(const NumberImpl&) = delete;
NumberImpl(NumberImpl&&) = delete;
NumberImpl& operator=(const NumberImpl&) = delete;
NumberImpl& operator=(NumberImpl&&) = delete;
};
} // namespace detail
// Intended public API.
template <int N>
using Number = detail::NumberImpl<N>;
where the design intent has been to create a non-copyable, non-movable singleton class template which wraps its single non-type template parameter into a public constant data member, and where the singleton object for each instantiation is the only that can ever be created for this particular class specialization. The author has defined an alias template Number solely to prohibit users of the API to explicitly specialize the underlying detail::NumberImpl class template.
Ignoring the actual usefulness (or, rather, uselessness) of this class template, have the author correctly implemented its design intent? Or, in other words, given the function wrappedValueIsN below, used as an acceptance test for the design of the publicly intended Number alias template, will the function always return true?
template <int N>
bool wrappedValueIsN(const Number<N>& num) {
// Always 'true', by design of the 'NumberImpl' class?
return N == num.value;
}
We will answer this question assuming that no user abuses the interface by specializing the semantically hidden detail::NumberImpl, in which case the answer is:
C++11: Yes
C++14: No
C++17: No
C++20: Yes
The key difference is that the class template detail::NumberImpl (for any non-explicit specialization of it) is an aggregate in C++14 and C++17, whereas it is not an aggregate in C++11 and C++20. As covered above, initialization of an object using direct-list-init or copy-list-init will result in aggregate initialization if the object is of an aggregate type. Thus, what may look like value-initialization (e.g. Number<1> n{} here)—which we may expect will have the effect of zero-initialization followed by default-initialization as a user-declared but not user-provided default constructer exists—or direct-initialization (e.g. Number<1>n{2} here) of a class type object will actually bypass any constructors, even deleted ones, if the class type is an aggregate.
struct NonConstructible {
NonConstructible() = delete;
NonConstructible(const NonConstructible&) = delete;
NonConstructible(NonConstructible&&) = delete;
};
int main() {
//NonConstructible nc; // error: call to deleted constructor
// Aggregate initialization (and thus accepted) in
// C++11, C++14 and C++17.
// Rejected in C++20 (error: call to deleted constructor).
NonConstructible nc{};
}
Thus, we can fail the wrappedValueIsN acceptance test in C++14 and C++17 by bypassing the private and deleted user-declared constructors of detail::NumberImpl by means of aggregate initialization, specifically where we explicitly provide a value for the single value member thus overriding the designated member initializer (... value{N};) that otherwise sets its value to N.
constexpr bool expected_result{true};
const bool actual_result =
wrappedValueIsN(Number<42>{41}); // false
// ^^^^ aggr. init. int C++14 and C++17.
Note that even if detail::NumberImpl were to declare a private and explicitly defaulted destructor (~NumberImpl() = default; with private access specifyer) we could still, at the cost of a memory leak, break the acceptance test by e.g. dynamically allocating (and never deleting) a detail::NumberImpl object using aggregate initialization (wrappedValueIsN(*(new Number<42>{41}))).
But why is detail::NumberImpl an aggregate in C++14 and C++17, and why is it not an aggregate in C++11 and C++20? We shall turn to the relevant standard passages for the different standard versions for an answer.
Aggregates in C++11
The rules governing whether a class is an aggregate or not is covered by [dcl.init.aggr]/1, where we refer to N3337 (C++11 + editorial fixes) for C++11 [emphasis mine]:
An aggregate is an array or a class (Clause [class]) with no
user-provided constructors ([class.ctor]), no
brace-or-equal-initializers for non-static data members
([class.mem]), no private or protected non-static data members (Clause
[class.access]), no base classes (Clause [class.derived]), and no
virtual functions ([class.virtual]).
The emphasized segments are the most relevant ones for the context of this answer.
User-provided functions
The detail::NumberImpl class does declare four constructors, such that it has four user-declared constructors, but it does not provide definitions for any of these constructors; it makes use of explicitly-defaulted and explicitly-deleted function definitions at the constructors’ first declarations, using the default and delete keywords, respectively.
As governed by [dcl.fct.def.default]/4, defining an explicitly-defaulted or explicitly-deleted function at its first declaration does not count as the function being user-provided [extract, emphasis mine]:
[…] A special member function is user-provided if it is user-declared and not explicitly defaulted or deleted on its first declaration. […]
Thus, the detail::NumberImpl fulfills the aggregate class requirement regarding having no user-provided constructors.
For the some additional aggregate confusion (which applies in C++11 through C++17), where the explicitly-defaulted definition is provided out-of-line, refer to my other answer here.
Designated member initializers
Albeit the detail::NumberImpl class has no user-provided constructors, it does use a brace-or-equal-initializer (commmonly referred to as a designated member initializer) for the single non-static data member value. This is the sole reason as to why the detail::NumberImpl class is not an aggregate in C++11.
Aggregates in C++14
For C++14, we once again turn to [dcl.init.aggr]/1, now referring to N4140 (C++14 + editorial fixes), which is nearly identical to the corresponding paragraph in C++11, except that the segment regarding brace-or-equal-initializers has been removed [emphasis mine]:
An aggregate is an array or a class (Clause [class]) with no
user-provided constructors ([class.ctor]), no private or protected
non-static data members (Clause [class.access]), no base classes
(Clause [class.derived]), and no virtual functions ([class.virtual]).
Thus, the detail::NumberImpl class fulfills the rules for it to be an aggregate in C++14, thus allowing circumventing all private, defaulted or deleted user-declared constructors by means of aggregate initialization.
We will get back to the consistently emphasized segment regarding user-provided constructors once we reach C++20 in a minute, but we shall first visit some explicit puzzlement in C++17.
Aggregates in C++17
True to its form, the aggregate once again changed in C++17, now allowing an aggregate to derive publicly from a base class, with some restrictions, as well as prohibiting explicit constructors for aggregates. [dcl.init.aggr]/1 from N4659 ((March 2017 post-Kona working draft/C++17 DIS), states [emphasis mine]:
An aggregate is an array or a class with
(1.1) no user-provided, explicit, or inherited constructors ([class.ctor]),
(1.2) no private or protected non-static data members (Clause [class.access]),
(1.3) no virtual functions, and
(1.4) no virtual, private, or protected base classes ([class.mi]).
The segment in about explicit is interesting in the context of this post, as we may further increase the aggregate cross-standard-releases volatility by changing the declaration of the private user-declared explicitly-defaulted default constructor of detail::NumberImpl from:
template <int N>
struct NumberImpl final {
// ...
private:
NumberImpl() = default;
// ...
};
to
template <int N>
struct NumberImpl final {
// ...
private:
explicit NumberImpl() = default;
// ...
};
with the effect that detail::NumberImpl is no longer an aggregate in C++17, whilst still being an aggregate in C++14. Denote this example as (*). Apart from copy-list-initialization with an empty braced-init-list (see more details in my other answer here):
struct Foo {
virtual void fooIsNeverAnAggregate() const {};
explicit Foo() {}
};
void foo(Foo) {}
int main() {
Foo f1{}; // OK: direct-list-initialization
// Error: converting to 'Foo' from initializer
// list would use explicit constructor 'Foo::Foo()'
Foo f2 = {};
foo({});
}
the case shown in (*) is the only situation where explicit actually has an effect on a default constructor with no parameters.
Aggregates in C++20
As of C++20, particularly due to the implementation of P1008R1 (Prohibit aggregates with user-declared constructors) most of the frequently surprising aggregate behaviour covered above has been addressed, specifically by no longer allowing aggregates to have user-declared constructors, a stricter requirement for a class to be an aggregate than just prohibiting user-provided constructors. We once again turn to [dcl.init.aggr]/1, now referring to N4861 (March 2020 post-Prague working draft/C++20 DIS), which states [emphasis mine]:
An aggregate is an array or a class ([class]) with
(1.1) no user-declared, or inherited constructors ([class.ctor]),
(1.2) no private or protected non-static data members ([class.access]),
(1.3) no virtual functions ([class.virtual]), and
(1.4) no virtual, private, or protected base classes ([class.mi]).
We may also note that the segment about explicit constructors has been removed, now redundant as we cannot mark a constructor as explicit if we may not even declare it.
Avoiding aggregate surprises
All the examples above relied on class types with public non-static data members, which is commonly considered an anti-pattern for the design of “non-POD-like” classes. As a rule of thumb, if you’d like to avoid designing a class that is unintentionally an aggregate, simply make sure that at least one (typically even all) of its non-static data members is private (/protected). For cases where this for some reason cannot be applied, and where you still don’t want the class to be an aggregate, make sure to turn to the relevant rules for the respective standard (as listed above) to avoid writing a class that is not portable w.r.t. being an aggregate or not over different C++ standard versions.
Actually, MSDN addressed your concern in the below document:
Modified specification of aggregate type
In Visual Studio 2019, under /std:c++latest, a class with any user-declared constructor (for example, including a constructor declared = default or = delete) isn't an aggregate. Previously, only user-provided constructors would disqualify a class from being an aggregate. This change puts additional restrictions on how such types can be initialized.
If I have a union with two data members of the same type, differing only by CV-qualification:
template<typename T>
union A
{
private:
T x_priv;
public:
const T x_publ;
public:
// Accept-all constructor
template<typename... Args>
A(Args&&... args) : x_priv(args...) {}
// Destructor
~A() { x_priv.~T(); }
};
And I have a function f that declares a union A, thus making x_priv the active member and then reads x_publ from that union:
int f()
{
A<int> a {7};
return a.x_publ;
}
In every compiler I tested there were no errors compiling nor at runtime for both int types and other, more complex, types such as std::string and std::thread.
I went to see on the standard if this was legal behavior and I started on looking at the difference of T and const T:
6.7.3.1 [basic.type.qualifier]
The cv-qualified or cv-unqualified versions of a type are distinct types; however, they shall have the same representation and alignment requirements ([basic.align]).
This means that when declaring a const T it has the exact same representation in memory as a T. But then I found that the standard actually disallows this for some types, which I found weird, as I see no reason for it.
I started my search on accessing non-active members.
It is only legal to access the common initial sequence of T and const T if both are standard-layout types.
10.4.1[class.union]
At most one of the non-static data members of an object of union type can be active at any time [...] [ Note: One special guarantee is made in order to simplify the use of unions: If a standard-layout union contains several standard-layout structs that share a common initial sequence ([class.mem]), and if a non-static data member of an object of this standard-layout union type is active and is one of the standard-layout structs, it is permitted to inspect the common initial sequence of any of the standard-layout struct members; see [class.mem]. — end note ]
The initial sequence is basically the order of the non-static data members with a few exceptions, but since T and const T have the exact same members in the same layout, this means that the common initial sequence of T and const T is all of the members of T.
10.3.22 [class.mem]
The common initial sequence of two standard-layout struct ([class.prop]) types is the longest sequence of non-static data members and bit-fields in declaration order, starting with the first such entity in each of the structs, such that corresponding entities have layout-compatible types, either both entities are declared with the no_unique_address attribute ([dcl.attr.nouniqueaddr]) or neither is, and either both entities are bit-fields with the same width or neither is a bit-field. [ Example:
And here is where the restrictions come in, it restricts some types from being accessed, even though they have the exact same representation in memory:
10.1.3 [class.prop]
A class S is a standard-layout class if it:
(3.1) has no non-static data members of type non-standard-layout class (or array of such types) or reference,
(3.2) has no virtual functions and no virtual base classes,
(3.3) has the same access control for all non-static data members,
(3.4) has no non-standard-layout base classes,
(3.5) has at most one base class subobject of any given type,
(3.6) has all non-static data members and bit-fields in the class and its base classes first declared in the same class, and
(3.7) has no element of the set M(S) of types as a base class, where for any type X, M(X) is defined as follows.108 [ Note: M(X) is the set of the types of all non-base-class subobjects that may be at a zero offset in X. — end note ]
(3.7.1) If X is a non-union class type with no (possibly inherited) non-static data members, the set M(X) is empty.
(3.7.2) If X is a non-union class type with a non-static data member of type X_0 that is either of zero size or is the first non-static data member of X (where said member may be an anonymous union), the set M(X) consists of X_0 and the elements of M(X_0).
(3.7.3) If X is a union type, the set M(X) is the union of all M(U_i) and the set containing all U_i, where each U_i is the type of the ith non-static data member of X.
(3.7.4) If X is an array type with element type X_e , the set M(X) consists of X e and the elements of M (X_e).
(3.7.5) If X is a non-class, non-array type, the set M(X) is empty.
My questions is is there any reason for this to not be valid behavior?.
Essentially is it that:
The standard makers forgot to account for this particular case?
I haven't read some part of the standard that allows this behavior?
There's some more specific reason for this not to be valid behavior?
A reason for this to be valid syntax is, for example, having a 'readonly' variable in a class, as such:
struct B;
struct A
{
... // Everything that struct A had before
friend B;
}
struct B
{
A member;
void f() { member.x_priv = 100; }
}
int main()
{
B b;
b.f(); // Modifies the value of member.x_priv
//b.member.x_priv = 100; // Invalid, x_priv is private
int x = b.member.x_publ; // Fine, x_publ is public
}
This way you don't need a getter function, which can cause performance overhead and although most compiler would optimize that away it still increases your class, and to get the variable you'd have to write int x = b.get_x().
Nor would you need a const reference to that variable (as described in this question), which while it works great, it adds size to your class, which can be bad for sufficiently big classes or classes that need to be as small as possible.
And it is weird having to write b.member.x_priv instead of b.x_priv but this would be fixable if we could have private members in anonymous unions then we could rewrite it like this:
struct B
{
union
{
private:
int x_priv;
public:
int x_publ;
friend B;
};
void f() { x_priv = 100; }
}
int main()
{
B b;
b.f(); // Modifies the value of member.x_priv
//b.x_priv = 100; // Invalid, x_priv is private
int x = b.x_publ; // Fine, x_publ is public
}
Another use case might be to give various names to the same data member, lie for example in a Shape, the user might want to refer to the position as either shape.pos, shape.position, shape.cur_pos or shape.shape_pos.
Although this would probably create more problems than it is worth, such a use case might be favorable when for example a name should be deprecated .
Code like this:
struct A { int i; };
struct B { int j; };
union U {
struct A a;
struct B b;
};
int main() {
union U u;
u.a.i = 1;
printf("%d\n", u.b.j);
}
is valid in C. For the sake of backward compatibility, it was considered desirable to ensure that it is also valid in C++. The special rules about common initial sequences of standard-layout structs ensure this backward compatibility. Extending the rule to allow more cases to be well-defined—ones involving non-standard-layout structs—is not necessary for C compatibility, since all structs that can be defined in the common subset of C and C++ are automatically standard-layout structs in C++.
Actually, the C++ rules are a little bit more permissive than required for C compatibility. They allow some cases involving base classes too:
struct A { int i; };
struct B { int j; };
struct C : A { };
struct D : B { };
// C and D have a common initial sequence consisting of C::i and D::j
But in general, structs in C++ can be much more complicated than their C counterparts. They can, for example, have virtual functions and virtual base classes, and those can affect their layout in an implementation-defined manner. For this reason, it's not so easy to make more cases of type punning through unions well-defined in C++. You would really have to sit down with implementers and discuss what the conditions would be such that the committee should mandate that two classes have the same layout for their common initial sequence and not leave it up to the implementation. Currently, that mandate applies only to standard-layout classes.
There are various rules in the standard that are strong enough to imply that T and const T always have the exact same layout even if T is not a standard-layout class. For this reason, it would be possible to make certain forms of type punning between a T member and a const T member of a union well-defined even if T is not standard-layout. However, adding only this very special case to the language is of dubious value and I think it's unlikely that the committee would accept such a proposal unless you have a really compelling use case. Not wanting to provide a getter that returns a const reference, simply because you don't want to write the () to call the getter each time you need access, is unlikely to convince the committee.
From the C++ standard:
A standard-layout class is a class that:
— has no non-static data members of type non-standard-layout class (or
array of such types) or reference,
— has no virtual functions (10.3) and no virtual base classes (10.1),
— has the same access control (Clause 11) for all non-static data members,
— has no non-standard-layout base classes,
— either has no non-static data members in the most derived class and
at most one base class with non-static data members, or has no base
classes with non-static data members, and
— has no base classes of the same type as the first non-static data
member
The macro offsetof(type, member-designator) accepts a restricted set
of type arguments in this International Standard. If type is not a
standard-layout class (Clause 9), the results are undefined
Considering these statements is there any safe way of using offsetof for members that depend on template parameters? If not, how may I get the offset of a member in template classes? What might be unsafe when using something like:
//MS Visual Studio 2013 definition
#define offsetof(s,m) (size_t)&reinterpret_cast<const volatile char&>((((s *)0)->m))
on non standard layout classes?
Following a sample where is NOT SAFE according to the standard:
#include <cstddef>
#include <iostream>
template<typename T>
struct Test
{
int a;
T b;
};
struct NonStdLayout
{
virtual void f(){};
};
int main()
{
std::cout << offsetof(Test<int>, b) << std::endl;
std::cout << offsetof(Test<NonStdLayout>, b) << std::endl;
return 0;
}
offsetof cannot be used on non-standard-layout classes simply because their layout in memory is unknown. For example, the standard does not specify how virtual member functions are implemented. One common way of doing so is to add a pointer to the vtable as the first data member of a class, but it's not the only way.
As to your definition of offsetof: there is no guarantee that a null pointer converts to a 0 via reinterpret_cast (or via a C-style cast), neither is there any semantics specified for other values of pointers cast to integers.
So if you know that your definition makes sense in the underlying addressing scheme used by your compiler for your platform, it can work. But it's an if you have to be aware of.
The answer is that it is perfectly safe to use offsetof in templates. No harm will be done thereby. However, if you choose to do so then you impose a restriction on the type of parameter for the template. It will work correctly for standard layout classes, and in principle at least the compiler should tell you when the parameter is of a type for which it won't work.
There is no way under the standard to obtain the offset for a member of a non-standard-layout class, regardless of whether any template is involved. It will probably work in individual compilers, but it may not. It is likely to work on all non-virtual classes (although this is not a requirement of the standard). Maybe you just have to experiment.
We are frequently forced to write non-standards compliant code to solve problems like this, so we test it carefully on individual compilers. It just means more hard work in research and testing.
I have many POD struct with a lot of member variables. Instead of initializing each members in the constructor, I simply use memset. Is this valid in C++?
struct foo
{
foo() { std::memset(this, 0, sizeof (foo)); }
int var1;
float var2;
double var3;
// more variables..
};
It's not guaranteed to work, since the C++ standard permits implementations in which all-bits-zero is a trap representation of float or double. So reading those members on such an implementation would have undefined behavior.
The same applies to any padding bytes that the implementation might put between the data members -- modifying them is either undefined behavior or else puts the object into an undefined state, that has undefined behavior when used. I forget which.
In practice it will work on all implementations I know, though.
Other answers make valid points about your class being non-POD (C++03) and non-trivial (C++11). Thing is, even if you removed the constructor and called memset from somewhere else it would still not be guaranteed to work by the standard. But if you did remove the constructor you could use aggregate initialization:
foo f = {0};
and that would intialize all members to zero values (whether or not that is represented by all-bits-zero), guaranteed.
According to standard your struct is not POD type and thus it is not allowed to use memset.
9 Classes
A trivial class is a class that has a default constructor (12.1), has no non-trivial default constructors ,
and is trivially copyable
10 A POD struct108 is a non-union class that is both a trivial class and a standard-layout class, and has no
non-static data members of type non-POD struct, non-POD union (or array of such types).
Since your class have non-trivial default constructor it is no longer trivial, and as result not a POD type.
Most likely is will be working on most of the compilers, no guarantee thru.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Why compiler doesn't allow std::string inside union ?
I knew that I had this problem when I started with C++: The compiler wouldn't allow me to put a variable of the type std::string into unions.
That was years ago, but actually I still don't know the exact answer. I read something related to a copy function with the string that the union didn't like, but that's pretty much all.
Why are C++ STL strings incompatible with unions?
From Wikipedia:
C++ does not allow for a data member to be any type that has a full fledged constructor/destructor
and/or copy constructor, or a non-trivial copy assignment operator. In particular, it is impossible to
have the standard C++ string as a member of a union.
Think about it this way: If you have a union of a class type like std::string and a primitive type (let's say a long), how would the compiler know when you are using the class type (in which case the constructor/destructor will need to be called) and when you are using the simple type? That's why full-fledged class types are not allowed as members of a union.
Class which have user-defined constructor or user-defined destructor is not allowed in union.
You can have pointer of such class as member of union, though.
struct X
{
X() {}
~X() {}
};
union A
{
X x; // not allowed - X has constructor (and destructor too)
X *px; //allowed!
};
Or you can use boost::variant which is a safe, generic, stack-based discriminated union container.
§9.5/1 says (formatting and emphasize is mine)
A union can have member functions (including constructors and destructors), but not virtual (10.3) functions.
A union shall not have base classes.
A union shall not be used as a base class.
An object of a class with a non-trivial constructor (12.1), a non-trivial copy constructor (12.8), a non-trivial destructor (12.4), or a non-trivial copy assignment operator (13.5.3, 12.8) cannot be a member of a union, nor can an array of such objects.
If a union contains a static data member, or a member of reference type, the program is ill-formed.
Interesting!