C++ optimise away private variable - c++

Does ISO C++ (11) permit a private non-static class member variable to be optimised away?
This could be detected:
class X { int x; };
assert (sizeof(X) >= sizeof(int));
but I am not aware of a clause that demands the assertion above.
To clarify: (a) Is there a clause in the C++ Standard that ensure the assertion above.
(b) Can anyone think of any other way to detect the elision of x?
[offsetof?]
(c) Is the optimisation permitted anyhow, despite (a) and (b)?
I have a feeling the optimisation could be possible if the class is local to a function but not otherwise (but I'd like to have a definitive citation).

I do not think it is forbidden, but I think it is impractical.
§9 Classes [class]
7/ A standard-layout class is a class that:
has no non-static data members of type non-standard-layout class (or array of such types) or reference,
has no virtual functions (10.3) and no virtual base classes (10.1),
has the same access control (Clause 11) for all non-static data members,
has no non-standard-layout base classes,
either has no non-static data members in the most derived class and at most one base class with non-static data members, or has no base classes with non-static data members, and
has no base classes of the same type as the first non-static data member.107
8/ A standard-layout struct is a standard-layout class defined with the class-key struct or the class-key class.
... thus class X { int x; }; is a standard-layout struct.
§9.2 Class members [class.mem]
16/ Two standard-layout struct (Clause 9) types are layout-compatible if they have the same number of non-static data members and corresponding non-static data members (in declaration order) have layout-compatible types (3.9).
... thus class X { int x; }; is layout-compatible with struct Y { int y; };.
The unfortunate thing is that layout-compatible is not formally defined in the Standard. However given the use of the word layout it seems the intent is to declare that two layout-compatible types should have the same underlying representation.
Therefore, to be able to remove the x in X one would have to prove that all structures that are layout-compatible (such as Y) are amenable to the same optimization (to keep the layout compatibility). It seems quite... improbable... in any non-trivial program.

Related

Confused about std::is_standard_layout type trait

Standard-layout class describes a standard layout class that:
has no non-static data members of type non-standard-layout class (or array of such types) or reference,
has no virtual functions and no virtual base classes,
has the same access control for all non-static data members,
has no non-standard-layout base classes,
has all non-static data members and bit-fields in the class and its base classes first declared in the same class, and
given the class as S, has no element of the set M(S) of types as a base class, where M(X) for a type X is defined as:
If X is a non-union class type with no (possibly inherited) non-static data members, the set M(X) is empty.
If X is a non-union class type whose first non-static data member has type X0 (where said member may be an anonymous union), the set M(X) consists of X0 and the elements of M(X0).
If X is a union type, the set M(X) is the union of all M(Ui) and the set containing all Ui, where each Ui is the type of the ith non-static data member of X.
If X is an array type with element type Xe, the set M(X) consists of Xe and the elements of M(Xe).
If X is a non-class, non-array type, the set M(X) is empty.
(numbered for convivence)
However, the following fails:
#include <type_traits>
struct A {
int a;
};
struct B : A {
int b;
};
static_assert(std::is_standard_layout<A>::value, "not standard layout"); // succeeds
static_assert(std::is_standard_layout<B>::value, "not standard layout"); // fails
Demo
I see that 1-4 are true, so is one of points under 5 false? I find the points under 5 a little confusing.
The cppreference description is a little confusing.
The standard (search "standard-layout class is") says:
A standard-layout class is a class that:
has no non-static data members of type non-standard-layout class (or array of such types) or reference,
has no virtual functions (10.3) and no virtual base classes (10.1),
has the same access control (Clause 11) for all non-static data members,
has no non-standard-layout base classes,
either has no non-static data members in the most derived class and at most one base class with non-static data members, or has no base classes with non-static data members, and
has no base classes of the same type as the first non-static data member.
You can see from the part I emphasized that a class with a non-static data member and a base class that itself has a non-static data member is not standard-layout.
I suspect this point is described as follows in cppreference:
has all non-static data members and bit-fields in the class and its base classes first declared in the same class
The point is that you must be able to cast the address of a standard-layout class object to a pointer to its first member, and back. It boils down to compatibility between C++ and C.
See also: Why is C++11's POD "standard layout" definition the way it is?.

C++ Union, two active members only differing by CV-qualification

If I have a union with two data members of the same type, differing only by CV-qualification:
template<typename T>
union A
{
private:
T x_priv;
public:
const T x_publ;
public:
// Accept-all constructor
template<typename... Args>
A(Args&&... args) : x_priv(args...) {}
// Destructor
~A() { x_priv.~T(); }
};
And I have a function f that declares a union A, thus making x_priv the active member and then reads x_publ from that union:
int f()
{
A<int> a {7};
return a.x_publ;
}
In every compiler I tested there were no errors compiling nor at runtime for both int types and other, more complex, types such as std::string and std::thread.
I went to see on the standard if this was legal behavior and I started on looking at the difference of T and const T:
6.7.3.1 [basic.type.qualifier]
The cv-qualified or cv-unqualified versions of a type are distinct types; however, they shall have the same representation and alignment requirements ([basic.align]).
This means that when declaring a const T it has the exact same representation in memory as a T. But then I found that the standard actually disallows this for some types, which I found weird, as I see no reason for it.
I started my search on accessing non-active members.
It is only legal to access the common initial sequence of T and const T if both are standard-layout types.
10.4.1[class.union]
At most one of the non-static data members of an object of union type can be active at any time [...] [ Note: One special guarantee is made in order to simplify the use of unions: If a standard-layout union contains several standard-layout structs that share a common initial sequence ([class.mem]), and if a non-static data member of an object of this standard-layout union type is active and is one of the standard-layout structs, it is permitted to inspect the common initial sequence of any of the standard-layout struct members; see [class.mem]. — end note ]
The initial sequence is basically the order of the non-static data members with a few exceptions, but since T and const T have the exact same members in the same layout, this means that the common initial sequence of T and const T is all of the members of T.
10.3.22 [class.mem]
The common initial sequence of two standard-layout struct ([class.prop]) types is the longest sequence of non-static data members and bit-fields in declaration order, starting with the first such entity in each of the structs, such that corresponding entities have layout-compatible types, either both entities are declared with the no_­unique_­address attribute ([dcl.attr.nouniqueaddr]) or neither is, and either both entities are bit-fields with the same width or neither is a bit-field. [ Example:
And here is where the restrictions come in, it restricts some types from being accessed, even though they have the exact same representation in memory:
10.1.3 [class.prop]
A class S is a standard-layout class if it:
(3.1) has no non-static data members of type non-standard-layout class (or array of such types) or reference,
(3.2) has no virtual functions and no virtual base classes,
(3.3) has the same access control for all non-static data members,
(3.4) has no non-standard-layout base classes,
(3.5) has at most one base class subobject of any given type,
(3.6) has all non-static data members and bit-fields in the class and its base classes first declared in the same class, and
(3.7) has no element of the set M(S) of types as a base class, where for any type X, M(X) is defined as follows.108 [ Note: M(X) is the set of the types of all non-base-class subobjects that may be at a zero offset in X. — end note ]
(3.7.1) If X is a non-union class type with no (possibly inherited) non-static data members, the set M(X) is empty.
(3.7.2) If X is a non-union class type with a non-static data member of type X_0 that is either of zero size or is the first non-static data member of X (where said member may be an anonymous union), the set M(X) consists of X_0 and the elements of M(X_0).
(3.7.3) If X is a union type, the set M(X) is the union of all M(U_i) and the set containing all U_i, where each U_i is the type of the ith non-static data member of X.
(3.7.4) If X is an array type with element type X_e , the set M(X) consists of X e and the elements of M (X_e).
(3.7.5) If X is a non-class, non-array type, the set M(X) is empty.
My questions is is there any reason for this to not be valid behavior?.
Essentially is it that:
The standard makers forgot to account for this particular case?
I haven't read some part of the standard that allows this behavior?
There's some more specific reason for this not to be valid behavior?
A reason for this to be valid syntax is, for example, having a 'readonly' variable in a class, as such:
struct B;
struct A
{
... // Everything that struct A had before
friend B;
}
struct B
{
A member;
void f() { member.x_priv = 100; }
}
int main()
{
B b;
b.f(); // Modifies the value of member.x_priv
//b.member.x_priv = 100; // Invalid, x_priv is private
int x = b.member.x_publ; // Fine, x_publ is public
}
This way you don't need a getter function, which can cause performance overhead and although most compiler would optimize that away it still increases your class, and to get the variable you'd have to write int x = b.get_x().
Nor would you need a const reference to that variable (as described in this question), which while it works great, it adds size to your class, which can be bad for sufficiently big classes or classes that need to be as small as possible.
And it is weird having to write b.member.x_priv instead of b.x_priv but this would be fixable if we could have private members in anonymous unions then we could rewrite it like this:
struct B
{
union
{
private:
int x_priv;
public:
int x_publ;
friend B;
};
void f() { x_priv = 100; }
}
int main()
{
B b;
b.f(); // Modifies the value of member.x_priv
//b.x_priv = 100; // Invalid, x_priv is private
int x = b.x_publ; // Fine, x_publ is public
}
Another use case might be to give various names to the same data member, lie for example in a Shape, the user might want to refer to the position as either shape.pos, shape.position, shape.cur_pos or shape.shape_pos.
Although this would probably create more problems than it is worth, such a use case might be favorable when for example a name should be deprecated .
Code like this:
struct A { int i; };
struct B { int j; };
union U {
struct A a;
struct B b;
};
int main() {
union U u;
u.a.i = 1;
printf("%d\n", u.b.j);
}
is valid in C. For the sake of backward compatibility, it was considered desirable to ensure that it is also valid in C++. The special rules about common initial sequences of standard-layout structs ensure this backward compatibility. Extending the rule to allow more cases to be well-defined—ones involving non-standard-layout structs—is not necessary for C compatibility, since all structs that can be defined in the common subset of C and C++ are automatically standard-layout structs in C++.
Actually, the C++ rules are a little bit more permissive than required for C compatibility. They allow some cases involving base classes too:
struct A { int i; };
struct B { int j; };
struct C : A { };
struct D : B { };
// C and D have a common initial sequence consisting of C::i and D::j
But in general, structs in C++ can be much more complicated than their C counterparts. They can, for example, have virtual functions and virtual base classes, and those can affect their layout in an implementation-defined manner. For this reason, it's not so easy to make more cases of type punning through unions well-defined in C++. You would really have to sit down with implementers and discuss what the conditions would be such that the committee should mandate that two classes have the same layout for their common initial sequence and not leave it up to the implementation. Currently, that mandate applies only to standard-layout classes.
There are various rules in the standard that are strong enough to imply that T and const T always have the exact same layout even if T is not a standard-layout class. For this reason, it would be possible to make certain forms of type punning between a T member and a const T member of a union well-defined even if T is not standard-layout. However, adding only this very special case to the language is of dubious value and I think it's unlikely that the committee would accept such a proposal unless you have a really compelling use case. Not wanting to provide a getter that returns a const reference, simply because you don't want to write the () to call the getter each time you need access, is unlikely to convince the committee.

What is the purpose of bullet point (7.5) in [class]/7, in C++14?

This is basically a continuation of my prior question.
This is [class]/7 in C++14:
A standard-layout class is a class that:
(7.1) — has no non-static data members of type non-standard-layout class (or array of such types) or reference,
(7.2) — has no virtual functions (10.3) and no virtual base classes (10.1),
(7.3) — has the same access control (Clause 11) for all non-static data members,
(7.4) — has no non-standard-layout base classes,
(7.5) — either has no non-static data members in the most derived class and at most one base class with
non-static data members, or has no base classes with non-static data members, and
(7.6) — has no base classes of the same type as the first non-static data member.
Consider the following snippet:
struct B{ int i; };
struct A : B{ int j; };
A satisfies bullet points (7.1) thru (7.4), but doesn't satisfy (7.5), as A has a non-static data member and has a base class with a non-static data member.
What is the problem with A being a standard-layout class?
Edit
As far as I can understand the accepted answer to the question of which this is being considered a dupe, the snippet above would have undefined behavior, if I tried to cast a pointer to A to the first data member of the base class B and back, because of this sentence written by the OP:
Within a class, members are allocated in increasing addresses according to the declaration order. However C++ doesn't dictate the order of allocation for data members across classes.
But that doesn't seem to answer my question. Suppose for example that in a certain compiler implementation, base B would follow struct A in memory, instead of preceding it. But this would contradict the fact that there is an implicit conversion, from a pointer to a derived class, to a pointer to a base class, according to [conv.ptr]/3:
A prvalue of type “pointer to cv D”, where D is a class type, can be
converted to a prvalue of type “pointer to cv B”, where B is a base
class (Clause 10) of D.
That is, if the base B followed struct A in memory, the above implicit conversion would be invalid.
Directly answering the question as phrased:
The purpose of this bullet is to allow very simple cases of inheritance where only one of the classes has data members.
Data layout for inheritance is unspecified, so the standard could just disallow inheritance altogether, but the standard makes an exception if one class has no data to treat the result still as Standard Layout.

Does a nested enum inside POD class makes it not POD?

By definition here, POD is a simple class with no user-defined constructors, non static members, and containing simple data types only.
The question is, will these 2 classes below be equivalent as POD types (in terms of memory footprint):
class pod
{
public:
int x;
double y;
};
class pod1
{
public:
int x;
double y;
enum POD_TYPE
{
POD1 = 0,
POD = 1
};
};
In other words does adding enum to the class only affects scope resolution of enum and does not affect properties of the class itself? By observation, it seems that class is still pod, but I would like to confirm based on the standard.
Yes, that type is still POD. The definition is given by C++11 9/10:
A POD struct is a non-union class that is both a trivial class and a standard-layout class, and has no non-static data members of type non-POD struct, non-POD union (or array of such types).
Trivial means that it doesn't do any funny business when creating, destroying or copying objects. Standard-layout means that it doesn't do any funny business with the layout of data members: no polymorphism, and restrictions on what you can do with access specifiers and inheritance. These terms are fully defined in C++11 9/6 and 9/7, if you want more detail.
Nested types (such as your enumeration), static data members and non-virtual member functions (apart from constructors etc. which would make it non-trivial) will not effect any of those things, so it is still POD.
UPDATE: Since you say you're interested in historic definitions, C++03 defined:
9/4 A POD-struct is an aggregate class that has no non-static members of type non-POD-struct, non-POD-union (or array of such types), and has no user-defined copy assignment operator and no user-defined destructor"
8.5.1/1 An aggregate is an array or class with no user-declared constructors, no private or protected non-static data members, no base classes and no virtual functions.
So there were more restrictions; but nested types were still allowed. I don't have a copy of C++98, but I'm sure that would be identical to C++03.
It doesn't make any difference regarding POD status because defining a nested enum does not add data members to the class. In fact, you could also define a nested class that is not POD inside pod1 and it would still not make a difference regarding the PODness of pod1.
The enum POD_TYPE is a type and won't effect the layout, we can see this from the draft C++ standard section 9.2 Class members paragraph 1 which says:
[...]Members of a class are data members, member functions (9.3), nested types, and
enumerators. Data members and member functions are static or non-static; see 9.4. Nested types are classes (9.1, 9.7) and enumerations (7.2) [...]
as opposed to data members and we can further see that the definition of a standard layout class depends only on data member from paragraph 16 which says:
Two standard-layout struct (Clause 9) types are layout-compatible if they have the same number of non-static data members and corresponding non-static data members (in declaration order) have layout-compatible types (3.9).
and we further see going back to section 9 Classes paragraph 10 which says(emphasis mine):
A POD struct108 is a non-union class that is both a trivial class and a standard-layout class, and has no non-static data members of type non-POD struct, non-POD union (or array of such types). Similarly, a POD union is a union that is both a trivial class and a standard layout class, and has no non-static data members of type non-POD struct, non-POD union (or array of such types). A POD class is a class that is either a POD struct or a POD union.
As far as I can tell the pre C++11 standard does not diff much in the above items.

struct alignment C/C++

In c/c++ (I am assuming they are the same in this regard), if I have the following:
struct S {
T a;
.
.
.
} s;
Is the following guaranteed to be true?
(void*)&s == (void*)&s.a;
Or in other words, is there any kind of guarantee that there will be no padding before the first member?
In C, yes, they're the same address. Simple, and straightforward.
In C++, no, they're not the same address. Base classes can (and I would suspect, do) come before all members, and virtual member functions usually add hidden data to the struct somewhere. Even more confusing, a C++ compiler may also rearrange members at will, unless the class is a standard layout type (though I don't know that any compiler does so)
Finally, if the C++ struct is composed of standard layout types, contains no base classes nor virtual functions and all members have the same visibility, and possibly other limitations I forgot, then it falls back on the C rules, and requires the first member to be at the same address as the object itself.
§ 9.2/7
A standard-layout class is a class that:
— has no non-static data members of type non-standard-layout class (or array of such types) or reference,
— has no virtual functions (10.3) and no virtual base classes (10.1),
— has the same access control (Clause 11) for all non-static data members,
— has no non-standard-layout base classes,
— either has no non-static data members in the most derived class and at most one base class with non-static data members, or has no base classes with non-static data members, and
— has no base classes of the same type as the first non-static data member.
§ 9.2/20
A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa. [ Note: There might therefore be unnamed padding within a standard-layout struct object, but not at its beginning, as necessary to achieve appropriate alignment. —end note ]
Yes, it is.
It is guaranteed there is no padding before the first struct member in C and in C++ (if it is a POD).
C quote:
(C11, 6.7.2.1p15) "There may be unnamed padding within a structure object, but not at its beginning."
C++ quote:
(C++11, 9.2p20) "There might therefore be unnamed padding within a standard-layout struct object, but not at its beginning, as necessary to achieve appropriate alignment"