Can you use the address of a constexpr variable? - c++

I have a variable whose address is passed as the fourth parameter to setsocketopt. Notice that this parameter is declared as a constant pointer (const void *optval).
In a patch I put up for review I changed the declaration of that variable to static constexpr. A reviewer of this change had concerns: he felt that it was questionable whether you can always take the address of a constexpr. He suggests that I make it const. I couldn't find much about addresses of constexpr variables and concerns about it after googling around. Can someone explain what the guarantees are concerning the address of a constexpr variable and what the caveats are about using it (if any)?
In case it's helpful, here's the code (I added static constexpr, it was just an int before):
static constexpr int ONE = 1;
setsockopt(socket_fd, IPPROTO_TCP, TCP_NODELAY, &ONE, sizeof(ONE));
Thanks!

Every object is guaranteed to have an address per [intro.object]/8
An object has nonzero size if it
is not a potentially-overlapping subobject, or
is not of class type, or
is of a class type with virtual member functions or virtual base classes, or
has subobjects of nonzero size or bit-fields of nonzero length.
Otherwise, if the object is a base class subobject of a standard-layout class type with no non-static data members, it has zero size. Otherwise, the circumstances under which the object has zero size are implementation-defined. Unless it is a bit-field, an object with nonzero size shall occupy one or more bytes of storage, including every byte that is occupied in full or in part by any of its subobjects. An object of trivially copyable or standard-layout type ([basic.types]) shall occupy contiguous bytes of storage.
emphasis mine
Since ONE is a non-class-non-bit-field type, it's in storage and you can take its address.

Yes, you can take the address of any object which is not a non-type non-reference template parameter or a bit-field. A variable declared constexpr (or one which is implicitly a constant expression) can be used in more contexts, not fewer.
It's true that the compiler can often avoid using any memory storage for a constexpr variable if it's not needed. But if the address is taken, that will probably require the memory storage, so the compiler will need to treat it more like an ordinary object in that case.

Related

type-casting abstract class type to struct type with a integer member

I am studing Blender's GHOST code and found the following statement in GHOST_CreateSystem C-API function
GHOST_ISystem::createSystem();
GHOST_ISystem *system = GHOST_ISystem::getSystem();
return (GHOST_SystemHandle)system;
GHOST_ISystem is an abstract class for a specific operating system such as win32, linux.
GHOST_SystemHandle is defined as structure type as follows:
typedef struct GHOST_SystemHandle__ {
int unused;
} * GHOST_SystemHandle
I am not familiar with this kind of practice type-casting a pointer to class into a pointer to struct type with an integer member.
It would be appreciated to enlighten me the background about it.
Abstract class CANNOT be instantiated, to wit it cannot be converted. What happens here is a conversion of pointer values.
The calls GHOST_ISystem::createSystem(); and GHOST_ISystem::getSystem(); are calls to static functions. The latter returns a pointer to GHOST_ISystem. Pointer is a separate compound type and pointer to any declared type may materialized. A pointer to an abstract class-type can hold value of pointer to any derived type. GHOST_ISystem* is its static type while its dynamic type can vary.
The C-style cast syntax (GHOST_SystemHandle)system; is considered faux pas in C++ as it is too permitting, unsafe conversion. In this particular case it is treated as reinterpret_cast to a pointer GHOST_SystemHandle__ *.
The content of pointer system is bytewise copied to an instance of GHOST_SystemHandle, which is also a pointer. This would be hinged on assumption that all pointers in implementation have similar content and same size and that access by dereferencing the result would be legal.
Likely, the underlying reason was that derived types would have an integer field or an subobject with same alignment. Strictly speaking, result is unspecified. Memory model of original class and GHOST_SystemHandle__ should match. This assumes too many things:
The abstract base class-type have no non-static member variable or it contains an equivalent member variable.
The implementation is conforming to late-enough standard to ensure that
base type subobject got zero size, if it doesn't have non-static member variables.
The alignment requirement of
derived type in question is same as alignment requirement of
GHOST_SystemHandle__. It it's not, result of *system->unused is
undefined.
There is a number of ways it could be avoided in C++ and none was used, which suggests that code was converted from C without much of refactoring done while keeping interface type formats same for compatibility.

Address of an array declared inside a new struct

inside a definition like this
typedef struct
{
myType array[N];
} myStruct;
myStruct obj;
can I always assume that ([edit] assuming proper casting will happen which is not the focus of the question here [/edit])
(&obj == &obj.array[0])
will return TRUE or the should I worry about the compiler introducing extra padding to accomodate the myType alignment requisites? In theory this shouldn't happen as the struct has a single field but I'm not entirely sure about this.
With a proper cast, this will always return true.
From section 6.7.2.1 of the C standard:
13. Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase
in the order in which they are declared. A pointer to a structure
object, suitably converted, points to its initial member (or
if that member is a bit-field, then to the unit in which it
resides), and vice versa. There may be unnamed padding within a
structure object, but not at its beginning.
According to the current C++ standard draft [class.mem] §20 (N4527), emphasis added:
If a standard-layout class object has any non-static data members, its address is the same as the address
of its first non-static data member. Otherwise, its address is the same as the address of its first base class
subobject (if any). [ Note: There might therefore be unnamed padding within a standard-layout struct
object, but not at its beginning, as necessary to achieve appropriate alignment. — end note ]
Whether myStruct is standard-layout, depends on whether myType is standard-layout. If it is, then what you're asking is guaranteed by the c++ standard.
Note that &obj and &obj.array[0] have unrelated pointer types, so the expression is not legal in c++.

Pointer arithmetics on non-array types

Let's consider following piece of code:
struct Blob {
double x, y, z;
} blob;
char* s = reinterpret_cast<char*>(&blob);
s[2] = 'A';
Assuming that sizeof(double) is 8, does this code trigger undefined behaviour?
Quoting from N4140 (roughly C++14):
3.9 Types [basic.types]
2 For any object (other than a base-class subobject) of trivially copyable type T, whether or not the object holds a valid value of type T, the underlying bytes (1.7) making up the object can be copied into an array of char or unsigned char.42 If the content of the array of char or unsigned char is copied back into the object, the object shall subsequently hold its original value.
42) By using, for example, the library functions (17.6.1.2) std::memcpy or std::memmove.
3 For any trivially copyable type T, if two pointers to T point to distinct T objects obj1 and obj2, where neither obj1 nor obj2 is a base-class subobject, if the underlying bytes (1.7) making up obj1 are copied
into obj2,43 obj2 shall subsequently hold the same value as obj1. [ Example: ... ]
43) By using, for example, the library functions (17.6.1.2) std::memcpy or std::memmove.
This does, in principle, allow assignment directly to s[2] if you take the position that assignment to s[2] is indirectly required to be equivalent to copying all of some other Blob into an array that just happens to be bytewise identical except for the third byte, and copying it into your Blob: you're not assigning to s[0], s[1], etc. For trivially copyable types including char, that is equivalent to setting them to the exact value they already have, which also has no observable effect.
However, if the only way to get s[2] == 'A' is by memory manipulation, then a valid argument could also be made that what you're copying back into your Blob isn't the underlying bytes that made up any previous Blob. In that case, technically, the behaviour would be undefined by omission.
I do strongly suspect, especially given the "whether or not the object holds a valid value of type T" comment, that it's intended to be allowed.
Chapter 3.10 of the standard seems to allow for that specific case, assuming that "access the stored value" means "read or write", which is unclear.
3.10-10
If a program attempts to access the stored value of an object through
a glvalue of other than one of the following types the behavior is
undefined:
—(10.1) the dynamic type of the object,
—(10.2) a cv-qualified version of the dynamic type of the object,
—(10.3) a type similar (as defined in 4.4) to the dynamic type of the
object,
—(10.4) a type that is the signed or unsigned type corresponding to
the dynamic type of the object,
—(10.5) a type that is the signed or unsigned type corresponding to a
cv-qualified version of the dynamic type of the object,
—(10.6) an aggregate or union type that includes one of the
aforementioned types among its elements or nonstatic data members
(including, recursively, an element or non-static data member of a
subaggregate or contained union),
—(10.7) a type that is a (possibly cv-qualified) base class type of the
dynamic type of the object,
—(10.8) a char or unsigned char type.

Are members of a POD-struct or standard layout type guaranteed to be aligned according to their alignment requirements?

Given a POD-struct (in C++03) or a standard layout type (in C++11), with all members having a fundamental alignment requirement, is it true that every member is guaranteed to be aligned according to its alignment requirement?
In other words, for all members m_k in { m0 ... mn } of standard layout type S,
struct S {
T0 m0;
T1 m1;
...
TN mn;
};
is the following expression guaranteed to evaluate to true?
(offsetof(S,m_k) % alignof(decltype(S::m_k))) == 0
Please give answers for both C++03 and C++11 and cite the relevant parts of the standard. Supporting evidence from C standards would also be helpful.
My reading of the C++03 standard (ISO/IEC 14882:2003(E)) is that it is silent regarding the alignment of members within a POD-struct, except for the first member. The relevant paragraphs are:
In the language of the specification, an object is a "region of storage":
1.8 The C + + object model [intro.object]
1.8/1 The constructs in a C + + program create, destroy, refer to, access, and manipulate objects. An object is a region of storage. ...
Objects are allocated according to their alignment requirement:
3.9 Types [basic.types]
3.9/5 Object types have alignment requirements (3.9.1, 3.9.2). The alignment of a complete object type is an implementation-defined integer value representing a number of bytes; an object is allocated at an address that meets the alignment requirements of its object type.
Fundamental types have alignment requirements:
3.9.1 Fundamental types [basic.fundamental]
3.9.1/3 For each of the signed integer types, there exists a corresponding (but different) unsigned integer type: "unsigned char", "unsigned short int", "unsigned int", and "unsigned long int," each of which occupies the same amount of storage and has the same alignment requirements (3.9) as the corresponding signed integer type;...
Padding may occur due to "implementation alignment requirements":
9.2 Class members [class.mem]
9.2/12 Nonstatic data members of a (non-union) class declared without an intervening access-specifier are allocated so that later members have higher addresses within a class object. The order of allocation of nonstatic
data members separated by an access-specifier is unspecified (11.1). Implementation alignment requirements might cause two adjacent members not to be allocated immediately after each other; so might requirements for space for managing virtual functions (10.3) and virtual base classes (10.1).
Does the word "allocated" in 9.2/12 have the same meaning as "allocated" in 3.9/5? Most uses of "allocated" in the spec refer to dynamic storage allocation, not struct-internal layout. The use of may in 9.2/12 seems to imply that the alignment requirements of 3.9/5 and 3.9.1/3 might not be strictly required for struct members.
The first member of a POD-struct will be aligned according to the alignment requirement of the struct:
9.2/17 A pointer to a POD-struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa. [Note: There might therefore be unnamed padding within a POD-struct object, but not at its beginning, as necessary to achieve appropriate alignment. ]
[Emphasis added in all of the above quotes.]
Each element of a POD struct is itself an object, and objects can only be allocated in accordance with the alignment requirements for those objects. The alignment requirements may change, though, due to the fact that something is a sub-object of another object ([basic.align]/1, 2:
1 Object types have alignment requirements (3.9.1, 3.9.2) which place restrictions on the addresses at which an object of that type may be allocated. An alignment is an implementation-defined integer value representing the number of bytes between successive addresses at which a given object can be allocated. An object type imposes an alignment requirement on every object of that type; stricter alignment can be requested using the alignment specifier (7.6.2).
2 A fundamental alignment is represented by an alignment less than or equal to the greatest alignment supported by the implementation in all contexts, which is equal to alignof(std::max_align_t) (18.2). The alignment required for a type might be different when it is used as the type of a complete object and when it is used as the type of a subobject. [Example:
struct B { long double d; };
struct D : virtual B { char c; }
When D is the type of a complete object, it will have a subobject of type B, so it must be aligned appropriately for a long double. If D appears as a subobject of another object that also has B as a virtual base class, the
B subobject might be part of a different subobject, reducing the alignment requirements on the D subobject.—end example ] The result of the alignof operator reflects the alignment requirement of the type in the
complete-object case.
[emphasis added]
Although the examples refer to a sub-object via inheritance, the normative wording just refers to sub-objects in general, so I believe the same rules apply, so on one hand you can assume that each sub-object is aligned so that it can be accessed. On the other hand, no you can't necessarily assume that will be the same alignment that alignof gives you.
[The reference is from N4296, but I believe the same applies to all recent versions. C++98/03, of course, didn't have alignof at all, but I believe the same basic principle applies--members will be aligned so they can be used, but that alignment requirement isn't necessarily the same as when they're used as independent objects.]
There are two distinct alignment requirements for each type. One corresponds to complete objects, the other one to subobjects. [basic.align]/2:
The alignment required for a type might be different when it is used
as the type of a complete object and when it is used as the type of a
subobject. [ Example:
struct B { long double d; };
struct D : virtual B { char c; };
When D is the type of a complete object, it will have a subobject of
type B, so it must be aligned appropriately for a long double. If
D appears as a subobject of another object that also has B as a
virtual base class, the B subobject might be part of a different
subobject, reducing the alignment requirements on the D subobject.
—end example ] The result of the alignof operator reflects the alignment requirement of the type in the complete-object case.
I can't think of any specific example for member subobjects - and there assuredly is none! -, but the above paragraph concedes the possibility that a member subobject has weaker alignment than alignof will yield, which would fail your condition.

Aliasing T* with char* is allowed. Is it also allowed the other way around?

Note: This question has been renamed and reduced to make it more focused and readable. Most of the comments refer to the old text.
According to the standard, objects of different type may not share the same memory location. So this would not be legal:
std::array<short, 4> shorts;
int* i = reinterpret_cast<int*>(shorts.data()); // Not OK
The standard, however, allows an exception to this rule: any object may be accessed through a pointer to char or unsigned char:
int i = 0;
char * c = reinterpret_cast<char*>(&i); // OK
However, it is not clear to me whether this is also allowed the other way around. For example:
char * c = read_socket(...);
unsigned * u = reinterpret_cast<unsigned*>(c); // huh?
Some of your code is questionable due to the pointer conversions involved. Keep in mind that in those instances reinterpret_cast<T*>(e) has the semantics of static_cast<T*>(static_cast<void*>(e)) because the types that are involved are standard-layout. (I would in fact recommend that you always use static_cast via cv void* when dealing with storage.)
A close reading of the Standard suggests that during a pointer conversion to or from T* it is assumed that there really is an actual object T* involved -- which is hard to fulfill in some of your snippet, even when 'cheating' thanks to the triviality of types involved (more on this later). That would be besides the point however because...
Aliasing is not about pointer conversions. This is the C++11 text that outlines the rules that are commonly referred to as 'strict aliasing' rules, from 3.10 Lvalues and rvalues [basic.lval]:
10 If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:
the dynamic type of the object,
a cv-qualified version of the dynamic type of the object,
a type similar (as defined in 4.4) to the dynamic type of the object,
a type that is the signed or unsigned type corresponding to the dynamic type of the object,
a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,
an aggregate or union type that includes one of the aforementioned types among its elements or non-static data members (including, recursively, an element or non-static data member of a subaggregate or contained union),
a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
a char or unsigned char type.
(This is paragraph 15 of the same clause and subclause in C++03, with some minor changes in the text with e.g. 'lvalue' being used instead of 'glvalue' since the latter is a C++11 notion.)
In the light of those rules, let's assume that an implementation provides us with magic_cast<T*>(p) which 'somehow' converts a pointer to another pointer type. Normally this would be reinterpret_cast, which yields unspecified results in some cases, but as I've explained before this is not so for pointers to standard-layout types. Then it's plainly true that all of your snippets are correct (substituting reinterpret_cast with magic_cast), because no glvalues are involved whatsoever with the results of magic_cast.
Here is a snippet that appears to incorrectly use magic_cast, but which I will argue is correct:
// assume constexpr max
constexpr auto alignment = max(alignof(int), alignof(short));
alignas(alignment) char c[sizeof(int)];
// I'm assuming here that the OP really meant to use &c and not c
// this is, however, inconsequential
auto p = magic_cast<int*>(&c);
*p = 42;
*magic_cast<short*>(p) = 42;
To justify my reasoning, assume this superficially different snippet:
// alignment same as before
alignas(alignment) char c[sizeof(int)];
auto p = magic_cast<int*>(&c);
// end lifetime of c
c.~decltype(c)();
// reuse storage to construct new int object
new (&c) int;
*p = 42;
auto q = magic_cast<short*>(p);
// end lifetime of int object
p->~decltype(0)();
// reuse storage again
new (p) short;
*q = 42;
This snippet is carefully constructed. In particular, in new (&c) int; I'm allowed to use &c even though c was destroyed due to the rules laid out in paragraph 5 of 3.8 Object lifetime [basic.life]. Paragraph 6 of same gives very similar rules to references to storage, and paragraph 7 explains what happens to variables, pointers and references that used to refer to an object once its storage is reused -- I will refer collectively to those as 3.8/5-7.
In this instance &c is (implicitly) converted to void*, which is one of the correct use of a pointer to storage that has not been yet reused. Similarly p is obtained from &c before the new int is constructed. Its definition could perhaps be moved to after the destruction of c, depending on how deep the implementation magic is, but certainly not after the int construction: paragraph 7 would apply and this is not one of the allowed situations. The construction of the short object also relies on p becoming a pointer to storage.
Now, because int and short are trivial types, I don't have to use the explicit calls to destructors. I don't need the explicit calls to the constructors, either (that is to say, the calls to the usual, Standard placement new declared in <new>). From 3.8 Object lifetime [basic.life]:
1 [...] The lifetime of an object of type T begins when:
storage with the proper alignment and size for type T is obtained, and
if the object has non-trivial initialization, its initialization is complete.
The lifetime of an object of type T ends when:
if T is a class type with a non-trivial destructor (12.4), the destructor call starts, or
the storage which the object occupies is reused or released.
This means that I can rewrite the code such that, after folding the intermediate variable q, I end up with the original snippet.
Do note that p cannot be folded away. That is to say, the following is defintively incorrect:
alignas(alignment) char c[sizeof(int)];
*magic_cast<int*>(&c) = 42;
*magic_cast<short*>(&c) = 42;
If we assume that an int object is (trivially) constructed with the second line, then that must mean &c becomes a pointer to storage that has been reused. Thus the third line is incorrect -- although due to 3.8/5-7 and not due to aliasing rules strictly speaking.
If we don't assume that, then the second line is a violation of aliasing rules: we're reading what is actually a char c[sizeof(int)] object through a glvalue of type int, which is not one of the allowed exception. By comparison, *magic_cast<unsigned char>(&c) = 42; would be fine (we would assume a short object is trivially constructed on the third line).
Just like Alf, I would also recommend that you explicitly make use of the Standard placement new when using storage. Skipping destruction for trivial types is fine, but when encountering *some_magic_pointer = foo; you're very much likely facing either a violation of 3.8/5-7 (no matter how magically that pointer was obtained) or of the aliasing rules. This means storing the result of the new expression, too, since you most likely can't reuse the magic pointer once your object is constructed -- due to 3.8/5-7 again.
Reading the bytes of an object (this means using char or unsigned char) is fine however, and you don't even to use reinterpret_cast or anything magic at all. static_cast via cv void* is arguably fine for the job (although I do feel like the Standard could use some better wording there).
This too:
// valid: char -> type
alignas(int) char c[sizeof(int)];
int * i = reinterpret_cast<int*>(c);
That is not correct. The aliasing rules state under which circumstances it is legal/illegal to access an object through an lvalue of a different type. There is an specific rule that says that you can access any object through a pointer of type char or unsigned char, so the first case is correct. That is, A => B does not necessarily mean B => A. You can access an int through a pointer to char, but you cannot access a char through a pointer to int.
For the benefit of Alf:
If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:
the dynamic type of the object,
a cv-qualified version of the dynamic type of the object,
a type similar (as defined in 4.4) to the dynamic type of the object,
a type that is the signed or unsigned type corresponding to the dynamic type of the object,
a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,
an aggregate or union type that includes one of the aforementioned types among its elements or non- static data members (including, recursively, an element or non-static data member of a subaggregate or contained union),
a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
a char or unsigned char type.
Regarding the validity of …
alignas(int) char c[sizeof(int)];
int * i = reinterpret_cast<int*>(c);
The reinterpret_cast itself is OK or not, in the sense of producing a useful pointer value, depending on the compiler. And in this example the result isn't used, in particular, the character array isn't accessed. So there is not much more that can be said about the example as-is: it just depends.
But let's consider an extended version that does touch on the aliasing rules:
void foo( char* );
alignas(int) char c[sizeof( int )];
foo( c );
int* p = reinterpret_cast<int*>( c );
cout << *p << endl;
And let's only consider the case where the compiler guarantees a useful pointer value, one that would place the pointee in the same bytes of memory (the reason that this depends on the compiler is that the standard, in §5.2.10/7, only guarantees it for pointer conversions where the types are alignment-compatible, and otherwise leave it as "unspecified" (but then, the whole of §5.2.10 is somewhat inconsistent with §9.2/18).
Now, one interpretation of the standard's §3.10/10, the so called "strict aliasing" clause (but note that the standard does not ever use the term "strict aliasing"),
If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:
the dynamic type of the object,
a cv-qualified version of the dynamic type of the object,
a type similar (as defined in 4.4) to the dynamic type of the object,
a type that is the signed or unsigned type corresponding to the dynamic type of the object,
a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,
an aggregate or union type that includes one of the aforementioned types among its elements or non- static data members (including, recursively, an element or non-static data member of a subaggregate or contained union),
a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
a char or unsigned char type.
is that, as it itself says, concerns the dynamic type of the object residing in the c bytes.
With that interpretation, the read operation on *p is OK if foo has placed an int object there, and otherwise not. So in this case, a char array is accessed via an int* pointer. And nobody is in any doubt that the other way is valid: even though foo may have placed an int object in those bytes, you can freely access that object as a sequence of char values, by the last dash of §3.10/10.
So with this (usual) interpretation, after foo has placed an int there, we can access it as char objects, so at least one char object exists within the memory region named c; and we can access it as int, so at least that one int exists there also; and so David’s assertion in another answer that char objects cannot be accessed as int, is incompatible with this usual interpretation.
David's assertion is also incompatible with the most common use of placement new.
Regarding what other possible interpretations there are, that perhaps could be compatible with David's assertion, well, I can't think of any that make sense.
So in conclusion, as far as the Holy Standard is concerned, merely casting oneself a T* pointer to the array is practically useful or not depending on the compiler, and accessing the pointed to could-be-value is valid or not depending on what's present. In particular, think of a trap representation of int: you would not want that blowing up on you, if the bitpattern happened to be that. So to be safe you have to know what's in there, the bits, and as the call to foo above illustrates the compiler can in general not know that, like, the g++ compiler's strict alignment-based optimizer can in general not know that…