Is a struct of one element compatible with the element itself? - c++

If I have the following struct:
struct Foo { int a; };
Is the code bellow conforming with the C++ Standard? I mean, can't it generate an "Undefined Behavior"?
Foo foo;
int ifoo;
foo = *reinterpret_cast<Foo*>(&ifoo);
void bar(int value);
bar(*reinterpret_cast<int*>(&foo));
auto fptr = static_cast<void(*)(...)>(&bar);
fptr(foo);

9.2/20 in N3290 says
A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa.
And your Foo is a standard-layout class.
So your second cast is correct.
I see no guarantee that the first one is correct (and I've used architecture where a char had weaker alignment restriction than a struct containing just a char, on such an architecture, it would be problematic). What the standard guarantee is that if you have a pointer to int which really point to the first element of a struct, you can reinterpret_cast it back to pointer to the struct.
Likewise, I see nothing which would make your third one defined if it was a reinterpret_cast (I'm pretty sure that some ABI use different convention to pass structs and basic types, so it is highly suspicious and I'd need a clear mention in the standard to accept it) and I'm quite sure that nothing allow static_cast between pointers to functions.

As long as you access only the first element of a struct, it's considered to be safe, since there's no padding before the first member of a struct. In fact, this trick is used, for example, in the Objecive-C runtime, where a generic pointer type is defined as:
typedef struct objc_object {
Class isa;
} *id;
and in runtime, real objecs (which are still bare struct pointers) have memory layouts like this:
struct {
Class isa;
int x; // random other data as instance variables
} *CustomObject;
and the runtime accesses the class of an actual object using this method.

Foo is a plain-old-data structure, which means it contains nothing but the data you explicitely store in it. In this case: an int.
Thus the memory layout for an int and Foo are the same.
You can typecast from one to the other without problems. Whether it's a clever idea to use this kind of stuff is a different question.
PS:
This usually works, but not necessarily due to different alignment restrictions. See AProgrammer's answer.

Related

What type of C++ cast should be used to cast a pointer to a struct to its first member?

X is defined as the following:
struct X
{
Y y;
// more fields...
int a;
};
I have a variable of type X. However, I would like to cast it to the type of its first member, in order to pass that into a function. I know that the C Standard permits it (and I suppose the C++ one does so as well).
In C I would do it like so:
X x;
Y* y = (Y*) x;
doStuff(y);
What type of cast is the right one in C++ for this? static_cast or reinterpret_cast?
None.
You can't mess around with objects using pointers like that. C++ is not C, and these are not "just bytes" (contrary to popular belief).
And you don't need to!
Pass &x.y instead; it's already the Y* you want.
I'd always recommend using static_cast instead of reinterpret_cast in any situation where the static_cast isn't rejected by the compiler. If possible try to avoid doing any casting at all - in this case you probably want: Y* y = &x.y.
To answer the comment:
In this case, I have a PROCESS_MEMORY_COUNTERS_EX
variable. However the WinAPI function GetProcessMemoryInfo takes a
PROCESS_MEMORY_COUNTERS*. The former type starts with the exact same
fields as the latter, and adds a few at the end. The intended usage is
to pass into the function a pointer to the latter type, even if we
hold a pointer to the former (larger) type.
The documentation for GetProcessMemoryInfo() states that the second parameter is:
A pointer to the PROCESS_MEMORY_COUNTERS or PROCESS_MEMORY_COUNTERS_EX
structure that receives information about the memory usage of the
process.
The Win32 API is a C API, and not a C++ one, so you can just use a C style cast here, or preferably a reinterpret_cast to make your intention clearer. I'd expect static_cast to be rejected by the compiler in this case. Note that the third cb parameter is there to tell the function which type of structure you actually provided - it should be set to either sizeof(PROCESS_MEMORY_COUNTERS) or sizeof(PROCESS_MEMORY_COUNTERS_EX).
#FredLarson In this case, I have a PROCESS_MEMORY_COUNTERS_EX variable. However the WinAPI function GetProcessMemoryInfo takes a PROCESS_MEMORY_COUNTERS*. The former type starts with the exact same fields as the latter, and adds a few at the end. The intended usage is to pass into the function a pointer to the latter type, even if we hold a pointer to the former (larger) type.
The cleanest way to accomplish that would be probably (ab)using inheritance. This way, you can have your _EX type share members with the base type while actually being an instance of it.
struct X {
int a;
int b;
};
struct X_EX : public X {
int other_member;
};
void doStuff(X*);
void foo(X_EX* ptr) {
doStuff(ptr);
}
However, do note that "Inherit-to-extend" is seen as a code smell, and something to avoid if possible nowadays. I'd make sure to put a comment explaining why it's necessary here.

Odd usage of special pointer values

I am using a C++ implementation of an algorithm which makes odd usage of special pointer values, and I would like to known how safe and portable is this.
First, there is some structure containing a pointer field. It initializes an array of such structures by zeroing the array with memset(). Later on, the code relies on the pointer fields initialized that way to compare equal to NULL; wouldn't that fail on a machine whose internal representation of the NULL pointer is not all-bits-zero?
Subsequently, the code sets some pointers to, and laters compares some pointers being equal to, specific pointer values, namely ((type*) 1) and ((type*) 2). Clearly, these pointers are meant to be some flags, not supposed to be dereferenced. But can I be sure that some genuine valid pointer would not compare equal to one of these? Is there any better (safe, portable) way to do that (i.e. use specific pointer values that can be taken by pointer variables only through explicit assignment, in order to flag specific situations)?
Any comment is welcome.
To sum up the comments I received, both issues raised in the question are indeed expected to work on "usual" setup, but comes with no guarantee.
Now if I want absolute guarantees, it seems my best option is, for the NULL pointers, set them either manually or with a proper constructor, and for the special pointer values, to create manually sentinel pointer values.
For the latter, in a C++ class I guess the most elegant solution is to use static members
class The_class
{
static const type reserved;
static const type* const sentinel;
};
provided that they can be initialized somewhere:
const type The_class::reserved = foo; // 'foo' is a constant expression of type 'type'
const type* const The_class::sentinel = &The_class::reserved;
If type is templated, either the above initialization must be instantiated for each type intended, or one must resort to non-static (less elegant but still usefull) "reserved" and "sentinel" members.
template <typename type>
class The_class
{
type reserved; // cannot be static anymore, nor const for complicated 'type' without adapted constructor
const type* const sentinel;
public:
The_class() : sentinel(&reserved);
};

Does the "cast to first member of standard layout" type punning rule extend to arrays?

Specifically, I am wrapping a C API in a friendly C++ wrapper. The C API has this fairly standard shape:
struct foo {...};
void get_foos(size_t* count, foo* dst);
And what I'd like to do, is save myself an extra copy by passing a typed-punned wrapper array directly to the C api with a bunch of sanity checking static_assert().
class fooWrapper {
foo raw_;
public:
[...]
};
std::vector<fooWrapper> get_foo_vector() {
size_t count = 0;
get_foos(&count, nullptr);
std::vector<fooWrapper> result(count);
// Is this OK?
static_assert(sizeof(foo) == sizeof(fooWrapper), "");
static_assert(std::is_standard_layout<fooWrapper>::value, "");
get_foos(&count, reinterpret_cast<foo*>(result.data()));
return result;
}
My understanding is that it is valid code, since all accessed memory locations individually qualify under the rule, but I'd like confirmation on that.
Edit: Obviously, as long as reinterpret_cast<char*>(result.data() + n) == reinterpret_cast<char*>(result.data()) + n*sizeof(foo) is true, it'll work under all major compilers today. But I'm wondering if the standard agrees.
First, this is not type punning. The reinterpret_cast you're doing is just an over-written way of doing &result.data().foo_. Type punning is accessing an object of one type through a pointer/reference to another type. You're accessing a subobject of the other type.
Second, this doesn't work. Pointer arithmetic is based on having an array (a single object acts as an array of 1 element for the purposes of pointer arithmetic). And vector<T> is defined by fiat to produce an array of Ts. But an array of T is not equivalent to an array of some subobject of T, even if that subobject is the same size as T and T is standard layout.
Therefore, if get_foos performs pointer arithmetic on its given array of foos, that's UB. Oh sure, it will almost certainly work. But the language's answer is UB.

Are void* pointer and pointer to some structure (layout-) compatible?

In other words, may I reinterpret (not convert!) void* pointer as a pointer to some structure type (assuming that the void* pointer really holds properly converted valid structure address)
Actually I'm interesting in the following scenario:
typedef struct void_struct void_struct_t;
typedef somestruct
{
int member;
// ... other members ...
}somestruct_t;
union
{
void* pv;
void_struct_t* pvs;
somestruct_t* ps;
}u;
somestruct_t s={};
u.pv= &s;
u.ps->member=1; // (Case 1) Ok? unspecified? UB?
u.pvs=(void_struct_t*)&s;
u.ps->member=1; // (Case 2) )Ok?
What I found in the C11 standard is rather dissapointing for the Case 1:
§6.2.5
28 A pointer to void shall have the same representation and alignment requirements as a
pointer to a character type.[footnote: The same representation and alignment requirements
are meant to imply interchangeability as arguments to functions, return values from
functions, and members of unions.] Similarly, pointers to qualified or unqualified
versions of compatible types shall have the same representation and alignment
requirements. All pointers to structure types shall have the same representation and
alignment requirements as each other. All pointers to union types shall have the same
representation and alignment requirements as each other. Pointers to other types need not
have the same representation or alignment requirements.
It seems, though, that Case 2 is valid, but I'm not 100% sure...
The question is mostly C-oriented, but I'm interesting in C++ too (I'd want the code would be valid while compiling by C++ compiler). Honestly, I found even less in C++11 standard, so even Case 2 seems questionable for me... however, may be I'm missing something.
[edit]
What is the real problem behind this question?
I have a (potentially large) set of types defined as structs.
For each type I need to define a companion type:
typedef struct companion_for_sometype
{
sometype* p_object;
// there are also other members
}companion_for_sometype;
Obviously, the companion type would be a template in C++, but I need a solution for C
(more exactly, for "clean C", i.e for intersection of C89 and C++ as I want my code to be also valid C++ code).
Fortunately, it is not a problem even in C, since I can define a macro
DECLARE_COMPANION(type_name) typedef struct companion_for_##type_name
{
type_name* p_object;
// there are also other members
}companion_for_##type_name;
and just invoke it for every type that need a companion.
There is also a set of generic operations on companion types.
These operations are also defined by macros (since there are no overloads in pure C).
One of this operations, say
#define op(companion_type_object) blablabla
should assign a void* pointer to p_object field of the companion object,
i.e. should do something like this:
(companion_type_object).p_object= (type_name*) some_function_returning_pvoid(..)
But the macro doesn't know type_name (only an object of companion type is passed to the macro)
so the macro can't do the appropriate pointer cast.
The question is actually inspired by this problem.
To solve it, I decide to reinterpret target pointer in the assignment as void* and then assign to it.
It may be done by replacing the pointer in the companion declaration with a union of pointers
(the question is about this case), or one may reinterpret target pointer directly, say:
*(void**) &(companion_type_object).p_object= some_function_returning_pvoid(..)
But I can't find any solution without reinterpreting pointers (maybe I'm missing some possibilities though)
void * is a pointer that can hold any object pointer type, that includes all pointers to structure type. So you can assign any pointer to a structure type to a void *.
But void * and pointers to structure types are not guaranteed to have the same representation so your case 1 is undefined behavior.
(C11, 6.2.5p28) "[...] Pointers to other types need not have the same
representation or alignment requirements."
In C, void * automatically casts to any object type, so this will work:
(companion_type_object).p_object = some_function_returning_pvoid(..)
In C++, you need to use static_cast, but you can find out the required type using decltype :
(companion_type_object).p_object =
static_cast<decltype(*(companion_type_object).p_object) *>(
some_function_returning_pvoid(..))
In C++03 you should be able to use some compiler extension equivalent to decltype. Alternatively, you could provide a macro-generated method on companion_type_object to cast a void * to the appropriate type:
static type_name *void_p_to_object_p(void *p) { return static_cast<type_name *>(p); }
...
(companion_type_object).p_object = companion_type_object.void_p_to_object_p(
some_function_returning_pvoid(..))

Container covariance in C++

I know that C++ doesn't support covariance for containers elements, as in Java or C#. So the following code probably is undefined behavior:
#include <vector>
struct A {};
struct B : A {};
std::vector<B*> test;
std::vector<A*>* foo = reinterpret_cast<std::vector<A*>*>(&test);
Not surprisingly, I received downvotes when suggesting this a solution to another question.
But what part of the C++ standard exactly tells me that this will result in undefined behavior? It's guaranteed that both std::vector<A*> and std::vector<B*> store their pointers in a continguous block of memory. It's also guaranteed that sizeof(A*) == sizeof(B*). Finally, A* a = new B is perfectly legal.
So what bad spirits in the standard did I conjure (except style)?
The rule violated here is documented in C++03 3.10/15 [basic.lval], which specifies what is referred to informally as the "strict aliasing rule"
If a program attempts to access the stored value of an object through an lvalue of other than one of the following types the behavior is undefined:
the dynamic type of the object,
a cv-qualified version of the dynamic type of the object,
a type that is the signed or unsigned type corresponding to the dynamic type of the object,
a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,
an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union),
a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
a char or unsigned char type.
In short, given an object, you are only allowed to access that object via an expression that has one of the types in the list. For a class-type object that has no base classes, like std::vector<T>, basically you are limited to the types named in the first, second, and last bullets.
std::vector<Base*> and std::vector<Derived*> are entirely unrelated types and you can't use an object of type std::vector<Base*> as if it were a std::vector<Derived*>. The compiler could do all sorts of things if you violate this rule, including:
perform different optimizations on one than on the other, or
lay out the internal members of one differently, or
perform optimizations assuming that a std::vector<Base*>* can never refer to the same object as a std::vector<Derived*>*
use runtime checks to ensure that you aren't violating the strict aliasing rule
[It might also do none of these things and it might "work," but there's no guarantee that it will "work" and if you change compilers or compiler versions or compilation settings, it might all stop "working." I use the scare-quotes for a reason here. :-)]
Even if you just had a Base*[N] you could not use that array as if it were a Derived*[N] (though in that case, the use would probably be safer, where "safer" means "still undefined but less likely to get you into trouble).
You are invoking the bad spirit of reinterpret_cast<>.
Unless you really know what you do (I mean not proudly and not pedantically) reinterpret_cast is one of the gates of evil.
The only safe use I know of is managing classes and structures between C++ and C functions calls.
There maybe some others however.
The general problem with covariance in containers is the following:
Let's say your cast would work and be legal (it isn't but let's assume it is for the following example):
#include <vector>
struct A {};
struct B : A { public: int Method(int x, int z); };
struct C : A { public: bool Method(char y); };
std::vector<B*> test;
std::vector<A*>* foo = reinterpret_cast<std::vector<A*>*>(&test);
foo->push_back(new C);
test[0]->Method(7, 99); // What should happen here???
So you have also reinterpret-casted a C* to a B*...
Actually I don't know how .NET and Java manage this (I think they throw an exception when trying to insert a C).
I think it'll be easier to show than tell:
struct A { int a; };
struct Stranger { int a; };
struct B: Stranger, A {};
int main(int argc, char* argv[])
{
B someObject;
B* b = &someObject;
A* correct = b;
A* incorrect = reinterpret_cast<A*>(b);
assert(correct != incorrect); // troubling, isn't it ?
return 0;
}
The (specific) issue showed here is that when doing a "proper" conversion, the compiler adds some pointer ajdustement depending on the memory layout of the objects. On a reinterpret_cast, no adjustement is performed.
I suppose you'll understand why the use of reinterpet_cast should normally be banned from the code...