Suppose I have two classes with identical members from two different libraries:
namespace A {
struct Point3D {
float x,y,z;
};
}
namespace B {
struct Point3D {
float x,y,z;
};
}
When I try cross-casting, it worked:
A::Point3D pa = {3,4,5};
B::Point3D* pb = (B::Point3D*)&pa;
cout << pb->x << " " << pb->y << " " << pb->z << endl;
Under which circumstances is this guaranteed to work? Always? Please note that it would be highly undesirable to edit an external library to add an alignment pragma or something like that. I'm using g++ 4.3.2 on Ubuntu 8.10.
If the structs you are using are just data and no inheritance is used I think it should always work.
As long as they are POD it should be ok.
http://en.wikipedia.org/wiki/Plain_old_data_structures
According to the standard(1.8.5)
"Unless it is a bit-field (9.6), a most derived object shall have a non-zero size and shall occupy one or more bytes of
storage. Base class subobjects may have zero size. An object of POD5)
type (3.9) shall occupy contiguous bytes of
storage."
If they occupy contiguous bytes of storage and they are the same struct with different name, a cast should succeed
If two POD structs start with the same sequence of members, the standard guarantees that you'll be able to access them freely through a union. You can store an A::Point3D in a union, and then read from the B::Point3D member, as long as you're only touching the members that are part of the initial common sequence. (so if one struct contained int, int, int, float, and the other contained int, int, int, int, you'd only be allowed to access the three first ints).
So that seems like one guaranteed way in which your code should work.
It also means the cast should work, but I'm not sure if this is stated explicitly in the standard.
Of course all this assumes that both structs are compiled with the same compiler to ensure identical ABI.
This line should be :
B::Point3D* pb = (B::Point3D*)&pa;
Note the &. I think what you are doing is a reinterpret_cast between two pointers. In fact you can reinterpret_cast any pointer type to another one, regardless of the type of the two pointers. But this unsafe, and not portable.
For example,
int x = 5;
double* y = reinterpret_cast<double*>(&x);
You are just going with the C-Style, So the second line is actually equal to:
double* z = (double*)&x;
I just hate the C-Style when casting because you can't tell the purpose of the cast from one look :)
Under which circumstances is this
guaranteed to work?
This is not real casting between types. For example,
int i = 5;
float* f = reinterpret_cast<float*>(&i);
Now f points to the same place that i points to. So, no conversion is done. When you dereference f, you will get the a float with the same binary representation of the integer i. It is four bytes on my machine.
The following is pretty safe:
namespace A {
struct Point3D {
float x,y,z;
};
}
namespace B {
typedef A::Point3D Point3D;
}
int main() {
A::Point3D a;
B::Point3D* b = &a;
return 0;
}
I know exactly it wouldn't work:
both struct has differen alignment;
compiled with different RTTI options
may be some else...
Related
Consider following code:
union U
{
int a;
float b;
};
int main()
{
U u;
int *p = &u.a;
*(float *)p = 1.0f; // <-- this line
}
We all know that addresses of union fields are usually same, but I'm not sure is it well-defined behavior to do something like this.
So, question is: Is it legal and well-defined behavior to cast and dereference a pointer to union field like in the code above?
P.S. I know that it's more C than C++, but I'm trying to understand if it's legal in C++, not C.
All members of a union must reside at the same address, that is guaranteed by the standard. What you are doing is indeed well-defined behavior, but it shall be noted that you cannot read from an inactive member of a union using the same approach.
Accessing inactive union member - undefined behavior?
Note: Do not use c-style casts, prefer reinterpret_cast in this case.
As long as all you do is write to the other data-member of the union, the behavior is well-defined; but as stated this changes which is considered to be the active member of the union; meaning that you can later only read from that you just wrote to.
union U {
int a;
float b;
};
int main () {
U u;
int *p = &u.a;
reinterpret_cast<float*> (p) = 1.0f; // ok, well-defined
}
Note: There is an exception to the above rule when it comes to layout-compatible types.
The question can be rephrased into the following snippet which is semantically equivalent to a boiled down version of the "problem".
#include <type_traits>
#include <algorithm>
#include <cassert>
int main () {
using union_storage_t = std::aligned_storage<
std::max ( sizeof(int), sizeof(float)),
std::max (alignof(int), alignof(float))
>::type;
union_storage_t u;
int * p1 = reinterpret_cast< int*> (&u);
float * p2 = reinterpret_cast<float*> (p1);
float * p3 = reinterpret_cast<float*> (&u);
assert (p2 == p3); // will never fire
}
What does the Standard (n3797) say?
9.5/1 Unions [class.union]
In a union, at most one of the non-static data members can be
active at any time, that is, the value of at most one of the
non-static dat amembers ca nbe stored in a union at any time.
[...] The size of a union is sufficient to contain the largest of
its non-static data members. Each non-static data member is
allocated as if it were the sole member of a struct. All non-static data members of a union object have the same address.
Note: The wording in C++11 (n3337) was underspecified, even though the intent has always been that of C++14.
Yes, it is legal. Using explicit casts, you can do almost anything.
As other comments have stated, all members in a union start at the same address / location so casting a pointer to a different member is pointless.
The assembly language will be the same. You want to make the code easy to read so I don't recommend the practice. It is confusing and there is no benefit.
Also, I recommend a "type" field so that you know when the data is in float format versus int format.
Is it OK to cast a double array to a struct made of doubles?
struct A
{
double x;
double y;
double z;
};
int main (int argc , char ** argv)
{
double arr[3] = {1.0,2.0,3.0};
A* a = static_cast<A*>(static_cast<void*>(arr));
std::cout << a->x << " " << a->y << " " << a->z << "\n";
}
This prints 1 2 3. But is it guaranteed to work every time with any compiler?
EDIT: According to
9.2.21: A pointer to a standard-layout struct object, suitably converted ? using a reinterpret_cast, points to its initial member (...) and vice versa.
if I replace my code with
struct A
{
double & x() { return data[0]; }
double & y() { return data[1]; }
double & z() { return data[2]; }
private:
double data[3];
};
int main (int, char **)
{
double arr[3] = {1.0,2.0,3.0};
A* a = reinterpret_cast<A*>(arr);
std::cout << a->x() << " " << a->y() << " " << a->z() << "\n";
}
then it is guaranteed to work. Correct? I understand that many people would not find this aesteticaly pleasing but there are advantages in working with a struct and not having to copy the input array data. I can define member functions in that struct to compute scalar and vector products, distances etc, that will make my code much easier to understand than if I work with arrays.
How about
int main (int, char **)
{
double arr[6] = {1.0,2.0,3.0,4.0,5.0,6.0};
A* a = reinterpret_cast<A*>(arr);
std::cout << a[0].x() << " " << a[0].y() << " " << a[0].z() << "\n";
std::cout << a[1].x() << " " << a[1].y() << " " << a[1].z() << "\n";
}
Is this also guaranteed to work or the compiler could put something AFTER the data members so that sizeof(A) > 3*sizeof(double)? And is there any portable way to prevent the compiler from doing so?
No, it's not guaranteed.
The only thing prohibiting any compiler from inserting padding between x and y, or between y and z is common sense. There is no rule in any language standard that would disallow it.
Even if there is no padding, even if the representation of A is exactly the same as that of double[3], then it's still not valid. The language doesn't allow you to pretend one type is really another type. You're not even allowed to treat an instance of struct A { int i; }; as if it's a struct B { int i; };.
The standard gives little guarantees about memory layout of objects.
For classes/structs:
9.2./15: Nonstatic data members of a class with the same access control are allocated so that later members have higher addresses
within a class object. The order of allocation of non-static data
members with different access control is unspecified. Implementation
alignment requirements might cause two adjacent members not to be allocated immediately after each other; so might requirements for
space for managing virtual functions and virtual base classes.
For arrays, the elements are contiguous. Nothing is said about alignment, so it may or may not use same alignment rules than in struct :
8.3.4: An object of array type contains a contiguously allocated non-empty set of N subobjects of type T.
The only thing you can be sure of in your specific example is that a.x corresponds to arr[0], if using a reinterpret_cast:
9.2.21: A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast, points to its initial member (...)
and vice versa. [
>
No it is not guaranteed, even if it should work with all compilers I know on common architectures, because C language specification says :
6.2.6 Representations of types
6.2.6.1 General1 The representations of all types are unspecified except as stated in this subclause. And it says nothing on the default padding in a struct.
Of course, common architectures use at most 64bits which is the size of a double on those architecture, so there should be no padding and your conversion should work.
But beware : you are explicitely invoking Undefined Behaviour, and next generation of compilers could do anything when compiling such a cast.
From all I know the answer is: yes.
The only thing that could throw you off is a #pragma directive with some very unusual alignment setting for the struct. If for example a double takes 8 bytes on your machine and the #pragma directive tells to align every member on 16-byte boundaries that could cause problems. Other than that you are fine.
std::complex implementation of msvc use the array solution, and llvm libc++ use the former form.
I think, just check the implementation of the std::complex of your libc++, and use the same solution with it.
I disagree with the consensus here. A struct with three doubles in it, is exactly the same as an array with 3 doubles in it. Unless you specifically pack the struct differently and are on a weird processor that has an odd number of bytes for doubles.
It's not built into the language, but I would feel safe in doing it. Style wise I wouldn't do it, because it's just confusing.
Was just reading about some anonymous structures and how it is isn't standard and some general use case for it is undefined behaviour...
This is the basic case:
struct Point {
union {
struct {
float x, y;
};
float v[2];
};
};
So writing to x and then reading from v[0] would be undefined in that you would expect them to be the same but it may not be so.
Not sure if this is in the standard but unions of the same type...
union{ float a; float b; };
Is it undefined to write to a and then read from b ?
That is to say does the standard say anything about binary representation of arrays and sequential variables of the same type.
The standard says that reading from any element in a union other
than the last one written is undefined behavior. In theory, the
compiler could generate code which somehow kept track of the
reads and writes, and triggered a signal if you violated the
rule (even if the two are the same type). A compiler could also
use the fact for some sort of optimization: if you write to a
(or x), it can assume that you do not read b (or v[0])
when optimizing.
In practice, every compiler I know supports this, if the union
is clearly visible, and there are cases in many (most?, all?)
where even legal use will fail if the union is not visible
(e.g.:
union U { int i; float f; };
int f( int* pi, int* pf ) { int r = *pi; *pf = 3.14159; return r; }
// ...
U u;
u.i = 1;
std::cout << f( &u.i, &u.f );
I've actually seen this fail with g++, although according to the
standard, it is perfectly legal.)
Also, even if the compiler supports writing to Point::x and
reading from Point::v[0], there's no guarantee that Point::y
and Point::v[1] even have the same physical address.
The standard requires that in a union "[e]ach data member is allocated as if it were the sole member of a struct." (9.5)
It also requires that struct { float x, y; } and float v[2] must have the same internal representation (9.2) and thus you could safely reinterpret cast one as the other
Taken together these two rules guarantee that the union you describe will function provided that it is genuinely written to memory. However, because the standard only requires that the last data member written be valid it's theoretically possible to have an implementation that fails if the union is only used as a local variable. I'd be amazed if that ever actually happens, however.
I did not get why you have used float v[2];
The simple union for a point structure can be defined as:
union{
struct {
float a;
float b;
};
} Point;
You can access the values in unioin as:
Point.a = 10.5;
point.b = 12.2; //example
I'm trying to cast a pointer to another type using reinterpret_cast
class MyClassA
{
int x;
int y;
public:
MyClassA();
~MyClassA();
};
class MyClassB
{
int x;
int y;
public:
MyClassB();
~MyClassB();
};
For example, if I cast a pointer to MyClassA to MyClassB, using reinterpret_cast would this conversion work? what about code portability?
And also, as I noted:
(5.2.10/4) A pointer can be explicitly converted to any integral type
large enough to hold it.
Does it mean any pointer e.g MyClassA* can only be converted to int* pointer? if i'm correct?
(5.2.10/4) A pointer can be explicitly converted to any integral type large enough to hold it.
This means int* to int. Pointers aren't integral types.
Casting pointers of unrelated types is bad news and shouldn't be done. It may work, but is not portable.
Inheritance will work though. Have the types inherit from a base class!
As to whether this works or not I do not know, that would be implementation dependant.
However there are no guarantee by the standard that this would work (so don't do it), MyClassA and MyClassB are two separate types that are non compatible even if they are structurally the same.
Same applies to int*.
if you need conversion between then then you can make an A to B assignment operator
class MyClassA{
...
operator=(const MyClassB& mcb)
{
this.x=mcb.x;
this.y=mcb.y;
}
};
And if you need access to an integer element inside of MyClassA.
//in MyClassA
int& getX(){ return x; }
const int& getX() const { return x; }
int& x=mca.getX();
That doesn't mean you can cast unrelated types. It means you can cast a pointer to a 64bit integer (for example) if on your platform 64bit integer is sufficiently big enough to contain a pointer value.
PS: I would like to mention that what you have done is totally deviating from the standard but you might get away with it on many platforms / compilers :)
My answer to the first is yes. A pointer is just a variable (integral) whose value is an address to another memory location. Which is why you are able to do the following.
(*(void(*)())0)(); /*Ref: Andrew Koenig C Traps and Pitfals*/
or
#define CARD_INTERFACE_ADDRESS 0x4801C
struct A * = new (CARD_INTERFACE_ADDRESS) struct A
MyClassA * ptr_a cannot (logically) be converted to int* ptr_i because what you are saying is that the the address held by the variable ptr_a is the address of MyClassA in the former while in the latter you are saying that saying that ptr_i is the address of an int.
I have a testing struct definition as follows:
struct test{
int a, b, c;
bool d, e;
int f;
long g, h;
};
And somewhere I use it this way:
test* t = new test; // create the testing struct
int* ptr = (int*) t;
ptr[2] = 15; // directly manipulate the third word
cout << t->c; // look if it really affected the third integer
This works correctly on my Windows - it prints 15 as expected, but is it safe? Can I be really sure the variable is on the spot in memory I want it to be - expecially in case of such combined structs (for example f is on my compiler the fifth word, but it is a sixth variable)?
If not, is there any other way to manipulate struct members directly without actually having struct->member construct in the code?
It looks like you are asking two questions
Is it safe to treat &test as a 3 length int arrray?
It's probably best to avoid this. This may be a defined action in the C++ standard but even if it is, it's unlikely that everyone you work with will understand what you are doing here. I believe this is not supported if you read the standard because of the potential to pad structs but I am not sure.
Is there a better way to access a member without it's name?
Yes. Try using the offsetof macro/operator. This will provide the memory offset of a particular member within a structure and will allow you to correctly position a point to that member.
size_t offset = offsetof(mystruct,c);
int* pointerToC = (int*)((char*)&someTest + offset);
Another way though would be to just take the address of c directly
int* pointerToC = &(someTest->c);
No you can't be sure. The compiler is free to introduce padding between structure members.
To add to JaredPar's answer, another option in C++ only (not in plain C) is to create a pointer-to-member object:
struct test
{
int a, b, c;
bool d, e;
int f;
long g, h;
};
int main(void)
{
test t1, t2;
int test::*p; // declare p as pointing to an int member of test
p = &test::c; // p now points to 'c', but it's not associating with an object
t1->*p = 3; // sets t1.c to 3
t2->*p = 4; // sets t2.c to 4
p = &test::f;
t1->*p = 5; // sets t1.f to 5
t2->*p = 6; // sets t2.f to 6
}
You are probably looking for the offsetof macro. This will get you the byte offset of the member. You can then manipulate the member at that offset. Note though, this macro is implementation specific. Include stddef.h to get it to work.
It's probably not safe and is 100% un-readable; thus making that kind of code unacceptable in real life production code.
use set methods and boost::bind for create functor which will change this variable.
Aside from padding/alignment issues other answers have brought up, your code violates strict aliasing rules, which means it may break for optimized builds (not sure how MSVC does this, but GCC -O3 will break on this type of behavior). Essentially, because test *t and int *ptr are of different types, the compiler may assume they point to different parts of memory, and it may reorder operations.
Consider this minor modification:
test* t = new test;
int* ptr = (int*) t;
t->c = 13;
ptr[2] = 15;
cout << t->c;
The output at the end could be either 13 or 15, depending on the order of operations the compiler uses.
According to paragraph 9.2.17 of the standard, it is in fact legal to cast a pointer-to-struct to a pointer to its first member, providing the struct is POD:
A pointer to a POD-struct object,
suitably converted using a
reinterpret_cast, points to its
initial member (or if that member is a
bit-field, then to the unit in which
it resides) and vice versa. [Note:
There might therefore be unnamed
padding within a POD-struct object,
but not at its beginning, as necessary
to achieve appropriate alignment. ]
However, the standard makes no guarantees about the layout of structs -- even POD structs -- other than that the address of a later member will be greater than the address of an earlier member, provided there are no access specifiers (private:, protected: or public:) between them. So, treating the initial part of your struct test as an array of 3 integers is technically undefined behaviour.