C++ Object, Member's Memory Position Offset - c++

Is there a better method to establish the positional offset of an object's data member than the following?
class object
{
int a;
char b;
int c;
};
object * o = new object();
int offset = (unsigned char *)&(object->c) - (unsigned char *)o;
delete o;

In this case, your class is POD, so you can use the offsetof macro from <cstddef>.
In practice, in most implementations, for most classes, you can use the same trick which offsetof typically uses:
int offset = &(((object *)0)->c) - (object *)0;
No need to actually create an object, although you may have to fight off some compiler warnings because this is not guaranteed to work.
Beware also that if your class has any multiple inheritance, then for (at least) all but one base, (void*)(base*)some_object != (void*)(derived*)some_object. So you have to be careful what you apply the offset to. As long as you calculate and apply it relative to a pointer to the class that actually defines the field (that is, don't try to work out the offset of a base class field from a derived class pointer) you'll almost certainly be fine. There are no guarantees about object layout, but most implementations do it the obvious way.
Technically for any non-POD class, it does not make sense to talk about "the offset" of a field from the base pointer. The difference between the pointers is not required to be the same for all objects of that class.

There's no need to actually create an object:
(size_t)&(((object*)0)->c)
That is, the address of a member in an object at address zero is the offset of that member into the object.
Of course, you will need access to the member, so you need to either make it a struct or add a public: access modifier.
This is how offsetof is implemented, at least for most compilers.

Rather than a pointer.
You can use a pointer to a member.
class X;
typedef int X::* PtrToMemberInt; // Declare a pointer to member.
class X
{
int a;
char b;
float c;
public:
static PtrToMemberInt getAPtr()
{
return &X::a;
}
int d;
};
int main()
{
// For private members you need to use a public method to get the address.
PtrToMemberInt aPtr = X::getAPtr();
// For public members you can take the address directly;
PtrToMemberInt dPtr = &X::d;
// Use the pointer like this:
X a;
a.*aPtr = 5;
a.*dPtr = 6;
}

To add to Martins answer, as stated in "Design and Evolution of C++" by Stroustrup (section [13.11]):
pointers to data members have proven a
useful way of expressing the layout of
a C++ class in an implementation
[Hübel, 1992]
Sandeep has been so kind as to convert the original paper and make it available on http://sandeep.files.wordpress.com/2008/07/ps.pdf
Note that the implementation described predated C++ RTTI, but I occasionally still use the pointer-to-member stuff.

Related

Is the pointer from casting to base gurenteed to be a pointer into the memory region of the derived object

Given this code:
#include <cassert>
#include <cstring>
struct base{
virtual ~base() = default;
};
class derived: public base{
public:
int x;
};
using byte = unsigned char;
int main() {
byte data[sizeof(derived)];
derived d;
memcpy(data, &d, sizeof(derived));
base* p = static_cast<base*>(reinterpret_cast<derived*>(data));
const auto offset = (long)data - (long)p;
assert(offset < sizeof(derived)); // <-- Is this defined?
}
As my comment asks, is this defined behavior by the standard? i.e does casting to base guarantee a pointer to the region occupied by the derived being cast? From my testing it works on gcc and clang, but I am wondering if it works cross platform too (obviously this version assumes 64bit pointers)
Is the pointer from casting to base gurenteed to be a pointer into the memory region of the derived object
Not necessarily in general. If the base is virtual, and the derived object in question isn't the most derived object, then in such case the virtual base may be outside the memory of the derived object.
But outside of that corner case, for example such as in the example code, the base sub object is indeed guaranteed to be within the derived object. That's what "sub object" implies.
potentially wrong alignment
your data array is a char array, so its alignment will be 1 byte.
your class however contains an int member, so its alignment will be at least 4 bytes.
So you data array is not sufficiently aligned to even contain a derived object.
You can easily fix this by providing an alignment of your data array that is at least that of derived or greater, e.g.:
alignas(alignof(derived)) byte data[sizeof(derived)];
(godbolt demonstrating the problem)
you can also use std::aligned_storage for this if you want.
using memcpy on classes is not always safe
Using memcpy on classes will only work if they're trivially copyable (so a byte-wise copy would be identical to calling the copy constructor). Your class isn't trivially copyable due to the virtual destructor, so memcpy isn't allowed to copy the class.
You can easily check this with std::is_trivially_copyable_v:
std::cout << std::is_trivially_copyable_v<derived> << std::endl;
You can fix this easily by calling the copy-constructor instead of using memcpy:
alignas(alignof(derived)) char data[sizeof(derived)];
derived d;
derived* derivedInData = new (data) derived(d);
virtual inheritance, multiple inheritance and other shenanigans
How classes will be layed out in memory is implementation-defined, so you basically have no guarantees how the compiler will lay out your class hierarchy.
However there are a few things you can count on:
sizeof(cls) will always return the size cls needs, including all it's base classes, even when it uses virtual and / or multiple inheritance. (sizeof)
When applied to a class type, the result is the number of bytes occupied by a complete object of that class, including any additional padding required to place such object in an array.
placement new will construct an object and return a pointer to it that is within the given buffer.
static_cast<> to baseclass is always defined
the actual answer
Yes, the base class pointer must always point to somewhere within your buffer, since it's a part of the derived class. However where exactly it'll be in the buffer is implementation-defined, so you should not rely on it.
The same thing is true about the pointer returned from placement new - it might be to the beginning of the array or somewhere else (e.g. array allocation), but it'll always be within the data array.
So as long as you stick to one of those patterns:
struct base { int i; }
struct derived : base { int j; };
alignas(alignof(derived)) char data[sizeof(derived)];
derived d;
memcpy(data, &d, sizeof(derived)); // trivially copyable
derived* ptrD = reinterpret_cast<derived*>(data);
base* ptrB = static_cast<base*>(ptrD);
/
struct base { int i; virtual ~base() = default; }
struct derived : base { int j; };
alignas(alignof(derived)) char data[sizeof(derived)];
derived d;
derived* ptrD = new(data) derived(d); // not trivially copyable
base* ptrB = static_cast<base*>(ptrD);
ptrD->~derived(); // remember to call destructor
your assertions should hold and the code should be portable.

do operator methods occupy memory in c++ objects?

Suppose I have some simple classes/structs without anything but data and a select few operators. If I understand, a basic struct with only data in C++, just like C, occupies as much memory as the members. For example,
struct SomeStruct { float data; }
sizeof(SomeStruct) == sizeof(float); // this should evaluate to true
What I'm wondering is if adding operators to the class will make the object larger in memory. For example
struct SomeStruct
{
public:
SomeStruct & operator=(const float f) { data = f; return this; }
private:
float data;
}
will it still be true that sizeof(SomeStruct) == sizeof(float) evaluates to true? Are there any operators/methods which will not increase the size of the objects in memory?
The structure may not necessarily be only as large as its members (consider padding and alignment), but you are basically correct, in that:
Functions are not data, and are not "stored" inside the object type.
That said, watch out for the addition of virtual table pointers in the case where you add a virtual function to your type. This is a one-time size increase for the type, and does not re-apply when you add more virtual functions.
What I'm wondering is if adding operators to the class will make the object larger in memory.
The answer is "it depends".
If the class wasn't polymorphic prior to adding the function and this new function keeps the class non-polymorphic, then adding this non-polymorphic function does nothing to the size of your class instances.
On the other hand, if adding this new function does make your class polymorphic, this addition will make instances of your class bigger. Most C++ implementations use a virtual table, or vtable for short. Each instance of a polymorphic class contains a pointer to the vtable for that class. Instances of non-polymorphic classes don't need and thus don't contain a vtable pointer.
Finally, adding yet another virtual function to a class that is already polymorphic does not make the class instances bigger. This addition does makes the vtable for that class bigger, but the vtable itself isn't a part of the instance. A vtable pointer is a part of the instance, and that pointer is already a part of the class layout because the class is already polymorphic.
When I was learning about C++ and OOP, I read somewhere (some bad source) that objects in C++ are essentially the same thing as C structs with function pointers inside of them.
They may be like that functionally, but if they were really implemented like that, it would have been a huge waste of space since all object instances would have to store the same pointers.
Method code is stored in one central location and C++ just makes it conveniently look like as if each instance had its methods inside of it.
(Operators are essentially functions with different syntax).
Methods and operators defined inside classes do not increase the size of instantiated objects. You can test it out for yourself:
#include <iostream>
using namespace std;
struct A {
int a;
};
struct B {
int a;
//SOME RANDOM METHODS AND OPERATORS
B() : a(1) {cout<<"I'm the constructor and I set 'a' to 1"<<endl;}
void some_method() const { for(int i=0;i<40;i++) cout<<"loop";}
B operator+=(const B& b){
a+=b.a;
return *this;
}
size_t my_size() const { return sizeof(*this);}
};
int main(){
cout<<sizeof(A)<<endl;
cout<<B().my_size()<<endl;
}
Output on a 64 bit system:
4
I'm the constructor and I set 'a' to 1
4
==> No change in size.

Can GCC compile classes to work as structs?

Is there any way to force a compiler (well GCC specifically) to make a class compile to object oriented C? Specifically what I want to achieve is to write this:
class Foo {
public:
float x, y, z;
float bar();
int other();
...etc
};
Foo f;
float result = f.bar()
int obSize = sizeof(Foo);
Yet compile to exactly the same as:
Struct Foo { float x, y, z; };
float Foo_bar(Foo *this);
Foo f;
float result = Foo_bar(&f);
int obSize = sizeof(Foo);
My motivation is to increase readability, yet not suffer a memory penalty for each object of Foo. I'd imagine the class implementation normally obSize would be
obSize = sizeof(float)*3 + sizeof(void*)*number_of_class_methods
Mostly to use c++ classes in memory constrained microcontrollers. However I'd imagine if I got it to work I'd use it for network serialization as well (on same endian machines of course).
Your compiler actually does exactly that for you. It might even be able to optimize optimistically by putting the this pointer in a register instead of pushing it onto the stack (this is at least what MSVC does on Windows), which you wouldn't be able to do with standard C calling convention.
As for:
obSize = sizeof(float)*3 + sizeof(void*)*number_of_class_methods
It is plain false. Did you even try it ?
Even if you had virtual functions, only one pointer to a table of functions would be added to each object (one table per class). With no virtual functions, nothing is added to an object beyond its members (and no function table is generated).
void* represents a pointer to data, not a pointer to code (they need not have the same size)
There is no guarantee that the size of the equivalent C struct is 3 * sizeof(float).
C++ already does what you're talking about for non-polymorphic classes (classes without a virtual method).
Generally speaking, a C++ class will have the same size as a C struct, unless the class contains one or more virtual methods, in which case the overhead will be a single pointer (often called the vptr) for each class instance.
There will also be a single instance of a 'vtbl' that has a set of pointers for each virtual function - but that vtbl will be shared among all objects of that class type (ie., there's a single vtbl per-class type, and the various vptrs for objects of that class will point to the same vtbl instance).
In summary, if your class has no virtual methods, it will be no larger than the same C struct. This fits with the C++ philosophy of not paying for what you don't use.
However, note that non-static member functions in a C++ class do take an extra parameter (the this pointer) that isn't explicitly mentioned in the parameter list - that is essentially what you discuss in your question.
footnote: in C++ classes and structs are the same except for the minor difference of default member accessibility. In the above answer, when I use the term 'class', the behavior applies just as well to structs in C++. When I use the term 'struct' I'm talking about C structs.
Also note that if your classes use inheritance, the 'overhead' of that inheritance depends on the exact variety of inheritance. But as in the difference between polymorphic and non-polymorphic classes, whatever that cost might be, it's only brought in if you use it.
No, your imagination is wrong. Class methods take up no space at all in an object. Why not write a class, and take the sizeof. Then add a few more methods and print the sizeof again. You will see that it hasn't changed. Something like this
First program
class X
{
public:
int y;
void method1() {}
};
int main()
{
cout << sizeof(X) << '\n'; // prints 4
}
Second program
class X
{
public:
int y;
void method1() {}
void method2() {}
void method3() {}
void method4() {}
void method5() {}
void method6() {}
};
int main()
{
cout << sizeof(X) << '\n'; // also prints 4
}
Actually, I believe there is no specific memory penalty with using classes since member functions are stored once for every instance of the class. So your memory footprint would be more like 1*sizeof(void*)*number_of_class_methods + N*sizeof(float)*3 where you have N instances of Foo.
The only time you get an additional penalty is when using virtual functions in which case each object carries around a pointer to a vtable with it.
You need to test, but as far as i know a class instance does only store pointers to its methods if said methods are virtual; otherwise, a struct and a class will take roughly the same amount of memory (bar different alignment done by different compilers etc).

Can a static member of a class as the same type as the class it is member of in C++

lets say I have
class : foo
{
public:
static const foo Invalidfoo;
foo();
foo(int, string);
private:
int number;
std::string name;
};
Is it safe or prone to any problem?
EDIT :
I want to use this to have an invalid object to return as a reference to launch errors.
It is perfectly legal, but the following is better:
class foo:
{
public:
static const& foo Invalidfoo()
{
static foo Invalidfoo_;
return Invalidfoo_;
}
private:
foo();
};
This way you are guaranteed that the object is initialized the first time it is used.
Edit: But no matter how you do it, you still have a global object, and that can be a cause of problem. The best solution may be to call the default constructor each time you need a default constructed object. In terms of efficiency, the difference is probably negligable.
It's legal.
It's actually widely used in the singleton pattern
Singletons multi threading access and creation problems.
A nice article about this:
C++ and the Perils of Double-Checked Locking
It is just acting like a global variable or singleton. It's prone to the problems relating to those.
That is perfectly valid code. It doesn't have any reason to cause any problem, because static data members don't contribute to the size of the class. No matter how many static data members you define in a class, it doesn't change its size by even one byte!
struct A
{
int i;
char c;
};
struct B
{
int i;
char c;
static A a;
static B b;
};
In above code, sizeof(A) == sizeof(B) will always be true. See this demo:
http://www.ideone.com/nsiNL
Its supported by the section $9.4.2/1 from the C++ Standard (2003),
A static data member is not part of
the subobjects of a class. There is
only one copy of a static data member
shared by all the objects of the
class.
You cannot define non-static data member of the enclosing class type, because non-static members are parts of the object, and so they do contribute to the size of the class. It causes problem when calculating the size of the class, due to recursive nature of the data members.
See this topic:
How do static member variables affect object size?
It's legal. Terrible code from a practical/style point of view, but it is legal, and technically, it can be made to work. Better than Singleton because it's immutable.
This is actually how a singleton is implemented, except your static member would be a pointer. So yes, you're safe.

When can I access an object member directly in memory? No getters called

It is usually allowed to do something like that (no comments on the code please :-))
class SimpleClass {
int member;
};
SimpleClass instance;
int* pointer = (int*) &instance;
However, if I define my class as:
class SimpleClass {
virtual void foo();
int member;
};
I can't do that anymore. Well, I guess I can, of course; it's just more complex.
So I was wondering: is it somewhere specified, or the only way to know when I can do that is just to use some common sense? Not counting alignment issues, which can usually be solved
Generally you want to keep the innards of a class of closed off from the outside world as you can, but if you do need to access a member directly simply specify it as public and take a pointer directly to it.
class SimpleClass {
public:
int member;
};
SimpleClass instance;
int* pointer = &instance.member;
I would avoid accessing the members in the way you describe because as you noted small changes in the class can mess it up, which may be fine whilst you are writing the code but when you come back to it much later you will probably overlook such subtleties. Also unless the class is constructed entirely of native data types, I believe the compiler's implementation will affect the required offset as well.
You can only do this safely if your class is plain old data (POD).
From Wikipedia:
A POD type in C++ is defined as either a scalar type or a POD class. A POD class has no user-defined copy assignment operator, no user-defined destructor, and no non-static data members that are not themselves PODs. Moreover, a POD class must be an aggregate, meaning it has no user-declared constructors, no private nor protected non-static data, no base classes and no virtual functions. The standard includes statements about how PODs must behave in C++.
See this page for many details on POD types. In particular,
"A pointer to a POD-struct object, suitably converted using a reinterpret_cast, points to its initial member ... and vice versa" [§9.2, ¶17].
class SimpleClass {
int member;
};
SimpleClass instance;
int* pointer = (int*) &SimpleClass;
In the above code pointer points to SimpleClass::member and hence you can access SimpleClass::member in this way.
Once you add an virtual method to the class SimpleClass, the memory map of the object changes.
class SimpleClass {
virtual void foo();
int member;
};
Every object of SimpleClass now contains a special pointer called as vptr which points to a vtable which is a table of addresses of all virtual functions in SimpleClass. The first 4 bytes of every object of SimpleClass now point to the vptr and not SimpleClass::member and that is the reason you cannot access member in the same way as first case.
Ofcourse, virtual behavior is implementation detail of compilers and vptr nor vtable are specified in the standard but the way i mentioned is how most compilers would implement it.
Since the implementation detail might be different for each compiler you should rely on accessing class members through pointers to class(especially polymorphic classes). Also, doing so defeats the entire purpose of Abstraction through Access Specifiers.
All right, several things.
Your first code example won't compile because you can't have a pointer to a class. I think you meant this:
SimpleClass instance;
int* pointer = (int*) &instance;
(Please don't code you haven't tried compiling.)
This kind of casting is powerful and dangerous. You could just as well cast to a pointer to some type (say, char*) and as long as it was a type no larger than int, the code would compile and run. This kind of reinterpretation is like a fire axe, to be used only when you have no better choice.
As it is, pointer points to the beginning of 'instance', and instance begins with instance.member (like every instance of SimpleClass), so you really can manipulate that member with it. As soon as you add another field to SimpleClass before member, you mess up this scheme.
In your second code example, you can still use pointer to store and retrieve an int, but you're doing so across boundaries within the memory structure of instance, which will damage instance beyond repair. I can't think of a good metaphor for how bad this is.
In answer to your questions: yes, this is specified somewhere, and no, common sense isn't good enough, you'll need insight into how the language works (which comes from playing around with questions like this).