Are classes larger in memory than their members in C++? - c++

Let's say I have some class who's only member is an int. If it wasn't in a class, the int alone would be 4 bytes. Does the class take more than 4 bytes of memory (in C++)?

The decision about how big a class ends up being is implementation-specific and depends on a lot of different factors. Sometimes, due to structure and class padding, a class might end up bigger than the size of its members. If you have any virtual functions in your class, then you'll typically end up with a virtual function table pointer (vtable pointer) at the front of the class that adds a bit of space. And it's entirely possible that the compiler might just For The Heck Of It make your class bigger than the size of its members if it think it will help out in some way (or if you have a lazy compiler!)
In your case, with a single 32-bit integer, I'd be surprised if the class ended up being any larger than the integer itself, since you aren't using any virtual functions and there aren't any members to insert padding bytes between. However, you cannot necessarily rely on this across systems.
If you're working on an application where it's absolutely essential that your class be the same size as the fields - perhaps, for example, if you're reading raw bytes and want to reinterpret them as class objects - you could use a static_assert to check for this:
class MyClass {
...
};
static_assert(sizeof(MyClass) == sizeof(int), "MyClass must have the same size as an integer.");
Many compilers have custom options (often through #pragma directives) that you can tune to ensure that classes get sized in a way that you'd like, so you could also consider reading up on that.

The actual size is implementation-dependent, so it can change across different compilers and architectures due to padding and other implementation details. Never trust a simple sum like in the following pseudocode:
size = sizeof(member1) + ... + sizeof(memberN)
Also if the class has virtual functions, yes, it can be more than 4 bytes.
Moreover, in the case of virtual functions and class inheritance the size can be complicated to be understood at first sight:
Each class that include virtual functions will store a vtable in memory with function pointers to these virtual functions.
Class A, with virtual functions, that inherit from another class B, that has virtual functions too, could need more than one table to store both A and B function pointers.
See this answer for more details: how to determine sizeof class with virtual functions?

Related

Cast a simple (c++) struct to another derived (c++) struct containing same datatypes [duplicate]

If I have a class as follows
class Example_Class
{
private:
int x;
int y;
public:
Example_Class()
{
x = 8;
y = 9;
}
~Example_Class()
{ }
};
And a struct as follows
struct
{
int x;
int y;
} example_struct;
Is the structure in memory of the example_struct simmilar to that in Example_Class
for example if I do the following
struct example_struct foo_struct;
Example_Class foo_class = Example_Class();
memcpy(&foo_struct, &foo_class, sizeof(foo_struct));
will foo_struct.x = 8 and foo_struct.y = 9 (ie: the same values as the x,y values in the foo_class) ?
The reason I ask is I have a C++ library (don't want to change it) that is sharing an object with C code and I want to use a struct to represent the object coming from the C++ library. I'm only interested in the attributes of the object.
I know the ideal situation would be to have Example_class wrap arround a common structure between the C and C++ code but it is not going to be easy to change the C++ library in use.
The C++ standard guarantees that memory layouts of a C struct and a C++ class (or struct -- same thing) will be identical, provided that the C++ class/struct fits the criteria of being POD ("Plain Old Data"). So what does POD mean?
A class or struct is POD if:
All data members are public and themselves POD or fundamental types (but not reference or pointer-to-member types), or arrays of such
It has no user-defined constructors, assignment operators or destructors
It has no virtual functions
It has no base classes
About the only "C++-isms" allowed are non-virtual member functions, static members and member functions.
Since your class has both a constructor and a destructor, it is formally speaking not of POD type, so the guarantee does not hold. (Although, as others have mentioned, in practice the two layouts are likely to be identical on any compiler that you try, so long as there are no virtual functions).
See section [26.7] of the C++ FAQ Lite for more details.
Is the structure in memory of the example_struct simmilar to that in Example_Class
The behaviour isn't guaranteed, and is compiler-dependent.
Having said that, the answer is "yes, on my machine", provided that the Example_Class contains no virtual method (and doesn't inherit from a base class).
In the case you describe, the answer is "probably yes". However, if the class has any virtual functions (including virtual destructor, which could be inherited from a base class), or uses multiple inheritance then the class layout may be different.
To add to what other people have said (eg: compiler-specific, will likely work as long as you don't have virtual functions):
I would highly suggest a static assert (compile-time check) that the sizeof(Example_class) == sizeof(example_struct) if you are doing this. See BOOST_STATIC_ASSERT, or the equivalent compiler-specific or custom construction. This is a good first-line of defense if someone (or something, such as a compiler change) modifies the class to invalidate the match. If you want extra checking, you can also runtime check that the offsets to the members are the same, which (together with the static size assert) will guarantee correctness.
In the early days of C++ compilers there were examples when compiler first changes struct keywords with class and then compiles. So much about similarities.
Differences come from class inheritance and, especially, virtual functions. If class contains virtual functions, then it must have a pointer to type descriptor at the beginning of its layout. Also, if class B inherits from class A, then class A's layout comes first, followed by class B's own layout.
So the precise answer to your question about just casting a class instance to a structure instance is: depends on class contents. For particular class which has methods (constructor and non-virtual destructor), the layout is probably going to be the same. Should the destructor be declared virtual, the layout would definitely become different between structure and class.
Here is an article which shows that there is not much needed to do to step from C structures to C++ classes: Lesson 1 - From Structure to Class
And here is the article which explains how virtual functions table is introduced to classes that have virtual functions: Lesson 4 - Polymorphism
Classes & structs in C++ are the equivalent, except that all members of a struct are public by default (class members are private by default). This ensures that compiling legacy C code in a C++ compiler will work as expected.
There is nothing stopping you from using all the fancy C++ features in a struct:
struct ReallyAClass
{
ReallyAClass();
virtual !ReallAClass();
/// etc etc etc
};
Why not explicitly assign the class's members to the struct's when you want to pass the data to C? That way you know your code will work anywhere.
You probably just derive the class from the struct, either publicly or privately. Then casting it would resolve correctly in the C++ code.

Checking if a class has expected attributes

I am going to give some classes about C++ and data structures, and to check students' progress I'd like them to develop the structures I talk about. This is the common approach for data structures classes, I guess. But I want more, I want the students to have a quick feedback on what they are missing, so I developed several unit tests for the classes that check the behavior and give them instant results on what is wrong.
This has been working properly for the past two semesters, but I want a step further on that automatized correction. I've been studying how to check what are the internal components of a class, so I can know if someone has implemented correctly a tree with a node* root and size_t size and hasn't used additional not-necessary attributes, for instance.
I know that I can have a rough approximation of an object size with sizeof, but the results are not that precise. It frequently is different from what I expect, for example: I tested creating a class with a pointer (8 bytes) and an int (4 bytes), but the sizeof was 28. From what I learnt, probably this has something to do with virtual function table and other alignment stuff.
So, how far and further can I go analyzing if someone has coded a data structure the proper and expected manner? How can I check that someone just didn't #include <list> and created an adaptor (for this I know I can just strip the includes but anyway)?
Let's break this answer into two parts, we'll split on the return from is_standard_layout.
1. Virtual Classes
is_standard_layout will return false, meaning the class is virtual. virtual classes will contain all the members from their parents aside from just a virtual function pointer. You can find more info here. Basically, your best bet for finding the size of the members here is to do sizeof the class in question reduced by sizeof(void*) And that's the size of your virtual class's members.
2. Non-Virtual Classes
is_standard_layout will return true meaning this is not a virtual class. In this case we can use offsetof to find the first member variable past the header information. Finding the end of the object with the pointer to the object and sizeof will you to measure the distance to the point returned by offsetof.
Both of these methods should yield the size of the members in the classes. determining an allowable range for class size is a matter of preference. But placing the evaluation in a static_assert will allow you to also provide a compile time message indicating the reason for the assert.

Packing bitfields even more tightly

I have a problem with bitfields in derived classes.
With the g++ compiler, you can assign __attribute__((packed)) to a class and it will pack bitfields. So
class A
{
public:
int one:10;
int two:10;
int three:10;
} __attribute__ ((__packed__));
takes up only 4 bytes. So far, so good.
However, if you inherit a class, like this
class B
{
public:
int one:10;
int two:10;
} __attribute__ ((__packed__));
class C : public B
{
public:
int three:10;
} __attribute__ ((__packed__));
I would expect class C, which has the same content as class A above, to have the same layout as well, i.e. take up 4 bytes. However, C turns out to occupy 5 bytes.
So my question is, am I doing something wrong, and if so, what? Or is this a problem with the compiler? An oversight, a real bug?
I tried googling, but haven't really come up with anything, apart from a difference between Linux and Windows (where the compiler tries to emulate MSVC), which I'm not interested in. This is just on Linux.
I believe the problem is with B, which cannot easily be 2.5 bytes. It has to be at least 3 bytes.
Theoretically, the derived class might be allowed to reuse padding from the base class, but I have never seen that happen.
Imagine for a second that what you are asking for is possible. What would be possible side-effects or issues of that? Let's see on a particular example that you have. Also assume a 32-bit architecture with 1-byte memory alignment.
There are 20 consecutive bits in class A that you can address via class's members one and two. It's a very convenient addressing for you, human. But what does the compiler do to make it happen? It uses masks and bit shifts to position those bits into correct places.
So far so good, seems simple and safe enough.
Adding 10 more bits. Let's say there was some amazingly smart compiler that allows you to squeeze those extra 10 bits into an already used 32-bit word (they fit nicely, don't they?).
Here comes trouble:
A* derived = new B; // upcast to base class
derived->one = 1;
derived->two = 2;
// what is the value of derived->three in this context?
// Especially taking into account that a compiler is free to do all sorts
// of optimizations when generating code for class A
Because of the above the class has to use different and separately-addressable memory locations for members of class A and members of class B causing those 10 bits to "spill" into next addressable memory location - next byte.
Even more trouble comes when you consider multiple inheritance - what is the one true way of arranging the bits in a derived class?

writing structs and classes to disk

The following function writes a struct to a file.
#define PAGESIZE sizeof(BTPAGE)
#define HEADERSIZE 2L
int btwrite(short rrn, BTPAGE *page_ptr)
{
long addr;
addr = (long) rrn * (long) PAGESIZE + HEADERSIZE;
lseek(btfd, addr, 0);
return (write(btfd, page_ptr, PAGESIZE));
}
The following is the struct.
typedef struct {
short keycount; /* number of keys in page */
int key[MAXKEYS]; /* the actual keys */
int value[MAXKEYS]; /* the actual values */
short child[MAXKEYS+1]; /* ptrs to rrns of descendants */
} BTPAGE;
What would happen if I changed the struct to a class, would it still work the same?
If I added class functions, would the size it takes up on disk increase?
There's a lot you need to learn here.
First of all, you're treating a structure as an array of bytes. This is strictly undefined behavior due to the strict aliasing rule. Anything can happen. So don't do it. Use proper serialization (for example via boost) instead. Yes, it's tedious. Yes, it's necessary.
Even if you ignore the undefinedness, and choose to become dependant on some particular compiler implementation (which may change even in the next compiler version), there's still reasons not to do it.
If you save a file on one machine, then load it on another, you may get garbage, because the second machine uses a different float representation, or a different endianness, or has different alignment rules, etc.
If your struct contains any pointers, it's very likely that saving them verbatim then loading them back will result in an address that doesn't not point to any meaningful place.
Typically when you add a member function, this happens:
the function's machine code is stored in a place shared by all the class instances (it wouldn't make sense to duplicate it, since it's logically immutable)
a hidden "this" pointer is passed to the function when it's called, so it knows which object it's been called on.
none of this requires any storage space in the instances.
However, when you add at least one virtual function, the compiler typically needs to also add a data chunk called a vtable (read up on it). This makes it possible to call different code depending on the current runtime type of the object (aka polymorphism). So the first virtual function you add to the class likely does increase the object size.
In C++, the difference between a struct and a class is simply that the members and base classes of a struct are public by default, whereas for a class they are private by default.
The technique of simply writing the bytes of the struct to a file and then reading them back in again only works if the struct is a plain old data, or POD, type. If you modify your struct such that it is no longer POD, this technique is not guaranteed to work (the rules describing what makes a POD struct are listed in answers to thet linked question).
If the class has any virtual function, then you're in trouble; if no virtual functions, you should still be OK (the same applies to a struct, of course, since it, too, could have virtual functions: the difference between struct and class is just that the default visibility in struct is public, in class it's private).
If you are doing more serialisation of classes consider using google protocol buffers, or something similar see this question

Structure of a C++ Object in Memory Vs a Struct

If I have a class as follows
class Example_Class
{
private:
int x;
int y;
public:
Example_Class()
{
x = 8;
y = 9;
}
~Example_Class()
{ }
};
And a struct as follows
struct
{
int x;
int y;
} example_struct;
Is the structure in memory of the example_struct simmilar to that in Example_Class
for example if I do the following
struct example_struct foo_struct;
Example_Class foo_class = Example_Class();
memcpy(&foo_struct, &foo_class, sizeof(foo_struct));
will foo_struct.x = 8 and foo_struct.y = 9 (ie: the same values as the x,y values in the foo_class) ?
The reason I ask is I have a C++ library (don't want to change it) that is sharing an object with C code and I want to use a struct to represent the object coming from the C++ library. I'm only interested in the attributes of the object.
I know the ideal situation would be to have Example_class wrap arround a common structure between the C and C++ code but it is not going to be easy to change the C++ library in use.
The C++ standard guarantees that memory layouts of a C struct and a C++ class (or struct -- same thing) will be identical, provided that the C++ class/struct fits the criteria of being POD ("Plain Old Data"). So what does POD mean?
A class or struct is POD if:
All data members are public and themselves POD or fundamental types (but not reference or pointer-to-member types), or arrays of such
It has no user-defined constructors, assignment operators or destructors
It has no virtual functions
It has no base classes
About the only "C++-isms" allowed are non-virtual member functions, static members and member functions.
Since your class has both a constructor and a destructor, it is formally speaking not of POD type, so the guarantee does not hold. (Although, as others have mentioned, in practice the two layouts are likely to be identical on any compiler that you try, so long as there are no virtual functions).
See section [26.7] of the C++ FAQ Lite for more details.
Is the structure in memory of the example_struct simmilar to that in Example_Class
The behaviour isn't guaranteed, and is compiler-dependent.
Having said that, the answer is "yes, on my machine", provided that the Example_Class contains no virtual method (and doesn't inherit from a base class).
In the case you describe, the answer is "probably yes". However, if the class has any virtual functions (including virtual destructor, which could be inherited from a base class), or uses multiple inheritance then the class layout may be different.
To add to what other people have said (eg: compiler-specific, will likely work as long as you don't have virtual functions):
I would highly suggest a static assert (compile-time check) that the sizeof(Example_class) == sizeof(example_struct) if you are doing this. See BOOST_STATIC_ASSERT, or the equivalent compiler-specific or custom construction. This is a good first-line of defense if someone (or something, such as a compiler change) modifies the class to invalidate the match. If you want extra checking, you can also runtime check that the offsets to the members are the same, which (together with the static size assert) will guarantee correctness.
In the early days of C++ compilers there were examples when compiler first changes struct keywords with class and then compiles. So much about similarities.
Differences come from class inheritance and, especially, virtual functions. If class contains virtual functions, then it must have a pointer to type descriptor at the beginning of its layout. Also, if class B inherits from class A, then class A's layout comes first, followed by class B's own layout.
So the precise answer to your question about just casting a class instance to a structure instance is: depends on class contents. For particular class which has methods (constructor and non-virtual destructor), the layout is probably going to be the same. Should the destructor be declared virtual, the layout would definitely become different between structure and class.
Here is an article which shows that there is not much needed to do to step from C structures to C++ classes: Lesson 1 - From Structure to Class
And here is the article which explains how virtual functions table is introduced to classes that have virtual functions: Lesson 4 - Polymorphism
Classes & structs in C++ are the equivalent, except that all members of a struct are public by default (class members are private by default). This ensures that compiling legacy C code in a C++ compiler will work as expected.
There is nothing stopping you from using all the fancy C++ features in a struct:
struct ReallyAClass
{
ReallyAClass();
virtual !ReallAClass();
/// etc etc etc
};
Why not explicitly assign the class's members to the struct's when you want to pass the data to C? That way you know your code will work anywhere.
You probably just derive the class from the struct, either publicly or privately. Then casting it would resolve correctly in the C++ code.