Suppose I have some simple classes/structs without anything but data and a select few operators. If I understand, a basic struct with only data in C++, just like C, occupies as much memory as the members. For example,
struct SomeStruct { float data; }
sizeof(SomeStruct) == sizeof(float); // this should evaluate to true
What I'm wondering is if adding operators to the class will make the object larger in memory. For example
struct SomeStruct
{
public:
SomeStruct & operator=(const float f) { data = f; return this; }
private:
float data;
}
will it still be true that sizeof(SomeStruct) == sizeof(float) evaluates to true? Are there any operators/methods which will not increase the size of the objects in memory?
The structure may not necessarily be only as large as its members (consider padding and alignment), but you are basically correct, in that:
Functions are not data, and are not "stored" inside the object type.
That said, watch out for the addition of virtual table pointers in the case where you add a virtual function to your type. This is a one-time size increase for the type, and does not re-apply when you add more virtual functions.
What I'm wondering is if adding operators to the class will make the object larger in memory.
The answer is "it depends".
If the class wasn't polymorphic prior to adding the function and this new function keeps the class non-polymorphic, then adding this non-polymorphic function does nothing to the size of your class instances.
On the other hand, if adding this new function does make your class polymorphic, this addition will make instances of your class bigger. Most C++ implementations use a virtual table, or vtable for short. Each instance of a polymorphic class contains a pointer to the vtable for that class. Instances of non-polymorphic classes don't need and thus don't contain a vtable pointer.
Finally, adding yet another virtual function to a class that is already polymorphic does not make the class instances bigger. This addition does makes the vtable for that class bigger, but the vtable itself isn't a part of the instance. A vtable pointer is a part of the instance, and that pointer is already a part of the class layout because the class is already polymorphic.
When I was learning about C++ and OOP, I read somewhere (some bad source) that objects in C++ are essentially the same thing as C structs with function pointers inside of them.
They may be like that functionally, but if they were really implemented like that, it would have been a huge waste of space since all object instances would have to store the same pointers.
Method code is stored in one central location and C++ just makes it conveniently look like as if each instance had its methods inside of it.
(Operators are essentially functions with different syntax).
Methods and operators defined inside classes do not increase the size of instantiated objects. You can test it out for yourself:
#include <iostream>
using namespace std;
struct A {
int a;
};
struct B {
int a;
//SOME RANDOM METHODS AND OPERATORS
B() : a(1) {cout<<"I'm the constructor and I set 'a' to 1"<<endl;}
void some_method() const { for(int i=0;i<40;i++) cout<<"loop";}
B operator+=(const B& b){
a+=b.a;
return *this;
}
size_t my_size() const { return sizeof(*this);}
};
int main(){
cout<<sizeof(A)<<endl;
cout<<B().my_size()<<endl;
}
Output on a 64 bit system:
4
I'm the constructor and I set 'a' to 1
4
==> No change in size.
Related
This question already has answers here:
Where are member functions stored for an object?
(2 answers)
Closed 6 years ago.
Whenever object is created for a class, memory space will be allocated for a class. So my question is: Do memory created for only member variables or for member functions also?? If memory is created for member functions, then where they will be stored??
Traditionally executable files had three sections. One for initialized data, one for uninitialized data, and one for code. This traditional partitioning is still very much in use, and code, no matter where it comes from, is placed in a separate section.
When an operating system load an executable file into memory, it puts the code in a separate place in memory that it marks as executable (on modern memory-protected systems) and all code are stored there separate from the objects themselves.
Member functions are just code located in the code segment. they are present exact one time, no matter how many objects you have.
They are nearly exactly the same as ordinary functions except that their first parameter is the this pointer, that is hidden in the language but present as a parameter on the executable code.
But there are two kinds of member functions:
"normal"
virtual
there is no difference between them in the sense of code size however they are called differently. Calls to normal functions can be determined by compiletime and the other are indirect calls via the function pointers-
If a class has a virtual member functions (the class is "polymorph") the compiler needs to create a "vtable" for this class (not object).
Each object does contain a pointer to the vtable of its class. this is necessary to access the correct virtual function if the object is accessed by a pointer that is of a base classes type.
Example:
class A{
public: bool doSomething();
int i;
};
class B:public A {
public: bool doSomething();
int j;
}
//
B b;
A* a = &b;
a->doSomething(); // <- A::doSomething() is called;
//
neither of this classes needs a vtable.
Example 2:
class A{
public: virtual bool doSomething();
int i;
};
class B:public A {
public: bool doSomething();
int j;
}
//
B b;
A* a = &b;
a->doSomething(); // <- B::doSomething() is called;
//
A (and all its childs) get a vtable. Then an object is created the objects vtable pointer is set to the correct table, so that independently from the Type of the pointer the correct function is called.
Only the member variables (plus padding between and after them) contribute to the sizeof of a class.
So in that sense regular functions do not take up space as far as an object is concerned. Member functions are little more than regular static functions with a implicit this pointer as a hidden argument.
Saying that though, a virtual function table might be the way an implemention deals with polymorphic types, and that will take up some space, but will probably only be a pointer to a table used by all objects of that particular class.
I have a base class called Effect that is defined like this:
class Effect
{
public:
virtual void apply(int a, int b);
};
void Effect::apply(int a, int b)
{
}
And some subclasses of Effect, all defined like this:
#pragma once
#include "Effect.h"
class SomeSubclassOfEffect: public Effect
{
public:
void apply(int a, int b);
};
void SomeSubclassOfEffect::apply(int a, int b)
{
//Magic
}
Now, in some other part of my application I have this:
Effect effects[6]...
effects[0] = SomeSubclassOfEffect();
What I'm trying to do is to call the corresponding overridden version of apply(int a, int b) by means of effects[whatever].apply(x, y), but I'm getting the parent class' one instead. Why am I getting this result?
C++ uses value semantics by default, not reference semantics, so effects[0] = SomeSubclassOfEffect(); slices off all the child-specific information, leaving only the parent information.
The way c++ places classes in memory is to put the base class first then derived class information after that.
|base class|derived class|
^ ^ ^
a b c
If you took the address of one of these objects that would be memory address a, the end of the class is at address c in this diagram. The information in your derived class is in the memory segment from b to c. There's 2 versions of apply stored in the memory here: a version for the base class and one for the derived class which one gets called is based on the type of the object.
When you create an array with Effect effects[3] for example you get memory that looks like this:
|effect|effect|effect|
C++ just allocates you just enough space to fit an effect class per array index and no more.
Now the problem is that you try to place and object that looks like this into the first element of the array:
|effect|somesubclass|
This can't fit into one array element so c++ slices off the end of the class, which includes the information about the subclass to put it in the array which loses information. Any members of the derived class are now gone! The array is of type effect so anything placed in the array will be treated as though it were that type and methods will be called accordingly. This then causes problems like the one you encountered.
This is known as the slicing problem: What is object slicing?
To deal with this you should never use a polymorphic array of objects, instead use pointers to the objects because the pointers are all the same size and hence will properly work when used inside the array. A pointer to the base class will be the same size as a pointer to the derived class, it's just the pointer to the start of the memory block after all. The pointers will now point to the correct type and hence the correct versions of functions will be called.
Is there any way to force a compiler (well GCC specifically) to make a class compile to object oriented C? Specifically what I want to achieve is to write this:
class Foo {
public:
float x, y, z;
float bar();
int other();
...etc
};
Foo f;
float result = f.bar()
int obSize = sizeof(Foo);
Yet compile to exactly the same as:
Struct Foo { float x, y, z; };
float Foo_bar(Foo *this);
Foo f;
float result = Foo_bar(&f);
int obSize = sizeof(Foo);
My motivation is to increase readability, yet not suffer a memory penalty for each object of Foo. I'd imagine the class implementation normally obSize would be
obSize = sizeof(float)*3 + sizeof(void*)*number_of_class_methods
Mostly to use c++ classes in memory constrained microcontrollers. However I'd imagine if I got it to work I'd use it for network serialization as well (on same endian machines of course).
Your compiler actually does exactly that for you. It might even be able to optimize optimistically by putting the this pointer in a register instead of pushing it onto the stack (this is at least what MSVC does on Windows), which you wouldn't be able to do with standard C calling convention.
As for:
obSize = sizeof(float)*3 + sizeof(void*)*number_of_class_methods
It is plain false. Did you even try it ?
Even if you had virtual functions, only one pointer to a table of functions would be added to each object (one table per class). With no virtual functions, nothing is added to an object beyond its members (and no function table is generated).
void* represents a pointer to data, not a pointer to code (they need not have the same size)
There is no guarantee that the size of the equivalent C struct is 3 * sizeof(float).
C++ already does what you're talking about for non-polymorphic classes (classes without a virtual method).
Generally speaking, a C++ class will have the same size as a C struct, unless the class contains one or more virtual methods, in which case the overhead will be a single pointer (often called the vptr) for each class instance.
There will also be a single instance of a 'vtbl' that has a set of pointers for each virtual function - but that vtbl will be shared among all objects of that class type (ie., there's a single vtbl per-class type, and the various vptrs for objects of that class will point to the same vtbl instance).
In summary, if your class has no virtual methods, it will be no larger than the same C struct. This fits with the C++ philosophy of not paying for what you don't use.
However, note that non-static member functions in a C++ class do take an extra parameter (the this pointer) that isn't explicitly mentioned in the parameter list - that is essentially what you discuss in your question.
footnote: in C++ classes and structs are the same except for the minor difference of default member accessibility. In the above answer, when I use the term 'class', the behavior applies just as well to structs in C++. When I use the term 'struct' I'm talking about C structs.
Also note that if your classes use inheritance, the 'overhead' of that inheritance depends on the exact variety of inheritance. But as in the difference between polymorphic and non-polymorphic classes, whatever that cost might be, it's only brought in if you use it.
No, your imagination is wrong. Class methods take up no space at all in an object. Why not write a class, and take the sizeof. Then add a few more methods and print the sizeof again. You will see that it hasn't changed. Something like this
First program
class X
{
public:
int y;
void method1() {}
};
int main()
{
cout << sizeof(X) << '\n'; // prints 4
}
Second program
class X
{
public:
int y;
void method1() {}
void method2() {}
void method3() {}
void method4() {}
void method5() {}
void method6() {}
};
int main()
{
cout << sizeof(X) << '\n'; // also prints 4
}
Actually, I believe there is no specific memory penalty with using classes since member functions are stored once for every instance of the class. So your memory footprint would be more like 1*sizeof(void*)*number_of_class_methods + N*sizeof(float)*3 where you have N instances of Foo.
The only time you get an additional penalty is when using virtual functions in which case each object carries around a pointer to a vtable with it.
You need to test, but as far as i know a class instance does only store pointers to its methods if said methods are virtual; otherwise, a struct and a class will take roughly the same amount of memory (bar different alignment done by different compilers etc).
In Why is there no base class in C++?, I quoted Stroustrup on why a common Object class for all classes is problematic in c++. In that quote there is the statement:
Using a universal base class implies cost: Objects must be heap-allocated to be polymorphic;
I really didn't look twice at it, and since its on Bjarnes home page I would suppose a lot of eyes have scanned that sentence and reported any misstatements.
A commenter however pointed out that this is probably not the case, and in retrospect I can't find any good reason why this should be true. A short test case yields the expected result of VDerived::f().
struct VBase {
virtual void f() { std::cout <<"VBase::f()\n"; }
};
struct VDerived: VBase {
void f() { std::cout << "VDerived::f()\n"; }
};
void test(VBase& obj) {
obj.f();
}
int main() {
VDerived obj;
test(obj);
}
Of course if the formal argument to test was test(VBase obj) the case would be totally different, but that would not be a stack vs. heap argument but rather copy semantics.
Is Bjarne flat out wrong or am I missing something here?
Addendum:
I should point out that Bjarne has added to the original FAQ that
Yes. I have simplified the arguments; this is an FAQ, not an academic paper.
I understand and sympathize with Bjarnes point. Also I suppose my eyes was one of the pairs scanning that sentence.
Looks like polymorphism to me.
Polymorphism in C++ works when you have indirection; that is, either a pointer-to-T or a reference-to-T. Where T is stored is completely irrelevant.
Bjarne also makes the mistake of saying "heap-allocated" which is technically inaccurate.
(Note: this doesn't mean that a universal base class is "good"!)
I think Bjarne means that obj, or more precisely the object it points to, can't easily be stack-based in this code:
int f(int arg)
{
std::unique_ptr<Base> obj;
switch (arg)
{
case 1: obj = std::make_unique<Derived1 >(); break;
case 2: obj = std::make_unique<Derived2 >(); break;
default: obj = std::make_unique<DerivedDefault>(); break;
}
return obj->GetValue();
}
You can't have an object on the stack which changes its class, or is initially unsure what exact class it belongs to.
(Of course, to be really pedantic, one could allocate the object on the stack by using placement-new on an alloca-allocated space. The fact that there are complicated workarounds is beside the point here, though.)
The following code also doesn't work as might be expected:
int f(int arg)
{
Base obj = DerivedFactory(arg); // copy (return by value)
return obj.GetValue();
}
This code contains an object slicing error: The stack space for obj is only as large as an instance of class Base; when DerivedFactory returns an object of a derived class which has some additional members, they will not be copied into obj which renders obj invalid and unusable as a derived object (and quite possibly even unusable as a base object.)
Summing up, there is a class of polymorphic behaviour that cannot be achieved with stack objects in any straightforward way.
Of course any completely constructed derived object, wherever it is stored, can act as a base object, and therefore act polymorphically. This simply follows from the is-a relationship that objects of inherited classes have with their base class.
Having read it I think the point is (especially given the second sentence about copy-semantics) that universal base class is useless for objects handled by value, so it would naturally lead to more handling via reference and thus more memory allocation overhead (think template vector vs. vector of pointers).
So I think he meant that the objects would have to be allocated separately from any structure containing them and that it would have lead to many more allocations on heap. As written, the statement is indeed false.
PS (ad Captain Giraffe's comment): It would indeed be useless to have function
f(object o)
which means that generic function would have to be
f(object &o)
And that would mean the object would have to be polymorphic which in turn means it would have to be allocated separately, which would often mean on heap, though it can be on stack. On the other hand now you have:
template <typename T>
f(T o) // see, no reference
which ends up being more efficient for most cases. This is especially the case of collections, where if all you had was a vector of such base objects (as Java does), you'd have to allocate all the objects separately. Which would be big overhead especially given the poor allocator performance at time C++ was created (Java still has advantage in this because copying garbage collector are more efficient and C++ can't use one).
Bjarne's statement is not correct.
Objects, that is instances of a class, become potentially polymorphic by adding at least one virtual method to their class declaration. Virtual methods add one level of indirection, allowing a call to be redirected to the actual implementation which might not be known to the caller.
For this it does not matter whether the instance is heap- or stack-allocated, as long as it is accessed through a reference or pointer (T& instance or T* instance).
One possible reason why this general assertion slipped onto Bjarne's web page might be that it is nonetheless extremely common to heap-allocate instances with polymorphic behavior. This is mainly because the actual implementation is indeed not known to the caller who obtained it through a factory function of some sort.
I think he was going along the lines of not being able to store it in a base-typed variable. You're right in saying that you can store it on the stack if it's of the derived type because there's nothing special about that; conceptually, it's just storing the data of the class and it's derivatives + a vtable.
edit: Okay, now I'm confused, re-looking at the example. It looks like you may be right now...
I think the point is that this is not "really" polymorphic (whatever that means :-).
You could write your test function like this
template<class T>
void test(T& obj)
{
obj.f();
}
and it would still work, whether the classes have virtual functions or not.
Polymorphism without heap allocation is not only possible but also relevant and useful in some real life cases.
This is quite an old question with already many good answers. Most answers indicate, correctly of course, that Polymorphism can be achieved without heap allocation. Some answers try to explain that in most relevant usages Polymorphism needs heap allocation.
However, an example of a viable usage of Polymorphism without heap allocation seems to be required (i.e. not just purely syntax examples showing it to be merely possible).
Here is a simple Strategy-Pattern example using Polymorphism without heap allocation:
Strategies Hierarchy
class StrategyBase {
public:
virtual ~StrategyBase() {}
virtual void doSomething() const = 0;
};
class Strategy1 : public StrategyBase {
public:
void doSomething() const override { std::cout << "Strategy1" << std::endl; }
};
class Strategy2 : public StrategyBase {
public:
void doSomething() const override { std::cout << "Strategy2" << std::endl; }
};
A non-polymorphic type, holding inner polymorphic strategy
class A {
const StrategyBase* strategy;
public:
// just for the example, could be implemented in other ways
const static Strategy1 Strategy_1;
const static Strategy2 Strategy_2;
A(const StrategyBase& s): strategy(&s) {}
void doSomething() const { strategy->doSomething(); }
};
const Strategy1 A::Strategy_1 {};
const Strategy2 A::Strategy_2 {};
Usage Example
int main() {
// vector of non-polymorphic types, holding inner polymorphic strategy
std::vector<A> vec { A::Strategy_1, A::Strategy_2 };
// may also add strategy created on stack
// using unnamed struct just for the example
struct : StrategyBase {
void doSomething() const override {
std::cout << "Strategy3" << std::endl;
}
} strategy3;
vec.push_back(strategy3);
for(auto a: vec) {
a.doSomething();
}
}
Output:
Strategy1
Strategy2
Strategy3
Code: http://coliru.stacked-crooked.com/a/21527e4a27d316b0
Let's assume we have 2 classes
class Base
{
public:
int x = 1;
};
class Derived
: public Base
{
public:
int y = 5;
};
int main()
{
Base o = Derived{ 50, 50 };
std::cout << Derived{ o }.y;
return 0;
}
The output will be 5 and not 50. The y is cut off. If the member variables and the virtual functions are the same, there is the illusion that polymorphism works on the stack as a different VTable is used. The example below illustrates that the copy constructor is called. The variable x is copied in the derived class, but the y is set by the initialization list of a temporary object.
The stack pointer has increased by 4 as the class Base holds an integer. The y will just be cut off in the assignment.
When using Polymorphism on the heap you tell the new allocator which type you allocate and by that how much memory on heap you need. With the stack this does not work. And neither memory is shrinking or increasing on the heap. As at the time of initialization you know what you're initializing and exact this amount of memory is allocated.
I was reading about Empty Base Optimization(EBO). While reading, the following questions popped up in my mind:
What is the point of using Empty class as base class when it contributes nothing to the derived classes (neither functionality-wise, nor data-wise)?
In this article, I read this:
//S is empty
class struct T : S
{
int x;
};
[...]
Notice that we didn’t lose any data or
code accuracy: when you create a
standalone object of type S, the
object’s size is still 1 (or more) as
before; only when S is used as base
class of another class does its memory
footprint shrink to zero. To realize
the impact of this saving, imagine a
vector that contains 125,000
objects. The EBO alone saves half a
megabyte of memory!
Does it mean that if we don't use "S" as base class of "T", we would necessarily consume double of megabyte of memory? I think, the article compares two different scenarios which I don't think is correct.
I would like to know a real scenario when EBO can proven to be useful.(means, in the same scenario, we would necessarily be at loss IF we don't use EBO!).
Please note that if your answer contains explanations like this :
The whole point is that an empty class has non-zero size, but when derived or deriving it can have zero size, then I'm NOT asking that, as I know that already. My question is, why would anyone derive his class from an empty class in the first place? Even if he doesn't derive and simply writes his class (without any empty base), is he at loss in ANY way?
EBO is important in the context of policy based design, where you generally inherit privately from multiple policy classes. If we take the example of a thread safety policy, one could imagine the pseudo-code :
class MTSafePolicy
{
public:
void lock() { mutex_.lock(); }
void unlock() { mutex_.unlock(); }
private:
Mutex mutex_;
};
class MTUnsafePolicy
{
public:
void lock() { /* no-op */ }
void unlock() { /* no-op */ }
};
Given a policy based-design class such as :
template<class ThreadSafetyPolicy>
class Test : ThreadSafetyPolicy
{
/* ... */
};
Using the class with a MTUnsafePolicy simply add no size overhead the class Test : it's a perfect example of don't pay for what you don't use.
EBO isn't really an optimization (at least not one that you do in the code). The whole point is that an empty class has non-zero size, but when derived or deriving it can have zero size.
This is the most usual result:
class A { };
class B { };
class C { };
class D : C { };
#include <iostream>
using namespace std;
int main()
{
cout << "sizeof(A) + sizeof(B) == " << sizeof(A)+sizeof(B) << endl;
cout << "sizeof(D) == " << sizeof(D) << endl;
return 0;
}
Output:
sizeof(A) + sizeof(B) == 2
sizeof(D) == 1
To the edit:
The optimization is, that if you actually do derive (for example from a functor, or from a class that has only static members), the size of your class (that is deriving) won't increase by 1 (or more likely 4 or 8 due to padding bytes).
The "Optimization" in the EBO means the case when you use base class can be optimized to use less memory than if you use a member of the same type. I.e. you compare
struct T : S
{
int x;
};
with
struct T
{
S s;
int x;
};
not with
struct T
{
int x;
};
If your question is why would you have an empty class at all (either as a member, or as a base), it is because you use its member functions. Empty means it has no data member, not that it does not have any members at all. Things like this are often done when programming with templates, where the base class is sometimes "empty" (no data members) and sometimes not.
Its used when programmers want to expose some data to client without increasing the client class size. The empty class can contain enums and typedefs or some defines which the client can use.The most judicious way to use such a class it it to,inherit such a class privately. This will hide the data from outside and wil not increase your class size.
There can be empty classes which do not have any member variables, but member functions (static or non static) which can act as utility classes, lets call this EmptyClass. Now we can have a case where we want to create a class (let's call it SomeClass) which have a containment kind of relation with EmptyClass, but not 'is-a' relation. One way is to create a member object of type EmptyClass in SomeClass as follows:
class EmptyClass
{
public:
void someFun1();
static int someUtilityFun2();
};
//sizeof(EmptyClass) = 1
class SomeClass
{
private:
EmptyClass e;
int x;
};
//sizeof(SomeClass) = 8
Now due to some alignment requirements compilers may add padding to SomeClass and its size is now 8 bytes. The better solution is to have a SomeClass derive privately from EmptyClass and in this way SomeClass will have access to all member functions of EmptyClass and won't increase the extra size by padding.
class SomeClass : private EmptyClass
{
private:
int x;
}
//sizeof(SomeClass) = 4
Most of the time, an empty base class is either used polymorphically (which the article mentions), as "tag" classes, or as exception classes (although those are usually derived from std::exception, which is not empty). Sometimes there is a good reason to develop a class hierarchy which begins with an empty base class.
Boost.CompressedPair uses the EBO to shrink the size of objects in the event that one of the elements is empty.
EASTL has a good explanation as to why they needed EBO, its also explained in-depth in the paper they link to/credit
EBO is not something the programmer influences, and/or the programmer would be punished for if (s)he chose not to derive from an empty base class.
The compiler controls whether for:
class X : emptyBase { int X; };
class Y { int x };
you get sizeof(X) == sizeof(Y) or not. If you do, the compiler implements EBO, if not, it doesn't.
There never is any situation where sizeof(Y) > sizeof(X) would occur.
The primary benefit I can think of is dynamic_cast. You can take a pointer to S and attempt to dynamic_cast it to anything that inherits from S- assuming that S offers a virtual function like a virtual destructor, which it pretty much must do as a base class. If you were, say, implementing a dynamically typed language, you may well wish or need for every type to derive from a base class purely for the purposes of type-erased storage, and type checking through dynamic_cast.