Just when I thought I was getting a good grasp of pointers, I'm confused again. Your insights will probably be helpful.
I guess I could state what confuses me in very general terms, like:
a) If I write A* p = new A(); (where A is some class), and then do stuff like (*p).do_stuff(), then the object pointed to by p might move in memory, so why would p still point to my object?
b) How are classes and member variable of classes stored in memory.
But maybe it is more useful that I tell you the problem that I have a little bit more specifically. Say I have a class Car that has a member variable Engine engine_; (where Engine is some other class). Fine. Now suppose that for some reason I want to create a class that has a member variable that is a pointer to an Engine, like:
class Car
{
friend Repair;
public:
Car() {engine_ = Engine();}
private:
Engine engine_;
};
class Repair
{
public:
Repair(const Car &car) : engine_(&(car.engine_)) {}
private:
Engine *engine_;
};
There's no chance that repair.engine_ will always point to my car's engine, is there (?) But even in this second version :
class Car
{
friend Repair;
public:
Car() {engine_ = new Engine();}
~Car() {delete engine_;}
private:
Engine *engine_;
};
// Maybe I need/should write consts somewhere, not sure
class Repair
{
public:
Repair(const Car &car) : engine_(car.engine_) {}
private:
Engine *engine_;
};
although it seems there's more chance this will work, I don't see how / understand whether it will...
Thanks in advance for your answers!
If I write A* p = new A(); (where A is some class), and then do stuff like (*p).do_stuff(), then the object pointed to by p might move in memory
No, it won't. (At least, *p will stay were it is; if it has pointer members itself, then those may get reset to point elsewhere.)
How are classes and member variable of classes stored in memory
As bits.
class Foo {
int i;
char *p;
public:
void bla();
};
will be represented as the bits of an int (probably 32) followed by those of a pointer (32 or 64), with perhaps some padding in between. The method will not take up space in your instances, it's stored separately.
As for your example, I don't exactly understand the problem. It should work if as the Car stays alive, and does not reset its Engine*, as long as the Repair object lives. (It doesn't look particularly robust, though.)
in both case 1) and case 2) there is no guarantee that repair.engine_ will always point to your car, because it's a friend class and not a member of class 'Car'
As others have said, the object does not move in memory when you do stuff like (*p).do_stuff();. You must have misunderstood something that you learned at some point.
For your second question, member functions and member variables are stored in different places in memory. The code for member functions is only generated once for each class, not once for each instance of the class. This code is stored at some location in memory.
As for member variables, this is what people are talking about when they mention your object's location in memory. For example, if you have a class like
class MyClass{
private:
int a;
int b;
double c;
public:
void fun();
};
and we assume that an instance of it is stored at memory location 0x0000, this means that a is at location 0x0000, b is at 0x0004, and c would be at 0x0008 (or something like this depending on how memory is laid out). The function fun() is stored somewhere else entirely.
Now if we make another instance of MyClass, it's a variable might be at 0x000C, it's b might be at 0x0010, and it's c at 0x0014. Finally, it's fun() is in the exact same location as fun() from the first instance.
Pointers in C++ allocated with new don't move. You might be thinking of malloc, where a pointer can be realloc'd and possibly go to a new location as a result.
Bjarne Stroustrup felt that C++ containers generally provided a better way to deal with the wish to have dynamically sized memory:
http://www2.research.att.com/~bs/bs_faq2.html#renew
In order to allow for movement and reorganization of memory, some systems use abstract handles that need to be locked into pointers before you can use them...such as Windows:
http://msdn.microsoft.com/en-us/library/windows/desktop/aa366584(v=vs.85).aspx
Using something abstract which you lock into pointers might make sense in a system that needs to do some kind of periodic memory defragmentation. But C++ doesn't pay the cost for that indirection by default, you'd implement it only in cases where it makes sense to do so.
(On the downside, with non-movable pointers if you allocate a million objects and then delete 999,999 of them... that one object which is left may stay sitting way up at the top of the address space. The OS/paging system is supposed to be smarter than to let this be a problem, but if your allocator is custom this might leave your heap at a big size. For instance: if you're using a memory-mapped file as a backing store for your objects...you'll be stuck with a large and mostly empty disk file.)
Related
The problem is quite simple. I have a class that can be structured as the following:
class MyClass {
struct A {
// ...
};
struct B {
// ...
};
};
The problem is: the information in MyClass::B is useless after some time of precomputations, whereas MyClass::A must never be deleted (the program may be running for days). MyClass::B holds quite a large amount of information. I want to get rid of MyClass::B while keeping MyClass::A in the same memory position.
Is it possible to do this without modifying too much the data structure and not having to add anything else to MyClass::A (in particular, a pointer to MyClass::B)? If so, what would be the right way to implement it? Take in account that the program must be as memory-efficient as possible (and let us take that to the extreme). I use C++14 BTW.
(And extra question: It is possible to delete the chunk corresponding to MyClass::B from MyClass?)
First or all that's a class declaration. You can't delete a part of the class declaration. You probably have something more like this:
// The exact place of the declaration doesn't matter actually
class A {...};
class B {...};
class C {
A a;
B b;
};
Just change it to use a pointer (in this day and age a smart pointer, like std::unique_ptr):
class C {
A a;
std::unique_ptr<B> b;
void FinishTaskThatRequiredB() {
b.reset(); // calls B::~B() and frees the memory.
}
};
Okay, let's elaborate a bit on what I wrote: “either the life of B is tied to that of MyClass, or it is independent. In that latter case, you must keep track of it in some way”.
Putting the context back from the question:
MyClass::B is useless after some time of precomputations.
MyClass::A must never be deleted.
Is ensues that you want to keep track of it in some way. How? Well, that depends on the rules on the life time.
If B could sometimes exist and sometimes not depending on hard to control (or plain unknowable) circumstances, then having a pointer to it, set to nullptr when it is useless, and dynamically allocated when it is useful is pretty much the only solution.
But here we have more knowledge: B exists first, then becomes useless and remains so forever. In other terms, you don't need B after some initial building steps used to create A.
There is a pattern that does exactly this: the builder pattern. Its purpose is to encapsulate a complex building operation, maybe tracking some state, until it has built some object, at which point it becomes useless and can be destroyed.
class ABuilder
{
public:
setSomeInfo(int);
doSomeComputation(......);
// etc
A get(); /// finalize building of A
private:
int someInfo_ = 0;
};
// somewhere else
auto b = ABuilder();
b.setSomeInfo(42);
b.doSomeComputation(......);
auto a = b.get();
// b is no longer used past that point
// delete it if it was allocated dynamically
// or let it go out of scope if it was automatic
From your example, it would map somewhat like this:
A is still A.
B is ABuilder.
MyClass is not needed.
If you had provided actual class names and purpose, it would have been easier to make the examples meaningful ;)
In any case, pointers are most likely the key to your desires. In any case, too, you need to separate B from MyClass, too. You can avoid a pointer within MyClass (see comments to question and other answer), if you store B separately and invert the direction of the pointer:
class MyClass
{
struct A { };
A a;
};
class Wrapper
{
struct B { };
MyClass* mc;
B b;
};
Now during initialisation, you'd create a Wrapper for each MyClass, most likely contained in two different arrays:
MyClass items[NUMBER_OF_ITEMS]; // global array?
Wrapper* wrappers = new Wrapper[NUMBER_OF_ITEMS];
// assign each item to its corresponding wrapper
// use the wrappers for initialisation
delete[] wrappers;
Now all that remains is one single pointer, of which you even might get rid if it is a local variable in a separate initialisation routine...
I have a question about good C++ style:
I would like to write a class "MyClass" which has one or some pointers as members and MyClass is able to allocate memory to this pointers. I would like to use the implicit give default-copy-constructor (as well as the default-assignement-operator) to copy an instance of MyClass, so that only the pointers were copied and the new object share the data which the initial object has allocated.
My idea was to prohibit copied objects (created with copy constructor or assignment operator) to release memory (as well as allocate memory to member pointers). In order to distinguesh between copied objects and original objects (created by the constructor), I want to use the following code:
class MyClass
{
public:
MyClass(): originalPtr(this) { data = new char[100000]; }
~MyClass() { if(originalPtr == this) delete[] data; }
private:
MyClass *originalPtr;
char *data; // shared data (not copiable)
char otherFeatures[10]; // individual data (copiable)
};
Would this solution (using the comparison with the this-pointer) a good style for such a purpose (e.g. parsing an object by call by value) or is it risky? Of course, I assume that the original object live always longer than the copied objects.
Thank you!
No, this is a bad idea. If the pointers are shared by several instances, than the one to deallocate should be the last one to die, not the original one. This differs in the sense that the original one might not be the one to die, which would cause all others to be pointing at garbage. Even though you assume that it's the last one to die, you need to realise that the inner workings of a class should not rely on external assumptions. That is, the class has no guarantees on how its life span is managed by the rest of the implementation, so it shouldn't make assumptions.
In this situation you should track references to your data. The basic idea is to keep track of how many copies of the class you have. As soon as that count reaches zero, you are free to release that memory; the last copy has just died. Fortunately for you, STL already provides such an implementation. These are known as Smart Pointers. There are others, such as std::unique_ptr, which makes the opposite by ensuring that the data is owned only by a single instance.
Ok, assuming the general case, where the original object does not die at last. I like the idea to just count the instances. For example one could use such a concept:
class MyClass
{
public:
MyClass(): countOfInstances(new int())
{
++*countOfInstances;
data = new char[100000];
}
~MyClass()
{
--*countOfInstances;
if(!countOfInstances)
{
delete[] data;
delete countOfInstances;
}
}
MyClass(const MyClass &other) // analogous for the assignment operator
{
countOfInstances = other.countOfInstances;
data = other.data;
otherFeatures = other.otherFeatures;
++*countOfInstances;
}
private:
int *countOfInstances;
char *data; // shared data (not copiable)
char otherFeatures; // individual data (copiable)
};
Here, one should also make sure that the shared memory is completely allocated before allowing to make copies.
Lets say I have a class "ClassA". Is it possible to assign a pointer of another instance of the class to a non-pointer variable? for example
ClassA pineapple();
ClassA* replacementPineapple = new ClassA();
pineapple.refersto = replacementPineapple; <- something like that
The reason I'm asking is because I have a class where I need to move a lot of the class variables to be physically located in a memory mapped file. I could of course just have them as pointers and dereference them every time i need to use them but thats a lot of dereferencing and with all the brackets and other stuff just makes the code really hard to read. If there is any way around that I'll take it.
Yes it is. You can allocate the memory flat and then do the copying (from an existing object) by hand. Or you use placement new operator to allocate/construct at a given location.
see: using placement new
I need to move a lot of the class variables to be physically located in a memory mapped file. I could of course just have them as pointers and dereference them every time i need to use them but thats a lot of dereferencing and with all the brackets and other stuff just makes the code really hard to read. If there is any way around that I'll take it.
class ClassA
{
public:
ClassA(int* p_a, double* p_d) : a_(*p_a), d_(*p_d) { }
int& a_;
double& d_;
};
Then you can create an instance like this:
ClassA my_a(ptr_to_a_in_shmem, ptr_to_d_in_shmem);
And write code (member functions or otherwise) that opererates on a_ and d_ thereby accessing/modifying shared memory.
If I understand your question correctly you just want to use pineapple as an alias for *replacementPineapple. You can do this by first creating the latter, and defining pineapple to be of reference type (marked by &)
ClassA* replacementPineapple = new ClassA();
ClassA& pineapple = *replacementPineapple;
Now every (read or write) access to a member of pineapple will actually access the corresponding member of *replacementPineapple. Only if you should at some time modify the pointer replacementPineapple to point elsewhere (or become null) then pineapple will still refer to the object it originally pointed to, and there is no way you can make it follow.
I have a class that looks like:
class A
{
public:
A();
void MethodThatWillNotBeInOtherStructure();
inline int getData() { return data; }
inline void setData(int data_) { data = data_; }
private:
int data;
}
and a structure like:
struct B
{
public:
inline int getData() { return data; }
inline void setData(int data_) { data = data_; }
private:
int data;
}
How can I copy an instance of A to B without individually setting the fields? I know I can as I have seen code that would take a void* of say A and pass it to a function expecting B and it work. My big question also, is how does this work? I suppose it has something to do with memcpy, but I don't know how the memory layout for the structure and the class will be. For example, how do the functions that are in one but not the other not get in the way of the memcpy? Could someone explain this to me?
Update
Ok, let me explain. I am not saying I would ever do this in reusable code or that I would ever use it period. I still want to know how it works. Does a class have a different memory layout than a structure? How are the methods stored? Where is the data stored?
Thanks!
Copying two unrelated structures in to each other through void * would work properly only if the two structures have the same memory layout. Otherwise the copying fails.
Note that the objects of structure A and B in your above code will have the same memory layout, since there member variables are identical.
Copying through void * works because one is just copying the actual memory occupied by one structure object in to memory occupied by another structure object.
It is basically a bad idea to copy two unrelated structures in this way.
Consider the situation where you have pointers members inside your structure, a memcpy would just cause a shallow copy of the pointer members, And if one of the object finishes its lifetime then eventually, the other object will be left with a dangling pointer member. That would eventually lead to an Undefined behavior(most likely a crash).
How are the methods stored? Where is the data stored?
A normal function(non virtual) will be stored somewhere in the code section of the program. This location is the same for all instances of the class/structure and hence it is not a part of the memory allocation of each object of the class/structure.
In case of a virtual member function, the size of an class/structure does get affected due to presence of virtual functions, each object of the class/structure then has a special pointer called vptr inside each this. Note that this is implementation detail of compilers and compilers may choose to implement it differently.
Knowing the memory layout is quite important when you're using unsafe mechanisms like memcpy to copy the structure. Once it's modified later, you entire logic may screwed up.
Objects of a class doesn't contain the functions. The memory of an object contains only the attributes and the required size for it. On the other hand, functions are executable peice of code which is common across the program and will not influence the structure's memory layout.
I'd suggest you to define operator=, constructors to appropriately casting one object to another.
The additional glitch on memcpy is that, the object may contain virtual pointer if the class has virtual functions. Additional pointer data may also be copied to the destination memory; which is not really good!
You can copy the A to B using memcpy since they have the same member variables. The functions are not part the instances, so they don't matter.
I would recommend against this approach. If either A or B changes, then your copy will fail at run-time. You can make a a constructor of B which takes A, a conversion function, or something. Though it will require a little more code, it will allow for changes to the structures.
Your best bet is probably to implement constructor of struct B that takes a const reference to a struct A. Use that directly or use the assignment operator (which you probably will need to implement for non-trivial cases).
A a;
... (populate a)
B b(a); //If you want to set b to a at instantiation.
b = B(a); //If you want to overwrite an existing instance of struct B with a.
EDIT:
To respond to your edit, it is entirely compiler dependent. The fact that it works at all for a class is reliant on compiler details, since AFAIK it's not supported by the standard. The compiler devs could decide to mess with people by making it work for CLASS but not STRUCT (or vice versa).
That said, I've never seen any difference in any compiler I've used. I would expect them to map memory identically.
Consider:
class A
{
public:
virtual void update() = 0;
}
class B : public A
{
public:
void update() { /* stuff goes in here... */ }
private:
double a, b, c;
}
class C {
// Same kind of thing as B, but with different update function/data members
}
I'm now doing:
A * array = new A[1000];
array[0] = new B();
array[1] = new C();
//etc., etc.
If i call sizeof(B), the size returned is the size required by the 3 double members, plus some overhead required for the virtual function pointer table. Now, back to my code, it turns out that 'sizeof(myclass)' is 32; that is, I am using 24 bytes for my data members, and 8 bytes for the virtual function table (4 virtual functions). My question is: is there any way I can streamline this? My program will eventually use a heck of a lot of memory, and I don't like the sound of 25% of it being eaten by virtual functions pointers.
The v-table is per class and not per object. Each object contains just a pointer to its v-table. So the overhead per instance is sizeof(pointer) (usually 4 or 8 bytes). It doesn't matter how many virtual functions you have for the sizeof the class object. Considering this, I think you shouldn't worry too much about it.
Typically, every instance of a class with at least one virtual function will have an extra pointer stored with its explicit data members.
There's no way round this, but remember that (again typically) each virtual function table is shared between all instances of the class, so there is no great overhead to having multiple virtual functions or extra levels of inheritance once you've paid the "vptr tax" (small cost of vtable pointer).
For larger classes the overhead becomes much smaller as a percentage.
If you want functionality that does something like what virtual functions do, you are going to have to pay for it in some way. Actually using native virtual functions may well be the cheapest option.
The space cost of a vtable is one pointer (modulo alignment). The table itself is not placed into each instance of the class.
You have two options.
1) Don't worry about it.
2) Don't use virtual functions. However, not using virtual functions can just move the size into your code, as your code gets more complex.
Moving away from the non issue of the vtable pointer in your object:
Your code has other problems:
A * array = new A[1000];
array[0] = new B();
array[1] = new C();
The problem you are having is the slicing problem.
You can not put an object of class B into a space the size reserved for an object of class A.
You will just slice the B(or C) part of the object clean off leaving you with just the A part.
What you want to do. Is have an array of A pointers so that it hold each item by pointer.
A** array = new A*[1000];
array[0] = new B();
array[1] = new C();
Now you have another problem of destruction. Ok. This could go on for ages.
Short answer use boost:ptr_vector<>
boost:ptr_vector<A> array(1000);
array[0] = new B();
array[1] = new C();
Never allocte array like that unless you have to (Its too Java Like to be useful).
How many instances of A-derived classes do you expect?
How many distinct A-derived classes do you expect?
Note that even with a million of instances, we are talking about a total of 32MB. Up to 10 millions, don't sweat it.
Generally you need an extra pointer per instance, (if you are running on an 32 bit platform, the last 4 byte are due to alignment). Each class consumes additional (Number of virtual functions * sizeof(virtual function pointer) + fixed size) bytes for its VMT.
Note that, considering alignment for the doubles, even a single byte as type identifier will bring up the array element size to 32. So Stjepan Rajko's solution is helpful in some cases, but not in yours.
Also, don't forget the overhead of a general heap for so many small objects. You may have another 8 bytes per object. With a custom heap manager - such as an object/size specific pool allocator - you can save more here and employ a standard solution.
If you're going to have millions of these things, and memory is a serious concern for you, then you probably ought not to make them objects. Just declare them as a struct or an array of 3 doubles (or whatever), and put the functions to manipulate the data somewhere else.
If you really need the polymorphic behavior, you probably can't win, since the type information you'd have to store in your struct will end up taking up a similar amount of space...
Is it likely that you'll have large groups of objects all of the same type? In that case, you could put the type information one level "up" from the individual "A" classes...
Something like:
class A_collection
{
public:
virtual void update() = 0;
}
class B_collection : public A_collection
{
public:
void update() { /* stuff goes in here... */ }
private:
vector<double[3]> points;
}
class C_collection { /* Same kind of thing as B_collection, but with different update function/data members */
As others already said, in a typical popular implementation approach, once a class becomes polymorphic, each instance grows by a size of an ordinary data pointer. It doesn't matter how many virtual functions you have in your class. On a 64-bit platform the size would increase by 8 bytes. If you observed 8-byte growth on a 32-bit platform, it could have been caused by padding added to 4-byte pointer for alignment (if your class has 8-byte alignment requirement).
Additionally, it is probably worth noting that virtual inheritance can inject extra data pointers into class instances (virtual base pointers). I'm only familiar with a few implementations and in at least one the number of virtual base pointers was the same as the number of virtual bases in the class, meaning that virtual inheritance can potentially add multiple internal data pointers to each instance.
If you know all of the derived types and their respective update functions in advance, you could store the derived type in A, and implement manual dispatch for the update method.
However, as others are pointing out, you are really not paying that much for the vtable, and the tradeoff is code complexity (and depending on alignment, you might not be saving any memory at all!). Also, if any of your data members have a destructor, then you also have to worry about manually dispatching the destructor.
If you still want to go this route, it might look like this:
class A;
void dispatch_update(A &);
class A
{
public:
A(char derived_type)
: m_derived_type(derived_type)
{}
void update()
{
dispatch_update(*this);
}
friend void dispatch_update(A &);
private:
char m_derived_type;
};
class B : public A
{
public:
B()
: A('B')
{}
void update() { /* stuff goes in here... */ }
private:
double a, b, c;
};
void dispatch_update(A &a)
{
switch (a.m_derived_type)
{
case 'B':
static_cast<B &> (a).update();
break;
// ...
}
}
You're adding a single pointer to a vtable to each object - if you add several new virtual functions the size of each object will not increase. Note that even if you're on a 32-bit platform where pointers are 4 bytes, you're seeing the size of the object increase by 8 probably due to the overall alignment requirements of the structure (ie., you're getting 4 bytes of padding).
So even if you made the class non-virtual, adding a single char member would likely add a full 8 bytes to the size of each object.
I think that the only ways you'll be able to reduce the size of you objects would be to:
make them non-virtual (you you really need polymorphic behavior?)
use floats instead of double for one or more data members if you don't need the precision
if you're likely to see many objects with the same values for the data members, you might be able to save on memory space in exchange for some complexity in managing the objects by using the Flyweight design pattern
Not an answer to the question directly, but also consider that the declaration order of your data members can increase or decrease your real memory consumption per class object. This is because most compilers can't (read: don't) optimize the order in which class members are laid out in memory to decrease internal fragmentation due to alignment woes.
Given all the answers that are already here, I think I must be crazy, but this seems right to me so I'm posting it anyways. When I first saw your code example, I thought you were slicing the instances of B and C, but then I looked a little closer. I'm now reasonably sure your example won't compile at all, but I don't have a compiler on this box to test.
A * array = new A[1000];
array[0] = new B();
array[1] = new C();
To me, this looks like the first line allocates an array of 1000 A. The subsequent two lines operate on the first and second elements of that array, respectively, which are instances of A, not pointers to A. Thus you cannot assign a pointer to A to those elements (and new B() returns such a pointer). The types are not the same, thus it should fail at compile time (unless A has an assignment operator that takes an A*, in which case it will do whatever you told it to do).
So, am I entirely off base? I look forward to finding out what I missed.
If you really want to save memory of virtual table pointer in each object then you can implement code in C-style...
E.g.
struct Point2D {
int x,y;
};
struct Point3D {
int x,y,z;
};
void Draw2D(void *pThis)
{
Point2D *p = (Point2D *) pThis;
//do something
}
void Draw3D(void *pThis)
{
Point3D *p = (Point3D *) pThis;
//do something
}
int main()
{
typedef void (*pDrawFunct[2])(void *);
pDrawFunct p;
Point2D pt2D;
Point3D pt3D;
p[0] = &Draw2D;
p[1] = &Draw3D;
p[0](&pt2D); //it will call Draw2D function
p[1](&pt3D); //it will call Draw3D function
return 0;
}