Heap allocation of abstract class - c++

I am wondering why I cannot initialize an array of abstract objects to the stack but not the heap.
Here is some C++ code similar to mine that fails on the last line. I'm mainly just curious with the reasons behind this problem dealing with the heap vs. the stack. Thanks in advance!
#define ARRAY_SIZE 10
class Obj {
public:
virtual void fn() =0;
};
class Sub : public Obj {
public:
void fn() {
// ...
}
};
Obj * o1_array[ARRAY_SIZE];
Obj * o2_array = new Obj[ARRAY_SIZE]; // Compiler Error

You are doing two very different things. In the "stack" case,
Obj * o1_array[ARRAY_SIZE];
you are initializing an array of pointers. You don't need a complete or non-abstract type for this. In the "heap" case,
Obj * o2_array = new Obj[ARRAY_SIZE];
you are initializing an array of objects on the RHS and assigning the address of the first element to a pointer in the LHS. You need a complete, non-abstract type to build the array. If you want a dynamically-sized collection of pointers, a good bet is to use a container of pointers (assuming something else takes care of the management of the pointees' lifetime) or a container of smart pointers (where the smartness of the pointer decides who manages the pointee).
Non-managing:
std::vector<Obj*> o1_array(ARRAY_SIZE);
Managing (unique ownership):
std::vector<std::unique_ptr<Obj>> o1_array(ARRAY_SIZE);
For more options on containers managing "pointers", see the boost pointer container library.

Related

C++ pointer vs object

Could you please clear up a question for me regarding pointer vs object in C++. I have the below code that has a class called "person" and a list that allows for 100 objects of that class type.
class person {...}
int main {
person* mylist;
mylist = new person[100];
mylist[0].set_name("John")
// ...
}
In this code I can call a method of the class by mylist[0].set_name() meaning (by my understanding) that mylist[0] is an object (hence the . operator to call a method). The code works fine.
I have another project where the "person" class is used as a base class to derive classes "carpenter" and "welder". The derived classes simply overwrite a virtual function called salary in the base "person" class to allow for a different calculation of salary.
person* mylist[100];
mylist[0] = new carpenter;
mylist[0]->set_name("John");
This code works fine as well. My question is - why in the first code I can call the set_name method using the . (meaning mylist[0] is an object) and in the second code I have to use the -> operator (meaning mylist[0] is a pointer to the object)?
T* represents a pointer type, which represents a variable that contains a "reference" (usually a memory address) to some instance of type T. Using a real world comparison, a T* pointer stands to T like a street address stands to a building.
Pointers allow you to refer to some instance owned by some other variable, and you can use a valid, non null instance of T* to read and write on a T. In this, they are similar to another C++ concept, references (written as T&), which allow you to alias variables, but differ significantly from pointers by not being objects in their own regard.
A pointer is, in fact, an object itself, with each pointer variable having its own unique address and being thus storable and referenceable. For instance, you can have pointers to pointers (T**) and references to pointers (T*&), but not pointers to references - pointers exist, while references may not (they are usually implemented as pointers underneath though).
To reflect the this "indirect" nature of pointers, C and C++ provide you with two different operators which allow you to dereference a pointer (* and ->), and to reference a variable (&).
For instance, you may do the following:
struct A { int x; };
// ...
A a {};
A *aptr { &a }; // `&` takes the address of `a` and stores it into the `aptr` variable of type `A*`
aptr->x = 33; // `->` is equivalent here to `(*aptr).x`, a.x is now 33
A a2 {};
A **aptrptr { &aptr }; // pointer to pointer
*aptrptr = &a2; // `aptr` now points to `a2`
operator-> is basically syntactic sugar that avoids you some cumbersome expressions like (*aptr).x.
References, being basically just aliases to something else, do not need any special syntax at all, and are always converted transparently when neeeded:
int x { 33 };
int &xref { x }; // `xref` refers to `x`
xref = 12; // `x` is now 33
int y = xref; // copies `x` into `y`, no special syntax needed
Pointers are also used in the C language to access arrays, which always decay to a pointer as soon as they are referred to in expressions. This is messy and it's one of the reasons std::vector and std::array should always be used in their place when feasible.
int x[33];
x[3] = 44; // equivalent to `*(&x[0] + 3) = 44`
Finally, the "indirect" nature of pointers and references allow C++ to convert a Derived* to a Base*, given that a derived class contains a full instance of its base (it gets more complicated with multiple inheritance though).
Every class that inherits or contains from another class containing virtual methods will include a hidden pointer to a _Virtual Method Table`, a list of pointers to functions which will be used to dispatch the virtual methods to the correct implementation.
PS: in modern C++, you should rarely need raw pointers. The correct approach is to use containers such as std::vector, std::array and std::string, and special pointer-like wrappers called smart pointers (like std::unique_ptr) every time you need a pointer for some reason. These will handle the lifetime of a pointer for you, avoiding memory leaks and vastly simplifying memory handling. For the same reason, the new operator should be mostly considered as being deprecated and should not be used at all (unless in placement new expressions, are potentially dangerous if used improperly).
basically the first case works like this: you have an array of objects. To access the object fields and methods you use . operator.
In the second case you have an array of pointers to an object. Pointer is just a memory address, that points to an object of some type. In your case, this is an array of pointers to class person. By default, these pointers are invalid; you have to set them to the address of some existing object. new creates an object on the heap, and returns you an address of that object. In order to access the value behind the pointer, you have to de-reference it. The syntax is this:
some_type obj;
some_type* ptr = &obj; // get the address of the object
(*ptr).some_method(); // de-reference the pointer and call it
ptr->some_method(); // same

c++ array of interfaces that contains class instances

How is an array that stores multiple IClass objects, into which we add instances of Class objects that implements IClass gets stored in memory? I am trying to confirm if my assumption is right (in case of 32 bit app)
we have an array of 32 bit pointers to the IClass objects that have a 32 bit pointer to the actual instance object of Class that takes up sizeof(Class) in memory? Also, if an interface doesnt have any virtual/abstract methods it only has a 32 bit pointer at its root and thats it?
Is this right?
as always, any input appreciated
edit:
say we have the following definitons:
class Class : IClass
{
int foo();
int bar;
}
class IClass
{
virtual int vfunc();
}
// array def:
IClass arr[1337];
I begin storing Class in the array arr, how does the runtime store sizeof(Class) into something that have sizeof(IClass) allocated for it?
how does the runtime store sizeof(Class) into something that have sizeof(IClass) allocated for it?
It does not. By assigning a Class instance to an element of the IClass array you are effectively invoking the IClass copy-constructor and copying just the parent part of the object. This is called "slicing", as explained in this answer.
In order to avoid it, you need to hold pointers (or references) to IClass. Three problems arise then:
You cannot copy the actual values, only move them or reference them, unless you implement a virtual "clone" method for your class hierarchy.
You need to handle ownership: an array of class C owns its instances, but an array of C* or C& does not. Normally you would want to use unique_ptr<T> or shared_ptr<T>, depending on the lifetime that is expected for the instances and who will see them.
You need a way to polymorphically construct and destroy the elements. The latter is easy, you just declare the destructor virtual at the base class. The former usually involves either somebody else passing you the instances they want saved (e.g. using an unique_ptr<IClass> that gives you the ownership) or creating some sort of factory infrastructure.

C++: Inside a method can one create an uninitialised object from the class?

Inside a method can one create an uninitialised object from the class?
Here's some context: imagine a class where the constructors all allocate memory:
class NumberArray
{
size_t m_Size;
int *m_Numbers;
public:
NumberArray() { m_Size = 1; m_Numbers = new int[1]; m_Numbers[0] = 0; }
// . . . other methods for manipulating or constructing . . .
~NumberArray() { delete[] m_Numbers; }
// What if I had a method that concatenates two arrays?
NumberArray ConcatenateWith(const NumberArray &) const;
};
Inside such a method one would desire to create an uninitialised object of class NumberArray, and then 'construct' a new object based on this and the object in the parameter? AKA:
NumberArray NumberArray::ConcatenateWith(const NumberArray &other) const
{
// Mystery manner of creating an uninitialised NumberArray 'returnObject'.
returnObject.m_Size = m_Size + other.m_Size;
returnObject.m_Numbers = new int[returnObject.m_Size];
std::copy(m_Numbers, m_Numbers + m_Size, returnObject.m_Numbers);
std::copy(other.m_Numbers, other.m_Numbers + other.m_Size, returnObject.m_Numbers + m_Size);
return returnObject;
}
What's the best way of doing this? Basically, I don't want the default constructor to create a size 1 array that I will just delete and then allocate a new array for again anyway.
It's not entirely clear what you are trying to do, but if all you want is to create a new instance of the class and not have a constructor other than the default constructor called then do just that.
All you have to do is create a private constructor, that has a different signature from the default constructor and which does not allocate memory (or differs in whatever way you need it to differ from the default constructor); then simply have your class invoke that constructor internally, when necessary.
What you're asking for is placement new. This looks something like this:
#include <cstdlib>
#include <new>
void* mem = std::malloc(sizeof(T)); // memory for a T (properly aligned per malloc)
T* x = new (mem) T; // construct a T in that memory location
x->~T(); // destruct that T
std::free(mem); // and free the memory
Doing this correctly (in an exception-safe manner with properly managed and aligned memory) is not a trivial task. You need to be careful about the lifetime of your objects.
For your question, you are describing exactly what std::vector does. It allocates raw uninitialized memory and constructs inserted elements directly into that memory. And lots of its code is dedicated to just getting the lifetime and memory management correct and exception safe!
You should strongly prefer to use std::vector instead of writing it yourself.
There is no well-defined way, as far as I'm aware, to create an object without invoking it's constructor. This is regardless of whether you have access to its public interface or not, though you could implement a private or protected constructor if you want to restrict who can invoke it. There is otehrwise no restrictions on creating new instances of a class from its own internal methods, in fact it is quite common to define a private constructor and a static public method that create instances of said object if you want to restrict under which conditions said object can be created.
If you want to, you can allocated sufficient memory for an object and reinterpret_cast a pointer to that memory to a pointer of the type you want. This usually works for POD's, but since many implementations (if not all) of polymorphic inheritance in c++ adds a pointer to a vtable to polymorphic instances this approach will usually, if not always, fail for those.
In short, create a private constructor and have a static method invoke it and then do any other work that you need is my recommendation.
I think this may be similar to what you want, an 'anonymous' class of sorts:
struct test {
virtual void doSomething() {
puts("test");
}
};
struct a {
test *t() {
struct b : test {
void doSomething() {
puts("b");
};
};
return new b;
};
};
int main()
{
a a;
a.t()->doSomething(); // outputs 'b'
}
However, due to slicing and how new works on C++, you must return a pointer and the 'anonymous' type must have a name, even if it's restricted only to the function.
If you could edit the OP and clarify exactly what you wish to accomplish by this, maybe we could help you more.

Storing pointer to heap objects in an STL container for later deallocation

How can one store an arbitrary number of dynamically created instances (of different types) in an STL container so that the memory can be freed later only having the container?
It should work like this:
std::vector< void * > vec;
vec.push_back( new int(10) );
vec.push_back( new float(1.) );
Now, if vec goes out of scope the pointers to the instances are destructed, but the memory for int and float are not freed. And obviously I can't do:
for( auto i : vec )
delete *i;
because void* is not a pointer-to-object type.
You could object and argue that this isn't a good idea because one can not access the elements of the vector. That is right, and I don't access them myself. The NVIDIA driver will access them as it just needs addresses (void* is fine) for it parameters to a kernel call.
I guess the problem here is that it can be different types that are stored. Wondering if a union can do the trick in case one wants to pass this as arguments to a cuda kernel.
The kernel takes parameters of different types and are collected by traversing an expression tree (expression templates) where you don't know the type beforehand. So upon visiting the leaf you store the parameter. it can only be void*, and built-in types int, float, etc.
The vector can be deleted right after the kernel launch (the launch is async but the driver copies the parameters first then continues host thread). 2nd question: Each argument is passed a void* to the driver. Regardless if its an int, float or even void*. So I guess one can allocate more memory than needed. I think the union thingy might be worth looking at.
You can use one vector of each type you want to support.
But while that's a great improvement on the idea of a vector of void*, it still quite smelly.
This does sound like an XY-problem: you have a problem X, you envision a solution Y, but Y obviously doesn't work without some kind of ingenious adaption, so ask about Y. When instead, should be asking about the real problem X. Which is?
Ok, FWIW
I would recomend using an in-place new combined with malloc. what this would do is allow you store the pointers created as void* in your vector. Then when the vector is finished with it can simply be iterated over and free() called.
I.E.
void* ptr = malloc(sizeof(int));
int* myNiceInt = new (ptr) int(myNiceValue);
vec.push_back(ptr);
//at some point later iterate over vec
free( *iter );
I believe that this will be the simplest solution to the problem in this case but do accept that this is a "C" like answer.
Just sayin' ;)
"NVIDIA driver" sounds like a C interface anyway, so malloc is not a crazy suggestion.
Another alternative, as you suggest, is to use a union... But you will also need to store "tags" in a parallel vector to record the actual type of the element, so that you can cast to the appropriate type on deletion.
In short, you must cast void * to an appropriate type before you can delete it. The "C++ way" would be to have a base class with a virtual destructor; you can call delete on that when it points to an instance of any sub-class. But if the library you are using has already determined the types, then that is not an option.
If you have control over the types you can create an abstract base class for them. Give that class a virtual destructor. Then you can have your std::vector<Object*> and iterate over it to delete anything which inherits from Object.
You probably need to have a second std::vector<void*> with pointers to the actual values, since the Object* probably hits the vtable first. A second virtual function like virtual void* ptr() { return &value; } would be useful here. And if it needs the size of the object you can add that too.
You could use the template pattern like this:
template<typename T>
class ObjVal : public Object {
public:
T val;
virtual void* ptr() { return &this->val; }
virtual size_t size() { return sizeof(this->val); }
};
Then you only have to type it once.
This is not particularly memory efficient because every Object picks up at least one extra pointer for the vtable.
However, new int(3) is not very memory efficient either because your allocator probably uses more than 4 bytes for it. Adding that vtable pointer may be essentially free.
Use more than 1 vector. Keep the vector<void*> around to talk to the API (which I'm guessing requires a contiguous block of void*s of non-uniform types?), but also have a vector<std::unique_ptr<int>> and vector<std::unique_ptr<float>> which owns the data. When you create a new int, push a unique_ptr that owns the memory into your vector of ints, and then stick it on the API-compatible vector as a void*. Bundle the three vectors into one struct so that their lifetimes are tied together if possible (and it probably is).
You can also do this with a single vector that stores the ownership of the variables. A vector of roll-your-own RAII pseudo-unique_ptr, or shared_ptr with custom destroyers, or a vector of std::function<void()> that your "Bundle"ing struct's destroyer invokes, or what have you. But I wouldn't recommend these options.

Accessing Members of Containing Objects from Contained Objects

If I have several levels of object containment (one object defines and instantiates another object which define and instantiate another object..), is it possible to get access to upper, containing - object variables and functions, please?
Example:
class CObjectOne
{
public:
CObjectOne::CObjectOne() { Create(); };
void Create();
std::vector<ObjectTwo>vObejctsTwo;
int nVariableOne;
}
bool CObjectOne::Create()
{
CObjectTwo ObjectTwo(this);
vObjectsTwo.push_back(ObjectTwo);
}
class CObjectTwo
{
public:
CObjectTwo::CObjectTwo(CObjectOne* pObject)
{
pObjectOne = pObject;
Create();
};
void Create();
CObjectOne* GetObjectOne(){return pObjectOne;};
std::vector<CObjectTrhee>vObjectsTrhee;
CObjectOne* pObjectOne;
int nVariableTwo;
}
bool CObjectTwo::Create()
{
CObjectThree ObjectThree(this);
vObjectsThree.push_back(ObjectThree);
}
class CObjectThree
{
public:
CObjectThree::CObjectThree(CObjectTwo* pObject)
{
pObjectTwo = pObject;
Create();
};
void Create();
CObjectTwo* GetObjectTwo(){return pObjectTwo;};
std::vector<CObjectsFour>vObjectsFour;
CObjectTwo* pObjectTwo;
int nVariableThree;
}
bool CObjectThree::Create()
{
CObjectFour ObjectFour(this);
vObjectsFour.push_back(ObjectFour);
}
main()
{
CObjectOne myObject1;
}
Say, that from within CObjectThree I need to access nVariableOne in CObjectOne. I would like to do it as follows:
int nValue = vObjectThree[index].GetObjectTwo()->GetObjectOne()->nVariable1;
However, after compiling and running my application, I get Memory Access Violation error.
What is wrong with the code above(it is example, and might contain spelling mistakes)?
Do I have to create the objects dynamically instead of statically?
Is there any other way how to achieve variables stored in containing objects from withing contained objects?
When you pass a pointer that points back to the container object, this pointer is sometimes called a back pointer. I see this technique being used all the time in GUI libraries where a widget might want access to its parent widget.
That being said, you should ask yourself if there's a better design that doesn't involve circular dependencies (circular in the sense that the container depends on the containee and the containee depends on the container).
You don't strictly have to create the objects dynamically for the back pointer technique to work. You can always take the address of a stack-allocated (or statically-allocated) object. As long as the life of that object persists while others are using pointers to it. But in practice, this technique is usually used with dynamically-created objects.
Note that you might also be able to use a back-reference instead of a back-pointer.
I think I know what's causing your segmentation faults. When your vectors reallocate their memory (as the result of growing to a larger size), the addresses of the old vector elements become invalid. But the children (and grand-children) of these objects still hold the old addresses in their back-pointers!
For the back-pointer thing to work, you'll have to allocate each object dynamically and store their pointers in the vectors. This will make memory management a lot more messy, so you might want to use smart pointers or boost::ptr_containers.
After seeing the comment you made in another answer, I now have a better idea of what you're trying to accomplish. You should research generic tree structures and the composite pattern. The composite pattern is usually what's used in the widget example I cited previously.
Maybe all your object can inherit from a common interface like :
class MyObject
{
public:
virtual int getData() = 0;
}
And after you can use a std::tree from the stl library to build your structure.
As Emile said, segmentation fault is caused by reallocation. Exactly speaking -- when the local stack objects' 'this' pointer was passed to create another object, which is then copied to the vector container. Then the 'Create()' function exits, the stack frame object ceases to exist and the pointer in the container gets invalid.