using *void as a buffer for static_cast - c++

So I go this:
class A;
class B : public A;
class C : public B;
vector<A*> *vecA;
vector<C*> *vecC;
And I want to cast a vectC into a vecA.
vector<A*> *_newA = static_cast< vector<A*>* >(vecC); //gives an error
So I used void pointer as a buffer and the cast:
void *buffer = vecC;
vector<A*> *_newA = static_cast< vector<A*>* >(buffer); //works
Is this valid? Is there another way to do it?

You should just have a std::vector<A*> and then when you wish to put a B or C inside, you can do this:
std::vector<A*> vec;
vec.push_back(b);
This will work much better than anything you're currently doing.
Also, you should use std::unique_ptr or std::shared_ptr, depending on ownership semantics of your pointers. Owning raw pointers are a no-no!!

Is this valid?
No it's not. It is completely undefined what happens if you access it.
Is there another way to do it?
Yes. It depends on what you want to do:
copy the content: std::copy(vecC.begin(), vecC.end(), std::back_inserter(vecA));
create an adaptor which behaves as random access container of A* if given a container or iterator of C* or any other derived type
I am sure there are other solutions, just tell us what you want to do

The magic words you probably needed to know are "covariance" and "contravariance". Have a look here for a very closely related issue.
Short answer: there's no neat, sensible way to do the conversion you want to do merely with casts. You will need to copy to contents of your old vector the long way into your new vector.
But more importantly: 1. Don't ever use new on a standard container type. 2. Don't pass around pointers to standard container types. 3. Don't pass around raw pointers to your own objects, use std::shared_ptr instead.

Instead of detouring via void*, with C++11 just use reinterpret_cast.
However, the conversion that you desire permits very ungood things:
#include <iostream>
#include <vector>
using namespace std;
class Base {};
class Derived: public Base
{ public: int x; Derived(): x(42) {} };
int main()
{
vector< Derived* > v;
Derived d;
v.push_back( &d );
cout << v.back()->x << endl; // 42
vector< Base* >* pbv =
reinterpret_cast< vector< Base* >* >( &v );
Base b;
pbv->push_back( &b );
cout << v.back()->x << endl; // indeterminate value
}
It's roughly the same issue that you have with converting T** to T const** (not permitted, could allow ungood things). The short of it is, don't do this unless you know exactly what you're doing. The longer of it, hm, well, there's not room to discuss it here, but it involves differentiating between mutable and immutable collections, and being very very careful.
edit: as others (who did not address the conversion question) have already stated, a practical solution is a vector of A pointers, or smart pointers, and then using C++ polymorphism (i.e. virtual member functions).

It looks to me as if you're looking for covariant support for generic (template) types. This is not supported at all by C++. It is supported by the CLR - and C#/VB - though only in the latest versions.
In spite of this, to echo others' responses, you're likely barking up the wrong tree. My hunch is that you want to have a vector of pointers to A-typed objects... and this type should include virtual methods and a virtual destructor - as necessary. It's impossible to be more specific about a suitable alternative approach without a better understanding of the high-level problem you're trying to solve - which is not evident from your question.

Related

Storing an inherited class in a vector [duplicate]

I have a tricky situation. Its simplified form is something like this
class Instruction
{
public:
virtual void execute() { }
};
class Add: public Instruction
{
private:
int a;
int b;
int c;
public:
Add(int x, int y, int z) {a=x;b=y;c=z;}
void execute() { a = b + c; }
};
And then in one class I do something like...
void some_method()
{
vector<Instruction> v;
Instruction* i = new Add(1,2,3)
v.push_back(*i);
}
And in yet another class...
void some_other_method()
{
Instruction ins = v.back();
ins.execute();
}
And they share this Instruction vector somehow. My concern is the part where I do "execute" function. Will it work? Will it retain its Add type?
No, it won't.
vector<Instruction> ins;
stores values, not references. This means that no matter how you but that Instruction object in there, it'll be copied at some point in the future.
Furthermore, since you're allocating with new, the above code leaks that object. If you want to do this properly, you'll have to do
vector<Instruction*> ins
Or, better yet:
vector< std::reference_wrapper<Instruction> > ins
I like this this blog post to explain reference_wrapper
This behavior is called object slicing.
So you will need some kind of pointer. A std::shared_ptr works well:
typedef shared_ptr<Instruction> PInstruction;
vector<PInstruction> v;
v.emplace_back(make_shared<Add>());
PInstruction i = v[0];
Keep in mind that PInstruction is reference-counted, so that the copy constructor of PInstruction will create a new "reference" to the same object.
If you want to make a copy of the referenced object you will have to implement a clone method:
struct Instruction
{
virtual PInstruction clone() = 0;
...
}
struct Add
{
PInstruction clone() { return make_shared<Add>(*this); }
...
}
PInstruction x = ...;
PInstruction y = x->clone();
If performance is an issue than you can look at std::unique_ptr, this is a little trickier to manage as move semantics are always required, but it avoids the cost of some atomic operations.
You can also use raw pointers and manage the memory manually with some sort of memory pool architecture.
The underlying problem is that to have a polymorphic type the compiler doesn't know how big the subclasses are going to be, so you can't just have a vector of the base type, as it won't have the extra space needed by subclasses. For this reason you will need to use pass-by-reference semantics as described above. This stores a pointer to the object in the vector and then stores the object on the heap in blocks of different sizes depending on what the subclass needs.
No, that will not work; you are "slicing" the Add object, and only inserting its Instruction part into the array. I would recommend that you make the base class abstract (e.g. by making execute pure virtual), so that slicing gives a compile error rather than unexpected behaviour.
To get polymorphic behaviour, the vector needs to contain pointers to the base class.
You will then need to be careful how you manage the objects themselves, since they are no longer contained in the vector. Smart pointers may be useful for this; and since you're likely to be dynamically allocating these objects, you should also give the base class a virtual destructor to make sure you can delete them correctly.
You may want to do a couple things, A: change the type of "v" to "vector", B: managed your memory with the "delete" operator. To answer your question, with this approach, yes, but you will only be able to access the interface from "Instruction", if you KNOW the type of something an "Instruction" pointer is pointing to I would suggest using dynamic_cast if you need to access the interface from, say, "Add".

Call derived class method when upcasted [duplicate]

I did find some questions already on StackOverflow with similar title, but when I read the answers, they were focusing on different parts of the question, which were really specific (e.g. STL/containers).
Could someone please show me, why you must use pointers/references for implementing polymorphism? I can understand pointers may help, but surely references only differentiate between pass-by-value and pass-by-reference?
Surely so long as you allocate memory on the heap, so that you can have dynamic binding, then this would have been enough. Obviously not.
"Surely so long as you allocate memory on the heap" - where the memory is allocated has nothing to do with it. It's all about the semantics. Take, for instance:
Derived d;
Base* b = &d;
d is on the stack (automatic memory), but polymorphism will still work on b.
If you don't have a base class pointer or reference to a derived class, polymorphism doesn't work because you no longer have a derived class. Take
Base c = Derived();
The c object isn't a Derived, but a Base, because of slicing. So, technically, polymorphism still works, it's just that you no longer have a Derived object to talk about.
Now take
Base* c = new Derived();
c just points to some place in memory, and you don't really care whether that's actually a Base or a Derived, but the call to a virtual method will be resolved dynamically.
In C++, an object always has a fixed type and size known at compile-time and (if it can and does have its address taken) always exists at a fixed address for the duration of its lifetime. These are features inherited from C which help make both languages suitable for low-level systems programming. (All of this is subject to the as-if, rule, though: a conforming compiler is free to do whatever it pleases with code as long as it can be proven to have no detectable effect on any behavior of a conforming program that is guaranteed by the standard.)
A virtual function in C++ is defined (more or less, no need for extreme language lawyering) as executing based on the run-time type of an object; when called directly on an object this will always be the compile-time type of the object, so there is no polymorphism when a virtual function is called this way.
Note that this didn't necessarily have to be the case: object types with virtual functions are usually implemented in C++ with a per-object pointer to a table of virtual functions which is unique to each type. If so inclined, a compiler for some hypothetical variant of C++ could implement assignment on objects (such as Base b; b = Derived()) as copying both the contents of the object and the virtual table pointer along with it, which would easily work if both Base and Derived were the same size. In the case that the two were not the same size, the compiler could even insert code that pauses the program for an arbitrary amount of time in order to rearrange memory in the program and update all possible references to that memory in a way that could be proven to have no detectable effect on the semantics of the program, terminating the program if no such rearrangement could be found: this would be very inefficient, though, and could not be guaranteed to ever halt, obviously not desirable features for an assignment operator to have.
So in lieu of the above, polymorphism in C++ is accomplished by allowing references and pointers to objects to reference and point to objects of their declared compile-time types and any subtypes thereof. When a virtual function is called through a reference or pointer, and the compiler cannot prove that the object referenced or pointed to is of a run-time type with a specific known implementation of that virtual function, the compiler inserts code which looks up the correct virtual function to call a run-time. It did not have to be this way, either: references and pointers could have been defined as being non-polymorphic (disallowing them to reference or point to subtypes of their declared types) and forcing the programmer to come up with alternative ways of implementing polymorphism. The latter is clearly possible since it's done all the time in C, but at that point there's not much reason to have a new language at all.
In sum, the semantics of C++ are designed in such a way to allow the high-level abstraction and encapsulation of object-oriented polymorphism while still retaining features (like low-level access and explicit management of memory) which allow it to be suitable for low-level development. You could easily design a language that had some other semantics, but it would not be C++ and would have different benefits and drawbacks.
I found it helpful to understand that a copy constructor is invoked when assigning like this:
class Base { };
class Derived : public Base { };
Derived x; /* Derived type object created */
Base y = x; /* Copy is made (using Base's copy constructor), so y really is of type Base. Copy can cause "slicing" btw. */
Since y is an actual object of class Base, rather than the original one, functions called on this are Base's functions.
Consider little endian architectures: values are stored low-order-bytes first. So, for any given unsigned integer, the values 0-255 are stored in the first byte of the value. Accessing the low 8-bits of any value simply requires a pointer to it's address.
So we could implement uint8 as a class. We know that an instance of uint8 is ... one byte. If we derive from it and produce uint16, uint32, etc, the interface remains the same for purposes of abstraction, but the one most important change is size of the concrete instances of the object.
Of course, if we implemented uint8 and char, the sizes may be the same, likewise sint8.
However, operator= of uint8 and uint16 are going to move different quantities of data.
In order to create a Polymorphic function we must either be able to:
a/ receive the argument by value by copying the data into a new location of the correct size and layout,
b/ take a pointer to the object's location,
c/ take a reference to the object instance,
We can use templates to achieve a, so polymorphism can work without pointers and references, but if we are not counting templates, then lets consider what happens if we implement uint128 and pass it to a function expecting uint8? Answer: 8 bits get copied instead of 128.
So what if we made our polymorphic function accept uint128 and we passed it a uint8. If our uint8 we were copying was unfortunately located, our function would attempt to copy 128 bytes of which 127 were outside of our accessible memory -> crash.
Consider the following:
class A { int x; };
A fn(A a)
{
return a;
}
class B : public A {
uint64_t a, b, c;
B(int x_, uint64_t a_, uint64_t b_, uint64_t c_)
: A(x_), a(a_), b(b_), c(c_) {}
};
B b1 { 10, 1, 2, 3 };
B b2 = fn(b1);
// b2.x == 10, but a, b and c?
At the time fn was compiled, there was no knowledge of B. However, B is derived from A so polymorphism should allow that we can call fn with a B. However, the object it returns should be an A comprising a single int.
If we pass an instance of B to this function, what we get back should be just a { int x; } with no a, b, c.
This is "slicing".
Even with pointers and references we don't avoid this for free. Consider:
std::vector<A*> vec;
Elements of this vector could be pointers to A or something derived from A. The language generally solves this through the use of the "vtable", a small addition to the object's instance which identifies the type and provides function pointers for virtual functions. You can think of it as something like:
template<class T>
struct PolymorphicObject {
T::vtable* __vtptr;
T __instance;
};
Rather than every object having its own distinct vtable, classes have them, and object instances merely point to the relevant vtable.
The problem now is not slicing but type correctness:
struct A { virtual const char* fn() { return "A"; } };
struct B : public A { virtual const char* fn() { return "B"; } };
#include <iostream>
#include <cstring>
int main()
{
A* a = new A();
B* b = new B();
memcpy(a, b, sizeof(A));
std::cout << "sizeof A = " << sizeof(A)
<< " a->fn(): " << a->fn() << '\n';
}
http://ideone.com/G62Cn0
sizeof A = 4 a->fn(): B
What we should have done is use a->operator=(b)
http://ideone.com/Vym3Lp
but again, this is copying an A to an A and so slicing would occur:
struct A { int i; A(int i_) : i(i_) {} virtual const char* fn() { return "A"; } };
struct B : public A {
int j;
B(int i_) : A(i_), j(i_ + 10) {}
virtual const char* fn() { return "B"; }
};
#include <iostream>
#include <cstring>
int main()
{
A* a = new A(1);
B* b = new B(2);
*a = *b; // aka a->operator=(static_cast<A*>(*b));
std::cout << "sizeof A = " << sizeof(A)
<< ", a->i = " << a->i << ", a->fn(): " << a->fn() << '\n';
}
http://ideone.com/DHGwun
(i is copied, but B's j is lost)
The conclusion here is that pointers/references are required because the original instance carries membership information with it that copying may interact with.
But also, that polymorphism is not perfectly solved within C++ and one must be cognizant of their obligation to provide/block actions which could produce slicing.
You need pointers or reference because for the kind of polymorphism you are interested in (*), you need that the dynamic type could be different from the static type, in other words that the true type of the object is different than the declared type. In C++ that happens only with pointers or references.
(*) Genericity, the type of polymorphism provided by templates, doesn't need pointers nor references.
When an object is passed by value, it's typically put on the stack. Putting something on the stack requires knowledge of just how big it is. When using polymorphism, you know that the incoming object implements a particular set of features, but you usually have no idea the size of the object (nor should you, necessarily, that's part of the benefit). Thus, you can't put it on the stack. You do, however, always know the size of a pointer.
Now, not everything goes on the stack, and there are other extenuating circumstances. In the case of virtual methods, the pointer to the object is also a pointer to the object's vtable(s), which indicate where the methods are. This allows the compiler to find and call the functions, regardless of what object it's working with.
Another cause is that very often the object is implemented outside of the calling library, and allocated with a completely different (and possibly incompatible) memory manager. It could also have members that can't be copied, or would cause problems if they were copied with a different manager. There could be side-effects to copying and all sorts of other complications.
The result is that the pointer is the only bit of information on the object that you really properly understand, and provides enough information to figure out where the other bits you need are.

How do I make a container that holds pointers of any type

This may seem silly, but I'd like to make a container that holds pointers of any type, so that I can store every single pointer in there and then easily delete them later. I tried:
vector<void*> v;
v.push_back(new Dog());
v.push_back(new Cat());
cout << v[0]; // prints mem address
cout << v[1]; // prints another mem address
cout << *v[0]; // compiler yells at me
But apparently you can't dereference void pointers. Is there a way to make a generic container of pointers of any type, without having to make every single class extend a superclass called "Object" or something?
You can implement pointer wrapper a template class which inherits from a common base class and place those to the container instead. Something along the lines:
class pointer_wrapper_base
{
public:
virtual void delete_pointee()=0;
protected:
void *m_ptr;
};
template<class T>
class pointer_wrapper: public pointer_wrapper_base
{
public:
pointer_wrapper(T *ptr_) {m_ptr=ptr_;}
virtual void delete_pointee()
{
delete (T*)m_ptr;
}
};
Once you have this class, you can use poly-variant class for example, which is like variant class, but all the different variations have common base class. I have an implementation here if you want to have a look: http://sourceforge.net/p/spinxengine/code/HEAD/tree/sxp_src/core/utils.h (search for poly_pod_variant):
std::vector<poly_pod_variant<pointer_wrapper_base> > x;
x.push_back(pointer_wrapper<Cat>(new(Cat)));
x[0]->delete_pointee();
Or if you are ok with dynamic allocation for the wrappers, then you can of course just store pointers to pointer_wrap_base to the vector, e.g.
std::vector<std::unique_ptr<pointer_wrapper_base> > x;
x.push_back(std::unique_ptr<pointer_wrapper_base>(new(pointer_wrapper<Cat>)(new Cat)));
x[0]->delete_pointee();
Look into using some of Boost's classes, such as boost::any, http://www.boost.org/doc/libs/1_55_0/doc/html/any.html and their example code, http://www.boost.org/doc/libs/1_55_0/doc/html/any/s02.html
Alternately, look at Boost's variant as well.
In general, learn Boost. It will blow you away and turbo charge your C++ development.
C++ has a static type system. This means that types of all expressions must be known at compile time. The solution to your problem depends on what are you going to do with the objects.
Option 1: Have Cat and Dog derive from a class
This makes sense if all the objects have a common interface, and if you can make them to derive from a class.
std::vector<std::unique_ptr<Animal>> vec; // good practice - automatically manage
// dynamically allocated elements with
// std::unique_ptr
vec.push_back(std::make_unique<Dog>()); // or vec.emplace_back(new Dog());
vec.push_back(std::make_unique<Cat>()); // or vec.emplace_back(new Cat());
std::cout << *v[0];
Option 2: boost::any
This makes sense if the types are unrelated. For example, you storing ints and objects of your class. Obviously you can't make int derived from your class. So you use boost::any to store, and then cast it back to the type of the object. Exception of type boost::bad_any_cast is thrown if you cast to unrelated type.
std::vector<boost::any> vec;
vec.push_back(Dog());
vec.push_back(25);
std::cout << boost::any_cast<int>(vec[1]);
Also, pointers. Solution to "I want manage my memory properly" is "Don't use new and delete" These are the tools to help you doing this, in no particular order:
std::string instead of null-terminated strings
std::vector<T> instead of new T[]
std::unique_ptr<T> instead of raw pointers to polymorphic objects
...or std::shared_ptr<T> if you share them
cout << * ((Dog *)v[0]);
I'm assuming here that your Dog class is stringifiable, otherwise you'll get a different kind of error, but your type conversion problem should be solved by (Dog *) type cast.

Why doesn't polymorphism work without pointers/references?

I did find some questions already on StackOverflow with similar title, but when I read the answers, they were focusing on different parts of the question, which were really specific (e.g. STL/containers).
Could someone please show me, why you must use pointers/references for implementing polymorphism? I can understand pointers may help, but surely references only differentiate between pass-by-value and pass-by-reference?
Surely so long as you allocate memory on the heap, so that you can have dynamic binding, then this would have been enough. Obviously not.
"Surely so long as you allocate memory on the heap" - where the memory is allocated has nothing to do with it. It's all about the semantics. Take, for instance:
Derived d;
Base* b = &d;
d is on the stack (automatic memory), but polymorphism will still work on b.
If you don't have a base class pointer or reference to a derived class, polymorphism doesn't work because you no longer have a derived class. Take
Base c = Derived();
The c object isn't a Derived, but a Base, because of slicing. So, technically, polymorphism still works, it's just that you no longer have a Derived object to talk about.
Now take
Base* c = new Derived();
c just points to some place in memory, and you don't really care whether that's actually a Base or a Derived, but the call to a virtual method will be resolved dynamically.
In C++, an object always has a fixed type and size known at compile-time and (if it can and does have its address taken) always exists at a fixed address for the duration of its lifetime. These are features inherited from C which help make both languages suitable for low-level systems programming. (All of this is subject to the as-if, rule, though: a conforming compiler is free to do whatever it pleases with code as long as it can be proven to have no detectable effect on any behavior of a conforming program that is guaranteed by the standard.)
A virtual function in C++ is defined (more or less, no need for extreme language lawyering) as executing based on the run-time type of an object; when called directly on an object this will always be the compile-time type of the object, so there is no polymorphism when a virtual function is called this way.
Note that this didn't necessarily have to be the case: object types with virtual functions are usually implemented in C++ with a per-object pointer to a table of virtual functions which is unique to each type. If so inclined, a compiler for some hypothetical variant of C++ could implement assignment on objects (such as Base b; b = Derived()) as copying both the contents of the object and the virtual table pointer along with it, which would easily work if both Base and Derived were the same size. In the case that the two were not the same size, the compiler could even insert code that pauses the program for an arbitrary amount of time in order to rearrange memory in the program and update all possible references to that memory in a way that could be proven to have no detectable effect on the semantics of the program, terminating the program if no such rearrangement could be found: this would be very inefficient, though, and could not be guaranteed to ever halt, obviously not desirable features for an assignment operator to have.
So in lieu of the above, polymorphism in C++ is accomplished by allowing references and pointers to objects to reference and point to objects of their declared compile-time types and any subtypes thereof. When a virtual function is called through a reference or pointer, and the compiler cannot prove that the object referenced or pointed to is of a run-time type with a specific known implementation of that virtual function, the compiler inserts code which looks up the correct virtual function to call a run-time. It did not have to be this way, either: references and pointers could have been defined as being non-polymorphic (disallowing them to reference or point to subtypes of their declared types) and forcing the programmer to come up with alternative ways of implementing polymorphism. The latter is clearly possible since it's done all the time in C, but at that point there's not much reason to have a new language at all.
In sum, the semantics of C++ are designed in such a way to allow the high-level abstraction and encapsulation of object-oriented polymorphism while still retaining features (like low-level access and explicit management of memory) which allow it to be suitable for low-level development. You could easily design a language that had some other semantics, but it would not be C++ and would have different benefits and drawbacks.
I found it helpful to understand that a copy constructor is invoked when assigning like this:
class Base { };
class Derived : public Base { };
Derived x; /* Derived type object created */
Base y = x; /* Copy is made (using Base's copy constructor), so y really is of type Base. Copy can cause "slicing" btw. */
Since y is an actual object of class Base, rather than the original one, functions called on this are Base's functions.
Consider little endian architectures: values are stored low-order-bytes first. So, for any given unsigned integer, the values 0-255 are stored in the first byte of the value. Accessing the low 8-bits of any value simply requires a pointer to it's address.
So we could implement uint8 as a class. We know that an instance of uint8 is ... one byte. If we derive from it and produce uint16, uint32, etc, the interface remains the same for purposes of abstraction, but the one most important change is size of the concrete instances of the object.
Of course, if we implemented uint8 and char, the sizes may be the same, likewise sint8.
However, operator= of uint8 and uint16 are going to move different quantities of data.
In order to create a Polymorphic function we must either be able to:
a/ receive the argument by value by copying the data into a new location of the correct size and layout,
b/ take a pointer to the object's location,
c/ take a reference to the object instance,
We can use templates to achieve a, so polymorphism can work without pointers and references, but if we are not counting templates, then lets consider what happens if we implement uint128 and pass it to a function expecting uint8? Answer: 8 bits get copied instead of 128.
So what if we made our polymorphic function accept uint128 and we passed it a uint8. If our uint8 we were copying was unfortunately located, our function would attempt to copy 128 bytes of which 127 were outside of our accessible memory -> crash.
Consider the following:
class A { int x; };
A fn(A a)
{
return a;
}
class B : public A {
uint64_t a, b, c;
B(int x_, uint64_t a_, uint64_t b_, uint64_t c_)
: A(x_), a(a_), b(b_), c(c_) {}
};
B b1 { 10, 1, 2, 3 };
B b2 = fn(b1);
// b2.x == 10, but a, b and c?
At the time fn was compiled, there was no knowledge of B. However, B is derived from A so polymorphism should allow that we can call fn with a B. However, the object it returns should be an A comprising a single int.
If we pass an instance of B to this function, what we get back should be just a { int x; } with no a, b, c.
This is "slicing".
Even with pointers and references we don't avoid this for free. Consider:
std::vector<A*> vec;
Elements of this vector could be pointers to A or something derived from A. The language generally solves this through the use of the "vtable", a small addition to the object's instance which identifies the type and provides function pointers for virtual functions. You can think of it as something like:
template<class T>
struct PolymorphicObject {
T::vtable* __vtptr;
T __instance;
};
Rather than every object having its own distinct vtable, classes have them, and object instances merely point to the relevant vtable.
The problem now is not slicing but type correctness:
struct A { virtual const char* fn() { return "A"; } };
struct B : public A { virtual const char* fn() { return "B"; } };
#include <iostream>
#include <cstring>
int main()
{
A* a = new A();
B* b = new B();
memcpy(a, b, sizeof(A));
std::cout << "sizeof A = " << sizeof(A)
<< " a->fn(): " << a->fn() << '\n';
}
http://ideone.com/G62Cn0
sizeof A = 4 a->fn(): B
What we should have done is use a->operator=(b)
http://ideone.com/Vym3Lp
but again, this is copying an A to an A and so slicing would occur:
struct A { int i; A(int i_) : i(i_) {} virtual const char* fn() { return "A"; } };
struct B : public A {
int j;
B(int i_) : A(i_), j(i_ + 10) {}
virtual const char* fn() { return "B"; }
};
#include <iostream>
#include <cstring>
int main()
{
A* a = new A(1);
B* b = new B(2);
*a = *b; // aka a->operator=(static_cast<A*>(*b));
std::cout << "sizeof A = " << sizeof(A)
<< ", a->i = " << a->i << ", a->fn(): " << a->fn() << '\n';
}
http://ideone.com/DHGwun
(i is copied, but B's j is lost)
The conclusion here is that pointers/references are required because the original instance carries membership information with it that copying may interact with.
But also, that polymorphism is not perfectly solved within C++ and one must be cognizant of their obligation to provide/block actions which could produce slicing.
You need pointers or reference because for the kind of polymorphism you are interested in (*), you need that the dynamic type could be different from the static type, in other words that the true type of the object is different than the declared type. In C++ that happens only with pointers or references.
(*) Genericity, the type of polymorphism provided by templates, doesn't need pointers nor references.
When an object is passed by value, it's typically put on the stack. Putting something on the stack requires knowledge of just how big it is. When using polymorphism, you know that the incoming object implements a particular set of features, but you usually have no idea the size of the object (nor should you, necessarily, that's part of the benefit). Thus, you can't put it on the stack. You do, however, always know the size of a pointer.
Now, not everything goes on the stack, and there are other extenuating circumstances. In the case of virtual methods, the pointer to the object is also a pointer to the object's vtable(s), which indicate where the methods are. This allows the compiler to find and call the functions, regardless of what object it's working with.
Another cause is that very often the object is implemented outside of the calling library, and allocated with a completely different (and possibly incompatible) memory manager. It could also have members that can't be copied, or would cause problems if they were copied with a different manager. There could be side-effects to copying and all sorts of other complications.
The result is that the pointer is the only bit of information on the object that you really properly understand, and provides enough information to figure out where the other bits you need are.

Howto avoid copying when casting a vector<Derived*> to a vector<Base*>

Is there a way to avoid copying large vectors, when a function expects a vector with (pointer to) baseclass objects as input but I only have a vector of (pointers to) derived objects?
class Base {};
class Derived : public Base {};
void doStuff(vector<Base*> &vec)
{
//do stuff with vec objects
}
int main()
{
vector<Derived*> fooDerived(1000000);
vector<Base*> fooBase(fooDerived.begin(), fooDerived.end()); // how to avoid copying here?
doStuff(fooBase);
}
If you could use a vector<Derived*> as if it where a vector<Base*>, you could add a pointer to a class OtherDerived : public Base to that vector. This would be dangerous.
You can't cast the vectors. However, you shouldn't really need to either.
If you really want, you could use Boost Iterator (documentation)
static Derived* ToDerived(Base* b)
{
return dynamic_cast<Derived*>(b); // return null for incompatible subtypes
}
static void DoSomething(Derived* d)
{
if (!d)
return; // incompatible type or null entry
// do work
}
// somewhere:
{
std::vector<Base*> bases;
std::for_each(
boost::make_transform_iterator(bases.begin(), &ToDerived),
boost::make_transform_iterator(bases.end(), &ToDerived),
DoSomething);
}
Note: a particularly handy effect of using dynamic_cast<Derived*> is that if the runtime type of the object cannot be casted to Derived* (e.g. because it is actually an OtherDerived*, it will simply return a null pointer.
If your definition of doStuff() is absolutely mandatory, then you won't get around the copy. Containers of pointers aren't "covariant" with respect to the pointee's class hierarchy, for a whole host of reasons. (For example, if you could treat vector<Derived*> like a vector<Base*>, you could insert Base-pointers into it which wouldn't behave like Derived-pointers. In any event, the parameter type of a container is fixed and part of the container's type.)
If you do have some leeway with the function, you could restructure the code a bit: You could make it a template parametrized on the container, or a template on an iterator range, and/or you could split the actual workload into a separate function. For example:
void doStuffImpl(Base *);
template <typename Iter>
void doStuff(Iter begin, Iter end)
{
for (Iter it = begin; it != end; ++it)
{
doStuffImpl(*it); // conversion happens here
}
}
Since your template for the vector is a pointer, your fooBase local variable is going to be using reference. Maybe the question is not very clear, if you could specify clearly we may try to address the problem!
One way is to store only pointers to the base class like this:
vector<Base*> v(100);
v.push_back(new Derived());
doStuff(v);
I found this topic interesting where they mention you can safely cast Derived* to Base* (but not the other side) while this is not possible from vector<Derived*> to vector<Base*> but you kind of should be able to... (Ok, the address of the vector changes, thus needing a copy)
One dirty solution they mention is using reinterpret_cast<vector<Base*> >(vector<Derived*>) but not sure if it's better then using the copy constructor... probably the same thing happens.