Weird behaviour of C++ pure virtual classes
Please help a bloody C++-beginner understand pure virtual classes better.
I tried a simple example with C++ virtuals and am not sure about the result.
If I tried the same in another programming language as java for example the output would be
Desired/Expected Output
1 -> Tweet
2 -> Tweet
However, here the output is
Actual Output
1 -> Meow
2 -> Tweet
Why is that?
It seems as if the operator= of the class animal had no effect.
Is that because a standard operator= of the class animal is called which just does nothing?
How would I achieve a behaviour similiar to java without having to use pointers?
Is that even possible?
Below an as simplified as possible example of code:
Code
#include <string>
#include <iostream>
using namespace std;
class Animal
{
public:
virtual string test() const = 0;
};
class Cat : public Animal
{
public:
virtual string test() const
{
return "Meow";
}
};
class Bird : public Animal
{
public:
virtual string test() const
{
return "Tweet";
}
};
void test_method(Animal &a)
{
Bird b;
a = b;
cout << "1 -> " << a.test() << endl;
cout << "2 -> " << b.test() << endl;
}
int main(int args, char** argv)
{
Cat c;
Animal &a = c;
test_method(a);
return 0;
}
Apparently, you have some unrealistic expectations about the behavior of a = b assignment.
Your a = b assignment simply copies data from Animal subobject of Bird object b to Animal subobject of Cat object referred by a. Since Animal has no data in it at all, you a = b is an empty operation, a no-op. It does not do anything at all.
But even if it did do something, it would simply copy the data fields. It cannot copy the polymorphic identity of the object. There no way to change the polymorphic identity of an object in C++. There's no way to change object's type. Regardless of what you do, a Cat will always remain a Cat.
The code you wrote can be shortened to
Cat a;
Bird b;
(Animal &) a = b; // Same as `(Animal &) a = (Animal &) b`
a.test(); // still a `Cat`
b.test(); // still a `Bird`
You did the same thing in a more obfuscated way.
In C++ references can't be rebound; they always refer to the variable you created them against. In this case a is a reference to c, and thus it is and will always be a Cat.
When you reassign a reference, you aren't rebinding the reference - you're assigning the underlying variable that the reference refers to. Thus the assignment a = b is the same as c = b.
You are a victim of what is known as object slicing. C++ is different from Java and C# in the way it handle its objects. In C# for example every user define type is managed as a reference when you create a new instance, and only primitive types and structures are handled as values, This means that in Java or C# when you assign an object, you are only assigning references, for instance:
Object a = new Object();
Object b = a;
Will result in both a and b pointing to the same object (The one created when we assigned a).
In C++ the story is different. you can create an instance of an object in the heap or the stack. and you can pass said objects by reference, by pointer or by value.
If you asssign references or pointers, it will behave similar to C# and Java. But if you assign an object by value, that is you assign the actual object and not a pointer or a reference, a new copy of the object will be created. Every user define type in C++ is copyable by default.
When you have inheritance and polymorphism involved, this copy behaviour creates an issue, because when you copy a child type into a parent type, the copy that will be created will only contains the portion of information for the parent type in the child, thus losing any polymorphism you may have.
In your example when you copy a Cat object into a Animal object, only the Animal part of the cat is copied, thats why you lose your polymorphism, the virtual table is no more. If you base class is abstract in any way this wont even be possible.
The solution, if you want to retain polymorphism, is to pass the object by pointer or reference instead of by value. You can create the object in the heap and assign that pointer, you can take the address of the object in the stack and assing that pointer, or you could just take the reference of the object and assign that instead.
The lesson to be learned here is to NEVER pass or assign by value objects with any sort of polymorphism or you will end up slicing it.
Try this.
#include <string>
#include <iostream>
using namespace std;
class Animal
{
public:
virtual string test() const = 0;
};
class Cat : public Animal
{
public:
virtual string test() const
{
return "Meow";
}
};
class Bird : public Animal
{
public:
virtual string test() const
{
return "Tweet";
}
};
void test_method(Animal *a)
{
Bird *b = new Bird();
a = b;
cout << "1 -> " << a->test() << endl;
cout << "2 -> " << b->test() << endl;
free(a);
}
int main(int args, char** argv)
{
Cat *c = new Cat();
Animal *a = c;
test_method(a);
free(c);
return 0;
}
Remove the a = b, and you'll get "Meow" followed by "Tweet"...
And in a bit more generic attitude:
First, define a generic interface for your generic class (Animal in this specific example).
You can then use this interface with any sub-class instance (Cat and Bird in this specific example).
Each instance will "act" according to your specific implementation.
Your mistake in function test_method was using an instance of the sub-class without referring to it through the generic class (with a reference or a pointer).
In order to change it into a generic function for Animal instances, you could do something like:
void test_method(Animal &a)
{
cout << a.test() << endl;
}
Related
I made a toy example of a problem I'm facing with my code:
I have an animal which I don't know what will be after a later stage, so I initialize it to a generic animal.
But later on, I want to make it a cat, so I'm assigning myAnimal to be a Cat
#include <iostream>
class Animal {
public:
int weight;
virtual void Sound() {
// To be implemented by child class
}
};
class Cat : public Animal {
public:
void Sound() {
std::cout << "Miau" << std::endl;
}
// Only cats purr
void Purr() {
std::cout << "Purr" << std::endl;
}
};
int main() {
// At this point I don't know which animal I'll have, so I initialize it
// to a generic Animal
Animal* myAnimal;
animal->weight = 10;
// At this point of the code, I know what animal I want, so I assign animal
// to be a Cat
double selectedAnimal = 0;
if (selectedAnimal == 0) {
myAnimal = &Cat();
// myAnimal = new Cat(); // this will just create a new Cat, losing
// the already assigned weight.
// I want to "upgrade" my generic animal, keeping its properties and adding
// new ones specific to cats
}
myAnimal->Sound();
myAnimal->Purr(); // ERROR: Class Animal has no member Purr
return 0;
}
I think I'm not assigning correctly myAnimal to be a Cat, but it is still an Animal. Howver the compiler doesn't complain when I do myAnimal = &Cat();.
So I don't understand if the compiler allows me to assign Animal to the class Cat myAnimal = &Cat(); why it complains when I try to use a method specific of the class Cat.
How should I reassign my generic animal in such a way that is now a full Cat with all its methods?
EDIT:
Answering some comments:
-Animal should not have a Purr method, only cats purr.
-I don't know at compile time what Animal will I have, that's why I assign it to be generic at the beginning.
I can reassign myAnimal to be a new Cat, but then any variables already set to the generic animal will be lost (eg: Animal might have a weight variable already set before knowing it's a Cat)
I'll try the suggested down-casting by #Some programmer dude
Firstly, when creating the Cat ensure that you allocate the memory correctly. Currently you are taking the address of a temporary object i.e. &Cat() which is not valid C++.
You can do this in two different ways:
// On the stack
Cat cat;
myAnimal = &cat;
// OR
myAnimal = new Cat(); // On the heap (remember to free the cat)
Then, when you want to use the animal as a cat, you can use a downcast e.g.:
auto myCatPtr = dynamic_cast<Cat*>(myAnimal);
if (myCatPtr) {
// This means the pointer is valid
myCatPtr->Purr();
}
Here is a working example.
The answer from Matthias GrĂ¼n shows how to manipulate C++ to do what you want.
However, my advice is to stop making C++ do what you think is right, and do it the way C++ wants to do it.
C++ wants you to never throw the type of an object away. It is a strongly-typed language. It almost always an "anti-pattern" to throw away the type of an object.
One common technique for avoiding this anti-pattern, is to separate "ownership" from "use". You can use a pointer to unknown-type, easily. Owning an object by a pointer to unknown type is really hard.
int main()
{
Cat my_cat;
Animal* any_animal = &my_cat; // non-owning pointer.
any_animal->Sound();
my_cat.Purr();
}
All that myAnimal = &Cat(); does is create a temporary of type Cat and assign the address of it to myAnimal, leaving you with a dangling pointer. myAnimal will point to an invalid address afterwards.
Also, even if it had been correctly assigned, for example by writing
myAnimal = new Cat{};
it would still require a cast so the compiler knows that it's dealing with a Cat, for example like so:
auto pCatInstance = dynamic_cast<Cat*>(myAnimal);
if (pCatInstance != nullptr)
pCatInstance->Purr();
If I remember correctly, in Java, we can pass a subclass to a function with a superclass. The code would look like this.
// Assume the classes were already defined, and Apple
// and Pineapple are derived from Fruit.
Fruit apple = new Apple();
Fruit pineapple = new Pineapple();
public void iHaveAPenIHaveAn(Fruit fruit) { ... } // :)
...
public static void main(String[] arg)
{
iHaveAPenIHaveAn(apple); // Uh! Apple-pen.
iHaveAPenIHaveAn(pineapple); // Uh! Pineapple-pen.
}
However, in C++, I noticed from here that you need to use a reference variable of the base class (Is that the proper term?) instead of a regular variable of the base class.
Assuming you have two classes: a base class A, and an A-derived class B.
class A { ... };
class B : A { ... };
If we have a neverGonna() function that takes in a class A argument, then why should the function look like this:
void neverGonna(A& a) { ... }
...
B giveYouUp;
neverGonna(giveYouUp);
instead of this?
void neverGonna(A a) { ... }
...
B letYouDown;
neverGonna(letYouDown);
What is the reasoning behind it?
Objects in Java are referenced by pointers. (Java calls them "references".) If you write Apple a; in Java, the variable a is actually a pointer, pointing to an Apple object somewhere in memory. In C++ though, if you write Apple a; then the variable a contains the entire Apple object. To get pointers, you need to explicitly declare a to be a pointer variable, as in Apple* a;, or a reference, as in Apple& a;.
The same goes for function arguments. In Java, if you send an Apple to a method that expects a Fruit (assuming that Apple is a subclass of Fruit), what is actually sent is a pointer, and the object itself is stored somewhere else.
In C++, the object itself is copied, and sent to the function. However, if the function expects a Fruit object, and you send an Apple object with some extra member variables in it, the Apple won't fit in the space that the function has to receive the Fruit. The Apple object has to be converted to a Fruit object, and any apple-specific extra stuff is removed. This is called object slicing.
The reason is that sizeof(A) and sizeof(B) are not the same. If a function (in C++ or the like) takes a parameter by value, it must know how large the value for that parameter will be in order to interpret the values in memory correctly. As a rough (but technically sketchy) example, suppose I have a function that takes an A and an int. Maybe it expects the incoming values to be stored like
AAAAAAiiii
where the first 6 bytes are the A object, and the last 4 bytes are the integer value. But I create a B, which looks something like AAAAAABBBBB... so now the function receives
AAAAAABBBBBiiii
and that's no good. Passing either a pointer or a reference allows the function to know how many of the bytes its receiving represent that first parameter.
So why isn't this a thing in Java? In java objects are always handled "by reference"; that is, when you say
Fruit apple = new Apple();
you're creating a variable apple which is a reference to a Fruit. If a java method says
public void iHaveAPenIHaveA(Fruit fruit)
it's going to accept a reference to a fruit.
In other words, what your Java sample is doing actually is the same thing as the by-reference version of your C++ sample (the one with void neverGonna(A& a)).
If you pass the object by value, then a (temporary) copy of the object is pushed into the stack before the function is called.
Since the function prototype dictates that the input argument's type is the base class, that copy contains only the "base" part of the object.
In other words, the original object is "sliced" and your hope for polymorphism goes to waste...
Because passing by value will cause object slicing. Consider this example:
#include <iostream>
using namespace std;
class Base
{
public:
virtual void foo()
{
cout << "Base foo()" << endl;
}
};
class Derived : public Base
{
public:
void foo()
{
cout << "Derived foo()" << endl;
}
};
void print(Base b)
{
b.foo(); // object will be sliced here
}
int main()
{
Base* b = new Derived();
print(*b); // passing by value will cause object slicing;
return 0;
}
Since we passed by value b in print(), there would be object slicing and the non-Base data in b will be cut off. The output would look like:
Base foo()
instead of:
Derived foo()
Heyo, I'm a little confused about how method overriding works when you involve calling objects as their parent type.
Here is my example code:
#include <iostream>
#include <cstdlib>
#include <vector>
using namespace std;
class A {
public:
A() {
std::cout << "Made A.\n";
}
void doThing() {
std::cout << "A did a thing.\n";
};
};
class B : public A {
public:
B() {
std::cout << "Made B.\n";
}
void doThing() {
std::cout << "B did a thing.\n";
};
};
class C : public A {
public:
C() {
std::cout << "Made C.\n";
}
void doThing() {
std::cout << "C did a thing.\n";
};
};
int main(int argc, char** argv) {
std::cout << "\n";
std::cout << "Make objects: \n";
A a;
B b;
C c;
std::cout << "Call objects normally: \n";
a.doThing();
b.doThing();
c.doThing();
std::cout << "Call objects as their parent type from a vector: \n";
vector<A> vect;
vect.push_back(a); vect.push_back(b); vect.push_back(c);
for(int i=0;i<vect.size();i++)
vect.data()[i].doThing();
return 0;
}
And here is the output I get:
Make objects:
Made A.
Made A.
Made B.
Made A.
Made C.
Call objects normally:
A did a thing.
B did a thing.
C did a thing.
Call objects as their parent type from a vector:
A did a thing.
A did a thing.
A did a thing.
This same code in another language (like Java) would produce this output:
Make objects:
Made A.
Made B.
Made C.
Call objects normally:
A did a thing.
B did a thing.
C did a thing.
Call objects as their parent type from a vector:
A did a thing.
B did a thing.
C did a thing.
In short, how do I achieve that second output in c++?
Whenever you pass a Derived object by value to a function taking a Base, something called "slicing" happens. Basically, only the Base part of the Derived object is being used.
You need to pass the object by reference or by pointer to avoid these issues. For example, declaring
f(Base&)
allows passing in a Derived object, i.e. allows you to write
f(Derived)
In addition, to enable run-time polymorphism, your function must be marked virtual. Java by default has everything implicitly marked virtual. However, this is C++ and you don't pay for what you don't use (virtual functions are an overhead).
PS: in your code, even if you want, you cannot use a std::vector of references. However, you can wrap the objects using std::reference_wrapper which allows you to "simulate" a std::vector of references:
std::vector<std::reference_wrapper<A>> vect
and use the get member function to retrieve the reference
for(int i=0;i<vect.size();i++)
vect[i].get().doThing();
Or, perhaps simpler, just use a std::vector<A*>
You need to use the virtual keyword to enable functions to be overwritten in subclasses.
Ok So here is what's happening:
Make objects:
A is quite obvious I think. The A object gets constructed and its default constructor prints
Made A
When you instantiate a B object first its parent class gets constructed fully. So in this case the parent class is A it gets constructed with the default constructor, which prints out
Made A
After that the B class remaining part gets constructed and runs its constructor which prints out
Made B
The same thing happens when instantiating C
Calling functions on the objects:
Its just a simple function call and since you overwrite the function in each class they get called and not the parent's function.
When you create a vector of the objects you copy the objects into them since you dont pass the reference nor the pointer. You have not written a copy constructor so the default bit by bit copy will run. This way from a B class object you get an A class object which function will print out A did thing and not B did thing. Same happens with C.
In the artificial example below, if I static_cast to the base class, when I call the setSnapshot() function it still calls the actual object setSnapshot(). This is what I want to happen. My question is can I always rely on this to work?
In code I am working on, we have this class hierarchy and in the b class there are macros used which static cast to the b type. This is to downcast from a base type so that specialised function in b can be called.
#include <iostream>
class a {
};
class b: public a {
public:
virtual void setSnapshot() { std::cout << "setting b snapshot\n"; }
};
class c : public b {
public:
virtual void setSnapshot() { std::cout << "setting c snapshot\n"; }
};
int main() {
a* o = new c;
//specifically casting to b
static_cast<b*>(o)->setSnapshot(); //prints setting c snapshot - what I want to happen
delete o;
return 0;
}
The title suggests that you're misunderstanding what the case does. new c creates an object of type c, and it will remain a c until it's destructed.
If you were to cast it to an a, you'd create a copy. But yu're only casting pointers. That doesn't affect the original object. That's still a c, and that's why you end up calling c::setSnapshot().
As long as a function is virtual in the statically known type a call of it will go to the override that is most derived.
For single inheritance this can be understood as a search for an implementation up the base class chain, starting in the most derived class.
In practice, for C++, the dynamic search is not done, and the effect of the search is instead implemented as a simple table lookup.
I have a pointer to an object, I want to know if that object is either of type of a given class or of type that is a subclass of the given class in C++.
Use dynamic_cast:
class A {
public:
virtual ~A() = default;
};
class B : public A {
};
B * obj = new B();
auto obj2 = dynamic_cast<A*>(obj);
if (obj2 != nullptr) {
std::cout << "B is an A" << std::endl;
}
The pointer you start with must have a type. Let's say that type is T*. Let's say the "given class" is G. I think (although I may be wrong) that it's the complete type of the object that you want to know about, not the relation between the types T and G.
If T is a class type with at least one virtual function, then you can do the test you want on a pointer ptr like this:
if (dynamic_cast<G*>(ptr)) {
// then the complete type of your object is either G or a subclass
} else {
// it isn't
}
If T is not a class type, or if it doesn't have a virtual function, then what you want to do is not possible. You'll have to find a more useful static type for the pointer.
If all you want to know is whether G is "either a base of or the same as" T then you don't need dynamic_cast or for there to be a virtual function. You just need std::is_base_of.