This question already has answers here:
Derived template-class access to base-class member-data
(3 answers)
Closed 8 years ago.
A coworker asked me today about code which looks somewhat like this:
#include <iostream>
template <class T>
class IBase {
public:
virtual ~IBase() {}
public:
virtual void foo() = 0;
};
template <class T>
class Base : public IBase<T> {
public:
virtual void bar() {
foo(); // compiler error
}
};
class Derived : public Base<int> {
public:
virtual void foo() {
std::cout << "Hello World!\n";
}
};
int main() {
Derived d;
d.bar();
}
At first he was getting a compiler error saying that "foo()" was not found. OK, so he tried to change it to IBase<T>::foo();. While that compiled, it resulting in a linker error. So immediately, I recalled that I've seen this type of problem before and suggesting that he write this->foo(); instead. Viola! problem solved!
Then he asked me why didn't plain foo(); work? Isn't this->x(); essentially the same as x();? Honestly, I have no idea, but he piqued my interest. So here we are:
In summary:
virtual void bar() {
this->foo(); // works
//IBase<T>::foo(); // linker error
//foo(); // compiler error
}
The question is why is this-> required. And why won't the other options work?
Because the base class member is a dependent name - its meaning depends on the template parameter, and so isn't known until the template is instantiated. The name isn't looked up in the generic IBase template, since that might be specialised to give it a different meaning before instantiation.
Qualifying it with IBase<T>:: calls the base-class function non-virtually; that's generally not what you want, especially if (as here) it's a pure virtual function with no implementation. Hence the linker error when you tried that.
Qualifying it with this-> tells the compiler that it's a member, and any further checking is deferred until the template is instantiated. The function is still called virtually.
Imagine you are the compiler. You have just been reading through and compiling the code and now you have reached the bar function. In this function, you see that there is an attempt to do foo(). At this point, do you know what foo is? You don't. It could come from the base class, but you can't possibly know, because you don't know what T is yet. It's certainly possible that there might be a specialization of IBase for which foo is something else entirely.
When you stick this-> before the function call, it causes the compiler to treat it as a dependent name. That means the compiler will say "Okay, this depends on the type of this, which I do not know yet. I'll wait until later, when I know how the class is being instantiated, before I look for foo."
IBase<T>::foo(); gives a linker error because there simply is no definition for foo in IBase<T>.
#include <iostream>
template <class T>
class IBase {
public:
virtual ~IBase() {}
public:
virtual void foo() = 0;
};
int foo() { std::cout << "hello!\n"; }
template <class T>
class Base : public IBase<T> {
public:
virtual void bar() {
foo(); // which foo?!
}
};
template <>
class IBase<int> {
public:
virtual ~IBase() {}
//virtual void foo() = 0; -- no foo()!
};
class Derived : public Base<int> {
public:
virtual void foo() {
std::cout << "Hello World!\n";
}
};
int main() {
Derived d;
d.bar();
}
The above illustrates why C++ does not allow members of dependent type parents to be implicitly found.
When you call foo() in Base, which foo() should be called? The one in IBase<T> or the free function foo()?
Either we put the decision off until later, or we go with the free function foo().
If we only go with the free function foo() if one is visible, then subtle changes in #include order can massively change what your program does. So if it should call the free function foo(), it must error if one is not found, or we are completely screwed.
If we defer the decision until later, it means less of the template can be parsed and understood until a later date. This moves more errors to the point of instantiation. It also results in some surprising behavior, like in the above case, where someone might think "I'm calling the method foo()", but actually ends up calling the free function foo() with no diagnostic.
Related
This question already has answers here:
Pure virtual function with implementation
(10 answers)
Closed 7 years ago.
I am working with a code and I saw something odd, a method of a class "MyClass" let's call it X() :
virtual void X() = 0;
So MyClass is an abstract class and in MyClass.cpp X() has a proper implementation...
In derived classes of MyClass, this method is called via MyClass::X();
I thought that = 0 would invalidate its implementation... but it's not the case and it is, in fact, usable in derived classes.
Can you please tell what the compiler really do when he encounters = 0 ?
From the standard (9.2 Class members [class.mem]):
= 0 is the pure-specifier
It tells that compiler that:
the class is abstract
the method will be defined outside the class definition
(usually in a derived class)
Example 1 (build fails)
If I understand your question correctly, you have something like that:
class MyClass {
public:
virtual void X() = 0;
};
class MyDerivedClass : MyClass {
public:
virtual void X();
};
void MyDerivedClass::X() { MyClass::X(); }
int main()
{
MyDerivedClass mdc;
mdc.X();
return 0;
}
If so, the build should fail with:
Error:
undefined reference to 'MyClass::X()'
Example 2 (build succeeds)
However, even if the method MyClass::X() is declared as pure virtual,
you can provide a definition. The following would work. The class MyClass
is still abstract, but you can call the method MyClass::X().
#include <iostream>
class MyClass {
public:
virtual void X() = 0; // pure virtual method
};
class MyDerivedClass : MyClass {
public:
virtual void X();
};
void MyClass::X() { // pure virtual method definition
std::cout << "MyClass::X()" << std::endl;
}
void MyDerivedClass::X() {
MyClass::X();
std::cout << "MyDerivedClass::X()" << std::endl;
}
int main()
{
MyDerivedClass mdc;
mdc.X();
return 0;
}
Output:
MyClass::X()
MyDerivedClass::X()
The =0 thing tells the compiler two things:
A regular out-of-class function definition is not required (though allowed). If there is no such definition, and the function is actually called, this is a runtine error.
The class is abstract and cannot be instantiated, whether a definition from point 1 is present or not. Attempts to do so should be flagged as compile time errors. Derived classes that don't override the function are abstract too.
You could not create instance of class with pure virtual methods, but in some cases you can call pure virtual methods, and it will be an error
I think the compiler creates a vtable with NULL pointers for pure virtual methods.
Here is the code I'm talking about
#include <stdio.h>
#include <stdlib.h>
class A {
public:
void method() {printf("method A\n");}
virtual void parentMethod() { printf("parentMethod\n"); method(); }
};
class B : public A {
public:
void method() {printf("method B\n");}
};
int main(void) {
A a;
B b;
a.parentMethod();
b.parentMethod();
}
My question is why this is happening? Why when b.parentMethod() is called it doesn't print method B. I realize that it has something to do with method() being in A and B as well as being non-virtual, but I can't get my head around it. Would someone be able to explain this behaviour?
Any help is appreciated.
You code was missing a virtual keyword:
#include <stdio.h>
#include <stdlib.h>
class A {
public:
virtual void method() {printf("method A\n");} // virtual was missing
void parentMethod() { printf("parentMethod\n"); method(); } // unnecessary virtual keyword
};
class B : public A {
public:
void method() {printf("method B\n");}
};
int main(void) {
A a;
B b;
a.parentMethod();
b.parentMethod();
}
The definition in the most upper class must contain the virtual keyword. It is very logical if you think about it. In this case, when you call method() the compiler knows it has to do something more than with a normal function call immediately.
Otherwise, it would have to find and iterate on all the derived types to see if they contain a redefinition of method().
My question is why this is happening? Why when b.parentMethod() is called it doesn't print method B. I realize that it has something to
do with method() being in A and B as well as being non-virtual, but I
can't get my head around it. Would someone be able to explain this
behaviour?
C++ has two levels of indirection when it comes to classes/structures. You have your "plain functions" (including "overloaded", "lambdas", static, etc.) and you have your "virtual functions". First, let's explain "plain functions".
struct Foo {
void goo();
};
In this structure, goo is just a plain old functions. If you try to write it in C, this would be analogous to calling,
void goo(struct Foo *this);
Nothing magical, just a plain function with a "hidden" pointer (all c++ functions have that "hidden" this pointer passed to them). Now, let's re-implement this function in an inherited structure,
struct Goo : public Foo {
void goo();
};
...
Goo g;
g.goo();
Foo f;
f.goo();
Here, plain as day, g.goo() calls goo() in Goo structure, and f.goo() calls goo() in Foo structure. So, in C functions, this would be just,
void goo(struct Foo *this);
void goo(struct Goo *this);
provided C did parameter overloading. But still, just plain functions. Calling goo() in Foo object will call different function than calling goo() in Goo object. Compile time resolution only. But now, let's make our function "virtual"
struct Foo {
virtual void goo();
};
struct Goo : public Foo {
void goo(); // <- also virtual because Foo::goo() is virtual
// In C++11 you'll want to write
// void goo() override;
// which verifies that you spelled function name correctly
// and are not making *new* virtual functions! common error!!
};
...
Goo g;
g.goo();
Foo f;
f.goo();
What happens here is that Foo now contains a "virtual table". The compiler now creates a table of functions that maps location of "latest" goo() function. Namely, implicit Foo() constructor would do something like,
virt_table[goo_function_idx] = &Foo::goo;
and then the constructor in Goo() would update this table with,
virt_table[goo_function_idx] = &Goo::goo;
And then when you have,
Foo *f = new Goo();
f->goo();
what happens is akin to,
f->virt_table[goo_function_idx]();
The function location is looked up in the "virtual table", and that function is called. This means runtime resolution of functions, or polymorphism. And this is how Goo::goo() is called.
Without this table, the compiler can only call functions it knows for said object. So in your example, b.parentMethod() is looked up in the table and called. But method() is not part of that table, so only compile-time resolution is attempted. And since this pointer is A*, you get A::method called.
I hope this clears up the "virtual table" business - it's literally an internal lookup table, but only for functions marked as virtual!
PS. You may ask, "but the this pointer will get 'upcast' by the virtual table from Foo* to Goo*", and yes, it would. I'll leave it as an exercise for you to figure out why that would always be correct.
Well you are correct. This is happening because your method is not virtual. When you are calling it through the parent class, there is no way for it to know, that it was overloaded, so A::method is always called. If you mark method as virtual, then call to it will be routed through the class vtable, so A::method would be replaced by the B::method in the ascendant class.
virtual means that a method can be overridden in a subclass. I think you wanted method, not parentMethod, to be overridden for B. I've renamed parentMethod to foo to be less misleading.
#include <stdio.h>
#include <stdlib.h>
class A {
public:
virtual void method() {printf("method A\n");}
void foo() { printf("foo\n"); method(); }
};
class B : public A {
public:
void method() {printf("method B\n");}
};
int main(void) {
A a;
B b;
a.foo();
b.foo();
}
You'll see that this gives the expected output:
foo
method A
foo
method B
Here's the ideone.
The following does not compile (Apple LLVM version 4.2 (clang-425.0.28)):
class A {
public:
virtual void foo() {};
virtual void foo( int i ) {};
};
class B : public A {
public:
virtual void foo( int i ) override { foo(); }
};
The compiler error is "Too few arguments" for the call to foo() inside B::foo(int). The compiler apparently thinks that I want to recursively call B::foo(int) and does not recognize that I want to call A::foo(void).
The error goes away if I replace the call to foo() by A::foo().
But:
1) Why is this happening? It seems obvious to just resolve to A::foo() in this case (or an overridden function down the class hierarchy).
2) More importantly, if I want to use polymorphic late binding for foo(void) as well, specifying A::foo() is not what I want, B::foo() of course produces a compiler error as well.
Thanks in advance for any enlightenment!
A name in a derived class hides the same name in base classes. In other words, when resolving foo in the context of B, name lookup finds B::foo and stops there. A::foo is never found.
Add using A::foo; within B's definition.
Say I have the following code:
template <class Derived>
class Base {
public:
virtual void foo_impl() = 0;
void foo() {
static_cast<Derived*>(this)->foo_impl(); //A
(*static_cast<Derived*>(this)).foo_impl(); //B
}
};
class Derived : public Base<Derived> {
private:
void foo_impl() {
bar();
}
};
A few questions:
Will line A generate a virtual function call? Although the majority of what I can find on the internet recommends doing things this way, to me I don't see how the compiler can do static dispatch considering that a pointer to Derived could still actually point to an object of type Derived2 where Derived2 : public Derived.
Does line B fix the issue I brought up in my previous point (if applicable)? It seems like it would, considering that now the call is not on a pointer any more and thus using *. would avoid a virtual function call. But if the compiler treats the dereferenced cast as a reference type, it could still generate a virtual function call... in that case, what is the workaround?
Does adding the C++11 final keyword to foo_impl() change how the compiler would act in either (or any other relevant) case?
Will line A generate a virtual function call?
Yes. foo_impl() is virtual and Derived overrides it. Even though foo_impl() in Derived is not explicitly tagged as virtual, it is in the base class, and this is enough to make it a virtual function.
Does line B fix the issue I brought up in my previous point (if applicable)?
No. It does not matter if the call is on a pointer or on a reference: the compiler still won't know whether you are invoking the function foo_impl() on an instance of a class that derives from Derived, or on a direct instance of Derived. Thus, the call is performed through a vtable.
To see what I mean:
#include <iostream>
using namespace std;
template <class Derived>
class Base {
public:
virtual void foo_impl() = 0;
void foo() {
static_cast<Derived*>(this)->foo_impl();
(*static_cast<Derived*>(this)).foo_impl();
}
};
class Derived : public Base<Derived> {
public:
void foo_impl() {
cout << "Derived::foo_impl()" << endl;
}
};
class MoreDerived : public Derived {
public:
void foo_impl() {
cout << "MoreDerived::foo_impl()" << endl;
}
};
int main()
{
MoreDerived d;
d.foo(); // Will output "MoreDerived::foo_impl()" twice
}
Finally:
Does adding the C++11 final keyword to foo_impl() change how the compiler would act in either (or any other relevant) case?
In theory, yes. The final keyword would make it impossible to override that function in subclasses of Derived. Thus, when performing a function call to foo_impl() through a pointer to Derived, the compiler could resolve the call statically. However, to the best of my knowledge, compilers are not required to do so by the C++ Standard.
CONCLUSION:
In any case, I believe what you actually want to do is not to declare the foo_impl() function at all in the base class. This is normally the case when you use the CRTP. Additionally, you will have to declare class Base<Derived> a friend of Derived if you want it to access Derived's private function foo_impl(). Otherwise, you can make foo_impl() public.
The common idiom for the CRTP does not involve declaring the pure virtual functions in the base. As you mention in one of the comments, that means that the compiler will not enforce the definition of the member in the derived type (other than through use, if there is any use of foo in the base, that requires the presence of foo_impl in the derived type).
While I would stick to the common idiom and not define the pure virtual function in the base, but, if you really feel you need to do it, you can disable dynamic dispatch by adding extra qualification:
template <class Derived>
class Base {
public:
virtual void foo_impl() = 0;
void foo() {
static_cast<Derived*>(this)->Derived::foo_impl();
// ^^^^^^^^^
}
};
The use of the extra qualification Derived:: disables dynamic dispatch, and that call will be statically resolved to Derived::foo_impl. Note that this comes will all of the usual caveats: you have a class with a virtual function and paying the cost of the virtual pointer per object, but you cannot override that virtual function in a most derived type, as the use in the CRTP base is blocking dynamic dispatch...
The extra verbiage in lines A and B have absolutely no effect on
the generated code. I don't know who recommends this (I've never seen
it), but in practice, the only time it might have an effect is
if the function isn't virtual. Just write foo_impl(), and be
done with it.
There is a means of avoiding the virtual function call if the
compiler knows the derived type. I've seen it used for
vector-like classes (where there are different implementations,
e.g. normal, sparse, etc. of the vector):
template <typename T>
class Base
{
private:
virtual T& getValue( int index ) = 0;
public:
T& operator[]( int index ) { return getValue( index ); }
};
template <typename T>
class Derived : public Base<T>
{
private:
virtual T& getValue( int index )
{
return operator[]( index );
}
public:
T& operator[]( index )
{
// find element and return it.
}
};
The idea here is that you normally only work through references
to the base class, but if performance becomes an issue, because
you're using [] in a tight loop, you can dynamic_cast to the
derived class before the loop, and use [] on the derived
class.
In CRTP to avoid dynamic polymorphism, the following solution is proposed to avoid the overhead of virtual member functions and impose a specific interface:
template <class Derived>
struct base {
void foo() {
static_cast<Derived *>(this)->foo();
};
};
struct my_type : base<my_type> {
void foo() {}; // required to compile. < Don't see why
};
struct your_type : base<your_type> {
void foo() {}; // required to compile. < Don't see why
};
However it seems that the derived class does not require a definition to compile as it inherits one (the code compiles fine without defining a my_type::foo). In fact if a function is provided, the base function will not be called when using the derived class.
So the question is, is the following code replacement acceptable (and standard?):
template <class Derived>
struct base {
void foo() {
// Generate a meaningful error if called
(void)sizeof( Derived::foo_IS_MISSING );
};
};
struct my_type : base<my_type> {
void foo() {}; // required to compile.
};
struct your_type : base<your_type> {
void foo() {}; // required to compile.
};
int main() {
my_type my_obj;
my_obj.foo(); // will fail if foo missing in derived class
}
The whole point of this pattern is, as far as I understand, that you can pass arguments simply as template <typename T> base<T> & and your interface is defined by (non-virtual) functions in base<T>. If you don't have an interface that you want to define (as you are suggesting in the second part of your question), then there's no need for any of this in the first place.
Note that you are not "imposing" an interface like with pure virtual functions, but rather you are providing an interface. Since everything is resolved at compile time, "imposing" isn't such a strong requirement.
In your replacement code you can't "polymorphically" call foo on a base<T>.
However it seems that the derived class does not require a definition to compile as it inherits one (the code compiles fine without defining a my_type::foo).
C++ is lazy : it will not try to make base<my_type>::foo() if you do not actually use it.
But if you try to use it, then it will be created and if that fails, compilation errors will flow.
But in your case, base<my_type>::foo() can be instanciated just fine :
template <class Derived>
struct base {
void foo() {
static_cast<Derived *>(this)->foo();
};
};
struct my_type : base<my_type> {};
void func() {
my_type m;
static_cast<base<my_type>& >(m).foo();
}
will compile just fine. When the compiler is presented with
static_cast(this)->foo(), it will try to find a foo() that is accessible in my_type. And there is one: it's called base<my_type>::foo(), which is public from a publicly inherited class. so base<my_type>::foo() calls base<my_type>::foo(), and you get an infinite recursion.
No, imagine the following situation:
template <typename T>
void bar(base<T> obj) {
obj.foo();
}
base<my_type> my_obj;
bar(my_obj);
Base's foo will be called instead of my_type's...
Do this, and you will get your erro message:
template <class Derived>
struct base {
void foo() {
sizeof(Derived::foo);
static_cast<Derived *>(this)->foo();
};
};
But I must confess I am not sure how this will work in compilers other than GCC, tested only with GCC.