Legally invoking a pure virtual function - c++

I'm sure we've all seen code that crashes due to a bug that results in a pure virtual function being called. One simple example is like this:
struct Base
{
Base() { method(); }
virtual void method() = 0;
};
struct Derived : Base
{
void method() {};
};
int main()
{
Derived d;
}
In this case, the call to method() in the Base constructor is specifically cited as undefined behaviour by section 10.4/6 of the C++ standard, so it's no surprise that we end up crashing. (Both g++ and Clang warn about this, and in fact linking fails with g++ with this example, though Clang succeeds.)
But, just for fun, can anybody come up with a way to invoke a pure virtual function which does not rely on undefined behaviour?
(I suppose you could argue that if such a method exists then there's a defect in the C++ standard, but I'm just curious...)
EDIT: Several answers guys and thank you, but I should have made it clear that I realise it's legal to make a non-virtual call to a pure virtual function (providing a definition exists somewhere). I was more wondering whether there is any clever loophole in the laws which could result in a virtual call, and thus most likely a crash in the common case of having no definition.
For example, perhaps via multiple inheritance one could perform some clever (legal) cast, but end up with the "wrong" (unimplemented) PV method() being called, that sort of thing. I just thought it was a fun brainteaser :-)

It's perfectly legal to call a pure virtual function non-virtually:
Derived d;
d.Base::method();
Of course, this requires the function to be defined, which isn't the case in your example.

Depends on what you mean with possible. Here's one which compiles successfully, but will most likely result in a linker error:
struct Base
{
virtual void method() = 0;
};
struct Derived : Base
{
void method() { Base::method(); };
};
int main()
{
Derived d;
d.method();
}
Live example
It compiles because nothing prevents a pure virtual function from actually having a body as well. That can be provided in the same translation unit (in a separate definition), or in a different translation unit. That's why it's a linker error and not a compiler one - just that the function has no body here doesn't mean it doesn't have one elsewhere.

A pure virutal function can have an implementation. Best example: a pure virtual destructor must have an implementation because all destructors will be called when an an object is destroyed.

Taking out the linker error from #Angew's answer. Not sure about the undefined behavior happening here...
struct Base
{
virtual void method() = 0;
};
void Base::method(){}
struct Derived : Base
{
void method() { Base::method(); };
};
int main()
{
Derived d;
d.method();
}
Live Demo

Related

C++ virtual function inlining when derived class is final?

I'm using C++ in an embedded environment where the runtime of virtual functions does matter. I have read about the rare cases when virtual functions can be inlined, for example: Are inline virtual functions really a non-sense?
The accepted answer states that inlining is only possible when the exact class is known at runtime, for example when dealing with a local, global, or static object (not a pointer or reference to the base type). I understand the logic behind this, but I wonder if inlining would be also possible in the following case:
class Base {
public:
inline virtual void x() = 0;
}
class Derived final : Base {
public:
inline virtual void x(){
cout << "inlined?";
}
}
int main(){
Base* a;
Derived* b;
b = new Derived();
a = b;
a->x(); //This can definitely not be inlined.
b->x(); //Can this be inlined?
}
From my point of view the compiler should know the definitive type of a at compiletime, as it is a final class. Is it possible to inline the virtual function in this case? If not, then why? If yes, then does the gcc-compiler (respectively avr-gcc) does so?
Thanks!
The first step is called devirtualization; where a function call does not go through virtual dispatch.
Compilers can and do devirtualize final methods and methods of final classes. That is almost the entire point of final.
Once devirtualized, methods can be inlined.
Some compilers can sometimes prove the static type of *a and even devirtualize that. This is less reliable. Godbolt's compiler explorer can be useful to understand what specific optimizations can happen and how it can fail.

Pure virtual function call interesting cases

Consider the following code:
#include <iostream>
using namespace std;
class A
{
public:
virtual void f() = 0;
A(){f();}
};
void A::f() {
cout<<"A"<<endl;
}
class B:public A{
public:
void f(){cout<<"B"<<endl;}
};
int main()
{
B b;
}
In this case I directly call the virtual function from constructor and get compiler warning which says:
warning: abstract virtual 'virtual void A::f()' called from constructor.
But it executes without termination and prints A.
If I wrap the call of the function like this:
class A
{
public:
virtual void f() = 0;
A(){g();}
void g(){f();}
};
void A::f(){cout<<"A"<<endl;}
class B:public A{
public:
void f(){cout<<"B"<<endl;}
};
int main()
{
B b;
}
The compiler does not output any warning during compilation but it crushes at runtime with the following message:
pure virtual method called
terminate called without active exception
Abort
Can anybody explain the behavior of both of this cases?
§ 10.4 Abstract classes [class.abstract] / p6
Member functions can be called from a constructor (or destructor) of an abstract class; the effect of making a virtual call (10.3) to a pure virtual function directly or indirectly for the object being created (or destroyed) from such a constructor (or destructor) is undefined.
In brief: The effect of making a call to a pure virtual function directly or indirectly for the object being created from constructor is undefined.
A call to pure virtual member functions cannot be used from a constructor or a destructor, no matter if the call is direct or indirect, because then you end up with an undefined behavior.
The only useful example of providing the implementation of a pure virtual function is when calling it from a derived class:
struct A
{
virtual void f() = 0;
};
void A::f()
{
cout<<"A"<<endl;
}
struct B : A
{
void f()
{
A::f();
cout<<"B"<<endl;
}
};
In the first case, the compiler happens to save you by statically dispatching to A::f(), since it knows the static type of A. But it's quite right that this is horribly undefined behaviour and you shouldn't do it.
In the second case, the compiler does not statically dispatch to A::f() since the call is not in the constructor so it must dynamically dispatch it. Different ABIs handle pure virtual calls differently, but both MSVC and Itanium have a dedicated pure virtual call handler which is placed in the vtable to catch these events. This is what produces the error message you see.
From a compiler's point of view, if you look at how the function f() is invoked:
Case-1: A's ctor calls A-ctor => f() directly. Compiler knows precisely that this is the case and decides to issue a warning.
Case-2: A's ctor calls A-ctor => g() => f(). There are entirely legitimate cases of calling f() from one of the class methods. Compiler can not say that this is illegal. The callgraph could have been from * => bar() => g() -> f(), meaning the type of the object is not known. Such paths being possible, makes dynamic dispatching necessary - leading to the runtime error.
As others pointed out, this is undefined usage and compilers only go so far in detecting and warning.
Undefined behaviour means that the compiler does not have to handle the situation in any particularly defined manner.
Here your compiler, that knew the actual type of A in its constructor was able to inline in the pure virtual method rather than call it through the v-table. This is what would happen if the method were normal virtual, not pure virtual, and this would be defined behaviour.
Whilst it would be the behaviour too even via g(), the compiler did not do this for a pure virtual f() function. It doesn't have to.
The simple moral is do not invoke undefined behaviour, and if you want to call f() from the constructor do not make it pure virtual.
If you want to enforce your sub-classes to implement f(), do not call it from the constructor of A but give that function you want to call a different name. Preferably not virtual at all.

Inline virtual function when called from another virtual function?

I have a class with two virtual member functions: foo and wrapper. foo is short and fast, and wrapper contains a loop that calls foo many times. My hope is that there is some way to inline the calls to foo inside the wrapper function, even when called from a pointer to an object:
MyClass *obj = getObject();
obj->foo(); // As I understand it, this cannot be inlined. That's okay.
obj->wrapper(); // Nor will this. However, I hope that the machine code
// for the wrapper function will contain inlined calls to
// foo().
Essentially, I want the compiler to generate multiple versions of the wrapper function -- one for each possible class -- and inline calls to the appropriate foo, which should be possible since the object type is determined before picking which wrapper function to execute. Is this possible? Do any compilers support this optimization?
Edit: I appreciate all of the feedback and answers so far, and I may end up picking one of them. However, most responses ignore the last part of my question where I explain why I think this optimization should be feasible. That is really the crux of my question and I am still hoping someone can address that.
Edit 2: I picked Vlad's answer since he both suggested the popular workaround and partially addressed my proposed optimization (in the comments of David's answer). Thanks to everyone who wrote an answer -- I read them all and there wasn't a clear "winner".
Also, I found an academic paper that proposes an optimization very similar to what I was suggesting: http://www.ebb.org/bkuhn/articles/cpp-opt.pdf.
In certain cases, compiler can determine the virtual dispatch behavior in compile-time and perform non-virtual function invocation or even inline the function. It can only do that if it can figure out that your class is the "top" in inheritance chain or those two functions are not otherwise overloaded. Oftentimes, this is simply impossible, especially if you don't have late time optimization enabled for the whole program.
Unless you want to check the results of your compiler's optimizations, your best bet would be not to use a virtual function in the inner loop at all. For example, something like this:
class Foo {
public:
virtual void foo()
{
foo_impl();
}
virtual void bar()
{
for (int i = 0; i < ∞; ++i) {
foo_impl();
}
}
private:
void foo_impl() { /* do some nasty stuff here */ }
};
But in that case you clearly give up the idea that somebody may come in, inherit from your class and throw in their own implementation of "foo" to be called by your "bar". They will essentially will need to re-implement both.
On the other hand, it smells a bit like a premature optimization. Modern CPUs will most likely "lock" your loop, predict the exit from it and execute the same µOPs over and over, even if your method is virtually virtual. So I'd recommend you carefully determine this to be a bottleneck before spending your time optimizing it.
No, the function call will not be inlined if performed through a pointer or reference (this includes the this pointer). A new type can be created that extends from your current type and overrides foo without overriding wrapper.
If you want to enable the compiler to inline your function, you must disable virtual dispatch for that call:
void Type::wrapper() {
Type::foo(); // no dynamic dispatch
}
Imagine the following hierarchy:
class Base
{
virtual void foo();
virtual void wrapper();
};
class Derived1: public Base
{
virtual void foo() { cout << "Derived1::foo"; }
virtual void wrapper() { foo(); }
};
class Derived2: public Derived1
{
virtual void foo() { cout << "Derived2::foo"; }
};
Base * p1 = new Derived1;
p1->wrapper(); // calls Derived1::wrapper which calls Derived1::foo
Base * p2 = new Derived2;
p2->wrapper(); // calls Derived1::wrapper which calls Derived2::foo
Can you see the problem? Derived1::wrapper must call Derived2::foo. It can't know until runtime whether it will be calling Derived1::foo or Derived2::foo, so there's no way to inline it.
If you want to insure that inlining is possible, make sure that the function you want to inline isn't virtual. It seems from your description that this might be possible, if every class in the hierarchy reimplements both foo and wrapper. A function doesn't need to be virtual to be overridden.
This is not accurate. virtual functions can be inlined, but only if the compiler knows the static type of the object with certainty - and thus have the guarantee that polymorphism works.
For example:
struct A
{
virtual void foo()
{
}
};
struct B
{
virtual void foo()
{
}
};
int main()
{
A a;
a.foo(); //this can be inlined
A* pa = new A;
pa->foo(); //so can this
}
void goo(A* pa)
{
pa->foo() //this probably can't
}
That said, it appears that in your case this can't happen. What you can do is have another non-virtual function that actually implements the functionality and call it statically, so the call gets resolved at compile-time:
class MyClass
{
virtual void foo() = 0;
virtual void wrapper() = 0;
};
class Derived : MyClass
{
void fooImpl()
{
//keep the actual implementation here
}
virtual void foo()
{
fooImpl();
}
virtual void wrapper()
{
for ( int i = 0 ; i < manyTimes ; i++ )
fooImpl(); //this can get inlined
}
};
or simply Derived::foo() as pointed out by #avakar.
Your function is virtual and its type is not indicate until runtime, so how you expect compiler to inline code of some class into it?
In some similar situation I had 2 functions: foo that is virtual and foo_impl that is normal function and will be called from foo and wrapper

Why a virtual call to a pure virtual function from a constructor is UB and a call to a non-pure virtual function is allowed by the Standard?

From 10.4 Abstract Classes parag. 6 in the Standard :
"Member functions can be called from a constructor (or destructor) of an abstract class; the effect of making a virtual call to a pure virtual function directly or indirectly for the object being created (or destroyed) from such a constructor (or destructor) is undefined."
Assuming that a call to a non-pure virtual function from a constructor (or destructor), is allowed by the Standard, why the difference ?
[EDIT] More standards quotes about pure virtual functions:
§ 10.4/2 A virtual function is specified pure by using a pure-specifier (9.2) in the function declaration in the class definition. A pure virtual function needs be defined only if called with, or as if with (12.4), the qualified-id syntax (5.1). ... [ Note: A function declaration cannot provide both a pure-specifier and a definition —end note ]
§ 12.4/9 A destructor can be declared virtual (10.3) or pure virtual (10.4); if any objects of that class or any derived class are created in the program, the destructor shall be defined.
Some questions that need answering are:
Where the pure virtual function has not been given an implementation, should this not be a compiler or linker error instead?
Where the pure virtual function has been given an implementation, why can it not be well-defined in this case to invoke this function?
Because a virtual call can NEVER call a pure virtual function -- the only way to call a pure virtual function is with an explicit (qualified) call.
Now outside of constructors or destructors, this is enforced by the fact that you can never actually have objects of an abstract class. You must instead have an object of some non-abstract derived class which overrides the pure virtual function (if it didn't override it, the class would be abstract). While a constructor or destructor is running, however, you might have an object of an intermediate state. But since the standard says that trying to call a pure virtual function virtually in this state results in undefined behavior, the compiler is free to not have to special case things to get it right, giving much more flexibility for implementing pure virtual functions. In particular, the compiler is free to implement pure virtuals the same way it implements non-pure virtuals (no special case needed), and crash or otherwise fail if you call the pure virtual from a ctor/dtor.
I think this code is an example of the undefined behaviour referenced by the standard. In particular, it is not easy for the compiler to notice that this is undefined.
(BTW, when I say 'compiler', I really mean 'compiler and linker'. Apologies for any confusion.)
struct Abstract {
virtual void pure() = 0;
virtual void foo() {
pure();
}
Abstract() {
foo();
}
~Abstract() {
foo();
}
};
struct X : public Abstract {
virtual void pure() { cout << " X :: pure() " << endl; }
virtual void impure() { cout << " X :: impure() " << endl; }
};
int main() {
X x;
}
If the constructor of Abstract directly called pure(), this would obviously be a problem and a compiler can easily see that there is no Abstract::pure() to be called, and g++ gives a warning. But in this example, the constructor calls foo(), and foo() is a non-pure virtual function. Therefore, there is no straightforward basis for the compiler or linker to give a warning or error.
As onlookers, we can see that foo is a problem if called from the constructor of Abstract. Abstract::foo() itself is defined, but it tries to call Abstract::pure and this doesn't exist.
At this stage, you might think that the compiler should issue a warning/error about foo on the grounds that it calls a pure virtual function. But instead you should consider the derived non-abstract class where pure has been given an implementation. If you call foo on that class after construction (and assuming you haven't overriden foo), then you will get well-defined behaviour. So again, there is no basis for a warning about foo. foo is well-defined as long as it isn't called in the constructor of Abstract.
Therefore, each method (the constructor and foo) are each relatively OK if you look on them on their own. The only reason we know there is a problem is because we can see the big picture. A very smart compiler would put each particular implementation/non-implementation into one of three categories:
Fully-defined: It, and all the methods it calls are fully-defined at every level in the object hierarchy
Defined-after-construction. A function like foo that has an implementation but which might backfire depending on the status of the methods it calls.
Pure virtual.
It's a lot of work to expect a compiler and linker to track all this, and hence the standard allows compilers to compile it cleanly but give undefined behaviour.
(I haven't mentioned the fact that it is possible to give implementations to pure-virtual methods. This is new to me. Is it defined properly, or is it just a compiler-specific extension? void Abstract :: pure() { })
So, it's not merely undefined 'because the standard says so`. You have to ask yourself 'what behaviour would you define for the above code?'. The only sensible answer is either to leave it undefined or to mandate a run-time error. The compiler and linker won't find it easy to analyse all these dependencies.
And to make matters worse, consider pointers-to-member-functions! The compiler or linker can't really tell if the 'problematic' methods will ever be called - it might depend on a whole load of other things that happen at runtime. If the compiler sees (this->*mem_fun)() in the constructor, it can't be expected to know how well-defined mem_fun is.
It is the way the classes are constructed and destructed.
Base is first constructed, then Derived. So in the constructor of Base, Derived has not yet been created. Therefore none of its member functions can be called. So if the constructor of Base calls a virtual function, it can't be the implementation from Derived, it must be the one from Base. But the function in Base is pure virtual and there is nothing to call.
In destruction, first Derived is destroyed, then Base. So once again in the destructor of Base there is no object of Derived to invoke the function, only Base.
Incidentally it is only undefined where the function is still pure virtual. So this is well-defined:
struct Base
{
virtual ~Base() { /* calling foo here would be undefined */}
virtual void foo() = 0;
};
struct Derived : public Base
{
~Derived() { foo(); }
virtual void foo() { }
};
The discussion has moved on to suggest alternatives that:
It might produce a compiler error, just like trying to create an instance of an abstract class does.
The example code would no doubt be something like:
class Base
{
// other stuff
virtual void init() = 0;
virtual void cleanup() = 0;
};
Base::Base()
{
init(); // pure virtual function
}
Base::~Base()
{
cleanup(); // which is a pure virtual function. You can't do that! shouts the compiler.
}
Here it is clear what you are doing is going to get you into trouble. A good compiler might issue a warning.
it might produce a link error
The alternative is to look for a definition of Base::init() and Base::cleanup() and invoke that if it exists, otherwise invoke a link error, i.e. treat cleanup as non-virtual for the purpose of constructors and destructors.
The issue is that won't work if you have a non-virtual function calling the virtual function.
class Base
{
void init();
void cleanup();
// other stuff. Assume access given as appropriate in examples
virtual ~Base();
virtual void doinit() = 0;
virtual void docleanup() = 0;
};
Base::Base()
{
init(); // non-virtual function
}
Base::~Base()
{
cleanup();
}
void Base::init()
{
doinit();
}
void Base::cleanup()
{
docleanup();
}
This situation looks to me to be beyond the capability of both the compiler and linker. Remember that these definitions could be in any compilation unit. There is nothing illegal about the constructor and destructor calling init() or cleanup() here unless you know what they are going to do, and there is nothing illegal about init() and cleanup() calling the pure virtual functions unless you know from where they are invoked.
It is totally impossible for the compiler or linker to do this.
Therefore the standard must allow the compile and link and mark this as "undefined behaviour".
Of course if an implementation does exist, the compiler is free to use it if able. Undefined behaviour doesn't mean it has to crash. Just that the standard doesn't say it has to use it.
Note that this case the destructor is calling a member function that calls the pure virtual but how do you know it will do even this? It could be calling something in a completely different library that invokes the pure virtual function (assume access is there).
Base::~Base()
{
someCollection.removeMe( this );
}
void CollectionType::removeMe( Base* base )
{
base->cleanup(); // ouch
}
If CollectionType exists in a totally different library there is no way any link error can occur here. The simple matter is again that the combination of these calls is bad (but neither one individually is faulty). If removeMe is going to be calling pure-virtual cleanup() it cannot be called from Base's destructor, and vice-versa.
One final thing you have to remember about Base::init() and Base::cleanup() here is that even if they have implementations, they are never called through the virtual function mechanism (v-table). They would only ever be called explicitly (using full class-name qualification) which means that in reality they are not really virtual. That you are allowed to give them implementations is perhaps misleading, probably wasn't really a good idea and if you wanted such a function that could be called through derived classes, perhaps it is better being protected and non-virtual.
Essentially: if you want the function to have the behaviour of a non-pure virtual function, such that you give it an implementation and it gets called in the constructor and destructor phase, then don't define it as pure virtual. Why define it as something you don't want it to be?
If all you want to do is prevent instances being created you can do that in other ways, such as:
- Make the destructor pure virtual.
- Make the constructors all protected
Before discussing why it's undefined, let's first clarify what the question is about.
#include<iostream>
using namespace std;
struct Abstract {
virtual void pure() = 0;
virtual void impure() { cout << " Abstract :: impure() " << endl; }
Abstract() {
impure();
// pure(); // would be undefined
}
~Abstract() {
impure();
// pure(); // would be undefined
}
};
struct X : public Abstract {
virtual void pure() { cout << " X :: pure() " << endl; }
virtual void impure() { cout << " X :: impure() " << endl; }
};
int main() {
X x;
x.pure();
x.impure();
}
The output of this is:
Abstract :: impure() // called while x is being constructed
X :: pure() // x.pure();
X :: impure() // x.impure();
Abstract :: impure() // called while x is being destructed.
The second and third lines are easy to understand; the methods were originally defined in Abstract, but the overrides in X take over. This result would have been the same even if x had been a reference or pointer of Abstract type instead of X type.
But this interesting thing is what happens inside the constructor and destructor of X. The call to impure() in the constructor calls Abstract::impure(), not X::impure(), even though the object being constructed is of type X. The same happens in the destructor.
When an object of type X is being constructed, the first thing that is constructed is merely an Abstract object and, crucially, it is ignorant of the fact that it will ultimately be an X object. The same process happens in reverse for the destruction.
Now, assuming you understand that, it is clear why the behaviour must be undefined. There is no method Abstract :: pure which could be called by the constructor or destructor, and hence it wouldn't be meaningful to try to define this behaviour (except possibly as a compilation error.)
Update: I've just discovered that is possible to give an implementation, in the virtual class, of a pure virtual method. The question is: Is this meaningful?
struct Abstract {
virtual void pure() = 0;
};
void Abstract :: pure() { cout << "How can I be called?!" << endl; }
There will never be an object whose dynamic type is Abstract, hence you'll never be able to execute this code with a normal call to abs.pure(); or anything like that. So, what is the point of allowing such a definition?
See this demo. The compiler gives warnings, but now the Abstract::pure() method is callable from the constructor. This is the only route by which Abstract::pure() can be called.
But, this is technically undefined. Another compiler is entitled to ignore the implementation of Abstract::pure, or even to do other crazy things. I'm not aware of why this isn't defined - but I wrote this up to try to help clear up the question.

Linking fails when missing implementation of a not-used pure virtual method

I have a class, which inherits from a class with a pure virtual functions.
Now I need to add another class, which doesn't need some method. I got an idea not to implement this method, instead of always throwing an exception when this method is called, like in the next example :
#include <iostream>
class ibase {
public:
virtual void foo() = 0;
virtual void boo() = 0;
};
class base1 : public ibase {
public:
virtual void foo(){ std::cout<<"base1::foo"<<std::endl; }
virtual void boo(){ std::cout<<"base1::boo"<<std::endl; }
};
class base2 : public ibase {
public:
virtual void foo() { std::cout<<"base2::foo"<<std::endl; }
virtual void boo();
};
int main()
{
ibase *inst1 = new base1;
ibase *inst2 = new base2;
inst1->foo();
inst1->boo();
inst2->foo();
}
But when I tried to compile using next compiler options :
g++ dfg.cpp -ansi -pedantic -Wall
this example produced next output (using g++ 4.3.0) :
/tmp/ccv6VUzm.o: In function `base2::base2()':
dfg.cpp:(.text._ZN5base2C1Ev[base2::base2()]+0x16): undefined reference to `vtable for base2'
collect2: ld returned 1 exit status
Can someone explain why the linking fails? The method boo() is not called.
The vtable for base2 is created - you use a base2. The vtable references boo() - so you need to define it.
10.3/8:
A virtual function declared in a class
shall be defined, or declared pure
(10.4) in that class, or both; but no
diagnostic is required (3.2).
The One-Definition Rule states that each function that is used must be defined exactly once. The definition of the term used includes the following line:
a virtual function is used if it is
not pure
That means that all non-pure virtual functions have to be defined even if they aren't called
It fails because the internal vtable needs a place to point to. It doesn't matter if it is called, the vtable is still created.
Create an empty body for the method and you should be good to go.
You have to implement the method in base2, there is no way around it. Polymorphism is a run time behavior, and there is no way for the linker to know that boo will never be called. You can simply provide a dummy implementation in the base class instead of making it pure virtual if its not mandatory to implement the method in derived class.
boo may not be invoked in reality but it is used to construct the v-table for base2.
You have to define what behaviour will happen is someone has a base2 and calls boo() on it (via its base class pointer) even if there is no point in the code where this is actually invoked. It is part of the contract of implementing an ibase.
The design is of course flawed, and if you want a class that allows a foo only there should be an interface for that.
If your particular instance is that a call is a no-op then that IS a behaviour for the class.
You just forgot to implement virtual void base2::boo (). You have to implement it in order to instantiate base2 class. Otherwise you can leave it pure-virtual by not declaring it in base2 class.