If several threads share a pointer to the same object, is it safe to call its virtual method? Of course we should assume that the method itself is thread-safe.
class Base {
public:
virtual int test(int arg) = 0;
};
class Derived1: public Base {
public:
virtual int test(int arg) { return arg * 2; }
};
class Derived2: public Base {
public:
virtual int test(int arg) { return arg * 3; }
};
//call in threads:
Base* base = (pointer to the same Derived1);
int value = base->test(1);
Yes, that's fine, provided the lifetime of the object *base exceeds that of the function call.
Calling virtual functions is normally implemented by looking up a vtable entry in the class. This vtable doesn't change, so for this implementation, it should be thread safe.
I don't think there is a general guarantee.
I've started this answer about three times, based on "Kerrek SB's comment on a previous answer. As I see it, without reading specs 'til my eyes bleed, I'm uncertain as to what shouldn't be threadsafe of a virtual call - the implementation of the "virtual function selection" (vtable or whatever it may be) should definitely not cause any problem, since it should just be a pointer to the function, which gets called based on the index [or similar] of the function chosen.
I'm sorry, this isn't an answer, but I have a very hard time seeing any scenario, on any processor that I know how it works (x86, 68K, 29k, ARM are ones that I've worked enough with to understand how a virtual function is implemented in assembler) where this would go wrong because of threads - assuming the other code is safe - if in the above example we were to run a second thread that modifies which element base points at, you could potentially have some sort of race-condition, where the value of base is pointing at the wrong set of virtual functions, or some such. But that's not the call itself, but the code modifying base that is 'not threadsafe'.
Of course, there may some "amateur" compiler around that solves virtual functions in some other way.
Of course, there wouldn't be a sensible workaround, should there be a problem - unless you block other threads for the entire duration of the virtual call - and if you have a class that implements threads by having a virtual function run(void *args) as the "this is what to run inside the thread", which is something I've seen a several times, that would pretty much kill that functionality completely!
So, basically, although I'm not able to refer to a spec section that says this is safe, I can't see any solution other than "it has to be".
If the method is threadsafe, then it is fine.
In short:
If not, C++ would not be useable at all for multithreaded programming.
In long:
After compiling the programm is constant. So its thread safe.
While loading (of modules) the runtime system changes data structures for
RTTI (dynamic_cast, ...). This is out of your scope but it should be implemented in a thread safe way.
After construction the type of your object does not change (you can with very dirty tricks). So all virtual functions of all your objects should not change. So its thread safe.
But you should consider that member of classes which can be seen as an replacment for virtual functions (boost function, loki functor, ...) may have value semantics and can change while called. In this case it should be documented or better interface for using them should be implemented.
In my opinion you can safely call a virtual method. Even in the case somebody use tries to mimic virtual functions you can expect normal (safe) behavior.
Related
I read that there is minor performance hit when calling virtual functions from derived classes if called repeatedly. One thing that I am not clear about is whether this affects function calls from the base class itself. If I call a method in the base class with the virtual keyword, does this performance hit still occur?
If I call a method in the base class with the virtual keyword, does this performance hit still occur?
That the virtual function is being called from the base class will not prevent virtual lookup.
Consider this trivial example:
class Base
{
public:
virtual get_int() { return 1; }
void print_int()
{
// Calling a virtual function from the base class
std::cout << get_int();
}
};
class Derived : public Base
{
public:
virtual get_int() { return 2; }
};
int main()
{
Base().print_int();
Derived().print_int();
}
Is print_int() guaranteed to print 1? It is not.
That the function print_int() exists in the base class does not prove that the object its called from is not derived.
Yes, there will be a performance overhead.
This is due to the fact that virtual functions in an inheritance hierarchy may or may not be overloaded by any derived class.
This requires a lookup in a v-table, as the base class doesn't know any better as to which class is dynamically implementing the function.
Edit: As mentioned, there may be some optimization, but this shouldn't be relied on
Virtual functions are implemented by a virtual function table. Each class has a table of the virtual functions' addresses. An instance of a class with a virtual function table has a pointer to the table, which is set by the constructor.
When the code calls a regular function, its address is hard coded into it. When it calls a virtual function it has to calculate *(*vtbl + 8 * function-offset), which requires two memory accesses. That's the overhead, which can be avoided in cases mentioned by others above.
Point is, if the same function is called repeatedly, much of the overhead might be avoided. The first call would bring the virtual function table from RAM to the CPU's cache, meaning it would be as cheap as 1-2 CPU cycles to fetch again. Doing an integer shift and an addition is rather cheap. If the compiler knows its the same function of the same class, it could calculate it once and reuse the value.
Since the question is "Is there a performance hit?" and not "Can there be a performance hit?", it is surprisingly tricky to answer accurately. This is because compilers are given a lot of leeway when it comes time to optimize your code, and they often make use of it. The process of eliminating that specific overhead has a particular name: devirtualization.
Because of this, whether a cost is incurred or not will depend on:
Which compiler is being used
The compiler version
The compiler settings
The linker settings
How the class is used
Whether there are any subclasses that override the method.
So what should you do with so much uncertainty? Since the overhead is minor in the first place, the first thing is to not worry about it unless you need to improve performance in that specific area. Writing structurally sound and maintainable code trumps premature optimisation every single time.
But if you have to worry about it, then a good rule of thumb is that if you are calling a non-final virtual function from a pointer or reference (which includes this) to a non-final class, then you should write code with the assumption that the tiny overhead associated with an indirect lookup through a vtable will be
paid.
That doesn't mean that the overhead will necessarily occur, just that there is a non-zero chance that it will be there.
So in your scenario, given:
class Base {
public:
virtual void foo() { std::cout << "foo is called\n"; }
void bar() { foo(); }
};
int main() {
Base b;
b.bar();
}
Base is not final
Base::foo is not final.
this is a pointer.
So you should operate under the assumption that the overhead might be present, regardless of whether or not it ends up being there in the final built application.
I have some trouble. While planning my program, I consider 2 versions:
Use static methods in classes with additional parameter (pointer to copy of a class).
Use virtual methods (vtable)
What is faster? Why?
Edit: I need to make next algorithm: array should be storing pointers to method of different classes (they meet diff. game objects), for example, method Draw().
The main task is storing and calling methods of different classes.
At this point, you probably shouldn't be thinking about micro-optimisations at all - focus on choosing efficient algorithms, and making your code clear and correct; then identify any bottlenecks that prevent it from performing as required. Having said that, here are some thoughts in the unlikely event that you do find that virtual dispatch becomes an issue.
The two are not equivalent - the first (if I understand what you're saying) is trying to emulate a non-virtual function by explicitly passing a this-like pointer to a static function, and is likely to be exactly as fast as the non-static equivalent. It will behave differently to a virtual function, and so can't be used if you need virtual dispatch.
A non-virtual function will (almost certainly) be a bit faster than a virtual function - it's more likely to be inlined, and if not inlined it can be called directly rather than looked up at runtime. Therefore, only declare functions virtual when you need virtual dispatch.
In extreme circumstances, you might be able to save a level of indirection over virtual functions, by storing function pointers in the object rather than using the compiler-generated virtual dispatch. You should only do this as a last resort, if you find that virtual dispatch is a serious bottle-neck, and you can't sensibly redesign your algorithm to avoid it.
First, virtual functions and what you are proposing have different
semantics. If you need different behavior because you have different
types of objects, then it's highly unlikely that you can do better than
the compilers implementation of virtual functions. If you don't need
it, then just don't declare the function virtual.
And don't worry about the performance issue until you know it is one.
If, after getting the code to work, you do find performance issues due
to virtual function calls (usually because the compiler can't inline the
function, so you loose all of the optimizations which would follow
inlining), you can, punctually, avoid the virtual function cost if you
design the class correctly. Supposing the function in question is
f():
class Base
{
virtual void doF() = 0;
public:
void f() { doF(); }
};
class Derived : public Base
{
virtual void doF() { f(); }
public:
void f() { /* ... */ }
};
If you do this, and you have, for example, a tight loop where you're
constantly calling f() on the same object, you can do something like:
void
tightLoop( Base& object )
{
Derived& o = dynamic_cast<Derived&>( object );
for ( /* ... */ ) {
o.f();
}
}
If you do this, of course, tightLoop can only be called with an object
which is actually Derived.
Why doesn't C++ make destructors virtual by default for classes that have at least one other virtual function? In this case adding a virtual destructor costs me nothing, and not having one is (almost?) always a bug. Will C++0x address this?
You don't pay for what you don't need. If you never delete through base pointer, you may not want the overhead of the indirected destructor call.
Perhaps you were thinking that the mere existence of the vtable is the only overhead. But each individual function dispatch has to be considered, too, and if I want to make my destructor call dispatch directly, I should be allowed to do so.
It would be nice of your compiler to warn you if you do ever delete a base pointer and that class has virtual methods, I suppose.
Edit: Let me pull Simon's excellent comment in here: Check out this SO question on the code generated for destructors. As you can see, there's also code-bloat overhead to be considered.
Here's an example (not that I recommend writing such code):
struct base {
virtual void foo() const = 0;
virtual void bar () const = 0;
};
struct derived: base {
void foo() const {}
void bar() const {}
};
std::shared_ptr<base>
make_base()
{
return std::make_shared<derived>();
}
This is perfectly fine code that does not exhibit UB. This is possible because std::shared_ptr uses type-erasure; the final call to delete will delete a derived*, even if the last std::shared_ptr to trigger destruction is of type std::shared_ptr<void>.
Note that this behaviour of std::shared_ptr is not tailored to virtual destruction; it has a variety of other uses (e.g. std::shared_ptr<FILE> { std::fopen( ... ), std::fclose }). However since this technique already pays the cost of some indirection to work, some users may not be interested in having a virtual destructor for their base classes. That's what "pay only for what you need" means.
By the letter of the standard, a polymorphic class with a non-virtual destructor is not a bug. One specific action performed on such an object results in undefined behavior, but everything else is perfectly kosher. So given the otherwise lenient behavior of the standard in terms of what mistakes it allows programmers to make, why should the destructor be given special treatment?
And such a change would have costs, albeit mostly trivial ones: the virtual table will be one element larger, and the virtual dispatch associated with destructor calls.
To the best of my knowledge, no, there is no change in the behavior of destructors in this regard in C++11. I imagine it would say something in the section on special member functions, but it does not, and there is similarly nothing in the section of virtual functions in general.
My library has two classes, a base class and a derived class. In the current version of the library the base class has a virtual function foo(), and the derived class does not override it. In the next version I'd like the derived class to override it. Does this break ABI? I know that introducing a new virtual function usually does, but this seems like a special case. My intuition is that it should be changing an offset in the vtbl, without actually changing the table's size.
Obviously since the C++ standard doesn't mandate a particular ABI this question is somewhat platform specific, but in practice what breaks and maintains ABI is similar across most compilers. I'm interested in GCC's behavior, but the more compilers people can answer for the more useful this question will be ;)
It might.
You're wrong regarding the offset. The offset in the vtable is determined already. What will happen is that the Derived class constructor will replace the function pointer at that offset with the Derived override (by switching the in-class v-pointer to a new v-table). So it is, normally, ABI compatible.
There might be an issue though, because of optimization, and especially the devirtualization of function calls.
Normally, when you call a virtual function, the compiler introduces a lookup in the vtable via the vpointer. However, if it can deduce (statically) what the exact type of the object is, it can also deduce the exact function to call and shave off the virtual lookup.
Example:
struct Base {
virtual void foo();
virtual void bar();
};
struct Derived: Base {
virtual void foo();
};
int main(int argc, char* argv[]) {
Derived d;
d.foo(); // It is necessarily Derived::foo
d.bar(); // It is necessarily Base::bar
}
And in this case... simply linking with your new library will not pick up Derived::bar.
This doesn't seem like something that could be particularly relied on in general - as you said C++ ABI is pretty tricky (even down to compiler options).
That said I think you could use g++ -fdump-class-hierarchy before and after you made the change to see if either the parent or child vtables change in structure. If they don't it's probably "fairly" safe to assume you didn't break ABI.
Yes, in some situations, adding a reimplementation of a virtual function will change the layout of the virtual function table. That is the case if you're reimplementing a virtual function from a base that isn't the first base class (multiple-inheritance):
// V1
struct A { virtual void f(); };
struct B { virtual void g(); };
struct C : A, B { virtual void h(); }; //does not reimplement f or g;
// V2
struct C : A, B {
virtual void h();
virtual void g(); //added reimplementation of g()
};
This changes the layout of C's vtable by adding an entry for g() (thanks to "Gof" for bringing this to my attention in the first place, as a comment in http://marcmutz.wordpress.com/2010/07/25/bcsc-gotcha-reimplementing-a-virtual-function/).
Also, as mentioned elsewhere, you get a problem if the class you're overriding the function in is used by users of your library in a way where the static type is equal to the dynamic type. This can be the case after you new'ed it:
MyClass * c = new MyClass;
c->myVirtualFunction(); // not actually virtual at runtime
or created it on the stack:
MyClass c;
c.myVirtualFunction(); // not actually virtual at runtime
The reason for this is an optimisation called "de-virtualisation". If the compiler can prove, at compile time, what the dynamic type of the object is, it will not emit the indirection through the virtual function table, but instead call the correct function directly.
Now, if users compiled against an old version of you library, the compiler will have inserted a call to the most-derived reimplementation of the virtual method. If, in a newer version of your library, you override this virtual function in a more-derived class, code compiled against the old library will still call the old function, whereas new code or code where the compiler could not prove the dynamic type of the object at compile time, will go through the virtual function table. So, a given instance of the class may be confronted, at runtime, with calls to the base class' function that it cannot intercept, potentially creating violations of class invariants.
My intuition is that it should be changing an offset in the vtbl, without actually changing the table's size.
Well, your intuition is clearly wrong:
either there is a new entry in the vtable for the overrider, all following entries are moved, and the table grows,
or there is no new entry, and the vtable representation does not change.
Which one is true can depends on many factors.
Anyway: do not count on it.
Caution: see In C++, does overriding an existing virtual function break ABI? for a case where this logic doesn't hold true;
In my mind Mark's suggestion to use g++ -fdump-class-hierarchy would be the winner here, right after having proper regression tests
Overriding things should not change vtable layout[1]. The vtable entries itself would be in the datasegment of the library, IMHO, so a change to it should not pose a problem.
Of course, the applications need to be relinked, otherwise there is a potential for breakage if the consumer had been using direct reference to &Derived::overriddenMethod;
I'm not sure whether a compiler would have been allowed to resolve that to &Base::overriddenMethod at all, but better safe than sorry.
[1] spelling it out: this presumes that the method was virtual to begin with!
I am currently debugging a crashlog. The crash occurs because the vtable pointer of a (c++-) object is 0x1, while the rest of the object seems to be ok as far as I can tell from the crashlog.
The program crashes when it tries to call a virtual method.
My question: Under what circumstances can a vtable pointer become null? Does operator delete set the vtable pointer to null?
This occurs on OS X using gcc 4.0.1 (Apple Inc. build 5493).
Could be a memory trample - something writing over that vtable by mistake. There is a nearly infinite amount of ways to "achieve" this in C++. A buffer overflow, for example.
Any kind of undefined behaviour you have may lead to this situation. For example:
Errors in pointer arithmetic or other that make your program write into invalid memory.
Uninitialized variables, invalid casts...
Treating an array polymorphically might cause this as a secondary effect.
Trying to use an object after delete.
See also the questions What’s the worst example of undefined behaviour actually possible? and What are all the common undefined behaviour that a C++ programmer should know about?.
Your best bet is to use a bounds and memory checker, as an aid to heavy debugging.
A very common case: trying to call a pure virtual method from the constructor...
Constructors
struct Interface
{
Interface();
virtual void logInit() const = 0;
};
struct Concrete: Interface()
{
virtual void logInit() const { std::cout << "Concrete" << std::endl; }
};
Now, suppose the following implementation of Interface()
Interface::Interface() {}
Then everything is fine:
Concrete myConcrete;
myConcrete.pure(); // outputs "Concrete"
It's such a pain to call pure after the constructor, it would be better to factorize the code right ?
Interface::Interface() { this->logInit(); } // DON'T DO THAT, REALLY ;)
Then we can do it in one line!!
Concrete myConcrete; // CRASHES VIOLENTLY
Why ?
Because the object is built bottom up. Let's look at it.
Instructions to build a Concrete class (roughly)
Allocate enough memory (of course), and enough memory for the _vtable too (1 function pointer per virtual function, usually in the order they are declared, starting from the leftmost base)
Call Concrete constructor (the code you don't see)
a> Call Interface constructor, which initialize the _vtable with its pointers
b> Call Interface constructor's body (you wrote that)
c> Override the pointers in the _vtable for those methods Concrete override
d> Call Concrete constructor's body (you wrote that)
So what's the problem ? Well, look at b> and c> order ;)
When you call a virtual method from within a constructor, it doesn't do what you're hoping for. It does go to the _vtable to lookup the pointer, but the _vtable is not fully initialized yet. So, for all that matters, the effect of:
D() { this->call(); }
is in fact:
D() { this->D::call(); }
When calling a virtual method from within a Constructor, you don't the full dynamic type of the object being built, you have the static type of the current Constructor invoked.
In my Interface / Concrete example, it means Interface type, and the method is virtual pure, so the _vtable does not hold a real pointer (0x0 or 0x01 for example, if your compiler is friendly enough to setup debug values to help you there).
Destructors
Coincidently, let's examine the Destructor case ;)
struct Interface { ~Interface(); virtual void logClose() const = 0; }
Interface::~Interface() { this->logClose(); }
struct Concrete { ~Concrete(); virtual void logClose() const; char* m_data; }
Concrete::~Concrete() { delete[] m_data; } // It's all about being clean
void Concrete::logClose()
{
std::cout << "Concrete refering to " << m_data << std::endl;
}
So what happens at destruction ? Well the _vtable works nicely, and the real runtime type is invoked... what it means here however is undefined behavior, because who knows what happened to m_data after it's been deleted and before Interface destructor was invoked ? I don't ;)
Conclusion
Never ever call virtual methods from within constructors or destructors.
If it's not that, you're left with a memory corruption, tough luck ;)
My first guess would be that some code is memset()'ing a class object.
This is totaly implementation dependant. However it would be quite safe to assume that after delete some other operation may set the memory space to null.
Other possibilities include overwrite of the memory by some loose pointer -- actually in my case it's almost always this...
That said, you should never try to use an object after delete.