Is the C++ linker smart about virtual methods used only from one class in a program? - c++

I work on a project with extremely low unit test culture. We have almost zero unit testing and every API is static.
To be able to unit test some of my code I create wrappers like
class ApiWrapper {
virtual int Call(foo, bar) {
return ApiCall(foo, bar);
}
};
Now in my functions instead:
int myfunc() {
APiCall(foo, bar);
}
I do:
int myfunc(ApiWrapper* wrapper) {
wrapper->Call(foo, bar);
}
This way I am able to mock such functionality. The problem is that some colleagues complain that production code should not be affected from testability needs - nonsense I know, but reality.
Anyway, I believe that I read somewhere at some point that compilers are actually smart about replacing unused polymorphic behavior with a direct call ... or if there is no class that overrides a virtual method it becomes "normal".
I experimented, and on gcc 4.8 it does not inline or directly call the virtual method, but instead creates vt.
I tried to google, but I did not find anything about this. Is this a thing or do I misremember ... or I have to do something to explain this to the linker, an optimization flag or something?
Note that while in production this class is final, in the test environment it is not. This is exactly what the linker has to be smart about and detect it.

The C++ compiler will only replace a polymorphic call with a direct call if it knows for certain what the actual type is.
So in the following snippet, it will be optimized:
void f() {
ApiWrapper x;
x.Call(); // Can be replaced
}
But in the general case, it can't:
void f(ApiWrapper* wrapper) {
wrapper->Call(); // Cannot be replaced
}
You also added two conditions to your question:
if there is no class that overrides a virtual method it becomes "normal".
This will not help. Neither the C++ compiler nor the linker will look at the totality of classes to search whether any inheritor exists. It's futile anyway, since you can always dynamically-load an instance of a new class.
By the way, this optimization is indeed performed by some JVMs (called devirtualization) - since in Java land there's a class loader which is aware of which classes are currently loaded.
in production this class is final
That will help! Clang, for example, will convert virtual calls to non-virtual calls if the method / method's class is marked final.

Related

Will a single derivation of an abstract interface have it's virtual functions inlined?

I have a pure virtual base class as an interface
struct Interface
{
virtual void foo() = 0;
};
I have 2 derivations of this interface, 1 for production use and 1 for test use
struct Production : Interface
{
void foo() override;
};
struct Test : Interface
{
void foo() override;
};
In my codebase I use Interface everywhere
void bar(Interface& i)
{
i.foo();
}
However, in my production binary, there is only a single derivation of Interface, namely Production.
When compiling my production binary, will the compiler/linker/optimiser see that for every call to Interface::foo(), it is actually a call to Production::foo(), and inline the virtual function calls?
If so, does it matter if I'm compiling these in different static libraries? ie: The static library only sees Interface, but when linking, the linker knows it's a Production.
I'm currently using gcc-4.9.2
About the only way to know in this specific case is to look at the code that's generated to find out.
That said, g++ 4.9 added a -fdevirtualize switch that's intended to do exactly this, at least under the right circumstances (and I've tested it and found cases where it definitely did work).
It was implemented in g++ by Honza Hubička, who wrote some about it in his blog:
http://hubicka.blogspot.com/2014/01/devirtualization-in-c-part-1.html
http://hubicka.blogspot.com/2014/01/devirtualization-in-c-part-2-low-level.html
http://hubicka.blogspot.com/2014/02/devirtualization-in-c-part-3-building.html
http://hubicka.blogspot.com/2014/02/devirtualization-in-c-part-4-analyzing.html
If you read these, you'll see why Crazy Eddie (for one example) was skeptical about anybody doing this--although it's been done, and does work, implementation is clearly non-trivial.
As far as writing your code goes: since you don't expect any runtime polymorphism anyway, you might consider using a template-based solution instead.
struct Production
{
void foo();
};
struct Test
{
void foo();
};
template <class Interface>
void bar(Interface& i)
{
i.foo();
}
Of course, in most cases, you end up with the Interface as a parameter to a class template, not a function template, but the basic idea remains the same--you define your Test and Production versions with the same interface (but not using a base class or virtual functions), and then instantiate over the Test version when testing and the Production version for production code. This, of course, works with essentially any compiler all the time, not just the right compiler under the right circumstances.
There's no accurate answer to this question except: maybe.
I would seriously doubt anyone's bothered to develop the matching for this optimization. It would have to be a "whole program optimization" or "link time optimization" for one. Most people don't build with that turned on.
Further it could be voided out if the library is not static. Meaning even abi-compatible library changes could affect the validity of the optimization. unique_ptr<Interface> fun() could suddenly start returning different derived classes based on some internal state of the library.
So my answer, and again it's not correct, is: I seriously doubt any compiler does this.

Mimicing C# 'new' (hiding a virtual method) in a C++ code generator

I'm developing a system which takes a set of compiled .NET assemblies and emits C++ code which can then be compiled to any platform having a C++ compiler. Of course, this involves some extensive trickery due to various things .NET does that C++ doesn't.
One such situation is the ability to hide virtual methods, such as the following in C#:
class A
{
virtual void MyMethod()
{ ... }
}
class B : A
{
override void MyMethod()
{ ... }
}
class C : B
{
new virtual void MyMethod()
{ ... }
}
class D : C
{
override void MyMethod()
{ ... }
}
I came up with a solution to this that seemed clever and did work, as in the following example:
namespace impdetails
{
template<class by_type>
struct redef {};
}
struct A
{
virtual void MyMethod( void );
};
struct B : A
{
virtual void MyMethod( void );
};
struct C : B
{
virtual void MyMethod( impdetails::redef<C> );
};
struct D : C
{
virtual void MyMethod( impdetails::redef<D> );
};
This does of course require that all the call sites for C::MyMethod and D::MyMethod construct and pass the dummy object, as in this example:
C *c_d = &d;
c_d->MyMethod( impdetails::redef<C>() );
I'm not worried about this extra source code overhead; the output of this system is mainly not intended for human consumption.
Unfortunately, it turns out this actually causes runtime overhead. Intuitively, one would expect that because impdetails::redef<> is empty, it would take no space and passing it would involve no code.
However, the C++ standard, for reasons I understand but don't totally agree with, mandates that objects cannot have zero size. This leaves us with a situation where the compiler actually emits code to create and pass the object.
In fact, at least on VC2008, I found that it even went to the trouble of zeroing the dummy byte, even in release builds! I'm not sure why that was necessary, but it makes me even more not want to do it this way.
If all else fails I could always change the actual name of the function, such as perhaps having MyMethod, MyMethod$1, and MyMethod$2. However, this causes more problems. For instance, $ is actually not legal in C++ identifiers (although compilers I've tested will allow it.) A totally acceptable identifier in the output program could also be an identifier in the input program, which suggests a more complex approach would be needed, making this a less attractive option.
It also so turns out that there are other situations in this project where it would be nice to be able to modify method signatures using arbitrary type arguments similar to how I'm passing a type to impdetails::redef<>.
Is there any other clever way to get around this, or am I stuck between adding overhead at every call site or mangling names?
After considering some other aspects of the system as well such as interfaces in .NET, I am starting to think maybe it's better - perhaps even more-or-less necessary - to not even use the C++ virtual calling mechanism at all. The more I consider, the messier using that mechanism is getting.
In this approach, each user object class would have a separate struct for the vtable (perhaps kept in a separate namespace like vtabletype::. The generated class would have a pointer member that would be initialized through some trickery to point to a static instance of the vtable. Virtual calls would explicitly use a member pointer from that vtable.
If done properly this should have the same performance as the compiler's own implementation would. I've confirmed it does on VC2008. (By contrast, just using straight C, which is what I was planning on earlier, would likely not perform as well, since compilers often optimize this into a register.)
It would be hellish to write code like this manually, but of course this isn't a concern for a generator. This approach does have some advantages in this application:
Because it's a much more explicit approach, one can be more sure that it's doing exactly what .NET specifies it should be doing with respect to newslot as well as selection of interface implementations.
It might be more efficient (depending on some internal details) than a more traditional C++ approach to interfaces, which would tend to invoke multiple inheritance.
In .NET, objects are considered to be fully constructed when their .ctor runs. This impacts how virtual functions behave. With explicit knowledge of the vtables, this could be achieved by writing it in during allocation. (Although putting the .ctor code into a normal member function is another option.)
It might avoid redundant data when implementing reflection.
It provides better control and knowledge of object layout, which could be useful for the garbage collector.
On the downside, it totally loses the C++ compiler's overloading feature with regard to the vtable entries: those entries are data members, not functions, so there is no overloading. In this case it would be tempting to just number the members (say _0, _1...) This may not be so bad when debugging, since once the pointer is followed, you'll see an actual, properly-named member function anyway.
I think I may end up doing it this way but by all means I'd like to hear if there are better options, as this is admittedly a rather complex approach (and problem.)

How does this code create an instance of a class which has only a private constructor?

I'm working on a sound library (with OpenAL), and taking inspiration from the interface provided by FMOD, you can see the interface at this link.
I've provided some concepts like: Sound, Channel and ChannelGroup, as you can see through FMOD interface, all of those classes have a private constructor and, for example, if you would create a Sound you mast use the function createSound() provided by the System class (the same if you would create a Channel or a ChannelGroup).
I'd like to provide a similar mechanism, but I don't understand how it work behind. For example, how can the function createSound() create a new istance of a Sound? The constructor is private and from the Sound interface there aren't any static methods or friendship. Are used some patterns?
EDIT: Just to make OP's question clear, s/he is not asking how to create a instance of class with private constructor, The question is in the link posted, how is instance of classes created which have private constructor and NO static methods or friend functions.
Thanks.
Hard to say without seeing the source code. Seems however that FMOD is 100% C with global variables and with a bad "OOP" C++ wrapper around it.
Given the absence of source code and a few of the bad tricks that are played in the .h files may be the code is compiled using a different header file and then just happens to work (even if it's clearly non-standard) with the compilers they are using.
My guess is that the real (unpublished) source code for the C++ wrapper is defining a static method or alternatively if everything is indeed just global then the object is not really even created and tricks are being played to fool C++ object system to think there is indeed an object. Apparently all dispatching is static so this (while not formally legal) can happen to work anyway with C++ implementations I know.
Whatever they did it's quite ugly and non-conforming from a C++ point of view.
They never create any instances! The factory function is right there in the header
/*
FMOD System factory functions.
*/
inline FMOD_RESULT System_Create(System **system)
{ return FMOD_System_Create((FMOD_SYSTEM **)system); }
The pointer you pass in to get a System object is immediately cast to a pointer to a C struct declared in the fmod.h header.
As it is a class without any data members who can tell the difference?
struct Foo {
enum Type {
ALPHA,
BETA_X,
BETA_Y
};
Type type () const;
static Foo alpha (int i) {return Foo (ALPHA, i);}
static Foo beta (int i) {return Foo (i<0 ? BETA_X : BETA_Y, i);}
private:
Foo (Type, int);
};
create_alpha could have been a free function declared friend but that's just polluting the namespace.
I'm afraid I can't access that link but another way could be a factory pattern. I'm guessing a bit, now.
It is the factory pattern - as their comment says.
/*
FMOD System factory functions.
*/
inline FMOD_RESULT System_Create(System **system) { return FMOD_System_Create((FMOD_SYSTEM **)system); }
It's difficult to say exactly what is happening as they don't publish the source for the FMOD_System_Create method.
The factory pattern is a mechanism for creating an object but the (sub)class produced depends on the parameters of the factory call. http://en.wikipedia.org/wiki/Factory_method_pattern

Segfault when calling a virtual function in another DLL

I have a two-part program where the main core needs to have an adapter registered and then call back into the register. The parts are in separate DLLs, and the core doesn't know the adapter's specifics (besides methods and parameters ahead of time). I've tried setting it up with the following code and recieve a segfault every time the core tries to call an adapter method:
Core.hpp/cpp (combined and simplified):
class Core
{
public:
Core()
: mAdapter(NULL)
{ }
void DoStuff(int param)
{
if ( this->mAdapter )
{
this->mAdapter->Prepare(param);
}
}
void Register(Adapter * adapter)
{
this->mAdapter = adapter;
}
private:
Adapter * mAdapter;
};
Adapter.hpp/cpp (in the core library):
class Adapter
{
public:
Adapter(Core * core) { }
virtual void Prepare(int param)
{
// For testing, this is defined here
throw("Non-overridden adapter function called.");
}
};
AdapterSpecific.hpp/cpp (second library):
class Adapter_Specific
: Adapter
{
public:
Adapter_Specific(Core * core)
{
core->Register(this);
}
void Prepare(int param) { ... }
};
All classes and methods are marked as DLL export while building the first module (core) and the core is marked as export, the adapter as import while building the adapter.
The code runs fine up until the point where Core::DoStuff is called. From walking through the assembly, it appears that it resolves the function from the vftable, but the address it ends up with is an 0x0013nnnn, and my modules are in the 0x0084nnnn and above range. Visual Studio's source debugger and the intellisense shows the the entries in the vftable are the same and the appropriate one does go to a very low address (one also goes to 0xF-something, which seems equally odd).
Edit for clarity: Execution never re-enters the adapter_specific class or module. The supposed address for the call is invalid and execution gets lost there, resulting in the segfault. It's not an issue with any code in the adapter class, which is why I left that out.
I've rebuilt both libraries more than once, in debug mode. This is pure C++, nothing fancy. I'm not sure why it won't work, but I need to call back into the other class and would rather avoid using a struct of function ptrs.
Is there a way to use neat callbacks like this between modules or is it somehow impossible?
You said all your methods are declared DLL exports. Methods (member functions) do not have to be marked that way ,exporting the class is sufficient. I don't know if it's harmfull if you do.
You are using a lot of pointers. Where are their lifetimes managed?
Adapter_Specific is inheriting private from Adapter in your example
Adapter has a virtual method so probably needs a virtual destructor too
Adapter also has no default constructor so the constructor of Adapter_Specific won't compile. You might want it to construct the base class with the same parameter. This does not happen automatically. However see point 6.
Constructors that take exactly one parameter (other than the copy constructor) should normally be declared explicit.
The base class Adapter takes a parameter it does not use.
The problem ended up being a stupid mistake on my part.
In another function in the code, I was accidentally feeding a function a Core * instead of the Adapter * and didn't catch it in my read-through. Somehow the compiler didn't catch it either (it should have failed, but no implicit cast warning was given, possibly because it was a reference-counted point).
That tried to turn the Core * into an Adapter and get the vftable from that mutant object, which failed miserably and resulted in a segfault. I fixed it to be the proper type and all works fine now.
__declspec(dllexport) on classes and class members is a really bad idea. It's much better to use an interface (base class containing only virtual functions, it's essentially the same as the "struct of function pointers" you didn't want, except the compiler handles all the details), and use __declspec(dllexport) only for global functions such as factory functions. Especially don't call constructors and destructors directly across DLL boundaries because you'll get mismatched allocators, expose an ordinary function which wraps the special functions.

fake/mock nonvirtual C++ methods

It known that in C++ mocking/faking nonvirtual methods for testing is hard. For example, cookbook of googlemock has two suggestion - both mean to modify original source code in some way (templating and rewriting as interface).
It appear this is very bad problem for C++ code. How can be done best if you can't modify original code that needs to be faked/mocked? Duplicating whole code/class (with it whole base class hierarchy??)
One way that we sometimes use is to split the original .cpp file into at least two parts.
Then the test apparatus can supply its own implementations; effectively using the linker to do the dirty work for us.
This is called the "Link Seam" in some circles.
I followed the Link Seam link from sdg's answer. There I read about different types of seams, but I was most impressed by Preprocessing Seams. This made me think about exploiting further the preprocessor. It turned out that it is possible to mock any external dependency without actually changing the calling code.
To do this, you have to compile the calling source file with a substitute dependency definition.
Here is an example how to do it.
dependency.h
#ifndef DEPENDENCY_H
#define DEPENDENCY_H
class Dependency
{
public:
//...
int foo();
//...
};
#endif // DEPENDENCY_H
caller.cpp
#include "dependency.h"
int bar(Dependency& dependency)
{
return dependency.foo() * 2;
}
test.cpp
#include <assert.h>
// block original definition
#define DEPENDENCY_H
// substitute definition
class Dependency
{
public:
int foo() { return 21; }
};
// include code under test
#include "caller.cpp"
// the test
void test_bar()
{
Dependency mockDependency;
int r = bar(mockDependency);
assert(r == 42);
}
Notice that the mock does not need to implement complete Dependency, just the minimum (used by caller.cpp) so the test can compile and execute.
This way you can mock non-virtual, static, global functions or almost any dependency without changing the productive code.
Another reason I like this approach is that everything related to the test is in one place. You don't have to tweak compiler and linker configurations here and there.
I have applied this technique successfully on a real world project with big fat dependencies.
I have described it in more detail in Include mock.
Code has to be written to be testable, by whatever test techniques you use. If you want to test using mocks, that means some form of dependency injection.
Non-virtual calls with no dependence on a template parameter pose the same problem as final and static methods in Java[*] - the code under test has explicitly said, "I want to call this code, not some unknown bit of code that's dependent in some way on an argument". You, the tester, want it to call different code under test from what it normally calls. If you can't change the code under test then you, the tester, will lose that argument. You might as well ask how to introduce a test version of line 4 of a 10-line function without changing the code under test.
If the class to be mocked is in a different TU from the class under test, you can write a mock with the same name as the original and link that instead. Whether you can generate that mock using your mocking framework in the normal way, I'm not so sure.
If you like, I suppose it's a "very bad problem for C++" that it's possible to write code that's hard to test. It shares this "problem" with a great number of other languages...
[*] My Java knowledge is quite low-power. There may be some clever way of mocking such methods in Java, which aren't applicable to C++. If so, please disregard them in order to see the analogy ;-)
I think it is not possible to do it with standard C++ right now (but lets hope that a powerful compile-time reflection will come to C++ soon...). However, there are a number of options for doing so.
You might have a look at Injector++. It is Windows only right now, but plans to add support for Linux & Mac.
Another option is CppFreeMock, which seems to work with GCC, but has no recent activities.
HippoMocks also provide such ability, but only for free functions. It doesn't support it for class member functions.
I'm not completely sure, but it seems that all the above achieve this with overwriting the target function at runtime so that it jumps to the faked function.
The there is C-Mock, which is an extension to Google Mock allowing you to mock non-virtual functions by redefining them, and relying on the fact that original functions are in dynamic libraries. It is limited to GNU/Linux platform.
Finally, you might also try PowerFake (for which, I'm the author) as introduced here.
It is not a mocking framework (currently) and it provides the possibility for replacing production functions with test ones. I hope to be able to integrate it to one or more mocking frameworks; if not, it'll become one.
Update: It has an integration with FakeIt.
Update 2: Added support for Google Mock
It also overrides the original function during linking (so, it won't work if a function is called in the same translation unit in which it is defined), but uses a different trick than C-Mock as it uses GNU ld's --wrap option. It also needs some changes to your build system for tests, but doesn't affect the main code in any way (except if you are forced to put a function in a separate .cpp file); but support for easily integrating it into CMake projects is provided.
But, it is currently limited to GCC/GNU ld (works also with MinGW).
Update: It supports GCC & Clang compilers, and GNU ld & LLVM lld linkers (or any compatible linker).
#zaharpopov you can use Typemock IsolatorPP to create mocks of non-virtual class and methods without changing your code (or legacy code).
for example if you have a non-virtual class called MyClass:
class MyClass
{
public:
int GetResult() { return -1; }
}
you can mock it with typemock like so:
MyClass* fakeMyClass = FAKE<MyClass>();
WHEN_CALLED(fakeMyClass->GetResult()).Return(10);
By the way the classes or methods that you want to test can also be private as typemock can mock them too, for example:
class MyClass
{
private:
int PrivateMethod() { return -1; }
}
MyClass* myClass = new MyClass();
PRIVATE_WHEN_CALLED(myClass, PrivateMethod).Return(1);
for more information go here.
You very specifically say "if you can't modify original code", which the techniques you mention in your question (and all the other current "answers") do.
Without changing that source, you can still generally (for common OSes/tools) preload an object that defines its own version of the function(s) you wish to intercept. They can even call the original functions afterwards. I provided an example of doing this in (my) question Good Linux TCP/IP monitoring tools that don't need root access?.
That is easier then you think. Just pass the constructed object to the constructor of the class you are testing. In the class store the reference to that object. Then it is easy to use mock classes.
EDIT :
The object that you are passing to the constructor needs an interface, and that class store just the reference to the interface.
struct Abase
{
virtual ~Abase(){}
virtual void foo() = 0;
};
struct Aimp : public Abase
{
virtual ~Aimp(){}
virtual void foo(){/*do stuff*/}
};
struct B
{
B( Aimp &objA ) : obja( objA )
{
}
void boo()
{
objA.foo();
}
Aimp &obja;
};
int main()
{
//...
Aimp realObjA;
B objB( realObjA );
// ...
}
In the test, you can pass the mock object easy.
I used to create an interface for the parts I needed to mock. Then I simply created a stub class that derived from this interface and passed this instance to my classes under test. Yes, it is a lot of hard work, but I found it worth it for some circumstances.
Oh, by interface I mean a struct with only pure virtual methods. Nothing else!