Segfault when calling a virtual function in another DLL - c++

I have a two-part program where the main core needs to have an adapter registered and then call back into the register. The parts are in separate DLLs, and the core doesn't know the adapter's specifics (besides methods and parameters ahead of time). I've tried setting it up with the following code and recieve a segfault every time the core tries to call an adapter method:
Core.hpp/cpp (combined and simplified):
class Core
{
public:
Core()
: mAdapter(NULL)
{ }
void DoStuff(int param)
{
if ( this->mAdapter )
{
this->mAdapter->Prepare(param);
}
}
void Register(Adapter * adapter)
{
this->mAdapter = adapter;
}
private:
Adapter * mAdapter;
};
Adapter.hpp/cpp (in the core library):
class Adapter
{
public:
Adapter(Core * core) { }
virtual void Prepare(int param)
{
// For testing, this is defined here
throw("Non-overridden adapter function called.");
}
};
AdapterSpecific.hpp/cpp (second library):
class Adapter_Specific
: Adapter
{
public:
Adapter_Specific(Core * core)
{
core->Register(this);
}
void Prepare(int param) { ... }
};
All classes and methods are marked as DLL export while building the first module (core) and the core is marked as export, the adapter as import while building the adapter.
The code runs fine up until the point where Core::DoStuff is called. From walking through the assembly, it appears that it resolves the function from the vftable, but the address it ends up with is an 0x0013nnnn, and my modules are in the 0x0084nnnn and above range. Visual Studio's source debugger and the intellisense shows the the entries in the vftable are the same and the appropriate one does go to a very low address (one also goes to 0xF-something, which seems equally odd).
Edit for clarity: Execution never re-enters the adapter_specific class or module. The supposed address for the call is invalid and execution gets lost there, resulting in the segfault. It's not an issue with any code in the adapter class, which is why I left that out.
I've rebuilt both libraries more than once, in debug mode. This is pure C++, nothing fancy. I'm not sure why it won't work, but I need to call back into the other class and would rather avoid using a struct of function ptrs.
Is there a way to use neat callbacks like this between modules or is it somehow impossible?

You said all your methods are declared DLL exports. Methods (member functions) do not have to be marked that way ,exporting the class is sufficient. I don't know if it's harmfull if you do.

You are using a lot of pointers. Where are their lifetimes managed?
Adapter_Specific is inheriting private from Adapter in your example
Adapter has a virtual method so probably needs a virtual destructor too
Adapter also has no default constructor so the constructor of Adapter_Specific won't compile. You might want it to construct the base class with the same parameter. This does not happen automatically. However see point 6.
Constructors that take exactly one parameter (other than the copy constructor) should normally be declared explicit.
The base class Adapter takes a parameter it does not use.

The problem ended up being a stupid mistake on my part.
In another function in the code, I was accidentally feeding a function a Core * instead of the Adapter * and didn't catch it in my read-through. Somehow the compiler didn't catch it either (it should have failed, but no implicit cast warning was given, possibly because it was a reference-counted point).
That tried to turn the Core * into an Adapter and get the vftable from that mutant object, which failed miserably and resulted in a segfault. I fixed it to be the proper type and all works fine now.

__declspec(dllexport) on classes and class members is a really bad idea. It's much better to use an interface (base class containing only virtual functions, it's essentially the same as the "struct of function pointers" you didn't want, except the compiler handles all the details), and use __declspec(dllexport) only for global functions such as factory functions. Especially don't call constructors and destructors directly across DLL boundaries because you'll get mismatched allocators, expose an ordinary function which wraps the special functions.

Related

Is the C++ linker smart about virtual methods used only from one class in a program?

I work on a project with extremely low unit test culture. We have almost zero unit testing and every API is static.
To be able to unit test some of my code I create wrappers like
class ApiWrapper {
virtual int Call(foo, bar) {
return ApiCall(foo, bar);
}
};
Now in my functions instead:
int myfunc() {
APiCall(foo, bar);
}
I do:
int myfunc(ApiWrapper* wrapper) {
wrapper->Call(foo, bar);
}
This way I am able to mock such functionality. The problem is that some colleagues complain that production code should not be affected from testability needs - nonsense I know, but reality.
Anyway, I believe that I read somewhere at some point that compilers are actually smart about replacing unused polymorphic behavior with a direct call ... or if there is no class that overrides a virtual method it becomes "normal".
I experimented, and on gcc 4.8 it does not inline or directly call the virtual method, but instead creates vt.
I tried to google, but I did not find anything about this. Is this a thing or do I misremember ... or I have to do something to explain this to the linker, an optimization flag or something?
Note that while in production this class is final, in the test environment it is not. This is exactly what the linker has to be smart about and detect it.
The C++ compiler will only replace a polymorphic call with a direct call if it knows for certain what the actual type is.
So in the following snippet, it will be optimized:
void f() {
ApiWrapper x;
x.Call(); // Can be replaced
}
But in the general case, it can't:
void f(ApiWrapper* wrapper) {
wrapper->Call(); // Cannot be replaced
}
You also added two conditions to your question:
if there is no class that overrides a virtual method it becomes "normal".
This will not help. Neither the C++ compiler nor the linker will look at the totality of classes to search whether any inheritor exists. It's futile anyway, since you can always dynamically-load an instance of a new class.
By the way, this optimization is indeed performed by some JVMs (called devirtualization) - since in Java land there's a class loader which is aware of which classes are currently loaded.
in production this class is final
That will help! Clang, for example, will convert virtual calls to non-virtual calls if the method / method's class is marked final.

c++ plugin : Is it ok to pass polymorphic objects?

When using dynamic libraries, I understand that we should only pass Plain Old Data-structures across boundaries. So can we pass a pointer to base ?
My idea is that the application and the library could both be aware of a common Interface (pure virtual method, = 0).
The library could instantiate a subtype of that Interface,
And the application could use it.
For instance, is the following snippet safe ?
// file interface.h
class IPrinter{
virtual void print(std::string str) = 0;
};
-
// file main.cpp
int main(){
//load plugin...
IPrinter* printer = plugin_get_printer();
printer->print( std::string{"hello"} );
}
-
// file plugin.cpp (compiled by another compiler)
IPrinter* plugin_get_printer(){
return new PrinterImpl{};
}
This snippet is not safe:
the two sides of your DLL boundaries do not use the same compiler. This means that the name mangling (for function names) and the vtable layout (for virtual functions) might not be the same (implementation specific.
the heap on both sides may also be managed differently, thus you have risks related to the deleting of your object if it's not done in the DLL.
This article presents very well the main challenges with binary compatible interfaces.
You may however pass to the other side of the mirror a pointer, as part of a POD as long as the other part doesn't us it by iself (f.ex: your app passes a pointer to a configuration object to the DLL. Later another DLL funct returns that pointer to your app. Your app can then use it as expected (at least if it wasn't a pointer to a local object that no longer exists) .
The presence of virtual functions in your class means that your class is going to have a vtable, and different compilers implement vtables differently.
So, if you use classes with virtual methods across DLL calls where the compiler used on the other side is different from the compiler that you are using, the result is likely to be spectacular crashes.
In your case, the PrinterImpl created by the DLL will have a vtable constructed in a certain way, but the printer->print() call in your main() will attempt to interpret the vtable of IPrinter in a different way in order to resolve the print() method call.

Prevent subclassing an abstract class interface in C++

I provide a SDK to my users, allowing them to write DLLs in C++ for expanding the software.
The SDK headers mostly contain interface class definitions. These class are of two types:
Some that the user must subclass and implement
Some that are wrappers to core classes, passed by the app to the DLL functions as pointers, which can then be used as arguments by the DLL code for calling core functions. These interfaces should not be subclassed by the user and passed to the core functions, as they expect a specific core subclass.
I write in the manual the interfaces that should not be subclassed, and only used through pointers on objects provided by the app. But at some places, it's too tempting to subclass them in the SDK if you do not read the manual.
Would it be possible to prevent subclassing some interfaces in the SDK headers?
As long as the client doesn't need to use the pointer for anything but
passing it back into your DLL, you can just use a forward declaration;
you can't derive from an incomplete type. (When faced with a similar
case recently, I went whole hog, and designed a special wrapper type
based on void*. There's a lot of casting in the interface code, but
there's no way the client can do much other than pass the value back to
me.)
If the classes in question implement an interface which the client must
also use, there are two solutions. The first is to change this,
replacing each of the member functions with a free function which takes
a pointer to the type, and just provide a forward declaration. The
second is to use something like:
class InternallyVisibleInterface;
class ClientVisibleInterface
{
private:
virtual void doSomething() = 0;
ClientVisibleInterface() = default;
friend class InternallyVisibleInterface;
protected: // Or public, depending on whether the client should
// be able to delete instances or not.
virtual ~ClientVisibleInterface() = default;
public:
void something();
};
and in your DLL:
class InternallyVisibleInterface : public ClientVisibleInterface
{
protected:
InternallyVisibleInterface() {}
// And anything else you need. If there is only one class in
// your application which should derive from the interface,
// this is it. If there are several, they should derive from
// this class, rather than ClientVisibleInterface, since this
// is the only class which can construct the
// ClientVisibleInterface base class.
};
void ClientVisibleInterface::something()
{
assert( dynamic_cast<InternallyVisibleInterface*>( this ) != nullptr );
doSomething();
}
This offers two levels of protection: first, although derivation
directly from ClientVisibleInterface is possible, it's impossible for
the resulting class to have a constructor, and so it cannot be
instantiated. And secondly, if the client code does cheat somehow,
there will be a runtime error if he does so.
You probably don't need both protections; one or the other should
suffice. The private constructor will result in a compile time error,
rather than a runtime one. On the other hand, without it, you don't
even have to mention the name of InternallyVisibleInterface in the
distributed headers.
As soon as a developper has a developpement environment, he can do almost anything, and you should not even try to control that.
IMHO the best you can do is to identify the limit between the core application and the extension DLLs and ensure that objects received from those DLLs are or correct class, and abort with a distinctive message if they are not.
Using RTTI and typeid is generally frowned upon because it is generally the sign of a bad OOP design : in normal use case, calling virtual method is enough to have proper code invoked. But I think it can safely be considered in your use case.

How do I add code automatically to a derived function in C++

I have code that's meant to manage operations on both a networked client and a server, since there is significant overlap between the two. However, there are a few functions here and there that are meant to be exclusively called by the client or server, and accidentally calling a client function on the server (or vice versa) is a significant source of bugs.
To reduce these sorts of programming errors, I'm trying to tag functions so that they'll raise a ruckus if they're misused. My current solution is a simple macro at the start of each function that calls an assert if the client or server accesses members they shouldn't. However, this runs into problems when there are multiple derived instances of classes, in that I have to tag the implementation as client or server side in EVERY child class.
What I'd like to be able to do is put a tag in the virtual member's signature in the base class, so that I only have to tag it once and not run into errors by forgetting to do it repeatedly. I've considered putting a check in a base class implementation and then referring to it with something like base::functionName, but that runs into the same issue as far as needing to manually add the function call to every implementation. Ideally, I'd be able to have parent versions of the function called automatically like default constructors do.
Does anybody know how to achieve something like this in C++? Is there an alternate approach I should be considering?
Thanks!
Another approach might be to override a different method than the one your callers actually call:
class Base {
public:
void doit(const Something &);
protected:
virtual void real_doit(const Something &);
};
class Derived: public Base {
protected:
virtual void real_doit(const Something &);
};
The implementation of Base::doit() could do the check to make sure that it's being called in the right environment, and then call the virtual real_doit() function. Derived classes would override the protected virtual function, and users of either class wouldn't be able to call the protected function.
The Base::doit() function is not virtual so that derived classes can't accidentally override the wrong one. (People can try, but hopefully they'll notice soon enough when it's not called.)
What you've proposed is incredibly complex. It sounds like a simpler solution would be
class CommonStuff {
// all common code that anybody can safely call
};
class ServerBase : public CommonStuff {
// only what the server is allowed to call; can safely be overwritten
};
class ClientBase : public CommonStuff {
// only what the client is allowed to call; can safely be overwritten
};
Compile-time enforcements are much better than any sort of runtime enforcement.
There's not a way within the language (that I know of) to do what you're asking without redesigning your classes. The simplest solution may be to have a Client interface (pure virtual) class that does not declare server functions, and a Server interface class that doesn't declare client functions, and have your consolidated code inherit (publicly) from both interfaces. Then in your client program, use a reference (or pointer) to the Client interface, which does not allow access to any methods not declared in the Client interface. On the server, use the Server interface.
This will also allow you to use derived classes as Server or Client as well.
I would consider splitting this library into three libraries: A base library that has most everything, a server-only library, and a client-only library. As long as the client doesn't use the server library, you're good. You may end up adding a few extra classes (class Processor might split into BaseProcessor, ClientProcessor, and ServerProcessor, where each subclass has one additional function that the base doesn't.)
If that won't work, could you put the server/client check in the class constructor, and call the assertion there? (That would only work if the server-only or client-only is granular to the class, not to the method.)
If that won't work, would it make any sense to actually compile different versions of your library, based on whether it's a server or client build? Surround the methods, and their declarations, with #ifdef SERVERBUILD and #ifdef CLIENTBUILD, and include some checks to make sure they aren't both defined (#if defined(SERVERBUILD) && defined(CLIENTBUILD), #error Can't define both!).
I voted up Greg Hewgill's answer, but it got me thinking about ways to add "aspects" such as you request. I used his naming convention here (class Base and method doit):
class Base {
protected:
class Aspect {
public:
Aspect(int x) {
std::cout << "aspect" << std::endl;
}
};
public:
virtual void doit(const Something &arg, const Aspect hook = 0)
{
std::cout << "doit(" << arg << ")" << std::endl;
}
};
Callers can just say base.doit(arg) since Aspect is a default argument. Its constructor runs before doit and its destructor (not pictured) runs after. Sadly my first idea to make the default argument hook = this is not allowed.
Children can override doit with the same signature and get the same effect.

If classes with virtual functions are implemented with vtables, how is a class with no virtual functions implemented?

In particular, wouldn't there have to be some kind of function pointer in place anyway?
I think that the phrase "classes with virtual functions are implemented with vtables" is misleading you.
The phrase makes it sound like classes with virtual functions are implemented "in way A" and classes without virtual functions are implemented "in way B".
In reality, classes with virtual functions, in addition to being implemented as classes are, they also have a vtable. Another way to see it is that "'vtables' implement the 'virtual function' part of a class".
More details on how they both work:
All classes (with virtual or non-virtual methods) are structs. The only difference between a struct and a class in C++ is that, by default, members are public in structs and private in classes. Because of that, I'll use the term class here to refer to both structs and classes. Remember, they are almost synonyms!
Data Members
Classes are (as are structs) just blocks of contiguous memory where each member is stored in sequence. Note that some times there will be gaps between members for CPU architectural reasons, so the block can be larger than the sum of its parts.
Methods
Methods or "member functions" are an illusion. In reality, there is no such thing as a "member function". A function is always just a sequence of machine code instructions stored somewhere in memory. To make a call, the processor jumps to that position of memory and starts executing. You could say that all methods and functions are 'global', and any indication of the contrary is a convenient illusion enforced by the compiler.
Obviously, a method acts like it belongs to a specific object, so clearly there is more going on. To tie a particular call of a method (a function) to a specific object, every member method has a hidden argument that is a pointer to the object in question. The member is hidden in that you don't add it to your C++ code yourself, but there is nothing magical about it -- it's very real. When you say this:
void CMyThingy::DoSomething(int arg);
{
// do something
}
The compiler really does this:
void CMyThingy_DoSomething(CMyThingy* this, int arg)
{
/do something
}
Finally, when you write this:
myObj.doSomething(aValue);
the compiler says:
CMyThingy_DoSomething(&myObj, aValue);
No need for function pointers anywhere! The compiler knows already which method you are calling so it calls it directly.
Static methods are even simpler. They don't have a this pointer, so they are implemented exactly as you write them.
That's is! The rest is just convenient syntax sugaring: The compiler knows which class a method belongs to, so it makes sure it doesn't let you call the function without specifying which one. It also uses that knowledge to translates myItem to this->myItem when it's unambiguous to do so.
(yeah, that's right: member access in a method is always done indirectly via a pointer, even if you don't see one)
(Edit: Removed last sentence and posted separately so it can be criticized separately)
Non virtual member functions are really just a syntactic sugar as they are almost like an ordinary function but with access checking and an implicit object parameter.
struct A
{
void foo ();
void bar () const;
};
is basically the same as:
struct A
{
};
void foo (A * this);
void bar (A const * this);
The vtable is needed so that we call the right function for our specific object instance. For example, if we have:
struct A
{
virtual void foo ();
};
The implementation of 'foo' might approximate to something like:
void foo (A * this) {
void (*realFoo)(A *) = lookupVtable (this->vtable, "foo");
(realFoo)(this); // Make the call to the most derived version of 'foo'
}
The virtual methods are required when you want to use polymorphism. The virtual modifier puts the method in the VMT for late binding and then at runtime is decided which method from which class is executed.
If the method is not virtual - it is decided at compile time from which class instance will it be executed.
Function pointers are used mostly for callbacks.
If a class with a virtual function is implemented with a vtable, then a class with no virtual function is implemented without a vtable.
A vtable contains the function pointers needed to dispatch a call to the appropriate method. If the method isn't virtual, the call goes to the class's known type, and no indirection is needed.
For a non-virtual method the compiler can generate a normal function invocation (e.g., CALL to a particular address with this pointer passed as a parameter) or even inline it. For a virtual function, the compiler doesn't usually know at compile time at which address to invoke the code, therefore it generates code that looks up the address in the vtable at runtime and then invokes the method. True, even for virtual functions the compiler can sometimes correctly resolve the right code at compile time (e.g., methods on local variables invoked without a pointer/reference).
(I pulled this section from my original answer so that it can be criticized separately. It is a lot more concise and to the point of your question, so in a way it's a much better answer)
No, there are no function pointers; instead, the compiler turns the problem inside-out.
The compiler calls a global function with a pointer to the object instead of calling some pointed-to function inside the object
Why? Because it's usually a lot more efficient that way. Indirect calls are expensive instructions.
There's no need for function pointers as it cant change during the runtime.
Branches are generated directly to the compiled code for the methods; just like if you have functions that aren't in a class at all, branches are generated straight to them.
The compiler/linker links directly which methods will be invoked. No need for a vtable indirection. BTW, what does that have to do with "stack vs. heap"?