Usually to create COM interface one should declare it in IDL file. In the project I work on I have one COM interface declared in *.h file in C++:
struct DECLSPEC_UUID("A67177F7-A4DD-4A80-8EE1-25CF12172068") ISomeService : public IUnknown
{
virtual ~ISomeService() {}
virtual HRESULT Initialize(const Settings& settings) = 0;
// ...
};
Moreover the method Initialize takes a struct that contains std::string fields as its parameter.
The corresponding COM class is implemented in C++ and it is used from another C++ module.
This works fine until I run the code under AppVerifier. It causes access violation exceptions to occur.
So my questions are
Is it right to sometimes declare COM-interface in *.h file?
If yes is it right to specify C++ types as parameters for COM interface methods? Or should I always use COM compliant types in such cases (BSTR etc)?
Sure, you can describe a COM interface without using IDL. But you won't be able to use such IDL features like type library and marshalling code generation. However if you are using the COM component as in-proc server only (DLL), and it's ok with you to distribute the .h file to the clients - then this approach will work fine.
Avoid using C++ types in the interface, because it may result in access violations when dealing with memory across DLL boundaries. Better use plain C types, or COM types
Related
I've read several times that passing STL objects like vector and string outside of a DLL boundary is bad practice because different compiler versions can generate different code for STL objects. Therefore, you should design a C-style interface and not pass STL objects at all. However, there are still some things unclear to me:
1. What is the 'boundary' of a DLL?
Is it right to say, that the boundary is where code is beeing compiled on DLL side? What if I define a .h file inside a DLL (f.e. to write a factory class) and use that header file in a different project? Is that .h file inside or outside the boundary of the DLL and why?
2. What is contained in a DLL?
Let' say I have a class Foo:
class Foo
{
public:
__declspec(dllexport) void f1(); //instantiates v1 inside function
private:
unique_ptr<vector<int>> v1 = nullptr;
}
If I only mark the function f1() with __declspec(dllexport), only this function should be contained in the DLL. How does the code inside f1() know what v1 is if v1 isn't contained in the DLL?
3. Passing objects out of a DLL-boundary using unique_ptr
I'm using unique_ptr almost everytime in my project. From what I understand, returning a unique_ptr from a DLL would be bad practice because unique_ptr is an STL object. How can I instantiate an object inside the DLL and return a unique_ptr to it?
4. Why does defining interfaces or using PIMPL help to define an DLL interface?
I still have to convert my STL classes to C-style objects. And in the project using the DLL, I would have to somehow wrap the C-style objects inside STL classes again. I don't see any advantage of using interfaces or PIMPL in this case.
Also, if I define an interface (class with pure virtual functions), wouldn't this have the same effect as just declaring the functions in my class with __declspec(dllexport)?
class IFoo
{
public:
virtual ~IFoo() = 0 {};
virtual void f1() = 0;
}
class Foo : public IFoo
{
public:
void f1();
//__declspec(dllexport) void f1(); //why use an interface if I can just declare the functions like this?
}
How is the DLL-STL problematic solved in modern C++ 11/14 libraries? Are there any modern open-source libraries that I can have a look at?
Unfortunately STL types aren't consistent across compilers. Even different versions of Visual Studio have differences.
The boundary is where the code is compiled. If you have an implementation in a header file in your library, then the compiler used to compile the EXE will compile the code. This is potentially very bad because what the code in the EXE thinks is the data is different to what the code in the DLL thinks is the data. (You need to look out for differences like this especially if you have #ifs in a struct definition and you need to be explicit about packing).
The only way to be sure is to define all your own types (being careful of packing) and not use STL. This is what DLL libraries usually do.
Interfaces can enable the user to dynamically link to the library. Using __declspec(dllexport) requires a static linking; that is the EXE has to link to the .lib generated when you compiled the DLL to be able to access all the functions. This means amongst other things you can't update the DLL without the EXE having to be recompiled (probably - you can get away with this in some circumstances, but it's not a good idea).
By dynamically linking you can update the DLL or add functionality to the DLL without relinking the EXE as long as you don't change your interfaces. The EXE might call LoadLibrary() on the DLL and GetProcAddress() to access one function that returns an interface. Everything else including data types passed as parameters are interfaces (i.e. contain only pure virtual functions) or simple structs. This is how the basic level of COM works.
To answer question 2, when you declare something as __declspec(dllexport) you are stating that this is part of the interface to the DLL - something that is accessible to the component that loads the DLL. Anything declared without __declspec(dllexport) should be present within the DLL but will not be available to be called/used by an external component.
I have a bunch of classes like this:
class A
{
public:
std::string s;
std::vector<int> v;
int a;
}
That I'm trying to export to be used in a dll. I want clients to be able to use the class like this:
A a;
a.s = "abc";
b.v.push_back(4);
But because vector and string cannot be exported, this isn't possible. Besides providing getters and setters for each member and using an abstract interface that has those getters and setters, is there any way around this?
Having STL classes at the DLL interface is a very fragile design: if you do so, all the clients of the DLL must be built using the same version of the Visual C++ compiler, and the same settings (e.g. same _ITERATOR_DEBUG_LEVEL, etc.), and the same flavor of the CRT.
This is highly constraining.
You may want to follow different paths. For example, you could build a pure-C interface API layer wrapping your C++ classes, and export that from the DLL. DLL with a pure C interface are just fine (for example, think of the Win32 APIs).
Then you can distribute an header file in which this pure-C interface is wrapped in C++ classes, ready to be used by clients (for example, think of ATL/WTL, which are convenient C++ wrappers around pure-C Win32 APIs).
Or you could follow a COM-ish approach, exporting C++ interfaces.
You can find more details about that in this article:
HowTo: Export C++ classes from a DLL
A DLL just provides references to global variables and functions, since you class uses neither, you cannot export anything from your class from the DLL. Typically you would put your class into a header file and share the header file between both your DLL and the harness executable just like you will do with a C++ project having multiple source files.
I'm looking to implement a custom implementation of COM in C++ on a UNIX type platform to allow me to dynamically load and link object oriented code. I'm thinking this would be based on a similar set of functionality that POSIX provides to load and call dll's ie dlopen, dlsym and dlclose.
I understand that the general idea of COM is that you link to a few functions ie QueryInterface, AddRef and Release in a common dll (Kernel32.dll) which then allows you to access interfaces which are just a table of function pointers encapsulated with a pointer to the object for which the function pointers should be called with. These functions are exposed through IUnknown which you must inherit off of.
So how does this all work? Is there a better way to dynamically link and load to object oriented code? How does inheritance from a dll work - does every call to the base class have to be to an exposed member function i.e private/protected/public is simply ignored?
I'm quite well versed in C++ and template meta-programming and already have a fully reflective C++ system i.e member properties, member functions and global/static functions that uses boost.
A couple of things to keep in mind:
The power of COM comes largely from the IDL and the midl compiler. It allows a verry succint definition of the objects and interfaces with all the C/C++ boilerplate generated for you.
COM registration. On Windows the class IDs (CLSID) are recorded in the registry where they are associated with the executable. You must provide similar functionality in the UNIX environment.
The whole IUnknown implementation is fairly trivial, except for QueryInterface which works when implemented in C (i.e. no RTTI).
A whole another aspect of COM is IDispatch - i.e. late bound method invocation and discovery (read only reflection).
Have a look at XPCOM as it is a multi-platform COM like environment. This is really one of those things you are better off leveraging other technologies. It can suck up a lot of the time better spent elsewhere.
I'm looking to implement a custom implementation of COM in C++ on a UNIX type platform to allow me to dynamically load and link object oriented code. I'm thinking this would be based on a similar set of functionality that POSIX provides to load and call dll's ie dlopen, dlsym and dlclose.
At its simplest level, COM is implemented with interfaces. In c++, if you are comfortable with the idea of pure virtual, or abstract base classes, then you already know how to define an interface in c++
struct IMyInterface {
void Method1() =0;
void Method2() =0;
};
The COM runtime provides a lot of extra services that apply to the windows environment but arn't really needed when implementing "mini" COM in a single application as a means to dynamically link to a more OO interface than traditionally allowed by dlopen, dlsym, etc.
COM objects are implemented in .dll, .so or .dylib files depending on your platform. These files need to export at least one function that is standardized: DllGetClassObject
In your own environment you can prototype it however you want but to interop with the COM runtime on windows obviously the name and parameters need to conform to the com standard.
The basic idea is, this is passed a pointer to a GUID - 16 bytes that uniquely are assigned to a particular object, and it creates (based on the GUID) and returns the IClassFactory* of a factory object.
The factory object is then used, by the COM runtime, to create instances of the object when the IClassFactory::CreateInstance method is called.
So, so far you have
a dynamic library exporting at least one symbol, named "DllGetClassObject" (or some variant thereof)
A DllGetClassObject method that checks the passed in GUID to see if and which object is being requested, and then performs a "new CSomeObjectClassFactory"
A CSomeObjectClassFactory implementation that implements (derives from) IClassFactory, and implements the CreateInstance method to "new" instances of CSupportedObject.
CSomeSupportedObject that implements a custom, or COM defined interface that derives from IUnknown. This is important because IClassFactory::CreateInstance is passed an IID (again, a 16byte unique id defining an interface this time) that it will need to QueryInterface on the object for.
I understand that the general idea of COM is that you link to a few functions ie QueryInterface, AddRef and Release in a common dll (Kernel32.dll) which then allows you to access interfaces which are just a table of function pointers encapsulated with a pointer to the object for which the function pointers should be called with. These functions are exposed through IUnknown which you must inherit off of.
Actually, COM is implemented by OLE32.dll which exposes a "c" api called CoCreateInstance. The app passed CoCreateInstance a GUID, which it looks up in the windows registry - which has a DB of GUID -> "path to dll" mappings. OLE/COM then loads (dlopen) the dll, calls its DllGetClassObject (dlsym) method, passing in the GUID again, presuming that succeeds, OLE/COM then calls the CreateInstance and returns the resulting interface to app.
So how does this all work? Is there a better way to dynamically link and load to object oriented code? How does inheritance from a dll work - does every call to the base class have to be to an exposed member function i.e private/protected/public is simply ignored?
implicit inheritance of c++ code from a dll/so/dylib works by exporting every method in the class as a "decorated" symbol. The method name is decorated with the class, and type of every parameter. This is the same way the symbols are exported from static libraries (.a or .lib files iirc). Static or dynamic libraries, "private, protected etc." are always enforced by the compiler, parsing the header files, never the linker.
I'm quite well versed in C++ and template meta-programming and already have a fully reflective C++ system i.e member properties, member functions and global/static functions that uses boost.
c++ classes can typically only be exported from dlls with static linkage - dlls that are loaded at load, not via dlopen at runtime. COM allows c++ interfaces to be dynamically loaded by ensuring that all datatypes used in COM are either pod types, or are pure virtual interfaces. If you break this rule, by defining an interface that tries to pass a boost or any other type of object you will quickly get into a situation where the compiler/linker will need more than just the header file to figure out whats going on and your carefully prepared "com" dll will have to be statically or implicitly linked in order to function.
The other rule of COM is, never pass ownership of an object accross a dynamic library boundary. i.e. never return an interface or data from a dll, and require the app to delete it. Interfaces all need to implement IUnknown, or at least a Release() method, that allows the object to perform a delete this. Any returned data types likewise must have a well known de-allocator - if you have an interface with a method called "CreateBlob", there should probably be a buddy method called "DeleteBlob".
To really understand how COM works, I suggest reading "Essential COM" by Don Box.
Look at the CORBA documentation, at System.ComponentModel in the sscli, the XPCOM parts of the Mozilla codebase. Miguel de Icaza implemented something like OLE in GNOME called Bonobo which might be useful as well.
Depending on what you're doing with C++ though, you might want to look at plugin frameworks for C++ like Yehia. I believe Boost also has something similar.
Edit: pugg seems better maintained than Yehia at the moment. I have not tried it though.
The basic design of COM is pretty simple.
All COM objects expose their functionality through one or more interfaces
All interfaces are derived from the IUnknown interface, thus all interfaces have
QueryInterface, AddRef & Release methods as the first 3 methods of their virtual
function table in a known order
All objects implement IUnknown
Any interface that an object supports can be queried from any other interface.
Interfaces are identified by Globally Unique Identifiers, these are IIDs GUIDs or CLSIDs, but they are all really the same thing. http://en.wikipedia.org/wiki/Globally_Unique_Identifier
Where COM gets complex is in how it deals with allowing interfaces to be called from outside the process where the object resides. COM marshalling is a nasty, hairy, beast. Made even more so by the fact that COM supports both single threaded and multi-threaded programming models.
The Windows implementaion of COM allows objects to be registered (the original use of the Windows registry was for COM). At a minimum the COM registry contains the mapping between the unique GUID for a COM object, and the library (dll) that contains it's code.
For this to work. DLLs that implement COM objects must have a ClassFactory - an entry point in the DLL with a standard name that can be called to create one of the COM objects the DLL implements. (In practice, Windows COM gets an IClassFactory object from this entry point, and uses that to create other COM objects).
so that's the 10 cent tour, but to really understand this, you need to read Essential COM by Don Box.
You may be interested in the (not-yet)Boost.Extension library.
Application is written in delphi 2010 and the underlying dll is a C++ dll.
In ideal case, when your application is in C++; The dll makes a callback to an application when an event occurs. The callback is implemented through an interface. Application developers implements the abstract c++ class and pass the object to the dll. The dll will then make a callback to a member function of your implemented class. A classic callback pattern it is.
But how do I pass a delphi object to the dll for it to make a callback.
I wouldn't really call that ideal. It is selfish and short-sighted to make a DLL that requires its consumers to use the same compiler as the DLL used. (Class layout is implementation-defined, and since both modules need to have the same notion of what a class is, they need to use the same compiler.)
Now, that doesn't mean other consumers of the DLL can't fake it. It just won't be as easy for them as the DLL's designer intended.
When you say the callback is implemented through an interface, do you mean a COM-style interface, where the C++ class has nothing but pure virtual methods, including AddRef, Release, and QueryInterface, and they all use the stdcall calling convention? If that's the case, then you can simply write a Delphi class that implements the same interface. There are many examples of that in the Delphi source code and other literature.
If you mean you have a non-COM interface, where the C++ class has only pure virtual methods, but not the three COM functions, then you can write a Delphi class with the same layout. Duplicate the method order, and make sure all the methods are virtual. The Delphi VMT has the same layout as most C++ vtables on Windows implementations, at least as far as the function-pointer order is concerned. (The Delphi VMT has a lot of non-method data as well, but that doesn't interfere with the method addresses.) Just be sure you maintain clear ownership boundaries. The DLL must never attempt to destroy the object; it won't have a C++-callable destructor that the delete operator could invoke.
If you mean that you have an arbitrary C++ class that could include data members, constructors, or non-pure methods, then your task is considerably more difficult. Follow up if this is the case; otherwise, I'd rather not address it right now.
Overall, I'll echo Mason's advice that the DLL should use plain C-style callback functions. A good rule of thumb is that if you stick to techniques you see in the Windows API, you'll be OK. If you're not in control of how to interact with the DLL, then so be it. But if you can make the DLL's external interface more C-like, that would be best. And that doesn't mean you need to abandon the C++-style interface; you could provide two interfaces, where the C-style interface serves as a wrapper for your already-working C++style interface.
You can't pass a Delphi object to C++, at least not without a very good understanding of how the object model works at the binary level. If you need callbacks, do them using C types only and plain functions and procedures (no methods) and you should be fine.
This question is related to "How to make consistent dll binaries across VS versions ?"
We have applications and DLLs built
with VC6 and a new application built
with VC9. The VC9-app has to use
DLLs compiled with VC6, most of
which are written in C and one in
C++.
The C++ lib is problematic due to
name decoration/mangling issues.
Compiling everything with VC9 is
currently not an option as there
appear to be some side effects.
Resolving these would be quite time
consuming.
I can modify the C++ library, however it must be compiled with VC6.
The C++ lib is essentially an OO-wrapper for another C library. The VC9-app uses some static functions as well as some non-static.
While the static functions can be handled with something like
// Header file
class DLL_API Foo
{
int init();
}
extern "C"
{
int DLL_API Foo_init();
}
// Implementation file
int Foo_init()
{
return Foo::init();
}
it's not that easy with the non-static methods.
As I understand it, Chris Becke's suggestion of using a COM-like interface won't help me because the interface member names will still be decorated and thus inaccessible from a binary created with a different compiler. Am I right there?
Would the only solution be to write a C-style DLL interface using handlers to the objects or am I missing something?
In that case, I guess, I would probably have less effort with directly using the wrapped C-library.
The biggest problem to consider when using a DLL compiled with a different C++ compiler than the calling EXE is memory allocation and object lifetime.
I'm assuming that you can get past the name mangling (and calling convention), which isn't difficult if you use a compiler with compatible mangling (I think VC6 is broadly compatible with VS2008), or if you use extern "C".
Where you'll run into problems is when you allocate something using new (or malloc) from the DLL, and then you return this to the caller. The caller's delete (or free) will attempt to free the object from a different heap. This will go horribly wrong.
You can either do a COM-style IFoo::Release thing, or a MyDllFree() thing. Both of these, because they call back into the DLL, will use the correct implementation of delete (or free()), so they'll delete the correct object.
Or, you can make sure that you use LocalAlloc (for example), so that the EXE and the DLL are using the same heap.
Interface member names will not be decorated -- they're just offsets in a vtable. You can define an interface (using a C struct, rather than a COM "interface") in a header file, thusly:
struct IFoo {
int Init() = 0;
};
Then, you can export a function from the DLL, with no mangling:
class CFoo : public IFoo { /* ... */ };
extern "C" IFoo * __stdcall GetFoo() { return new CFoo(); }
This will work fine, provided that you're using a compiler that generates compatible vtables. Microsoft C++ has generated the same format vtable since (at least, I think) MSVC6.1 for DOS, where the vtable is a simple list of pointers to functions (with thunking in the multiple-inheritance case). GNU C++ (if I recall correctly) generates vtables with function pointers and relative offsets. These are not compatible with each other.
Well, I think Chris Becke's suggestion is just fine. I would not use Roger's first solution, which uses an interface in name only and, as he mentions, can run into problems of incompatible compiler-handling of abstract classes and virtual methods. Roger points to the attractive COM-consistent case in his follow-on.
The pain point: You need to learn to make COM interface requests and deal properly with IUnknown, relying on at least IUnknown:AddRef and IUnknown:Release. If the implementations of interfaces can support more than one interface or if methods can also return interfaces, you may also need to become comfortable with IUnknown:QueryInterface.
Here's the key idea. All of the programs that use the implementation of the interface (but don't implement it) use a common #include "*.h" file that defines the interface as a struct (C) or a C/C++ class (VC++) or struct (non VC++ but C++). The *.h file automatically adapts appropriately depending on whether you are compiling a C Language program or a C++ language program. You don't have to know about that part simply to use the *.h file. What the *.h file does is define the Interface struct or type, lets say, IFoo, with its virtual member functions (and only functions, no direct visibility to data members in this approach).
The header file is constructed to honor the COM binary standard in a way that works for C and that works for C++ regardless of the C++ compiler that is used. (The Java JNI folk figured this one out.) This means that it works between separately-compiled modules of any origin so long as a struct consisting entirely of function-entry pointers (a vtable) is mapped to memory the same by all of them (so they have to be all x86 32-bit, or all x64, for example).
In the DLL that implements the the COM interface via a wrapper class of some sort, you only need a factory entry point. Something like an
extern "C" HRESULT MkIFooImplementation(void **ppv);
which returns an HRESULT (you'll need to learn about those too) and will also return a *pv in a location you provide for receiving the IFoo interface pointer. (I am skimming and there are more careful details that you'll need here. Don't trust my syntax) The actual function stereotype that you use for this is also declared in the *.h file.
The point is that the factory entry, which is always an undecorated extern "C" does all of the necessary wrapper class creation and then delivers an Ifoo interface pointer to the location that you specify. This means that all memory management for creation of the class, and all memory management for finalizing it, etc., will happen in the DLL where you build the wrapper. This is the only place where you have to deal with those details.
When you get an OK result from the factory function, you have been issued an interface pointer and it has already been reserved for you (there is an implicit IFoo:Addref operation already performed on behalf of the interface pointer you were delivered).
When you are done with the interface, you release it with a call on the IFoo:Release method of the interface. It is the final release implementation (in case you made more AddRef'd copies) that will tear down the class and its interface support in the factory DLL. This is what gets you correct reliance on a consistent dynamic stoorage allocation and release behind the interface, whether or not the DLL containing the factory function uses the same libraries as the calling code.
You should probably implement IUnknown:QueryInterface (as method IFoo:QueryInterface) too, even if it always fails. If you want to be more sophisticated with using the COM binary interface model as you have more experience, you can learn to provide full QueryInterface implementations.
This is probably too much information, but I wanted to point out that a lot of the problems you are facing about heterogeneous implementations of DLLs are resolved in the definition of the COM binary interface and even if you don't need all of it, the fact that it provides worked solutions is valuable. In my experience, once you get the hang of this, you will never forget how powerful this can be in C++ and C++ interop situations.
I haven't sketched the resources you might need to consult for examples and what you have to learn in order to make *.h files and to actually implement factory-function wrappers of the libraries you want to share. If you want to dig deeper, holler.
There are other things you need to consider too, such as which run-times are being used by the various libraries. If no objects are being shared that's fine, but that seems quite unlikely at first glance.
Chris Becker's suggestions are pretty accurate - using an actual COM interface may help you get the binary compatibility you need. Your mileage may vary :)
not fun, man. you are in for a lot of frustration, you should probably give this:
Would the only solution be to write a
C-style DLL interface using handlers
to the objects or am I missing
something? In that case, I guess, I
would probably have less effort with
directly using the wrapped C-library.
a really close look. good luck.