How To Make A Suitable .DLL for type-library conversion

How To Make A Suitable .DLL for type-library conversion - c++

I have incomplete types from my COM type library .tlb (which was generated by regasm OF .NET v4+) I am unsure if this is because cppc.exe derived the wrong header for it or if this is merely a symptom of it.
I know it's incomplete because all the generated classes come out a. as structs and b. with empty interface bodies (no methods defined).
I'm sure I've done something wrong when making my .dll suitable for regasm; I've changed the Assembly-Info and I've made default constructors for-all types. Was I meant to do something else maybe?

If you mark your assembly as COM-visible, all public classes with a default constructor are exposed with the help of a dispinterface. This interface type is designed for late bound clients, such as scripting languages, which invoke methods and properties via Dispatch identifiers (DISPIDs). An accompanying type library won't contain any entries for the interface.
In order to bind to a fixed layout, you’ll need to switch to a dual or IUnknown interface. Simply edit your AssemblyInfo.cs as follows:
// Setting ComVisible to false makes the types in this assembly not visible
// to COM components. If you need to access a type in this assembly from
// COM, set the ComVisible attribute to true on that type.
[assembly: ComVisible(true)]
[assembly: ClassInterface(ClassInterfaceType.AutoDual)]
Now, the methods of the interface are visible in the type library, as well as in IntelliSense.
Note however that with an AutoDual exposed class, you'll need to need to take care when updating the assembly. See this link for further info.

Related

Why should I use DECLARE_DYNAMIC instead of DECLARE_DYNCREATE?

DECLARE_DYNCREATE provides exactly the same feature of DECLARE_DYNAMIC along with its dynamic object creation ability. Then why should anyone use DECLARE_DYNAMIC instead of DECLARE_DYNCREATE?

The macros are documented to provide different functionality.
DECLARE_DYNAMIC:
Adds the ability to access run-time information about an object's class when deriving a class from CObject.
This provides the functionality for introspection, similar to RTTI (Run-Time Type Information) provided by C++. An application can query a CObject-derived class instance for its run-time type through the associated CRuntimeClass Structure. It is useful in situations where you need to check that an object is of a particular type, or has a specific base class type. The examples at CObject::IsKindOf should give you a good idea.
DECLARE_DYNCREATE:
Enables objects of CObject-derived classes to be created dynamically at run time.
This macro enables dynamic creation of class instances at run-time. The functionality is provided through the class factory method CRuntimeClass::CreateObject. It can be used when you need to create class instances at run-time based on the class type's string representation. An example would be a customizable GUI, that is built from an initialization file.
Both features are implemented through the same CRuntimeClass Structure, which may lead to the conclusion that they can be used interchangeably. In fact, code that uses an inappropriate macro will compile just fine, and expose the desired run-time behavior. The difference is purely semantic: The macros convey different intentions, and should be used according to the desired features, to communicate developer intent.
There's also a third related macro, DECLARE_SERIAL:
Generates the C++ header code necessary for a CObject-derived class that can be serialized.
It enables serialization of respective CObject-derived class instances, for example to a file, memory stream, or network socket. Since the deserialization process requires dynamic creation of objects from the serialized stream, it includes the functionality of DECLARE_DYNCREATE.
Put together, the following list should help you pick the right macro for your specific scenarios:
Use DECLARE_DYNAMIC if your code needs to retrieve an object's run-time type.
Use DECLARE_DYNCREATE if, in addition, you need to dynamically create class instances based on the type's string representation.
Use DECLARE_SERIAL if, in addition, you need to provide serialization support.

You're asking "why buy a Phillips screwdriver when I own a flathead?" The answer is that you should use the tool that suits your needs: if you need to drive only flathead screws, don't buy a Phillips driver. Otherwise, buy one.
If you need the features provided by DECLARE_DYNCREATE (e.g. because you're creating a view that's auto-created by the framework when a document is opened) then you should use DECLARE_DYNCREATE and if you don't and DECLARE_DYNAMIC works, you should use it.

Late Binding COM objects with C++Builder

We're interfacing to some 3rd party COM objects from a C++Builder 2010 application.
Currently we import the type library and generate component wrappers, and then are able to make method calls and access properties in a fairly natural way.
object->myProperty = 42;
object->doSomething(666);
However, we've been bitten by changes to the COM object's interface (which is still being extended and developed) causing our own app to fail because some method GUIDs seem to get invalidated - even if the only change to the interface has been the addition of a new method).
Late Binding has been suggested as a way of addressing this. I think this requires our code to be changed rather like this:
object.OlePropertySet("myProperty", 42);
object.OlePrcedure("doSomething", 666);
Obviously this is painful to read and write, so we'd have to write wrapper classes instead.
Is there any way of getting late binding wrappers generated automatically when we import the type library? And, if so, are they smart enough to only do the textual binding once when the object is created, rather than on every single method call?

When you import a TypeLibrary for a COM object that supports late-binding (when it implements the IDispatch interface), the importer can generate separate wrapper classes (not components) for both static-binding and late-binding.
Adding a new method to an existing interface should not invalidate your code. Methods do not have GUIDs. However, for an IDispatch-based interface, its methods do have DISPID values associated with them, and those DISPID values can be changed from one release to another. Though any respectable COM developer should never do that once an interface definition has been locked in.

After deep investigation of the code and headers generated by the TLIBIMP, this turns out to be fairly easy.
If your Type Library has a class Foo, then after importing the type library, you would typically use the auto-generated smart pointer classes IFooPtr.
{
IFooPtr f;
...
f->myMethod(1,2);
}
You should note that at this point that the bindings are static - that is, they depend not just on the GUIDs of the objects and the DISPIDs of the methods, but on the exact layout of the VTable in the DLL. Any changes that affect the vtable - for instance, adding an additional method to a base class of Foo will cause the method call to fail.
To use dynamic bindings, you can use the IFooDisp classes instead of IFooPtr. Again, these are smart wrappers, handling object lifetimes automatically. Note that with these classes you should use the . operator to access methods, not the indirection -> operator. Using the indirection operator will call the method, but via a static binding.
{
IFooDisp f;
...
f.myMethod(1,2);
}
By using these IDispatch-based wrappers, methods will be dispatched by their DISPIDs, even if the objects vtable layout is changed. I think these classes also give a way to dispatch by function name rather than DISPID, but haven't confirmed the details of that.

Module and classes handling (dynamic linking)

Run into a bit of an issue, and I'm looking for the best solution concept/theory.
I have a system that needs to use objects. Each object that the system uses has a known interface, likely implemented as an abstract class. The interfaces are known at build time, and will not change. The exact implementation to be used will vary and I have no idea ahead of time what module will be providing it. The only guarantee is that they will provide the interface. The class name and module (DLL) come from a config file or may be changed programmatically.
Now, I have all that set up at the moment using a relatively simple system, set up something like so (rewritten pseudo-code, just to show the basics):
struct ClassID
{
Module * module;
int number;
};
class Module
{
HMODULE module;
function<void * (int)> * createfunc;
static Module * Load(String filename);
IObject * CreateClass(int number)
{
return createfunc(number);
}
};
class ModuleManager
{
bool LoadModule(String filename);
IObject * CreateClass(String classname)
{
ClassID class = AvailableClasses.find(classname);
return class.module->CreateObject(class.number);
}
vector<Module*> LoadedModules;
map<String, ClassID> AvailableClasses;
};
Modules have a few exported functions to give the number of classes they provide and the names/IDs of those, which are then stored. All classes derive from IObject, which has a virtual destructor, stores the source module and has some methods to get the class' ID, what interface it implements and such.
The only issue with this is each module has to be manually loaded somewhere (listed in the config file, at the moment). I would like to avoid doing this explicitly (outside of the ModuleManager, inside that I'm not really concerned as to how it's implemented).
I would like to have a similar system without having to handle loading the modules, just create an object and (once it's all set up) it magically appears.
I believe this is similar to what COM is intended to do, in some ways. I looked into the COM system briefly, but it appears to be overkill beyond belief. I only need the classes known within my system and don't need all the other features it handles, just implementations of interfaces coming from somewhere.
My other idea is to use the registry and keep a key with all the known/registered classes and their source modules and numbers, so I can just look them up and it will appear that Manager::CreateClass finds and makes the object magically. This seems like a viable solution, but I'm not sure if it's optimal or if I'm reinventing something.
So, after all that, my question is: How to handle this? Is there an existing technology, if not, how best to set it up myself? Are there any gotchas that I should be looking out for?

COM very likely is what you want. It is very broad but you don't need to use all the functionality. For example, you don't need to require participants to register GUIDs, you can define your own mechanism for creating instances of interfaces. There are a number of templates and other mechanisms to make it easy to create COM interfaces. What's more, since it is a standard, it is easy to document the requirements.
One very important thing to bear in mind is that importing/exporting C++ objects requires all participants to be using the same compiler. If you think that ever could be a problem to you then you should use COM. If you are happy to accept that restriction then you can carry on as you are.

I don't know if any technology exists to do this.
I do know that I worked with a system very similar to this. We used XML files to describe the various classes that different modules made available. Our equivalent of ModuleManager would parse the xml files to determine what to create for the user at run time based on the class name they provided and the configuration of the system. (Requesting an object that implemented interface 'I' could give back any of objects 'A', 'B' or 'C' depending on how the system was configured.)
The big gotcha we found was that the system was very brittle and at times hard to debug/understand. Just reading through the code, it was often near impossible to see what concrete class was being instantiated. We also found that maintaining the XML created more bugs and overhead than expected.
If I was to do this again, I would keep the design pattern of exposing classes from DLL's through interfaces, but I would not try to build a central registry of classes, nor would I derive everything from a base class such as IObject.
I would instead make each module responsible for exposing its own factory functions(s) to instantiate objects.

Custom COM Implementation?

I'm looking to implement a custom implementation of COM in C++ on a UNIX type platform to allow me to dynamically load and link object oriented code. I'm thinking this would be based on a similar set of functionality that POSIX provides to load and call dll's ie dlopen, dlsym and dlclose.
I understand that the general idea of COM is that you link to a few functions ie QueryInterface, AddRef and Release in a common dll (Kernel32.dll) which then allows you to access interfaces which are just a table of function pointers encapsulated with a pointer to the object for which the function pointers should be called with. These functions are exposed through IUnknown which you must inherit off of.
So how does this all work? Is there a better way to dynamically link and load to object oriented code? How does inheritance from a dll work - does every call to the base class have to be to an exposed member function i.e private/protected/public is simply ignored?
I'm quite well versed in C++ and template meta-programming and already have a fully reflective C++ system i.e member properties, member functions and global/static functions that uses boost.

A couple of things to keep in mind:
The power of COM comes largely from the IDL and the midl compiler. It allows a verry succint definition of the objects and interfaces with all the C/C++ boilerplate generated for you.
COM registration. On Windows the class IDs (CLSID) are recorded in the registry where they are associated with the executable. You must provide similar functionality in the UNIX environment.
The whole IUnknown implementation is fairly trivial, except for QueryInterface which works when implemented in C (i.e. no RTTI).
A whole another aspect of COM is IDispatch - i.e. late bound method invocation and discovery (read only reflection).
Have a look at XPCOM as it is a multi-platform COM like environment. This is really one of those things you are better off leveraging other technologies. It can suck up a lot of the time better spent elsewhere.

I'm looking to implement a custom implementation of COM in C++ on a UNIX type platform to allow me to dynamically load and link object oriented code. I'm thinking this would be based on a similar set of functionality that POSIX provides to load and call dll's ie dlopen, dlsym and dlclose.
At its simplest level, COM is implemented with interfaces. In c++, if you are comfortable with the idea of pure virtual, or abstract base classes, then you already know how to define an interface in c++
struct IMyInterface {
void Method1() =0;
void Method2() =0;
};
The COM runtime provides a lot of extra services that apply to the windows environment but arn't really needed when implementing "mini" COM in a single application as a means to dynamically link to a more OO interface than traditionally allowed by dlopen, dlsym, etc.
COM objects are implemented in .dll, .so or .dylib files depending on your platform. These files need to export at least one function that is standardized: DllGetClassObject
In your own environment you can prototype it however you want but to interop with the COM runtime on windows obviously the name and parameters need to conform to the com standard.
The basic idea is, this is passed a pointer to a GUID - 16 bytes that uniquely are assigned to a particular object, and it creates (based on the GUID) and returns the IClassFactory* of a factory object.
The factory object is then used, by the COM runtime, to create instances of the object when the IClassFactory::CreateInstance method is called.
So, so far you have
a dynamic library exporting at least one symbol, named "DllGetClassObject" (or some variant thereof)
A DllGetClassObject method that checks the passed in GUID to see if and which object is being requested, and then performs a "new CSomeObjectClassFactory"
A CSomeObjectClassFactory implementation that implements (derives from) IClassFactory, and implements the CreateInstance method to "new" instances of CSupportedObject.
CSomeSupportedObject that implements a custom, or COM defined interface that derives from IUnknown. This is important because IClassFactory::CreateInstance is passed an IID (again, a 16byte unique id defining an interface this time) that it will need to QueryInterface on the object for.
I understand that the general idea of COM is that you link to a few functions ie QueryInterface, AddRef and Release in a common dll (Kernel32.dll) which then allows you to access interfaces which are just a table of function pointers encapsulated with a pointer to the object for which the function pointers should be called with. These functions are exposed through IUnknown which you must inherit off of.
Actually, COM is implemented by OLE32.dll which exposes a "c" api called CoCreateInstance. The app passed CoCreateInstance a GUID, which it looks up in the windows registry - which has a DB of GUID -> "path to dll" mappings. OLE/COM then loads (dlopen) the dll, calls its DllGetClassObject (dlsym) method, passing in the GUID again, presuming that succeeds, OLE/COM then calls the CreateInstance and returns the resulting interface to app.
So how does this all work? Is there a better way to dynamically link and load to object oriented code? How does inheritance from a dll work - does every call to the base class have to be to an exposed member function i.e private/protected/public is simply ignored?
implicit inheritance of c++ code from a dll/so/dylib works by exporting every method in the class as a "decorated" symbol. The method name is decorated with the class, and type of every parameter. This is the same way the symbols are exported from static libraries (.a or .lib files iirc). Static or dynamic libraries, "private, protected etc." are always enforced by the compiler, parsing the header files, never the linker.
I'm quite well versed in C++ and template meta-programming and already have a fully reflective C++ system i.e member properties, member functions and global/static functions that uses boost.
c++ classes can typically only be exported from dlls with static linkage - dlls that are loaded at load, not via dlopen at runtime. COM allows c++ interfaces to be dynamically loaded by ensuring that all datatypes used in COM are either pod types, or are pure virtual interfaces. If you break this rule, by defining an interface that tries to pass a boost or any other type of object you will quickly get into a situation where the compiler/linker will need more than just the header file to figure out whats going on and your carefully prepared "com" dll will have to be statically or implicitly linked in order to function.
The other rule of COM is, never pass ownership of an object accross a dynamic library boundary. i.e. never return an interface or data from a dll, and require the app to delete it. Interfaces all need to implement IUnknown, or at least a Release() method, that allows the object to perform a delete this. Any returned data types likewise must have a well known de-allocator - if you have an interface with a method called "CreateBlob", there should probably be a buddy method called "DeleteBlob".

To really understand how COM works, I suggest reading "Essential COM" by Don Box.

Look at the CORBA documentation, at System.ComponentModel in the sscli, the XPCOM parts of the Mozilla codebase. Miguel de Icaza implemented something like OLE in GNOME called Bonobo which might be useful as well.
Depending on what you're doing with C++ though, you might want to look at plugin frameworks for C++ like Yehia. I believe Boost also has something similar.
Edit: pugg seems better maintained than Yehia at the moment. I have not tried it though.

The basic design of COM is pretty simple.
All COM objects expose their functionality through one or more interfaces
All interfaces are derived from the IUnknown interface, thus all interfaces have
QueryInterface, AddRef & Release methods as the first 3 methods of their virtual
function table in a known order
All objects implement IUnknown
Any interface that an object supports can be queried from any other interface.
Interfaces are identified by Globally Unique Identifiers, these are IIDs GUIDs or CLSIDs, but they are all really the same thing. http://en.wikipedia.org/wiki/Globally_Unique_Identifier
Where COM gets complex is in how it deals with allowing interfaces to be called from outside the process where the object resides. COM marshalling is a nasty, hairy, beast. Made even more so by the fact that COM supports both single threaded and multi-threaded programming models.
The Windows implementaion of COM allows objects to be registered (the original use of the Windows registry was for COM). At a minimum the COM registry contains the mapping between the unique GUID for a COM object, and the library (dll) that contains it's code.
For this to work. DLLs that implement COM objects must have a ClassFactory - an entry point in the DLL with a standard name that can be called to create one of the COM objects the DLL implements. (In practice, Windows COM gets an IClassFactory object from this entry point, and uses that to create other COM objects).
so that's the 10 cent tour, but to really understand this, you need to read Essential COM by Don Box.

You may be interested in the (not-yet)Boost.Extension library.

Using C++ DLLs with different compiler versions

This question is related to "How to make consistent dll binaries across VS versions ?"
We have applications and DLLs built
with VC6 and a new application built
with VC9. The VC9-app has to use
DLLs compiled with VC6, most of
which are written in C and one in
C++.
The C++ lib is problematic due to
name decoration/mangling issues.
Compiling everything with VC9 is
currently not an option as there
appear to be some side effects.
Resolving these would be quite time
consuming.
I can modify the C++ library, however it must be compiled with VC6.
The C++ lib is essentially an OO-wrapper for another C library. The VC9-app uses some static functions as well as some non-static.
While the static functions can be handled with something like
// Header file
class DLL_API Foo
{
int init();
}
extern "C"
{
int DLL_API Foo_init();
}
// Implementation file
int Foo_init()
{
return Foo::init();
}
it's not that easy with the non-static methods.
As I understand it, Chris Becke's suggestion of using a COM-like interface won't help me because the interface member names will still be decorated and thus inaccessible from a binary created with a different compiler. Am I right there?
Would the only solution be to write a C-style DLL interface using handlers to the objects or am I missing something?
In that case, I guess, I would probably have less effort with directly using the wrapped C-library.

The biggest problem to consider when using a DLL compiled with a different C++ compiler than the calling EXE is memory allocation and object lifetime.
I'm assuming that you can get past the name mangling (and calling convention), which isn't difficult if you use a compiler with compatible mangling (I think VC6 is broadly compatible with VS2008), or if you use extern "C".
Where you'll run into problems is when you allocate something using new (or malloc) from the DLL, and then you return this to the caller. The caller's delete (or free) will attempt to free the object from a different heap. This will go horribly wrong.
You can either do a COM-style IFoo::Release thing, or a MyDllFree() thing. Both of these, because they call back into the DLL, will use the correct implementation of delete (or free()), so they'll delete the correct object.
Or, you can make sure that you use LocalAlloc (for example), so that the EXE and the DLL are using the same heap.

Interface member names will not be decorated -- they're just offsets in a vtable. You can define an interface (using a C struct, rather than a COM "interface") in a header file, thusly:
struct IFoo {
int Init() = 0;
};
Then, you can export a function from the DLL, with no mangling:
class CFoo : public IFoo { /* ... */ };
extern "C" IFoo * __stdcall GetFoo() { return new CFoo(); }
This will work fine, provided that you're using a compiler that generates compatible vtables. Microsoft C++ has generated the same format vtable since (at least, I think) MSVC6.1 for DOS, where the vtable is a simple list of pointers to functions (with thunking in the multiple-inheritance case). GNU C++ (if I recall correctly) generates vtables with function pointers and relative offsets. These are not compatible with each other.

Well, I think Chris Becke's suggestion is just fine. I would not use Roger's first solution, which uses an interface in name only and, as he mentions, can run into problems of incompatible compiler-handling of abstract classes and virtual methods. Roger points to the attractive COM-consistent case in his follow-on.
The pain point: You need to learn to make COM interface requests and deal properly with IUnknown, relying on at least IUnknown:AddRef and IUnknown:Release. If the implementations of interfaces can support more than one interface or if methods can also return interfaces, you may also need to become comfortable with IUnknown:QueryInterface.
Here's the key idea. All of the programs that use the implementation of the interface (but don't implement it) use a common #include "*.h" file that defines the interface as a struct (C) or a C/C++ class (VC++) or struct (non VC++ but C++). The *.h file automatically adapts appropriately depending on whether you are compiling a C Language program or a C++ language program. You don't have to know about that part simply to use the *.h file. What the *.h file does is define the Interface struct or type, lets say, IFoo, with its virtual member functions (and only functions, no direct visibility to data members in this approach).
The header file is constructed to honor the COM binary standard in a way that works for C and that works for C++ regardless of the C++ compiler that is used. (The Java JNI folk figured this one out.) This means that it works between separately-compiled modules of any origin so long as a struct consisting entirely of function-entry pointers (a vtable) is mapped to memory the same by all of them (so they have to be all x86 32-bit, or all x64, for example).
In the DLL that implements the the COM interface via a wrapper class of some sort, you only need a factory entry point. Something like an
extern "C" HRESULT MkIFooImplementation(void **ppv);
which returns an HRESULT (you'll need to learn about those too) and will also return a *pv in a location you provide for receiving the IFoo interface pointer. (I am skimming and there are more careful details that you'll need here. Don't trust my syntax) The actual function stereotype that you use for this is also declared in the *.h file.
The point is that the factory entry, which is always an undecorated extern "C" does all of the necessary wrapper class creation and then delivers an Ifoo interface pointer to the location that you specify. This means that all memory management for creation of the class, and all memory management for finalizing it, etc., will happen in the DLL where you build the wrapper. This is the only place where you have to deal with those details.
When you get an OK result from the factory function, you have been issued an interface pointer and it has already been reserved for you (there is an implicit IFoo:Addref operation already performed on behalf of the interface pointer you were delivered).
When you are done with the interface, you release it with a call on the IFoo:Release method of the interface. It is the final release implementation (in case you made more AddRef'd copies) that will tear down the class and its interface support in the factory DLL. This is what gets you correct reliance on a consistent dynamic stoorage allocation and release behind the interface, whether or not the DLL containing the factory function uses the same libraries as the calling code.
You should probably implement IUnknown:QueryInterface (as method IFoo:QueryInterface) too, even if it always fails. If you want to be more sophisticated with using the COM binary interface model as you have more experience, you can learn to provide full QueryInterface implementations.
This is probably too much information, but I wanted to point out that a lot of the problems you are facing about heterogeneous implementations of DLLs are resolved in the definition of the COM binary interface and even if you don't need all of it, the fact that it provides worked solutions is valuable. In my experience, once you get the hang of this, you will never forget how powerful this can be in C++ and C++ interop situations.
I haven't sketched the resources you might need to consult for examples and what you have to learn in order to make *.h files and to actually implement factory-function wrappers of the libraries you want to share. If you want to dig deeper, holler.

There are other things you need to consider too, such as which run-times are being used by the various libraries. If no objects are being shared that's fine, but that seems quite unlikely at first glance.
Chris Becker's suggestions are pretty accurate - using an actual COM interface may help you get the binary compatibility you need. Your mileage may vary :)

not fun, man. you are in for a lot of frustration, you should probably give this:
Would the only solution be to write a
C-style DLL interface using handlers
to the objects or am I missing
something? In that case, I guess, I
would probably have less effort with
directly using the wrapped C-library.
a really close look. good luck.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js