COM exe, C++, and MinGW

COM exe, C++, and MinGW - c++

Have abit of an odd question; I'm using a tool supplied by a large company that, for reasons I find somewhat baffling, uses a COM interface defined inside the exe itself. In the example code they provide, it looks alittle like this.
#import "C:\\Path_To_Exe\\the.exe" rename_namespace ("exe_namespace");
From what I understand, this is the way Microsoft Visual C++ compiler understands the COM and works with it, and I have had the example code working before (currently, it doesn't compile due to fiddling with my build environment).
My question is, is there a way to do the same with MinGW? The project I'm working on is mainly using that; we can use MSVC if required, but I'd ideally like to avoid using multiple compilers if possible. I'm currently using cmake to build with, but I'm willing to use a script to build the items that need the COM interface if needed.
Thanks for your time.

The answer to "is there a way to do the same with MinGW" is no. #import is an optional tool that reads a COM type library (embedded in a binary or not, the TLB corresponds in general to an .idl file, but that also is optional), and generates C/C++ code that's heavily dependent on .c and .h files that only Visual Studio provides.
The answer to "can I do COM with MinGW" is of course yes. I don't know much about MinGW and tools, but you can do COM with any compiler since COM is (just) a binary standard.
If you get rid of #import, you'll have to change the code that uses what was generated (in the .TLH file resulting of the #import directive), COM helper, wrappers, etc. It can be a lot of work, but it's technically possible.
Now, in your context, I suppose it really depends how big the .exe's type library (the description of your COM classes, interfaces, etc.) is. Visual Studio's #import adds value, so you'll have to assess how much value it added for you.
If it's just one class, one interface for example, then it can be interesting to get rid of the #import. If the .exe already has .h files that correspond to the tlb, then you can use them, otherwise you'll have to redeclare some by yourself (and again, change the code that was using generated wrappers).
The sole fact that you ask the question makes me wonder if you have enough knowledge of COM (no offense :-) to get rid of Visual Studio.

The COM subsystem is part of the Windows API, and you can access it using C calls to that API.
However there is a huge amount of boilerplate involved in this. The compilers which support COM "out of the box" have written all this boilerplate, and packaged it up in some combination of compiled libraries, template headers, and so on.
Another part of the usual suite of tools offered by these compilers is one that can read COM interface definitions out of an existing compiled object. COM objects usually contain a binary representation of their interface, for this reason.
There are a few ways you could proceed here in order to use g++; one option is following this broad outline:
Use your MSVC installation to read the COM object and produce a C header file describing the interface.
Pick out the enumerations and GUIDs from that header file.
In g++, use the Windows API to invoke the object, using those enumerations and GUIDs.
If you want to author objects in g++ then there is a lot more work to do as you need to implement a bunch of things, but it is possible.
I have done this successfully in the past with g++ (as part of testing COM objects I'd developed). Probably somebody could develop a nice open-source suite for using COM objects, or even for authoring, that does not depend on MSVC but I'm not aware of such a thing.
I would recommend reading the books by Don Box, they fill in a lot of gaps in understanding that you will have if you've only learned about COM by working with it and reading the internet.

Related

Using Excel through vtable interface

I am learning COM programming via C++. As I understand, on the client side of dual interfaces you have two choices:
Acquire an IDispatch interface, query DISPIDs with GetIDsOfNames, and use Invoke to access methods and properties.
Include the .h header files with interface definitions and the .c source files with GUIDs created by MIDL in your project and call the functions directly through the vtable, which is known for the compiler from the .h files.
I would like to create a quite complex Excel Workbook from a C++ program (and I insist on using C++ instead of C# or anything else). Using the 1. way I was able to write a program which runs correctly. However, I have two problems: (A) the code is quite clumsy because of the calls to Invoke, (B) it is quite fast but I would like it to be even faster.
So I would like to try the 2. way. I am just missing the .h and .c files because unlike in the examples in the books I read, these files are not created by another example project but by Microsoft.
My questions are:
Where can I find these files?
How much performance improvement can I hope from way 2. compared to way 1.?

I recommend to not do it. And, I have my reasons...
For simple things, just spinning up excel.exe and marshaling data from one process to another process eats up most of the time. Those things are a magnitude greater than what you might gain in using C++ interfaces.
However, the big reason is this: Sometimes Office doesn't get installed or registered correctly on a client machine...for whatever reason. What happens sometimes is that the interfaces do not get registered correctly. If the interfaces do not get registered correctly, you will be pulling your hair out trying to figure out why your program is failing. Eventually you might figure it out. Then your only recourse is to tell your customer to re-install Office and hope it installs correctly, or to create a .reg file and have the customer apply the reg file to fix the interfaces if he has administrator privileges and you know which are the missing interfaces.
If you use IDispatch, it doesn't matter if the interfaces are missing. I've learned this the hard way with Word. You already have it working...
If you insist, then you can try:
#import "progid:Excel.Sheet" // plus a bunch of other options like rename() etc...

How to retrieve C++ class member variable names?

I have a C++ header file containing class definition, and I want to retrieve the types and names of its member variables.
Editors like Eclipse and Visual Studio do this to visualize the code, so I am interested if they also provide API in some (maybe native) scripting language or maybe in Java, which will allow to get member variable types and names as strings and, say, write them to a file. If not, then maybe there are utilities to dump the class description to some XML-like file?

IDEs basically use a C++ compiler to extract this information because parsing C++ is hard. Even worse, the C++ compiler has to be fault-tolerant if it should work while you're writing the code.
Visual Studio creates a SQL Server database in your project folder. That is the database used for code completion aka IntelliSense. Look for the file "projectname.sdf". You can write a Visual Studio add in that opens this database and accesses its members. I have done so.
But: There is absolutely no guarantee that the information in this database is complete and correct. On small projects it will probably work ok, on large projects with several 100k LOC the database will almost certainly not be 100% complete. If you want to generate code automatically based on this database, be very, very careful.
The Visual Studio compiler generates a debug symbols database that you can query after you have compiled your project. There is an example project on MSDN to do that.
Clang provides an API to do that. See libclang et al.
From that page:
Clang provides infrastructure to write tools that need syntactic and semantic information about a program. This document will give a short introduction of the different ways to write clang tools, and their pros and cons.
This is certainly a better option than 1) because you actually access the intermediate compiler syntax tree. Several people (e.g. Apple, Google) have written refactoring tools and syntax checkers based on Clang. More information here.
One big caveat: Clang is currently not able to parse many Windows headers because it does not support all Microsoft compiler options.
GCC can dump its syntax tree too.

What problems can appear when using G++ compiled DLL (plugin) in VC++ compiled application?

I use and Application compiled with the Visual C++ Compiler. It can load plugins in form of a .dll. It is rather unimportant what exactly it does, fact is:
This includes calling functions from the .dll that return a pointer to an object of the Applications API, etc.
My question is, what problems may appear when the Application calls a function from the .dll, retrieves a pointer from it and works with it. For example, something that comes into my mind, is the size of a pointer. Is it different in VC++ and G++? If yes, this would probably crash the Application?
I don't want to use the Visual Studio IDE (which is unfortunately the "preferred" way to use the Applications SDK). Can I configure G++ to compile like VC++?
PS: I use MINGW GNU G++

As long as both application and DLL are compiled on the same machine, and as long as they both only use the C ABI, you should be fine.
What you can certainly not do is share any sort of C++ construct. For example, you mustn't new[] an array in the main application and let the DLL delete[] it. That's because there is no fixed C++ ABI, and thus no way in which any given compiler knows how a different compiler implements C++ data structures. This is even true for different versions of MSVC++, which are not ABI-compatible.

All C++ language features are going to be entirely incompatible, I'm afraid. Everything from the name-mangling to memory allocation to the virtual-call mechanism are going to be completely different and not interoperable. The best you can hope for is a quick, merciful crash.
If your components only use extern "C" interfaces to talk to one another, you can probably make this work, although even there, you'll need to be careful. Both C++ runtimes will have startup and shutdown code, and there's no guarantee that whichever linker you use to assemble the application will know how to include this code for the other compiler. You pretty much must link g++-compiled code with g++, for example.
If you use C++ features with only one compiler, and use that compiler's linker, then it gets that much more likely to work.

This should be OK if you know what you are doing. But there's some things to watch out for:
I'm assuming the interface between EXE and DLL is a "C" interface or something COM like where the only C++ classes exposed are through pure-virutal interfaces. It gets messier if you are exporting a concrete class through a DLL.
32-bit vs. 64bit. The 32-bit app won't load a 64-bit DLL and vice-versa. Make sure they match.
Calling convention. __cdecl vs __stdcall. Often times Visual Studio apps are compiled with flags that assuming __stdcall as the default calling convention (or the function prototype explicitly says so). So make sure that the g++ compilers generates code that matches the calling type expected by the EXE. Otherwise, the exported function might run, but the stack can get trashed on return. If you debug through a crash like this, there's a good chance the cdecl vs stdcall convention was incorrectly specified. Easy to fix.
C-Runtimes will not likely be shared between the EXE and DLL, so don't mix and match. A pointer allocated with new or malloc in the EXE should not be released with delete or free in the DLL (and vice versa). Likewise, FILE handles returned by fopen() can not be shared between EXE and DLL. You'll likely crash if any of this happens.... which leads me to my next point....
C++ header files with inline code cause enough headaches and are the source of issues I called out in #3. You'll be OK if the interface between DLL And EXE is a pure "C" interface.
Name mangling issues. If you run into issues where the function name exported doesn't match because of name mangling or because of a leading underscore, you can fix that up in a .DEF file. At least that's what I've done in the past with Visual Studio. Not sure if the equivalent exists in g++/MinGW. Example below. Learn to use "dumpbin.exe /exports" to you can validate your DLL is exporting function with the right name. Using extern "C" will also help fix this as well.
EXPORTS
FooBar=_Foobar#12
BlahBlah=??BlahBlah##QAE#XZ #236 NONAME
Those are the issues that I know of. I can't tell you much more since you didn't explain the interface between the DLL and EXE.

The size of a pointer won't vary; that is dependent on the platform and module bitness and not the compiler (32-bit vs 64-bit and so on).
What can vary is the size of basically everything else, and what will vary are templates.
Padding and alignment of structs tends to be compiler-dependent, and often settings-within-compiler dependent. There are so loose rules, like pointers typically being on a platform-bitness-boundary and bools having 3 bytes after them, but it's up to the compiler how to handle that.
Templates, particularly from the STL (which is different for each compiler) may have different members, sizes, padding, and mostly anything. The only standard part is the API, the backend is left to the STL implementation (there are some rules, but compilers can still compile templates differently). Passing templates between modules from one build is bad enough, but between different compilers it can often be fatal.
Things which aren't standardized (name mangling) or are highly specific by necessity (memory allocation) will also be incompatible. You can get around both of those issues by only destroying from the library that creates (good practice anyway) and using STL objects that take a deleter, for allocation, and exporting using undecorated names and/or the C style (extern "C") for exported methods.
I also seem to remember a catch with how the compilers handle virtual destructors in the vtable, with some small difference.
If you can manage to only pass references of your own objects, avoid externally visible templates entirely, work primarily with pointers and exported or virtual methods, you can avoid a vast majority of the issues (COM does precisely this, for compatibility with most compilers and languages). It can be a pain to write, but if you need that compatibility, it is possible.
To alleviate some, but not all, of the issues, using an alternate to the STL (like Qt's core library) will remove that particular problem. While throwing Qt into any old project is a hideous waste and will cause more bloat than the "boost ALL THE THINGS!!!" philosophy, it can be useful for decoupling the library and the compiler to a greater extent than using a stock STL can.

You can't pass C runtime objects between them. For example you can not open a FILE buffer in one and pass it to be used in the other. You can't free memory allocated on the other side.

The main problems are the function signatures and way parameters are passed to library code. I've had great difficulty getting VC++ dll's to work in gnu based compilers in the past. This was way back when VC++ always cost money and mingw was the free solution.
My experience was with DirectX API's. Slowly a subset got it's binaries modified by enthusiasts but it was never as up-to-date or reliable so after evaluating it I switched to a proper cross platform API, that was SDL.
This wikipedia article describes the different ways libraries can be compiled and linked. It is rather more in depth than I am able to summarise here.

How should I integrate with and package this third-party library in a Win32 C++ app?

We have a (very large) existing codebase for a custom ActiveX control, and I'd like to integrate libkml into it for the sake of interacting with KML mapping data, rather than reinventing the wheel. The problem is, I'm a relatively new Windows developer, and coming from the Linux world, I'm really not sure what the right way of integrating a third party library is. Thankfully, libkml does provide MSVCC projects for compiling it, so porting isn't a problem. I guess I have a couple choices that I can think of:
Build and link the library directly. We already have a solution with project files in it for the "main" project; I could add the libkml projects to that solution, but I'd rather not. It's very unlikely that the libkml code will change in relation to our app's code.
Statically link to the .lib files produced by the libkml build. This is unattractive, since there are six .lib files that come out of the libkml solution and it seems inelegant to manually specify them in the linker options, etc.
Package the code as-is in a DLL. Maybe with COM? It seems like if I did this without any translation, I'd end up with a lot of overhead, and since I'm fairly unfamiliar with COM, I don't know how much work would be involved in exposing all the functionality I'd like to use via COM. The library is fairly big, has a lot of classes it uses, and if I had to manually write code to expose it all, I'd be hesitant to go this route.
Write wrapper code to to abstract the functionality I need, package that in a COM DLL, and interact with that. This seems sensible, I suppose, but it's difficult to determine how much abstraction I need since I haven't written the code that would use libkml yet.
Let me reiterate: I haven't yet written the code that will interact with libkml yet, so this is mostly experimental. Options 1 and 2 are also complicated by the fact that libkml relies additionally on three more external libraries that are also in .lib files (that I had to recompile anyways to get the code generation flags to line up). The goal obviously is to get the code to work, but maintainability and source tree organization are also goals, so I'm leaning towards options 3 and 4, but I don't know the best way to approach those on Windows.

Typing six file names, or using the declarative style with #pragma comment(lib, "foo.lib") is small potatoes compared to the work you'll have to do to turn this into a DLL or COM server.
The distribution is heavily biased towards using this as a static link library. There are only spotty declarations available to turn this into a DLL with __declspec(dllexport). They exist only in the 3rd party dependencies. All using different #defines of course, you'll by typing a bunch of names in the preprocessor definitions for the projects.
Furthermore, you'll have a hard time actually getting this DLL loaded at runtime since you are using it in a COM server. The search path for DLLs will be the client app's when COM creates your control instance, not likely to be anywhere near close to the place you deployed the DLL.
Making it a COM server is a lot of work, you'll have to write all the interface glue yourself. Again, nothing already in the source code that helps with this at all.

You can also wrap all the functionality you need in a non-COM-dll. Visual studio supports creating a static wrapper library which, when linked, will make your program use the dll. This way you only have one dependency to specify instead of six.
Other than that, what is wrong with specifying six dependencies. I would assume that there is a good reason that these are six separate libraries instead of one, so it is prudent to specify exactly which parts you actually use.

Maybe I'm missing something here, but I really don't see what is wrong with (1). I think that even if you had multiple projects that were using libkml, just insert the project file for libkml into your solution file, specify the dependencies, and you should be done. It's dead simple. Even solution (2) is dead simple. If the libraries ever change, you rebuild - you're going to need to do that anyway.
I'm failing to see how (3) or (4) are necessary or even desired. To me, it sounds like a lot of work for goals (source tree organization and maintainability) that I'm not even sure that those options really meet. In fact, you said yourself that "It's very unlikely that the libkml code will change in relation to our app's code."
What I've found over the years is to just keep things simple. If rebuilding KML is potentially time consuming, grab the libs and just statically link to the libraries. Yes, there are other dependencies, but you'll set this up once and be done, hopefully never to worry about it again. Otherwise, stick it in the project and move on. I think that it's worthwhile to ask whether spending a lot of time on this issue is worth the trouble.

DLLs and STLs and static data (oh my!)

OK..... I've done all the reading on related questions, and a few MSDN articles, and about a day's worth of googling.
What's the current "state of the art" answer to this question:
I'm using VS 2008, C++ unmanaged code. I have a solution file with quite a few DLLs and quite a few EXEs. As long as I completely control the build environment, such that all pieces and parts are built with the same flags, and use the same runtime libaries, and no one has a statically linked CRT library, am I ok to pass STL objects around?
It seems like this should be OK, but depending on which article you read, there's lots of Fear, Uncertainty, and Doubt.
I know there's all sorts of problems with templates that produce static data behind the scenes (every dll would get their own copy, leading to heartache), but what about regular old STL?

As long as they ALL use the exact same version of runtime DLLs, there should be no problem with STL. But once you happen to have several around, they will use for instance different heaps - leading to no end of troubles.

We successfully pass STL objects around in our application which is made up from dozens of DLLs. To ensure it works one of our automated tests that runs at every build is to verify the settings for all projects. If you add a new project and misconfigure it, or break the configuration of an existing project, the build fails.
The settings we check are as follows. Note not all of these will cause issues, but we check them for consistency.
#defines
_WIN32_WINNT
STRICT
_WIN32_IE
NDEBUG
_DEBUG
_SECURE_SCL
Compiler options
DebugInformationFormat
WholeProgramOptimization
RuntimeLibrary

We use stl collections in our application and pass them to and from methods in different dlls (usually as references). This doesn't cause any trouble.
The only area where we have had trouble is where one dll allocates memory and another dll tries to delete it. This only is reported as bad, but I am not sure why. However it only seems to be a problem on Debug builds (where it is reported), but still works on release builds. Having said that where ever I come across this I do fix it.
If I was writing a 3rd party library I would think twice about using stl parameters in the api. Previously (VC6) we had to use the OCI (Oracles C api) as opposed to OCCI (Oracles C++ api) because it only worked with the Microsoft STL implementation and we were using stlport. Of course if you enable your clients to build the library with their own stl implementation this is not an issue.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js