Using new Windows features with fallback - c++

I've been using dynamic libraries and GetProcAddress stuff for quite some time, but it always seems tedious, intellisense hostile, and ugly way to do things.
Does anyone know a clean way to import new features while staying compatible with older OSes.
Say I want to use a XML library which is a part of Vista. I call LoadLibraryW and then I can use the functions if HANDLE is non-null.
But I really don't want to go the #typedef (void*)(PFNFOOOBAR)(int, int, int) and PFNFOOOBAR foo = reinterpret_cast<PFNFOOOBAR>(GetProcAddress(GetModuleHandle(), "somecoolfunction"));, all that times 50, way.
Is there a non-hackish solution with which I could avoid this mess?
I was thinking of adding coolxml.lib in project settings, then including coolxml.dll in delayload dll list, and, maybe, copying the few function signatures I will use in the needed file. Then checking the LoadLibraryW return with non null, and if it's non-null then branching to Vista branch like in a regular program flow.
But I'm not sure if LoadLibrary and delay-load can work together and if some branch prediction will not mess things up in some cases.
Also, not sure if this approach will work, and if it wont cause problems after upgrading to the next SDK.

IMO, LoadLibrary and GetProcAddress are the best way to do it.
(Make some wrapper objects which take care of that for you, so you don't pollute your main code with that logic and ugliness.)
DelayLoad brings with it security problems (see this OldNewThing post) (edit: though not if you ensure you never call those APIs on older versions of windows).
DelayLoad also makes it too easy to accidentally depend on an API which won't be available on all targets. Yes, you can use tools to check which APIs you call at runtime but it's better to deal with these things at compile time, IMO, and those tools can only check the code you actually exercise when running under them.
Also, avoid compiling some parts of your code with different Windows header versions, unless you are very careful to segregate code and the objects that are passed to/from it.
It's not absolutely wrong -- and it's completely normal with things like plug-in DLLs where two entirely different teams probably worked on the two modules without knowing what SDK version each other targeted -- but it can lead to difficult problems if you aren't careful, so it's best avoided in general.
If you mix header versions you can get very strange errors. For example, we had a static object which contained an OS structure which changed size in Vista. Most of our project was compiled for XP, but we added a new .cpp file whose name happened to start with A and which was set to use the Vista headers. That new file then (arbitrarily) became the one which triggered the static object to be allocated, using the Vista structure sizes, but the actual code for that object was build using the XP structures. The constructor thought the object's members were in different places to the code which allocated the object. Strange things resulted!
Once we got to the bottom of that we banned the practise entirely; everything in our project uses the XP headers and if we need anything from the newer headers we manually copy it out, renaming the structures if needed.
It is very tedious to write all the typedef and GetProcAddress stuff, and to copy structures and defines out of headers (which seems wrong, but they're a binary interface so not going to change) (don't forget to check for #pragma pack stuff, too :(), but IMO that is the best way if you want the best compile-time notification of issues.
I'm sure others will disagree!
PS: Somewhere I've got a little template I made to make the GetProcAddress stuff slightly less tedious... Trying to find it; will update this when/if I do. Found it, but it wasn't actually that useful. In fact, none of my code even used it. :)

Yes, use delay loading. That leaves the ugliness to the compiler. Of course you'll still have to ensure that you're not calling a Vista function on XP.

Delay loading is the best way to avoid using LoadLibrary() and GetProcAddress() directly. Regarding the security issues mentioned, about the only thing you can do about that is use the delay load hooks to make sure (and optionally force) the desired DLL is being loaded during the dliNotePreLoadLibrary notification using the correct system path, and not relative to your app folder. Using the callbacks will also allow you to substitute your own fallback implementations in the dliFailLoadLib/dliFailGetProc notifications when the desired API function(s) are not available. That way, the rest of your code does not have to worry about platform differences (or very little).

Related

What problems can appear when using G++ compiled DLL (plugin) in VC++ compiled application?

I use and Application compiled with the Visual C++ Compiler. It can load plugins in form of a .dll. It is rather unimportant what exactly it does, fact is:
This includes calling functions from the .dll that return a pointer to an object of the Applications API, etc.
My question is, what problems may appear when the Application calls a function from the .dll, retrieves a pointer from it and works with it. For example, something that comes into my mind, is the size of a pointer. Is it different in VC++ and G++? If yes, this would probably crash the Application?
I don't want to use the Visual Studio IDE (which is unfortunately the "preferred" way to use the Applications SDK). Can I configure G++ to compile like VC++?
PS: I use MINGW GNU G++
As long as both application and DLL are compiled on the same machine, and as long as they both only use the C ABI, you should be fine.
What you can certainly not do is share any sort of C++ construct. For example, you mustn't new[] an array in the main application and let the DLL delete[] it. That's because there is no fixed C++ ABI, and thus no way in which any given compiler knows how a different compiler implements C++ data structures. This is even true for different versions of MSVC++, which are not ABI-compatible.
All C++ language features are going to be entirely incompatible, I'm afraid. Everything from the name-mangling to memory allocation to the virtual-call mechanism are going to be completely different and not interoperable. The best you can hope for is a quick, merciful crash.
If your components only use extern "C" interfaces to talk to one another, you can probably make this work, although even there, you'll need to be careful. Both C++ runtimes will have startup and shutdown code, and there's no guarantee that whichever linker you use to assemble the application will know how to include this code for the other compiler. You pretty much must link g++-compiled code with g++, for example.
If you use C++ features with only one compiler, and use that compiler's linker, then it gets that much more likely to work.
This should be OK if you know what you are doing. But there's some things to watch out for:
I'm assuming the interface between EXE and DLL is a "C" interface or something COM like where the only C++ classes exposed are through pure-virutal interfaces. It gets messier if you are exporting a concrete class through a DLL.
32-bit vs. 64bit. The 32-bit app won't load a 64-bit DLL and vice-versa. Make sure they match.
Calling convention. __cdecl vs __stdcall. Often times Visual Studio apps are compiled with flags that assuming __stdcall as the default calling convention (or the function prototype explicitly says so). So make sure that the g++ compilers generates code that matches the calling type expected by the EXE. Otherwise, the exported function might run, but the stack can get trashed on return. If you debug through a crash like this, there's a good chance the cdecl vs stdcall convention was incorrectly specified. Easy to fix.
C-Runtimes will not likely be shared between the EXE and DLL, so don't mix and match. A pointer allocated with new or malloc in the EXE should not be released with delete or free in the DLL (and vice versa). Likewise, FILE handles returned by fopen() can not be shared between EXE and DLL. You'll likely crash if any of this happens.... which leads me to my next point....
C++ header files with inline code cause enough headaches and are the source of issues I called out in #3. You'll be OK if the interface between DLL And EXE is a pure "C" interface.
Name mangling issues. If you run into issues where the function name exported doesn't match because of name mangling or because of a leading underscore, you can fix that up in a .DEF file. At least that's what I've done in the past with Visual Studio. Not sure if the equivalent exists in g++/MinGW. Example below. Learn to use "dumpbin.exe /exports" to you can validate your DLL is exporting function with the right name. Using extern "C" will also help fix this as well.
EXPORTS
FooBar=_Foobar#12
BlahBlah=??BlahBlah##QAE#XZ #236 NONAME
Those are the issues that I know of. I can't tell you much more since you didn't explain the interface between the DLL and EXE.
The size of a pointer won't vary; that is dependent on the platform and module bitness and not the compiler (32-bit vs 64-bit and so on).
What can vary is the size of basically everything else, and what will vary are templates.
Padding and alignment of structs tends to be compiler-dependent, and often settings-within-compiler dependent. There are so loose rules, like pointers typically being on a platform-bitness-boundary and bools having 3 bytes after them, but it's up to the compiler how to handle that.
Templates, particularly from the STL (which is different for each compiler) may have different members, sizes, padding, and mostly anything. The only standard part is the API, the backend is left to the STL implementation (there are some rules, but compilers can still compile templates differently). Passing templates between modules from one build is bad enough, but between different compilers it can often be fatal.
Things which aren't standardized (name mangling) or are highly specific by necessity (memory allocation) will also be incompatible. You can get around both of those issues by only destroying from the library that creates (good practice anyway) and using STL objects that take a deleter, for allocation, and exporting using undecorated names and/or the C style (extern "C") for exported methods.
I also seem to remember a catch with how the compilers handle virtual destructors in the vtable, with some small difference.
If you can manage to only pass references of your own objects, avoid externally visible templates entirely, work primarily with pointers and exported or virtual methods, you can avoid a vast majority of the issues (COM does precisely this, for compatibility with most compilers and languages). It can be a pain to write, but if you need that compatibility, it is possible.
To alleviate some, but not all, of the issues, using an alternate to the STL (like Qt's core library) will remove that particular problem. While throwing Qt into any old project is a hideous waste and will cause more bloat than the "boost ALL THE THINGS!!!" philosophy, it can be useful for decoupling the library and the compiler to a greater extent than using a stock STL can.
You can't pass C runtime objects between them. For example you can not open a FILE buffer in one and pass it to be used in the other. You can't free memory allocated on the other side.
The main problems are the function signatures and way parameters are passed to library code. I've had great difficulty getting VC++ dll's to work in gnu based compilers in the past. This was way back when VC++ always cost money and mingw was the free solution.
My experience was with DirectX API's. Slowly a subset got it's binaries modified by enthusiasts but it was never as up-to-date or reliable so after evaluating it I switched to a proper cross platform API, that was SDL.
This wikipedia article describes the different ways libraries can be compiled and linked. It is rather more in depth than I am able to summarise here.

Can changes to a dll be made, while keeping compatibility with pre-compiled executables?

We have a lot of executables that reference one of our dlls. We found a bug in one of our dlls and don't want to have to re-compile and redistribute all of our executables to fix it.
My understanding is that dlls will keep their compatibility with their executables so long as you don't change anything in the header file. So no new class members, no new functions, etc... but a change to the logic within a function should be fine. Is this correct? If it is compiler specific, please let me know, as this may be an issue.
Your understanding is correct. So long as you change the logic but not the interface then you will not run into compatibility issues.
Where you have to be careful is if the interface to the DLL is more than just the function signatures. For example if the original DLL accepted an int parameter but the new DLL enforced a constraint that the value of this parameter must be positive, say, then you would break old programs.
This will work. As long as the interface to the DLL remains the same, older executables can load it and use it just fine. That being said, you're starting down a very dangerous road. As time goes by and you patch more and more DLLs, you may start to see strange behaviour on customer installations that is virtually impossible to diagnose. This arises from unexpected interactions between different versions of your various components. Historically, this problem was known as DLL hell.
In my opinion, it is a much better idea to rebuild, retest, and redistribute the entire application. I would even go further and suggest that you use application manifests to ensure that your executables can only work with specific versions of your DLLs. It may seem like a lot of work now, but it can really save you a lot of headaches in the future.
It depends
in theory yes, if you load the dll with with LoadLibrary and haven't changed the interface you should be fine.
If you OTOH link with the .dll file using some .lib stub there is no guarantee it will work.
That is one of the reasons why COM was invented.

C++: Any way to 'jail function'?

Well, it's a kind of a web server.
I load .dll(.a) files and use them as program modules.
I recursively go through directories and put '_main' functors from these libraries into std::map under name, which is membered in special '.m' files.
The main directory has few directories for each host.
The problem is that I need to prevent usage of 'fopen' or any other filesystem functions working with directory outside of this host directory.
The only way I can see for that - write a warp for stdio.h (I mean, write s_stdio.h that has a filename check).
May be it could be a deamon, catching system calls and identifying something?
add
Well, and what about such kind of situation: I upload only souses and then compile it directly on my server after checking up? Well, that's the only way I found (having everything inside one address space still).
As C++ is low level language and the DLLs are compiled to machine code they can do anything. Even if you wrap the standard library functions the code can do the system calls directly, reimplementing the functionality you have wrapped.
Probably the only way to effectively sandbox such a DLL is some kind of virtualisation, so the code is not run directly but in a virtual machine.
The simpler solution is to use some higher level language for the loadable modules that should be sandboxed. Some high level languages are better at sandboxing (Lua, Java), other are not so good (e.g. AFAIK currently there is no official restricted environment implemented for Python).
If you are the one loading the module, you can perform a static analysis on the code to verify what APIs it calls, and refuse to link it if it doesn't check out (i.e. if it makes any kind of suspicious call at all).
Having said that, it's a lot of work to do this, and not very portable.

C++ internal code reuse: compile everything or share the library / dynamic library?

General question:
For unmanaged C++, what's better for internal code sharing?
Reuse code by sharing the actual source code? OR
Reuse code by sharing the library / dynamic library (+ all the header files)
Whichever it is: what's your strategy for reducing duplicate code (copy-paste syndrome), code bloat?
Specific example:
Here's how we share the code in my organization:
We reuse code by sharing the actual source code.
We develop on Windows using VS2008, though our project actually needs to be cross-platform. We have many projects (.vcproj) committed to the repository; some might have its own repository, some might be part of a repository. For each deliverable solution (.sln) (e.g. something that we deliver to the customer), it will svn:externals all the necessary projects (.vcproj) from the repository to assemble the "final" product.
This works fine, but I'm quite worried about eventually the code size for each solution could get quite huge (right now our total code size is about 75K SLOC).
Also one thing to note is that we prevent all transitive dependency. That is, each project (.vcproj) that is not an actual solution (.sln) is not allowed to svn:externals any other project even if it depends on it. This is because you could have 2 projects (.vcproj) that might depend on the same library (i.e. Boost) or project (.vcproj), thus when you svn:externals both projects into a single solution, svn:externals will do it twice. So we carefully document all dependencies for each project, and it's up to guy that creates the solution (.sln) to ensure all dependencies (including transitive) are svn:externals as part of the solution.
If we reuse code by using .lib , .dll instead, this would obviously reduce the code size for each solution, as well as eliminiate the transitive dependency mentioned above where applicable (exceptions are, for example, third-party library/framework that use dll like Intel TBB and the default Qt)
Addendum: (read if you wish)
Another motivation to share source code might be summed up best by Dr. GUI:
On top of that, what C++ makes easy is
not creation of reusable binary
components; rather, C++ makes it
relatively easy to reuse source code.
Note that most major C++ libraries are
shipped in source form, not compiled
form. It's all too often necessary to
look at that source in order to
inherit correctly from an object—and
it's all too easy (and often
necessary) to rely on implementation
details of the original library when
you reuse it. As if that isn't bad
enough, it's often tempting (or
necessary) to modify the original
source and do a private build of the
library. (How many private builds of
MFC are there? The world will never
know . . .)
Maybe this is why when you look at libraries like Intel Math Kernel library, in their "lib" folder, they have "vc7", "vc8", "vc9" for each of the Visual Studio version. Scary stuff.
Or how about this assertion:
C++ is notoriously non-accommodating
when it comes to plugins. C++ is
extremely platform-specific and
compiler-specific. The C++ standard
doesn't specify an Application Binary
Interface (ABI), which means that C++
libraries from different compilers or
even different versions of the same
compiler are incompatible. Add to that
the fact that C++ has no concept of
dynamic loading and each platform
provide its own solution (incompatible
with others) and you get the picture.
What's your thoughts on the above assertion? Does something like Java or .NET face these kinds of problems? e.g. if I produce a JAR file from Netbeans, will it work if I import it into IntelliJ as long as I ensure that both have compatible JRE/JDK?
People seem to think that C specifies an ABI. It doesn't, and I'm not aware of any standardised compiled language that does. To answer your main question, use of libraries is of course the way to go - I can't imagine doing anything else.
One good reason to share the source code: Templates are one of C++'s best features because they are an elegant way around the rigidity of static typing, but by their nature are a source-level construct. If you focus on binary-level interfaces instead of source-level interfaces, your use of templates will be limited.
We do the same. Trying to use binaries can be a real problem if you need to use shared code on different platforms, build environments, or even if you need different build options such as static vs. dynamic linking to the C runtime, different structure packing settings, etc..
I typically set projects up to build as much from source on-demand as possible, even with third-party code such as zlib and libpng. For those things that must be built separately, e.g. Boost, I typically have to build 4 or 8 different sets of binaries for the various combinations of settings needed (debug/release, VS7.1/VS9, static/dynamic), and manage the binaries along with the debugging information files in source control.
Of course, if everyone sharing your code is using the same tools on the same platform with the same options, then it's a different story.
I never saw shared libraries as a way to reuse code from an old project into a new one. I always thought it was more about sharing a library between different applications that you're developing at about the same time, to minimize bloat.
As far as copy-paste syndrome goes, if I copy and paste it in more than a couple places, it needs to be its own function. That's independent of whether the library is shared or not.
When we reuse code from an old project, we always bring it in as source. There's always something that needs tweaking, and its usually safer to tweak a project-specific version than to tweak a shared version that can wind up breaking the previous project. Going back and fixing the previous project is out of the question because 1) it worked (and shipped) already, 2) it's no longer funded, and 3) the test hardware needed may no longer be available.
For example, we had a communication library that had an API for sending a "message", a block of data with a message ID, over a socket, pipe, whatever:
void Foo:Send(unsigned messageID, const void* buffer, size_t bufSize);
But in a later project, we needed an optimization: the message needed to consist of several blocks of data in different parts of memory concatenated together, and we couldn't (and didn't want to, anyway) do the pointer math to create the data in its "assembled" form in the first place, and the process of copying the parts together into a unified buffer was taking too long. So we added a new API:
void Foo:SendMultiple(unsigned messageID, const void** buffer, size_t* bufSize);
Which would assemble the buffers into a message and send it. (The base class's method allocated a temporary buffer, copied the parts together, and called Foo::Send(); subclasses could use this as a default or override it with their own, e.g. the class that sent the message on a socket would just call send() for each buffer, eliminating a lot of copies.)
Now, by doing this, we have the option of backporting (copying, really) the changes to the older version, but we're not required to backport. This gives the managers flexibility, based on the time and funding constraints they have.
EDIT: After reading Neil's comment, I thought of something that we do that I need to clarify.
In our code, we do lots of "libraries". LOTS of them. One big program I wrote had something like 50 of them. Because, for us and with our build setup, they're easy.
We use a tool that auto-generates makefiles on the fly, taking care of dependencies and almost everything. If there's anything strange that needs to be done, we write a file with the exceptions, usually just a few lines.
It works like this: The tool finds everything in the directory that looks like a source file, generates dependencies if the file changed, and spits out the needed rules. Then it makes a rule to take eveything and ar/ranlib it into a libxxx.a file, named after the directory. All the objects and library are put in a subdirectory that is named after the target platform (this makes cross-compilation easy to support). This process is then repeated for every subdirectory (except the object file subdirs). Then the top-level directory gets linked with all the subdirs' libraries into the executable, and a symlink is created, again, naked after the top-level directory.
So directories are libraries. To use a library in a program, make a symbolic link to it. Painless. Ergo, everything's partitioned into libraries from the outset. If you want a shared lib, you put a ".so" suffix on the directory name.
To pull in a library from another project, I just use a Subversion external to fetch the needed directories. The symlinks are relative, so as long as I don't leave something behind it still works. When we ship, we lock the external reference to a specific revision of the parent.
If we need to add functionality to a library, we can do one of several things. We can revise the parent (if it's still an active project and thus testable), tell Subversion to use the newer revision and fix any bugs that pop up. Or we can just clone the code, replacing the external link, if messing with the parent is too risky. Either way, it still looks like a "library" to us, but I'm not sure that it matches the spirit of a library.
We're in the process of moving to Mercurial, which has no "externals" mechanism so we have to either clone the libraries in the first place, use rsync to keep the code synced between the different repositories, or force a common directory structure so you can have hg pull from multiple parents. The last option seems to be working pretty well.

DLLs and STLs and static data (oh my!)

OK..... I've done all the reading on related questions, and a few MSDN articles, and about a day's worth of googling.
What's the current "state of the art" answer to this question:
I'm using VS 2008, C++ unmanaged code. I have a solution file with quite a few DLLs and quite a few EXEs. As long as I completely control the build environment, such that all pieces and parts are built with the same flags, and use the same runtime libaries, and no one has a statically linked CRT library, am I ok to pass STL objects around?
It seems like this should be OK, but depending on which article you read, there's lots of Fear, Uncertainty, and Doubt.
I know there's all sorts of problems with templates that produce static data behind the scenes (every dll would get their own copy, leading to heartache), but what about regular old STL?
As long as they ALL use the exact same version of runtime DLLs, there should be no problem with STL. But once you happen to have several around, they will use for instance different heaps - leading to no end of troubles.
We successfully pass STL objects around in our application which is made up from dozens of DLLs. To ensure it works one of our automated tests that runs at every build is to verify the settings for all projects. If you add a new project and misconfigure it, or break the configuration of an existing project, the build fails.
The settings we check are as follows. Note not all of these will cause issues, but we check them for consistency.
#defines
_WIN32_WINNT
STRICT
_WIN32_IE
NDEBUG
_DEBUG
_SECURE_SCL
Compiler options
DebugInformationFormat
WholeProgramOptimization
RuntimeLibrary
We use stl collections in our application and pass them to and from methods in different dlls (usually as references). This doesn't cause any trouble.
The only area where we have had trouble is where one dll allocates memory and another dll tries to delete it. This only is reported as bad, but I am not sure why. However it only seems to be a problem on Debug builds (where it is reported), but still works on release builds. Having said that where ever I come across this I do fix it.
If I was writing a 3rd party library I would think twice about using stl parameters in the api. Previously (VC6) we had to use the OCI (Oracles C api) as opposed to OCCI (Oracles C++ api) because it only worked with the Microsoft STL implementation and we were using stlport. Of course if you enable your clients to build the library with their own stl implementation this is not an issue.