DLLs and STLs and static data (oh my!)

DLLs and STLs and static data (oh my!) - c++

OK..... I've done all the reading on related questions, and a few MSDN articles, and about a day's worth of googling.
What's the current "state of the art" answer to this question:
I'm using VS 2008, C++ unmanaged code. I have a solution file with quite a few DLLs and quite a few EXEs. As long as I completely control the build environment, such that all pieces and parts are built with the same flags, and use the same runtime libaries, and no one has a statically linked CRT library, am I ok to pass STL objects around?
It seems like this should be OK, but depending on which article you read, there's lots of Fear, Uncertainty, and Doubt.
I know there's all sorts of problems with templates that produce static data behind the scenes (every dll would get their own copy, leading to heartache), but what about regular old STL?

As long as they ALL use the exact same version of runtime DLLs, there should be no problem with STL. But once you happen to have several around, they will use for instance different heaps - leading to no end of troubles.

We successfully pass STL objects around in our application which is made up from dozens of DLLs. To ensure it works one of our automated tests that runs at every build is to verify the settings for all projects. If you add a new project and misconfigure it, or break the configuration of an existing project, the build fails.
The settings we check are as follows. Note not all of these will cause issues, but we check them for consistency.
#defines
_WIN32_WINNT
STRICT
_WIN32_IE
NDEBUG
_DEBUG
_SECURE_SCL
Compiler options
DebugInformationFormat
WholeProgramOptimization
RuntimeLibrary

We use stl collections in our application and pass them to and from methods in different dlls (usually as references). This doesn't cause any trouble.
The only area where we have had trouble is where one dll allocates memory and another dll tries to delete it. This only is reported as bad, but I am not sure why. However it only seems to be a problem on Debug builds (where it is reported), but still works on release builds. Having said that where ever I come across this I do fix it.
If I was writing a 3rd party library I would think twice about using stl parameters in the api. Previously (VC6) we had to use the OCI (Oracles C api) as opposed to OCCI (Oracles C++ api) because it only worked with the Microsoft STL implementation and we were using stlport. Of course if you enable your clients to build the library with their own stl implementation this is not an issue.

Related

Can changes to a dll be made, while keeping compatibility with pre-compiled executables?

We have a lot of executables that reference one of our dlls. We found a bug in one of our dlls and don't want to have to re-compile and redistribute all of our executables to fix it.
My understanding is that dlls will keep their compatibility with their executables so long as you don't change anything in the header file. So no new class members, no new functions, etc... but a change to the logic within a function should be fine. Is this correct? If it is compiler specific, please let me know, as this may be an issue.

Your understanding is correct. So long as you change the logic but not the interface then you will not run into compatibility issues.
Where you have to be careful is if the interface to the DLL is more than just the function signatures. For example if the original DLL accepted an int parameter but the new DLL enforced a constraint that the value of this parameter must be positive, say, then you would break old programs.

This will work. As long as the interface to the DLL remains the same, older executables can load it and use it just fine. That being said, you're starting down a very dangerous road. As time goes by and you patch more and more DLLs, you may start to see strange behaviour on customer installations that is virtually impossible to diagnose. This arises from unexpected interactions between different versions of your various components. Historically, this problem was known as DLL hell.
In my opinion, it is a much better idea to rebuild, retest, and redistribute the entire application. I would even go further and suggest that you use application manifests to ensure that your executables can only work with specific versions of your DLLs. It may seem like a lot of work now, but it can really save you a lot of headaches in the future.

It depends
in theory yes, if you load the dll with with LoadLibrary and haven't changed the interface you should be fine.
If you OTOH link with the .dll file using some .lib stub there is no guarantee it will work.
That is one of the reasons why COM was invented.

How should I integrate with and package this third-party library in a Win32 C++ app?

We have a (very large) existing codebase for a custom ActiveX control, and I'd like to integrate libkml into it for the sake of interacting with KML mapping data, rather than reinventing the wheel. The problem is, I'm a relatively new Windows developer, and coming from the Linux world, I'm really not sure what the right way of integrating a third party library is. Thankfully, libkml does provide MSVCC projects for compiling it, so porting isn't a problem. I guess I have a couple choices that I can think of:
Build and link the library directly. We already have a solution with project files in it for the "main" project; I could add the libkml projects to that solution, but I'd rather not. It's very unlikely that the libkml code will change in relation to our app's code.
Statically link to the .lib files produced by the libkml build. This is unattractive, since there are six .lib files that come out of the libkml solution and it seems inelegant to manually specify them in the linker options, etc.
Package the code as-is in a DLL. Maybe with COM? It seems like if I did this without any translation, I'd end up with a lot of overhead, and since I'm fairly unfamiliar with COM, I don't know how much work would be involved in exposing all the functionality I'd like to use via COM. The library is fairly big, has a lot of classes it uses, and if I had to manually write code to expose it all, I'd be hesitant to go this route.
Write wrapper code to to abstract the functionality I need, package that in a COM DLL, and interact with that. This seems sensible, I suppose, but it's difficult to determine how much abstraction I need since I haven't written the code that would use libkml yet.
Let me reiterate: I haven't yet written the code that will interact with libkml yet, so this is mostly experimental. Options 1 and 2 are also complicated by the fact that libkml relies additionally on three more external libraries that are also in .lib files (that I had to recompile anyways to get the code generation flags to line up). The goal obviously is to get the code to work, but maintainability and source tree organization are also goals, so I'm leaning towards options 3 and 4, but I don't know the best way to approach those on Windows.

Typing six file names, or using the declarative style with #pragma comment(lib, "foo.lib") is small potatoes compared to the work you'll have to do to turn this into a DLL or COM server.
The distribution is heavily biased towards using this as a static link library. There are only spotty declarations available to turn this into a DLL with __declspec(dllexport). They exist only in the 3rd party dependencies. All using different #defines of course, you'll by typing a bunch of names in the preprocessor definitions for the projects.
Furthermore, you'll have a hard time actually getting this DLL loaded at runtime since you are using it in a COM server. The search path for DLLs will be the client app's when COM creates your control instance, not likely to be anywhere near close to the place you deployed the DLL.
Making it a COM server is a lot of work, you'll have to write all the interface glue yourself. Again, nothing already in the source code that helps with this at all.

You can also wrap all the functionality you need in a non-COM-dll. Visual studio supports creating a static wrapper library which, when linked, will make your program use the dll. This way you only have one dependency to specify instead of six.
Other than that, what is wrong with specifying six dependencies. I would assume that there is a good reason that these are six separate libraries instead of one, so it is prudent to specify exactly which parts you actually use.

Maybe I'm missing something here, but I really don't see what is wrong with (1). I think that even if you had multiple projects that were using libkml, just insert the project file for libkml into your solution file, specify the dependencies, and you should be done. It's dead simple. Even solution (2) is dead simple. If the libraries ever change, you rebuild - you're going to need to do that anyway.
I'm failing to see how (3) or (4) are necessary or even desired. To me, it sounds like a lot of work for goals (source tree organization and maintainability) that I'm not even sure that those options really meet. In fact, you said yourself that "It's very unlikely that the libkml code will change in relation to our app's code."
What I've found over the years is to just keep things simple. If rebuilding KML is potentially time consuming, grab the libs and just statically link to the libraries. Yes, there are other dependencies, but you'll set this up once and be done, hopefully never to worry about it again. Otherwise, stick it in the project and move on. I think that it's worthwhile to ask whether spending a lot of time on this issue is worth the trouble.

Using new Windows features with fallback

I've been using dynamic libraries and GetProcAddress stuff for quite some time, but it always seems tedious, intellisense hostile, and ugly way to do things.
Does anyone know a clean way to import new features while staying compatible with older OSes.
Say I want to use a XML library which is a part of Vista. I call LoadLibraryW and then I can use the functions if HANDLE is non-null.
But I really don't want to go the #typedef (void*)(PFNFOOOBAR)(int, int, int) and PFNFOOOBAR foo = reinterpret_cast<PFNFOOOBAR>(GetProcAddress(GetModuleHandle(), "somecoolfunction"));, all that times 50, way.
Is there a non-hackish solution with which I could avoid this mess?
I was thinking of adding coolxml.lib in project settings, then including coolxml.dll in delayload dll list, and, maybe, copying the few function signatures I will use in the needed file. Then checking the LoadLibraryW return with non null, and if it's non-null then branching to Vista branch like in a regular program flow.
But I'm not sure if LoadLibrary and delay-load can work together and if some branch prediction will not mess things up in some cases.
Also, not sure if this approach will work, and if it wont cause problems after upgrading to the next SDK.

IMO, LoadLibrary and GetProcAddress are the best way to do it.
(Make some wrapper objects which take care of that for you, so you don't pollute your main code with that logic and ugliness.)
DelayLoad brings with it security problems (see this OldNewThing post) (edit: though not if you ensure you never call those APIs on older versions of windows).
DelayLoad also makes it too easy to accidentally depend on an API which won't be available on all targets. Yes, you can use tools to check which APIs you call at runtime but it's better to deal with these things at compile time, IMO, and those tools can only check the code you actually exercise when running under them.
Also, avoid compiling some parts of your code with different Windows header versions, unless you are very careful to segregate code and the objects that are passed to/from it.
It's not absolutely wrong -- and it's completely normal with things like plug-in DLLs where two entirely different teams probably worked on the two modules without knowing what SDK version each other targeted -- but it can lead to difficult problems if you aren't careful, so it's best avoided in general.
If you mix header versions you can get very strange errors. For example, we had a static object which contained an OS structure which changed size in Vista. Most of our project was compiled for XP, but we added a new .cpp file whose name happened to start with A and which was set to use the Vista headers. That new file then (arbitrarily) became the one which triggered the static object to be allocated, using the Vista structure sizes, but the actual code for that object was build using the XP structures. The constructor thought the object's members were in different places to the code which allocated the object. Strange things resulted!
Once we got to the bottom of that we banned the practise entirely; everything in our project uses the XP headers and if we need anything from the newer headers we manually copy it out, renaming the structures if needed.
It is very tedious to write all the typedef and GetProcAddress stuff, and to copy structures and defines out of headers (which seems wrong, but they're a binary interface so not going to change) (don't forget to check for #pragma pack stuff, too :(), but IMO that is the best way if you want the best compile-time notification of issues.
I'm sure others will disagree!
PS: Somewhere I've got a little template I made to make the GetProcAddress stuff slightly less tedious... Trying to find it; will update this when/if I do. Found it, but it wasn't actually that useful. In fact, none of my code even used it. :)

Yes, use delay loading. That leaves the ugliness to the compiler. Of course you'll still have to ensure that you're not calling a Vista function on XP.

Delay loading is the best way to avoid using LoadLibrary() and GetProcAddress() directly. Regarding the security issues mentioned, about the only thing you can do about that is use the delay load hooks to make sure (and optionally force) the desired DLL is being loaded during the dliNotePreLoadLibrary notification using the correct system path, and not relative to your app folder. Using the callbacks will also allow you to substitute your own fallback implementations in the dliFailLoadLib/dliFailGetProc notifications when the desired API function(s) are not available. That way, the rest of your code does not have to worry about platform differences (or very little).

Any improvements on the GCC/Windows DLLs/C++ STL front?

Yesterday, I got bit by a rather annoying crash when using DLLs compiled with GCC under Cygwin. Basically, as soon as you run with a debugger, you may end up landing in a debugging trap caused by RtlFreeHeap() receiving an address to something it did not allocate.
This is a known bug with GCC 3.4 on Cygwin. The situation arises because the libstdc++ library includes a "clever" optimization for empty strings. I spare you the details (see the references throughout this post), but whenever you allocate memory in one DLL for an std::string object that "belongs" to another DLL, you end up giving one heap a chunk to free that came from another heap. Hence the SIGTRAP in RtlFreeHeap().
There are other problems reported when exceptions are thrown across DLL boundaries.
This makes GCC 3.4 on Windows an unacceptable solution as soon as your project is based on DLLs and the STL. I have a few options to move past this option, many of which are very time-consuming and/or annoying:
I can patch my libstdc++ or rebuild it with the --enable-fully-dynamic-string configuration option
I can use static libraries instead, which increases my link time
I cannot (yet) switch to another compiler either, because of some other tools I'm using. The comments I find from some GCC people is that "it's almost never reported, so it's probably not a problem", which annoys me even more.
Does anyone have some news about this? I can't find any clear announcement that this has been fixed (the bug is still marked as "assigned"), except one comment on the GNU Radio bug tracker.
Thanks!

The general problem you're running into is that C++ was never really meant as a component language. It was really designed to be used to create complete standalone applications. Things like shared libraries and other such mechanisms were created by vendors on their own. Think of this example: suppose you created a C++ component that returns a C++ object. How is the C++ component know that it will be used by a C++ caller? And if the caller is a C++ application, why not just use the library directly?
Of course, the above information doesn't really help you.
Instead, I would create the shared libraries/DLLs such that you follow a couple of rules:
Any object created by a component is also destroyed by the same component.
A component can be safely unloaded when all of its created objects are destroyed.
You may have to create additional APIs in your component to ensure these rules, but by following these rules, it will ensure that problems like the one described won't happen.

Static or dynamic linking the CRT, MFC, ATL, etc

Back in the 90s when I first started out with MFC I used to dynamically link my apps and shipped the relevant MFC DLLs. This caused me a few issues (DLL hell!) and I switched to statically linking instead - not just for MFC, but for the CRT and ATL. Other than larger EXE files, statically linking has never caused me any problems at all - so are there any downsides that other people have come across? Is there a good reason for revisiting dynamic linking again? My apps are mainly STL/Boost nowadays FWIW.

Most of the answers I hear about this involve sharing your dll's with other programs, or having those dll's be updated without the need to patch your software.
Frankly I consider those to be downsides, not upsides. When a third party dll is updated, it can change enough to break your software. And these days, hard drive space isn't as precious as it once was, an extra 500k in your executable? Who cares?
Being 100% sure of the version of dll that your software is using is a good thing.
Being 100% sure that the client is not going to have a dependency headache is a good thing.
The upsides far outweigh the downsides in my opinion

There are some downsides:
Bigger exe size (esp if you ship multiple exe's)
Problems using other DLL's which rely on or assume dynamic linking (eg: 3rd party DLL's which you cannot get as static libraries)
Different c-runtimes between DLL's with independent static linkage (no cross-module allocate/deallocate)
No automatic servicing of shared components (no ability to have 3rd party module supplier update their code to fix issues without recompiling and updating your application)
We do static linking for our Windows apps, primarily because it allows xcopy deployment, which is just not possible with installing or relying on SxS DLL's in a way which works, since the process and mechanism is not well documented or easily remotable. If you use local DLL's in the install directory it will kinda work, but it's not well supported. The inability to easily do remote installation without going through a MSI on the remote system is the primary reason why we don't use dynamic linking, but (as you pointed out) there are many other benefits to static linking. There are pros and cons to each; hopefully this helps enumerate them.

As long as you keep your usage limited to certain libraries and do not use any dll's then you should be good.
Unfortunately, there are some libraries that you cannot link statically. The best example I have is OpenMP. If you take advantage of Visual Studio's OpenMP support, you will have to make sure the runtime is installed (in this case vcomp.dll).
If you do use dll's then you can't pass some items back and forth without some serious gymnastics. std::strings come to mind. If your exe and dll are dynamically linked then the allocation takes place in in the CRT. Otherwise your program may try to allocate the string on one side and deallocate it on the other. Bad things ensue...
That said, I still statically link my exe's and dll's. It does reduce a lot of the variablilty in the install and I consider that well worth the few limitations.

One good feature of using dll's are that if multiple processess loads the same dll its code can be shared between them. This can save memory and shorten loading times for an application loading a dll that's already used by another program.

No, nothing new on that front. Keep it that way.

Most definitely.
Allocation is done on a 'static' heap. Since allocation an deallocation should be done on the same heap, this means that if you ship a library, you should take care that client code can not call 'your' p = new LibClass() and delete that object itself using delete p;.
My conclusion: either shield allocation and deallocation from client code, or dynamically link the CRT.

There are some software licenses such as LGPL that require you to either use a DLL or distribute your application as object files that the user can link together. If you are using such a library, you'll probably want to use it as a DLL.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js