We have a C++ library. It has four static objects that are sensitive to initialization order (two of them are strings from the standard library).
We are using init_seg(lib) to control the order of initialization of C++ static objects in the library. The source file that is using it is compiled and used in either a dynamic link library or static library. According to the documentation for init_seg:
... It is particularly important to use the init_seg pragma in dynamic-link libraries (DLLs) or libraries requiring initialization. (emphasis mine)
The Visual Studio solution is organized into four projects. One is the static library, or is the dynamic library, one is the test driver for the static library, and one is the test driver for the dynamic library.
Under Visual Studio, compiling the source file with the init_seg results in warning C4073, with the text initializers put in library initialization area. According to MSDN:
Only third-party library developers should use the library initialization area, which is specified by #pragma init_seg. The following sample generates C4073...
The code using the init_seg is used in the libraries only, and not used with the test drivers. I've verified the static library and dynamic library project settings, and they are clearly calling out library artifacts.
Why am I receiving the C4073 warning?
It's just warning you, not even a wag of the finger, it's more like "are you sure?" When you use a 3rd party library that uses this feature it warned its developer too - unless he turned it off with #pragma warning. You could do the same. Or you could use segment user not segment lib: your strings will still be constructed before your application code runs. Segment lib is really intended for things like ... oh, Qt or MFC or a framework like that that needs to be initialized before any application code runs, including your early initialization stuff in user.
Here's some more information:
Suppose you have a library for your own app. And it has some stuff that needs to be initialized before any of its code is run, because certain classes exposed by this library are intended to be (or, are allowed to be) statically allocated in your application code, and those classes do complicated stuff in their constructors that needs, say, some large bollix of precomputed (but not static) data. So for that precomputed stuff you precompute it in the constructor of a class (a different class) and you statically allocate an instance of that class, so that the initialization of that instance calls its constructor which precomputes all that stuff, and that static instance you mark with pragma init_seg(user). Now, before any of your application code runs - including any constructors of static instances of this library's class in your code - the library's init_seg(user) code will all run, and so, by the time your static instances of this library's classes get constructed the data they need will be there.
Now consider stuff that really has got to be there early, of which static instances exist that you can call. E.g., std::cout. You can call methods on std::cout in constructors of your own classes that you have static instances of. Obviously, the object std::cout needs to be initialized before any of your code runs. And some of your code might run in stuff you've marked init_seg(user). So Microsoft places all such code - that is, the constructors for std::cout etc. - in init_seg(compiler). That stuff will run before anything else.
So what is init_seg(lib) for? Well, suppose you're a framework like MFC. You expose stuff like the Application object that the user will (is likely to, is expected to) create a static instance of. Some of the code in Application which will be run at static initialization time depends on other stuff in MFC that needs to be initialized. So obviously it needs to be init_segged, but where? compiler is for the compiler and runtimes alone, and your framework stuff (MFC) might be used in the init_seg(user) that the user is allowed to use ... so you get an intermediate level between compiler and user - that's lib.
Now it is fairly rare to need anything like that because most C++ libraries used by programs are not themselves used by multiple libraries and thus don't need to make sure they're initialized before all other "ordinary" libraries. MFC does, because you might have bought 3rd party MFC control or other libraries from one or more vendors, and those depend on MFC, and those might use MFC things in their constructors, and you might use objects from those libraries statically, so MFC needs to be initialized before those other libraries.
But in most cases, your library isn't going to be a dependency of other C++ libraries that people are using. Thus there isn't any kind of dependency chain where your library needs to be initialized before other libraries. Your libraries might need to be initialized before your user code, yes, but there isn't any order in which they need to be initialized. So init_seg(user) is fine for all of them.
And, Microsoft (and generally, most C++ experts) will tell you: If there is some kind of order dependency in which your separate libraries need to be initialized at the time of static initialization of your app then you're doing it wrong. Seriously. There's a design problem there.
So, to respond to a comment: It isn't a compiler bug. It's a user warning. Fairly benign: Nothing's going to go wrong if you ignore it (unlike, say ignoring a warning about casting from long to int). But if you're using init_seg(lib) you might not really be understanding what that compiler feature is all about, so they'd like you to think about it. And after you've thought about it, if you still want to do it, go ahead and turn the warning off.
Related
Suppose now two C++ libraries are available: one library has all the functions that will be needed by the program ( a C++ application program that will invoke the library), and the other one not only has the necessary functions that will be needed by the program but also has other functions that will not not used by the program. We assume that for the common functions in both libraries they are implemented in the same manner. My question is: when the program uses the library to perform a certain task, what's the effect of the library on the performance of the program?
The reason why I asked this question is because when developing a c++ library I often wrote some additional functions, which may not be invoked by the users of the library but are important for debugging. When the library is finished, I have two choices: one is to keep these auxiliary functions and the other is removing them or using other strategies of keep them (for example, define MACRO to disable these functions). If keeping these auxiliaries functions will not deteriorate the performance, I would like to keep them.
Everything else being the same, there will be no performance difference.
In addition, if the library is a static library, the linker will not include the functions that are not used, and the executables will have the same size.
Well if you have written a static library that I guess you have. Then the only difference it will create is that the static library functionality will be part of you executable no matter if you use it or not.
I don't think it will hurt you in terms of speed but yes it will occupy a lot more space since a copy of lib will be created with you executable.
Can a std::string be passed by value across DLL boundries between DLLs built with different version of Visual Studio?
No, because templated code is generated separately per module.
So, when your EXE instantiates a std::string and passes it to the DLL, the DLL will begin using a completely different implementation on it. The result is a total mess, but it often sorta almost works because implementations are very similar, or the mess is hard to detect because it's some kind of subtle heap corruption.
Even if they're both built with the same version of VS, it's very precarious / fragile, and I would not recommend it. Either use a C-style interface between modules (for example, COM), or just don't use a DLL.
More detailed explanation here: Creating c++ DLL without static methods
and here: How can I call a function of a C++ DLL that accepts a parameter of type stringstream from C#?
In general, you can not mix binary code built with different compilers, which includes different versions of the same compiler (and can even include the same compiler invoked with different commandline options), so the answer to what you are trying to do is a clear "No".
The reason is that different compilers might provide different implementations of std::string. For example, one implementation could have a fixed, static buffer, while another version doesn't, which already leads to different object sizes. There is a bunch of other things that can make the interfaces incompatible, like the underlying allocator, the internal representation. Some things will already fail to link, due to name mangling or different private APIs, which both protect you from doing something wrong.
Some notes:
Even if you didn't pass the object by value but by reference, the called code could have a different idea of what this objects looks like.
It also doesn't matter that the type is supplied by the compiler, even if you defined the class yourself and compiled two DLLs with different versions of the class definition you would have problems. Also, if you change your standard library implementation, you make your binaries incompatible, too.
It doesn't even matter that the other code is in a DLL, it also applies to code in the same executable or DLL, although headers and automatic recompilation on change make this case very unlikely.
Specifically for MS Windows, you also have a debug heap and a release heap, and memory allocated in one must not be returned to the other. For that reason, you often have two DLL, one with a 'd' suffix (the debug version) and one without. This is a case where the compiler settings already affect compatibility, but you can get around this using a parallel approach of providing two versions of your DLL, too.
To some degree, similar problems occur for C code, too, where compilers have to agree on e.g. struct layout and calling conventions. Due to greater age and lower complexity, different C compilers are effectively compatible though. This is also accepted as a necessary feature in C as opposed to C++.
I would like to know, is calling code placed in a .dll, which was build with a different tool-chain, possible? And is using a .lib file build with an older compiler to build a code with a newer one possible?
I know, the second one is not preferable, but I would like to know, is it impossible.
Precisely my case looks like this:
I have a.exe file built with VC7.1 using b.lib file which was also built with VC7.1 . a.exe calls code from c.dll which was built also using b.dll. Now I want to write a new c.dll, but compile it with VC9. (I want to do so, as I need some libraries which do not provide support for building them with VC7.1 .) -- My c.dll also requires b.lib, still I have sources for it, thus I can recompile it.
So, is it possible to make it work? If not, can you provide a brief explanations, what exactly disallows this?
It is not entirely impossible. The chief problem is that you'll inevitably end up with two distinct copies of the runtime library. Copies that each keep their own state and use their own memory allocator. The DLL interface has to be carefully designed to avoid the possible mishaps that can cause.
Hard rules are that you can never throw an exception from the code in the DLL and catch it in the EXE. And that you cannot return a standard C++ library object like std::string from your DLL code, they have different implementations and the EXE cannot properly destroy the object since it uses a different allocator. And the more general rule, the DLL can never return a pointer to an object that needs to be released by the caller. CRT state can cause subtle problems, like errno not returning the proper error code and locale being set wrong. All and all, plenty of misery that's very hard to diagnose and even harder to fix.
The COM programming model is an example of one that's safe. It never exposes implementation, only pure abstract interfaces. No exceptions, only error codes. Objects are allocated by a factory and reference counted. And where absolutely necessary, it uses a common heap to allocate from, CoTaskMemAlloc(). Not a popular programming model but that's what it takes.
I have a C++ class I'm writing now that will be used all over a project I'm working on. I have the option to put it in a static library, or export the class from a dll. What are the benefits/penalties for each approach. The only one I can think of is compiled code size which I don't really care about. Thanks!
Advantages of a DLL:
You can have multiple different exe's that access this functionality, so you will have a smaller project size overall.
You can dynamically update your component without replacing the whole exe. If you do this though be careful that the interface remains the same.
Sometimes like in the case of LGPL you are forced into using a DLL.
You could have some components as C#, Python or other languages that tie into your DLL.
You can build programs that consume your DLL that work with different versions of the DLL. For example you could check if a function exists in a certain operating system DLL and only call it if it exists, and otherwise do some other processing.
Advantages of Static library:
You cannot have dll verisoning problems that way
Less to distribute, you aren't forced into a full installer if you only have a small application.
You don't have to worry about anyone else tying into your code that would have been accessible if it was a DLL.
Easier to develop a static library as you don't need to worry about exports and imports.
Memory management is easier.
One of the most significant and often unnoted features of dynamic libraries on Windows is that DLLs have their own heap. This can be an advantage or a disadvantage depending on your point of view but you need to be aware of it. For example, a global variable in a DLL will be shared among all the processes attaching to that library which can be a useful form of de facto interprocess communication or the source of an obscure run time error.
Is it possible to implement monkey patching in C++?
Or any other similar approach to that?
Thanks.
Not portably so, and due to the dangers for larger projects you better have good reason.
The Preprocessor is probably the best candidate, due to it's ignorance of the language itself. It can be used to rename attributes, methods and other symbol names - but the replacement is global at least for a single #include or sequence of code.
I've used that before to beat "library diamonds" into submission - Library A and B both importing an OS library S, but in different ways so that some symbols of S would be identically named but different. (namespaces were out of the question, for they'd have much more far-reaching consequences).
Similary, you can replace symbol names with compatible-but-superior classes.
e.g. in VC, #import generates an import library that uses _bstr_t as type adapter. In one project I've successfully replaced these _bstr_t uses with a compatible-enough class that interoperated better with other code, just be #define'ing _bstr_t as my replacement class for the #import.
Patching the Virtual Method Table - either replacing the entire VMT or individual methods - is somethign else I've come across. It requires good understanding of how your compiler implements VMTs. I wouldn't do that in a real life project, because it depends on compiler internals, and you don't get any warning when thigns have changed. It's a fun exercise to learn about the implementation details of C++, though. One application would be switching at runtime from an initializer/loader stub to a full - or even data-dependent - implementation.
Generating code on the fly is common in certain scenarios, such as forwarding/filtering COM Interface calls or mapping OS Window Handles to library objects. I'm not sure if this is still "monkey-patching", as it isn't really toying with the language itself.
To add to other answers, consider that any function exposed through a shared object or DLL (depending on platform) can be overridden at run-time. Linux provides the LD_PRELOAD environment variable, which can specify a shared object to load after all others, which can be used to override arbitrary function definitions. It's actually about the best way to provide a "mock object" for unit-testing purposes, since it is not really invasive. However, unlike other forms of monkey-patching, be aware that a change like this is global. You can't specify one particular call to be different, without impacting other calls.
Considering the "guerilla third-party library use" aspect of monkey-patching, C++ offers a number of facilities:
const_cast lets you work around zealous const declarations.
#define private public prior to header inclusion lets you access private members.
subclassing and use Parent::protected_field lets you access protected members.
you can redefine a number of things at link time.
If the third party content you're working around is provided already compiled, though, most of the things feasible in dynamic languages isn't as easy, and often isn't possible at all.
I suppose it depends what you want to do. If you've already linked your program, you're gonna have a hard time replacing anything (short of actually changing the instructions in memory, which might be a stretch as well). However, before this happens, there are options. If you have a dynamically linked program, you can alter the way the linker operates (e.g. LD_LIBRARY_PATH environment variable) and have it link something else than the intended library.
Have a look at valgrind for example, which replaces (among alot of other magic stuff it's dealing with) the standard memory allocation mechanisms.
As monkey patching refers to dynamically changing code, I can't imagine how this could be implemented in C++...