Convert from MinGW .a to VC++ .lib - c++

I have an old library (in C++) that I'm only able to build on MinGW+MSYS32 on Windows. From that I can produce a .a static library file generated from GNU libtool. My primary development is done in Visual Studio 2008. If I try to take the MinGW produced library and link it in Visual Studio, it complains about missing externals. This is most likely due to the C++ symbol mangling that is done, and that it doesn't match whats in the .a file. Is there any know way to convert a static .a file to VC++ library format?

If symbols are mangled differently, the compilers are using a different ABI and you won't be able to "convert" or whatever the compiled libraries: the reason names are mangled in different ways is intentional to avoid users from ever succeeding with building an executable from object files using different ABIs. The ABI defines how arguments are passed, how virtual function tables are represented, how exceptions are thrown, the size of basic data types, and quite a number of other things. The name mangling is also part of this and carefully chosen to be efficient and different from other ABIs on the same platform. The easiest way to see if the name mangling is indeed different is to compiler a file with a couple of functions using only basic types with both compilers and have a look at the symbols (on UNIX I would use nm for this; I'm not a Windows programmer and thus I don't know what tool to use there).
Based on Wikipedia it seems that MinGW also uses a different run-time library: the layout and the implementation of the standard C++ library used by MinGW (libstdc++) and Visual Studio (Dinkumware) is not compatible. Even if the ABI between the two compilers is the same, the standard C++ library used isn't. If the ABI is the same you would need to replace set things up so that one compiler uses the standard library of the respective other compiler. There is no way to do this with the already compiled library. I don't know if this is doable.
Personally I wouldn't bother with trying to link things between these two different compiler because it is bound not to work at all. Instead, you'd need to port the implementation to a common compiler and go from there.

Search for the def file and run the command e.g. lib /machine:i386 /def:test.def it will generate
an import lib file.

Related

Why can some libraries built by older compilers link against modern code, and others cannot?

We have a lot of prebuilt libraries (via CMake mostly), built using Visual Studio 2017 v141. When we try to use these against a project using Visual STudio 2019 v142 we see errors like:
Error C1047 The object or library file
‘boost_chrono-vc141-mt-gd-x32-1_68.lib’ was created by a different
version of the compiler than other objects...
On the other hand, we also use pre-compiled .libs from 3rd-party vendors which are over a decade old and these have worked just fine when linked against our codebase.
What determines whether a library needs to be rebuilt, and why can some ancient libraries still be used when others that are only one version behind cannot?
ABI incompatibilities could cause some issues. Even though the C++ standard calls for objects such as std::vector and std::mutex and that they need to have specific public/protected members, how these classes are made is left to the implementation.
In practice, it means that nothing prevents the GNU standard library from having their data fields in another orders than the LLVM standard library, or having completely different private members.
As such, if you try to use a function from a library built with the LLVM libc++ by sending it a GNU libstdc++ vector it causes UB. Even on the same standard library, different versions could have changed something and that could be a problem.
To avoid these issues, popular C++ libraries only use C data structures in their ABIs since (at least for now) every compiler produces the same memory layout for a char*, an int or a struct.
These ABI issues can appears in two places:
When you use dynamic libraries (.so and .dll files) your compiler probably won't say anything and you'll get undefined behavior when you call a function of the library using incompatible C++ objects.
When you use static libraries (.a and .lib files) I'm not really sure, I'm guessing it could either print an error if it sees there's gonna be a problem or successfully compile some Frankenstein monster of a binary that will behave like the above point
I will try to answer some integral parts, but be aware this answer could be incomplete. With more information from peers we will maybe be able to construct a full answer!
The simples kind of linking is linking towards a C library. Since there is no concept of classes and overloading function names, the compiler creators are able to create entry points to functions by their pure name. This seems to be pretty much quasi-standardized since, I myself, haven't encountered a pure C library not at least linkable to my projects. You can select this behaviour in C++ code by prepending a function declaration with extern "C". (This also makes it easy to link against a library from C# code) Here is a detailed explanation about extern "C". But as far as I am aware this behaviour is not standardized; it is just so simple - it seems - there is just one sane solution.
Going into C++ we start to encounter function, variable and struct names repeating. Lets just talk about overloaded functions here. For that compiler creators have to come up with some kind of mapping between void a(); void a(int x); void a(char x); ... and their respective library representation. Since this process also is not standardized (see this thread) and this process is far more complex than the 1 to 1 mapping of C, the ABIs of different compilers or even compiler versions can differ in any way.
Now given two compilers (or linkers I couldn't find a resource wich specifies wich one exactly is responsible for the mangling but since this process is not standardized it could be also outsourced to cthulhu) with different name mangling schemes, create following function entry points (simplified):
compiler1
_a_
_a_int_
_a_char_
compiler2
_a_NULL_
_a_++INT++_
_a_++CHAR++_
Different linkers will not understand the output of your particular process; linker1 will try to search for _a_int_ in a library containing only _a_++INT++_. Since linkers can't use fuzzy string comparison (this could lead to a apocalypse imho) it won't find your function in the library. Also don't be fooled by the simplicity of this example: For every feature like namespace, class, method etc. there has to be a method implemented to map a function name to a entry point or memory structure.
Given your example you are lucky you use libraries of the same publisher who coded some logic to detect old libraries. Usually you will get something along the lines of <something> could not be resolved or some other convoluted, irritating and/or unhelpful error message.
Some info and experience dump regarding Visual Studio and libraries in general:
In general the Visual C++ suite doesn't support crosslinked libs between different versions but you could be lucky and it works. Don't rely on it.
Since VC++ 2015 the ABI of the libraries is guaranteed by microsoft to be compatible as drescherjm commented: link to microsoft documentation
In general when using libraries from different suites you should always be cautious as n. 1.8e9-where's-my-share m. commented here (here is your share btw) about dependencies to other libraries and runtimes. In general in general not having the control over how libraries are built is a huge pita
Edit addressing memory layout incompatibilities in addition to Tzigs answer: different name mangling schemes seem to be partially intentional to protect users against linkage against incompatible libraries. This answer goes into detail about it. The relevant passage from gcc docs:
G++ does not do name mangling in the same way as other C++ compilers. This means that object files compiled with one compiler cannot be used with another.
This effect is intentional [...].
Error C1047
This is caused by /GL Global optimization or /LTGC Link Time Code Generation
These use information in the .obj, to perform global optimizations. When present, VS looks at the compiler which generated the original .lib, and if they are different emits the error. These compilation switches are for code from a single compiler, and not intended for cross version usage.
The other builds which work, don't have the switches, so are compatible.
Visual studio has started to use a new #pragma detect_mismatch
This causes an old build to identify it is incompatible with a new build, by detecting the version change.
Very old builds didn't have / support the pragma, so had no checking.
When you build a lib, its dependencies are loaded and satisified by the linker, but this is not a guarantee of working. The one-definition-rule signs the developer up to a contract, that within a compiled binary, all implementations of the same named function are the same. If this came from different compilers, that may not be true, and so the linker can choose any, causing latent bugs, where mixtures of old and new code are linkeded into the binary.
If the definition or implementation of std::string has changed, it may link, but have code which is flawed.
This new compiler check, causes a fail early, which I thoroughly approve of.

Distributed dll and c++ objects - how some libraries handle this?

Recently I found that passing/exposing objects to/from dll is dangerous if executable and dll was compiled with different compilers or even with different version of the same compiler or even with different settings of the same compiler! To say I was surprised is to say nothing - all this time I thought I can create a dynamic library and safely distribute it.
Now having this knowledge I wonder how certain libraries work? Let's take a dll in system32 folder. With dependency Walker I can see that in, say, d3d11.dll functions exposed with C parameter - I guess this is C convention which gives us a guarantee it will work. Also I know that d3d11 works with COM objects, so there's no risk of creating an object in dll and destroying it in exe.
But now let's take any Qt dll - the functions exported as c++ and have all their names mangled. I remember when I installed the framework I specified my compiler - MSVC2013, but I didn't specified a version! I can use a dll compiled with update1 compiler in update5 compiler and I can't remember any strange behavior. How it works then? How they handled mangled function name resolve, parameters passing?
The usual reason that a library under windows has to be compiled with the same compiler is that the C++ standard library is not ABI compatible between different version of MSVC++. If a library uses the C++ Standard library across the dll boundary this is ok as long as the C++ standard library is identical in the library and the compiler that uses the library.
Qt expose there own strings (QString) and vectors(QVector) etc, instead of std::string and std::vector, therefore Qt doesn't expose the C++ standard library through the dll boundary, hence the Qt dll works with all versions of Visual Studio.

Where is the standard library?

I have searched Google but haven't found quite a direct answer to my queries.
I have been reading C++ Primer and I'm still quite new to the language, but despite how good the book is it discusses the use of the standard library but doesn't really describe where it is or where it comes from (it hasn't yet anyway). So, where is the standard library? Where are the header files that let me access it? When I downloaded CodeBlocks, did the STL come with it? Or does it automatically come with my OS?
Somewhat related, but what exactly is MinGW that came with Cobeblocks? Here it says
MinGW is a C/C++ compiler suite which allows you to create Windows executables without dependency on such DLLs
So at the most basic level is it just a collection of "things" needed to let me make C++ programs?
Apologies for the quite basic question.
"When I downloaded CodeBlocks, did the STL come with it?"
Despite it's not called the STL, but the C++ standard library, it comes with your c++ compiler implementation (and optionally packaged with the CodeBlocks IDE).
You have to differentiate IDE and compiler toolchain. CodeBlocks (the Integrated Development Environment) can be configured to use a number of different compiler toolchains (e.g. Clang or MSVC).
"Or does it automatically come with my OS?"
No, usually not. Especially not for Windows OS
"So, where is the standard library? Where are the header files that let me access it?"
They come with the compiler toolchain you're currently using for your CodeBlocks project.
Supposed this is the MinGW GCC toolchain and it's installed in the default directory, you'll find the libraries under a directory like (that's what I have)
C:\MinGW\lib\gcc\mingw32\4.8.1
and the header files at
C:\MinGW\lib\gcc\mingw32\4.8.1\include\c++
"So at the most basic level is it just a collection of "things" needed to let me make C++ programs?"
It's the Minimalist GNU toolchain for Windows. It usually comes along with the GCC (GNU C/C++ compiler toolchain), plus the MSYS minimalist GNU tools environment (including GNU make, shell, etc.).
When you have installed a C++ implementation you'll have something which implements everything necessary to use C++ source files and turn them into something running. How that is done exactly depends on the specific C++ implementation. Most often, there is a compiler which processes individual source file and translates them into object files which are then combined by a linker to produce an actual running program. That is by no means required and, e.g., cling directly interprets C++ code without compiling.
All this is just to clarify that there is no one way how C++ is implemented although the majority of contemporary C++ implementations follow the approach of compiler/linker and provide libraries as a collection of files with declarations and library files providing implementations of these declarations.
Where the C++ standard library is located and where its declarations are to be found entirely depends on the C++ implementations. Oddly, all C++ implementations I have encountered so far except cling do use a compiler and all these compilers support a -E option (although it is spelled /E for MSVC++) which preprocesses a C++ file. The typically quite large output shows locations of included files pointing at the location of the declarations. That is, something like this executed on a command line yields a file with the information about the locations:
compiler -E input.cpp > input.ii
How the compiler compiler is actually named entirely depends on the C++ implementation and is something like g++, clang++, etc. The file input.cpp is supposed to contain a suitable include directive for one of the standard C++ library headers, e.g.
#include <iostream>
Searching in the output input.ii should reveal the location of this header. Of course, it is possible that the declarations are made available by the compiler without actually including a file but just making declarations visible. There used to be a compiler like this (TenDRA) but I'm not aware of any contemporary compiler doing this (with modules being considered for standardization these may come back in the future, though).
Where the actual library file with the objects implementing the various declarations is located is an entirely different question and locating these tends to be a bit more involved.
The C++ implementation is probably installed somehow when installing CodeBlocks. I think it is just one package. On systems with a package management system like dpkg on some Linuxes it would be quite reasonable to just have the IDE have a dependency on the compiler (e.g., gcc for CodeBlocks) and have the compiler have a dependency on the standard C++ library (libstdc++ for gcc) and have the package management system sort out how things are installed.
There are several implementations of the C++ standard library. Some of the more popular ones are libstdc++, which comes packaged with GCC, libc++, which can be used with Clang, or Visual Studio's implementation by Microsoft. They use a licensed version of Dinkumware's implementation. MinGW contains a port of GCC. CodeBlocks, an IDE, allows you to choose a setup which comes packaged with a version of MinGW, or one without. Either way, you can still set up the IDE to use a different compiler if you choose. Part of the standard library implementation will also be header files, not just binaries, because a lot of it is template code (which can only be implemented in header files.)
I recommend you read the documentation for the respective technologies because they contain a lot of information, more than a tutorial or book would:
libstdc++ faq
MinGW faq
MSDN

Are compiled .lib files interchangeable for different versions of Microsoft Visual C++?

Some projects provide a single set of "Windows" binaries for C (and possible C++ - not sure) libraries. For example, see the links on the right side of this libxml-related page.
I'm pretty sure there's no way to convert between VC++ .lib files and MinGW GCC .a files, so calling them "Windows" rather than "Microsoft" binaries seems a tad misleading. But I'm also surprised that there's no apparent need for different binaries for different VC++ versions.
I seem to remember, many years ago, having problems writing plugins for tracker-style music program (Jeskola Buzz) because that program was using VC++6, and I had upgraded to VC++7. I don't remember the exact issue - it may have been partly DLL related, but I know those don't need to care about VC++ version. I think the issue related to the .lib files provided and maybe also to the runtime libraries that they linked to. It was a long while ago, though, so it's all a bit vague.
Anyway, can libraries compiled by one version of MS VC++ be linked into projects built with another version? What limitations apply, if any?
I'm interested in both C and C++ libraries, which will be called from C++ projects (I rarely use C, except for C libraries called from C++).
The MS COFF format (.lib, .obj, etc) is the same for all VC++ versions and even for other languages.
The problem is that .obj files depend on other .obj (.lib) files.
For C++ code, there is a happy chance that code won't compile with new version of VC++ standard library implementation. For example, an old version of CRT used extern "C++" void internal_foo(int), and newer CRT uses extern "C++" void internal_foo(int, int), so linker will fail with "unresolved external symbol" error.
For C code, there is a chance that code will compile, because for extern "C", symbol names doesn't encode whole signature. But at runtime application will crash after this function will be called.
The same thing can happen if layout of some data structure will change, linker will not detect it.

MS vs Non-MS C++ compiler compatibility

Thinking of using MinGW as an alternative to VC++ on Windows, but am worried about compatibility issues. I am thinking in terms of behaviour, performance on Windows (any chance a MinGW compiled EXE might act up). Also, in terms of calling the Windows API, third-party DLLs, generatic and using compatible static libraries, and other issues encountered with mixing parts of the same application with the two compilers.
First, MinGW is not a compiler, but an environment, it is bundled with gcc.
If you think of using gcc to compile code and have it call the Windows API, it's okay as it's C; but for C++ DLLs generated by MSVC, you might have a harsh wake-up call.
The main issue is that in C++, each compiler has its own name mangling (or more generally ABI) and its own Standard library. You cannot mix two different ABI or two different Standard Libraries. End of the story.
Clang has a specific MSVC compatibility mode, allowing it to accept code that MSVC accepts and to emit code that is binary compatible with code compiled with MSVC. Indeed, it is even officially supported in Visual Studio.
Obviously, you could also simply do the cross-DLL communication in C to circumvent most issues.
EDIT: Kerrek's clarification.
It is possible to compile a large amount of C++ code developed for VC++ with the MinGW toolchain; however, the ease with which you complete this task depends significantly on how C++-standards-compliant the code is.
If the C++ code utilizes VC++ extensions, such as __uuidof, then you will need to rewrite these portions.
You won't be able to compile ATL & MFC code with MinGW because the ATL & MFC headers utilize a number of VC++ extensions and depend on VC++-specific behaviors:
try-except Statements
__uuidof
throw(...)
Calling a function without forward-declaring it.
__declspec(nothrow)
...
You won't be able to use VC++-generated LIB files, so you can't use MinGW's linker, ld, to link static libraries without recompiling the library code as a MinGW A archive.
You can link with closed-source DLLs; however, you will need to export the symbols of the DLL as a DEF file and use dlltool to make the corresponding A archive (similar to the VC++ LIB file for each DLL).
MinGW's inclusion of the w32api project basically means that code using the Windows C API will compile just fine, although some of the newer functions may not be immediately available. For example, a few months ago I was having trouble compiling code that used some of the "secure" functions (the ones with the _s suffix), but I got around this problem by exporting the symbols of the DLL as a DEF, preparing an up-to-date A archive, and writing forward declarations.
In some cases, you will need to adjust the arguments to the MinGW preprocessor, cpp, to make sure that all header files are properly included and that certain macros are predefined correctly.
What I recommend is just trying it. You will definitely encounter problems, but you can usually find a solution to each by searching on the Internet or asking someone. If for no other reason, you should try it to learn more about C++, differences between compilers, and what standards-compliant code is.