MinGW: Linking with LAPACK and BLAS causes C++ exceptions to become unhandled - c++

The situation is simple, but strange. When i compile my program without the LinearAlgebra.o source (which requires linking to LAPACK), C++ exceptions are caught and handled. When I do not include that compilation unit but still link to the libraries (-llapack -lblas), exceptions are caught and handled. But once I get it in there (the code from it runs just fine), C++ exceptions are no longer handled correctly, and I get Windows crash handler "Program has stopped responding report back to HQ" nonsense.
Here I shed light on what is going on inside this source file. I did keep it pretty simple, but I'm not sure if it's really Kosher.
I suspect it is something about invoking FORTRAN routines which causes C++ exceptions to stop working. But I have no idea how to go about fixing this.
UPDATE:
I am very glad to have found a temporary workaround for this issue: I am using MinGW's gfortran compiler to directly compile the LAPACK and BLAS routines I am currently using.
Linking those object files into my C++ project using -lgfortran with g++ works flawlessly, and my exceptions are still being correctly handled! As a bonus this allows me to only include what LAPACK routines I intend to use, so now I no longer have to link a ~4MB library.
Edit: I think if I statically link a library it only "grabs what it needs" so it being 4MB wouldn't matter in that case.

I have had great results with GotoBLAS2. Running the script included produces a massive 19MB static library optimized for my machine. It works flawlessly by simply linking it. All of my fortran style calls just work.

Related

Why would a C program compile and link with a C compiler against C++ libraries then SIGILL at runtime?

I recently had a problem with a 3rd party plugin that I wrote on IBM AIX. It took considerable time to track down. The plugin was in the form of a C executable. The executable was compiled against 3rd party libraries. A new version of those libraries was provided for a mandatory upgrade.
When compiled and linked with the IBM C compiler 12.1 my existing code would produce a binary that would work if the timezone was set using OLSON and crash with SIGILL and no backtrace available if the timezone was set using POSIX.
I was able to track down the crash to a call to the 3rd party API fairly quickly using debug printfs to a log file and flushing the log file. But it took some time for the 3rd party provider to mention that the new version of the API introduced the use of C++ libraries in their own code.
The problem was resolved by compiling with the IBM C compiler but linking with the IBM C++ linker (xlC).
So my questions are:
1) Why didn't the C linker fail to produce a valid executable?
A semi-valid binary must have been produced or the code would not have worked under OLSON timezone. That means all the symbols were present and any name mangling was handled (though possibly not correctly)
2) How do I determine if the binary produced by compiler and linker is valid?
The only way I can think to do this is to fully excercise the code as much as possible with unit testing.
3) How do I prevent similar from occuring for different code?
Should I always link with a C++ linker? That doesn't seem right to me.
I apologize that I can't post code, but I am not at liberty to do this.
Zsigmond has a lot of good statements. We need at least to see the link line and its options to help you much. Very often, open source links with "ignore unresolved symbols" because they don't understand the import files. So, the link will succeed in producing an utterly useless binary -- cause that is why it was asked to do.

Binary C++ library compatibility between VS2010 and VS2012?

I am confused about the binary compatibility of compiled libraries between VS2010 and VS2012. I would like to migrate to VS2012, however many closed-source binary-only SDKs are only out for VS2010, for example SDKs for interfacing hardware devices.
Traditionally, as far as I know Visual Studio was extremely picky about compiler versions, and in VS2010 you could not link to libraries which have been compiled for VS2008.
The reason I'm confused now, is that I'm in the process of migrating to VS2012, and I have tried a few projects, and for my biggest surprise, many of them work cross-versions with no problems.
Note: I'm not talking about the v100 mode, which as far as I know is just a VS2012 GUI over the VS2010 compiling engine.
I'm talking about opening a VS2010 solution in VS2012, click update, and seeing what happens.
When linking to some bigger libraries, like boost, the compiling didn't work, as there are checks for compiler version and they raise an error and abort compilation. Some other libraries just abort at missing functions. This is the behaviour what I was expecting.
On the other hand, many libraries work fine with no errors or additional warnings.
How is it possible? Was VS2012 made in a special way to keep binary compatibility with VS2010 libraries? Does it depend on dynamic vs. static linking?
And the most important question: even though there is no error raised at compile time, can I trust the compiler that there won't be any bugs when linking a VS2012 project to VS2010 compiled libraries?
“Many libraries work fine. How is it possible?”
1) the lib is compiled to use static RTL, so the code won't pull in a second RTL DLL that clashes.
2) the code only calls functions (and uses structures, etc.) that is completely in the header files
so doesn't cause a linker error, or
calls functions that are still present in the new RTL so doesn't cause a linker error,
3) doesn't call anything with structures that have changed layout or meaning so it doesn't crash.
#3 is the one to worry about. You can use imports to see what it uses and make a complete list, but there is no documentation or guarantee as to which ones are compatible. Just because it seems to run doesn't mean it doesn't have a latent bug.
There is also the possibility of
4) the driver SDK or other code that's fairly low level was written to avoid using standard library calls completely.
also (not your situation I think) DLLs can be isolated and have their own RTL and not pass stuff back and forth (like memory allocation and freeing) between different regimes. In-proc COM servers work this way. DLLs can do this in general if you're careful about what you pass and return and what you do with things like pointers. Crypto++ for example is initialized with the memory routines from the enclosing program and doesn't expose malloc'ed memory from the RTL version it was compiled with.

Changes in lines that will not execute breaks the build !

I have this job of implementing a library that provides a file-sharing feature.
This has already happened twice:
First, in a string in an if-else path, only the if path is being executed, but when i change a spelling in the else path, the software after a few minutes crashes in an std library. I verified with a debug attached that the lined changed was never being touched. When i reversed the change, it works nicely again.
Second, my software crashes on a std library again with the out-of-array check into a standard basic_string destructor.
I did everything, all library matched the _HAS_ITERATOR_DEBUGGING.
After 4 hours I discovered that the problematic file is TorrentFile.cpp/h.
If i add a function ( even though it is never called ), the program crashes at the end of that file, but if its not there, there's no bug. The code causing the problem:
std::vector<TorrentFileListPacket> TorrentFile::GetFileMap()
{
std::vector<TorrentFileListPacket> vFiles;
return vFiles;
};
If i comment this code out, the crash is gone.
This is really driving me crazy!
I've been a developer for 8 years, and I've never seen something like this before!
Additional Information
My memory is OK, I'm using Visual Studio 2010 with SP1 in Windows 7. The library is libTorrent from RasterBar and it links to boost. The software is using MFC.
This smells strongly of memory corruption in a totally different location from where you would expect from the crashes. Most likely adding and removing functions is changing the memory layout in such a way that causes the memory corruption's effects to be immediately visible or not.
Your best hope is something like Purify or Valgrind to hunt it down.
You probably want to make sure that all your object files and libraries are ABI compatible with each other.
Numerous compiler settings will change the ABI. Especially debug and release builds and iterator debugging. The struct layout for standard containers typically change when you enable iterator debugging (which I believe is on by default for all debug builds in msvc, and off for release builds).
So, if a single object file, static library or DLL that you link against is built with an incompatible configuration, you typically see very odd behaviors. With libtorrent you need to make sure you build the library with the same configuration as you link against it with. Many of the TORRENT_* defines will actually change some aspect of some struct layout or function call. Make sure you define the exact same set of those in your client as when building the library. One simple way of dealing with this problem is to simply pull all source files into your project and build everything together.
If you are using libtorrent as a DLL (or boost for that matter), are they compiled against the same C Run-Time?
Often when I run into this type of issue it is because I make a call into a library that was compiled with MinGW (which uses the CRT from VS6.0) or an older version of Visual Studio. If memory is allocated by the library and then free'd by your application, you will often get these types of errors in the destructor.
If you aren't sure, you can open the DLL in question in a tool like the Dependency Walker. Look for the dependency MSVCRT.DLL, MSVCR100.DLL, etc.

segfault during __cxa_allocate_exception in SWIG wrapped library

While developing a SWIG wrapped C++ library for Ruby, we came across an unexplained crash during exception handling inside the C++ code.
I'm not sure of the specific circumstances to recreate the issue, but it happened first during a call to std::uncaught_exception, then after a some code changes, moved to __cxa_allocate_exception during exception construction. Neither GDB nor valgrind provided any insight into the cause of the crash.
I've found several references to similar problems, including:
http://wiki.fifengine.de/Segfault_in_cxa_allocate_exception
http://forums.fifengine.de/index.php?topic=30.0
http://code.google.com/p/osgswig/issues/detail?id=17
https://bugs.launchpad.net/ubuntu/+source/libavg/+bug/241808
The overriding theme seems to be a combination of circumstances:
A C application is linked to more than one C++ library
More than one version of libstdc++ was used during compilation
Generally the second version of C++ used comes from a binary-only implementation of libGL
The problem does not occur when linking your library with a C++ application, only with a C application
The "solution" is to explicitly link your library with libstdc++ and possibly also with libGL, forcing the order of linking.
After trying many combinations with my code, the only solution that I found that works is the LD_PRELOAD="libGL.so libstdc++.so.6" ruby scriptname option. That is, none of the compile-time linking solutions made any difference.
My understanding of the issue is that the C++ runtime is not being properly initialized. By forcing the order of linking you bootstrap the initialization process and it works. The problem occurs only with C applications calling C++ libraries because the C application is not itself linking to libstdc++ and is not initializing the C++ runtime. Because using SWIG (or boost::python) is a common way of calling a C++ library from a C application, that is why SWIG often comes up when researching the problem.
Is anyone out there able to give more insight into this problem? Is there an actual solution or do only workarounds exist?
Thanks.
Following Michael Dorgan's suggestion, I'm copying my comment into an answer:
Found the real cause of the problem. Hopefully this will help someone else encountering this bug. You probably have some static data somewhere that is not being properly initialized. We did, and the solution was in boost-log for our code base. https://sourceforge.net/projects/boost-log/forums/forum/710022/topic/3706109. The real problem is the delay loaded library (plus statics), not the potentially multiple versions of C++ from different libraries. For more info: http://parashift.com/c++-faq-lite/ctors.html#faq-10.13
Since encountering this problem and its solution, I've learned that it's important to understand how statics are shared or not shared between your statically and dynamically linked libraries. On Windows this requires explicitly exporting the symbols for the shared statics (including things like singletons meant to be accessed across different libraries). The behavior is subtly different between each of the major platforms.
I recently ran into this problem as well. My process creates a shared object module that is used as a python C++ extension. A recent OS upgrade from RHEL 6.4 to 6.5 exposed the problem.
Following the tips here, I merely added -lstdc++ to my link switches and that solved the problem.
Having the same problem using SWIG for Python with a cpp library (Clipper), I found that using LD_PRELOAD as you suggested works for me too.
As another workaround which doesn't require LD_PRELOAD, I found that I can also link the libstdc++ into the .so library file of my module, e.g.
ld -shared /usr/lib/i386-linux-gnu/libstdc++.so.6 module.o module_wrap.o -o _module.so
I can then import it in python without any further options.
I realise that #lefticus accepted the answer relating to what I guess amounts to undefined static init order; however, I had a very similar problem, this time with boost::python.
I tried my damndest to find any static initilisation issues and couldn't - to the point that I refactored a major chunk of our codebase; and when that didn't work ended up removing exceptions altogether.
However, some more crept in and we started getting these segfaults again.
After some more investigation I came across this link which talks about custom allocators.
We do indeed use tcmalloc ourselves; and after I removed it from our library which is exported to boost::python we had no more issues!
So just an FYI to anyone who stumbles across this thread - if #lefticus's answer doesn't work, check if you're using a different allocator to that which python uses.

Adding Boost makes Debug build depend on "non-D" MSVC runtime DLLs

I have an annoying problem which I might be able to somehow circumvent, but on the other hand would much rather be on top of it and understand what exactly is going on, since it looks like this stuff is really here to stay.
Here's the story: I have a simple OpenGL app which works fine: never a major problem in compiling, linking, or running it. Now I decided to try to move some of the more intensive calculations into a worker thread, in order to possibly make the GUI even more responsive — using Boost.Thread, of course.
In short, if I add the following fragment in the beginning of my .cpp file:
#include <boost/thread/thread.hpp>
void dummyThreadFun() { while (1); }
boost::thread p(dummyThreadFun);
, then I start getting "This application has failed to start because MSVCP90.dll was not found" when trying to launch the Debug build. (Release mode works ok.)
Now looking at the executable using the Dependency Walker, who also does not find this DLL (which is expected I guess), I could see that we are looking for it in order to be able to call the following functions:
?max#?$numeric_limits#K#std##SAKXZ
?max#?$numeric_limits#_J#std##SA_JXZ
?min#?$numeric_limits#K#std##SAKXZ
?min#?$numeric_limits#_J#std##SA_JXZ
Next, I tried to convert every instance of min and max to use macros instead, but probably couldn't find all references to them, as this did not help. (I'm using some external libraries for which I don't have the source code available. But even if I could do this — I don't think it's the right way really.)
So, my questions — I guess — are:
Why do we look for a non-debug DLL even though working with the debug build?
What is the correct way to fix the problem? Or even a quick-and-dirty one?
I had this first in a pretty much vanilla installation of Visual Studio 2008. Then tried installing the Feature Pack and SP1, but they didn't help either. Of course also tried to Rebuild several times.
I am using prebuilt binaries for Boost (v1.36.0). This is not the first time I use Boost in this project, but it may be the first time that I use a part that is based on a separate source.
Disabling incremental linking doesn't help. The fact that the program is OpenGL doesn't seem to be relevant either — I got a similar issue when adding the same three lines of code into a simple console program (but there it was complaining about MSVCR90.dll and _mkdir, and when I replaced the latter with boost::create_directory, the problem went away!!). And it's really just removing or adding those three lines that makes the program run ok, or not run at all, respectively.
I can't say I understand Side-by-Side (don't even know if this is related but that's what I assume for now), and to be honest, I am not super-interested either — as long as I can just build, debug and deploy my app...
Edit 1: While trying to build a stripped-down example that anyway reproduces the problem, I have discovered that the issue has to do with the Spread Toolkit, the use of which is a factor common to all my programs having this problem. (However, I never had this before starting to link in the Boost stuff.)
I have now come up with a minimal program that lets me reproduce the issue. It consists of two compilation units, A.cpp and B.cpp.
A.cpp:
#include "sp.h"
int main(int argc, char* argv[])
{
mailbox mbox = -1;
SP_join(mbox, "foo");
return 0;
}
B.cpp:
#include <boost/filesystem.hpp>
Some observations:
If I comment out the line SP_join of A.cpp, the problem goes away.
If I comment out the single line of B.cpp, the problem goes away.
If I move or copy B.cpp's single line to the beginning or end of A.cpp, the problem goes away.
(In scenarios 2 and 3, the program crashes when calling SP_join, but that's just because the mailbox is not valid... this has nothing to do with the issue at hand.)
In addition, Spread's core library is linked in, and that's surely part of the answer to my question #1, since there's no debug build of that lib in my system.
Currently, I'm trying to come up with something that'd make it possible to reproduce the issue in another environment. (Even though I will be quite surprised if it actually can be repeated outside my premises...)
Edit 2: Ok, so here we now have a package using which I was able to reproduce the issue on an almost vanilla installation of WinXP32 + VS2008 + Boost 1.36.0 (still pre-built binaries from BoostPro Computing).
The culprit is surely the Spread lib, my build of which somehow requires a rather archaic version of STLPort for MSVC 6! Nevertheless, I still find the symptoms relatively amusing. Also, it would be nice to hear if you can actually reproduce the issue — including scenarios 1-3 above. The package is quite small, and it should contain all the necessary pieces.
As it turns out, the issue did not really have anything to do with Boost.Thread specifically, as this example now uses the Boost Filesystem library. Additionally, it now complains about MSVCR90.dll, not P as previously.
Boost.Thread has quite a few possible build combinations in order to try and cater for all the differences in linking scenarios possible with MSVC. Firstly, you can either link statically to Boost.Thread, or link to Boost.Thread in a separate DLL. You can then link to the DLL version of the MSVC runtime, or the static library runtime. Finally, you can link to the debug runtime or the release runtime.
The Boost.Thread headers try and auto-detect the build scenario using the predefined macros that the compiler generates. In order to link against the version that uses the debug runtime you need to have _DEBUG defined. This is automatically defined by the /MD and /MDd compiler switches, so it should be OK, but your problem description suggests otherwise.
Where did you get the pre-built binaries from? Are you explicitly selecting a library in your project settings, or are you letting the auto-link mechanism select the appropriate .lib file?
I believe I have had this same problem with Boost in the past. From my understanding it happens because the Boost headers use a preprocessor instruction to link against the proper lib. If your debug and release libraries are in the same folder and have different names the "auto-link" feature will not work properly.
What I have done is define BOOST_ALL_NO_LIB for my project(which prevents the headers from "auto linking") and then use the VC project settings to link against the correct libraries.
Looks like other people have answered the Boost side of the issue. Here's a bit of background info on the MSVC side of things, that may save further headache.
There are 4 versions of the C (and C++) runtimes possible:
/MT: libcmt.lib (C), libcpmt.lib (C++)
/MTd: libcmtd.lib, libcpmtd.lib
/MD: msvcrt.lib, msvcprt.lib
/MDd: msvcrtd.lib, msvcprtd.lib
The DLL versions still require linking to that static lib (which somehow does all of the setup to link to the DLL at runtime - I don't know the details). Notice in all cases debug version has the d suffix. The C runtime uses the c infix, and the C++ runtime uses the cp infix. See the pattern? In any application, you should only ever link to the libraries in one of those rows.
Sometimes (as in your case), you find yourself linking to someone else's static library that is configured to use the wrong version of the C or C++ runtimes (via the awfully annoying #pragma comment(lib)). You can detect this by turning your linker verbosity way up, but it's a real PITA to hunt for. The "kill a rodent with a bazooka" solution is to use the /nodefaultlib:... linker setting to rule out the 6 C and C++ libraries that you know you don't need. I've used this in the past without problem, but I'm not positive it'll always work... maybe someone will come out of the woodwork telling me how this "solution" may cause your program to eat babies on Tuesday afternoons.
This is a classic link error. It looks like you're linking to a Boost DLL that itself links to the wrong C++ runtime (there's also this page, do a text search for "threads"). It also looks like the boost::posix::time library links to the correct DLL.
Unfortunately, I'm not finding the page that discusses how to pick the correctly-built Boost DLL (although I did find a three-year-old email that seems to point to BOOST_THREAD_USE_DLL and BOOST_THREAD_USE_LIB).
Looking at your answer again, it appears you're using pre-built binaries. The DLL you're not able to link to is part of the TR1 feature pack (second question on that page). That feature pack is available on Microsoft's website. Or you'll need a different binary to link against. Apparently the boost::posix::time library links against the unpatched C++ runtime.
Since you've already applied the feature pack, I think the next step I would take would be to build Boost by hand. That's the path I've always taken, and it's very simple: download the BJam binary, and run the Boost Build script in the library source. That's it.
Now this got even a bit more interesting... If I just add this somewhere in the source:
boost::posix_time::ptime pt = boost::posix_time::microsec_clock::universal_time();
(together with the corresponding #include stuff), then it again works ok. So this is one quick and not even too dirty solution, but hey — what's going on here, really?
From memory various parts of the boost libraries need you to define some preprocessor flags in order to be able to compile correctly. Stuff like BOOST_THREAD_USE_DLL and so on.
The BOOST_THREAD_USE_DLL won't be what's causing this particular error, but it may be expecting you to define _DEBUG or something like that. I remember a few years ago in our boost C++ projects we had quite a few extra BOOST_XYZ preprocessor definitions declared in the visual studio compiler options (or makefile)
Check the config.hpp file in the boost thread directory. When you pull in the ptime stuff it's possibly including a different config.hpp file, which may then define those preprocessor things differently.