Crash when running application due to existence of unexecuted code in source file - c++ - c++

I'm working on a pretty tricky problem that I've been on for literally a week now. I've hit a very hard wall and my forehead hurts from banging it so I'm hoping someone can help me out.
I am using Visual Studio 2005 for this project - I have 2008 installed but was running into similar issues when I tried it.
We have an application currently working compiled against OpenCv1.1 and I'm trying to update it to 2.2. When we switch over statically link to the new libs, the application crashes - but only in release mode. So dynamic linking and debug both work fine.
The crash is in std::vector when calling push_back.
I then came up with a sample test application which runs some basic code in opencv which works fine and then took that exact same code and added it to our application. That code fails.
I then gutted the application so it didn't instantiate any code objects except the main gui and 1 class which called that code and it still crashed. However, if I ran that code directly in the main gui, it worked fine.
I then started commenting out huge amounts of the application (in components that should never be instantiated) and eventually I worked my way down down down until...
I have a class that has a method
void Foo()
{
std::vector<int> blah;
blah.begin();
}
If this method is defined in the header, the test code works, but if this code is defined in the cpp file, it crashes. Also, if I use std::vector<double> instead of int, it also works.
I then tried to play with the compiler options and if I have optimizations turned off (/Od) and Inline Function Expansion set to Only __inline (/Ob1) it works even with the code being in the cpp file.
Of course, if we go back to the ungutted application and change those compiler options by themselves, it crashes.
If anyone has any insights on this, please let me know.
Thanks,
Liron

ARGH! Solution figured out.
In our solution we had defined _SECURE_SCL = 0, but in the 3rd party libs we had build, that was undefined (which means = 1). Setting _SECURE_SCL to 0 supposedly reduces runtimes drastically, but it has to be done the same across all included libs otherwise they will treat array sizes differently.
http://msdn.microsoft.com/en-us/library/aa985896%28v=vs.80%29.aspx
That was a fun week.

The STL classes, like vector<>, have a layout mismatch between the release and the debug builds, caused by iterator debugging support. Your problem behaves exactly like the kind of trouble you get into when you link a debug build of a .lib or DLL in the release build of your application and exchange an STL object between them. Heap corruption and access violation exceptions are the result.
Triple check your build settings and ensure that you only ever link the release build of the .libs in your Release build and the debug build of the .libs in your Debug build.

could you try:
void Foo()
{
std::vector<int> blah;
blah.reserve(5);
blah.begin();
}

Related

Access violation reading location before entering main

After upgrading from Visual Studio 2012 to Visual Studio 2015 my project gets heap corruption and access violation errors before even reaching the main function. There is simply no code that I can debug. I checked for static variables and anything that can possibly be created on a stack but I do not seem to have any. I tried to launch the program with empty main function:
int main(){return 0;};
but I still get access violation error. The program is a GUI form with mixed C++/CLI code. I made sure I have rebuilt third party libraries (curl) from source with VS2015 and use correct DLLs. Have no clue where do these errors come from - everything worked fine before. How can I fix/debug it? I googled a lot and tried even running gflags but the debugger does not stop on any human readable code. The error reads:
Exception thrown at 0x5A4C7988 (verifier.dll) in MyProgramName.exe: 0xC0000005: Access violation reading location 0xA46FEC13.
EDIT 1:
verifier.dll has nothing to do with the error. It turns out this library is loaded when I added image file to gflags.exe in attempt to debug the program. After discarding these changes I got back to the original error message which is heap corruption raised by ntdll.dll.
EDIT 2:
I stripped down the code to the bare minimum by deleting all .h files and .cpp files till the error got away. This is very inefficient way to debug, yet since I did not know any better I did it anyway.
#include "MyGUI.h"
#include <boost/date_time/posix_time/posix_time.hpp>
[STAThread]
void main(array<String^>^ args) {
Application::EnableVisualStyles();
Application::SetCompatibleTextRenderingDefault(false);
Application::Run(gcnew MyProject::MyGUI);
}
If I then remove the line
#include <boost/date_time/posix_time/posix_time.hpp>
then the problem goes away and the gui starts with no errors.
The reason why I got access violation error is because posix_time library silently links a .lib file. When I migrated from VS2012 to VS2015 I have rebuilt all the boost libraries in multiple variants and the additional library directories have been specified correctly in the project and contained the right variants of the libraries. I used posix_time library in other projects without problems however other projects were targeting x64 bit but the one I had a problem with is a x32 bit. I was linking x32 bit project against lib32 directory of my boost build, which is correct. It turns out that the reason why my application crashed was a bug in 32bit version of the posix_time library. Even linking my 32bit application to x64 folder resolves the problem. However, since I did not use microseconds,
BOOST_DATE_TIME_NO_LIB
compiler directive was sufficient. According to boost documentation,
The nano-second resolution option uses 96 bits of underlying storage for each ptime while the micro-second resolution uses 64 bits per ptime.
The latter quote and the fact that x64 bit library works fine makes me think that 32bit implementation of the library is faulty.
For those who might find it helpful - this was the reason. As for my original question how to debug such a crash when the heap corrupts even before the entry point is reached (and therefore no any source code or debug information is available) without wasting 7 hours to strip down the project till there is only one line left - I leave the question open.
If someone can point out a better approach to find such kind of errors - this will be very instructive.
EDIT 3:
Apparently this is not the end. After fixing boost libraries and reverting to the full code I got the error again. Problem is caused by use of a static variable in a GUI windows form method callback:
System::Void comboBoxStart_SelectionChangeCommitted(System::Object^ sender,
System::EventArgs^ e) {
static wstring LastChoice; //This line is enough to reproduce the crash
}
Can anyone explain why this leads to access violation reading location? Similar symptoms are described in this post.
Before loading your main, the executable will load dependent libraries and execute their own "main". It's a bit more complicated than that, but the important thing to understand is that code gets executed in those libraries first. If there are errors there, it will crash.
Your error message mentions that verifier.dll is the source of the problem. You should investigate that DLL first. Did you build it yourself? Does it have all required dependencies to run?
Access violation was caused by using static variables in managed code. I used a static variable of wstring type (native C++ type!) inside a windows form GUI callback (managed code). My guess is that the program tried to initialize the variable before the managed code initialization was over, causing the error before even reaching the program entry point.
After removing all static variables from managed code the problem disappeared.

What could possibly "break the debugger" in Visual Studio (maybe std::string?)

Consider setting a breakpoint on the following line, and stepping into it using the Visual Studio debugger (in a fully-cleaned and rebuilt debug build):
Poco::URI testUri( "http://somewhere.com/test/path" );
It turns out this step in will take you to this function:
URI::URI(const char* uri):
_port(0)
{
parse(std::string(uri));
}
And it turns out that when you take a few steps more and pause on the final line after the parse() call, all is well in the newly constructed URI object, specifically:
it has been parsed correctly;
one can expand the this pointer to see correctly assigned member variables (e.g. its_host, _path and _scheme members are set to "somewhere.com", "/test/path" and "http" respectively);
the this pointer at this stage points to a legitimate memory (e.g. 0x002AEE20) location, and at that location one can see what I am confident is the URI object (a set of std::string variables, and one int as it happens).
However, after a single step more, one returns to the original code line, and suddenly:
expanding the testUri object in the "Autos" or "Watch" debugger windows leads to std::string members that cannot be read (there are "errors reading characters of string"), and yet...
the memory where the constructed object resides remains unchanged, and...
the address of testUri is confirmed to point to the unchanged memory
How can this be? Is the VS debugger broken? What broke it?
This is the latest in a sequence of weird issues trying to get POCO Libraries up and going in a multi-threaded MFC project. I have no idea if the MFC or the multi-threading should have any impact on Poco, but I have experienced a week of weirdness -- often with std::string objects involved -- and I'd like to get to the bottom of it. All suggestions for tracing what is occurring greatly appreciated. I'm running VS2015 Community if that make a difference.
As mentioned in the comments, trying to mix different builds (i.e. both release and debug) within the same project can cause issues like this.
In this case, however, it was the mixing of different compilers -- most of the project was built under what seem to be VS2010 conditions, while the Poco libraries were built under VS2015 conditions.
I'm not 100% sure of the conditions under which the wider project was being compiled before, since it was recently upgraded from VS2010 to VS2015, and in the process, the Platform Toolset setting did not show up in the .vcxproj file. I have now (re-)introduced a Platform Toolset for each build configuration and set it to v100, and also rebuilt Poco with the build_vs100.cmd script. Everything seems to work as expected now.
The way I tracked this down was to observe that the application was being compiled with /MDd (multi-threaded debug DLL code generation), yet the linker was attempting to link to the "d" versions of the Poco libraries, not the "mdd" versions. When the compilers were brought into line, the linker correctly linked the "mdd" versions as one would expect.
Since all library linking in Poco is intended to be automatic (see the #pragma directives in PocoFoundation.h), the incorrect library selection was due to changed preprocessor definitions (POCO_STATIC was not being defined). I did not bother to check why this was.

The required DLLs in a visual studio c++ project

I've done some searching and seen questions similar in nature to mine, but none that quite hit the nail on the head of the issue I'm having.
I'm making C++ game in Visual Studio (with the Allegro 5 library) and encountering difficulty running it on other computers. I'm well aware of the 'MSVCR##.dll is missing from this computer' issue, but what I'm wondering is why I'm unable to run my Release build because I'm missing the MSVCR##'D'.dll on a certain computer, when I was under the impression that the 'D' suffixed .dll was exclusively required for running the debugger. I've checked in my configuration manager for release build settings and I have 'Generate Debug Info' set to No, which I thought was the only thing I needed to do. My question I guess is whether or not there are any other settings I need to configure to make sure my Release build isn't looking for the MSVCR##D.dll. Thanks in advance anyone who has any info!
You're a bit confused about the use of the *D libraries. They're indeed used for debug builds, but debug builds differ in multiple ways from release builds. For starters, debug builds by default come with a *.PDB file that contains all the function names (This is your "Generate Debug Info" option). A debugger looks into the .PDB file to find a readable name for a crash site.
Another debug option is to not inline code - this keeps your named functions intact. Inlining may put that single finction inside three other functions, which complicates debugging a bit.
Finally the Debug CRT includes functions that perform extra error checking against bad arguments. Many functions exhibit Undefined Behavior when passed a null pointer, for instance. The Debug libraries will catch quite a few of those, whereas the Release versions assume you pass valid pointers only.
Now DLL's can reference each other; there's a whoel dependency graph. That's why the Dependency Walker tool exists: it figures out which DLL's rqeuire which other DLL's, and this will tell you why you need the *D version.
Thank you very much for all your inputs, I was able to learn a fair bit from this. It turns out the issue was (of course) entirely my fault, as when setting up the Allegro 5 dependencies in the project settings (under General->Linker) I was accidentally including a dependency for the debug version of the Allegro monolith-md.dll as well as the non-debug version in my Release build, and that .dll was in turn referencing the *D version of the MSVCR .dll. The issue has been resolved by removing that dependency from the Release build of my game.
Install dependency walker on that machine. Load the exe. Check if any of the dependent dlls are missing.

DEP (/NXCOMPAT) causing segfault in LoadLibrary (down in DllMainCRTStartup)

In this case I do have the source for both the application and the dll.
When both are compiled without /NXCOMPAT, they work together fine. But when I compile both with /NXCOMPAT, I get a segfault deep in kernelspace.
If I compile the dll with /NXCOMPAT, and compile the executable without, it also works fine. (not surprising I suppose, since the DEP settings for the executable get forced on the loaded dll.)
I have previously seen a segfault in MainCRTStartup (Note: not the dll version), after enabling DEP, which was caused by another linker option. However, in this case that other linker option is NOT set, so I know that's not the answer.
Anyone have an idea where I should look for the cause?
Edit: Further strangeness. I've been running this in the debugger in VS 2008 the whole time, but when I tried running it without the debugger attached, the segfault disappears. I find this a very unsatisfactory solution, as I still don't know why it's been doing this.
Edit the 2nd: Also segfaults running in the debugger in VS 2013 Express.
Lacking code, we have to guess by the symptoms. My crystal ball says you're doing things inside LoadLibrary (i.e. inside DllMainCRTStartup) which are banned. And there is a very long list of things which are banned in LoadLibrary, including loading any other DLL.
Note that your global objects are created from DllMainCRTStartup and therefore have to respect the LoadLibrary rules as well.

Visual Studio Debuging Errors in C++

For some reason the integrated debugger is causing an error as soon as I make reference to a third party vendor's dll class. This same code runs when it is built and ran as a release, stand alone. The two properties for debug and release should be the same as I have not really altered them. I added the lib file to the path for both builds. I simply have:
ClassNameFromDll blah;
When it gets to here, I get this exception:
Unhandled exception at 0x78a3f623 (mfc90ud.dll) in MTGO SO Bot.exe:
0xC0000005: Access violation reading location 0xf78e4568.
It occurs in: afxtls.cpp, line 252.
This is an MFC app, but I am not really using any MFC other than a very simple gui which fires off an event that is all win32. I am using Visual Studio 2008 Express.
Looking at the atltls.cpp file from my VC9 install, the crash is occurring here:
inline void* CThreadSlotData::GetThreadValue(int nSlot)
{
EnterCriticalSection(&m_sect);
ASSERT(nSlot != 0 && nSlot < m_nMax);
ASSERT(m_pSlotData != NULL);
ASSERT(m_pSlotData[nSlot].dwFlags & SLOT_USED); // <== crash
// ...
}
So the reason the crash doesn't occur in release build is because the ASSERT() is a no-op in that build. I'm not familiar with ATL's use of thread local storage, but this assertion indicates that something is asking for a value in a slot where nothing has been stored yet.
Whether the initialization of that TLS slot is your responsibility or the 3rd party DLL's responsibility, I don't know.
It looks like GetThreadValue() has some additional protections such that it'll return a NULL pointer in the release build for an uninitialized slot (though I'm not sure that this would be guaranteed) - I'd bet that the 3rd party DLL relies on that behavior (ie., it checks for a NULL return) so no crash occurs in release builds. Note that the vendor might be using the CThreadSlotData class indirectly (the stack trace would give a clue about this), so they might not be aware of its expectations.
engaging psychic debugging
The fact that it runs in release mode fine and crashes in debug mode leads me to believe that you've somehow managed to reference, specifically, the release version of that DLL (mfc90u.dll), rather than referencing the library itself and allowing the linker to decide which version to import.
You may not be using MFC for anything in this app, but if it's building as an MFC application, you will get all of the MFC stuff whether you want it or not (which means you also have to solve the MFC dependency problem and ship the MFC DLLs with your app).
Do you have a stack trace you can post? It might have some helpful information.
If the 3rd party DLL is still actively supported by the vendor, then the first thing you should do is see if you can have the same problem occur with a very simple program that you can send to the vendor and ask them to fix it.
If the vendor is not available or responsive enough:
If you have source of the 3rd party DLL and can easily build your own version, you have probably the best way to debug this (short of getting the vendor to support you). Even if you cannot easily build a source-debuggable DLL, you can trace into the constructor's assembly instructions and use the source as a map to help you understand what's going on.
Even if you don't have source for the 3rd party DLL then I think the best course of action is to trace through the constructor for ClassNameFromDll to try to figure out whats going wrong. It might help to compare the instructions path in the Debug build vs. the Release build.
MFC source is distributed with MSVC (probably not with the Express version, but I think with all other versions) so when you get in to the MFC DLL's code you might find the source to be useful in helping to figure out what's going on.