Access violation reading location before entering main - c++

After upgrading from Visual Studio 2012 to Visual Studio 2015 my project gets heap corruption and access violation errors before even reaching the main function. There is simply no code that I can debug. I checked for static variables and anything that can possibly be created on a stack but I do not seem to have any. I tried to launch the program with empty main function:
int main(){return 0;};
but I still get access violation error. The program is a GUI form with mixed C++/CLI code. I made sure I have rebuilt third party libraries (curl) from source with VS2015 and use correct DLLs. Have no clue where do these errors come from - everything worked fine before. How can I fix/debug it? I googled a lot and tried even running gflags but the debugger does not stop on any human readable code. The error reads:
Exception thrown at 0x5A4C7988 (verifier.dll) in MyProgramName.exe: 0xC0000005: Access violation reading location 0xA46FEC13.
EDIT 1:
verifier.dll has nothing to do with the error. It turns out this library is loaded when I added image file to gflags.exe in attempt to debug the program. After discarding these changes I got back to the original error message which is heap corruption raised by ntdll.dll.
EDIT 2:
I stripped down the code to the bare minimum by deleting all .h files and .cpp files till the error got away. This is very inefficient way to debug, yet since I did not know any better I did it anyway.
#include "MyGUI.h"
#include <boost/date_time/posix_time/posix_time.hpp>
[STAThread]
void main(array<String^>^ args) {
Application::EnableVisualStyles();
Application::SetCompatibleTextRenderingDefault(false);
Application::Run(gcnew MyProject::MyGUI);
}
If I then remove the line
#include <boost/date_time/posix_time/posix_time.hpp>
then the problem goes away and the gui starts with no errors.
The reason why I got access violation error is because posix_time library silently links a .lib file. When I migrated from VS2012 to VS2015 I have rebuilt all the boost libraries in multiple variants and the additional library directories have been specified correctly in the project and contained the right variants of the libraries. I used posix_time library in other projects without problems however other projects were targeting x64 bit but the one I had a problem with is a x32 bit. I was linking x32 bit project against lib32 directory of my boost build, which is correct. It turns out that the reason why my application crashed was a bug in 32bit version of the posix_time library. Even linking my 32bit application to x64 folder resolves the problem. However, since I did not use microseconds,
BOOST_DATE_TIME_NO_LIB
compiler directive was sufficient. According to boost documentation,
The nano-second resolution option uses 96 bits of underlying storage for each ptime while the micro-second resolution uses 64 bits per ptime.
The latter quote and the fact that x64 bit library works fine makes me think that 32bit implementation of the library is faulty.
For those who might find it helpful - this was the reason. As for my original question how to debug such a crash when the heap corrupts even before the entry point is reached (and therefore no any source code or debug information is available) without wasting 7 hours to strip down the project till there is only one line left - I leave the question open.
If someone can point out a better approach to find such kind of errors - this will be very instructive.
EDIT 3:
Apparently this is not the end. After fixing boost libraries and reverting to the full code I got the error again. Problem is caused by use of a static variable in a GUI windows form method callback:
System::Void comboBoxStart_SelectionChangeCommitted(System::Object^ sender,
System::EventArgs^ e) {
static wstring LastChoice; //This line is enough to reproduce the crash
}
Can anyone explain why this leads to access violation reading location? Similar symptoms are described in this post.

Before loading your main, the executable will load dependent libraries and execute their own "main". It's a bit more complicated than that, but the important thing to understand is that code gets executed in those libraries first. If there are errors there, it will crash.
Your error message mentions that verifier.dll is the source of the problem. You should investigate that DLL first. Did you build it yourself? Does it have all required dependencies to run?

Access violation was caused by using static variables in managed code. I used a static variable of wstring type (native C++ type!) inside a windows form GUI callback (managed code). My guess is that the program tried to initialize the variable before the managed code initialization was over, causing the error before even reaching the program entry point.
After removing all static variables from managed code the problem disappeared.

Related

Windows C++ MFC migration: AfxGetThread Assertion. Why does win32u.dll load before mfc140d.dll in some cases?

I have customer code written for Visual Studio 6.0 MFC which has a simple GUI and launches an EXE with arguments. This code was ported from VS6.0 to VS2019 about 2 years ago and works in a production environment on several systems. We now have a new system where the code fails to function... and I'm starting to dig.
The code is throwing exception in appcore.cpp line 196
It is crashing at AfxGetThread() now that I have been able to get VS2019 to find "appcore.cpp".
This is new information.. I will be searching on AfxGetThread next... so this question likely to be a duplicate now.
One difference I have detected is the order where the Visual Studio 2019 debuger loads symbols. I can't say for certain that this is an indication of the actual DLL load order at runtime, but it appears to be. The screenshot below is the SYMBOL load order where a difference is detected between the working and non working instance of the application.
In the image below we have a Tortoise SVN Diff of two ASCII files. One is the DLL symbol load order on the left when the application works. The second is the DLL symbol load order on the right when the application fails to work. Line 7 is the divergance, where in the failig case the library win32u.dll is pulled in before mfc140d.dll.
The customer code uses some Apache log4cxx libraries which I need to investigate, but at this point in the load sequence I'm not 100% sure differences in those libraries or *.h files used at build time could influence the DLL loading order.
So this is the puzzle I'm looking at.
I will include some links to relevant StackOverflow questions that are similar in my search for an answer to this question.
Possibly Useful Links:
https://learn.microsoft.com/en-us/cpp/porting/porting-guide-mfc-scribble?view=msvc-170
The DLLs are searched in order in different locations: Standard Search Order for Desktop Applications
Most likely the DLLs on the failing machine are missing or they are in the wrong location, so Windows grabs something else.
Make sure all the dependencies are installed in the correct folders.
In this case, the crash in appcore.cpp was due to the code having 2 CWinApp derived objects in the code. And the crash occurs at construction time.
The first hurdle is to get VisualStudio2019 to find appcore.cpp and be able to step into this code. I browsed to C:\Program Files(x86)\VisualStudio and searched for "appcore.cpp". This provided the trail of breadcrumbs to get Visual Studio2019 the correct path when it asked for the file.
In my case the path is:
C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.29.30037\atlmfc\src\appcore.cpp
The second hurdle was to put a breakpoint at the ASSERT point, as the FIRST time the program comes up, this ASSERT is OK. At least in my case... the first constructor of the CWinApp object succeeded. So in my case the offensive code which was unexpectadly constructing a CWinApp derived object ran first. Then the second iteration through the ASSERT would happen.
By placing a breakpoint at the ASSERT, you can re-launch and look at the stack trace of the successful object to determine if it's expected or not.
In my case there was a *.h fix required to get ancient Visual Studio 6.0 MFC code to build and link successfully. I don't have the exact secret, but it's essentially getting a *.h to specify the proper WIN_VER minimum windows version before afx.h is included. For the code that failed, an incorrect *.h was included that included the fix PLUS objects that derived CWinApp.
As near as I can tell, the reason this worked on some sites and not others was due to a regression in the code that was built on that particular system.
The other "strange" behavior was that on the system with the bad *.h include the DLL symbol load order was different. I was able to replicate this bad behavior on more than one system running the "bad" Exe code. Then I did the same on working *.exe code. The strange thing was that the "bad" exe had reasonably correct sources in the workspace. So there were source control issues leading to this belief that it worked on one node and not others.
The runtime behavior I was able to catch matched obsolete code committed to source control.

VS2008 C++ breakpoint becomes permanently inactive after access violation error (no executable code associated with line)

I am reproducing the following behavior in VS2008 (native C++):
attach to an executable that consumes a custom dll (for which I have the source)
debug the code from the dynamic lib
encounter an access violation error (probably caused by the code in the executable - closed source)
break on access violation error with the attached debugger
After this, no matter how many times I reattach, rebuild, restart the application, computer, any breakpoint I will set in the .dll source code becomes inactive (No executable code associated with this line is the alleged cause, according to VS).
I suspect this is an issue with VS2008, as I did the same on a different machine and now I have two machines where debugging is no longer possible.
Are there any recorded solutions of this issue? What can be done to overcome it?
What I have done:
deleting everything (the entire solution, pdbs, binaries, etc.) and starting with the code from scratch (cloning the latest version from the repository)
restarting the machine
changing the machine (it worked once, until the error occurred, then the other computer exhibited the same behavior)
What I cannot do:
change compiler/VS version
debug the executable (sadly no source code available and lack of assembly skills)
The root of the issue was more subtle. Although the project was intended to be native C++, I have found that on the configuration I was testing the code, the entire project was built with CLR support.
When attaching to the application the first time on any machine, in native debugging mode, the breakpoints would trigger. However, when encountering the native access violation error, these breakpoints became permanently inactive thereafter. After deciding to check what happens if I let the debugger attach in auto mode, I have discovered that the breakpoints became active and hence found out that all code had been compiled with the /clr flag except for the entrypoint in the consumed dll, which had no CLR support.
The question here is why VS2008 behaves like this and does not directly disable breakpoints whenever one attempts to debug a managed context using native debug settings.
TL;DR: check if your C++ project is built with CLR support and attach either as native or managed, depending on your needs. Alternatively, if only some of your files require C++-CLI usage, enable the /clr flag only for those. It is more often a better choice since C++-CLI often clashes with certain native libraries (e.g. not std::mutex support, linking against native static libs Linking unmanaged C++ DLL with managed C++ class library DLL, etc.).

What could possibly "break the debugger" in Visual Studio (maybe std::string?)

Consider setting a breakpoint on the following line, and stepping into it using the Visual Studio debugger (in a fully-cleaned and rebuilt debug build):
Poco::URI testUri( "http://somewhere.com/test/path" );
It turns out this step in will take you to this function:
URI::URI(const char* uri):
_port(0)
{
parse(std::string(uri));
}
And it turns out that when you take a few steps more and pause on the final line after the parse() call, all is well in the newly constructed URI object, specifically:
it has been parsed correctly;
one can expand the this pointer to see correctly assigned member variables (e.g. its_host, _path and _scheme members are set to "somewhere.com", "/test/path" and "http" respectively);
the this pointer at this stage points to a legitimate memory (e.g. 0x002AEE20) location, and at that location one can see what I am confident is the URI object (a set of std::string variables, and one int as it happens).
However, after a single step more, one returns to the original code line, and suddenly:
expanding the testUri object in the "Autos" or "Watch" debugger windows leads to std::string members that cannot be read (there are "errors reading characters of string"), and yet...
the memory where the constructed object resides remains unchanged, and...
the address of testUri is confirmed to point to the unchanged memory
How can this be? Is the VS debugger broken? What broke it?
This is the latest in a sequence of weird issues trying to get POCO Libraries up and going in a multi-threaded MFC project. I have no idea if the MFC or the multi-threading should have any impact on Poco, but I have experienced a week of weirdness -- often with std::string objects involved -- and I'd like to get to the bottom of it. All suggestions for tracing what is occurring greatly appreciated. I'm running VS2015 Community if that make a difference.
As mentioned in the comments, trying to mix different builds (i.e. both release and debug) within the same project can cause issues like this.
In this case, however, it was the mixing of different compilers -- most of the project was built under what seem to be VS2010 conditions, while the Poco libraries were built under VS2015 conditions.
I'm not 100% sure of the conditions under which the wider project was being compiled before, since it was recently upgraded from VS2010 to VS2015, and in the process, the Platform Toolset setting did not show up in the .vcxproj file. I have now (re-)introduced a Platform Toolset for each build configuration and set it to v100, and also rebuilt Poco with the build_vs100.cmd script. Everything seems to work as expected now.
The way I tracked this down was to observe that the application was being compiled with /MDd (multi-threaded debug DLL code generation), yet the linker was attempting to link to the "d" versions of the Poco libraries, not the "mdd" versions. When the compilers were brought into line, the linker correctly linked the "mdd" versions as one would expect.
Since all library linking in Poco is intended to be automatic (see the #pragma directives in PocoFoundation.h), the incorrect library selection was due to changed preprocessor definitions (POCO_STATIC was not being defined). I did not bother to check why this was.

Crash when running application due to existence of unexecuted code in source file - c++

I'm working on a pretty tricky problem that I've been on for literally a week now. I've hit a very hard wall and my forehead hurts from banging it so I'm hoping someone can help me out.
I am using Visual Studio 2005 for this project - I have 2008 installed but was running into similar issues when I tried it.
We have an application currently working compiled against OpenCv1.1 and I'm trying to update it to 2.2. When we switch over statically link to the new libs, the application crashes - but only in release mode. So dynamic linking and debug both work fine.
The crash is in std::vector when calling push_back.
I then came up with a sample test application which runs some basic code in opencv which works fine and then took that exact same code and added it to our application. That code fails.
I then gutted the application so it didn't instantiate any code objects except the main gui and 1 class which called that code and it still crashed. However, if I ran that code directly in the main gui, it worked fine.
I then started commenting out huge amounts of the application (in components that should never be instantiated) and eventually I worked my way down down down until...
I have a class that has a method
void Foo()
{
std::vector<int> blah;
blah.begin();
}
If this method is defined in the header, the test code works, but if this code is defined in the cpp file, it crashes. Also, if I use std::vector<double> instead of int, it also works.
I then tried to play with the compiler options and if I have optimizations turned off (/Od) and Inline Function Expansion set to Only __inline (/Ob1) it works even with the code being in the cpp file.
Of course, if we go back to the ungutted application and change those compiler options by themselves, it crashes.
If anyone has any insights on this, please let me know.
Thanks,
Liron
ARGH! Solution figured out.
In our solution we had defined _SECURE_SCL = 0, but in the 3rd party libs we had build, that was undefined (which means = 1). Setting _SECURE_SCL to 0 supposedly reduces runtimes drastically, but it has to be done the same across all included libs otherwise they will treat array sizes differently.
http://msdn.microsoft.com/en-us/library/aa985896%28v=vs.80%29.aspx
That was a fun week.
The STL classes, like vector<>, have a layout mismatch between the release and the debug builds, caused by iterator debugging support. Your problem behaves exactly like the kind of trouble you get into when you link a debug build of a .lib or DLL in the release build of your application and exchange an STL object between them. Heap corruption and access violation exceptions are the result.
Triple check your build settings and ensure that you only ever link the release build of the .libs in your Release build and the debug build of the .libs in your Debug build.
could you try:
void Foo()
{
std::vector<int> blah;
blah.reserve(5);
blah.begin();
}

Visual Studio Debuging Errors in C++

For some reason the integrated debugger is causing an error as soon as I make reference to a third party vendor's dll class. This same code runs when it is built and ran as a release, stand alone. The two properties for debug and release should be the same as I have not really altered them. I added the lib file to the path for both builds. I simply have:
ClassNameFromDll blah;
When it gets to here, I get this exception:
Unhandled exception at 0x78a3f623 (mfc90ud.dll) in MTGO SO Bot.exe:
0xC0000005: Access violation reading location 0xf78e4568.
It occurs in: afxtls.cpp, line 252.
This is an MFC app, but I am not really using any MFC other than a very simple gui which fires off an event that is all win32. I am using Visual Studio 2008 Express.
Looking at the atltls.cpp file from my VC9 install, the crash is occurring here:
inline void* CThreadSlotData::GetThreadValue(int nSlot)
{
EnterCriticalSection(&m_sect);
ASSERT(nSlot != 0 && nSlot < m_nMax);
ASSERT(m_pSlotData != NULL);
ASSERT(m_pSlotData[nSlot].dwFlags & SLOT_USED); // <== crash
// ...
}
So the reason the crash doesn't occur in release build is because the ASSERT() is a no-op in that build. I'm not familiar with ATL's use of thread local storage, but this assertion indicates that something is asking for a value in a slot where nothing has been stored yet.
Whether the initialization of that TLS slot is your responsibility or the 3rd party DLL's responsibility, I don't know.
It looks like GetThreadValue() has some additional protections such that it'll return a NULL pointer in the release build for an uninitialized slot (though I'm not sure that this would be guaranteed) - I'd bet that the 3rd party DLL relies on that behavior (ie., it checks for a NULL return) so no crash occurs in release builds. Note that the vendor might be using the CThreadSlotData class indirectly (the stack trace would give a clue about this), so they might not be aware of its expectations.
engaging psychic debugging
The fact that it runs in release mode fine and crashes in debug mode leads me to believe that you've somehow managed to reference, specifically, the release version of that DLL (mfc90u.dll), rather than referencing the library itself and allowing the linker to decide which version to import.
You may not be using MFC for anything in this app, but if it's building as an MFC application, you will get all of the MFC stuff whether you want it or not (which means you also have to solve the MFC dependency problem and ship the MFC DLLs with your app).
Do you have a stack trace you can post? It might have some helpful information.
If the 3rd party DLL is still actively supported by the vendor, then the first thing you should do is see if you can have the same problem occur with a very simple program that you can send to the vendor and ask them to fix it.
If the vendor is not available or responsive enough:
If you have source of the 3rd party DLL and can easily build your own version, you have probably the best way to debug this (short of getting the vendor to support you). Even if you cannot easily build a source-debuggable DLL, you can trace into the constructor's assembly instructions and use the source as a map to help you understand what's going on.
Even if you don't have source for the 3rd party DLL then I think the best course of action is to trace through the constructor for ClassNameFromDll to try to figure out whats going wrong. It might help to compare the instructions path in the Debug build vs. the Release build.
MFC source is distributed with MSVC (probably not with the Express version, but I think with all other versions) so when you get in to the MFC DLL's code you might find the source to be useful in helping to figure out what's going on.