Protobuf SerializeAsString causing heap debug assertion in x64 - mfc

Sorry this will not be easily reproductible but maybe someone can help me along the way anyway!
I have a C++ MFC-based project (VS2019) that uses Google protobuf for communication with another C#-based application. When compiled under Win32, everything was working great. But we had to migrate to x64 and now Google protobuf functions SerializeToString and SerializeAsString causes a heap debug assertion when the generated string goes out of scope. The proto files are autogenereated from contract classes in the C# app, and I have the same problem with all of them.
Code snippet that generates the error:
auto test = API::myScope::MyTestDto(); //does not matter which protobuf class is being used
test.set_my_data(5);
{
std::string newString = test.SerializeAsString();
//Or use SerializeToString for same error:
//std::string newString;
//test.SerializeToString(&newString);
ASSERT(newString.length() > 0); //everything is fine here
} // Causes assertion when newString goes out of scope.
The heap debug assertion that is being thrown:
File: minkernel\crts\ucrt\src\appcrt\heap\debug_heap.cpp
Line: 996
Expression: __acrt_first_block == header*
Here is the part of debug_heap that is throwing the assertion:
// Optionally reclaim memory:
if ((_crtDbgFlag & _CRTDBG_DELAY_FREE_MEM_DF) == 0)
{
// Unlink this allocation from the global linked list:
if (header->_block_header_next)
{
header->_block_header_next->_block_header_prev = header->_block_header_prev;
}
else
{
_ASSERTE(__acrt_last_block == header);
__acrt_last_block = header->_block_header_prev;
}
if (header->_block_header_prev)
{
header->_block_header_prev->_block_header_next = header->_block_header_next;
}
else
{
_ASSERTE(__acrt_first_block == header); //THIS LINE THROWING ASSERTION FAULT
__acrt_first_block = header->_block_header_next;
}
memset(header, dead_land_fill, sizeof(_CrtMemBlockHeader) + header->_data_size + no_mans_land_size);
_free_base(header);
}
Some more remarks:
The string seems to look okay... until it leaves scope and causes the crash.
I have tried recompiling the whole protobuf library from latest version, and I have tried regenerating all the autogenerated c++ protobuf code, but still the same error.
The app is multithreaded, but there is really nothing else going on at the same time that should be able to compete about the memory...
Has anyone expereinced this kind of error with the protobuf SerializeAsString function?
Or do you have any other debugging pointers that you could give me?
Since this is part of a huge project, I am actually afraid that there might be some other memory corruption error in the software that is causing this whole mess. But I cannot understand what could cause this sort of problem.

#Iinspectable was pointing me in the correct direction. The problem was that the libprotobuf was compiled as DLL, and the DLL versioning was causing heap errors. I had been compiling libprotobuf as DLL since that was used before in this project, but the protobuf readme file now says that compiling libprotobuf as static LIB is to be preferred, to avoid potential heap errors caused by DLL compilation.
So I threw out the DLL and made a static LIB instead, problem solved!

Related

Why would calling MessageBox[etc]() without a return variable cause the program to crash?

So if I write the following code:
MessageBoxA(0, "Yo, wazzup!", "A Greeting From Earth", 0);
the program crashes with an access violation when it exits. When I write the code like this:
int a;
a = MessageBoxA(0, "Yo, wazzup!", "A Greeting From Earth", 0);
it doesn't crash. Now I know why it crashes when it crashes thanks to another question I asked, also regarding argument-mismatching, but I don't know why it crashes.
So why does this cause an APPCRASH? I was always under the impression that calling a function that had a return-type, without actually giving one was safe, example:
int SomeFunction (void) {
std::cout << "Hello ya'll!\n";
return 42;
}
int main (void) {
int a;
// "Correct" ?
a = SomeFunction();
a = MessageBoxA(0, "Yo, wazzup!", "A Greeting From Earth", 0);
// "Incorrect" ?
SomeFunction();
MessageBoxA(0, "Yo, wazzup!", "A Greeting From Earth", 0);
}
When I run this kind of test "clean" (in a new file) I don't get any errors. It only seems to give an error with MessageBox/MessageBoxA when run in my program. Knowing the possible causes would help me pinpoint the error as the project code is too big to post (and I would need my friend's permission to post his code anyway).
Additional Info:Compiler = GCCPlatform = Windows
EDIT:
UpdateThanks everyone for your feedback so far. So I decided to run it through a debugger... Now Code::Blocks doesn't debug a project unless it is loaded from a project file (*.cbp) - AFAIK. So I created an actual project and copy-pasted our project's main file into the projects. Then I ran in debug mode and didn't get so much as a warning. I then compiled in build mode and it ran fine.Next, I decided to open a new file in Dev-C++ and run it through the debugging and later the final build process and again I got no errors for either build or debug. I cannot reproduce this error in Dev-C++, even with our main file (as in the one that causes the error in Code::Blocks).
ConclusionThe fault must lie in Code::Blocks. AFAIK, they both use GCC so I am pretty confused. The only thing I can think of is a version difference or perhaps my compiler settings or something equally obscure. Could optimizer settings or any other compiler settings somehow cause this kind of error?
The version with the return value does not crash because it had one int more on the stack. Your erroneous code reads over the bounds of the stack and then runs into an access violation. But if you have more on the stack you will not hit the guard page, because that is just enough extra stack. If the the erroneous code only reads it is sort of OK, but still broken.
We had one bit of WTF inducing code that was like so:
char dummy[52];
some_function();
There was thankfully a longish comment explaining that removing dummy, makes some_function crash. It was in a very old application so nobody dared touch it and the some_function was totally different module we had no control over. Oh yea and that application was running smoothly in the field for over 20 years in industrial installations, like refineries or nuclear power plants... ^_^

Access Violation exception when running a release-built application

Recently I have been developing a small OpenGL game. Everything in it runs fine with the debug build but when I build the release, I get a strange Access Violation exception.
I searched across the code and it seems that the problem occurs when I try to open a file. Here is the function where I think the problem is coming from:
#define LOCAL_FILE_DIR "data\\"
#define GLOBAL_FILE_DIR "..\\data\\"
std::string FindFile(const std::string &baseName)
{
std::string fileName = LOCAL_FILE_DIR + baseName;
std::ifstream testFile(fileName.c_str()); // The code breaks here
if(testFile.is_open())
return fileName;
fileName = GLOBAL_FILE_DIR + baseName;
testFile.open(fileName.c_str());
if(testFile.is_open())
return fileName;
throw std::runtime_error("Could not find the file " + baseName);
}
This code is associated with loading of GLSL shaders. A function takes the shader's file name and then passes it to FindFile in order to find the file needed.
Just as a general rule from personal (and teaching) experience: >90% of the cases where Debug works fine and Release crashes are due to uninitialized variables. That's a little harder to do in C++ than in C, but it is a very common problem. Make sure all your vars (like baseName) are initialized before using them.
I fixed the problem.
Everything was happening because I have made the Release build using glsdk's Debug build libraries. Changing to the Release build libraries fixed the problem.
Check that baseName is valid. Try printing it out. You may be getting a corrupted copy of baseName or your stack may have gotten trashed prior to that point (same result).

Boost serialization assertion fail

I use boost's binary serialization and it worked well until now. I have std::list of pointers to serialize for output (oarchive) but serialization fails inside object's serialize() function with MSVC's dialog:
R6010 -abort() has been called
and such string is printed into console window:
Assertion failed: 0 == static_cast<int>(t) || 1 == static_cast<int>(t), file c:\program files\boost\boost_1_44\boost\archive\basic_binary_oprimitive.hpp, line 91
what does it mean?
Project is pretty big, sources are distributed so I cannot post it's code here, but I tried to simulate this error within simple project - there it works fine what is strange.
P.S. I use boost 1.44 with MSVC2010EE on Windows XP. When I click "retry" on "Debug Error!" window debugger shows arrow on the code line next to serialization archive << myList; line - I mean it seems like error occurred at some destructor or something.
When I make changes inside objects serialize() function - they will be applied just when I rebuild whole project (clean before compiling) - but if I just compile it (where IDE shows that all sources which include changed header are recompiled) - no changes will happen at runtime since last version (I tried with printf()) - that's strange.
Could I occasionally set some critical definitions or something?
The line in question says:
// trap usage of invalid uninitialized boolean which would
// otherwise crash on load.
It looks like at some point you are trying to serialize a bool that hasn't been initialized. Without further code we can't help you find which one.

Program crashes with 0xC000000D and no exceptions - how do I debug it?

I have a Visual C++ 9 Win32 application that uses a third-party library. When a function from that library is called with a certain set of parameters the program crashes with "exception code 0xC000000D".
I tried to attach Visual Studio debugger - no exceptions are thrown (neither C++ nor structured like access violations) and terminate() is not called either. Still the program just ends silently.
How does it happen that the program just ends abnormally but without stopping in the debugger? How can I localize the problem?
That's STATUS_INVALID_PARAMETER, use WinDbg to track down who threw it (i.e. attach WinDbg, sxe eh then g.
Other answers and comments to the question helped a lot. Here's what I did.
I notices that if I run the program under Visual Studio debugger it just ends silently, but if I run it without debugger it crashes with a message box (usual Windows message box saying that I lost my unsaved data and everyone is sooo sorry).
So I started the program wihtout debugger, let it crash and then - while the message box was still there - attached the debugger and hit "Break". Here's the call stack:
ntdll.dll!_KiFastSystemCallRet#0()
ntdll.dll!_ZwWaitForMultipleObjects#20() + 0xc bytes
kernel32.dll!_WaitForMultipleObjectsEx#20() - 0x48 bytes
kernel32.dll!_WaitForMultipleObjects#16() + 0x18 bytes
faultrep.dll!StartDWException() + 0x5df bytes
faultrep.dll!ReportFault() + 0x533 bytes
kernel32.dll!_UnhandledExceptionFilter#4() + 0x55c bytes
//SomeThirdPartyLibraryFunctionAddress
//SomeThirdPartyLibraryFunctionAddress
//SomeThirdPartyLibraryFunctionAddress
//SomeThirdPartyLibraryFunctionAddress
//OurCodeInvokingThirdPartyLibraryCode
so obviously that's some problem inside the trird-party library. According to MSDN, UnhandledExceptionFilter() is called in fatal situations and clearly the call is done because of some problem in the library code. So we'll try to work the problem out with the library vendor first.
If you don't have source and debugging information for your 3rd party library, you will not be able to step into it with the debugger. As I see it, your choices are;
Put together a simple test case illustrating the crash and send it onto the library developer
Wrap that library function in your own code that checks for illegal parameters and throw an exception / return an error code when they are passed by your own application
Rewrite the parts of the library that do not work or use an alternative
Very difficult to fix code that is provided as object only
Edit You might also be able to exit more gracefully using __try __finally around your main message loop, something like
int CMyApp::Run()
{
__try
{
int i = CWinApp::Run();
m_Exitok = MAGIC_EXIT_NO;
return i;
}
__finally
{
if (m_Exitok != MAGIC_EXIT_NO)
FaultHandler();
}
}

LoadLibrary fails when including a specific file during DLL build

I'm getting really strange behavior in one of the DLLs of my C++ app. It works and loads fine until I include a single file using #include in the main file of the DLL. I then get this error message:
Loading components from D:/Targets/bin/MatrixWorkset.dll
Could not load "D:/Targets/bin/MatrixWorkset.dll": Cannot load library MatrixWorkset: Invalid access to memory location.
Now I've searched and searched through the code and google and I can't figure out what is going on. Up till now everything was in a single DLL and I've decided to split it into two smaller ones. The file that causes the problems is part of the other second library (which loads fine).
Any ideas would really be appreciated.
Thanks,
Jaco
The likely cause is a global with class type. The constructor is run from DllMain(), and DllMain() in turn runs before LoadLibrary() returns. There are quite a few restrictions on what you can do until DllMain() has returned.
Is it possible that header includes a #pragma comment(lib,"somelibrary.lib") statement somewhere? If so it's automatically trying to import a library.
To troubleshoot this I'd start by looking at the binary with depends (http://www.dependencywalker.com/), to see if there are any DLL dependencies you don't expect. If you do find something and you are in Visual Studio, you should turn on "Show Progress" AKA /VERBOSE on the linker.
Since you are getting the Invalid Access to memory location, it's possible there's something in the DLLMAIN or some static initializer that is crashing. Can you simplify the MatrixWorkset.dll (assuming you wrote it)?
The error you describe sounds like a run-time error. Is this error displayed automatically by windows or is it one that your program emits?
I say attach a debugger to your application and trace where this error is coming from. Is Windows failing to load a dependency? Is your library somehow failing on load-up?
If you want to rule in/out this header file you're including, try pre-compiling your main source file both with and without this #include and diff the two results.
I'm still not getting it going. Let me answer some of the questions asked:
1) Windows is not failing to load a dependency, I think since Dependency Walker shows everything is ok.
2) I've attached a debugger which basically prints the following when it tries to load MatrixWorkset.dll:
10:04:19.234
stdout:&"warning: Loading components from D:/ScinericSoftware/VisualWorkspace/trunk/Targets/bin/MatrixWorkset.dll\n"
10:04:19.234
stdout:&"\n"
status:Stopped: "signal-received"
status:Stopped.
10:04:19.890
stdout:30*stopped,reason="signal-received",signal-name="SIGSEGV",signal-meaning="Segmentation fault",thread-id="1",frame={addr="0x7c919994",func="towlower",args=[],from="C:\\WINDOWS\\system32\\ntdll.dll"}
input:31info shared
input:32-stack-list-arguments 2 0 0
input:33-stack-list-locals 2
input:34-stack-list-frames
input:35-thread-list-ids
input:36-data-list-register-values x
10:04:19.890
3) MSalters: I'm not sure what you mean with a "global with class type". The file that is giving the problems have been included in a different DLL in which it worked fine and the DLL loaded successfully.
This is the top of the MatrixVariable.h file:
#include "QtSF/Variable.h" // Located in depending DLL (the DLL in which this file always lived.
#include "Matrix.h" // File located in this DLL
#include "QList" // These are all files from the Qt Framework
#include "QModelIndex"
#include "QItemSelection"
#include "QObject"
using namespace Zenautics;
using namespace std;
class MatrixVariable : public Variable
{
Q_OBJECT
Q_PROPERTY(int RowCount READ rowCount WRITE setRowCount)
Q_PROPERTY(int ColumnCount READ columnCount WRITE setColumnCount)
Q_PROPERTY(int UndoPoints READ undoPoints WRITE setUndoPoints)
public:
//! Default constructor.
MatrixVariable(const QString& name, int rows, int cols, double fill_real = 0, double fill_complex = 0, bool isReal = true);
etc. etc. etc.
A possible solution is to put the MatrixVariable file back in the original DLL but that defeats the whole idea of splitting the DLL into smaller parts which is not really a option.
I get that error from GetLastError() when I fail to load a DLL from a command line EXE recently. It used to work, then I added some MFC code to the DLL. Now all bets are off.
I just had this exact same problem. A dll that had been working just fine, suddenly stopped working. I was taking an access violation in the CRT stuff that initializes static objects. Doing a rebuild all did not fix the problem. But when I manually commented out all the statics, the linker complained about a corrupt file. Link again: Worked. Now I can LoadLibrary. Then, one by one, I added the statics back in. Each time, I recompiled and tested a LoadLibrary. Each time it worked fine. Eventually, all my statics were back, and things working normally.
If I had to guess, some intermediate file used by the linker was corrupted (I see the ilk files constantly getting corrupted by link.exe). If you can, maybe wipe out all your files and do a clean build? But I'm guessing you've already figured things out since this is 6 months old ...