VC++ 6.0 vector access violation crash. Known bug? - c++

I'm trying to use a std::vector<>::const_iterator and I get an 'access violation' crash. It looks like the std::vector code is crashing when it uses its own internal First_ and Last_ pointers. Presumably this is a known bug. I'm hoping someone can point me to the correct workaround. It's probably relevant that the crashing function is called from an external library?
const Thing const* AClass::findThing (const std::string& label) const
{
//ThingList_.begin() blows up at run time. Compiles fine.
for (std::vector<Thing*>::const_iterator it = ThingList_.begin(); it != ThingList_.end(); ++it) {
//Irrelevant.
}
return 0;
}
Simply calling ThingList_.size() also crashes.
This is sp6, if it matters.

If you're passing C++ objects across external library boundaries, you must ensure that all libraries are using the same runtime library (in particular, the same heap allocator). In practice, this means that all libraries must be linked to the DLL version of MSVCRT.

It's almost certainly a bug in your code and not std::vector. This code is used by way too many projects to have such an easy to repro bug.
What's likely happening is that the ThnigList_ variable has been corrupted in some way. Was the underlying array accessed directly and/or modified?

I agree with Jared that it is probably in your code,
never the less, you should be sure your stl libs are up to date.
The dinkumware site has the patched files you need.
You should update just to be safe

Related

std::string is different when passed to method

I'm using an external library (Qpid Proton C++) in my Visual Studio project.
The API includes a method like:
container::connect(const std::string &url) {...}
I call it in my code this way:
container.connect("127.0.0.1");
but when debugging, stepping into the library's function, the string gets interpreted in the wrong way, with a size of some millions chars, and unintelligible content.
What could be the cause for this?
You need to put the breakpoint inside the function and not at the function declaration level, where the variable exists but is not yet initialized.
Just in case someone runs into a similar problem, as Alan Birtles was mentioning in his comment, one possible cause is having the library and your code using different C++ runtimes, and that turned out to be the case this time.
In general, as stated in this page from Visual C++ documentation,
If you're using CRT (C Runtime) or STL (Standard Template Library) types, don't pass them between binaries (including DLLs) that were compiled by using different versions of the compiler.
which is exactly what was going on.

AccessViolationException reading memory allocated in C++ application from C++/CLI DLL

I have a C++ client to a C++/CLI DLL, which initializes a series of C# dlls.
This used to work. The code that is failing has not changed. The code that has changed is not called before the exception is thrown. My compile environment has changed, but recompiling on a machine with an environment similar to my old one still failed. (EDIT: as we see in the answer this is not entirely true, I was only recompiling the library in the old environment, not the library and client together. The client projects had been upgraded and couldn't easily go back.)
Someone besides me recompiled the library, and we started getting memory management issues. The pointer passed in as a String must not be in the bottom 64K of the process's address space. I recompiled it, and all worked well with no code changes. (Alarm #1) Recently it was recompiled, and memory management issues with strings re-appeared, and this time they're not going away. The new error is Unhandled Exception: System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
I'm pretty sure the problem is not located where I see the exception, the code didn't change between the successful and failing builds, but we should review that to be complete. Ignore the names of things, I don't have much control over the design of what it's doing with these strings. And sorry for the confusion, but note that _bridge and bridge are different things. Lots of lines of code missing because this question is already too long.
Defined in library:
struct Config
{
std::string aye;
std::string bee;
std::string sea;
};
extern "C" __declspec(dllexport) BridgeBase_I* __stdcall Bridge_GetConfiguredDefaultsImplementationPointer(
const std::vector<Config> & newConfigs, /**< new configurations to apply **/
std::string configFolderPath, /**< folder to write config files in **/
std::string defaultConfigFolderPath, /**< folder to find default config files in **/
std::string & status /**< output status of config parse **/
);
In client function:
GatewayWrapper::Config bridge;
std::string configPath("./config");
std::string defaultPath("./config/default");
GatewayWrapper::Config gwtransport;
bridge.aye = "bridged.dll";
bridge.bee = "1.0";
bridge.sea = "";
configs.push_back(bridge);
_bridge = GatewayWrapper::Bridge_GetConfiguredDefaultsImplementationPointer(configs, configPath, defaultPath, status);
Note that call to library that is crashing is in the same scope as the vector declaration, struct declaration, string assignment and vector push-back
There are no threading calls in this section of code, but there are other threads running doing other things. There is no pointer math here, there are no heap allocations in the area except perhaps inside the standard library.
I can run the code up to the Bridge_GetConfiguredDefaultsImplementationPointer call in the debugger, and the contents of the configs vector look correct in the debugger.
Back in the library, in the first sub function, where the debugger don't shine, I've broken down the failing statement into several console prints.
System::String^ temp
List<CConfig^>^ configs = gcnew List<CConfig ^>((INT32)newConfigs.size());
for( int i = 0; i< newConfigs.size(); i++)
{
std::cout << newConfigs[i].aye<< std::flush; // prints
std::cout << newConfigs[i].aye.c_str() << std::flush; // prints
temp = gcnew System::String(newConfigs[i].aye.c_str());
System::Console::WriteLine(temp); // prints
std::cout << "Testing string creation" << std::endl; // prints
std::cout << newConfigs[i].bee << std::flush; // crashes here
}
I get the same exception on access of bee if I move the newConfigs[i].bee out above the assignment of temp or comment out the list declaration/assignment.
Just for reference that the std::string in a struct in a vector should have arrived at it's destination ok
Is std::vector copying the objects with a push_back?
std::string in struct - Copy/assignment issues?
http://www.cplusplus.com/reference/vector/vector/operator=/
Assign one struct to another in C
Why this exception is not caught by my try/catch
https://stackoverflow.com/a/918891/2091951
Generic AccessViolationException related questions
How to handle AccessViolationException
Programs randomly getting System.AccessViolationException
https://connect.microsoft.com/VisualStudio/feedback/details/819552/visual-studio-debugger-throws-accessviolationexception
finding the cause of System.AccessViolationException
https://msdn.microsoft.com/en-us/library/ms164911.aspx
Catching access violation exceptions?
AccessViolationException when using C++ DLL from C#
Suggestions in above questions
Change to .net 3.5, change target platform - these solutions could have serious issues with a large mult-project solution.
HandleProcessCorruptedStateExceptions - does not work in C++, this decoration is for C#, catching this error could be a very bad idea anyway
Change legacyCorruptedStateExceptionsPolicy - this is about catching the error, not preventing it
Install .NET 4.5.2 - can't, already have 4.6.1. Installing 4.6.2 did not help. Recompiling on a different machine that didn't have 4.5 or 4.6 installed did not help. (Despite this used to compile and run on my machine before installing Visual Studio 2013, which strongly suggests the .NET library is an issue?)
VSDebug_DisableManagedReturnValue - I only see this mentioned in relation to a specific crash in the debugger, and the help from Microsoft says that other AccessViolationException issues are likely unrelated. (http://connect.microsoft.com/VisualStudio/feedbackdetail/view/819552/visual-studio-debugger-throws-accessviolationexception)
Change Comodo Firewall settings - I don't use this software
Change all the code to managed memory - Not an option. The overall design of calling C# from C++ through C++/CLI is resistant to change. I was specifically asked to design it this way to leverage existing C# code from existing C++ code.
Make sure memory is allocated - memory should be allocated on the stack in the C++ client. I've attempted to make the vector be not a reference parameter, to force a vector copy into explicitly library controlled memory space, did not help.
"Access violations in unmanaged code that bubble up to managed code are always wrapped in an AccessViolationException." - Fact, not a solution.
but it was the mismatch, not the specific version that was the problem
Yes, that's black letter law in VS. You unfortunately just missed the counter-measures that were built into VS2012 to turn this mistake into a diagnosable linker error. Previously (and in VS2010), the CRT would allocate its own heap with HeapAlloc(). Now (in VS2013), it uses the default process heap, the one returned by the GetProcessHeap().
Which is in itself enough to trigger an AVE when you run your app on Vista or higher, allocating memory from one heap and releasing it from another triggers an AVE at runtime, a debugger break when you debug with the Debug Heap enabled.
This is not where it ends, another significant issue is that the std::string object layout is not the same between the versions. Something you can discover with a little test program:
#include <string>
#include <iostream>
int main()
{
std::cout << sizeof(std::string) << std::endl;
return 0;
}
VS2010 Debug : 32
VS2010 Release : 28
VS2013 Debug : 28
VS2013 Release : 24
I have a vague memory of Stephen Lavavej mentioning the std::string object size reduction, very much presented as a feature, but I can't find it back. The extra 4 bytes in the Debug build is caused by the iterator debugging feature, it can be disabled with _HAS_ITERATOR_DEBUGGING=0 in the Preprocessor Definitions. Not a feature you'd quickly want to throw away but it makes mixing Debug and Release builds of the EXE and its DLLs quite lethal.
Needless to say, the different object sizes seriously bytes when the Config object is created in a DLL built with one version of the standard C++ library and used in another. Many mishaps, the most basic one is that the code will simply read the Config::bee member from the wrong offset. An AVE is (almost) guaranteed. Lots more misery when code allocates the small flavor of the Config object but writes the large flavor of std::string, that randomly corrupts the heap or the stack frame.
Don't mix.
I believe 2013 introduced a lot of changes in the internal data formats of STL containers, as part of a push to reduce memory usage and improve perf. I know vector became smaller, and string is basically a glorified vector<char>.
Microsoft acknowledges the incompatibility:
"To enable new optimizations and debugging checks, the Visual Studio
implementation of the C++ Standard Library intentionally breaks binary
compatibility from one version to the next. Therefore, when the C++
Standard Library is used, object files and static libraries that are
compiled by using different versions can't be mixed in one binary (EXE
or DLL), and C++ Standard Library objects can't be passed between
binaries that are compiled by using different versions."
If you're going to pass std::* objects between executables and/or DLLs, you absolutely must ensure that they are using the same version of the compiler. It would be well-advised to have your client and its DLLs negotiate in some way at startup, comparing any available versions (e.g. compiler version + flags, boost version, directx version, etc.) so that you catch errors like this quickly. Think of it as a cross-module assert.
If you want to confirm that this is the issue, you could pick a few of the data structures you're passing back and forth and check their sizes in the client vs. the DLLs. I suspect your Config class above would register differently in one of the fail cases.
I'd also like to mention that it is probably a bad idea in the first place to use smart containers in DLL calls. Unless you can guarantee that the app and DLL won't try to free or reallocate the internal buffers of the other's containers, you could easily run into heap corruption issues, since the app and DLL each have their own internal C++ heap. I think that behavior is considered undefined at best. Even passing const& arguments could still result in reallocation in rare cases, since const doesn't prevent a compiler from diddling with mutable internals.
You seem to have memory corruption. Microsoft Application Verifier is invaluable in finding corruption. Use it to find your bug:
Install it to your dev machine.
Add your exe to it.
Only select Basics\Heaps.
Press Save. It doesn't matter if you keep application verifier open.
Run your program a few times.
If it crashes, debug it and this time, the crash will point to your problem, not just some random location in your program.
PS: It's a great idea to have Application Verifier enabled at all times for your development project.

C++ What could cause a line of code in one place to change behavior in an unrelated function?

Consider this mock-up of my situation.
in an external header:
class ThirdPartyObject
{
...
}
my code: (spread among a few headers and source files)
class ThirdPartyObjectWrapper
{
private:
ThirdPartyObject myObject;
}
class Owner
{
public:
Owner() {}
void initialize();
private:
ThirdPartyObjectWrapper myWrappedObject;
};
void Owner::initialize()
{
//not weird:
//ThirdPartyObjectWrapper testWrappedObject;
//weird:
//ThirdPartyObject testObject;
}
ThirdPartyObject is, naturally, an object defined by a third party (static precompiled) library I'm using. ThirdPartyObjectWrapper is a convenience class that eliminates a lot of boiler-plating for working with ThirdPartyObject. Owner::initialize() is called shortly after an instance of Owner is created.
Notice the two lines I have labeled as "weird" and "not weird" in Owner::initialize(). All I'm doing here is creating a couple of objects on the stack with their default constructors. I don't do anything with those objects and they get destroyed when they leave scope. There are no build or linker errors involved, I can uncomment either or both lines and the code will build.
However, if I uncomment "weird" then I get a segmentation fault, and (here's why I say it's weird) it's in a completely unrelated location. Not in the constructor of testObject, like you might expect, but in the constructor of Owner::myObjectWrapper::myObject. The weird line never even gets called, but somehow its presence or absence consistently changes the behavior of an unrelated function in a static library.
And consider that if I only uncomment "not weird" then it runs fine, executing the ThirdPartyObject constructor twice with no problems.
I've been working with C++ for a year so it's not really a surprise to me that something like this would be able happen, but I've about reached the limit of my ability to figure out how this gotcha is happening. I need the input of people with significantly more C++ experience than me.
What are some possibilities that could cause this to happen? What might be going on here?
Also, note, I'm not asking for advice on how to get rid of the segfault. Segfaults I understand, I suspect it's a simple race condition. What I don't understand is the behavior gotcha so that's the only thing I'm trying to get answers for.
My best lead is that it has to do with headers and macros. The third party library actually already has a couple of gotchas having to do with its headers and macros, for example the code won't build if you put your #include's in the wrong order. I'm not changing any #include's so strictly this still wouldn't make sense, but perhaps the compiler is optimizing includes based on the presence of a symbol here? (it would be the only mention of ThirdPartyObject in the file)
It also occurs to me that because I am using Qt, it could be that the Meta-Object Compiler (which generates supplementary code between compilations) might be involved in this. Very unlikely, as Qt has no knowledge of the third party library where the segfault is happening and this is not actually relevant to the functionality of the MOC (since at no point ThirdPartyObject is being passed as an argument), but it's worth investigating at least.
Related questions have suggested that it could be a relatively small buffer overflow or race condition that gets tripped up by compiler optimizations. Continuing to investigate but all leads are welcome.
Typical culprits:
Some build products are stale and not binary-compatible.
You have a memory bug that has corrupted the state of your process, and are seeing a manifestation of that in a completely unrelated location.
Fixing #1 is trivial: delete the build folder and build again. If you're not building in a shadow build folder, you've set yourself up for failure, hopefully you now know enough to stop :)
Fixing #2 is not trivial. View manual memory management and possible buffer overflows with suspicion. Use modern C++ programming techniques to leverage the compiler to help you out: store things by value, use containers, use smart pointers, and use iterators and range-for instead of pointers. Don't use C-style arrays. Abhor C-style APIs of the (Type * array, int count) kind - they don't belong in C++.
What fun. I've boiled this down to the bottom.
//#include <otherthirdpartyheader.h>
#include <thirdpartyobject.h>
int main(...)
{
ThirdPartyObject test;
return 0;
}
This code runs. If I uncomment the first include, delete all build artifacts, and build again, then it breaks. There's obviously a header/macro component, and probably some kind of compiler-optimization component. But, get this, according to the library documentation it should give me a segfault every time because I haven't been doing a required initialization step. So the fact that it runs at all indicates unexpected behavior.
I'm chalking this up to library-specific issues rather than broad spectrum C++ issues. I'll be contacting the vendor going forward from here, but thanks everyone for the help.

Upgrade to VS2012 resulting in crash due to different VC++ runtimes?

There is a large legacy project I have to maintain, which I recently upgraded from Visual Studio 2008 to Visual Studio 2012. As it is a COM server and a OCX control, creating all the typelib stuff etc. resulted in some problems that I managed to solve. However, when I run the Release build now I frequently get crashes.
I followed some advise I found here on SO and was able to track down the crash to the following piece of code:
int Phx2Preview::ClearOvlElementList() {
for (int i = 0; i < (int)m_vOvlElements.size(); i++) {
P_SAFE_DELETE(m_vOvlElements[i].pPolyOrig); // <- code crashes here
P_SAFE_DELETE(m_vOvlElements[i].pPolyDispl);
}
m_vOvlElements.clear();
m_vRefElemList.clear();
m_pRefElemSelected = NULL;
return PHXE_NO_ERROR;
}
P_SAFE_DELETE is a macro that checks if the pointer is null, and in case it's not deletes and sets it to null. The actual vector elements are created like this:
if (v1) {
tNew.pPolyOrig = new CInPolygon();
tNew.pPolyDispl = new CInPolygon();
tNew.pPolyOrig->FromSafeArray(v1);
tNew.pPolyOrig->Rotate(NULLPOINT, m_nTurnAngle*__pi/180.);
tNew.eType = (overlayET)type;
tNew.nImagenr = nImageNr;
m_vOvlElements.push_back(tNew);
}
Now, the thing is that CInPolygon is a class from an external library which is created with Visual C++ 7.1. The P_SAFE_DELETE is also defined in a header from that library. From here I know that mixing different runtime versions is bad, and this question lets me suspect that this mixing may be responsible from the crash.
My question is: Why does it happen? After all, since both new and delete are called from the same place, no actual objects are passed between the different CRTs. Also, when the OCX is compiled using Visual Studio 2008, no problems occur. Is this due to pure luck? I guess the basic issue is existing in that setting, too. And, well, what can I do to solve to problem? Switch back to VS2008?
Edit:
As asked: The destructor of CInPolygon is just
CInPolygon::~CInPolygon(void) {
m_vPoints.clear();
}
here the m_vPoints is a std::vector<..> defined in the class. Maybe I should mention that CInPolygon inherits from that:
interface IRoi {
virtual ~IRoi() {
return;
}
public:
// other stuff
};
(Didn't even know that interface was a valid keyword in plain C++...) Could it be that the fact that the base class destructor is defined in the header is causing the problem? After all, that header is also known to the host programm..
tNew.pPolyOrig = new CInPolygon();
Yes, this is guaranteed to fail. Short from having different allocators in your program, your host program cannot possibly compute the size of the CInPolygon object correctly. It uses an entirely different implementation of std::vector. It was significantly rewritten in VS2012 to take advantage of C++11. Inevitably, the code in the library using the old version of vector will corrupt the heap.
You must rebuild the library as well, using the exact same version of the compiler with the exact same settings.

accessing element of boost sparse_matrix seems to stall program

I've got a strange bug that I'm hoping a more experience programmer might have some insight into. I'm using the boost ublas sparse matrices, specifically mapped_matrix, and there is an intermittent bug that occurs eventually, but not in the initial phases of the program. This is a large program, so I cannot post all the code but the core idea is that I call a function which belongs to a particular class:
bool MyClass::get_cell(unsigned int i, unsigned int j) const
{
return c(i,j);
}
The variable c is defined as a member of the class
boost::numeric::ublas::mapped_matrix<bool> c;
When the bug occurs, the program seems to stop (but does not crash). Debugging with Eclipse, I can see that the program enters the boost mapped_matrix code and continues several levels down into std::map, std::_Rb_tree, and std::less. Also, the program occasionally traces down to std::map, std::_Rb_tree, and std::_Select1st. While code is executing and the active line what's in memory changes in _Rb_tree, execution never seems to return in the level of std::map. The line in std::map the program is stuck on is the return of the following function.
const_iterator
find(const key_type& __x) const
{ return _M_t.find(__x); }
It seems to me that there is some element in the c matrix that the program is looking for but somehow the underlying storage mechanism has "misplaced it". However, I'm not sure why or how to fix it. That could also be completely off base.
Any help you can provide would be greatly appreciated. If I have not included the right information in this question, please let me know what I'm missing. Thank you.
Some things to try to debug the code (not necessarily permanent changes):
Change the bool to an int in the matrix type for c, to see if the matrix expects numeric types.
Change the matrix type to another with a similar interface, possibly plain old matrix.
Valgrind the app (if you're on linux) to check you're not corrupting memory.
If that fails, you could try calling get_cell every time you modify the matrix to see what might be causing the problem.
Failing that, you may have to try reduce the problem to a much smaller subset of code which you can post here.
It might help if you tell us what compiler and OS you're using.
Is this part of a multithreaded program?
I ask, because usually when I see problems in STL, it ends up being a problem with unsynchronized access.