I have a small piece of C++ code that is making me insane. Whenever it runs, it throws a null reference exception in a very unexpected place.
void CSoundHandle::SetTarget(CSound* sound)
{
assert(_Target == nullptr);
if (sound == nullptr) { return; }
_Target = sound;
// This works just fine.
_Target->Play();
// This is the code that throws the exception. It doesn't seem possible, as
// we should not be able to get here if 'sound' is null.
_Target->Stop();
}
So what the heck is going on? The message in the output window is:
this->_Target-> was nullptr.
0xC0000005: Access violation reading location 0x00000014
I have confirmed in disassembly that it is not taking place inside of the Stop function as well. How is this possible?
EDIT:
The pointer for sound is indeed initialized, and 'this' and 'this->Target' are non-null.
EDIT 2:
I have somehow solved the problem by slightly changing the declaration of the Stop function:
// From this.
virtual void Stop();
// To this.
void Stop();
This seems especially odd since Play() is also virtual, but works without any trouble. I can't say I've ever seen anything like this before. There are no other functions named 'Stop' in the rest of the program, nor are there subclasses of CSound, so I'm a bit confused.
Reading location 0x00000014 implies you're trying to access a field located 0x14 bytes from the beginning of an object. The pointer to that object is set to null. So the problem is in the caller of your function: it passes a bad pointer to sound, which is neither null nor valid. This is why the null check in your function passes (0x14 isn't null), but you still crash.
Update:
The second edit to the question indicates the problem is in calling a virtual function. The null here is the virtual pointer then, and 0x14 is the offset of the Stop virtual function. The virtual pointer is set (in code generated by the compiler) during object construction, and should never point to 0. If it does, some part of the program is corrupting the object. An easy to detect case would be an attempt to reset the object, but memory corruptions (e.g., out-of-bound write to an array) could also cause this issue.
Related
I have three classes relevant to this issue. I'm implementing a hardware service for an application. PAPI (Platform API) is a hardware service class that keeps track of various hardware interfaces. I have implemented an abstract HardwareInterface class, and a class that derives it called HardwareWinUSB.
Below are examples similar to what I've done. I've left out members that don't appear to be relevant to this issue, like functions to open the USB connection:
class PAPI {
HardwareInterface *m_pHardware;
PAPI() {
m_pHardware = new HardwareWinUSB();
}
~PAPI() {
delete m_pHardware;
}
ERROR_CODE WritePacket(void* WriteBuf)
{
return m_pHardware->write( WriteBuf);
}
};
class HardwareInterface {
virtual ERROR_CODE write( void* WriteBuf) = 0;
};
class HardwareWinUSB : public HardwareInterface
{
ERROR_CODE write( void* Params)
{
// Some USB writing code.
// This had worked just fine before attempting to refactor
// Into this more sustainable hardware management scheme
{
};
I've been wrestling with this for several hours now. It's a strange, reproducible issue, but is sometimes intermittent. If I step through the debugger at a higher context, things execute well. If I don't dig deep enough, I'm met with an error that reads
Exception thrown at 0x00000000 in <ProjectName.exe>: 0xC0000005: Access violation executing location 0x00000000
If I dig down into the PAPI code, I see bizarre behavior.
When I set a breakpoint in the body of WritePacket, everything appears normal. Then I do a "step over" in the debugger. After the return from the function call, my reference to 'this' is set to 0x00000000.
What is going on? It looks like a null value was pushed on the return stack? Has anyone seen something like this happen before? Am I using virtual methods incorrectly?
edit
After further dissection, I found that I was reading before calling write, and the buffer that I was reading into was declared in local scope. When new reads came in, they were being pushed into the stack, corrupting it. The next function called, write, would return to a destroyed stack.
A buffer overrun can trash the return address on the stack. You seem to be reading and writing packets with void pointers and without passing around explicit sizes, so a simple overrun bug seems quite likely. The Visual Studio compiler has options to add stack integrity checks to detect these kinds of bugs, but they're not 100% perfect. Nonetheless, make sure you have them switched on.
Also note that the Visual Studio debugger can occasionally (but rarely) show the wrong value for this, especially if you're trying to debug optimized code. If you're at the } at the end of a method, I wouldn't necessarily worry about the debugger showing a bizarre value for this.
After further dissection, I found that I was reading before calling write, and the buffer that I was reading into was declared in local scope (in the read function).
When new reads came in, they were being pushed into the stack, corrupting it. The next function I called, write, would return to a destroyed stack.
Unhandled exception at 0x764F135D (kernel32.dll) in RFNReader_NFCP.exe.4448.dmp: 0xC0000005: Access violation writing location 0x00000001.
void Notify( const char* buf, size_t len )
{
for( auto it = m_observerList.begin(); it != m_observerList.end(); )
{
auto item = it->lock();
if( item )
{
item->Update( buf, len );
++it;
}
else
{
it = m_observerList.erase( it );
}
}
}
variable item's value in debug window:
item shared_ptr {m_interface="10.243.112.12" m_port="8889" m_clientSockets={ size=0 } ...} [3 strong refs, 2 weak refs] [default] std::tr1::shared_ptr
but in item->Update():
the item(this) become null!
why??
The problem here is most likely not the weak_ptr, which is used correctly.
In fact, the code you posted is completely fine, so the error must be elsewhere. The raw pointer and length arguments indicate a possible memory corruption.
Be aware that the debugger might lie to you if you accidentally mess up stack frames due to memory corruption. Since you seem to be debugging this from a minidump it might also be that the dumping swallowed some info here.
Mind you, the corrupted this pointer that you are seeing here is just a value on the stack! The underlying object is most probably still alive, as you are maintaining several shared_ptrs to it (you can verify this in a debug build by checking if the original memory location of the object was overwritten by magic numbers). It's really just your stack values that are bogus. I would definitely recommend you double check the stack manually using VS's memory and register windows. If you do have a memory corruption, it should become visible there.
Also consider temporarily cranking up the amount of data saved to the minidump if it threw away too much.
Finally, be sure you double check your buffer handling. It's very likely that you messed up there somewhere and an out-of-bounds buffer write caused the corruption.
Note that your this is invalid (0x00000001), i.e. the object got destroyed. Notify member function was called for a destroyed object. This obviously crashes as soon as Notify tries to access an object member.
I have the following code (some code removed to strip it to the essentials; the couple methods/attributes used should be self explanatory):
void testApp::togglePalette(){
GraphicalEntity* palette= this->getEntityByName("palette-picker");
cerr << palette << endl;
}
GraphicalEntity* testApp::getEntityByName(string name){
list<GraphicalEntity*>::iterator j;
for(j=screenEntities.begin(); j!=screenEntities.end();++j){
if ((*j)->getTypetag() == name){
cerr << *j << endl;
return *j;
}
}
}
Which outputs the following:
0x54bda0
0
I'm confused- why isn't palette in togglePalette() equal to the address returned from getEntityByName (so 0x54bda0 in the current case), but to 0?
Thanks!
EDIT: As Fred pointed out in one of his comments, it was indeed an issue of the compiler being confused by the code reaching the end of the function without returning anything.
Adding:
return (GraphicalEntity*) NULL;
at the end of my getEntityByName method solved the problem. Thanks a lot!
I'm still confused by why the method would return 0 even if the object is found (as in the way I implement my code, it is known that there will always be something found) though- any explanation on that would be more than welcome!
Following on my comment, here's a more complete answer.
There is a path in your testApp::getEntityByName() method where control exits the method without returning a value. Depending on your compiler, architecture and calling convention, this could result in machine code that doesn't work even if your flow never goes through the erroneous path.
Depending on the calling convention, it is either the caller or the called method's responsibility to clean up the stack before or after the method returns. The return value, and where it is allocated in memory, is part of that convention, and a compiler expects a function to always return the same type no matter what the control flow within the function is. Because of that, it can optimize some methods by rearranging some stuff and generating specific clean-up code to clean restore the stack according to the calling convention. In any case, the missing return value can mess up that optimization or clean-up because it violates what the compiler took for granted when it processed your code, i.e. that every path returned a pointer to a GraphicalEntity object. Failing that assumption corrupted the stack or its content, and you ended up with a NULL pointer (it might as well have crashed or done just about anything else, it's all part of undefined behavior).
It could happened if screenEntities is accessed via another thread so the "pallette-picker" has been removed or modified. Then getEntityByName function will return NULL in debug mode.
So I have been debugging this error for hours now. I writing a program using Ogre3d relevant only because it doesn't load symbols so it doesn't let me stack trace which made finding the location of the crash even harder. So, write before I call a specific function I print out "Starting" then I call the function and immediately after I print "Stopping". Throughout the function I print out letters A-F where F is printed right before the function returns (one line above the last '}') The weird thing is when the crash occurs it is after the 'F' is printed but there is no 'Stopping'. Does this mean that the crash is happening in between somewhere? The only thing I can think of is something going wrong during the deallocation of some of the memory allocated during the function. I've never had anything happen like this, I will keep checking to make sure it's going wrong where I think it is.
Most of the times when something weird and un-understandable happens, it's because of something else.
You could have some dangling pointers in your code (even in a place far away from that function) pointing to some random memory cells.
You might have used such dangling pointer, and it might have resulted in overwriting some memory cells you need. The result of this is that you changed the behavior of your program by changing some variable defined elsewhere, some constants, or even some code!
I'd suggest you to debug your application using some tool able to check and report erroneous memory accesses, like Valgrind.
Anyway if you are able to localize the source of your crash and to write a really small piece of code that will crash post it here -- it could be just a simple error in your function, although it sounds unlikely, from your description.
This probably means that the error is happening when the function returns and some destructor is firing. Chances are that you have some destructor trying to free memory it doesn't own, or writing off the end of some buffer in a log, etc.
Another possibility to be aware of might come up if you aren't flushing the output stream. It's possible that "Stopping" is getting printed, but is being buffered before hitting stdout. Make sure to check for this, since if that's what's going on you'll be barking up the wrong tree.
I had a similar problem, and it turned out that my function was not returning anything when the signature expected a return type of std::shared_ptr, even though I was not using the return anywhere.
The function had the following signature:
std::shared_ptr<blDataNode> blConditionBasedDataSelectionUI::selectData(std::shared_ptr<blDataNode> inputData)
{
// My error was due to the function
// not returning anything
}
I encountered the same problem and it turned out I forgot to init my vector before appending new items, which cause error when my function was comparing the vector with other list.
std::vector<cv::Point> lefteyeCV;
void Init() {
// I need to add "lefteyeCV.clear();" here!
for (int i = 0; i < 8; i++) {
lefteyeCV.push_back(cv::Point(0, 0));
}
}
// the following comparison will crash after "return 0"
// because cl_ is of size 8, but if I run "Init()" twice, lefteyeCV.size() = 16
// then the comparison is out of range.
int irisTrack(){
for (int i = 0; i < lefteyeCV.size(); i++) {
cl_[i] = cv::Point(lefteyeCV[order[i]].x - leftRect.x, lefteyeCV[order[i]].y - leftRect.y);
}
return 0;
}
What's confusing is that, I'm using Xcode and the app crash right after "return 0" with the indecipherable message "thread 13: signal SIGABRT". However, using Visual Studio instead showed me the line where index is out of range.
I'm getting a bad error. When I call delete on an object at the top of an object hierarchy (hoping to the cause the deletion of its child objects), my progam quits and I get this:
*** glibc detected *** /home/mossen/workspace/abbot/Debug/abbot: double free or corruption (out): 0xb7ec2158 ***
followed by what looks like a memory dump of some kind. I've searched for this error and from what I gather it seems to occur when you attempt to delete memory that has already been deleted. Impossible as there's only one place in my code that attempts this delete. Here's the wacky part: it does not occur in debug mode. The code in question:
Terrain::~Terrain()
{
if (heightmap != NULL) // 'heightmap' is a Heightmap*
{
cout << "heightmap& == " << heightmap << endl;
delete heightmap;
}
}
I have commented out everything in the heightmap destructor, and still this error. When the error occurs,
heightmap& == 0xb7ec2158
is printed. In debug mode I can step through the code slowly and
heightmap& == 0x00000000
is printed, and there is no error. If I comment out the 'delete heightmap;' line, error never occurs. The destructor above is called from another destructor (separate classes, no virtual destructors or anything like that). The heightmap pointer is new'd in a method like this:
Heightmap* HeightmapLoader::load() // a static method
{
// ....
Heightmap* heightmap = new Heightmap();
// ....other code
return heightmap;
}
Could it be something to do with returning a pointer that was initialized in the stack space of a static method? Am I doing the delete correctly? Any other tips on what I could check for or do better?
What happens if load() is never called? Does your class constructor initialise heightmap, or is it uninitialised when it gets to the destructor?
Also, you say:
... delete memory that has already been deleted. Impossible as there's only one place in my code that attempts this delete.
However, you haven't taken into consideration that your destructor might be called more than once during the execution of your program.
In debug mode pointers are often set to NULL and memory blocks zeroed out. That is the reason why you are experiencing different behavior in debug/release mode.
I would suggest you use a smart pointer instead of a traditional pointer
auto_ptr<Heightmap> HeightmapLoader::load() // a static method
{
// ....
auto_ptr<Heightmap> heightmap( new Heightmap() );
// ....other code
return heightmap;
}
that way you don't need to delete it later as it will be done for you automatically
see also boost::shared_ptr
It's quite possible that you're calling that dtor twice; in debug mode the pointer happens to be zeroed on delete, in optimized mode it's left alone. While not a clean resolution, the first workaround that comes to mind is setting heightmap = NULL; right after the delete -- it shouldn't be necessary but surely can't hurt while you're looking for the explanation of why you're destroying some Terrain instance twice!-) [[there's absolutely nothing in the tiny amount of code you're showing that can help us explain the reason for the double-destruction.]]
It looks like the classic case of uninitialized pointer. As #Greg said, what if load() is not called from Terrain? I think you are not initializing the HeightMap* pointer inside the Terrain constructor. In debug mode, this pointer may be set to NULL and C++ gurantees that deleting a NULL pointer is a valid operation and hence the code doesn't crash. However, in release mode due to optimizations, the pointer in uninitialized and you try to free some random block of memory and the above crash occurs.