I have a very simple C++ code here:
char *s = new char[100];
strcpy(s, "HELLO");
delete [] s;
int n = strlen(s);
If I run this code from Visual C++ 2008 by pressing F5 (Start Debugging,) this always result in crash (Access Violation.) However, starting this executable outside the IDE, or using the IDE's Ctrl+F5 (Start without Debugging) doesn't result in any crash. What could be the difference?
I also want to know if it's possible to stably reproduce the Access Violation crash caused from accessing deleted area? Is this kind of crash rare in real-life?
Accessing memory through a deleted pointer is undefined behavior. You can't expect any reliable/repeatable behavior.
Most likely it "works" in the one case because the string is still "sitting there" in the now available memory -= but you cannot rely on that. VS fills memory with debug values to help force crashes to help find these errors.
The difference is that a debugger, and debug libraries, and code built in "debug" mode, likes to break stuff that should break. Your code should break (because it accesses memory it no longer technically owns), so it breaks easier when compiled for debugging and run in the debugger.
In real life, you don't generally get such unsubtle notice. All that stuff that makes things break when they should in the debugger...that stuff's expensive. So it's not checked as strictly in release. You might be able 99 times out of 100 to get away with freeing some memory and accessing it right after, cause the runtime libs don't always hand the memory back to the OS right away. But that 100th time, either the memory's gone, or another thread owns it now and you're getting the length of a string that's no longer a string, but a 252462649-byte array of crap that runs headlong into unallocated (and thus non-existent, as far as you or the runtime should care) memory. And there's next to nothing to tell you what just happened.
So don't do that. Once you've deleted something, consider it dead and gone. Or you'll be wasting half your life tracking down heisenbugs.
Dereferencing a pointer after delete is undefined behavior - anything can happen, including but not limited to:
data corruption
access violation
no visible effects
exact results will depend on multiple factors most of which are out of your control. You'll be much better off not triggering undefined behavior in the first place.
Usually, there is no difference in allocated and freed memory from a process perspective. E.g the process only has one large memory map that grows on demand.
Access violation is caused by reading/writing memory that is not available, ususally not paged in to the process. Various run-time memory debugging utilities uses the paging mechanism to track invalid memory accesses without the severe run time penalty that software memory checking would have.
Anyway your example proves only that an error is sometimes detected when running the program in one environment, but not detected in another environment, but it is still an error and the behaviour of the code above is undefined.
The executable with debug symbols is able to detect some cases of access violations. The code to detect this is contained in the executable, but will not be triggered by default.
Here you'll find an explanation of how you can control behaviour outside of a debugger: http://msdn.microsoft.com/en-us/library/w500y392%28v=VS.80%29.aspx
I also want to know if it's possible
to stably reproduce the Access
Violation crash caused from accessing
deleted area?
Instead of plain delete you could consider using an inline function that also sets the value of the deleted pointer to 0/NULL. This will typically crash if you reference it. However, it won't complain if you delete it a second time.
Is this kind of crash rare in
real-life?
No, this kind of crash is probably behind the majority of the crashes you and I see in software.
Related
I have an std::string which appears to be getting corrupted somehow. Sometimes the string destructor will trigger an access violation, and sometimes printing it via std::cout will produce a crash.
If I pad the string in a struct as follows, the back_padding becomes slightly corrupted at a relatively consistant point in my code:
struct Test {
int front_padding[128] = {0};
std::string my_string;
int back_padding[128] = {0};
};
Is there a way to protect the front and back padding arrays so that writing to them will cause a exception or something? Or perhaps some tool which can be used to catch the culprit writing to this memory?
Platform: Windows x64 built with MSVC.
In general you have to solve problem of code sanitation, which is quite a broad topic. It sounds like you may have either out-of-bound write, or use of a dangling pointer or even a race condition in using a pointer, but in latter case bug's visibility is affected by obsevation, like the proverbial cat in quantum superposition state.
A dirty way to debug source of such rogue write is to create a data breakpont. It is especially effective if bug appears to be deterministic and isn't a "heisenbug". It is possible in MSVS during debug session. In gdb it is possible by using watch breakpoints.
You can point at the std::string storage or, in your experimental case, at the front padding array to in attempt to trigger breakpoint where a write operation occurs.
How can you catch memory corruption in C++?
The best way with a modern compiler is to compile with an address sanitizer. This inserts exactly the sort of guard areas you describe around automatic (stack) and dynamic (heap) allocations, and detects when they're trampled. It's built into Clang, GCC and MSVC.
If you don't have compiler support, or need to diagnose the problem in an existing binary without recompiling, you can use Valgrind.
The sanitized executable runs at full speed, although it's doing more work and deliberately has a less cache-friendly memory layout; expect it to be about 2x slower than an equivalent un-instrumented build.
Running under valgrind is much slower (expect 10x-30x for memcheck), but will catch more types of error, and is your only option if you can't recompile.
I'm on amd64 architecture. The following code gets rejected by the g++ compiler at the "if" statement:
void * newmem=malloc(n);
if(newmem==0xefbeaddeefbeadde){
with the error message:
error: ISO C++ forbids comparison between pointer and integer [-fpermissive]
I can't seem to find the magic incantation needed to get it going (and I don't want to use -fpermissive). Any help appreciated.
Background:
I'm hunting an ugly bug which crashes my program while requesting memory in some STL new operation (at least gdb told me that). Thinking it could be some memory overrun of one of the allocated memory chunks being, by bad luck, adjacent to memory used by the OS to manage memory lists of my program, I quickly overrode new(), new plus their delete counterparts with own routines that added memory fencing; and while the application still crashes (all fences intact (sigh)), gdb now told me this:
Program received signal SIGSEGV, Segmentation fault.
0x0000000000604f5e in construct (__val=..., __p=0xefbeaddeefbeadde, this=<optimized out>)
at /usr/include/c++/4.6/ext/new_allocator.h:108
108 { ::new((void *)__p) _Tp(__val); }
I noted that the pointer __p has as value the pointer representation of the value I used for one of my memory fences (0xdeadbeef), hence my wish to catch this earlier in my new() to try and dump out some more complex values in my program.
Additional note: the function where it crashes runs flawlessly a couple of million times before it crashes (intermixed with dozens of other routines which all also run a couple of thousand to million times), using valgrind does not seem like an option atm because it takes 6hrs and 11 Gb before my program crashes.
One of them is an integer, and the other a pointer. Perhaps an ugly cast would do
if(newmem==(void*)0xefbeaddeefbeadde){
For your question: you have 2 options - cast the pointer to 64bit number or cast the number to void*
For the background: as this takes too long, it could be a memory leak. So, try valgrind - no need to wait for crash, just run your program through valgrind and use the appropriate options. valgrind could help you to catch undefined behavior, too.
Another thing - compile your program with -O0 optimization level and with max level of debug symbols - -ggdb3. Also, if your executable does not generate a core dump, when this error occurs, make it generate. Then leave it working, while you're trying different approaches to catch the error.
In case you don't succeed, you'll at least have a (hopefully) nice core dump, that you can examine with gdb your_exe the_core_dump. Then, watch the core frame by frame, watch the variables, etc.
If it's not a memory leak, it could be a memory corruption somewhere, undefined behavior or something like this. If a good core dump is generated, you'll probably catch the error. Or, at least, it could give you useful information about the case.
Another thing, that could be useful, if you generate a core dump (no matter if it's a good or a bad one) - the size of the dump. If it's too large (the definition of "large" this depends on your program), then you most probably have a memory leak (it could be some kind of cross-reference issue, if you use smart pointers and if so - valgrind will not catch it)
That probably won't work that way. Next time when your program runs, malloc may never return that value. There is something else you must do, rather than just checking on the return value of new!
I get a segmentation fault when attempting to delete this.
I know what you think about delete this, but it has been left over by my predecessor. I am aware of some precautions I should take, which have been validated and taken care of.
I don't get what kind of conditions might lead to this crash, only once in a while. About 95% of the time the code runs perfectly fine but sometimes this seems to be corrupted somehow and crash.
The destructor of the class doesn't do anything btw.
Should I assume that something is corrupting my heap somewhere else and that the this pointer is messed up somehow?
Edit : As requested, the crashing code:
long CImageBuffer::Release()
{
long nRefCount = InterlockedDecrement(&m_nRefCount);
if(nRefCount == 0)
{
delete this;
}
return nRefCount;
}
The object has been created with a new, it is not in any kind of array.
The most obvious answer is : don't delete this.
If you insists on doing that, then use common ways of finding bugs :
1. use valgrind (or similar tool) to find memory access problems
2. write unit tests
3. use debugger (prepare for loooong staring at the screen - depends on how big your project is)
It seems like you've mismatched new and delete. Note that delete this; can only be used on an object which was allocated using new (and in case of overridden operator new, or multiple copies of the C++ runtime, the particular new that matches delete found in the current scope)
Crashes upon deallocation can be a pain: It is not supposed to happen, and when it happens, the code is too complicated to easily find a solution.
Note: The use of InterlockedDecrement have me assume you are working on Windows.
Log everything
My own solution was to massively log the construction/destruction, as the crash could well never happen while debugging:
Log the construction, including the this pointer value, and other relevant data
Log the destruction, including the this pointer value, and other relevant data
This way, you'll be able to see if the this was deallocated twice, or even allocated at all.
... everything, including the stack
My problem happened in Managed C++/.NET code, meaning that I had easy access to the stack, which was a blessing. You seem to work on plain C++, so retrieving the stack could be a chore, but still, it remains very very useful.
You should try to load code from internet to print out the current stack for each log. I remember playing with http://www.codeproject.com/KB/threads/StackWalker.aspx for that.
Note that you'll need to either be in debug build, or have the PDB file along the executable file, to make sure the stack will be fully printed.
... everything, including multiple crashes
I believe you are on Windows: You could try to catch the SEH exception. This way, if multiple crashes are happening, you'll see them all, instead of seeing only the first, and each time you'll be able to mark "OK" or "CRASHED" in your logs. I went even as far as using maps to remember addresses of allocations/deallocations, thus organizing the logs to show them together (instead of sequentially).
I'm at home, so I can't provide you with the exact code, but here, Google is your friend, but the thing to remember is that you can't have a __try/__except handdler everywhere (C++ unwinding and C++ exception handlers are not compatible with SEH), so you'll have to write an intermediary function to catch the SEH exception.
Is your crash thread-related?
Last, but not least, the "I happens only 5% of the time" symptom could be caused by different code path executions, or the fact you have multiple threads playing together with the same data.
The InterlockedDecrement part bothers me: Is your object living in multiple threads? And is m_nRefCount correctly aligned and volatile LONG?
The correctly aligned and LONG part are important, here.
If your variable is not a LONG (for example, it could be a size_t, which is not a LONG on a 64-bit Windows), then the function could well work the wrong way.
The same can be said for a variable not aligned on 32-byte boundaries. Is there #pragma pack() instructions in your code? Does your projet file change the default alignment (I assume you're working on Visual Studio)?
For the volatile part, InterlockedDecrement seem to generate a Read/Write memory barrier, so the volatile part should not be mandatory (see http://msdn.microsoft.com/en-us/library/f20w0x5e.aspx).
I get segmentation fault accessing an object which looks valid and fully accessible in gdb. Isn't segmentation is always about inaccessible memory?
EDIT: more details.
The crash happend under gdb so I could examine the object's memory. It had the members set to proper values so there is no chance I was accessing read-only memory. The instruction where crashed happed is kind of Var = Obj.GetMember() where Var, GetMember and the corresponding member are short integers.
Misalignment? I suppose it would cause bus error, not segmentation. I'll try to rebuild all. The problem is that this piece of code runs thousands times a second and the segmentation happens once in several days.
Try complete rebuild (make clean && make), this had helped me a couple of times when I encountered such weird errors.
Late UPD:
If this does fix the problem, it usually means that something is wrong with your makefile, usually screwed-up dependencies between .cpp and .h files, for example: a.cpp includes b.h, but b.h is not listed in a.cpp's dependencies.
You can get faults even if accessing "valid" memory under some circumstances:
you're attempting to modify memory but the specific mapping is readonly
you're attempting to execute code in a memory area that is no-execute
you're attempting to e.g. load/store at a misaligned address and your hardware issues alignment exceptions
Without a look at the coredump, to figure out what the faulting instruction (load/store/execute) was and what exactly the mapping permissions for the accessed memory were it's impossible to distinguish.
Basically, yes. Did you use the core dump to analyze your seg fault?
Code would very much help, but have you done a make clean? If you've increased the size of a class and your dependencies aren't right then there won't be enough space allocated for an instance and that class will then overrun and corrupt whatever it precedes in memory.
Platform : Win32
Language : C++
I get an error if I leave the program running for a while (~10 min).
Unhandled exception at 0x10003fe2 in ImportTest.exe: 0xC0000005: Access violation reading location 0x003b1000.
I think it could be a memory leak but I don't know how to find that out.
Im also unable to 'free()' memory because it always causes (maybe i shouldn't be using free() on variables) :
Unhandled exception at 0x76e81f70 in ImportTest.exe: 0xC0000005: Access violation reading location 0x0fffffff.
at that stage the program isn't doing anything and it is just waiting for user input
dllHandle = LoadLibrary(L"miniFMOD.dll");
playSongPtr = (playSongT)GetProcAddress(dllHandle,"SongPlay");
loadSongPtr = (loadSongT)GetProcAddress(dllHandle,"SongLoadFromFile");
int songHandle = loadSongPtr("FILE_PATH");
// ... {just output , couldn't cause errors}
playSongPtr(songHandle);
getch(); // that is where it causes an error if i leave it running for a while
Edit 2:
playSongPtr(); causes the problem. but i don't know how to fix it
I think it's pretty clear that your program has a bug. If you don't know where to start looking, a useful technique is "divide and conquer".
Start with your program in a state where you can cause the exception to happen. Eliminate half your code, and try again. If the exception still happens, then you've got half as much code to look through. If the exception doesn't happen, then it might have been related to the code you just removed.
Repeat the above until you isolate the problem.
Update: You say "at that stage the program isn't doing anything" but clearly it is doing something (otherwise it wouldn't crash). Is your program a console mode program? If so, what function are you using to wait for user input? If not, then is it a GUI mode program? Have you opened a dialog box and are waiting for something to happen? Have you got any Windows timers running? Any threads?
Update 2: In light of the small snippet of code you posted, I'm pretty sure that if you try to remove the call to the playSongPtr(songHandle) function, then your problem is likely to go away. You will have to investigate what the requirements are for "miniFMOD.dll". For example, that DLL might assume that it's running in a GUI environment instead of a console program, and may do things that don't necessarily work in console mode. Also, in order to do anything in the background (including playing a song), that DLL probably needs to create a thread to periodically load the next bit of the song and queue it in the play buffer. You can check the number of threads being created by your program in Task Manager (or better, Process Explorer). If it's more than one, then there are other things going on that you aren't directly controlling.
The error tells you that memory is accessed which you have not allocated at the moment. It could be a pointer error like dereferencing NULL. Another possibility is that you use memory after you freed it.
The first step would be to check your code for NULL reference checks, i.e. make sure you have a valid pointer before you use it, and to check the lifecycle of all allocated and freed resources. Writing NULL's over references you just freed might help find the problem spot.
I doubt this particular problem is a memory leak; the problem is dereferencing a pointer that does not point to something useful. To check for a memory leak, watch your process in your operating system's process list tool (task manager, ps, whatever) and see if the "used memory" value keeps growing.
On calling free: You should call free() once and only once on the non-null values returned from malloc(), calloc() or strdup(). Calling free() less than once will lead to a memory leak. Calling free() more than once will lead to memory corruption.
You should get a stack trace to see what is going on when the process crashes. Based on my reading of the addresses involved you probably have a stack overflow or have an incorrect pointer calculation using a stack address (in C/C++ terms: an "auto" variable.) A stack trace will tell you how you got to the point where it crashed.