In the following C++ code, it should be impossible for ain integer division by zero to occur:
// gradedUnits and totalGrades are both of type int
if (gradedUnits == 0) {
return 0;
} else {
return totalGrades/gradedUnits; //call stack points to this line
}
however Visual Studio is popping up this error:
Unhandled exception at 0x001712c0 in DSA_asgn1.exe: 0xC0000094: Integer division by zero.
And the stack trace points to the line indicated in the code.
UPDATE: I may have just been doing something silly here. While messing around trying to get VS to pay attention to my debug breakpoints, I rebuilt the solution and the exception isn't happening any more. It seems likely to me that I was stopping in the middle of a debug session and resuming it, when I thought I was starting new sessions.
Thanks for the responses. Would it be appropriate here to delete my question since it's resolved and wasn't really what I thought it was?
It seems like VS might just do this with any integer division, without checking whether a divide by zero is possible. Do I need to catch this exception even though the code should never be able to throw it? If so, what's the best way to go about this?
This is for an assignment that specifies VS 2005/2008 with C++. I would prefer not to make things more complicated than I need to, but at the same time I like to do things properly where possible.
You should try going through this code with VS debugger, and see what are the actual values of those variables.
It turns out this problem was caused by the code I originally had, which did not have the divide-by-zero check, shown here:
return totalGrades/gradedUnits;
The problem was that although I'd updated the code, I was actually still in the same debug session that threw the original error, so the program was still running on the old code and threw the error again every time I restarted it.
The problem was solved by rebuilding the solution, which forced a new debug session. Simply terminating the debug session and restarting it with a rebuild would have solved it too (I just hadn't noticed I was still in the session).
Related
I've run into a strange problem. The Release version of my application seems to run fine, but recently when I switched to the Debug version, I got an access violation immediately on start-up. The access violation is occurring when a block of allocated memory is freed. All of this occurs in the constructor for a static variable.
I believe the problem doesn't occur in the Release version simply because I have defined NDEBUG there, which I believe disables assertions in the C runtime.
I've been able to narrow things down a bit. If I add the following code to the constructor before the usual calls, then I get the same error:
int *temp = new int[3];
delete[] temp;
This makes me think that something outside of this block of code is causing the problem, e.g., perhaps there is a problem with the way the C runtime is being linked. However, I'm at a loss to say what that problem might be, and after a day of poking at the problem I'm running out of ideas for where to poke next.
Any help would be greatly appreciated. I am using Visual Studio 2010 to compile the application and running Windows 7.
Thanks so much!
In Debug mode, additional checks are added; therefore it's not unusual for a program to run perfectly well in Release mode but to give an access violation in Debug mode. This doesn't mean that the Release version is OK; it only means that some error made when the Release version is running is not catched but is when running in Debug mode.
Debugging a corrupted memory problem in C/C++ is very hard because the error can be made by any other instruction affecting the memory. For example, if you have two arrays that follow one each other in the allocated memory and the first array is overrun, then it will corrupt the header put before the second array (each memory allocation is prefixed by an header; this header is used by the operators delete and delete[] when deallocating the memory). However, only when you will try to deallocate the second array that the access violation will occurs and this, even if it's with the first array that there is error in the code.
Of course, you can have other problems with the second array. For example, you can find that some or all of its values have been corrupted when trying to read from it. However, it's not always the case and in many occasions, it might behave perfectly well when reading or writing to or from it and you can have the exact same good behavior with the first array. It's not because you don't have any problem reading and writing to and from some array that you don't overstep its boundary and corrupting the memory above (or below) it. Sometimes, the problem will only show up when trying to deallocate the array and other times, the problem will show up otherwise; for example with the display of corrupted values.
I was able to produce a minimal example by cutting away essentially all of my application code. I reduced the InitInstance function to the following:
BOOL CTempApp::InitInstance()
{
int *temp = new int[3] ;
delete[] temp ;
return FALSE ;
}
Even with all of my application code stripped away, the problem persisted. After this, I created a new project in Visual Studio 2010, once again replaced InitInstance with my minimal version, and then added all of the Additional Dependencies from my original project in the linker options. With this configuration, the problem was reproduced.
Next, I started removing libraries from the list of dependencies. I managed to whittle the list down to a single third-party library which was causing the following linker warning despite being labeled as the debug version:
LINK : warning LNK4098: defaultlib 'msvcrtd.lib' conflicts with use of other libs; use /NODEFAULTLIB:library
I think what's happening here is that the vendor has linked his debug library against the non-debug runtime. Presumably when my application calls delete[] there is some confusion as to what the parameters are for the call, and it is trying to delete a portion of memory I have not allocated.
I have tried adding the following to my Ignore Specific Default Libraries list:
libc.lib, libcmt.lib, msvcrt.lib, libcd.lib, libcmtd.lib
as suggested here. However, the problem persists. I think the solution will be to contact the vendor and have them resolve the conflict.
Thanks again for your suggestions. I hope this answer will help someone else with debugging a similar problem.
So I have some code that looks like this, written in and compiled with Visual Studio 2010:
if ( outputFile.is_open() )
{
outputFile.close();
}
if ( !outputFile.is_open() ) // condition for sanity-checking
{
outputFile.open("errorOut.txt", ios::out);
}
This crashes on an access violation. Attaching a debugger shows that the first condition is false (outputFile is not open), the second condition is true (outputFile is closed, which is good since I just checked it). Then open() gets called, and eventually locale::getloc() attempts to dereference a null pointer, but I have no idea why that would be happening (since that's now three classes deep into the Standard Library).
Interestingly, the file "errorOut.txt" does get created, even though the open call crashes.
I've spent a few hours watching this in a debugger, but I honestly have no idea what's going on. Anyone have any ideas for even trying to determine what's wrong with the code? It's entirely possible that some code elsewhere is contributing to this situation (inherited code), but there's a lot of it and I don't even know where to look. Everything up to that point seems fine.
OK, I'm not really sure if this is the best way to handle this, but since this involved some truly strange behavior (crashing in the middle of an STL function, and some other oddities like hanging on exit(1); and the like), I'll leave an explanation here for the future.
In our case, the error seemed to derive from some memory corruption going on in some truly awful code that we inherited. Cleaning up the code in general eliminated this crash and other strange behaviors displayed by the program.
I don't know if this will be useful to anyone; maybe it would have been better to simply delete the question. I'm actually somewhat curious if I should have, if anyone wants to leave a comment.
I get a segmentation fault when attempting to delete this.
I know what you think about delete this, but it has been left over by my predecessor. I am aware of some precautions I should take, which have been validated and taken care of.
I don't get what kind of conditions might lead to this crash, only once in a while. About 95% of the time the code runs perfectly fine but sometimes this seems to be corrupted somehow and crash.
The destructor of the class doesn't do anything btw.
Should I assume that something is corrupting my heap somewhere else and that the this pointer is messed up somehow?
Edit : As requested, the crashing code:
long CImageBuffer::Release()
{
long nRefCount = InterlockedDecrement(&m_nRefCount);
if(nRefCount == 0)
{
delete this;
}
return nRefCount;
}
The object has been created with a new, it is not in any kind of array.
The most obvious answer is : don't delete this.
If you insists on doing that, then use common ways of finding bugs :
1. use valgrind (or similar tool) to find memory access problems
2. write unit tests
3. use debugger (prepare for loooong staring at the screen - depends on how big your project is)
It seems like you've mismatched new and delete. Note that delete this; can only be used on an object which was allocated using new (and in case of overridden operator new, or multiple copies of the C++ runtime, the particular new that matches delete found in the current scope)
Crashes upon deallocation can be a pain: It is not supposed to happen, and when it happens, the code is too complicated to easily find a solution.
Note: The use of InterlockedDecrement have me assume you are working on Windows.
Log everything
My own solution was to massively log the construction/destruction, as the crash could well never happen while debugging:
Log the construction, including the this pointer value, and other relevant data
Log the destruction, including the this pointer value, and other relevant data
This way, you'll be able to see if the this was deallocated twice, or even allocated at all.
... everything, including the stack
My problem happened in Managed C++/.NET code, meaning that I had easy access to the stack, which was a blessing. You seem to work on plain C++, so retrieving the stack could be a chore, but still, it remains very very useful.
You should try to load code from internet to print out the current stack for each log. I remember playing with http://www.codeproject.com/KB/threads/StackWalker.aspx for that.
Note that you'll need to either be in debug build, or have the PDB file along the executable file, to make sure the stack will be fully printed.
... everything, including multiple crashes
I believe you are on Windows: You could try to catch the SEH exception. This way, if multiple crashes are happening, you'll see them all, instead of seeing only the first, and each time you'll be able to mark "OK" or "CRASHED" in your logs. I went even as far as using maps to remember addresses of allocations/deallocations, thus organizing the logs to show them together (instead of sequentially).
I'm at home, so I can't provide you with the exact code, but here, Google is your friend, but the thing to remember is that you can't have a __try/__except handdler everywhere (C++ unwinding and C++ exception handlers are not compatible with SEH), so you'll have to write an intermediary function to catch the SEH exception.
Is your crash thread-related?
Last, but not least, the "I happens only 5% of the time" symptom could be caused by different code path executions, or the fact you have multiple threads playing together with the same data.
The InterlockedDecrement part bothers me: Is your object living in multiple threads? And is m_nRefCount correctly aligned and volatile LONG?
The correctly aligned and LONG part are important, here.
If your variable is not a LONG (for example, it could be a size_t, which is not a LONG on a 64-bit Windows), then the function could well work the wrong way.
The same can be said for a variable not aligned on 32-byte boundaries. Is there #pragma pack() instructions in your code? Does your projet file change the default alignment (I assume you're working on Visual Studio)?
For the volatile part, InterlockedDecrement seem to generate a Read/Write memory barrier, so the volatile part should not be mandatory (see http://msdn.microsoft.com/en-us/library/f20w0x5e.aspx).
Sorry if this sounds like an "It compiles, so it must work!" question, but I want to understand why something is happening (or not happening, as the case may be).
In Project Settings, I set Basic Runtime Checks to Both. The debugger informs me that:
Run-Time Check Failure #2 - Stack around the variable 'beg' was corrupted.
But if I set it to the default, which is none, the program runs and completes normally, throwing no exceptions and causing no errors.
My question is, can I safely ignore this (because MSVC++ could be somehow wrong) or is this a real problem? I don't see how the program can continue successfully when the stack has been screwed up.
Edit:
The function that causes this error looks exactly like this:
int fun(list<int>::iterator&, const list<int>::iterator&);
int foo(list<int>& l) {
list<int>::iterator beg = l.begin();
list<int>::iterator end = l.end();
return fun(beg, end);
}
fun increments and operates on beg and when it returns, beg == end, and when MSVC++ breaks, it points to the closing }.
Edit 2:
I have isolated the problem. In some situations, fun removes some elements from the list who owns the items it iterates. This is what causes the error.
Your question isn't answerable without code to reproduce the problem.
But to give a vague answer to your general problem - If the compiler or debugger detected a problem, you probably have one.
In C++, just because something "goes wrong" doesn't mean your program will crash - it might keep running with completely unpredictable results. It may even complete with the results you desired. But just because it ran well on your system doesn't give you any guarantee for other systems, compilers, times of day, or even for additional runs of the same program.
This is called undefined behavior, and is caused by using the language incorrectly (but not in a way that causes a compile failure). A buffer overrun is only one of dozens of examples.
It turned out something was wrong with my Visual Studio installation, so reinstalling it fixed the problem.
In the following case I'm calling a Func with pointer passed to it, but in the called function, the parameter shows the pointer value as something totally bogus. Something like below.
bool flag = Func(pfspara);--> pfspara = 0x0091d910
bool Func(PFSPARA pfspara) --> pfspara = 0x00000005
{
return false;
}
Why does pfspara change to some bogus pointer? I can't reproduce the problem in debug, only in production.
Thanks.
If you are trying to debug optimized code in for example Visual Studio, you cannot always rely on the debugger properly showing the values of variables - especially not if the variable is unused so that the compiler probably optimizes it away.
Try running this instead:
bool Func(PFSPARA pfspara)
{
printf("%x\n", pfspara);
return false;
}
In general, this should never happen. Problems that can cause this type of symptoms include incompatibility in the compilation options between the calling and the caller modules, bad casting of member function pointers, or simply compiler bugs.
You need to provide a lot more details about your problem: Show the real code, specify your compiler, specify what are the debug vs. production compilation flags, etc.
In addition to Rasmus' comments, I find it is generally worth checking whether the problem occurs in a debug build as well as the release build. If you see genuine problems occurring in a release build but not in the debug build, it is often down to a bug that is exposed by the optimization processs, such as an uninitialised variable. There is a real danger in doing most of your testing in a debug build, to avoid the problem you are seeing here, and then shipping a release build. IMO, if you don't have a good regression test suite (preferably automated) I would avoid shipping opimized code.
It sounds like a buffer overflow problem to me -- something is overwriting that variable. But as mentioned in other answers, there's no way to tell for sure without some actual code to work with.
It sounds to me like you're scribbling on the stack... somewhere in your code a buffer on the stack is overflowing, or you're taking the address of an object on the stack and writing to it after the function returns. This is causing your stack to be corrupted.
It may only happen in release mode because the stack allocations are different due to optimization and exclusion of 'guard' blocks used to help check for this kind of condition.