Why would buffer overruns cause segmentation faults when accessing an integer? - c++

During a call to function B() from function A(), B() allocates a 100-char array and fills it several times, including once with a 101-character string and once with a 110 character string. This is an obvious mistake.
Later, function A() tries to access completely unrelated int variable i, and a segmentation fault occurs.
I understand why the buffer overrun occurs, but why do I get a segmentation fault when I access this integer? Why is it that I don't simply get garbage data?

A buffer overrun may clobber a previously saved version of the frame pointer on the stack.
When the function returns, this corrupt version is loaded into the frame pointer register, causing the behavior you describe.
Wikipedia's page contains a figure and definitions.

When A() calls B(), B's preamble instructions save A's frame pointer—the location on the stack where A keeps local variables, before replacing it with B's own frame pointer. It looks like this:
When B overruns its local variables, it messes up the value which will be reloaded into the frame pointer. This is garbage as a frame pointer value, so all of A's local variables are trashed. Worse, future writes to local variables are messing with memory belonging to someone else.

The most likely explanation from you description is that the overrun in B corrupts the saved frame pointer on the stack for A. So after B returns, A has garbage in its frame pointer and crashes when it tries to access a local variable.

If you're accessing i through a pointer, then the problem is the pointer is garbage.

It is important to remember that you allocate enough memory plus one for the nul terminating character (Astute readers will point out this nul, that is primarily there for a reason - a nul with one 'l' is '\0' [Thanks Software Monkey for pointing out an error!], a null with two 'l' is a pointer pointing to nothing).
Here's an example of how a seg fault can occur
int main(int argc, char **argv){
int *x = NULL;
*x = 5;
// boom
}
Since x is a pointer and set to null, we attempt to dereference the pointer and assigning a value to it. A guaranteed way of generating a segmentation fault.
There is an old trick available in that you can actually trap the seg fault and get a stack trace, more common on unix environment, by setting up a signal handler to trap a SIGSEGV, and within your signal handler invoke a process like this:
char buf[250];
buf[0] = '\0';
sprintf(buf, "gdb -a %d | where > mysegfault.txt", getpid());
system(buf);
This attaches the currently executing C program and shells out to the debugger and attaches itself to it, the where part of it shows the stack trace of the offending line that caused the seg fault and redirects the output to a file in the current directory.
Note: this is implementation defined, depending on the installation, under AIX, the gnu debugger is present and hence this will work, your mileage may vary.
Hope this helps,
Best regards,
Tom.

Related

weak-ptr become null, crash app 1 time every week

Unhandled exception at 0x764F135D (kernel32.dll) in RFNReader_NFCP.exe.4448.dmp: 0xC0000005: Access violation writing location 0x00000001.
void Notify( const char* buf, size_t len )
{
for( auto it = m_observerList.begin(); it != m_observerList.end(); )
{
auto item = it->lock();
if( item )
{
item->Update( buf, len );
++it;
}
else
{
it = m_observerList.erase( it );
}
}
}
variable item's value in debug window:
item shared_ptr {m_interface="10.243.112.12" m_port="8889" m_clientSockets={ size=0 } ...} [3 strong refs, 2 weak refs] [default] std::tr1::shared_ptr
but in item->Update():
the item(this) become null!
why??
The problem here is most likely not the weak_ptr, which is used correctly.
In fact, the code you posted is completely fine, so the error must be elsewhere. The raw pointer and length arguments indicate a possible memory corruption.
Be aware that the debugger might lie to you if you accidentally mess up stack frames due to memory corruption. Since you seem to be debugging this from a minidump it might also be that the dumping swallowed some info here.
Mind you, the corrupted this pointer that you are seeing here is just a value on the stack! The underlying object is most probably still alive, as you are maintaining several shared_ptrs to it (you can verify this in a debug build by checking if the original memory location of the object was overwritten by magic numbers). It's really just your stack values that are bogus. I would definitely recommend you double check the stack manually using VS's memory and register windows. If you do have a memory corruption, it should become visible there.
Also consider temporarily cranking up the amount of data saved to the minidump if it threw away too much.
Finally, be sure you double check your buffer handling. It's very likely that you messed up there somewhere and an out-of-bounds buffer write caused the corruption.
Note that your this is invalid (0x00000001), i.e. the object got destroyed. Notify member function was called for a destroyed object. This obviously crashes as soon as Notify tries to access an object member.

General way of solving Error: Stack around the variable 'x' was corrupted

I have a program which prompts me the error in VS2010, in debug :
Error: Stack around the variable 'x' was corrupted
This gives me the function where a stack overflow likely occurs, but I can't visually see where the problem is.
Is there a general way to debug this error with VS2010? Would it be possible to indentify which write operation is overwritting the incorrect stack memory?
thanks
Is there a general way to debug this error with VS2010?
No, there isn't. What you have done is to somehow invoke undefined behavior. The reason these behaviors are undefined is that the general case is very hard to detect/diagnose. Sometimes it is provably impossible to do so.
There are however, a somewhat smallish number of things that typically cause your problem:
Improper handling of memory:
Deleting something twice,
Using the wrong type of deletion (free for something allocated with new, etc.),
Accessing something after it's memory has been deleted.
Returning a pointer or reference to a local.
Reading or writing past the end of an array.
This can be caused by several issues, that are generally hard to see:
double deletes
delete a variable allocated with new[] or delete[] a variable allocated with new
delete something allocated with malloc
delete an automatic storage variable
returning a local by reference
If it's not immediately clear, I'd get my hands on a memory debugger (I can think of Rational Purify for windows).
This message can also be due to an array bounds violation. Make sure that your function (and every function it calls, especially member functions for stack-based objects) is obeying the bounds of any arrays that may be used.
Actually what you see is quite informative, you should check in near x variable location for any activity that might cause this error.
Below is how you can reproduce such exception:
int main() {
char buffer1[10];
char buffer2[20];
memset(buffer1, 0, sizeof(buffer1) + 1);
return 0;
}
will generate (VS2010):
Run-Time Check Failure #2 - Stack around the variable 'buffer1' was corrupted.
obviously memset has written 1 char more than it should. VS with option \GS allows to detect such buffer overflows (which you have enabled), for more on that read here: http://msdn.microsoft.com/en-us/library/Aa290051.
You can for example use debuger and step throught you code, each time watch at contents of your variable, how they change. You can also try luck with data breakpoints, you set breakpoint when some memory location changes and debugger stops at that moment,possibly showing you callstack where problem is located. But this actually might not work with \GS flag.
For detecting heap overflows you can use gflags tool.
I was puzzled by this error for hours, I know the possible causes, and they are already mentioned in the previous answers, but I don't allocate memory, don't access array elements, don't return pointers to local variables...
Then finally found the source of the problem:
*x++;
The intent was to increment the pointed value. But due to the precedence ++ comes first, moving the x pointer forward then * does nothing, then writing to *x will be corrupt the stack canary if the parameter comes from the stack, making VS complain.
Changing it to (*x)++ solves the problem.
Hope this helps.
Here is what I do in this situation:
Set a breakpoint at a location where you can see the (correct) value of the variable in question, but before the error happens. You will need the memory address of the variable whose stack is being corrupted. Sometimes I have to add a line of code in order for the debugger to give me the address easily (int *x = &y)
At this point you can set a memory breakpoint (Debug->New Breakpoint->New Data Breakpoint)
Hit Play and the debugger should stop when the memory is written to. Look up the stack (mine usually breaks in some assembly code) to see whats being called.
I usually follow the variable before the complaining variable which usually helps me get the problem. But this can sometime be very complex with no clue as you have seen it. You could enable Debug menu >> Exceptions and tick the 'Win32 exceptions" to catch all exceptions. This will still not catch this exceptions but it could catch something else which could indirectly point to the problem.
In my case it was caused by library I was using. It turnout the header file I was including in my project didn't quite match the actual header file in that library (by one line).
There is a different error which is also related:
0xC015000F: The activation context being deactivated is not the most
recently activated one.
When I got tired of getting the mysterious stack corrupted message on my computer with no debugging information, I tried my project on another computer and it was giving me the above message instead. With the new exception I was able to work my way out.
I encountered this when I made a pointer array of 13 items, then trying to set the 14th item. Changing the array to 14 items solved the problem. Hope this helps some people ^_^
One relatively common source of "Stack around the variable 'x' was corrupted" problem is wrong casting. It is sometimes hard to spot. Here is an example of a function where such problem occurs and the fix. In the function assignValue I want to assign some value to a variable. The variable is located at the memory address passed as argument to the function:
using namespace std;
template<typename T>
void assignValue(uint64_t address, T value)
{
int8_t* begin_object = reinterpret_cast<int8_t*>(std::addressof(value));
// wrongly casted to (int*), produces the error (sizeof(int) == 4)
//std::copy(begin_object, begin_object + sizeof(T), (int*)address);
// correct cast to (int8_t*), assignment byte by byte, (sizeof(int8_t) == 1)
std::copy(begin_object, begin_object + sizeof(T), (int8_t*)address);
}
int main()
{
int x = 1;
int x2 = 22;
assignValue<int>((uint64_t)&x, x2);
assert(x == x2);
}

What does this error mean?

I am writing a C++ code in the ROOT platform. I am getting the following error:
*** Break *** segmentation violation
gdb not found, need it for stack trace
Root > Function main() busy flag cleared
I just want to know what this means (in general).
Generally, "segmentation violation" means you accessed a piece of memory that wasn't allocated to you. Usually a stray pointer is the reason for that.
The remaining is some Linux-specific message concerning a missing gdb (which would be helpful to understand the problem).
Typically that means you have written to (or maybe read) memory you don't have permission on. Either it's just invalid memory or (if the platform supports such a concept) it's outside of the memory you own.
A common cause of this is freeing a pointer but then using it again.
Foo * pFoo = new Foo();
pFoo->Bar(); // should be fine.
delete pFoo; // pFoo now points to memory that may or may not still be an actual Foo.
pFoo->Bar(); // undefined behavior.

why its not safe to return value on function stack

I came across the following paragraph while reading bruce eckel..where he was trying to explain why its not safe for function to return value on stack
Now imagine what would happen if an ordinary function tried to return values on the stack
.you can,t touch any part of the stack that's above the return address,so the function would have to push the values below the return address.But when the assembly language return is executed ,the stack pointer must be pointing to the return address(or right below it depending on your machine),so right before the RETURN ,function must move the stack pointer up,thus clearing of all the local variables.If you are trying to return values on the stack below the return address,you become vulnerable at the moment because an interrupt could come along.The ISR would come the stack pointer down to hold its return address and its local variables and overwrite your return value
would you like to help me for comprehend the bold italic text?
Suppose that you have the following call stack somewhere in your application:
Main routine
Function1's local variables
Function2's local variables <-- STACK POINTER
In this case main calls function1, and function1 calls function2.
Now suppose that function2 calls function3, and the return value of function3 is returned on the stack:
Main routine
Function1's local variables
Function2's local variables
Function3's local variables, including the return value <-- STACK POINTER
Function3 stores the return value on the stack, and then returns. Returning means, decreasing the stack pointer again, so the stack becomes this:
Main routine
Function1's local variables
Function2's local variables <-- STACK POINTER
You see, function3's stack frame is not here anymore.
Well, actually I lied a bit. The stack frame is still there:
Main routine
Function1's local variables
Function2's local variables <-- STACK POINTER
Function3's local variables, including the return value
So it seems safe to still access the stack to get the return value.
But, if there is an interrupt AFTER function3 has returned, but BEFORE function2 get's the return value from the stack, we get this:
Main routine
Function1's local variables
Function2's local variables
Interrupt function's local variables <-- STACK POINTER
And now the stack frame is really overwritten, and the return value that we desperately needed has gone.
That's why returning a return value on the stack is not safe.
The problem is similar to the one shown in this simple piece of C code:
char *buf = (char *)malloc(100*sizeof(char *));
strcpy (buf, "Hello World");
free (buf);
printf ("Buffer is %s\n",buf);
Most of the times, the memory that was used for buf will still have the contents "Hello World", but it can go horribily wrong if someone is able to allocate memory after free has been called, but before printf is called. One such example is in multi-threaded applications (and we already encountered this problem internally), like shown here:
THREAD 1: THREAD 2:
--------- ---------
char *buf = (char *)malloc(100);
strcpy (buf, "Hello World");
free (buf);
char *mybuf = (char *)malloc(100);
strcpy (mybuf, "This is my string");
printf ("Buffer is %s\n",buf);
The printf is Thread 1 may now print "Hello World", or it may print "This is my string". Anything can happen.
He is just trying to explain why you shouldn't return a pointer or reference to a local variable. Because it disappears as soon as the function returns!
Exactly what happens at the hardware level isn't that important, even if it might explain why the value sometimes seems to still be there and sometimes not.
When you call a function that has pass-by-stack arguments, those arguments get pushed onto the stack. When the function returns, that bit of stack memory it was using is released. Immediately thereafter, it's unsafe to access what was in those stack values, because something else may have overwritten them.
Let's say we're on a cpu where the stack pointer is kept in a register called SP, and it grows "upwards".
Your code is chugging along and comes to a function call. At this point, we'll say SP is 100.
The function is called, and your function takes two single byte arguments. Those two bytes worth of arguments get pushed onto the stack, and... and this is the important part - the address of the code from which you called the function (let's say it's 4bytes). Now SP is 106. The address to return to is at SP=100, and your two bytes are at 104 and 105.
Let's say the function modifies one of those arguments (SP=105) as a way to return the modified value
The function returns, the stack snaps back to where it was (SP=100), and continues on.
In a perfect world, there's nothing else going on in the system and your program has absolute control over the CPU... Until you do something else that requires the stack, that SP=105 value will stay there "forever".
However, with interrupts, there's no guarantees that something else won't come up. let's say a hardware interrupt hits your app. This means an immediate jump to the interrupt servicing routine, so the current address of where the CPU was when the interrupt hit gets pushed onto the stack (4 bytes), now SP is 103. Let's say this ISR calls other subroutines, which means more return addresses get pushed onto the stack. So now SP is 107... your original 105 value has no been overwritten.
eventually these ISRs will return, control goes back to your code, and SP is 100 again... your app tries to retrieve that SP=105 value, blissfully unaware that it got trashed by the ISR, and now you're working with bad data.
The most important part of that paragraph is:
If you are trying to return values on
the stack below the return address (...)
In other words, don't return pointers to data that is only valid within the scope of that function.
This was probably something you had to worry about before C standardized how a function returned a struct by value. Now, it's part of the C99 standard (6.8.6.4) and you shouldn't worry about it.
And return by value is fully supported in C++ for a while now. Otherwise, many STL implementation details would simply not work properly.

How to end up with a pointer to 0xCCCCCCCC

The program I'm working on crashes sometimes trying to read data at the address 0xCCCCCCCC. Google (and StackOverflow) being my friends I saw that it's the MSVC debug code for uninitialized stack variable. To understand where the problem can come from, I tried to reproduce this behavior: problem is I haven't been able to do it.
Question is: have you a code snippet showing how a pointer can end pointing to 0xCCCCCCCC?
Thanks.
int main()
{
int* p;
}
If you build with the Visual C++ debug runtime, put a breakpoint in main(), and run, you will see that p has a value of 0xcccccccc.
Compile your code with the /GZ compiler switch or /RTCs switch. Make sure that /Od switch is also used to disable any optimizations.
s
Enables stack frame run-time error checking, as follows:
Initialization of local variables to a nonzero value. This helps identify bugs that do not appear when running in debug mode. There is a greater chance that stack variables will still be zero in a debug build compared to a release build because of compiler optimizations of stack variables in a release build. Once a program has used an area of its stack, it is never reset to 0 by the compiler. Therefore, subsequent, uninitialized stack variables that happen to use the same stack area can return values left over from the prior use of this stack memory.
Detection of overruns and underruns of local variables such as arrays. /RTCs will not detect overruns when accessing memory that results from compiler padding within a structure. Padding could occur by using align (C++), /Zp (Struct Member Alignment), or pack, or if you order structure elements in such a way as to require the compiler to add padding.
Stack pointer verification, which detects stack pointer corruption. Stack pointer corruption can be caused by a calling convention mismatch. For example, using a function pointer, you call a function in a DLL that is exported as __stdcall but you declare the pointer to the function as __cdecl.
I do not have MSVC, but this code should produce the problem and compile with no warnings.
In file f1.c:
void ignore(int **p) { }
In file f2.c:
void ignore(int **p);
int main(int c, char **v)
{
int *a;
ignore(&a);
return *a;
}
The call to ignore makes it look like a might be initialized. I doubt the compiler will warn in this case, because of the risk that the warning might be a false positive.
How about this? Ignore the warning that VC throws while running.
struct A{
int *p;
};
int main(){
A a;
cout << (void *)a.p;
}