How to handle or avoid a stack overflow in C++ - c++

In C++ a stack overflow usually leads to an unrecoverable crash of the program. For programs that need to be really robust, this is an unacceptable behaviour, particularly because stack size is limited. A few questions about how to handle the problem.
Is there a way to prevent stack overflow by a general technique. (A scalable, robust solution, that includes dealing with external libraries eating a lot of stack, etc.)
Is there a way to handle stack overflows in case they occur? Preferably, the stack gets unwound until there's a handler to deal with that kinda issue.
There are languages out there, that have threads with expandable stacks. Is something like that possible in C++?
Any other helpful comments on the solution of the C++ behaviour would be appreciated.

Handling a stack overflow is not the right solution, instead, you must ensure that your program does not overflow the stack.
Do not allocate large variables on the stack (where what is "large" depends on the program). Ensure that any recursive algorithm terminates after a known maximum depth. If a recursive algorithm may recurse an unknown number of times or a large number of times, either manage the recursion yourself (by maintaining your own dynamically allocated stack) or transform the recursive algorithm into an equivalent iterative algorithm
A program that must be "really robust" will not use third-party or external libraries that "eat a lot of stack."
Note that some platforms do notify a program when a stack overflow occurs and allow the program to handle the error. On Windows, for example, an exception is thrown. This exception is not a C++ exception, though, it is an asynchronous exception. Whereas a C++ exception can only be thrown by a throw statement, an asynchronous exception may be thrown at any time during the execution of a program. This is expected, though, because a stack overflow can occur at any time: any function call or stack allocation may overflow the stack.
The problem is that a stack overflow may cause an asynchronous exception to be thrown even from code that is not expected to throw any exceptions (e.g., from functions marked noexcept or throw() in C++). So, even if you do handle this exception somehow, you have no way of knowing that your program is in a safe state. Therefore, the best way to handle an asynchronous exception is not to handle it at all(*). If one is thrown, it means the program contains a bug.
Other platforms may have similar methods for "handling" a stack overflow error, but any such methods are likely to suffer from the same problem: code that is expected not to cause an error may cause an error.
(*) There are a few very rare exceptions.

You can protect against stack overflows using good programming practices, like:
Be very carefull with recursion, I have recently seen a SO resulting from badly written recursive CreateDirectory function, if you are not sure if your code is 100% ok, then add guarding variable that will stop execution after N recursive calls. Or even better dont write recursive functions.
Do not create huge arrays on stack, this might be hidden arrays like a very big array as a class field. Its always better to use vector.
Be very carefull with alloca, especially if it is put into some macro definition. I have seen numerous SO resulting from string conversion macros put into for loops that were using alloca for fast memory allocations.
Make sure your stack size is optimal, this is more important in embeded platforms. If you thread does not do much, then give it small stack, otherwise use larger. I know reservation should only take some address range - not physical memory.
those are the most SO causes I have seen in past few years.
For automatic SO finding you should be able to find some static code analysis tools.

Re: expandable stacks. You could give yourself more stack space with something like this:
#include <iostream>
int main()
{
int sp=0;
// you probably want this a lot larger
int *mystack = new int[64*1024];
int *top = (mystack + 64*1024);
// Save SP and set SP to our newly created
// stack frame
__asm__ (
"mov %%esp,%%eax; mov %%ebx,%%esp":
"=a"(sp)
:"b"(top)
:
);
std::cout << "sp=" << sp << std::endl;
// call bad code here
// restore old SP so we can return to OS
__asm__(
"mov %%eax,%%esp":
:
"a"(sp)
:);
std::cout << "Done." << std::endl;
delete [] mystack;
return 0;
}
This is gcc's assembler syntax.

C++ is a powerful language, and with that power comes the ability to shoot yourself in the foot. I'm not aware of any portable mechanism to detect and correct/abort when stack overflow occurs. Certainly any such detection would be implementation-specific. For example g++ provides -fstack-protector to help monitor your stack usage.
In general your best bet is to be proactive in avoiding large stack-based variables and careful with recursive calls.

I don't think that that would work. It would be better to push/pop esp than move to a register because you don't know if the compiler will decide to use eax for something.

Here ya go:
https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/resetstkoflw?view=msvc-160
You wouldn't catch the EXCEPTION_STACK_OVERFLOW structured exception yourself because the OS is going to catch it (in Windows' case).
Yes, you can safely recover from an structured exception (called "asynchronous" above) unlike what was indicated above. Windows wouldn't work at all if you couldn't. PAGE_FAULTs are structured exceptions that are recovered from.
I am not as familiar with how things work under Linux and other platforms.

Related

How to handle stack overflow while using recursive functions [duplicate]

Is it possible to catch a stack overflow exception in a recursive C++ function? If so, how?
so what will happen in this case
void doWork()
{
try() {
doWork();
}
catch( ... ) {
doWork();
}
}
I am not looking for an answer to specific OS. Just in general
It's not an exception per se, but if you just want to be able to limit your stack usage to a fixed amount, you could do something like this:
#include <stdio.h>
// These will be set at the top of main()
static char * _topOfStack;
static int _maxAllowedStackUsage;
int GetCurrentStackSize()
{
char localVar;
int curStackSize = (&localVar)-_topOfStack;
if (curStackSize < 0) curStackSize = -curStackSize; // in case the stack is growing down
return curStackSize;
}
void MyRecursiveFunction()
{
int curStackSize = GetCurrentStackSize();
printf("MyRecursiveFunction: curStackSize=%i\n", curStackSize);
if (curStackSize < _maxAllowedStackUsage) MyRecursiveFunction();
else
{
printf(" Can't recurse any more, the stack is too big!\n");
}
}
int main(int, char **)
{
char topOfStack;
_topOfStack = &topOfStack;
_maxAllowedStackUsage = 4096; // or whatever amount you feel comfortable allowing
MyRecursiveFunction();
return 0;
}
There's really no portable way to do it. An out of control recursive function will usually cause an invalid memory access when it tries to allocate a stack frame beyond the stack address space. This will usually just crash your program with a Segmentation Fault/Access Violation depending on the OS. In other words, it won't throw a c++ exception that can be handled in a standard way by the language.
Even if you can do this non-portably, as you can in Windows, it's still a very bad idea. The best strategy is to not overflow the stack in the first place. If you need isolation from some code you don't control, run that code in a different process and you can detect when it crashes. But you don't want to do that sort of thing in your own process, because you don't know what sort of nasty corruption of state the offending code is going to do, and that will make you unstable.
There's an interesting, somewhat related blog post by Microsoft's Raymond Chen about why you shouldn't try to check for valid pointers in a user mode application on Windows.
There isn't a portable way. However, there are a few nonportable solutions.
First, as others have mentioned, Windows provides a nonstandard __try and __except framework called Structured Exeption Handling (your specific answer is in the Knowledge Base).
Second, alloca -- if implemented correctly -- can tell you if the stack is about to overflow:
bool probe_stack(size_t needed_stack_frame_size)
{
return NULL != alloca(needed_stack_frame_size);
};
I like this approach, because at the end of probe_stack, the memory alloca allocated is released and available for your use. Unfortunately only a few operating systems implement alloca correctly. alloca never returns NULL on most operating systems, letting you discover that the stack has overflown with a spectacular crash.
Third, UNIX-like systems often have a header called ucontext.h with functions to set the size of the stack (or, actually, to chain several stacks together). You can keep track of where you are on the stack, and determine if you're about to overflow. Windows comes with similar abilities a la CreateFiber.
As of Windows 8, Windows has a function specifically for this (GetCurrentThreadStackLimits)
On what OS? Just for example, you can do it on Windows using Structured Exception Handling (or Vectored Exception Handling). Normally you can't do it with native C++ exception handling though, if that's what you're after.
Edit: Microsoft C++ can turn a structured exception into a C++ exception. That was enabled by default in VC++ 6. It doesn't happen by default with newer compilers, but I'm pretty sure with a bit of spelunking, you could turn it back on.
It's true that when this happens, you're out of stack space. That's part of why I mentioned vectored exception handling. Each thread gets its own stack, and a vectored exception handler can run in a separate thread from where the exception was thrown. Even SEH, however, you can handle a stack overflow exception -- it just has to manually spawn a thread to do most of the work.
I doubt so, when stack got overflow the program will not be able even to handle exception. Normally OS will close such program and report the error.
This happens mostly because of infinite recursions.
In Windows you can use structured exception handling (SEH), with __try and __except keywords to install your own exception handler routine that can catch stack overflows, access violation, etc etc.
It's pretty neat to avoid Windows' default crash dialog, and replace it with your own, if you need to.
This is done all the time by most modern operating systems. If you want to do it on your own, you'll have to know the maximum "safe" address for your stack (or likewise do some math to determine how many times you can safely call the function), but this can get very tricky if you aren't managing the call stack yourself, since the OS will usually (for good reason) be hiding this from you.
If you are programming in kernel space, this gets significantly easier, but still something I question why you're doing. If you have a stack overflow, it's probably because of a bad algorithmic decision or else an error in the code.
edit: just realized you want to "catch the exception" that results. I don't think my answer directly answers that at all (does this exception even exist? i would figure instead on a spectacular failure), but I'll leave it up for insight. If you want it removed, please let me know in the comments and I will do so.
You have to know always a level of your recursion and check it if greater than some threshold. Max level (threshold) is calclulated by ratio of stack size divided by the memory required one recursive call.
The memory required one recursive call is the memory for all arguments of the function plus the memory for all local variables plus the memory for return address + some bytes (about 4-8).
Of course, you could avoid the recursion problem by converting it to a loop.
Not sure if you're aware of this but any recursive solution can be translated to a loop-based solution, and vice-versa. It is usually desirable to use a loop based solution because it is easier to read and understand.
Regardless of use of recursion or loop, you need to make sure the exit-condition is well defined and will always be hit.
If you use Visual C++
Goto C/C++ , Code Generation
Choose "Both..." in "Basic Runtime Checks"
Then, run your application...

How to track down a SIGFPE/Arithmetic exception

I have a C++ application cross-compiled for Linux running on an ARM CortexA9 processor which is crashing with a SIGFPE/Arithmetic exception. Initially I thought that it's because of some optimizations introduced by the -O3 flag of gcc but then I built it in debug mode and it still crashes.
I debugged the application with gdb which catches the exception but unfortunately the operation triggering exception seems to also trash the stack so I cannot get any detailed information about the place in my code which causes that to happen. The only detail I could finally get was the operation triggering the exception(from the following piece of stack trace):
3 raise() 0x402720ac
2 __aeabi_uldivmod() 0x400bb0b8
1 __divsi3() 0x400b9880
The __aeabi_uldivmod() is performing an unsigned long long division and reminder so I tried the brute force approach and searched my code for places that might use that operation but without much success as it proved to be a daunting task. Also I tried to check for potential divisions by zero but again the code base it's pretty large and checking every division operation it's a cumbersome and somewhat dumb approach. So there must be a smarter way to figure out what's happening.
Are there any techniques to track down the causes of such exceptions when the debugger cannot do much to help?
UPDATE: After crunching on hex numbers, dumping memory and doing stack forensics(thanks Crashworks) I came across this gem in the ARM Compiler documentation(even though I'm not using the ARM Ltd. compiler):
Integer division-by-zero errors can be trapped and identified by
re-implementing the appropriate C library helper functions. The
default behavior when division by zero occurs is that when the signal
function is used, or
__rt_raise() or __aeabi_idiv0() are re-implemented, __aeabi_idiv0() is
called. Otherwise, the division function returns zero.
__aeabi_idiv0() raises SIGFPE with an additional argument, DIVBYZERO.
So I put a breakpoint at __aeabi_idiv0(_aeabi_ldiv0) et Voila!, I had my complete stack trace before being completely trashed. Thanks everybody for their very informative answers!
Disclaimer: the "winning" answer was chosen solely and subjectively taking into account the weight of its suggestions into my debugging efforts, because more than one was informative and really helpful.
My first suggestion would be to open a memory window looking at the region around your stack pointer, and go digging through it to see if you can find uncorrupted stack frames nearby that might give you a clue as to where the crash was. Usually stack-trashes only burn a couple of the stack frames, so if you look upwards a few hundred bytes, you can get past the damaged area and get a general sense of where the code was. You can even look down the stack, on the assumption that the dead function might have called some other function before it died, and thus there might be an old frame still in memory pointing back at the current IP.
In the comments, I linked some presentation slides that illustrate the technique on a PowerPC — look at around #73-86 for a case study in a similar botched-stack crash. Obviously your ARM's stack frames will be laid out differently, but the general principle holds.
(Using the basic idea from Fedor Skrynnikov, but with compiler help instead)
Compile your code with -pg. This will insert calls to mcount and mcountleave() in every function. Do not link against the GCC profiling lib, but provide your own. The only thing you want to do in your mcount and mcountleave() is to keep a copy of the current stack, so just copy the top 128 bytes or so of the stack to a fixed buffer. Both the stack and the buffer will be in cache all the time so it's fairly cheap.
You can implement special guards in functions that can cause the exception. Guard is a simple class, in constractor of this class you put the name of the file and line (_FILE_, _LINE_) into file/array/whatever. The main condition is that this storage should be the same for all instances of this class(kind of stack). In the destructor you remove this line. To make it works you need to put the creation of this guard on the first line of each function and to create it only on stack. When you will be out of current block deconstructor will be called. So in the moment of your exception you will know from this improvised callstack which function is causing a problem.
Ofcaurse you may put creation of this class under debug condition
Enable generation of core files, and open the core file with the debuger
Since it uses raise() to raise the exception, I would expect that signal() should be able to catch it. Is this not the case?
Alternatively, you can set a conditional breakpoint at __aeabi_uldivmod to break when divisor (r1) is 0.

Catching "Stack Overflow" exceptions in recursive C++ functions

Is it possible to catch a stack overflow exception in a recursive C++ function? If so, how?
so what will happen in this case
void doWork()
{
try() {
doWork();
}
catch( ... ) {
doWork();
}
}
I am not looking for an answer to specific OS. Just in general
It's not an exception per se, but if you just want to be able to limit your stack usage to a fixed amount, you could do something like this:
#include <stdio.h>
// These will be set at the top of main()
static char * _topOfStack;
static int _maxAllowedStackUsage;
int GetCurrentStackSize()
{
char localVar;
int curStackSize = (&localVar)-_topOfStack;
if (curStackSize < 0) curStackSize = -curStackSize; // in case the stack is growing down
return curStackSize;
}
void MyRecursiveFunction()
{
int curStackSize = GetCurrentStackSize();
printf("MyRecursiveFunction: curStackSize=%i\n", curStackSize);
if (curStackSize < _maxAllowedStackUsage) MyRecursiveFunction();
else
{
printf(" Can't recurse any more, the stack is too big!\n");
}
}
int main(int, char **)
{
char topOfStack;
_topOfStack = &topOfStack;
_maxAllowedStackUsage = 4096; // or whatever amount you feel comfortable allowing
MyRecursiveFunction();
return 0;
}
There's really no portable way to do it. An out of control recursive function will usually cause an invalid memory access when it tries to allocate a stack frame beyond the stack address space. This will usually just crash your program with a Segmentation Fault/Access Violation depending on the OS. In other words, it won't throw a c++ exception that can be handled in a standard way by the language.
Even if you can do this non-portably, as you can in Windows, it's still a very bad idea. The best strategy is to not overflow the stack in the first place. If you need isolation from some code you don't control, run that code in a different process and you can detect when it crashes. But you don't want to do that sort of thing in your own process, because you don't know what sort of nasty corruption of state the offending code is going to do, and that will make you unstable.
There's an interesting, somewhat related blog post by Microsoft's Raymond Chen about why you shouldn't try to check for valid pointers in a user mode application on Windows.
There isn't a portable way. However, there are a few nonportable solutions.
First, as others have mentioned, Windows provides a nonstandard __try and __except framework called Structured Exeption Handling (your specific answer is in the Knowledge Base).
Second, alloca -- if implemented correctly -- can tell you if the stack is about to overflow:
bool probe_stack(size_t needed_stack_frame_size)
{
return NULL != alloca(needed_stack_frame_size);
};
I like this approach, because at the end of probe_stack, the memory alloca allocated is released and available for your use. Unfortunately only a few operating systems implement alloca correctly. alloca never returns NULL on most operating systems, letting you discover that the stack has overflown with a spectacular crash.
Third, UNIX-like systems often have a header called ucontext.h with functions to set the size of the stack (or, actually, to chain several stacks together). You can keep track of where you are on the stack, and determine if you're about to overflow. Windows comes with similar abilities a la CreateFiber.
As of Windows 8, Windows has a function specifically for this (GetCurrentThreadStackLimits)
On what OS? Just for example, you can do it on Windows using Structured Exception Handling (or Vectored Exception Handling). Normally you can't do it with native C++ exception handling though, if that's what you're after.
Edit: Microsoft C++ can turn a structured exception into a C++ exception. That was enabled by default in VC++ 6. It doesn't happen by default with newer compilers, but I'm pretty sure with a bit of spelunking, you could turn it back on.
It's true that when this happens, you're out of stack space. That's part of why I mentioned vectored exception handling. Each thread gets its own stack, and a vectored exception handler can run in a separate thread from where the exception was thrown. Even SEH, however, you can handle a stack overflow exception -- it just has to manually spawn a thread to do most of the work.
I doubt so, when stack got overflow the program will not be able even to handle exception. Normally OS will close such program and report the error.
This happens mostly because of infinite recursions.
In Windows you can use structured exception handling (SEH), with __try and __except keywords to install your own exception handler routine that can catch stack overflows, access violation, etc etc.
It's pretty neat to avoid Windows' default crash dialog, and replace it with your own, if you need to.
This is done all the time by most modern operating systems. If you want to do it on your own, you'll have to know the maximum "safe" address for your stack (or likewise do some math to determine how many times you can safely call the function), but this can get very tricky if you aren't managing the call stack yourself, since the OS will usually (for good reason) be hiding this from you.
If you are programming in kernel space, this gets significantly easier, but still something I question why you're doing. If you have a stack overflow, it's probably because of a bad algorithmic decision or else an error in the code.
edit: just realized you want to "catch the exception" that results. I don't think my answer directly answers that at all (does this exception even exist? i would figure instead on a spectacular failure), but I'll leave it up for insight. If you want it removed, please let me know in the comments and I will do so.
You have to know always a level of your recursion and check it if greater than some threshold. Max level (threshold) is calclulated by ratio of stack size divided by the memory required one recursive call.
The memory required one recursive call is the memory for all arguments of the function plus the memory for all local variables plus the memory for return address + some bytes (about 4-8).
Of course, you could avoid the recursion problem by converting it to a loop.
Not sure if you're aware of this but any recursive solution can be translated to a loop-based solution, and vice-versa. It is usually desirable to use a loop based solution because it is easier to read and understand.
Regardless of use of recursion or loop, you need to make sure the exit-condition is well defined and will always be hit.
If you use Visual C++
Goto C/C++ , Code Generation
Choose "Both..." in "Basic Runtime Checks"
Then, run your application...

Cost of throwing C++0x exceptions

What are the performance implications of throwing exceptions in C++0x? How much is this compiler dependent? This is not the same as asking what is the cost of entering a try block, even if no exception is thrown.
Should we expect to use exceptions more for general logic handling like in Java?
#include <iostream>
#include <stdexcept>
struct SpaceWaster {
SpaceWaster(int l, SpaceWaster *p) : level(l), prev(p) {}
// we want the destructor to do something
~SpaceWaster() { prev = 0; }
bool checkLevel() { return level == 0; }
int level;
SpaceWaster *prev;
};
void thrower(SpaceWaster *current) {
if (current->checkLevel()) throw std::logic_error("some error message goes here\n");
SpaceWaster next(current->level - 1, current);
// typical exception-using code doesn't need error return values
thrower(&next);
return;
}
int returner(SpaceWaster *current) {
if (current->checkLevel()) return -1;
SpaceWaster next(current->level - 1, current);
// typical exception-free code requires that return values be handled
if (returner(&next) == -1) return -1;
return 0;
}
int main() {
const int repeats = 1001;
int returns = 0;
SpaceWaster first(1000, 0);
for (int i = 0; i < repeats; ++i) {
#ifdef THROW
try {
thrower(&first);
} catch (std::exception &e) {
++returns;
}
#else
returner(&first);
++returns;
#endif
}
#ifdef THROW
std::cout << returns << " exceptions\n";
#else
std::cout << returns << " returns\n";
#endif
}
Mickey Mouse benchmarking results:
$ make throw -B && time ./throw
g++ throw.cpp -o throw
1001 returns
real 0m0.547s
user 0m0.421s
sys 0m0.046s
$ make throw CPPFLAGS=-DTHROW -B && time ./throw
g++ -DTHROW throw.cpp -o throw
1001 exceptions
real 0m2.047s
user 0m1.905s
sys 0m0.030s
So in this case, throwing an exception up 1000 stack levels, rather than returning normally, takes about 1.5ms. That includes entering the try block, which I believe on some systems is free at execution time, on others incurs a cost each time you enter try, and on others only incurs a cost each time you enter the function which contains the try. For a more likely 100 stack levels, I upped the repeats to 10k because everything was 10 times faster. So the exception cost 0.1ms.
For 10 000 stack levels, it was 18.7s vs 4.1s, so about 14ms additional cost for the exception. So for this example we're looking at a pretty consistent overhead of 1.5us per level of stack (where each level is destructing one object).
Obviously C++0x doesn't specify performance for exceptions (or anything else, other than big-O complexity for algorithms and data structures). I don't think it changes exceptions in a way which will seriously impact many implementations, either positively or negatively.
Exception performance is very compiler dependent. You'll have to profile your application to see if it is a problem. In general, it should not be.
You really should use exceptions for "exceptional conditions", not general logic handling. Exceptions are ideal for separating normal paths through your code and error paths.
I basically think the wrong question has been asked.
What is the cost of exception is not usefull, more usefull is the cost of exceptions relative to the alternative. So you need to measure how much exceptions cost and compare that to returning error codes >>>AND<<< checking the error codes at each level of the stack unwind.
Also note using exceptions should not be done when you have control of everything. Within a class returning an error code is probably a better technique. Exceptions should be used to transfer control at runtime when you can not determine how (or in what context) your object will be utilised at runtime.
Basically it should be used to transfer control to higher level of context where an object with enough context will understand how to handlle the exceptional situation.
Given this usage princimple we see that exceptions will be used to transfer control up multiple levels in stack frame. Now consider the extra code you need to write to pass an error code back up the same call stack. Consider the extra complexity then added when error codes can come from multiple different directions and try and cordinate all the different types of error code.
Given this you can see how Exceptions can greatlly simplify the flow of the code, and you can see the enhirit complexity of the code flow. The question then becomes weather exceptions are more expensive than the complex error conditions tests that need to be performed at each stack frame.
The answer as always depends (do both profile and use the quickist if that is what you need).
But if speed is not the only cost.
Maintainability is a cost that can be measured. Using this metric of cost Exceptions alway win as they ultimately make the constrol flow of the code to just the task that needs to be done not the task and error control.
I once created an x86 emulation library and used exceptions for interrupts and such. Bad idea. Even when I was not throwing any exceptions, it was impacting my main loop a lot. Something like this was my main loop
try{
CheckInterrupts();
*(uint32_t*)&op_cache=ReadDword(cCS,eip);
(this->*Opcodes[op_cache[0]])();
//operate on the this class with the opcode functions in this class
eip=(uint16_t)eip+1;
}
//eventually, handle these and do CpuInts...
catch(CpuInt_excp err){
err.code&=0x00FF;
switch(err.code){
The overhead of containing that code in a try block made the exception functions the 2 of the top 5 users of CPU time.
That, to me is expensive
There's no reason why exceptions in C++0x should be either faster or slower than in C++03. And that means their performance is completely implementation-dependant. Windows uses completely different data structures to implement exception handling on 32-bit vs 64-bit, and on Itanium vs x86. Linux isn't guaranteed to stick with one and just one implementation either. It depends. There are several popular ways to implement exception handling, and all of them have advantages and downsides.
So it doesn't depend on language (c++03 vs 0x), but on compiler, runtime library, OS and CPU architecture.
I don't think C++0x adds to or changes anything in the way C++ exceptions work. For the general advise take a look here.
One would imagine they would have about the same performance as in C++03, which is "very slow!" And no, exceptions should be use in exceptional circumstances due to the try-catch-throw construct, in any language. You're doing something wrong if you are using throw for program flow control in java.
#Steve Jessop has posted the de facto answer, but I still think there is one more thing: to try different optimization levels like "-O2" which is often used by most products.
Exception handling is an expensive feature, in general, because throwing/catching implies additional code to be executed to ensure stack unwinding and catch condition evaluation.
As far as I understood from some readings, for Visual C++, for example, there are some pretty complex structures and logic embedded in the code to ensure that. Since most of the functions might call other functions that might throw exceptions, some overhead for stack unwinding may exist even in these cases.
However, before thinking on exception overhead, it is best to measure the impact of exception usage in your code before any optimization action. Avoiding exception overusage should prevent exception handling significant overheads.
Imagine a bell goes off, the computer stops accepting any input for three seconds, and then someone kicks the user in the head.
That's the cost of an exception. If it's preventing data loss or the machine from catching on fire, it's worth the cost. Otherwise, it probably isn't.
EDIT: And since this got a downvote (plus an upvote, so +8 for me!), I'll clarify the above with less humor and more information: exception, at least in C++ land, require RTTI and compiler and possibly operating system magic, which makes the performance of them a giant black hole of uncertainty. (You aren't even guaranteed that they will fire, but those cases happen during other more serious events, like running out of memory or the user just killing the process or the machine actually catching on fire.) So if you use them, it should be because you want to gracefully recover from a situation that otherwise would cause something horrible to happen, but that recovery cannot have any expectations of running performantly (whatever that may be for your specific application).
So if you are going to use exceptions, you cannot assume anything about the performance implications.

What can modify the frame pointer?

I have a very strange bug cropping up right now in a fairly massive C++ application at work (massive in terms of CPU and RAM usage as well as code length - in excess of 100,000 lines). This is running on a dual-core Sun Solaris 10 machine. The program subscribes to stock price feeds and displays them on "pages" configured by the user (a page is a window construct customized by the user - the program allows the user to configure such pages). This program used to work without issue until one of the underlying libraries became multi-threaded. The parts of the program affected by this have been changed accordingly. On to my problem.
Roughly once in every three executions the program will segfault on startup. This is not necessarily a hard rule - sometimes it'll crash three times in a row then work five times in a row. It's the segfault that's interesting (read: painful). It may manifest itself in a number of ways, but most commonly what will happen is function A calls function B and upon entering function B the frame pointer will suddenly be set to 0x000002. Function A:
result_type emit(typename type_trait<T_arg1>::take _A_a1) const
{ return emitter_type::emit(impl_, _A_a1); }
This is a simple signal implementation. impl_ and _A_a1 are well-defined within their frame at the crash. On actual execution of that instruction, we end up at program counter 0x000002.
This doesn't always happen on that function. In fact it happens in quite a few places, but this is one of the simpler cases that doesn't leave that much room for error. Sometimes what will happen is a stack-allocated variable will suddenly be sitting on junk memory (always on 0x000002) for no reason whatsoever. Other times, that same code will run just fine. So, my question is, what can mangle the stack so badly? What can actually change the value of the frame pointer? I've certainly never heard of such a thing. About the only thing I can think of is writing out of bounds on an array, but I've built it with a stack protector which should come up with any instances of that happening. I'm also well within the bounds of my stack here. I also don't see how another thread could overwrite the variable on the stack of the first thread since each thread has it's own stack (this is all pthreads). I've tried building this on a linux machine and while I don't get segfaults there, roughly one out of three times it will freeze up on me.
Stack corruption, 99.9% definitely.
The smells you should be looking carefully for are:-
Use of 'C' arrays
Use of 'C' strcpy-style functions
memcpy
malloc and free
thread-safety of anything using pointers
Uninitialised POD variables.
Pointer Arithmetic
Functions trying to return local variables by reference
I had that exact problem today and was knee-deep in gdb mud and debugging for a straight hour before occurred to me that I simply wrote over array boundaries (where I didn't expect it the least) of a C array.
So, if possible, use vectors instead because any decend STL implementation will give good compiler messages if you try that in debug mode (whereas C arrays punish you with segfaults).
I'm not sure what you're calling a "frame pointer", as you say:
On actual execution of that
instruction, we end up at program
counter 0x000002
Which makes it sound like the return address is being corrupted. The frame pointer is a pointer that points to the location on the stack of the current function call's context. It may well point to the return address (this is an implementation detail), but the frame pointer itself is not the return address.
I don't think there's enough information here to really give you a good answer, but some things that might be culprits are:
incorrect calling convention. If you're calling a function using a calling convention different from how the function was compiled, the stack may become corrupted.
RAM hit. Anything writing through a bad pointer can cause garbage to end up on the stack. I'm not familiar with Solaris, but most thread implementations have the threads in the same process address space, so any thread can access any other thread's stack. One way a thread can get a pointer into another thread's stack is if the address of a local variable is passed to an API that ultimately deals with the pointer on a different thread. unless you synchronize things properly, this will end up with the pointer accessing invalid data. Given that you're dealing with a "simple signal implementation", it seems like it's possible that one thread is sending a signal to another. Maybe one of the parameters in that signal has a pointer to a local?
There's some confusion here between stack overflow and stack corruption.
Stack Overflow is a very specific issue cause by try to use using more stack than the operating system has allocated to your thread. The three normal causes are like this.
void foo()
{
foo(); // endless recursion - whoops!
}
void foo2()
{
char myBuffer[A_VERY_BIG_NUMBER]; // The stack can't hold that much.
}
class bigObj
{
char myBuffer[A_VERY_BIG_NUMBER];
}
void foo2( bigObj big1) // pass by value of a big object - whoops!
{
}
In embedded systems, thread stack size may be measured in bytes and even a simple calling sequence can cause problems. By default on windows, each thread gets 1 Meg of stack, so causing stack overflow is much less of a common problem. Unless you have endless recursion, stack overflows can always be mitigated by increasing the stack size, even though this usually is NOT the best answer.
Stack Corruption simply means writing outside the bounds of the current stack frame, thus potentially corrupting other data - or return addresses on the stack.
At it's simplest:-
void foo()
{
char message[10];
message[10] = '!'; // whoops! beyond end of array
}
That sounds like a stack overflow problem - something is writing beyond the bounds of an array and trampling over the stack frame (and probably the return address too) on the stack. There's a large literature on the subject. "The Shell Programmer's Guide" (2nd Edition) has SPARC examples that may help you.
With C++ unitialized variables and race conditions are likely suspects for intermittent crashes.
Is it possible to run the thing through Valgrind? Perhaps Sun provides a similar tool. Intel VTune (Actually I was thinking of Thread Checker) also has some very nice tools for thread debugging and such.
If your employer can spring for the cost of the more expensive tools, they can really make these sorts of problems a lot easier to solve.
It's not hard to mangle the frame pointer - if you look at the disassembly of a routine you will see that it is pushed at the start of a routine and pulled at the end - so if anything overwrites the stack it can get lost. The stack pointer is where the stack is currently at - and the frame pointer is where it started at (for the current routine).
Firstly I would verify that all of the libraries and related objects have been rebuilt clean and all of the compiler options are consistent - I've had a similar problem before (Solaris 2.5) that was caused by an object file that hadn't been rebuilt.
It sounds exactly like an overwrite - and putting guard blocks around memory isn't going to help if it is simply a bad offset.
After each core dump examine the core file to learn as much as you can about the similarities between the faults. Then try to identify what is getting overwritten. As I remember the frame pointer is the last stack pointer - so anything logically before the frame pointer shouldn't be modified in the current stack frame - so maybe record this and copy it elsewhere and compare upon return.
Is something meaning to assign a value of 2 to a variable but instead is assigning its address to 2?
The other details are lost on me but "2" is the recurring theme in your problem description. ;)
I would second that this definitely sounds like a stack corruption due to out of bound array or buffer writing. Stack protector would be good as long as the writing is sequential, not random.
I second the notion that it is likely stack corruption. I'll add that the switch to a multi-threaded library makes me suspicious that what has happened is a lurking bug has been exposed. Possibly the sequencing the buffer overflow was occurring on unused memory. Now it's hitting another thread's stack. There are many other possible scenarios.
Sorry if that doesn't give much of a hint at how to find it.
I tried Valgrind on it, but unfortunately it doesn't detect stack errors:
"In addition to the performance penalty an important limitation of Valgrind is its inability to detect bounds errors in the use of static or stack allocated data."
I tend to agree that this is a stack overflow problem. The tricky thing is tracking it down. Like I said, there's over 100,000 lines of code to this thing (including custom libraries developed in-house - some of it going as far back as 1992) so if anyone has any good tricks for catching that sort of thing, I'd be grateful. There's arrays being worked on all over the place and the app uses OI for its GUI (if you haven't heard of OI, be grateful) so just looking for a logical fallacy is a mammoth task and my time is short.
Also agreed that the 0x000002 is suspect. It is about the only constant between crashes. Even weirder is the fact that this only cropped up with the multi-threaded switch. I think that the smaller stack as a result of the multiple-threads is what's making this crop up now, but that's pure supposition on my part.
No one asked this, but I built with gcc-4.2. Also, I can guarantee ABI safety here so that's also not the issue. As for the "garbage at the end of the stack" on the RAM hit, the fact that it is universally 2 (though in different places in the code) makes me doubt that as garbage tends to be random.
It is impossible to know, but here are some hints that I can come up with.
In pthreads you must allocate the stack and pass it to the thread. Did you allocate enough? There is no automatic stack growth like in a single threaded process.
If you are sure that you don't corrupt the stack by writing past stack allocated data check for rouge pointers (mostly uninitialized pointers).
One of the threads could overwrite some data that others depend on (check your data synchronisation).
Debugging is usually not very helpful here. I would try to create lots of log output (traces for entry and exit of every function/method call) and then analyze the log.
The fact that the error manifest itself differently on Linux may help. What thread mapping are you using on Solaris? Make sure you map every thread to it's own LWP to ease the debugging.
Also agreed that the 0x000002 is suspect. It is about the only constant between crashes. Even weirder is the fact that this only cropped up with the multi-threaded switch. I think that the smaller stack as a result of the multiple-threads is what's making this crop up now, but that's pure supposition on my part.
If you pass anything on the stack by reference or by address, this would most certainly happen if another thread tried to use it after the first thread returned from a function.
You might be able to repro this by forcing the app onto a single processor. I don't know how you do that with Sparc.