What is the machinery behind stack unwinding?

What is the machinery behind stack unwinding? - c++

I'm trying to understand the machinery behind stack unwinding in C++. In other words I'm interested in how this feature is implemented (and whether that is a part of the standard).
So, the thread executes some code until an exception is thrown. When the exception is thrown, what are the threads/interrupt handlers used to record the state and unwind the stack? What is guaranteed by the standard and what is implementation specific?

The thread executes some code until the exception is thrown, and it continues to do so. Exception handling still is C++ code.
The throw expression creates a C++ object, running in the context of the throwing function. While the constructor of the exception object is running, all objects in the scope of the throwing function are still alive.
Directly after, however, the stack unwind happens. The C++ compiler will have arranged for a return path that does not require a return object, but which does allow passing of the exception object. Just like a normal return, objects that are local to a function are being destroyed when the function returns. At binary level, this is pretty straightforward: it's just a bunch of destructor calls, and typically a stack pointer is also adjusted.
What the standard also does not specify, is the mechanism used to determine how many scopes need to be exited. The standard describes in terms of catch, but a typical CPU has no direct equivalent. Hence, this is commonly part of the C++ ABI for a given platform so that compilers sharing a ABI agree. ABI compatibility requires that a caller must catch the exceptions from a callee, even if they've been compiled with different compilers. And obviously, destructors have to be called, so the ABI also needs to arrange that mechanism. The intermediate functions could even have been compiled by a third compiler - as long as they share an ABI, it's all supposed to work.
As noted in the comments, C++ has no notion of interrupts. If the OS needs something to happen with interrupts, the compiler needs to take care of that. It matters little what exactly the C++ code is doing at that time.

Related

Implementing std::malloc in a C++ standard-compliant manner

A quick thought experiment before getting to the question. Imagine someone is implementing std::malloc (say, one of the JEMalloc or TCMalloc folks). One of the very basic things they would need is the ability to know that the program will not call back into malloc once execution has entered the implementation of std::malloc.
For example,
void* malloc(...) {
auto lck = std::unique_lock{malloc_mutex};
// .. memory allocation business logic
}
Now if there is a signal in between the lock and the business logic for the allocation, we can deadlock if the signal handler calls back into std::malloc. It is not designed to be re-entrant, the C++ standard requires that a signal handler registered with std::signal does not call back into operator new (which can possibly call back into malloc, therefore it is required that a user-defined signal handler not call back into malloc if it is to be considered portable across all implementations of the language).
§[support.signal]p3 in the most recent version of the standard outlines this requirement
An evaluation is signal-safe unless it includes one of the following:
a call to any standard library function, except for plain lock-free atomic operations and functions explicitly identified as signal-safe. [ Note: This implicitly excludes the use of new and delete expressions that rely on a library-provided memory allocator. — end note ]
However, the C++ standard seemingly says nothing about how function stacks are to be implemented for threads of execution (see this question: C++ threads stack address range), this means that a function dispatch within std::malloc's implementation might call into operator new if the program is compiled with segmented stacks.
How can one possibly implement a function like std::malloc in that case? If indeed, the C++ standard offers no such guarantees, then what does? How can we know that the implementation of a regular function goes through the regular stack allocation process (stack pointer increment)? Which standard (eg. ABI, compiler, POSIX) covers this?

The implementation is required to use a signal-safe allocator for its stack frames. This follows from the fact that function calls (to non-library functions) in signal handlers are permitted. The implementation can use malloc or operator new, but only if those allocators are themselves signal-safe.

Under the logic of the C++ Standard, the implementation is considered as a whole. In particular, any part of an implementation may assume anything about any other part of the implementation.
That means for this question that std::malloc and the signal handler may assume things about each other. Some implementations may decide that their std::malloc implementation is async-safe, others may decide it's not. But there are a myriad of other assumptions that might exist - alignment, contiguity, recycling of free'd addresses, etc. Since this is all internal to implementations, there's no Standard describing this.
That is a problem for "replacement mallocs". You can implement JE::malloc but std:: is special. C++ at least acknowledged the possibility of a replacement operator new but even that was never specified to this detailed level.

as-if rule and removal of allocation

The "as-if rule" gives the compiler the right to optimize out or reorder expressions that would not make a difference to the output and correctness of a program under certain rules, such as;
§1.9.5
A conforming implementation executing a well-formed program shall
produce the same observable behavior as one of the possible executions
of the corresponding instance of the abstract machine with the same
program and the same input.
The cppreference url I linked above specifically mentions special rules for the values of volatile objects, as well as for "new expressions", under C++14:
New-expression has another exception from the as-if rule: the compiler
may remove calls to the replaceable allocation functions even if a
user-defined replacement is provided and has observable side-effects.
I assume "replaceable" here is what is talked about for example in
§18.6.1.1.2
Replaceable: a C++ program may define a function with this function
signature that displaces the default version defined by the C++
standard library.
Is it correct that mem below can be removed or reordered under the as-if rule?
{
... some conformant code // upper block of code
auto mem = std::make_unique<std::array<double, 5000000>>();
... more conformant code, not using mem // lower block of code
}
Is there a way to ensure it's not removed, and stays between the upper and lower blocks of code? A well placed volatile (either/or volatile std::array or left of auto) comes to mind, but as there is no reading of mem, I think even that would not help under the as-if rule.
Side note; I've not been able to get visual studio 2015 to optimize out mem and the allocation at all.
Clarification: The way to observe this would be that the allocation call to the OS comes between any i/o from the two blocks. The point of this is for test cases and/or trying to get objects to be allocated at new locations.

Yes; No. Not within C++.
The abstract machine of C++ does not talk about system allocation calls at all. Only the side effects of such a call that impact the behavior of the abstract machine are fixed by C++, and even then the compiler is free to do something else, so long as-if it results in the same observable behavior on the part of the program in the abstract machine.
In the abstract machine, auto mem = std::make_unique<std::array<double, 5000000>>(); creates a variable mem. It, if used, gives you access to a large amount of doubles packed into an array. The abstract machine is free to throw an exception, or provide you with that large amount of doubles; either is fine.
Note that it is a legal C++ compiler to replace all allocations through new with an unconditional throw of an allocation failure (or returning nullptr for the no throw versions), but that would be a poor quality of implementation.
In the case where it is allocated, the C++ standard doesn't really say where it comes from. The compiler is free to use a static array, for example, and make the delete call a no-op (note it may have to prove it catches all ways to call delete on the buffer).
Next, if you have a static array, if nobody reads or writes to it (and the construction cannot be observed), the compiler is free to eliminate it.
That being said, much of the above relies on the compiler knowing what is going on.
So an approach is to make it impossible for the compiler to know. Have your code load a DLL, then pass a pointer to the unique_ptr to that DLL at the points where you want its state to be known.
Because the compiler cannot optimize over run-time DLL calls, the state of the variable has to basically be what you'd expect it to be.
Sadly, there is no standard way to dynamically load code like that in C++, so you'll have to rely upon your current system.
Said DLL can be separately written to be a noop; or, even, you can examine some external state, and conditionally load and pass the data to the DLL based on the external state. So long as the compiler cannot prove said external state will occur, it cannot optimize around the calls not being made. Then, never set that external state.
Declare the variable at the top of the block. Pass a pointer to it to the fake-external-DLL while uninitialized. Repeat just before initializing it, then after. Then finally, do it at the end of the block before destroying it, .reset() it, then do it again.

no-throw exception guarantee and stack overflow

There are several special functions which usually guarantee not to throw excpetions, e.g.:
Destructors
swap method
Consider the following swap implementation, as stated in this answer:
friend void swap(dumb_array& first, dumb_array& second)
{
using std::swap;
swap(first.mSize, second.mSize);
swap(first.mArray, second.mArray); // What if stack overlow occurs here?
}
It uses two swap functions - for integer and for pointer. What if the second function will cause stack overflow? Objects will become corrupted. I guess it is not an std::exception, it is some kind of system exception, like Win32-exception. But now we cannot guarantee no-throwing, since we're calling a function.
But all authoritative sources just use swap like it's ok, no exceptions will ever be thrown here. Why?

In general you cannot handle running out of stack. The standard doesn't say what happens if you run out of stack, neither does it talk about what the stack is, how much is available, etc. OSes may let you control it at the time the executable is built or when it is run, all of which is fairly irrelevant if you're writing library code, since you have no control of how much stack the process has, or how much has already been used before the user calls into your library.
You can assume that stack overflow results in the OS doing something external to your program. A very simple OS might just let it go weird (undefined behavior), a serious OS might blow the process away, or if you're really unlucky it throws some implementation-defined exception. I actually don't know whether Windows offers an SEH exception for stack overflow, but if it does then it's probably best not to enable it.
If you're concerned, you can mark your swap function as noexcept. Then in a conforming implementation, any exception that tries to leave the function will cause the program to terminate(). That is to say, it fulfils the noexcept contract at the cost of taking out your program.

What if the second function will cause stack overflow?
Then your program is in an unrecoverable faulted state, and there is no practical way to handle the situation. Hopefully, the overflow has already caused a segmenation fault and terminated the program.
But now we cannot guarantee no-throwing
I've never encountered an implementation that would throw an exception in that state, and I'd be rather scared if it did.
But all authoritative sources just use swap like it's ok, no exceptions will ever be thrown here. Why?
The authoritative sources I've read (like this one, for example) don't "just use it like it's OK"; they say that if you have (for example) a non-throwing swap function, and a non-throwing destructor, then you can provide exception-safety guarantees from functions that use them.
It's useful to categorise functions according to their exception guarantees:
Basic: exceptions leave everything in a valid but unspecified state
Strong: exceptions leave the state unchanged
No-throw: no exceptions will be thrown.
Than a common approach to providing the "strong" guarantee is:
do the work that might throw on a temporary copy of the state
swap that copy with the live state (requiring a non-throwing swap operation)
destroy the old state (requiring a non-throwing destructor)
If you don't have a no-throw guarantee from those operations, then it's more difficult, and perhaps impossible, to provide a strong guarantee.

Under what circumstances are C++ destructors not going to be called?

I know that my destructors are called on normal unwind of stack and when exceptions are thrown, but not when exit() is called.
Are there any other cases where my destructors are not going to get called? What about signals such as SIGINT or SIGSEGV? I presume that for SIGSEGV, they are not called, but for SIGNINT they are, how do I know which signals will unwind the stack?
Are there any other circumstances where they will not be called?

Are there any other circumstances where they[destructors] will not be called?
Long jumps: these interfere with the natural stack unwinding process and often lead to undefined behavior in C++.
Premature exits (you already pointed these out, though it's worth noting that throwing while already stack unwinding as a result of an exception being thrown leads to undefined behavior and this is why we should never throw out of dtors)
Throwing from a constructor does not invoke the dtor for a class. This is why, if you allocate multiple memory blocks managed by several different pointers (and not smart pointers) in a ctor, you need to use function-level try blocks or avoid using the initializer list and have a try/catch block in the ctor body (or better yet, just use a smart pointer like scoped_ptr since any member successfully initialized so far in an initializer list will be destroyed even though the class dtor will not be called).
As pointed out, failing to make a dtor virtual when a class is deleted through a base pointer could fail to invoke the subclass dtors (undefined behavior).
Failing to call matching operator delete/delete[] for an operator new/new[] call (undefined behavior - may fail to invoke dtor).
Failing to manually invoke the dtor when using placement new with a custom memory allocator in the deallocate section.
Using functions like memcpy which only copies one memory block to another without invoking copy ctors. mem* functions are deadly in C++ as they bulldoze over the private data of a class, overwrite vtables, etc. The result is typically undefined behavior.
Instantiation of some of smart pointers (auto_ptr) on an incomplete type, see this discussion

The C++ standard says nothing about how specific signals must be handled - many implementations may not support SIGINT, etc. Destructors will not be called if exit() or abort() or terminate() are called.
Edit: I've just had a quick search through the C++ Standard and I can't find anything that specifies how signals interact with object lifetimes - perhaps someone with better standards-fu than me could find something?
Further edit: While answering another question, I found this in the Standard:
On exit from a scope (however
accomplished), destructors (12.4) are
called for all constructed objects
with automatic storage duration
(3.7.2) (named objects or temporaries)
that are declared in that scope, in
the reverse order of their
declaration.
So it seems that destructors must be called on receipt of a signal.

Another case they won't be called is if you are using polymorphism and have not made your base destructors virtual.

A signal by itself won't affect the execution of the current thread and hence the invocation of destructors, because it is a different execution context with its own stack, where your objects do not exist. It's like an interrupt: it is handled somewhere outside of your execution context, and, if handled, the control is returned to your program.
Same as with multithreading, C++ the language does not know a notion of signals. These two are completely orthogonal to each other and are specified by two unrelated standards. How they interact is up to the implementation, as long as it does not break either of the standards.
As a side note, another case is when object's destructor won't be called is when its constructor throws an exception. Members' destructors will still be called, though.

abort terminates program without executing destructors for objects of automatic or static storage duration as Standard says. For other situations you should read implementation specific documents.

If a function or method has a throws specification, and throws something NOT covered by the specification, the default behavior is to exit immediately. The stack is not unwound and destructors are not called.
POSIX signals are an operating system specific construct and have no notion of C++ object scope. Generally you can't do anything with a signal except maybe, trap it, set a global flag variable, and then handle it later on in your C++ code after the signal handler exits.
Recent versions of GCC allow you to throw an exception from within synchronous signal handlers, which does result in the expected unwinding and destruction process. This is very operating system and compiler specific, though

A lot of answers here but still incomplete!
I found another case where destructors are not executed. This happens always when the exception is catched across a library boundary.
See more details here:
Destructors not executed (no stack unwinding) when exception is thrown

There are basically two situations, where destructors are called: On stack unwind at the end of function (or at exceptions), if someone (or a reference counter) calls delete.
One special situation is to be found in static objects - they are destructed at the end of the program via at_exit, but this is still the 2nd situation.
Which signal leaves at_exit going through may depend, kill -9 will kill the process immediately, other signals will tell it to exit but how exactly is dependent on the signal callback.

about throw() in C++

void MyFunction(int i) throw();
it just tells the compiler that the function does not throw any exceptions.
It can't make sure the function throw nothing, is that right?
So what's the use of throw()
Is it redundant? Why this idea is proposed?

First of all, when the compiler works right, it is enforced -- but at run-time, not compile-time.. A function with an empty exception specification will not throw an exception. If something happens that would create an exception escaping from it, will instead call unexpected(), which (in turn) calls abort. You can use set_unexpected to change what gets called, but about all that function is allowed to do is add extra "stuff" (e.g. cleanup) before aborting the program -- it can't return to the original execution path.
That said, at least one major compiler (VC++) parses exception specifications, but does not enforce them, though it can use empty exception specifications to improve optimization a little. In this case, an exception specification that isn't followed can/does result in undefined behavior instead of necessarily aborting the program.

It can't make sure the function throw nothing, is that right?
You are almost there. It is an exception specification. It means that as an implementer you gurantee to your client(s) that this piece of code will not throw an exception. This does not however stop some functions within MyFunction to throw and which, if you do not handle them, will bubble up and cause your/client's program in a way you did not intent it to. It does not even mean that you cannot have a throw expression inside.
It is best to avoid such specification, until and unless you are absolutely sure that your code will never throw -- which is kind of difficult except for very basic functions. See the standard swap, pointer assignments etc.
Is it redundant? Why this idea is proposed?
Not exactly. When properly used, it can be of help to the compiler for optimization purposes. See this article. This article explains the history behind no-throw well.
Digging a bit more I found this excellent article from the Boost documentation. A must read. Read about the exception guarantees part.

As you said, it just tells the compiler that the function does not throw any exceptions.
When the compiler expects possible exceptions, it often has to generate the code in some specific form, which makes it less efficient. It also might have to generate some additional "household" code for the sole purpose of handling exceptions when and if they are thrown.
When you tell the compiler that this function never throws anything, it makes it much easier to the compiler to recognize the situations when all these additional exception-related expenses are completely unnecessary, thus helping the compiler to generate more efficient code.
Note, that if at run time you actually try to throw something out of a function that is declared with throw() specification, the exception will not be allowed to leave the function. Instead a so called unexpected exception handler will be invoked, which by default will terminate the program. So, in that sense it is actually enforced that a throw() function does not throw anything.
P.S. Since exception specifications are mostly affecting the run-time behavior of the program, in general they might not have any compile time effect. However, this specific exception specification - the empty one throw() - is actually recognized by some compilers at compile time and does indeed lead to generation of more efficient code. Some people (me included) hold the opinion that the empty exception specification is the only one that is really worth using in the real-life code.

To be honest exception specifications in the real world have not turned out to be as usfull as envisoned by the origianl designers. Also the differences between C++ runtime checked exception specifications and Java's compile time checked exception specifications has caused a lot of problems.
The currently except norms for exception specifications are:
Don't use them.
Unless they are the empty form.
Make sure that if you use the empty form gurantee that you actually catch all excetions.
The main problem is that if you have a method with a throw spec. Then somthing it uses changes underneath and now throws new types of exception there is no warning or problem with the code. But at runtime if the exception occurs (the new one that is not in throw spec) then your code will terminate(). To me termination is a last resort and should never happen in a well formed program I would much rather unwind the stack all the way back to main with an exception thus allowing a sliughtly cleaner exit.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js