C++ return value versus exception performance - c++

Somewhere I have read that modern Intel processors have low-level hardware for implementing exceptions and most compilers take advantage of it, to the effect that exceptions become faster than returning results state using variables.
Is it true? are exceptions faster than variables as far as returning state/responding to state? reading stack overflow on the topic seems to contradict that.
Thank you

Be aware that there's ambiguity in the term "exception handler." I believe you'll find that hardware folks when talking about exceptions mean things like:
Hardware interrupts, aka signals, whose handlers are sometimes called exception handlers (see http://pages.cs.wisc.edu/~smoler/x86text/lect.notes/interrupts.html)
Machine check exceptions, which halt the computer if something in hardware goes wrong (see http://en.wikipedia.org/wiki/Machine_Check_Exception)
Neither of those has anything to do with C++'s exception handling facility.
As a counterexample, I have at least one anecdotal data point where exceptions were way slower than return codes: that was on Intel hardware alright, but with gcc 2.95 and a very large set of code with a very large exception table, that was constructed the first time an exception was thrown. Subsequent exceptions were fast, but by then the damage was usually done. Admittedly, gcc 2.95 is pretty ancient, but it should be enough to caution you about making generalizations about the speed of C++ exception handling, even on Intel hardware.

I don't know where you read this, but it is surely incorrect. No hardware designer would make exceptional circumstances, which are by definition uncommon, work FASTER than normal ones. Also keep in mind that C, which according to TIOBE is the most popular systems language, does not even support exceptions. It seems EXTREMELY unlikely that processors are optimized for ONE language's exception handling, whose implementation is not even standardized among compilers.
Even if, somehow, exceptions were faster, you still should not use them outside their intended purpose, lest you confuse every other programmer in the world.

No. Nothing is going to be faster than sticking a variable into a register. Even with explicit hardware support, exceptions are still going to require things like memory accesses.
C++ exceptions couldn't be implemented for the most part in that way, because c++ requires that the stack be unwound and objects destructed.

The answer is technically correct, but highly misleading.
At the core of the issue is the observation that exceptions are exceptional. They usually do not happen. This is not the case when you return an error code. This happens always, even if there is no error. In that case the function still has to return 0, or true, or -1, or ...
Now this means that a CPU and a compiler can specifically optimize functions that fail by exception. But it's important to realize what they optimize, and that's the non-failure, non-exception case - at the cost of the exceptional cases.
Once we realize that, we can look at how the compiler and CPU optimzie such cases. One common method is putting the exception code separate from the normal code. As a result, that code will normally not end up in the CPU cache, which can contain more useful code as a result. In fact, the exception code might not end up in RAM at all, and stay on disk.
Another supporting mechanism is the CPU branch predictor. It will remember that the branches that lead to exception code are usually not taken, and therefore predict that the next time they're not taken either. The compiler can even put this in as a hint. However, this hint feature was abandoned past the Intel Pentium 4; modern CPUs predicted branches well enough.

Even if they were faster, you should not use them for anything other than exceptional conditions. If you misuse them you make your program much harder to debug. In gdb you can do a 'catch throw' and easily find out where your program is going wrong and throwing an exception, but not if you're throwing exceptions as part of your regular processing.

Your question is a little unclear, because what you mean by implementing exceptions covers three things:
Entering a try block. This can have no cost, but tends to make a throw more expensive. There is a more specific question about this on SO.
Executing a throw. There is a more specific question about this on SO.
Unwinding the stack to get from a throw to its catch, and loading the error handling code (in the catch) into the CPU cache. Your should ignore this cost, because you must pay this cost if using status codes rather than exceptions.

Here is blog article where someone did some actual benchmarks: https://pspdfkit.com/blog/2020/performance-overhead-of-exceptions-in-cpp/
tl;dr: The throw/catch mechanism is about an order of magnitude slower than returning a value, so if you care about performance you should only use it in exceptional situations.

Related

When should I use VULKAN_HPP_NO_EXCEPTIONS?

This question is regarding exception handling in Vulkan-Hpp (official Vulkan C++ bindings).
I wrote a small application using Vulkan-Hpp without VULKAN_HPP_NO_EXCEPTIONS defined (and with exception handlers). But after coming across this stackoverflow question (Are Exceptions in C++ really slow) I started worrying about the penalty for using exceptions there. Then I found out about the define VULKAN_HPP_NO_EXCEPTIONS, but it changes the syntax completely for all calls which could throw an exception (because of different return values): that means, one has to decide before starting implementation to either use VULKAN_HPP_NO_EXCEPTIONS or not (i.e. they can't be enabled for the "Debug" Configuration and disabled for the "Release" Configuration easily).
If exception handling is disabled by defining VULKAN_HPP_NO_EXCEPTIONS
ResultValue<SomeType>::type is a struct which contains the return
value and the error code in the fields result and value.
Source
i.e.
surface = instance.createWin32SurfaceKHR(surfaceCreateInfo);
becomes
vk::ResultValue<vk::SurfaceKHR> surfaceResult = instance.createWin32SurfaceKHR(surfaceCreateInfo);
if (surfaceResult.result == vk::Result::eSuccess) {
surface = surfaceResult.value;
}
So given that it is not trivial to change the strategy regarding VULKAN_HPP_NO_EXCEPTIONS at a later stage in development, I wonder in which situations I should use VULKAN_HPP_NO_EXCEPTIONS for my project and in which situations I shouldn't?
I assume there must be some technical rationale behind it, other than just personal taste/opinion.
The principle reason exceptions can be disabled is because many game developers for various platforms turn off exception handling at the compiler level. On some platforms, exception handling is flat out not supported. Those platforms still need a reasonable means to deal with errors, and that requires a different API.
Exceptions have been a hotly debated subject in C++ and likely always will be. While C++ programmers will agree that exceptions should only be used in exceptional circumstances, the line between "exceptional circumstances" and "expected behavior" is ultimately in the eye of the beholder.
Personally, I would consider Vulkan errors to be "exceptional circumstances". Device lost and OOM errors are not things you frequently expect to happen. Plus, your response to them will likely be decidedly non-local; code higher up in the call stack will be what actually deals with it.
Furthermore, many of the functions that error are not the functions commonly encountered in performance critical Vulkan code (vkCmd*, and such). After all, usage errors are supposed to be handled by validation layers and should be impossible at runtime. Errors are usually given for object creation/destruction, and allocations, which are not things you do in the middle of building command buffers.
The erroring function most likely to be found in performance-critical code is vkAllocateDescriptorSets. And while it can error out, can only do so for memory fragmentation reasons. The standard actually requires this:
Any returned error other than VK_ERROR_OUT_OF_POOL_MEMORY_KHR or VK_ERROR_FRAGMENTED_POOL does not imply its usual meaning: applications should assume that the allocation failed due to fragmentation, and create a new descriptor pool.
Fragmentation is something that you can usually prevent, if you have firm control over your input data. Given such control, you can ensure that you never get errors when allocating from descriptor pools.
vkBegin/EndCommandBuffer can error, but only for OOM reasons. Which typically means that there's little you can do to recover, so performance is irrelevant.
The commands that give you serious runtime errors that require actions are typically device commands. And you don't issue such commands in the middle of rendering; vkQueueSubmit is the one exception, and that's at the end of rendering (or beginning; however you want to see it).
This is probably why throwing in VK_HPP is the default.

Why do exceptions always incur overhead in non-leaf functions with destructible stack objects?

I came across the claim in the title here:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2271.html
via here:
http://www.boost.org/doc/libs/1_57_0/doc/html/container/exception_handling.html
Exception handling violates the don't-pay-for-what-you-don't-use design of C++, as it incurs overhead in any non-leaf function that has
destructable stack objects regardless of whether they use exception
handling.
What is this referring to?
I take this bullet point to mean that any strategy for properly unwinding the stack in the event of an exception requires non-leaf functions to store some sort of information about destructible objects they placed on the stack. If that's correct, then my specific questions are:
What is this information that must be stored?
Why is it not possible to correctly unwind the stack given only an instruction address at which a throw occurred and tables of address ranges computed before run-time?
Modern exception handling is indeed table based and zero cost. Unfortunately it was not the case for Windows x86 - one of the most popular targets for game development. Most likely it was due to binary compatibility reasons but even Raymond Chen doesn't now the reason. In x64 they implemented it the way should be from the very beginning.
You pay in binary size.
All the code that deals with exceptions needs to be there no matter if you use exceptions or not, since in general a compiler can not know if a function can throw or not, unless it is marked noexcept (noexcept exists mostly for this reason).
The increased binary size might also hurt actual runtime performance if the code that contains the exception handling enters the CPU cache, wasting cache memory. A good compiler should be able to avoid this problem by storing all the code that performs the exception handling as far as possible from the "hot" runtime path.
Moreover, some ABI (SJLJ) implements exceptions with some runtime overhead even in the non exceptional path. Itanium and windows ABI both have zero overhead on the non-exceptional paths (and hence on these ABI you can expect exceptions to be faster than return-error-code error handling).
This llvm doc is a good starting point if you are interested in the differences between exception handling in the various ABIs.

Are Exceptions still undesirable in Realtime environment?

A couple of years ago I was taught, that in real-time applications such as Embedded Systems or (Non-Linux-)Kernel-development C++-Exceptions are undesirable. (Maybe that lesson was from before gcc-2.95). But I also know, that Exception Handling has become better.
So, are C++-Exceptions in the context of real-time applications in practice
totally unwanted?
even to be switched off via via compiler-switch?
or very carefully usable?
or handled so well now, that one can use them almost freely, with a couple of things in mind?
Does C++11 change anything w.r.t. this?
Update: Does exception handling really require RTTI to be enabled (as one answerer suggested)? Are there dynamic casts involved, or similar?
Exceptions are now well-handled, and the strategies used to implement them make them in fact faster than testing return code, because their cost (in terms of speed) is virtually null, as long as you do not throw any.
However they do cost: in code-size. Exceptions usually work hand in hand with RTTI, and unfortunately RTTI is unlike any other C++ feature, in that you either activate or deactivate it for the whole project, and once activated it will generated supplementary code for any class that happens to have a virtual method, thus defying the "you don't pay for what you don't use mindset".
Also, it does require supplementary code for its handling.
Therefore the cost of exceptions should be measured not in terms of speed, but in terms of code growth.
EDIT:
From #Space_C0wb0y: This blog article gives a small overview, and introduces two widespread methods for implementing exceptions Jumps and Zero-Cost. As the name implies, good compilers now use the Zero-Cost mechanism.
The Wikipedia article on Exception Handling talk about the two mechanisms used. The Zero-Cost mechanism is the Table-Driven one.
EDIT:
From #Vlad Lazarenko whose blog I had referenced above, the presence of exception thrown might prevent a compiler from inlining and optimizing code in registers.
Answer just to the update:
Does exception handling really require
RTTI to be enabled
Exception-handling actually requires something more powerful than RTTI and dynamic cast in one respect. Consider the following code:
try {
some_function_in_another_TU();
} catch (const int &i) {
} catch (const std::logic_error &e) {}
So, when the function in the other TU throws, it's going to look up the stack (either check all levels immediately, or check one level at a time during stack unwinding, that's up to the implementation) for a catch clause that matches the object being thrown.
To perform this match, it might not need the aspect of RTTI that stores the type in each object, since the type of a thrown exception is the static type of the throw expression. But it does need to compare types in an instanceof way, and it needs to do this at runtime, because some_function_in_another_TU could be called from anywhere, with any type of catch on the stack. Unlike dynamic_cast, it needs to perform this runtime instanceof check on types which have no virtual member functions, and for that matter types which are not class types. That last part doesn't add difficulty, because non-class types have no hierarchy, and so all that's needed is type equality, but you still need type identifiers that can be compared at runtime.
So, if you enable exceptions then you need the part of RTTI that does type comparisons, like dynamic_cast's type comparisons but covering more types. You don't necessarily need the part of RTTI that stores the data used to perform this comparison in each class's vtable, where it's reachable from the object -- the data could instead only be encoded at the point of each throw expression and each catch clause. But I doubt that's a significant saving, since typeid objects aren't exactly massive, they contain a name that's often needed anyway in a symbol table, plus some implementation-defined data to describe the type hierarchy. So probably you might as well have all of RTTI by that point.
The problem with exceptions is not necessarily the speed (which may differ greatly, depending on the implementation), but it's what they actually do.
In the real-time world, when you have a time constraint on an operation, you need to know exactly what your code does. Exceptions provide shortcuts that may influence the overall run time of your code (exception handler may not fit into the real-time constraint, or due to an exception you might not return the query response at all, for example).
If you mean "real-time" as in fact "embedded", then the code size, as mentioned, becomes an issue. Embedded code may not necessarily be real-time, but it can have size constraint (and often does).
Also, embedded systems are often designed to run forever, in an infinite event loop. Exception may take you somewhere out of that loop, and also corrupt your memory and data (because of the stack unwinding) - again, depends on what you do with them, and how the compiler actually implements it.
So better safe than sorry: don't use exceptions. If you can sustain occasional system failures, if you're running in a separate task than can be easily restarted, if you're not really real-time, just pretend to be - then you probably can give it a try. If you're writing software for a heart-pacer - I would prefer to check return codes.
C++ exceptions still aren't supported by every realtime environment in a way that makes them acceptable everywhere.
In the particular example of video games (which have a soft 16.6ms deadline for every frame), the leading compilers implement C++ exceptions in such a way that simply turning on exception handling in your program will significantly slow it down and increase code size, regardless of whether you actually throw exceptions or not. Given that both performance and memory are critical on a game console, that's a dealbreaker: the PS3's SPU units, for example, have 256kb of memory for both code and data!
On top of this, throwing exceptions is still quite slow (measure it if you don't believe me) and can cause heap deallocations which are also undesirable in cases where you haven't got microseconds to spare.
The one... er... exception I have seen to this rule is cases where the exception might get thrown once per app run -- not once per frame, but literally once. In that case, structured exception handling is an acceptable way to catch stability data from the OS when a game crashes and relay it back to the developer.
The implementation of the exception mechanism is usually very slow when an exception is thrown, otherwise the costs of using them is almost none. In my opinion exceptions are very useful if you use them correctly.
In RT applications, exceptions should be thrown only when something goes bad and the program has to stop and fix the issue (and possible wait for the user interaction). Under such circumstances, it takes longer to fix the issue.
Exceptions provide hidden path of reporting an error. They make the code more shorter and more readable, therefore easier maintenance.
Typical implementations of C++ exception handling were still not ideal, and might cause the entire language implementation almost unusable for some embedded targets with extremely limited resources, even if the user code is not explicitly using these features. This is referred as "zero overhead principle violation" by recent WG21 papers, see N4049 and N4234 for details. In such environments, exception handling does not work as expected (consuming reasonable system resources) whether the application is real-time or not.
However, there should be real-time applications in embedded environments which can afford these overhead, e.g. a video player in a handheld device.
Exception handling should always be used carefully. Throwing and catching exceptions per frame in a real-time application for any platforms (not only for embedded environments) is a bad design/implementation and not acceptable in general.
There are generally 3 or 4 constraints in embedded / realtime development - especially when that implies kernel mode development
at various points - usually while handling hardware exceptions - operations MUST NOT throw more hardware exceptions. c++'s implicit data structures (vtables) and code (default constructors & operators & other implicitly generated code to support the c++ exception mechanisim) are not placeable, and cannot as a result be guaranteed to be placed in non paged memory when executed in this context.
Code quality - c++ code in general can hide a lot of complexity in statements that look trivial making code difficult to visually audit for errors. exceptions decouple handling from location, making proving code coverage of tests difficult.
C++ exposes a very simple memory model: new allocates from an infinite free store, until you run out, and it throws an exception. In memory constrained devices, more efficient code can be written that makes explicit use of fixed size blocks of memory. C+'s implicit allocations on almost any operation make it impossible to audit memory use. Also, most c++ heaps exhibit the disturbing property that there is no computable upper limit on how long a memory allocation can take - which again makes it difficult to prove the response time of algorithms on realtime devices where fixed upper limits are desirable.

Performance when exceptions are not thrown (C++)

I have already read a lot about C++ exceptions and what i see, that especially exceptions performance is a hard topic. I even tried to look under the g++'s hood to see how exceptions are represented in assembly.
I'm a C programmer, because I prefer low level languages. Some time ago I decided to use C++ over C because with small cost it can make my life much easier (classes over structures, templates etc.).
Returning back to my question, as I see exceptions do generate overhead bud only when they occur, because it require a long sequence of jumps and comparisons instructions to find a appropriate exception handler. In normal program execution (where is no error) exceptions overhead equals to normal return code checking. Am I right?
Please see my detailed response to a similar question here.
Exception handling overhead is platform specific and depends on the OS, the compiler, and the CPU architecture you're running on.
For Visual Studio, Windows, and x86, there is a cost even when exceptions are not thrown. The compiler generates additional code to keep track of the current "scope" which is later used to determine what destructors to call and where to start searching for exception filters and handlers. Scope changes are triggered by try blocks and the creation of objects with destructors.
For Visual Studio, Windows, and x86-64, the cost is essentially zero when exceptions are not thrown. The x86-64 ABI has a much stricter protocol around exception handling than x86, and the OS does a lot of heavy lifting, so the program itself does not need to keep track of as much information in order to handle exceptions.
When exceptions occur, the cost is significant, which is why they should only happen in truly exceptional cases. Handling exceptions on x86-64 is more expensive than on x86, because the architecture is optimized for the more common case of exceptions not happening.
Here's a detailed review of the cost of the exception handling when no exceptions are actually thrown:
http://www.nwcpp.org/old/Meetings/2006/10.html
In general, in every function that uses exception handling (has either try/catch blocks or automatic objects with destructor) - the compiler generates some extra prolog/epilog code to deal with the expcetion registration record.
Plus after every automatic object is constructed and destructed - a few more assembler commands are added (adjust the exception registration record).
In addition some optimizations may be disabled. Especially this is the case when you work in the so-called "asynchronous" exception handling model.

In what ways do C++ exceptions slow down code when there are no exceptions thown?

I have read that there is some overhead to using C++ exceptions for exception handling as opposed to, say, checking return values. I'm only talking about overhead that is incurred when no exception is thrown. I'm also assuming that you would need to implement the code that actually checks the return value and does the appropriate thing, whatever would be the equivalent to what the catch block would have done. And, it's also not fair to compare code that throws exception objects with 45 state variables inside to code that returns a negative integer for every error.
I'm not trying to build a case for or against C++ exceptions solely based on which one might execute faster. I heard someone make the case recently that code using exceptions ought to run just as fast as code based on return codes, once you take into account all the extra bookkeeping code that would be needed to check the return values and handle the errors. What am I missing?
There is a cost associated with exception handling on some platforms and with some compilers.
Namely, Visual Studio, when building a 32-bit target, will register a handler in every function that has local variables with non-trivial destructor. Basically, it sets up a try/finally handler.
The other technique, employed by gcc and Visual Studio targeting 64-bits, only incurs overhead when an exception is thrown (the technique involves traversing the call stack and table lookup). In cases where exceptions are rarely thrown, this can actually lead to a more efficient code, as error codes don't have to be processed.
Only try/catch and try/except block take a few instructions to set up. The overhead should generally be negligible in every case except the tighest loops. But you wouldn't normally use try/catch/except in an inner loop anyway.
I would advise not to worry about this, and use a profiler instead to optimize your code where needed.
It's completely implementation dependent but many recent implementations have very little or no performance overhead when exceptions aren't thrown. In fact you are right. Correctly checking return codes from all functions in code that doesn't use exceptions can be slower then doing nothing for code using exceptions.
Of course, you would need to measure the performance for your particular requirements to be sure.
There is some overhead with exceptions (as the other answers pointed out).
But you do not have much of a choice nowadays. Try do disable exceptions in your project, and make sure that ALL dependent code and libraries can compile and run without.
Do they work with exceptions disabled?
Lets assume they do! Then benchmark some cases, but note that you have to set a "disable exceptions" compile switch. Without that switch you still have the overhead - even if the code never throws exceptions.
Only overhead is ~6 instructions which add 2 SEH at the start of the function and leave them at the end. No matter how many try/catches you have in a thread it is always the same.
Also what is this about local variables? I hear people always complaining about them when using try/catch. I don't get it, because the deconstructors would eventually be called anyways. Also you shouldn't be letting an exception go up more then 1-3 calls.
I took Chip Uni's test code and expanded it a bit. I split the code into two source files (one with exceptions; one without). I made each benchmark run 1000 times, and I used clock_gettime() with CLOCK_REALTIME to record the start and end times of each iteration. Then I computed the mean and variance of the data. I ran this test with 64-bit versions of g++ 5.2.0 and clang++ 3.7.0 on an Intel Core i7 box with 16GB RAM that runs ArchLinux with kernel 4.2.5-1-ARCH. You can find the expanded code and the full results here.
g++
No Exceptions
Average: 30,022,994 nanoseconds
Standard Deviation: 1.25327e+06 nanoseconds
Exceptions
Average: 30,025,642 nanoseconds
Standard Deviation: 1.83422e+06 nanoseconds
clang++
No Exceptions
Average: 20,954,657 nanoseconds
Standard Deviation: 426,662 nanoseconds
Exceptions
Average: 23,916,638 nanoseconds
Standard Deviation: 1.72583e+06 nanoseconds
C++ Exceptions only incur a non-trivial performance penalty with clang++, and even that penalty is only ~14%.