Catch by reference when exception variable is not defined - c++

When catching an exception the standard guidance is to throw by value, catch by reference. As I understand it, this is for two reasons:
If the exception was thrown due to an out of memory exception, we won't call a copy constructor which could potentially terminate the program.
If the exception is part of an inheritance heirarchy, we may potentially have object slicing on the exception.
If we have a scenario where we don't define the exception name in the catch block are these concerns (really 1., as slicing won't be an issue if we don't have a name for the variable) still valid?
For example:
catch(my_exception)
{ ... }
or
catch(my_exception &)
{ ... }
Is there still the possibility of the program terminating if the exception caught by value in this case? My feeling is that it is technically still possible.
Note: I am asking this because I have had to review someone's code who put a catch by value in this case. As shown in the question I am not entirely sure on the technical impact of either choice, but I think that in terms of consistancy it is better to catch by reference in this case regardless (there is no downside to catching by reference in any case).

The standard does not require special optimization in the case of an unnamed exception object. On the contrary, it then requires an effect as if a temporary is copy-initialized. This copying can result in memory being dynamically allocated.
N3290 §15.3/16:
If the exception-declaration specifies a name, it declares a variable which is copy-initialized (8.5) from the
exception object. If the exception-declaration denotes an object type but does not specify a name, a temporary (12.2) is copy-initialized (8.5) from the exception object. The lifetime of the variable or temporary
ends when the handler exits, after the destruction of any automatic objects initialized within the handler.
The paragraph above does not mention catching by reference, and so one might reasonably conclude that it applies whether or not the exception object is caught by reference; that a copy is constructed anyway.
However, that is contradicted by the next paragraph:
N3290 §15.3/17:
When the handler declares a non-constant object, any changes to that object will not affect the temporary
object that was initialized by execution of the throw-expression. When the handler declares a reference to
a non-constant object, any changes to the referenced object are changes to the temporary object initialized
when the throw-expression was executed and will have effect should that object be rethrown.
So, declared type T& (with T non-const) is the single case where C++11 requires a reference directly to the thrown object, instead of copying. And it is also that way in C++03, except that C++03 has some additional wording about as-if optimization. So, for the formal the preference should be for
catch( T& name )
and
catch( T& )
However, I have always caught exceptions like catch( T const& ). From a practical point of view one may assume that the compiler will optimize that to a direct reference to the thrown object, even though it is possible to devise cases where the observed program effect would then be non-standard. For example <evil grin>
#include <stdio.h>
#include <stdexcept>
struct Error
: std::runtime_error
{
public:
static Error* pThrown;
char const* pMessage_;
Error()
: std::runtime_error( "Base class message" )
, pMessage_( "Original message." )
{
printf( "Default-construction of Error object.\n" );
pThrown = this;
}
Error( Error const& other )
: std::runtime_error( other )
, pMessage_( other.pMessage_ )
{
printf( "Copy-construction of Error obejct.\n" );
}
char const* what() const throw() { return pMessage_; }
};
Error* Error::pThrown = 0;
int main()
{
printf( "Testing non-const ref:\n" );
try
{
throw Error();
}
catch( Error& x )
{
Error::pThrown->pMessage_ = "Modified message.";
printf( "%s\n", x.what() );
}
printf( "\n" );
printf( "Testing const ref:\n" );
try
{
throw Error();
}
catch( Error const& x )
{
Error::pThrown->pMessage_ = "Modified message";
printf( "%s\n", x.what() );
}
}
With both MinGW g++ 4.4.1 and Visual C++ 10.0 the output is …
Testing non-const ref:
Default-construction of Error object.
Modified message.
Testing const ref:
Default-construction of Error object.
Modified message
A pedantic formalist might say that both compilers are non-conforming, failing to create a copy for the Error const& case. A purely practical practitioner might say that hey, what else did you expect? And me, I say that the wording in the standard is very far from perfect here, and that if anything, one should expect a clarification to explicitly allow the output above, so that also catching by reference to const is both safe and maximally efficient.
Summing up wrt. the OP's question:
Catching by reference won’t call a copy constructor which could potentially terminate the program.
The standard only guarantees this for reference to non-const.
In practice, as shown, it is also guaranteed for reference to const, even when program results are then affected.
Cheers & hth.,

I would prefer catching by reference. The compiler could discard the exception as an optimization and do no copy, but that's only a possibility. Catching by reference gives you certainties.

There might be invisible data associated with your exception, e.g. a vtable.
The vtable is an invisible data structure that carries information about where certain, polymorphic (i.e. virtual) member functions can be found. This table in the general case costs a bit of memory that is held in the object itself. This may by the size of a pointer into some external table, or even the complete table. As always, it depends.

Related

Calling non-static member function outside of object's lifetime in C++17

Does the following program have undefined behavior in C++17 and later?
struct A {
void f(int) { /* Assume there is no access to *this here */ }
};
int main() {
auto a = new A;
a->f((a->~A(), 0));
}
C++17 guarantees that a->f is evaluated to the member function of the A object before the call's argument is evaluated. Therefore the indirection from -> is well-defined. But before the function call is entered, the argument is evaluated and ends the lifetime of the A object (see however the edits below). Does the call still have undefined behavior? Is it possible to call a member function of an object outside its lifetime in this way?
The value category of a->f is prvalue by [expr.ref]/6.3.2 and [basic.life]/7 does only disallow non-static member function calls on glvalues referring to the after-lifetime object. Does this imply the call is valid? (Edit: As discussed in the comments I am likely misunderstanding [basic.life]/7 and it probably does apply here.)
Does the answer change if I replace the destructor call a->~A() with delete a or new(a) A (with #include<new>)?
Some elaborating edits and clarifications on my question:
If I were to separate the member function call and the destructor/delete/placement-new into two statements, I think the answers are clear:
a->A(); a->f(0): UB, because of non-static member call on a outside its lifetime. (see edit below, though)
delete a; a->f(0): same as above
new(a) A; a->f(0): well-defined, call on the new object
However in all these cases a->f is sequenced after the first respective statement, while this order is reversed in my initial example. My question is whether this reversal does allow for the answers to change?
For standards before C++17, I initially thought that all three cases cause undefined behavior, already because the evaluation of a->f depends on the value of a, but is unsequenced relative to the evaluation of the argument which causes a side-effect on a. However, this is undefined behavior only if there is an actual side-effect on a scalar value, e.g. writing to a scalar object. However, no scalar object is written to because A is trivial and therefore I would also be interested in what constraint exactly is violated in the case of standards before C++17, as well. In particular, the case with placement-new seems unclear to me now.
I just realized that the wording about the lifetime of objects changed between C++17 and the current draft. In n4659 (C++17 draft) [basic.life]/1 says:
The lifetime of an object o of type T ends when:
if T is a class
type with a non-trivial destructor (15.4), the destructor call starts
[...]
while the current draft says:
The lifetime of an object o of type T ends when:
[...]
if T is a class type, the destructor call starts, or
[...]
Therefore, I suppose my example does have well-defined behavior in C++17, but not he current (C++20) draft, because the destructor call is trivial and the lifetime of the A object isn't ended. I would appreciate clarification on that as well. My original question does still stands even for C++17 for the case of replacing the destructor call with delete or placement-new expression.
If f accesses *this in its body, then there may be undefined behavior for the cases of destructor call and delete expression, however in this question I want to focus on whether the call in itself is valid or not.
Note however that the variation of my question with placement-new would potentially not have an issue with member access in f, depending on whether the call itself is undefined behavior or not. But in that case there might be a follow-up question especially for the case of placement-new because it is unclear to me, whether this in the function would then always automatically refer to the new object or whether it might need to potentially be std::laundered (depending on what members A has).
While A does have a trivial destructor, the more interesting case is probably where it has some side effect about which the compiler may want to make assumptions for optimization purposes. (I don't know whether any compiler uses something like this.) Therefore, I welcome answers for the case where A has a non-trivial destructor as well, especially if the answer differs between the two cases.
Also, from a practical perspective, a trivial destructor call probably does not affect the generated code and (unlikely?) optimizations based on undefined behavior assumptions aside, all code examples will most likely generate code that runs as expected on most compilers. I am more interested in the theoretical, rather than this practical perspective.
This question intends to get a better understanding of the details of the language. I do not encourage anyone to write code like that.
It’s true that trivial destructors do nothing at all, not even end the lifetime of the object, prior to (the plans for) C++20. So the question is, er, trivial unless we suppose a non-trivial destructor or something stronger like delete.
In that case, C++17’s ordering doesn’t help: the call (not the class member access) uses a pointer to the object (to initialize this), in violation of the rules for out-of-lifetime pointers.
Side note: if just one order were undefined, so would be the “unspecified order” prior to C++17: if any of the possibilities for unspecified behavior are undefined behavior, the behavior is undefined. (How would you tell the well-defined option was chosen? The undefined one could emulate it and then release the nasal demons.)
The postfix expression a->f is sequenced before the evaluation of any arguments (which are indeterminately sequenced relative to one another). (See [expr.call])
The evaluation of the arguments is sequenced before the body of the function (even inline functions, see [intro.execution])
The implication, then is that calling the function itself is not undefined behavior. However, accessing any member variables or calling other member functions within would be UB per [basic.life].
So the conclusion is that this specific instance is safe per the wording, but a dangerous technique in general.
You seem to assume that a->f(0) has these steps (in that order for most recent C++ standard, in some logical order for previous versions):
evaluating *a
evaluating a->f (a so called bound member function)
evaluating 0
calling the bound member function a->f on the argument list (0)
But a->f doesn't have either a value or type. It's essentially a non-thing, a meaningless syntax element needed only because the grammar decomposes member access and function call, even on a member function call which by define combines member access and function call.
So asking when a->f is "evaluated" is a meaningless question: there is no such thing as a distinct evaluation step for the a->f value-less, type-less expression.
So any reasoning based on such discussions of order of evaluation of non entity is also void and null.
EDIT:
Actually this is worse than what I wrote, the expression a->f has a phony "type":
E1.E2 is “function of parameter-type-list cv returning T”.
"function of parameter-type-list cv" isn't even something that would be a valid declarator outside a class: one cannot have f() const as a declarator as in a global declaration:
int ::f() const; // meaningless
And inside a class f() const doesn't mean "function of parameter-type-list=() with cv=const”, it means member-function (of parameter-type-list=() with cv=const). There is no proper declarator for proper "function of parameter-type-list cv". It can only exist inside a class; there is no type "function of parameter-type-list cv returning T" that can be declared or that real computable expressions can have.
In addition to what others said:
a->~A(); delete a;
This program has a memory leak which itself is technically not undefined behavior.
However, if you called delete a; to prevent it - that should have been undefined behavior because delete would call a->~A() second time [Section 12.4/14].
a->~A()
Otherwise in reality this is as others suggested - compiler generates machine code along the lines of A* a = malloc(sizeof(A)); a->A(); a->~A(); a->f(0);.
Since no member variables or virtuals all three member functions are empty ({return;}) and do nothing. Pointer a even still points to valid memory.
It will run but debugger may complain of memory leak.
However, using any nonstatic member variables inside f() could have been undefined behavior because you are accessing them after they are (implicitly) destroyed by compiler-generated ~A(). That would likely result in a runtime error if it was something like std::string or std::vector.
delete a
If you replaced a->~A() with expression that invoked delete a; instead then I believe this would have been undefined behavior because pointer a is no longer valid at that point.
Despite that, the code should still run without errors because function f() is empty. If it accessed any member variables it may have crashed or led to random results because the memory for a is deallocated.
new(a) A
auto a = new A; new(a) A; is itself undefined behavior because you are calling A() a second time for the same memory.
In that case calling f() by itself would be valid because a exists but constructing a twice is UB.
It will run fine if A does not contain any objects with constructors allocating memory and such. Otherwise it could lead to memory leaks, etc, but f() would access the "second" copy of them just fine.
I'm not a language lawyer but I took your code snippet and modified it slightly. I wouldn't use this in production code but this seems to produce valid defined results...
#include <iostream>
#include <exception>
struct A {
int x{5};
void f(int){}
int g() { std::cout << x << '\n'; return x; }
};
int main() {
try {
auto a = new A;
a->f((a->~A(), a->g()));
catch(const std::exception& e) {
std::cerr << e.what();
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
I'm running Visual Studio 2017 CE with compiler language flag set to /std:c++latest and my IDE's version is 15.9.16 and I get the follow console output and exit program status:
console output
5
IDE exit status output
The program '[4128] Test.exe' has exited with code 0 (0x0).
So this does seem to be defined in the case of Visual Studio, I'm not sure how other compilers will treat this. The destructor is being invoked, however the variable a is still in dynamic heap memory.
Let's try another slight modification:
#include <iostream>
#include <exception>
struct A {
int x{5};
void f(int){}
int g(int y) { x+=y; std::cout << x << '\n'; return x; }
};
int main() {
try {
auto a = new A;
a->f((a->~A(), a->g(3)));
catch(const std::exception& e) {
std::cerr << e.what();
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
console output
8
IDE exit status output
The program '[4128] Test.exe' has exited with code 0 (0x0).
This time let's not change the class anymore, but let's make call on a's member afterwards...
int main() {
try {
auto a = new A;
a->f((a->~A(), a->g(3)));
a->g(2);
} catch( const std::exception& e ) {
std::cerr << e.what();
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
console output
8
10
IDE exit status output
The program '[4128] Test.exe' has exited with code 0 (0x0).
Here it appears that a.x is maintaining its value after a->~A() is called since new was called on A and delete has not yet been called.
Even more if I remove the new and use a stack pointer instead of allocated dynamic heap memory:
int main() {
try {
A b;
A* a = &b;
a->f((a->~A(), a->g(3)));
a->g(2);
} catch( const std::exception& e ) {
std::cerr << e.what();
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
I'm still getting:
console output
8
10
IDE exit status output
When I change my compiler's language flag setting from /c:std:c++latest to /std:c++17 I'm getting the same exact results.
What I'm seeing from Visual Studio it appears to be well defined without producing any UB within the contexts of what I've shown. However as from a language perspective when it concerns the standard I wouldn't rely on this type of code either. The above also doesn't consider when the class has internal pointers both stack-automatic storage as well as dynamic-heap allocation and if the constructor calls new on those internal objects and the destructor calls delete on them.
There are also a bunch of other factors than just the language setting for the compiler such as optimizations, convention calling, and other various compiler flags. It is hard to say and I don't have an available copy of the full latest drafted standard to investigate this any deeper. Maybe this can help you, others who are able to answer your question more thoroughly, and other readers to visualize this type of behavior in action.

c++ exceptions throw by value catch by reference

In C++ when throwing object by value like: throw Exception(), this will create temp object, how can it be caught by reference? i know it works, but if it was a function return value or function call it would have failed without adding const to type, what is the difference ?
First, when you write
throw Exception();
what's being thrown isn't actually the temporary object created by the prvalue expression Exception(). Conceptually, there's a separate object - the exception object - that's initialized from that temporary object, and it is the exception object that's actually thrown. (Compilers are allowed to elide the copy/move, though.)
Second, the language rules say that the exception object is always considered an lvalue. Hence it is allowed to bind to non-const lvalue references.

Content of char pointer seems to get deleted while being passed to the catch block

I encountered a very strange (at least to me) behaviour of a exception class I have thrown. What I do is that I allocate memory via new for a string in the constructor of the exception class and fill it with characters. So far everything is fine. When debugging the code I can see in Visual Studio that the pointer actually has the right content.
Now the weird thing happens. My next breakpoint is in the catch - block to which the exception is passed after being constructed and here I can see in the debugger that the content of the string contained in the exception object is severly corrupted. Even though the address didn't change at all! So it seems like the content of the string gets destructed.
So I put a breakpoint into the exceptions destructor and really, it is being called before the catch - block is entered. This confuses me a lot since I learned to pass exceptions by reference to the catch block. But what good is that if the destructor gets called before I can access the dynamically created data...
I constructed a minimal example that shows the situation I am in:
#include <iostream>
#include <cstring>
class test_exception {
public:
test_exception();
~test_exception() {
delete[] _msg;
}
// Getter Functions
char* errorMessage() const {
return _msg;
}
private:
char* _msg;
};
test_exception::test_exception()
{
_msg = new char[22];
strcpy(_msg, "This is a test string");
}
int main(int argc, char* argv[])
{
try {
throw test_exception();
} catch (const test_exception& err) {
std::cout << err.errorMessage() << std::endl;
}
std::cin.get();
return 0;
}
It would be create if someone could tell me if it is weird MS behaviour or if I misunderstood how try - catch - blocks should be used.
Exceptions are copied (or in C++11, possibly moved) when they're thrown. Quoting C++11, §15.1/3:
A throw-expression initializes a temporary object, called the exception object, the type of which is determined by removing any top-level cv-qualifiers from the static type of the operand of throw and adjusting the type from “array of T” or “function returning T” to “pointer to T” or “pointer to function returning T”, respectively. The temporary is an lvalue and is used to initialize the variable named in the matching handler. If the type of the exception object would be an incomplete type or a pointer to an incomplete type other than (possibly cv-qualified) void the program is ill-formed. Except for these restrictions and the restrictions on type matching mentioned in 15.3, the operand of throw is treated exactly as a function argument in a call or the operand of a return statement.
Because test_exception violates the rule-of-three (or for C++11, the rule-of-five), test_exception::_msg has already been deleted by the time you enter your catch block.
Since exceptions are copied, you should add a copy constructor to your test_exception object. The exception thrown is not the same one as the one received by the catch.

Rationale for throwing static type?

According to the C++ FAQ, when one throws an object, it's thrown using the static type of the expression. Hence, if you have:
catch ( some_exception const &e ) {
// ...
throw e; // throws static type, possibly causing "slicing"; should just "throw;" instead
}
and e is actually a reference to some class derived from some_exception, the above throw will cause the object to be "sliced" silently. Yes, I know the correct answer is simply to throw;, but the way things are seems like an unnecessary source of confusion and bugs.
What's the rationale for this? Why wouldn't you want it to throw by the dynamic type of the object?
When you throw something, a temporary object is constructed from the operand of the throw and that temporary object is the object that is caught.
C++ doesn't have built-in support for copying things or creating objects based on the dynamic type of an expression, so the temporary object is of the static type of the operand.
The "argument" to throw is an expression and it is the type of the expression that determines the type of the exception object thrown. The type of the expression thrown doesn't necessarily have to be a polymorphic type so there may not be a way to determine if the expression actually refers to a base class subobject of a more derived type.
The simpler "type of the expression" rule also means that the implementation doesn't have to dynamically determine the size and type of the exception object at runtime which might require more complex and less efficient code to be generated for exception handling. If it had to do this it would represent the only place in a language where a copy constructor for a type unknown at the call point was required. This might add significantly to the cost of implementation.
Consider that we can have references to objects where the static type of the reference is copyable, but the dynamic type of the object is not.
struct foo {};
struct ncfoo : foo
{
private:
ncfoo(ncfoo const&) {}
};
ncfoo g_ncfoo;
void fun()
{
foo& ref = g_ncfoo;
throw ref; // what should be thrown here?
}
If you say "in this case just throw the static type", then how are the exact rules - what does "in this case" mean? References we just caught are "re-thrown" without copying, everything else is copied? Hm...
But however you define the rule, it would still be confusing. Throwing by-reference would lead to different behavior, depending on where we got that reference from. Neh. C++ is already complicated and confusing enough :)

Destruction of string temporaries in thrown exceptions

Consider the following code:
std::string my_error_string = "Some error message";
// ...
throw std::runtime_error(std::string("Error: ") + my_error_string);
The string passed to runtime_error is a temporary returned by string's operator+. Suppose this exception is handled something like:
catch (const std::runtime_error& e)
{
std::cout << e.what() << std::endl;
}
When is the temporary returned by string's operator+ destroyed? Does the language spec have anything to say about this? Also, suppose runtime_error took a const char* argument and was thrown like this:
// Suppose runtime_error has the constructor runtime_error(const char* message)
throw std::runtime_error((std::string("Error: ") + my_error_string).c_str());
Now when is the temporary string returned by operator+ destroyed? Would it be destroyed before the catch block tries to print it, and is this why runtime_error accepts a std::string and not a const char*?
runtime_error is a class which contains a string. That string will be managed for you by the normal C++ construction and destruction mechanisms. If it contained a char *, then that would have to be explicitly managed, but you would still not have to do anything as a user of runtime_error.
Despite what you may read elsewhere on the internet, C++ is designed to almost always do "the reasonable thing" - you actually have to try fairly hard to break this reasonable behaviour, though of course it is not impossible to do so.
As a temporary object (12.2), the result of the + will be destroyed as the last step in the evaluation of the full-expression (1.9/9) that contains it. In this case the full-expression is the throw-expression.
A throw-expression constructs a temporary object (the exception-object) (15.1) (std::runtime_error in this case). All the temporaries in the throw-expression will be destroyed after the exception-object has been constructed. The exception is thrown only once the evaluation of the throw-expression has completed, as the destruction of temporaries is part of this evaluation they will be destroyed before the destruction of automatic variables constructed since the try block was entered (15.2) and before the handler is entered.
The post-condition on runtime_error's constructor is that what() returns something that strcmp considers equal to what c_str() on the passed in argument returns. It is a theoretical possiblility that once the std::string passed as a constructor argument is destroyed, runtime_error's what() could return something different, although it would be a questionable implementation and it would still have to be a null-terminated string of some sort, it couldn't return a pointer to a stale c_str() of a dead string.
Note that the runtime_error exception class makes a copy of the string passed into the constructor. So when you're calling .what() on the exception object, you're not getting back the same exact string instance you passed in.
So to answer your question, the temporary you're asking about gets destroyed "at the semicolon" of the expression that contains it (this is true in both your first and second version of the question), but as I said, this isn't that interesting, because a copy of it was already made.