Is throwing a temporary value as reference undefined behavior? - c++

To my surprise, std::runtime_error only has a const std::string& constructor, but no value constructor. Thus I am wondering if throw std::runtime_error("Temporary" + std::to_string(100)); is defined. After all we are creating an exception object that refers to a temporary object inside the scope of a function that immediately exits (as we throw out of it). Further investigation showed that while returning a constant reference to a temporary value immediately causes a segfault on my system, throwing one doesn't:
const std::string& undefined_behavior() {
return "Undefined " + std::to_string(1);
}
void test() {
throw std::runtime_error("Hello : " + std::to_string(2));
}
void test2() {
const std::string& ref = "Hello : " + std::to_string(3);
throw ref;
}
int main(int argc, char** argv) {
//std::cout << undefined_behavior() << std::endl;
try {
test();
} catch(const std::runtime_error& ex) {
std::cout << ex.what() << std::endl;
}
try {
test2();
} catch(const std::string& ref) {
std::cout << ref << std::endl;
}
}
On my system, undefined_behavior() crashes immediately but test() and test2() run fine.
While I know the call to undefined_behavior() is undefined behavior, is test() and test2() undefined? If not, do thrown values get a specfied treatment?
And if they are all undefined but happened to work on my computer by accident, what is the proper way to throw an exception with a variable message?

The const std::string& constructor doesn't cause the exception object to store a reference to the passed std::string. std::runtime_error will make an internal copy of the passed string (although the copy will be stored in an usual way, probably reference-counted in an allocation external to the exception object; this is to give better exception guarantees when the exception object is copied, which the standard requires).
A function taking a const lvalue reference doesn't usually mean that it takes ownership or stores a reference to the object.
Prior to C++11 there was no reason to not use a const reference here, since it was always more efficient than passing by-value. With C++11 move semantics that changed and it is true that usually these kind of constructors had rvalue reference variants added with C++11, so that passing a temporary std::string object to the constructor would allow more efficient move-construction instead of copy-construction.
My guess is that this simply never was done for the exception classes because the performance of their construction should not be important in comparison to the overhead of actually throwing the exception.
So in test()'s case a temporary string object is created from the + operation, which is passed to the constructor and the exception object will make a copy of it. After the initialization of the exception object the temporary string is destroyed, but that is not a problem, since the exception object stores/references a copy.
In test2() the same would apply if you had written throw std::runtime_error(ref). However, here a reference is bound immediately to the temporary object materialized from the return value of +. This causes its lifetime to be extended to that of the reference variable. Therefore again, the temporary lives long enough for the std::runtime_error to make a copy from it.
But since you write simply throw ref that is not exactly what happens. throw ref will throw a std::string, because that is the type of the ref expression. However, throw doesn't actually use the object referenced by the expression as exception object. The exception object is initialized from the operand of the throw expression. So here, the exception object will be another std::string object (in unspecified storage) which will be initialized from the temporary that ref refers to.
undefined_behavior() has undefined behavior when the return value is used, because the temporary std::string object materialized from the + return value and which the return value of undefined_behavior() references is not lifetime-extended. Although it is again immediately bound by a reference, the lifetime-extension rule specifically doesn't apply to objects bound to a reference in a return value initialization. Instead the temporary is destroyed when the function returns, causing the function to always return a dangling reference.

It is not undefined behavior.
In test1(), the constructor of std::runtime_error makes an internal copy. There is even a note on this at cppreference:
Because copying std::runtime_error is not permitted to throw
exceptions, this message is typically stored internally as a
separately-allocated reference-counted string. This is also why
there is no constructor taking std::string&&: it would have to
copy the content anyway.
In test2() there is no std::runtime_error involved. Instead, as you can read e.g. here, in the following statement
throw expr;
the exception object is copy-initialized from expr. So there is a copy made of that temporary string also in this case.

Related

Why can't we return an object by reference from a function in C++?

What I've understood, the reason is that we unnecessarily call the copy constructor for a simple statement like a=b; (both are objects).
What I don't get is that in my book it's written that we should never pass an object by reference, because as soon as the function terminates, that reference ceases to exist.
So is the text written in my book wrong or am I missing something here?
Text
ref: Overloading assignment operator in C++
There's nothing wrong with returning a reference from a function.
Indeed that's how the assignment operator operator= is normally defined (with return *this; for method chaining)!
What you shouldn't do is return a reference to an object that goes out of scope, e.g.
int& undefinedBehaviourServer()
{
int ub;
return ub;
}
In this case, ub has automatic storage duration and the returned reference to it will dangle.
As soon as the function is completed, all objects declared in it are destroyed. Therefore, by returning a link from a function, you risk getting a call to a remote object. Let's see the typical example:
// don't do that!!!
std::string& get_str()
{
std::string s = "abc";
return s;
}
int main()
{
string &s = get_str();
// "abc"-string already destoyed at this moment
std::cout << s; // attempt to deleted string: undefined behavior
}
Therefore, it is dangerous to return references to local objects from the functions, because it may involve accessing a deleted object (undefined behavior). Although technically returning an object (not local) reference is possible and often used. For example:
std::string& get_s()
{
static std::string s = "abc";
return s;
}
int main()
{
std::string &s = get_s();
std::cout << s; // that's OK
}

Why throwing exception which is a reference calls copy constructor?

Why throwing exception which is a reference calls copy constructor?
struct Error
{
Error() {}
Error(const Error&) = delete;
};
int main()
{
Error& error = *new Error;
throw error;
}
Compilation error:
error: declared here
Error(const Error&) = delete;
It does not happen when throwing pointer like:
int main()
{
Error* error = new Error;
throw error;
}
This is OK.
You cannot throw a reference. Throwing always copies the thrown expression value into a special area of storage set aside for thrown objects. Otherwise, you'd almost always be "catching" a dangling reference, as is [theoretically] the case in your code.
Your Error type cannot be copied, so the program is impossible.
However, a pointer can of course be copied, and the main problem in your final example is a memory leak. Also your program will simply terminate at the throw statement as you don't have any try/catch.
Before unwinding stack a throw operator (except throw; without an argument, used for rethrowing) creates an exception object (in a special memory area). Depending on circumstances, the object is initialized in different ways: constructor, copy constructor, move constructor (https://en.cppreference.com/w/cpp/language/copy_elision) using what was provided to the throw operator. Providing the reference is ok, but three:
if you provide a reference in an arguments list, it depends on receiving party, what is actually received, a reference or a value copy;
compiler needs to initialize an exception object, because what was provide to the throw operator will not live when the exception handling catch block is running (the stack will have been unwound then; in case of the pointer, the pointer provided, though the object it points to in your case is alive, and in the catch block you have a copy of the pointer to the same object);
it is not possible to initialize a reference in runtime;
so, in your case, the compiler expects the copy constructor in order to initialize an exception object using the reference you provided (copy constructors usually takes values to initialize the object using references to the initial one).
When you pass a reference to Error to the throw operator, the type of the exception object is Error, and we need to initialize an Error exception object in that specific memory area..
When you pass a pointer to Error to a throw operator, the type of the exception object is a pointer to Error (Error *), so the pointer is copied, not the Error object which the pointer points to. The copying pointer to error has nothing to do with calling the copy constructor of Error, so you don’t have the error in that case.

Throwing exception that has const reference to local variable

What happens to the local variables during stack unwinding, that are referenced in exception? Consider following code:
class bar;
class my_error
{
public:
my_error(const bar& bar) : _bar(bar) {}
const bar& get_bar() const { return _bar; }
private:
const bar& _bar;
}
...
bar some_local_object(...);
if (!foo()) {
throw my_error(some_local_object);
}
...
try {
g();
} catch (my_error& e) {
e.get_bar()
...
}
What happens to some_local_object? Shouldn't it be destroyed during stack unwinding? Is it safe to use it, as provided in example?
Additional question
As already answered, this code would lead to undefined behavior. My second question to it is:
If I am neither allowed to pass reference to local object nor should I try to make copy of it, because in rare case it could cause bad_alloc (which is why, I guess, gcc standard library has no meaningful error message, i.e. map.at throws exception for which what() returns "map.at"), then what is a good strategy to pass additional information? Note that even joining multiple strings, during construction of error message could theoretically cause bad_alloc. i.e.:
void do_something(const key& k, ....)
{
...
if (!foo(k)) {
std::ostringstream os;
os << "Key " << k << " not found"; // could throw bad_alloc
throw std::runtime_error(os.str());
}
// another approcach
if (!foo(k)) {
throw key_not_found(k); // also bad, because exception could outlive k
}
}
The behavior is the same as when returning a reference to a variable on the stack: the object is destroyed before you get to use it. That is, by the time the exception is caught, the referenced object is destroyed and all access to the reference result in undefined behavior.
The relevant clause in the standard is 15.2 [except.ctor] paragraph 1:
As control passes from the point where an exception is thrown to a handler, destructors are invoked for all automatic objects constructed since the try block was entered. The automatic objects are destroyed in the reverse order of the completion of their construction.
The local object is destroyed during stack unwinding. The reference then becomes invalid, a dangling reference. Which means that inspection of the exception object such that the reference is used, will have Undefined Behavior.

Will temporary object be deleted if there is no const reference to it?

Lets take a look to this two functions:
std::string get_string()
{
std::string ret_value;
// Calculate ret_value ...
return ret_value;
}
void process_c_string(const char* s)
{
std::cout << s << endl;
}
And here are two possible calls of process_c_string with argument returned by get_string.
Without binding const reference to the returned object of get_string.
process_c_string(get_string().c_str());
With binding const reference to the returned object of get_string.
const std::string& tmp_str = get_string();
process_c_string(tmp_str.c_str());
I know that second way is valid, but what about the first one, what does standard say about this case? Will the temporary object returned by get_string be deleted before process_c_str finished because of there is no const reference to it?
Note: The both versions are ok in MSVC.
The lifetime of the temporary extends for the length of the full expression in which it was created. In your case, the temporary will be destroyed but only after the call to process_c_string completes. As long as the function does not store the pointer for later use, you are fine.
In the second case (binding of reference), the lifetime of that temporary is extended to be the scope of the reference, but I would advise against that pattern in this particular case. You get the same effect by creating a local string initialized with the temporary and the code is simpler. (From a performance point of view, all compilers elide the potential extra copy in the code, so the cost would be the same)

Thrown object copy constructs -- why?

I want to be able to throw a constructed object, but modify it just before it's thrown
(using the Named Parameter Idiom).
Given:
#include <iostream>
#include <exception>
using namespace std;
struct my_exception : std::exception {
my_exception() {
cout << "my_exception(): this=" << hex << (unsigned long)this << endl;
}
my_exception( my_exception const& ) {
cout << "my_exception( my_exception const& )" << endl;
}
~my_exception() throw() {
cout << "~my_exception()" << endl;
}
my_exception& tweak() {
return *this;
}
char const* what() const throw() { return "my_exception"; }
};
int main() {
try {
throw my_exception().tweak();
}
catch ( my_exception const &e ) {
cout << "&e=" << hex << (unsigned long)&e << endl;
}
}
When I run the program, I get:
my_exception(): this=7fff5fbfeae0
my_exception( my_exception const& )
~my_exception()
&e=1001000f0
~my_exception()
As you can see, the exception object caught is not the one that's originally thrown.
If I remove the call to tweak(), I instead get:
my_exception(): this=1001000f0
&e=1001000f0
~my_exception()
For the case where tweak() is called, why is the copy constructor called? I want tweak() to operate on the originally constructed object and no copy to be made. Is there any way to prevent the copy construction?
FYI: I'm using g++ 4.2.1 (part of Xcode on Mac OS X).
An exception is thrown by value. You can't throw a reference as a reference. When you try, the object gets copied (using the statically known type).
By the way, this one reason why it's a good idea to make exceptions cloneable, and to have virtual rethrower method.
EDIT (see comments): For example, it's Undefined Behavior to propagate an exception through a C callback. But if you have defined a suitable exception class then you can clone it, and in C++-land again up the call chain rethrow via virtual method.
Cheers & hth.,
To add to Alf's answer, the fact that you aren't getting a copy operation when you don't call tweak() is because the standard permits (but doesn't require) eliding calls to the copy constructor to create the temporary exception object. From C++03 15.1/5 (Throwing an exception):
If the use of the temporary object can
be eliminated without changing the
meaning of the program except for the
execution of constructors and
destructors associated with the use of
the temporary object (12.2), then the
exception in the handler can be
initialized directly with the argument
of the throw expression. When the
thrown object is a class object, and
the copy constructor used to
initialize the temporary copy is not
accessible, the program is ill-formed
(even when the temporary object could
otherwise be eliminated).
If you make the copy constructor private, gcc will give you an error (even though when the constructor is public it doesn't get called). MSVC will not give an error, but it should I think.
AFAIK the following happens in your line throw my_exception().tweak(); :
new my_exception object is created (locally, on the stack), tweak() returns reference to this local object. Then, when you throw this reference, you go out of the scope and local object gets deleted. So, the implementation copies the class to dynamic memory to keep reference valid.
In the second case you throw it by value and it is allocated in dynamic memory at once.