Rationale for throwing static type? - c++

According to the C++ FAQ, when one throws an object, it's thrown using the static type of the expression. Hence, if you have:
catch ( some_exception const &e ) {
// ...
throw e; // throws static type, possibly causing "slicing"; should just "throw;" instead
}
and e is actually a reference to some class derived from some_exception, the above throw will cause the object to be "sliced" silently. Yes, I know the correct answer is simply to throw;, but the way things are seems like an unnecessary source of confusion and bugs.
What's the rationale for this? Why wouldn't you want it to throw by the dynamic type of the object?

When you throw something, a temporary object is constructed from the operand of the throw and that temporary object is the object that is caught.
C++ doesn't have built-in support for copying things or creating objects based on the dynamic type of an expression, so the temporary object is of the static type of the operand.

The "argument" to throw is an expression and it is the type of the expression that determines the type of the exception object thrown. The type of the expression thrown doesn't necessarily have to be a polymorphic type so there may not be a way to determine if the expression actually refers to a base class subobject of a more derived type.
The simpler "type of the expression" rule also means that the implementation doesn't have to dynamically determine the size and type of the exception object at runtime which might require more complex and less efficient code to be generated for exception handling. If it had to do this it would represent the only place in a language where a copy constructor for a type unknown at the call point was required. This might add significantly to the cost of implementation.

Consider that we can have references to objects where the static type of the reference is copyable, but the dynamic type of the object is not.
struct foo {};
struct ncfoo : foo
{
private:
ncfoo(ncfoo const&) {}
};
ncfoo g_ncfoo;
void fun()
{
foo& ref = g_ncfoo;
throw ref; // what should be thrown here?
}
If you say "in this case just throw the static type", then how are the exact rules - what does "in this case" mean? References we just caught are "re-thrown" without copying, everything else is copied? Hm...
But however you define the rule, it would still be confusing. Throwing by-reference would lead to different behavior, depending on where we got that reference from. Neh. C++ is already complicated and confusing enough :)

Related

c++ exceptions throw by value catch by reference

In C++ when throwing object by value like: throw Exception(), this will create temp object, how can it be caught by reference? i know it works, but if it was a function return value or function call it would have failed without adding const to type, what is the difference ?
First, when you write
throw Exception();
what's being thrown isn't actually the temporary object created by the prvalue expression Exception(). Conceptually, there's a separate object - the exception object - that's initialized from that temporary object, and it is the exception object that's actually thrown. (Compilers are allowed to elide the copy/move, though.)
Second, the language rules say that the exception object is always considered an lvalue. Hence it is allowed to bind to non-const lvalue references.

"Safe" dynamic cast?

I'm familiar with how to do a dynamic cast in C++, as follows:
myPointer = dynamic_cast<Pointer*>(anotherPointer);
But how do you make this a "safe" dynamic cast?
When dynamic_cast cannot cast a pointer because it is not a complete object of the required class it returns a null pointer to indicate the failure.
If dynamic_cast is used to convert to a reference type and the conversion is not possible, an exception of type bad_cast is thrown instead.
Q But how do you make this a "safe" dynamic cast?
A It will be a safe dynamic cast as long as the argument to dynamic_cast is a valid pointer (including NULL). If you pass a dangling pointer or a value that is garbage, then the call to dynamic_cast is not guaranteed to be safe. In fact, the best case scenario is that the run time system throws an exception and you can deal with it. The worst case scenario is that it is undefined behavior. You can get one behavior now and a different behavior next time.
Most ways in which you might attempt to abuse dynamic_cast result in a compiler error (for example, trying to cast to a type that's not in a related polymorphic hierarchy).
There are also two runtime behaviours for times when you effectively use dynamic_cast to ask whether a particular pointer actually addresses an object of a specific derived type:
if (Derived* p = dynamic_cast<Derived*>(p_base))
{
...can use p in here...
}
else
...p_base doesn't point to an object of Derived type, nor anything further
derived from Derived...
try
{
Derived& d = dynamic_cast<Derived&>(*p_base);
...use d...
}
catch (std::bad_cast& e)
{
...wasn't Derived or further derived class...
}
The above is "safe" (defined behaviour) as long as p_base is either nullptr/0 or really does point to an object derived from Base, otherwise it's Undefined Behaviour.
Additionally, there is a runtime-unsafe thing you can do with a dynamic_cast<>, yielding Undefined Behaviour:
Standard 12.7/6: "If the operand of the dynamic_cast refers to the object under construction or destruction and the static type of the operand is not a pointer to or object of the constructor or destructor’s own class or one of its bases, the dynamic_cast results in undefined behavior.". The Standard provides an example to illustrate this.

Catch by reference when exception variable is not defined

When catching an exception the standard guidance is to throw by value, catch by reference. As I understand it, this is for two reasons:
If the exception was thrown due to an out of memory exception, we won't call a copy constructor which could potentially terminate the program.
If the exception is part of an inheritance heirarchy, we may potentially have object slicing on the exception.
If we have a scenario where we don't define the exception name in the catch block are these concerns (really 1., as slicing won't be an issue if we don't have a name for the variable) still valid?
For example:
catch(my_exception)
{ ... }
or
catch(my_exception &)
{ ... }
Is there still the possibility of the program terminating if the exception caught by value in this case? My feeling is that it is technically still possible.
Note: I am asking this because I have had to review someone's code who put a catch by value in this case. As shown in the question I am not entirely sure on the technical impact of either choice, but I think that in terms of consistancy it is better to catch by reference in this case regardless (there is no downside to catching by reference in any case).
The standard does not require special optimization in the case of an unnamed exception object. On the contrary, it then requires an effect as if a temporary is copy-initialized. This copying can result in memory being dynamically allocated.
N3290 §15.3/16:
If the exception-declaration specifies a name, it declares a variable which is copy-initialized (8.5) from the
exception object. If the exception-declaration denotes an object type but does not specify a name, a temporary (12.2) is copy-initialized (8.5) from the exception object. The lifetime of the variable or temporary
ends when the handler exits, after the destruction of any automatic objects initialized within the handler.
The paragraph above does not mention catching by reference, and so one might reasonably conclude that it applies whether or not the exception object is caught by reference; that a copy is constructed anyway.
However, that is contradicted by the next paragraph:
N3290 §15.3/17:
When the handler declares a non-constant object, any changes to that object will not affect the temporary
object that was initialized by execution of the throw-expression. When the handler declares a reference to
a non-constant object, any changes to the referenced object are changes to the temporary object initialized
when the throw-expression was executed and will have effect should that object be rethrown.
So, declared type T& (with T non-const) is the single case where C++11 requires a reference directly to the thrown object, instead of copying. And it is also that way in C++03, except that C++03 has some additional wording about as-if optimization. So, for the formal the preference should be for
catch( T& name )
and
catch( T& )
However, I have always caught exceptions like catch( T const& ). From a practical point of view one may assume that the compiler will optimize that to a direct reference to the thrown object, even though it is possible to devise cases where the observed program effect would then be non-standard. For example <evil grin>
#include <stdio.h>
#include <stdexcept>
struct Error
: std::runtime_error
{
public:
static Error* pThrown;
char const* pMessage_;
Error()
: std::runtime_error( "Base class message" )
, pMessage_( "Original message." )
{
printf( "Default-construction of Error object.\n" );
pThrown = this;
}
Error( Error const& other )
: std::runtime_error( other )
, pMessage_( other.pMessage_ )
{
printf( "Copy-construction of Error obejct.\n" );
}
char const* what() const throw() { return pMessage_; }
};
Error* Error::pThrown = 0;
int main()
{
printf( "Testing non-const ref:\n" );
try
{
throw Error();
}
catch( Error& x )
{
Error::pThrown->pMessage_ = "Modified message.";
printf( "%s\n", x.what() );
}
printf( "\n" );
printf( "Testing const ref:\n" );
try
{
throw Error();
}
catch( Error const& x )
{
Error::pThrown->pMessage_ = "Modified message";
printf( "%s\n", x.what() );
}
}
With both MinGW g++ 4.4.1 and Visual C++ 10.0 the output is …
Testing non-const ref:
Default-construction of Error object.
Modified message.
Testing const ref:
Default-construction of Error object.
Modified message
A pedantic formalist might say that both compilers are non-conforming, failing to create a copy for the Error const& case. A purely practical practitioner might say that hey, what else did you expect? And me, I say that the wording in the standard is very far from perfect here, and that if anything, one should expect a clarification to explicitly allow the output above, so that also catching by reference to const is both safe and maximally efficient.
Summing up wrt. the OP's question:
Catching by reference won’t call a copy constructor which could potentially terminate the program.
The standard only guarantees this for reference to non-const.
In practice, as shown, it is also guaranteed for reference to const, even when program results are then affected.
Cheers & hth.,
I would prefer catching by reference. The compiler could discard the exception as an optimization and do no copy, but that's only a possibility. Catching by reference gives you certainties.
There might be invisible data associated with your exception, e.g. a vtable.
The vtable is an invisible data structure that carries information about where certain, polymorphic (i.e. virtual) member functions can be found. This table in the general case costs a bit of memory that is held in the object itself. This may by the size of a pointer into some external table, or even the complete table. As always, it depends.

so what is the type of "this" ? Why is "this" not a lvalue?

Say the object is
class A {
public : void Silly(){
this = 0x12341234;
}
I know I will get compiler error ' "this" is not a lvalue.' But then it is not a temporary either. So what is the hypothetical declaration of "this" ?
Compiler : GCC 4.2 compiler on mac.
For some class X, this has the type X* this;, but you're not allowed to assign to it, so even though it doesn't actually have the type X *const this, it acts almost like it was as far as preventing assignment goes. Officially, it's a prvalue, which is the same category as something like an integer literal, so trying to assign to it is roughly equivalent to trying to assign a different value to 'a' or 10.
Note that in early C++, this was an lvalue -- assigning to this was allowed -- you did that to handle the memory allocation for an object, vaguely similar to overloading new and delete for the class (which wasn't supported yet at that time).
It is impossible to provide a "declaration" for this. There's no way to "declare" an rvalue in C++. And this is an rvalue, as you already know.
Lvalueness and rvalueness are the properties of expressions that produce these values, not the properties of declarations or objects. In that regard, one can even argue that it impossible to declare an lvalue either. You declare an object. Lvalue is what is produced when you use the name of that object as an expression. In that sense both "to declare an rvalue" and "to declare an lvalue" are oxymoron expressions.
Your question also seems to suggest that the properties of "being an lvalue" and "being a temporary" are somehow complementary, i.e. everything is supposedly an lvalue or a temporary. In reality, the property of "being a temporary" has no business being here. All expressions are either lvalues or rvalues. And this happens to be an rvalue.
Temporaries, on the other hand, can be perceived as rvalues or as lvalues, depending on how you access the temporary.
P.S. Note, BTW, that in C++ (as opposed to C) ordinary functions are lvalues.
For one thing, this is not a variable - it's a keyword. When used as a rvalue, its type is A * or A const *. In modern C++, assigning to this is prohibited. You cannot take the address of this, either. In other words, it's not a valid lvalue.
To answer the second part, "why is this not an lvalue", I'm speculating as to the committee's actual motivation, but advantages include:
assigning to this doesn't make much logical sense, so there's no particular need for it to appear on the left-hand-side of assignments. Making it an rvalue emphasises that this doesn't make much sense by forbidding it, and means that the standard doesn't have to define what happens if you do it.
making it an rvalue prevents you taking a pointer to it, which in turn relieves the implementation of any need to furnish it with an address, just like a register-modified automatic variable. It could for example devote a register in non-static member functions to storing this. If you take a const reference, then unless the use permits cunning optimization it needs to be copied somewhere that has an address, but at least it needn't be the same address if you do it twice in quick succession, as it would need to be if this were a declared variable.
You get a compiler error because this is a const pointer to the instance of the class of the same type as that class. You can't assign to it although you can use it to change other class members in non-const qualified methods, call methods, and operators. Also note because it's an instance that static methods do not have a this pointer.
Hypothetical:
class Whatever
{
// your error because this is Whatever* const this;
void DoWhatever(const Whatever& obj) { this = &obj; }
// this is ok
void DoWhatever(const Whatever& obj) { *this = obj; }
// error because this is now: const Whatever* const this;
void DoWhatever(const Whatever& obj) const { *this = obj; }
// error because this doesn't exist in this scope
static void DoWhatever(const Whatever& obj) { *this = obj; }
};

Weird use of `?:` in `typeid` code

In one of the projects I'm working on, I'm seeing this code
struct Base {
virtual ~Base() { }
};
struct ClassX {
bool isHoldingDerivedObj() const {
return typeid(1 ? *m_basePtr : *m_basePtr) == typeid(Derived);
}
Base *m_basePtr;
};
I have never seen typeid used like that. Why does it do that weird dance with ?:, instead of just doing typeid(*m_basePtr)? Could there be any reason? Base is a polymorphic class (with a virtual destructor).
EDIT: At another place of this code, I'm seeing this and it appears to be equivalently "superfluous"
template<typename T> T &nonnull(T &t) { return t; }
struct ClassY {
bool isHoldingDerivedObj() const {
return typeid(nonnull(*m_basePtr)) == typeid(Derived);
}
Base *m_basePtr;
};
I think it is an optimisation! A little known and rarely (you could say "never") used feature of typeid is that a null dereference of the argument of typeid throws an exception instead of the usual UB.
What? Are you serious? Are you drunk?
Indeed. Yes. No.
int *p = 0;
*p; // UB
typeid (*p); // throws
Yes, this is ugly, even by the C++ standard of language ugliness.
OTOH, this does not work anywhere inside the argument of typeid, so adding any clutter will cancel this "feature":
int *p = 0;
typeid(1 ? *p : *p); // UB
typeid(identity(*p)); // UB
For the record: I am not claiming in this message that automatic checking by the compiler that a pointer is not null before doing a dereference is necessarily a crazy thing. I am only saying that doing this check when the dereference is the immediate argument of typeid, and not elsewhere, is totally crazy. (Maybe is was a prank inserted in some draft, and never removed.)
For the record: I am not claiming in the previous "For the record" that it makes sense for the compiler to insert automatic checks that a pointer is not null, and to to throw an exception (as in Java) when a null is dereferenced: in general, throwing an exception on a null dereference is absurd. This is a programming error so an exception will not help. An assertion failure is called for.
The only effect I can see is that 1 ? X : X gives you X as an rvalue instead of plain X which would be an lvalue. This can matter to typeid() for things like arrays (decaying to pointers) but I don't think it would matter if Derived is known to be a class. Perhaps it was copied from someplace where the rvalue-ness did matter? That would support the comment about "cargo cult programming"
Regarding the comment below I did a test and sure enough typeid(array) == typeid(1 ? array : array), so in a sense I'm wrong, but my misunderstanding could still match the misunderstanding that lead to the original code!
This behaviour is covered by [expr.typeid]/2 (N3936):
When typeid is applied to a glvalue expression whose type is a polymorphic class type, the result refers to a std::type_info object representing the type of the most derived object (that is, the dynamic type) to which the glvalue refers. If the glvalue expression is obtained by applying the unary * operator to a pointer and the pointer is a null pointer value, the typeid expression throws an exception of a type that would match a handler of type std::bad_typeid exception.
The expression 1 ? *p : *p is always an lvalue. This is because *p is an lvalue, and [expr.cond]/4 says that if the second and third operand to the ternary operator have the same type and value category, then the result of the operator has that type and value category also.
Therefore, 1 ? *m_basePtr : *m_basePtr is an lvalue with type Base. Since Base has a virtual destructor, it is a polymorphic class type.
Therefore, this code is indeed an example of "When typeid is applied to a glvalue expression whose type is a polymorphic class type" .
Now we can read the rest of the above quote. The glvalue expression was not "obtained by applying the unary * operator to a pointer" - it was obtained via the ternary operator. Therefore the standard does not require that an exception be thrown if m_basePtr is null.
The behaviour in the case that m_basePtr is null would be covered by the more general rules about dereferencing a null pointer (which are a bit murky in C++ actually but for practical purposes we'll assume that it causes undefined behaviour here).
Finally: why would someone write this? I think curiousguy's answer is the most plausible suggestion so far: with this construct, the compiler does not have to insert a null pointer test and code to generate an exception, so it is a micro-optimization.
Presumably the programmer is either happy enough that this will never be called with a null pointer, or happy to rely on a particular implementation's handling of null pointer dereference.
I suspect some compiler was, for the simple case of
typeid(*m_basePtr)
returning typeid(Base) always, regardless of the runtime type. But turning it to an expression/temporary/rvalue made the compiler give the RTTI.
Question is which compiler, when, etc. I think GCC had problems with typeid early on, but it is a vague memory.