I'd like to do this:
struct Derived;
struct Base{
Derived const& m_ref;
Base(Derived const& ref) : m_ref(ref){}
};
struct Derived: Base{
Derived(): Base(*this){}
};
But I seem to get unreliable behaviour (when used later on, m_ref points to things that aren't valid Derived).
Is it permissible to construct a reference to Derived from *this before the class has been initialised?
I appreciate that it is not valid to use such a reference until it has been initialised, but I don't see how changes to the initialisation of a class can affect references to it (since initialising it doesn't move it around in memory...).
I'm not sure what to call what I'm trying to do, so my search for information on this has drawn a blank...
Update:
I can't reproduce my problems with a simple test case, so it looks like it is probably okay (though I can't prove it, and would still welcome a definitive answer).
Suspect my problems arose from a broken copy-assignment operator. That's another matter altogether though!
Update 2
My copy constructor and copy-assignment operators were indeed to blame, and now this seems to work reliably. Still interested in whether or not it is well-defined behaviour though.
3.8/1 says:
The lifetime of an object of type T
begins when: — storage with the proper
alignment and size for type T is
obtained, and — if T is a class type
with a non-trivial constructor (12.1),
the constructor call has completed.
3.8/5 says:
Before the lifetime of an object has started but after the storage which the object will occupy has been allocated
or, after the lifetime of an object has ended and before the storage which the object occupied is
reused or released, any pointer that refers to the storage location where the object will be or was located
may be used but only in limited ways. Such a pointer refers to allocated storage (3.7.3.2), and using the
pointer as if the pointer were of type void*, is well-defined. Such a pointer may be dereferenced but the
resulting lvalue may only be used in limited ways, as described below.
"Below" is 3.8/6:
Such an lvalue refers to allocated storage (3.7.3.2), and using the properties of the lvalue which do not
depend on its value is well-defined.
...and then a list of things you can't do. Binding to a reference to the same, derived type is not among them.
I can't find anything elsewhere that might make your code invalid. Notably, despite the following phrase in 8.3.2/4:
A reference shall be initialized to
refer to a valid object or function.
there doesn't seem to be any definition of "valid object" to speak of.
So, after much to-ing and fro-ing, I must conclude that it is legal.
Of course, that's not to say that it's in any way a good idea! It still looks like a bad design.
For example, if you later change your base constructor and any of the following become relevant (again from 3.8/6):
the lvalue is used to access a non-static data member or call a non-static member function of the object
the lvalue is implicitly converted (4.10) to a reference to a base class type
the lvalue is used as the operand of a static_cast (5.2.9) (except when the conversion is ultimately to char& or unsigned char&
the lvalue is used as the operand of a dynamic_cast (5.2.7) or as the operand of typeid.
...then your program will be undefined, and the compiler may emit no diagnostic for this!
Random other observations
I notice a couple of other interesting things whilst compiling this answer, and this is as good a place as any to share them.
First, 9.3.2 appears to leave the type of this in a ctor-initializer accidentally unspecified. Bizarre!
Second, the criteria set on a pointer by 3.8/5 (not the same list that I quoted from 3.8/6) include:
If the object will be or was of a
non-POD class type, the program has
undefined behavior if [..] the
pointer is implicitly converted (4.10)
to a pointer to a base class type.
I believe that this renders the following innocuous-looking code undefined:
struct A {
A(A* ptr) {}
};
struct B : A {
B() : A(this) {}
};
int main() {
B b;
}
I think in general you're OK doing this, but be very careful in constructors and destructors. In particular, in Base::~Base, the Derived part of the object has already been destroyed so don't use m_ref there.
3.8/6 says what you can do with a pointer/reference to memory for an object that has been allocated but not yet constructed. Your code doesn't provoke an lvalue-to-rvalue conversion, or otherwise break the rules, so I'd think that you're fine. Since you're observing bad values, though, I may well have missed something. Or your code might be otherwise bad.
Even if you did break those rules, 12.6.2 and 12.7 list additional things that you can do during construction and destruction.
Edit: ah, 8.3.2/4: "A reference shall be initialized to refer to a valid object
or function." You initialize m_ref to refer to an object whose constructor hasn't even been entered yet. I don't know without further research whether an object under construction is "valid" or not, and in particular whether the object of most-derived type is "valid" at the time of construction of the base class. This could perhaps be the problem, though.
You might think that no unconstructed object is "valid", but then this would be invalid:
class Foo {
Foo() {
Foo &self = *this; // reference initialized to refer to unconstructed object!
}
};
So, is that invalid? If not, does the most-derived object become valid somewhere between the start of the base class constructor call and the start of the derived class constructor call? I dunno, sorry.
I think the big problem in this is that you think you want to do one thing, when in reality you actually want to do something else. Strange, huh?
Is it permissible to construct a reference to Derived from *this before the class has been initialised?
Yes, as long as you don't use it (for anything but storing a reference to it) in the scope of the Base constructor and remember in ~Base that Derived is destroyed before Base.
But why on earth do you think that Base wants to know of Derived? If it's static polymorphism you are after, then the curiously recurring template pattern is what you want:
template <typename T>
class Base {};
class Derived : public Base<Derived> {};
But I don't really think that's what you're aiming at.
Maybe you want a way for Base to communicate with a client and think that should be done with inheritance? If so, then this observer-ish idiom is what you need:
class Client
{
public:
virtual void Behavior() = 0;
protected:
~Client() {}
};
class Base
{
Client& client_;
public:
Base(Client& client) : client_(client) {}
};
class Implementor : public Client
{
public:
Implementor() : Base(*this) {}
virtual void Behavior() { ... }
};
If not even that is what you want, then you need to rethink your design.
I'm actually implementing a generic base class that takes a template parameter class and derives from it, and adds a "safe bool" conversion based on the result of a function call on the derived type. I'd like to avoid using virtual functions, if possible, because I'm a serial premature optimiser I really do care about performance in some of the places I'd like to use this. – Autopulated 37 mins ago
You don't need a reference to the Derived class. Your class is deriving from a template parameter. Just use the common method.
#include <iostream>
template <class T>
class Base : public T
{
public:
bool operator!() const
{
return !this->isOk();
}
};
class TemplateClass
{
public:
bool isOk() const
{
return true;
}
};
int main (int argc, char* argv[])
{
Base<TemplateClass> myClass;
if (!!myClass)
{
std::cout << "ok" << std::endl;
}
else
{
std::cout << "not ok" << std::endl;
}
return 0;
}
You can even use template specialization if you know ahead of time of derived classes that don't implement a common bool check.
Related
We ran into this scenario in our codebase at my work, and we had a big debate over whether this is valid C++ or not. Here is the simplest code example I could come up with:
template <class T>
class A {
public:
A() { subclass = static_cast<T*>(this); }
virtual void Foo() = 0;
protected:
T* subclass;
};
class C : public A<C> {
public:
C(int i) : i(i) { }
virtual void Foo() { subclass->Bar(); }
void Bar() { std::cout << "i is " << i << std::endl; }
private:
int i;
};
int main() {
C c(5);
c.Foo();
return 0;
}
This code works 100% of the time in practice (as long as the template parameter type matches the subclass type), but if we run it through a runtime analyzer, it tells us that the static_cast is invalid because we're casting this to a C* but the C constructor hasn't run yet. Sure enough, if we change the static_cast to a dynamic_cast, it returns nullptr and this program will fail and crash when accessing i in Bar().
My intuition is that it should always be possible to replace static_cast with dynamic_cast without breaking your code, suggesting that the original code in fact is depending on compiler-specific undefined behavior. However, on cppreference it says:
If the object expression refers or points to is actually a base class subobject of an object of type D, the result refers to the enclosing object of type D.
The question being, is it a base class subobject of an object of type D before the object of type D has finished being constructed? Or is this undefined behavior? My level of C++ rules lawyering is not strong enough to work this out.
In my opinion this is well-defined according to the current wording of the standard: the C object exists at the time of the static_cast, although it is under construction and its lifetime has not yet begun. This would seem to make the static_cast well-defined according to [expr.static.cast]/11, which reads in part:
... If the prvalue of type “pointer to cv1 B” points to a B that is actually a base class subobject of an object of type D, the resulting pointer points to the enclosing object of type D. Otherwise, the behavior is undefined.
It doesn't say that the D object's lifetime must have begun.
We might also want to look at the explicit rule about when it becomes legal to perform an implicit conversion from derived to base, [class.cdtor]/3:
To explicitly or implicitly convert a pointer (a glvalue) referring to an object of class X to a pointer (reference) to a direct or indirect base class B of X, the construction of X and the construction of all of its direct or indirect bases that directly or indirectly derive from B shall have started and the destruction of these classes shall not have completed, otherwise the conversion results in undefined behavior. To form a pointer to (or access the value of) a direct non-static member of an object obj, the construction of obj shall have started and its destruction shall not have completed, otherwise the computation of the pointer value (or accessing the member value) results in undefined behavior.
According to this rule, as soon as the compiler starts constructing the base class A<C>, it is well-defined to implicitly convert from C* to A<C>*. Before that point, it results in UB. The reason for this, basically, has to do with virtual base classes: if the path by which A<C> is inherited by C contains any virtual inheritance, the conversion may rely on data that are set up by one of the constructors in the chain. For a conversion from base to derived, if there is indeed any virtual inheritance on the chain, static_cast will not compile, so we don't really need to ask ourselves the question, but are those data sufficient for going the other way?
I really can't see anything in the text of the standard, nor any rationale, for the static_cast in your example not being well-defined, nor in any other case of static_casting from base to derived when the reverse implicit conversion (or static_cast) would be allowed (excepting the case of virtual inheritance, which as I said before, leads to a compile error anyway).
(Would it be well-defined to do it even earlier? In most cases this won't be possible; how could you possibly attempt to static_cast from B* to D* before the conversion from D* to B* is allowed, without having obtained the B* pointer precisely by doing the latter? If the answer is that you got from D* to B* through an intermediate base class C1 whose constructor has started, but there is another intermediate base class C2 sharing the same B base class subobject and its construction hasn't started yet, then B is a virtual base class, and again, this means the compiler will stop you from then trying to static_cast from B* back down to D*. So I think there are no issues left to resolve here.)
Inspired by my (currently deleted) answer to this question (but there's a summary in my comment thereto), I was wondering whether the constructor for the Derived class in the code below exhibits undefined behaviour.
#include <iostream>
class Base {
public:
Base(int test) {
std::cout << "Base constructor, test: " << test << std::endl;
}
};
class Derived : public Base {
private:
int variable;
public: Derived() : Base(variable = 50) { // Is this undefined behaviour?
}
};
int main()
{
Derived derived;
return 0;
}
I know that, when the base c'tor is called, the derived object has not yet (formally) been constructed, so the lifetime of the variable member has not yet started. However, this excerpt from the C++ Standard (thanks to NathanOliver for the reference) suggests (maybe) that the object may be used "in limited ways" (bolding mine):
7 Similarly, before the lifetime of an object has started
but after the storage which the object will occupy has been allocated or, after the lifetime of an object has ended and before
the storage which the object occupied is reused or released, any
glvalue that refers to the original object may be used but only in
limited ways. For an object under construction or destruction, see
[class.cdtor]. Otherwise, such a glvalue refers to allocated storage
([basic.stc.dynamic.allocation]), and using the properties of the
glvalue that do not depend on its value is well-defined. …
Clearly, if variable were a object which itself had a non-trivial constructor, there would (almost certainly) be undefined behaviour here. However, for a primitive (or POD) type like an int, can we assume that the storage for it has been allocated? And, if so, does the last phrase of the above quote hold, or is this still UB?
As an aside, neither clang-cl nor MSVC, even with full warnings enabled, give any diagnostic for the code shown, and it runs as expected. However, I appreciate that neither of those tools qualify as a formal interpretation/enforcement of the C++ Standard.
The behavior is undefined, regardless of whether or not the Base constructor accepts the parameter by reference.
When control reaches Base(variable = 50), the lifetime of variable hasn't started yet, because data members are initialized after base classes.
So first, writing to it causes UB because the lifetime hasn't started yet. Then, because you pass by value, reading from it is also UB.
[class.base.init]/13
In a non-delegating constructor, initialization proceeds in the following order:
— First, and only for the constructor of the most derived class ..., virtual base classes are initialized ...
— Then, direct base classes are initialized ...
— Then, non-static data members are initialized in the order they were declared in the class definition ...
— Finally, the ... the constructor body is executed.
Idea by #Jarod42: as an experiment, you can try this in a constexpr context, which is supposed to catch UB.
#include <type_traits>
#include <iostream>
struct Base
{
int x;
constexpr Base(int x) : x(x) {}
};
struct Derived : Base
{
int variable;
constexpr Derived() : Base(variable = 42) {}
};
constexpr Derived derived;
Clang rejects this with:
error: constexpr variable 'derived' must be initialized by a constant expression
note: assignment to object outside its lifetime is not allowed in a constant expression
while GCC and MSVC accept it.
I had this question (beginner C++) in a C++ quiz : My answer was incorrect, I want to understand the explanation behind the correct answer - "Undefined behavior"
Question:
What will happen in the following code after the function foo() returns?
class base
{
public:
base() { }
~base() { }
};
class derived : public base
{
private:
int *p_pi_values;
public:
derived() : p_pi_values(new int[100]) { }
~derived() { delete [] p_pi_values; }
};
void foo(void)
{
derived *p_derived = new derived();
base *p_base = p_derived;
// Do some other stuff here.
delete p_base;
}
I gave this answer which turned out wrong ==> integer array will not be properly deleted.
Correct Answer ==> The behavior is undefined.
The destructor of your base class isn't virtual. It's simply a rule of the language that if you delete an object through a pointer to a base subobject, the corresponding base class must have a virtual destructor, or otherwise it is undefined behaviour.
(In practice, if your base class doesn't have a virtual destructor, the compiler may not emit the necessary code to perform all the necessary clean-up for the derived object. It will just assume that your object is of the same type as the pointer and not bother to look further, as indeed the polymorphic lookup of the most derived object comes at a cost that you don't want to impose needlessly.)
§5.3.5/3:
In the first alternative (delete object), if the static type of the operand is different from its dynamic type, the static type shall be a base class of the operand’s dynamic type and the static type shall have a virtual destructor or the behavior is undefined
You should make your destructor virtual in a base class. The problem with the code as it is now is that delete p_base will cause a destructor of base class to be called. The one from the derived class won't be called and the memory allocated for the array of integers won't be freed.
This happens because if a method isn't virtual in a base class, compiler simply looks at a pointer type and calls a method located in this type (in this case - it's a base class) i.e. a decision what method to call is made during compilation time based on the type of the pointer and not the real type of the object the pointer is referring to.
Out of curiousity I check the C++ specs. The answer to the question is item 3 in section 5.3.5:
In the first alternative (delete object), if the static type of the
object to be deleted is different from its dynamic type, the static
type shall be a base class of the dynamic type of the object to be
deleted and the static type shall have a virtual destructor or the
behavior is undefined.
Personally, I would have answered the same way you did. If you ignore the compiler's warning, the most likely outcome is that the destructor of the derived class won't get called.
I guess compiler is allowed to optimize this code and hence the assignment of p_derived to p_base never happens.
To be more specific the compiler may optimize the code to one line.
delete new derived();
Hence it is viewed that the behavior as undefined as this can change how the compiler really optimizes the code.
I would like to know standard's view on dereferencing pointer to base, but I'm not making any progress finding it. Take these two classes for example:
class Base
{
public:
virtual void do_something() = 0;
};
class Derived : public Base
{
public:
virtual void do_something();
};
void foo2(Base *b)
{
Base &b1 = *b; // how this work by standard?
}
void foo()
{
Derived *d = new Derived();
foo2(d); // does this work by standard?
}
So, basically, if pointer of type B to an object of type D is dereferenced, will slicing happen in place, or temporary will emerge? I'm prone to believe that temporary is not an option, because that would mean that temporary is instance of abstract class.
Whatever the truth, I would appreciate any pointers to the ISO standard that says one or the other. (Or third, for that matter. :) )
EDIT:
I threw the point with temporary not being an option as a possible line of reasoning why it behaves the way it does, which is quite logical, but I can't find confirmation in standard, and I'm not a regular reader.
EDIT2:
Through discussion, it became obvious that my question was actually about dereferencing a pointer mechanism, and not about splicing or temporaries. I thank everyone for trying to dumb it down for me, and I finally got answer to the question the puzzled me the most: Why I can't find anything in the standard about this... Obviously it was the wrong question, but I've got the right answer.
Thnx
Base &b = *static_cast<Base *>(d); // does this work by standard?
Yes.
But you can simply do this:
Base &b = *d;
//use b polymorphically!
b.do_something(); //calls Derived::do_something()
No need to use static_cast. After all, Derived is derived from Base.
Reply to your edit:
foo2(d); // does this work by standard?
Yes. Pointer of type Base* can be initialized with pointer of type Derived*.
--
Base &b = *b; // how this work by standard?
No. They're same name. If you mean, Base &b1 = *b, then yes, that works. b1 refers to the object pointed to by b.
Object slicing only occurs when the copy constructor or the assignment operator of the base class gets involved somehow, like in parameter passing by value. You can easily avoid these errors by inheriting from Boost's noncopyable for example, even if only in DEBUG mode.
Neither casting pointers or references nor dereferencing involve any copy construction or assignment. Making a Base reference from a Derived reference is perfectly safe, it's even a standard implicit conversion.
In my C++11 draft, 10 [class.derived] /1 says
[ Note: The scope resolution operator :: (5.1) can be used to refer to
a direct or indirect base member explicitly. This allows access to a
name that has been redeclared in the derived class. A derived class
can itself serve as a base class subject to access control; see 11.2.
A pointer to a derived class can be implicitly converted to a pointer
to an accessible unambiguous base class (4.10). An lvalue of a
derived class type can be bound to a reference to an accessible
unambiguous base class (8.5.3). —end note ]
In most implementations, your foo2 function will store Base& b as a Base*. It obviously can't be a Base itself, because that would be a copy, not a reference. Since it acts (at runtime, not syntactically) like a pointer instead of a copy, there's no splicing concerns.
In your code before your edit, the compiler would know that Base& b was actually d, it would be syntactic sugar, and wouldn't even generate a pointer in the assembly.
Can I pass "this" to a function as a pointer, from within the class constructor, and use it to point at the object's members before the constructor returns?
Is it safe to do this, so long as the accessed members are properly initialized before the function call?
As an example:
#include <iostream>
class Stuff
{
public:
static void print_number(void *param)
{
std::cout << reinterpret_cast<Stuff*>(param)->number;
}
int number;
Stuff(int number_)
: number(number_)
{
print_number(this);
}
};
void main() {
Stuff stuff(12345);
}
I thought this wouldn't work, but it seems to. Is this standard behavior, or just undefined behavior going my way?
When you instantiate an object in C++, the code in the constructor is the last thing executed. All other initialization, including superclass initialization, superclass constructor execution, and memory allocation happens beforehand. The code in the constructor is really just to perform additional initialization once the object is constructed. So it is perfectly valid to use a "this" pointer in a class' constructor and assume that it points to a completely constructed object.
Of course, you still need to beware of uninitialized member variables, if you haven't already initialized them in your constructor code.
You can find a good answer to this here (C++ FAQ).
All inherited members and members of the calling class are guaranteed to have been constructed at the start of the constructor's code execution and so can be referenced safely within it.
The main gotcha is that you should not call virtual functions on this. Most times I've tried this it just ends up calling the base class's function, but I believe the standard says the result is undefined.
As a side-note on the presented code, I would instead templatize the void*:
class Stuff
{
public:
template <typename T>
static void print_number(const T& t)
{
std::cout << t.number;
}
int number;
Stuff(int number_)
: number(number_)
{
print_number(*this);
}
};
Then you'd get a compile error if the type of t doesn't have a number member.
Andy, I think you're wrong about the undefined part of the standard.
When you're in the constructor, "this" is a pointer to an object whose type is the base class of the object you're creating, which means that virtual functions partially implemented in the base class will be called and the pointers in the virtual table won't be followed.
More info in the C++ Faq Lite...