Is such a downcast safe? - c++

Suppose we have the following code:
#include <memory>
#include <vector>
struct BaseComponent
{
template <typename T>
T * as()
{
return static_cast<T*>(this);
}
virtual ~BaseComponent() {}
};
template <typename T>
struct Component : public BaseComponent
{
virtual ~Component() {}
};
struct PositionComponent : public Component<PositionComponent>
{
float x, y, z;
virtual ~PositionComponent() {}
};
int main()
{
std::vector<std::unique_ptr<BaseComponent>> mComponents;
mComponents.emplace_back(new PositionComponent);
auto *pos = mComponents[0]->as<PositionComponent>();
pos->x = 1337;
return 0;
}
In the T * as() method, should I use a static_cast or a dynamic_cast? are there times when the the conversion will fail? Do I need to dynamic_cast like this instead?
auto *ptr = dynamic_cast<T*>(this);
if(ptr == nullptr)
throw std::runtime_error("D'oh!");
return ptr;

In your case there is no way to tell statically whether this is the right type or not.
What you may want is a CRTP (Curiously recurring template pattern):
template <class T>
struct BaseComponent
{
T* as()
{
return static_cast<T*>(this);
}
virtual ~BaseComponent() {}
};
template <typename T>
struct Component : public BaseComponent<T>
{
virtual ~Component() {}
};
struct PositionComponent : public Component<PositionComponent>
{
float x, y, z;
virtual ~PositionComponent() {}
};
This way you can do:
auto x = yourBaseComponent.as();
and have the right child type statically.

The code that you present is correct and well formed, but the cast in general is not safe. If the actual object was not a PositionComponent, then the compiler would very gladly assume that it is and you would be causing undefined behavior.
If you replace the cast with dynamic_cast, then the compiler will generate code that at runtime verifies that the conversion is valid.
The real question is why would you need this. There are reasons, but more often than not the use of casts are an indication of issues with your design. Reconsider whether you can do better (i.e. redesign your code so that you don't need to go explicitly converting types)

Since you are using unique_ptr<BaseComponent>, naturally there could be times when the conversion fails: the insertion of new data in the vector and consumption of that data are done in unrelated places, and in such a way that the compiler cannot enforce it.
Here is an example of an invalid cast:
struct AnotherComponent : public Component<AnotherComponent>
{
virtual ~AnotherComponent () {}
};
std::vector<std::unique_ptr<BaseComponent>> mComponents;
mComponents.emplace_back(new AnotherComponent);
// !!! This code compiles, but it is fundamentally broken !!!
auto *pos = mComponents[0]->as<PositionComponent>();
pos->x = 1337;
In this respect, using dynamic_cast would provide better protection against incorrect usage of the as<T> function. Note that the incorrect usage may not be intentional: any time the compiler cannot check the type for you, and you have a potential type mismatch, you should prefer dynamic_cast<T>
Here is a small demo to illustrate how dynamic_cast would offer you a degree of protection.

You should always use dynamic_cast when casting polymorphic objects that are derived from a baseclass.
In a case where mComponents[0] is not PositionComponent (or a class derived therefrom), the above code would fail. Since the whole purpose of having mComponents hold a pointer to BaseComponent is so that you can put other things than PositionComponent objects into the vector, I'd say you need to care for that particular scenario.
In general, it's a "bad smell" when you are using dynamic_cast (or generally casting objects that are derived from a common baseclass). Typically it means the objects should not be held in a common container, because they are not closely enough related.

Related

Static cast base to derived pointer and construct derived members

A minimal example of what I want: I have two classes
template<typename T>
struct General;
template<typename T>
struct Specific;
with Specific inheriting from General, but General calling a Specific member function in its constructor:
template<typename T>
struct Specific: public General<T> {
int y;
void compute() {
y=x;
this->x++;
}
};
template<typename T>
struct General {
int x;
General(int x) : x(x) { static_cast<Specific<T>*>(this)->compute(); };
};
The idea is that there is one General class and multiple specific ones, and which one to use is decided either at run time or compile time.
The code above compiles and runs correctly, even though y is never actually constructed. I think I get away with it because it's a primitive type, but if I use a more complicated y instead:
template<typename T>
struct Specific: public General<T> {
std::vector<int> y;
void compute() {
y={0}; //anything here
this->x++;
}
};
then the code still compiles, but when executing the line with the comment I get a read access violation error because y hasn't been constructed.
There is a way out of it, and that's to have the variable y in General as opposed to Specific. This works, but different implementations need different variables, and it's not very elegant to include lots of protected variables in General that are only used by one implementation each (although from a performance standpoint, I guess compilers remove unused member variables as long as that can be detected in compile-time correct?). So is there a better way to do it?
Edit: One possible way suggested by Scheff:
General(int x) : x(x) {
Specific<T> B;
B = static_cast<Specific<T>&>(*this);
B.compute();
*this = static_cast<General<T>&>(B);
}
"Upgrading" a base class to a derived class is not going to happen. If I say
General<int> g(5)
I asked for a General<int> and I will get a General<int>. A Specific<int> is not a General<int>, and it would be wrong for me to get one. Indeed, since your Specific is larger than General, constructing a Specific where a General is expected corrupts the stack, even in the first example with two ints.
Deciding which subclass of General to instantiate belongs outside of its constructor, period. The most obvious way to get there is to untangle the initialization into normal constructors:
template<typename T>
struct General {
int x;
General(int x) : x(x) { }
virtual ~General() { }
};
template<typename T>
struct Specific : General<T> {
int y;
Specific(int x) : General<T>(x), y(this->x++) { }
};
and to decide which one to actually instantiate in a free function:
template<typename T>
std::unique_ptr<General<T>> DecideGeneral(int x) {
if(foo) return std::make_unique<Specific<T>>(x);
// other cases...
else return std::make_unique<General<T>>(x);
}
int main() {
// interesting tidbit:
// a const std::unique_ptr<T> is basically just a T, as the lifetime of the
// pointed-to object is exactly the same as the pointer's (it can't move out)
// all accesses are still indirect though
const auto g = DecideGeneral<int>(5);
}

Implicit downcast of shared_ptr in CRTP

I built a class Interface to be used with CRTP for static polymorphism and a Client class having a shared_ptr to the Interface. I would like to return from the Client the shared_ptr to the Implementation, something that I can (safely?) do through static_pointer_cast within the Client. Is there a way to allow an implicit downcasting from the shared_ptr to Interface to a shared_ptr to Implementation?
template<class Implementation>
struct Interface {
int foo() { return static_cast<Implementation*>(this)->fooimpl();}
};
template<class Implementation>
struct Client {
Client(std::shared_ptr<Implementation> const & pt) : pt_{pt} {}
std::shared_ptr<Interface<Implementation>> pt_;
std::shared_ptr<Implementation> getPt() {
//Can I avoid this static_pointer_cast?<
return std::static_pointer_cast<Implementation>(pt_);
}
};
One possible solution to avoid all this mess is to keep a shared_ptr to Implementation within the Client class. In this way, however, nowhere I am saying that Implementation in Client has the Interface.
template<class Implementation>
struct Client {
Client(std::shared_ptr<Implementation> const & pt) : pt_{pt} {}
std::shared_ptr<Implementation> pt_;
std::shared_ptr<Implementation> getPt() { return pt_;}
};
Is there a way to allow an implicit downcasting from the shared_ptr to Interface to a shared_ptr to Implementation?
Simple answer: No! As the compiler have no idea of the "reverse" inheritance, it can give you direct support for it. CRTP is a general pattern to work around the underlying problem. Here the downcast is hand coded but hidden behind the interface of the CRTP implementation.
The CRTP is because I want Client to use foo() and at the same time independent of the implementation
As you get it as a compile time implementation, it is not really independent at the moment. You will see it latest, if you want point to something of that CRTP type.
The shared_ptr is because the Implementation may be shared among Clients
Your idea have a circular problem!
If you write as given in your example:
template
struct Client {
Client(std::shared_ptr const & pt) : pt_{pt} {}
std::shared_ptr pt_;
std::shared_ptr getPt() { return pt_;}
};
the code which calls getPt() must know the type of the returned pointer! Even if you use auto you will get the type of the returned pointer. So you simply can't hide it from your using code at all.
I ended up just putting a shared_ptr to the Implementation as class member and added a "static_assert + is_base_of " to insure this.
Seems to be also a circular problem.
If you write:
template < typename T>
class CRTP: public T
{
public:
void Check()
{
static_assert( std::is_base_of_v < T, CRTP<T> > );
}
};
class A {};
int main()
{
CRTP<A> x;
x.Check();
}
What is the assert helping here? It is only a check that you wrote 4 lines above "class CRTP: public T". For me that makes no sense.
I still have no real idea what you want to achieve more than simply using CRTP.

Contiguous storage of polymorphic types

I'm interested to know if there is any viable way to contiguously store an array of polymorphic objects, such that virtual methods on a common base can be legally called (and would dispatch to the correct overridden method in a subclass).
For example, considering the following classes:
struct B {
int common;
int getCommon() { return common; }
virtual int getVirtual() const = 0;
}
struct D1 : B {
virtual int getVirtual final const { return 5 };
}
struct D2 : B {
int d2int;
virtual int getVirtual final const { return d2int };
}
I would like to allocate a contiguous array of D1 and D2 objects, and treat them as B objects, including calling getVirtual() which will delegate to the appropriate method depending on the object type. Conceptually this seems possible: each object knows its type, typically via an embedded vtable pointer, so you could imagine, storing n objects in an array of n * max(sizeof(D1), sizeof(D2)) unsigned char, and using placement new and delete to initialize the objects, and casting the unsigned char pointer to B*. I'm pretty sure a cast is not legal, however.
One could also imagine creating a union like:
union Both {
D1 d1;
D2 d2;
}
and then creating an array of Both, and using placement new to create the objects of the appropriate type. This again doesn't seem to offer a way to actually call B::getVirtual() safely, however. You don't know the last stored type for the elements, so how are you going to get your B*? You need to use either &u.d1 or &u.d2 but you don't know which! There are actually special rules about "initial common subsequences" which let you do some things on unions where the elements share some common traits, but this only applies to standard layout types. Classes with virtual methods are not standard layout types.
Is there any way to proceed? Ideally, a solution would look something like a non-slicing std::vector<B> that can actually contain polymorphic subclasses of B. Yes, if required one might stipulate that all possible subclasses are known up front, but a better solution would only need to know the maximum likely size of any subclass (and fail at compile time if someone tries to add a "too big" object).
If it isn't possible to do with the built-in virtual mechanism, other alternatives that offer similar functionality would also be interesting.
Background
No doubt someone will ask "why", so here's a bit of motivation:
It seems generally well-known that using virtual functions to implement runtime polymorphism comes at a moderate overhead when actually calling virtual methods.
Not as often discussed, however, is the fact that using classes with virtual methods to implement polymorphism usually implies a totally different way of managing the memory for the underlying objects. You cannot just add objects of different types (but a common base) to a standard container: if you have subclasses D1 and D2, both derived from base B, a std::vector<B> would slice any D1 or D2 objects added. Similarly for arrays of such objects.
The usual solution is to instead use containers or arrays of pointers to the base class, like std::vector<B*> or perhaps std::vector<unique_ptr<B>> or std::vector<shared_ptr<B>>. At a minimum, this adds an extra indirection when accessing each element1, and in the case of the smart pointers, it breaks common container optimizations. If you are actually allocating each object via new and delete (including indirectly), then the time and memory cost of storing your objects just increased by a large amount.
Conceptually it seems like various subclasses of a common base can be stored consecutively (each object would consume the same amount of space: that of the largest supported object), and that a pointer to an object could be treated as a base-class pointer. In some cases, this could greatly simply and speed-up use of such polymorphic objects. Of course, in general, it's probably a terrible idea, but for the purposes of this question let's assume it has some niche application.
1 Among other things, this indirection pretty much prevents any vectorization of the same operation applied to all elements and harms locality of reference with implications both for caching and prefetching.
You were almost there with your union. You can use either a tagged union (add an if to discriminate in your loop) or a std::variant (it introduces a kind of double dispatching through std::find to get the object out of it) to do that. In neither case you have allocations on the dynamic storage, so data locality is guaranteed.
Anyway, as you can see, in any case you can replace an extra level of indirection (the virtual call) with a plain direct call. You need to erase the type somehow (polymorphism is nothing more than a kind of type erasure, think of it) and you cannot get out directly from an erased object with a simple call. ifs or extra calls to fill the gap of the extra level of indirection are required.
Here is an example that uses std::variant and std::find:
#include<vector>
#include<variant>
struct B { virtual void f() = 0; };
struct D1: B { void f() override {} };
struct D2: B { void f() override {} };
void f(std::vector<std::variant<D1, D2>> &vec) {
for(auto &&v: vec) {
std::visit([](B &b) { b.f(); }, v);
}
}
int main() {
std::vector<std::variant<D1, D2>> vec;
vec.push_back(D1{});
vec.push_back(D2{});
f(vec);
}
For it's really close, it doesn't worth it posting also an example that uses tagged unions.
Another way to do that is by means of separate vectors for the derived classes and a support vector to iterate them in the right order.
Here is a minimal example that shows it:
#include<vector>
#include<functional>
struct B { virtual void f() = 0; };
struct D1: B { void f() override {} };
struct D2: B { void f() override {} };
void f(std::vector<std::reference_wrapper<B>> &vec) {
for(auto &w: vec) {
w.get().f();
}
}
int main() {
std::vector<std::reference_wrapper<B>> vec;
std::vector<D1> d1;
std::vector<D2> d2;
d1.push_back({});
vec.push_back(d1.back());
d2.push_back({});
vec.push_back(d2.back());
f(vec);
}
I try to implement what you want without memory overhead:
template <typename Base, std::size_t MaxSize, std::size_t MaxAlignment>
struct PolymorphicStorage
{
public:
template <typename D, typename ...Ts>
D* emplace(Ts&&... args)
{
static_assert(std::is_base_of<Base, D>::value, "Type should inherit from Base");
auto* d = new (&buffer) D(std::forward<Ts>(args)...);
assert(&buffer == reinterpret_cast<void*>(static_cast<Base*>(d)));
return d;
}
void destroy() { get().~Base(); }
const Base& get() const { return *reinterpret_cast<const Base*>(&buffer); }
Base& get() { return *reinterpret_cast<Base*>(&buffer); }
private:
std::aligned_storage_t<MaxSize, MaxAlignment> buffer;
};
Demo
But problems are that copy/move constructors (and assignment) are incorrect, but I don't see correct way to implement it without memory overhead (or additional restriction to the class).
I cannot =delete them, else you cannot use them in std::vector.
With memory overhead, variant seems then simpler.
So, this is really ugly, but if you're not using multiple inheritance or virtual inheritance, a Derived * in most implementations is going to have the same bit-level value as a Base *.
You can test this with a static_assert so things fail to compile if that's not the case on a particular platform, and use your union idea.
#include <cstdint>
class Base {
public:
virtual bool my_virtual_func() {
return true;
}
};
class DerivedA : public Base {
};
class DerivedB : public Base {
};
namespace { // Anonymous namespace to hide all these pointless names.
constexpr DerivedA a;
constexpr const Base *bpa = &a;
constexpr DerivedB b;
constexpr const Base *bpb = &b;
constexpr bool test_my_hack()
{
using ::std::uintptr_t;
{
const uintptr_t dpi = reinterpret_cast<uintptr_t>(&a);
const uintptr_t bpi = reinterpret_cast<uintptr_t>(bpa);
static_assert(dpi == bpi, "Base * and Derived * !=");
}
{
const uintptr_t dpi = reinterpret_cast<uintptr_t>(&b);
const uintptr_t bpi = reinterpret_cast<uintptr_t>(bpb);
static_assert(dpi == bpi, "Base * and Derived * !=");
}
// etc...
return true;
}
}
const bool will_the_hack_work = test_my_hack();
The only problem is that constexpr rules will forbid your objects from having virtual destructors because those will be considered 'non-trivial'. You'll have to destroy them by calling a virtual function that must be defined in every derived class that then calls the destructor directly.
But, if this bit of code succeeds in compiling, then it doesn't matter if you get a Base * from the DerivedA or DerivedB member of your union. They're going to be the same anyway.
Another option is to embed a pointer to a struct full of member function pointers at the beginning of a struct that contains that pointer and a union with your derived classes in it and initialize it yourself. Basically, implement your own vtable.
There was a talk at CppCon 2017, "Runtime Polymorphism - Back to the Basics", that discussed doing something like what you are asking for. The slides are on github and a video of the talk is available on youtube.
The speaker's experimental library for achieving this, "dyno", is also on github.
It seems to me that you're looking for a variant, which is a tagged union with safe access.
c++17 has std::variant. For prior versions, boost offers a version - boost::variant
Note that the polymorphism is no longer necessary. In this case I have used signature-compatible methods to provide the polymorphism, but you can also provide it through signature-compatible free functions and ADL.
#include <variant> // use boost::variant if you don't have c++17
#include <vector>
#include <algorithm>
struct B {
int common;
int getCommon() const { return common; }
};
struct D1 : B {
int getVirtual() const { return 5; }
};
struct D2 : B {
int d2int;
int getVirtual() const { return d2int; }
};
struct d_like
{
using storage_type = std::variant<D1, D2>;
int get() const {
return std::visit([](auto&& b)
{
return b.getVirtual();
}, store_);
}
int common() const {
return std::visit([](auto&& b)
{
return b.getCommon();
}, store_);
};
storage_type store_;
};
bool operator <(const d_like& l, const d_like& r)
{
return l.get() < r.get();
}
struct by_common
{
bool operator ()(const d_like& l, const d_like& r) const
{
return l.common() < r.common();
}
};
int main()
{
std::vector<d_like> vec;
std::sort(begin(vec), end(vec));
std::sort(begin(vec), end(vec), by_common());
}

Value to type runtime mapping

Consider this code
enum Types
{
t1,
t2
};
struct Base
{
Types typeTag;
Base(Types t) : typeTag(t){}
};
template<typename T>
struct Derived : Base
{
using Base::Base;
T makeT() { return T(); }
};
int main()
{
Base *b = new Derived<std::string>(t1);
auto d = getDerivedByTag(b); // How ??
d->makeT();
return 0;
}
Is it possible to restore Derived type parameter by Base::typeTag value in runtime? Obviously, some external preliminarily prepared mapping is needed, but I can't figure out the exact way.
What you want is basically a reflection that is not (yet) supported in C++. There are ways to simulate it or work around it but they are often verbose and not elegant. I would suggest rethinking your design, particularly your use of auto. It is not supposed to substitute for "any type" as you seem to imply by your code. It is meant as simplification of code when actual type is long or obfuscated (often happens with templates), nested etc. Not when you do not know the type! Because then you cannot really use it, can you.
So what you will have to do in one way or another is check the typeTag directly and continue based on that information. Alternatively you would need to use polymorphism using the Base directly (calling virtual methods propagated to Derived). For type unions you could use boost::variant (if you do not care what type Derived template argument is) or some other framework/library alternative like QVariant in Qt.
I'm not sure if my understanding is correct.
#include "iostream"
enum Types
{
t1,
t2
};
template<typename T>
struct Base
{
typedef T DerivedType;
Types typeTag;
Base(Types t) : typeTag(t){}
DerivedType* operator()() {
return static_cast<DerivedType*>(this);
}
};
template<typename T>
struct Derived : Base<Derived<T>>
{
Derived(Types t): Base<Derived<T>>(t) {}
T makeT() { return T(); }
};
int main()
{
Base<Derived<std::string>> *b = new Derived<std::string>(t1);
auto d = (*b)();
d->makeT();
return 0;
}
https://godbolt.org/g/uBsFD8
My implementation has nothing to do with typeTag.
Do you mean getDerivedByTag(b->typeTag) rather than getDerivedByTag(b)?

What is an appropriate interface for dealing with meta-aspects of classes?

I'm looking for some advice of what would be an appropriate interface for dealing with aspects about classes (that deal with classes), but which are not part of the actual class they are dealing with (meta-aspects). This needs some explanation...
In my specific example I need to implement a custom RTTI system that is a bit more complex than the one offered by C++ (I won't go into why I need that). My base object is FooBase and each child class of this base is associated a FooTypeInfo object.
// Given a base pointer that holds a derived type,
// I need to be able to find the actual type of the
// derived object I'm holding.
FooBase* base = new FooDerived;
// The obvious approach is to use virtual functions...
const FooTypeInfo& info = base->typeinfo();
Using virtual functions to deal with the run-time type of the object doesn't feel right to me. I tend to think of the run-time type of an object as something that goes beyond the scope of the class, and as such it should not be part of its explicit interface. The following interface makes me feel a lot more comfortable...
FooBase* base = new FooDerived;
const FooTypeInfo& info = foo::typeinfo(base);
However, even though the interface is not part of the class, the implementation would still have to use virtual functions, in order for this to work:
class FooBase
{
protected:
virtual const FooTypeInfo& typeinfo() const = 0;
friend const FooTypeInfo& ::foo::typeinfo(const FooBase*);
};
namespace foo
{
const FooTypeInfo& typeinfo(const FooBase* ptr) {
return ptr->typeinfo();
}
}
Do you think I should use this second interface (that feels more appropriate to me) and deal with the slightly more complex implementation, or shoud I just go with the first interface?
#Seth Carnegie
This is a difficult problem if you don't even want derived classes to know about being part of the RTTI ... because you can't really do anything in the FooBase constructor that depends on the runtime type of the class being instantiated (for the same reason you can't call virtual methods in a ctor or dtor).
FooBase is the common base of the hierarchy. I also have a separate CppFoo<> class template that reduces the amount of boilerplate and makes the definition of types easier. There's another PythonFoo class that work with Python derived objects.
template<typename FooClass>
class CppFoo : public FooBase
{
private:
const FooTypeInfo& typeinfo() const {
return ::foo::typeinfo<FooClass>();
}
};
class SpecificFoo : public CppFoo<SpecificFoo>
{
// The class can now be implemented agnostic of the
// RTTI system that works behind the scenes.
};
A few more details about how the system works can be found here:
► https://stackoverflow.com/a/8979111/627005
You can tie dynamic type with static type via typeid keyword and use returned std::type_info objects as means of identification. Furthermore, if you apply typeid on a separate class created specially for the purpose, it will be totally non-intrusive for the classes you are interesed in, althought their names still have to be known in advance. It is important that typeid is applied on a type which supports dynamic polymorphism - it has to have some virtual function.
Here is example:
#include <typeinfo>
#include <cstdio>
class Base;
class Derived;
template <typename T> class sensor { virtual ~sensor(); };
extern const std::type_info& base = typeid(sensor<Base>);
extern const std::type_info& derived = typeid(sensor<Derived>);
template <const std::type_info* Type> struct type
{
static const char* name;
static void stuff();
};
template <const std::type_info* Type> const char* type<Type>::name = Type->name();
template<> void type<&base>::stuff()
{
std::puts("I know about Base");
}
template<> void type<&derived>::stuff()
{
std::puts("I know about Derived");
}
int main()
{
std::puts(type<&base>::name);
type<&base>::stuff();
std::puts(type<&derived>::name);
type<&derived>::stuff();
}
Needless to say, since std::type_info are proper objects and they are unique and ordered, you can manage them in a collection and thus erase type queried from the interface:
template <typename T> struct sensor {virtual ~sensor() {}};
struct type
{
const std::type_info& info;
template <typename T>
explicit type(sensor<T> t) : info(typeid(t))
{};
};
bool operator<(const type& lh, const type& rh)
{
return lh.info.before(rh.info);
}
int main()
{
std::set<type> t;
t.insert(type(sensor<Base>()));
t.insert(type(sensor<Derived>()));
for (std::set<type>::iterator i = t.begin(); i != t.end(); ++i)
std::puts(i->info.name());
}
Of course you can mix and match both, as you see fit.
Two limitations:
there is no actual introspection here . You can add it to template struct sensor via clever metaprogramming, it's very wide subject (and mind bending, sometimes).
names of all types you want to support have to be known in advance.
One possible variation is adding RTTI "framework hook" such as static const sensor<Myclass> rtti_MyClass; to implementation files where class names are already known and let the constructor do the work. They would also have to be complete types at this point to enable introspection in sensor.