Mixing smart pointers to class objects with class objects - c++

I have a code where by design some classes use a factory function to generate the actual class, and others do not. Many classes have functions with the same names implemented and those functions are called sequentially (see below). This design results in mixing smart pointers to objects and objects itself. Is the code below bad design and should I use smart pointers everywhere?
#include <iostream>
#include <memory>
class A
{
public:
void print_name() { std::cout << "A\n"; }
};
class B
{
public:
virtual void print_name() = 0;
static std::unique_ptr<B> factory(const int n);
};
class B1 : public B
{
public:
void print_name() { std::cout << "B1\n"; }
};
class B2 : public B
{
public:
void print_name() { std::cout << "B2\n"; }
};
std::unique_ptr<B> B::factory(const int n)
{
if (n == 1)
return std::make_unique<B1>();
else if (n == 2)
return std::make_unique<B2>();
else
throw std::runtime_error("Illegal option");
}
int main()
{
A a;
std::unique_ptr<B> b1 = B::factory(1);
std::unique_ptr<B> b2 = B::factory(2);
// The block below disturbs me because of mixed . and ->
a.print_name();
b1->print_name();
b2->print_name();
return 0;
}
EDIT
I have added smart pointers to the example following the comments below.

This looks like a reasonable design to me. In the client code you will work through the base class interface.
class Base {};
class A: public Base {};
class B: public Base {};
class B1: public B {};
class B2: public B {};
class Factory {
std::unique_ptr<Base> create(const int n) {
// Instantiate a concrete class based on n
return std::unique_ptr<Base>(new A());
}
}

Generally, you should avoid using pointers, if not needed. Topics like Return Value Optimization make them optional in most cases. Also, don't use C-Style-Pointers. C++11 introduced the memory-header, which includes many useful smart pointers.
And yeah, I feel like using those pointers in such a program is a bad design decision.

Other than the mix, why does the code disturb you? What do you think could actually go badly because of it? Do you think the difference really hurts readability?
Yes the lines that call print look different, and that difference tells you that the objects are being managed in a different way.
There's no obvious benefit to them being made the same, if a developer can't understand both . and -> then you have much bigger problems.
I've worked on a codebase where nearly everything was handled by shared pointers for consistency, even though some things shouldn't be pointers and most aren't really shared. Needless inconsistency should be avoided but it really isn't helpful to be consistent in cases where that hides legitimate difference.
One benefit of your code over all pointers is that since a is not a pointer you can sure that a is not null.

Related

Check intermediate ancestor type of object

Some places in our code make extensive use of 'dynamic_cast` do check the type of a given object:
if (dynamic_cast<Foo*>(bar))
return "foo";
else
return "not-foo";
In some specific section of the code, we decided to switch to typeid, but we ran into a problem: we're checking an object against an arbitrary ancestor, not to its concrete type:
#include <typeinfo>
#include <iostream>
struct Base
{
virtual ~Base() {} // enable vtable
};
struct Derived : Base{};
struct DerivedAgain : Derived{};
struct OtherDerived : Derived{};
int main() {
Base* b = new DerivedAgain;
// if (typeid(*b) == typeid(Derived)) // will print false
if (dynamic_cast<Derived*>(b))
std::cout << "true\n";
else
std::cout << "false\n";
return 0;
}
Is there a way to check if b is Derived* without dynamic_cast?
P.S.: I'm aware this might be indicative of some larger problem with the design of the code, but I want to know specifically how to make this kind of check.
A quick way to go around dynamic_cast, with some manual work, if you have relatively few types:
add a static constexpr/const int mytypeno member to each class with a distinct prime number (this can be done automatically)
add another static constexpr/const int mytypecompatibility member, which is:
for base (topmost classes) the same as mytypeno
for classes that has base classes, the product of their mytypeno and all base classes' mytypeno
Then you can check if a class is a descendant of another as: descendant.mytypecompatibility % base.mytypeno == 0. This idea was originally proposed by Bjarne for a quick implementation of dynamic_cast<> with full access to all sources (vs. local solutions). It's also relatively little intrusion as you only take two static constexpr members, thus no per-object overhead and you don't force virtual table to be used.
It seems that dynamic_cast is what I want in this specific case, since it is the only thing in the language that will actually examine the dependency chain of the object.
That said, it's worth mentioning that, as I said in the question and many people said in the comments, having to use dynamic_cast may be an indicative of a larger design problem in the code. One possible solution is using a polymorphic method instead:
// with checking
int doThing(Base* b)
{
if (auto d = dynamic_cast<Derived*>(b))
return stuff(d);
if (auto d = dynamic_cast<OtherDerived*>(b))
return otherStuff(d);
return 0;
}
// with polymorphism
int doThing(Base* b)
{
return b->stuff();
}
// stuff is a virtual method defined in Base, Derived and OtherDerived
struct Base
{
virtual int stuff() const {}
};
struct Derived : Base
{
int stuff() const override { return stuff(this); }
};
struct OtherDerived : Base
{
int stuff() const override { return otherStuff(this); }
};

Contiguous storage of polymorphic types

I'm interested to know if there is any viable way to contiguously store an array of polymorphic objects, such that virtual methods on a common base can be legally called (and would dispatch to the correct overridden method in a subclass).
For example, considering the following classes:
struct B {
int common;
int getCommon() { return common; }
virtual int getVirtual() const = 0;
}
struct D1 : B {
virtual int getVirtual final const { return 5 };
}
struct D2 : B {
int d2int;
virtual int getVirtual final const { return d2int };
}
I would like to allocate a contiguous array of D1 and D2 objects, and treat them as B objects, including calling getVirtual() which will delegate to the appropriate method depending on the object type. Conceptually this seems possible: each object knows its type, typically via an embedded vtable pointer, so you could imagine, storing n objects in an array of n * max(sizeof(D1), sizeof(D2)) unsigned char, and using placement new and delete to initialize the objects, and casting the unsigned char pointer to B*. I'm pretty sure a cast is not legal, however.
One could also imagine creating a union like:
union Both {
D1 d1;
D2 d2;
}
and then creating an array of Both, and using placement new to create the objects of the appropriate type. This again doesn't seem to offer a way to actually call B::getVirtual() safely, however. You don't know the last stored type for the elements, so how are you going to get your B*? You need to use either &u.d1 or &u.d2 but you don't know which! There are actually special rules about "initial common subsequences" which let you do some things on unions where the elements share some common traits, but this only applies to standard layout types. Classes with virtual methods are not standard layout types.
Is there any way to proceed? Ideally, a solution would look something like a non-slicing std::vector<B> that can actually contain polymorphic subclasses of B. Yes, if required one might stipulate that all possible subclasses are known up front, but a better solution would only need to know the maximum likely size of any subclass (and fail at compile time if someone tries to add a "too big" object).
If it isn't possible to do with the built-in virtual mechanism, other alternatives that offer similar functionality would also be interesting.
Background
No doubt someone will ask "why", so here's a bit of motivation:
It seems generally well-known that using virtual functions to implement runtime polymorphism comes at a moderate overhead when actually calling virtual methods.
Not as often discussed, however, is the fact that using classes with virtual methods to implement polymorphism usually implies a totally different way of managing the memory for the underlying objects. You cannot just add objects of different types (but a common base) to a standard container: if you have subclasses D1 and D2, both derived from base B, a std::vector<B> would slice any D1 or D2 objects added. Similarly for arrays of such objects.
The usual solution is to instead use containers or arrays of pointers to the base class, like std::vector<B*> or perhaps std::vector<unique_ptr<B>> or std::vector<shared_ptr<B>>. At a minimum, this adds an extra indirection when accessing each element1, and in the case of the smart pointers, it breaks common container optimizations. If you are actually allocating each object via new and delete (including indirectly), then the time and memory cost of storing your objects just increased by a large amount.
Conceptually it seems like various subclasses of a common base can be stored consecutively (each object would consume the same amount of space: that of the largest supported object), and that a pointer to an object could be treated as a base-class pointer. In some cases, this could greatly simply and speed-up use of such polymorphic objects. Of course, in general, it's probably a terrible idea, but for the purposes of this question let's assume it has some niche application.
1 Among other things, this indirection pretty much prevents any vectorization of the same operation applied to all elements and harms locality of reference with implications both for caching and prefetching.
You were almost there with your union. You can use either a tagged union (add an if to discriminate in your loop) or a std::variant (it introduces a kind of double dispatching through std::find to get the object out of it) to do that. In neither case you have allocations on the dynamic storage, so data locality is guaranteed.
Anyway, as you can see, in any case you can replace an extra level of indirection (the virtual call) with a plain direct call. You need to erase the type somehow (polymorphism is nothing more than a kind of type erasure, think of it) and you cannot get out directly from an erased object with a simple call. ifs or extra calls to fill the gap of the extra level of indirection are required.
Here is an example that uses std::variant and std::find:
#include<vector>
#include<variant>
struct B { virtual void f() = 0; };
struct D1: B { void f() override {} };
struct D2: B { void f() override {} };
void f(std::vector<std::variant<D1, D2>> &vec) {
for(auto &&v: vec) {
std::visit([](B &b) { b.f(); }, v);
}
}
int main() {
std::vector<std::variant<D1, D2>> vec;
vec.push_back(D1{});
vec.push_back(D2{});
f(vec);
}
For it's really close, it doesn't worth it posting also an example that uses tagged unions.
Another way to do that is by means of separate vectors for the derived classes and a support vector to iterate them in the right order.
Here is a minimal example that shows it:
#include<vector>
#include<functional>
struct B { virtual void f() = 0; };
struct D1: B { void f() override {} };
struct D2: B { void f() override {} };
void f(std::vector<std::reference_wrapper<B>> &vec) {
for(auto &w: vec) {
w.get().f();
}
}
int main() {
std::vector<std::reference_wrapper<B>> vec;
std::vector<D1> d1;
std::vector<D2> d2;
d1.push_back({});
vec.push_back(d1.back());
d2.push_back({});
vec.push_back(d2.back());
f(vec);
}
I try to implement what you want without memory overhead:
template <typename Base, std::size_t MaxSize, std::size_t MaxAlignment>
struct PolymorphicStorage
{
public:
template <typename D, typename ...Ts>
D* emplace(Ts&&... args)
{
static_assert(std::is_base_of<Base, D>::value, "Type should inherit from Base");
auto* d = new (&buffer) D(std::forward<Ts>(args)...);
assert(&buffer == reinterpret_cast<void*>(static_cast<Base*>(d)));
return d;
}
void destroy() { get().~Base(); }
const Base& get() const { return *reinterpret_cast<const Base*>(&buffer); }
Base& get() { return *reinterpret_cast<Base*>(&buffer); }
private:
std::aligned_storage_t<MaxSize, MaxAlignment> buffer;
};
Demo
But problems are that copy/move constructors (and assignment) are incorrect, but I don't see correct way to implement it without memory overhead (or additional restriction to the class).
I cannot =delete them, else you cannot use them in std::vector.
With memory overhead, variant seems then simpler.
So, this is really ugly, but if you're not using multiple inheritance or virtual inheritance, a Derived * in most implementations is going to have the same bit-level value as a Base *.
You can test this with a static_assert so things fail to compile if that's not the case on a particular platform, and use your union idea.
#include <cstdint>
class Base {
public:
virtual bool my_virtual_func() {
return true;
}
};
class DerivedA : public Base {
};
class DerivedB : public Base {
};
namespace { // Anonymous namespace to hide all these pointless names.
constexpr DerivedA a;
constexpr const Base *bpa = &a;
constexpr DerivedB b;
constexpr const Base *bpb = &b;
constexpr bool test_my_hack()
{
using ::std::uintptr_t;
{
const uintptr_t dpi = reinterpret_cast<uintptr_t>(&a);
const uintptr_t bpi = reinterpret_cast<uintptr_t>(bpa);
static_assert(dpi == bpi, "Base * and Derived * !=");
}
{
const uintptr_t dpi = reinterpret_cast<uintptr_t>(&b);
const uintptr_t bpi = reinterpret_cast<uintptr_t>(bpb);
static_assert(dpi == bpi, "Base * and Derived * !=");
}
// etc...
return true;
}
}
const bool will_the_hack_work = test_my_hack();
The only problem is that constexpr rules will forbid your objects from having virtual destructors because those will be considered 'non-trivial'. You'll have to destroy them by calling a virtual function that must be defined in every derived class that then calls the destructor directly.
But, if this bit of code succeeds in compiling, then it doesn't matter if you get a Base * from the DerivedA or DerivedB member of your union. They're going to be the same anyway.
Another option is to embed a pointer to a struct full of member function pointers at the beginning of a struct that contains that pointer and a union with your derived classes in it and initialize it yourself. Basically, implement your own vtable.
There was a talk at CppCon 2017, "Runtime Polymorphism - Back to the Basics", that discussed doing something like what you are asking for. The slides are on github and a video of the talk is available on youtube.
The speaker's experimental library for achieving this, "dyno", is also on github.
It seems to me that you're looking for a variant, which is a tagged union with safe access.
c++17 has std::variant. For prior versions, boost offers a version - boost::variant
Note that the polymorphism is no longer necessary. In this case I have used signature-compatible methods to provide the polymorphism, but you can also provide it through signature-compatible free functions and ADL.
#include <variant> // use boost::variant if you don't have c++17
#include <vector>
#include <algorithm>
struct B {
int common;
int getCommon() const { return common; }
};
struct D1 : B {
int getVirtual() const { return 5; }
};
struct D2 : B {
int d2int;
int getVirtual() const { return d2int; }
};
struct d_like
{
using storage_type = std::variant<D1, D2>;
int get() const {
return std::visit([](auto&& b)
{
return b.getVirtual();
}, store_);
}
int common() const {
return std::visit([](auto&& b)
{
return b.getCommon();
}, store_);
};
storage_type store_;
};
bool operator <(const d_like& l, const d_like& r)
{
return l.get() < r.get();
}
struct by_common
{
bool operator ()(const d_like& l, const d_like& r) const
{
return l.common() < r.common();
}
};
int main()
{
std::vector<d_like> vec;
std::sort(begin(vec), end(vec));
std::sort(begin(vec), end(vec), by_common());
}

How can I keep const-correctness and RAII?

I have situation similar to included:
class A
{
public:
A(shared_ptr<B>);
}
class B : public enable_shared_from_this<B>
{
const shared_ptr<A> a;
}
I can't have shared_ptr to B before construction, so before a is initialized. So, I need to initialize my constant field after construction (I think it denies RAII), or just construct it later (so it can't be const, so it denies const-correctness, and also looks like not-too-consistent with RAII).
It looks like propably common situation. Is there any the cleanest way to handle this? How would you do this?
I would solve this by not having const members, plain and simple. They are generally much more trouble than they're worth (they make the class non-assignable, not even move-assignable, for example).
a is private, so only the class itself can access it. Thus it should be enough to document "a should never be modified after being initialised!!!". If you fear that won't be enough (or the class has friends outside your control), you can make this even more obvious like this:
class B : public enable_shared_from_this<B>
{
const std::shared_ptr<A>& a() { return _use_this_ONLY_for_initialising_a; }
std::shared_ptr<A> _use_this_ONLY_for_initialising_a;
};
Such a situation is a good indicator to refactor your code. Think about whether B should actually inhert from A or be a member of A before finding a way around this problem...
.. because it is probably going to be to remove the constness of your object - and probably not use shared_ptr (you have a cyclical reference there, so ref-counting alone will never be able to destroy your objects!).
If A doesn't actually keep a copy of the shared_ptr<B> you pass it (just uses the B briefly) then you can make this work (see below) but:
If A doesn't keep a reference to the B then it could just take a B& or B* argument, not a shared_ptr<B>, so you should change the design.
If A does keep a reference then you're going to have a circular reference, so you should change the design.
This works, but is really, really horrible and would be easy to introduce bugs, and is generally a bad idea, I probably shouldn't even be showing it, just change your design to avoid circular dependencies:
#include <memory>
#include <iostream>
class B;
class A
{
public:
A(std::shared_ptr<B> b);
};
class B : public std::enable_shared_from_this<B>
{
// return a shared_ptr<B> that owns `this` but with a
// null deleter. This does not share ownership with
// the result of shared_from_this(), but is usable
// in the B::B constructor.
std::shared_ptr<B> non_owning_shared_from_this()
{
struct null_deleter {
void operator()(void*) const { }
};
return std::shared_ptr<B>(this, null_deleter());
}
public:
B(int id)
: m_id(id), a(std::make_shared<A>(non_owning_shared_from_this()))
{ }
int id() const { return m_id; }
private:
int m_id;
const std::shared_ptr<A> a;
};
A::A(std::shared_ptr<B> b)
{
std::cout << b->id() << std::endl;
}
int main()
{
auto b = std::make_shared<B>(42);
}

what is "capability query" in dynamic_cast context and why is this useful?

I am reading some C++ material on dynamic_cast and there the following practice is considered bad:
class base{};
class derived1 d1 :public base{};
class derived2 d2 :public base
{
public:
void foo(){}
};
void baz(base *b)
{
if (derived2 *d2= dynamic_cast<derived2 *> (b) )
{
d2-> foo();
}
}
The remedy to this is to use the "capability query" using an empty pure virtual base class like following:
class capability_query
{
public:
virtual void foo()= 0;
};
class base{};
class derived1 d1 :public base{};
class derived2 d2 :public base, public capability_query
{
public:
virtual void foo(){}
};
void baz(base *b)
{
if (capability_query *cq= dynamic_cast<capability_query *> (b) )
{
cq-> foo();
}
}
My 1st question is why is the first code block considered bad?
The way I see it foo is only executed if d2 can be successfully downcasted from b in the baz function. So what is the issue here?!
My 2nd question is why is the second code block considered good? and how does this fix the issue, which I don't understand in the first place.
FYI, my google search for capability query returned http://en.wikibooks.org/wiki/More_C%2B%2B_Idioms/Capability_Query
which seems to be basically code block1 and not code block2. I still don't get why an additional empty base class is considered a better practice?
EDIT:
here is the best possible answer I can think of.Since inside baz I am downcasting to a pointer type and not reference, in case the downcast is not successful , I will get a Null pointer and not std::bad_cast. So, assuming the cast goes wrong and I do get NULL pointer , but what if I am not supposed to execute Null->foo and if I may forget to test for NULL, so code block 1 could be a problem.
The way code block 2 fixes this, is by adding an empty class. Even if
dynamic_cast<capability_query *> (b)
fails and I get a null pointer , you cannot execute
null->foo since inside capability_query class this foo method is pure virtual. This is just a conjecture , but may be I am on the right path??!!
The academic answer would be that in object oriented design you should not depend on the implementation i.e. concrete classes. Instead you should depend on high-level components like interfaces and abstract base classes. You can read more about this design principle on Wikipedia.
The reason for this is to decouple the design which makes the code more manageable and maintainable.
Let's look at an example. You have a base class and a derived class:
struct Duck {
virtual ~Duck() {}
};
struct MallardDuck : public Duck {
void quack() const {
std::cout << "Quack!" << std::endl;
}
};
Let's say you have another class with a function taking a parameter Duck.
struct SoundMaker {
void makeSound(const Duck* d) {
if (const MallardDuck* md = dynamic_cast<const MallardDuck*>(d)) {
md->quack();
}
}
};
You can use the classes like this:
MallardDuck md;
SoundMaker sm;
sm.makeSound(&md);
Which outputs Quack!.
Now lets add another derived class RubberDuck:
struct RubberDuck : public Duck {
void squeak() const {
std::cout << "Squeak!" << std::endl;
}
};
If you want SoundMaker to use the class RubberDuck you must make changes in makeSound:
void makeSound(const Duck* d) {
if (const MallardDuck* md = dynamic_cast<const MallardDuck*>(d)) {
md->quack();
} else if (const RubberDuck* rd = dynamic_cast<const RubberDuck*>(d)) {
rd->squeak();
}
}
What if you need to add another type of duck and produce its sound? For every new type of duck you add, you will have to make changes in both the code of the new duck class and in SoundMaker. This is because you depend on concrete implementation. Wouldn't it be better if you could just add new ducks without having to change SoundMaker? Look at the following code:
struct Duck {
virtual ~Duck() {}
virtual void makeSound() const = 0;
};
struct MallardDuck : public Duck {
void makeSound() const override {
quack();
}
void quack() const {
std::cout << "Quack!" << std::endl;
}
};
struct RubberDuck : public Duck {
void makeSound() const override {
squeak();
}
void squeak() const {
std::cout << "Squeak!" << std::endl;
}
};
struct SoundMaker {
void makeSound(const Duck* d) {
d->makeSound(); // No dynamic_cast, no dependencies on implementation.
}
};
Now you can use both duck types in the same way as before:
MallardDuck md;
RubberDuck rd;
SoundMaker sm;
sm.makeSound(&md);
sm.makeSound(&rd);
And you can add as many duck types as you wish without having to change anything in SoundMaker. This is a decoupled design and is much easier to maintain. This is the reason for why it is bad practise to down-cast and depend on concrete classes, instead only use high-level interfaces (in the general case).
In your second example you're using a separate class to evaluate if the requested behaviour of the derived class is available. This might be somewhat better as you separate (and encapsulate) the behaviour-control code. It still creates dependencies to your implementation though and every time the implementation changes you may need to change the behaviour-control code.
The first example, where foo is called on d2->foo(), violates the Open-Closed Principle, which in this case means that you should be able to add or remove functionality in d2 without changing code in baz (or anywhere else). The code:
void baz(base *b)
{
if (capability_query *cq= dynamic_cast<capability_query *> (b) )
{
cq-> foo();
}
}
shows that baz depends on the definition of the class d2. If one day, the function d2::foo() is removed, the function baz will also have to be modified, otherwise you'll be a compiler error.
However, in the improved version, if an author decides to remove the foo capability of d2 by removing the base class capability_query, (or indeed if the foo capability were to be added to class d1) the function baz needs no modification, and the run time behavior will automatically be correct.

How do i rewrite this typecast so nonsense is a compile time error?

This code compiles. Its obviously wrong because B can never be a WTF. Is there a way i can write the typecast so this is a compile time error?
class B{ public: virtual void abc(){} };
class D1 : public B{};
class WTF{ };
template<class T, class TT>
T DoSomething(TT o){
return dynamic_cast<T>(o);
}
B*Factory() { return new D1; }
int main(){
DoSomething<D1*, B*>(Factory());
DoSomething<WTF*, B*>(Factory());
}
No, you can't rewrite the cast to be a compiler time error, principally because your assertion that a B can never be a WTF is false.
E.g.
class Combine : public B, public WTF
{
};
int main()
{
Combine c;
std::cout << (void*)&c << '\n';
std::cout << (void*)dynamic_cast<WTF*>(&c) << '\n';
return 0;
}
There are two solutions to this problem, depending on what you want the code to be doing.
The first solution: Change the dynamic_cast into static_cast. This solution assumes that you never use class WTF in a class hierarchy with multiple inheritance. See below.
The second solution: You realize that C++ allows multiple inheritance. There is no way for you nor for the compiler to predict whether an arbitrary instance of WTF created sometime in the future will inherit from class B or not.
#include <stdio.h>
class B{ public: virtual void abc(){} };
class D1 : public B{};
class WTF{ };
class WTF_B : public WTF, public B {};
template<class T, class TT>
T DoSomething(TT o){
return dynamic_cast<T>(o);
}
B*Factory1() { return new D1; }
B*Factory2() { return new WTF_B; }
int main(){
printf("%p\n", DoSomething<D1*, B*>(Factory1()));
printf("%p\n", DoSomething<WTF*, B*>(Factory1()));
printf("%p\n", DoSomething<WTF*, B*>(Factory2()));
return 0;
}
The above source code can also be found here. Its output looks like this:
0x861a008
(nil)
0x861a028
In either case, the choice between solution 1 and solution 2 is yours - but the choice is yours only if you know what you are doing.
If your compiler/library support is behind the times and you don't have traits available, this will perform the simple check:
template<class T, class TT>
T DoSomething(TT o) {
enum { Concept_SimpleTypeRelationCheck = static_cast<T>(0) == static_cast<TT>(0) };
return dynamic_cast<T>(o);
}
With multiple inheritance, dynamic_cast<T>(o) could still fail (or succeed despite compilation error, as others have noted).
Note that this variant differs from the accepted answer (downcast via static_cast) in that its use of dynamic_cast preserves type safety when downcasting because it can return 0 if the type does not match the destination. static_cast is fine when upcasting through single inheritance hierarchies, but unsafe for downcasting (unless you have manually ensured correctness in each case).
There is no way to make the compiler give an error about an impossible cast in this situation because it is not an impossible cast. There is the possibility that somewhere there is a class that has multiple inheritance to derive from both WTF and B.