Thou shalt not inherit from std::vector - c++

Ok, this is really difficult to confess, but I do have a strong temptation at the moment to inherit from std::vector.
I need about 10 customized algorithms for vector and I want them to be directly members of the vector. But naturally I want also to have the rest of std::vector's interface. Well, my first idea, as a law-abiding citizen, was to have an std::vector member in MyVector class. But then I would have to manually reprovide all of the std::vector's interface. Too much to type. Next, I thought about private inheritance, so that instead of reproviding methods I would write a bunch of using std::vector::member's in the public section. This is tedious too actually.
And here I am, I really do think that I can simply inherit publicly from std::vector, but provide a warning in the documentation that this class should not be used polymorphically. I think most developers are competent enough to understand that this shouldn't be used polymorphically anyway.
Is my decision absolutely unjustifiable? If so, why? Can you provide an alternative which would have the additional members actually members but would not involve retyping all of vector's interface? I doubt it, but if you can, I'll just be happy.
Also, apart from the fact that some idiot can write something like
std::vector<int>* p = new MyVector
is there any other realistic peril in using MyVector? By saying realistic I discard things like imagine a function which takes a pointer to vector ...
Well, I've stated my case. I have sinned. Now it's up to you to forgive me or not :)

Actually, there is nothing wrong with public inheritance of std::vector. If you need this, just do that.
I would suggest doing that only if it is really necessary. Only if you can't do what you want with free functions (e.g. should keep some state).
The problem is that MyVector is a new entity. It means a new C++ developer should know what the hell it is before using it. What's the difference between std::vector and MyVector? Which one is better to use here and there? What if I need to move std::vector to MyVector? May I just use swap() or not?
Do not produce new entities just to make something to look better. These entities (especially, such common) aren't going to live in vacuum. They will live in mixed environment with constantly increased entropy.

The whole STL was designed in such way that algorithms and containers are separate.
This led to a concept of different types of iterators: const iterators, random access iterators, etc.
Therefore I recommend you to accept this convention and design your algorithms in such way that they won't care about what is the container they're working on - and they would only require a specific type of iterator which they'd need to perform their operations.
Also, let me redirect you to some good remarks by Jeff Attwood.

The main reason for not inheriting from std::vector publicly is an absence of a virtual destructor that effectively prevents you from polymorphic use of descendants. In particular, you are not allowed to delete a std::vector<T>* that actually points at a derived object (even if the derived class adds no members), yet the compiler generally can't warn you about it.
Private inheritance is allowed under these conditions. I therefore recommend using private inheritance and forwarding required methods from the parent as shown below.
class AdVector: private std::vector<double>
{
typedef double T;
typedef std::vector<double> vector;
public:
using vector::push_back;
using vector::operator[];
using vector::begin;
using vector::end;
AdVector operator*(const AdVector & ) const;
AdVector operator+(const AdVector & ) const;
AdVector();
virtual ~AdVector();
};
You should first consider refactoring your algorithms to abstract the type of container they are operating on and leave them as free templated functions, as pointed out by majority of answerers. This is usually done by making an algorithm accept a pair of iterators instead of container as arguments.

If you're considering this, you've clearly already slain the language pedants in your office. With them out of the way, why not just do
struct MyVector
{
std::vector<Thingy> v; // public!
void func1( ... ) ; // and so on
}
That will sidestep all the possible blunders that might come out of accidentally upcasting your MyVector class, and you can still access all the vector ops just by adding a little .v .

What are you hoping to accomplish? Just providing some functionality?
The C++ idiomatic way to do this is to just write some free functions that implement the functionality. Chances are you don't really require a std::vector, specifically for the functionality you're implementing, which means you're actually losing out on reusability by trying to inherit from std::vector.
I would strongly advise you to look at the standard library and headers, and meditate on how they work.

I think very few rules should be followed blindly 100% of the time. It sounds like you've given it quite a lot of thought, and are convinced that this is the way to go. So -- unless someone comes up with good specific reasons not to do this -- I think you should go ahead with your plan.

There is no reason to inherit from std::vector unless one wants to make a class that works differently than std::vector, because it handles in its own way the hidden details of std::vector's definition, or unless one has ideological reasons to use the objects of such class in place of std::vector's ones. However, the creators of the standard on C++ did not provide std::vector with any interface (in the form of protected members) that such inherited class could take advantage of in order to improve the vector in a specific way. Indeed, they had no way to think of any specific aspect that might need extension or fine-tune additional implementation, so they did not need to think of providing any such interface for any purpose.
The reasons for the second option can be only ideological, because std::vectors are not polymorphic, and otherwise there is no difference whether you expose std::vector's public interface via public inheritance or via public membership. (Suppose you need to keep some state in your object so you cannot get away with free functions). On a less sound note and from the ideological point of view, it appears that std::vectors are a kind of "simple idea", so any complexity in the form of objects of different possible classes in their place ideologically makes no use.

In practical terms: If you do not have any data members in your derived class, you do not have any problems, not even in polymorphic usage. You only need a virtual destructor if the sizes of the base class and the derived class are different and/or you have virtual functions (which means a v-table).
BUT in theory: From [expr.delete] in the C++0x FCD: In the first alternative (delete object), if the static type of the object to be deleted is different from its dynamic type, the static type shall be a base class of the dynamic type of the object to be deleted and the static type shall have a virtual destructor or the behavior is undefined.
But you can derive privately from std::vector without problems.
I have used the following pattern:
class PointVector : private std::vector<PointType>
{
typedef std::vector<PointType> Vector;
...
using Vector::at;
using Vector::clear;
using Vector::iterator;
using Vector::const_iterator;
using Vector::begin;
using Vector::end;
using Vector::cbegin;
using Vector::cend;
using Vector::crbegin;
using Vector::crend;
using Vector::empty;
using Vector::size;
using Vector::reserve;
using Vector::operator[];
using Vector::assign;
using Vector::insert;
using Vector::erase;
using Vector::front;
using Vector::back;
using Vector::push_back;
using Vector::pop_back;
using Vector::resize;
...

If you follow good C++ style, the absence of virtual function is not the problem, but slicing (see https://stackoverflow.com/a/14461532/877329)
Why is absence of virtual functions not the problem? Because a function should not try to delete any pointer it receives, since it does not have an ownership of it. Therefore, if following strict ownership policies, virtual destructors should not be needed. For example, this is always wrong (with or without virtual destructor):
void foo(SomeType* obj)
{
if(obj!=nullptr) //The function prototype only makes sense if parameter is optional
{
obj->doStuff();
}
delete obj;
}
class SpecialSomeType:public SomeType
{
// whatever
};
int main()
{
SpecialSomeType obj;
doStuff(&obj); //Will crash here. But caller does not know that
// ...
}
In contrast, this will always work (with or without virtual destructor):
void foo(SomeType* obj)
{
if(obj!=nullptr) //The function prototype only makes sense if parameter is optional
{
obj->doStuff();
}
}
class SpecialSomeType:public SomeType
{
// whatever
};
int main()
{
SpecialSomeType obj;
doStuff(&obj);
// The correct destructor *will* be called here.
}
If the object is created by a factory, the factory should also return a pointer to a working deleter, which should be used instead of delete, since the factory may use its own heap. The caller can get it form of a share_ptr or unique_ptr. In short, do not delete anything you didn't get directly from new.

Yes it's safe as long as you are careful not to do the things that are not safe... I don't think I've ever seen anyone use a vector with new so in practice you'll likely be fine. However, it's not the common idiom in c++....
Are you able to give more information on what the algorithms are?
Sometimes you end up going down one road with a design and then can't see the other paths you might have taken - the fact that you claim to need to vector with 10 new algorithms rings alarm bells for me - are there really 10 general purpose algorithms that a vector can implement, or are you trying to make an object that is both a general purpose vector AND which contains application specific functions?
I'm certainly not saying that you shouldn't do this, it's just that with the information you've given alarm bells are ringing which makes me think that maybe something is wrong with your abstractions and there is a better way to achieve what you want.

I also inherited from std::vector recently, and found it to be very useful and so far I haven't experienced any problems with it.
My class is a sparse matrix class, meaning that I need to store my matrix elements somewhere, namely in an std::vector. My reason for inheriting was that I was a bit too lazy to write interfaces to all the methods and also I am interfacing the class to Python via SWIG, where there is already good interface code for std::vector. I found it much easier to extend this interface code to my class rather than writing a new one from scratch.
The only problem I can see with the approach is not so much with the non-virtual destructor, but rather some other methods, which I would like to overload, such as push_back(), resize(), insert() etc. Private inheritance could indeed be a good option.
Thanks!

This question is guaranteed to produce breathless pearl-clutching, but in fact there is no defensible reason for avoiding, or "unnecessarily multiplying entities" to avoid, derivation from a Standard container. The simplest, shortest possible expression is clearest, and best.
You do need to exercise all the usual care around any derived type, but there is nothing special about the case of a base from the Standard. Overriding a base member function could be tricky, but that would be unwise to do with any non-virtual base, so there is not much special here. If you were to add a data member, you would need to worry about slicing if the member had to be kept consistent with contents of the base, but again that is the same for any base.
The place where I have found deriving from a standard container particularly useful is to add a single constructor that does precisely the initialization needed, with no chance of confusion or hijacking by other constructors. (I'm looking at you, initialization_list constructors!) Then, you can freely use the resulting object, sliced -- pass it by reference to something expecting the base, move from it to an instance of the base, what have you. There are no edge cases to worry about, unless it would bother you to bind a template argument to the derived class.
A place where this technique will be immediately useful in C++20 is reservation. Where we might have written
std::vector<T> names; names.reserve(1000);
we can say
template<typename C>
struct reserve_in : C {
reserve_in(std::size_t n) { this->reserve(n); }
};
and then have, even as class members,
. . .
reserve_in<std::vector<T>> taken_names{1000}; // 1
std::vector<T> given_names{reserve_in<std::vector<T>>{1000}}; // 2
. . .
(according to preference) and not need to write a constructor just to call reserve() on them.
(The reason that reserve_in, technically, needs to wait for C++20 is that prior Standards don't require the capacity of an empty vector to be preserved across moves. That is acknowledged as an oversight, and can reasonably be expected to be fixed as a defect in time for '20. We can also expect the fix to be, effectively, backdated to previous Standards, because all existing implementations actually do preserve capacity across moves; the Standards just haven't required it. The eager can safely jump the gun -- reserving is almost always just an optimization anyway.)
Some would argue that the case of reserve_in is better served by a free function template:
template<typename C>
auto reserve_in(std::size_t n) { C c; c.reserve(n); return c; }
Such an alternative is certainly viable -- and could even, at times, be infinitesimally faster, because of *RVO. But the choice of derivation or free function should be made on its own merits, and not from baseless (heh!) superstition about deriving from Standard components. In the example use above, only the second form would work with the free function; although outside of class context it could be written a little more concisely:
auto given_names{reserve_in<std::vector<T>>(1000)}; // 2

Here, let me introduce 2 more ways to do want you want. One is another way to wrap std::vector, another is the way to inherit without giving users a chance to break anything:
Let me add another way of wrapping std::vector without writing a lot of function wrappers.
#include <utility> // For std:: forward
struct Derived: protected std::vector<T> {
// Anything...
using underlying_t = std::vector<T>;
auto* get_underlying() noexcept
{
return static_cast<underlying_t*>(this);
}
auto* get_underlying() const noexcept
{
return static_cast<underlying_t*>(this);
}
template <class Ret, class ...Args>
auto apply_to_underlying_class(Ret (*underlying_t::member_f)(Args...), Args &&...args)
{
return (get_underlying()->*member_f)(std::forward<Args>(args)...);
}
};
Inheriting from std::span instead of std::vector and avoid the dtor problem.

Related

Proper way to upcast unique_ptr when returning from a factory

I learned C++ in the 1990s, but haven't used it much since then. I'm trying to catch up on the last couple of decades by using modern techniques in a small library I maintain at work. I've come across this stylistic question, and I'd like to know what modern consensus is.
I have a pure virtual base class Base, with two implementations A and B. The implementations are not in the header file; their public API is entirely part of Base. I have factory functions to construct them, which need to call a common initializer. These classes don't support copying, and can change owners during their lifetime, so I use a unique_ptr to keep things more sane.
In combination, that leads to a somewhat awkward construct:
unique_ptr<Base> rv = make_unique<A>();
I haven't seen cases where make_unique is explicitly given a template class. I also haven't seen cases where make_unique can't be used with auto. However, if I want RVO to apply, then rv (AFAICT) needs to have type unique_ptr<Base> rather than unique_ptr<A>.
Here, I have two deviations from "what I've seen before" in one line. That set off flags in my head, and made me think that I should ask the community for collective wisdom.
Ok, enough English; here's some code:
class Base {
public:
static std::unique_ptr<Base> MakeA();
static std::unique_ptr<Base> MakeB();
private:
void Initialize();
};
class A : public Base {};
class B : public Base {};
std::unique_ptr<Base> Base::MakeA() {
std::unique_ptr<Base> rv = std::make_unique<A>();
rv->Initialize();
return rv;
}
std::unique_ptr<Base> Base::MakeB() {
std::unique_ptr<Base> rv = std::make_unique<B>();
rv->Initialize();
return rv;
}
Again, here's some constraints I'm adding for myself to keep things super-clean:
Since the classes A and B are implementation details, I don't make them visible externally.
As a consequence, the factory functions need to return a unique_ptr<Base> rather than unique_ptr<A> and unique_ptr<B>.
I'm trying to make this NRVO-friendly, since I have other move-only classes to deal with and want to practice making RVO-friendly functions.
Since the factory functions have to call the Initialize method, I have to return an lvalue.
I want to use make_unique instead of unique_ptr(new ...), for the usual reasons.
What's the usual way to construct this sort of thing?
if I want RVO to apply, then rv (AFAICT) needs to have type unique_ptr rather than unique_ptr
Since you are not actually returning or assigning the constructed object (just a (unique_)pointer to it), RVO is irrelevant here. unique_ptr is basically zero-overhead, it's a single pointer (i.e. one register). It may make a difference in terms of the C++ abstract machine (the constructors/operators of std::unique_ptr), but in practice this aspect will have no impact.
More to the point: At no point is the (move) assignment operator or the move/copy constructor of A or B called, with or without (N)RVO.
Note that this is not specific to unique_ptr - if you were to return a raw pointer, the same idea would apply.
Your design is good. Still if you care to eliminate move/copy on return which is the case with your design (of course if your compiler supports such an optimization), you might also want to eliminate it at the creation site. To this end you need to use
std::unique_ptr<Base> rv(new Derived(...));
Here you should be careful with what you place instead of ellipsis. About your comment, what usage of new is discouraged to use it in modern C++. It is not, see, make_unique internally uses new and it is written in modern C++.
But again, your design is good, and you could resort to my proposal only if you really care to avoid moves.

Is C++'s default copy-constructor inherently unsafe? Are iterators fundamentally unsafe too?

I used to think C++'s object model is very robust when best practices are followed.
Just a few minutes ago, though, I had a realization that I hadn't had before.
Consider this code:
class Foo
{
std::set<size_t> set;
std::vector<std::set<size_t>::iterator> vector;
// ...
// (assume every method ensures p always points to a valid element of s)
};
I have written code like this. And until today, I hadn't seen a problem with it.
But, thinking about it a more, I realized that this class is very broken:
Its copy-constructor and copy-assignment copy the iterators inside the vector, which implies that they will still point to the old set! The new one isn't a true copy after all!
In other words, I must manually implement the copy-constructor even though this class isn't managing any resources (no RAII)!
This strikes me as astonishing. I've never come across this issue before, and I don't know of any elegant way to solve it. Thinking about it a bit more, it seems to me that copy construction is unsafe by default -- in fact, it seems to me that classes should not be copyable by default, because any kind of coupling between their instance variables risks rendering the default copy-constructor invalid.
Are iterators fundamentally unsafe to store? Or, should classes really be non-copyable by default?
The solutions I can think of, below, are all undesirable, as they don't let me take advantage of the automatically-generated copy constructor:
Manually implement a copy constructor for every nontrivial class I write. This is not only error-prone, but also painful to write for a complicated class.
Never store iterators as member variables. This seems severely limiting.
Disable copying by default on all classes I write, unless I can explicitly prove they are correct. This seems to run entirely against C++'s design, which is for most types to have value semantics, and thus be copyable.
Is this a well-known problem, and if so, does it have an elegant/idiomatic solution?
C++ copy/move ctor/assign are safe for regular value types. Regular value types behave like integers or other "regular" values.
They are also safe for pointer semantic types, so long as the operation does not change what the pointer "should" point to. Pointing to something "within yourself", or another member, is an example of where it fails.
They are somewhat safe for reference semantic types, but mixing pointer/reference/value semantics in the same class tends to be unsafe/buggy/dangerous in practice.
The rule of zero is that you make classes that behave like either regular value types, or pointer semantic types that don't need to be reseated on copy/move. Then you don't have to write copy/move ctors.
Iterators follow pointer semantics.
The idiomatic/elegant around this is to tightly couple the iterator container with the pointed-into container, and block or write the copy ctor there. They aren't really separate things once one contains pointers into the other.
Yes, this is a well known "problem" -- whenever you store pointers in an object, you're probably going to need some kind of custom copy constructor and assignment operator to ensure that the pointers are all valid and point at the expected things.
Since iterators are just an abstraction of collection element pointers, they have the same issue.
Is this a well-known problem?
Well, it is known, but I would not say well-known. Sibling pointers do not occur often, and most implementations I have seen in the wild were broken in the exact same way than yours is.
I believe the problem to be infrequent enough to have escaped most people's notice; interestingly, as I follow more Rust than C++ nowadays, it crops up there quite often because of the strictness of the type system (ie, the compiler refuses those programs, prompting questions).
does it have an elegant/idiomatic solution?
There are many types of sibling pointers situations, so it really depends, however I know of two generic solutions:
keys
shared elements
Let's review them in order.
Pointing to a class-member, or pointing into an indexable container, then one can use an offset or key rather than an iterator. It is slightly less efficient (and might require a look-up) however it is a fairly simple strategy. I have seen it used to great effect in shared-memory situation (where using pointers is a no-no since the shared-memory area may be mapped at different addresses).
The other solution is used by Boost.MultiIndex, and consists in an alternative memory layout. It stems from the principle of the intrusive container: instead of putting the element into the container (moving it in memory), an intrusive container uses hooks already inside the element to wire it at the right place. Starting from there, it is easy enough to use different hooks to wire a single elements into multiple containers, right?
Well, Boost.MultiIndex kicks it two steps further:
It uses the traditional container interface (ie, move your object in), but the node to which the object is moved in is an element with multiple hooks
It uses various hooks/containers in a single entity
You can check various examples and notably Example 5: Sequenced Indices looks a lot like your own code.
Is this a well-known problem
Yes. Any time you have a class that contains pointers, or pointer-like data like an iterator, you have to implement your own copy-constructor and assignment-operator to ensure the new object has valid pointers/iterators.
and if so, does it have an elegant/idiomatic solution?
Maybe not as elegant as you might like, and probably is not the best in performance (but then, copies sometimes are not, which is why C++11 added move semantics), but maybe something like this would work for you (assuming the std::vector contains iterators into the std::set of the same parent object):
class Foo
{
private:
std::set<size_t> s;
std::vector<std::set<size_t>::iterator> v;
struct findAndPushIterator
{
Foo &foo;
findAndPushIterator(Foo &f) : foo(f) {}
void operator()(const std::set<size_t>::iterator &iter)
{
std::set<size_t>::iterator found = foo.s.find(*iter);
if (found != foo.s.end())
foo.v.push_back(found);
}
};
public:
Foo() {}
Foo(const Foo &src)
{
*this = src;
}
Foo& operator=(const Foo &rhs)
{
v.clear();
s = rhs.s;
v.reserve(rhs.v.size());
std::for_each(rhs.v.begin(), rhs.v.end(), findAndPushIterator(*this));
return *this;
}
//...
};
Or, if using C++11:
class Foo
{
private:
std::set<size_t> s;
std::vector<std::set<size_t>::iterator> v;
public:
Foo() {}
Foo(const Foo &src)
{
*this = src;
}
Foo& operator=(const Foo &rhs)
{
v.clear();
s = rhs.s;
v.reserve(rhs.v.size());
std::for_each(rhs.v.begin(), rhs.v.end(),
[this](const std::set<size_t>::iterator &iter)
{
std::set<size_t>::iterator found = s.find(*iter);
if (found != s.end())
v.push_back(found);
}
);
return *this;
}
//...
};
Yes, of course it's a well-known problem.
If your class stored pointers, as an experienced developer you would intuitively know that the default copy behaviours may not be sufficient for that class.
Your class stores iterators and, since they are also "handles" to data stored elsewhere, the same logic applies.
This is hardly "astonishing".
The assertion that Foo is not managing any resources is false.
Copy-constructor aside, if a element of set is removed, there must be code in Foo that manages vector so that the respective iterator is removed.
I think the idiomatic solution is to just use one container, a vector<size_t>, and check that the count of an element is zero before inserting. Then the copy and move defaults are fine.
"Inherently unsafe"
No, the features you mention are not inherently unsafe; the fact that you thought of three possible safe solutions to the problem is evidence that there is no "inherent" lack of safety here, even though you think the solutions are undesirable.
And yes, there is RAII here: the containers (set and vector) are managing resources. I think your point is that the RAII is "already taken care of" by the std containers. But you need to then consider the container instances themselves to be "resources", and in fact your class is managing them. You're correct that you're not directly managing heap memory, because this aspect of the management problem is taken care of for you by the standard library. But there's more to the management problem, which I'll talk a bit more about below.
"Magic" default behavior
The problem is that you are apparently hoping that you can trust the default copy constructor to "do the right thing" in a non-trivial case such as this. I'm not sure why you expected the right behavior--perhaps you're hoping that memorizing rules-of-thumb such as the "rule of 3" will be a robust way to ensure that you don't shoot yourself in the foot? Certainly that would be nice (and, as pointed out in another answer, Rust goes much further than other low-level languages toward making foot-shooting much harder), but C++ simply isn't designed for "thoughtless" class design of that sort, nor should it be.
Conceptualizing constructor behavior
I'm not going to try to address the question of whether this is a "well-known problem", because I don't really know how well-characterized the problem of "sister" data and iterator-storing is. But I hope that I can convince you that, if you take the time to think about copy-constructor-behavior for every class you write that can be copied, this shouldn't be a surprising problem.
In particular, when deciding to use the default copy-constructor, you must think about what the default copy-constructor will actually do: namely, it will call the copy-constructor of each non-primitive, non-union member (i.e. members that have copy-constructors) and bitwise-copy the rest.
When copying your vector of iterators, what does std::vector's copy-constructor do? It performs a "deep copy", i.e., the data inside the vector is copied. Now, if the vector contains iterators, how does that affect the situation? Well, it's simple: the iterators are the data stored by the vector, so the iterators themselves will be copied. What does an iterator's copy-constructor do? I'm not going to actually look this up, because I don't need to know the specifics: I just need to know that iterators are like pointers in this (and other respect), and copying a pointer just copies the pointer itself, not the data pointed to. I.e., iterators and pointers do not have deep-copying by default.
Note that this is not surprising: of course iterators don't do deep-copying by default. If they did, you'd get a different, new set for each iterator being copied. And this makes even less sense than it initially appears: for instance, what would it actually mean if uni-directional iterators made deep-copies of their data? Presumably you'd get a partial copy, i.e., all the remaining data that's still "in front of" the iterator's current position, plus a new iterator pointing to the "front" of the new data structure.
Now consider that there is no way for a copy-constructor to know the context in which it's being called. For instance, consider the following code:
using iter = std::set<size_t>::iterator; // use typedef pre-C++11
std::vector<iter> foo = getIters(); // get a vector of iterators
useIters(foo); // pass vector by value
When getIters is called, the return value might be moved, but it might also be copy-constructed. The assignment to foo also invokes a copy-constructor, though this may also be elided. And unless useIters takes its argument by reference, then you've also got a copy constructor call there.
In any of these cases, would you expect the copy constructor to change which std::set is pointed to by the iterators contained by the std::vector<iter>? Of course not! So naturally std::vector's copy-constructor can't be designed to modify the iterators in that particular way, and in fact std::vector's copy-constructor is exactly what you need in most cases where it will actually be used.
However, suppose std::vector could work like this: suppose it had a special overload for "vector-of-iterators" that could re-seat the iterators, and that the compiler could somehow be "told" only to invoke this special constructor when the iterators actually need to be re-seated. (Note that the solution of "only invoke the special overload when generating a default constructor for a containing class that also contains an instance of the iterators' underlying data type" wouldn't work; what if the std::vector iterators in your case were pointing at a different standard set, and were being treated simply as a reference to data managed by some other class? Heck, how is the compiler supposed to know whether the iterators all point to the same std::set?) Ignoring this problem of how the compiler would know when to invoke this special constructor, what would the constructor code look like? Let's try it, using _Ctnr<T>::iterator as our iterator type (I'll use C++11/14isms and be a bit sloppy, but the overall point should be clear):
template <typename T, typename _Ctnr>
std::vector< _Ctnr<T>::iterator> (const std::vector< _Ctnr<T>::iterator>& rhs)
: _data{ /* ... */ } // initialize underlying data...
{
for (auto i& : rhs)
{
_data.emplace_back( /* ... */ ); // What do we put here?
}
}
Okay, so we want each new, copied iterator to be re-seated to refer to a different instance of _Ctnr<T>. But where would this information come from? Note that the copy-constructor can't take the new _Ctnr<T> as an argument: then it would no longer be a copy-constructor. And in any case, how would the compiler know which _Ctnr<T> to provide? (Note, too, that for many containers, finding the "corresponding iterator" for the new container may be non-trivial.)
Resource management with std:: containers
This isn't just an issue of the compiler not being as "smart" as it could or should be. This is an instance where you, the programmer, have a specific design in mind that requires a specific solution. In particular, as mentioned above, you have two resources, both std:: containers. And you have a relationship between them. Here we get to something that most of the other answers have stated, and which by this point should be very, very clear: related class members require special care, since C++ does not manage this coupling by default. But what I hope is also clear by this point is that you shouldn't think of the problem as arising specifically because of data-member coupling; the problem is simply that default-construction isn't magic, and the programmer must be aware of the requirements for correctly copying a class before deciding to let the implicitly-generated constructor handle copying.
The elegant solution
...And now we get to aesthetics and opinions. You seem to find it inelegant to be forced to write a copy-constructor when you don't have any raw pointers or arrays in your class that must be manually managed.
But user-defined copy constructors are elegant; allowing you to write them is C++'s elegant solution to the problem of writing correct non-trivial classes.
Admittedly, this seems like a case where the "rule of 3" doesn't quite apply, since there's a clear need to either =delete the copy-constructor or write it yourself, but there's no clear need (yet) for a user-defined destructor. But again, you can't simply program based on rules of thumb and expect everything to work correctly, especially in a low-level language such as C++; you must be aware of the details of (1) what you actually want and (2) how that can be achieved.
So, given that the coupling between your std::set and your std::vector actually creates a non-trivial problem, solving the problem by wrapping them together in a class that correctly implements (or simply deletes) the copy-constructor is actually a very elegant (and idiomatic) solution.
Explicitly defining versus deleting
You mention a potential new "rule of thumb" to follow in your coding practices: "Disable copying by default on all classes I write, unless I can explicitly prove they are correct." While this might be a safer rule of thumb (at least in this case) than the "rule of 3" (especially when your criterion for "do I need to implement the 3" is to check whether a deleter is required), my above caution against relying on rules of thumb still applies.
But I think the solution here is actually simpler than the proposed rule of thumb. You don't need to formally prove the correctness of the default method; you simply need to have a basic idea of what it would do, and what you need it to do.
Above, in my analysis of your particular case, I went into a lot of detail--for instance, I brought up the possibility of "deep-copying iterators". You don't need to go into this much detail to determine whether or not the default copy-constructor will work correctly. Instead, simply imagine what your manually-created copy constructor will look like; you should be able to tell pretty quickly how similar your imaginary explicitly-defined constructor is to the one the compiler would generate.
For example, a class Foo containing a single vector data will have a copy constructor that looks like this:
Foo::Foo(const Foo& rhs)
: data{rhs.data}
{}
Without even writing that out, you know that you can rely on the implicitly-generated one, because it's exactly the same as what you'd have written above.
Now, consider the constructor for your class Foo:
Foo::Foo(const Foo& rhs)
: set{rhs.set}
, vector{ /* somehow use both rhs.set AND rhs.vector */ } // ...????
{}
Right away, given that simply copying vector's members won't work, you can tell that the default constructor won't work. So now you need to decide whether your class needs to be copyable or not.

Is it safe to create a class based on vector/list of its instances? [duplicate]

Ok, this is really difficult to confess, but I do have a strong temptation at the moment to inherit from std::vector.
I need about 10 customized algorithms for vector and I want them to be directly members of the vector. But naturally I want also to have the rest of std::vector's interface. Well, my first idea, as a law-abiding citizen, was to have an std::vector member in MyVector class. But then I would have to manually reprovide all of the std::vector's interface. Too much to type. Next, I thought about private inheritance, so that instead of reproviding methods I would write a bunch of using std::vector::member's in the public section. This is tedious too actually.
And here I am, I really do think that I can simply inherit publicly from std::vector, but provide a warning in the documentation that this class should not be used polymorphically. I think most developers are competent enough to understand that this shouldn't be used polymorphically anyway.
Is my decision absolutely unjustifiable? If so, why? Can you provide an alternative which would have the additional members actually members but would not involve retyping all of vector's interface? I doubt it, but if you can, I'll just be happy.
Also, apart from the fact that some idiot can write something like
std::vector<int>* p = new MyVector
is there any other realistic peril in using MyVector? By saying realistic I discard things like imagine a function which takes a pointer to vector ...
Well, I've stated my case. I have sinned. Now it's up to you to forgive me or not :)
Actually, there is nothing wrong with public inheritance of std::vector. If you need this, just do that.
I would suggest doing that only if it is really necessary. Only if you can't do what you want with free functions (e.g. should keep some state).
The problem is that MyVector is a new entity. It means a new C++ developer should know what the hell it is before using it. What's the difference between std::vector and MyVector? Which one is better to use here and there? What if I need to move std::vector to MyVector? May I just use swap() or not?
Do not produce new entities just to make something to look better. These entities (especially, such common) aren't going to live in vacuum. They will live in mixed environment with constantly increased entropy.
The whole STL was designed in such way that algorithms and containers are separate.
This led to a concept of different types of iterators: const iterators, random access iterators, etc.
Therefore I recommend you to accept this convention and design your algorithms in such way that they won't care about what is the container they're working on - and they would only require a specific type of iterator which they'd need to perform their operations.
Also, let me redirect you to some good remarks by Jeff Attwood.
The main reason for not inheriting from std::vector publicly is an absence of a virtual destructor that effectively prevents you from polymorphic use of descendants. In particular, you are not allowed to delete a std::vector<T>* that actually points at a derived object (even if the derived class adds no members), yet the compiler generally can't warn you about it.
Private inheritance is allowed under these conditions. I therefore recommend using private inheritance and forwarding required methods from the parent as shown below.
class AdVector: private std::vector<double>
{
typedef double T;
typedef std::vector<double> vector;
public:
using vector::push_back;
using vector::operator[];
using vector::begin;
using vector::end;
AdVector operator*(const AdVector & ) const;
AdVector operator+(const AdVector & ) const;
AdVector();
virtual ~AdVector();
};
You should first consider refactoring your algorithms to abstract the type of container they are operating on and leave them as free templated functions, as pointed out by majority of answerers. This is usually done by making an algorithm accept a pair of iterators instead of container as arguments.
If you're considering this, you've clearly already slain the language pedants in your office. With them out of the way, why not just do
struct MyVector
{
std::vector<Thingy> v; // public!
void func1( ... ) ; // and so on
}
That will sidestep all the possible blunders that might come out of accidentally upcasting your MyVector class, and you can still access all the vector ops just by adding a little .v .
What are you hoping to accomplish? Just providing some functionality?
The C++ idiomatic way to do this is to just write some free functions that implement the functionality. Chances are you don't really require a std::vector, specifically for the functionality you're implementing, which means you're actually losing out on reusability by trying to inherit from std::vector.
I would strongly advise you to look at the standard library and headers, and meditate on how they work.
I think very few rules should be followed blindly 100% of the time. It sounds like you've given it quite a lot of thought, and are convinced that this is the way to go. So -- unless someone comes up with good specific reasons not to do this -- I think you should go ahead with your plan.
There is no reason to inherit from std::vector unless one wants to make a class that works differently than std::vector, because it handles in its own way the hidden details of std::vector's definition, or unless one has ideological reasons to use the objects of such class in place of std::vector's ones. However, the creators of the standard on C++ did not provide std::vector with any interface (in the form of protected members) that such inherited class could take advantage of in order to improve the vector in a specific way. Indeed, they had no way to think of any specific aspect that might need extension or fine-tune additional implementation, so they did not need to think of providing any such interface for any purpose.
The reasons for the second option can be only ideological, because std::vectors are not polymorphic, and otherwise there is no difference whether you expose std::vector's public interface via public inheritance or via public membership. (Suppose you need to keep some state in your object so you cannot get away with free functions). On a less sound note and from the ideological point of view, it appears that std::vectors are a kind of "simple idea", so any complexity in the form of objects of different possible classes in their place ideologically makes no use.
In practical terms: If you do not have any data members in your derived class, you do not have any problems, not even in polymorphic usage. You only need a virtual destructor if the sizes of the base class and the derived class are different and/or you have virtual functions (which means a v-table).
BUT in theory: From [expr.delete] in the C++0x FCD: In the first alternative (delete object), if the static type of the object to be deleted is different from its dynamic type, the static type shall be a base class of the dynamic type of the object to be deleted and the static type shall have a virtual destructor or the behavior is undefined.
But you can derive privately from std::vector without problems.
I have used the following pattern:
class PointVector : private std::vector<PointType>
{
typedef std::vector<PointType> Vector;
...
using Vector::at;
using Vector::clear;
using Vector::iterator;
using Vector::const_iterator;
using Vector::begin;
using Vector::end;
using Vector::cbegin;
using Vector::cend;
using Vector::crbegin;
using Vector::crend;
using Vector::empty;
using Vector::size;
using Vector::reserve;
using Vector::operator[];
using Vector::assign;
using Vector::insert;
using Vector::erase;
using Vector::front;
using Vector::back;
using Vector::push_back;
using Vector::pop_back;
using Vector::resize;
...
If you follow good C++ style, the absence of virtual function is not the problem, but slicing (see https://stackoverflow.com/a/14461532/877329)
Why is absence of virtual functions not the problem? Because a function should not try to delete any pointer it receives, since it does not have an ownership of it. Therefore, if following strict ownership policies, virtual destructors should not be needed. For example, this is always wrong (with or without virtual destructor):
void foo(SomeType* obj)
{
if(obj!=nullptr) //The function prototype only makes sense if parameter is optional
{
obj->doStuff();
}
delete obj;
}
class SpecialSomeType:public SomeType
{
// whatever
};
int main()
{
SpecialSomeType obj;
doStuff(&obj); //Will crash here. But caller does not know that
// ...
}
In contrast, this will always work (with or without virtual destructor):
void foo(SomeType* obj)
{
if(obj!=nullptr) //The function prototype only makes sense if parameter is optional
{
obj->doStuff();
}
}
class SpecialSomeType:public SomeType
{
// whatever
};
int main()
{
SpecialSomeType obj;
doStuff(&obj);
// The correct destructor *will* be called here.
}
If the object is created by a factory, the factory should also return a pointer to a working deleter, which should be used instead of delete, since the factory may use its own heap. The caller can get it form of a share_ptr or unique_ptr. In short, do not delete anything you didn't get directly from new.
Yes it's safe as long as you are careful not to do the things that are not safe... I don't think I've ever seen anyone use a vector with new so in practice you'll likely be fine. However, it's not the common idiom in c++....
Are you able to give more information on what the algorithms are?
Sometimes you end up going down one road with a design and then can't see the other paths you might have taken - the fact that you claim to need to vector with 10 new algorithms rings alarm bells for me - are there really 10 general purpose algorithms that a vector can implement, or are you trying to make an object that is both a general purpose vector AND which contains application specific functions?
I'm certainly not saying that you shouldn't do this, it's just that with the information you've given alarm bells are ringing which makes me think that maybe something is wrong with your abstractions and there is a better way to achieve what you want.
I also inherited from std::vector recently, and found it to be very useful and so far I haven't experienced any problems with it.
My class is a sparse matrix class, meaning that I need to store my matrix elements somewhere, namely in an std::vector. My reason for inheriting was that I was a bit too lazy to write interfaces to all the methods and also I am interfacing the class to Python via SWIG, where there is already good interface code for std::vector. I found it much easier to extend this interface code to my class rather than writing a new one from scratch.
The only problem I can see with the approach is not so much with the non-virtual destructor, but rather some other methods, which I would like to overload, such as push_back(), resize(), insert() etc. Private inheritance could indeed be a good option.
Thanks!
This question is guaranteed to produce breathless pearl-clutching, but in fact there is no defensible reason for avoiding, or "unnecessarily multiplying entities" to avoid, derivation from a Standard container. The simplest, shortest possible expression is clearest, and best.
You do need to exercise all the usual care around any derived type, but there is nothing special about the case of a base from the Standard. Overriding a base member function could be tricky, but that would be unwise to do with any non-virtual base, so there is not much special here. If you were to add a data member, you would need to worry about slicing if the member had to be kept consistent with contents of the base, but again that is the same for any base.
The place where I have found deriving from a standard container particularly useful is to add a single constructor that does precisely the initialization needed, with no chance of confusion or hijacking by other constructors. (I'm looking at you, initialization_list constructors!) Then, you can freely use the resulting object, sliced -- pass it by reference to something expecting the base, move from it to an instance of the base, what have you. There are no edge cases to worry about, unless it would bother you to bind a template argument to the derived class.
A place where this technique will be immediately useful in C++20 is reservation. Where we might have written
std::vector<T> names; names.reserve(1000);
we can say
template<typename C>
struct reserve_in : C {
reserve_in(std::size_t n) { this->reserve(n); }
};
and then have, even as class members,
. . .
reserve_in<std::vector<T>> taken_names{1000}; // 1
std::vector<T> given_names{reserve_in<std::vector<T>>{1000}}; // 2
. . .
(according to preference) and not need to write a constructor just to call reserve() on them.
(The reason that reserve_in, technically, needs to wait for C++20 is that prior Standards don't require the capacity of an empty vector to be preserved across moves. That is acknowledged as an oversight, and can reasonably be expected to be fixed as a defect in time for '20. We can also expect the fix to be, effectively, backdated to previous Standards, because all existing implementations actually do preserve capacity across moves; the Standards just haven't required it. The eager can safely jump the gun -- reserving is almost always just an optimization anyway.)
Some would argue that the case of reserve_in is better served by a free function template:
template<typename C>
auto reserve_in(std::size_t n) { C c; c.reserve(n); return c; }
Such an alternative is certainly viable -- and could even, at times, be infinitesimally faster, because of *RVO. But the choice of derivation or free function should be made on its own merits, and not from baseless (heh!) superstition about deriving from Standard components. In the example use above, only the second form would work with the free function; although outside of class context it could be written a little more concisely:
auto given_names{reserve_in<std::vector<T>>(1000)}; // 2
Here, let me introduce 2 more ways to do want you want. One is another way to wrap std::vector, another is the way to inherit without giving users a chance to break anything:
Let me add another way of wrapping std::vector without writing a lot of function wrappers.
#include <utility> // For std:: forward
struct Derived: protected std::vector<T> {
// Anything...
using underlying_t = std::vector<T>;
auto* get_underlying() noexcept
{
return static_cast<underlying_t*>(this);
}
auto* get_underlying() const noexcept
{
return static_cast<underlying_t*>(this);
}
template <class Ret, class ...Args>
auto apply_to_underlying_class(Ret (*underlying_t::member_f)(Args...), Args &&...args)
{
return (get_underlying()->*member_f)(std::forward<Args>(args)...);
}
};
Inheriting from std::span instead of std::vector and avoid the dtor problem.

abstract class and using array polymorphically

i'm just reading meyers "More Effective C++ 35 New Ways" - item 33, and he suggest there
always to inherit from an abstract base class, and not a concrete.
one of the reason he claims, which i can't quite get , is that with inheriting from an abstract class, treating array polymorphically (item 3 in the book) is not a problem.
can someone suggest how is that ?
In addition i would like to hear if it's really always a good thing never to let the client instantiate a class which other derives from ? (meyers in his book is showing a problem with the assignment operator for example )
code example as requested:
CLASS BST {.... };
CLASS BlanacedBST:: public BST {....}
void printBSTArray(ostream& s, const BST array[],int Numelements)
{
for(int i=0;i < Numelements;i++)
{
s << array[i];
}
}
BST BSTArray[10];
printBSTArray(BSTArray); // works fine
BlanacedBST bBSTArray[10];
printBSTArray(bBSTArray); // undefined behaviour (beacuse the subscript operator advances the pointer according to BST chunk size)
then, he addes that avoiding concreate class (BlanacedBST) inheriting from another concreat class(BST) usually avoids this problem - this i don't get how.
While I think that avoiding inheritance from non-abstract classes is a good design guideline and something that should make you think twice about your design, I definitely do not think that it's in the category of 'never do this'.
I will say that classes designed to be inherited from that have data in them should probably be hiding their assignment operator because of the slicing issue.
I think there's a way to categorize classes that isn't often thought of, and I think that causes a lot of confusion. I think there are classes that are designed to be used by value, and classes that are designed to always be used by reference (meaning via a reference or a pointer or something like that).
In most object oriented languages user defined classes can only be used by reference, and there are a special class of 'primitive' types that can be used by value. One of C++'s big strengths is that you can create user defined classes that can be used by value. This can lead to some huge efficiency wins. In Java, for example, all of your points (to pick a random simple class) are heap allocated and need to be garbage collected, even though they're basically just two or three doubles stuck together with some nice 'final' support functions.
So classes that are designed to be used by reference should disable assignment and should seriously consider disabling copy construction and require people to use a 'make a copy of this' virtual function for that purpose. Notice that Java classes generally don't have anything like an assignment operator or standard copy constructor.
Classes that are designed to be used by value should generally not have virtual functions, though it may be very useful to have them be a part of an inheritance hierarchy. They can still be rather complex though because they can contain references to objects of classes designed to be used by reference.
If you need to treat a by reference class as being used by value you should use the handle/body design pattern or a smart pointer. The STL containers are all designed to be used on by value objects, so this is a fairly common problem.
Meyers does not say that you can create the array with no problems; he says that it will be more difficult for you to try to create it. The compiler will complain as soon as you try to initialise it, because you cannot create objects of the base class if it is abstract.

Is there any real risk to deriving from the C++ STL containers?

The claim that it is a mistake ever to use a standard C++ container as a base class surprises me.
If it is no abuse of the language to declare ...
// Example A
typedef std::vector<double> Rates;
typedef std::vector<double> Charges;
... then what, exactly, is the hazard in declaring ...
// Example B
class Rates : public std::vector<double> {
// ...
} ;
class Charges: public std::vector<double> {
// ...
} ;
The positive advantages to B include:
Enables overloading of functions because f(Rates &) and f(Charges &) are distinct signatures
Enables other templates to be specialized, because X<Rates> and X<Charges> are distinct types
Forward declaration is trivial
Debugger probably tells you whether the object is a Rates or a Charges
If, as time goes by, Rates and Charges develop personalities — a Singleton for Rates, an output format for Charges — there is an obvious scope for that functionality to be implemented.
The positive advantages to A include:
Don't have to provide trivial implementations of constructors etc
The fifteen-year-old pre-standard compiler that's the only thing that will compile your legacy doesn't choke
Since specializations are impossible, template X<Rates> and template X<Charges> will use the same code, so no pointless bloat.
Both approaches are superior to using a raw container, because if the implementation changes from vector<double> to vector<float>, there's only one place to change with B and maybe only one place to change with A (it could be more, because someone may have put identical typedef statements in multiple places).
My aim is that this be a specific, answerable question, not a discussion of better or worse practice. Show the worst thing that can happen as a consequence of deriving from a standard container, that would have been prevented by using a typedef instead.
Edit:
Without question, adding a destructor to class Rates or class Charges would be a risk, because std::vector does not declare its destructor as virtual. There is no destructor in the example, and no need for one. Destroying a Rates or Charges object will invoke the base class destructor. There is no need for polymorphism here, either. The challenge is to show something bad happening as a consequence of using derivation instead of a typedef.
Edit:
Consider this use case:
#include <vector>
#include <iostream>
void kill_it(std::vector<double> *victim) {
// user code, knows nothing of Rates or Charges
// invokes non-virtual ~std::vector<double>(), then frees the
// memory allocated at address victim
delete victim ;
}
typedef std::vector<double> Rates;
class Charges: public std::vector<double> { };
int main(int, char **) {
std::vector<double> *p1, *p2;
p1 = new Rates;
p2 = new Charges;
// ???
kill_it(p2);
kill_it(p1);
return 0;
}
Is there any possible error that even an arbitrarily hapless user could introduce in the ??? section which will result in a problem with Charges (the derived class), but not with Rates (the typedef)?
In the Microsoft implementation, vector<T> is itself implemented via inheritance. vector<T,A> is a publicly derived from _Vector_Val<T,A> Should containment be preferred?
The standard containers do not have virtual destructors, thus you cannot handle them polymorphically. If you will not, and everyone who uses your code doesn't, it's not "wrong", per se. However, you are better off using composition anyway, for clarity.
Because you need a virtual destructor and the std containers don't have it. The std containers are not designed to act as base class.
For more information read the article "Why shouldn't we inherit a class from STL classes?"
Guideline
A base class must have:
a public virtual destructor
or a protected non-virtual destructor
One strong counter-argument, in my opinion, is that you are imposing an interface and the implementation onto your types. What happens when you find out that vector memory allocation strategy does not fit your needs? Will you derive from std:deque? What about those 128K lines of code that already use your class? Will everybody need to recompile everything? Will it even compile?
The issue isn't a philisophical one, it's an implementation issue. The standard containers' destructors aren't virtual, which means there's no way to use runtime polymorphisim on them to get the proper desctructor.
I've found in practice it really isn't that much of a pain to create my own custom list classes with just the methods my code needs defined (and a private member of the "parent" class). In fact, it often leads to better-designed classes.
Aside from the fact that a base class needs a virtual destructor or a protected non-virtual destructor you are making the following assertion in your design:
Rates, and Charges for that matter, ARE THE SAME AS a vector of doubles in your example above. By your own assertion "...as time goes by, Rates and Charges develop personalities..." then is the assertion that Rates ARE STILL THE SAME AS a vector of doubles at this point? A vector of doubles is not a singleton for example therefore if I use your Rates to declare my vector of doubles for Widgets I may incur some headache from your code. What else about Rates and Charges are subject to change? Are any of the base class changes safely insulated from clients of your design should they change in a fundamental way?
The point is a class is an element, of many in C++, to express design intentions. Saying what you mean and meaning what you say is the reason against using inheritance in this manner.
...Or just posted more succinctly before my response: Substitution.
Also, in most cases, you should prefer composition or aggregation over inheritance if possible.
Is there any possible error that even an arbitrarily hapless user could introduce in the ??? section which will result in a problem with Charges (the derived class), but not with Rates (the typedef)?
Firstly, there's Mankarse's excellent point:
The comment in kill_it is wrong. If the dynamic type of victim is not std::vector, then the delete simply invokes undefined behaviour. The call to kill_it(p2) causes this to happen, and so nothing needs to be added to the //??? section for this to have undefined behaviour. – Mankarse Sep 3 '11 at 10:53
Secondly, say they call f(*p1); where f is specialised for std::vector<double>: that vector specialisation won't be found - you may end up matching the template specialisation differently - typically running (slower or otherwise less efficient) generic code, or getting a linker error if an un-specialised version isn't actually defined. Not often a significant concern.
Personally, I consider destruction through a pointer to base to be crossing the line - it may only be a "hypothetical" problem (as far as you can tell) given your current compiler, compiler flags, program, OS version etc. - but it could break at any time for no "good" reason.
If you are confident you can avoid deletion via a base-class pointer, go for it.
That said, a few notes on your assessment:
"providing trivial implementations of constructors" - that's a hassle, but one tip for C++03: template <typename A> Classname(const A& a) : Base(a) { } template <typename A, typename B> Classname(const A& a, const B& b) : Base(a, b) { } ... can sometimes be easier than enumerating all the overloads, but doesn't handle non-const parameters, default values, explicit-vs-non-explicit constructors, nor scale to huge numbers of arguments. C++11 provides a better general solution.
Without question, adding a destructor to class Rates or class Charges would be a risk, because std::vector does not declare its destructor as virtual. There is no destructor in the example, and no need for one. Destroying a Rates or Charges object will invoke the base class destructor. There is no need for polymorphism here, either.
There is no risk posed by a derived class destructor if the object is not deleted polymorphically; if it is there undefined behaviour whether or not your derived class has a user-defined destructor. That said, you cross from "probably-ok-for-a-cowboy" to "almost-certainly-not-ok" when you add data members or further bases with destructors that perform clean-up (memory deallocation, mutex unlocking, file handle closing etc.)
Saying "will invoke the base class destructor" makes it sound like that's done directly with no implicitly-defined derived-class destructor involved or making the call - all an optimisation detail and not specified by the Standard.