Suppose I have a class Widget with a container data member d_members, and another container data member d_special_members containing pointers to distinguished elements of d_members. The special members are determined in the constructor:
#include <vector>
struct Widget
{
std::vector<int> d_members;
std::vector<int*> d_special_members;
Widget(std::vector<int> members) : d_members(members)
{
for (auto& member : d_members)
if (member % 2 == 0)
d_special_members.push_back(&member);
}
};
What is the best way to implement the copy constructor and operator=() for such a class?
The d_special_members in the copy should point to the copy of d_members.
Is it necessary to repeat the work that was done in the constructor? I hope this can be avoided.
I would probably like to use the copy-and-swap idiom.
I guess one could use indices instead of pointers, but in my actual use case d_members has a type like std::vector< std::pair<int, int> > (and d_special_members is still just std::vector<int*>, so it refers to elements of pairs), so this would not be very convenient.
Only the existing contents of d_members (as given at construction time) are modified by the class; there is never any reallocation (which would invalidate the pointers).
It should be possible to construct Widget objects with d_members of arbitrary size at runtime.
Note that the default assignment/copy just copies the pointers:
#include <iostream>
using namespace std;
int main()
{
Widget w1({ 1, 2, 3, 4, 5 });
cout << "First special member of w1: " << *w1.d_special_members[0] << "\n";
Widget w2 = w1;
*w2.d_special_members[0] = 3;
cout << "First special member of w1: " << *w1.d_special_members[0] << "\n";
}
yields
First special member of w1: 2
First special member of w1: 3
What you are asking for is an easy way to maintain associations as data is moved to new memory locations. Pointers are far from ideal for this, as you have discovered. What you should be looking for is something relative, like a pointer-to-member. That doesn't quite apply in this case, so I would go with the closest alternative I see: store indices into your sub-structures. So store an index into the vector and a flag indicating the first or second element of the pair (and so on, if your structure gets even more complex).
The other alternative I see is to traverse the data in the old object to figure out which element a given special pointer points to -- essentially computing the indices on the fly -- then find the corresponding element in the new object and take its address. (Maybe you could use a calculation to speed this up, but I'm not sure that would be portable.) If there is a lot of lookup and not much copying, this might be better for overall performance. However, I would rather maintain the code that stores indices.
The best way is to use indices. Honestly. It makes moves and copies just work; this is a very useful property because it's so easy to get silently wrong behavior with hand written copies when you add members. A private member function that converts an index into a reference/pointer does not seem very onerous.
That said, there may still be similar situations where indices aren't such a good option. If you, for example have a unordered_map instead of a vector, you could of course still store the keys rather than pointers to the values, but then you are going through an expensive hash.
If you really insist on using pointers rather that indices, I'd probably do this:
struct Widget
{
std::vector<int> d_members;
std::vector<int*> d_special_members;
Widget(std::vector<int> members) : d_members(members)
{
for (auto& member : d_members)
if (member % 2 == 0)
d_special_members.push_back(&member);
}
Widget(const Widget& other)
: d_members(other.d_members)
, d_special_members(new_special(other))
{}
Widget& operator=(const Widget& other) {
d_members = other.d_members;
d_special_members = new_special(other);
}
private:
vector<int*> new_special(const Widget& other) {
std::vector<int*> v;
v.reserve(other.d_special_members.size());
std::size_t special_index = 0;
for (std::size_t i = 0; i != d_members.size(); ++i) {
if (&other.d_members[i] == other.d_special_members[special_index]) {
v.push_back(&d_members[i});
++special_index;
}
}
return v;
}
};
My implementation runs in linear time and uses no extra space, but exploits the fact (based on your sample code) that there are no repeats in the pointers, and that the pointers are ordered the same as the original data.
I avoid copy and swap because it's not necessary to avoid code duplication and there just isn't any reason for it. It's a possible performance hit to get strong exception safety, that's all. However, writing a generic CAS that gives you strong exception safety with any correctly implemented class is trivial. Class writers should usually not use copy and swap for the assignment operator (there are, no doubt, exceptions).
This work for me for vector of pairs, though it's terribly ugly and I would never use it in real code:
std::vector<std::pair<int, int>> d_members;
std::vector<int*> d_special_members;
Widget(const Widget& other) : d_members(other.d_members) {
d_special_members.reserve(other.d_special_members.size());
for (const auto p : other.d_special_members) {
ptrdiff_t diff = (char*)p - (char*)(&other.d_members[0]);
d_special_members.push_back((int*)((char*)(&d_members[0]) + diff));
}
}
For sake of brevity I used only C-like type cast, reinterpret_cast would be better. I am not sure whether this solution does not result in undefined behavior, in fact I guess it does, but I dare to say that most compilers will generate a working program.
I think using indexes instead of pointers is just perfect. You don't need any custom copy code then.
For convenience you may want to define a member function converting the index to actual pointer you want. Then your members can be of arbitrary complexity.
private:
int* getSpecialMemberPointerFromIndex(int specialIndex)
{
return &d_member[specialIndex];
}
Related
Can we overload the push_back() method in std::vector to allow non-duplicate elements? I know std::set and std::unordered_set are supposed to avoid duplicate elements, but std::set sorts the elements and std::unordered_set stores the elements in no particular order. I need to retrieve the elements in the order they are inserted, while ensuring duplicate elements are not inserted.
Edit: There's a possible duplicate for this question here. The best solution to this duplicate proposes to have an auxiliary data structure and another custom method "add". This doesn't look good for me since(I'll put it in a separate documentation) the users inserting data in std::vector rarely refer to the documentation for any custom functions. If there's no efficient way though, this can be a last resort.
Many people advise against it, but it seems there's some kind of urban legend going around that doing so will cause the universe to undergo vacuum decay and reality as we know it will dissolve.
You can publicly inherit from std::vector. But you have to think about what you can do with that.
If you inherit from vector, it is highly recommended that you don't add any data members to it. This can cause object slicing (google "c++ object slicing".) You also need to keep in mind that vector is not using virtual functions. That means you cannot override member functions. You can only shadow them, so it's not guaranteed that it will always be your push_back() function that gets called. The original will get called if you pass an object of your class to something that takes a reference to a vector, for example.
So in the end, you'd need to add a push_back_unique() function instead. But that in turns means that can be served by a simple free function instead. So inheriting vector isn't needed. This of course means there's never a guarantee that the elements in the vector will be unique. Other code might use push_back() instead somewhere.
Inheriting vector makes sense if you want to add completely new convenience functions that don't impose or lift any restrictions that vector has. If you want something that looks like a vector but really isn't (because it has different behavior and/or restrictions), you should implement your own type that delegates the container functionality to vector by either inheriting privately from it, or by having it as a private data member, and then replicate the vector API through public wrapper functions.
But this is very tedious to implement. Usually, you don't really need all the API from vector. So I'd say just write a smaller class around vector that only provides the functionality you need. And that functionality sounds like it's going to be pretty much read-only, since allowing write access to the elements allows for setting an element to the same value as another, breaking the container's uniqueness. So you could do something like:
template<typename T>
class UniqueVector
{
public:
void push_back(T&& elem)
{
if (std::find(vec_.begin(), vec_.end(), elem) == vec_.end()) {
vec_.push_back(std::forward(elem));
}
}
const T& operator[](size_t index) const
{
return vec_[index];
}
auto begin() const
{
return vec_.cbegin();
}
auto end() const
{
return vec_.cend();
}
private:
std::vector<T> vec_;
};
If you still want to allow write access to individual elements, then you can provide non-const functions that check if the value that is passed is already in the vector. Like:
void assign_if_unique(size_t index, T&& value)
{
if (std::find(vec_.begin(), vec_.end(), value) == vec_.end()) {
vec_[index] = std::forward(value);
}
}
This is a minimal example. You should obviously add the functions you actually want. Like size(), empty(), and whatever else you need.
You should first define a free function1 to implement your feature:
template<class T>
std::vector<T>&
push_back_unique(std::vector<T>& dest, T const& src)
{ /* ... */ }
If you use this a lot, and if make sense regarding your program, you might want to define an operator to do so:
template<class T>
std::vector<T>& operator<<(std::vector<T>& dest, T const& src)
{ return push_back_unique(dest, src); }
This allows:
std::vector<int> data;
data << 5 << 8 << 13 << 5 << 21;
for (auto n : data) std::cout << n << " "; // prints 5 8 13 21
1) This is because inheriting from standard containers is often bad practice and brings pitfalls.
I'm new to c++. I came across some code and got confused
vector<int> vec{3,1,4,1,5};
vector<int> &vecRef = vec;
auto vecCopy = vecRef; // makes copy of vec
auto &vecRef2 = vecRef; // reference
I read about the usage of reference types in c++ and I understand why it's useful for immutable types. But for mutable types like vectors, what's the difference between vector vecCopy = vec and vector& vecRef = rec? Aren't they both alias to vec?
But for mutable types like vectors, what's the difference between
vector vecCopy = vec and vector& vecRef = rec? Aren't they both alias
to vec?
No. One is a copy of the entire vector. The other is a reference to the same.
Your example code is contrived. I can't think of any reasons why you would do this:
vector<int> vec{3,1,4,1,5};
vector<int> &vecRef = vec;
You pass variables by reference all the time. But I can't imagine a reason why I'd make a reference to a local variable like this, other than to illustrate an example of references as opposed to copies.
So: vecCopy is a whole DIFFERENT vector with its own contents. At the end of your code, it's identical in contents to vec, but after that, you can add to one or the other and they begin to diverge. vecRef is a reference to the exact same data. If you think of them as (under the hood) pointers, they point to the same object.
Difference between references and values.
One of the features of C++ is that it distinguishes between references and values. A lot of other languages don't do this. Let's say you have a vector:
std::vector<int> v1 = {1, 2, 3};
Creating a deep copy of this vector is really simple:
auto copy_of_v1 = v1;
We can prove it by changing copy_of_v1:
std::cout << (v1 == copy_of_v1) << '\n'; // Prints 1, for true
copy_of_v1[1] = 20; // copy_of_v1 == {1, 20, 3} now
std::cout << (v1 == copy_of_v1) << '\n'; // Prints 0, for false
Use cases for references.
References have three big use cases:
- Avoiding a copy by storing/using a reference
- Getting additional information out of a function (by passing it a reference, and letting it modify the reference)
- Writing data structures / container classes
We've seen the first case already, so let's look at the other two.
Using references to write functions that modify their input. Let's say you wanted to add the ability to append elements to vectors using +=. An operator is a function, so if it's going to modify the vector, it needs to have a reference to it:
// We take a reference to the vector, and return the same reference
template<class T>
std::vector<T>& operator +=(std::vector<T>& vect, T const& thing) {
vect.push_back(thing);
return vect;
}
This allows us to append elements to the vector just like it was a string:
int main() {
std::vector<int> a;
((a += 1) += 2) += 3; // Appends 1, then 2, then 3
for(int i : a) {
std::cout << i << '\n';
}
}
If we didn't take the vector by reference, the function wouldn't be able to change it. This means that we wouldn't be able to append anything.
Using references to write containers.
References make it easy to write mutable containers in C++. When we want to provide access to something in the container, we just return a reference to it. This provides direct access to elements, even primitives.
template<class T>
class MyArray {
std::unique_ptr<T[]> array;
size_t count;
public:
T* data() {
return array.get();
}
T const* data() {
return array.get();
}
MyArray() = default; // Default constructor
MyArray(size_t count) // Constructs array with given size
: array(new T[count])
, count(count) {}
MyArray(MyArray const& m) // Copy constructor
: MyArray(count) {
std::copy_n(m.data(), count, data();
}
MyArray(MyArray&&) = default;// Move constructor
// By returning a reference, we can access elements directly
T& operator[](size_t index) {
return array[index];
}
};
Now, when using MyArray, we can directly change and modify elements, even if they're primitives:
MyArray<int> m(10); // Create with 10 elements
m[0] = 1; // Modify elements directly
m[0]++; // Use things like ++ directly
Using references in c++ is the same as just using the name of the object itself. Therefore, you might consider a reference an alias.
vector<int> vec = {1, 2, 3};
vector<int>& vecRef = vec;
cout << vec.size() << '\n'; // Prints '3'
cout << vecRef.size() << '\n'; // Also prints '3'
It's worth noting that nobody really uses references to simply have another name for an existing object.
They are primarily used instead of pointers to pass objects without copying them.
C++ uses value semantics by default. Objects are values unless you specifically declare them to be references. So:
auto vecCopy = vecRef;
will create a value object called vecCopy which will contain a deep copy of vec since vecRef is an alias for vec. In Python, this would roughly translate to:
import copy
vec = [3, 1, 4, 1, 5]
vecCopy = copy.deepcopy(vec)
Note that it only "roughly" translates to that. How the copy is performed depends on the type of the object. For built-in types (like int and char for example,) it's a straightforward copy of the data they contain. For class types, it invokes either the copy constructor, or the copy assignment operator (in your example code, it's the copy constructor.) So it's up to these special member functions to actually perform the copy. The default copy constructor and assignment operators will copy each class member, which in turn might invoke that member's copy ctor or assignment operator if it has one, etc, etc, until everything has been copied.
Value semantics in C++ allow for certain code generation optimizations by the compiler that would be difficult to perform when using reference semantics. Obviously if you copy large objects around, the performance benefit of values will get nullified by the performance cost of copying data. In these cases, you would use references. And obviously you need to use references if you need to modify the passed object rather than a copy of it.
In general, value semantics are preferred unless there is a reason to use a reference. For example, a function should take parameters by value, unless the passed argument needs to be modified, or it's too big.
Also, using references can increase the risk of running into undefined behavior (pointers incur the same risks.) You can have dangling references for example (a reference that refers to a destroyed object, for example.) But you can't have dangling values.
References can also decrease your ability to reason about what is going on in the program because objects can get modified through references by non-local code.
In any event, it's a rather big subject. Things start to become more clear as you use the language and gain more experience with it. If there's a very general rule of thumb to take away from all this: use values unless there's a reason not to (mostly object size, requiring mutability of a passed function argument, or with runtime polymorphic classes since those require to be accessed through a reference or pointer when that access needs to be polymorphic.)
You can also find beginner articles and talks about the subject. Here's one to get you started:
https://www.youtube.com/watch?v=PkyD1iv3ATU
I would like to create an object, put the object into a vector, and still be able to modify the same object by accessing only the vector. However, I understand that when an object is push_back() to a vector, the object is actually copied into the vector. As a result, accessing the object in the vector will merely access a similar, but different object.
I have a beginner's knowledge in C, so I know that I can create a pointer to the object, and make a vector of pointers. e.g. vector<Object *>. However, it seems as if pointers are discouraged in C++, and references are preferred. Yet, I cannot make a vector of references.
I wish to use only the standard libraries, so boost is off limits to me.
I heard of smart pointers. However, it appears as if there are multiple types of smart pointers. Would it not be overkill for this purpose? If smart pointers are indeed the answer, then how do I determine which one to use?
So my question is: What is the standard practice for creating a vector of references/pointers to objects?
In other words, would like the below (pseudo-)code to work.
#include <iostream>
#include <cstdlib>
#include <vector>
using namespace std;
class Object
{
public:
int field;
};
vector<Object> addToVector(Object &o)
{
vector<Object> v;
v.push_back(o);
v[0].field = 3; // I want this to set o.field to 3.
return v;
}
int main()
{
Object one;
one.field = 1;
cout << one.field << endl; // 1 as expected
Object &refone = one;
refone.field = 2;
cout << one.field << endl; // 2 as expected
vector<Object> v = addToVector(one);
cout << v[0].field << endl; // 3 as expected
cout << one.field << endl; // I want to get 3 here as well, as opposed to 2.
return 0;
}
I would like to create an object, put the object into a vector, and still be able to modify the same object by accessing only the vector. However, I understand that when an object is push_back() to a vector, the object is actually copied into the vector. As a result, accessing the object in the vector will merely access a similar, but different object.
I'm almost certain that this is not what you want or "should" want. Forgive me that direct opening of my answer, but unless you have a very good reason to do this, you probably don't want to do it.
For that - a vector with references - to work you must guarantee that the referenced objects won't get moved nor destructed while you hold references to them. If you have them in a vector, make sure that vector isn't resized. If you have them on the stack like in your example, then don't let the vector of references or a copy of it leave that stack frame. If you want to store them in some container, use a std::list (it's iterators - pointers basically - don't get invalidated when inserting or removing elements).
You already noticed that you cannot have a vector of "real" references. The reason therefore is that references aren't assignable. Consider following code:
int a = 42;
int b = 21;
int & x = a; // initialisation only way to bind to something
int & y = b;
x = y;
b = 0;
After that, the value you obtain from x will be 21, because the assignment didn't change the reference (to be bound to b) but the referenced object, a. But a std::vector explicitly requires this.
You could now set out and write an wrapper around a pointer like ...
template<typename T>
struct my_ref {
T * target;
// don't let one construct a my_ref without valid object to reference to
my_ref(T & t) : target(&t) {}
// implicit conversion into an real reference
operator T &(void) {
return *target;
}
// default assignment works as expected with pointers
my_ref & operator=(my_ref const &) = default;
// a moved from reference doesn't make sense, it would be invalid
my_ref & operator=(my_ref &&) = delete;
my_ref(my_ref &&) = delete;
// ...
};
... but this is pretty pointless since std::reference_wrapper already provides exactly that:
int main (int, char**) {
int object = 21; // half of the answer
vector<reference_wrapper<int>> v;
v.push_back(object);
v[0].get() = 42; // assignment needs explicit conversion of lhs to a real reference
cout << "the answer is " << object << endl;
return 0;
}
(Example live here)
Now one could argue why using a wrapper around a pointer like std::reference_wrapper when one could also directly use a pointer. IMO a pointer, having the ability to be nullptr, changes the semantics of the code: When you have a raw pointer, it could be invalid. Sure, you can just assume that it's not, or put it somewhere in comments, but in the end you then rely on something that's not guaranteed by the code (and this behaviour normally leads to bugs).
If an element of your vector could "reference" an object or be invalid, then still raw pointers aren't the first choice (for me): When you use an element from your vector which is valid, then the object referenced by it is actually referenced from multiple places on your code; it's shared. The "main" reference to the object then should be a std::shared_ptr and the elements of your vector std::weak_ptrs. You can then (thread safe) acquire a valid "reference" (a shared pointer) when you need to and drop it when done:
auto object = make_shared<int>(42);
vector<weak_ptr<int>> v;
v.push_back (object);
// ... somewhere later, potentially on a different thread
if (auto ref = v[0].lock()) {
// noone "steals" the object now while it's used here
}
// let them do what they want with the object, we're done with it ...
Finally, please take my answer with a grain of salt, much of it is based on my opinion (and experience) and might not count as "standard practice".
If you don't need dynamic growth and don't know the size of the buffer at compile time, when should unique_ptr<int[]> be used instead of vector<int> if at all?
Is there a significant performance loss in using vector instead of unique_ptr?
There is no performance loss in using std::vector vs. std::unique_ptr<int[]>. The alternatives are not exactly equivalent though, since the vector could be grown and the pointer cannot (this can be and advantage or a disadvantage, did the vector grow by mistake?)
There are other differences, like the fact that the values will be initialized in the std::vector, but they won't be if you new the array (unless you use value-initialization...).
At the end of the day, I personally would opt for std::vector<>, but I still code in C++03 without std::unique_ptr.
If you're in a position where vector<int> is even a possibility, you probably want to go with that except in extreme and rare circumstances. And even then, a custom type instead of unique_ptr<int[]> may well be the best answer.
So what the heck is unique_ptr<int[]> good for? :-)
unique_ptr<T[]> really shines in two circumstances:
1. You need to handle a malloc/free resource from some legacy function and you would like to do it in a modern exception safe style:
void
foo()
{
std::unique_ptr<char[], void(*)(void*)> p(strdup("some text"), std::free);
for (unsigned i = 0; p[i]; ++i)
std::cout << p[i];
std::cout << '\n';
}
2. You've need to temporarily secure a new[] resource before transferring it onto another owner:
class X
{
int* data_;
std::string name_;
static void validate(const std::string& nm);
public:
~X() {delete [] data_;}
X(int* data, const std::string& name_of_data)
: data_(nullptr),
name_()
{
std::unique_ptr<int[]> hold(data); // noexcept
name_ = name_of_data; // might throw
validate(name_); // might throw
data_ = hold.release(); // noexcept
}
};
In the above scenario, X owns the pointer passed to it, whether or not the constructor succeeds. This particular example assumes a noexcept default constructor for std::string which is not mandated. However:
This point is generalizable to circumstances not involving std::string.
A std::string default constructor that throws is lame.
std::vector stores the length of both the size of the variable and the size of the allocated data along with the pointer to the data it's self. std::unique_ptr just stores the pointer so there may be a small gain in using std::unique_ptr.
No one has yet mentioned the vector provides iterators and function such and size() where as unique ptr does not. So if iterators are needed use std::vector
C++14 introduces std::dynarray for that purpose.
Now, between these two constructions :
auto buffer = std::make_unique<int[]>( someCount );
auto buffer = std::vector<int>( someCount, someValue );
The first gives you an uninitialized array of int but the second initializes it with a value ( 0 if not provide ). So if you do not need the memory to be initialized because you will overwrite it somehow later with something more complex than std::fill, choose 1, if not, choose 2.
Objective Part:
No, there probably shouldn't be a significant performance difference between the two (though I suppose it depends on the implementation and you should measure if it's critical).
Subjective Part:
std::vector is going to give you a well known interface with .size() and .at() and iterators, which will play nicely with all sorts of other code. Using std::unique_ptr gives you a more primitive interface and makes you keep track of details (like the size) separately. Therefore, barring other constraints, I would prefer std::vector.
As a general rule, I prefer using value rather than pointer semantics in C++ (ie using vector<Class> instead of vector<Class*>). Usually the slight loss in performance is more than made up for by not having to remember to delete dynamically allocated objects.
Unfortunately, value collections don't work when you want to store a variety of object types that all derive from a common base. See the example below.
#include <iostream>
using namespace std;
class Parent
{
public:
Parent() : parent_mem(1) {}
virtual void write() { cout << "Parent: " << parent_mem << endl; }
int parent_mem;
};
class Child : public Parent
{
public:
Child() : child_mem(2) { parent_mem = 2; }
void write() { cout << "Child: " << parent_mem << ", " << child_mem << endl; }
int child_mem;
};
int main(int, char**)
{
// I can have a polymorphic container with pointer semantics
vector<Parent*> pointerVec;
pointerVec.push_back(new Parent());
pointerVec.push_back(new Child());
pointerVec[0]->write();
pointerVec[1]->write();
// Output:
//
// Parent: 1
// Child: 2, 2
// But I can't do it with value semantics
vector<Parent> valueVec;
valueVec.push_back(Parent());
valueVec.push_back(Child()); // gets turned into a Parent object :(
valueVec[0].write();
valueVec[1].write();
// Output:
//
// Parent: 1
// Parent: 2
}
My question is: Can I have have my cake (value semantics) and eat it too (polymorphic containers)? Or do I have to use pointers?
Since the objects of different classes will have different sizes, you would end up running into the slicing problem if you store them as values.
One reasonable solution is to store container safe smart pointers. I normally use boost::shared_ptr which is safe to store in a container. Note that std::auto_ptr is not.
vector<shared_ptr<Parent>> vec;
vec.push_back(shared_ptr<Parent>(new Child()));
shared_ptr uses reference counting so it will not delete the underlying instance until all references are removed.
I just wanted to point out that vector<Foo> is usually more efficient than vector<Foo*>. In a vector<Foo>, all the Foos will be adjacent to each other in memory. Assuming a cold TLB and cache, the first read will add the page to the TLB and pull a chunk of the vector into the L# caches; subsequent reads will use the warm cache and loaded TLB, with occasional cache misses and less frequent TLB faults.
Contrast this with a vector<Foo*>: As you fill the vector, you obtain Foo*'s from your memory allocator. Assuming your allocator is not extremely smart, (tcmalloc?) or you fill the vector slowly over time, the location of each Foo is likely to be far apart from the other Foos: maybe just by hundreds of bytes, maybe megabytes apart.
In the worst case, as you scan through a vector<Foo*> and dereferencing each pointer you will incur a TLB fault and cache miss -- this will end up being a lot slower than if you had a vector<Foo>. (Well, in the really worst case, each Foo has been paged out to disk, and every read incurs a disk seek() and read() to move the page back into RAM.)
So, keep on using vector<Foo> whenever appropriate. :-)
Yes, you can.
The boost.ptr_container library provides polymorphic value semantic versions of the standard containers. You only have to pass in a pointer to a heap-allocated object, and the container will take ownership and all further operations will provide value semantics , except for reclaiming ownership, which gives you almost all the benefits of value semantics by using a smart pointer.
You might also consider boost::any. I've used it for heterogeneous containers. When reading the value back, you need to perform an any_cast. It will throw a bad_any_cast if it fails. If that happens, you can catch and move on to the next type.
I believe it will throw a bad_any_cast if you try to any_cast a derived class to its base. I tried it:
// But you sort of can do it with boost::any.
vector<any> valueVec;
valueVec.push_back(any(Parent()));
valueVec.push_back(any(Child())); // remains a Child, wrapped in an Any.
Parent p = any_cast<Parent>(valueVec[0]);
Child c = any_cast<Child>(valueVec[1]);
p.write();
c.write();
// Output:
//
// Parent: 1
// Child: 2, 2
// Now try casting the child as a parent.
try {
Parent p2 = any_cast<Parent>(valueVec[1]);
p2.write();
}
catch (const boost::bad_any_cast &e)
{
cout << e.what() << endl;
}
// Output:
// boost::bad_any_cast: failed conversion using boost::any_cast
All that being said, I would also go the shared_ptr route first! Just thought this might be of some interest.
While searching for an answer to this problem, I came across both this and a similar question. In the answers to the other question you will find two suggested solutions:
Use std::optional or boost::optional and a visitor pattern. This solution makes it hard to add new types, but easy to add new functionality.
Use a wrapper class similar to what Sean Parent presents in his talk. This solution makes it hard to add new functionality, but easy to add new types.
The wrapper defines the interface you need for your classes and holds a pointer to one such object. The implementation of the interface is done with free functions.
Here is an example implementation of this pattern:
class Shape
{
public:
template<typename T>
Shape(T t)
: container(std::make_shared<Model<T>>(std::move(t)))
{}
friend void draw(const Shape &shape)
{
shape.container->drawImpl();
}
// add more functions similar to draw() here if you wish
// remember also to add a wrapper in the Concept and Model below
private:
struct Concept
{
virtual ~Concept() = default;
virtual void drawImpl() const = 0;
};
template<typename T>
struct Model : public Concept
{
Model(T x) : m_data(move(x)) { }
void drawImpl() const override
{
draw(m_data);
}
T m_data;
};
std::shared_ptr<const Concept> container;
};
Different shapes are then implemented as regular structs/classes. You are free to choose if you want to use member functions or free functions (but you will have to update the above implementation to use member functions). I prefer free functions:
struct Circle
{
const double radius = 4.0;
};
struct Rectangle
{
const double width = 2.0;
const double height = 3.0;
};
void draw(const Circle &circle)
{
cout << "Drew circle with radius " << circle.radius << endl;
}
void draw(const Rectangle &rectangle)
{
cout << "Drew rectangle with width " << rectangle.width << endl;
}
You can now add both Circle and Rectangle objects to the same std::vector<Shape>:
int main() {
std::vector<Shape> shapes;
shapes.emplace_back(Circle());
shapes.emplace_back(Rectangle());
for (const auto &shape : shapes) {
draw(shape);
}
return 0;
}
The downside of this pattern is that it requires a large amount of boilerplate in the interface, since each function needs to be defined three times.
The upside is that you get copy-semantics:
int main() {
Shape a = Circle();
Shape b = Rectangle();
b = a;
draw(a);
draw(b);
return 0;
}
This produces:
Drew rectangle with width 2
Drew rectangle with width 2
If you are concerned about the shared_ptr, you can replace it with a unique_ptr.
However, it will no longer be copyable and you will have to either move all objects or implement copying manually.
Sean Parent discusses this in detail in his talk and an implementation is shown in the above mentioned answer.
Take a look at static_cast and reinterpret_cast
In C++ Programming Language, 3rd ed, Bjarne Stroustrup describes it on page 130. There's a whole section on this in Chapter 6.
You can recast your Parent class to Child class. This requires you to know when each one is which. In the book, Dr. Stroustrup talks about different techniques to avoid this situation.
Do not do this. This negates the polymorphism that you're trying to achieve in the first place!
Most container types want to abstract the particular storage strategy, be it linked list, vector, tree-based or what have you. For this reason, you're going to have trouble with both possessing and consuming the aforementioned cake (i.e., the cake is lie (NB: someone had to make this joke)).
So what to do? Well there are a few cute options, but most will reduce to variants on one of a few themes or combinations of them: picking or inventing a suitable smart pointer, playing with templates or template templates in some clever way, using a common interface for containees that provides a hook for implementing per-containee double-dispatch.
There's basic tension between your two stated goals, so you should decide what you want, then try to design something that gets you basically what you want. It is possible to do some nice and unexpected tricks to get pointers to look like values with clever enough reference counting and clever enough implementations of a factory. The basic idea is to use reference counting and copy-on-demand and constness and (for the factor) a combination of the preprocessor, templates, and C++'s static initialization rules to get something that is as smart as possible about automating pointer conversions.
I have, in the past, spent some time trying to envision how to use Virtual Proxy / Envelope-Letter / that cute trick with reference counted pointers to accomplish something like a basis for value semantic programming in C++.
And I think it could be done, but you'd have to provide a fairly closed, C#-managed-code-like world within C++ (though one from which you could break through to underlying C++ when needed). So I have a lot of sympathy for your line of thought.
Just to add one thing to all 1800 INFORMATION already said.
You might want to take a look at "More Effective C++" by Scott Mayers "Item 3: Never treat arrays polymorphically" in order to better understand this issue.
I'm using my own templated collection class with exposed value type semantics, but internally it stores pointers. It's using a custom iterator class that when dereferenced gets a value reference instead of a pointer. Copying the collection makes deep item copies, instead of duplicated pointers, and this is where most overhead lies (a really minor issue, considered what I get instead).
That's an idea that could suit your needs.