C++ Vectors, Lists, Pointers, and Confusion - c++

I have an Entity class with a variety of subclasses for particular types of entities. The objects of the subclasses have various relationships with each other, so I was using Entity pointers and static_casts to describe these relationships and to allow entities to access information about the other entities they have relationships with.
However, I found that using raw pointers was a source of a lot of difficulty & hard to debug errors. For instance, I wasted several hours before realizing I was making pointers to an object before copying it into a vector, which invalidated the pointers.
I'm using vectors and lists & stl containers like that to manage all of my memory - I am really trying to avoid worrying about memory management and low-level issues.
Say Cow is a subclass of Entity, and I have a vector of Cows (or some data structure of Cows). This vector might move itself around in memory if it needs to resize or something. I don't really want to care about those details, but I know they might invalidate my pointers. So if a Pig has a relationship with a cow and the cow vector is resized, the Pig loses track of where its friend is located in memory & has a dangerous incorrect pointer.
Here are my questions, although I think someone experienced might be able to help me more just by reading the situation...
Should I try to slip something into the constructor/destructor of my Entity objects so that they automatically invalidate relationships other entities have with them when they are deleted or update memory locations if they are moved? If this is a good approach, any help on the semantics would be useful. I have been writing all of my classes with no constructors or destructors at all so that they are extremely easy to use in things like vectors.
Is the better option to just keep track of the containers & manually update the pointers whenever I know the containers are going to move? So the strategy there would be to iterate through any container that may have moved right after it moves and update all of the pointers immediately.

As I understand it, you're doing something like this:
cow florence, buttercup;
std::vector<cow> v = { florence, buttercup };
cow* buttercup_ptr = &v[1];
// do something involving the cow*
And your buttercup_ptr is getting invalidated because, eg. florence was removed from the vector.
It is possible to resolve this by using smart pointers, like this:
std::shared_ptr<cow> florence = std::make_shared<cow>();
std::vector<std::shared_ptr<cow>> v;
v.push_back(florence);
You can now freely share florence... regardles off how the vector gets shuffled around, it remains valid.
If you want to let florence be destroyed when you pop her out of the vector, that presents a minor problem: anyone holding a copy of the shared_ptr will prevent this particular cow instance being cleaned up. You can avoid that by using a weak_ptr.
std::weak_ptr<cow> florence_ref = florence;
In order to use a weak_ptr you call lock() on it to convert it to a full shared_ptr. If the underlying cow instance was already destroyed because it had no strong references, an excetion is throw (you can also call expired() to check this before lock(), of course!)
The other possibility (as I suggested in my comment on the original question) is to use iterators pointing into the container holding the cow instances. This is a little stranger, but iterators are quite like pointers in many ways. You just have to be careful in your choice of container, to ensure your iterators aren't going to be invalidate (which mean syou can't use a vector for this purpose);
std::set<cow> s;
cow florence;
auto iter = s.insert(florence);
// do something with iter, and not worry if other cows are added to
// or removed from s
I don't think I'd necessarily advise this option, though I'm sure it would work. Using smart pointers is a better way to show future maintainance programmers your intent and has the advantage that getting dangling references is much harder.

You have a misconception about pointers. A pointer is (for the sake of this discussion!) just an address. Why would copying an address invalidate anything? Let's say your object is created in memory at address 12345. new therefore gives you a pointer whose value is 12345:
Entity *entity = new Cow; // object created at address 12345
// entity now has value 12345
Later, when you copy that pointer into a vector (or any other standard container), you copy nothing but the value 12345.
std::vector<Entity *> entities;
entities.push_back(entity);
// entities[0] now has value 12345, entity does not change and remains 12345
In fact, there's no difference with regards to this compared to, say, a std::vector<int>:
int i = 12345;
std::vector<int> ints;
ints.push_back(i);
// ints[0] now has value 12345, i does not change and remains 12345
When the vector is copied, the values inside do not change, either:
std::vector<int> ints2 = ints;
std::vector<Entity *> entities2 = entities;
// i, ints[0] and ints[0] are 12345
// entity, entities[0] and entities[0] are 12345
So if a Pig has a relationship with a cow and the cow vector is
resized, the Pig loses track of where its friend is located in memory
& has a dangerous incorrect pointer.
This fear is unfounded. The Pig does not lose track of its friend.
The dangerous thing is keeping a pointer to a deleted object:
delete entity;
entities[0]->doSometing(); // undefined behaviour, may crash
How to solve this problem is subject of thousands of questions, websites, blogs and books in C++. Smart pointers may help you, but you should first truly understand the concept of ownership.

My two cent. The code below is a working example (C++11, translates with g++ 4.8.2) with smart pointers and factory function create. Should be quite failsafe. (You can always destroy something in C++ if you really want to.)
#include <memory>
#include <vector>
#include <algorithm>
#include <iostream>
#include <sstream>
class Entity;
typedef std::shared_ptr<Entity> ptr_t;
typedef std::vector<ptr_t> animals_t;
class Entity {
protected:
Entity() {};
public:
template<class E, typename... Arglist>
friend ptr_t create(Arglist... args) {
return ptr_t(new E(args...));
}
animals_t my_friends;
virtual std::string what()=0;
};
class Cow : public Entity {
double m_milk;
Cow(double milk) :
Entity(),
m_milk(milk)
{}
public:
friend ptr_t create<Cow>(double);
std::string what() {
std::ostringstream os;
os << "I am a cow.\nMy mother gave " << m_milk << " liters.";
return os.str();
}
};
class Pig : public Entity {
Pig() : Entity() {}
public:
friend ptr_t create<Pig>();
std::string what() {
return "I am a pig.";
}
};
int main() {
animals_t animals;
animals.push_back(create<Cow>(1000.0));
animals.push_back(create<Pig>());
animals[0]->my_friends.push_back(animals[1]);
std::for_each(animals.begin(),animals.end(),[](ptr_t v){
std::cout << '\n' << v->what();
std::cout << "\nMy friends are:";
std::for_each(v->my_friends.begin(),v->my_friends.end(),[](ptr_t f){
std::cout << '\n' << f->what();
});
std::cout << '\n';
});
}
/*
Local Variables:
compile-command: "g++ -std=c++11 test.cc -o a.exe && ./a.exe"
End:
*/
By declaring the Entity constructor as protected one forces that derived objects of type Entity must be created through the factory function create.
The factory makes sure that objects always to into smart pointers (in our case std::shared_ptr).
It is written as template to make it re-usable in the subclasses. More precisely it is a variadic template to allow constructors with any numbers of arguments in the derived classes. In the modified version of the program one passes the milk production of the mother cow to the newly created cow as a constructor argument.

Related

Prevent memory leaks and easy/idiomatic programming in C++

Is there a compromise between managing memory and easy/idiomatic programming in C++?
For example, let's say I have the following classes. We have an Item that someone may purchase, and a ShoppingList of those Items. Let's just say Item and ShoppingList are big data structures with more features than shown and that we would not want to pass them by value normally (but they are stripped down to the basics features in this example).
class Item {
private:
std::string name;
float cost;
public:
Item(std::string name, float cost) : name{name}, cost{cost} { }
std::string toString(void) {
return name + " = " + std::to_string(cost);
}
~Item() { std::cout << "~Item()" << std::endl; }
};
class ShoppingList {
private:
std::vector<Item *> items;
public:
ShoppingList() {}
void addItem(Item &item) {
items.push_back(&item);
}
std::vector<Item *> getItems() {
return items;
}
~ShoppingList() {
std::cout << "~ShoppingList()" << std::endl;
}
};
Now, I want to be able to conveniently do something like the following (inline the argument).
ShoppingList shoppingList{};
shoppingList.add(Item{"soap", 1.99}); // no
C++17 will not me allow to try this either.
ShoppingList shoppingList{};
shoppingList.add(&Item{"soap", 1.99}); // no
shoppingList.add(new Item{"soap", 1.99}); // no
Instead, I have to do something like this.
ShoppingList shoppingList{};
Item soap{"soap", 1.99};
shoppingList.add(soap);
The above is "ok" for adding 1 element, but things get "worse" when I try to use a for loop. The item created on each loop ends with the same address and I do not end up with the intended effect.
ShoppingList shoppingList{};
auto products = std::vector<std::string>{"brush", "comb"};
for (auto product : products) {
Item item{product, 0.99};
shoppingList.addItem(item);
}
It seems unreasonable, to me, to do the following.
ShoppingList shoppingList{};
Item item1{"item1", 0.99};
Item item2{"item2", 0.99};
Item item3{"item3", 0.99};
// and so on
shoppingList.addItem(item1);
shoppingList.addItem(item2);
shoppingList.addItem(item3);
// and so on
There's just so many ways of thinking about how to make it easier to use the code. One way would be to use raw pointers. So I may change the method signature of ShoppingList.addItem(...) to as follows.
void addItem(Item *item);
Then I can do some less difficult use of the objects as follows.
ShoppingList shoppingList{};
shoppingList.addItem(new Item{"item1", 0.99});
shoppingList.addItem(new Item{"item2", 0.99});
shoppingList.addItem(new Item{"item3", 0.99});
With this new method signature, for loops should work.
ShoppingList shoppingList{};
auto products = std::vector<std::string>{"brush", "comb"};
for (auto product : products) {
shoppingList.addItem(new Item{product, 0.99});
}
Ok, so this approach makes it easier to use/write the code, but valgrind analysis says Leak_DefintelyLost where I start using new.
Another approach is smart pointers. So let's wrap the objects into smart pointers; in particular shared_pointer. Here's the new ShoppingList using smart pointers.
class ShoppingList {
private:
std::vector<std::shared_ptr<Item>> items;
public:
ShoppingList() {}
void addItem(std::shared_ptr<Item> item) {
items.push_back(item);
}
std::vector<std::shared_ptr<Item>> getItems() {
return items;
}
~ShoppingList() {
std::cout << "~ShoppingList()" << std::endl;
}
};
Usage will look like the following.
ShoppingList shoppingList{};
shoppingList.addItem(std::make_shared<Item>("soap", 32.0));
shoppingList.addItem(std::make_shared<Item>("shampoo", 32.0));
valgrind does not detect a single memory leak issue.
The only problem I have with using shared_ptr is verbosity and nested types. This declaration looks bad to me. That is a lot of : and < with >.
std::vector<std::shared_ptr<Item>> items;
In one situation where I had a map of maps of big data structures, I considered using shared pointers and ended up with a declaration as follows. If another coder looks at what the intention is (or if I myself come back later to troubleshoot), what is the intention here?
std::map<std::string, std::map<std::string, std::vector<std::shared_ptr<SomeClass>>>> dataBag;
It seems to me that the : and < and > are polluting the code and intent all for the purpose of managing memory leaks. I am not sure what's good or bad or idiomatic to write C++17 while being concise but also preventing memory leaks boostrapped by the language constructs.
Now, I did encounter the idea of RAII. But mocking some classes with that approach (raw pointers only), valgrind still shows Leak_DefinitelyLost.
Any general guidance on I suppose what I would refer to as idiomatic C++ that is memory leakage sensitive/preventative would be helpful. I was going to go down the road of creating factories to store all the pointers that I will create across my API, but I realized that approach would cause the memory to continue to build up even when there are no longer references to some pointers (and smart pointers already handle that part).
At this point, I am almost resigned to the acceptance of living with smart pointers (though there may be diamond and colon operators all over the place) to avoid memory leaks.
One way or another, if you want a collection of items, you're going to have to create each item in the collection. Storing pointers to items mostly helps if they're really so tremendously huge that copying them is unacceptable, even if it happens only a few times, such as when a vector reallocates its memory.
As for passing by reference: right now, you're defining the parameter as an lvalue reference. That means the reference can only refer to an lvalue. Thus the requirement to create a named object (an lvalue) then pass it.
You apparently don't want that. In your case, you have a couple of choices. One is a reference to const. Unlike a (non-const) lvalue reference, this can bind to a temporary object.
Another choice would be to use an rvalue reference. An rvalue reference has the advantage that it can bind to an rvalue (such as a temporary object), and also that it's "aware" that it's dealing with an rvalue, so it can "steal" the contents of that object. This is particularly useful if the object mostly contains a pointer to the actual data, in which case you can just copy the pointer (and modify the existing object so it won't destroy the data when it's destroyed).
In most cases, you're best off starting with the simplest code that could work. Create the vector of objects, and live with the fact that they'll get copied. Sometime later, if you find that you have a real bottleneck from copying those objects as that vector is reallocated, it's soon enough to change your code--such as to store std::unique_ptrs instead of storing objects directly. But that's not your first choice. Chances are pretty good that when you profile the code, you'll find that copying those objects isn't a bottleneck at all, so putting work into keeping them from being copied would have been wasted.
A long time ago I get c++ advice.
Never use raw pointers.
As soon as you are not low-level library developer. You don't need raw pointers. Don't need. Period.

When should I use references in C++?

I've been programming C++ for a while now and I'm starting to doubt that the rule use references whenever possible should be applied everywhere.
Unlike this related SO post I'm interested in a different kind of thing.
In my experience the reference/pointer mix messes up your code:
std::vector<Foo *> &x = get_from_somewhere(); // OK? reference as return value
some_func_pass_by_ref(x); // OK reference argument and reference variable
some_func_by_pointer(x[4]); // OK pointer arg, pointer value
some_func_pass_elem_by_ref(*x[5]); // BAD: pointer value, reference argument
some_func_that_requires_vec_ptr(&x); // BAD: reference value, pointer argument
One option would be to replace & with * const like Foo & with Foo * const
void some_func_by_ref(const std::vector<Foo * const> * const); // BAD: verbose!
this way at least the traversals are gone. and me rewriting function headers is gone, because all arguments will be pointers... at the price of polluting the code with const instead of pointer arithmetic (mainly & and *).
I would like to know how and when you apply use references whenever possible rule.
considering:
minimal rewriting of function prototypes (i.e.: oh damn I need need to rewrite alot of prototypes because I want to put this referenced element into a container)
increasing readability
avoid application of * to transform Foo* to Foo& and vice versa
avoid excessive const usage as in * const
NOTES: one thing I figured is to use pointers whenever I intend to ever put the element into an STL container (see boost::ref)
I don't think this is something C++03 specific but C++11 solutions are fine if they can be backported to C++03 (i.e.: NRVO instead of move-semantics).
When should I use references in C++?
When you need to treat a variable like the object itself (most cases when you don't explicitly need pointers and don't want to take ownership of an object).
I would like to know how and when you apply use references whenever possible rule.
whenever possible, except when you need to:
work on the address (log the address, diagnose or write custom memory allocation, etc)
take ownership of parameter (pass by value)
respect an interface that requires a pointer (C interoperability code and legacy code).
Bjarne Stroustrup stated in his book that he introduced references to the language because operators needed to be called without making a copy of the object (that would mean "by pointer") and he needed to respect syntax similar to calling by value (that would mean "not by pointer") (and thus references were born).
In short, you should use pointers as little as possible:
if the value is optional ("can be null") then use a std::optional around it, not a pointer
if you need to take ownership of the value, receive parameter by value (not a pointer)
if you need to read a value without modifying it, receive parameter by const &
if you need to allocate dynamically or return newly/dynamically allocated object, transmit value by one of: std::shared_ptr, std::unique_ptr, your_raii_pointer_class_here - not by (raw) pointer
if you need to pass a pointer to C code, you should still use the std::xxx_ptr classes, and get the pointer using .get() for getting the raw pointer.
one thing I figured is to use pointers whenever I intend to ever put the element into an STL container (or can I get rid of this?)
You can use Boost Pointer Container library.
IMHO the rule stands because raw pointers are dangerous because ownership and destruction responsibility becomes rapidly unclear. Hence the multiple encapsulations around the concept (smart_ptr, auto_ptr, unique_ptr, ...).
First, consider using such encapsulations instead of raw pointer in your container.
Second, why do you need to put pointers in a container ? I mean, they're meant to contain full objects ; they have an allocator as template argument for precise memory allocation after all. Most of the time, you want pointers because you have an OO-approach making heavy use of polymorphism. You should reconsider this approach. For example you can replace:
struct Animal {virtual std::string operator()() = 0;};
struct Dog : Animal {std::string operator()() {return "woof";}};
struct Cat : Animal {std::string operator()() {return "miaow";}};
// can not have a vector<Animal>
By something like this, using Boost.Variant :
struct Dog {std::string operator()() {return "woof";}};
struct Cat {std::string operator()() {return "miaow";}};
typedef boost::variant<Dog, Cat> Animal;
// can have a vector<Animal>
This way when you add a new animal, you inherit nothing, you just add it to the variant.
You can also consider, a little bit more complicated, but far more generic, using Boost.Fusion :
struct Dog {std::string talk; Dog() : talk("wook"){}};
struct Cat {std::string talk; Cat() : talk("miaow"){}};
BOOST_FUSION_ADAPT_STRUCT(Dog, (std::string, talk))
BOOST_FUSION_ADAPT_STRUCT(Cat, (std::string, talk))
typedef boost::fusion::vector<std::string> Animal;
int main()
{
vector<Animal> animals;
animals.push_back(Dog());
animals.push_back(Cat());
using boost::fusion::at;
using boost::mpl::int_;
for(auto a : animals)
{
cout << at<int_<0>>(a) << endl;
}
}
This way you do not even modify an aggregate like variant nor the algorithms on animals, you just need to provide a FUSION_ADAPT matching the used algorithms prerequisites. Both versions (variant and fusion) let you define orthogonal object groups, a useful thing you can not do with inheritance trees.
The following ways seem reasonable dealing with this:
boost and C++11 have a class that can cheaply be used to store references in a container: Reference Wrapper
A good advice is to use the handle/body idiom more often instead of passing around raw pointers. This also solves the ownership issue of the memory that is governed by the reference or the pointer. Sean Parent from Adobe has pointed this out at a talk at going native 2013.
I chose to use the Handle/Body Idiom approach because it gives pointers automatically copy/assign behaviour while hiding the underlying implementation and ownership semantics. It also acts as kind of a compile time firewall reducing header file inclusion.

C++ - Proper way of using std::vector & related memory management

Hy, I would like to ask a question that puzzles me.
I've a class like this:
class A {
private:
std::vector<Object*>* my_array_;
...
public
std::vector<Object*>& my_array(); // getter
void my_array(const std::vector<Object*>& other_array); // setter
};
I wanted to ask you, based on your experience, what is the correct way of implementing the setter and getter in a (possible) SAFE manner.
The first solution came to my mind is the following.
First, when I do implement the setter, I should:
A) check the input is not a referring to the data structure I already hold;
B) release the memory of ALL objects pointed by my_array_
C) copy each object pointed by other_array and add its copy to my_array_
D) finally end the function.
The getter may produce a copy of the inner array, just in case.
The questions are many:
- is this strategy overkilling?
- does it really avoid problems?
- somebody really uses it or are there better approaches?
I've tried to look for the answer to this question, but found nothing so particularly focused on this problem.
That of using smart pointers is a very good answer, i thank you both.. it seems I can not give "useful answer" to more than one so I apologize in advance. :-)
From your answers however a new doubt has raised.
When i use a vector containing unique_ptr to objects, I will have to define a deep copy constructor. Is there a better way than using an iterator to copy each element in the vector of objects, given that now we are using smart pointers?
I'd normally recommend not using a pointer to a vector as a member, but from your question it seems like it's shared between multiple instances.
That said, I'd go with:
class A {
private:
std::shared_ptr<std::vector<std::unique_ptr<Object> > > my_array_;
public
std::shared_ptr<std::vector<std::unique_ptr<Object> > > my_array(); // getter
void my_array(std::shared_ptr<std::vector<std::unique_ptr<Object> > > other_array); // setter
};
No checks necessary, no memory management issues.
If the inner Objects are also shared, use a std::shared_ptr instead of the std::unique_ptr.
I think you are overcomplicating things having a pointer to std::vector as data member; remember that C++ is not Java (C++ is more "value" based than "reference" based).
Unless there is a strong reason to use a pointer to a std::vector as data member, I'd just use a simple std::vector stored "by value".
Now, regarding the Object* pointers in the vector, you should ask yourself: are those observing pointers or are those owning pointers?
If the vector just observes the Objects (and they are owned by someone else, like an object pool allocator or something), you can use raw pointers (i.e. simple Object*).
But if the vector has some ownership semantics on the Objects, you should use shared_ptr or unique_ptr smart pointers. If the vector is the only owner of Object instances, use unique_ptr; else, use shared_ptr (which uses a reference counting mechanism to manage object lifetimes).
class A
{
public:
// A vector which owns the pointed Objects
typedef std::vector<std::shared_ptr<Object>> ObjectArray;
// Getter
const ObjectArray& MyArray() const
{
return m_myArray
}
// Setter
// (new C++11 move semantics pattern: pass by value and move from the value)
void MyArray(ObjectArray otherArray)
{
m_myArray = std::move(otherArray);
}
private:
ObjectArray m_myArray;
};

Example to use shared_ptr?

Hi I asked a question today about How to insert different types of objects in the same vector array and my code in that question was
gate* G[1000];
G[0] = new ANDgate() ;
G[1] = new ORgate;
//gate is a class inherited by ANDgate and ORgate classes
class gate
{
.....
......
virtual void Run()
{ //A virtual function
}
};
class ANDgate :public gate
{.....
.......
void Run()
{
//AND version of Run
}
};
class ORgate :public gate
{.....
.......
void Run()
{
//OR version of Run
}
};
//Running the simulator using overloading concept
for(...;...;..)
{
G[i]->Run() ; //will run perfectly the right Run for the right Gate type
}
and I wanted to use vectors so someone wrote that I should do that :
std::vector<gate*> G;
G.push_back(new ANDgate);
G.push_back(new ORgate);
for(unsigned i=0;i<G.size();++i)
{
G[i]->Run();
}
but then he and many others suggested that I would better use Boost pointer containers
or shared_ptr. I have spent the last 3 hours reading about this topic, but the documentation seems pretty advanced to me . ****Can anyone give me a small code example of shared_ptr usage and why they suggested using shared_ptr. Also are there other types like ptr_vector, ptr_list and ptr_deque** **
Edit1: I have read a code example too that included:
typedef boost::shared_ptr<Foo> FooPtr;
.......
int main()
{
std::vector<FooPtr> foo_vector;
........
FooPtr foo_ptr( new Foo( 2 ) );
foo_vector.push_back( foo_ptr );
...........
}
And I don't understand the syntax!
Using a vector of shared_ptr removes the possibility of leaking memory because you forgot to walk the vector and call delete on each element. Let's walk through a slightly modified version of the example line-by-line.
typedef boost::shared_ptr<gate> gate_ptr;
Create an alias for the shared pointer type. This avoids the ugliness in the C++ language that results from typing std::vector<boost::shared_ptr<gate> > and forgetting the space between the closing greater-than signs.
std::vector<gate_ptr> vec;
Creates an empty vector of boost::shared_ptr<gate> objects.
gate_ptr ptr(new ANDgate);
Allocate a new ANDgate instance and store it into a shared_ptr. The reason for doing this separately is to prevent a problem that can occur if an operation throws. This isn't possible in this example. The Boost shared_ptr "Best Practices" explain why it is a best practice to allocate into a free-standing object instead of a temporary.
vec.push_back(ptr);
This creates a new shared pointer in the vector and copies ptr into it. The reference counting in the guts of shared_ptr ensures that the allocated object inside of ptr is safely transferred into the vector.
What is not explained is that the destructor for shared_ptr<gate> ensures that the allocated memory is deleted. This is where the memory leak is avoided. The destructor for std::vector<T> ensures that the destructor for T is called for every element stored in the vector. However, the destructor for a pointer (e.g., gate*) does not delete the memory that you had allocated. That is what you are trying to avoid by using shared_ptr or ptr_vector.
I will add that one of the important things about shared_ptr's is to only ever construct them with the following syntax:
shared_ptr<Type>(new Type(...));
This way, the "real" pointer to Type is anonymous to your scope, and held only by the shared pointer. Thus it will be impossible for you to accidentally use this "real" pointer. In other words, never do this:
Type* t_ptr = new Type(...);
shared_ptr<Type> t_sptr ptrT(t_ptr);
//t_ptr is still hanging around! Don't use it!
Although this will work, you now have a Type* pointer (t_ptr) in your function which lives outside the shared pointer. It's dangerous to use t_ptr anywhere, because you never know when the shared pointer which holds it may destruct it, and you'll segfault.
Same goes for pointers returned to you by other classes. If a class you didn't write hands you a pointer, it's generally not safe to just put it in a shared_ptr. Not unless you're sure that the class is no longer using that object. Because if you do put it in a shared_ptr, and it falls out of scope, the object will get freed when the class may still need it.
Learning to use smart pointers is in my opinion one of the most important steps to become a competent C++ programmer. As you know whenever you new an object at some point you want to delete it.
One issue that arise is that with exceptions it can be very hard to make sure a object is always released just once in all possible execution paths.
This is the reason for RAII: http://en.wikipedia.org/wiki/RAII
Making a helper class with purpose of making sure that an object always deleted once in all execution paths.
Example of a class like this is: std::auto_ptr
But sometimes you like to share objects with other. It should only be deleted when none uses it anymore.
In order to help with that reference counting strategies have been developed but you still need to remember addref and release ref manually. In essence this is the same problem as new/delete.
That's why boost has developed boost::shared_ptr, it's reference counting smart pointer so you can share objects and not leak memory unintentionally.
With the addition of C++ tr1 this is now added to the c++ standard as well but its named std::tr1::shared_ptr<>.
I recommend using the standard shared pointer if possible. ptr_list, ptr_dequeue and so are IIRC specialized containers for pointer types. I ignore them for now.
So we can start from your example:
std::vector<gate*> G;
G.push_back(new ANDgate);
G.push_back(new ORgate);
for(unsigned i=0;i<G.size();++i)
{
G[i]->Run();
}
The problem here is now that whenever G goes out scope we leak the 2 objects added to G. Let's rewrite it to use std::tr1::shared_ptr
// Remember to include <memory> for shared_ptr
// First do an alias for std::tr1::shared_ptr<gate> so we don't have to
// type that in every place. Call it gate_ptr. This is what typedef does.
typedef std::tr1::shared_ptr<gate> gate_ptr;
// gate_ptr is now our "smart" pointer. So let's make a vector out of it.
std::vector<gate_ptr> G;
// these smart_ptrs can't be implicitly created from gate* we have to be explicit about it
// gate_ptr (new ANDgate), it's a good thing:
G.push_back(gate_ptr (new ANDgate));
G.push_back(gate_ptr (new ORgate));
for(unsigned i=0;i<G.size();++i)
{
G[i]->Run();
}
When G goes out of scope the memory is automatically reclaimed.
As an exercise which I plagued newcomers in my team with is asking them to write their own smart pointer class. Then after you are done discard the class immedietly and never use it again. Hopefully you acquired crucial knowledge on how a smart pointer works under the hood. There's no magic really.
The boost documentation provides a pretty good start example:
shared_ptr example (it's actually about a vector of smart pointers) or
shared_ptr doc
The following answer by Johannes Schaub explains the boost smart pointers pretty well:
smart pointers explained
The idea behind(in as few words as possible) ptr_vector is that it handles the deallocation of memory behind the stored pointers for you: let's say you have a vector of pointers as in your example. When quitting the application or leaving the scope in which the vector is defined you'll have to clean up after yourself(you've dynamically allocated ANDgate and ORgate) but just clearing the vector won't do it because the vector is storing the pointers and not the actual objects(it won't destroy but what it contains).
// if you just do
G.clear() // will clear the vector but you'll be left with 2 memory leaks
...
// to properly clean the vector and the objects behind it
for (std::vector<gate*>::iterator it = G.begin(); it != G.end(); it++)
{
delete (*it);
}
boost::ptr_vector<> will handle the above for you - meaning it will deallocate the memory behind the pointers it stores.
Through Boost you can do it
>
std::vector<boost::any> vecobj;
boost::shared_ptr<string> sharedString1(new string("abcdxyz!"));
boost::shared_ptr<int> sharedint1(new int(10));
vecobj.push_back(sharedString1);
vecobj.push_back(sharedint1);
>
for inserting different object type in your vector container. while for accessing you have to use any_cast, which works like dynamic_cast, hopes it will work for your need.
#include <memory>
#include <iostream>
class SharedMemory {
public:
SharedMemory(int* x):_capture(x){}
int* get() { return (_capture.get()); }
protected:
std::shared_ptr<int> _capture;
};
int main(int , char**){
SharedMemory *_obj1= new SharedMemory(new int(10));
SharedMemory *_obj2 = new SharedMemory(*_obj1);
std::cout << " _obj1: " << *_obj1->get() << " _obj2: " << *_obj2->get()
<< std::endl;
delete _obj2;
std::cout << " _obj1: " << *_obj1->get() << std::endl;
delete _obj1;
std::cout << " done " << std::endl;
}
This is an example of shared_ptr in action. _obj2 was deleted but pointer is still valid.
output is,
./test
_obj1: 10 _obj2: 10
_obj2: 10
done
The best way to add different objects into same container is to use make_shared, vector, and range based loop and you will have a nice, clean and "readable" code!
typedef std::shared_ptr<gate> Ptr
vector<Ptr> myConatiner;
auto andGate = std::make_shared<ANDgate>();
myConatiner.push_back(andGate );
auto orGate= std::make_shared<ORgate>();
myConatiner.push_back(orGate);
for (auto& element : myConatiner)
element->run();

Can I have polymorphic containers with value semantics in C++?

As a general rule, I prefer using value rather than pointer semantics in C++ (ie using vector<Class> instead of vector<Class*>). Usually the slight loss in performance is more than made up for by not having to remember to delete dynamically allocated objects.
Unfortunately, value collections don't work when you want to store a variety of object types that all derive from a common base. See the example below.
#include <iostream>
using namespace std;
class Parent
{
public:
Parent() : parent_mem(1) {}
virtual void write() { cout << "Parent: " << parent_mem << endl; }
int parent_mem;
};
class Child : public Parent
{
public:
Child() : child_mem(2) { parent_mem = 2; }
void write() { cout << "Child: " << parent_mem << ", " << child_mem << endl; }
int child_mem;
};
int main(int, char**)
{
// I can have a polymorphic container with pointer semantics
vector<Parent*> pointerVec;
pointerVec.push_back(new Parent());
pointerVec.push_back(new Child());
pointerVec[0]->write();
pointerVec[1]->write();
// Output:
//
// Parent: 1
// Child: 2, 2
// But I can't do it with value semantics
vector<Parent> valueVec;
valueVec.push_back(Parent());
valueVec.push_back(Child()); // gets turned into a Parent object :(
valueVec[0].write();
valueVec[1].write();
// Output:
//
// Parent: 1
// Parent: 2
}
My question is: Can I have have my cake (value semantics) and eat it too (polymorphic containers)? Or do I have to use pointers?
Since the objects of different classes will have different sizes, you would end up running into the slicing problem if you store them as values.
One reasonable solution is to store container safe smart pointers. I normally use boost::shared_ptr which is safe to store in a container. Note that std::auto_ptr is not.
vector<shared_ptr<Parent>> vec;
vec.push_back(shared_ptr<Parent>(new Child()));
shared_ptr uses reference counting so it will not delete the underlying instance until all references are removed.
I just wanted to point out that vector<Foo> is usually more efficient than vector<Foo*>. In a vector<Foo>, all the Foos will be adjacent to each other in memory. Assuming a cold TLB and cache, the first read will add the page to the TLB and pull a chunk of the vector into the L# caches; subsequent reads will use the warm cache and loaded TLB, with occasional cache misses and less frequent TLB faults.
Contrast this with a vector<Foo*>: As you fill the vector, you obtain Foo*'s from your memory allocator. Assuming your allocator is not extremely smart, (tcmalloc?) or you fill the vector slowly over time, the location of each Foo is likely to be far apart from the other Foos: maybe just by hundreds of bytes, maybe megabytes apart.
In the worst case, as you scan through a vector<Foo*> and dereferencing each pointer you will incur a TLB fault and cache miss -- this will end up being a lot slower than if you had a vector<Foo>. (Well, in the really worst case, each Foo has been paged out to disk, and every read incurs a disk seek() and read() to move the page back into RAM.)
So, keep on using vector<Foo> whenever appropriate. :-)
Yes, you can.
The boost.ptr_container library provides polymorphic value semantic versions of the standard containers. You only have to pass in a pointer to a heap-allocated object, and the container will take ownership and all further operations will provide value semantics , except for reclaiming ownership, which gives you almost all the benefits of value semantics by using a smart pointer.
You might also consider boost::any. I've used it for heterogeneous containers. When reading the value back, you need to perform an any_cast. It will throw a bad_any_cast if it fails. If that happens, you can catch and move on to the next type.
I believe it will throw a bad_any_cast if you try to any_cast a derived class to its base. I tried it:
// But you sort of can do it with boost::any.
vector<any> valueVec;
valueVec.push_back(any(Parent()));
valueVec.push_back(any(Child())); // remains a Child, wrapped in an Any.
Parent p = any_cast<Parent>(valueVec[0]);
Child c = any_cast<Child>(valueVec[1]);
p.write();
c.write();
// Output:
//
// Parent: 1
// Child: 2, 2
// Now try casting the child as a parent.
try {
Parent p2 = any_cast<Parent>(valueVec[1]);
p2.write();
}
catch (const boost::bad_any_cast &e)
{
cout << e.what() << endl;
}
// Output:
// boost::bad_any_cast: failed conversion using boost::any_cast
All that being said, I would also go the shared_ptr route first! Just thought this might be of some interest.
While searching for an answer to this problem, I came across both this and a similar question. In the answers to the other question you will find two suggested solutions:
Use std::optional or boost::optional and a visitor pattern. This solution makes it hard to add new types, but easy to add new functionality.
Use a wrapper class similar to what Sean Parent presents in his talk. This solution makes it hard to add new functionality, but easy to add new types.
The wrapper defines the interface you need for your classes and holds a pointer to one such object. The implementation of the interface is done with free functions.
Here is an example implementation of this pattern:
class Shape
{
public:
template<typename T>
Shape(T t)
: container(std::make_shared<Model<T>>(std::move(t)))
{}
friend void draw(const Shape &shape)
{
shape.container->drawImpl();
}
// add more functions similar to draw() here if you wish
// remember also to add a wrapper in the Concept and Model below
private:
struct Concept
{
virtual ~Concept() = default;
virtual void drawImpl() const = 0;
};
template<typename T>
struct Model : public Concept
{
Model(T x) : m_data(move(x)) { }
void drawImpl() const override
{
draw(m_data);
}
T m_data;
};
std::shared_ptr<const Concept> container;
};
Different shapes are then implemented as regular structs/classes. You are free to choose if you want to use member functions or free functions (but you will have to update the above implementation to use member functions). I prefer free functions:
struct Circle
{
const double radius = 4.0;
};
struct Rectangle
{
const double width = 2.0;
const double height = 3.0;
};
void draw(const Circle &circle)
{
cout << "Drew circle with radius " << circle.radius << endl;
}
void draw(const Rectangle &rectangle)
{
cout << "Drew rectangle with width " << rectangle.width << endl;
}
You can now add both Circle and Rectangle objects to the same std::vector<Shape>:
int main() {
std::vector<Shape> shapes;
shapes.emplace_back(Circle());
shapes.emplace_back(Rectangle());
for (const auto &shape : shapes) {
draw(shape);
}
return 0;
}
The downside of this pattern is that it requires a large amount of boilerplate in the interface, since each function needs to be defined three times.
The upside is that you get copy-semantics:
int main() {
Shape a = Circle();
Shape b = Rectangle();
b = a;
draw(a);
draw(b);
return 0;
}
This produces:
Drew rectangle with width 2
Drew rectangle with width 2
If you are concerned about the shared_ptr, you can replace it with a unique_ptr.
However, it will no longer be copyable and you will have to either move all objects or implement copying manually.
Sean Parent discusses this in detail in his talk and an implementation is shown in the above mentioned answer.
Take a look at static_cast and reinterpret_cast
In C++ Programming Language, 3rd ed, Bjarne Stroustrup describes it on page 130. There's a whole section on this in Chapter 6.
You can recast your Parent class to Child class. This requires you to know when each one is which. In the book, Dr. Stroustrup talks about different techniques to avoid this situation.
Do not do this. This negates the polymorphism that you're trying to achieve in the first place!
Most container types want to abstract the particular storage strategy, be it linked list, vector, tree-based or what have you. For this reason, you're going to have trouble with both possessing and consuming the aforementioned cake (i.e., the cake is lie (NB: someone had to make this joke)).
So what to do? Well there are a few cute options, but most will reduce to variants on one of a few themes or combinations of them: picking or inventing a suitable smart pointer, playing with templates or template templates in some clever way, using a common interface for containees that provides a hook for implementing per-containee double-dispatch.
There's basic tension between your two stated goals, so you should decide what you want, then try to design something that gets you basically what you want. It is possible to do some nice and unexpected tricks to get pointers to look like values with clever enough reference counting and clever enough implementations of a factory. The basic idea is to use reference counting and copy-on-demand and constness and (for the factor) a combination of the preprocessor, templates, and C++'s static initialization rules to get something that is as smart as possible about automating pointer conversions.
I have, in the past, spent some time trying to envision how to use Virtual Proxy / Envelope-Letter / that cute trick with reference counted pointers to accomplish something like a basis for value semantic programming in C++.
And I think it could be done, but you'd have to provide a fairly closed, C#-managed-code-like world within C++ (though one from which you could break through to underlying C++ when needed). So I have a lot of sympathy for your line of thought.
Just to add one thing to all 1800 INFORMATION already said.
You might want to take a look at "More Effective C++" by Scott Mayers "Item 3: Never treat arrays polymorphically" in order to better understand this issue.
I'm using my own templated collection class with exposed value type semantics, but internally it stores pointers. It's using a custom iterator class that when dereferenced gets a value reference instead of a pointer. Copying the collection makes deep item copies, instead of duplicated pointers, and this is where most overhead lies (a really minor issue, considered what I get instead).
That's an idea that could suit your needs.