Reference counting - internal references problem

Reference counting - internal references problem - c++

I have implemented my own smart pointers, and everything worked fine, untill I realized a fatal flaw with my implementation. The problem was the fact that an object can have a smart pointer that potentially holds a reference to itself. the problem would be easy to avoid if this was a one layer problem - what could easly happen is a ref counted class would indirectly (through one of its members) holds a referenece to itself. this would mean that a object would never be removed deleted. Is there any way/method I could solve this?
simplest example:
class Derived : public Object {
public:
static ref<Object> Create() { return ref<Object>(new Derived()); }
private:
Derived() : m_ref(this) // m_ref now holds a reference to Derived instance
{
// SOME CODE HERE
}
ref<Object> m_ref;
};
Object is base class contains reference counter, ref is smart pointer that holds a reference to its assigned object

There is no easy way to handle this issue. It is a fundamental problem with reference counting.
To build intuition as to why this is the case, note that the difficulty of detecting cycles of smart pointers is similar to the difficulty of dealing with the cycles. To detect cycles you need to be able to traverse the pointers from "root pointers". If you could do that, you could mark the ones you see during traversal. If you could mark them you could implement mark-and-sweep, which is garbage collection.

Related

Non owning reference to deleteable object

What would the best practice to hold a non owning reference to a object, that can be deleted?
The first part is fairly simple, I simply using the stupid-smart pointer: observer_ptr. However, the last part makes it somewhat more difficult.
Example
Having this setup, to illustrate the need of my vector unique ptr
class Object
{
};
class Derrived : public Object
{
};
With the implementation of
vector<nonstd::observer_ptr<Object>> _observers;
vector<unique_ptr<Object>> _objects;
auto t = make_unique<Derrived>();
_observers.push_back(nonstd::make_observer(t.get()));
_objects.push_back(move(t));
// Same objects
cout << (_observers.at(0).get() == _objects.at(0).get()) << endl;
Issue
Now at any time, somewhere, one of the objects in _objects might be deleted.
I will simply illustrate this by deleting the first object in the vector:
_objects.erase(_objects.begin());
This will result in the _objects vector is empty. However, the _observers vector now points to a freed memory space.
Of course, I can simply delete the observer from _observers, but imagine having such observing references in different parts of my program.
Would there be any cleaner solution for this, and it this the right way to observe different objects?
Please let me know if the example at hand does not illustrate the problem (or any problem for that matter) that I described.

Your use-case sounds like a std::weak_ptr<Object> would be suitable non-owning representation. Of course, for a std::weak_ptr<T> the owning representation is std::shared_ptr<T>. However, since you’ll need to “pin” the object before you could access a std::weak_ptr<T> you’d have more than one owner anyway while accessing the pointer.

As stated in the comments, this is a typical use-case for std::weak_ptr:
std::weak_ptr is a smart pointer that holds a non-owning ("weak")
reference to an object that is managed by std::shared_ptr. It must be
converted to std::shared_ptr in order to access the referenced object.
Example:
vector<shared_ptr<Object>> objects;
objects.push_back(make_shared<Derived>());
weak_ptr<Object> ptr{ objects.back() };
auto sh_ptr = ptr.lock(); // increase reference count if object is still alive
if(sh_ptr) { // if object was not deleted yet
sh_ptr->doStuff(); // safely access the object, as this thread holds a valid reference
}

Today there is no way to make non-owning relationship to be enforced by compiler:
1. weak_ptr could be converted to shared_ptr
2. Everything else could be deleted.
3. Wrappers around weak_ptr that would be non convertible to shared_ptr would not work also: once reference to an object is retrieved it could be deleted too.

unique_ptr and polymorphism

I have some code that currently uses raw pointers, and I want to change to smart pointers. This helps cleanup the code in various ways. Anyway, I have factory methods that return objects and its the caller's responsibility to manager them. Ownership isn't shared and so I figure unique_ptr would be suitable. The objects I return generally all derive from a single base class, Object.
For example,
class Object { ... };
class Number : public Object { ... };
class String : public Object { ... };
std::unique_ptr<Number> State::NewNumber(double value)
{
return std::unique_ptr<Number>(new Number(this, value));
}
std::unique_ptr<String> State::NewString(const char* value)
{
return std::unique_ptr<String>(new String(this, value));
}
The objects returned quite often need to be passed to another function, which operates on objects of type Object (the base class). Without any smart pointers the code is like this.
void Push(const Object* object) { ... } // push simply pushes the value contained by object onto a stack, which makes a copy of the value
Number* number = NewNumber(5);
Push(number);
When converting this code to use unique_ptrs I've run into issues with polymorphism. Initially I decided to simply change the definition of Push to use unique_ptrs too, but this generates compile errors when trying to use derived types. I could allocate objects as the base type, like
std::unique_ptr<Object> number = NewNumber(5);
and pass those to Push - which of course works. However I often need to call methods on the derived type. In the end I decided to make Push operate on a pointer to the object stored by the unique_ptr.
void Push(const Object* object) { ... }
std::unique_ptr<Object> number = NewNumber(5);
Push(number.get());
Now, to the reason for posting. I'm wanting to know if this is the normal way to solve the problem I had? Is it better to have Push operate on the unique_ptr vs the object itself? If so how does one solve the polymorphism issues? I would assume that simply casting the ptrs wouldn't work. Is it common to need to get the underlying pointer from a smart pointer?
Thanks, sorry if the question isn't clear (just let me know).
edit: I think my Push function was a bit ambiguous. It makes a copy of the underlying value and doesn't actually modify, nor store, the input object.

Initially I decided to simply change the definition of Push to use
unique_ptrs too, but this generates compile errors when trying to use
derived types.
You likely did not correctly deal with uniqueness.
void push(std::unique_ptr<int>);
int main() {
std::unique_ptr<int> i;
push(i); // Illegal: tries to copy i.
}
If this compiled, it would trivially break the invariant of unique_ptr, that only one unique_ptr owns an object, because both i and the local argument in push would own that int, so it is illegal. unique_ptr is move only, it's not copyable. It has nothing to do with derived to base conversion, which unique_ptr handles completely correctly.
If push owns the object, then use std::move to move it there. If it doesn't, then use a raw pointer or reference, because that's what you use for a non-owning alias.

Well, if your functions operate on the (pointed to) object itself and don't need its address, neither take any ownership, and, as I guess, always need a valid object (fail when passed a nullptr), why do they take pointers at all?
Do it properly and make them take references:
void Push(const Object& object) { ... }
Then the calling code looks exactly the same for raw and smart pointers:
auto number = NewNumber(5);
Push(*number);
EDIT: But of course no matter if using references or pointers, don't make Push take a std::unique_ptr if it doesn't take ownership of the passed object (which would make it steal the ownership from the passed pointer). Or in general don't use owning pointers when the pointed to object is not to be owned, std::shared_ptr isn't anything different in this regard and is as worse a choice as a std::unique_ptr for Push's parameter if there is no ownership to be taken by Push.

If Push does not take owenrship, it should probably take reference instead of pointer. And most probably a const one. So you'll have
Push(*number);
Now that's obviously only valid if Push isn't going to keep the pointer anywhere past it's return. If it does I suspect you should try to rethink the ownership first.

Here's a polymorphism example using unique pointer:
vector<unique_ptr<ICreature>> creatures;
creatures.emplace_back(new Human);
creatures.emplace_back(new Fish);
unique_ptr<vector<string>> pLog(new vector<string>());
for each (auto& creature in creatures)
{
auto state = creature->Move(*pLog);
}

List of smart pointers - Managing object lifetime and pointer validity

I have a list of smart pointers where each pointer points to a separate Entity class.
std::list<std::unique_ptr<Entity>> m_entities;
I would like the constructor to handle the assigning of each pointer to a std::list class as it is "automatically" handled by the code on class instantiation. However, if this design is bad then I would welcome a better alternative as it only makes sense to me coming from a C# background.
Entity::Entity(Game &game)
: m_game(game),
m_id(m_game.g_idGenerator->generateNewID())
{
m_game.m_entities.push_back(std::unique_ptr<Entity>(this));
}
The main problem I have encountered with this method is that the Entity class' lifetime is unmanaged by the Entity class.
For example if I allocate an Entity class on the stack it will call the Entity destructor after leaving the method in which it was allocated and the pointer will no longer be valid.
I therefore considered the alternative of creating a smart pointer, allocating the Entity class to the heap and then explicitly adding the pointer to the list.
std::unique_ptr<Entity> b(new Entity(*this));
m_entities.push_back(b); // ERROR
This produces the following error
error C2664: 'void std::list<_Ty>::push_back(_Ty &&)' : cannot convert parameter 1 from 'std::unique_ptr<_Ty>' to 'std::unique_ptr<_Ty> &&'
What would be considered the best approach for allocating each pointer to the list and is a constructor based version possible?
I'm currently thinking that it is the list of smart pointers that should handle the lifetime for each Entity class and that assigning pointers in a constructor is not a good design choice. In that case I should probably create a CreateEntity method that adds the pointer to list rather than let the constructor handle it. Is this better?
I considered what type of smart pointer would be appropriate for this operation after reading through questions found here, here and here (offsite). It is difficult to get an exact answer based on what I've read so far though as they all offer somewhat conflicting advice.

Using constructor this way is definitely not good idea because constructor has no information about how object is created and controlled - on the stack, statically, dynamically by some smart pointer, dynamically by dumb pointer?
To solve this problem you could use static factory method to create Entity instances:
class Entity
{
public:
// Variant with unique ownership
static void CreateGameEntity(Game& game)
{
std::unique_ptr<Entity> p(new Entity());
game.m_entities.push_back(std::move(p));
}
// OR (you cannot use both)
// Variant with shared ownership
static std::shared_ptr<Entity> CreateGameEntity(Game& game)
{
std::shared_ptr<Entity> p(new Entity());
game.m_entities.push_back(p);
return p;
}
private:
// Declare ctors private to avoid possibility to create Entity instances
// without CreateGameEntity() method, e.g. on stack.
Entity();
Entity(const Entity&);
};
Which smart pointer to use? Well, this depends on your design. If Game object solely owns Entity instances and completely manages their lifetime, using std::unique_ptr is OK. If you need some kind of shared ownership (e.g. you have several Game objects that can share same Entity objects) you shall use std::shared_ptr.
Also in case of unique ownership you may use Boost Pointer Container library. It contains specialized owning pointer containers like ptr_vector, ptr_list, ptr_map etc.

I won't comment on your design questions, but to fix your error, change your code to either:
m_entities.push_back(std::unique_ptr<Boundary>(new Boundary(*this, body)));
or:
std::unique_ptr<Boundary> b(new Boundary(*this, body));
m_entities.push_back(std::move(b));
The reason is that b in your code is an lvalue, but std::unique_ptr<> is a move-only type (i.e. has no copy constructor).

The problem in your code is that you try to move a std::unique_ptr<T> from an l-value. The instantiations of std::unique_ptr<T> are non-copyable and are only movable. To move from an l-value you need to explicitly do so:
this->m_entities.push_back(std::move(b));
The call to std::move() won't really move anything but it does yield a type which indicates to the compiler that the object can be moved.

To address the issue with the stack-created instance, you could simply add a parameter to the constructor that tells it to not add the new instance to the list, eg:
Entity::Entity(Game &game, bool AddToList = true)
: m_game(game),
m_id(m_game.g_idGenerator->generateNewID())
{
if (AddToList) m_game.m_entities.push_back(this);
}
.
{
...
Entity e(game, false);
...
}
Another option might be to add a destructor to Entity that removes it from the list if it is still present, but that might get a little complex trying to avoid conflicts between direct Entity destructions and unique_ptr destructions.

Composition using reference or pointer?

Please let me know which one is good
Consider I have a class like this
class Shape
{
public:
virtual void Display()
{
cout << "Shape::Display()" << endl;
}
};
class Line : public Shape
{
public:
void Display()
{
cout << "Line::Display()" << endl;
}
};
class Circle : public Shape
{
public:
void Display()
{
cout << "Circle::Display()" << endl;
}
};
1)
class Container
{
public:
Container(Shape& shape) : m_Shape(shape)
{
}
void Draw()
{
m_Shape.Display();
}
private:
Shape& m_Shape;
};
Here he is taking reference and assigning it to the base class object.
2)
class ShapeContainer
{
public:
ShapeContainer(Shape* pShape) : m_pShape(pShape)
{
}
void Draw()
{
m_pShape->Display();
}
private:
Shape* m_pShape;
};
Here he is taking the pointer and assigning it to the base class object.
Please let me know which one is good or when to use them

The decision has to be taken in terms of what you want to do with your object. Since you are not including the option of storing a value, I will assume that you need the container to refer to an external object. If this is not a requirement, consider using what standard containers do: store a copy of the object.
By using a reference you avoid having to check for NULL on creation (after creation both solutions share the same problem there: the external object might be destroyed, leaving you with a dangling reference and Undefined Behavior). The semantics of a reference are a little stronger, and anyone that reads the code will immediately understand: This container is meant to hold a reference to this object for the whole lifetime of the container, the object lifetime is managed externally.
On the negative side of it, references are not resettable, once you set a reference it becomes an alias to the object, and you cannot reset it to refer to a different object. This means that a container that holds a reference will not be assignable by default as you cannot make it refer to a different object. Even if you provide a custom assignment operator, the reference cannot be changed and it will keep referring to the old object. You will not be able to provide member functions in the container to refer to a different object: once you set a reference, it is set forever.
Pointers on the other hand, can be NULL, so you might want/need to check for validity of the pointer before use. They are resettable, and you can offer operations to have your container refer to a different object at any point, by default the container will be assignable. On the negative side, pointers can be used in different scenarios, so the semantics are not so clear. In particular, you will have to carefully document on the interface whether the container takes ownership of the pointer (and the responsibility to release the memory) or not. If you are going to handle ownership inside the container, then consider using a smart pointer (or else just make a copy and forget the original if you can, which will be less error prone).

I perosnally prefer using references over pointers whenever possible. This avoids having to do a NULL check before using the pointer as references can not be NULL. BTW, your shape class requires a virtual destructor.

I'd rather say that there is no good or bad in this case, it's just a matter of what you want to do.
If the user is not allowed to store NULL pointers in your vector because it's a requirement, then use by reference. But if the user wants to reserve space for 10 objects but only wishes to have the first 3 initialized, then you'd want to be able to store NULL to be able to differentiate between objects and empty space.
But as said, it's a matter of requirement for the container.
In your case i'd choose by reference.

I would go for the reference approach if there is no need to explicitly control the lifetime of the contained object in the container class and you can be sure that the passed reference object is always valid during the lifetime of the container class.

How to handle 'this' pointer in constructor?

I have objects which create other child objects within their constructors, passing 'this' so the child can save a pointer back to its parent. I use boost::shared_ptr extensively in my programming as a safer alternative to std::auto_ptr or raw pointers. So the child would have code such as shared_ptr<Parent>, and boost provides the shared_from_this() method which the parent can give to the child.
My problem is that shared_from_this() cannot be used in a constructor, which isn't really a crime because 'this' should not be used in a constructor anyways unless you know what you're doing and don't mind the limitations.
Google's C++ Style Guide states that constructors should merely set member variables to their initial values. Any complex initialization should go in an explicit Init() method. This solves the 'this-in-constructor' problem as well as a few others as well.
What bothers me is that people using your code now must remember to call Init() every time they construct one of your objects. The only way I can think of to enforce this is by having an assertion that Init() has already been called at the top of every member function, but this is tedious to write and cumbersome to execute.
Are there any idioms out there that solve this problem at any step along the way?

Use a factory method to 2-phase construct & initialize your class, and then make the ctor & Init() function private. Then there's no way to create your object incorrectly. Just remember to keep the destructor public and to use a smart pointer:
#include <memory>
class BigObject
{
public:
static std::tr1::shared_ptr<BigObject> Create(int someParam)
{
std::tr1::shared_ptr<BigObject> ret(new BigObject(someParam));
ret->Init();
return ret;
}
private:
bool Init()
{
// do something to init
return true;
}
BigObject(int para)
{
}
BigObject() {}
};
int main()
{
std::tr1::shared_ptr<BigObject> obj = BigObject::Create(42);
return 0;
}
EDIT:
If you want to object to live on the stack, you can use a variant of the above pattern. As written this will create a temporary and use the copy ctor:
#include <memory>
class StackObject
{
public:
StackObject(const StackObject& rhs)
: n_(rhs.n_)
{
}
static StackObject Create(int val)
{
StackObject ret(val);
ret.Init();
return ret;
}
private:
int n_;
StackObject(int n = 0) : n_(n) {};
bool Init() { return true; }
};
int main()
{
StackObject sObj = StackObject::Create(42);
return 0;
}

Google's C++ programming guidelines have been criticized here and elsewhere again and again. And rightly so.
I use two-phase initialization only ever if it's hidden behind a wrapping class. If manually calling initialization functions would work, we'd still be programming in C and C++ with its constructors would never have been invented.

Depending on the situation, this may be a case where shared pointers don't add anything. They should be used anytime lifetime management is an issue. If the child objects lifetime is guaranteed to be shorter than that of the parent, I don't see a problem with using raw pointers. For instance, if the parent creates and deletes the child objects (and no one else does), there is no question over who should delete the child objects.

KeithB has a really good point that I would like to extend (in a sense that is not related to the question, but that will not fit in a comment):
In the specific case of the relation of an object with its subobjects the lifetimes are guaranteed: the parent object will always outlive the child object. In this case the child (member) object does not share the ownership of the parent (containing) object, and a shared_ptr should not be used. It should not be used for semantic reasons (no shared ownership at all) nor for practical reasons: you can introduce all sorts of problems: memory leaks and incorrect deletions.
To ease discussion I will use P to refer to the parent object and C to refer to the child or contained object.
If P lifetime is externally handled with a shared_ptr, then adding another shared_ptr in C to refer to P will have the effect of creating a cycle. Once you have a cycle in memory managed by reference counting you most probably have a memory leak: when the last external shared_ptr that refers to P goes out of scope, the pointer in C is still alive, so the reference count for P does not reach 0 and the object is not released, even if it is no longer accessible.
If P is handled by a different pointer then when the pointer gets deleted it will call the P destructor, that will cascade into calling the C destructor. The reference count for P in the shared_ptr that C has will reach 0 and it will trigger a double deletion.
If P has automatic storage duration, when it's destructor gets called (the object goes out of scope or the containing object destructor is called) then the shared_ptr will trigger the deletion of a block of memory that was not new-ed.
The common solution is breaking cycles with weak_ptrs, so that the child object would not keep a shared_ptr to the parent, but rather a weak_ptr. At this stage the problem is the same: to create a weak_ptr the object must already be managed by a shared_ptr, which during construction cannot happen.
Consider using either a raw pointer (handling ownership of a resource through a pointer is unsafe, but here ownership is handled externally so that is not an issue) or even a reference (which also is telling other programmers that you trust the referred object P to outlive the referring object C)

A object that requires complex construction sounds like a job for a factory.
Define an interface or an abstract class, one that cannot be constructed, plus a free-function that, possibly with parameters, returns a pointer to the interface, but behinds the scenes takes care of the complexity.
You have to think of design in terms of what the end user of your class has to do.

Do you really need to use the shared_ptr in this case? Can the child just have a pointer? After all, it's the child object, so it's owned by the parent, so couldn't it just have a normal pointer to it's parent?

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js