Copy on write proper usage? - c++

I'm tyring to understand how COW works, I found following class on wikibooks, but I don't understand this code.
template <class T>
class CowPtr
{
public:
typedef boost::shared_ptr<T> RefPtr;
private:
RefPtr m_sp;
void detach()
{
T* tmp = m_sp.get();
if( !( tmp == 0 || m_sp.unique() ) ) {
m_sp = RefPtr( new T( *tmp ) );
}
}
public:
CowPtr(T* t)
: m_sp(t)
{}
CowPtr(const RefPtr& refptr)
: m_sp(refptr)
{}
CowPtr(const CowPtr& cowptr)
: m_sp(cowptr.m_sp)
{}
CowPtr& operator=(const CowPtr& rhs)
{
m_sp = rhs.m_sp; // no need to check for self-assignment with boost::shared_ptr
return *this;
}
const T& operator*() const
{
return *m_sp;
}
T& operator*()
{
detach();
return *m_sp;
}
const T* operator->() const
{
return m_sp.operator->();
}
T* operator->()
{
detach();
return m_sp.operator->();
}
};
And I would use it in my multithreaded application on map object, which is shared.
map<unsigned int, LPOBJECT> map;
So I've assigned it to template and now I have :
CowPtr<map<unsigned int, LPOBJECT>> map;
And now my questions :
How I should propertly take instance of the map for random thread which want only read map objects ?
How I should modify map object from random thread, for ex. insert new object or erase it ?

The code you post is poor to the point of being unusable; the
author doesn't seem to understand how const works in C++.
Practically speaking: CoW requires some knowledge of the
operations being done on the class. The CoW wrapper has to
trigger the copy when an operation on the wrapped object might
modify; in cases where the wrapped object can "leak" pointers
or iterators which allow modification, it also has to be able to
memorize this, to require deep copy once anything has been
leaked. The code you posted triggers the copy depending on
whether the pointer is const or not, which isn't at all the same
thing. Thus, with an std::map, calling std::map<>::find on
the map should not trigger copy on write, even if the pointer
is not const, and calling std::map<>::insert should, even if
the pointer is const.
With regards to threading: it is very difficult to make a CoW
class thread safe without grabbing a lock for every operation
which may mutate, because it's very difficult to know when
the actual objects are shared between threads. And it's even
more difficult if the object allows pointers or iterators to
leak, as do the standard library objects.
You don't explain why you want a thread-safe CoW map. What's
the point of the map if each time you add or remove an element,
you end up with a new copy, which isn't visible in other
instances? If it's just to start individual instances with
a copy of some existing map, std::map has a copy constructor
which does the job just fine, and you don't need any fancy
wrapper.

How does this work?
The class class CowPtr does hold a shared pointer to the underlying object. It does have a private method to copy construct a new object and assign the pointer to to the local shared pointer (if any other object does hold a reference to it): void detach().
The relevant part of this code is, that it has each method as
const return_type&
method_name() const
and once without const. The const after a method guarantees that the method does not modify the object, the method is called a const method. As the reference to the underlying object is const too, that method is being called every time you require a reference without modifying it.
If however you chose to modify the Object behind the reference, for example:
CowPtr<std::map<unsigned int, LPOBJECT>> map;
map->clear();
the non-const method T& operator->() is being called, which calls detach(). By doing so, a copy is made if any other CowPtr or shared_ptr is referencing the same underlying object (the instance of <unsigned int, LPOBJECT> in this case)
How to use it?
Just how you would use a std::shared_ptr or boost::shared_ptr. The cool thing about that implementation is that it does everything automatically.
Remarks
This is no COW though, as a copy is made even if you do not write, it is more a Copy if you do not guarantee that you do not write-Implementation.

Related

How to use getters and setters without generating a copy?

I would like to know how to use getters and setters for a member variable which takes a lot of memory. Usualy i would do as bellow:
class A
{
private:
BigObject object;
public:
BigObject getObject() const
{
return object;
}
void setObject(const BigObject& object)
{
this->object = object;
}
};
However this getter and setter i believe will copy the BigObject which i do not want. Is there a better way to do this?
I thought of doing it this way but i read on the internet that it's not a good idea because it can lead to a segmentation fault if used badly:
BigObject& getObject()
{
return object
}
(If you do not care about encapsulation in this case, meaning the A::object member should be modifiable by anyone without restriction, then look at SergeyA's answer).
Return by const reference in order to avoid copying and still maintain encapsulation (meaning the caller can't modify the member by mistake):
const BigObject& getObject() const
{
return object;
}
If the caller actually wants a copy, they can do so easily themselves.
If you want to prevent dangling references (the segfault you mentioned) when the getter is used on a temporary, you can only return a copy when the getter is actually called on a temporary:
BigObject getObject() const &&
{
return object;
}
const BigObject& getObject() const &
{
return object;
}
This will return a copy when calling getObject() on a temporary. Alternatively, you can completely prevent calling the getter on a temporary by deleting that particular overload:
BigObject getObject() const && = delete;
const BigObject& getObject() const &
{
return object;
}
Keep in mind that this is not a guaranteed safety net. It prevents some mistakes, but not all of them. The caller of the function should still be aware about object lifetimes.
You can also improve your setter, btw. Right now, it will always copy the object regardless how the caller passes the argument. You should take it by value instead and move it into the member:
void setObject(BigObject object)
{
this->object = std::move(object);
}
This requires that BigObject is movable though. If it's not, then this will be even worse than before.
Best solution: make code of your class exactly that:
struct A
{
BigObject object;
};
Explanation - avoid trivial setters and getters. If you find yourself putting those into your classes, expose the member directly and be done with it.
Do not ever listen to people who'd say "But what if in the future we add non-trivial logic"? I have seen more than a healthy dose of trivial setters and getters, been around for decades, and never replaced with something non-trivial.
The common practice is to:
class A
{
public:
// this method is const
const& BigObject getObject() const
{
return object;
}
// this method is not const
void setObject(const BigObject& object)
{
object = object;
}
private:
BigObject object;
};
If you need to get a read-only object - it's perfectly fine. Otherwise, consider changes in architecture.
An alternative would be to store a std::shared_ptr and return a std::shared_ptr or std::weak_ptr.
Instead of returning a copy of the member, you can return a reference to it. This way there is no need to copy the member.
I thought of doing it this way but i read on the internet that it's not a good idea because it can lead to a segmentation fault if used badly
The solution is to not use it badly.
Returning a reference to a member is fairly common pattern and not generally discouraged. Although, for types that are fast to copy, returning a copy is generally superior when there is no need to refer to the member itself.
There is a solution that avoids both copying and breakage of encapsulation: Use a shared pointer, and return a copy of that shared pointer from the getter. However, this approach has a runtime cost and requires dynamic allocation, so it is not ideal for all use cases.
In case of setter, you may use move assignment instead, which is more efficient than copy assignment for some types.

How to use shared_ptr to manage already ref-count managed objects?

if an object is already reference-counted (like glib in C), having obj_ref, obj_unref. All we have is a pointer like obj *p.
How can we use c++'s shared_ptr to manage the object so that we can have an uniform interface.
Ok, it seems that a lot of people have misunderstood my intension.
The greatest issue here is not about deleter. It's about inform of the original manager that I increased the refcount.
If I assigned or copied, only std::shared_ptr increased the refcount, but the original one did not. Is there anyway to inform it? So as the unref operation.
std::shared_ptr allows you to pass a custom deleter which is called when the owned object should be destroyed. You could use it to call obj_unref.
obj* p = create_obj();
p->obj_ref();
std::shared_ptr<obj> sp(p, [](auto p) {
p->obj_unref();
});
/* use sp normally, obj will be 'obj_unref'ed and deleted when sp goes out of scope */
I don't know how a obj is created and if it gets destroyed by obj_unref() when the count reaches 0, but I hope you see what I mean.
The idea is to increment objs internal reference count just once at the beginning, and decrement it just once when the last shared_ptr is destroyed.
Don't try to somehow duct tape std::shared_ptr's refcounting to your custom one, that won't end well. Just write a custom pointer:
struct objPtr {
objPtr()
: _ptr{nullptr} { }
objPtr(obj *ptr)
: _ptr{ptr} {
if(_ptr)
_ptr->obj_ref();
}
~objPtr() {
if(_ptr)
_ptr->obj_unref();
}
objPtr(objPtr const &orig)
: objPtr{orig._ptr} { }
objPtr &operator = (objPtr const &orig) {
obj *const oPtr = std::exchange(_ptr, orig._ptr);
_ptr->obj_ref();
oPtr->obj_unref();
return *this;
}
obj &operator * () { return *_ptr; }
obj const &operator * () const { return *_ptr; }
obj *operator -> () { return _ptr; }
obj const *operator -> () const { return _ptr; }
operator bool() const { return _ptr; }
bool operator ! () const { return !_ptr; }
private:
obj *_ptr;
};
Add move construction and assignment if you so wish.
When you want a shared_ptr, start with a unique_ptr. Then build up.
struct cleanup_obj {
// not called with nullptr:
void operator()(obj* t)const {
obj_unref(t);
}
};
using obj_unique_ptr = std::unique_ptr<T, cleanup_obj>;
using obj_shared_ptr = std::shared_ptr<T>;
template<class T>
obj_unique_ptr<T> make_unique_refcount( T* t ) {
using ptr=obj_unique_ptr<T>;
if (!t) return ptr();
obj_ref(t);
return ptr(t);
}
template<class T>
obj_shared_ptr<T> make_shared_refcount( T* t ) {
return make_unique_refcount(t); // implicit convert does right thing
}
What did I do?
First, I wrote a unique_ptr wrapper, because we may as well be complete, and it solves the shared_ptr case via the unique_ptr->shared_ptr implicit conversion.
For unique_ptr, we have to say we aren't using the default object destroyer. In this case, we are using a stateless function object that knows how to obj_unref an obj*. The stateless function object keeps the overhead at zero.
For the null case, we don't first add a reference, as that is rude.
For shared_ptr, the fact that we have a working unique_ptr makes it a free function. shared_ptr will happily store the destroyer function that unique_ptr has. It doesn't have to be told it has a special object destroyer, because shared_ptr type erases object destruction by default. (This is because unique_ptr<T> is zero-overhead over a naked pointer, while shared_ptr<T> has unavoidable overhead of the reference counting block; the designers figured once you have that reference counting block, adding in a type-erased destruction function was not really expensive).
Note that our obj_unique_ptr<T> is also zero overhead over a naked pointer. Quite often you'll want one of these instead of the shared one.
Now, you can upgrade the obj_unique_ptr to a full on intrusive pointer, with less overhead than a shared_ptr, if you want.
template<class T>
struct obj_refcount_ptr : obj_unique_ptr<T> // public
{
// from unique ptr:
obj_refcount_ptr(obj_unique_ptr<T> p):obj_unique_ptr<T>(std::move(p)){}
obj_refcount_ptr& operator=(obj_unique_ptr<T> p){
static_cast<obj_unique_ptr<T>&>(*this)=std::move(p);
return *this;
}
obj_refcount_ptr(obj_refcount_ptr&&)=default;
obj_refcount_ptr& operator=(obj_refcount_ptr&&)=default;
obj_refcount_ptr()=default;
obj_refcount_ptr(obj_refcount_ptr const& o):
obj_refcount_ptr(make_unique_refcount(o.get())
{}
obj_refcount_ptr& operator=(obj_refcount_ptr const& o) {
*this = make_unique_refcount(o.get());
return *this;
}
};
which I think covers it. Now it is a zero-overhead reference counting intrusive smart pointer. These intrusive smart pointers can be converted toa std::shared_ptr<T> via implicit conversion, as they are still unique_ptrs. They are just unique_ptrs we have taught to copy themselves!
It does require moving from an obj_refcount_ptr to get a shared_ptr. We can fix this:
operator std::shared_ptr<T>() const {
return obj_refcount_ptr(*this);
}
which creatres an obj_refcount_ptr copy of *this and moves it into the shared_ptr. Only one add ref is called, and the remove ref is only called when the shared_ptr count goes to zero.
The general approach is to start with the simplest smart pointer (unique_ptr), get it right, then exploit its implementation to get us the shared_ptr and eventually the refcount_ptr. We can test the unique_ptr implementation in isolation, and its correctness makes testing the richer pointers easier.
The most simplest approach, the least invasive one with the minimal possibility of breaking something, is to simply write your own facade for the object, with the underlying object as a private member and providing simple wrappers to access it.
Then use a std::shared_ptr to that.
It's an incredibly bad idea to have the same objects in multiple smart pointer implementations as their ref counts can't know about each other. As soon as the ref count hits zero in one it will delete the object even if the other still holds refs.
If you really had to you could construct your smart pointers with custom deleters (that do nothing), but I really wouldn't recommend this approach.
Pick one implementation and stick to it.

unique/shared_ptr with custom operator=

I'm working with a polymorphic type (that is, I always interact with it through a pointer) that has methods "Destroy()" and "Clone()", and I would like to wrap it in a resource safe type.
Now, if "Destroy()" was all I had to worry about, then I could use unique_ptr with a custom deleter and it would be easy. But, this resource handle type is used as a member in another type that is otherwise copyable using default generated copy and assign operations. Ideally, I would like to customize the resource handle's copy constructor and assignment to invoke "Clone()", just like I already customize the resource handle's destructor to invoke "Destroy()". But looking through the docs on unique_ptr and shared_ptr, I don't see anything that would let me do that.
Did I miss something in the docs? Is there a ready-made std way to do this?
If not, should I extend unique_ptr and override the copy operations?
Alternatively, should I just write my own resource handle with all the usual pointer semantics from scratch?
You may create wrapper around std::unique_ptr
// To handle your cusstom Destroy
struct DestroyDeleter
{
void operator(Interface* o) {
object->Destroy();
delete object;
}
};
using InterfacePtr = std::unique_ptr<Interface, DestroyDeleter>;
// To handle the copy with clone:
class wrapper
{
public:
explicit wrapper(InterfacePtr o) : data(std::move(o)) {}
wrapper(const wrapper& rhs) : data(rhs.data->Clone()) {}
wrapper(wrapper&& rhs) = default;
wrapper& operator =(const wrapper& rhs) { data = rhs.data->Clone(); }
wrapper& operator =(wrapper&& rhs) = default;
const Interface* operator ->() const { return data.get(); }
Interface* operator ->() { return data.get(); }
const Interface& operator *() const { return data; }
Interface& operator *() { return *data; }
private:
InterfacePtr data;
};
One of the reasons for having a unique_ptr and a shared_ptr is that copying and juggling these pointers around is a cheap operation that does not involve copying the underlying object.
And that's why smart pointers do not implement any means for you to install your own operator=. As I understand your question, you want to wedge in a call to your custom Clone() and Destroy() methods, when a smart pointer gets copied.
Well, unique_ptr and shared_ptr implement their own operator= that does the right thing.
For starters whatever is being done by your Destroy() should really be done in your class's destructor. That's what a destructor is for. Then, your clone() method simply needs to clone its own object, and return a new unique_ptr, so, when you already have an existing unique_ptr or a shared_ptr, using it to invoke the clone() method clones the object and returns a smart pointer to the cloned object:
std::unique_ptr<myClass> p; // Or std::shared_ptr
// p is initialized, populated from there.
std::unique_ptr<myClass> new_p=p->clone();
Having a distinct Destroy() method that must be invoked before destroying an object is an invitation to spawn an endless parade of bugs. It's only a matter of time before you forget to call it when destroying a cloned object.
The best way to avoid creating and wasting time with bugs in your code is to make it logically impossible for them to happen. If something needs to be done in order to destroy an object, this needs to be done in the destructor. If something extra needs to be done in order to destroy an object that was cloned from another object, the object needs to have an internal flag that indicates that it's a cloned object, and have its destructor do the right thing, based on that.
Once that's done, it will become logically impossible for your code to screw up, by forgetting to do something to get rid of a cloned object. You can congratulate yourself for saving yourself, in the future, from wasting time searching for a bunch of bugs that will never be created, now. Your future self will thank your past self. Just use smart pointers as they were intended to be used, and have clone() give you a smart pointer right from the start.

How to write handles for classes that don't have clone() member?

I'm following an example in Accelerated C++ and writing a simple Handle class that will act as a smart pointer. This uses the virtual ctor idiom using a virtual clone() function. So far so good. But what to do when I want to use my Handle for classes that I don't control that don't provide clone()?
The method suggested in the book is to create a global clone function and use template specialization (something I'm seeing for the first time) so that if clone() is called with a particular argument, one can write code to handle that case.
My question is: This means that I have to write a clone() version for every type of class that I envision my user can use Handle with. This seems quite hard! Is there a more elegant and/or simple way to solve this issue? How is it possible that things like auto_ptr or boost::shared_ptr are able to provide this functionality without the tedious clone() definitions?
For completeness, here's my Handle class implementation:
template <class T> class Handle
{
public:
Handle() : p(0) {}
Handle(T* t) : p(t) {}
Handle( const Handle& s ) :p(0) { if (s.p) p = s.p->clone(); }
const Handle& operator=( const Handle& );
~Handle() { delete p; }
operator bool() { return p; }
T& operator*() { if (p) return *p; else throw std::runtime_error("Handle not bound"); }
T* operator->() { if (p) return p; else throw std::runtime_error("Handle not bound"); }
private:
T* p;
};
Thanks!
The solution to this problem is to simply not write Handles for these kinds of classes. No. Really.
auto_ptr (deprecated as of C++11) never needs to clone the underlying object, because auto_ptr never copies the object. An auto_ptr only ever has one copy of the object, and when the auto_ptr is copied, control of the object is transfered -- that object is not copied.
unique_ptr never needs to clone the underlying object, because there is only ever one unique_ptr that owns the object. unique_ptr is noncopyable, and is only movable.
shared_ptr never needs to clone because it also only ever controls one copy of the object. Copying the shared_ptr only increments the reference count, and that single object is destroyed when the reference count is zero.
In general, if there's no way to deep copy the resource your class is controlling, then you should just make the class noncopyable. If clients need to pass references to your class around, they can place the class in an auto_ptr, unique_ptr, or shared_ptr themselves.

auto_ptr question in c++

I am new here.
I am also new on C++
So here is the class and function i wrote.But i got the compiler error
My class:
class fooPlayer
{
public:
void fooPlayerfunc(){}//doing something here
char askYesNo(std::string question);
};
class fooPlayerFactory
{
public:
virtual std::auto_ptr<fooPlayer> MakePlayerX() const;
virtual std::auto_ptr<fooPlayer> MakePlayerO() const;
private:
std::auto_ptr<fooPlayer> MakePlayer(char letter) const;
std::auto_ptr<fooPlayer> my_player;
};
Implement my class:
auto_ptr<fooPlayer> fooPlayerFactory:: MakePlayer(char letter) const
{
my_player->fooPlayerfunc();
return my_player;
}
auto_ptr<fooPlayer> fooPlayerFactory::MakePlayerX() const
{
char go_first = my_player->askYesNo("Do you require the first move?");
MakePlayer(go_first);
return my_player;
}
auto_ptr<fooPlayer> fooPlayerFactory::MakePlayerO() const
{
return my_player;
}
My main() function here:
int main()
{
fooPlayerFactory factory;
factory.MakePlayerX();
factory.MakePlayerO();
}
I got the error:
error C2558: class 'std::auto_ptr<_Ty>' : no copy constructor available or copy constructor is declared 'explicit'
I do not know how to change it even after reading the document on this link:
The reason for the error is that you are calling the copy constructor of auto_ptr my_player in fooPlayerFactory::MakePlayerO() which is a const method. That means that is cannot modify its members.
However the copy constructor of auto_ptr DOES modify the right hand side so returning my_player trys to change its pointer to 0 (NULL), while assigning the original pointer to the auto_ptr in the return value.
The signature of the copy constuctor is
auto_ptr<T>::auto_ptr<T>(auto_ptr<T> & rhs)
not
auto_ptr<T>::auto_ptr<T>(const auto_ptr<T> & rhs)
The copy constructor of auto_ptr assigns ownership of the pointer to the left hand side, the right hand side then holds nothing.
I don't think you want to use auto_ptr here, you probably want boost::smart_ptr
It looks like you have mixed up two uses for auto_ptr
The first is as poor man's boost::scoped_ptr. This is to manage a single instance of a pointer in a class, the class manages the life time of the pointer. In this case you don't normally return this pointer outside your class (you can it is legal, but boost::smart_ptr / boost::weak_ptr would be better so clients can participate the life time of the pointer)
The second is its main purpose which is to return a newly created pointer to the caller of a function in an exception safe way.
eg
auto_ptr<T> foo() {
return new T;
}
void bar() {
auto_ptr<T> t = foo();
}
As I said I think you have mixed these two uses auto_ptr is a subtle beast you should read the auto_ptr docs carefully. It is also covered very well in Effective STL by Scott Meyers.
In your code:
auto_ptr<fooPlayer> fooPlayerFactory:: MakePlayer(char letter) const
{
my_player->fooPlayerfunc();
return my_player;
}
This is a const function, but fooPlayerfunc is not const - my compiler reports this error rather than the one you say you are getting. Are you posting the real code?
I don't think you actually want to constructing dynamic objects here.
A factory object creates and returns an object it normally does not keep a reference to it after creation (unless you are sharing it), and I don't actually see anywhere that you are creating the player.
If you only ever create one player internally in your (fooPlayerFactory). Then create an object and return references to it.
Edit: in response to the comment (which is correct, my bad), I left only the advice part.
Best practice is to have the factory methods just return a plain old pointer to the underlying object, and let the caller decide how to manage ownership (auto_ptr, scoped_ptr, or whatever).
Also your code is buggy, any class that implements virtual methods should have a virtual destructor.
I'm not seeing anywhere you construct my_player, so I have a feeling that some of the code is missing. Specifically, I think your constructor has this line:
my_player = new fooPlayer()
A fooPlayer object is not quite the same thing as an auto_ptr<fooPlayer> object, and auto_ptr is intentionally designed to prevent assigning from one to the other because, frankly, the alternative is worse. For the details, look up (1) conversion constructors, (2) the explicit keyword, and (3) copy constructors and destructive copy semantics.
You should change the constructor to either:
class fooPlayerFactory {
public:
fooPlayerFactory()
{
my_player = std::auto_ptr<fooPlayer>(new fooPlayer());
}
Or (using a member initializer list):
class fooPlayerFactory {
public:
fooPlayerFactory() : my_player(std::auto_ptr<fooPlayer>(new fooPlayer()) { }
The solution isn't pretty but, like I said, the alternative is worse due to some really arcane details.
As a bit of advice, though, you're making life harder than it needs to be; and may in fact be causing strange bugs. auto_ptr exists to manage the lifetime of an object, but the only reason you need to worry about the lifetime of my_player is that you've allocated it with new. But there's no need to call new, and in fact there's no need to keep my_player. And unless fooPlayerFactory is meant to be the base class for some other factory, there's no need to mark functions virtual.
Originally I thought you could get away with simply returning copies of the my_player object, but there's a problem: before returning my_player from MakePlayer() you call a method on it, and I assume that method changes the internal state of my_player. Further calls to MakePlayer() will change the state again, and I think you're going to eventually have my_player in the wrong state. Instead, return a different fooPlayer object with each request. Don't do memory management, just promise to construct the object. That way the user can decide on memory allocation:
fooPlayerFaclotry factory;
fooPlayer on_stack = factory.MakePlayerX();
fooPlayer* on_heap_raw_pointer = new fooPlayer(factory.MakePlayerO());
std::auto_ptr<fooPlayer> on_heap_managed_scope
= std::auto_ptr<fooPlayer>(factory.MakePlayerX());
I would change fooPlayerFactory to look like this:
class fooPlayerFactory
{
private:
fooPlayer MakePlayer(const char letter) const
{
fooPlayer result;
result.fooPlayerfunc();
return result;
}
public:
fooPlayer* MakePlayerX() const
{
char go_first = askYesNo("Do you require the first move?");
return MakePlayer(go_first);
}
fooPlayer MakePlayerO() const
{
return fooPlayer();
}
};