c++ vector with strict ownership semantics

c++ vector with strict ownership semantics - c++

This is not a question about putting std::auto_ptr into std::vector.
Is there some vector equivalent of std::auto_ptr in std::, std::tr1:: or boost::? I use std::auto_ptr in function parameters and return values to decalre ownership semantics of these functions. But this way I can pass only single objects. As a temporary solution for vectors I have this:
std::auto_ptr<std::vector<std::tr1::shared_ptr<ClassExample> > > fx(....);
which, I suppose, by introducing boost, I will be able to change into this:
std::auto_ptr<std::vectro<boost::unique_ptr<ClassExample> > >f(...);
in order to define strict ownership passing, but it seems to be quite complicated. To simplify it, I can use
std::vector<boost::unique_ptr<ClassExample> > f(...);
as the price for deep copy of the vector is not high, but I am still curious if there is something that I can use like this:
auto_vector<ClassExample> f(...);
meaning that the function is releasing ownership of all the objects and the vector internal data array is not deeply copied.

There's a C++11 solution - it requires an implementation that provides r-value references and has a standard library updated to include move constructors and std::unique_ptr.
Just return std::vector<std::unique_ptr<T>> - that type can't be copied, as std::unique_ptr<T> isn't copyable. The compiler will either use the move constructor when returning, which will not invoke a deep copy or will apply the RVO and elide the construction of a new object.

Without any particular reason as to why you need to maintain a vector of pointers, the first solution would be to ditch all the complexity and go for the simple:
std::vector<Type> f();
Ownership of the objects is unique (the vector owns them), and while the code looks like it is copying the vector on return it will be optimized away in most cases.
If you need the objects inside the vector to be dynamically allocated due to some other requirement (objects must be allocated through a factory, they are pointers to derived types, or they cannot move due to vector growth -- code maintains references/pointers) that you are not showing then I would go for a boost::ptr_vector that will maintain ownership of the contained objects. Again, return by value:
boost::ptr_vector<Type> f();

Related

How I can use emplace for unordered_set that holds shared_ptr to the object?

Say I have an object:
struct Foo {
Foo(const std::string& str1, const std::string& str1)
: mStr1(str1), mStr2(str2)
{}
std::string mStr1;
std::string mStr2;
};
And set
typedef std::unordered_set<std::shared_ptr<Foo> , Hash, Compare> Set;
I have custom hasher and compare. But when I say:
Set set;
set.emplace(str1, str2);
I receive compile error, because the constructor of Foo is obviously not a constructor of std::shared_ptr<Foo>. What I would like is when emplace needs to construct a pointer to use std::make_shared<Foo>(str1, str2)
It seems that I also need a custom allocator for that, but I did not manage to implement one that satisfy the compiler.
My question is: Is what I want possible. If yes, how - is the allocator the right way to go. If yes, can you point me to an example.

Use set.insert(std::make_shared<Foo>(str1, str2));. emplace is generally not a gain for container with unique keys when duplicate keys are an issue, because of the way it operates:
It must construct the object first before it can compare it to existing keys in the container to determine if it can be inserted.
Once the object is constructed, it cannot be copied or moved, because the object is not required to be copyable or movable. Moreover, there's no reliable way for emplace to detect when it can copy or move something, because there are plenty of types for which is_copy_constructible returns true but cannot be actually copied.
Object construction can only happen once, since the constructor may move from the arguments or have other side effects.
A typical implementation of emplace thus always allocates memory for the node up-front, constructs the object inside that memory, compares it with existing elements in the container, and then either links it in, or destroys it and deallocates the memory.
insert, on the other hand, has the key readily available. It can therefore first decide whether the node should be inserted, and only allocate memory if it should be.
In theory, implementations might special-case emplace for the "one argument with the same type as the element type" case. But I know of no implementation that actually does this.

You can just use std::make_shared directly in the argument list of emplace.
set.emplace(std::make_shared<Foo>(str1, str2));
No custom allocator required.

I receive compile error, because the constructor of Foo is obviously
not a constructor of std::shared_ptr. What I would like is when
emplace needs to construct a pointer to use
std::make_shared(str1, str2)
emplace is implemented as a function that uses perfect forwarding to invoke the constructor of the contained element (in this case shared_ptr). The contained element's constructor accepts a pointer to Foo, therefore you should be able to do this (just like you would construct a shared_ptr<Foo> object):
set.emplace(new Foo("x", "y")); //or
set.emplace(new Foo(str1, str2));
It seems that I also need a custom allocator for that, but I did not
manage to implement one that satisfy the compiler.
A custom allocator is a total overkill if all you want to do is add a shared_ptr in the most efficient way (by invoking forwarding constructor on some pre-allocate element), or I'm totally misunderstanding your question. You would typically use the allocator if you don't want the element to be constructed using the default allocator (which use operator new). In this case, shared_ptr itself will be the element that will be constructed on the heap. You would only use an allocator if you are concerned that heap allocations are for some reason inefficient for your purposes (e.g if you allocate millions of small objects).
Note (as commented by #Yakk) that, in this case it is possible that the instantiation of shared_ptr may throw (I can only think of bad_alloc as possibility), in which case the pointer passed to emplace would cause a leak. For this reason I too think std::make_shared would be a better option (as mentioned in another answer).

Why a unique_ptr can be used with std containers, vectors<> for example?

I understand that auto_ptr cannot be used with vectors since auto_ptr does not meet the requirement of being a copy constructible. Since the auto_ptr being copied is modified, copying does not result in two exact copies thereby violating the copy constructible idiom.
Unique_ptr also seem to do the same; it modifies the object being copied - the pointer member of the object being copied is set to nullptr.
Then, how is it possible to use uinque_ptr with vectors and not the auto_ptrs ?
Is my understanding correct or am I missing something here?
auto_ptr <int> autoPtr(new int);
vector < auto_ptr <int> > autoVec;
autoVec.push_back(autoPtr); //compiler error..why?
unique_ptr <int> uniquePtr(new int);
vector < unique_ptr <int> > uniqueVec;
uniqueVec.push_back(std::move(uniquePtr)); //okay..why?

The problem with auto_ptr was that an operation that wasn't expected to modify an object did modify it. The copy constructor modified the source.
unique_ptr doesn't have a copy constructor. It has a move constructor. The move constructor also modifies the source, but this is expected, and it's possible for generic code to distinguish between situations that use the copy constructor and those that use the move constructor.
vector (and the other containers) didn't just magically work with unique_ptr once it was written. They had to be updated so that they could work with move-only types. This was done in C++11. It was possible to make this modification because moving and copying are different operations. It wasn't possible to do the same for auto_ptr because that class had a weird copy constructor.

The compilation error is because of an attempt to modify the auto_ptr const& formal argument (by copying it) in the push_back implementation.
Instead of
autoVec.push_back(autoPtr);
you can do
autoVec.emplace_back( autoPtr.release() );
or more directly (not using the autoPtr variable)
autoVec.emplace_back( new int(42) );
However, while this works in a technical sense (disclaimer: code not even glanced at by any compiler) you still have the problem of auto_ptr items nulling themselves as a result of attempted copying. So this is all very very unsafe. Additionally, auto_ptr was deprecated in C++11; I do not know whether it has been altogether removed now, but that's not unlikely.

Which C++ object copies more quickly?

Two instances of this C++ object exist.
my_type
{
public:
std::vector<unsigned short> a;
}
One where the std::vector is empty and the other where it contains 50 elements.
Which instance copies most quickly or do they copy in the same time?

When a std::vector is copied all of it's elements are also copied - so the time taken should be proportional to vector.size().
In c++0x so called move semantics are introduced, allowing a move constructor and move assignment operator to be defined for types. These are defined for standard library containers (such as std::vector) and should allow for vector's to be moved in O(1) time. If you're worried about performance, maybe you could re-cast your operations to make use of these new features.
EDIT: Based on the linked question, if you're worried about the extra copies potentially done when calling vector::push_back you have a few options:
In c++0x use the new vector::emplace_back instead. This allows for your objects to be constructed in-place in the container.
In c++0x use move semantics, via something like vector.push_back(std::move(object_to_push)). For POD types this will still do more copying than the emplace_back option.
Store a container of pointers to objects rather than objects themselves. The only thing that will get copied by the container in this case is the pointer itself - which is cheap. You potentially want to use some variant of smart pointers with this option.
Hope this helps.

Should I use shared_ptr or unique_ptr

I've been making some objects using the pimpl idiom, but I'm not sure whether to use std::shared_ptr or std::unique_ptr.
I understand that std::unique_ptr is more efficient, but this isn't so much of an issue for me, as these objects are relatively heavyweight anyway so the cost of std::shared_ptr over std::unique_ptr is relatively minor.
I'm currently going with std::shared_ptr just because of the extra flexibility. For example, using a std::shared_ptr allows me to store these objects in a hashmap for quick access while still being able to return copies of these objects to callers (as I believe any iterators or references may quickly become invalid).
However, these objects in a way really aren't being copied, as changes affect all copies, so I was wondering that perhaps using std::shared_ptr and allowing copies is some sort of anti-pattern or bad thing.
Is this correct?

I've been making some objects using the pimpl idiom, but I'm not sure whether to used shared_ptr or unique_ptr.
Definitely unique_ptr or scoped_ptr.
Pimpl is not a pattern, but an idiom, which deals with compile-time dependency and binary compatibility. It should not affect the semantics of the objects, especially with regard to its copying behavior.
You may use whatever kind of smart pointer you want under the hood, but those 2 guarantee that you won't accidentally share the implementation between two distinct objects, as they require a conscious decision about the implementation of the copy constructor and assignment operator.
However, these objects in a way really aren't being copied, as changes affect all copies, so I was wondering that perhaps using shared_ptr and allowing copies is some sort of anti-pattern or bad thing.
It is not an anti-pattern, in fact, it is a pattern: Aliasing. You already use it, in C++, with bare pointers and references. shared_ptr offer an extra measure of "safety" to avoid dead references, at the cost of extra complexity and new issues (beware of cycles which create memory leaks).
Unrelated to Pimpl
I understand unique_ptr is more efficient, but this isn't so much of an issue for me, as these objects are relatively heavyweight anyway so the cost of shared_ptr over unique_ptr is relatively minor.
If you can factor out some state, you may want to take a look at the Flyweight pattern.

If you use shared_ptr, it's not really the classical pimpl
idiom (unless you take additional steps). But the real question
is why you want to use a smart pointer to begin with; it's very
clear where the delete should occur, and there's no issue of
exception safety or other to be concerned with. At most,
a smart pointer will save you a line or two of code. And the
only one which has the correct semantics is boost::scoped_ptr,
and I don't think it works in this case. (IIRC, it requires
a complete type in order to be instantiated, but I could be
wrong.)
An important aspect of the pimpl idiom is that its use should be
transparent to the client; the class should behave exactly as if
it were implemented classically. This means either inhibiting
copy and assignment or implementing deep copy, unless the class
is immutable (no non-const member functions). None of the usual
smart pointers implement deep copy; you could implement one, of
course, but it would probably still require a complete type
whenever the copy occurs, which means that you'd still have to
provide a user defined copy constructor and assignment operator
(since they can't be inline). Given this, it's probably not
worth the bother using the smart pointer.
An exception is if the objects are immutable. In this case, it
doesn't matter whether the copy is deep or not, and shared_ptr
handles the situation completely.

When you use a shared_ptr (for example in a container, then look this up and return it by-value), you are not causing a copy of the object it points to, simply a copy of the pointer with a reference count.
This means that if you modify the underlying object from multiple points, then you affect changes on the same instance. This is exactly what it is designed for, so not some anti-pattern!
When passing a shared_ptr (as the comments say,) it's better to pass by const reference and copy (there by incrementing the reference count) where needed. As for return, case-by-case.

Yes, please use them. Simply put, the shared_ptr is an implementation of smart pointer. unique_ptr is an implementation of automatic pointer:

unique_ptr - major improvement?

In the actual C++ standard, creating collections satisfying following rules is hard if not impossible:
exception safety,
cheap internal operations (in actual STL containers: the operations are copies),
automatic memory management.
To satisfy (1), a collection can't store raw pointers. To satisfy (2), a collection must store raw pointers. To satisfy (3), a collection must store objects by value.
Conclusion: the three items conflict with each other.
Item (2) will not be satisfied when shared_ptrs are used because when a collection will need to move an element, it will need to make two calls: to a constructor and to a destructor. No massive, memcpy()-like copy/move operations are possible.
Am I correct that the described problem will be solved by unique_ptr and std::move()? Collections utilizing the tools will be able to satisfy all 3 conditions:
When a collection will be deleted as a side effect of an exception, it will call unique_ptr's destructors. No memory leak.
unique_ptr does not need any extra space for reference counter; therefore its body should be exact the same size, as wrapped pointer,
I am not sure, but it looks like this allows to move groups of unique_ptrs by using memmove() like operations (?),
even if it's not possible, the std::move() operator will allow to move each unique_ptr object without making the constructor/destructor pair calls.
unique_ptr will have exclusive ownership of given memory. No accidental memory leaks will be possible.
Is this true? What are other advantages of using unique_ptr?

I agree entirely. There's at last a natural way of handling heap allocated objects.
In answer to:
I am not sure, but it looks like this allows to move groups of unique_ptrs by using memmove() like operations,
there was a proposal to allow this, but it hasn't made it into the C++11 Standard.

Yes, you are right. I would only add this is possible thanks to r-value references.

When a collection will be deleted as a side effect of an exception, it will call unique_ptr's destructors. No memory leak.
Yes, a container of unique_ptr will satisfy this.
unique_ptr does not need any extra space for reference counter; therefore its body should be exact the same size, as wrapped pointer
unique_ptr's size is implementation-defined. While all reasonable implementations of unique_ptr using it's default destructor will likely only be a pointer in size, there is no guarantee of this in the standard.
I am not sure, but it looks like this allows to move groups of unique_ptrs by using memmove() like operations (?),
Absolutely not. unique_ptr is not a trivial class; therefore, it cannot be memmoved around. Even if it were, you can't just memmove them, because the destructors for the originals need to be called. It would have to be a memmove followed by a memset.
even if it's not possible, the std::move() operator will allow to move each unique_ptr object without making the constructor/destructor pair calls.
Also incorrect. Movement does not make constructors and destructors not be called. The unique_ptr's that are being destroyed need to be destroyed; that requires a call to their destructors. Similarly, the new unique_ptrs need to have their constructors called; that's how an object's lifetime begins.
There's no avoiding that; it's how C++ works.
However, that's not what you should be worried about. Honestly, if you're concerned about a simple constructor/destructor call, you're either in code that you should be hand-optimizing (and thus writing your own code for), or you're prematurely optimizing your code. What matters is not whether constructors/destructors are called; what matters is how fast the resulting code is.
unique_ptr will have exclusive ownership of given memory. No accidental memory leaks will be possible.
Yes, it will.
Personally, I'd say you're doing one of the following:
Being excessively paranoid about copying objects. This is evidence by the fact that you consider putting a shared_ptr in a container is too costly of a copy. This is an all-too-common malady among C++ programmers. That's not to say that copying is always good or something, but obsessing over copying a shared_ptr in a container is ridiculous outside of exceptional circumstances.
Not aware of how to properly use move semantics. If your objects are expensive to copy but cheap to move... then move them into the container. There's no reason to have a pointer indirection when your objects already contain pointer indirections. Just use movement with the objects themselves, not unique_ptrs to objects.
Disregarding the alternatives. Namely, Boost's pointer containers. They seem to have everything you want. They own pointers to their objects, but externally they have value semantics rather than pointer semantics. They're exception safe, and any copying happens with pointers. No unique_ptr constructor/destructor "overhead".

It looks like the three conditions I've enumerated in my post are possible to obtain by using Boost Pointer Container Library.

This question illlustrates why I so love the Boehm garbage collector (libgc). There's never a need to copy anything for reasons of memory management, and indeed, ownership of memory no longer needs to be mentioned as part of APIs. You have to buy a little more RAM to get the same CPU performance, but you save hundreds of hours of programmers' time. You decide.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js