std::unique_ptr::release() vs std::move()

std::unique_ptr::release() vs std::move() - c++

I have a class that represents a runtime context and builds a tree, the tree root is held in a unique_ptr. When building the tree is done I want to extract the tree. This is how it looks (not runnable, this is not a debugging question):
class Context {
private:
std::unique_ptr<Node> root{new Node{}};
public:
// imagine a constructor, attributes and methods to build a tree
std::unique_ptr<Node> extractTree() {
return std::move(this->root);
}
};
So I used std::move() to extract the root node from the Context instance.
However there are alternatives to using std::move() e.g.:
std::unique_ptr<Node> extractTree() {
// This seems less intuitive to me
return std::unique_ptr<Node>{this->root.release()};
}
Is std::move() the best choice?

You should definitely go with the first version as the second one basically does the everything the first version does with more code and less readability.
Philosophically speaking, the second version moves the unique pointer, no more, no less. so why going around the table instead of using the already existing, more readable and more idiomatic std::unique_ptr(std::unique_ptr&&) ?
And lastly, if your smart pointer somewhere in the future will hold custom deleter , the first version will make sure the deleter moves as well, the second version does not moves the deleter. I can definitely imagine a programmer building non-custom deleter unique pointer out of custom-deleter unique_pointer with using release. With move semantics, the program will fail to compile.

What you're doing is dangerous. Once you've called getTree(), you must not call it a second time, which is not clear from the interface. You may want to reconsider your design (e.g. a shared_ptr might do a better job, or simply store the root as a raw pointer and manually take care of the deallocation).
Anyway, using std::move is the better option of the two if you want to stick with your design as it makes your intent more clear.
EDIT: apparently 'must not' has a special meaning of forbideness in English I was not aware of. It is fine to call the function twice or as many times as you want, but will not return a pointer to a valid object if done consecutively.

Related

How to avoid unnecessary copies with `boost::variant` recursive XML structure

Let me refer to the following example from the boost spirit tutorial about parsing
a "mini XML" data structure. My question doesn't actually have anything to do with
spirit, it's really about boost::variant, and making efficient "recursive" variant
structures.
Here's the code:
struct mini_xml;
typedef
boost::variant<
boost::recursive_wrapper<mini_xml>
, std::string
>
mini_xml_node;
struct mini_xml
{
std::string name; // tag name
std::vector<mini_xml_node> children; // children
};
In the tutorial, they go on to show how to "adapt" the struct mini_xml for boost::fusion,
and then write spirit grammars that load data into it.
However, it occurred to me recently that there's a subtle issue in this example that may lead to significant overhead.
The issue is that the variant type, mini_xml_node, is not no-throw move constructible in this example. The reason is that it contains boost::recursive_wrapper. The recursive_wrapper<T>
always represents a heap-allocated instance of T, and it does not have an empty state. When a
variant containing recursive_wrapper is moved from, a new dynamic allocation is made and a new T
is move constructed from the old T -- we can't simply take ownership of the old T, because that would leave the old variant in an empty state. (I only looked into this after I was trying to implement my own variant type -- this is indeed what boost::variant does afaict, and it is indeed not no-throw move constructible in this example.)
Because mini_xml_node is not no-throw move constructible, the container std::vector<mini_xml_node> children will be on the "slow path" -- when it uses std::move_if_noexcept, it will elect to copy the elements rather
than move them. This happens for instance every time the vector capacity increases. And when we copy a mini_xml_node, we copy all of its children.
So for instance, if I want to parse from disk an xml tree which has depth d and branching factor b, I estimate that we will end up copying each leaf of the tree about d * log b times, since pushing back into a vector b times causes reallocation about log b times, and it will happen for each ancestor of the leaf.
I don't have an actual application right now where I care about this overhead, but it's easy for me to imagine that
I might. For instance I might want to write a high-performance parser using spirit for some small utility program
that will check that e.g. a data file has a certain form, or count some statistic of it or something. And then it
might very well be that the time to parse dominates the overall runtime of the utility, so these copies might be performance critical.
The question is, what is the simplest and cleanest way to change the code to prevent these copies from happening? I thought of a few approaches
but maybe you see a better way.
Quick and dirty: Write a wrapper class that contains a mini_xml_node but the move constructor is explicitly marked
noexcept, even though it isn't really noexcept. This will cause termination if an exception is thrown, but that should be pretty rare...
Use an alternative to std::vector that doesn't use std::move_if_noexcept, and instead just uses move. If I understand right, boost::vector does this. The tradeoff is it doesn't have strong exception safety, but it's not clear if that's really a problem here.
(This one doesn't actually work.) Add boost::blank as one of the options for mini_xml_node.
When blank is one of the value types, boost::variant has special code that adapts its behavior -- it will use
blank as a natural empty state when default constructing and when making type-changing assignments.
However, it appears that putting blank into boost::variant doesn't go so far as allowing me to no-throw move construct
the variant even though it contains a boost::recursive_wrapper type. Tested on coliru with boost 1.63:
http://coliru.stacked-crooked.com/a/87620e443470b70c
(Maybe this would be a useful feature for boost::variant in the future? What I'd like is that the move constructor transfers ownership of the recursively-wrapped guy, and puts the old variant in the blank state, and marks the move constructor as noexcept.)

Program design concerning polymorphism & resource management

I have a class called Widget. This class is abstract and has virtual methods. To avoid object slicing all Widgets are stored as references or pointers. I have several classes with constructors that store internally the widget given to them; thus the Widget stored must have been initialized outside the constructor and cannot be destroyed before the object is, therefore usually the Widget is allocated via dynamic memory. My question is regarding how to handle this dynamic memory; I have compiled a list of options (feel free to suggest others.) Which is the most idiomatic?
1. Smart pointers. Smart pointers seem like the right choice, but since I'm using C++98 I have to write my own. I also think that writing smart_pointer<Widget> all the time is a little ugly.
2. Copy Widgets when stored. Another course of action is to store a copy of the passed-in Widget instead of the original. This might cause object-slicing, but I'm not sure. Also, users might want to write classes themselves that store passed-in Widgets, and I wouldn't want to make it too complicated.
3. Let the user handle everything. I could perhaps make the user make sure that the Widget is deleted on time. This seems to be what Qt does (?). However, this again complicates things for the user.

I personally like this approach (it is not always applicable, but I used it successfully multiple times):
class WidgetOwner
{
vector<Widget*> m_data;
public:
RegisterWidget(Widget *p) { m_data.push_back(p); }
~WidgetOwner() { for (auto &p : m_data) delete p; }
};
This simple class just stores pointers. This class can store any derivatives of Widget provided that Widget has virtual destructor. For a polymorphic class this should not be a problem.
Note that once Widget is registered, it cannot be destroyed unless everything is destroyed.
The advantage of this approach is that you can pass around pointers freely. They all will be valid until the storage will be destroyed. This is sort of hand made pool.

Which is the most idiomatic?
The most idiomatic would certainly be what next versions of c++ decided to be "the way to go", and that would be smart pointers (You can find/use an implementation on boost for example, also other ones on the internet might be simpler for inspiration).
You can also decide that since you are using c++98 (that's a huge factor to take into consideration), you take what's idiomatic for that context, and since that was pretty much no man's land, the answer is most likely whatever home made design is the most appealing to you.

I think smart pointer is best choice. And if you feel template is ugly, try the typedef

Is there a way to optimize shared_ptr for the case of permanent objects?

I've got some code that is using shared_ptr quite widely as the standard way to refer to a particular type of object (let's call it T) in my app. I've tried to be careful to use make_shared and std::move and const T& where I can for efficiency. Nevertheless, my code spends a great deal of time passing shared_ptrs around (the object I'm wrapping in shared_ptr is the central object of the whole caboodle). The kicker is that pretty often the shared_ptrs are pointing to an object that is used as a marker for "no value"; this object is a global instance of a particular T subclass, and it lives forever since its refcount never goes to zero.
Using a "no value" object is nice because it responds in nice ways to various methods that get sent to these objects, behaving in the way that I want "no value" to behave. However, performance metrics indicate that a huge amount of the time in my code is spent incrementing and decrementing the refcount of that global singleton object, making new shared_ptrs that refer to it and then destroying them. To wit: for a simple test case, the execution time went from 9.33 seconds to 7.35 seconds if I stuck nullptr inside the shared_ptrs to indicate "no value", instead of making them point to the global singleton T "no value" object. That's a hugely important difference; run on much larger problems, this code will soon be used to do multi-day runs on computing clusters. So I really need that speedup. But I'd really like to have my "no value" object, too, so that I don't have to put checks for nullptr all over my code, special-casing that possibility.
So. Is there a way to have my cake and eat it too? In particular, I'm imagining that I might somehow subclass shared_ptr to make a "shared_immortal_ptr" class that I could use with the "no value" object. The subclass would act just like a normal shared_ptr, but it would simply never increment or decrement its refcount, and would skip all related bookkeeping. Is such a thing possible?
I'm also considering making an inline function that would do a get() on my shared_ptrs and would substitute a pointer to the singleton immortal object if get() returned nullptr; if I used that everywhere in my code, and never used * or -> directly on my shared_ptrs, I would be insulated, I suppose.
Or is there another good solution for this situation that hasn't occurred to me?

Galik asked the central question that comes to mind regarding your containment strategy. I'll assume you've considered that and have reason to rely on shared_ptr as a communical containment strategy for which no alternative exists.
I have to suggestions which may seem controversial. What you've defined is that you need a type of shared_ptr that never has a nullptr, but std::shared_ptr doesn't do that, and I checked various versions of the STL to confirm that the customer deleter provided is not an entry point to a solution.
So, consider either making your own smart pointer, or adopting one that you change to suit your needs. The basic idea is to establish a kind of shared_ptr which can be instructed to point it's shadow pointer to a global object it doesn't own.
You have the source to std::shared_ptr. The code is uncomfortable to read. It may be difficult to work with. It is one avenue, but of course you'd copy the source, change the namespace and implement the behavior you desire.
However, one of the first things all of us did in the middle 90's when templates were first introduced to the compilers of the epoch was to begin fashioning containers and smart pointers. Smart pointers are remarkably easy to write. They're harder to design (or were), but then you have a design to model (which you've already used).
You can implement the basic interface of shared_ptr to create a drop in replacement. If you used typedefs well, there should be a limited few places you'd have to change, but at least a search and replace would work reasonably well.
These are the two means I'm suggesting, both ending up with the same feature. Either adopt shared_ptr from the library, or make one from scratch. You'd be surprised how quickly you can fashion a replacement.
If you adopt std::shared_ptr, the main theme would be to understand how shared_ptr determines it should decrement. In most implementations shared_ptr must reference a node, which in my version it calls a control block (_Ref). The node owns the object to be deleted when the reference count reaches zero, but naturally shared_ptr skips that if _Ref is null. However, operators like -> and *, or the get function, don't bother checking _Ref, they just return the shadow, or _Ptr in my version.
Now, _Ptr will be set to nullptr (or 0 in my source) when a reset is called. Reset is called when assigning to another object or pointer, so this works even if using assignment to nullptr. The point is, that for this new type of shared_ptr you need, you could simply change the behavior such that whenever that happens (a reset to nullptr), you set _Ptr, the shadow pointer in shared_ptr, to the "no value global" object's address.
All uses of *,get or -> will return the _Ptr of that no value object, and will correctly behave when used in another assignment, or reset is called again, because those functions don't rely upon the shadow pointer to act upon the node, and since in this special condition that node (or control block) will be nullptr, the rest of shared_ptr would behave as though it was pointing to nullptr correctly - that is, not deleting the global object.
Obviously this sounds crazy to alter std::pointer to such application specific behavior, but frankly that's what performance work tends to make us do; otherwise strange things, like abandoning C++ occasionally in order to obtain the more raw speed of C, or assembler.
Modifying std::shared_ptr source, taken as a copy for this special purpose, is not what I would choose (and, factually, I've faced other versions of your situation, so I have made this choice several times over decades).
To that end, I suggest you build a policy based smart pointer. I find it odd I suggested this earlier on another post today (or yesterday, it's 1:40am).
I refer to Alexandrescu's book from 2001 (I think it was Modern C++...and some words I don't recall). In that he presented loki, which included a policy based smart pointer design, which is still published and freely available on his website.
The idea should have been incorporated into shared_ptr, in my opinion.
Policy based design is implemented as the paradigm of a template class deriving from one or more of it's parameters, like this:
template< typename T, typename B >
class TopClass : public B {};
In this way, you can provide B, from which the object is built. Now, B may have the same construction, it may also be a policy level which derives from it's second parameter (or multiple derivations, however the design works).
Layers can be combined to implement unique behaviors in various categories.
For example:
std::shared_ptr and std::weak_ptrare separate classes which interact as a family with others (the nodes or control blocks) to provide smart pointer service. However, in a design I used several times, these two were built by the same top level template class. The difference between a shared_ptr and a weak_ptr in that design was the attachment policy offered in the second parameter to the template. If the type is instantiated with the weak attachment policy as the second parameter, it's a weak pointer. If it's given a strong attachment policy, it's a smart pointer.
Once you create a policy designed template, you can introduce layers not in the original design (expanding it), or to "intercept" behavior and specialize it like the one you currently require - without corrupting the original code or design.
The smart pointer library I developed had high performance requirements, along with a number of other options including custom memory allocation and automatic locking services to make writing to smart pointers thread safe (which std::shared_ptr doesn't provide). The interface and much of the code is shared, yet several different kinds of smart pointers could be fashioned simply by selecting different policies. To change behavior, a new policy could be inserted without altering the existing code. At present, I use both std::shared_ptr (which I used when it was in boost years ago) and the MetaPtr library I developed years ago, the latter when I need high performance or flexible options, like yours.
If std::shared_ptr had been a policy based design, as loki demonstrates, you'd be able to do this with shared_ptr WITHOUT having to copy the source and move it to a new namespace.
In any event, simply creating a shared pointer which points the shadow pointer to the global object on reset to nullptr, leaving the node pointing to null, provides the behavior you described.

How do I hand out weak_ptrs to this in my constructor?

I'm creating a class that will be part of a DAG. The constructor will take pointers to other instances and use them to initialize a dependency list.
After the dependency list is initialized, it can only ever be shortened - the instance can never be added as a dependency of itself or any of its children.
::std::shared_ptr is a natural for handling this. Reference counts were made for handling DAGs.
Unfortunately, the dependencies need to know their dependents - when a dependency is updated, it needs to tell all of its dependents.
This creates a trivial cycle that can be broken with ::std::weak_ptr. The dependencies can just forget about dependents that go away.
But I cannot find a way for a dependent to create a ::std::weak_ptr to itself while it's being constructed.
This does not work:
object::object(shared_ptr<object> dependency)
{
weak_ptr<object> me = shared_from_this();
dependency->add_dependent(me);
dependencies_.push_back(dependency);
}
That code results in the destructor being called before the constructor exits.
Is there a good way to handle this problem? I'm perfectly happy with a C++11-only solution.

Instead of a constructor, use a function to build the nodes of your graph.
std::shared_ptr<Node> mk_node(std::vector<std::shared_ptr<Node>> const &dependencies)
{
std::shared_ptr<Node> np(new Node(dependencies));
for (size_t i=0; i<dependencies.size(); i++)
dependencies[i].add_dependent(np); // makes a weak_ptr copy of np
return np;
}
If you make this a static member function or a friend of your Node class, you can make the actual constructor private.

Basically, you can't. You need a shared_ptr or weak_ptr to make a weak_ptr and obviously self can only be aware of of its own shared_ptr only in form of weak_ptr (otherwise there's no point in counting references). And, of course, there could be no self-shared_ptr when the object isn't yet constructed.

You can't.
The best I've come up with is to make the constructor private and have a public factory function that returns a shared_ptr to a new object; the factory function can then call a private method on the object post-construction that initialises the weak_ptr.

I'm understanding your question as conceptually related to garbage collection issues.
GC is a non-modular feature: it deals with some global property of the program (more precisely, being a live data is a global, non-modular, property inside a program - there are situations where you cannot deal with that in a modular & compositional way.). AFAIK, STL or C++ standard libraries does not help much for global program features.
A possible answer might be to use (or implement yourself) a garbage collector; Boehm's GC could be useful to you, if you are able to use it in your entire program.
And you could also use GC algorithms (even copying generational ones) to deal with your issue.

Unfortunately, the dependencies need to know their dependents. This is because when a dependency is updated, it needs to tell all of its dependents. And there is a trivial cycle. Fortunately, this cycle can be broken with ::std::weak_ptr. The dependencies can just forget about dependents that go away.
It sounds like a dependency can't reference a dependent unless the dependent exists. (E.g. if the dependent is destroyed, the dependency is destroyed too -- that's what a DAG is after all)
If that's the case, you can just hand out plain pointers (in this case, this). You're never going to need to check inside the dependency if the dependent is alive, because if the dependent died then the dependency should have also died.
Just because the object is owned by a shared_ptr doesn't mean that all pointers to it themselves must be shared_ptrs or weak_ptrs - just that you have to define clear semantics as to when the pointers become invalidated.

It sounds to me like you're trying to conflate to somewhat different items: a single object (a node in the DAG), and managing a collection of those objects.
class DAG {
class node {
std::vector<std::weak_ptr<node> > dependents;
public:
node(std::vector<weak_ptr<node> > d) : dependents(d) {}
};
weak_ptr<node> root;
};
Now, it may be true that DAG will only ever hold a weak_ptr<node> rather than dealing with an instance of a node directly. To the node itself, however, this is more or less irrelevant. It needs to maintain whatever key/data/etc., it contains, along with its own list of dependents.
At the same time, by nesting it inside of DAG (especially if we make the class definition of node private to DAG), we can minimize access to node, so very little other code has to be concerned with anything about a node. Depending on the situation, you might also want to do things like deleting some (most?) of the functions in node that the compiler will generate by default (e.g., default ctor, copy ctor, assignment operator).

This has been driving me nuts as well.
I considered adopting the policy of using pointers to break cycles... But I'm really not found of this because I really like how clear the intent of the weak_ptr is when you see it in your code (you know it's there to break cycles).
Right now I'm leaning toward writing my own weak_ptr class.

Maybe this will help:
inherit from enable_shared_from_this which basically holds a weak_ptr.
This will allow you to use this->shared_from_this();
shared_ptr's know to look if the class inherits from the class and use the classes weak_ptr when pointing to the object (prevents 2 shared pointers from counting references differently)
More about it: cppreference

Best practice when calling initialize functions multiple times?

This may be a subjective question, but I'm more or less asking it and hoping that people share their experiences. (As that is the biggest thing which I lack in C++)
Anyways, suppose I have -for some obscure reason- an initialize function that initializes a datastructure from the heap:
void initialize() {
initialized = true;
pointer = new T;
}
now When I would call the initialize function twice, an memory leak would happen (right?). So I can prevent this is multiple ways:
ignore the call (just check wether I am initialized, and if I am don't do anything)
Throw an error
automatically "cleanup" the code and then reinitialize the thing.
Now what is generally the "best" method, which helps keeping my code manegeable in the future?
EDIT: thank you for the answers so far. However I'd like to know how people handle this is a more generic way. - How do people handle "simple" errors which can be ignored. (like, calling the same function twice while only 1 time it makes sense).

You're the only one who can truly answer the question : do you consider that the initialize function could eventually be called twice, or would this mean that your program followed an unexpected execution flow ?
If the initialize function can be called multiple times : just ignore the call by testing if the allocation has already taken place.
If the initialize function has no decent reason to be called several times : I believe that would be a good candidate for an exception.
Just to be clear, I don't believe cleanup and regenerate to be a viable option (or you should seriously consider renaming the function to reflect this behavior).

This pattern is not unusual for on-demand or lazy initialization of costly data structures that might not always be needed. Singleton is one example, or for a class data member that meets those criteria.
What I would do is just skip the init code if the struct is already in place.
void initialize() {
if (!initialized)
{
initialized = true;
pointer = new T;
}
}
If your program has multiple threads you would have to include locking to make this thread-safe.

I'd look at using boost or STL smart pointers.

I think the answer depends entirely on T (and other members of this class). If they are lightweight and there is no side-effect of re-creating a new one, then by all means cleanup and re-create (but use smart pointers). If on the other hand they are heavy (say a network connection or something like that), you should simply bypass if the boolean is set...
You should also investigate boost::optional, this way you don't need an overall flag, and for each object that should exist, you can check to see if instantiated and then instantiate as necessary... (say in the first pass, some construct okay, but some fail..)

The idea of setting a data member later than the constructor is quite common, so don't worry you're definitely not the first one with this issue.
There are two typical use cases:
On demand / Lazy instantiation: if you're not sure it will be used and it's costly to create, then better NOT to initialize it in the constructor
Caching data: to cache the result of a potentially expensive operation so that subsequent calls need not compute it once again
You are in the "Lazy" category, in which case the simpler way is to use a flag or a nullable value:
flag + value combination: reuse of existing class without heap allocation, however this requires default construction
smart pointer: this bypass the default construction issue, at the cost of heap allocation. Check the copy semantics you need...
boost::optional<T>: similar to a pointer, but with deep copy semantics and no heap allocation. Requires the type to be fully defined though, so heavier on dependencies.
I would strongly recommend the boost::optional<T> idiom, or if you wish to provide dependency insulation you might fall back to a smart pointer like std::unique_ptr<T> (or boost::scoped_ptr<T> if you do not have access to a C++0x compiler).

I think that this could be a scenario where the Singleton pattern could be applied.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

std::unique_ptr::release() vs std::move() - c++

Related

How to avoid unnecessary copies with `boost::variant` recursive XML structure

Program design concerning polymorphism & resource management

Is there a way to optimize shared_ptr for the case of permanent objects?

How do I hand out weak_ptrs to this in my constructor?

Best practice when calling initialize functions multiple times?

Categories

Resources