I am trying to understand which of the following usage of shared pointer makes more sense as it gets inserted into a vector.
bar takes a const reference of a shared pointer vs foo that takes a copy. In both the cases, the passed-in parameter gets moved into a vector. The interesting part is the use_count of a in the caller remains 2 for foo and bar both which implies the the vector stores a "copy"?
Like if a shared_ptr is passed by a reference, its count doesn't increment. As soon as it's "moved" into a vector, it does. Does that mean vector isn't storing the reference to an original object a?
class A
{
std::vector<std::shared_ptr<int>> _vec;
public:
void bar(const std::shared_ptr<int>& ptr)
{
_vec.emplace_back(std::move(ptr));
}
void foo(std::shared_ptr<int> ptr)
{
_vec.emplace_back(std::move(ptr));
}
};
int main()
{
auto sp = std::make_shared<int>();
A a;
// a.foo(sp); // use_count = 2
a.bar(sp); // use_count = 2
}
You're passing the shared_ptr to bar by reference to const. That means that the original shared_ptr can't be modified via that reference.
Moving from a share_ptr requires modifying the moved-from shared_ptr to set it to point to nothing.
See the issue? bar can't move from ptr, so it instead ends up copying it into the vector. ptr/sp remains valid and continues to point to the int that you allocated in main and a._vec also holds a shared_ptr to that same int. Thus use_count must be 2.
If you want to actually move from sp then you should change bar to accept some sort of mutable reference. Really you should make it accept an rvalue reference though, since a.bar(sp) causing sp to become invalid would violate most programmers' expectations:
class A
{
std::vector<std::shared_ptr<int>> _vec;
public:
void bar(std::shared_ptr<int>&& ptr)
{
_vec.emplace_back(std::move(ptr));
}
};
int main()
{
auto sp = std::make_shared<int>();
A a;
a.bar(std::move(sp));
// Here sp.use_count() is 0 because it was moved from
// a._vec.back().use_count() will be 1 though
}
Demo
This limits the caller to always moving their shared_ptr into A. Simply accepting the parameter by value as you do in foo will likely result in no measurable performance difference when the caller wants to give up ownership and provides greater flexibility when they don't.
In both cases, the shared pointer being passed to foo and bar is being moved into the vector. This means that the vector takes ownership of the dynamically allocated memory that the shared pointer was managing, and the shared pointer in the caller is no longer managing any memory.
The difference between foo and bar is that foo takes a copy of the shared pointer, while bar takes a const reference to the shared pointer. Since foo takes a copy, the shared pointer's reference count is incremented by one when it is passed to foo. This means that the reference count will be 2 after foo is called.
In contrast, bar takes a const reference to the shared pointer, so the reference count is not incremented when it is passed to bar. This means that the reference count will remain at 1 after bar is called.
Overall, the use of bar is more efficient in this case because it does not require an extra increment of the reference count, but the difference in performance is likely to be small in practice. It is more important to choose the approach that is more readable and maintainable for your specific use case.
Related
Are smart pointers considered as pointers? And thus can they implicitly used as pointers?
Let's say I have the following class:
class MyClass {
//...
std::shared_ptr<AnotherClass> foo() { /*whatever*/ };
void bar(AnotherClass* a) { /*whatever too*/ };
//...
}
Then can I use MyClass the following way?
// m is an instance of MyClass
m.bar(m.foo());
No they can't be used interchangable. You would get a compiler error in your example. But you can always get the raw pointer by shared_ptr::get().
NO! It would be a terrible API. Yes, you could easily implement it within shared_ptr, but just because you could doesn't mean you should.
Why is it such a bad idea? The plain-pointer-based interface of bar doesn't retain an instance of the shared pointer. If bar happens to store the raw pointer somewhere and then exit, there's nothing that guarantees that the pointer it had stored won't become dangling in the future. The only way to guarantee that would be to retain an instance of the shared pointer, not the raw pointer (that's the whole point of shared_ptr!).
It gets worse: the following code is undefined behavior if foo() returns a pointer instance that had only one reference when foo() returned (e.g. if foo is a simple factory of new objects):
AnotherClass *ptr = m.foo().get();
// The shared_ptr instance returned by foo() is destroyed at this point
m.bar(ptr); // undefined behavior: ptr is likely a dangling pointer here
Here are the options; consider those listed earlier first before considering their successors.
If bar(AnotherClass *) is an external API, then you need to wrap it in a safe way, i.e. the code that would have called Original::bar should be calling MyWrapped::bar, and the wrapper should do whatever lifetime management is necessary. Suppose that there is startUsing(AnotherClass *) and finishUsing(AnotherClass *), and the code expects the pointer to remain valid between startUsing and finishUsing. Your wrapper would be:
class WithUsing {
std::unique_ptr<AnotherClass> owner; /* or shared_ptr if the ownership is shared */
std::shared_ptr<User> user;
public:
WithUsing(std::unique_ptr<AnotherClass> owner, std::Shared_ptr<User> user) :
owner(std::move(owner)), user(std::move(user)) {
user.startUsing(owner.get());
}
void bar() const {
user.bar(owner.get());
}
~WithUsing() {
user.finishUsing(owner.get());
}
};
You would then use WithUsing as a handle to the User object, and any uses would be done through that handle, ensuring the existence of the object.
If AnotherClass is copyable and is very cheap to copy (e.g. it consists of a pointer or two), then pass it by value:
void bar(AnotherClass)
If the implementation of bar doesn't need to change the value, it can be defined to take a const-value (the declaration can be without the const as it doesn't matter there):
void bar(const AnotherClass a) { ... }
If bar doesn't store a pointer, then don't pass it a pointer: pass a const reference by default, or a non-const reference if necessary.
void bar(const AnotherClass &a);
void bar_modifies(AnotherClass &a);
If it makes sense to invoke bar with "no object" (a.k.a. "null"), then:
If passing AnotherClass by value is OK, then use std::optional:
void bar(std::optional<AnotherClass> a);
Otherwise, if AnotherClass takes ownership, passing unique_ptr works fine since it can be null.
Otherwise, passing shared_ptr works fine since it can be null.
If foo() creates a new object (vs. returning an object that exists already), it should be returning unique_ptr anyway, not a shared_ptr. Factory functions should be returning unique pointers: that's idiomatic C++. Doing otherwise is confusing, since returning a shared_ptr is meant to express existing shared ownership.
std::unique_ptr<AnotherClass> foo();
If bar should take ownership of the value, then it should be accepting a unique pointer - that's the idiom for "I'm taking over managing the lifetime of that object":
void bar(std::unique_ptr<const AnotherClass> a);
void bar_modifies(std::unique_ptr<AnotherClass> a);
If bar should retain shared ownership, then it should be taking shared_ptr, and you will be immediately converting the unique_ptr returned from foo() to a shared one:
struct MyClass {
std::unique_ptr<AnotherClass> foo();
void bar(std::shared_ptr<const AnotherClass> a);
void bar_modifies(std::shared_ptr<AnotherClass> a);
};
void test() {
MyClass m;
std::shared_ptr<AnotherClass> p{foo()};
m.bar(p);
}
shared_ptr(const Type) and shared_ptr(Type) will share the ownership,
they provide a constant view and a modifiable view of the object, respectively. shared_ptr<Foo> is also convertible to shared_ptr<const Foo> (but not the other way round, you'd use const_pointer_cast for that (with caution). You should always default to accessing objects as constants, and only working with non-constant types when there's an explicit need for it.
If a method doesn't modify something, make it self-document that fact by having it accept a reference/pointer to const something instead.
Smart pointers are used to make sure that an object is deleted if it is no longer used (referenced).
Smart pointer are there to manage lifetime of the pointer they own/share.
You can think of a wrapper that has a pointer inside. So the answer is no. However you can access to the pointer they own via get() method.
Please note that it is not so difficult to make dangling pointers if you use get method, so if you use it be extra cautious.
C++ references are still confusing to me. Suppose I have a function/method which creates an object of type Foo and returns it by reference. (I assume that if I want to return the object, it cannot be a local variable allocated on the stack, so I must allocate it on the heap with new):
Foo& makeFoo() {
...
Foo* f = new Foo;
...
return *f;
}
When I want to store the object created in a local variable of another function, should the type be Foo
void useFoo() {
Foo f = makeFoo();
f.doSomething();
}
or Foo&?
void useFoo() {
Foo& f = makeFoo();
f.doSomething();
}
Since both is correct syntax: Is there a significant difference between the two variants?
Yes, the first one will make a copy of the returned reference, while the second will be a reference to the return of makeFoo.
Note that using the first version will result in a memory leak (most likely), unless you do some dark magic inside the copy constructor.
Well, the second will result in a leak as well unless you call delete &f;.
Bottom line: don't. Just follow the crowd and return by value. Or a smart pointer.
Your first code does a lot of work:
void useFoo() {
Foo f = makeFoo(); // line 2
f.doSomething();
}
Thinking of line 2, some interesting things happen. First, the compiler will emit code to construct a Foo object at f using the default constructor of the class. Then, it will call makeFoo(), which also creates a new Foo object and returns a reference to that object. The compiler will also have to emit code that copies the temporary return value of makeFoo() into the object at f, and then it will destroy the temporary object. Once line 2 is done, f.doSomething() is called. But just before useFoo() returns, we destroy the object at f, as well, since it is going out of scope.
Your second code example is much more efficient, but it's actually probably wrong:
void useFoo() {
Foo& f = makeFoo(); // line 2
f.doSomething();
}
Thinking of line 2 in that example, we realize that we don't create an object for f since it is just a reference. The makeFoo() function returns an object that it has newly allocated, and we keep a reference to it. We call doSomething() through that reference. But when the useFoo() function returns, we don't ever destroy the new object that makeFoo() created for us and it leaks.
There's a few different ways to fix this. You could just use the reference mechanism you have in your first code fragment, if you don't mind the extra constructors, creation, copying and destruction. (If you have trivial constructors and destructors, and not much (or none) state to copy, then it doesn't matter much.) You could just return a pointer, which has the strong implication that the caller is responsible for managing the life cycle of the referenced object.
If you return a pointer, you've implied that the caller must manage the life cycle of the object, but you haven't enforced it. Someone, someday, somewhere will get it wrong. So you might consider making a wrapper class that manages the reference and provides accessors to encapsulate the management of the objects. (You could even bake that into the Foo class itself, if you wanted to.) A wrapper class of this type is called a "smart pointer" in its generic form. If you're using the STL, you'll find a smart pointer implementation in the std::unique_ptr template class.
A function should never return a reference to a new object that gets created. When you are making a new value, you should return a value or a pointer. Returning a value is almost always preferred, since almost any compiler will use RVO/NRVO to get rid of the extra copy.
Returning a value:
Foo makeFoo(){
Foo f;
// do something
return f;
}
// Using it
Foo f = makeFoo();
Returning a pointer:
Foo* makeFoo(){
std::unique_ptr<Foo> p(new Foo()); // use a smart pointer for exception-safety
// do something
return p.release();
}
// Using it
Foo* foo1 = makeFoo(); // Can do this
std::unique_ptr<Foo> foo2(makeFoo()); // This is better
How to get a reference to an object having shared_ptr<T> to it? (for a simple class T)
operator* already returns a reference:
T& ref = *ptr;
Or, I suppose I could give a more meaningful example:
void doSomething(std::vector<int>& v)
{
v.push_back(3);
}
auto p = std::make_shared<std::vector<int>>();
//This:
doSomething(*p);
//Is just as valid as this:
vector<int> v;
doSomething(v);
(Note that it is of course invalid to use a reference that references a freed object though. Keeping a reference to an object does not have the same effect as keeping a shared_ptr instance. If the count of the shared_ptr instances falls to 0, the object will be freed regardless of how many references are referencing it.)
Is there a concept of shared refrence?
Yes actually! shared_ptr provides an "aliasing constructor" that can be used exactly for this purpose. It returns a shared_ptr that uses the same reference count as the input shared_ptr but points to a different reference, typically a field or value derived from the backing data.
It is the responsibility of the programmer to make sure that this ptr remains valid as long as this shared_ptr exists, such as in the typical use cases where ptr is a member of the object managed by r or is an alias (e.g., downcast) of r.get()
What is shared_ptr's aliasing constructor for? goes into this in more detail, including an example of how to use it.
I have some confusion about the shared_ptr copy constructor. Please consider the following 2 lines:
It is a "constant" reference to a shared_ptr object, that is passed to the copy constructor so that another shared_ptr object is initialized.
The copy constructor is supposed to also increment a member data - "reference counter" - which is also shared among all shared_ptr objects, due to the fact that it is a reference/pointer to some integer telling each shared_ptr object how many of them are still alive.
But, if the copy constructor attempts to increment the reference counting member data, does it not "hit" the const-ness of the shared_ptr passed by reference? Or, does the copy constructor internally use the const_cast operator to temporarily remove the const-ness of the argument?
The phenomenon you're experiencing is not special to the shared pointer. Here's a typical primeval example:
struct Foo
{
int * p;
Foo() : p(new int(1)) { }
};
void f(Foo const & x) // <-- const...?!?
{
*x.p = 12; // ...but this is fine!
}
It is true that x.p has type int * const inside f, but it is not an int const * const! In other words, you cannot change x.p, but you can change *x.p.
This is essentially what's going on in the shared pointer copy constructor (where *p takes the role of the reference counter).
Although the other answers are correct, it may not be immediately apparent how they apply. What we have is something like this:
template <class T>
struct shared_ptr_internal {
T *data;
size_t refs;
};
template <class T>
class shared_ptr {
shared_ptr_internal<T> *ptr;
public:
shared_ptr(shared_ptr const &p) {
ptr = p->ptr;
++(ptr->refs);
}
// ...
};
The important point here is that the shared_ptr just contains a pointer to the structure that contains the reference count. The fact that the shared_ptr itself is const doesn't affect the object it points at (what I've called shared_ptr_internal). As such, even when/if the shared_ptr itself is const, manipulating the reference count isn't a problem (and doesn't require a const_cast or mutable either).
I should probably add that in reality, you'd probably structure the code a bit differently than this -- in particular, you'd normally put more (all?) of the code to manipulate the reference count into the shared_ptr_internal (or whatever you decide to call it) itself, instead of messing with those in the parent shared_ptr class.
You'll also typically support weak_ptrs. To do this, you have a second reference count for the number of weak_ptrs that point to the same shared_ptr_internal object. You destroy the final pointee object when the shared_ptr reference count goes to 0, but only destroy the shared_ptr_internal object when both the shared_ptr and weak_ptr reference counts go to 0.
It uses an internal pointer which doesn't inherit the contests of the argument, like:
(*const_ref.member)++;
Is valid.
the pointer is constant, but not the value pointed to.
Wow, what an eye opener this has all been! Thanks to everyone that I have been able to pin down the source of confusion to the fact that I always assumed the following ("a" contains the address of "b") were all equivalent.
int const *a = &b; // option1
const int *a = &b; // option2
int * const a = &b; // option3
But I was wrong! Only the first two options are equivalent. The third is totally different.
With option1 or option2, "a" can point to anything it wants but cannot change the contents of what it points to.
With option3, once decided what "a" points to, it cannot point to anything else. But it is free to change the contents of what it is pointing to. So, it makes sense that shared_ptr uses option3.
Suppose I have the following code:
class B { /* */ };
class A {
vector<B*> vb;
public:
void add(B* b) { vb.push_back(b); }
};
int main() {
A a;
B* b(new B());
a.add(b);
}
Suppose that in this case, all raw pointers B* can be handled through unique_ptr<B>.
Surprisingly, I wasn't able to find how to convert this code using unique_ptr. After a few tries, I came up with the following code, which compiles:
class A {
vector<unique_ptr<B>> vb;
public:
void add(unique_ptr<B> b) { vb.push_back(move(b)); }
};
int main() {
A a;
unique_ptr<B> b(new B());
a.add(move(b));
}
So my simple question: is this the way to do it and in particular, is move(b) the only way to do it? (I was thinking of rvalue references but I don't fully understand them.)
And if you have a link with complete explanations of move semantics, unique_ptr, etc. that I was not able to find, don't hesitate to share it.
EDIT According to http://thbecker.net/articles/rvalue_references/section_01.html, my code seems to be OK.
Actually, std::move is just syntactic sugar. With object x of class X, move(x) is just the same as:
static_cast <X&&>(x)
These 2 move functions are needed because casting to a rvalue reference:
prevents function "add" from passing by value
makes push_back use the default move constructor of B
Apparently, I do not need the second std::move in my main() if I change my "add" function to pass by reference (ordinary lvalue ref).
I would like some confirmation of all this, though...
I am somewhat surprised that this is not answered very clearly and explicitly here, nor on any place I easily stumbled upon. While I'm pretty new to this stuff, I think the following can be said.
The situation is a calling function that builds a unique_ptr<T> value (possibly by casting the result from a call to new), and wants to pass it to some function that will take ownership of the object pointed to (by storing it in a data structure for instance, as happens here into a vector). To indicate that ownership has been obtained by the caller, and it is ready to relinquish it, passing a unique_ptr<T> value is in place. Ther are as far as I can see three reasonable modes of passing such a value.
Passing by value, as in add(unique_ptr<B> b) in the question.
Passing by non-const lvalue reference, as in add(unique_ptr<B>& b)
Passing by rvalue reference, as in add(unique_ptr<B>&& b)
Passing by const lvalue reference would not be reasonable, since it does not allow the called function to take ownership (and const rvalue reference would be even more silly than that; I'm not even sure it is allowed).
As far as valid code goes, options 1 and 3 are almost equivalent: they force the caller to write an rvalue as argument to the call, possibly by wrapping a variable in a call to std::move (if the argument is already an rvalue, i.e., unnamed as in a cast from the result of new, this is not necessary). In option 2 however, passing an rvalue (possibly from std::move) is not allowed, and the function must be called with a named unique_ptr<T> variable (when passing a cast from new, one has to assign to a variable first).
When std::move is indeed used, the variable holding the unique_ptr<T> value in the caller is conceptually dereferenced (converted to rvalue, respectively cast to rvalue reference), and ownership is given up at this point. In option 1. the dereferencing is real, and the value is moved to a temporary that is passed to the called function (if the calles function would inspect the variable in the caller, it would find it hold a null pointer already). Ownership has been transferred, and there is no way the caller could decide to not accept it (doing nothing with the argument causes the pointed-to value to be destroyed at function exit; calling the release method on the argument would prevent this, but would just result in a memory leak). Surprisingly, options 2. and 3. are semantically equivalent during the function call, although they require different syntax for the caller. If the called function would pass the argument to another function taking an rvalue (such as the push_back method), std::move must be inserted in both cases, which will transfer ownership at that point. Should the called function forget to do anything with the argument, then the caller will find himself still owning the object if holding a name for it (as is obligatory in option 2); this in spite of that fact that in case 3, since the function prototype asked the caller to agree to the release of ownership (by either calling std::move or supplying a temporary). In summary the methods do
Forces caller to give up ownership, and be sure to actually claim it.
Force caller to possess ownership, and be prepared (by supplying a non const reference) to give it up; however this is not explicit (no call of std::move required or even allowed), nor is taking away ownership assured. I would consider this method rather unclear in its intention, unless it is explicitly intended that taking ownership or not is at discretion of the called function (some use can be imagined, but callers need to be aware)
Forces caller to explicitly indicate giving up ownership, as in 1. (but actual transfer of ownership is delayed until after the moment of function call).
Option 3 is fairly clear in its intention; provided ownership is actually taken, it is for me the best solution. It is slightly more efficient than 1 in that no pointer values are moved to temporaries (the calls to std::move are in fact just casts and cost nothing); this might be especially relevant if the pointer is handed through several intermediate functions before its contents is actually being moved.
Here is some code to experiment with.
class B
{
unsigned long val;
public:
B(const unsigned long& x) : val(x)
{ std::cout << "storing " << x << std::endl;}
~B() { std::cout << "dropping " << val << std::endl;}
};
typedef std::unique_ptr<B> B_ptr;
class A {
std::vector<B_ptr> vb;
public:
void add(B_ptr&& b)
{ vb.push_back(std::move(b)); } // or even better use emplace_back
};
void f() {
A a;
B_ptr b(new B(123)),c;
a.add(std::move(b));
std::cout << "---" <<std::endl;
a.add(B_ptr(new B(4567))); // unnamed argument does not need std::move
}
As written, output is
storing 123
---
storing 4567
dropping 123
dropping 4567
Note that values are destroyed in the ordered stored in the vector. Try changing the prototype of the method add (adapting other code if necessary to make it compile), and whether or not it actually passes on its argument b. Several permutations of the lines of output can be obtained.
Yes, this is how it should be done. You are explicitly transferring ownership from main to A. This is basically the same as your previous code, except it's more explicit and vastly more reliable.
So my simple question: is this the way to do it and in particular, is this "move(b)" the only way to do it? (I was thinking of rvalue references but I don't fully understand it so...)
And if you have a link with complete explanations of move semantics, unique_ptr... that I was not able to find, don't hesitate.
Shameless plug, search for the heading "Moving into members". It describes exactly your scenario.
Your code in main could be simplified a little, since C++14:
a.add( make_unique<B>() );
where you can put arguments for B's constructor inside the inner parentheses.
You could also consider a class member function that takes ownership of a raw pointer:
void take(B *ptr) { vb.emplace_back(ptr); }
and the corresponding code in main would be:
a.take( new B() );
Another option is to use perfect forwarding for adding vector members:
template<typename... Args>
void emplace(Args&&... args)
{
vb.emplace_back( std::make_unique<B>(std::forward<Args>(args)...) );
}
and the code in main:
a.emplace();
where, as before, you could put constructor arguments for B inside the parentheses.
Link to working example