Why std::find doesn't work on std::shared_ptr

Why std::find doesn't work on std::shared_ptr - c++

We came across something we can not explain at work, and even if we found a solution, i would like to know exactly why the first code was fishy.
Here a minimal code example :
#include <iostream>
#include <memory>
#include <vector>
#include <algorithm>
int main() {
std::vector<std::shared_ptr<int>> r;
r.push_back(std::make_shared<int>(42));
r.push_back(std::make_shared<int>(1337));
r.push_back(std::make_shared<int>(13));
r.push_back(std::make_shared<int>(37));
int* s = r.back().get();
auto it = std::find(r.begin(),r.end(),s); // 1 - compliation error
auto it = std::find(r.begin(),r.end(),std::shared_ptr<int>(s)); // 2 - runtime error
auto it = std::find_if(r.begin(),r.end(),[s](std::shared_ptr<int> i){
return i.get() == s;}
); // 3 -works fine
if(it == r.end())
cout << "oups" << endl;
else
cout << "found" << endl;
return 0;
}
So what i want to know is why the find are not working.
For the first one, it seems that shared_ptr do not have a comparison operator with raw pointers, can someone explain why ?
The second one seems to be a problem of ownership, multiple delete (when my local shared_ptr goes out of scope it delete my pointer), but what i don't understand is why the runtime error is during the find execution, the double delete should happen only on the vector destruction, any thoughts ?
I have the working solution with the find_if so what i really want is to understand why the first two are not working, not another working solution (but if you have a more elegant one, feel free to post).

For the first one, it seems that shared_ptr do not have a comparison
operator with raw pointers, can someone explain why ?
Subjective, but I certainly don't consider it a good idea for shared pointers to be comparable to raw pointers, and I think the authors of std::shared_ptr and the standard's committee agree with that sentiment.
The second one seems to be a problem of ownership, multiple delete
(when my local shared_ptr goes out of scope it delete my pointer), but
what i don't understand is why the runtime error is during the find
execution, the double delete should happen only on the vector
destruction, any thoughts ?
s is a pointer to an int that was allocated by make_shared as part of a block, together with the reference counting information. It's implementation defined how it actually was allocated, but you can be sure it was not with a simple unadorned new expression, because that would allocate a seperate int in its own memory location. i.e. it was not allocated in any of these ways:
p = new int;
p = new int(value);
p = new int{value};
Then you passed s to the constructor of a new shared_ptr (the shared_ptr you passed as an argument to std::find). Since you didn't pass a special deleter along with the pointer, the default deleter will be used. The default deleter will simply call delete on the pointer.
Since the pointer was not allocated with an unadorned new expression, calling delete on it is undefined behavior. Since the temporary shared_ptr will be destroyed at the end of the statement, and it believes it is the sole owner of the integer, delete will be called on the integer at the end of the statement. This is likely the cause of your runtime error.
Try the following, easier to reason about snippet, and you will likely run into the same problem:
auto p = std::make_shared<int>(10);
delete p.get(); // This will most likely cause the same error.
// It is undefined behavior though, so there
// are no guarantees on that.

The smart pointer class template std::shared_ptr<> only supports operators for comparison against other std::shared_ptr<> objects; not raw pointers. Specifically, these are supported in that case:
operator== - Equivalence
operator!= - Negated equivalence
operator< - Less-than
operator<= - Less-than or equivalent
operator> - Greater-than
operator>= - Greater-than or equivalent
Read here for more info
Regarding why in the first case, because it isn't just a question of value; its a question of equivalence. A std::shared_ptr<> cannot be considered equivalent or comparable to a raw address simply because that raw address may not be held within a shared pointer. And even if the addresses are value equivalent, that doesn't mean the source of the latter came from a properly reference-counted equivalence (i.e. another shared pointer). Interestingly, your second example exposes what happens when you try to rig that system.
Regarding the second case, constructing a shared pointer as you are will proclaim two independent shared pointers having independent ownership of the same dynamic resource. So ask yourself, which one gets to delete it ? Um... yeah. Only when you replicate the std::shared_ptr<> itself will the reference count material shared among shared pointers holding the same datum reference be properly managed, so your code in this case is just-plain wrong.
If you want to hunt a raw address down in a collection of shared pointers, your third method is exactly how you should do it.
Edit: Why does the ownership issue in case 2 render where it does?
Ok, I did some hunting, and it turns out its a runtime-thing (at least on my implementation). I would have to check to know for sure if this behavior (of std::make_shared) is hardened in the standard, but I doubt it). The bottom line is this. These two things:
r.push_back(new int(42));
and
r.push_back(std::make_shared<int>(42));
can do very different things. The former dynamically allocates a new int, then send its address off to the matching constructor for std::shared_ptr<iint>, which allocates its own shared reference data that manages referencing counting to the provided address. I.e. there are two distinct blocks of data from separate allocations.
But the latter does something different. It allocates the object and the shared reference data in the same memory block, using placement-new for the object itself and either move-construction or copy-construction depending on what is provided/appropriate. The result is there is one memory allocation, and it holds both the reference data and the object, the latter being an offset within the allocate memory. Therefore the pointer you're sending to your shared_ptr did not come from an allocation return value.
Try the first one, and i bet you'll see you're runtime error relocate to the destruction of the vector rather then the conclusion of the find.

bool operator ==(const std::shared_ptr<T>&, const T*) doesn't exist.
It is a bad usage of std::shared_ptr
it is like you do:
int* p = new int(42);
std::shared_ptr<int> sp1(p);
std::shared_ptr<int> sp2(p); // Incorrect, should be sp2(sp1)
// each sp1 and sp2 will delete p at end of scope -> double delete ...

Related

Thread-safety of reference count in std::shared_ptr

Looking at this implementation of std::shared_ptr https://thecandcppclub.com/deepeshmenon/chapter-10-shared-pointers-and-atomics-in-c-an-introduction/781/ :
Question 1 : I can see that we're using std::atomic<int*> to store the pointer to the reference count associated with the resource being managed. Now, in the destructor of the shared_ptr, we're changing the value of the ref-count itself (like --(*reference_count)). Similarly, when we make a copy of the shared_ptr, we increment the ref-count value. However, in both these operations, we're not changing the value of the pointer to the ref-count but rather ref-count itself. Since the pointer to ref-count is the "atomic thing" here, I was wondering how would ++ / -- operations to the ref-count be thread-safe? Is std::atomic implemented internally in a way such that in case of pointers, it ensures changes to the underlying object itself are also thread-safe?
Question 2 : Do we really need this nullptr check in default_deleter class before calling delete on ptr? As per Is it safe to delete a NULL pointer?, it is harmless to call delete on nullptr.

Question 1:
The implementation linked to is not thread-safe at all. You are correct that the shared reference counter should be atomic, not pointers to it. std::atomic<int*> here makes no sense.
Note that just changing std::atomic<int*> to std::atomic<int>* won't be enough to fix this either. For example the destructor is decrementing the reference count and checking it against 0 non-atomically. So another thread could get in between these two operations and then they will both think that they should delete the object causing undefined behavior.
As mentioned by #fabian in the comments, it is also far from a correct non-thread-safe shared pointer implementation. For example with the test case
{
Shared_ptr<int> a(new int);
Shared_ptr<int> b(new int);
b = a;
}
it will leak the second allocation. So it doesn't even do the basics correctly.
Even more, in the simple test case
{
Shared_ptr<int> a(new int);
}
it leaks the allocated memory for the reference counter (which it always leaks).
Question 2:
There is no reason to have a null pointer check there except to avoid printing the message. In fact, if we want to adhere to the standard's specification of std::default_delete for default_deleter, then at best it is wrong to check for nullptr, since that is specified to call delete unconditionally.
But the only possible edge case where this could matter is if a custom operator delete would be called that causes some side effect for a null pointer argument. However, it is anyway unspecified whether delete will call operator delete if passed a null pointer, so that's not practically relevant either.

A shared pointer which is conceptually owned by one, unique, object

What is the canonical way to deal with shared pointers in C++ when there is a clear case to argue that "one, unique object owns the pointer"?
For example, if a shared_ptr is a member of a particular class, which is responsible for initializing the pointer, then it could be argued that this class should also have the final say on when the pointer is deleted.
In other words, it may be the case that when the owning class goes out of scope (or is itself delete'd that any remaining references to the pointer no longer make sense. This may be due to related variables which were members of the destroyed class.
Here is a sketch of an example
class Owner
{
Owner()
{
p.reset(malloc_object(arguments), free_object);
}
std::shared_ptr<type> get() { return p; }
// seems strange because now something somewhere
// else in the code can hold up the deletion of p
// unless a manual destructor is written
~Owner()
{
p.reset(nullptr); // arduous
}
std::shared_ptr<type> p;
int a, b, c; // some member variables which are logically attached to p
// such that neither a, b, c or p make sense without each other
}
One cannot use a unique_ptr as this would not permit the pointer to be returned by the get function, unless a raw pointer is returned. (Is this is an acceptable solution?)
A unique_ptr in combination with returning weak_ptr from the get function might make sense. But this is not valid C++. weak_ptr is used in conjunction with shared_ptr.
A shared_ptr with the get function returning weak_ptr is better than a raw pointer becuase in order to use the weak pointer, it has to be converted to a shared pointer. This will fail if the reference count is already zero and the object has been deleted.
However using a shared_ptr defeats the point, since ideally a unique_ptr would be chosen because there can then only be one thing which "owns" the pointed to data.
I hope the question is clear enough, it was quite difficult to explain since I can't copy the code I am working with.

It is ok to return the shared_ptr there, what will happen is that the pointer will still be held somewhere outside the Owner class. Since your doing p.reset(nullptr); at the destructor, whoever was holding that shared_ptr will now be holding a pointer to null.
Using weak_ptr with shared_ptr is also a good solution, the problem is the same which is the fact that the best class to represent p is unique_ptr as you described.
The path I would choose is to hold a unique_ptr which seems more adequate and to implement the get() function like this:
type* get() { return p.get(); }
The behaviour is the same and the code is clearer since having p as unique_ptr will give clarity on how it should be used.

C++ multiple unique pointers from same raw pointer

Consider my code below. My understanding of unique pointers was that only one unique pointer can be used to reference one variable or object. In my code I have more than one unique_ptr accessing the same variable.
It's obviously not the correct way to use smart pointers i know, in that the pointer should have complete ownership from creation. But still, why is this valid and not having a compilation error? Thanks.
#include <iostream>
#include <memory>
using namespace std;
int main()
{
int val = 0;
int* valPtr = &val;
unique_ptr <int> uniquePtr1(valPtr);
unique_ptr <int> uniquePtr2(valPtr);
*uniquePtr1 = 10;
*uniquePtr2 = 20;
return 0;
}

But still, why is this valid
It is not valid! It's undefined behaviour, because the destructor of std::unique_ptr will free an object with automatic storage duration.
Practically, your program tries to destroy the int object three times. First through uniquePtr2, then through uniquePtr1, and then through val itself.
and not having a compilation error?
Because such errors are not generally detectable at compile time:
unique_ptr <int> uniquePtr1(valPtr);
unique_ptr <int> uniquePtr2(function_with_runtime_input());
In this example, function_with_runtime_input() may perform a lot of complicated runtime operations which eventually return a pointer to the same object valPtr points to.
If you use std::unique_ptr correctly, then you will almost always use std::make_unique, which prevents such errors.

Just an addition to Christian Hackl's excellent answer:
std::unique_ptr was introduced to ensure RAII for pointers; this means, in opposite to raw pointers you don't have to take care about destruction yourself anymore. The whole management of the raw pointer is done by the smart pointer. Leaks caused by a forgotten delete can not happen anymore.
If a std::unique_ptr would only allow to be created by std::make_unique, it would be absolutely safe regarding allocation and deallocation, and of course that would be also detectable during compile time.
But that's not the case: std::unique_ptr is also constructible with a raw pointer. The reason is, that being able to be constructed with a hard pointer makes a std::unique_ptr much more useful. If this would not be possible, e.g. the pointer returned by Christian Hackl's function_with_runtime_input() would not be possible to integrate into a modern RAII environment, you would have to take care of destruction yourself.
Of course the downside with this is that errors like yours can happen: To forget destruction is not possible with std::unique_ptr, but erroneous multiple destructions are always possible (and impossible to track by the compiler, as C.H. already said), if you created it with a raw pointer constructor argument. Always be aware that std::unique_ptr logically takes "ownership" of the raw pointer - what means, that no one else may delete the pointer except the one std::unique_ptr itself.
As rules of thumb it can be said:
Always create a std::unique_ptr with std::make_unique if possible.
If it needs to be constructed with a raw pointer, never touch the raw pointer after creating the std::unique_ptr with it.
Always be aware, that the std::unique_ptr takes ownership of the supplied raw pointer
Only supply raw pointers to the heap. NEVER use raw pointers which point to local
stack variables (because they will be unavoidably destroyed automatically,
like valin your example).
Create a std::unique_ptr only with raw pointers, which were created by new, if possible.
If the std::unique_ptr needs to be constructed with a raw pointer, which was created by something else than new, add a custom deleter to the std::unique_ptr, which matches the hard pointer creator. An example are image pointers in the (C based) FreeImage library, which always have to be destroyed by FreeImage_Unload()
Some examples to these rules:
// Safe
std::unique_ptr<int> p = std::make_unique<int>();
// Safe, but not advisable. No accessible raw pointer exists, but should use make_unique.
std::unique_ptr<int> p(new int());
// Handle with care. No accessible raw pointer exists, but it has to be sure
// that function_with_runtime_input() allocates the raw pointer with 'new'
std::unique_ptr<int> p( function_with_runtime_input() );
// Safe. No accessible raw pointer exists,
// the raw pointer is created by a library, and has a custom
// deleter to match the library's requirements
struct FreeImageDeleter {
void operator() (FIBITMAP* _moribund) { FreeImage_Unload(_moribund); }
};
std::unique_ptr<FIBITMAP,FreeImageDeleter> p( FreeImage_Load(...) );
// Dangerous. Your class method gets a raw pointer
// as a parameter. It can not control what happens
// with this raw pointer after the call to MyClass::setMySomething()
// - if the caller deletes it, your'e lost.
void MyClass::setMySomething( MySomething* something ) {
// m_mySomethingP is a member std::unique_ptr<Something>
m_mySomethingP = std::move( std::unique_ptr<Something>( something ));
}
// Dangerous. A raw pointer variable exists, which might be erroneously
// deleted multiple times or assigned to a std::unique_ptr multiple times.
// Don't touch iPtr after these lines!
int* iPtr = new int();
std::unique_ptr<int> p(iPtr);
// Wrong (Undefined behaviour) and a direct consequence of the dangerous declaration above.
// A raw pointer is assigned to a std::unique_ptr<int> twice, which means
// that it will be attempted to delete it twice.
// This couldn't have happened if iPtr wouldn't have existed in the first
// place, like shown in the 'safe' examples.
int* iPtr = new int();
std::unique_ptr<int> p(iPtr);
std::unique_ptr<int> p2(iPtr);
// Wrong. (Undefined behaviour)
// An unique pointer gets assigned a raw pointer to a stack variable.
// Erroneous double destruction is the consequence
int val;
int* valPtr = &val;
std::unique_ptr<int> p(valPtr);

This example of code is a bit artificial. unique_ptr is not usually initialized this way in real world code. Use std::make_unique or initialize unique_ptr without storing raw pointer in a variable:
unique_ptr <int> uniquePtr2(new int);

Are std::shared_ptrs aware of each other?

That is, if I don't use the copy constructor, assignment operator, or move constructor etc.
int* number = new int();
auto ptr1 = std::shared_ptr<int>( number );
auto ptr2 = std::shared_ptr<int>( number );
Will there be two strong references?

According to the standard, use_count() returns 1 immediately after a shared_ptr is constructed from a raw pointer (§20.7.2.2.1/5). We can infer from this that, no, two shared_ptr objects constructed from raw pointers are not "aware" of each other, even if the raw pointers are the same.

Yes there will be two strong references, theres no global record of all shared pointers that it looks up to see if the pointer you're trying to cover is already covered by another smart pointer. (it's not impossible to make something like this yourself, but it's not something you should have to do)
The smart pointer creates it's own reference counter and in your case, there would be two separate ones keeping track of the same pointer.
So either smart pointer may delete the content without being aware of the fact that it is also held in another smart pointer.

Your code is asking for crash!
You cannot have two smart pointers pointing to the same actual object, because both will try to call its destructor and release memory when the reference counter goes to 0.
So, if you want to have two smart pointers pointing to the same object you must do:
auto ptr1 = make_shared<int>(10);
auto ptr2 = ptr1;

How to destroy a vector of pointers in c++?

I have the following code in one of my methods:
vector<Base*> units;
Base *a = new A();
Base *b = new B();
units.push_back(a);
units.push_back(b);
Should I destroy the a and b pointers before I exit the method?
Or should I somehow just destroy the units vector of pointers?
Edit 1:
This is another interesting case:
vector<Base*> units;
A a;
B b;
units.push_back(&a);
units.push_back(&b);
What about this case? Now I don't have to use delete nor smart pointers.
Thanks

If you exit the method, units will be destroyed automatically. But not a and b. Those you need to destroy explicitly.
Alternatively, you could use std::shared_ptr to do it for you, if you have C++11.
std::vector<std::shared_ptr<Base>> units;
And you just use the vector almost as you did before, but without worrying about memory leaks when the function exists. I say almost, because you'll need to use std::make_shared to assign into the vector.

A rather old-fashioned solution, that works with all compilers:
for ( vector<Base*>::iterator i = units.begin(); i != units.end(); ++i )
delete *i;
In C++11 this becomes as simple as:
for ( auto p : units )
delete p;
Your second example doesn't require pointer deallocation; actually it would be a bad error to do it. However it does require care in ensuring that a and b remain valid at least as long as units does. For this reason I would advise against that approach.

You need to iterate over the vector and delete each pointer it contains. Deleting the vector will result in memory leaks, as the objects pointed to by its elements are not deleted.
TL;DR: The objects remain, the pointers are lost == memory leak.

Yes you should destroy those pointers (assuming you aren't returning the vector elsewhere).
You could easily do it with a std::for_each as follows:
std::for_each( units.begin(), units.end(), []( Base* p ) { delete p; } );

You should not delete if this two situation match.
Created vector return to out side of the function.
Vector created outside of the function and and suppose to access from other functions.
In other situations you should delete memory pointed by pointers in vector. otherwise after you delete the pointers, no way to refer this memory locations and it calls memory leak.
vector<Base*>::iterator it;
for ( it = units.begin(); it != units.end(); ){
delete * it;
}

I would suggest that you use SmartPointers in the vector. Using smart pointers is a better practice than using raw pointers. You should use the std::unique_ptr, std::shared_ptr or std::weak_ptr smart pointers or the boost equivalents if you don't have C++11. Here is the boost library documentation for these smart pointers.
In the context of this question, yes you have to delete the pointers that are added to the vector. Else it would cause a memory leak.

You have to delete them unless you will have memory leak , in the following code if I comment the two delete lines the destructors never called, also you have to declare the destuctor of the Base class as virtual. As others mentioned is better to use smart pointers.
#include <iostream>
#include <vector>
class Base
{
public:
virtual ~Base(){std::cout << "Base destructor" << std::endl;};
};
class Derived : public Base
{
~Derived(){std::cout << "Derived destructor" << std::endl;};
};
int main()
{
std::vector<Base*> v;
Base *p=new Base();
Base *p2=new Derived();
v.push_back(p);
v.push_back(p2);
delete v.at(0);
delete v.at(1);
};
Output:
Base destructor
Derived destructor
Base destructor
Output with non-virtual base destructor (memory leak):
Base destructor
Base destructor

Yes and no. You don't need to delete them inside the function, but for other reasons than you might think.
You are essentially giving ownership of the objects to the vector, but the vector is not aware of that and therfore wont call delete on the pointers automatically. So if you store owning raw pointers in a vector, you have to manually call delete on them some time. But:
If you give the vector out of your function, you should not destroy the objects inside the function, or the vector full of pointers to freed memory would be pretty useless, so no. But in that case, you should make sure the objects are destroyed after the vector has been used outside the function.
If you don't give the vector out of the function, you should destroy the objects inside the function, but there would be no need to allocate them on the free store, so don't use pointers and new. You just push/emplace the objects themselves into the vector, it takes care of the destruction then, and therfore you don't need delete.
And besides that: Don't use plain new. Use smart pointers. Regardless what you do with them, the smart pointers will take care of a proper destruction of the objects contained. No need to use new, no need to use delete. Ever. (Except when you are writing your own low level data structures, e.g. smart pointers). So if you want to have a vector full of owning pointers, these should be smart pointers. That way you won't have to worry about wether, when and how to destroy the objects and free the memory.

The best way to store pointers in a vector will be to use smart_ptr instead of raw pointers. As soon as the vector DTOR is called and control exits the DTOR all smart_ptrs will be refernced counted. And you should never bothered about the memory leak with smart_ptrs.

In the first example, you will eventually have to delete a and b, but not necessarily when units goes out of scope. Usually you will do that just before units goes out of scope, but that is not the only possible case. It depends on what is intended.
You might (later in the same function) alias a or b, or both, because you want them to outlive units or the function scope. You might put them into two unit objects at the same time. Or, many other possible things.
What's important is that destroying the vector (automatic at scope end in this case) destroys the elements held by the vector, nothing more. The elements are pointers, and destroying a pointer does nothing. If you also want to destroy what the pointer points to (as to not leak memory), you must do that manually (for_each with a lambda would do).
If you don't want to do this work explicitly, a smart pointer can automatize that for you.
The second example (under Edit1) does not require you to delete anything (in fact that's not even possible, you would likely see a crash attempting to do that) but the approach is possibly harmful.
That code will work perfectly well as long as you never reference anything in units any more after a and b left scope. Woe if you do.
Technically, such a thing might even happen invisibly, since units is destroyed after a, but luckily, ~vector does not dereference pointer elements. It merely destroys them, which for a pointer doesn't do anything (trivial destructor).
But imagine someone was so "smart" as to extend the vector class, or maybe you apply this pattern some day in the future (because it "works fine") to another object which does just that. Bang, you're dead. And you don't even know where it came from.
What I really don't like about the code, even though it is strictly "legal" is the fact that it may lead to a condition which will crash or exhibit broken, unreproducable behaviour. However, it does not crash immediately. Code that is "broken" should crash immediately, so you see that something is wrong, and you are forced to fix it. Unluckily that's not the case here.
This will appear to work, possibly for years, until one day it doesn't. Eventually you'll have forgotten that a and b live on the current stack frame and reference the non-existing objects in the vector from some other location. Maybe you dynamically allocate the vector in a future revision of your code, since you pass it to another function. And maybe it will continue to appear working.
And then, you'll spend hours of your time (and likely the time of others) trying to find why a section of code that cannot possibly fail produces wrong results or crashes.

Warning against your second example.
This simple extension leads to undefined behavior:
class A {
public:
int m;
A(int _m): m(_m) {}
};
int main(){
std::vector<A*> units;
for (int i = 0; i < 3; ++i) {
A a(i);
units.push_back(&a);
}
for (auto i : units) std::cout << i->m << " "; // output: 2 2 2 !!!!
return 0;
}
In each loop, the pointer to each a is saved in units, but the objects that they point to go out of scope. In the case of my compiler, the memory address of each a was re-used each time, resulting in units holding three identical memory addresses -- all pointing to the final a object.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js