Threading a Shared Model with pointers - c++

I have a vector of pointers to objects created with new. Multiple threads access this vector in a safe manner with various gets/sets. However, a thread may delete one of the objects, in which case another thread's pointer to the object is no longer valid. How can a method know if the pointer is valid? Options 1 and 2 actually seem to work well. I don't know how they will scale. What is the best approach? Is there a portable version 3?
Testing for pointer validity examples that work:
1. Use integers instead of pointers. A hash (std::map) checks to see if the pointer is still valid. Public methods look like:
get(size_t iKey)
{
if((it = mMap.find(iKey)) != mMap.end())
{
TMyType * pMyType = it->second;
// do something with pMyType
}
}
2. Have a vector of shared_ptr. Each thread tries to call lock() on its weak_ptr. If the returned shared_ptr is null we know someone deleted it while we were waiting. Public methods looks like:
get(boost::weak_ptr<TMyType> pMyType)
{
boost::shared_ptr<TMyType> pGOOD = pMyType.lock();
if (pGOOD != NULL)
{
// Do something with pGOOD
}
}
3. Test for null on plain raw pointers? Is this possible?
get(TMyType * pMyType)
{
if(pMyType != NULL){ //do something }
}

#3 will not work. Deleting a pointer and setting it to NULL doesn't affect other pointers that are pointing to the same object. There is nothing you can do with raw pointers to detect if the object is deleted.
#1 is effectively a pointer to a pointer. If you always access it through that pointer and are able to lock it. If not, what happens if it's deleted in another thread after you successfully get it?
#2 is the standard implementation of this kind of idea, and the pattern is used in many libraries. Lock a handle, getting a pointer. If you get it, use it. If not, it's gone.

Related

Avoiding null pointer crashes in C++ by overloading operators - bad practice?

I'm starting to write a rather large Qt application and instead of using raw pointers I want to use smart pointers, as well as Qt's own guarded pointer called QPointer.
With both standard library smart pointers and Qt's pointers the application crashes when a NULL pointer is dereferenced.
My idea was that I could add a custom overload to the dereference operators * and -> of these pointer types that check if the pointer is NULL.
Below is a short example that works fine so far. If a NULL pointer was dereferenced, a temporary dummy object would be created so that the application does not crash. How this dummy object would be processed might not be always correct, but at least there would be no crash and I could even react on this and show a warning or write it to a log file.
template <class T>
class Ptr : public std::shared_ptr<T> {
private:
T * m_temp;
public:
Ptr<T>(T * ptr) : std::shared_ptr<T>(ptr), m_temp(NULL) {}
~Ptr() {
if (m_temp) {
delete m_temp;
}
}
T * operator->() {
if (!std::shared_ptr<T>::get()) {
if (m_temp) {
delete m_temp;
}
m_temp = new T();
return m_temp;
} else {
return std::shared_ptr<T>::get();
}
}
T & operator*() {
return *operator->();
}
};
Of course I'll be doing NULL checks and try to eliminate the source of NULL pointers as much as possible, but for the rare case that it I forget a NULL check and the exception occurs, could this be a good way of handling it? Or is this a bad idea?
I would say this is a bad idea for a few reasons:
You cannot derive from standard library types. It may work until you change something benign in your code and then it breaks. There are various things you can do to make this more acceptable, but the easiest thing is to just not do this.
There are more ways to create a shared_ptr than just a constructor call. Duplicating the pointer value in your m_temp variable is likely just to lead things to be out of sync and cause more problems. By the time you cover all the bases, you will have probably re-implemented the whole shared_ptr class.
m_temp = new T(); seems like a frankly crazy thing to do if the old pointer is null. What about all the state stored in the object that was previously null? What about constructor parameters? Any initialization for the pointer? Sure, you could maybe handle all of these, but by that point you might as well handle the nullptr check elsewhere where things will be clearer.
You don't want to hide values being nullptr. If you have code using a pointer, it should care about the value of that pointer. If it is null and that is unexpected, then something further up the chain likely went wrong and you should be handling that appropriately (exceptions, error codes, logging, etc.). Silently allocating a new pointer will just hide the original source of the error. Whenever there is something wrong in a program, you want to stop or address the problem as close to the source as possible - it makes debugging the problem simpler.
A side note, if you are confident that your pointers are not null and don't want to have to deal with nullptr in a block of code, you may be able to use references instead. For example:
void fun1(MyObject* obj) {}
void fun2(MyObject& obj) {}
In fun1, the code might need to check for nullptr to be well written. In fun2, there is no need to check for nullptr because if someone converts a nullptr to a reference they have already broken the rules. fun2 pushes any responsibility for checking the pointer value higher up the stack. This can be good in some cases (just don't try and store the reference for later). Note that you can use operator * on a shared_ptr/unique_ptr to get a reference directly.

Reference type return function: how to return (optional) object

I've a multithreaded C++ application that could call from any thread a function like the following, to get an Object from a list/vector.
class GlobalClass{
public:
MyObject* GlobalClass::getObject(int index) const
{
/* mutex lock & unlock */
if (m_list.hasValueAt(index))
return m_list[index];
else
return 0;
}
List<MyObject*> m_list;
};
//Thread function
MyObject* obj = globalClass->getObject(0);
if (!obj) return;
obj->doSomething();
Note: the scope here is to understand some best practice related to function returns by reference, value or pointer, so forgive some pseudo-code or missing declarations (I make use of lock/unlock, GlobalClass is a global singleton, etc...).
The issue here is that if the MyObject at that index in deleted inside GlobalClass, at a certain point I'm using a bad pointer (obj).
So I was thinking about returning a copy of the oject:
MyObject GlobalClass::getObject(int index) const
{
/* mutex lock & unlock */
if (m_list.hasValueAt(index))
return MyObject(*m_list[index]);
else
return MyObject();
}
The issue here is that the object (MyObject) being returned is a large enough object that returning a copy is not efficient.
Finally, I would like to return a reference to that object (better a const reference):
const MyObject& GlobalClass::getObject(int index) const
{
/* mutex lock & unlock */
if (m_list.hasValueAt(index))
return *m_list[index];
else{
MyObject* obj = new MyObject();
return *obj ;
}
}
Considering that my list couldn't cointain the object at that index, I'm introducing a memory leak.
What's the best solution to deal with this?
Must I fall back in returning a copy even if is less efficient or is there something I'm missing in returning a reference?
You have multiple choices:
Use a std::shared_ptr if "Get" pass the owning of the object to the caller. This way the object cannot get out of scope. Of course the caller is unaware when it happens.
Use a std::weak_ptr. This has the same meaning of 1., but the ptr can be reset. In this case the caller can detect if the object was deleted.
Use std::optional as suggested in a comment, and return a copy or a reference. The use of a reference type as argument of optional doesn't avoid the problem of the object being deleted so the reference can become invalid as well. A copy would avoid this, but it may be too expensive, as said.
Reading through the lines, you seems to suggest that the caller will use the pointer immediately after the call, and for a limited span of time. So 1. and 2. are equivalent and seems to fit your needs.
See this introduction to smart pointers for more details.
If you want to avoid copying the object, there are only two possible cases:
The m_list entry that is returned by getObject is/can be deleted concurrently by another thread. If you don't copy that object beforehand, there is nothing you can do within getObject to prevent another thread from suddenly having a reference/pointer dangle. However, you could make each entry of m_list be a std::shared_ptr<MyObject> and return that directly. The memory management will happen automatically (but beware of the potential overhead in the reference counting of shared_ptr, as well as the possibility of deadlocks).
You have (or add) some mechanism to ensure that objects can only be deleted from m_list if no other thread currently holds some pointer/reference to them. This very much depends on your algorithm, but it might e.g. be possible to mark objects for deletion only and then delete them later in a synchronous section.
Your issues seems to stem from the fact that your program is multithreaded - another way forward (and for raw pointer or the std::optional reference returning version: only way forward, perhaps short of a complete redesign), is that you need to expose the mutex to outside the function scope to accomplish what you need. This you can accomplish in multiple ways, however the most simple way to illustrate this is the following:
/*mutex lock*/
const MyObject& obj = globalClass.get(index);
/*do stuff with obj*/
/*mutex unlock*/

Using boost::shared_ptr to refer to iterator without copying data?

Inside a loop I need to call a function which has an argument of type pcl::PointIndicesPtr. This is actually a boost::shared_ptr< ::pcl::PointIndices>. Is there a way to do this without having to copy the underlying data? I only could it get to work by using make_shared, which copies the object if I understand it correctly.
for (std::vector<pcl::PointIndices>::const_iterator it = cluster_indices.begin (); it != cluster_indices.end (); ++it)
{
pcl::PointIndicesPtr indices_ptr2 =boost::make_shared<pcl::PointIndices>(*it);
}
For example this will crash at runtime:
for (std::vector<pcl::PointIndices>::const_iterator it = cluster_indices.begin (); it != cluster_indices.end (); ++it)
{
pcl::PointIndices test = *it;
pcl::PointIndicesPtr indices_ptr3(&test);
}
The answer depends on the implementation of the function you are calling and what else your code does with the object. There is no "one right answer".
For example, if the function can't possibly access the object after it returns, the right answer might be to wrap the existing object with a shared_ptr with a dummy destructor. But if the function stashes the shared_ptr, that won't work.
If your own code never modifies the object, constructing the object with make_shared in the first place may be the right answer. But if your code modifies the object while the function expects it not to change later, that won't work.
You have to make a decision based on all the information.
The most important question to answer -- why does the function you are calling take a shared_ptr? Does it have a good reason? If so, what is that reason? If not, why not change it to take a reference?
Could you use BOOST_FOREACH?
i.e.
BOOST_FOREACH(pcl::PointIndiciesPtr ptr; cluster_indicies)
{
//do something with shared_ptr
}

Replacing raw pointers in vectors with std::shared_ptr

I have the following structure:
typedef Memory_managed_data_structure T_MYDATA;
std::vector<T_MYDATA *> object_container;
std::vector<T_MYDATA *> multiple_selection;
T_MYDATA * simple_selection;
Edit: this may be very important: the Memory_managed_data_structure contains, among other things, a bitter, raw pointer to some other data.
It aims to be a very simple representation of an original container of memory managed objects (object_container) and then a "multiple_selection" array (for selecting many objects in the range and doing various operations with them) and a "simple_selection" pointer (for doing these operations on a single object).
The lifetime of all objects is managed by the object_container while multiple_selection and simple_selection just point to some of them. multiple_selection and simple_selection can be nullified as needed and only object_container objects can be deleted.
The system works just fine but I am trying to get into shared_ptrs right now and would like to change the structure to something like:
typedef Memory_managed_data_structure T_MYDATA;
std::vector<std::shared_ptr<T_MYDATA> > object_container;
std::vector<std::shared_ptr<T_MYDATA> > multiple_selection;
std::shared_ptr<T_MYDATA> simple_selection;
Again, the object container would be the "owner" and the rest would just point to them. My question is, would this scheme wreak havok in the application?. Is there something I should know before snowballing into these changes?. Are not shared_ptr the appropriate kind of pointer here?.
I can somewhat guarantee that no object would exists in multiple_selection or simple_selection if it is not in object_container first. Of course, no delete is ever called in multiple_selection or simple_selection.
Thanks for your time.
Edit: Forgot to mention, never used any of these automated pointers before so I may be wildly confused about their uses. Any tips and rules of thumb will be greatly appreciated.
You say, that the object container would be the "owner" of the objects in question. In that case, that you have a clear owning relationship, using std::shared_ptr is not ideal. Rather, stick with what you have.
However, if you cannot guarantee, that a pointer has been removed from multiple_selection and/or simple_selection before it is deleted, you have to act. One possible action could be, that you use shared_ptr. In that case, an object could still continue to exist in one of the selections, even, if it is removed (via shared_ptr::reset or just assigning a null value) from object_container.
The other alternative is to make sure, that objects get removed thoroughly: If something is to be deleted, remove ALL references to it from the selections and from the object_container, and THEN delete it. If you strictly follow this scheme, you don't need the overhead of shared_ptr.
I can somewhat guarantee that no object would exists in
multiple_selection or simple_selection if it is not in
object_container first.
If you 150% sure, than there is no need for smart ptr.
Reason you may need it in this situation is debug, I think.
In case you describe - multiple_selection and simple_selection is not shared_ptr, but weak_ptr.
Code with error:
std::vector<int*> owner_vector;
std::vector<int*> weak_vector;
int* a = new int(3);
owner_vector.push_back(a);
weak_vector.push_back(a);
std::for_each(
owner_vector.begin(),
owner_vector.end(),
[](int* ptr) {
delete ptr;
}
);
std::for_each(
weak_vector.begin(),
weak_vector.end(),
[](int* ptr) {
*ptr = 3; // oops... usage of deleted pointer
}
);
You can catch it with smart pointers:
std::vector<std::shared_ptr<int>> owner_vector;
std::vector<std::weak_ptr<int>> weak_vector;
{
auto a = std::make_shared<int>();
owner_vector.push_back(a);
weak_vector.push_back(a);
}
std::for_each(
owner_vector.begin(),
owner_vector.end(),
[](std::shared_ptr<int>& ptr) {
ptr.reset(); // memory delete
}
);
std::for_each(
weak_vector.begin(),
weak_vector.end(),
[](std::weak_ptr<int>& ptr) {
assert(!ptr.expired()); // guarantee to be alive
auto shared_ptr = ptr.lock();
*shared_ptr = 3;
}
);
In last example you will have assert failed, but not undefined/segmentation fault. In not debug case you can disable shared_ptr overhead.

How to manage object life time using Boost library smart pointers?

There is a scenario that i need to solve with shared_ptr and weak_ptr smart pointers.
Two threads, thread 1 & 2, are using a shared object called A. Each of the threads have a reference to that object. thread 1 decides to delete object A but at the same time thread 2 might be using it. If i used shared_ptr to hold object A's references in each thread, the object wont get deleted at the right time.
What should i do to be able to delete the object when its supposed to and prevent an error in other threads that using that object at the same time?
There's 2 cases:
One thread owns the shared data
If thread1 is the "owner" of the object and thread2 needs to just use it, store a weak_ptr in thread2. Weak pointers do not participate in reference counting, instead they provide a way to access a shared_ptr to the object if the object still exists. If the object doesn't exist, weak_ptr will return an empty/null shared_ptr.
Here's an example:
class CThread2
{
private:
boost::weak_ptr<T> weakPtr
public:
void SetPointer(boost::shared_ptr<T> ptrToAssign)
{
weakPtr = ptrToAssign;
}
void UsePointer()
{
boost::shared_ptr<T> basePtr;
basePtr = weakPtr.lock()
if (basePtr)
{
// pointer was not deleted by thread a and still exists,
// so it can be used.
}
else
{
// thread1 must have deleted the pointer
}
}
};
My answer to this question (link) might also be useful.
The data is truly owned by both
If either of your threads can perform deletion, than you can not have what I describe above. Since both threads need to know the state of the pointer in addition to the underlying object, this may be a case where a "pointer to a pointer" is useful.
boost::shared_ptr< boost::shared_ptr<T> >
or (via a raw ptr)
shared_ptr<T>* sharedObject;
or just
T** sharedObject;
Why is this useful?
You only have one referrer to T (in fact shared_ptr is pretty redundant)
Both threads can check the status of the single shared pointer (is it NULL? Was it deleted by the other thread?)
Pitfalls:
- Think about what happens when both sides try to delete at the same time, you may need to lock this pointer
Revised Example:
class CAThread
{
private:
boost::shared_ptr<T>* sharedMemory;
public:
void SetPointer(boost::shared_ptr<T>* ptrToAssign)
{
assert(sharedMemory != NULL);
sharedMemory = ptrToAssign;
}
void UsePointer()
{
// lock as needed
if (sharedMemory->get() != NULL)
{
// pointer was not deleted by thread a and still exists,
// so it can be used.
}
else
{
// other thread must have deleted the pointer
}
}
void AssignToPointer()
{
// lock as needed
sharedMemory->reset(new T);
}
void DeletePointer()
{
// lock as needed
sharedMemory->reset();
}
};
I'm ignoring all the concurrency issues with the underlying data, but that's not really what you're asking about.
Qt has a QPointer class that does this. The pointers are automatically set to 0 if what they're pointed at is deleted.
(Of course, this would only work if you're interested in integrating Qt into your project.)