simple thread safe vector for connections in grpc service - c++

I'm trying to learn about concurrency, and I'm implementing a small connection pool in a grpc service that needs to make many connections to a postgres database.
I'm trying to implement a basic connectionPool to prevent creating a new connection for each request. To start, I attempted to create a thread safe std::vector. When I run the grpc server, a single transaction is made, and then the server blocks, but I can't reason out what's going on. Any help would be appreciated
class SafeVector {
std::vector<pqxx::connection*> pool_;
int size_;
int max_size_;
std::mutex m_;
std::condition_variable cv_;
public:
SafeVector(int size, int max_size) : size_(size), max_size_(max_size) {
assert(size_ <= max_size_);
for (size_t i = 0; i < size; ++i) {
pool_.push_back(new pqxx::connection("some conn string"));
}
}
SafeVector(SafeVector const&)=delete; // to be implemented
SafeVector& operator=(SafeVector const&)=delete; // no assignment keeps things simple
std::shared_ptr<pqxx::connection> borrow() {
std::unique_lock<std::mutex> l(m_);
cv_.wait(l, [this]{ return !pool_.empty(); });
std::shared_ptr<pqxx::connection> res(pool_.back());
pool_.pop_back();
return res;
}
void surrender(std::shared_ptr<pqxx::connection> connection) {
std::lock_guard<std::mutex> l(m_);
pool_.push_back(connection.get());
cv_.notify_all();
}
};
In main, I then pass a SafeVector* s = new SafeVector(4, 10); into my service ServiceA(s)
Inside ServiceA, I use the connection as follows:
std::shared_ptr<pqxx::connection> c = safeVector_->borrow();
c->perform(SomeTransactorImpl);
safeVector_->surrender(c);
I put a bunch of logging statements everywhere, and I'm pretty sure I have a fundamental misunderstanding of the core concept of either (1) shared_ptr or (2) the various locking structures.
In particular, it seems that after 4 connections are used (the maximum number of hardware threads on my machine), a seg fault (error 11) happens when attempting to return a connection in the borrow() method.
Any help would be appreciated. Thanks.

smart pointers in C++ are about object ownership.
Object ownership is about who gets to delete the object and when.
A shared pointer means that who gets to delete and when is a shared concern. Once you have said "no one bit of code is permitted to delete this object", you cannot take it back.
In your code, you try to take an object with shared ownership and claim it for your SafeVector in surrender. This is not permitted. You try it anyhow with a call to .get(), but the right to delete that object remains owned by shared pointers.
They proceed to delete it (maybe right away, maybe tomorrow) and your container has a dangling pointer to a deleted object.
Change your shared ptrs to unique ptrs. Add move as required to make it compile.
In surrender, assert the supplied unique ptr is non-empty.
And whike you are in there,
cv_.notify_one();
I would also
std::vector<std::unique_ptr<pqxx::connection>> pool_;
and change:
pool_.push_back(std::move(connection));
if you don't update the type of pool_, instead change .get() to .release(). Unlike shared ptr, unique ptr can give up ownership.

Related

std::shared_ptr::unique(), copying and thread safety

I have a shared_ptr stored in a central place which can be accessed by multiple threads through a method getPointer(). I want to make sure that only one thread uses the pointer at one time. Thus, whenever a thread wants to get the pointer I test if the central copy is the only one via std::shared_ptr::unique() method. If it returns yes, I return the copy assuming that unique()==false as long as that thread works on the copy. Other threads trying to access the pointer at the same time receive a nullptr and have to try again in the future.
Now my question:
Is it theoretically possible that two different threads calling getPointer() can get mutual access to the pointer despite the mutex guard and the testing via unique() ?
std::shared_ptr<int> myPointer; // my pointer is initialized somewhere else but before the first call to getPointer()
std::mutex myMutex;
std::shared_ptr<int> getPointer()
{
std::lock_guard<std::mutex> guard(myMutex);
std::shared_ptr<int> returnValue;
if ( myPointer.unique() )
returnValue = myPointer;
else
returnValue = nullptr;
return returnValue;
}
Regards
Only one "active" copy can exist at a time.
It is protected by the mutex until after a second shared_ptr is created at which point a subsequent call (once it gets the mutex after the first call has exited) will fail the unique test until the initial caller's returned shared_ptr is destroyed.
As noted in the comments, unique is going away in c++20, but you can test use_count == 1 instead, as that is what unique does.
Your solution seems overly complicated. It exploits the internal workings of the shared pointer to deduce a flag value. Why not just make the flag explicit?
std::shared_ptr<int> myPointer;
std::mutex myMutex;
bool myPointerIsInUse = false;
bool GetPermissionToUseMyPointer() {
std::lock_guard<std::mutex guard(myMutex);
auto r = (! myPointerIsInUse);
myPointerIsInUse ||= myPointerIsInUse;
return r;
}
bool RelinquishPermissionToUseMyPointer() {
std::lock_guard<std::mutex guard(myMutex);
myPointerIsInUse = false;
}
P.S., If you wrap that in a class with a few extra bells and whistles, it'll start to look a lot like a semaphore.

Is this inter-thread object sharing strategy sound?

I'm trying to come up with a fast way of solving the following problem:
I have a thread which produces data, and several threads which consume it. I don't need to queue produced data, because data is produced much more slowly than it is consumed (and even if this failed to be the case occasionally, it wouldn't be a problem if a data point were skipped occasionally). So, basically, I have an object that encapsulates the "most recent state", which only the producer thread is allowed to update.
My strategy is as follows (please let me know if I'm completely off my rocker):
I've created three classes for this example: Thing (the actual state object), SharedObject<Thing> (an object that can be local to each thread, and gives that thread access to the underlying Thing), and SharedObjectManager<Thing>, which wraps up a shared_ptr along with a mutex.
The instance of the SharedObjectManager (SOM) is a global variable.
When the producer starts, it instantiates a Thing, and tells the global SOM about it. It then makes a copy, and does all of it's updating work on that copy. When it is ready to commit it's changes to the Thing, it passes the new Thing to the global SOM, which locks it's mutex, updates the shared pointer it keeps, and then releases the lock.
Meanwhile, the consumer threads all intsantiate SharedObject<Thing>. these objects each keep a pointer to the global SOM, as well as a cached copy of the shared_ptr kept by the SOM... It keeps this cached until update() is explicitly called.
I believe this is getting hard to follow, so here's some code:
#include <mutex>
#include <iostream>
#include <memory>
class Thing
{
private:
int _some_member = 10;
public:
int some_member() const { return _some_member; }
void some_member(int val) {_some_member = val; }
};
// one global instance
template<typename T>
class SharedObjectManager
{
private:
std::shared_ptr<T> objPtr;
std::mutex objLock;
public:
std::shared_ptr<T> get_sptr()
{
std::lock_guard<std::mutex> lck(objLock);
return objPtr;
}
void commit_new_object(std::shared_ptr<T> new_object)
{
std::lock_guard<std::mutex> lck (objLock);
objPtr = new_object;
}
};
// one instance per consumer thread.
template<typename T>
class SharedObject
{
private:
SharedObjectManager<T> * som;
std::shared_ptr<T> cache;
public:
SharedObject(SharedObjectManager<T> * backend) : som(backend)
{update();}
void update()
{
cache = som->get_sptr();
}
T & operator *()
{
return *cache;
}
T * operator->()
{
return cache.get();
}
};
// no actual threads in this test, just a quick sanity check.
SharedObjectManager<Thing> glbSOM;
int main(void)
{
glbSOM.commit_new_object(std::make_shared<Thing>());
SharedObject<Thing> myobj(&glbSOM);
std::cout<<myobj->some_member()<<std::endl;
// prints "10".
}
The idea for use by the producer thread is:
// initialization - on startup
auto firstStateObj = std::make_shared<Thing>();
glbSOM.commit_new_object(firstStateObj);
// main loop
while (1)
{
// invoke copy constructor to copy the current live Thing object
auto nextState = std::make_shared<Thing>(*(glbSOM.get_sptr()));
// do stuff to nextState, gradually filling out it's new value
// based on incoming data from other sources, etc.
...
// commit the changes to the shared memory location
glbSOM.commit_new_object(nextState);
}
The use by consumers would be:
SharedObject<Thing> thing(&glbSOM);
while(1)
{
// think about the data contained in thing, and act accordingly...
doStuffWith(thing->some_member());
// re-cache the thing
thing.update();
}
Thanks!
That is way overengineered. Instead, I'd suggest to do following:
Create a pointer to Thing* theThing together with protection mutex. Either a global one, or shared by some other means. Initialize it to nullptr.
In your producer: use two local objects of Thing type - Thing thingOne and Thing thingTwo (remember, thingOne is no better than thingTwo, but one is called thingOne for a reason, but this is a thing thing. Watch out for cats.). Start with populating thingOne. When done, lock the mutex, copy thingOne address to theThing, unlock the mutex. Start populating thingTwo. When done, see above. Repeat untill killed.
In every listener: (make sure the pointer is not nullptr). Lock the mutex. Make a copy of the object pointed two by the theThing. Unlock the mutex. Work with your copy. Burn after reading. Repeat untill killed.

Updating cache without blocking

I currently have a program that has a cache like mechanism. I have a thread listening for updates from another server to this cache. This thread will update the cache when it receives an update. Here is some pseudo code:
void cache::update_cache()
{
cache_ = new std::map<std::string, value>();
while(true)
{
if(recv().compare("update") == 0)
{
std::map<std::string, value> *new_info = new std::map<std::string, value>();
std::map<std::string, value> *tmp;
//Get new info, store in new_info
tmp = cache_;
cache_ = new_cache;
delete tmp;
}
}
}
std::map<std::string, value> *cache::get_cache()
{
return cache_;
}
cache_ is being read from many different threads concurrently. I believe how I have it here I will run into undefined behavior if one of my threads call get_cache(), then my cache updates, then the thread tries to access the stored cache.
I am looking for a way to avoid this problem. I know I could use a mutex, but I would rather not block reads from happening as they have to be as low latency as possible, but if need be, I can go that route.
I was wondering if this would be a good use case for a unique_ptr. Is my understanding correct in that if a thread calls get_cache, and that returns a unique_ptr instead of a standard pointer, once all threads that have the old version of cache are finished with it(i.e leave scope), the object will be deleted.
Is using a unique_ptr the best option for this case, or is there another option that I am not thinking of?
Any input will be greatly appreciated.
Edit:
I believe I made a mistake in my OP. I meant to use and pass a shared_ptr not a unique_ptr for cache_. And when all threads are finished with cache_ the shared_ptr should delete itself.
A little about my program: My program is a webserver that will be using this information to decide what information to return. It is fairly high throughput(thousands of req/sec) Each request queries the cache once, so telling my other threads when to update is no problem. I can tolerate slightly out of date information, and would prefer that over blocking all of my threads from executing if possible. The information in the cache is fairly large, and I would like to limit any copies on value because of this.
update_cache is only run once. It is run in a thread that just listens for an update command and runs the code.
I feel there are multiple issues:
1) Do not leak memory: for that never use "delete" in your code and stick with unique_ptr (or shared_ptr in specific cases)
2) Protect accesses to shared data, for that either using locking (mutex) or lock-free mecanism (std::atomic)
class Cache {
using Map = std::map<std::string, value>();
std::unique_ptr<Map> m_cache;
std::mutex m_cacheLock;
public:
void update_cache()
{
while(true)
{
if(recv().compare("update") == 0)
{
std::unique_ptr<Map> new_info { new Map };
//Get new info, store in new_info
{
std::lock_guard<std::mutex> lock{m_cacheLock};
using std::swap;
swap(m_cache, new_cache);
}
}
}
}
Note: I don't like update_cache() being part of a public interface for the cache as it contains an infinite loop. I would probably externalize the loop with the recv and have a:
void update_cache(std::unique_ptr<Map> new_info)
{
{ // This inner brace is not useless, we don't need to keep the lock during deletion
std::lock_guard<std::mutex> lock{m_cacheLock};
using std::swap;
swap(m_cache, new_cache);
}
}
Now for the reading to the cache, use proper encapsulation and don't leave the pointer to the member map escape:
value get(const std::string &key)
{
// lock, fetch, and return.
// Depending on value type, you might want to allocate memory
// before locking
}
Using this signature you have to throw an exception if the value is not present in the cache, another option is to return something like a boost::optional.
Overall you can keep a low latency (everything is relative, I don't know your use case) if you take care of doing costly operations (memory allocation for instance) outside of the locking section.
shared_ptr is very reasonable for this purpose, C++11 has a family of functions for handling shared_ptr atomically. If the data is immutable after creation, you won't even need any additional synchronization:
class cache {
public:
using map_t = std::map<std::string, value>;
void update_cache();
std::shared_ptr<const map_t> get_cache() const;
private:
std::shared_ptr<const map_t> cache_;
};
void cache::update_cache()
{
while(true)
{
if(recv() == "update")
{
auto new_info = std::make_shared<map_t>();
// Get new info, store in new_info
// Make immutable & publish
std::atomic_store(&cache_,
std::shared_ptr<const map_t>{std::move(new_info)});
}
}
}
auto cache::get_cache() const -> std::shared_ptr<const map_t> {
return std::atomic_load(&cache_);
}

A thread-safe implementation of a generic container of type pair<unsigned int, boost::any> using shared_ptrs

I have created a generic message queue for use in a multi-threaded application. Specifically, single producer, multi-consumer. Main code below.
1) I wanted to know if I should pass a shared_ptr allocated with new into the enqueue method by value, or is it better to have the queue wrapper allocate the memory itself and just pass in a genericMsg object by const reference?
2) Should I have my dequeue method return a shared_ptr, have a shared_ptr passed in as a parameter by reference (current strategy), or just have it directly return a genericMsg object?
3) Will I need signal/wait in enqueue/dequeue or will the read/write locks suffice?
4) Do I even need to use shared_ptrs? Or will this depend solely on the implementation I use? I like that the shared_ptrs will free memory once all references are no longer using the object. I can easily port this to regular pointers if that's recommended, though.
5) I'm storing a pair here because I'd like to discriminate what type of message I'm dealing with else w/o having to do an any_cast. Every message type has a unique ID that refers to a specific struct. Is there a better way of doing this?
Generic Message Type:
template<typename Message_T>
class genericMsg
{
public:
genericMsg()
{
id = 0;
size = 0;
}
genericMsg (unsigned int &_id, unsigned int &_size, Message_T &_data)
{
id = _id;
size = _size;
data = _data;
}
~genericMsg()
{}
unisgned int id;
unsigned int size;
Message_T data; //All structs stored here contain only POD types
};
Enqueue Methods:
// ----------------------------------------------------------------
// -- Thread safe function that adds a new genericMsg object to the
// -- back of the Queue.
// -----------------------------------------------------------------
template<class Message_T>
inline void enqueue(boost::shared_ptr< genericMsg<Message_T> > data)
{
WriteLock w_lock(myLock);
this->qData.push_back(std::make_pair(data->id, data));
}
VS:
// ----------------------------------------------------------------
// -- Thread safe function that adds a new genericMsg object to the
// -- back of the Queue.
// -----------------------------------------------------------------
template<class Message_T>
inline void enqueue(const genericMsg<Message_T> &data_in)
{
WriteLock w_lock(myLock);
boost::shared_ptr< genericMsg<Message_T> > data =
new genericMsg<Message_T>(data_in.id, data_in.size, data_in.data);
this->qData.push_back(std::make_pair(data_in.id, data));
}
Dequeue Method:
// ----------------------------------------------------------------
// -- Thread safe function that grabs a genericMsg object from the
// -- front of the Queue.
// -----------------------------------------------------------------
template<class Message_T>
void dequeue(boost::shared_ptr< genericMsg<Message_T> > &msg)
{
ReadLock r_lock(myLock);
msg = boost::any_cast< boost::shared_ptr< genericMsg<Message_T> > >(qData.front().second);
qData.pop_front();
}
Get message ID:
inline unsigned int getMessageID()
{
ReadLock r_lock(myLock);
unsigned int tempID = qData.front().first;
return tempID;
}
Data Types:
std::deque < std::pair< unsigned int, boost::any> > qData;
Edit:
I have improved upon my design. I now have a genericMessage base class that I directly subclass from in order to derive the unique messages.
Generic Message Base Class:
class genericMessage
{
public:
virtual ~genericMessage() {}
unsigned int getID() {return id;}
unsigned int getSize() {return size;}
protected:
unsigned int id;
unsigned int size;
};
Producer Snippet:
boost::shared_ptr<genericMessage> tmp (new derived_msg1(MSG1_ID));
theQueue.enqueue(tmp);
Consumer Snippet:
boost::shared_ptr<genericMessage> tmp = theQueue.dequeue();
if(tmp->getID() == MSG1_ID)
{
boost::shared_ptr<derived_msg1> tObj = boost::dynamic_pointer_cast<derived_msg1>(tmp);
tObj->printData();
}
New Queue:
std::deque< boost::shared_ptr<genericMessage> > qData;
New Enqueue:
void mq_class::enqueue(const boost::shared_ptr<genericMessage> &data_in)
{
boost::unique_lock<boost::mutex> lock(mut);
this->qData.push_back(data_in);
cond.notify_one();
}
New Dequeue:
boost::shared_ptr<genericMessage> mq_class::dequeue()
{
boost::shared_ptr<genericMessage> ptr;
{
boost::unique_lock<boost::mutex> lock(mut);
while(qData.empty())
{
cond.wait(lock);
}
ptr = qData.front();
qData.pop_front();
}
return ptr;
}
Now, my question is am I doing dequeue correctly? Is there another way of doing it? Should I pass in a shared_ptr as a reference in this case to achieve what I want?
Edit (I added answers for parts 1, 2, and 4).
1) You should have a factory method that creates new genericMsgs and returns a std::unique_ptr. There is absolutely no good reason to pass genericMsg in by const reference and then have the queue wrap it in a smart pointer: Once you've passed by reference you have lost track of ownership, so if you do that the queue is going to have to construct (by copy) the entire genericMsg to wrap.
2) I can't think of any circumstance under which it would be safe to take a reference to a shared_ptr or unique_ptr or auto_ptr. shared_ptrs and unique_ptrs are for tracking ownership and once you've taken a reference to them (or the address of them) you have no idea how many references or pointers are still out there expecting the shared_ptr/unique_ptr object to contain a valid naked pointer.
unique_ptr is always preferred to a naked pointer, and is preferred to a shared_ptr in cases where you only have a single piece of code (validly) pointing to an object at a time.
https://softwareengineering.stackexchange.com/questions/133302/stdshared-ptr-as-a-last-resort
http://herbsutter.com/gotw/_103/
Bad practice to return unique_ptr for raw pointer like ownership semantics? (the answer explains why it is good practice not bad).
3) Yes, you need to use a std::condition_variable in your dequeue function. You need to test whether qData is empty or not before calling qData.front() or qData.pop_front(). If qData is empty you need to wait on a condition variable. When enqueue inserts an item it should signal the condition variable to wake up anyone who may have been waiting.
Your use of reader/writer locks is completely incorrect. Don't use reader/writer locks. Use std::mutex. A reader lock can only be used on a method that is completely const. You are modifying qData in dequeue, so a reader lock will lead to data races there. (Reader writer locks are only applicable when you have stupid code that is both const and holds locks for extended period of time. You are only keeping the lock for the period of time it takes to insert or remove from the queue, so even if you were const the added overhead of reader/writer locks would be a net lose.)
An example of implementing a (bounded) buffer using mutexes and condition_variables can be found at: Is this a correct way to implement a bounded buffer in C++.
4) unique_ptr is always preferred to naked pointers, and usually preferred to shared_ptr. (The main exception where shared_ptr might be better is for graph-like data structures.) In cases like yours where you are reading something in on side, creating a new object with a factory, moving the ownership to the queue and then moving ownership out of the queue to the consumer it sounds like you should be using unique_ptr.
5) You are reinventing tagged unions. Virtual functions were added to c++ specifically so you wouldn't need to do this. You should subclass your messages from a class that has a virtual function called do_it() (or better yet, operator()() or something like that). Then instead of tagging each struct, make each struct a subclass of your message class. When you dequeue each struct (or ptr to struct) just call do_it() on it. Strong static typing, no casts. See C++ std condition variable covering a lot of share variables for an example.
Also: if you are going to stick with the tagged unions: you can't have separate calls to get the id and the data item. Consider: If thread A calls to get the id, then thread B calls to get the id, then thread B retrieves the data item, now what happens when thread A calls to retrieve a data item? It gets a data item, but not with the type that it expected. You need to retrieve the id and the data item under the same critical section.
First of all, it's better to use 3rd-party concurrency containers than to implement them yourself, except it's for education purpose.
Your messages doesn't look to have costly constructors/destructor, so you can store them by value and forget about all your other questions. Use move semantics (if available) for optimizations.
If your profiler says "by value" is bad idea in your particular case:
I suppose your producer creates messages, puts them into your queue and loses any interest in them. In this case you don't need shared_ptr because you don't have shared ownership. You can use unique_ptr or even a raw pointer. It's implementation details and better to hide them inside the queue.
From performance point of view, it's better to implement lock-free queue. "locks vs. signals" depends completely on your application. For example, if you use thread pool and kind of a scheduler it's better to allow your clients to do something useful while queue is full/empty. In simpler cases reader/writer lock is just fine.
If I want to be thread safe, I usually use const objects and modify only on copy or create constructor. In this way you don't need to use any lock mechanism. In a threaded system, it is usually more effective than use mutexes on a single instance.
In your case only deque would need lock.

Read-write thread-safe smart pointer in C++, x86-64

I develop some lock free data structure and following problem arises.
I have writer thread that creates objects on heap and wraps them in smart pointer with reference counter. I also have a lot of reader threads, that work with these objects. Code can look like this:
SmartPtr ptr;
class Reader : public Thread {
virtual void Run {
for (;;) {
SmartPtr local(ptr);
// do smth
}
}
};
class Writer : public Thread {
virtual void Run {
for (;;) {
SmartPtr newPtr(new Object);
ptr = newPtr;
}
}
};
int main() {
Pool* pool = SystemThreadPool();
pool->Run(new Reader());
pool->Run(new Writer());
for (;;) // wait for crash :(
}
When I create thread-local copy of ptr it means at least
Read an address.
Increment reference counter.
I can't do these two operations atomically and thus sometimes my readers work with deleted object.
The question is - what kind of smart pointer should I use to make read-write access from several threads with correct memory management possible? Solution should exist, since Java programmers don't even care about such a problem, simply relying on that all objects are references and are deleted only when nobody uses them.
For PowerPC I found http://drdobbs.com/184401888, looks nice, but uses Load-Linked and Store-Conditional instructions, that we don't have in x86.
As far I as I understand, boost pointers provide such functionality only using locks. I need lock free solution.
boost::shared_ptr have atomic_store which uses a "lock-free" spinlock which should be fast enough for 99% of possible cases.
boost::shared_ptr<Object> ptr;
class Reader : public Thread {
virtual void Run {
for (;;) {
boost::shared_ptr<Object> local(boost::atomic_load(&ptr));
// do smth
}
}
};
class Writer : public Thread {
virtual void Run {
for (;;) {
boost::shared_ptr<Object> newPtr(new Object);
boost::atomic_store(&ptr, newPtr);
}
}
};
int main() {
Pool* pool = SystemThreadPool();
pool->Run(new Reader());
pool->Run(new Writer());
for (;;)
}
EDIT:
In response to comment below, the implementation is in "boost/shared_ptr.hpp"...
template<class T> void atomic_store( shared_ptr<T> * p, shared_ptr<T> r )
{
boost::detail::spinlock_pool<2>::scoped_lock lock( p );
p->swap( r );
}
template<class T> shared_ptr<T> atomic_exchange( shared_ptr<T> * p, shared_ptr<T> r )
{
boost::detail::spinlock & sp = boost::detail::spinlock_pool<2>::spinlock_for( p );
sp.lock();
p->swap( r );
sp.unlock();
return r; // return std::move( r )
}
With some jiggery-pokery you should be able to accomplish this using InterlockedCompareExchange128. Store the reference count and pointer in a 2 element __int64 array. If reference count is in array[0] and pointer in array[1] the atomic update would look like this:
while(true)
{
__int64 comparand[2];
comparand[0] = refCount;
comparand[1] = pointer;
if(1 == InterlockedCompareExchange128(
array,
pointer,
refCount + 1,
comparand))
{
// Pointer is ready for use. Exit the while loop.
}
}
If an InterlockedCompareExchange128 intrinsic function isn't available for your compiler then you may use the underlying CMPXCHG16B instruction instead, if you don't mind mucking around in assembly language.
The solution proposed by RobH doesn't work. It has the same problem as the original question: when accessing the reference count object, it might already have been deleted.
The only way I see of solving the problem without a global lock (as in boost::atomic_store) or conditional read/write instructions is to somehow delay the destruction of the object (or the shared reference count object if such thing is used). So zennehoy has a good idea but his method is too unsafe.
The way I might do it is by keeping copies of all the pointers in the writer thread so that the writer can control the destruction of the objects:
class Writer : public Thread {
virtual void Run() {
list<SmartPtr> ptrs; //list that holds all the old ptr values
for (;;) {
SmartPtr newPtr(new Object);
if(ptr)
ptrs.push_back(ptr); //push previous pointer into the list
ptr = newPtr;
//Periodically go through the list and destroy objects that are not
//referenced by other threads
for(auto it=ptrs.begin(); it!=ptrs.end(); )
if(it->refCount()==1)
it = ptrs.erase(it);
else
++it;
}
}
};
However there are still requirements for the smart pointer class. This doesn't work with shared_ptr as the reads and writes are not atomic. It almost works with boost::intrusive_ptr. The assignment on intrusive_ptr is implemented like this (pseudocode):
//create temporary from rhs
tmp.ptr = rhs.ptr;
if(tmp.ptr)
intrusive_ptr_add_ref(tmp.ptr);
//swap(tmp,lhs)
T* x = lhs.ptr;
lhs.ptr = tmp.ptr;
tmp.ptr = x;
//destroy temporary
if(tmp.ptr)
intrusive_ptr_release(tmp.ptr);
As far as I understand the only thing missing here is a compiler level memory fence before lhs.ptr = tmp.ptr;. With that added, both reading rhs and writing lhs would be thread-safe under strict conditions: 1) x86 or x64 architecture 2) atomic reference counting 3) rhs refcount must not go to zero during the assignment (guaranteed by the Writer code above) 4) only one thread writing to lhs (using CAS you could have several writers).
Anyway, you could create your own smart pointer class based on intrusive_ptr with necessary changes. Definitely easier than re-implementing shared_ptr. And besides, if you want performance, intrusive is the way to go.
The reason this works much more easily in java is garbage collection. In C++, you have to manually ensure that a value is not just starting to be used by a different thread when you want to delete it.
A solution I've used in a similar situation is to simply delay the deletion of the value. I create a separate thread that iterates through a list of things to be deleted. When I want to delete something, I add it to this list with a timestamp. The deleting thread waits until some fixed time after this timestamp before actually deleting the value. You just have to make sure that the delay is large enough to guarantee that any temporary use of the value has completed.
100 milliseconds would have been enough in my case, I chose a few seconds to be safe.