C++ Destruction Dependencies - c++

I have a C++ class that needs to track resource usage, and when the last instance referencing the specific resource is destructed, the resource must be released. Due to the nature of how this resource can be acquired by multiple different objects (and not just through a copy / move constructor), I've had to implement my own reference tracking.
This reference tracking works great with very little performance hit, but when I started adding threading, I had to add a critical section to guard accesses to the reference counting structure. The critical section is a static member of the class that uses it. This works fine until the process begins exiting, and it's time for everything's destructor to be called.
What's happening is that the critical section's destructor (which calls DeleteCriticalSection) is being called before the last destructor of my objects. The result is that I'm stuck with either possible race conditions on my reference counter or a crash from trying to enter an invalid critical section.
These objects have a clear dependency on this critical section, and I don't see a great way to prevent it from being destroyed until the last object is gone. My thinking is to change the critical section from a static member to a std::shared_ptr<CriticalSection> belonging to each instance of the class, but that seems like it'd have an unnecessary performance hit.
Is there some other way to outline this dependency? Or is there a better way to do what I'm trying to do without a need for this dependency in the first place?
EDIT: To be clear, I tried using std::shared_ptr to handle reference tracking. Unfortunately, that doesn't work. Here's a trivialized example of how it causes issues.
Object Get(){
Object o1{ GetResourceIdentifier(3) };
Object o2{ o1 }; // Copy constructor
Object o3{ GetResourceIdentifier(3) };
return o3;
}
void main(){
auto test{ Get() };
test.DoStuff();
}
Necessarily, when an object is instantiated, it will just open the same resource if it's already open. So o1, o2, and o3 will all refer to the same underlying resource in this example. But with std::shared_ptr, when Get returns, the shared pointer that o1 and o2 have will think that there are no references left and release the resource. Unfortunately, since o3 refers to the same resource, its resource also gets freed here, meaning the call to DoStuff will go awry.
If you'd like to see the actual code (file is rather large), the source is here and the header is here

If I understand your use case can use shared_ptr, but the trick is to store weak_ptrs, otherwise your resource will not be freed until the end of the program:
struct Resource
{
int id;
Resource(int id):id{id}{};
~Resource() { std::cout << "~" << id << std::endl; }
auto foo() { std::cout << "foo" << id << std::endl; }
};
std::shared_ptr<Resource> get_resource(int resource_id)
{
static std::mutex mutex{};
static std::unordered_map<int, std::weak_ptr<Resource>> resource_map{};
std::scoped_lock lock{mutex};
auto& weak = resource_map[resource_id];
if (!weak.lock())
{
auto shared = std::make_shared<Resource>(resource_id);
weak = shared;
return shared;
}
return weak.lock();
}
This works as follows: When request a resource i
if there isn't any resource i with active owners: creates a new resource, returns a new shared_ptr and stores a weak_ptr to it
if there is a resource i with active owners returns a new share_ptr to it
A resource i:
will be created on the first get_resource(i) call
will be deleted once there are no more owners to it (get_resource doesn't hold owners)
once deleted a new resource will be recreated on the next get_resource(i) call
there will never be more than 1 resources i at the same time
This seems to work, but be advised I have done only summary testing:
auto test()
{
auto r0 = get_resource(1);
auto r1 = get_resource(24);
auto r2 = r1;
auto r3 = get_resource(24);
return r3;
}
int main()
{
auto r = test();
r->foo();
}
Outputs
~1
foo24
~24

Related

How to use `boost::thread_specific_ptr` with `for_each`

In the code below, Bar is supposed to model a thread-unsafe object that is moderately expensive to create. Foo contains a Bar and is multi-threaded, so it uses a thread_specific_ptr<Bar> to make a per-thread Bar that can be re-used across multiple calls to loop for the same Foo (therefore amortizing the cost of creating a Bar for each thread). Foo always creates a Bar with the same num, so the sanity check is supposed to always pass, yet it fails.
The reason for this is (I think) explained in the requirement for the thread_specific_ptr destructor:
All the thread specific instances associated to this thread_specific_ptr (except maybe the one associated to this thread) must be null.
So the problem is caused by a combination of three things:
Bar objects created in worker threads are not cleaned up when Foos thread_specific_ptr is cleaned up, and are therefore persisted across iterations of the loop in main (essentially, a memory leak)
The C++ runtime is re-using threads in for_each between iterations of the loop in main
The C++ runtime is re-allocating each Foo in the main loop to the same memory address
The way that thread_specific_ptrs are indexed (by the thread_specific_ptr's memory address and the thread ID) results in old Bars being accidentally reused. I understand the problem; what I don't understand is what to do about it. Note the remark from the docs:
The requirement is due to the fact that in order to delete all these instances, the implementation should be forced to maintain a list of all the threads having an associated specific ptr, which is against the goal of thread specific data.
I'd like to avoid this complexity as well.
How can I use for_each for simple thread management, but also avoid the memory leak? Solution requirements:
It should only create one Bar per thread per Foo (i.e., don't create a new Bar inside the for_each)
Assume Bar is not thread-safe.
If possible, use for_each to make the parallel loop as simple as possible
The loop should actually run in parallel (i.e., no mutex around a single Bar)
Bar objects created by loop should be available for use until the Foo object that created them is destructed, at which point all Bar objects should also be destructed.
The following code compiles and should exit with return code 1 with high probability on a machine with sufficient cores.
#include <boost/thread/tss.hpp>
#include <execution>
#include <iostream>
#include <vector>
using namespace std;
class Bar {
public:
// models a thread-unsafe object
explicit Bar(int i) : num(i) { }
int num;
};
class Foo {
public:
explicit Foo(int i) : num(i) { }
void loop() {
vector<int> idxs(32);
iota(begin(idxs), end(idxs), 0);
for_each(__pstl::execution::par, begin(idxs), end(idxs), [&](int) {
if (ptr.get() == nullptr) {
// no `Bar` exists for this thread yet, so create one
Bar *tmp = new Bar(num);
ptr.reset(tmp);
}
// Get the thread-local Bar
Bar &b = *ptr;
// Sanity check: we ALWAYS create a `Bar` with the same num as `Foo`;
// see the `if` block above.
// Therefore, this condition shouldn't ever be true (but it is!)
if (b.num != num) {
cout << "NOT THREAD SAFE: Foo index is " << num << ", but Bar index is " << b.num << endl;
exit(1);
}
});
}
boost::thread_specific_ptr<Bar> ptr;
int num;
};
int main() {
for(int i = 0; i < 100; i++) {
Foo f(i);
f.loop();
}
return 0;
}
According to the documentation
~thread_specific_ptr();
Requires:
All the thread specific instances associated to this thread_specific_ptr
(except maybe the one associated to this thread) must be null.
This means you are not allowed to destroy Foo until all of its Bar have been destroyed. This is a problem because execution_policy::par does not have to operate on a fresh thread pool, nor does it have to terminate the threads once the for_each() is done.
This gives us enough to answer the question as asked: You can only use thread_specific_ptr alongside execution::par to share data between various iterations on the same thread if:
The thread_specific_ptr is never destroyed. This is required because there is no way to know whether a given iteration of the for_each will be the last one for its assigned thread, and that thread might never get scheduled again.
You are comfortable leaking one instance of the pointed object per thread until the end of the program.
What's going on in your code
We are already in Undefined Behavior land, but the behavior you are seeing can still be explained a bit further. Considering that:
Boost.Thread uses the address of the thread_specific_ptr instance as key of the thread specific pointers. This avoids to create/destroy a key which will need a lock to protect from race conditions. This has a little performance liability, as the access must be done using an associative container.
... and that all 100 instances of Foo will most likely be at the same place in memory, you end up seeing instances of Bar from the previous Foo when the worker threads are recycled, leading to your (innacurate, see below) check to hit.
Solution: What I think you should do
I would suggest you just drop thread_specific_ptr altogether and manually manage the pool of per-thread/per-Foo Bar instances with an associative container, this makes managing the lifetime of the Bar objects a lot more straightforward:
class per_thread_bar_pool {
std::map<std::thread::id, Bar> bars_;
// alternatively:
// std::map<std::thread::id, std::unique_ptr<Bar>> bars_;
std::mutex mtx_;
public:
Bar& get(int num) {
auto tid = std::this_thread::get_id();
std::unique_lock l{mtx_};
auto found = bars_.find(tid);
if(found == bars_.end()) {
l.unlock(); // Let other threads access the map while `Bar` is being built.
Bar new_bar(num);
// auto new_bar = std::make_unique<Bar>(num);
l.lock();
assert(bars_.find(tid) == bars_.end());
found = bars_.emplace(tid, std::move(new_bar)).first;
}
return found->second;
// return *found->second;
}
};
void loop() {
per_thread_bar_pool bars;
vector<int> idxs(32);
iota(begin(idxs), end(idxs), 0);
for_each(__pstl::execution::par, begin(idxs), end(idxs), [&](int) {
Bar& current_bar = bars.get(num);
// ...
}
}
thread_specific_ptr already uses std::map<> under the hood (it maintains one per thread). So introducing one here is not that big of a deal.
We do introduce a mutex, but it only comes into play for a simple lookup/insertion into a map, and since constructing Bar is supposed to be so expensive, it will most likely have very little impact. It also has the benefit that multiple instances of Foo do not interact with each other anymore, so you avoid surprising bugs that could occur if you ever end up calling foo::loop() from multiple threads.
N.B.: if (b.num != num) { is not a valid test since all instances of Bar from a given Foo share the same num. That should only cause false-negatives though.
Solution: Making your code work (almost)
All this being said, if you are absolutely gung-ho about using thread_specific_pointer and execution::par at the same time you'll have to do the following:
void loop() {
static boost::thread_specific_ptr<Bar> ptr; // lives till the end of the program
vector<int> idxs(32);
iota(begin(idxs), end(idxs), 0);
for_each(__pstl::execution::par, begin(idxs), end(idxs), [&](int) {
if (ptr.get() == nullptr || ptr->num != num) {
// no `Bar` exists for this thread yet, or it's from a previous run
Bar *tmp = new Bar(num);
ptr.reset(tmp);
}
// Get the thread-local Bar
Bar &b = *ptr;
});
However, this will leak up to 1 Bar per thread, as cleanup only ever happens when we try to reuse a Bar from a previous run. There are no ways around this.

In C++ threads, should I pass shared_ptr by value or reference?

This page on Thread Safety by Microsoft says shared_ptr should be used even if there are multiple copies sharing the same object.
So does this mean that both of the following are acceptable? I've tried both and they appear to work fine.
EDIT: The actual business objective is to get string updates from the long running thread to the main thread. I figured I should use shared_ptr since string is not thread safe. Don't care about ownership honestly.
Option 1 (Passing reference):
auto status = std::make_shared<std::string>();
auto f = [&status]() {
...
*status = "current status";
...
};
std::thread t{f};
while(true) {
std::cout << *status << std::endl;
std::this_thread::sleep_for(1000ms);
if (*status == "completed") break;
}
t.join();
Option 2 (Making a copy):
auto status = std::make_shared<std::string>();
auto f = [](std::shared_ptr<std::string> s) {
...
*s= "current status";
...
};
std::thread t{f, status};
while(true) {
std::cout << *status << std::endl;
std::this_thread::sleep_for(1000ms);
if (*status == "completed") break;
}
t.join();
EDIT2: So apparently both these approaches are wrong for what I'm trying to achieve. I need to use std::mutex (cppreference) and not muck around with shared_ptr. See second half of this answer.
Typically, threads may outlive the scope where they are created. In such case, any local variable captured by reference may be destroyed while the thread is still running. If this is the case, then you should not capture by reference.
Furthermore, modifying a shared pointer object in one thread and accessing in another without synchronisation results in undefined behaviour. If that is what you're doing, then you should access the pointer using std::atomic_load/atomic_store functions, or simply copy the pointer into each thread. Note that you can capture by copy:
auto f = [status]() {
Furthermore, the shared pointer provides no extra thread safety to accessing the pointed object beyond keeping the ownership alive and ensuring it gets deleted exactly once. If the pointed type is not atomic, then modifying it in one thread and accessing in another without synchronisation results in undefined behaviour. If that is what you're doing, you need to use mutexes or something similar. Or copy the pointed object itself into each thread.
Regarding the edited question: Your examples apply to this last case. Both of them have undefined behaviour. You need synchronisation.
It is weird to accept shared_ptr by reference as you lose the whole point of using shared_ptr in the first place. You may just use a raw pointer instead.
There are cases when accepting by reference of shared_ptr is legitimate but if you give a reference of it to a thread then it will cause UB once that instance of the shared_ptr is destroyed and the thread still uses the shared_ptr.
Primary purpose of shared_ptr is to manage lifetime of the object. If you pass a reference of it to a thread then you throw away the whole purpose and advantages of the shared_ptr.
if you use a reference you can't detach the thread.
for example, this program will be crashed:
#include <thread>
#include <chrono>
#include <string>
#include <iostream>
void f1()
{
auto status = std::make_shared<std::string>();
auto f = [&status]()
{
std::this_thread::sleep_for(std::chrono::seconds(1));
*status = "current status";
};
std::thread t{f};
t.detach();
}
int main() {
f1();
std::string status="other status";//use the frame
std::this_thread::sleep_for(std::chrono::seconds(1));
std::cout<<status<<std::endl; //check the frame
}
When you pass a reference to status the lambda, it means that you would need make sure yourself that the lambda does not outlive the status variable.
Imagine that we would want to move the thread creation to a separate function:
std::thread
spawn_thread( std::shared_ptr< std::string > status )
{
auto f = [&status]( ) {
// Adding a sleep to make sure this gets executed after we exit spawn_thread function.
std::this_thread::sleep_for( std::chrono::milliseconds( 500 ) );
*status = "current status";
};
std::thread t{f};
return t;
}
int
main( )
{
auto status = std::make_shared< std::string >( );
auto thread = spawn_thread( status );
while ( true )
{
std::cout << *status << std::endl;
std::this_thread::sleep_for( std::chrono::milliseconds( 1000 ) );
if ( *status == "completed" )
{
break;
}
}
thread.join( );
}
Running this code will most likely result in a crash, since variable status (not the shared data behind it) gets out of scope before we access it within the lambda.
Of course we could pass the reference to status to the spawn_thread function, but then we propagate the problem further - now the caller of spawn_thread needs to make sure that this variable outlives the thread.
std::shared_ptr is designed for the cases when you do not want to manually control the lifetime of the object passed around, but in order for it to work you need to pass it by value so that the internal mechanism keeps count of the number of shared_ptr instances.
Keep in mind that while passing around and copying shared_ptr is thread safe, concurrent reads and writes of the value stored inside it is not.

Is this inter-thread object sharing strategy sound?

I'm trying to come up with a fast way of solving the following problem:
I have a thread which produces data, and several threads which consume it. I don't need to queue produced data, because data is produced much more slowly than it is consumed (and even if this failed to be the case occasionally, it wouldn't be a problem if a data point were skipped occasionally). So, basically, I have an object that encapsulates the "most recent state", which only the producer thread is allowed to update.
My strategy is as follows (please let me know if I'm completely off my rocker):
I've created three classes for this example: Thing (the actual state object), SharedObject<Thing> (an object that can be local to each thread, and gives that thread access to the underlying Thing), and SharedObjectManager<Thing>, which wraps up a shared_ptr along with a mutex.
The instance of the SharedObjectManager (SOM) is a global variable.
When the producer starts, it instantiates a Thing, and tells the global SOM about it. It then makes a copy, and does all of it's updating work on that copy. When it is ready to commit it's changes to the Thing, it passes the new Thing to the global SOM, which locks it's mutex, updates the shared pointer it keeps, and then releases the lock.
Meanwhile, the consumer threads all intsantiate SharedObject<Thing>. these objects each keep a pointer to the global SOM, as well as a cached copy of the shared_ptr kept by the SOM... It keeps this cached until update() is explicitly called.
I believe this is getting hard to follow, so here's some code:
#include <mutex>
#include <iostream>
#include <memory>
class Thing
{
private:
int _some_member = 10;
public:
int some_member() const { return _some_member; }
void some_member(int val) {_some_member = val; }
};
// one global instance
template<typename T>
class SharedObjectManager
{
private:
std::shared_ptr<T> objPtr;
std::mutex objLock;
public:
std::shared_ptr<T> get_sptr()
{
std::lock_guard<std::mutex> lck(objLock);
return objPtr;
}
void commit_new_object(std::shared_ptr<T> new_object)
{
std::lock_guard<std::mutex> lck (objLock);
objPtr = new_object;
}
};
// one instance per consumer thread.
template<typename T>
class SharedObject
{
private:
SharedObjectManager<T> * som;
std::shared_ptr<T> cache;
public:
SharedObject(SharedObjectManager<T> * backend) : som(backend)
{update();}
void update()
{
cache = som->get_sptr();
}
T & operator *()
{
return *cache;
}
T * operator->()
{
return cache.get();
}
};
// no actual threads in this test, just a quick sanity check.
SharedObjectManager<Thing> glbSOM;
int main(void)
{
glbSOM.commit_new_object(std::make_shared<Thing>());
SharedObject<Thing> myobj(&glbSOM);
std::cout<<myobj->some_member()<<std::endl;
// prints "10".
}
The idea for use by the producer thread is:
// initialization - on startup
auto firstStateObj = std::make_shared<Thing>();
glbSOM.commit_new_object(firstStateObj);
// main loop
while (1)
{
// invoke copy constructor to copy the current live Thing object
auto nextState = std::make_shared<Thing>(*(glbSOM.get_sptr()));
// do stuff to nextState, gradually filling out it's new value
// based on incoming data from other sources, etc.
...
// commit the changes to the shared memory location
glbSOM.commit_new_object(nextState);
}
The use by consumers would be:
SharedObject<Thing> thing(&glbSOM);
while(1)
{
// think about the data contained in thing, and act accordingly...
doStuffWith(thing->some_member());
// re-cache the thing
thing.update();
}
Thanks!
That is way overengineered. Instead, I'd suggest to do following:
Create a pointer to Thing* theThing together with protection mutex. Either a global one, or shared by some other means. Initialize it to nullptr.
In your producer: use two local objects of Thing type - Thing thingOne and Thing thingTwo (remember, thingOne is no better than thingTwo, but one is called thingOne for a reason, but this is a thing thing. Watch out for cats.). Start with populating thingOne. When done, lock the mutex, copy thingOne address to theThing, unlock the mutex. Start populating thingTwo. When done, see above. Repeat untill killed.
In every listener: (make sure the pointer is not nullptr). Lock the mutex. Make a copy of the object pointed two by the theThing. Unlock the mutex. Work with your copy. Burn after reading. Repeat untill killed.

Read-write thread-safe smart pointer in C++, x86-64

I develop some lock free data structure and following problem arises.
I have writer thread that creates objects on heap and wraps them in smart pointer with reference counter. I also have a lot of reader threads, that work with these objects. Code can look like this:
SmartPtr ptr;
class Reader : public Thread {
virtual void Run {
for (;;) {
SmartPtr local(ptr);
// do smth
}
}
};
class Writer : public Thread {
virtual void Run {
for (;;) {
SmartPtr newPtr(new Object);
ptr = newPtr;
}
}
};
int main() {
Pool* pool = SystemThreadPool();
pool->Run(new Reader());
pool->Run(new Writer());
for (;;) // wait for crash :(
}
When I create thread-local copy of ptr it means at least
Read an address.
Increment reference counter.
I can't do these two operations atomically and thus sometimes my readers work with deleted object.
The question is - what kind of smart pointer should I use to make read-write access from several threads with correct memory management possible? Solution should exist, since Java programmers don't even care about such a problem, simply relying on that all objects are references and are deleted only when nobody uses them.
For PowerPC I found http://drdobbs.com/184401888, looks nice, but uses Load-Linked and Store-Conditional instructions, that we don't have in x86.
As far I as I understand, boost pointers provide such functionality only using locks. I need lock free solution.
boost::shared_ptr have atomic_store which uses a "lock-free" spinlock which should be fast enough for 99% of possible cases.
boost::shared_ptr<Object> ptr;
class Reader : public Thread {
virtual void Run {
for (;;) {
boost::shared_ptr<Object> local(boost::atomic_load(&ptr));
// do smth
}
}
};
class Writer : public Thread {
virtual void Run {
for (;;) {
boost::shared_ptr<Object> newPtr(new Object);
boost::atomic_store(&ptr, newPtr);
}
}
};
int main() {
Pool* pool = SystemThreadPool();
pool->Run(new Reader());
pool->Run(new Writer());
for (;;)
}
EDIT:
In response to comment below, the implementation is in "boost/shared_ptr.hpp"...
template<class T> void atomic_store( shared_ptr<T> * p, shared_ptr<T> r )
{
boost::detail::spinlock_pool<2>::scoped_lock lock( p );
p->swap( r );
}
template<class T> shared_ptr<T> atomic_exchange( shared_ptr<T> * p, shared_ptr<T> r )
{
boost::detail::spinlock & sp = boost::detail::spinlock_pool<2>::spinlock_for( p );
sp.lock();
p->swap( r );
sp.unlock();
return r; // return std::move( r )
}
With some jiggery-pokery you should be able to accomplish this using InterlockedCompareExchange128. Store the reference count and pointer in a 2 element __int64 array. If reference count is in array[0] and pointer in array[1] the atomic update would look like this:
while(true)
{
__int64 comparand[2];
comparand[0] = refCount;
comparand[1] = pointer;
if(1 == InterlockedCompareExchange128(
array,
pointer,
refCount + 1,
comparand))
{
// Pointer is ready for use. Exit the while loop.
}
}
If an InterlockedCompareExchange128 intrinsic function isn't available for your compiler then you may use the underlying CMPXCHG16B instruction instead, if you don't mind mucking around in assembly language.
The solution proposed by RobH doesn't work. It has the same problem as the original question: when accessing the reference count object, it might already have been deleted.
The only way I see of solving the problem without a global lock (as in boost::atomic_store) or conditional read/write instructions is to somehow delay the destruction of the object (or the shared reference count object if such thing is used). So zennehoy has a good idea but his method is too unsafe.
The way I might do it is by keeping copies of all the pointers in the writer thread so that the writer can control the destruction of the objects:
class Writer : public Thread {
virtual void Run() {
list<SmartPtr> ptrs; //list that holds all the old ptr values
for (;;) {
SmartPtr newPtr(new Object);
if(ptr)
ptrs.push_back(ptr); //push previous pointer into the list
ptr = newPtr;
//Periodically go through the list and destroy objects that are not
//referenced by other threads
for(auto it=ptrs.begin(); it!=ptrs.end(); )
if(it->refCount()==1)
it = ptrs.erase(it);
else
++it;
}
}
};
However there are still requirements for the smart pointer class. This doesn't work with shared_ptr as the reads and writes are not atomic. It almost works with boost::intrusive_ptr. The assignment on intrusive_ptr is implemented like this (pseudocode):
//create temporary from rhs
tmp.ptr = rhs.ptr;
if(tmp.ptr)
intrusive_ptr_add_ref(tmp.ptr);
//swap(tmp,lhs)
T* x = lhs.ptr;
lhs.ptr = tmp.ptr;
tmp.ptr = x;
//destroy temporary
if(tmp.ptr)
intrusive_ptr_release(tmp.ptr);
As far as I understand the only thing missing here is a compiler level memory fence before lhs.ptr = tmp.ptr;. With that added, both reading rhs and writing lhs would be thread-safe under strict conditions: 1) x86 or x64 architecture 2) atomic reference counting 3) rhs refcount must not go to zero during the assignment (guaranteed by the Writer code above) 4) only one thread writing to lhs (using CAS you could have several writers).
Anyway, you could create your own smart pointer class based on intrusive_ptr with necessary changes. Definitely easier than re-implementing shared_ptr. And besides, if you want performance, intrusive is the way to go.
The reason this works much more easily in java is garbage collection. In C++, you have to manually ensure that a value is not just starting to be used by a different thread when you want to delete it.
A solution I've used in a similar situation is to simply delay the deletion of the value. I create a separate thread that iterates through a list of things to be deleted. When I want to delete something, I add it to this list with a timestamp. The deleting thread waits until some fixed time after this timestamp before actually deleting the value. You just have to make sure that the delay is large enough to guarantee that any temporary use of the value has completed.
100 milliseconds would have been enough in my case, I chose a few seconds to be safe.

Deleting pointer sometimes results in heap corruption

I have a multithreaded application that runs using a custom thread pool class. The threads all execute the same function, with different parameters.
These parameters are given to the threadpool class the following way:
// jobParams is a struct of int, double, etc...
jobParams* params = new jobParams;
params.value1 = 2;
params.value2 = 3;
int jobId = 0;
threadPool.addJob(jobId, params);
As soon as a thread has nothing to do, it gets the next parameters and runs the job function. I decided to take care of the deletion of the parameters in the threadpool class:
ThreadPool::~ThreadPool() {
for (int i = 0; i < this->jobs.size(); ++i) {
delete this->jobs[i].params;
}
}
However, when doing so, I sometimes get a heap corruption error:
Invalid Address specified to RtlFreeHeap
The strange thing is that in one case it works perfectly, but in another program it crashes with this error. I tried deleting the pointer at other places: in the thread after the execution of the job function (I get the same heap corruption error) or at the end of the job function itself (no error in this case).
I don't understand how deleting the same pointers (I checked, the addresses are the same) from different places changes anything. Does this have anything to do with the fact that it's multithreaded?
I do have a critical section that handles the access to the parameters. I don't think the problem is about synchronized access. Anyway, the destructor is called only once all threads are done, and I don't delete any pointer anywhere else. Can pointer be deleted automatically?
As for my code. The list of jobs is a queue of a structure, composed of the id of a job (used to be able to get the output of a specific job later) and the parameters.
getNextJob() is called by the threads (they have a pointer to the ThreadPool) each time they finished to execute their last job.
void ThreadPool::addJob(int jobId, void* params) {
jobData job; // jobData is a simple struct { int, void* }
job.ID = jobId;
job.params = params;
// insert parameters in the list
this->jobs.push(job);
}
jobData* ThreadPool::getNextJob() {
// get the data of the next job
jobData* job = NULL;
// we don't want to start a same job twice,
// so we make sure that we are only one at a time in this part
WaitForSingleObject(this->mutex, INFINITE);
if (!this->jobs.empty())
{
job = &(this->jobs.front());
this->jobs.pop();
}
// we're done with the exclusive part !
ReleaseMutex(this->mutex);
return job;
}
Let's turn this on its head: Why are you using pointers at all?
class Params
{
int value1, value2; // etc...
}
class ThreadJob
{
int jobID; // or whatever...
Params params;
}
class ThreadPool
{
std::list<ThreadJob> jobs;
void addJob(int job, const Params & p)
{
ThreadJob j(job, p);
jobs.push_back(j);
}
}
No new, delete or pointers... Obviously some of the implementation details may be cocked, but you get the overall picture.
Thanks for extra code. Now we can see a problem -
in getNextJob
if (!this->jobs.empty())
{
job = &(this->jobs.front());
this->jobs.pop();
After the "pop", the memory pointed to by 'job' is undefined. Don't use a reference, copy the actual data!
Try something like this (it's still generic, because JobData is generic):
jobData ThreadPool::getNextJob() // get the data of the next job
{
jobData job;
WaitForSingleObject(this->mutex, INFINITE);
if (!this->jobs.empty())
{
job = (this->jobs.front());
this->jobs.pop();
}
// we're done with the exclusive part !
ReleaseMutex(this->mutex);
return job;
}
Also, while you're adding jobs to the queue you must ALSO lock the mutex, to prevent list corruption. AFAIK std::lists are NOT inherently thread-safe...?
Using operator delete on pointer to void results in undefined behavior according to the specification.
Chapter 5.3.5 of the draft of the C++ specification. Paragraph 3.
In the first alternative (delete object), if the static type of the operand is different from its dynamic type, the static type shall be a base class of the operand’s dynamic type and the static type shall have a virtual destructor or the behavior is undefined. In the second alternative (delete array) if the dynamic type of the object to be deleted differs from its static type, the behavior is undefined.73)
And corresponding footnote.
This implies that an object cannot be deleted using a pointer of type void* because there are no objects of type void
All access to the job queue must be synchronized, i.e. performed only from 1 thread at a time by locking the job queue prior to access. Do you already have a critical section or some similar pattern to guard the shared resource? Synchronization issues often lead to weird behaviour and bugs which are hard to reproduce.
It's hard to give a definitive answer with this amount of code. But generally speaking, multithreaded programming is all about synchronizing access to data that might be accessed from multiple threads. If there is no long or other synchronization primitive protecting access to the threadpool class itself, then you can potentially have multiple threads reaching your deletion loop at the same time, at which point you're pretty much guaranteed to be double-freeing memory.
The reason you're getting no crash when you delete a job's params at the end of the job function might be because access to a single job's params is already implicitly serialized by your work queue. Or you might just be getting lucky. In either case, it's best to think about locks and synchronization primitive as not being something that protects code, but as being something that protects data (I've always thought the term "critical section" was a bit misleading here, as it tends to lead people to think of a 'section of lines of code' rather than in terms of data access).. In this case, since you want to access your jobs data from multiple thread, you need to be protecting it via a lock or some other synchronization primitive.
If you try to delete an object twice, the second time will fail, because the heap is already freed. This is the normal behavior.
Now, since you are in a multithreading context... it might be that the deletions are done "almost" in parallel, which might avoid the error on the second deletion, because the first one is not yet finalized.
Use smart pointers or other RAII to handle your memory.
If you have access to boost or tr1 lib you can do something like this.
class ThreadPool
{
typedef pair<int, function<void (void)> > Job;
list< Job > jobList;
HANDLE mutex;
public:
void addJob(int jobid, const function<void (void)>& job) {
jobList.push_back( make_pair(jobid, job) );
}
Job getNextJob() {
struct MutexLocker {
HANDLE& mutex;
MutexLocker(HANDLE& mutex) : mutex(mutex){
WaitForSingleObject(mutex, INFINITE);
}
~MutexLocker() {
ReleaseMutex(mutex);
}
};
Job job = make_pair(-1, function<void (void)>());
const MutexLocker locker(this->mutex);
if (!this->jobList.empty()) {
job = this->jobList.front();
this->jobList.pop();
}
return job;
}
};
void workWithDouble( double value );
void workWithInt( int value );
void workWithValues( int, double);
void test() {
ThreadPool pool;
//...
pool.addJob( 0, bind(&workWithDouble, 0.1));
pool.addJob( 1, bind(&workWithInt, 1));
pool.addJob( 2, bind(&workWithValues, 1, 0.1));
}