How do I use an arbitrary string as a lock in C++? - c++

Let's say I have a multithreaded C++ program that handles requests in the form of a function call to handleRequest(string key). Each call to handleRequest occurs in a separate thread, and there are an arbitrarily large number of possible values for key.
I want the following behavior:
Simultaneous calls to handleRequest(key) are serialized when they have the same value for key.
Global serialization is minimized.
The body of handleRequest might look like this:
void handleRequest(string key) {
KeyLock lock(key);
// Handle the request.
}
Question: How would I implement KeyLock to get the required behavior?
A naive implementation might start off like this:
KeyLock::KeyLock(string key) {
global_lock->Lock();
internal_lock_ = global_key_map[key];
if (internal_lock_ == NULL) {
internal_lock_ = new Lock();
global_key_map[key] = internal_lock_;
}
global_lock->Unlock();
internal_lock_->Lock();
}
KeyLock::~KeyLock() {
internal_lock_->Unlock();
// Remove internal_lock_ from global_key_map iff no other threads are waiting for it.
}
...but that requires a global lock at the beginning and end of each request, and the creation of a separate Lock object for each request. If contention is high between calls to handleRequest, that might not be a problem, but it could impose a lot of overhead if contention is low.

You could do something similar to what you have in your question, but instead of a single global_key_map have several (probably in an array or vector) - which one is used is determined by some simple hash function on the string.
That way instead of a single global lock, you spread that out over several independent ones.
This is a pattern that is often used in memory allocators (I don't know if the pattern has a name - it should). When a request comes in, something determines which pool the allocation will come from (usually the size of the request, but other parameters can factor in as well), then only that pool needs to be locked. If an allocation request comes in from another thread that will use a different pool, there's no lock contention.

It will depend on the platform, but the two techniques that I'd try would be:
Use named mutex/synchronization
objects, where object name = Key
Use filesystem-based locking, where you
try to create a non-shareable
temporary file with the key name. If it exists already (=already
locked) this will fail and you'll
have to poll to retry
Both techniques will depend on the detail of your OS. Experiment and see which works.
.

Perhaps an std::map<std::string, MutexType> would be what you want, where MutexType is the type of the mutex you want. You will probably have to wrap accesses to the map in another mutex in order to ensure that no other thread is inserting at the same time (and remember to perform the check again after the mutex is locked to ensure that another thread didn't add the key while waiting on the mutex!).
The same principle could apply to any other synchronization method, such as a critical section.

Raise granularity and lock entire key-ranges
This is a variation on Mike B's answer, where instead of having several fluid lock maps you have a single fixed array of locks that apply to key-ranges instead of single keys.
Simplified example: create array of 256 locks at startup, then use first byte of key to determine index of lock to be acquired (i.e. all keys starting with 'k' will be guarded by locks[107]).
To sustain optimal throughput you should analyze distribution of keys and contention rate. The benefits of this approach are zero dynamic allocations and simple cleanup; you also avoid two-step locking. The downside is potential contention peaks if key distribution becomes skewed over time.

After thinking about it, another approach might go something like this:
In handleRequest, create a Callback that does the actual work.
Create a multimap<string, Callback*> global_key_map, protected by a mutex.
If a thread sees that key is already being processed, it adds its Callback* to the global_key_map and returns.
Otherwise, it calls its callback immediately, and then calls the callbacks that have shown up in the meantime for the same key.
Implemented something like this:
LockAndCall(string key, Callback* callback) {
global_lock.Lock();
if (global_key_map.contains(key)) {
iterator iter = global_key_map.insert(key, callback);
while (true) {
global_lock.Unlock();
iter->second->Call();
global_lock.Lock();
global_key_map.erase(iter);
iter = global_key_map.find(key);
if (iter == global_key_map.end()) {
global_lock.Unlock();
return;
}
}
} else {
global_key_map.insert(key, callback);
global_lock.Unlock();
}
}
This has the advantage of freeing up threads that would otherwise be waiting for a key lock, but apart from that it's pretty much the same as the naive solution I posted in the question.
It could be combined with the answers given by Mike B and Constantin, though.

/**
* StringLock class for string based locking mechanism
* e.g. usage
* StringLock strLock;
* strLock.Lock("row1");
* strLock.UnLock("row1");
*/
class StringLock {
public:
/**
* Constructor
* Initializes the mutexes
*/
StringLock() {
pthread_mutex_init(&mtxGlobal, NULL);
}
/**
* Lock Function
* The thread will return immediately if the string is not locked
* The thread will wait if the string is locked until it gets a turn
* #param string the string to lock
*/
void Lock(string lockString) {
pthread_mutex_lock(&mtxGlobal);
TListIds *listId = NULL;
TWaiter *wtr = new TWaiter;
wtr->evPtr = NULL;
wtr->threadId = pthread_self();
if (lockMap.find(lockString) == lockMap.end()) {
listId = new TListIds();
listId->insert(listId->end(), wtr);
lockMap[lockString] = listId;
pthread_mutex_unlock(&mtxGlobal);
} else {
wtr->evPtr = new Event(false);
listId = lockMap[lockString];
listId->insert(listId->end(), wtr);
pthread_mutex_unlock(&mtxGlobal);
wtr->evPtr->Wait();
}
}
/**
* UnLock Function
* #param string the string to unlock
*/
void UnLock(string lockString) {
pthread_mutex_lock(&mtxGlobal);
TListIds *listID = NULL;
if (lockMap.find(lockString) != lockMap.end()) {
lockMap[lockString]->pop_front();
listID = lockMap[lockString];
if (!(listID->empty())) {
TWaiter *wtr = listID->front();
Event *thdEvent = wtr->evPtr;
thdEvent->Signal();
} else {
lockMap.erase(lockString);
delete listID;
}
}
pthread_mutex_unlock(&mtxGlobal);
}
protected:
struct TWaiter {
Event *evPtr;
long threadId;
};
StringLock(StringLock &);
void operator=(StringLock&);
typedef list TListIds;
typedef map TMapLockHolders;
typedef map TMapLockWaiters;
private:
pthread_mutex_t mtxGlobal;
TMapLockWaiters lockMap;
};

Related

C++ Threading using 2 Containers

I have the following problem. I use a vector that gets filled up with values from a temperature sensor. This function runs in one thread. Then I have another thread responsible for publishing all the values into a data base which runs once every second. Now the publishing thread will lock the vector using a mutex, so the function that fills it with values will get blocked. However, while the thread that publishes the values is using the vector I want to use another vector to save the temperature values so that I don't lose any values while the data is getting published. How do I get around this problem? I thought about using a pointer that points to the containers and then switching it to the other container once it gets locked to keep saving values, but I dont quite know how.
I tried to add a minimal reproducable example, I hope it kind of explains my situation.
void publish(std::vector<temperature> &inputVector)
{
//this function would publish the values into a database
//via mqtt and also runs in a thread.
}
int main()
{
std::vector<temperature> testVector;
std::vector<temperature> testVector2;
while(1)
{
//I am repeatedly saving values into the vector.
//I want to do this in a thread but if the vector locked by a mutex
//i want to switch over to the other vector
testVector.push_back(testSensor.getValue());
}
}
Assuming you are using std::mutex, you can use mutex::try_lock on the producer side. Something like this:
while(1)
{
if (myMutex.try_lock()) {
// locking succeeded - move all queued values and push the new value
std::move(testVector2.begin(), testVector2.end(), std::back_inserter(testVector));
testVector2.clear();
testVector.push_back(testSensor.getValue());
myMutex.unlock();
} else {
// locking failed - queue the value
testVector2.push_back(testSensor.getValue());
}
}
Of course publish() needs to lock the mutex, too.
void publish(std::vector<temperature> &inputVector)
{
std::lock_guard<std::mutex> lock(myMutex);
//this function would publish the values into a database
//via mqtt and also runs in a thread.
}
This seems like the perfect opportunity for an additional (shared) buffer or queue, that's protected by the lock.
main would be essentially as it is now, pushing your new values into the shared buffer.
The other thread would, when it can, lock that buffer and take the new values from it. This should be very fast.
Then, it does not need to lock the shared buffer while doing its database things (which take longer), as it's only working on its own vector during that procedure.
Here's some pseudo-code:
std::mutex pendingTempsMutex;
std::vector<temperature> pendingTemps;
void thread2()
{
std::vector<temperature> temps;
while (1)
{
// Get new temps if we have any
{
std::scoped_lock l(pendingTempsMutex);
temps.swap(pendingTemps);
}
if (!temps.empty())
publish(temps);
}
}
void thread1()
{
while (1)
{
std::scoped_lock l(pendingTempsMutex);
pendingTemps.push_back(testSensor.getValue());
/*
Or, if getValue() blocks:
temperature newValue = testSensor.getValue();
std::scoped_lock l(pendingTempsMutex);
pendingTemps.push_back(newValue);
*/
}
}
Usually you'd use a std::queue for pendingTemps though. I don't think it really matters in this example, because you're always consuming everything in thread 2, but it's more conventional and can be more efficient in some scenarios. It can't lose you much as it's backed by a std::deque. But you can measure/test to see what's best for you.
This solution is pretty much what you already proposed/explored in the question, except that the producer shouldn't be in charge of managing the second vector.
You can improve it by having thread2 wait to be "informed" that there are new values, with a condition variable, otherwise you're going to be doing a lot of busy-waiting. I leave that as an exercise to the reader ;) There should be an example and discussion in your multi-threaded programming book.

I don't understand how can optimistic concurrency be implemented in C++11

I'm trying to implement a protected variable that does not use locks in C++11. I have read a little about optimistic concurrency, but I can't understand how can it be implemented neither in C++ nor in any language.
The way I'm trying to implement the optimistic concurrency is by using a 'last modification id'. The process I'm doing is:
Take a copy of the last modification id.
Modify the protected value.
Compare the local copy of the modification id with the current one.
If the above comparison is true, commit the changes.
The problem I see is that, after comparing the 'last modification ids' (local copy and current one) and before commiting the changes, there is no way to assure that no other threads have modified the value of the protected variable.
Below there is a example of code. Lets suppose that are many threads executing that code and sharing the variable var.
/**
* This struct is pretended to implement a protected variable,
* but using optimistic concurrency instead of locks.
*/
struct ProtectedVariable final {
ProtectedVariable() : var(0), lastModificationId(0){ }
int getValue() const {
return var.load();
}
void setValue(int val) {
// This method is not atomic, other thread could change the value
// of val before being able to increment the 'last modification id'.
var.store(val);
lastModificationId.store(lastModificationId.load() + 1);
}
size_t getLastModificationId() const {
return lastModificationId.load();
}
private:
std::atomic<int> var;
std::atomic<size_t> lastModificationId;
};
ProtectedVariable var;
/**
* Suppose this method writes a value in some sort of database.
*/
int commitChanges(int val){
// Now, if nobody has changed the value of 'var', commit its value,
// retry the transaction otherwise.
if(var.getLastModificationId() == currModifId) {
// Here is one of the problems. After comparing the value of both Ids, other
// thread could modify the value of 'var', hence I would be
// performing the commit with a corrupted value.
var.setValue(val);
// Again, the same problem as above.
writeToDatabase(val);
// Return 'ok' in case of everything has gone ok.
return 0;
} else {
// If someone has changed the value of var while trying to
// calculating and commiting it, return error;
return -1;
}
}
/**
* This method is pretended to be atomic, but without using locks.
*/
void modifyVar(){
// Get the modification id for checking whether or not some
// thread has modified the value of 'var' after commiting it.
size_t currModifId = lastModificationId.load();
// Get a local copy of 'var'.
int currVal = var.getValue();
// Perform some operations basing on the current value of
// 'var'.
int newVal = currVal + 1 * 2 / 3;
if(commitChanges(newVal) != 0){
// If someone has changed the value of var while trying to
// calculating and commiting it, retry the transaction.
modifyVar();
}
}
I know that the above code is buggy, but I don't understand how to implement something like the above in a correct way, without bugs.
Optimistic concurrency doesn't mean that you don't use the locks, it merely means that you don't keep the locks during most of the operation.
The idea is that you split your modification into three parts:
Initialization, like getting the lastModificationId. This part may need locks, but not necessarily.
Actual computation. All expensive or blocking code goes here (including any disk writes or network code). The results are written in such a way that they not obscure previous version. The likely way it works is by storing the new values next to the old ones, indexed by not-yet-commited version.
Atomic commit. This part is locked, and must be short, simple, and non blocking. The likely way it works is that it just bumps the version number - after confirming, that there was no other version commited in the meantime. No database writes at this stage.
The main assumption here is that computation part is much more expensive that the commit part. If your modification is trivial and the computation cheap, then you can just use a lock, which is much simpler.
Some example code structured into these 3 parts could look like this:
struct Data {
...
}
...
std::mutex lock;
volatile const Data* value; // The protected data
volatile int current_value_version = 0;
...
bool modifyProtectedValue() {
// Initialize.
int version_on_entry = current_value_version;
// Compute the new value, using the current value.
// We don't have any lock here, so it's fine to make heavy
// computations or block on I/O.
Data* new_value = new Data;
compute_new_value(value, new_value);
// Commit or fail.
bool success;
lock.lock();
if (current_value_version == version_on_entry) {
value = new_value;
current_value_version++;
success = true;
} else {
success = false;
}
lock.unlock();
// Roll back in case of failure.
if (!success) {
delete new_value;
}
// Inform caller about success or failure.
return success;
}
// It's cleaner to keep retry logic separately.
bool retryModification(int retries = 5) {
for (int i = 0; i < retries; ++i) {
if (modifyProtectedValue()) {
return true;
}
}
return false;
}
This is a very basic approach, and especially the rollback is trivial. In real world example re-creating the whole Data object (or it's counterpart) would be likely infeasible, so the versioning would have to be done somewhere inside, and the rollback could be much more complex. But I hope it shows the general idea.
The key here is acquire-release semantics and test-and-increment. Acquire-release semantics are how you enforce an order of operations. Test-and-increment is how you choose which thread wins in case of a race.
Your problem therefore is the .store(lastModificationId+1). You'll need .fetch_add(1). It returns the old value. If that's not the expected value (from before your read), then you lost the race and retry.
If I understand your question, you mean to make sure var and lastModificationId are either both changed, or neither is.
Why not use std::atomic<T> where T would be structure that hold both the int and the size_t?
struct VarWithModificationId {
int var;
size_t lastModificationId;
};
class ProtectedVariable {
private std::atomic<VarWithModificationId> protectedVar;
// Add your public setter/getter methods here
// You should be guaranteed that if two threads access protectedVar, they'll each get a 'consistent' view of that variable, but the setter will need to use a lock
};
Оptimistic concurrency is used in database engines when it's expected that different users will access the same data rarely. It could go like this:
First user reads data and timestamp. Users handles the data for some time, user checks if the timestamp in the DB hasn't changes since he read the data, if it doesn't then user updates the data and the timestamp.
But, internally DB-engine uses locks for update anyway, during this lock it checks if timestamp has been changed and if it hasn't been, engine updates the data. Just time for which data is locked smaller than with pessimistic concurrency. And you also need to use some kind of locking.

Concurrently processing data. What do I need to watch out for?

I have a routine that is meant to load and parse data from a file. There is a possibility that the data from the same file might need to be retrieved from two places at once, i.e. during a background caching process and from a user request.
Specifically I am using C++11 thread and mutex libraries. We compile with Visual C++ 11 (aka 2012), so are limited by whatever it lacks.
My naive implementation went something like this:
map<wstring,weak_ptr<DataStruct>> data_cache;
mutex data_cache_mutex;
shared_ptr<DataStruct> ParseDataFile(wstring file_path) {
auto data_ptr = make_shared<DataStruct>();
/* Parses and processes the data, may take a while */
return data_ptr;
}
shared_ptr<DataStruct> CreateStructFromData(wstring file_path) {
lock_guard<mutex> lock(data_cache_mutex);
auto cache_iter = data_cache.find(file_path);
if (cache_iter != end(data_cache)) {
auto data_ptr = cache_iter->second.lock();
if (data_ptr)
return data_ptr;
// reference died, remove it
data_cache.erase(cache_iter);
}
auto data_ptr = ParseDataFile(file_path);
if (data_ptr)
data_cache.emplace(make_pair(file_path, data_ptr));
return data_ptr;
}
My goals were two-fold:
Allow multiple threads to load separate files concurrently
Ensure that a file is only processed once
The problem with my current approach is that it doesn't allow concurrent parsing of multiple files at all. If I understand what will happen correctly, they're each going to hit the lock and end up processing linearly, one thread at a time. It may change from run to run the order which the threads pass through the lock first, but the end result is the same.
One solution I've considered was to create a second map:
map<wstring,mutex> data_parsing_mutex;
shared_ptr<DataStruct> ParseDataFile(wstring file_path) {
lock_guard<mutex> lock(data_parsing_mutex[file_path]);
/* etc. */
data_parsing_mutex.erase(file_path);
}
But now I have to be concerned with how data_parsing_mutex is being updated. So I guess I need another mutex?
map<wstring,mutex> data_parsing_mutex;
mutex data_parsing_mutex_mutex;
shared_ptr<DataStruct> ParseDataFile(wstring file_path) {
unique_lock<mutex> super_lock(data_parsing_mutex_mutex);
lock_guard<mutex> lock(data_parsing_mutex[file_path]);
super_lock.unlock();
/* etc. */
super_lock.lock();
data_parsing_mutex.erase(file_path);
}
In fact, looking at this, it's not going to avoid necessarily double-processing a file if it hasn't been completed by the background process when the user requests it, unless I check the cache yet again.
But by now my spidey senses are saying There must be a better way. Is there? Would futures, promises, or atomics help me at all here?
From what you described, it sounds like you're trying to do a form of lazy initialization of the DataStruct using a thread pool, along with a reference counted cache. std::async should be able to provide a lot of the dispatch and synchronization necessary for something like this.
Using std::async, the code would look something like this...
map<wstring,weak_ptr<DataStruct>> cache;
map<wstring,shared_future<shared_ptr<DataStruct>>> pending;
mutex cache_mutex, pending_mutex;
shared_ptr<DataStruct> ParseDataFromFile(wstring file) {
auto data_ptr = make_shared<DataStruct>();
/* Parses and processes the data, may take a while */
return data_ptr;
}
shared_ptr<DataStruct> CreateStructFromData(wstring file) {
shared_future<weak_ptr<DataStruct>> pf;
shared_ptr<DataStruct> ce;
{
lock_guard(cache_mutex);
auto ci = cache.find(file);
if (!(ci == cache.end() || ci->second.expired()))
return ci->second.lock();
}
{
lock_guard(pending_mutex);
auto fi = pending.find(file);
if (fi == pending.end() || fi.second.get().expired()) {
pf = async(ParseDataFromFile, file).share();
pending.insert(fi, make_pair(file, pf));
} else {
pf = pi->second;
}
}
pf.wait();
ce = pf.get();
{
lock_guard(cache_mutex);
auto ci = cache.find(file);
if (ci == cache.end() || ci->second.expired())
cache.insert(ci, make_pair(file, ce));
}
{
lock_guard(pending_mutex);
auto pi = pending.find(file);
if (pi != pending.end())
pending.erase(pi);
}
return ce;
}
This can probably be optimized a bit, but the general idea should be the same.
On a typical computer there is little point in trying to load files concurrently, since disk access will be the bottleneck. Instead, it's better to have a single thread load files (or use asynchronous I/O) and dish out the parsing to a thread pool. Then store the results in a shared container.
Regarding preventing double work, you should consider if this is really necessary. If you are only doing this out of premature optimization, you'd probably make users happier by focussing on making the program responsive, rather than efficient. That is, make sure the user gets what they ask for quickly, even if it means doing double work.
OTOH, if there is a technical reason for not parsing a file twice, you can keep track of the status of each file (loading, parsing, parsed) in the shared container.

Updating cache without blocking

I currently have a program that has a cache like mechanism. I have a thread listening for updates from another server to this cache. This thread will update the cache when it receives an update. Here is some pseudo code:
void cache::update_cache()
{
cache_ = new std::map<std::string, value>();
while(true)
{
if(recv().compare("update") == 0)
{
std::map<std::string, value> *new_info = new std::map<std::string, value>();
std::map<std::string, value> *tmp;
//Get new info, store in new_info
tmp = cache_;
cache_ = new_cache;
delete tmp;
}
}
}
std::map<std::string, value> *cache::get_cache()
{
return cache_;
}
cache_ is being read from many different threads concurrently. I believe how I have it here I will run into undefined behavior if one of my threads call get_cache(), then my cache updates, then the thread tries to access the stored cache.
I am looking for a way to avoid this problem. I know I could use a mutex, but I would rather not block reads from happening as they have to be as low latency as possible, but if need be, I can go that route.
I was wondering if this would be a good use case for a unique_ptr. Is my understanding correct in that if a thread calls get_cache, and that returns a unique_ptr instead of a standard pointer, once all threads that have the old version of cache are finished with it(i.e leave scope), the object will be deleted.
Is using a unique_ptr the best option for this case, or is there another option that I am not thinking of?
Any input will be greatly appreciated.
Edit:
I believe I made a mistake in my OP. I meant to use and pass a shared_ptr not a unique_ptr for cache_. And when all threads are finished with cache_ the shared_ptr should delete itself.
A little about my program: My program is a webserver that will be using this information to decide what information to return. It is fairly high throughput(thousands of req/sec) Each request queries the cache once, so telling my other threads when to update is no problem. I can tolerate slightly out of date information, and would prefer that over blocking all of my threads from executing if possible. The information in the cache is fairly large, and I would like to limit any copies on value because of this.
update_cache is only run once. It is run in a thread that just listens for an update command and runs the code.
I feel there are multiple issues:
1) Do not leak memory: for that never use "delete" in your code and stick with unique_ptr (or shared_ptr in specific cases)
2) Protect accesses to shared data, for that either using locking (mutex) or lock-free mecanism (std::atomic)
class Cache {
using Map = std::map<std::string, value>();
std::unique_ptr<Map> m_cache;
std::mutex m_cacheLock;
public:
void update_cache()
{
while(true)
{
if(recv().compare("update") == 0)
{
std::unique_ptr<Map> new_info { new Map };
//Get new info, store in new_info
{
std::lock_guard<std::mutex> lock{m_cacheLock};
using std::swap;
swap(m_cache, new_cache);
}
}
}
}
Note: I don't like update_cache() being part of a public interface for the cache as it contains an infinite loop. I would probably externalize the loop with the recv and have a:
void update_cache(std::unique_ptr<Map> new_info)
{
{ // This inner brace is not useless, we don't need to keep the lock during deletion
std::lock_guard<std::mutex> lock{m_cacheLock};
using std::swap;
swap(m_cache, new_cache);
}
}
Now for the reading to the cache, use proper encapsulation and don't leave the pointer to the member map escape:
value get(const std::string &key)
{
// lock, fetch, and return.
// Depending on value type, you might want to allocate memory
// before locking
}
Using this signature you have to throw an exception if the value is not present in the cache, another option is to return something like a boost::optional.
Overall you can keep a low latency (everything is relative, I don't know your use case) if you take care of doing costly operations (memory allocation for instance) outside of the locking section.
shared_ptr is very reasonable for this purpose, C++11 has a family of functions for handling shared_ptr atomically. If the data is immutable after creation, you won't even need any additional synchronization:
class cache {
public:
using map_t = std::map<std::string, value>;
void update_cache();
std::shared_ptr<const map_t> get_cache() const;
private:
std::shared_ptr<const map_t> cache_;
};
void cache::update_cache()
{
while(true)
{
if(recv() == "update")
{
auto new_info = std::make_shared<map_t>();
// Get new info, store in new_info
// Make immutable & publish
std::atomic_store(&cache_,
std::shared_ptr<const map_t>{std::move(new_info)});
}
}
}
auto cache::get_cache() const -> std::shared_ptr<const map_t> {
return std::atomic_load(&cache_);
}

Deleting pointer sometimes results in heap corruption

I have a multithreaded application that runs using a custom thread pool class. The threads all execute the same function, with different parameters.
These parameters are given to the threadpool class the following way:
// jobParams is a struct of int, double, etc...
jobParams* params = new jobParams;
params.value1 = 2;
params.value2 = 3;
int jobId = 0;
threadPool.addJob(jobId, params);
As soon as a thread has nothing to do, it gets the next parameters and runs the job function. I decided to take care of the deletion of the parameters in the threadpool class:
ThreadPool::~ThreadPool() {
for (int i = 0; i < this->jobs.size(); ++i) {
delete this->jobs[i].params;
}
}
However, when doing so, I sometimes get a heap corruption error:
Invalid Address specified to RtlFreeHeap
The strange thing is that in one case it works perfectly, but in another program it crashes with this error. I tried deleting the pointer at other places: in the thread after the execution of the job function (I get the same heap corruption error) or at the end of the job function itself (no error in this case).
I don't understand how deleting the same pointers (I checked, the addresses are the same) from different places changes anything. Does this have anything to do with the fact that it's multithreaded?
I do have a critical section that handles the access to the parameters. I don't think the problem is about synchronized access. Anyway, the destructor is called only once all threads are done, and I don't delete any pointer anywhere else. Can pointer be deleted automatically?
As for my code. The list of jobs is a queue of a structure, composed of the id of a job (used to be able to get the output of a specific job later) and the parameters.
getNextJob() is called by the threads (they have a pointer to the ThreadPool) each time they finished to execute their last job.
void ThreadPool::addJob(int jobId, void* params) {
jobData job; // jobData is a simple struct { int, void* }
job.ID = jobId;
job.params = params;
// insert parameters in the list
this->jobs.push(job);
}
jobData* ThreadPool::getNextJob() {
// get the data of the next job
jobData* job = NULL;
// we don't want to start a same job twice,
// so we make sure that we are only one at a time in this part
WaitForSingleObject(this->mutex, INFINITE);
if (!this->jobs.empty())
{
job = &(this->jobs.front());
this->jobs.pop();
}
// we're done with the exclusive part !
ReleaseMutex(this->mutex);
return job;
}
Let's turn this on its head: Why are you using pointers at all?
class Params
{
int value1, value2; // etc...
}
class ThreadJob
{
int jobID; // or whatever...
Params params;
}
class ThreadPool
{
std::list<ThreadJob> jobs;
void addJob(int job, const Params & p)
{
ThreadJob j(job, p);
jobs.push_back(j);
}
}
No new, delete or pointers... Obviously some of the implementation details may be cocked, but you get the overall picture.
Thanks for extra code. Now we can see a problem -
in getNextJob
if (!this->jobs.empty())
{
job = &(this->jobs.front());
this->jobs.pop();
After the "pop", the memory pointed to by 'job' is undefined. Don't use a reference, copy the actual data!
Try something like this (it's still generic, because JobData is generic):
jobData ThreadPool::getNextJob() // get the data of the next job
{
jobData job;
WaitForSingleObject(this->mutex, INFINITE);
if (!this->jobs.empty())
{
job = (this->jobs.front());
this->jobs.pop();
}
// we're done with the exclusive part !
ReleaseMutex(this->mutex);
return job;
}
Also, while you're adding jobs to the queue you must ALSO lock the mutex, to prevent list corruption. AFAIK std::lists are NOT inherently thread-safe...?
Using operator delete on pointer to void results in undefined behavior according to the specification.
Chapter 5.3.5 of the draft of the C++ specification. Paragraph 3.
In the first alternative (delete object), if the static type of the operand is different from its dynamic type, the static type shall be a base class of the operand’s dynamic type and the static type shall have a virtual destructor or the behavior is undefined. In the second alternative (delete array) if the dynamic type of the object to be deleted differs from its static type, the behavior is undefined.73)
And corresponding footnote.
This implies that an object cannot be deleted using a pointer of type void* because there are no objects of type void
All access to the job queue must be synchronized, i.e. performed only from 1 thread at a time by locking the job queue prior to access. Do you already have a critical section or some similar pattern to guard the shared resource? Synchronization issues often lead to weird behaviour and bugs which are hard to reproduce.
It's hard to give a definitive answer with this amount of code. But generally speaking, multithreaded programming is all about synchronizing access to data that might be accessed from multiple threads. If there is no long or other synchronization primitive protecting access to the threadpool class itself, then you can potentially have multiple threads reaching your deletion loop at the same time, at which point you're pretty much guaranteed to be double-freeing memory.
The reason you're getting no crash when you delete a job's params at the end of the job function might be because access to a single job's params is already implicitly serialized by your work queue. Or you might just be getting lucky. In either case, it's best to think about locks and synchronization primitive as not being something that protects code, but as being something that protects data (I've always thought the term "critical section" was a bit misleading here, as it tends to lead people to think of a 'section of lines of code' rather than in terms of data access).. In this case, since you want to access your jobs data from multiple thread, you need to be protecting it via a lock or some other synchronization primitive.
If you try to delete an object twice, the second time will fail, because the heap is already freed. This is the normal behavior.
Now, since you are in a multithreading context... it might be that the deletions are done "almost" in parallel, which might avoid the error on the second deletion, because the first one is not yet finalized.
Use smart pointers or other RAII to handle your memory.
If you have access to boost or tr1 lib you can do something like this.
class ThreadPool
{
typedef pair<int, function<void (void)> > Job;
list< Job > jobList;
HANDLE mutex;
public:
void addJob(int jobid, const function<void (void)>& job) {
jobList.push_back( make_pair(jobid, job) );
}
Job getNextJob() {
struct MutexLocker {
HANDLE& mutex;
MutexLocker(HANDLE& mutex) : mutex(mutex){
WaitForSingleObject(mutex, INFINITE);
}
~MutexLocker() {
ReleaseMutex(mutex);
}
};
Job job = make_pair(-1, function<void (void)>());
const MutexLocker locker(this->mutex);
if (!this->jobList.empty()) {
job = this->jobList.front();
this->jobList.pop();
}
return job;
}
};
void workWithDouble( double value );
void workWithInt( int value );
void workWithValues( int, double);
void test() {
ThreadPool pool;
//...
pool.addJob( 0, bind(&workWithDouble, 0.1));
pool.addJob( 1, bind(&workWithInt, 1));
pool.addJob( 2, bind(&workWithValues, 1, 0.1));
}