How to use lock to manage access to different type of resources? - concurrency

I need to synchronize execution of piece of code for several different types of resources.
For a given resource type, only one thread should be able to execute piece of code at a time. Same piece of code can be executed by multiple threads as long as all threads are working on different types of resources.
Let me know if following approach is right or wrong.
Function GetResourceTypeLockObject returns different lock objects based on given type in parameter. Each object is previously created readonly object.
GetResourceTypeLockObject(TypeEnum type)
{
if type == 1
return type1LockObject;
else if type == 2
return type2LockObject;
else if type == 3
return type3LockObject;
}
WorkOnResource(TypeEnum type)
{
var lockObject = GetResourceTypeLockObject(type);
lock(lockObject)
{
// Manipulate resource of given type
}
}

Related

Analyze concurency::task() API & why do we need this?

I'm, trying to understand the syntax of the concurrency::task in the below code snippet.
I'm unable to understand this code snippet syntax.
How do we analyze this:
What is "getFileOperation" here. Is it an object of type StorageFile
class ? What does "then" keyword mean here ? There is a "{" after
then(....)? I'm unable to analyze this syntax ?
Also why do we need this concurrency::task().then().. Use case ?
concurrency::task<Windows::Storage::StorageFile^> getFileOperation(installFolder->GetFileAsync("images\\test.png"));
getFileOperation.then([](Windows::Storage::StorageFile^ file)
{
if (file != nullptr)
{
Taken from MSDN concurrency::task API
void MainPage::DefaultLaunch()
{
auto installFolder = Windows::ApplicationModel::Package::Current->InstalledLocation;
concurrency::task<Windows::Storage::StorageFile^> getFileOperation(installFolder->GetFileAsync("images\\test.png"));
getFileOperation.then([](Windows::Storage::StorageFile^ file)
{
if (file != nullptr)
{
// Set the option to show the picker
auto launchOptions = ref new Windows::System::LauncherOptions();
launchOptions->DisplayApplicationPicker = true;
// Launch the retrieved file
concurrency::task<bool> launchFileOperation(Windows::System::Launcher::LaunchFileAsync(file, launchOptions));
launchFileOperation.then([](bool success)
{
if (success)
{
// File launched
}
else
{
// File launch failed
}
});
}
else
{
// Could not find file
}
});
}
getFileOperation is an object that will return a StorageFile^ (or an error) at some point in the future. It is a C++ task<t> wrapper around the WinRT IAsyncOperation<T> object returned from GetFileAsync.
The implementation of GetFileAsync can (but isn't required to) execute on a different thread, allowing the calling thread to continue doing other work (like animating the UI or responding to user input).
The then method lets you pass a continuation function that will be called once the asynchronous operation has completed. In this case, you're passing a lambda (inline anonymous function) which is identified by the [] square brackets followed by the lambda parameter list (a StorageFile^, the object that will be returned by GetFileAsync) and then the function body. This function body will be executed once the GetFileAsync operation completes its work sometime in the future.
The code inside the continuation function passed to then typically (but not always) executes after the code that follows the call to create_task() (or in your case the task constructor).

multithread function restart pattern/solution

I have the following scenario:
One thread is iterating over a list data structure in method A of class X. That data structure is a cache. At any time, we can get a call to a method B in class X saying our cache is out of date. In that case, we need to restart function A if we are currently in function A, since the iteration over that data structure could us to find data that is no longer present. We can count on Method B not being called twice at the same time(Method A will have time to complete).
Is this possible? I am working with C++. Note that simply locking the cache is not enough. If we lock the cache and we get a call saying the cache is out of date, we need to right away restart function A for proper behaviour.
This wouldn't work correctly, but I will attempt to show what I need:
Class X
{
Method A
{
for each //data structure
{
// do processing
// check if our cache is out of date
if(mRestart)
{
while(!mReadyToStart)
; //wait
mRestart = false;
mReadyToStart = false;
//Break, and call something that will recall this function.
}
else
return result; //return means we never needed to restart.
}
}
Method B
{
mRestart = true;
//Do processing
mReadyToRestart = true;
}
bool mRestart; // Init to false in constructor
}
You need to use synchronization mechanisms to protect the mRestart, mReadyToRestart members and the data structures you're working on, from concurrent access.
Depending on your particular needs, OS and build environment you could either use c++11 standard mutexes or condition variables, or other low level OS methods, or frameworks (e.g. boost::thread) to realize this in c++.

Is it safe to modify data of pointer in vector from another thread?

Things seem to be working but I'm unsure if this is the best way to go about it.
Basically I have an object which does asynchronous retrieval of data. This object has a vector of pointers which are allocated and de-allocated on the main thread. Using boost functions a process results callback is bound with one of the pointers in this vector. When it fires it will be running on some arbitrary thread and modify the data of the pointer.
Now I have critical sections around the parts that are pushing into the vector and erasing in case the asynch retrieval object is receives more requests but I'm wondering if I need some kind of guard in the callback that is modifying the pointer data as well.
Hopefully this slimmed down pseudo code makes things more clear:
class CAsyncRetriever
{
// typedefs of boost functions
class DataObject
{
// methods and members
};
public:
// Start single asynch retrieve with completion callback
void Start(SomeArgs)
{
SetupRetrieve(SomeArgs);
LaunchRetrieves();
}
protected:
void SetupRetrieve(SomeArgs)
{
// ...
{ // scope for data lock
boost::lock_guard<boost::mutex> lock(m_dataMutex);
m_inProgress.push_back(SmartPtr<DataObject>(new DataObject)));
m_callback = boost::bind(&CAsyncRetriever::ProcessResults, this, _1, m_inProgress.back());
}
// ...
}
void ProcessResults(DataObject* data)
{
// CALLED ON ANOTHER THREAD ... IS THIS SAFE?
data->m_SomeMember.SomeMethod();
data->m_SomeOtherMember = SomeStuff;
}
void Cleanup()
{
// ...
{ // scope for data lock
boost::lock_guard<boost::mutex> lock(m_dataMutex);
while(!m_inProgress.empty() && m_inProgress.front()->IsComplete())
m_inProgress.erase(m_inProgress.begin());
}
// ...
}
private:
std::vector<SmartPtr<DataObject>> m_inProgress;
boost::mutex m_dataMutex;
// other members
};
Edit: This is the actual code for the ProccessResults callback (plus comments for your benefit)
void ProcessResults(CRetrieveResults* pRetrieveResults, CRetData* data)
{
// pRetrieveResults is delayed binding that server passes in when invoking callback in thread pool
// data is raw pointer to ref counted object in vector of main thread (the DataObject* in question)
// if there was an error set the code on the atomic int in object
data->m_nErrorCode.Store_Release(pRetrieveResults->GetErrorCode());
// generic iterator of results bindings for generic sotrage class item
TPackedDataIterator<GenItem::CBind> dataItr(&pRetrieveResults->m_DataIter);
// namespace function which will iterate results and initialize generic storage
GenericStorage::InitializeItems<GenItem>(&data->m_items, dataItr, pRetrieveResults->m_nTotalResultsFound); // this is potentially time consuming depending on the amount of results and amount of columns that were bound in storage class definition (i.e.about 8 seconds for a million equipment items in release)
// atomic uint32_t that is incremented when kicking off async retrieve
m_nStarted.Decrement(); // this one is done processing
// boost function completion callback bound to interface that requested results
data->m_complete(data->m_items);
}
As it stands, it appears that the Cleanup code can destroy an object for which a callback to ProcessResults is in flight. That's going to cause problems when you deref the pointer in the callback.
My suggestion would be that you extend the semantics of your m_dataMutex to encompass the callback, though if the callback is long-running, or can happen inline within SetupRetrieve (sometimes this does happen - though here you state the callback is on a different thread, in which case you are OK) then things are more complex. Currently m_dataMutex is a bit confused about whether it controls access to the vector, or its contents, or both. With its scope clarified, ProcessResults could then be enhanced to verify validity of the payload within the lock.
No, it isn't safe.
ProcessResults operates on the data structure passed to it through DataObject. It indicates that you have shared state between different threads, and if both threads operate on the data structure concurrently you might have some trouble coming your way.
Updating a pointer should be an atomic operation, but you can use InterlockedExchangePointer (in Windows) to be sure. Not sure what the Linux equivalent would be.
The only consideration then would be if one thread is using an obsolete pointer. Does the other thread delete the object pointed to by the original pointer? If so, you have a definite problem.

Difficult concurrent design

I have a class called Root which serves as some kind of phonebook for dynamic method calls: it holds a dictionary of url keys pointing to objects. When a command wants to execute a given method it calls a Root instance with an url and some parameter:
root_->call("/some/url", ...);
Actually, the call method in Root looks close to this:
// Version 0
const Value call(const Url &url, const Value &val) {
// A. find object
if (!objects_.get(url.path(), &target))
return ErrorValue(NOT_FOUND_ERROR, url.path());
}
// B. trigger the object's method
return target->trigger(val);
}
From the code above, you can see that this "call" method is not thread safe: the "target" object could be deleted between A and B and we have no guarantee that the "objects_" member (dictionary) is not altered while we read it.
The first solution that occurred to me was:
// Version I
const Value call(const Url &url, const Value &val) {
// Lock Root object with a mutex
ScopedLock lock(mutex_);
// A. find object
if (!objects_.get(url.path(), &target))
return ErrorValue(NOT_FOUND_ERROR, url.path());
}
// B. trigger the object's method
return target->trigger(val);
}
This is fine until "target->trigger(val)" is a method that needs to alter Root, either by changing an object's url or by inserting new objects. Modifying the scope and using a RW mutex can help (there are far more reads than writes on Root):
// Version II
const Value call(const Url &url, const Value &val) {
// A. find object
{
// Use a RW lock with smaller scope
ScopedRead lock(mutex_);
if (!objects_.get(url.path(), &target))
return ErrorValue(NOT_FOUND_ERROR, url.path());
}
}
// ? What happens to 'target' here ?
// B. trigger the object's method
return target->trigger(val);
}
What happens to 'target' ? How do we ensure it won't be deleted between finding and calling ?
Some ideas: object deletion could be post-poned in a message queue in Root. But then we would need another RW mutex read-locking deletion on the full method scope and use a separate thread to process the delete queue.
All this seems very convoluted to me and I'm not sure if concurrent design has to look like this or I just don't have the right ideas.
PS: the code is part of an open source project called oscit (OpenSoundControl it).
To avoid the deletion of 'target', I had to write a thread safe reference counted smart pointer. It is not that hard to do. The only thing you need to ensure is that the reference count is accessed within a critical section. See this post for more information.
You are on the wrong track with this. Keep in mind: you can't lock data, you can only block code. You cannot protect the "objects" member with a locally defined mutex. You need the exact same mutex in the code that alters the objects collection. It must block that code when another thread is executing the call() method. The mutex must be defined at least at class scope.

Deleting pointer sometimes results in heap corruption

I have a multithreaded application that runs using a custom thread pool class. The threads all execute the same function, with different parameters.
These parameters are given to the threadpool class the following way:
// jobParams is a struct of int, double, etc...
jobParams* params = new jobParams;
params.value1 = 2;
params.value2 = 3;
int jobId = 0;
threadPool.addJob(jobId, params);
As soon as a thread has nothing to do, it gets the next parameters and runs the job function. I decided to take care of the deletion of the parameters in the threadpool class:
ThreadPool::~ThreadPool() {
for (int i = 0; i < this->jobs.size(); ++i) {
delete this->jobs[i].params;
}
}
However, when doing so, I sometimes get a heap corruption error:
Invalid Address specified to RtlFreeHeap
The strange thing is that in one case it works perfectly, but in another program it crashes with this error. I tried deleting the pointer at other places: in the thread after the execution of the job function (I get the same heap corruption error) or at the end of the job function itself (no error in this case).
I don't understand how deleting the same pointers (I checked, the addresses are the same) from different places changes anything. Does this have anything to do with the fact that it's multithreaded?
I do have a critical section that handles the access to the parameters. I don't think the problem is about synchronized access. Anyway, the destructor is called only once all threads are done, and I don't delete any pointer anywhere else. Can pointer be deleted automatically?
As for my code. The list of jobs is a queue of a structure, composed of the id of a job (used to be able to get the output of a specific job later) and the parameters.
getNextJob() is called by the threads (they have a pointer to the ThreadPool) each time they finished to execute their last job.
void ThreadPool::addJob(int jobId, void* params) {
jobData job; // jobData is a simple struct { int, void* }
job.ID = jobId;
job.params = params;
// insert parameters in the list
this->jobs.push(job);
}
jobData* ThreadPool::getNextJob() {
// get the data of the next job
jobData* job = NULL;
// we don't want to start a same job twice,
// so we make sure that we are only one at a time in this part
WaitForSingleObject(this->mutex, INFINITE);
if (!this->jobs.empty())
{
job = &(this->jobs.front());
this->jobs.pop();
}
// we're done with the exclusive part !
ReleaseMutex(this->mutex);
return job;
}
Let's turn this on its head: Why are you using pointers at all?
class Params
{
int value1, value2; // etc...
}
class ThreadJob
{
int jobID; // or whatever...
Params params;
}
class ThreadPool
{
std::list<ThreadJob> jobs;
void addJob(int job, const Params & p)
{
ThreadJob j(job, p);
jobs.push_back(j);
}
}
No new, delete or pointers... Obviously some of the implementation details may be cocked, but you get the overall picture.
Thanks for extra code. Now we can see a problem -
in getNextJob
if (!this->jobs.empty())
{
job = &(this->jobs.front());
this->jobs.pop();
After the "pop", the memory pointed to by 'job' is undefined. Don't use a reference, copy the actual data!
Try something like this (it's still generic, because JobData is generic):
jobData ThreadPool::getNextJob() // get the data of the next job
{
jobData job;
WaitForSingleObject(this->mutex, INFINITE);
if (!this->jobs.empty())
{
job = (this->jobs.front());
this->jobs.pop();
}
// we're done with the exclusive part !
ReleaseMutex(this->mutex);
return job;
}
Also, while you're adding jobs to the queue you must ALSO lock the mutex, to prevent list corruption. AFAIK std::lists are NOT inherently thread-safe...?
Using operator delete on pointer to void results in undefined behavior according to the specification.
Chapter 5.3.5 of the draft of the C++ specification. Paragraph 3.
In the first alternative (delete object), if the static type of the operand is different from its dynamic type, the static type shall be a base class of the operand’s dynamic type and the static type shall have a virtual destructor or the behavior is undefined. In the second alternative (delete array) if the dynamic type of the object to be deleted differs from its static type, the behavior is undefined.73)
And corresponding footnote.
This implies that an object cannot be deleted using a pointer of type void* because there are no objects of type void
All access to the job queue must be synchronized, i.e. performed only from 1 thread at a time by locking the job queue prior to access. Do you already have a critical section or some similar pattern to guard the shared resource? Synchronization issues often lead to weird behaviour and bugs which are hard to reproduce.
It's hard to give a definitive answer with this amount of code. But generally speaking, multithreaded programming is all about synchronizing access to data that might be accessed from multiple threads. If there is no long or other synchronization primitive protecting access to the threadpool class itself, then you can potentially have multiple threads reaching your deletion loop at the same time, at which point you're pretty much guaranteed to be double-freeing memory.
The reason you're getting no crash when you delete a job's params at the end of the job function might be because access to a single job's params is already implicitly serialized by your work queue. Or you might just be getting lucky. In either case, it's best to think about locks and synchronization primitive as not being something that protects code, but as being something that protects data (I've always thought the term "critical section" was a bit misleading here, as it tends to lead people to think of a 'section of lines of code' rather than in terms of data access).. In this case, since you want to access your jobs data from multiple thread, you need to be protecting it via a lock or some other synchronization primitive.
If you try to delete an object twice, the second time will fail, because the heap is already freed. This is the normal behavior.
Now, since you are in a multithreading context... it might be that the deletions are done "almost" in parallel, which might avoid the error on the second deletion, because the first one is not yet finalized.
Use smart pointers or other RAII to handle your memory.
If you have access to boost or tr1 lib you can do something like this.
class ThreadPool
{
typedef pair<int, function<void (void)> > Job;
list< Job > jobList;
HANDLE mutex;
public:
void addJob(int jobid, const function<void (void)>& job) {
jobList.push_back( make_pair(jobid, job) );
}
Job getNextJob() {
struct MutexLocker {
HANDLE& mutex;
MutexLocker(HANDLE& mutex) : mutex(mutex){
WaitForSingleObject(mutex, INFINITE);
}
~MutexLocker() {
ReleaseMutex(mutex);
}
};
Job job = make_pair(-1, function<void (void)>());
const MutexLocker locker(this->mutex);
if (!this->jobList.empty()) {
job = this->jobList.front();
this->jobList.pop();
}
return job;
}
};
void workWithDouble( double value );
void workWithInt( int value );
void workWithValues( int, double);
void test() {
ThreadPool pool;
//...
pool.addJob( 0, bind(&workWithDouble, 0.1));
pool.addJob( 1, bind(&workWithInt, 1));
pool.addJob( 2, bind(&workWithValues, 1, 0.1));
}