Most puzzling C++ heap allocation bug [closed]

Most puzzling C++ heap allocation bug [closed] - c++

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 7 years ago.
Improve this question
I cannot post the source code, but I can explain parts of it on a conceptual level and hope I can be helped to understand why my solution works.
I have an application that has 3 threads: A, B and C (main thread).
Thread B has a list of Foo objects.
Each Foo object contains exactly 1 Mutex object, which is a wrapper over recursive mutexes, and a bunch of methods used to set and get various attributes in a synchronizing manner, using the Mutex and 2 of the methods are used to set and get markedForDelete attribute.
All that Thread B does is iterate over said list using iterators, and delete Foo objects marked for delete, or execute other instructions otherwise. It is the only thread in charge with destruction of Foo objects using basic code similar to this:
while (running)
{
fooListLock->Lock();
for (vector<Foo*>::iterator it = fooList.begin(); it)
{
if (it->isMarkedForDelete())
{
it = fooList.erase(it);
}
else
{
it->execute();
}
}
fooListLock->Unlock();
sleep (sleepVariable);
}
Thread A and C will create and add Foo objects to list and they can also mark them to be deleted and it is done in a synchronized manner using other mutexes.
Thread C will occasionally be closed and always afterwards be restarted, but in a controlled manner and never during memory allocation / deallocation and will always release locked mutexes.
The problem is that when Foo's Mutex is allocated in heap memory (via the new operator) the application will reach a deadlock state where Thread C wants to access resources locked by Thread A and the former wants to access a resource locked by Thread B and Thread B is blocked by Foo's Mutex, which is locked, but has no owner. Using GDB I found that the Mutex's pthread_mutex_t owner value is 0 or a negative number, not corresponding to the id of any threads. The blocking end of the deadlock occurs at this piece of code in Thread B: if (it->isMarkedForDelete()).
My very intuitive solution is to allocate Foo's Mutex on the stack and it works without any other modifications ! The application does not ever reach a deadlock state this way.
The compilation is done using g++ 4.8 with the O2 flag set.
I know it's not much to go on, but can someone help me understand why my solution works ?

I certainly believe, it has nothing to do with heap bugs. Most likely, you are not initializing your mutex properly. Are you calling pthread_mutex_initialize?

Related

Does a mutex lock itself, or the memory positions in question?

Let's say we've got a global variable, and a global non-member function.
int GlobalVariable = 0;
void GlobalFunction();
and we have
std::mutex MutexObject;
then inside one of the threads, we have this block of code:
{
std::lock_guard<std::mutex> lock(MutexObject);
GlobalVairable++;
GlobalFunction()
}
now, inside another thread running in parallel, what happens if we do thing like this:
{
//std::lock_guard<std::mutex> lock(MutexObject);
GlobalVairable++;
GlobalFunction()
}
So the question is, does a mutex lock only itself from getting owned while being owned by another thread, not caring in the process about what is being tried to be accessed in the critical code? or does the compiler, or in run-time, the OS actually designate the memory location being accessed in the critical code as blocked for now by MutexObject?
My guess is the former, but I need to hear from an experienced programmer; Thanks for taking the time to read my question.

It’s the former. There’s no relationship between the mutex and the objects you’re protecting with the mutex. (In general, it's not possible for the compiler to deduce exactly which objects a given block of code will modify.) The magic behind the mutex comes entirely from the temporal ordering guarantees it makes: that everything the thread does before releasing the mutex is visible to the next thread after it’s grabbed the mutex. But the two threads both need to actively use the mutex for that to happen.
A system which actually cares about what memory locations a thread has accessed and modified, and builds safely on top of that, is “transactional memory”. It’s not widely used.

Destroying a global object [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 years ago.
Improve this question
I have a global object which is shared by different threads. To provide the synchronization, I have added a mutex inside the global object.
Mutex will locked by a thread before accessing the data inside the object.
Everything is fine except delete.
If a thread is locking the mutex and deleting the object.
How again it can unlock it? (because memory will be released for data and hence mutex)
How can a safe delete be implemented using this approach i.e. keeping mutex inside the object?

Typically, you have a global object like that (using std::mutex as example):
std::mutex theGlobalMutex; // no pointer, no reference!
This way, the object will be initialised before the program starts running, and it will be cleaned up automatically as soon as you are leaving the program (destructor being called is guaranteed by the standard). No problem at all - at first. There are situations when the actual initialisation can be delayed (see e.g. here mentioned somewhere in the middle of the article), but you should be safe from that if you define your global object in the file containing the main function.
Alternative, if you really want to control object creation yourself:
std::unique_ptr<std::mutex> the_mutex;
int main()
{
// before any theread is created:
the_mutex = new std::mutex();
// ...
return 0;
}
Again, the mutex will be cleaned up automatically, this time via the smart pointer. I assume that you are aware that you shold, though, never ever change the object contained in the smart pointer, otherwise you break your protection against race conditions.
One last point:
[...], I have added a mutex inside the global object.
OK, your mutex is part of the global object. If you want to retain thread safety, the global object now must exist during the whole life time of your program (or at least as long as there are multiple threads running). If you cannot assure this by your program design - then you need to move the mutex out of the class! So either have a separate mutex as above or make the mutex a static member of your class. The latter option again provides automatic cleanup as the former already does.
Edit:
According to your comment, what you want to achieve is protecting smaller parts of a larger tree against race conditions such that the nodes can be used independently, providing smaller lock ranges/durations.
With the approach you are planning, you get into trouble as soon as you try to modify the whole tree: Imagine you are going to delete one node in thread A. A gets the mutex, but is then interrupted. Another thread B tries to lock the mutex, too, to modify the object in question, but fails to do so and has to wait. A now deletes the object, and B operates on invalid data from now on!
So on modification, you need to protect the whole tree! If you are on the newest C++ standard, you can use a global/static std::shared_mutex to secure your tree: Every read access to a node is protected with a shared_lock to the entire tree, each write access (adding or deleting nodes) with a normal lock.
Otherwise, boost offers a similar facility and you find solutions here on stackoverflow (e. g. here or here).

Valgrind: finding conflicting store/load in multithread program [duplicate]

This question already has an answer here:
Is there a bug in the boost asio HTTP Server 3 example or boost bug?
(1 answer)
Closed 9 years ago.
i am trying to write a program with boost::asio and multiple threads. The program seems to be working fine, but when i run it with the valgrind thread tool drd, I get messages of conflicting store and load operations.
==13740== Thread 2:
==13740== Conflicting store by thread 2 at 0x06265ff0 size 4
==13740== at 0x40F2B8: boost::asio::detail::epoll_reactor::descriptor_state::set_ready_events(unsigned int) (epoll_reactor.hpp:68)
==13740== by 0x410097: boost::asio::detail::epoll_reactor::run(bool, boost::asio::detail::op_queue&) (epoll_reactor.ipp:430)
etc.
The error messages are rather lengthy due to all the involved boost calls and seem not to include my functions directly. As I said, the program seems to work, but leaving these errors inside the code leaves me with a bad feeling. Is there any good way, to find the problematic locations in the code?
Thanks for the advice

A related bug report landed in Ubuntu's bug trackers (not the right place IYAM):
https://bugs.launchpad.net/ubuntu/+source/boost1.53/+bug/1243570
This details a lock order violation detected with Helgrind (not DRD). This is more tangible
Some old but perhaps interesting discussion about this here: http://lists.boost.org/Archives/boost/2010/06/167818.php
My comments on this source code are as follows:
The comments at the bottom of class epoll_reactor say that any
access of registered_descriptors_ should be protected by
registered_descriptors_mutex_. However, the method shutdown_service()
modifies the container registered_descriptors_ but doesn't lock
registered_descriptors_mutex_.
The method epoll_reactor::register_descriptor() modifies its second
argument (descriptor_data) such that it points to the newly created
descriptor_state object. All data members of the struct
descriptor_state are public, but all accesses must be guarded by a
lock on descriptor_state::mutex_. So all callers of
register_descriptor() must be checked in order to verify whether or
not there are any thread-unsafe accesses of
descriptor_state::op_queue_ or descriptor_state::shutdown_. Personally
I never recommended such a class design.
While all accesses of the members of struct descriptor_state should
be protected by locking descriptor_state::mutex_, no lock is held on
this last mutex by register_descriptor() when it sets
descriptor_data::shutdown_ nor by shutdown_service() while it modifies
descriptor_state::op_queue_ and descriptor_state::shutdown_. The
former is easy to fix: move the "descriptor_data->shutdown_ = false"
statement to somewhere before the epoll_ctl() system call.
Does one of the above scenarios explain the race report you have observed ?
Of course, many version passed since then (1.43.0-1.55.0) so chances are this has been addressed, or otherwise changed, but it could give you a lead to find more information in the boost trackers?

Mutex example / tutorial? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
I was trying to understand how mutexes work. Did a lot of Googling but it still left some doubts of how it works because I created my own program in which locking didn't work.
One absolutely non-intuitive syntax of the mutex is pthread_mutex_lock( &mutex1 );, where it looks like the mutex is being locked, when what I really want to lock is some other variable. Does this syntax mean that locking a mutex locks a region of code until the mutex is unlocked? Then how do threads know that the region is locked? [UPDATE: Threads know that the region is locked, by Memory Fencing ]. And isn't such a phenomenon supposed to be called critical section? [UPDATE: Critical section objects are available in Windows only, where the objects are faster than mutexes and are visible only to the thread which implements it. Otherwise, critical section just refers to the area of code protected by a mutex]
What's the simplest possible mutex example program and the simplest possible explanation on the logic of how it works?

Here goes my humble attempt to explain the concept to newbies around the world: (a color coded version on my blog too)
A lot of people run to a lone phone booth (they don't have mobile phones) to talk to their loved ones. The first person to catch the door-handle of the booth, is the one who is allowed to use the phone. He has to keep holding on to the handle of the door as long as he uses the phone, otherwise someone else will catch hold of the handle, throw him out and talk to his wife :) There's no queue system as such. When the person finishes his call, comes out of the booth and leaves the door handle, the next person to get hold of the door handle will be allowed to use the phone.
A thread is : Each person
The mutex is : The door handle
The lock is : The person's hand
The resource is : The phone
Any thread which has to execute some lines of code which should not be modified by other threads at the same time (using the phone to talk to his wife), has to first acquire a lock on a mutex (clutching the door handle of the booth). Only then will a thread be able to run those lines of code (making the phone call).
Once the thread has executed that code, it should release the lock on the mutex so that another thread can acquire a lock on the mutex (other people being able to access the phone booth).
[The concept of having a mutex is a bit absurd when considering real-world exclusive access, but in the programming world I guess there was no other way to let the other threads 'see' that a thread was already executing some lines of code. There are concepts of recursive mutexes etc, but this example was only meant to show you the basic concept. Hope the example gives you a clear picture of the concept.]
With C++11 threading:
#include <iostream>
#include <thread>
#include <mutex>
std::mutex m;//you can use std::lock_guard if you want to be exception safe
int i = 0;
void makeACallFromPhoneBooth()
{
m.lock();//man gets a hold of the phone booth door and locks it. The other men wait outside
//man happily talks to his wife from now....
std::cout << i << " Hello Wife" << std::endl;
i++;//no other thread can access variable i until m.unlock() is called
//...until now, with no interruption from other men
m.unlock();//man lets go of the door handle and unlocks the door
}
int main()
{
//This is the main crowd of people uninterested in making a phone call
//man1 leaves the crowd to go to the phone booth
std::thread man1(makeACallFromPhoneBooth);
//Although man2 appears to start second, there's a good chance he might
//reach the phone booth before man1
std::thread man2(makeACallFromPhoneBooth);
//And hey, man3 also joined the race to the booth
std::thread man3(makeACallFromPhoneBooth);
man1.join();//man1 finished his phone call and joins the crowd
man2.join();//man2 finished his phone call and joins the crowd
man3.join();//man3 finished his phone call and joins the crowd
return 0;
}
Compile and run using g++ -std=c++0x -pthread -o thread thread.cpp;./thread
Instead of explicitly using lock and unlock, you can use brackets as shown here, if you are using a scoped lock for the advantage it provides. Scoped locks have a slight performance overhead though.

While a mutex may be used to solve other problems, the primary reason they exist is to provide mutual exclusion and thereby solve what is known as a race condition. When two (or more) threads or processes are attempting to access the same variable concurrently, we have potential for a race condition. Consider the following code
//somewhere long ago, we have i declared as int
void my_concurrently_called_function()
{
i++;
}
The internals of this function look so simple. It's only one statement. However, a typical pseudo-assembly language equivalent might be:
load i from memory into a register
add 1 to i
store i back into memory
Because the equivalent assembly-language instructions are all required to perform the increment operation on i, we say that incrementing i is a non-atmoic operation. An atomic operation is one that can be completed on the hardware with a gurantee of not being interrupted once the instruction execution has begun. Incrementing i consists of a chain of 3 atomic instructions. In a concurrent system where several threads are calling the function, problems arise when a thread reads or writes at the wrong time. Imagine we have two threads running simultaneoulsy and one calls the function immediately after the other. Let's also say that we have i initialized to 0. Also assume that we have plenty of registers and that the two threads are using completely different registers, so there will be no collisions. The actual timing of these events may be:
thread 1 load 0 into register from memory corresponding to i //register is currently 0
thread 1 add 1 to a register //register is now 1, but not memory is 0
thread 2 load 0 into register from memory corresponding to i
thread 2 add 1 to a register //register is now 1, but not memory is 0
thread 1 write register to memory //memory is now 1
thread 2 write register to memory //memory is now 1
What's happened is that we have two threads incrementing i concurrently, our function gets called twice, but the outcome is inconsistent with that fact. It looks like the function was only called once. This is because the atomicity is "broken" at the machine level, meaning threads can interrupt each other or work together at the wrong times.
We need a mechanism to solve this. We need to impose some ordering to the instructions above. One common mechanism is to block all threads except one. Pthread mutex uses this mechanism.
Any thread which has to execute some lines of code which may unsafely modify shared values by other threads at the same time (using the phone to talk to his wife), should first be made acquire a lock on a mutex. In this way, any thread that requires access to the shared data must pass through the mutex lock. Only then will a thread be able to execute the code. This section of code is called a critical section.
Once the thread has executed the critical section, it should release the lock on the mutex so that another thread can acquire a lock on the mutex.
The concept of having a mutex seems a bit odd when considering humans seeking exclusive access to real, physical objects but when programming, we must be intentional. Concurrent threads and processes don't have the social and cultural upbringing that we do, so we must force them to share data nicely.
So technically speaking, how does a mutex work? Doesn't it suffer from the same race conditions that we mentioned earlier? Isn't pthread_mutex_lock() a bit more complex that a simple increment of a variable?
Technically speaking, we need some hardware support to help us out. The hardware designers give us machine instructions that do more than one thing but are guranteed to be atomic. A classic example of such an instruction is the test-and-set (TAS). When trying to acquire a lock on a resource, we might use the TAS might check to see if a value in memory is 0. If it is, that would be our signal that the resource is in use and we do nothing (or more accurately, we wait by some mechanism. A pthreads mutex will put us into a special queue in the operating system and will notify us when the resource becomes available. Dumber systems may require us to do a tight spin loop, testing the condition over and over). If the value in memory is not 0, the TAS sets the location to something other than 0 without using any other instructions. It's like combining two assembly instructions into 1 to give us atomicity. Thus, testing and changing the value (if changing is appropriate) cannot be interrupted once it has begun. We can build mutexes on top of such an instruction.
Note: some sections may appear similar to an earlier answer. I accepted his invite to edit, he preferred the original way it was, so I'm keeping what I had which is infused with a little bit of his verbiage.

I stumbled upon this post recently and think that it needs an updated solution for the standard library's c++11 mutex (namely std::mutex).
I've pasted some code below (my first steps with a mutex - I learned concurrency on win32 with HANDLE, SetEvent, WaitForMultipleObjects etc).
Since it's my first attempt with std::mutex and friends, I'd love to see comments, suggestions and improvements!
#include <condition_variable>
#include <mutex>
#include <algorithm>
#include <thread>
#include <queue>
#include <chrono>
#include <iostream>
int _tmain(int argc, _TCHAR* argv[])
{
// these vars are shared among the following threads
std::queue<unsigned int> nNumbers;
std::mutex mtxQueue;
std::condition_variable cvQueue;
bool m_bQueueLocked = false;
std::mutex mtxQuit;
std::condition_variable cvQuit;
bool m_bQuit = false;
std::thread thrQuit(
[&]()
{
using namespace std;
this_thread::sleep_for(chrono::seconds(5));
// set event by setting the bool variable to true
// then notifying via the condition variable
m_bQuit = true;
cvQuit.notify_all();
}
);
std::thread thrProducer(
[&]()
{
using namespace std;
int nNum = 13;
unique_lock<mutex> lock( mtxQuit );
while ( ! m_bQuit )
{
while( cvQuit.wait_for( lock, chrono::milliseconds(75) ) == cv_status::timeout )
{
nNum = nNum + 13 / 2;
unique_lock<mutex> qLock(mtxQueue);
cout << "Produced: " << nNum << "\n";
nNumbers.push( nNum );
}
}
}
);
std::thread thrConsumer(
[&]()
{
using namespace std;
unique_lock<mutex> lock(mtxQuit);
while( cvQuit.wait_for(lock, chrono::milliseconds(150)) == cv_status::timeout )
{
unique_lock<mutex> qLock(mtxQueue);
if( nNumbers.size() > 0 )
{
cout << "Consumed: " << nNumbers.front() << "\n";
nNumbers.pop();
}
}
}
);
thrQuit.join();
thrProducer.join();
thrConsumer.join();
return 0;
}

For those looking for the shortex mutex example:
#include <mutex>
int main() {
std::mutex m;
m.lock();
// do thread-safe stuff
m.unlock();
}

The function pthread_mutex_lock() either acquires the mutex for the calling thread or blocks the thread until the mutex can be acquired. The related pthread_mutex_unlock() releases the mutex.
Think of the mutex as a queue; every thread that attempts to acquire the mutex will be placed on the end of the queue. When a thread releases the mutex, the next thread in the queue comes off and is now running.
A critical section refers to a region of code where non-determinism is possible. Often this because multiple threads are attempting to access a shared variable. The critical section is not safe until some sort of synchronization is in place. A mutex lock is one form of synchronization.

You are supposed to check the mutex variable before using the area protected by the mutex. So your pthread_mutex_lock() could (depending on implementation) wait until mutex1 is released or return a value indicating that the lock could not be obtained if someone else has already locked it.
Mutex is really just a simplified semaphore. If you read about them and understand them, you understand mutexes. There are several questions regarding mutexes and semaphores in SO. Difference between binary semaphore and mutex, When should we use mutex and when should we use semaphore and so on. The toilet example in the first link is about as good an example as one can think of. All code does is to check if the key is available and if it is, reserves it. Notice that you don't really reserve the toilet itself, but the key.

SEMAPHORE EXAMPLE ::
sem_t m;
sem_init(&m, 0, 0); // initialize semaphore to 0
sem_wait(&m);
// critical section here
sem_post(&m);
Reference : http://pages.cs.wisc.edu/~remzi/Classes/537/Fall2008/Notes/threads-semaphores.txt

Multithreading with STL container

I have an unordered map which stores a pointer of objects. I am not sure whether I am doing the correct thing to maintain the thread safety.
typedef std::unordered_map<string, classA*>MAP1;
MAP1 map1;
pthread_mutex_lock(&mutexA)
if(map1.find(id) != map1.end())
{
pthread_mutex_unlock(&mutexA); //already exist, not adding items
}
else
{
classA* obj1 = new classA;
map1[id] = obj1;
obj1->obtainMutex(); //Should I create a mutex for each object so that I could obtain mutex when I am going to update fields for obj1?
pthread_mutex_unlock(&mutexA); //release mutex for unordered_map so that other threads could access other object
obj1->field1 = 1;
performOperation(obj1); //takes some time
obj1->releaseMutex(); //release mutex after updating obj1
}

Several thoughts.
If you do have one mutex per stored object, then you should try to create that mutex in the constructor for the stored object. In other words, to maintain encapsulation, you should avoid having external code manipulate that mutex. I would convert "field1" into a setter "SetField1" that handles the mutex internally.
Next, I agree with the comment that you could move pthread_mutex_unlock(&mutexA); to occur before obj1->obtainMutex();
Finally, I don't think you need obtainMutex at all. Your code looks as if only one thread will ever be allowed to create an object, and therefore only one thread will manipulate the contents during object creation. So if I consider only what little code you've shown here, it does not seem that mutex-per-object is needed at all.

One problem I see with the code is that it will lead to problems especially when exceptions occur.
obj1->obtainMutex(); //Should I create a mutex for each object so that I could obtain mutex when I am going to update fields for obj1?
pthread_mutex_unlock(&mutexA); //release mutex for unordered_map so that other threads could access other object
obj1->field1 = 1;
performOperation(obj1);
If performOperation throws an exception then obj1->releaseMutex(); will never get called thus leaving the object locked and potentially leading to deadlocks sometime in the future.
And even if you do not use exceptions yourself some library code you use in performOperation might. Or you might mistakenly sometime in the future insert a return and forget to unlock all owned locks before and so on...
The same goes for the pthread_mutex_lock and pthread_mutex_unlock calls.
I would recommend using RAII for locking / unlocking.
I.e. the code could look like this:
typedef std::unordered_map<string, classA*>MAP1;
MAP1 map1;
Lock mapLock(&mutexA); //automatci variable. The destructor of the Lock class
//automatically calls pthread_mutex_unlock in its destructor if it "owns" the
//mutex
if(map1.find(id) == map1.end())
{
classA* obj1 = new classA;
map1[id] = obj1;
Lock objLock(obj);
mapLock.release(); //we explicitly release mapLock here
obj1->field1 = 1;
performOperation(obj1); //takes some time
}
I.e. for a reference for some minimalistic RAAI threading support please refer to "Modern C++ design: generic programming and design patterns applied" by Andrei Alexandrescu (see here). Other resources also exist (here)
I will try to describe in the end one other problem I see with the code. More exactly, the problem I see with having the obtainMutex and releaseMutex as methods and calling them explicitly. Let's imagine thread 1 locks the map, creates an object calls obtainMutex and unlocks the map. Another thread (lets call it Thread 2) gets scheduled for execution locks the map obtains an iterator to the map1[id] of the object and calls releaseMutex() on the pObject (i.e. let's say due to a bug the code does not attempt to call obtainMutex first). Now Thread 1 gets scheduled and calls at some point releaseMutex() also. So the object got locked once but released twice. What I am trying to say is that it's going to be hard work making sure the calls are always correctly paired in the face of exceptions, potential early returns that do not unlock and incorrect usage of the object locking interface. Also Thread 2 might just delete the pObject it obtained from the map and erase it from the map. thread 1 will then go on an work with an already deleted object.
When used judiciously RAII would make the code simpler to understand (even shorter if you compare our versions) and also help a lot with some of the problems I enumerated above.

Thought of combining my comments into an answer:
1) When you are adding an entry, and therefore are modifying the container, you should not allow read access from other threads, as the container may be in a transition between legal states. Complementary, you should not modify the container when other threads are reading it. This calls for the use of read-write lock. The pseudo-code is something like:
set read lock
search container
if found
release read lock
operate on the found object
else
set write lock
release read lock
add entry
release write lock
endif
(it's been some time since I've done multi-threaded programming, so I may be rusty on details)
2) When I worked on MSVC some years ago we used the multi-threaded (i.e. thread-safe) version of the standard libraries. It could save you all this trouble. Didn't bother (yet) to check if thread-safe std exists also on gcc/Linux.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Most puzzling C++ heap allocation bug [closed] - c++

I certainly believe, it has nothing to do with heap bugs. Most likely, you are not initializing your mutex properly. Are you calling pthread_mutex_initialize?

Related

Does a mutex lock itself, or the memory positions in question?

Destroying a global object [closed]

Valgrind: finding conflicting store/load in multithread program [duplicate]

Mutex example / tutorial? [closed]

Multithreading with STL container

Categories

Resources