What is the reason for the name `weak_ptr::lock()`? - c++

Many of our developers don't understand what creating a shared_ptr from a weak_ptr has to do with locking things. By them, the term 'lock' is associated with mutexes first.
It could have been called use or safeguard or lease or promote for instance... but hey, it isn't, and it's our responsibility to learn the standard.
But to them, this is so bad an issue that another class that copied this idiom had to be renamed, causing numerous lines of code to follow the rename. Now we have sacrificed consistency with the standard for just a little intuitivity.
Does anyone know how the choice for the name lock() was made?

It locks the shared object in memory, and prevents it being deleted.
It has nothing to do with locking a mutex or anything like that.

Related

Why should avoid singleton in C++

People use singleton everywhere. Read some threads recently from stackoverflow that singleton should be avoided in C++, but not clear why is that.
Some might worry about memory leak with undeleted pointers, things such as exceptions will skip the memory recycle codes. But will the auto_ptr solve this problem?
In general, as mentioned in another answer, you should avoid mutable global data. It introduces a difficulty in tracking code side effects.
However your question is specifically about C++. You could, for instance, have global immutable data that is worth sharing in a singleton. In C++, specifically, it's nearly impossible to safely initialize a singleton in a multithreaded environment.
Multithreaded Environment
You can use the "construct on first use" idiom to make sure the singleton is properly initialized exactly by the time it is needed: http://www.parashift.com/c++-faq-lite/static-init-order.html.
However, what happens if you have 2 (or more) threads which all try to access the singleton for the first time, at exactly the same time? This scenario is not as far fetched as it seems, if the shared immutable data is data required by your calculateSomeData thread, and you initialize several of these threads at the same time.
Reading the discussion linked above in the C++ FAQ Lite, you can see that it's a complex question in the first place. Adding threads makes it much harder.
On Linux, with gcc, the compiler solves this problem for you - statics are initialized within a mutex and the code is made safe for you. This is an enhancement, the standard requires no such behavior.
In MSVC the compiler does not provide this utility for you and you get a crash instead. You may think "that's ok, I'll just put a mutex around my first use initialization!" However, the mutex itself suffers from exactly the same problem, itself needing to be static.
The only way to make sure your singleton is safe for threaded use is to initialize it very early in the program before any threads are started. This can be accomplished with a trick that causes the singleton to be initialized before main is called.
Singletons That Rely on Other Singletons
This problem can be mostly solved with the construct on first use idiom, but if you have the problem of initializing them before any threads are initialized, you can potentially introduce new problems.
Cross-platform Compatibility
If you plan on using your code on more than one platform, and compile shared libraries, expect some issues. Because there is no C++ ABI interface specified, each compiler and platform handles global statics differently. For instance, unless the symbols are explicitly exported in MSVC, each DLL will have its own instance of the singleton. On Linux, the singletons will be implicitly shared between shared libraries.
Avoid mutable global variables whether they're singletons or not, since they introduce unconstrained lines of communication: you don't know what part of the code is affecting what other parts, or when that happens.

When to use mutexes?

I've been playing around with gtkmm and multi-threaded GUIs and stumbled into the concept of a mutex. From what I've been able to gather, it serves the purpose of locking access to a variable for a single thread in order to avoid concurrency issues. This I understand, seems rather natural, however I still don't get how and when one should use a mutex. I've seen several uses where the mutex is only locked to access particular variables (e.g.like this tutorial). For which type of variables/data should a mutex be used?
PS: Most of the answers I've found on this subject are rather technical, and since I am far from an expert on this I was looking more for a conceptual answer.
If you have data that is accessed from more than a single thread, you probably need a mutex. You usually see something like
theMutex.lock()
do_something_with_data()
theMutex.unlock()
or a better idiom in c++ would be:
{
MutexGuard m(theMutex)
do_something_with_data()
}
where MutexGuard c'tor does the lock() and d'tor does the unlock()
This general rule has a few exceptions
if the data you are using can be accessed in an atomic manner, you don't need a lock. In Visual Studio you have functions like InterlockedIncrement() that do this. gcc has it's own facilities to do this.
If you are accessing the data to only ever read it and never change it, it's usually safe to do without locking. but if even a single thread does any change to the data, all the other threads need to make sure they don't try to read the data while it is being changed. You can also read about Reader-Writer lock for this kind of situations.
variables that are changed among multiple threads. So data that is not modified (immutable) or data that is not shared does not need

How do I make any C++ library I make thread safe?

First of all, I'm fairly experienced with C++ and understand the basics of threading and thread synchronization. I also want to write a custom memory allocator as a pet project of mine and have read that they should be thread-safe.
I understand what the term "thread-safe" means, but I have no idea on how to make C++ code thread-safe.
Are there any practical examples or tutorials on how to make code thread-safe?
In a memory allocator scenario, is it essentially ensuring that all mutating functions are marked as critical sections? Or is there something more to it?
Same as all threading issues: make sure that when one thread is changing something, no other thread is accessing it. For a memory allocation system, I would imagine you would need a way of making sure you don't allocate the same block of memory to 2 threads at the same time. Whether that is by wrapping the entire search, or by allowing multiple searches but locking when the allocation table is to be updated (which could then cause the result of the search to become invalid, necessitating another search) would be up to you.

Standard way to make STL objects threadsafe?

I need several STL containers, threadsafe.
Basically I was thinking I just need 2 methods added to each of the STL container objects,
.lock()
.unlock()
I could also break it into
.lockForReading()
.unlockForReading()
.lockForWriting()
.unlockForWriting()
The way that would work is any number of locks for parallel reading are acceptable, but if there's a lock for writing then reading AND writing are blocked.
An attempt to lock for writing waits until the lockForReading semaphore drops to 0.
Is there a standard way to do this?
Is how I'm planning on doing this wrong or shortsighted?
This is really kind of bad. External code will not recognize or understand your threading semantics, and the ease of availability of aliases to objects in the containers makes them poor thread-safe interfaces.
Thread-safety occurs at design time. You can't solve thread safety by throwing locks at the problem. You solve thread safety by not having two threads writing to the same data at the same time- in the general case, of course. However, it is not the responsibility of a specific object to handle thread safety, except direct threading synchronization primitives.
You can have concurrent containers, designed to allow concurrent use. However, their interfaces are vastly different to what's offered by the Standard containers. Less aliases to objects in the container, for example, and each individual operation is encapsulated.
The standard way to do this is acquire the lock in a constructor, and release it in the destructor. This is more commonly know as Resource Acquisition Is Initialization, or RAII. I strongly suggest you use this methodology rather than
.lock()
.unlock()
Which is not exception safe. You can easily forget to unlock the mutex prior to throwing, resulting in a deadlock the next time a lock is attempted.
There are several synchronization types in the Boost.Thread library that will be useful to you, notably boost::mutex::scoped_lock. Rather than add lock() and unlock() methods to whatever container you wish to access from multiple threads, I suggest you use a boost:mutex or equivalent and instantiate a boost::mutex::scoped_lock whenever accessing the container.
Is there a standard way to do this?
No, and there's a reason for that.
Is how I'm planning on doing this
wrong or shortsighted?
It's not necessarily wrong to want to synchronize access to a single container object, but the interface of the container class is very often the wrong place to put the synchronization (like DeadMG says: object aliases, etc.).
Personally I think both TBB and stuff like concurrent_vector may either be overkill or still the wrong tools for a "simple" synchronization problem.
I find that ofttimes just adding a (private) Lock object (to the class holding the container) and wrapping up the 2 or 3 access patterns to the one container object will suffice and will be much easier to grasp and maintain for others down the road.
Sam: You don't want a .lock() method because something could go awry that prevents calling the .unlock() method at the end of the block, but if .unlock() is called as a consequence of object destruction of a stack allocated variable then any kind of early return from the function that calls .lock() will be guaranteed to free the lock.
DeadMG:
Intel's Threading Building Blocks (open source) may be what you're looking for.
There's also Microsoft's concurrent_vector and concurrent_queue, which already comes with Visual Studio 2010.

C++ volatile required when spinning on boost::shared_ptr operator bool()? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
When to use volatile with multi threading?
I have two threads referencing the same boost::shared_ptr:
boost::shared_ptr<Widget> shared;
On thread is spinning, waiting for the other thread to reset the boost::shared_ptr:
while(shared)
boost::thread::yield();
And at some point the other thread will call:
shared.reset();
My question is whether or not I need to declare the shared pointer as volatile to prevent the compiler from optimizing the call to shared.operator bool() out of the loop and never detecting the change? I know that if I were simply looping on a variable, waiting for it to reach 0 I would need volatile, but I'm not sure if boost::shared_ptr is implemented in such a way that it is not necessary here.
EDIT: I'm fully aware that condition variables can be used to solve this problem in a different way. But in this case, the busy loop is very uncommon and contending for the lock on the condition variable is an overhead we would rather not incur.
Rant 1:
That code probably won't do what you think it will. When you write code like that, you're introducing a data race into your code. This is almost certainly a bug that will result in your program non-deterministically failing.
Data structures (including shared_ptr) are generally not meant to be accessed concurrently. Do not modify the same structure at the same time in more than one thread. That could corrupt the structure. Do not modify it in one thread and read it in another thread. The reader could see inconsistent data. Probably multiple threads can read it at the same time.
If you think you really want to do some of the above, find out if the data structure allows some of these behaviors in a section probably titled "Thread Safety." If it does allow them, take a second look at whether your performance really needs this, and then use it. (The documentation on shared_ptr does NOT allow what you're doing.)
Rant 2:
Now, for a higher-level concern, you probably shouldn't be doing thread synchronization by waiting for a pointer to be set to NULL. Really, look at condition variables, barriers, or futures as a way of getting one thread to wait until another is finished with something. It's a nicer interface, and whoever looks at your code next (that includes you in 6 months) will thank you.
I know you're concerned about the performance cost of real synchronization. Don't worry about this. It'll be fine. If you're worried about lock contention, use barriers or futures, which don't have a big shared lock to contend for.
Caveat: there is a time for writing code that avoids locks at all cost. But unless you're looking at profiler data that says your synch ops are too slow for your target workload, this isn't the time.
Rant 3:
I hope that shared in your example is global. Otherwise, you have multiple threads with local references to the same shared_ptr that points to the real object you're interested in. It kind of defeats the purpose of having a reference-counted pointer. Just please tell me it's global.
What you actually should do is to use condition variables. Busy waits are evil.
Edit: also depending on your task, futures may be even cleaner way to achieve what you want.