Mistake in cpp specification about mutex try_lock? - c++

On follow link (http://www.cplusplus.com/reference/mutex/mutex/try_lock/) we have declared that sample can return only values from 1 to 100000. Does it is declared that 0 can't be in output?
// mutex::try_lock example
#include <iostream> // std::cout
#include <thread> // std::thread
#include <mutex> // std::mutex
volatile int counter (0); // non-atomic counter
std::mutex mtx; // locks access to counter
void attempt_10k_increases () {
for (int i=0; i<10000; ++i) {
if (mtx.try_lock()) { // only increase if currently not locked:
++counter;
mtx.unlock();
}
}
}
int main ()
{
std::thread threads[10];
// spawn 10 threads:
for (int i=0; i<10; ++i)
threads[i] = std::thread(attempt_10k_increases);
for (auto& th : threads) th.join();
std::cout << counter << " successful increases of the counter.\n";
return 0;
}
In any case, it's easy to answer 'How to get 2?', but really not clear about how to get 1 and never get 0.
The try_lock can "fail spuriously when no other thread has a lock on the mutex, but repeated calls in these circumstances shall succeed at some point", but if it true, then sample can return 0 (and also can return 1 in some case).
But, if this specification sample declared true and 0 cannot be in output, then words about "fail spuriously" maybe not true then?

The Standard says the following:
30.4.1.2/14 [thread.mutex.requirements.mutex]
An implementation
may fail to obtain the lock even if it is not held by any other thread. [ Note: This spurious failure is
normally uncommon, but allows interesting implementations based on a simple compare and exchange
(Clause 29). —end note ]
So you can even get 0 if all of try_lock fail.
Also, please do not use cplusplus.com, it has a long history of having lots of mistakes.
It's safer to use cppreference.com which is much closer to the Standard

try_lock can fail if another thread held a lock and just released it, for example. You read that "repeated calls in these circumstances shall succeed at some point". Doing 10,000 calls to try_lock will count as "repeated calls" and one of them will succeed.

You can never get 0:
When first call to try_lock() happens (doesn't matter which thread is here first) the mutex is unlocked. This means that 1 of the 10 threads will manage to lock the mutex, meaning try_lock() will succeed.
You can get 1:
Lets say thread 0 manages to lock the mutex:
void attempt_10k_increases () {
for (int i=0; i<10000; ++i) {
if (mtx.try_lock()) { // thread 1 to 9 are here
++counter; // thread 0 is here
mtx.unlock();
}
}
}
Now we say that the OS scheduler chose to not run thread 0 for a little while. In the meantime, thread 1 to 9 keep running, calling try_lock() and failing, because thread 0 holds the mutex.
Thread 1 to 9 are now done. They failed to acquire the mutex even once.
Thread 0 gets reschuled. It unlock the mutex and finish.
Counter is now 1.

Related

Mutex in c++ is not running properly

I am trying to use mutex to arrange the output between two threads to print the message from Thread 1 then print output from thread 2.
but I am getting the messages to be printed randomly so it seems like I am not using mutex correctly.
std::mutex mu;
void share_print(string msg, int id)
{
mu.lock();
cout << msg << id << endl;
mu.unlock();
}
void func1()
{
for (int i = 0; i > -50; i--)
{
share_print(string("From Func 1: "), i);
}
}
int main()
{
std::thread t1(func1);
for (int i = 0; i < 50; i++)
{
share_print(string("From Main: "), i);
}
t1.join();
return 0;
}
the output is:
Your usage of mutexes is 100% correct. It's your expectation of mutex behavior, and execution thread behavior, that misses the mark. For example, C++ execution threads give you no guarantees whatsoever that any line in func1 will be executed before main() completely finishes executing its for loop.
As far as mutexes are concerned, your only guarantees, that matter here are:
Only one execution thread can lock a given std::mutex at the same time.
If a std::mutex is not locked, one of two things will happen when an execution thread attempts to lock it, either: a) it will lock it b) if another thread already has it locked or manages to lock it first it will block until the mutex is no longer locked, and then it will attempt to lock the mutex again.
It is very important to understand all the implications of these rules. Even if your execution thread has a mutex locked, then proceeds to unlock it, and then lock it again, it may end up re-locking the mutex immediately even if another execution thread is also waiting to lock the mutex. Mutexes do not impose any kind of a queueing, a locking order, or a priority between different execution threads that are trying to lock it. It's a free-for-all.
Even if mutexes worked the way you expected them to work, that still gives you no guarantees whatsoever:
std::thread t1 (func1 );
Your only guarantee here is that func1 will be called by a new execution thread at some point on or after this std::thread object's construction finishes.
for (int i = 0; i < 50; i++)
{
share_print(string("From Main: "), i);
}
This entire for loop can finish even before a single line from func1 gets executed. It'll lock and unlock the mutex 50 times and call it a day, before func1 wakes up and does the same.
Or, alternatively, it's possible for func1 to run to completion before main enters the for loop.
You have no expectations of any order of execution of multiple execution threads, unless explicit syncronization takes place.
In order to achieve your interleaving output a lot more work is needed. In addition to just a mutex there will need to be some kind of a condition variable, and a separate variable that indicates whose "turn" it is. Each execution thread, both main and func1, will not only need to lock the mutex, but block on the condition variable until the shared variable indicates that it's turn is up, then do its printing, set the shared variable to indicate that it's the other thread's turn, signal the condition variable, and only then unlock the mutex (or, always keep the mutex locked and always spin on the condition variable).

Do locks in C++ 11 guarantee freshness of accessed data?

Usually, when using std::atomic types accessed concurrently by multiple threads, there's no guarantee a thread will read the "up to date" value when accessing them, and a thread may get a stale value from cache or any older value. The only way to get the up to date value are functions such as compare_exchange_XXX. (See questions here and here)
#include <atomic>
std::atomic<int> cancel_work = 0;
std::mutex mutex;
//Thread 1 executes this function
void thread1_func()
{
cancel_work.store(1, <some memory order>);
}
// Thread 2 executes this function
void thread2_func()
{
//No guarantee tmp will be 1, even when thread1_func is executed first
int tmp = cancel_work.load(<some memory order>);
}
However my question is, what happens when using a mutex and lock instead? Do we have any guarantee of the freshness of shared data accessed?
For example, assuming both thread 1 and thread 2 are run concurrently and thread 1 obtains the lock first (executes first). Does it guarantee that thread 2 will see the modified value and not an old value?
Does it matter whether the shared data "cancel_work" is atomic or not in this case?
#include <atomic>
int cancel_work = 0; //any difference if replaced with std::atomic<int> in this case?
std::mutex mutex;
// Thread 1 executes this function
void thread1_func()
{
//Assuming Thread 1 enters lock FIRST
std::lock_guard<std::mutex> lock(mutex);
cancel_work = 1;
}
// Thread 2 executes this function
void thread2_func()
{
std::lock_guard<std::mutex> lock(mutex);
int tmp = cancel_work; //Will tmp be 1 or 0?
}
int main()
{
std::thread t1(thread1_func);
std::thread t2(thread2_func);
t1.join(); t2.join();
return 0;
}
Yes, the using of the mutex/lock guarantees that thread2_func() will obtain a modified value.
However, according to the std::atomic specification:
The synchronization is established only between the threads releasing
and acquiring the same atomic variable. Other threads can see
different order of memory accesses than either or both of the
synchronized threads.
So your code will work correctly using acquire/release logic, too.
#include <atomic>
std::atomic<int> cancel_work = 0;
void thread1_func()
{
cancel_work.store(1, std::memory_order_release);
}
void thread2_func()
{
// tmp will be 1, when thread1_func is executed first
int tmp = cancel_work.load(std::memory_order_acquire);
}
The C++ standard only constrains the observable behavior of the abstract machine in well formed programs without undefined behavior anywhere during the abstract machine's execution.
It provides no guarantees about mapping between the physical hardware actions the program executes and behavior.
In your cases, on the abstract machine there is no ordering between thread1 and thread2's execution. Even if the physical hardware where to schedule and run thread1 before thread2, that places zero constraints (in your simple example) on the output the program generates. The programs' output is only contrained by what legal outputs the abstract machine could produce.
A C++ compiler can legally:
Eliminate your program completely as equivalent to return 0;
Prove that the read of cancel_work in thread2 is unsequenced relative to all modification of cancel_work away from 0, and change it to a constant read of 0.
Actually run thread1 first then run thread2, but prove it can treat the operations in thread2 as-if they occurred before thread1 ran, so don't bother forcing a cache line refresh in thread2 and reading stale data from cancel_work.
What actually happens on the hardware does not impact what the program can legally do. And what the program can legally do is in threading sitations is restricted by observable behavior of the abstract machine, and on the behavior of synchronization primitives and their use in different threads.
For an actual happens before relationship to occur, you need something like:
std::thread(thread1_func).join();
std::thread(thread2_func).join();
and now we do know that everything in thread1_func happens before thread2_func.
We can still rewrite your program as return 0; and similar changes. But we now have a guarantee that thread1_func happens before thread2_func code does.
Note that we can eliminate (1) above via:
std::lock_guard<std::mutex> lock(mutex);
int tmp = cancel_work; //Will tmp be 1 or 0?
std::cout << tmp;
and cause tmp to actually be printed.
The program can then be converted to one that prints 1 or 0 and has no threading at all. It could keep the threading, but change thread2_func to print a constant 0. Etc.
So we rewrite your program to look like this:
std::condition_variable cv;
bool writ = false;
int cancel_work = 0; //any difference if replaced with std::atomic<int> in this case?
std::mutex mutex;
// Thread 1 executes this function
void thread1_func()
{
{
std::lock_guard<std::mutex> lock(mutex);
cancel_work = 1;
}
{
std::lock_guard<std::mutex> lock(mutex);
writ = true;
cv.notify_all();
}
}
// Thread 2 executes this function
void thread2_func()
{
std::unique_lock<std::mutex> lock(mutex);
cv.wait(lock, []{ return writ; } );
int tmp = cancel_work;
std::cout << tmp; // will print 1
}
int main()
{
std::thread t1(thread1_func);
std::thread t2(thread2_func);
t1.join(); t2.join();
return 0;
}
and now thread2_func happens after thread1_func and all is good. The read is guaranteed to be 1.

Execution not switching between thread (c++11)

I am a beginner in C++11 multithreading. I am working with small codes and came into this problem. Here is the code:
#include <iostream>
#include <thread>
#include <vector>
#include <mutex>
std::mutex print_mutex;
void function1()
{
std::cout << "Thread1 started" << std::endl;
while (true)
{
std::unique_lock<std::mutex> lock(print_mutex);
for (size_t i = 0; i<= 1000000000; i++)
continue;
std::cout << "This is function1" << std::endl;
lock.unlock();
}
}
void function2()
{
std::cout << "Thread2 started" << std::endl;
while (true)
{
std::unique_lock<std::mutex> lock(print_mutex);
for (size_t i = 0; i <= 1000000000; i++)
continue;
std::cout << "This is function2" << std::endl;
lock.unlock();
}
}
int main()
{
std::thread t1(function1);
std::thread t2(function2);
t1.join();
t2.join();
return 0;
}
I have written code with the intuition of expecting the following output:
Thread1 started Thread2 started This is
function1 This is function2 This is function1 . . . .
But the output shown is as follows:
Thread1 started Thread2 started
This is function1 This is function1 This is
function1 . . .
Where am I going wrong?
Unlocking a mutex does not guarantee that another thread that's waiting to lock the same mutex will immediately acquire a lock.
It only guarantees that the other thread will TRY to acquire the lock.
In this case, after you unlock the mutex in one thread, the same thread will immediately try to lock it again. Even though another thread was waiting patiently, for the mutex, it's not a guarantee that the other thread will win this time. The same thread that just locked it can succeed in immediately locking it again.
Today, you're seeing that the same thread always wins the locking race. Tomorrow, you may find that it's always the other thread that does. You have no guarantees, whatsoever, which thread will acquire the mutex when there's more than one thread going after the same mutex, at the same time. The winner depends on your CPU and other hardware architecture, how busy the system is loaded, at the time, and many other factors.
Both of your thread is doing following steps:
Lock
Long empty loop
Print
Unlock
Lock
Long empty loop
(and so on)
Practically, you haven't left any time for context switching, there is a lock just right after the unlock. Solution: Swap the "lock" and the "long empty loop" steps, so only the "print" step will be locked, the scheduler can switch to the other thread during "long empty loop".
Welcome to threads!
Edit: Pro Tipp: Debugging multithreading programs is hard. But sometimes it's worth to insert a plain printf() to indicate locks and unlocks (the right order: lock, then printf and printf then unlock), even when the program seems correct. In this case you could see the zero gap between unlock-lock.
This is a valid result, your code does not try to control the execution order in any way so as long as all threads execute at some point and there's is no problem and it's a legitimate result.
This could happen even if you switched the order of the loop and the lock(see here), because again you haven't written anything that attempts to control it using e.g conditional variables or just some silly atomic_bool(it is a silly solution just to demonstrate how can you actually make it alternating and be sure it will) boolean to alternate the runs.

C++ Holding a number of threads

I'm new to C++ (on Windows) and threading and I'm currently trying to find a solution to my problem using mutexes, semaphores and events.
I'm trying to create a Barrier class with a constructor and a method called Enter. The class Barrier with it's only method Enter is supposed to hold off any thread that enters it, until a number of thread have reached that method. The number of thread to wait for it recieved at the contructor.
My problem is how do I use the locks to create that effect? what I need is something like a reversed semaphore, that holds threads until a count has been reached and not like the regular semaphore works that lets threads in until a count is reached.
Any ideas as to how to go about this would be great.
Thanks,
Netanel.
Maybe:
In the ctor, store the limit count and create an empty semaphore.
When a thread calls Enter, lock a mutex first so you can twiddle inside safely. Inc a thread count toward the limit count. If the limit has not yet been reached, release the mutex and wait on the semaphore. If the limit is reached, signal the semaphore[limit-1] times in a loop, zero the thread count, (ready for next time), release the mutex and return from Enter(). Any threads that were waiting on the semaphore, and are now ready/running, should just return from their 'Enter' call.
The mutex prevents any released thread that loops around from 'getting in again' until all the threads that called 'Enter' and waited have been set running and the barrier is reset.
You can implement it with condition variable.
Here is an example:
I declare 25 threads and launch them doing the WorkerThread function.
The condition I am checking to block/unblick the threads is whether the number of threads in the section is less than 2.
(I have added some asserts to prove what my coode does).
My code is simply sleeping in the critical section and after I decrease the number of threads in the critical section.
I also added a mutex for the cout to have clean messages.
#include
#include
#include
#include
#include
#include
#include /* assert */
using namespace std;
std::mutex m;
atomic<int> NumThreadsInCritialSection=0;
int MaxNumberThreadsInSection=2;
std::condition_variable cv;
mutex coutMutex;
int WorkerThread()
{
// Wait until main() sends data
{
std::unique_lock<std::mutex> lk(m);
cv.wait(lk, []{return NumThreadsInCritialSection<MaxNumberThreadsInSection;});
}
assert (NumThreadsInCritialSection<MaxNumberThreadsInSection);
assert (NumThreadsInCritialSection>=0);
NumThreadsInCritialSection++;
{
std::unique_lock<std::mutex> lk(coutMutex);
cout<<"NumThreadsInCritialSection= "<<NumThreadsInCritialSection<<endl;
}
std::this_thread::sleep_for(std::chrono::seconds(5));
NumThreadsInCritialSection--;
{
std::unique_lock<std::mutex> lk(coutMutex);
cout<<"NumThreadsInCritialSection= "<<NumThreadsInCritialSection<<endl;
}
cv.notify_one();
return 0;
}
int main()
{
vector<thread> vWorkers;
for (int i=0;i<25;++i)
{
vWorkers.push_back(thread(WorkerThread));
}
for (auto j=vWorkers.begin(); j!=vWorkers.end(); ++j)
{
j->join();
}
return 0;
}
Hope that helps, tell me if you have any questions, I can comment or change my code.
Pseudocode outline might look like this:
void Enter()
{
Increment counter (atomically or with mutex)
if(counter >= desired_count)
{
condition_met = true; (protected if bool writes aren't atomic on your architecture)
cond_broadcast(blocking_cond_var);
}
else
{
Do a normal cond_wait loop-plus-predicate-check (waiting for the broadcast and checking condition_met each iteration to protect for spurious wakeups).
}
}

Semaphore Vs Mutex

I was reading a bit of Mutex and semaphore.
I have piece of code
int func()
{
i++;
return i;
}
i is declared somewhere outside as a global variable.
If i create counting semaphore with count as 3 won't it have a race condition? does that mean i should be using a binary semaphore or a Mutex in this case ?
Can somebody give me some practical senarios where Mutex, critical section and semaphores can be used.
probably i read lot. At the end i am a bit confused now. Can somebody clear the thought.
P.S: I have understood that primary diff between mutex and binary semaphore is the ownership. and counting semaphore should be used as a Signaling mechanism.
Differences between mutex and semaphore (I never worked with CriticalSection):
When using condition variables, its lock must be a mutex.
When using more than 1 available resources, you must use a semaphore initialized with the number of available resources, so when you're out of resources, the next thread blocks.
When using 1 resource or some code that may only be executed by 1 thread, you have the choice of using a mutex or a semaphore initialized with 1 (this is the case for OP's question).
When letting a thread wait until signaled by another thread, you need a semaphore intialized with 0 (waiting thread does sem.p(), signalling thread does sem.v()).
A critical section object is the easiest way here. It is a lightweight synchronisation object.
Here is some code as example:
#define NUMBER_OF_THREADS 100
// global
CRITICAL_SECTION csMyCriticalSectionObject;
int i = 0;
HANDLE hThread[NUMBER_OF_THREADS];
int main(int argc, char *argv[])
{
// initialize the critical section object
InitializeCriticalSection(&csMyCriticalSectionObject);
// create 100 threads:
for (int n = 0; n < NUMBER_OF_THREADS; n++)
{
if (!CreateThread(NULL,0,func,hThread[n],0,NULL))
{
fprintf(stderr,"Failed to create thread\n");
}
}
// wait for all 100 threads:
WaitForMultipleObjects(NUMBER_OF_THREADS,hThread,TRUE,INFINITE);
// this can be made more detailed/complex to find each thread ending with its
// exit code. See documentation for that
}
Links: CreateThread function and WaitForMultipleObjects function
With the thread:
// i is global, no need for i to returned by the thread
DWORD WINAPI func( LPVOID lpvParam )
{
EnterCriticalSection(&csMyCriticalSectionObject);
i++;
LeaveCriticalSection(&csMyCriticalSectionObject);
return GetLastError();
}
Mutex and/or semaphore are going to far for this purpose.
Edit: A semaphore is basically a mutex which can be released multiple times. It stores the number of release operations and can therefore release the same number of waits on it.