C++ Printing/cout overlaps in multithreading? - c++

I was wondering how I could handle printing when using multiple threads.
I thought it would be pretty simple:
#include <iostream>
#include <pthread.h>
using namespace std;
bool printing = false;
struct argumentStruct {
int a;
float b;
};
void *ThreadFunction(void *arguments) {
struct argumentStruct*args = (struct argumentStruct*)arguments;
int a = args->a;
float b = args->b;
while (printing) {}
printing = true;
cout << "Some text...." << a << b << endl;
printing = false;
}
main() {
pthread_t threads[3];
struct argumentStruct argStruct[3];
argStruct[0].a = 1;
argStruct[0].b = 1.1;
pthread_create(&threads[0], NULL, &ThreadFunction, (void *)&argStruct[0]);
argStruct[1].a = 2;
argStruct[1].b = 2.2;
pthread_create(&threads[1], NULL, &ThreadFunction, (void *)&argStruct[1]);
argStruct[2]a = 3;
argStruct[2].b = 3.3;
pthread_create(&threads[2], NULL, &ThreadFunction, (void *)&argStruct[2]);
getchar();
return 0;
}
But this doesn't really work that well. Some couts are just skipped (or maybe overwritten?).
So what am I doing wrong? How can I handle this properly?

The problem is that the statements that test and set the printing variable are not atomic, i.e., they don't execute without being interrupted by the OS scheduler which switches the CPU among threads. You should use mutexes in order to stop other threads while printing. Here you have a nice example:
http://sourcecookbook.com/en/recipes/70/basic-and-easy-pthread-mutex-lock-example-c-thread-synchronization

You have a race condition, where two (or more) threads can both set printing to true.
This is because assignment is not an atomic operation, it's done in multiple steps by the CPU, and if the thread is interrupted before the actual setting of the variable to true, and another thread starts running, then you can have two threads running simultaneously both believing the variable is true. For more clarity:
Thread A sees that printing is false
Thread A is interrupted
Thread B starts running
Thread B sees that printing is false
Thread B sets printing to true
Thread B is interrupted
Thread A is scheduled and starts running again
Thread A sets printing to true
Now both thread A and B are running full speed ahead.
That's why there are threading primitives such as semaphores and mutex that handle these things.

Related

Which types of memory_order should be used for non-blocking behaviour with an atomic_flag?

I'd like, instead of having my threads wait, doing nothing, for other threads to finish using data, to do something else in the meantime (like checking for input, or re-rendering the previous frame in the queue, and then returning to check to see if the other thread is done with its task).
I think this code that I've written does that, and it "seems" to work in the tests I've performed, but I don't really understand how std::memory_order_acquire and std::memory_order_clear work exactly, so I'd like some expert advice on if I'm using those correctly to achieve the behaviour I want.
Also, I've never seen multithreading done this way before, which makes me a bit worried. Are there good reasons not to have a thread do other tasks instead of waiting?
/*test program
intended to test if atomic flags can be used to perform other tasks while shared
data is in use, instead of blocking
each thread enters the flag protected part of the loop 20 times before quitting
if the flag indicates that the if block is already in use, the thread is intended to
execute the code in the else block (only up to 5 times to avoid cluttering the output)
debug note: this doesn't work with std::cout because all the threads are using it at once
and it's not thread safe so it all gets garbled. at least it didn't crash
real world usage
one thread renders and draws to the screen, while the other checks for input and
provides frameData for the renderer to use. neither thread should ever block*/
#include <fstream>
#include <atomic>
#include <thread>
#include <string>
struct ThreadData {
int numTimesToWriteToDebugIfBlockFile;
int numTimesToWriteToDebugElseBlockFile;
};
class SharedData {
public:
SharedData() {
threadData = new ThreadData[10];
for (int a = 0; a < 10; ++a) {
threadData[a] = { 20, 5 };
}
flag.clear();
}
~SharedData() {
delete[] threadData;
}
void runThread(int threadID) {
while (this->threadData[threadID].numTimesToWriteToDebugIfBlockFile > 0) {
if (this->flag.test_and_set(std::memory_order_acquire)) {
std::string fileName = "debugIfBlockOutputThread#";
fileName += std::to_string(threadID);
fileName += ".txt";
std::ofstream writeFile(fileName.c_str(), std::ios::app);
writeFile << threadID << ", running, output #" << this->threadData[threadID].numTimesToWriteToDebugIfBlockFile << std::endl;
writeFile.close();
writeFile.clear();
this->threadData[threadID].numTimesToWriteToDebugIfBlockFile -= 1;
this->flag.clear(std::memory_order_release);
}
else {
if (this->threadData[threadID].numTimesToWriteToDebugElseBlockFile > 0) {
std::string fileName = "debugElseBlockOutputThread#";
fileName += std::to_string(threadID);
fileName += ".txt";
std::ofstream writeFile(fileName.c_str(), std::ios::app);
writeFile << threadID << ", standing by, output #" << this->threadData[threadID].numTimesToWriteToDebugElseBlockFile << std::endl;
writeFile.close();
writeFile.clear();
this->threadData[threadID].numTimesToWriteToDebugElseBlockFile -= 1;
}
}
}
}
private:
ThreadData* threadData;
std::atomic_flag flag;
};
void runThread(int threadID, SharedData* sharedData) {
sharedData->runThread(threadID);
}
int main() {
SharedData sharedData;
std::thread thread[10];
for (int a = 0; a < 10; ++a) {
thread[a] = std::thread(runThread, a, &sharedData);
}
thread[0].join();
thread[1].join();
thread[2].join();
thread[3].join();
thread[4].join();
thread[5].join();
thread[6].join();
thread[7].join();
thread[8].join();
thread[9].join();
return 0;
}```
The memory ordering you're using here is correct.
The acquire memory order when you test and set your flag (to take your hand-written lock) has the effect, informally speaking, of preventing any memory accesses of the following code from becoming visible before the flag is tested. That's what you want, because you want to ensure that those accesses are effectively not done if the flag was already set. Likewise, the release order on the clear at the end prevents any of the preceding accesses from becoming visible after the clear, which is also what you need so that they only happen while the lock is held.
However, it's probably simpler to just use a std::mutex. If you don't want to wait to take the lock, but instead do something else if you can't, that's what try_lock is for.
class SharedData {
// ...
private:
std::mutex my_lock;
}
// ...
if (my_lock.try_lock()) {
// lock was taken, proceed with critical section
my_lock.unlock();
} else {
// lock not taken, do non-critical work
}
This may have a bit more overhead, but avoids the need to think about atomicity and memory ordering. It also gives you the option to easily do a blocking wait if that later becomes useful. If you've designed your program around an atomic_flag and later find a situation where you must wait to take the lock, you may find yourself stuck with either spinning while continually retrying the lock (which is wasteful of CPU cycles), or something like std::this_thread::yield(), which may wait for longer than necessary after the lock is available.
It's true this pattern is somewhat unusual. If there is always non-critical work to be done that doesn't need the lock, commonly you'd design your program to have a separate thread that just does the non-critical work continuously, and then the "critical" thread can just block as it waits for the lock.

C++ cancelling a pthread using a secondary thread stuck in function call

I'm having trouble instituting a timeout in one of my pthreads. I've simplified my code here and I've isolated the issue to be the CNF algorithm I'm running in the thread.
int main(){
pthread_t t1;
pthread_t t2;
pthread_t t3; //Running multiple threads, the others work fine and do not require a timeout.
pthread_create(&t1, nullptr, thread1, &args);
pthread_join(t1, nullptr);
std::cout << "Thread should exit and print this\n"; //This line never prints since from what I've figured to be a lack of cancellation points in the actual function running in the thread.
return 0;
}
void* to(void* args) {
int timeout{120};
int count{0};
while(count < timeout){
sleep(1);
count++;
}
std::cout << "Killing main thread" << std::endl;
pthread_cancel(*(pthread_t *)args);
}
void *thread1 (void *arguments){
//Create the timeout thread within the CNF thread to wait 2 minutes and then exit this whole thread
pthread_t time;
pthread_t cnf = pthread_self();
pthread_create(&time, nullptr, &timeout, &cnf);
//This part runs and prints that the thread has started
std::cout << "CNF running\n";
auto *args = (struct thread_args *) arguments;
int start = args->vertices;
int end = 1;
while (start >= end) {
//This is where the issue lies
cover = find_vertex_cover(args->vertices, start, args->edges_a, args->edges_b);
start--;
}
pthread_cancel(time); //If the algorithm executes in the required time then the timeout is not needed and that thread is cancelled.
std::cout << "CNF END\n";
return nullptr;
}
I tried commenting out the find_vertex_cover function and add an infinite loop and I was able to create a timeout and end the thread that way. The function is actually working the exact way it should. It should take forever to run under the conditions I'm running it at and therefore I need a timeout.
//This was a test thread function that I used to validate that implementing the timeout using `pthread_cancel()` this way works. The thread will exit once the timeout is reached.
void *thread1 (void *args) {
pthread_t x1;
pthread_t x2 = pthread_self();
pthread_create(&x1, nullptr, to, &x2);
/*
for (int i = 0;i<100; i++){
sleep(1);
std::cout << i << std::endl;
}
*/
}
Using this function I was able to validate that my timeout thread approach worked. The issue is when I actually run the CNF algorithm (using Minisat under the hood) once find_vertex_cover runs, there is no way to end the thread. The algorithm is expected to fail in the situation I'm implementing which is why a timeout is being implemented.
I've read up on using pthread_cancel() and while it isn't a great way it's the only way I could implement a timeout.
Any help on this issue would be appreciated.
I've read up on using pthread_cancel() and while it isn't a great way [..]
That's right. pthread_cancel should be avoided. It's especially bad for use in C++ as it's incompatible with exception handling. You should use std::thread and for thread termination, you can possibly use conditional variable or a atomic variable that terminates the "infinite loop" when set.
That aside, cancellation via pthread_cancel depends on two things: 1) cancellation state 2) cancellation type.
Default cancellation state is enabled. But the default cancellation type is deferred - meaning the cancellation request will be delivered only at the next cancellation point. I suspect there's any cancellation points in find_vertex_cover. So you could set the cancellation type to asynchronous via the call:
pthread_setcanceltype(PTHREAD_CANCEL_ASYNCHRONOUS, NULL);
from the thread(s) you want to be able to cancel immediately.
But again, I suggest to not go for pthread_cancel approach at all and instead rewrite the "cancel" logic so that it doesn't involve pthread_cancel.

Linux Multithreading - threads do not produce any output as expected

I am learning multi-threading in Linux platform. I wrote this small program to get comfort with the concepts. On running the executable, I could not see any error nor does it print Hi. Hence I made to sleep the thread after I saw the output. But still could not see the prints on the console.
I also want to know which thread prints at run time. Can anyone help me?
#include <iostream>
#include <unistd.h>
#include <pthread.h>
using std::cout;
using std::endl;
void* print (void* data)
{
cout << "Hi" << endl;
sleep(10000000);
}
int main (int argc, char* argv[])
{
int t1 = 1, t2 =2, t3 = 3;
pthread_t thread1, thread2, thread3;
int thread_id_1, thread_id_2, thread_id_3;
thread_id_1 = pthread_create(&thread1, NULL, print, 0);
thread_id_2 = pthread_create(&thread2, NULL, print, 0);
thread_id_3 = pthread_create(&thread3, NULL, print, 0);
return 0;
}
Your main thread probably exits and thus the entire process dies. So, the threads don't get a chance to run. It's also possible (quite unlikely but still possible) that you'd see the output from the threads even with your code as-is if the threads complete execution before main thread exits. But you can't rely on that.
Call pthread_join(), which suspends the calling thread until the thread (specified by the thread ID) returns, on the threads after the pthread_create() calls in main():
pthread_join(thread1, NULL);
pthread_join(thread2, NULL);
pthread_join(thread3, NULL);
You can also use an array of pthread_t which would allow you to use a for loop over the pthread_create() and pthread_join() calls.
Or exit only the main thread using pthread_exit(0), which would exit only the calling thread and the remaining threads (the ones you created) will continue execution.
Note that your thread function should return a pointer or NULL:
void* print (void* data)
{
cout << "Hi" << endl;
return NULL;
}
Not sure about the high sleeps either right the threads exit, which is unnecessary and would hold the threads from exiting. Probably not something you wanted.

Why does Sleep function disable my Mutex

I found code online that displays how to use threads from a tutorial by redKyle. In the 'Race Condition' tutorial, he basically shows how two threads are sent to a function. The objective of the function is to print '.' and '#' in sequence one hundred times each. He provides the code to get this to work, he does NOT provide the code for the mutex. I have modified the code to include the mutex so that to prevent one thread from accessing the variable that holds the last character printed while another thread is accessing it.
I got the code to work. Great! However, I kept changing the sleep value between 1 and 50. The mutex code works fine. However, when i set sleep to 0 (or just comment it out) the mutex no longer works and the values are no longer printed in the correct manner (I no longer see 200 characters of strictly alternating '#' and '.').
The following is the code:
#include "stdafx.h"
#include <iostream>
#include <windows.h>
using namespace std;
static char lastChar='#';
//define a mutex
HANDLE mutexHandle = NULL;
//flag to specify if thread has begun
bool threadStarted = false;
void threadProc(int *sleepVal, int *threadID)
{
cout<<"sleepVal: "<<*sleepVal<<endl;
for (int i=0; i<100; i++)
{
char currentChar;
threadStarted = true;
while(!threadStarted){}
//lock mutex
WaitForSingleObject(mutexHandle, INFINITE);
if (lastChar == '#')
currentChar = '.';
else
currentChar = '#';
Sleep(*sleepVal);
lastChar = currentChar;
ReleaseMutex(mutexHandle);
threadStarted = false;
// cout<<"\nSleepVal: "<<*sleepVal<<" at: "<<currentChar;
cout<<currentChar;
}//end for
}//end threadProc
int main()
{
cout<<"Race conditions by redKlyde \n";
int sleepVal1 = 50;
int sleepVal2 = 30;
//create mutex
mutexHandle = CreateMutex(NULL, false, NULL);
//create thread1
HANDLE threadHandle;
threadHandle = CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE) threadProc, &sleepVal1, 0, NULL);
//create thread2
HANDLE threadHandle2;
threadHandle2 = CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE) threadProc, &sleepVal2, 0, NULL);
WaitForSingleObject(threadHandle, INFINITE);
WaitForSingleObject(threadHandle2, INFINITE);
cout<<endl<<endl;
CloseHandle(mutexHandle);
system("pause");
return 0;
}
So my question is: why does setting sleep to 0 void the mutex code.
Take notice that your print statement is not protected by the mutex, so one thread is free to print while the other is free to modify. By not sleeping, you're allowing the scheduler to determine the print order based upon the quantum of the thread.
There are some things wrong:
1) You should not be sleeping inside a held lock. This is almost never correct.
2) Any place your data is shared, you should be guarding with a lock. This means that the print statement should be in the lock, too.
Also, as a tip for future use of mutual exclusion, on Windows the best usermode mutex is the SRWLock followed by the CriticalSection. Use a handle-based synch object is much slower.

critical section problem in Windows 7

Why does the code sample below cause one thread to execute way more than another but a mutex does not?
#include <windows.h>
#include <conio.h>
#include <process.h>
#include <iostream>
using namespace std;
typedef struct _THREAD_INFO_ {
COORD coord; // a structure containing x and y coordinates
INT threadNumber; // each thread has it's own number
INT count;
}THREAD_INFO, * PTHREAD_INFO;
void gotoxy(int x, int y);
BOOL g_bRun;
CRITICAL_SECTION g_cs;
unsigned __stdcall ThreadFunc( void* pArguments )
{
PTHREAD_INFO info = (PTHREAD_INFO)pArguments;
while(g_bRun)
{
EnterCriticalSection(&g_cs);
//if(TryEnterCriticalSection(&g_cs))
//{
gotoxy(info->coord.X, info->coord.Y);
cout << "T" << info->threadNumber << ": " << info->count;
info->count++;
LeaveCriticalSection(&g_cs);
//}
}
ExitThread(0);
return 0;
}
int main(void)
{
// OR unsigned int
unsigned int id0, id1; // a place to store the thread ID returned from CreateThread
HANDLE h0, h1; // handles to theads
THREAD_INFO tInfo[2]; // only one of these - not optimal!
g_bRun = TRUE;
ZeroMemory(&tInfo, sizeof(tInfo)); // win32 function - memset(&buffer, 0, sizeof(buffer))
InitializeCriticalSection(&g_cs);
// setup data for the first thread
tInfo[0].threadNumber = 1;
tInfo[0].coord.X = 0;
tInfo[0].coord.Y = 0;
h0 = (HANDLE)_beginthreadex(
NULL, // no security attributes
0, // defaut stack size
&ThreadFunc, // pointer to function
&tInfo[0], // each thread gets its own data to output
0, // 0 for running or CREATE_SUSPENDED
&id0 ); // return thread id - reused here
// setup data for the second thread
tInfo[1].threadNumber = 2;
tInfo[1].coord.X = 15;
tInfo[1].coord.Y = 0;
h1 = (HANDLE)_beginthreadex(
NULL, // no security attributes
0, // defaut stack size
&ThreadFunc, // pointer to function
&tInfo[1], // each thread gets its own data to output
0, // 0 for running or CREATE_SUSPENDED
&id1 ); // return thread id - reused here
_getch();
g_bRun = FALSE;
return 0;
}
void gotoxy(int x, int y) // x=column position and y=row position
{
HANDLE hdl;
COORD coords;
hdl = GetStdHandle(STD_OUTPUT_HANDLE);
coords.X = x;
coords.Y = y;
SetConsoleCursorPosition(hdl, coords);
}
That may not answer your question but the behavior of critical sections changed on Windows Server 2003 SP1 and later.
If you have bugs related to critical sections on Windows 7 that you can't reproduce on an XP machine you may be affected by that change.
My understanding is that on Windows XP critical sections used a FIFO based strategy that was fair for all threads while later versions use a new strategy aimed at reducing context switching between threads.
There's a short note about this on the MSDN page about critical sections
You may also want to check this forum post
Critical sections, like mutexes are designed to protect a shared resource against conflicting access (such as concurrent modification). Critical sections are not meant to replace thread priority.
You have artificially introduced a shared resource (the screen) and made it into a bottleneck. As a result, the critical section is highly contended. Since both threads have equal priority, that is no reason for Windows to prefer one thread over another. Reduction of context switches is a reason to pick one thread over another. As a result of that reduction, the utilization of the shared resource goes up. That is a good thing; it means that one thread will be finished a lot earlier and the other thread will finish a bit earlier.
To see the effect graphically, compare
A B A B A B A B A B
to
AAAAA BBBBB
The second sequence is shorter because there's only one switch from A to B.
In hand wavey terms:
CriticalSection is saying the thread wants control to do some things together.
Mutex is making a marker to show 'being busy' so others can wait and notifying of completion so somebody else can start. Somebody else already waiting for the mutex will grab it before you can start the loop again and get it back.
So what you are getting with CriticalSection is a failure to yield between loops. You might see a difference if you had Sleep(0); after LeaveCriticalSection
I can't say why you're observing this particular behavior, but it's probably to do with the specifics of the implementation of each mechanism. What I CAN say is that unlocking then immediately locking a mutex is a bad thing. You will observe odd behavior eventually.
From some MSDN docs (http://msdn.microsoft.com/en-us/library/ms682530.aspx):
Starting with Windows Server 2003 with Service Pack 1 (SP1), threads waiting on a critical section do not acquire the critical section on a first-come, first-serve basis. This change increases performance significantly for most code