C++ threads and variables - c++

I have a problem within the program that I write. I have functions returning pointers and within the main() I want to run them in threads.
I'm able to execute the functions in threads:
double* SplitFirstArray_1st(double *arr0){
const UI arrSize = baseElements/4;
std::cout << "\n1st split: \n";
double *arrSplited1=nullptr;
arrSplited1 = new double [arrSize];
for(UI i=0; i<arrSize; i++){
arrSplited1 = arr0;
}
for(UI j=0; j< arrSize; ++j){
std::cout << arrSplited1[j] << " ";
}
return arrSplited1;
delete [] arrSplited1, arr0;
}
in main()
std::thread _th1(SplitFirstArray_1st, rootArr);
_th1.join();
The above is not what I'm after. I have another pointer:
*arrTh1=nullptr;
I would like to use it in a thread so it would be assigned with the value returned by my function SplitFirstArray_1st
arrTh1 = SplitFirstArray_1st(xxx);
Is such action is possible to be executed in a tread ?

Don't return the variable, pass a pointer to the variable and set the value at what this points too.
i.e.:
void set_int(int* toset) {
*toset = 4;
}
This works fine with things that are already pointers:
void set_ptr(int** toset) {
*toset = new int[4];
// ...
*toset[0] = 2;
}
You can know the data is safe to use if the function has returned.
Completely unrelated note:
return foo;
// No point placing code here unless you used goto as it won't get executed.
// Also: don't use goto.
}

Something like this:
std::thread _th1([&]() { arrTh1 = SplitFirstArray_1st(rootArr); });

Functions which start the thread cannot return values in a normal way. Therefore they should be declared as void.
Common way is to assign a protected global variable. You should protect one with mutexes (or other methods) to avoid races.
mutex m;
double *arrTh1 = nullptr;
double* aSplitFirstArray_1st(double *arr0){
...
m.lock();
arrTh1 = arrSplited1;
m.unlock();
}
When you use the pointer in other threads (including the main one), you need to protect the usage as well with the same mutex (or choose other methods).
and please, do not delete arrSopited1 and arr0. it will make the arrTh1 pointer unusable.
Note, if you use async functions, you could use futures to return values.

Related

What is RF() here? Is it absolutely necessary here?

class RF
{
public:
bitset<32> ReadData1, ReadData2;
RF()
{
Registers.resize(32);
Registers[0] = bitset<32> (0);
}
void ReadWrite(bitset<5> RdReg1, bitset<5> RdReg2, bitset<5> WrtReg, bitset<32> WrtData, bitset<1> WrtEnable)
{
// implement the funciton by you.
}
void OutputRF() // write RF results to file
{
ofstream rfout;
rfout.open("RFresult.txt",std::ios_base::app);
if (rfout.is_open())
{
rfout<<"A state of RF:"<<endl;
for (int j = 0; j<32; j++)
{
rfout << Registers[j]<<endl;
}
}
else cout<<"Unable to open file";
rfout.close();
}
private:
vector<bitset<32> >Registers;
};
RF() is the constructor, but since all it does is resize Registers to 32, you can remove it if you specify that initialization on the member directly, like this:
vector<bitset<32> > Registers = vector<bitset<32> >(32);
Then Registers will be constructed with size 32x32 bits by default, and all the bits will be zero as well, so you can remove the entire RF() function.
Note: At first I thought you could use vector<bitset<32> > Registers{32} but due to vagaries of C++ syntax that does the wrong thing. Thanks to Fureeish for that.
The short answer to your question is that, yes, for your current program, it is necessary.
The RF() function in this case is the function called when we initialize the RF object, eg.
RF new_RF;
Would run the RF() function and set things up. For this reason, it is called a 'constructor', because it helps you 'construct' your class.
In your case, the constructor is necessary for your program because it sets up your Registers variable, so that the code below from your OutputRF() function can run.
for (int j = 0; j<32; j++)
{
rfout << Registers[j]<<endl;
}
It's also useful because we can use it to set up many things, for example, if our RF() constructor looked like this:
RF(int a)
{
Registers.resize(a);
Registers[0] = bitset<a> (0);
}
It would instead resize the RF Registers to int a. You can look here for a more in-depth tutorial about constructors.
Hope that helps!

Reset thread-local variables in OpenMP

I need a consistent way of resetting all thread-local variables my program creates. The problem lies in that the thread-local data is created in places different from where they are used.
My program outline is the following:
struct data_t { /* ... */ };
// 1. Function that fetches the "global" thread-local data
data_t& GetData()
{
static data_t *d = NULL;
#pragma omp threadprivate(d); // !!!
if (!d) { d = new data_t(); }
return *d;
}
// 2 example function that uses the data
void user(int *elements, int num, int *output)
{
#pragma omp parallel for shared(elements, output) if (num > 1000)
for (int i = 0; i < num; ++i)
{
// computation is a heavy calculation, on memoized data
computation(GetData());
}
}
Now, my problem is I need a function that resets data, i.e. every thread-local object created must be accounted for.
For now, my solution, is to use a parallel region, that hopefully uses equal or more threads than the "parallel for" so every object is "iterated" through:
void ClearThreadLocalData()
{
#pragma omp parallel
{
// assuming data_t has a "clear()" method
GetData().clear();
}
}
Is there a more idiomatic / safe way to implement ClearThreadLocalData() ?
You can create and use a global version number for your data. Increment it every time you need to clear the existing caches. Then modify GetData to check the version number if there is an existing data object, discarding the existing one and creating a new one if it is out of date. (The version number for the allocated data_t object can be stored within data_t if you can modify the class, or in a second thread local variable if not.) You'd end up with something like
static int dataVersion;
data_t& GetData()
{
static data_t *d = NULL;
#pragma omp threadprivate(d); // !!!
if (d && d->myDataVersion != dataVersion) {
delete d;
d = nullptr;
}
if (!d) {
d = new data_t();
d->myDataVersion = dataVersion;
}
return *d;
}
This doesn't depend on the existence of a Clear method in data_t, but if you have one replace the delete-and-reset with a call to Clear. I'm using d = nullptr to avoid duplicating the call to new data_t().
The global dataVersion could be a static member of data_t if you want to avoid the global variable, and it can be atomic if necessary although GetData would need changes to handle that.
When it comes time to reset the data, just change the global version number:
++dataVersion;

Creating an array of objects causes an issue

I create the two following objects:
bool Reception::createNProcess()
{
for (int y = 0; y < 3; ++y)
{
Process *pro = new Process; // forks() at construction
Thread *t = new Thread[5];
this->addProcess(pro); // Adds the new process to a vector
if (pro->getPid() == 0)
{
for (int i = 0; i < 5; ++i)
{
pro->addThread(&t[i]); // Adds the new thread to a vector
t[i].startThread();
}
}
}
Where I create 3 processes (that I have encapsulated in Process) and create 5 threads in each of these processes.
But I'm not sure the following line is correct:
Thread *t = new Thread[5];
Because my two functions addProcess and addThread both take a pointer to Process and Thread respectively and yet the compiler asks for a reference to t[i] for addThread and I don't understand why.
void Process::addThread(Thread *t)
{
this->threads_.push_back(t);
}
void Reception::addProcess(Process *p)
{
this->createdPro.push_back(p);
}
createdPro is defined in the Reception class as follows:
std::vector<Process *> createdPro;
and threads_ in the Process class like such:
std::vector<Thread *> threads_;
And the error message (as obvious as it is) is as follows:
error: no matching function for call to ‘Process::addThread(Thread&)’
pro->addThread(t[i]);
process.hpp:29:10: note: candidate: void Process::addThread(Thread*)
void addThread(Thread *);
process.hpp:29:10: note: no known conversion for argument 1 from ‘Thread’ to ‘Thread*’
Even though I defined my Thread as a pointer.
You have defined the member function to take a pointer:
void Process::addThread(Thread *t)
{
...
}
You then invoke this function for &t[i], which is a pointer and should work perfectly:
pro->addThread(&t[i]); // Adds the new thread to a vector
You could also invoke it with t+i and it would still be ok. However your error message tells us something different: the compiler doesn't find a match for pro->addThread(t[i]); (i.e. the & is missing).
Either you made a typo in your question, or you made a typo in your code. Or you have another invocation somewhere where you've forgotten the ampersand: t[i] would of course designate an object (it's equivalent to *(t+i) ) and not a pointer, and cause the error message you have (demo mcve)

Can a function return the same value inside a loop, and return different values outside of loops?

It acts like this.
fun();//return 1;
for (int i=0;i++;i<100)
fun();//return 2;
fun();//return 3;
I don't want to do it manually, like:
static int i=0;
fun(){return i};
main()
{
i++;
fun();//return 1;
i++;
for (int i=0;i++;i<100)
fun();//return 2;
i++;
fun();//return 3;
}
New classes and static variables are allowed.
I am trying to design a cache replacement algorithm. Most of the time I use the LRU algorithm, but, if I use LRU algorithm inside a loop I would very likely get a cache thrashing.
https://en.wikipedia.org/wiki/Thrashing_(computer_science)
I need to know if I am inside a loop. Then I can use the LFU algorithm to avoid thrashing.
An obvious way of doing this would be using the __LINE__ macro. It will return the source code line number, which will be different throughout your function.
It is not possible within c++ for a function to know whether or not it is inside a loop 100% of the time. However, if you are happy to do some manual coding to tell the function that it is inside a loop then making use of C++'s default parameters you could simply implement a solution. For more information on default parameters see http://www.learncpp.com/cpp-tutorial/77-default-parameters/. Also, because Global variables are generally frowned upon I have placed them in a separate namespace in order to prevent clashes.
namespace global_variables {
int i = 0;
}
int func(bool is_in_loop = false) {
if (is_in_loop)
{
//do_something;
return global_variables::i;
}
else
{
//do_something_else;
return global_variables::i++;
}
}
int main()
{
// Calling function outside of loop
std::cout << func();
// Calling function inside of loop
for (int j=0;j<100;j++)
{
// is_in_loop will be overided to be true.
std::cout << function(true);
}
return 0;
}

Multithreaded matrix multiplication in C++

I've been having trouble with this parallel matrix multiplication code, I keep getting an error when trying to access a data member in my structure.
This is my main function:
struct arg_struct
{
int* arg1;
int* arg2;
int arg3;
int* arg4;
};
int main()
{
pthread_t allthreads[4];
int A [N*N];
int B [N*N];
int C [N*N];
randomMatrix(A);
randomMatrix(B);
printMatrix(A);
printMatrix(B);
struct arg_struct *args = (arg_struct*)malloc(sizeof(struct arg_struct));
args.arg1 = A;
args.arg2 = B;
int x;
for (int i = 0; i < 4; i++)
{
args.arg3 = i;
args.arg4 = C;
x = pthread_create(&allthreads[i], NULL, &matrixMultiplication, (void*)args);
if(x!=0)
exit(1);
}
return 0;
}
and the matrixMultiplication method used from another C file:
void *matrixMultiplication(void* arguments)
{
struct arg_struct* args = (struct arg_struct*) arguments;
int block = args.arg3;
int* A = args.arg1;
int* B = args.arg2;
int* C = args->arg4;
free(args);
int startln = getStartLineFromBlock(block);
int startcol = getStartColumnFromBlock(block);
for (int i = startln; i < startln+(N/2); i++)
{
for (int j = startcol; j < startcol+(N/2); j++)
{
setMatrixValue(C,0,i,j);
for(int k = 0; k < N; k++)
{
C[i*N+j] += (getMatrixValue(A,i,k) * getMatrixValue(B,k,j));
usleep(1);
}
}
}
}
Another error I am getting is when creating the thread: "invalid conversion from ‘void ()(int, int*, int, int*)’ to ‘void* ()(void)’ [-fpermissive]
"
Can anyone please tell me what I'm doing wrong?
First you mix C and C++ very badly, either use plain C or use C++, in C++ you can simply use new and delete.
But the reason of your error is you allocate arg_struct in one place and free it in 4 threads. You should allocate one arg_struct for each thread
Big Boss is right in the sense that he has identified the problem, but to add to/augment the reply he made.
Option 1:
Just create an arg_struct in the loop and set the members, then pass it through:
for(...)
{
struct arg_struct *args = (arg_struct*)malloc(sizeof(struct arg_struct));
args->arg1 = A;
args->arg2 = B; //set up args as now...
...
x = pthread_create(&allthreads[i], NULL, &matrixMultiplication, (void*)args);
....
}
keep the free call in the thread, but now you could then use the passed struct directly rather than creating locals in your thread.
Option 2:
It looks like you want to copy the params from the struct internally to the thread anyway so you don't need to dynamically allocate.
Just create an arg_struct and set the members, then pass it through:
arg_struct args;
//set up args as now...
for(...)
{
...
x = pthread_create(&allthreads[i], NULL, &matrixMultiplication, (void*)&args);
}
Then remove the free call.
However as James pointed out you would need to synchronize in the thread/parent on the structure to make sure that it wasn't changed. That would mean a Mutex or some other mechanism. So probably the move of the allocation to the for loop is easier to begin with.
Part 2:
I'm working on windows (so I can't experiment currently), but pthread_create param 3 is referring to the thread function matrixMultiplication which is defined as void* matrixMultiplication( void* ); - it looks correct to me (signature wise) from the man pages online, void* fn (void* )
I think I'll have to defer to someone else on your second error. Made this post a comunnity wiki entry so answer can be put into this if desired.
It's not clear to me what you are trying to do. You start some threads,
then you return from main (exiting the process) before getting any
results from them.
In this case, I'ld probably not use any dynamic allocation, directly.
(I would use std::vector for the matrices, which would use dynamic
allocation internally.) There's no reason to dynamically allocate the
arg_struct, since it can safely be copied. Of course, you'll have to
wait until each thread has successfully extracted its data before
looping to construct the next thread. This would normally be done using
a conditional: the new thread would unblock the conditional once it has
extracted the arguments from the arg_struct (or even better, you could
use boost::thread, which does this part for you). Alternatively, you
could use an array of arg_struct, but there is absolutely no reason to
allocate them dynamically. (If for some reason you cannot use
std::vector for A, B and C, you will want to allocate these
dynamically, in order to avoid any risk of stack overflow. But
std::vector is a much better solution.)
Finally, of course, you must wait for all of the threads to finish
before leaving main. Otherwise, the threads will continue working on
data that doesn't exist any more. In this case, you should
pthread_join all of the threads before exiting main. Presumably,
too, you want to do something with the results of the multiplication,
but in any case, exiting main before all of the threads have finished
accessing the matrices will cause undefined behavior.