OpenMP - parallel code has unexpected results

OpenMP - parallel code has unexpected results - c++

#include "/usr/lib/gcc/i686-linux-gnu/4.6/include/omp.h"
#include <iostream>
#include<list>
using namespace std;
int main()
{
list<int> lst;
for(int i=0;i<5;i++)
lst.push_back(i);
#pragma omp parallel for
for(int i=0;i<5;i++)
{
cout<<i<<" "<<omp_get_thread_num()<<endl;
}
}
suppose that I can get this:
0 0
1 0
2 0
3 1
4 1
However, sometimes I can get this result:
30 0
1 0
2 0
1
4 1
or even this kind of result:
30 1 0
4 1
1 0
2 0
I know this is because the output code:
cout<<i<<" "<<omp_get_thread_num()<<endl;
has been spliced into small segments and has no order when doing output.
But who can tell me how to prevent this from happening?
Thanks.

Standard output streams are NOT synchronized!
The only guarantee the standard gives is that, single characters are outputted atomically.
You need either a lock - which defies the point of parallelization or you could drop the "<< i" which should result in a quasi synchronized behavior.

The loop runs out of order. This is why you have unordered output.
If your problem is the 30 in
30 0
1 0
2 0
1
4 1
then stay cool, there is no 30, but 3 and 0. You still have, as expected, an unordered row of [0..4]:
3 0 0
1 0
2 0
1
4 1
What you can't tell is only which of the 0s which of the 1s is not a thread number.

Your code
#pragma omp parallel for
for(int i = 0; i < 5; i++)
{
cout << i << " " << omp_get_thread_num() << endl;
}
is equivalent to
#pragma omp parallel for
for(int i = 0; i < 5; i++)
{
cout << i;
cout << " ";
cout << omp_get_thread_num();
cout << endl;
}
Calls to << in the different threads may be executed in any order. For instance cout << i; in thread 3 may be followed by cout << i; in tread 0 which may be followed by cout << " "; in thread 3 etc. resulting in the garbled output 30 ....
The correct way is to rewrite the code so that each thread calls cout << only once in the loop:
#pragma omp parallel for
for(int i = 0; i < 5; i++)
{
stringstream ss;
ss << i << " " << omp_get_thread_num() << '\n';
cout << ss.str();
}

You can create an array (of size 5) containing which thread handled which index and then print it outside the parallel loop.

Related

Locally compiled c++ code is improperly looping

The following never terminates on my system.
#include <iostream>
using namespace std;
int main(){
int solutions[1000][4] = {};
for(int a=0; 3*a<=1000; a++){
for(int b=0; 5*b<=1000; b++){
for(int c=0; 7*c<=1000; c++){
cout << "enter" << "\t" << a << "\t" << b << "\t" << c << endl;
if (3*a+5*b+7*c > 1000) {break;}
solutions[3*a+5*b+7*c][0] = a;
solutions[3*a+5*b+7*c][1] = b;
solutions[3*a+5*b+7*c][2] = c;
solutions[3*a+5*b+7*c][3] = 1;
cout << "exit" << "\t" << a << "\t" << b << "\t" << c << endl << endl;
}
}
}
}
I'm completely stumped, so I decided to print a log of variable changes. It makes it to 4 iterations of b, and then when c hits 140, it loops back to 0. Log looks like this
...
enter 0 4 137
exit 0 4 137
enter 0 4 138
exit 0 4 138
enter 0 4 139
exit 0 4 139
enter 0 4 140
exit 0 4 0
enter 0 4 1
exit 0 4 1
enter 0 4 2
exit 0 4 2
enter 0 4 3
exit 0 4 3
...
I compiled this using g++ B.cpp -o B.exe, and then just ran the executable. The exact code (with logging commented out) terminates properly online at http://cpp.sh/. My compiler version is g++ (i686-posix-dwarf-rev0, Built by MinGW-W64 project) 5.3.0. What could be going wrong here?

When a = 0, b = 4, c = 140, 3*a+5*b+7*c becomes 1000 and write to out-of-bounds solution[1000] happens. It seems this out-of-bound write happened to break the loop counter.
Allocate one more element to avoid this out-of-bounds write.
int solutions[1001][4] = {};

OpenMP only using one thread

I have having a bit of a frustrating problem with openmp. When I run the following code it only seems to be running on one thread.
omp_set_num_threads(8);
#pragma omp parallel for schedule(dynamic)
for(size_t i = 0; i < jobs.size(); i++) //jobs is a vector
{
std::cout << omp_get_thread_num() << "\t" << omp_get_num_threads() << "\t" << omp_in_parallel() << std::endl;
jobs[i].run();
}
This prints...
0 1 1
for every line.
I can see using top that openmp is spawning as many threads as I have the process taskset to. They are mostly idle while it runs. The program is both compiled and linked with the -fopenmp flag with gcc. I am using redhat 6. I also tried using the num_threads(8) parameter in the pragma which made no difference. The program is linked with another library which also uses openmp so maybe this is the issue. Does anyone know what might cause this behavior? In all my past openmp experience it has just worked.

Can you print your jobs.size()?
I made a quick test and it does work:
#include <stdio.h>
#include <omp.h>
#include <iostream>
int main()
{
omp_set_num_threads(2);
#pragma omp parallel for ordered schedule(dynamic)
for(size_t i = 0; i < 4; i++) //jobs is a vector
{
#pragma omp ordered
std::cout << i << "\t" << omp_get_thread_num() << "\t" << omp_get_num_threads() << "\t" << omp_in_parallel() << std::endl;
}
return 0;
}
I got:
icpc -qopenmp test.cpp && ./a.out
0 0 2 1
1 1 2 1
2 0 2 1
3 1 2 1

simple for loop finishes and fails the run

For some unknown reason this simple code runs, does what it's expected to do and then crashes the run. I am using NetBeans IDE, which overlapped my arrays before (tends to be buggy), so I was wondering if someone gets the same error - that would mean I certainly have to change the IDE environment.
#include <iostream>
using namespace std;
int main ()
{
int first[4][4];
for (int a = 0; a < 5; a++)
{
for (int b = 0; b < 5;b++)
{
cout << a << " " << b << " ";
if (first [a][b] != 0)
{
first[a][b] = 0;
}
cout << first[a][b] << " ";
}
cout << endl << endl << endl;
}
return 0;
};

here you are declearing a array with 4 indexes.In c/c++ index number starts at 0.
In your code you are saying :
int first[4][4];
that means indexs are : 0 1 2 3.Array length or total index are 4.
But in for loop you are saying
for (int a = 0; a < 5; a++) {
....
}
so you are trying to access index number 0 1 2 3 4 respectively.But remember you don't have index number 4.That is why it should give array index out of bound error.
Also at the end of main function you are using a semicolon.remove that
main () {
....
};
Hope this solves the problem.From next time Please try to provide details about the errors your IDE is giving you as it will be easier for the people who are giving answer.

Convert an Eigen matrix to Triplet form C++

I think Eigen uses compressed methods to store sparse matrices. Is there any way that I can extract Triplet-format vectors of an Eigen sparse matrix in from of std::vectors?
Thanks.
More info (an example of triplet format)
Triplet format of matrix :
A=
3 0 4 0
0 0 1 0
0 2 0 5
4 0 0 0
i = 1 1 2 3 3 4 // row
j = 1 3 3 2 4 1 // column
S = 3 4 1 2 5 4 // values

The answer to the question, which is:
// Is there some method such as:
std::vector<Eigen::Triplet<double>> T = SparseMat.to_triplets();
// in Eigen?
Is no, there does not appear to be such a function.
Instead,
std::vector<Eigen::Triplet<double>> to_triplets(Eigen::SparseMatrix<double> & M){
std::vector<Eigen::Triplet<double>> v;
for(int i = 0; i < M.outerSize(); i++)
for(typename Eigen::SparseMatrix<double>::InnerIterator it(M,i); it; ++it)
v.emplace_back(it.row(),it.col(),it.value());
return v;
}
auto t = to_triplets(SparseMat);
And if you want to do it faster, open it in an IDE, look around for pointers to the data arrays, and write a convoluted function that will have no effect on runtime, since the matrix is sparse, and copying is linear in terms of nonzero elements.

Simply as shown in the tutorial:
#include <Eigen/Sparse>
#include <iostream>
using namespace Eigen;
using std::cout;
using std::endl;
typedef Triplet<int> Trip;
int main(int argc, char *argv[]){
std::vector<Trip> trp, tmp;
// I subtracted 1 from the indices so that the output matches your question
trp.push_back(Trip(1-1,1-1,3));
trp.push_back(Trip(1-1,3-1,4));
trp.push_back(Trip(2-1,3-1,1));
trp.push_back(Trip(3-1,2-1,2));
trp.push_back(Trip(3-1,4-1,5));
trp.push_back(Trip(4-1,1-1,4));
int rows, cols;
rows = cols = 4;
SparseMatrix<int> A(rows,cols);
A.setFromTriplets(trp.begin(), trp.end());
cout << "Matrix from triplets:" << endl;
cout << A << endl;
cout << endl << "Triplets:" << endl;
cout << "Row\tCol\tVal" <<endl;
for (int k=0; k < A.outerSize(); ++k)
{
for (SparseMatrix<int>::InnerIterator it(A,k); it; ++it)
{
cout << 1+it.row() << "\t"; // row index
cout << 1+it.col() << "\t"; // col index (here it is equal to k)
cout << it.value() << endl;
}
}
return 0;
}

Why are addresses of vector elements not consecutive when assigned using push_back()?

Please look at the small test code + output provided below. It seems that when using push_back() on an std::vector within a loop, C++ allocates the memory at 'random' addresses, and then re-copies the data into consecutive memory addresses after the loop is finished.
Is this to do with the fact that the size of the vector is not known before the loop?
What is the correct way of doing what I do in the test code? Do I have to assign the pointers in another loop after the first one exits? Note that I cannot define the size of the vector before the first loop, because in reality it is actually a vector of class objects that require initialization.
Thank you for your help.
std::vector<int> MyVec;
std::vector<int *> MyVecPtr;
for (int i = 0; i < 10; i++)
{
MyVec.push_back(i);
MyVecPtr.push_back(&MyVec.back());
std::cout << MyVec.back() << " "
<< &MyVec.back() << " "
<< MyVecPtr.back() << " "
<< *MyVecPtr.back() << std::endl;
}
std::cout << std::endl;
for (int i = 0; i < MyVec.size(); i++)
{
std::cout << MyVec[i] << " "
<< &MyVec[i] << " "
<< MyVecPtr[i] << " "
<< *MyVecPtr[i] << std::endl;
}
0 0x180d010 0x180d010 0
1 0x180d054 0x180d054 1
2 0x180d038 0x180d038 2
3 0x180d03c 0x180d03c 3
4 0x180d0b0 0x180d0b0 4
5 0x180d0b4 0x180d0b4 5
6 0x180d0b8 0x180d0b8 6
7 0x180d0bc 0x180d0bc 7
8 0x180d140 0x180d140 8
9 0x180d144 0x180d144 9
0 0x180d120 0x180d010 25219136
1 0x180d124 0x180d054 0
2 0x180d128 0x180d038 2
3 0x180d12c 0x180d03c 3
4 0x180d130 0x180d0b0 4
5 0x180d134 0x180d0b4 5
6 0x180d138 0x180d0b8 6
7 0x180d13c 0x180d0bc 7
8 0x180d140 0x180d140 8
9 0x180d144 0x180d144 9

If you know how many insertions you will be performing, you should use reserve() on your vector accordingly. This will eliminate the need for any resizing it would otherwise perform when the capacity is exceeded.
MyVec.reserve(10);
for (int i = 0; i < 10; i++)
{
MyVec.push_back(i);
//...

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

OpenMP - parallel code has unexpected results - c++

Standard output streams are NOT synchronized! The only guarantee the standard gives is that, single characters are outputted atomically. You need either a lock - which defies the point of parallelization or you could drop the "<< i" which should result in a quasi synchronized behavior.

You can create an array (of size 5) containing which thread handled which index and then print it outside the parallel loop.

Related

Locally compiled c++ code is improperly looping

OpenMP only using one thread

simple for loop finishes and fails the run

Convert an Eigen matrix to Triplet form C++

Why are addresses of vector elements not consecutive when assigned using push_back()?

Categories

Resources