I am confused as to why I am getting a segmentation fault when creating and firing off threads here. It happens in the t[j] = thread(getMax, A); line and I am very confused as to why this is happening. threadMax[] is the max of each thread. getMax() returns the maximum value of an array.
#include <iostream>
#include <stdlib.h>
#include <sys/time.h>
#include <thread>
#define size 10
#define numThreads 10
using namespace std;
int threadMax[numThreads] = {0};
int num =0;
void getMax(double *A){
num += 1;
double max = A[0];
double min = A[0];
for (int i =0; i<size; i++){
if(A[i] > max){
max = A[i];
}
}
threadMax[num] = max;
}
int main(){
int max =0;
double S,E;
double *A = new double[size];
srand(time(NULL));
thread t[numThreads];
//Assign random values to array
for(int i = 0; i<size; i++){
A[i] = (double(rand()%100));
}
//create Threads
for(int j =0; j <numThreads; j++){
cout << A[j] << " " << j << "\n";
t[j] = thread(getMax, A);
}
//join threads
for(int i =0; i< numThreads; i++){
t[i].join();
}
//Find Max from all threads
for(int i =0; i < numThreads; i++){
if(threadMax[i] > max){
max = threadMax[i];
}
}
cout <<max;
delete [] A;
return 0;
}
The behavior of this code is undefined:
void getMax(double *A){
num += 1;
double max = A[0];
double min = A[0];
for (int i =0; i<size; i++){
if(A[i] > max){
max = A[i];
}
}
threadMax[num] = max;
}
The num += 1 can allow multiple threads to attempt to modify num at the same time. Worse, when num is read in the threadMax[num] = max;, threads may see values of num modified by other threads while they were running.
You need to assign each thread a number in some safe way.
Here are three ways it can fail:
Two threads do num += 1; at exactly the same time and as a result, num only increments once.
Every thread does num += 1; before any thread does threadMax[num] = max;. All threads overwrite the same entry in the array. (Which, actually, is out of bounds!)
The code crashes because its behavior is undefined.
As others have stated, your num variable is not protected from race conditions inside of getMax(), which can lead to it being corrupted, thus causing getMax() to access the threadMax[] array out of bounds.
You can avoid that by simply getting rid of that num variable altogether and pass the array index as an input parameter to std::thread instead.
Try something more like this:
#include <iostream>
#include <vector>
#include <array>
#include <thread>
#include <algorithm>
#include <cstdlib>
#include <ctime>
using namespace std;
const size_t size = 10;
const size_t numThreads = 10;
double threadMax[numThreads] = {};
void getMax(int idx, double *A){
threadMax[idx] = *max_element(A, A + size);
}
int main(){
srand(time(nullptr));
vector<double> A(size);
array<thread, numThreads> t;
//Assign random values to array
generate_n(A.begin(), size, [](){ return double(rand() % 100); });
/* or:
for(double &d : A){
d = double(rand() % 100);
}
*/
//create Threads
for(int j = 0; j < numThreads; ++j){
cout << A[j] << " " << j << "\n";
t[j] = thread(getMax, j, A.data());
}
//join threads
for(thread &thd : t){
thd.join();
}
//Find Max from all threads
double max = *max_element(threadMax.begin(), threadMax.end());
cout << max;
return 0;
}
Related
I am having trouble with the push_back() function in C++. For a reason which I don't understand, the push_back function will "not accept" the value I am telling it to append, but the code works perfectly until I try to display the values I want (end of code). I have checked the value type, which is double, but it still won't append it.
The value I am trying to insert comes from a function which calculates the mean of a vector, taking out the NaN values. The code works perfectly but when I am trying to display the values I want, I always get: Segmentation fault (core dumped).
This mean function first iterates over a range and creates a vector in which the NaN will be taken away. Then the mean will be calculated. I have spent quite some time on trying to figure out where the error could come from but wasn't able to figure out anything so any help would be highly appreciated. The following mean function:
double mean_func(double **arr, int iterations, int header, int start){
std::vector<double> vec;
for (int i=start-iterations; i < start; i++){
vec.push_back(arr[i][header]);
}
vec.erase(std::remove_if(std::begin(vec),
std::end(vec),
[](const auto& value) { return std::isnan(value); }),
std::end(vec));
double sum = std::accumulate(vec.begin(), vec.end(), 0.0);
double mean = sum / vec.size();
return mean;
}
The whole code where transf_array_2_vec transforms a vector to an array.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <omp.h>
#include <cmath>
#include <vector>
#include <algorithm>
#include <iostream>
#include <numeric>
#include <typeinfo>
double** transf_vec_2_array(std::vector<std::vector<double> > vals, int N, int M)
{
double** temp;
temp = new double*[N];
#pragma omp parallel for
for(int i=0; (i < N); i++)
{
temp[i] = new double[M];
for(int j=0; j < M; j++)
{
temp[i][j] = vals[i][j];
}
}
return temp;
}
double mean_func(double **arr, int iterations, int header, int start){
std::vector<double> vec;
for (int i=start-iterations; i < start; i++){
vec.push_back(arr[i][header]);
}
vec.erase(std::remove_if(std::begin(vec),
std::end(vec),
[](const auto& value) { return std::isnan(value); }),
std::end(vec));
double sum = std::accumulate(vec.begin(), vec.end(), 0.0);
double mean = sum / vec.size();
return mean;
}
double** cfun(double **indata, unsigned int rows, unsigned int cols, int max_inputs, int daily_inputs, int weekly_inputs, int inputs_short_term, const char **header_1, const char **header_2, unsigned int size_header_1, unsigned int size_header_2, double **outdata) {
std::vector< std::vector <double>> seq_collec;
std::vector<double>seq;
unsigned int i, j, k, l;
unsigned int temp= 5080;
int num = omp_get_thread_num();
omp_set_num_threads(num);
//#pragma omp parallel for //private(seq)
for(i = max_inputs + weekly_inputs; i < temp ; i++) {
for(j = 0; j < size_header_1; j++ ){
for(k = 0; k < size_header_2 ; k++){
for(l = i-max_inputs; l < i; l++){
if((strcmp(header_2[k],"price")==0)|| (strcmp(header_2[k], "change")==0)){
seq.push_back(indata[l][j+k]);
seq.push_back(mean_func(indata, inputs_short_term, j*size_header_2, l));
//std::cout << i << " " << j << " " << k <<std::endl;
//std::cout << header_1[j] << " " << header_2[k] << std::endl;
//std::cout << typeid(mean).name() << std::endl;
//std::cout << mean_func(indata, inputs_short_term, j*size_header_2, l) << std::endl;
}else{
seq.push_back(indata[l][j+k]);
//std::cout << header_1[k] << " " << header_2[k] << std::endl;
}
}
}
seq.push_back(mean_func(indata, daily_inputs, j+k, l));
seq.push_back(mean_func(indata, weekly_inputs, j+k, l));
}
//std::cout << seq.size() << std::endl;
seq_collec.push_back(seq);
seq.clear();
}
outdata = transf_vec_2_array(seq_collec, seq_collec.size(), seq_collec[0].size());
//std::cout << seq_collec.size() << std::endl;
//std::cout << seq_collec[0].size() << std::endl;
return outdata;
}
int main(){
int rows = 10846, cols = 12, max_inputs = 20, daily_inputs = 1000, weekly_inputs=5000, inputs_short_term=4;
unsigned int size_header_1 = 3, size_header_2 = 4;
const char *header_1[size_header_1] = {"CH:SMI","DJIA","RUI"};
const char *header_2[size_header_2] = {"change","delta_vol","price","volume"};
double* *indata = new double*[rows];
double* *outdata = new double*[rows];
for (int i=0; i < rows; i++){
indata[i] = new double[cols];
outdata[i] = new double[cols];
}
for (int i=0; i < rows; i++){
for (int j=0; j < cols; j++){
indata[i][j]=i + j ;
}
}
outdata = cfun(indata, rows, cols, max_inputs, daily_inputs, weekly_inputs, inputs_short_term, header_1, header_2, size_header_1, size_header_2, outdata);
for(int j = 0; j < 1; j++){
for(int i = 0; i < 366; i++){
std::cout << outdata[i][j] << std::endl; // PROBLEM HERE !!!
}
}
return 0;
}
I created a matrix[10][10] with random numbers
for (int i = 0; i < 10; i++)
{
for (int j = 0; j < 10; j++)
{
matrix[i][j] = rand() % 100 ;
}
}
But I need to use bool function for check duplicate numbers and if its same use random again.How can i do it?
An efficient way to test for duplicates is to store the elements that have been inserted into the matrix in a std::vector and to use to std::find. This allows to check whether a newly generated random number is already included in the previously stored elements or not. If it is found, then another random number should be generated and the test repeated.
#include <iostream>
#include <cstdlib>
#include <vector>
#include <algorithm>
bool alreadySelected(int n, int nvalues, int values[][10]) {
std::vector<int> v(&values[0][0], &values[0][0] + nvalues );
return (std::find(v.begin(), v.end(), n) != v.end());
}
int main() {
int matrix[10][10];
for (int i = 0; i < 10; i++) {
int n;
bool dupe;
for (int j = 0; j < 10; j++) {
int nvalues = i * 10 + j;
do {
n = std::rand() % 100 ;
dupe = alreadySelected( n, nvalues, matrix );
} while ( dupe );
matrix[i][j] = n;
std::cout << matrix[i][j] << " ";
}
std::cout << "\n";
}
}
A much simpler way to generate such a matrix would be to use std::random_shuffle.
There are multiple ways to achieve this.
Write a function which returns bool and takes 10*10 matrix size. Compute sum of all numbers. Compare this result with the sum of numbers from 1...99. If both matches then no duplicate return true, otherwise duplicate return false. Sum of 1..99 can be calculated using n(n+1)/2, where n = 99.
In function create array of size 100. Initialize all array elements with 0. Iterate over matrix, use matrix element as index of array. If array contains 1 at that position you got duplicate element otherwise make array element at that index 0.
Implementation of first approach
#include <iostream>
#include <cstdlib>
#include <iomanip>
#define ROW 10
#define COL 10
#define MOD 100
int main()
{
int mat[ROW][COL];
int sum = 0;
int range_sum = ((MOD-1)*(MOD))/2; // n = MOD-1, sum = n(n+1)/2
while(true){
sum = 0;
for(int i = 0; i < ROW; i++){
for(int j = 0; j < COL; j++){
mat[i][j] = rand()%MOD;
sum += mat[i][j];
}
}
if(sum==range_sum){
break;
}
}
for(int i = 0; i < ROW; i++){
for(int j = 0; j < COL; j++){
std::cout << std::setw(2) << mat[i][j] << " ";
}
std::cout << std::endl;
}
return 0;
}
I'm trying to perform matrix multiplication using openMP as follows and I compile it using GCC : g++ -std=gnu++11 -g -Wall -fopenmp -o parallel_not_opt parallel_not_opt.cpp
But when I try to run it by using parallel_not_opt.exe, it aborts giving the typical Windows error parallel_not_opt.exe has stopped working...
Am I missing something?
#include "includes/stdafx.h"
#include <iostream>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <vector>
# include <omp.h>
#include <chrono>
#include <fstream>
#include <algorithm>
#include <immintrin.h>
#include <cfloat>
#include <limits>
#include <math.h>
using namespace std::chrono;
using namespace std;
//populate matrix with random values.
double** generateMatrix(int n){
double max = DBL_MAX;
double min = DBL_MIN;
double** matA = new double*[n];
for (int i = 0; i < n; i++) {
matA[i] = new double[n];
for (int j = 0; j < n; j++) {
double randVal = (double)rand() / RAND_MAX;
matA[i][j] = min + randVal * (max - min);
}
}
return matA;
}
//generate matrix for final result.
double** generateMatrixFinal(int n){
double** matA = new double*[n];
for (int i = 0; i < n; i++) {
matA[i] = new double[n];
for (int j = 0; j < n; j++) {
matA[i][j] = 0;
}
}
return matA;
}
//matrix multiplication - parallel
double matrixMultiplicationParallel(double** A, double** B, double** C, int n){
int i, j, k;
clock_t begin_time = clock();
# pragma omp parallel shared ( A,B,C,n ) // private ( i, j, k )
{
# pragma omp for
for (i = 0; i < n; i++) {
// cout<< i << ", " ;
for (j = 0; j < n; j++) {
for (k = 0; k < n; k++) {
C[i][j] += A[i][k] * B[k][j];
}
}
}
}
double t = float(clock() - begin_time);
return t;
}
int _tmain(int argc, _TCHAR* argv[])
{
ofstream out("output.txt", ios::out | ios::app);
out << "--------------STARTED--------------" << "\n";
int start = 200, stop = 2000, step = 200;
for (int n = start; n <= stop; n += step)
{
srand(time(NULL));
cout << "\nn: " << n << "\n";
double t1 = 0;
int my_size = n;
double **A = generateMatrix(my_size);
double **B = generateMatrix(my_size);
double **C = generateMatrixFinal(my_size);
double single_sample_time = matrixMultiplicationParallel(A, B, C, n);
t1 += single_sample_time;
for (int i = 0; i < n; i++) {
delete[] A[i];
delete[] B[i];
delete[] C[i];
}
delete[] A;
delete[] B;
delete[] C;
}
out << "-----------FINISHED-----------------" << "\n";
out.close();
return 0;
}
The private ( i, j, k ) declaration is not optional. Add it back, otherwise the inner loop variables j and k are shared, which completely messes up the inner loops.
It is better to declare variables as locally as possible. That makes reasoning about OpenMP code much easier:
clock_t begin_time = clock();
# pragma omp parallel
{
# pragma omp for
for (int i = 0; i < n; i++) {
for (int j = 0; j < n; j++) {
for (int k = 0; k < n; k++) {
C[i][j] += A[i][k] * B[k][j];
}
}
}
}
return float(clock() - begin_time);
In that case, A,B,C will be shared by default - coming from the outside, and j,k are private because they are declared within the parallel scope. The loop variable of a parallel for is always implicitly private.
int temp;
for (int j = 0; j < vecsize - 1; ++j) {
int min = sort.at(j);
for (int i = j+1; i < vecsize; ++i) {
if (min > sort.at(i)) {
min = sort.at(i);
temp = i;
}
}
swap(sort.at(j), sort.at(temp));
}
I am trying to sort (in ascending order) the vector of: 23 42 4 16 8 15
However, my attempt at using selection sort outputs: 4 8 15 23 16 42
What am I doing wrong?
When you define min, you seem to be assigning it the value of the array sort at jth index. Yet, you are using an extra variable tmp to swap the elements, and you seem to fail to initialize it before the inner for loop, similar to how you initialize min. And if all the other elements in the array are smaller than the element at sort[j], tmp will be uninitialized for that iteration of the outer loop, possibly causing it to have an incorrect value in it.
int temp;
for (int j = 0; j < vecsize - 1; ++j) {
int min = sort.at(j);
temp = j; # HERE'S WHAT'S NEW
for (int i = j+1; i < vecsize; ++i) {
if (min > sort.at(i)) {
min = sort.at(i);
temp = i;
}
}
swap(sort.at(j), sort.at(temp));
}
You may see this code at work here. It seems to produce the desired output.
Try this : corrected-code
#include <iostream>
#include <vector>
using namespace std;
void print (vector<int> & vec) {
for (int i =0 ; i < vec.size(); ++i) {
cout << vec[i] << " ";
}
cout << endl;
}
int main() {
int temp;
vector<int> sort;
sort.push_back(23);
sort.push_back(42);
sort.push_back( 4);
sort.push_back( 16);
sort.push_back( 8);
sort.push_back(15);
print(sort);
int vecsize = sort.size();
for (int j = 0; j < vecsize - 1; ++j) {
int min = j;
for (int i = j+1; i < vecsize; ++i) {
if (sort.at(min) > sort.at(i)) {
min = i;
}
}
if (min != j)
swap(sort.at(j), sort.at(min));
}
print(sort);
return 0;
}
If you can use C++11, you can also solve sorting (as in your example) with lambdas. It's a more powerful and optimized way. You should try it maybe in the future.
[EDITED]:
A short example:
// Example program
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
int main()
{
std::vector<int> myVector;
myVector.emplace_back(23);
myVector.emplace_back(42);
myVector.emplace_back(4);
myVector.emplace_back(16);
myVector.emplace_back(8);
myVector.emplace_back(15);
std::sort(myVector.begin(), myVector.end(),
[](int a, int b) -> bool
{
return a < b;
});
}
I'm trying to parallelize a "for" with openmp.
However the result, parallel code vs nonparallel, is different. I believe that it is related with the definition of the sum variable outside of the loop, but I don't know how to solve the problem.
What I want is to parallelize the first "for" loop.
Edit: 1
Here is the simplest example I could find.
//g++ -o test2 test2.cpp -fopenmp
//
//
#include <cmath>
#include <iostream>
using namespace std;
double f(double i, double j)
{
return i + j;
}
int main()
{
const int size = 256;
double sum = 0;
//will use openmp
#pragma omp parallel for
for(int i = 0; i < size; i = i + 1)
{
for(int j = 0; j < size; j=j+1)
{
if(i != j)
{
sum = sum + f(i,j);
}
}
}
cout << "sum = " << sum << endl;
//not using openmp
sum = 0;
for(int i = 0; i < size; i = i + 1)
{
for(int j = 0; j < size; j=j+1)
{
if(i != j)
{
sum = sum + f(i,j);
}
}
}
cout << "sum = " << sum << endl;
}
Your problem is the access to sum being performed by several threads. I.e. when the first thread reaches
sum=sum+f(i,j);
it grabs sum, does the calculations, writes the result to sum. When another thread in the meantime arrived at that line, it grabs the old value of sum and dumps its result, overwriting the first threads results.
A solution would be to set
double increment=f(i,j);
#pragma omp critical
sum+=increment;
Also note that your code's results are not predictable and change when you run it several times.
Thank you for your answer, it finally works.
The following code is a working code with Christoph Solution.
//g++ -o test2 test2.cpp -fopenmp
#include <cmath>
#include <iostream>
using namespace std;
double f(double i, double j)
{
return i + j;
}
int main()
{
const int size = 256;
double sum = 0;
//will use openmp
#pragma omp parallel for
for(int i = 0; i < size; i = i + 1)
{
for(int j = 0; j < size; j=j+1)
{
if(i != j)
{
double increment = f(i,j);
#pragma omp critical
sum = sum + increment;
}
}
}
cout << "sum = " << sum << endl;
//not using openmp
sum = 0;
for(int i = 0; i < size; i = i + 1)
{
for(int j = 0; j < size; j=j+1)
{
if(i != j)
{
sum = sum + f(i,j);
}
}
}
cout << "sum = " << sum << endl;
}