I have some code written in C++ and when I compile it on my laptop, the results show, however, I have tried to compile and run the code onto the RPI and I get the error:
Segmentation fault
How the program (currently) works:
Reads in a (.wav) file into a vector of doubles ("rawData")
Splits the rawData into blocks (blockked)
The segmentation fault happens when I try and split the data into blocks. The sizes:
rawData - 57884
blockked - 112800
Now I know the RPI only has 256MB and this could possibly be the problem, or, i'm not handling the data properly. I have included some code as well, to help demonstrate how things are running:
(main.cpp):
int main()
{
int N = 600;
int M = 200;
float sumthresh = 0.035;
float zerocorssthres = 0.060;
Wav sampleWave;
if(!sampleWave.readAudio("repositry/example.wav", DOUBLE))
{
cout << "Cannot open the file BOOM";
}
// Return the data
vector<double> rawData = sampleWave.returnRaw();
// THIS segments (typedef vector<double> iniMatrix;)
vector<iniMatrix> blockked = sampleWave.something(rawData, N, M);
cout << rawData.size();
return EXIT_SUCCESS;
}
(function: something)
int n = theData.size();
int maxblockstart = n - N;
int lastblockstart = maxblockstart - (maxblockstart % M);
int numblocks = (lastblockstart)/M + 1;
vector< vector<double> > subBlock;
vector<double> temp;
this->width = N;
this->height = numblocks;
subBlock.resize(600*187);
for(int i=0; (i < 600); i++)
{
subBlock.push_back(vector<double>());
for(int j=0; (j < 187); j++)
{
subBlock[i].push_back(theData[i*N+j]);
}
}
return subBlock;
Any suggestions would be greatly appreciated :)! Hopefully this is enough description.
You're probably overrunning an array somewhere (Maybe not even in the code you posted). I'm not really sure what you're trying to do with the blocking either, but I guess you want to split your wave file into 600 sample chunks?
If so, I think you want something more like the following:
std::vector<std::vector<double>>
SimpleWav::something(const std::vector<double>& data, int N) {
//How many blocks of size N can we get?
int num_blocks = data.size() / N;
//Create the vector with enough empty slots for num_blocks blocks
std::vector<std::vector<double>> blocked(num_blocks);
//Loop over all the blocks
for(int i = 0; i < num_blocks; i++) {
//Resize the inner vector to fit this block
blocked[i].resize(N);
//Pull each sample for this block
for(int j = 0; j < N; j++) {
blocked[i][j] = data[i*N + j];
}
}
return blocked;
}
Related
I wrote this knapsack problem solution in c++ however when I run it, it gives me segmentation fault
I have tried everything and my compiler will always give me the segmentation fault error.
#include<iostream>
#include<algorithm>
int knapsack(int v[],int w[],int n,int W)
{
int V[n][W];
for(int i = 0; i<=W;i++)
{
V[0][i] = 0;
}
for(int i = 0; i <= n; i++){
for(int j = 1; j<=W; j++)
{
if(w[i]<=W)
{
V[i][j] = std::max(V[i-1][j], v[i]+V[i-1][j-w[i]]);
}
else
{
V[i][j] = V[i-1][j];
}
}
}
return V[n][W];
}
int main()
{
int v[4] = {10,40,30,50};
int w[4] = {5,4,6,3};
int n = 3;
int W = 10;
std::cout<<"item value:"<<knapsack(v,w,n,W);
}
Don't use VLAs. The size of an array must be known at compile time, else it's not standard C++. Those are compiler extensions that are not portable and introduce some hidden costs.
Array indices go from 0 to length-1. in you loop
for(int i = 0; i<=W;i++)
i can reach W, then V[0][W] is out of bounds which causes the seg fault. You have to use < instead of <=:
for(int i = 0; i < W; i++)
n should probably be 4, if it's meant to represent the size of the array, a std::vector would make your life easier here, because a vector knows it's size
In general don't use C-style arrays or raw pointers at all in this day and age, use std::vector instead.
int V[n][W];
for(int i = 0; i<=W;i++)
{
V[0][i] = 0;
}
Note that V's indexes go from V[0][0] to V[0][W-1]. Your for loop will try to read V[0][W].
The same error is repeated in other places. Your end condition in your for loops should be < (strictly less) instead of <= (less or equal than).
I'm new to Pthreads and c++ and trying to parallelize an image flipping program. Obviously it isnt working. I'm told I need to port some code from an Image class but not really sure what porting means. I just copied and pasted the code but I guess that's wrong.
I get the general idea. allocate the workload, intitialize the threads, create the threads, join the threads and define a callback function.
I'm not totally sure what the cells_per_thread should be. I'm pretty sure it should be the image width * height / threads. Does that seem correct?
I'm getting multiple errors when compiling with cmake.
its saying m_thread_number, getWidth, getHeight, getPixel, temp are not define in the scope. I assume thats because the Image class code isn't ported?
PthreadImage.cxx
//Declare a callabck fucntion for Horizontal flip
void* H_flip_callback_function(void* aThreadData);
PthreadImage PthreadImage::flipHorizontally() const
{
if (m_thread_number == 0 || m_thread_number == 1)
{
return PthreadImage(Image::flipHorizontally(), m_thread_number);
}
else
{
PthreadImage temp(getWidth(), getHeight(), m_thread_number);
//Workload allocation
//Create a vector of type ThreadData whcih is constructed at the top of the class under Struct ThreadData. Pass in the number of threads.
vector<ThreadData> p_thread_data(m_thread_number);
//create an integer to hold the last element. inizialize it as -1.
int last_element = -1;
//create an unsigned int to hold how many cells we need per thread. For the image we want the width and height divided by the number of threads.
unsigned int cells_per_thread = getHeight() * getWidth() / m_thread_number;
//Next create a variable to hold the remainder of the sum.
unsigned int remainder = getHeight() * getWidth() % m_thread_number;
//print the number of cells per thread to the console
cout << "Default number for cells per thread: " << cells_per_thread << endl;
//inizialize the threads with a for loop to interate through each thread and populate it
for (int i = 0; i < m_thread_number; i++)
{
//thread ids correspond with the for loop index values.
p_thread_data[i].thread_id = i;
//start is last element + 1 i.e -1 + 1 start = 0.
p_thread_data[i].start_id = ++last_element;
p_thread_data[i].end_id = last_element + cells_per_thread - 1;
p_thread_data[i].input = this;
p_thread_data[i].output = &temp;
//if the remainder is > thats 0 add 1 to the end them remove 1 remainder.
if (remainder > 0)
{
p_thread_data[i].end_id++;
--remainder;
}
//make the last element not = -1 but = the end of the threads.
last_element = p_thread_data[i].end_id;
//print to console what number then thread start and end on
cout << "Thread[" << i << "] starts with " << p_thread_data[i].start_id << " and stops on " << p_thread_data[i].end_id << endl;
}
//create the threads with antoher for loop
for (int i = 0; i < m_thread_number; i++)
{
pthread_create(&p_thread_data[i].thread_id, NULL, H_flip_callback_function, &p_thread_data[i]);
}
//Wait for each thread to complete;
for (int i = 0; i < m_thread_number; i++)
{
pthread_join(p_thread_data[i].thread_id, NULL);
}
return temp;
}
}
Callback function
//Define the callabck fucntion for Horizontal flip
void* H_flip_callback_function(void* aThreadData)
{
//convert void to Thread data
ThreadData* p_thread_data = static_cast<ThreadData*>(aThreadData);
int tempHeight = temp(getHeight());
int tempWidth = temp(getWidth());
for (int i = p_thread_data->start_id; i <= p_thread_data->end_id; i++)
{
// Process every row of the image
for (unsigned int j = 0; j < m_height; ++j)
{
// Process every column of the image
for (unsigned int i = 0; i < m_width / 2; ++i)
{
(*(p_thread_data->output))( i, j) = getPixel(m_width - i - 1, j);
(*(p_thread_data->output))(m_width - i - 1, j) = getPixel( i, j);
}
}
}
}
Image class
#include <sstream> // Header file for stringstream
#include <fstream> // Header file for filestream
#include <algorithm> // Header file for min/max/fill
#include <numeric> // Header file for accumulate
#include <cmath> // Header file for abs and pow
#include <vector>
#include "Image.h"
//-----------------
Image::Image():
//-----------------
m_width(0),
m_height(0)
//-----------------
{}
//----------------------------------
Image::Image(const Image& anImage):
//----------------------------------
m_width(anImage.m_width),
m_height(anImage.m_height),
m_p_image(anImage.m_p_image)
//----------------------------------
Image class code to be ported
//-----------------------------------
Image Image::flipHorizontally() const
//-----------------------------------
{
// Create an image of the right size
Image temp(getWidth(), getHeight());
// Process every row of the image
for (unsigned int j = 0; j < m_height; ++j)
{
// Process every column of the image
for (unsigned int i = 0; i < tempWidth / 2; ++i)
{
temp(i, j) = getPixel(tempWidth - i - 1, j);
temp(tempWidth - i - 1, j) = getPixel(i, j);
}
}
return 0;
}
I feel like its pretty close. Any help greatly appreciated!
EDIT
Ok, so this is the correct code for anyone wasting their time on this.
There was obviously a fair few things wrong.
I don't know why there was 3 for loops. There should be 2. 1 for Rows and 1 for columns.
The cells_per_thread should be pixels_per_thread and rows/threads as #Larry B suggested not ALL the pixels per thread.
You can use -> to get members of a pointer i.e setPixel(),getPixel` etc. Who knew that!?
There was a data structure that was pretty inportant for you guys but I forgot.
struct ThreadData
{
pthread_t thread_id;
unsigned int start_id;
unsigned int end_id;
const Image* input;
Image* output;
};
Correct Callback
void* H_flip_callback_function(void* aThreadData)
{
//convert void to Thread data
ThreadData* p_thread_data = static_cast<ThreadData*>(aThreadData);
int width = p_thread_data->input->getWidth();
// Process every row of the image
for (unsigned int j = p_thread_data->start_id; j <=p_thread_data->end_id; ++j)
}
// Process every column of the image
for (unsigned int i = 0; i < width / 2; ++i)
{
p_thread_data->output->setPixel(i,j, p_thread_data->input->getPixel(width - i - 1, j));
p_thread_data->output->setPixel(width - i - 1, j, p_thread_data->input->getPixel(i, j));
}
}
return 0;
}
So now this code compiles and flips.
Thanks!
The general strategy for porting single threaded code to a multi-thread version is essentially rewriting the existing code to divide the work into self contained units of work that you can hand off to a thread for execution.
With that in mind, I don't agree with your implementation of H_flip_callback_function:
void* H_flip_callback_function(void* aThreadData)
{
//convert void to Thread data
ThreadData* p_thread_data = static_cast<ThreadData*>(aThreadData);
// Create an image of the right size
PthreadImage temp(getWidth(), getHeight(), m_thread_number);
int tempHeight = temp(getHeight());
int tempWidth = temp(getWidth());
for (int i = p_thread_data->start_id; i <= p_thread_data->end_id; i++)
{
// Process every row of the image
for (unsigned int j = 0; j < tempHeight; ++j)
{
// Process every column of the image
for (unsigned int i = 0; i < tempWidth / 2; ++i)
{
temp(i, j) = getPixel(tempWidth - i - 1, j);
temp(tempWidth - i - 1, j) = getPixel(i, j);
}
}
}
}
At face value, it looks like all your threads will be operating on the whole image. If this is the case, there is no real difference between your single and multi-thread version as you're just doing the same work multiple times in the multi-thread version.
I would argue that the smallest self contained unit of work would be to horizontally flip a single row of the image. However, if you have less threads than the number of rows, then you could allocate (Num rows / Num threads) to each thread. Each thread would then flip the rows assigned to it and the main thread would collect the results and assemble the final image.
With regards to your build warnings and errors, you'll have to provide the complete source code, build settings, environment, etc..
So I have a vector of vectors type double. I basically need to be able to set 360 numbers to cosY, and then put those 360 numbers into cosineY[0], then get another 360 numbers that are calculated with a different a now, and put them into cosineY[1].Technically my vector is going to be cosineYa I then need to be able to take out just cosY for a that I specify...
My code is saying this:
for (int a = 0; a < 8; a++)
{
for int n=0; n <= 360; n++
{
cosY[n] = cos(a*vectorOfY[n]);
}
cosineY.push_back(cosY);
}
which I hope is the correct way of actually setting it.
But then I need to take cosY for a that I specify, and calculate another another 360 vector, which will be stored in another vector again as a vector of vectors.
Right now I've got:
for (int a = 0; a < 8; a++
{
for (int n = 0; n <= 360; n++)
{
cosProductPt[n] = (VectorOfY[n]*cosY[n]);
}
CosProductY.push_back(cosProductPt);
}
The VectorOfY is besically the amplitude of an input wave. What I am doing is trying to create a cosine wave with different frequencies (a). I am then calculation the product of the input and cosine wave at each frequency. I need to be able to access these 360 points for each frequency later on in the program, and right now also I need to calculate the addition of all elements in cosProductPt, for every frequency (stored in cosProductY), and store it in a vector dotProductCos[a].
I've been trying to work it out but I don't know how to access all the elements in a vector of vectors to add them. I've been trying to do this for the whole day without any results. Right now I know so little that I don't even know how I would display or access a vector inside a vector, but I need to use that access point for the addition.
Thank you for your help.
for (int a = 0; a < 8; a++)
{
for int n=0; n < 360; n++) // note traded in <= for <. I think you had an off by one
// error here.
{
cosY[n] = cos(a*vectorOfY[n]);
}
cosineY.push_back(cosY);
}
Is sound so long as cosY has been pre-allocated to contain at least 360 elements. You could
std::vector<std::vector<double>> cosineY;
std::vector<double> cosY(360); // strongly consider replacing the 360 with a well-named
// constant
for (int a = 0; a < 8; a++) // same with that 8
{
for int n=0; n < 360; n++)
{
cosY[n] = cos(a*vectorOfY[n]);
}
cosineY.push_back(cosY);
}
for example, but this hangs on to cosY longer than you need to and could cause problems later, so I'd probably scope cosY by throwing the above code into a function.
std::vector<std::vector<double>> buildStageOne(std::vector<double> &vectorOfY)
{
std::vector<std::vector<double>> cosineY;
std::vector<double> cosY(NumDegrees);
for (int a = 0; a < NumVectors; a++)
{
for int n=0; n < NumDegrees; n++)
{
cosY[n] = cos(a*vectorOfY[n]); // take radians into account if needed.
}
cosineY.push_back(cosY);
}
return cosineY;
}
This looks horrible, returning the vector by value, but the vast majority of compilers will take advantage of Copy Elision or some other sneaky optimization to eliminate the copying.
Then I'd do almost the exact same thing for the second step.
std::vector<std::vector<double>> buildStageTwo(std::vector<double> &vectorOfY,
std::vector<std::vector<double>> &cosineY)
{
std::vector<std::vector<double>> CosProductY;
for (int a = 0; a < numVectors; a++)
{
for (int n = 0; n < NumDegrees; n++)
{
cosProductPt[n] = (VectorOfY[n]*cosineY[a][n]);
}
CosProductY.push_back(cosProductPt);
}
return CosProductY;
}
But we can make a couple optimizations
std::vector<std::vector<double>> buildStageTwo(std::vector<double> &vectorOfY,
std::vector<std::vector<double>> &cosineY)
{
std::vector<std::vector<double>> CosProductY;
for (int a = 0; a < numVectors; a++)
{
// why risk constantly looking up cosineY[a]? grab it once and cache it
std::vector<double> & cosY = cosineY[a]; // note the reference
for (int n = 0; n < numDegrees; n++)
{
cosProductPt[n] = (VectorOfY[n]*cosY[n]);
}
CosProductY.push_back(cosProductPt);
}
return CosProductY;
}
And the next is kind of an extension of the first:
std::vector<std::vector<double>> buildStageTwo(std::vector<double> &vectorOfY,
std::vector<std::vector<double>> &cosineY)
{
std::vector<std::vector<double>> CosProductY;
std::vector<double> cosProductPt(360);
for (std::vector<double> & cosY: cosineY) // range based for. Gets rid of
{
for (int n = 0; n < NumDegrees; n++)
{
cosProductPt[n] = (VectorOfY[n]*cosY[n]);
}
CosProductY.push_back(cosProductPt);
}
return CosProductY;
}
We could do the same range-based for trick for the for (int n = 0; n < NumDegrees; n++), but since we are iterating multiple arrays here it's not all that helpful.
Dear Stack Community,
I'm doing a DSP exercise to complement my C++ FIR lowpass filter with filter coefficients designed in and exported from Matlab. The DSP exercise in question is the act of decimating the output array of the FIR lowpass filter to a lower sample rate by a factor of 'M'. In C++ I made a successful but extremely simple implementation within a .cpp file and I've been trying hard to convert it to a function to which I can give the output array of the FIR filter. Here is the very basic version of the code:
int n = 0;
int length = 50;
int M = 12;
float array[length];
float array2[n];
for (int i = 0 ; i<length; i++) {
array[i] = std::rand();
}
for (int i = 0; i<length; i=i+M) {
array2[n++] = array[i];
}
for (int i = 0; i<n; i++) {
std::cout << i << " " << array2[i] << std::endl;
}
As you can see very simple. My attempt to convert this to a function using is unfortunately not working. Here is the function as is:
std::vector<float> decimated_array(int M,std::vector<float> arr){
size_t n_idx = 0;
std::vector<float> decimated(n_idx);
for (int i = 0; i<(int)arr.size(); i = i + M) {
decimated[n_idx++] = arr[i];
}
return decimated;
}
This produces a very common Xcode error of EXC_BAD_ACCESS when using this section of code in the .cpp file. The error occurs in the line 'decimated[n_idx++] = arr[i];' specifically:
int length = 50;
int M = 3;
std::vector<float> fct_array(length);
for (int i = 0 ; i<length; i++) {
fct_array[i] = std::rand();
}
FIR_LPF test;
std::vector<float> output;
output = test.decimated_array(M,fct_array);
I'm trying to understand what is incorrect with my application of or perhaps just my translation of the algorithm into a more general setting. Any help with this matter would be greatly appreciated and hopefully this is clear enough for the community to understand.
Regards, Vhaanzeit
The issue:
size_t n_idx = 0;
std::vector<float> decimated(n_idx);
You did not size the vector before you used it, thus you were invoking undefined behavior when assigning to element 0, 1, etc. of the decimated vector.
What you could have done is in the loop, call push_back:
std::vector<float> decimated_array(int M,std::vector<float> arr)
{
std::vector<float> decimated;
for (size_t i = 0; i < arr.size(); i = i + M) {
decimated.push_back(arr[i]);
}
return decimated;
}
The decimated vector starts out empty, but a new item is added with the push_back call.
Also, you should pass the arr vector by const reference, not by value.
std::vector<float> decimated_array(int M, const std::vector<float>& arr);
Passing by (const) reference does not invoke a copy.
Edit: Changed loop counter to correct type, thus not needing the cast.
I recently finished writing what I consider my "main.cpp" code in a Win32 Console project. It builds the solution perfectly and the external release version runs and completes within like 30 seconds, which is fast for the number of calculations it does.
When I use my MFC built UI made with just 1 standard dialog box for some simple float inputs, the program that ran fine by itself gets hung up when it has to create and calculate some 2D-vectors.
std::mt19937 generator3(time(0));
static uniform_01<std::mt19937> dist3(generator3);
std::vector<int> e_scatter;
for (int i = 0; i <= n; i++)
{
if (dist3() >= perc_e)
{
e_scatter.push_back(1);
// std::cout << e_scatter[i] << '\n';
// system("pause");
}
else
{
e_scatter.push_back(0);
// std::cout << e_scatter[i] << '\n';
// system("pause");
}
}
string fileName_escatter = "escatter.dat";
FILE* dout4 = fopen(fileName_escatter.c_str(), "w");
for (int i = 0; i <= n; i++)
{
fprintf(dout4, "%d", e_scatter[i]);
fprintf(dout4, "\n");
// fprintf(dout2, "%f", e_scatter[i]);
// fprintf(dout2, "\n");
};
fclose(dout4);
std::vector<vector<float>> electron;
// std::vector<float> angle;
**randutils::mt19937_rng rng2;
std::vector<float> rand_scatter;
for (int i = 0; i <= n; i++)
{
std::vector<float> w;
electron.push_back(w);
rand_scatter.push_back(rng2.uniform(0.0, 1.0));
for (int j = 0; j <= 2000; j++)
{
if (e_scatter[i] == 0)
{
electron[i].push_back(linspace[j] * (cos((rand_scatter[i] * 90) * (PI / 180))));
//electron[i][j] == abs(electron[i][j]);
}
else
{
electron[i].push_back(linspace[j]);
};
};
};**
More specifically it does not get past a specific for loop and I am forced to close it. I've let it run for 20 minutes to see if it was just computing things slower, but still got 0 output from it. I am not that great at the debugging part of code when I have this GUI from MFC since I dont have the console popping up.
Is there something that I am missing when I try to use MFC for the gui and large 2D vectors?
The first loop calculates and spits out an output file 'escatter.dat' after its finished but the second set of loops never finishes and the memory usage keeps ramping up.
linspace[i] is calculated before all of this code and is just a vector of 2001 numbers that it uses to populate the std::vector> electron vector in the double for loops.
Ive included this http://pastebin.com/i8A7t38K link to the MFC part of the code that I was using to not make this post really long to read.
Thank you.
I agree that the debugging checks are the major problem.
But if your program is running 30 seconds, n must be big.
You may consider optimizing your code for reducing memory allocations, by preallocating memory using vector::reserve;
std::vector<vector<float>> electron;
// std::vector<float> angle;
**randutils::mt19937_rng rng2;
std::vector<float> rand_scatter;
electron.reserve(n+1); // worth for big n
rand_scatter.reserve(n+1); // worth for big n
for (int i = 0; i <= n; i++)
{
std::vector<float> w;
electron.push_back(w);
rand_scatter.push_back(rng2.uniform(0.0, 1.0));
electron[i].reserve(2000+1); // really worth for big n
for (int j = 0; j <= 2000; j++)
{
if (e_scatter[i] == 0)
{
electron[i].push_back(linspace[j] * (cos((rand_scatter[i] * 90) * (PI / 180))));
//electron[i][j] == abs(electron[i][j]);
}
else
{
electron[i].push_back(linspace[j]);
};
};
};**
or rewrite by not using push_back (since you know all sizes)
std::vector<vector<float>> electron(n+1);
// std::vector<float> angle;
**randutils::mt19937_rng rng2;
std::vector<float> rand_scatter(n+1);
for (int i = 0; i <= n; i++)
{
std::vector<float>& w=electron[i];
w.reserve(2000+1);
float r=rng2.uniform(0.0, 1.0);
rand_scatter[i]=r;
for (int j = 0; j <= 2000; j++)
{
float f;
if (e_scatter[i] == 0)
{
f=linspace[j] * (cos((r * 90) * (PI / 180)));
// f=abs(f);
}
else
{
f=linspace[j];
};
w[j]=f;
};
};**
After that runtime might decrease to at most few seconds.
Another optimization
string fileName_escatter = "escatter.dat";
FILE* dout4 = fopen(fileName_escatter.c_str(), "w");
for (int i = 0; i <= n; i++)
{
fprintf(dout4, "%d\n", e_scatter[i]); // save one method call
// fprintf(dout2, "%f\n", e_scatter[i]);
};
fclose(dout4);
BTW: ofstream is the stl-way of writing files, like
ofstream dout4("escatter.dat", std::ofstream::out);
for (int i = 0; i <= n; i++)
{
dout4 << e_scatter[i] << std::endl;
};
dout4.close();