Efficient algorithm to copy tiles in a big matrix

Efficient algorithm to copy tiles in a big matrix - c++

I have N square matrices all of the same size MxM that have to be copied in a matrix that contains NxN matrices, arranged in a symmetrical way. The two halves, upper and lower, contain transposed version of the same matrices like in this scheme.
N = 4
m1 m2 m3 m4
m2'm1 m2 m3
m3'm2'm1 m2
m4'm3'm2'm1
The algorithm that produces data initially fills just the upper row and the first column, leaving the rest empty.
m1 m2 m3 m4
m2'0 0 0
m3'0 0 0
m4'0 0 0
I would like to find an efficient indexing scheme to fill all the big matrix starting from the elements of the line that has been already filled. Remember that m1...mn are square matrices of size MxM, and the matrix is arranged in column-major order. The matrix is not so big so no need to exploit much locality and cache-related things.
The trivial algorithm is like below, where X is the matrix.
int toX = 0, fromX = 0, toY = 0, fromY = 0;
for (int i = 1; i < N; ++i) {
for (int j = 1; j < N; ++j) {
for (int ii = 0; ii < M; ++ii) {
for (int jj = 0; jj < M; ++jj) {
fromX = (i - 1) * dim + ii;
fromY = (j - 1) * dim + jj;
toX = i * dim + ii;
toY = j * dim + jj;
X(toX, toY) = X(fromX, fromY);
}
}
}
}
Can you find a better way?

Depending on your application, it might be unnecessary to store all those transposed matrices. If m1 is symmetric, you could even cull the lower half of the m1 matrices.
In fact, it might even be feasible to leave all those matrices alone and do your matrix-operations block-wise (addition and multiplication with a scalar are simple, multiplication with a vector would be a bit more complicated)
If you really need the whole matrix, you might get a slightly lower operation count by filling the matrix diagonally, i.e. by doing something like this:
int toX = 0, fromX = 0, toY = 0, fromY = 0;
// m1 (note that this part can be sped up further if m1 is symmetric)
for (int ii = 0; ii<M; ii++){
for (int jj = 0; jj<M; jj++){
fromX = ii;
fromY = jj;
toX = fromX;
toY = fromY;
for (int k=1; k<N; k++){
toX += dim;
toY += dim;
X(toX, toY) = X(fromX, fromY);
}
}
}
// m2 to m(N-1)
for (int i = 2; i < N; i++){
for (int ii = 0; ii<M; ii++){
for (int jj = 0; jj<M; jj++){
fromX = i*dim+ii;
fromY = jj;
toX = fromX;
toY = fromY;
for (int k=i; k<N; k++){
toX += dim;
toY += dim;
X(toX, toY) = X(fromX, fromY);
X(toY, toX) = X(fromX, fromY);
}
}
}
}

Related

Convolution algorithm for image processing

I've come up with this code for applying a 3x3 kernel to my image:
double sum;
for(int i = 1; i < src.rows - 1; i++){
for(int j = 1; j < src.cols - 1; j++)
for (int k = 0; k < 3; k ++) {
sum = 0.0;
dst.at<cv::Vec3b>(i,j)[k] = 0.0;
for(int x = -1; x <= 1; x++){
for(int y = -1; y <=1; y++){
sum += (Kernel_Matrix[y+1][x+1]*src.at<cv::Vec3b>(i - x, j - y)[k]);
}
}
dst.at<cv::Vec3b>(i,j)[k] = cv::saturate_cast<uchar>(sum);
}
}
Now I got 2 questions:
By reading https://en.wikipedia.org/wiki/Kernel_(image_processing), there's various matrix for various filter, let's say I want my Blur filter to increase intensity, via a gui Slider that gives a value from x to whatever, what kind of operation should I make to my Blur Matrix(make a sum, a multiplication...)?
(I wold like to do the same with sharpness)
is there a specific matrix for Noise Reduction?
If you also have any mods to suggest me on my algorithm please let me know!
thanks!

Neural Network not learning. Stuck at 50%

I'm beginner in NNs. I'm trying to create a NN for XOR function but it's not learning, it's stuck at 50%
Can anyone give me some advice? Thanks.
Here's the code:
/// Matrix.cpp
#include "pch.h"
#include "Matrix.h"
....
Matrix Matrix::sigmoidDerivate(const Matrix &m) {
assert(m.rows >= 1 && m.cols >= 1);
Matrix tmp(m.rows, m.cols);
for (ushort i = 0; i < tmp.rows; i++) {
for (ushort j = 0; j < tmp.cols; j++) {
tmp.mat[i][j] = m.mat[i][j]*(1-m.mat[i][j]);
}
}
return tmp;
}
Matrix Matrix::sigmoid(const Matrix &m) {
assert(m.rows >= 1 && m.cols >= 1);
Matrix tmp(m.rows, m.cols);
for (ushort i = 0; i < tmp.rows; i++) {
for (ushort j = 0; j < tmp.cols; j++) {
tmp.mat[i][j]= 1 / (1 + exp(-m.mat[i][j]));
}
}
return tmp;
}
Matrix Matrix::randomMatrix(ushort rows, ushort cols) {
assert(rows>=1 && cols>=1);
Matrix tmp(rows,cols);
const int range_from = -3;
const int range_to = 3;
std::random_device rand_dev;
std::mt19937 generator(rand_dev());
std::uniform_real_distribution<double> distr(range_from, range_to);
for (ushort i = 0; i < rows; i++) {
for (ushort j = 0; j < cols; j++) {
tmp.mat[i][j] = distr(generator);
}
}
return tmp;
}
And this is main () :
vector<vector<double>> in = {
{0,0},
{1,0},
{0,1},
{1,1}
};
vector<double> out = { 0,1,1,0 };
const ushort inputNeurons = 2;
const ushort hiddenNeurons = 3;
const ushort outputNeurons = 1;
const double learningRate = 0.03;
Matrix w_0_1 = Matrix::randomMatrix(inputNeurons, hiddenNeurons);
Matrix w_1_2 = Matrix::randomMatrix(hiddenNeurons, outputNeurons);
unsigned int epochs = 100000;
for (int i = 0; i < epochs; i++) {
for (int j = 0; j < in.size(); j++) {
Matrix Layer_0 = Matrix::createRowMatrix(in[j]);
Matrix desired_output = Matrix::createRowMatrix({ out[j] });
Matrix Layer_1 = Matrix::sigmoid(Matrix::multiply(Layer_0, w_0_1));
Matrix Layer_2 = Matrix::sigmoid(Matrix::multiply(Layer_1, w_1_2));
Matrix error = Matrix::POW2(Matrix::substract(Layer_2, desired_output));
//backprop
Matrix Layer_2_delta = Matrix::elementWiseMultiply(
Matrix::substract(Layer_2, desired_output),
Matrix::sigmoidDerivate(Layer_2)
);
Matrix Layer_1_delta = Matrix::elementWiseMultiply(
Matrix::multiply(Layer_2_delta, Matrix::transpose(w_1_2)),
Matrix::sigmoidDerivate(Layer_1)
);
Matrix w_1_2_delta = Matrix::multiply(Matrix::transpose(Layer_1), Layer_2_delta);
Matrix w_0_1_delta = Matrix::multiply(Matrix::transpose(Layer_0), Layer_1_delta);
//updating weights
w_0_1 = Matrix::multiply(w_0_1_delta, learningRate);
w_1_2 = Matrix::multiply(w_1_2_delta, learningRate);
}
}
NN architecture is : 2 ->3 ->1
In hidden layer if number is small, like 2-4, the output is 50%. and for 8 neurons on hidden layer ..output becomes around 49%.
Some help please.

I'm not that into c++ so I'm not sure. But in the line:
Matrix::substract(Layer_2, desired_output),
You are doing something like subtracting the desired "good" output from the existing Layer. In my opinion that should be the other way round. So you have to multiply it by -1
For me it's working like that. If you like so I can send you my source code. (it's java)

From mathematic function to c++ code

I am trying to implement this F(S) function:
bellow is my code but is not working:
double EnergyFunction::evaluate(vector<short> field) {
double e = 0.0;
for (int k = 1; k < field.size() - 1; k++){
double c = 0.0;
for (int i = 1; i < field.size() - k; i++) {
c += field[i] * field[i + k];
}
e += pow(c, 2);
}
double f = pow(field.size(), 2) / ( 2 * e );
return f;
}
For example F(S) function should return value 8644 for vector:
1,1,1,-1,-1,-1,1,-1,1,1,-1,1,-1,1,-1,1,-1,-1,1,1,1,1,-1,-1,-1,1,1,1,1,-1,1,-1,1,1,-1,-1,1,1,1,1,-1,-1,-1,1,-1,-1,1,-1,-1,1,1,-1,1,-1,-1,1,1,-1,1,-1,1,-1,1,-1,1,-1,1,1,-1,-1,-1,-1,-1,-1,1,-1,1,1,1,-1,1,1,-1,1,1,-1,1,-1,1,1,1,-1,-1,1,1,-1,-1,1,1,1,1,1,1,1,1,-1,1,-1,1,-1,1,-1,-1,1,-1,-1,1,-1,-1,1,-1,-1,-1,-1,-1,1,1,1,1,1,-1,-1,-1,1,-1,-1,1,-1,-1,1,-1,-1,1,-1,1,-1,-1,1,1,1,1,1,1,-1,1,-1,1,-1,1,1,1,1,1,1,-1,1,-1,-1,-1,1,-1,1,1,-1,-1,-1,-1,1,-1,-1,-1,1,1,-1,-1,1,1,1,-1,-1,1,1,1,1,-1,1,1,-1,1,-1,-1,1,-1,-1,-1,-1,1,-1,-1,-1,1,-1,-1,1,1,-1,-1,-1,-1,-1,1,-1,-1,-1,1,1,-1,1,1,-1,-1,-1,1,-1,-1,1,-1,-1,-1,1,1,1,-1,-1,-1,-1,1,1,1,-1,1,-1,-1,1,-1,1,1,-1,-1,-1,-1,1,-1,1,1,1,1,1,1,-1,1,1,1,-1,-1,-1,-1,1,-1,1,1,1,1,-1,1,1,1,1,1,-1,-1,-1,1,-1,-1,1,1,1,-1,1,1,1,-1,1,1
I need another par of eyes to look at my code because I am a bit lost here. :)

after refactoring:
double EnergyFunction::evaluate(vector<short> field) {
double e = 0.0;
int l = field.size()
for (int k = 1; k < l; k++){
double c = 0.0;
for (int i = 0, j = k; j < l; i++, j++) {
c += field[i] * field[j];
}
e += c*c;
}
return l*l / ( e+e );
}
explanation:
1. we need to iterate (L-1) times
2. we need to shift the base and offset indexes until we reach the last one
3. c*c and e+e are quicker and easier to read

You are mapping variables into different ranges using the same names, which is always going to be confusing. Better is to keep ranges and names the same as in the math, and only subtract one for 0-base indexes at indexing time. Also might as well use L explicitly:
int L = field.size();
for (int k = 1; k <= L-1; k++){
...
for (int i = 1; i <= L-k; i++) {
c += field[i -1] * field[i+k -1];
...

C++ Pattern Matching with FFT cross-correlation (Images)

everyone I am trying to implement patter matching with FFT but I am not sure what the result should be (I think I am missing something even though a read a lot of stuff about the problem and tried a lot of different implementations this one is the best so far). Here is my FFT correlation function.
void fft2d(fftw_complex**& a, int rows, int cols, bool forward = true)
{
fftw_plan p;
for (int i = 0; i < rows; ++i)
{
p = fftw_plan_dft_1d(cols, a[i], a[i], forward ? FFTW_FORWARD : FFTW_BACKWARD, FFTW_ESTIMATE);
fftw_execute(p);
}
fftw_complex* t = (fftw_complex*)fftw_malloc(rows * sizeof(fftw_complex));
for (int j = 0; j < cols; ++j)
{
for (int i = 0; i < rows; ++i)
{
t[i][0] = a[i][j][0];
t[i][1] = a[i][j][1];
}
p = fftw_plan_dft_1d(rows, t, t, forward ? FFTW_FORWARD : FFTW_BACKWARD, FFTW_ESTIMATE);
fftw_execute(p);
for (int i = 0; i < rows; ++i)
{
a[i][j][0] = t[i][0];
a[i][j][1] = t[i][1];
}
}
fftw_free(t);
}
int findCorrelation(int argc, char* argv[])
{
BMP bigImage;
BMP keyImage;
BMP result;
RGBApixel blackPixel = { 0, 0, 0, 1 };
const bool swapQuadrants = (argc == 4);
if (argc < 3 || argc > 4) {
cout << "correlation img1.bmp img2.bmp" << endl;
return 1;
}
if (!keyImage.ReadFromFile(argv[1])) {
return 1;
}
if (!bigImage.ReadFromFile(argv[2])) {
return 1;
}
//Preparations
const int maxWidth = std::max(bigImage.TellWidth(), keyImage.TellWidth());
const int maxHeight = std::max(bigImage.TellHeight(), keyImage.TellHeight());
const int rowsCount = maxHeight;
const int colsCount = maxWidth;
BMP bigTemp = bigImage;
BMP keyTemp = keyImage;
keyImage.SetSize(maxWidth, maxHeight);
bigImage.SetSize(maxWidth, maxHeight);
for (int i = 0; i < rowsCount; ++i)
for (int j = 0; j < colsCount; ++j) {
RGBApixel p1;
if (i < bigTemp.TellHeight() && j < bigTemp.TellWidth()) {
p1 = bigTemp.GetPixel(j, i);
} else {
p1 = blackPixel;
}
bigImage.SetPixel(j, i, p1);
RGBApixel p2;
if (i < keyTemp.TellHeight() && j < keyTemp.TellWidth()) {
p2 = keyTemp.GetPixel(j, i);
} else {
p2 = blackPixel;
}
keyImage.SetPixel(j, i, p2);
}
//Here is where the transforms begin
fftw_complex **a = (fftw_complex**)fftw_malloc(rowsCount * sizeof(fftw_complex*));
fftw_complex **b = (fftw_complex**)fftw_malloc(rowsCount * sizeof(fftw_complex*));
fftw_complex **c = (fftw_complex**)fftw_malloc(rowsCount * sizeof(fftw_complex*));
for (int i = 0; i < rowsCount; ++i) {
a[i] = (fftw_complex*)fftw_malloc(colsCount * sizeof(fftw_complex));
b[i] = (fftw_complex*)fftw_malloc(colsCount * sizeof(fftw_complex));
c[i] = (fftw_complex*)fftw_malloc(colsCount * sizeof(fftw_complex));
for (int j = 0; j < colsCount; ++j) {
RGBApixel p1;
p1 = bigImage.GetPixel(j, i);
a[i][j][0] = (0.299*p1.Red + 0.587*p1.Green + 0.114*p1.Blue);
a[i][j][1] = 0.0;
RGBApixel p2;
p2 = keyImage.GetPixel(j, i);
b[i][j][0] = (0.299*p2.Red + 0.587*p2.Green + 0.114*p2.Blue);
b[i][j][1] = 0.0;
}
}
fft2d(a, rowsCount, colsCount);
fft2d(b, rowsCount, colsCount);
result.SetSize(maxWidth, maxHeight);
for (int i = 0; i < rowsCount; ++i)
for (int j = 0; j < colsCount; ++j) {
fftw_complex& y = a[i][j];
fftw_complex& x = b[i][j];
double u = x[0], v = x[1];
double m = y[0], n = y[1];
c[i][j][0] = u*m + n*v;
c[i][j][1] = v*m - u*n;
int fx = j;
if (fx>(colsCount / 2)) fx -= colsCount;
int fy = i;
if (fy>(rowsCount / 2)) fy -= rowsCount;
float r2 = (fx*fx + fy*fy);
const double cuttoffCoef = (maxWidth * maxHeight) / 37992.;
if (r2<128 * 128 * cuttoffCoef)
c[i][j][0] = c[i][j][1] = 0;
}
fft2d(c, rowsCount, colsCount, false);
const int halfCols = colsCount / 2;
const int halfRows = rowsCount / 2;
if (swapQuadrants) {
for (int i = 0; i < halfRows; ++i)
for (int j = 0; j < halfCols; ++j) {
std::swap(c[i][j][0], c[i + halfRows][j + halfCols][0]);
std::swap(c[i][j][1], c[i + halfRows][j + halfCols][1]);
}
for (int i = halfRows; i < rowsCount; ++i)
for (int j = 0; j < halfCols; ++j) {
std::swap(c[i][j][0], c[i - halfRows][j + halfCols][0]);
std::swap(c[i][j][1], c[i - halfRows][j + halfCols][1]);
}
}
for (int i = 0; i < rowsCount; ++i)
for (int j = 0; j < colsCount; ++j) {
const double& g = c[i][j][0];
RGBApixel pixel;
pixel.Alpha = 0;
int gInt = 255 - static_cast<int>(std::floor(g + 0.5));
pixel.Red = gInt;
pixel.Green = gInt;
pixel.Blue = gInt;
result.SetPixel(j, i, pixel);
}
BMP res;
res.SetSize(maxWidth, maxHeight);
result.WriteToFile("result.bmp");
return 0;
}
Sample output

This question would probably be more appropriately posted on another site like cross validated (metaoptimize.com used to also be a good one, but it appears to be gone)
That said:
There's two similar operations you can perform with FFT: convolution and correlation. Convolution is used for determining how two signals interact with each-other, whereas correlation can be used to express how similar two signals are to each-other. Make sure you're doing the right operation as they're both commonly implemented throught a DFT.
For this type of application of DFTs you usually wouldn't extract any useful information in the fourier spectrum unless you were looking for frequencies common to both data sources or whatever (eg, if you were comparing two bridges to see if their supports are spaced similarly).
Your 3rd image looks a lot like the power domain; normally I see the correlation output entirely grey except where overlap occurred. Your code definitely appears to be computing the inverse DFT, so unless I'm missing something the only other explanation I've come up with for the fuzzy look could be some of the "fudge factor" code in there like:
if (r2<128 * 128 * cuttoffCoef)
c[i][j][0] = c[i][j][1] = 0;
As for what you should expect: wherever there are common elements between the two images you'll see a peak. The larger the peak, the more similar the two images are near that region.
Some comments and/or recommended changes:
1) Convolution & correlation are not scale invariant operations. In other words, the size of your pattern image can make a significant difference in your output.
2) Normalize your images before correlation.
When you get the image data ready for the forward DFT pass:
a[i][j][0] = (0.299*p1.Red + 0.587*p1.Green + 0.114*p1.Blue);
a[i][j][1] = 0.0;
/* ... */
How you grayscale the image is your business (though I would've picked something like sqrt( r*r + b*b + g*g )). However, I don't see you doing anything to normalize the image.
The word "normalize" can take on a few different meanings in this context. Two common types:
normalize the range of values between 0.0 and 1.0
normalize the "whiteness" of the images
3) Run your pattern image through an edge enhancement filter. I've personally made use of canny, sobel, and I think I messed with a few others. As I recall, canny was "quick'n dirty", sobel was more expensive, but I got comparable results when it came time to do correlation. See chapter 24 of the "dsp guide" book that's freely available online. The whole book is worth your time, but if you're low on time then at a minimum chapter 24 will help a lot.
4) Re-scale the output image between [0, 255]; if you want to implement thresholds, do it after this step because the thresholding step is lossy.
My memory on this one is hazy, but as I recall (edited for clarity):
You can scale the final image pixels (before rescaling) between [-1.0, 1.0] by dividing off the largest power spectrum value from the entire power spectrum
The largest power spectrum value is, conveniently enough, the center-most value in the power spectrum (corresponding to the lowest frequency)
If you divide it off the power spectrum, you'll end up doing twice the work; since FFTs are linear, you can delay the division until after the inverse DFT pass to when you're re-scaling the pixels between [0..255].
If after rescaling most of your values end up so black you can't see them, you can use a solution to the ODE y' = y(1 - y) (one example is the sigmoid f(x) = 1 / (1 + exp(-c*x) ), for some scaling factor c that gives better gradations). This has more to do with improving your ability to interpret the results visually than anything you might use to programmatically find peaks.
edit I said [0, 255] above. I suggest you rescale to [128, 255] or some other lower bound that is gray rather than black.

calculating euclidean distance for Luv values between 2 pixels

In opencv c++ I'm trying to figure out how to calculate the euclidean distance between point i,j and all points within a 3x3 kernel. This is to create a contrast map of saliency from the Luv color space. I've also tried the norm function to no avail. I'm very confused as to how to solve this problem and would appreciate some feedback.
Mat tmp1 = MeanShift_Luv.clone();
int big_theta = 3; // kernel size / neighborhood to perform convolution on
Mat gradient_1 = Mat::zeros(tmp1.rows, tmp1.cols, CV_64FC3);
for (int i = 0; i < tmp1.rows; i++){
for (int j = 0; j < tmp1.cols; j++){
double dist = 0;
for (int m = -big_theta / 2; m < big_theta / 2; m++){
for(int n = -big_theta /2; n < big_theta / 2; n++){
if (m == 0 || n == 0) continue;
if (i + m < 0 || i + m >= tmp1.rows) continue;
if (j + n < 0 || j + n >= tmp1.cols) continue;
/* unsure what to do at this part
Point a(i,j);
Point b(i+m, j+n);
*/
}
}
gradient_1.at<Vec3d>(i,j) = dist;
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Efficient algorithm to copy tiles in a big matrix - c++

Related

Convolution algorithm for image processing

Neural Network not learning. Stuck at 50%

From mathematic function to c++ code

C++ Pattern Matching with FFT cross-correlation (Images)

calculating euclidean distance for Luv values between 2 pixels

Categories

Resources