FFT scale power spectrum - c++

I have problem to scale out power spectrum of image using FFT. The code is below
void spectrumFFT(Complex<double> *f, Complex<double> *output, int width, int height){
Complex<double> *temp = new Complex<double>[width * height];
Complex<double> *singleValue = new Complex<double>();
for(int j = 0; j < height; j++){
for(int i = 0; i < width; i++){
singleValue = f[i + j * width];
Complex<double> tempSwap = singleValue->Mag();
// tempSwap assign Magnitude value from singleValue
temp[i + j * width] = tempSwap;
}
}
Let's say temp 1-D array is fill of magnitude value. What my problem is how to scale out min and max value of magnitude which range between [0 - 255).
Note : input *f is already calculated value of 2DFFT and *output value will be filled with min and max value of magnitude.
Any idea with programming?
Thank you,
Regards,
Ichiro

Your question isn't 100% clear, so I might be off and this might be not what you're looking for - I'll do it in general, ignoring the value range you might actually get or use.
Assuming you've got the absolute minimum and the absolute maximum value, vmin and vmax and you'd like to scale the whole range to [0; 255] you can do this that way:
// move the lower end to 0
double mod_add = -vmin;
double mod_mul = 255 / (vmax + mod_add);
Now, to rearrange one value to the range we calculated:
double scaled = (value + mod_add) * mod_mul;
mod_add will move negative numbers/values to the positive range (where the absolute minimum will become 0) and mod_mul will scale the whole range (from absolute minimum to absolute maximum) to fit into [0; 255]. Without negative values you're able to skip mod_add obviously. If you'd like to keep 0 in center (i.e. at 127) you'll have to skip mod_add and instead use the absolute maximum of vmax and vmin and scale that to 127 instead of 255.
On a side note, I think you could simplify your loop a lot, possibly saving some processing time (might not be possible depending on other code being there):
const unsigned int num = width * height;
for (unsigned int i = 0; i < num; i++)
temp[i] = f[i]->Mag();
Also, as mentioned by Oli in the comments, you shouldn't assign any value to singleValue in the beginning, as it's overwritten later on anyway.

Related

Cache misses from random access array

I have an array of values corresponding to an integer indexed axis. I need to linearly interpolate from these values at specific double precision indices.
double indices[20];
double results[20];
double values[1000];
// ...
for (int i = 0; i < 20; i++)
{
double index = indices[i];
int indexInt = (int)index;
double frac = index - indexInt;
// Linear interpolation
result[i] = values[indexInt] * (1.0 - frac) + values[indexInt + 1] * frac;
}
Profiling shows that the result linear interpolation line is taking more program run time than expected, and my suspicion is that this is due to cache misses. The indices are sorted but not guaranteed to be close to each other, and do not have a constant stride. Is there a way to mitigate this?

Trying to compute my own Histogram without opencv calcHist()

What I'm trying to do is writing a function that calculates a Histogram of a greyscale image with a forwarded Number of Bins (anzBin) which the histograms range is divided in. Then I'm running through the Image Pixels compairing their value to the different Bins and in case a value fits, increasing the value of the Bin by 1
vector<int> calcuHisto(const IplImage *src_pic, int anzBin)
{
CvSize size = cvGetSize(src_pic);
int binSize = (size.width / 256)*anzBin;
vector<int> histogram(anzBin,0);
for (int y = 0; y<size.height; y++)
{
const uchar *src_pic_point =
(uchar *)(src_pic->imageData + y*src_pic->widthStep);
for (int x = 0; x<size.width; x++)
{
for (int z = 0; z < anzBin; z++)
{
if (src_pic_point[x] <= z*binSize)
{
histogram[src_pic_point[x]]++;
}
}
}
}
return histogram;
}
But unfortunately it's not working...
What is wrong here?
Please help
There are a few issues I can see
Your binSize calculation is wrong
Your binning algorithm is one sided, and should be two sided
You aren't incrementing the proper bin when you find a match
1. binsize calculation
bin size = your range / number of bins
2. two sided binning
if (src_pic_point[x] <= z*binSize)
you need a two sided range of values, not a one sided inequality. Imagine you have 4 bins and values from 0 to 255. Your bins should have the following ranges
bin low high
0 0 63.75
1 63.75 127.5
2 127.5 191.25
3 191.25 255
For example: a value of 57 should go in bin 0. Your code says the value goes in all the bins! Because its always <= z*binsize You need something something with a lower and upper bound.
3. Incrementing the appropriate bin
You are using z to loop over each bin, so when you find a match you should increment bin z, you don't use the actual pixel value except when determining which bin it belongs to
this would likely be buffer overrun imagine again you have 4 bins, and the current pixel has a value of 57. This code says increment bin 57. But you only have 4 bins (0-3)
histogram[src_pic_point[x]]++;
you want to increment only the bin the pixel value falls into
histogram[z]++;
CODE
With that in mind here is revised code (it is untested, but should work)
vector<int> calcuHisto(const IplImage *src_pic, int anzBin)
{
CvSize size = cvGetSize(src_pic);
double binSize = 256.0 / anzBin; //new definition
vector<int> histogram(anzBin,0); //i don't know if this works so I
//so I will leave it
//goes through all rows
for (int y = 0; y<size.height; y++)
{
//grabs an entire row of the imageData
const uchar *src_pic_point = (uchar *)(src_pic->imageData + y*src_pic->widthStep);
//goes through each column
for (int x = 0; x<size.width; x++)
{
//for each bin
for (int z = 0; z < anzBin; z++)
{
//check both upper and lower limits
if (src_pic_point[x] >= z*binSize && src_pic_point[x] < (z+1)*binSize)
{
//increment the index that contains the point
histogram[z]++;
}
}
}
}
return histogram;
}

sobel filter algorithm thresholding (no external libs used)

I am writing my own implementation of the sobel egde detection. My function's interface is
void sobel_filter(volatile PIXEL * pixel_in, FLAG *EOL, volatile PIXEL * pixel_out, int rows, int cols)
(PIXEL being an 8bit greyscale pixel)
For testing I changed the interface to:
void sobel_filter(PIXEL pixels_in[MAX_HEIGHT][MAX_WIDTH],PIXEL
pixels_out[MAX_HEIGHT][MAX_WIDTH], int rows,int cols);
But Still, the thing is I get to read one pixel at a time, which brings me to the problem of managing the output values of sobel when they are bigger then 255 or smaller then 0. If I had the whole picture from the start, I could normalize all sobel output values with their min and max values. But this is not possible for me.
This is my sobel operator code, ver1:
PIXEL sobel_op(PIXEL_CH window[KERNEL_SIZE][KERNEL_SIZE]){
const char x_op[KERNEL_SIZE][KERNEL_SIZE] = { {-1,0,1},
{-2,0,2},
{-1,0,1}};
const char y_op[KERNEL_SIZE][KERNEL_SIZE] = { {1,2,1},
{0,0,0},
{-1,-2,-1}};
short x_weight=0;
short y_weight=0;
PIXEL ans;
for (short i=0; i<KERNEL_SIZE; i++){
for(short j=0; j<KERNEL_SIZE; j++){
x_weight+=window[i][j]*x_op[i][j];
y_weight+=window[i][j]*y_op[i][j];
}
}
short val=ABS(x_weight)+ABS(y_weight);
//make sure the pixel value is between 0 and 255 and add thresholds
if(val>200)
val=255;
else if(val<100)
val=0;
ans=255-(unsigned char)(val);
return ans;
}
this is ver 2, changes are made only after summing up the weights:
short val=ABS(x_weight)+ABS(y_weight);
unsigned char char_val=(255-(unsigned char)(val));
//make sure the pixel value is between 0 and 255 and add thresholds
if(char_val>200)
char_val=255;
else if(char_val<100)
char_val=0;
ans=char_val;
return ans;
Now, for a 3x3 sobel both seem to be giving OK results:
;
But when I try with a 5x5 sobel
const char x_op[KERNEL_SIZE][KERNEL_SIZE] = { {1,2,0,-2,-1},
{4,8,0,-8,-4},
{6,12,0,-12,-6},
{4,8,0,-8,-4},
{1,2,0,-2,-1}};
const char y_op[KERNEL_SIZE][KERNEL_SIZE] = { {-1,-4,-6,-4,-1},
{-2,-8,-12,-8,-2},
{0,0,0,0,0},
{2,8,12,8,2},
{1,4,6,4,1}};
it gets tricky:
As you can see, for the 5x5 the results are quite bad and I don't know how to normalize the values. Any ideas?
Think about the range of values that your filtered values can take.
For the Sobel 3x3, the highest X/Y value is obtained when the pixels with a positive coefficient are white (255), and the ones with a negative coefficient are black (0), which gives a total of 1020. Symmetrically, the lowest value is -1020. After taking the absolute value, the range is from 0 to 1020 = 4 x 255.
For the magnitude, Abs(X)+Abs(Y), the computation is a little more complicated as the two components cannot reach 1020 at the same time. If I am right, the range is from 0 to 1530 = 6 x 255.
Similar figures for the 5x5 are 48 x 255 and 66 x 255.
Knowing that, you should rescale the values to a smaller range (apply a reduction coefficient), and adjust the thresholds. Logically, if you apply a coefficient 3/66 to the Sobel 5x5, you will return to similar conditions.
It all depends on the effect that you want to achieve.
Anyway, the true question is: how are the filtered values statistically distributed for typical images ? Because it is unnecessary to keep the far tails of the distribution.
You have to normalize the results of your computation. For that you have to find out how "big" is the filter with all absoltue values. So I do this:
for(int i = 0; i < mask.length; i++)
for(int j = 0; j < mask[i].length; j++)
size += Math.abs(mask[i][j]);
Where mask is my sobel filter of each size. So after apply your sobel filter you have to normalize your value in your code it should look like:
for (short i=0; i<KERNEL_SIZE; i++){
for(short j=0; j<KERNEL_SIZE; j++){
x_weight+=window[i][j]*x_op[i][j];
y_weight+=window[i][j]*y_op[i][j];
}
}
x_weight /= size;
y_weight /= size;
After that for visualization you have to shift the values about 128. Just do that if you want to visualize the image. Otherwise you get problems with later calculations (gradient for example).
x_weight += 128;
y_weight += 128;
Hope it works and help.

detection of the darkest fixed-size square from a picture

I have a picture of 2600x2600 in gray.
Or it can be seen as a matrix of unsigned short.
I would like to find the darkest (or the brightest by computing the inverse picture) square are of a fixed size N. N could be parametrized (if there is more than one darkest square I would like all).
I read detection-of-rectangular-bright-area-in-a-image-using-opencv
but it needs to a threshold value I don't have and furthermore I search a fixed size.
Do anyone as a way to find it in c++ or python ?
For each row of the image,
Add up the N consecutive pixels, so you get W - N + 1 pixels.
For each column of the new image,
For each consecutive sequence of N pixels, (H - N + 1)
Add them up and compare to the current best.
To add up each consecutive sequence of pixels, you could subtract the last pixel, and add the next pixel.
You could also reuse the image array as storage, if it can be modified. If not, a memory-optimization would be to just store the latest column, and go trough it for each step in the first loop.
Runtime: O(w·h)
Here is some code in C#, to demonstrate this (ignoring the pixel format, and any potential overflows):
List<Point> FindBrightestSquare(int[,] image, int N, out int squareSum)
{
int width = image.GetLength(0);
int height = image.GetLength(1);
if (width < N || height < N)
{
return false;
}
int currentSum;
for (int y = 0; y < height; y++)
{
currentSum = 0;
for (int x = 0; x < width; x++)
{
currentSum += image[x,y];
if (x => N)
{
currentSum -= image[x-N,y];
image[x-N,y] = currentSum;
}
}
}
int? bestSum = null;
List<Point> bestCandidates = new List<Point>();
for (int x = 0; x <= width-N; x++)
{
currentSum = 0;
for (int y = 0; y < height; y++)
{
currentSum += image[x,y];
if (y >= N)
{
currentSum -= image[x, y-N];
if (bestSum == null || currentSum > bestSum)
{
bestSum = currentSum;
bestCandidates.Clear();
bestCandidates.Add(new Point(x, y-N));
}
else if (currentSum == bestSum)
{
bestCandidates.Add(new Point(x, y-N));
}
}
}
}
squareSum = bestSum.Value;
return bestCandidates;
}
You could increment the threshold until you find a square, and use a 2D FSM to detect the square.
This will produce a match in O(width * height * bpp) (binary search on the lowest possible threshold, assuming a power-of-two range):
- set threshold to its maximum value
- for every bit of the threshold
- clear the bit in the threshold
- if there is a match
- record the set of matches as a result
- else
- set the bit
- if there is no record, then the threshold is its maximum.
to detect a square:
- for every pixel:
- if the pixel is too bright, set its line-len to 0
- else if it's the first column, set its line-len to 1
- else set its line-len to the line-len of the pixel to the left, plus one
- if the pixel line-len is less than N, set its rect-len to 0
- else if it's the first row, set its rect-len to 1
- else set its rect-len to the rect-len of the pixel above, plus one
- if the rect-len is at least N, record a match.
line-len represents the number of consecutive pixels that are dark enough.
rect-len represents the number of consecutive rows of dark pixels that are long enough and aligned.
For video-capture, replace the binary search by a linear search from the threshold for the previous frame.
Obviously, you can't get better than theta(width/N * height/N) best case (as you'll have to rule out every possible position for a darker square) and the bit depth can be assumed constant, so this algorithm is asymptotically optimal for a fixed N. It's probably asymptotically optimal for N as a part of the input as well, as (intuitively) you have to consider almost every pixel in the average case.

Histogram approximation for streaming data

This question is a slight extension of the one answered here. I am working on re-implementing a version of the histogram approximation found in Section 2.1 of this paper, and I would like to get all my ducks in a row before beginning this process again. Last time, I used boost::multi_index, but performance wasn't the greatest, and I would like to avoid the logarithmic in number of buckets insert/find complexity of a std::set. Because of the number of histograms I'm using (one per feature per class per leaf node of a random tree in a random forest), the computational complexity must be as close to constant as possible.
A standard technique used to implement a histogram involves mapping the input real value to a bin number. To accomplish this, one method is to:
initialize a standard C array of size N, where N = number of bins; and
multiply the input value (real number) by some factor and floor the result to get its index in the C array.
This works well for histograms with uniform bin size, and is quite efficient. However, Section 2.1 of the above-linked paper provides a histogram algorithm without uniform bin sizes.
Another issue is that simply multiplying the input real value by a factor and using the resulting product as an index fails with negative numbers. To resolve this, I considered identifying a '0' bin somewhere in the array. This bin would be centered at 0.0; the bins above/below it could be calculated using the same multiply-and-floor method just explained, with the slight modification that the floored product be added to two or subtracted from two as necessary.
This then raises the question of merges: the algorithm in the paper merges the two closest bins, as measured from center to center. In practice, this creates a 'jagged' histogram approximation, because some bins would have extremely large counts and others would not. Of course, this is due to non-uniform-sized bins, and doesn't result in any loss of precision. A loss of precision does, however, occur if we try to normalize the non-uniform-sized bins to make the uniform. This is because of the assumption that m/2 samples fall to the left and right of the bin center, where m = bin count. We could model each bin as a gaussian, but this will still result in a loss of precision (albeit minimal)
So that's where I'm stuck right now, leading to this major question: What's the best way to implement a histogram accepting streaming data and storing each sample in bins of uniform size?
Keep four variables.
int N; // assume for simplicity that N is even
int count[N];
double lower_bound;
double bin_size;
When a new sample x arrives, compute double i = floor(x - lower_bound) / bin_size. If i >= 0 && i < N, then increment count[i]. If i >= N, then repeatedly double bin_size until x - lower_bound < N * bin_size. On every doubling, adjust the counts (optimize this by exploiting sparsity for multiple doublings).
for (int j = 0; j < N / 2; j++) count[j] = count[2 * j] + count[2 * j + 1];
for (int j = N / 2; j < N; j++) count[j] = 0;
The case i < 0 is trickier, since we need to decrease lower_bound as well as increase bin_size (again, optimize for sparsity or adjust the counts in one step).
while (lower_bound > x) {
lower_bound -= N * bin_size;
bin_size += bin_size;
for (int j = N - 1; j > N / 2 - 1; j--) count[j] = count[2 * j - N] + count[2 * j - N + 1];
for (int j = 0; j < N / 2; j++) count[j] = 0;
}
The exceptional cases are expensive but happen only a logarithmic number of times in the range of your data over the initial bin size.
If you implement this in floating-point, be mindful that floating-point numbers are not real numbers and that statements like lower_bound -= N * bin_size may misbehave (in this case, if N * bin_size is much smaller than lower_bound). I recommend that bin_size be a power of the radix (usually two) at all times.