detection of the darkest fixed-size square from a picture - c++

I have a picture of 2600x2600 in gray.
Or it can be seen as a matrix of unsigned short.
I would like to find the darkest (or the brightest by computing the inverse picture) square are of a fixed size N. N could be parametrized (if there is more than one darkest square I would like all).
I read detection-of-rectangular-bright-area-in-a-image-using-opencv
but it needs to a threshold value I don't have and furthermore I search a fixed size.
Do anyone as a way to find it in c++ or python ?

For each row of the image,
Add up the N consecutive pixels, so you get W - N + 1 pixels.
For each column of the new image,
For each consecutive sequence of N pixels, (H - N + 1)
Add them up and compare to the current best.
To add up each consecutive sequence of pixels, you could subtract the last pixel, and add the next pixel.
You could also reuse the image array as storage, if it can be modified. If not, a memory-optimization would be to just store the latest column, and go trough it for each step in the first loop.
Runtime: O(w·h)
Here is some code in C#, to demonstrate this (ignoring the pixel format, and any potential overflows):
List<Point> FindBrightestSquare(int[,] image, int N, out int squareSum)
{
int width = image.GetLength(0);
int height = image.GetLength(1);
if (width < N || height < N)
{
return false;
}
int currentSum;
for (int y = 0; y < height; y++)
{
currentSum = 0;
for (int x = 0; x < width; x++)
{
currentSum += image[x,y];
if (x => N)
{
currentSum -= image[x-N,y];
image[x-N,y] = currentSum;
}
}
}
int? bestSum = null;
List<Point> bestCandidates = new List<Point>();
for (int x = 0; x <= width-N; x++)
{
currentSum = 0;
for (int y = 0; y < height; y++)
{
currentSum += image[x,y];
if (y >= N)
{
currentSum -= image[x, y-N];
if (bestSum == null || currentSum > bestSum)
{
bestSum = currentSum;
bestCandidates.Clear();
bestCandidates.Add(new Point(x, y-N));
}
else if (currentSum == bestSum)
{
bestCandidates.Add(new Point(x, y-N));
}
}
}
}
squareSum = bestSum.Value;
return bestCandidates;
}

You could increment the threshold until you find a square, and use a 2D FSM to detect the square.
This will produce a match in O(width * height * bpp) (binary search on the lowest possible threshold, assuming a power-of-two range):
- set threshold to its maximum value
- for every bit of the threshold
- clear the bit in the threshold
- if there is a match
- record the set of matches as a result
- else
- set the bit
- if there is no record, then the threshold is its maximum.
to detect a square:
- for every pixel:
- if the pixel is too bright, set its line-len to 0
- else if it's the first column, set its line-len to 1
- else set its line-len to the line-len of the pixel to the left, plus one
- if the pixel line-len is less than N, set its rect-len to 0
- else if it's the first row, set its rect-len to 1
- else set its rect-len to the rect-len of the pixel above, plus one
- if the rect-len is at least N, record a match.
line-len represents the number of consecutive pixels that are dark enough.
rect-len represents the number of consecutive rows of dark pixels that are long enough and aligned.
For video-capture, replace the binary search by a linear search from the threshold for the previous frame.
Obviously, you can't get better than theta(width/N * height/N) best case (as you'll have to rule out every possible position for a darker square) and the bit depth can be assumed constant, so this algorithm is asymptotically optimal for a fixed N. It's probably asymptotically optimal for N as a part of the input as well, as (intuitively) you have to consider almost every pixel in the average case.

Related

How to determine pixel intensity with respect to pixel range in x-axis?

I want to see the distribution of a color with respect to image width. That is, if a (black and white) image has width of 720 px, then I want to conclude that a specific range (e.g. pixels [500,720]) has more white color in compared to rest of the image. What I thought is, I need a slice of the image of 720x1 px, then I need to check the values and distribute them w.r.t. width of 720 px. But I don't know the way I can apply this in a suitable way?
edit: I use OpenCV 4.0.0 with C++.
Example Case: In the first image, it is obvious that right hand side pixels are white. I want to get estimate coordinates of this dense line or zone. The light pink zone is where I am interested in and the red borders are the range where I want to find it.
If you want to get minimum continious range of image columns which contain more white than the rest of the image, than you need first to calculate number of white pixels in each column. Lets assume we have an image 720x500 (500 pixels high and 720 pixels wide). Than you will get an array Arr of 720 elements that equal number of white pixels in each column (1x500) respectively.
const int Width = img.cols;
int* Arr = new int[Width];
for( int x = 0; x < Width; x++ ) {
Arr[x] = 0;
for( int y = 0; y < img.rows; y++ ) {
if ( img.at<cv::Vec3b>(y,x) == cv::Vec3b(255,255,255) ) {
Arr[x]++;
}
}
}
You need to find a minimum range [A;B] in this array that satisfies condition Sum(Arr[0 to A-1]) + Sum(Arr[B+1 to Width-1]) < Sum(Arr[A to B]).
// minimum range width is guaranteed to be less or equal to (Width/2 + 1)
int bestA = 0, minimumWidth = Width/2 + 1;
int total = RangeSum(Arr, 0, Width-1);
for (int i = 0; i < Width; i++) {
for (int j = i; j < Width && j < i + minimumWidth; j++) {
int rangeSum = RangeSum(Arr, i, j);
if (rangeSum > total - rangeSum) {
bestA = i;
minimumWidth = j - i + 1;
break;
}
}
}
std::cout << "Most white minimum range - [" << bestA << ";" << bestA + minimumWidth - 1 << "]\n";
You can optimize the code if you precalculate sums for all [0; i] ranges, i from 0 to Width - 1. Than you can calculate RangeSum(Arr, A, B) as PrecalculatedSums[B] - PrecalculatedSums[A] (in O(1) complexity).

Procedurally generate seamless fractal noise textures

I have been generating noise textures to use as height maps for terrain generation. In this application, initially there is a 256x256 noise texture that is used to create a block of land that the user is free to roam around. When the user reaches a certain boundary in-game the application generates a new texture and thus another block of terrain.
In the code, a table of 64x64 random values are generated, and the values in the texture are the result of interpolating between these points at various 'frequencies' and 'wavelengths' using a smoothstep function, and then combined to form the final noise texture; and finally the values in the texture are divided through by its largest value to effectively normalize it. When the player is at the boundary and a new texture is created, the random number table that is created re-uses the values from the appropriate edge of the previous texture (eg. if the new texture is for a block of land that is on the +X side of the previous one, the last value in every row of the previous texture is used as the first value in every row of random numbers in the next.)
My problem is this: even though the same values are being used across the edges of adjacent textures, they are nowhere near seamless - some neighboring points on the terrain are mismatched by many many metres. My guess is that the changing frequencies that are used to sample the random number table are probably having a significant effect on all areas of the texture. So how might one generate fractal noise poceduraly, ie. as needed, AND have it look continuous with adjacent values?
Here is a section of the code that returns a value interpolated between the points on the random number table given a point P:
float MainApp::assessVal(glm::vec2 P){
//Integer component of P
int xi = (int)P.x;
int yi = (int)P.y;
//Decimal component ofP
float xr = P.x - xi;
float yr = P.y - yi;
//Find the grid square P lies inside of
int x0 = xi % randX;
int x1 = (xi + 1) % randX;
int y0 = yi % randY;
int y1 = (yi + 1) % randY;
//Get random values for the 4 nodes
float r00 = randNodes->randNodes[y0][x0];
float r10 = randNodes->randNodes[y0][x1];
float r01 = randNodes->randNodes[y1][x0];
float r11 = randNodes->randNodes[y1][x1];
//Smoother interpolation so
//texture appears less blocky
float sx = smoothstep(xr);
float sy = smoothstep(yr);
//Find the weighted value of the 4
//random values. This will be the
//final value in the noise texture
float sx0 = mix(r00, r10, sx);
float sx1 = mix(r01, r11, sx);
return mix(sx0, sx1, sy);
}
Where randNodes is a 2 dimensional array containing the random values.
And here is the code that takes all the values returned from the above function and constructs texture data:
int layers = 5;
float wavelength = 1, frequency = 1;
for (int k = 0; k < layers; k++) {
for (int i = 0; i < stepsY; i++) {
for(int j = 0; j < stepsX; j++){
//Compute value for (stepsX * stepsY) interpolation points
//across the grid of random numbers
glm::vec2 P = glm::vec2((float)j/stepsX * randX, (float)i/stepsY * randY);
buf[i * stepsY + j] += assessVal(P * wavelength) * frequency;
}
}
//repeat (layers) times with different signals
wavelength *= 0.5;
frequency *= 2;
}
for(int i = 0; i < buf.size(); i++){
//divide all data by the largest value.
//this normalises the data to avoid saturation
buf[i] /= largestVal;
}
Finally, here is an example of two textures generated by these functions that should be seamless, but aren't:
The 2 images placed side by side as they are now are obviously mis-matched.
Your code wraps the values only in the domain of the noise texture you read from, but not in the domain of the texture being generated.
For the texture T of size stepX to be repeatable (let's consider 1-d case for simplicity) you must have
T(0) == T(stepX)
Or in your case (substitute j = 0 and j = stepX):
assessVal(0) == assessVal(randX * wavelength)
For when k >= 1 this is clearly not true in your code, because
(randX / pow(2, k)) % randX != 0
One solution is to decrease randX and randY while you go up the frequencies.
But my typical approach would rather be starting from a 2x2 random texture, upscale it to 4x4 with GL_REPEAT, add a bit more per-pixel noise, continue upscaling to 8x8 etc.. till I get to the desired size.
The root cause of course is that your smoothing changes pixels to match their neighbors, but you later add new neighbors and do not re-smooth the pixels who got new neighbors.
One simple and common workaround is to keep an edge of invisible pixels, the width of which is half that of your smoothing kernel. Now, when expanding the area, you can resmooth those invisible pixels just before they're revealed. Don't forget to add a new edge of invisible pixels!

Trying to compute my own Histogram without opencv calcHist()

What I'm trying to do is writing a function that calculates a Histogram of a greyscale image with a forwarded Number of Bins (anzBin) which the histograms range is divided in. Then I'm running through the Image Pixels compairing their value to the different Bins and in case a value fits, increasing the value of the Bin by 1
vector<int> calcuHisto(const IplImage *src_pic, int anzBin)
{
CvSize size = cvGetSize(src_pic);
int binSize = (size.width / 256)*anzBin;
vector<int> histogram(anzBin,0);
for (int y = 0; y<size.height; y++)
{
const uchar *src_pic_point =
(uchar *)(src_pic->imageData + y*src_pic->widthStep);
for (int x = 0; x<size.width; x++)
{
for (int z = 0; z < anzBin; z++)
{
if (src_pic_point[x] <= z*binSize)
{
histogram[src_pic_point[x]]++;
}
}
}
}
return histogram;
}
But unfortunately it's not working...
What is wrong here?
Please help
There are a few issues I can see
Your binSize calculation is wrong
Your binning algorithm is one sided, and should be two sided
You aren't incrementing the proper bin when you find a match
1. binsize calculation
bin size = your range / number of bins
2. two sided binning
if (src_pic_point[x] <= z*binSize)
you need a two sided range of values, not a one sided inequality. Imagine you have 4 bins and values from 0 to 255. Your bins should have the following ranges
bin low high
0 0 63.75
1 63.75 127.5
2 127.5 191.25
3 191.25 255
For example: a value of 57 should go in bin 0. Your code says the value goes in all the bins! Because its always <= z*binsize You need something something with a lower and upper bound.
3. Incrementing the appropriate bin
You are using z to loop over each bin, so when you find a match you should increment bin z, you don't use the actual pixel value except when determining which bin it belongs to
this would likely be buffer overrun imagine again you have 4 bins, and the current pixel has a value of 57. This code says increment bin 57. But you only have 4 bins (0-3)
histogram[src_pic_point[x]]++;
you want to increment only the bin the pixel value falls into
histogram[z]++;
CODE
With that in mind here is revised code (it is untested, but should work)
vector<int> calcuHisto(const IplImage *src_pic, int anzBin)
{
CvSize size = cvGetSize(src_pic);
double binSize = 256.0 / anzBin; //new definition
vector<int> histogram(anzBin,0); //i don't know if this works so I
//so I will leave it
//goes through all rows
for (int y = 0; y<size.height; y++)
{
//grabs an entire row of the imageData
const uchar *src_pic_point = (uchar *)(src_pic->imageData + y*src_pic->widthStep);
//goes through each column
for (int x = 0; x<size.width; x++)
{
//for each bin
for (int z = 0; z < anzBin; z++)
{
//check both upper and lower limits
if (src_pic_point[x] >= z*binSize && src_pic_point[x] < (z+1)*binSize)
{
//increment the index that contains the point
histogram[z]++;
}
}
}
}
return histogram;
}

How to filter given width of lines in a image?

I need to filter given width of lines in a image.
I am coding a program which will detect lines of road image. And I found something like that but can't understand logic of it. My function has to do that:
I will send image and width of line in terms of pixel size(e.g 30 pixel width), the function will filter just these lines in image.
I found that code:
void filterWidth(Mat image, int tau) // tau=width of line I want to filter
int aux = 0;
for (int j = 0; j < quad.rows; ++j)
{
unsigned char *ptRowSrc = quad.ptr<uchar>(j);
unsigned char *ptRowDst = quadDst.ptr<uchar>(j);
for (int i = tau; i < quad.cols - tau; ++i)
{
if (ptRowSrc[i] != 0)
{
aux = 2 * ptRowSrc[i];
aux += -ptRowSrc[i - tau];
aux += -ptRowSrc[i + tau];
aux += -abs((int)(ptRowSrc[i - tau] - ptRowSrc[i + tau]));
aux = (aux < 0) ? (0) : (aux);
aux = (aux > 255) ? (255) : (aux);
ptRowDst[i] = (unsigned char)aux;
}
}
}
What is the mathematical explanation of that code? And how does that work?
Read up about convolution filters. This code is a particular case of a 1 dimensional convolution filter (it only convolves with other pixels on the currently processed line).
The value of aux is started with 2 * the current pixel value, then pixels on either side of it at distance tau are being subtracted from that value. Next the absolute difference of those two pixels is also subtracted from it. Finally it is capped to the range 0...255 before being stored in the output image.
If you have an image:
0011100
This convolution will cause the centre 1 to gain the value:
2 * 1
- 0
- 0
- abs(0 - 0)
= 2
The first '1' will become:
2 * 1
- 0
- 1
- abs(0 - 1)
= 0
And so will the third '1' (it's a mirror image).
And of course the 0 values will always stay zero or become negative, which will be capped back to 0.
This is a rather weird filter. It takes the pixel values three by three on the same line, with a tau spacing. Let these values by Vl, V and Vr.
The filter computes - Vl + 2 V - Vr, which can be seen as a second derivative, and deducts |Vl - Vr|, which can be seen as a first derivative (also called gradient). The second derivative gives a maximum response in case of a maximum configuration (Vl < V > Vr); the first derivative gives a minimum response in case of a symmetric configuration (Vl = Vr).
So the global filter will give a maximum response for a symmetric maximum (like with a light road on a dark background, vertical, with a width less than 2.tau).
By rearranging the terms, you can see that the filter also yields the smallest of the left and right gradients, V - Vm and V - Vp (clamped to zero).

Optimized float Blur variations

I am looking for optimized functions in c++ for calculating areal averages of floats. the function is passed a source float array, a destination float array (same size as source array), array width and height, "blurring" area width and height.
The function should "wrap-around" edges for the blurring/averages calculations.
Here is example code that blur with a rectangular shape:
/*****************************************
* Find averages extended variations
*****************************************/
void findaverages_ext(float *floatdata, float *dest_data, int fwidth, int fheight, int scale, int aw, int ah, int weight, int xoff, int yoff)
{
printf("findaverages_ext scale: %d, width: %d, height: %d, weight: %d \n", scale, aw, ah, weight);
float total = 0.0;
int spos = scale * fwidth * fheight;
int apos;
int w = aw;
int h = ah;
float* f_temp = new float[fwidth * fheight];
// Horizontal
for(int y=0;y<fheight ;y++)
{
Sleep(10); // Do not burn your processor
total = 0.0;
// Process entire window for first pixel (including wrap-around edge)
for (int kx = 0; kx <= w; ++kx)
if (kx >= 0 && kx < fwidth)
total += floatdata[y*fwidth + kx];
// Wrap
for (int kx = (fwidth-w); kx < fwidth; ++kx)
if (kx >= 0 && kx < fwidth)
total += floatdata[y*fwidth + kx];
// Store first window
f_temp[y*fwidth] = (total / (w*2+1));
for(int x=1;x<fwidth ;x++) // x width changes with y
{
// Substract pixel leaving window
if (x-w-1 >= 0)
total -= floatdata[y*fwidth + x-w-1];
// Add pixel entering window
if (x+w < fwidth)
total += floatdata[y*fwidth + x+w];
else
total += floatdata[y*fwidth + x+w-fwidth];
// Store average
apos = y * fwidth + x;
f_temp[apos] = (total / (w*2+1));
}
}
// Vertical
for(int x=0;x<fwidth ;x++)
{
Sleep(10); // Do not burn your processor
total = 0.0;
// Process entire window for first pixel
for (int ky = 0; ky <= h; ++ky)
if (ky >= 0 && ky < fheight)
total += f_temp[ky*fwidth + x];
// Wrap
for (int ky = fheight-h; ky < fheight; ++ky)
if (ky >= 0 && ky < fheight)
total += f_temp[ky*fwidth + x];
// Store first if not out of bounds
dest_data[spos + x] = (total / (h*2+1));
for(int y=1;y< fheight ;y++) // y width changes with x
{
// Substract pixel leaving window
if (y-h-1 >= 0)
total -= f_temp[(y-h-1)*fwidth + x];
// Add pixel entering window
if (y+h < fheight)
total += f_temp[(y+h)*fwidth + x];
else
total += f_temp[(y+h-fheight)*fwidth + x];
// Store average
apos = y * fwidth + x;
dest_data[spos+apos] = (total / (h*2+1));
}
}
delete f_temp;
}
What I need is similar functions that for each pixel finds the average (blur) of pixels from shapes different than rectangular.
The specific shapes are: "S" (sharp edges), "O" (rectangular but hollow), "+" and "X", where the average float is stored at the center pixel on destination data array. Size of blur shape should be variable, width and height.
The functions does not need to be pixelperfect, only optimized for performance. There could be separate functions for each shape.
I am also happy if anyone can tip me of how to optimize the example function above for rectangluar blurring.
What you are trying to implement are various sorts of digital filters for image processing. This is equivalent to convolving two signals where the 2nd one would be the filter's impulse response. So far, you regognized that a "rectangular average" is separable. By separable I mean, you can split the filter into two parts. One that operates along the X axis and one that operates along the Y axis -- in each case a 1D filter. This is nice and can save you lots of cycles. But not every filter is separable. Averaging along other shapres (S, O, +, X) is not separable. You need to actually compute a 2D convolution for these.
As for performance, you can speed up your 1D averages by properly implementing a "moving average". A proper "moving average" implementation only requires a fixed amount of little work per pixel regardless of the averaging "window". This can be done by recognizing that neighbouring pixels of the target image are computed by an average of almost the same pixels. You can reuse these sums for the neighbouring target pixel by adding one new pixel intensity and subtracting an older one (for the 1D case).
In case of arbitrary non-separable filters your best bet performance-wise is "fast convolution" which is FFT-based. Checkout www.dspguide.com. If I recall correctly, there is even a chapter on how to properly do "fast convolution" using the FFT algorithm. Although, they explain it for 1-dimensional signals, it also applies to 2-dimensional signals. For images you have to perform 2D-FFT/iFFT transforms.
To add to sellibitze's answer, you can use a summed area table for your O, S and + kernels (not for the X one though). That way you can convolve a pixel in constant time, and it's probably the fastest method to do it for kernel shapes that allow it.
Basically, a SAT is a data structure that lets you calculate the sum of any axis-aligned rectangle. For the O kernel, after you've built a SAT, you'd take the sum of the outer rect's pixels and subtract the sum of the inner rect's pixels. The S and + kernels can be implemented similarly.
For the X kernel you can use a different approach. A skewed box filter is separable:
You can convolve with two long, thin skewed box filters, then add the two resulting images together. The center of the X will be counted twice, so will you need to convolve with another skewed box filter, and subtract that.
Apart from that, you can optimize your box blur in many ways.
Remove the two ifs from the inner loop by splitting that loop into three loops - two short loops that do checks, and one long loop that doesn't. Or you could pad your array with extra elements from all directions - that way you can simplify your code.
Calculate values like h * 2 + 1 outside the loops.
An expression like f_temp[ky*fwidth + x] does two adds and one multiplication. You can initialize a pointer to &f_temp[ky*fwidth] outside the loop, and just increment that pointer in the loop.
Don't do the division by h * 2 + 1 in the horizontal step. Instead, divide by the square of that in the vertical step.