local thresholding implementation in C++ / OpenCV - c++

I would like to implement a local thresholding algorithm and I require your expertise.
my images are resized to 600x400, grayscale.
Basic Thought process on localizing:
Segment the images using a 9x9 ROI taken at each pixel and calculating the maximum intensity in the region.
create a 9x9 Kernel.
condition:
if the center pixel of the mask is above 50% of the maximum intensity, set the center pixel true.(apply mask)
my question to you:
How should I pick my kernel/mask ?
cv::Mat ROI;
cv::Mat mask(input.size(),CV_8UC1, cv::Scalar::all(0)); // create mask of 0s at first
const int kerneldepth = 1;
const int kernelsize = 9;
cv::Mat kernel = cv::Mat::ones( kernelsize, kernelsize, CV_8UC1 );
//take ROI of 9x9 and apply a threshold
for( double x = 9; x < input.cols -9; x++ ){
for( double y = 9 ; y < input.rows - 9 ; y++ ){
try{
double x_left = x - 4;
double x_right = x + 4;
double y_up = y + 4;
double y_down = y - 4;
double maxVal;
double minVal;
cv::Point anchor(kernelsize/2,kernelsize/2);
cv::Rect ROI = cv::Rect(x_left,y_down,9,9);
cv::Mat ROI_Mat = input(ROI); // a new matrix for ROI
cv::Scalar avgPixelIntensity = cv::mean( ROI_Mat ); // calculate mean
cv::minMaxLoc(ROI_Mat,&minVal,&maxVal);
if( input.at<uchar>(x,y) >= 0.5*maxVal){
cv::filter2D(input,mask,-1,kernel,anchor,0);
} else { break;}
}
catch (cv::Exception &e){
e.what();
}
}
*****************************UPDATED CODE: ******************************************
applyLocalThresh(cv::Mat &src, cv::Mat& out){
double maxVal, minVal;
cv::Mat output;
int top, bottom, left , right;
int borderType = cv::BORDER_CONSTANT;
cv::Scalar value;
top = (int) (9); bottom = (int) (9);
left = (int) (9); right = (int) (9);
output = src;
out = src;
value = 0;
cv::copyMakeBorder(src,output,top,bottom,left,right,borderType,value);
for(int y = 9; y < src.rows; y++) {
for(int x = 9; x < src.cols; x ++) {
cv::Mat ROI = src(cv::Rect(cv::Point(x-4,y-4),cv::Size(9,9)));
cv::minMaxLoc(ROI,&minVal,&maxVal);
if(src.at<uchar>(cv::Point(x-4,y-4)) >= 0.6*maxVal){
out.at<uchar>(cv::Point(x-4,y-4)) = 255;
}else{
out.at<uchar>(cv::Point(x-4,y-4));
}
}
}
}

You can do this with a dilation followed by a comparison in OpenCV;
im = load image here;
di = dilate im with a 9x9 kernel;
bw = im > (di * 0.5); // in OpenCV, pixels of bw are set to 255 or 0
A simple example to illustrate this with a 4x6 image and a 3x3 kernel in Matlab/Octave:
im =
1 2 3 4 5 6
2 3 4 5 6 7
3 4 5 6 7 8
4 5 6 7 8 9
di =
3 4 5 6 7 7
4 5 6 7 8 8
5 6 7 8 9 9
5 6 7 8 9 9
th = di * .5
th =
1.5000 2.0000 2.5000 3.0000 3.5000 3.5000
2.0000 2.5000 3.0000 3.5000 4.0000 4.0000
2.5000 3.0000 3.5000 4.0000 4.5000 4.5000
2.5000 3.0000 3.5000 4.0000 4.5000 4.5000
bw = im > th
bw =
0 0 1 1 1 1
0 1 1 1 1 1
1 1 1 1 1 1
1 1 1 1 1 1

I fear this approach is not entirely correct. Let me explain: for operations involving a kernel, one must be careful to place the center of the kernel on top of the pixel that is going to be transformed. That's because a 3x3, 5x5, 7x7, 9x9 (...) kernel just computes the value for one pixel in the image, which is the one positioned at the center [0,0] of the kernel.
If you think about how to compute the value for the first pixel of the image, the center of a 9x9 kernel is going to be placed at coordinate [0,0]. That means that 3/4 of the kernel are going to be placed at negative coordinates, i.e. coordinates that refers to pixels that don't exist:
[-4,-4][-3,-4][-2,-4][-1,-4][ 0,-4][ 1,-4][ 2,-4][ 3,-4][ 4,-4]
[-4,-3][-3,-3][-2,-3][-1,-3][ 0,-3][ 1,-3][ 2,-3][ 3,-3][ 4,-3]
[-4,-2][-3,-2][-2,-2][-1,-2][ 0,-2][ 1,-2][ 2,-2][ 3,-2][ 4,-2]
[-4,-1][-3,-1][-2,-1][-1,-1][ 0,-1][ 1,-1][ 2,-1][ 3,-1][ 4,-1]
[-4, 0][-3, 0][-2, 0][-1, 0][ 0, 0][ 1, 0][ 2, 0][ 3, 0][ 4, 0]
[-4, 1][-3, 1][-2, 1][-1, 1][ 0, 1][ 1, 1][ 2, 1][ 3, 1][ 4, 1]
[-4, 2][-3, 2][-2, 2][-1, 2][ 0, 2][ 1, 2][ 2, 2][ 3, 2][ 4, 2]
[-4, 3][-3, 3][-2, 3][-1, 3][ 0, 3][ 1, 3][ 2, 3][ 3, 3][ 4, 3]
[-4, 4][-3, 4][-2, 4][-1, 4][ 0, 4][ 1, 4][ 2, 4][ 3, 4][ 4, 4]
This is always going to happen with pixels near the border of the image. So for the computation of the first pixel, we would have to restrict the computation to 1/4 of the kernel, which refers to valid coordinates in the target image:
[ ][ ][ ][ ][ ][ ][ ][ ][ ]
[ ][ ][ ][ ][ ][ ][ ][ ][ ]
[ ][ ][ ][ ][ ][ ][ ][ ][ ]
[ ][ ][ ][ ][ ][ ][ ][ ][ ]
[ ][ ][ ][ ][ 0, 0][ 1, 0][ 2, 0][ 3, 0][ 4, 0]
[ ][ ][ ][ ][ 0, 1][ 1, 1][ 2, 1][ 3, 1][ 4, 1]
[ ][ ][ ][ ][ 0, 2][ 1, 2][ 2, 2][ 3, 2][ 4, 2]
[ ][ ][ ][ ][ 0, 3][ 1, 3][ 2, 3][ 3, 3][ 4, 3]
[ ][ ][ ][ ][ 0, 4][ 1, 4][ 2, 4][ 3, 4][ 4, 4]
So the problem with your current approach is that at some point you will setup a ROI that is going to have negative coordinates, and when these instructions are executed you will see a nice crash:
cv::Mat ROI_Mat = input(ROI); // crash
The solution is not to use a ROI and just implement that algorithm yourself. I just can't see this custom computation working with cv::filter2D(). Here's a little something to help you get started:
void local_threshold(const cv::Mat& input, cv::Mat& output)
{
if (input.channels() != 1)
{
std::cout << "local_threshold !!! input image must be single channel" << std::endl;
return;
}
output = cv::Mat(input.rows, input.cols, CV_8UC1);
double min_val = 0, max_val = 0;
for (int i = 0; i < input.rows; i++)
for (int j = 0; j < input.cols; j++)
{
cv::Mat kernel = Mat::zeros(9, 9, output.type());
// Implement logic to fill the 9x9 kernel with
// values from the input Mat, respecting boundaries.
cv::Scalar avg_intensity = cv::mean(kernel);
cv::minMaxLoc(kernel, &min_val,&max_val);
if (input.at<uchar>(i,j) > (max_val / 2))
output.at<unsigned char>(i,j) = 255;
else
output.at<unsigned char>(i,j) = 0;
}
}

After further thinking and finding out how to utilize my basic knowledge in programming, I came up with this code that isn't the most efficient but gets the job done.
What was the main problem with my approach? :
Boundary pixels where one of the main problems and the whole indexing operation between kernel and mask caused a slight headache.
What was my approach in solving that matter?:
My threshold says it needs a relative high intensity level to set true pixels. Therefore I filled the image with some imaginary negative pixels and made my algorithm start at the first pixel of the original. And saved the results to a mask.
Result:
SUCCESS!
Code:
double maxVal, minVal;
cv::Mat output;
int top, bottom, left , right;
int borderType = cv::BORDER_CONSTANT;
cv::Scalar value;
top = (int) (4); bottom = (int) (4);
left = (int) (4); right = (int) (4);
output = src;
out = src;
value = 0;
cv::copyMakeBorder(src,output,top,bottom,left,right,borderType,value);
for(int y = 4; y < output.rows - 4; y++) {
for(int x = 4; x < output.cols - 4; x ++) {
// apply local ROI
cv::Mat ROI = output(cv::Rect(cv::Point(x-4,y-4),cv::Size(9,9)));
cv::minMaxLoc(ROI,&minVal,&maxVal); // extract max intensity values in the ROI
if(src.at<uchar>(cv::Point(x-4,y-4)) >= 0.5*maxVal){ // apply local threshold w.r.t highest intensity level
out.at<uchar>(cv::Point(x-4,y-4)) = 255; // change pixel value in mask if true
}else{
out.at<uchar>(cv::Point(x-4,y-4)) = 0;
}
}
}
}
it needs some clean up I know but hopefully this would help others to get some idea.

Related

I'm having trouble with changing parts of an array

I'm currently working on a small game to play in the console. I'm trying to make player movement, but when I try to replace a certain element within the level array, it deletes the rest of the array.
The only movement in the code right now is moving right (type 2 in the console to move right)
#include <iostream>
using namespace std;
#define con std::cout <<
#define newline std::cout << '\n'
#define text std::cin >>
#define end return 0
#define repeat while (true)
int width, height;
int rprog;
int x, y, z;
int playerpos;
int input;
double level[] =
{1, 1, 1, 1, 1, 1,
1, 0, 0, 0, 0, 1,
1, 0, 2, 0, 0, 1,
1, 0, 0, 0, 0, 1,
1, 0, 0, 0, 0, 1,
1, 1, 1, 1, 1, 1};
const char *display[] = {" ", "[ ]", "[X]"};
int render () {
x = 1;
y = 1;
while (x < 37) {
z = level[x - 1];
con display[z];
x = x + 1;
y = y + 1;
if (y == 7) {
y = 1;
newline;
}
}
end;
}
int player () {
con "Please make your next move : w: 1, a: 2, s: 3, d: 4";
newline;
con "Current position: " << playerpos;
newline;
text input;
if (input == 2) {
level[playerpos] = 0;
playerpos = playerpos - 1;
level[playerpos] = 3;
}
end;
}
int main() {
playerpos = 15;
while (true) {
render ();
player ();
}
end;
}
I'm using this website for coding currently: https://www.programiz.com/cpp-programming/online-compiler/
This is the output:
[ ][ ][ ][ ][ ][ ]
[ ] [ ]
[ ] [X] [ ]
[ ] [ ]
[ ] [ ]
[ ][ ][ ][ ][ ][ ]
Please make your next move : w: 1, a: 2, s: 3, d: 4
Current position: 15
2
[ ][ ][ ][ ][ ][ ]
[ ] [ ]
[ ]
And then it cuts off rendering the level.
I'm confused. What am I doing wrong?
Arrays
Array indices start with 0 in C++.
You set the item at the new position to 3:
level[playerpos] = 3;
However, your array for the display types has only 3 elements (0, 1, 2):
const char *display[] = {" ", "[ ]", "[X]"};
Thus, you encounter undefined behaviour, as you have an out of bounds access.
Note also, that your initial array correctly uses a 2 for the player position, and thus works.
However, it also has a off-by-1 error: Your initialize the playerpos = 15, but place the 2 at index 14. Thus, the initial rendering is wrong. So the first movement will not be correct, and seem to stay on the same position.
Types
As #RemyLebeau mentions, why do you use a double array for the game state? Not only would other types be more appropriate, especially double can lead to serious, hard to debug probles. Not all integers are perfectly representable by a double, and type conversions could lead to different results.
Just for an example: if you add states 4 and 5, and imagine a double could not represent 5 exactely, but store it as 4.99999999999999999 instead. When accessing the array, integer conversion could render a state 4 instead.
Check this question and answer for details
Defines
As #KarenMelikyan mentioned in a comment, those #defines are a bad idea. It makes your code much harder to read for others, and is a bad a habit to develop. Better get aquainted with correct C++ syntax and use it.

Matrix masking operation in OpenCV(C++) and in Matlab

I would like to do the following operation (which is at the current state in Matlab) using cv::Mat variables.
I have matrix mask:
mask =
1 0 0
1 0 1
then matrix M:
M =
1
2
3
4
5
6
3
and samples = M(mask,:)
samples =
1
2
6
My question is, how can I perform the same operation like, M(mask,:), with OpenCV?
With my knowledge the closet function to this thing is copyTo function in opencv that get matrix and mask for inputs. but this function hold original structure of your matrix you can test it.
I think there is no problem to use for loop in opencv(in c++) because it's fast. I propose to use for loop with below codes.
Mat M=(Mat_<uchar>(2,3)<<1,2,3,4,5,6); //Create M
cout<<M<<endl;
Mat mask=(Mat_<bool>(2,3)<<1,0,0,1,0,1); // Create mask
cout<<mask<<endl;
Mat samples;
///////////////////////////////
for(int i=0;i<M.total();i++)
{
if(mask.at<uchar>(i))
samples.push_back(M.at<uchar>(i));
}
cout<<samples<<endl;
above code result below outputs.
[ 1, 2, 3;
4, 5, 6]
[ 1, 0, 0;
1, 0, 1]
[ 1;
4;
6]
with using copyTo your output will be like below
[1 0 0
4 0 6];

calculate gradient directions

I want calculate angles of gradients from depth map and group it for some directions (8 sectors)
But my function calculates only first 3 directions
cv::Mat calcAngles(cv::Mat dimg)//dimg is depth map
{
const int directions_num = 8;//number of directions
const int degree_grade = 360;
int range_coeff = 255 / (directions_num + 1);//just for visualize
cv::Mat x_edge, y_edge, full_edge, angles;
dimg.copyTo(x_edge);
dimg.copyTo(y_edge);
dimg.copyTo(full_edge);
//compute gradients
Sobel( dimg, x_edge, CV_8U, 1, 0, 5, 1, 19, 4 );
Sobel( dimg, y_edge, CV_8U, 0, 1, 5, 1, 19, 4 );
Sobel( dimg, full_edge, CV_8U, 1, 1, 5, 1, 19, 4 );
float freq[directions_num + 1];//for collect direction's frequency
memset(freq, 0, sizeof(freq));
angles = cv::Mat::zeros(dimg.rows, dimg.cols, CV_8U);//store directions here
for(int i = 0; i < angles.rows; i++)
{
for(int j = 0; j < angles.cols; j++)
{
angles.at<uchar>(i, j) = (((int)cv::fastAtan2(y_edge.at<uchar>(i, j), x_edge.at<uchar>(i, j))) / (degree_grade/directions_num) + 1
) * (dimg.at<uchar>(i, j) ? 1 : 0);//fastatan returns values from 0 to 360, if i not mistaken. I want group angles by directions_num sectors. I use first 'direction' (zero value) for zero values from depth map (zero value at my depth map suggest that it is bad pixel)
freq[angles.at<uchar>(i, j)] += 1;
}
}
for(int i = 0; i < directions_num + 1; i++)
{
printf("%2.2f\t", freq[i]);
}
printf("\n");
angles *= range_coeff;//for visualization
return angles;
}
Out from one of the frames:
47359.00 15018.00 8199.00 6224.00 0.00 0.00 0.00 0.00 0.00
(first value is "zero pixel", next is number of gradients in n-place but only 3 are not zero)
Visualization
Is there way out? Or these result is OK?
PS Sorry for my writing mistakes. English in not my native language.
You used CV_8U type for Sobel output. It is unsigned integer 8 bit. So it can store only positive values. That's why fastAtan2 returns less or equal than 90. Change type to CV_16S and use short type for accessing the elements:
cv::Sobel(dimg, x_edge, CV_16S, 1, 0, 5, 1, 19, 4);
cv::Sobel(dimg, y_edge, CV_16S, 0, 1, 5, 1, 19, 4);
cv::fastAtan2(y_edge.at<short>(i, j), x_edge.at<short>(i, j))

OpenGl glTexImage2D data

I am trying to read floatingpoint numbers from a CSV file, that contains a precomputed texture, store it in a 1 dimensional array, and then put that read data into a 2 dimensional texture.
I need to make sure the following code does that, because i have problems accessing the data and I cannot figure out where the error is:
// Allocate memory
float * image = new float [width * height * 3 ];
for( int i = 0; i < height; i++)
{
for( int j = 0; j < width-1; j++)
{
fscanf( fDataFile, "%f,", &fData );
image[ 4 * i * j + 0 ] = fData;
image[ 4 * i * j + 1 ] = fData;
image[ 4 * i * j + 2 ] = fData;
}
fscanf( fDataFile, "%f", &fData );
image[ 4 * i * width-1 + 0 ] = fData;
image[ 4 * i * width-1 + 1 ] = fData;
image[ 4 * i * width-1 + 2 ] = fData;
}
There shouldn't be a problem here, but what troubles me is the following:
// create the texture
glGenTextures(1, &texHandle);
glBindTexture(GL_TEXTURE_2D, texHandle);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB, width, height, 0, GL_RGB, GL_FLOAT, &image[0]);
Is it okay to just give glTexImage2D the pointer to my onedimensional array?
the size of the array is width*height*3 and the texture's format should be width*height with 3 channels... so the size should be okay i guess?!
Still my program won't work as expected and this is one potential source for an error.
I solved my messed up texture reading. I don't know what got into me but the initilization of my array was pure nonesense. Here is the corrected code, I found out when trying to write a test texture:
// Allocate memory
float * image = new float [width * height * 3 ];
for( int i = 0; i < height; i++)
{
for( int j = 0; j < width-1; j++)
{
fscanf( fDataFile, "%f,", &fData );
image[ 3 * (i * width + j) + 0 ] = fData;
image[ 3 * (i * width + j) + 1 ] = fData;
image[ 3 * (i * width + j) + 2 ] = fData;
//image[ 4 * i * j + 2 ] = 1.0f;
}
fscanf( fDataFile, "%f", &fData );
image[ 3 * (i * width + width-1) + 0 ] = fData;
image[ 3 * (i * width + width-1) + 1 ] = fData;
image[ 3 * (i * width + width-1) + 2 ] = fData;
//image[ 4 * i * width-1 + 2 ] = 1;
}
Furthermore it would work now independent of the internal format. GL_RGB, GL_RGBA, GL_RGB32F and GL_RGBA32F all work fine without changing the way I read my texture.
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB, width, height, 0, GL_RGB, GL_FLOAT, &image[0]);
You should be using a floating-point internal format. For example, GL_RGB32F. That should be the third parameter.

Copying upper MatrixXd to lower MatrixXd (Eigen3) C++ library

I've got a lower triangular MatrixXd and I want to copy its lower values to the upper side as it'll become a symmetric matrix. How can I do it?
So far I've done:
MatrixXd m(n,n);
.....
//do something with m
for(j=0; j < n; j++)
{
for(i=0; i<j; i++)
{
m(i,j) = m(j,i);
}
}
Is there a fastest way to do it? I was thinking of some internal method that is able to "copy" the lower triangular matrix to the upper.
Say I've got this matrix, we call m:
1 2 3
4 5 6
7 8 9
what I need to obtain in m is :
1 4 7
4 5 8
7 8 9
I also know you can get the upper or the lower part of the matrix to do something:
MatrixXd m1(n,n);
m1 = m.triangularView<Eigen::Upper>();
cout << m1 <<endl;
1 2 3
0 5 6
0 0 9
But I can't yet get what I want...
I assume here that you are referring to working with the Eigen3 c++ library. This is not clear from your question. if not, you should consider it. In any case, within Eigen, there is no need to actually copy the triangular part, to get a selfadjoint matrix. Eigen has the concept of views, and you can use a self adjoint view in order to perform an operation like e.g.
using namespace Eigen;
MatrixXd m(m,n);
...
(generate uppper triangular entries in m)
...
VectorXd r(n), p(n);
r = m.selfadjointView<Upper>() * p;
here is a small example to illustrate using fixed size matrices:
#include <Eigen/Core>
using namespace std;
using namespace Eigen;
int main()
{
Matrix2d m,c;
m << 1, 2,
0, 1;
Vector2d x(0,2), r;
// perform copy operation
c = m.selfadjointView<Upper>();
cout << c << endl;
// directly apply selfadjoint view in matrix operation
// (no entries are copied)
r = m.selfadjointView<Upper>() * x;
}
the output will be
[1, 2,
2, 1].
now, the result in r is the same as if you had used c * x instead. Just that there is no need for copying the values in the original matrix to make it selfadjoint.
In case the selfadjointView is not an option for you, the solution is to use triangularView on the destination matrix:
m.triangularView<Lower>() = m.transpose();
The simplest way I can think of is by copying the upper part of m matrix trasposed on the upper part:
m.triangularView<Upper>() = m.transpose();
For example, the following code:
MatrixXd m(3,3);
m << 1, 2, 3, 4, 5, 6, 7, 8, 9;
m.triangularView<Upper>() = m.transpose();
std::cout << m << std::endl;
Gives the output you asked for:
1 4 7
4 5 8
7 8 9
Regards.
Simply:
m = m.selfadjointView<Upper>();
I think you are doing it the right way. If you knew some details about the memory layout of data in the matrix you could use some low-level optimizations. One of the techniques is loop tiling.
If speed is a big issue, I would not copy anything just decorate/wrap the matrix object with a coordinate inverting object that would flip the (x,y) to (y,x). if you make the () operator an an inline function it will not incur any significant cost when you use it.
This works, you can cut something but you need at least n*m/2 (less something), so only of a 2x
edit: I see that you use this matrixd object... the syntax is different, but the algorithm is this, anyway
#include <stdio.h>
int main ( )
{
int mat [ 4 ] [ 4 ];
int i, j;
mat [ 0 ] [ 0 ] = 0;
mat [ 0 ] [ 1 ] = 1;
mat [ 0 ] [ 2 ] = 2;
mat [ 0 ] [ 3 ] = 3;
mat [ 1 ] [ 0 ] = 4;
mat [ 1 ] [ 1 ] = 5;
mat [ 1 ] [ 2 ] = 6;
mat [ 1 ] [ 3 ] = 7;
mat [ 2 ] [ 0 ] = 8;
mat [ 2 ] [ 1 ] = 9;
mat [ 2 ] [ 2 ] = 10;
mat [ 2 ] [ 3 ] = 11;
mat [ 3 ] [ 0 ] = 12;
mat [ 3 ] [ 1 ] = 13;
mat [ 3 ] [ 2 ] = 14;
mat [ 3 ] [ 3 ] = 15;
for ( i = 0; i < 4; i++ )
{
for ( j = 0; j < 4; j++ )
printf ( "%02d", mat [ i ] [ j ] );
printf ( "\n" );
}
printf ( "\n" );
for ( i = 1; i < 4; i++ )
{
for ( j = 0; j < i; j++ )
mat [ j ] [ i ] = mat [ i ] [ j ];
}
for ( i = 0; i < 4; i++ )
{
for ( j = 0; j < 4; j++ )
printf ( "%02d ", mat [ i ] [ j ] );
printf ( "\n" );
}
printf ( "\n" );
scanf ( "%d", &i );
}