HOG feature based recognition - c++

scene
I used haar cascade to detect face and that face ROI was passed to estimate its HoG feature...while calculating HoG feature...I used 8x8 patch of image(assuring all image was a square matrix).
problem
I used to create a 9 bin histogram for each patch.for example for a image ROI of 177x177 dimension I obtain 484 8x8 patch...out of all each patch contains a 9 bin histogram(9-dimension vector)...So for whole image I obtain total 4356 catenated histogram( 4356 dimension vector..)...Now I usually don't want to apply PCA algorithm currently or any kind of dimension reduction..The problem is a single image is generating such a large dimensional vector what if I feed 5 image to network (also dimension is changing with picture...its not fixed!)..feels like BooM!!
so what to do..

Related

Depth/Disparity Map from a moving camera in OpenCV

Is that possible to get the depth/disparity map from a moving camera? Let say I capture an image at x location, after I travelled let say 5cm and I capture another picture, and from there I calculate the depth map of the image.
I have tried using BlockMatching in opencv but the result is not good.The first and second image are as following:
first image,second image,
disparity map (colour),disparity map
My code is as following:
GpuMat leftGPU;
GpuMat rightGPU;
leftGPU.upload(left);rightGPU.upload(right);
GpuMat disparityGPU;
GpuMat disparityGPU2;
Mat disparity;Mat disparity1,disparity2;
Ptr<cuda::StereoBM> stereo = createStereoBM(256,3);
stereo->setMinDisparity(-39);
stereo->setPreFilterCap(61);
stereo->setPreFilterSize(3);
stereo->setSpeckleRange(1);
stereo->setUniquenessRatio(0);
stereo->compute(leftGPU,rightGPU,disparityGPU);
drawColorDisp(disparityGPU, disparityGPU2,256);
disparityGPU.download(disparity);
disparityGPU2.download(disparity2);
imshow("display img",disparityGPU);
how can I improve upon this? From the colour disparity map, there are quite a lot error (ie. the tall circle is red in colour and it is the same as some of the part of the table.). Also,from the disparity map, there are small noise (all the black dots in the picture), how can I pad those black dots with nearby disparities?
It is possible if the object is static.
To properly do stereo matching, you first need to rectify your images! If you don't have calibrated cameras, you can do this from detected feature points. Also note that for cuda::StereoBM the minimum default disparity is 0. (I have never used cuda, but I don't think your setMinDisparity is doing anything, see this anser.)
Now, in your example images corresponding points are only about 1 row apart, therefore your disparity map actually doesn't look too bad. Maybe having a larger blockSize would already do in this special case.
Finally, your objects have very low texture, therefore the block matching algorithm can't detect much.

How to project IR image on a 2D plane using OpenCV and PCL

I have a Kinect and I'm using OpenCV and point cloud library. I would like to project the IR Image onto a 2D plane for forklift pallet detection. How would I do that?
I'm trying to detect the pallet in the forklift here is an image:
Where are the RGB data? You can use it to help with the detection. You do not need to project the image onto any plane to detect a pellet. There are basically 2 ways used for detection
non-deterministic based on neural network, fuzzy logic, machine learning, etc
This approach need a training dataset to recognize the object. Much experience is needed for proper training set and classifier architecture/topology selection. But other then that you do not need to program it... as usually some readily available lib/tool is used just configure and pass the data.
deterministic based on distance or correlation coefficients
I would start with detecting specific features like:
pallet has specific size
pallet has sharp edges and specific geometry shape in depth data
pallet has specific range of colors (yellowish wood +/- lighting and dirt)
Wood has specific texture patterns
So compute some coefficient for each feature how close the object is to real pallet. And then just treshold the distance of all coefficients combined (possibly weighted as some features are more robust).
I do not use the #1 approach so I would go for #2. So combine the RGB and depth data (they have to be matched exactly). Then segmentate the image (based on depth and color). After that for each found object classify if it is pallet ...
[Edit1]
Your colored image does not correspond to depth data. The aligned gray-scale has poor quality and the depth data image is also very poor. Is the depth data processed somehow (loosing precision)? If you look at your data from different sides:
You can see how poor it is so I doubt you can use depth data for detection at all...
PS. I used my Align already captured rgb and depth images for the visualization.
The only thing left is the colored image and detect areas with matching color only. Then detect the features and classify. The color of your pallet in the image is almost white. Here HSV reduced colors to basic 16 colors (too lazy to segmentate)
You should obtain range of colors of the pallets possible by your setup to ease up the detection. Then check those objects for the features like size, shape,area,circumference...
[Edit2]
So I would start with Image preprocessing:
convert to HSV
treshold only pixels close to pallet color
I chose (H=40,S=18,V>100) as a pallet color. My HSV ranges are <0,255> per channel so Hue angle difference can be only <-180deg,+180deg> max which corresponds to <-128,+128> in my ranges.
remove too thin areas
Just scan all Horizontal an Vertical lines count consequent set pixels and if too small size recolor them to black...
This is the result:
On the left the original image (downsized so it fits to this page), In the middle is the color treshold result and last is the filtering out of small areas. You can play with tresholds and pallet color to change behavior to suite your needs.
Here C++ code:
int tr_d=10; // min size of pallet [pixels[
int h,s,v,x,y,xx;
color c;
pic1=pic0;
pic1.pf=_pf_rgba;
pic2.resize(pic1.xs*3,pic1.ys); xx=0;
pic2.bmp->Canvas->Draw(xx,0,pic0.bmp); xx+=pic1.xs;
// [color selection]
for (y=0;y<pic1.ys;y++)
for (x=0;x<pic1.xs;x++)
{
// get color from image
c=pic0.p[y][x];
rgb2hsv(c);
// distance to white-yellowish color in HSV (H=40,S=18,V>100)
h=c.db[picture::_h]-40;
s=c.db[picture::_s]-18;
v=c.db[picture::_v];
// hue is cyclic angular so use only shorter angle
if (h<-128) h+=256;
if (h>+128) h-=256;
// abs value
if (h< 0) h=-h;
if (s< 0) s=-s;
// treshold close colors
c.dd=0;
if (h<25)
if (s<25)
if (v>100)
c.dd=0x00FFFFFF;
pic1.p[y][x]=c;
}
pic2.bmp->Canvas->Draw(xx,0,pic1.bmp); xx+=pic1.xs;
// [remove too thin areas]
for (y=0;y<pic1.ys;y++)
for (x=0;x<pic1.xs;)
{
for ( ;x<pic1.xs;x++) if ( pic1.p[y][x].dd) break; // find set pixel
for (h=x;x<pic1.xs;x++) if (!pic1.p[y][x].dd) break; // find unset pixel
if (x-h<tr_d) for (;h<x;h++) pic1.p[y][h].dd=0; // if too small size recolor to zero
}
for (x=0;x<pic1.xs;x++)
for (y=0;y<pic1.ys;)
{
for ( ;y<pic1.ys;y++) if ( pic1.p[y][x].dd) break; // find set pixel
for (h=y;y<pic1.ys;y++) if (!pic1.p[y][x].dd) break; // find unset pixel
if (y-h<tr_d) for (;h<y;h++) pic1.p[h][x].dd=0; // if too small size recolor to zero
}
pic2.bmp->Canvas->Draw(xx,0,pic1.bmp); xx+=pic1.xs;
See how to extract the borders of an image (OCT/retinal scan image) for the description of picture and color. Or look at any of my DIP/CV tagged answers. Now the code is well commented and straightforward but just need to add:
You can ignore pic2 stuff it is just the image posted above so I do not need to manually print screen and merge the subresult in paint... To improve robustness you should add enhancing of dynamic range (so the tresholds have the same conditions for any input images). Also you should compare to more then just single color (if more wood types of pallet is present).
Now you should segmentate or label the areas
loop through entire image
find first pixel set with the pallet color
flood fill the area with some distinct ID color different from set pallet color
I use black 0x00000000 space and white 0x00FFFFFF as pallete pixel color. So use ID={1,2,3,4,5...}. Also remember number of filled pixels (that is your area) so you do not need to compute it again. You can also compute bounding box directly while filling.
compute and compare features
You need to experiment with more then one image. To find out what properties are good for detection. I would go for circumference length vs area ratio. and or bounding box size... The circumference can be extracted by simply selecting all pixels with proper ID color neighboring black pixel.
See also similar Fracture detection in hand using image proccessing
Good luck and have fun ...

Create the voronoi diagram with openCv and C++

I have a little problem. I need to create the voronoi diagram of a BW image by using openCV and C++. I should have something like the output of the Matlab function voronoin.
The goal is to create a mask for each region of the diagram.
This is an example I made in Matlab:
matlab voronoi diagram
So, for each region I should create a mask or to have a different color.
I tried the openCV function distanceTransform in order to get the voronoi labels.
Mat bwCoresGoodInv = 255 - bwCoresGood;
distanceTransform(bwCoresGoodInv, distTr,voronoiLabels, CV_DIST_L2, CV_DIST_MASK_PRECISE, DIST_LABEL_PIXEL);
namedWindow( "voronoiDistLab", CV_WINDOW_AUTOSIZE );
voronoiLabels = voronoiLabels*5;
imshow( "voronoiDistLab", voronoiLabels );
the results is the following image:
voronoi labels openCV
as you can see in each region there are differents colors(in particular there is something in correspondence to the cell), is there a way to have just a color?
thank you in advance
If you are asking how to get different colors than the gray scale values provided by displaying the labels, one approach (probably not the most efficient) is to run cv::findContours on an edge-detected image of the label image, and then iterate through each contour found and draw it onto a new image, it can be filled or outlined. It's not super exact and can leave gaps, some dilation on the edge image may be required.
It would be very nice if distanceTransform returned a data-structure that mapped the range of intensity values in the label image to every pixel that has that value, maybe with a vector of binary images where the nth image in the vector is a binary mask with an isolated nth label region- but I think as it is now this would have to be done by the user.

using OpenCV and SVM with images

I am having difficulty with reading an image, extracting features for training, and testing on new images in OpenCV using SVMs. can someone please point me to a great link? I have looked at the OpenCV Introduction to Support Vector Machines. But it doesn't help with reading in images, and I am not sure how to incorporate it.
My goals are to classify pixels in an image. These pixel would belong to a curves. I understand forming the training matrix (for instance,
image A
1,1 1,2 1,3 1,4 1,5
2,1 2,2 2,3 2,4 2,5
3,1 3,2 3,3 3,4 3,5
I would form my training matrix as a [3][2]={ {1,1} {1,2} {1,3} {1,4} {1,5} {2,1} ..{} }
However, I am a little confuse about the labels. From my understanding, I have to specify which row (image) in the training matrix corresponds, which corresponds to a curve or non-curve. But, how can I label a training matrix row (image) if there are some pixels belonging to the curve and some not belonging to a curve. For example, my training matrix is [3][2]={ {1,1} {1,2} {1,3} {1,4} {1,5} {2,1} ..{} }, pixels {1,1} and {1,4} belong to the curve but the rest does not.
I've had to deal with this recently, and here's what I ended up doing to get SVM to work for images.
To train your SVM on a set of images, first you have to construct the training matrix for the SVM. This matrix is specified as follows: each row of the matrix corresponds to one image, and each element in that row corresponds to one feature of the class -- in this case, the color of the pixel at a certain point. Since your images are 2D, you will need to convert them to a 1D matrix. The length of each row will be the area of the images (note that the images must be the same size).
Let's say you wanted to train the SVM on 5 different images, and each image was 4x3 pixels. First you would have to initialize the training matrix. The number of rows in the matrix would be 5, and the number of columns would be the area of the image, 4*3 = 12.
int num_files = 5;
int img_area = 4*3;
Mat training_mat(num_files,img_area,CV_32FC1);
Ideally, num_files and img_area wouldn't be hardcoded, but obtained from looping through a directory and counting the number of images and taking the actual area of an image.
The next step is to "fill in" the rows of training_mat with the data from each image. Below is an example of how this mapping would work for one row.
I've numbered each element of the image matrix with where it should go in the corresponding row in the training matrix. For example, if that were the third image, this would be the third row in the training matrix.
You would have to loop through each image and set the value in the output matrix accordingly. Here's an example for multiple images:
As for how you would do this in code, you could use reshape(), but I've had issues with that due to matrices not being continuous. In my experience I've done something like this:
Mat img_mat = imread(imgname,0); // I used 0 for greyscale
int ii = 0; // Current column in training_mat
for (int i = 0; i<img_mat.rows; i++) {
for (int j = 0; j < img_mat.cols; j++) {
training_mat.at<float>(file_num,ii++) = img_mat.at<uchar>(i,j);
}
}
Do this for every training image (remembering to increment file_num). After this, you should have your training matrix set up properly to pass into the SVM functions. The rest of the steps should be very similar to examples online.
Note that while doing this, you also have to set up labels for each training image. So for example if you were classifying eyes and non-eyes based on images, you would need to specify which row in the training matrix corresponds to an eye and a non-eye. This is specified as a 1D matrix, where each element in the 1D matrix corresponds to each row in the 2D matrix. Pick values for each class (e.g., -1 for non-eye and 1 for eye) and set them in the labels matrix.
Mat labels(num_files,1,CV_32FC1);
So if the 3rd element in this labels matrix were -1, it means the 3rd row in the training matrix is in the "non-eye" class. You can set these values in the loop where you evaluate each image. One thing you could do is to sort the training data into separate directories for each class, and loop through the images in each directory, and set the labels based on the directory.
The next thing to do is set up your SVM parameters. These values will vary based on your project, but basically you would declare a CvSVMParams object and set the values:
CvSVMParams params;
params.svm_type = CvSVM::C_SVC;
params.kernel_type = CvSVM::POLY;
params.gamma = 3;
// ...etc
There are several examples online on how to set these parameters, like in the link you posted in the question.
Next, you create a CvSVM object and train it based on your data!
CvSVM svm;
svm.train(training_mat, labels, Mat(), Mat(), params);
Depending on how much data you have, this could take a long time. After it's done training, however, you can save the trained SVM so you don't have to retrain it every time.
svm.save("svm_filename"); // saving
svm.load("svm_filename"); // loading
To test your images using the trained SVM, simply read an image, convert it to a 1D matrix, and pass that in to svm.predict():
svm.predict(img_mat_1d);
It will return a value based on what you set as your labels (e.g., -1 or 1, based on my eye/non-eye example above). Alternatively, if you want to test more than one image at a time, you can create a matrix that has the same format as the training matrix defined earlier and pass that in as the argument. The return value will be different, though.
Good luck!

how to fill image with gradient of 3 random colors?

I wonder how to fill openCV mat ( image ) with radial gradient of 3 random colors in 3 random positions with 3 random radii using openCV 2 API? Is it possible in one line of code?
OpenCV is not the appropriate tool for that, because it's not a image manipulation library …
ImageMagick is a better for that job. Check out this page: http://www.imagemagick.org/Usage/canvas/ After you created the gradient image you can use it with OpenCV (whatever you want to accomplish).