I'm doing some image processing to learn, I'm using openCV with the new C++ iterface.
I have an RGB image, I read the grayscale version, equalize the histogram, then use canny to detect edges.
The question is: does the Canny function return a grayscale image or a binary one?
I use a double for loop to check the values of pixels in the image resulting of applying canny edge detector, and I have a lot of values totally different. In a binary image must be only two values 0 or 1 (0 or 255)... How can I do that?
My code:
imagen = imread(nombre_imagen,0); //lee la imagen en escala de grises
equalizeHist(imagen,imagen);
Canny(binary,binary,0.66*threshold,1.33*threshold,3,true);
for(int i=0;i<binary.rows;i++){
for(int j=0;j<binary.cols;j++){
std::cout << binary.at<float>(i,j)<< ",";
}
std::cout << "\n";
}
This is a 183x183 matrix and I can see a lot of values even some nan.
How can I make this binary?
For the Canny function I use a threshold value obtaining the "mean" value of the grayscale image; I saw it here: http://www.kerrywong.com/2009/05/07/canny-edge-detection-auto-thresholding/.
Binary images have type uchar which you need to cast to int (or unsigned int) if you want the 0-255 range:
for(int i=0;i<edges.rows;i++) {
for(int j=0;j<edges.cols;j++) {
std::cout << (int)edges.at<uchar>(i,j)<< ",";
}
std::cout << "\n";
}
You should avoid the C-style cast, but I'll leave that as an exercise to the reader.
Documentation ( http://opencv.willowgarage.com/documentation/cpp/imgproc_feature_detection.html?highlight=canny#Canny ) says that:
void Canny(const Mat& image, Mat& edges, double threshold1, double threshold2, int apertureSize=3, bool L2gradient=false)
Second parameter - edges – The output edge map. It will have the same size and the same type as image
In your call you pass the same source image as the second parameter, which causes, I think, strange results.
Related
I'm working on code that computes dense SIFT features from a set of images, based on SIFT flow: http://people.csail.mit.edu/celiu/SIFTflow/
I'd like to try building a FLANN index on these images by comparing the "energy" between each image in SIFT flow representation.
I have the code to compute the energy from here: http://richardt.name/publications/video-deanaglyph/
Is there a way to create my own distance function for the indexing?
RELATED NOTE:
I was finally able to get an alternate (but not custom) distance function working with flann::Index. The trick is you need to use flann::GenericIndex like so:
flann::GenericIndex<cvflann::ChiSquareDistance<int>> flannIndex(descriptors, cvflann::KDTreeIndexParams());
But you need to give it CV_32S descriptors.
And if you use knnSearch with custom distance function, you have to provide a CV_32S results Mat and CV_32F distances Mat.
Here's my full code in case it's helpful (not a lot of documentation out there):
Mat samples;
loadDescriptors(samples); // loading descriptors from .yml file
samples *= 100000; // scaling up my descriptors to be int
samples.convertTo(samples, CV_32S); // convert float to int
// create flann index
flann::GenericIndex<cvflann::ChiSquareDistance<int>> flannIndex(samples, cvflann::KDTreeIndexParams());
// NOTE lack of distance type in constructor parameters
// (unlike flann::index)
// now try knnSearch
int k=10; // find 10 nearest neighbors
Mat results(1,10,CV_32S), dists(1,10,CV_32F);
// (1,10) Mats for the output, types CV_32S and CV_32F
Mat responseHistogram;
responseHistogram = samples.row(60);
// choose a random row from the descriptors Mat
// to find nearest neighbors
flannIndex.knnSearch(responseHist, results, dists, k, cvflann::SearchParams(200) );
cout << results << endl;
cout << dists << endl;
flannIndex.save(ofToDataPath("indexChi2.txt"));
Using Chi Squared actually seems to work better for me than L2 distance. My feature vectors are BoW histograms in this case.
I'm trying to multiply two images of different models, in my case HSV and YCRCB.
I get the "vector is out of bound error" every time.
I have checked the sizes of the input images being multiplied, the number of rows and columns. I know the value is exceeding over 255.
I tried to implement this method opencv - image multiplication, but the code has way to many MAT's that have to be initialized. This also leads me to ask the question if images with more than 1 channel can be multiplied. Also tried direct multiplication and it doesn't work, so tried multiplying channel wise. To make things easier, I used the loop method but then the error occured.
A short summary about the code and reason for doing it : I'm using it for skin detection but want to further reduce noise. I think this can be done by multiplying the 2 output images generated by the threshold operations (for HSV & YCRCB). Since these images have different noises in the image, the output of the multiplication will have even less noise (I have seen the output on different screens, the overlapping regions are very small) hence this can detect skin color at all almost times and noise will be minimal and thus will help in tracking skin better.
The code given below is not complete cause it never executes till the end. After this there are morphological and dilation operations being done, that's it.
This is my first time asking a question on Stack Overflow and I'm still learning Open CV . Sorry If I have been over-descriptive and all suggestions are welcome. Thank You.
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv\cv.h>
#include <opencv\highgui.h>
#include <iostream>
#include <opencv2\imgproc\imgproc.hpp>
using namespace cv;
using namespace std;
char key;
Mat image,hsv,ycr;
vector<Mat> channels,ycrs,threshold_output;
int main()
{
VideoCapture cap(0); // open the default camera
if(!cap.isOpened()) // check if we succeeded
{
cout << "Cannot open the web cam" << endl;
return -1;
}
while(1)
{
cap>>image;
cvtColor( image, ycr, CV_BGR2YCrCb ); //Converts into YCRCB
cvtColor( image, hsv, CV_BGR2HSV ); //Converts into HSV
Mat imgThresholded;
Mat imgThresholded1;
inRange(ycr, Scalar(0, 140,105 ), Scalar(255, 165,135), imgThresholded1); //for yrcrcb range
inRange(hsv, Scalar(0, 48,150 ), Scalar(20, 150,255), imgThresholded); //for hsv range
split(imgThresholded1, channels);
split(imgThresholded, ycrs);
for( int i = 0; i <3 ; i++ )
{
multiply(channels[i],ycrs[i], threshold_output[i], 1,-1 );
}//code breaks here
Even if the input to inRange are multi-channeled, the output of inRange will be a single-channel CV_8UC1.
The reason is that inRange computes a Cartesian intersection:
Result (x, y) is true (uchar of 255) if ALL of these are true:
For first channel, lower[0] <= img(x, y)[0] <= upper[0], AND
For second channel, lower[1] <= img(x, y)[1] <= upper[1], AND
And so on.
In other words, after it has checked each channel's pixel values against the lower and upper bound, the logical result is then "boiled down" Logical-And operation over the channels of the image.
"Boiled down" is my colloquial way of referring to reduction, or fold, where a function can accept arbitrary number of arguments and it can be "reduced" down to a single value. Summation, multiplication, string concatenation, etc.
It is therefore not necessary to use cv::split on the output of cv::inRange. In fact, because the output has only one channel, calling channels[1] or ycrs[1] will be an undefined behavior, which will either cause an exception for a debug-build and undefined behavior or crash or memory corruption for a release-build.
I have used Canny edge detector to successfully identify the edges of a given image. I'm struggling with finding specific points on this detected edge line.
My approach:
I used the cv::canny function in opencv and the output is stored in cv::Mat format. I want to iterate through the all values of the matrix and identify all those pixels where the edge is present so that I can detect the specific points on the detected edge line.
Function used:
cv::Canny(frame_gray,contours,50,150);
The output is stored in contours and it is of type CV_8UC3
To access the pixel value, have tried
contours.at<int>(i,j) != 0
and also
contours.at<uchar>(i,j) != 0
Will greatly appreciate help in the above. If the approach is correct and am missing something or else if i should try another approach
Thanks
Edit:
for(int i=0;i<img_width;i++)
{
if((int)contours.at<uchar>(i,neckcenter.y) > 0 )
{
Point multipoints(i,neckcenter.y);
circle( contours, multipoints, neckpoint, Scalar( 255, 0, 0 ),4, 8, 0 );
cout << (int)contours.at<uchar>(i,neckcenter.y) << endl;
}
}
I am using the above code which forms a small circle of radius 1 (defined by neckpoint) where it detects a point on and edge. The neckcenter.y is a constant value derived from an earlier calculation. What am i doing wrong here ?
Output of the code -
you probably want a grayscale pass before applying Canny:
Mat gray;
cvtColor(bgr,gray,CV_BGR2GRAY); // now gray is a 8bit, uchar Mat
Mat contours;
cv::Canny(gray,contours,50,150);
// now you're safe to use:
uchar value = contours.at<uchar>(i,j);
The syntax:
contours.at<uchar>(i,j)
Is correct for your case in terms of data type (i.e. a grayscale image). The problem is possibly hinted at by this line:
for(int i=0;i<img_width;i++)
When you access OpenCV pixels using at, you must specify the pixel position as (row, col), so your indexing is the wrong way round. Try this in all places where you access pixels:
contours.at<uchar>(j,i)
From the OpenCV documentation:
You have a 3 channel image of the type unsigned char. To access it you should use the cv::Vec3b type. Here is how to do it:
int channel = 0;//or 1 or 2
contours.at<cv::Vec3b>(i,j)[channel]
To check if all elements are 0:
contours.at<cv::Vec3b>(i,j)[0]==0 && contours.at<cv::Vec3b>(i,j)[1]==0 && contours.at<cv::Vec3b>(i,j)[2]==0
But where do you have the information that the image type of contours is CV_8UC3 ?
The plan
My project is able to capture the bitmap of a target window and convert it into an IplImage, and then display that image in a cvNamedWindow, where further processing can take place.
For the sake of testing, I've loaded an image into MSPaint like so:
The user is then allowed to click and drag the mouse over any number of pixels within the image to create a vector<cv::Scalar_<BYTE>> containing these RGB color values.
Then, with the help of ColorRGBToHLS(), this array is then sorted from left to right by hue, like so:
// PixelColor is just a cv::Scalar_<BYTE>
bool comparePixelColors( PixelColor& pc1, PixelColor& pc2 ) {
WORD h1 = 0, h2 = 0;
WORD s1 = 0, s2 = 0;
WORD l1 = 0, l2 = 0;
ColorRGBToHLS(RGB(pc1.val[2], pc1.val[1], pc1.val[0]), &h1, &l1, &s1);
ColorRGBToHLS(RGB(pc2.val[2], pc2.val[1], pc2.val[0]), &h2, &l2, &s2);
return ( h1 < h2 );
}
//..(elsewhere in code)
std::sort(m_colorRange.begin(), m_colorRange.end(), comparePixelColors);
...and then shown in a new cvNamedWindow, which looks something like:
The problem
Now, the idea here is to create a binary threshold image (or "mask") where this selected range of colors become white, and the rest of the source image becomes black... similar to the way the "Select By Color" tool operates in GIMP, or the "magic wand" tool works in Photoshop... except instead of limiting ourselves to a specific contoured selection, we are literally operating on the image as a whole.
I've read into cvInRangeS, and it sounds like it's precisely what I need.
However, and for whatever reason, the thresholded image always ends up being totally black...
VOID ShowThreshedImage(const IplImage* src, const PixelColor& min, const PixelColor& max)
{
IplImage* imgHSV = cvCreateImage(cvGetSize(src), IPL_DEPTH_8U, 3);
cvCvtColor(src, imgHSV, CV_RGB2HLS);
cvNamedWindow("T1");
cvShowImage("T1", imgHSV); // <-- Shows up like the image below
IplImage* imgThreshed = cvCreateImage(cvGetSize(src), IPL_DEPTH_8U, 1);
cvInRangeS(imgHSV, min, max, imgThreshed);
cvNamedWindow("T2");
cvShowImage("T2", imgThreshed); // <-- SHOWS UP PITCH BLACK!
}
This is what the "T1" window ends up looking like (which I suppose is correct?):
Bearing in mind that because the color range vector is stored as RGB (and that OpenCV internally reverses this order into BGR), I have converted the min/max values into HLS before passing them into ShowThreshedImage() like so:
CvScalar rgbPixelToHSV(const PixelColor& pixelColor)
{
WORD h = 0, s = 0, l = 0;
ColorRGBToHLS(RGB(pixelColor.val[2], pixelColor.val[1], pixelColor.val[0]), &h, &l, &s);
return PixelColor(h, s, l);
}
//...(elsewhere in code)
if(m_colorRange.size() > 0)
m_minHSV = rgbPixelToHSV(m_colorRange[0]);
if(m_colorRange.size() > 1)
m_maxHSV = rgbPixelToHSV(m_colorRange[m_colorRange.size() - 1]);
ShowThreshedImage(m_imgSrc, m_minHSV, m_maxHSV);
...But even without this conversion and simply passing RGB values instead, the result is still an entirely black image. I've even tried manually plugging in certain min/max values, and the best result I got was a few lit pixels (albeit, the incorrect ones).
The question:
What am I doing wrong here?
Is there something that I don't understand about the cvInRangeS method?
Do I need to step through each and every single color in order to properly threshold the selected range out of the source image?
Are there any other ways of accomplishing this?
Thank you for your time.
Update:
I have discovered that cvInRangeS expects all values for min to be lower than that of max. But when a range of colors are selected, there doesn't appear to be any guarantee that this will be the case, often resulting in a black thresholded image.
And swapping values to enforce this rule may result in unwanted colors within the new range (in some cases, this could include all colors instead of just the desired ones).
So I suppose the real question here would be:
"How do you segment an array of RGB colors, and use them to threshold an image?"
Your problem might be caused by the simple fact that OpenCV maintains a different range for values than for instanc MSpaint. For instance the HSV color space in paint is 360,100,100 while in OpenCV it is 180,255,255. Check your input values in openCV bu outputting the pixel value when clicking on a certain pixel. inRangeS should be the correct tool for the job. That said, in RGB it should work just as well because the range is the same as in paint.
cvSetMouseCallback("MyWindow", mouseEvent, (void*) &myImage);
void mouseEvent(int evt, int x, int y, int flags, void *param) {
if (evt == CV_EVENT_LBUTTONDOWN) {
printf("%d %d\n", x, y);
IplImage* imageSource = (IplImage*) param;
Mat image(imageSource);
cout << "Image cols " << image.cols << " rows " << image.rows << endl;
Mat imageHSV;
cvtColor(image, imageHSV, CV_BGR2HSV);
Vec3b p = imageHSV.at<Vec3b > (y, x);
char text[20];
sprintf(text, "H=%d, S=%d, V=%d", p[0], p[1], p[2]);
cout << text << endl;
}
}
When you have an idea about the HSV values by using this values, use these as lower and upper bounds for the in range method after converting the image to HSV by using cvtColor(image, imageHSV, CV_BGR2HSV). That should make you able to get the desired result.
It is not going to be too inefficient to iterate through every pixel. That is exactly what cvInRangeS would do - see this: http://docs.opencv.org/doc/tutorials/core/how_to_scan_images/how_to_scan_images.html#the-efficient-way (I do this all the time and it is instantaneous for reasonable size images).
I would treat the color in the array as points in 3D RGB space. Find two color points that specify a prism that includes all other color points. That is just finding the min and max of all r,g, and b values. If this idea is not ok then you might have to check every image pixel against every pixel in the vector.
Then for each pixel in the image: result is black if (pixel.r < min.r) || (pixel.r > max.r) || (pixel.g < min.g) || (pixel.g > max.g) || (pixel.b < min.b) || (pixel.b > max.b), result is the pixel value otherwise.
This all should be very easy, so long as it is actually what you want.
I'm converting code from Matlab to C++, and one of the functions that I don't understand is imtransform. I need to "register" an image, which basically means stretching, skewing, and rotating my image so that it overlaps correctly with another image.
Matlab's imtransform does the registration for you, but as I'm programming this in C++ I need to know what's been abstracted. What is the normal math involved in image registration? How can I go from 2 arrays of data (which make up images) to 1 array, which is the combined image overlapped?
I recommend you to use OpenCV within c++ and there are a lot of image processing tools and functions you can call and use.
The Registration module implements parametric image registration. The implemented method is direct alignment, that is, it uses directly the pixel values for calculating the registration between a pair of images, as opposed to feature-based registration.
The OpenCV constants that represent these models have a prefix MOTION_ and are shown inside the brackets.
Translation ( MOTION_TRANSLATION ) : The first image can be shifted ( translated ) by (x , y) to obtain the second image. There are only two parameters x and y that we need to estimate.
Euclidean ( MOTION_EUCLIDEAN ) : The first image is a rotated and shifted version of the second image. So there are three parameters — x, y and angle . You will notice in Figure 4, when a square undergoes Euclidean transformation, the size does not change, parallel lines remain parallel, and right angles remain unchanged after transformation.
Affine ( MOTION_AFFINE ) : An affine transform is a combination of rotation, translation ( shift ), scale, and shear. This transform has six parameters. When a square undergoes an Affine transformation, parallel lines remain parallel, but lines meeting at right angles no longer remain orthogonal.
Homography ( MOTION_HOMOGRAPHY ) : All the transforms described above are 2D transforms. They do not account for 3D effects. A homography transform on the other hand can account for some 3D effects ( but not all ). This transform has 8 parameters. A square when transformed using a Homography can change to any quadrilateral.
Reference: https://docs.opencv.org/3.4.2/db/d61/group__reg.html
This is an example I found very useful for image registration:
#include <opencv2/opencv.hpp>
#include "opencv2/xfeatures2d.hpp"
#include "opencv2/features2d.hpp"
using namespace std;
using namespace cv;
using namespace cv::xfeatures2d;
const int MAX_FEATURES = 500;
const float GOOD_MATCH_PERCENT = 0.15f;
void alignImages(Mat &im1, Mat &im2, Mat &im1Reg, Mat &h)
{
Mat im1Gray, im2Gray;
cvtColor(im1, im1Gray, CV_BGR2GRAY);
cvtColor(im2, im2Gray, CV_BGR2GRAY);
// Variables to store keypoints and descriptors
std::vector<KeyPoint> keypoints1, keypoints2;
Mat descriptors1, descriptors2;
// Detect ORB features and compute descriptors.
Ptr<Feature2D> orb = ORB::create(MAX_FEATURES);
orb->detectAndCompute(im1Gray, Mat(), keypoints1, descriptors1);
orb->detectAndCompute(im2Gray, Mat(), keypoints2, descriptors2);
// Match features.
std::vector<DMatch> matches;
Ptr<DescriptorMatcher> matcher = DescriptorMatcher::create("BruteForce-Hamming");
matcher->match(descriptors1, descriptors2, matches, Mat());
// Sort matches by score
std::sort(matches.begin(), matches.end());
// Remove not so good matches
const int numGoodMatches = matches.size() * GOOD_MATCH_PERCENT;
matches.erase(matches.begin()+numGoodMatches, matches.end());
// Draw top matches
Mat imMatches;
drawMatches(im1, keypoints1, im2, keypoints2, matches, imMatches);
imwrite("matches.jpg", imMatches);
// Extract location of good matches
std::vector<Point2f> points1, points2;
for( size_t i = 0; i < matches.size(); i++ )
{
points1.push_back( keypoints1[ matches[i].queryIdx ].pt );
points2.push_back( keypoints2[ matches[i].trainIdx ].pt );
}
// Find homography
h = findHomography( points1, points2, RANSAC );
// Use homography to warp image
warpPerspective(im1, im1Reg, h, im2.size());
}
int main(int argc, char **argv)
{
// Read reference image
string refFilename("form.jpg");
cout << "Reading reference image : " << refFilename << endl;
Mat imReference = imread(refFilename);
// Read image to be aligned
string imFilename("scanned-form.jpg");
cout << "Reading image to align : " << imFilename << endl;
Mat im = imread(imFilename);
// Registered image will be resotred in imReg.
// The estimated homography will be stored in h.
Mat imReg, h;
// Align images
cout << "Aligning images ..." << endl;
alignImages(im, imReference, imReg, h);
// Write aligned image to disk.
string outFilename("aligned.jpg");
cout << "Saving aligned image : " << outFilename << endl;
imwrite(outFilename, imReg);
// Print estimated homography
cout << "Estimated homography : \n" << h << endl;
}
Raw C++ does not have any of the concepts you refer to built into it. However, there are many image processing libraries for C++ you can use that can do various transforms. DevIL and FreeImage should be able to do layering, as well as some transforms.