Accessing RGB values of all pixels in a certain image in openCV - c++

I have searched internet and stackoverflow thoroughly, but I didn't find what exactly I'm looking for!
How can I get RGB (BGR actually) values of a certain image (all pixels of the image) in OpenCV? I'm using C++, the image is stored in cv::Mat variable.
I'm showing some of my efforts so far: I tried this code from another stackoverflow link. But every time I re-run the code the value in Hexadecimal changed! For example once its 00CD5D7C, in next run it is 00C09D7C.
cv::Mat img_rgb = cv::imread("img6.jpg");
Point3_<uchar>* p = img_rgb.ptr<Point3_<uchar> >(10,10);
p->x; //B
p->y; //G
p->z; //R
std::cout<<p;
In another try I used this code from another answer. Here output is always -858993460.
img_rgb.at<cv::Vec3b>(10,10);
img_rgb.at<cv::Vec3b>(10,10)[0] = newval[0];
img_rgb.at<cv::Vec3b>(10,10)[1] = newval[1];
img_rgb.at<cv::Vec3b>(10,10)[2] = newval[2];
cout<<newval[0]; //For cout<<newval[1]; cout<<newval[2]; the result is still same
NOTE: I used (10,10) as a test to get the RGB, my target is get the RGB values if the whole image!

Since you are loading a color image (of type CV_8UC3), you need to access its elements with .at<Vec3b>(row, col). The elements are in BGR order:
Mat img_bgr = imread("path_to_img");
for(int r = 0; r < img_bgr.rows; ++r) {
for(int c = 0; c < img_bgr.cols; ++c) {
std::cout << "Pixel at position (x, y) : (" << c << ", " << r << ") =" <<
img_bgr.at<Vec3b>(r,c) << std::endl;
}
}
You can also simplify using Mat3b (aka Mat_<Vec3b>), so you don't need to use the .at function, but using directly the parenthesis:
Mat3b img_bgr = imread("path_to_img");
for(int r = 0; r < img_bgr.rows; ++r) {
for(int c = 0; c < img_bgr.cols; ++c) {
std::cout << "Pixel at position (x, y) : (" << c << ", " << r << ") =" <<
img_bgr(r,c) << std::endl;
}
}
To get each single channel, you can easily do:
Vec3b pixel = img_bgr(r,c); // or img_bgr.at<Vec3b>(r,c)
uchar blue = pixel[0];
uchar green = pixel[1];
uchar red = pixel[2];

Related

Number and character recognition using ANN OpenCV 3.1

I have implemented Neural network using OpenCV ANN Library. I am newbie in this field and I learn everything about it online (Mostly StackOverflow).
I am using this ANN for detection of number plate. I did segmentation part using OpenCV image processing library and it is working good. It performs character segmentation and gives it to the NN part of the project. NN is going to recognize the number plate.
I have sample images of 20x30, therefore I have 600 neurons in input layer. As there are 36 possibilities (0-9,A-Z) I have 36 output neurons. I kept 100 neurons in hidden layer. The predict function of OpenCV is giving me the same output for every segmented image. That output is also showing some large negative(< -1). I have used cv::ml::ANN_MLP::SIGMOID_SYM as an activation function.
Please don't mind as there is lot of code wrongly commented (I am doing trial and error).
I need to find out what is the output of predict function. Thank you for your help.
#include <opencv2/opencv.hpp>
int inputLayerSize = 1;
int outputLayerSize = 1;
int numSamples = 2;
Mat layers = Mat(3, 1, CV_32S);
layers.row(0) =Scalar(600) ;
layers.row(1) = Scalar(20);
layers.row(2) = Scalar(36);
vector<int> layerSizes = { 600,100,36 };
Ptr<ml::ANN_MLP> nnPtr = ml::ANN_MLP::create();
vector <int> n;
//nnPtr->setLayerSizes(3);
nnPtr->setLayerSizes(layers);
nnPtr->setTrainMethod(ml::ANN_MLP::BACKPROP);
nnPtr->setTermCriteria(TermCriteria(cv::TermCriteria::COUNT | cv::TermCriteria::EPS, 1000, 0.00001f));
nnPtr->setActivationFunction(cv::ml::ANN_MLP::SIGMOID_SYM, 1, 1);
nnPtr->setBackpropWeightScale(0.5f);
nnPtr->setBackpropMomentumScale(0.5f);
/*CvANN_MLP_TrainParams params = CvANN_MLP_TrainParams(
// terminate the training after either 1000
// iterations or a very small change in the
// network wieghts below the specified value
cvTermCriteria(CV_TERMCRIT_ITER + CV_TERMCRIT_EPS, 1000, 0.000001),
// use backpropogation for training
CvANN_MLP_TrainParams::BACKPROP,
// co-efficents for backpropogation training
// (refer to manual)
0.1,
0.1);*/
/* Mat samples(Size(inputLayerSize, numSamples), CV_32F);
samples.at<float>(Point(0, 0)) = 0.1f;
samples.at<float>(Point(0, 1)) = 0.2f;
Mat responses(Size(outputLayerSize, numSamples), CV_32F);
responses.at<float>(Point(0, 0)) = 0.2f;
responses.at<float>(Point(0, 1)) = 0.4f;
*/
//reading chaos image
// we will read the classification numbers into this variable as though it is a vector
// close the traning images file
/*vector<int> layerInfo;
layerInfo=nnPtr->get;
for (int i = 0; i < layerInfo.size(); i++) {
cout << "size of 0" <<layerInfo[i] << endl;
}*/
cv::imshow("chaos", matTrainingImagesAsFlattenedFloats);
// cout <<abc << endl;
matTrainingImagesAsFlattenedFloats.convertTo(matTrainingImagesAsFlattenedFloats, CV_32F);
//matClassificationInts.reshape(1, 496);
matClassificationInts.convertTo(matClassificationInts, CV_32F);
matSamples.convertTo(matSamples, CV_32F);
std::cout << matClassificationInts.rows << " " << matClassificationInts.cols << " ";
std::cout << matTrainingImagesAsFlattenedFloats.rows << " " << matTrainingImagesAsFlattenedFloats.cols << " ";
std::cout << matSamples.rows << " " << matSamples.cols;
imshow("Samples", matSamples);
imshow("chaos", matTrainingImagesAsFlattenedFloats);
Ptr<ml::TrainData> trainData = ml::TrainData::create(matTrainingImagesAsFlattenedFloats, ml::SampleTypes::ROW_SAMPLE, matSamples);
nnPtr->train(trainData);
bool m = nnPtr->isTrained();
if (m)
std::cout << "training complete\n\n";
// cv::Mat matCurrentChar = Mat(cv::Size(matTrainingImagesAsFlattenedFloats.cols, matTrainingImagesAsFlattenedFloats.rows), CV_32F);
// cout << "samples:\n" << samples << endl;
//cout << "\nresponses:\n" << responses << endl;
/* if (!nnPtr->train(trainData))
return 1;*/
/* cout << "\nweights[0]:\n" << nnPtr->getWeights(0) << endl;
cout << "\nweights[1]:\n" << nnPtr->getWeights(1) << endl;
cout << "\nweights[2]:\n" << nnPtr->getWeights(2) << endl;
cout << "\nweights[3]:\n" << nnPtr->getWeights(3) << endl;*/
//predicting
std::vector <cv::String> filename;
cv::String folder = "./plate/";
cv::glob(folder, filename);
if (filename.empty()) { // if unable to open image
std::cout << "error: image not read from file\n\n"; // show error message on command line
return(0); // and exit program
}
String strFinalString;
for (int i = 0; i < filename.size(); i++) {
cv::Mat matTestingNumbers = cv::imread(filename[i]);
cv::Mat matGrayscale; //
cv::Mat matBlurred; // declare more image variables
cv::Mat matThresh; //
cv::Mat matThreshCopy;
cv::Mat matCanny;
//
cv::cvtColor(matTestingNumbers, matGrayscale, CV_BGR2GRAY); // convert to grayscale
matThresh = cv::Mat(cv::Size(matGrayscale.cols, matGrayscale.rows), CV_8UC1);
for (int i = 0; i < matGrayscale.cols; i++) {
for (int j = 0; j < matGrayscale.rows; j++) {
if (matGrayscale.at<uchar>(j, i) <= 130) {
matThresh.at<uchar>(j, i) = 255;
}
else {
matThresh.at<uchar>(j, i) = 0;
}
}
}
// blur
cv::GaussianBlur(matThresh, // input image
matBlurred, // output image
cv::Size(5, 5), // smoothing window width and height in pixels
0); // sigma value, determines how much the image will be blurred, zero makes function choose the sigma value
// filter image from grayscale to black and white
/* cv::adaptiveThreshold(matBlurred, // input image
matThresh, // output image
255, // make pixels that pass the threshold full white
cv::ADAPTIVE_THRESH_GAUSSIAN_C, // use gaussian rather than mean, seems to give better results
cv::THRESH_BINARY_INV, // invert so foreground will be white, background will be black
11, // size of a pixel neighborhood used to calculate threshold value
2); */ // constant subtracted from the mean or weighted mean
// cv::imshow("thresh" + std::to_string(i), matThresh);
matThreshCopy = matThresh.clone();
std::vector<std::vector<cv::Point> > ptContours; // declare a vector for the contours
std::vector<cv::Vec4i> v4iHierarchy;// make a copy of the thresh image, this in necessary b/c findContours modifies the image
cv::Canny(matBlurred, matCanny, 20, 40, 3);
/*std::vector<std::vector<cv::Point> > ptContours; // declare a vector for the contours
std::vector<cv::Vec4i> v4iHierarchy; // declare a vector for the hierarchy (we won't use this in this program but this may be helpful for reference)
cv::findContours(matThreshCopy, // input image, make sure to use a copy since the function will modify this image in the course of finding contours
ptContours, // output contours
v4iHierarchy, // output hierarchy
cv::RETR_EXTERNAL, // retrieve the outermost contours only
cv::CHAIN_APPROX_SIMPLE); // compress horizontal, vertical, and diagonal segments and leave only their end points
/*std::vector<std::vector<cv::Point> > contours_poly(ptContours.size());
std::vector<cv::Rect> boundRect(ptContours.size());
for (int i = 0; i < ptContours.size(); i++)
{
approxPolyDP(cv::Mat(ptContours[i]), contours_poly[i], 3, true);
boundRect[i] = cv::boundingRect(cv::Mat(contours_poly[i]));
}*/
/*for (int i = 0; i < ptContours.size(); i++) { // for each contour
ContourWithData contourWithData; // instantiate a contour with data object
contourWithData.ptContour = ptContours[i]; // assign contour to contour with data
contourWithData.boundingRect = cv::boundingRect(contourWithData.ptContour); // get the bounding rect
contourWithData.fltArea = cv::contourArea(contourWithData.ptContour); // calculate the contour area
allContoursWithData.push_back(contourWithData); // add contour with data object to list of all contours with data
}
for (int i = 0; i < allContoursWithData.size(); i++) { // for all contours
if (allContoursWithData[i].checkIfContourIsValid()) { // check if valid
validContoursWithData.push_back(allContoursWithData[i]); // if so, append to valid contour list
}
}
//sort contours from left to right
std::sort(validContoursWithData.begin(), validContoursWithData.end(), ContourWithData::sortByBoundingRectXPosition);
// std::string strFinalString; // declare final string, this will have the final number sequence by the end of the program
*/
/*for (int i = 0; i < validContoursWithData.size(); i++) { // for each contour
// draw a green rect around the current char
cv::rectangle(matTestingNumbers, // draw rectangle on original image
validContoursWithData[i].boundingRect, // rect to draw
cv::Scalar(0, 255, 0), // green
2); // thickness
cv::Mat matROI = matThresh(validContoursWithData[i].boundingRect); // get ROI image of bounding rect
cv::Mat matROIResized;
cv::resize(matROI, matROIResized, cv::Size(RESIZED_IMAGE_WIDTH, RESIZED_IMAGE_HEIGHT)); // resize image, this will be more consistent for recognition and storage
*/
cv::Mat matROIFloat;
cv::resize(matThresh, matThresh, cv::Size(RESIZED_IMAGE_WIDTH, RESIZED_IMAGE_HEIGHT));
matThresh.convertTo(matROIFloat, CV_32FC1, 1.0 / 255.0); // convert Mat to float, necessary for call to find_nearest
cv::Mat matROIFlattenedFloat = matROIFloat.reshape(1, 1);
cv::Point maxLoc = { 0,0 };
cv::Point minLoc;
cv::Mat output = cv::Mat(cv::Size(36, 1), CV_32F);
vector<float>output2;
// cv::Mat output2 = cv::Mat(cv::Size(36, 1), CV_32F);
nnPtr->predict(matROIFlattenedFloat, output2);
// float max = output.at<float>(0, 0);
int fo = 0;
float m = output2[0];
imshow("predicted input", matROIFlattenedFloat);
// float b = output.at<float>(0, 0);
// cout <<"\n output0,0:"<<b<<endl;
// minMaxLoc(output, 0, 0, &minLoc, &maxLoc, Mat());
// cout << "\noutput:\n" << maxLoc.x << endl;
for (int j = 1; j < 36; j++) {
float value =output2[j];
if (value > m) {
m = value;
fo = j;
}
}
float * p = 0;
p = &m;
cout << "j value in output " << fo << " Max value " << p << endl;
//imshow("output image" + to_string(i), output);
// cout << "\noutput:\n" << minLoc.x << endl;
//float fltCurrentChar = (float)maxLoc.x;
output.release();
m = 0;
fo = 0;
}
// strFinalString = strFinalString + char(int(fltCurrentChar)); // append current char to full string
// cv::imshow("Predict output", output);
/*cv::Point maxLoc = {0,0};
Mat output=Mat (cv::Size(matSamples.cols,matSamples.rows),CV_32F);
nnPtr->predict(matTrainingImagesAsFlattenedFloats, output);
minMaxLoc(output, 0, 0, 0, &maxLoc, 0);
cout << "\noutput:\n" << maxLoc.x << endl;*/
// getchar();
/*for (int i = 0; i < 10;i++) {
for (int j = 0; j < 36; j++) {
if (matCurrentChar.at<float>(i, j) >= 0.6) {
cout << " "<<j<<" ";
}
}
}*/
waitKey(0);
return(0);
}
void gen() {
std::string dir, filepath;
int num, imgArea, minArea;
int pos = 0;
bool f = true;
struct stat filestat;
cv::Mat imgTrainingNumbers;
cv::Mat imgGrayscale;
cv::Mat imgBlurred;
cv::Mat imgThresh;
cv::Mat imgThreshCopy;
cv::Mat matROIResized=cv::Mat (cv::Size(RESIZED_IMAGE_WIDTH,RESIZED_IMAGE_HEIGHT),CV_8UC1);
cv::Mat matROI;
std::vector <cv::String> filename;
std::vector<std::vector<cv::Point> > ptContours;
std::vector<cv::Vec4i> v4iHierarchy;
int count = 0, contoursCount = 0;
matSamples = cv::Mat(cv::Size(36, 496), CV_32FC1);
matTrainingImagesAsFlattenedFloats = cv::Mat(cv::Size(600, 496), CV_32FC1);
for (int j = 0; j <= 35; j++) {
int tmp = j;
cv::String folder = "./Training Data/" + std::to_string(tmp);
cv::glob(folder, filename);
for (int k = 0; k < filename.size(); k++) {
count++;
// If the file is a directory (or is in some way invalid) we'll skip it
// if (stat(filepath.c_str(), &filestat)) continue;
//if (S_ISDIR(filestat.st_mode)) continue;
imgTrainingNumbers = cv::imread(filename[k]);
imgArea = imgTrainingNumbers.cols*imgTrainingNumbers.rows;
// read in training numbers image
minArea = imgArea * 50 / 100;
if (imgTrainingNumbers.empty()) {
std::cout << "error: image not read from file\n\n";
//return(0);
}
cv::cvtColor(imgTrainingNumbers, imgGrayscale, CV_BGR2GRAY);
//cv::equalizeHist(imgGrayscale, imgGrayscale);
imgThresh = cv::Mat(cv::Size(imgGrayscale.cols, imgGrayscale.rows), CV_8UC1);
/*cv::adaptiveThreshold(imgGrayscale,
imgThresh,
255,
cv::ADAPTIVE_THRESH_GAUSSIAN_C,
cv::THRESH_BINARY_INV,
3,
0);
*/
for (int i = 0; i < imgGrayscale.cols; i++) {
for (int j = 0; j < imgGrayscale.rows; j++) {
if (imgGrayscale.at<uchar>(j, i) <= 130) {
imgThresh.at<uchar>(j, i) = 255;
}
else {
imgThresh.at<uchar>(j, i) = 0;
}
}
}
// cv::imshow("imgThresh"+std::to_string(count), imgThresh);
imgThreshCopy = imgThresh.clone();
cv::GaussianBlur(imgThreshCopy,
imgBlurred,
cv::Size(5, 5),
0);
cv::Mat imgCanny;
// cv::Canny(imgBlurred,imgCanny,20,40,3);
cv::findContours(imgBlurred,
ptContours,
v4iHierarchy,
cv::RETR_EXTERNAL,
cv::CHAIN_APPROX_SIMPLE);
for (int i = 0; i < ptContours.size(); i++) {
if (cv::contourArea(ptContours[i]) > MIN_CONTOUR_AREA) {
contoursCount++;
cv::Rect boundingRect = cv::boundingRect(ptContours[i]);
cv::rectangle(imgTrainingNumbers, boundingRect, cv::Scalar(0, 0, 255), 2); // draw red rectangle around each contour as we ask user for input
matROI = imgThreshCopy(boundingRect); // get ROI image of bounding rect
std::string path = "./" + std::to_string(contoursCount) + ".JPG";
cv::imwrite(path, matROI);
// cv::imshow("matROI" + std::to_string(count), matROI);
cv::resize(matROI, matROIResized, cv::Size(RESIZED_IMAGE_WIDTH, RESIZED_IMAGE_HEIGHT)); // resize image, this will be more consistent for recognition and storage
std::cout << filename[k] << " " << contoursCount << "\n";
//cv::imshow("matROI", matROI);
//cv::imshow("matROIResized"+std::to_string(count), matROIResized);
// cv::imshow("imgTrainingNumbers" + std::to_string(contoursCount), imgTrainingNumbers);
int intChar;
if (j<10)
intChar = j + 48;
else {
intChar = j + 55;
}
/*if (intChar == 27) { // if esc key was pressed
return(0); // exit program
}*/
// if (std::find(intValidChars.begin(), intValidChars.end(), intChar) != intValidChars.end()) { // else if the char is in the list of chars we are looking for . . .
// append classification char to integer list of chars
cv::Mat matImageFloat;
matROIResized.convertTo(matImageFloat,CV_32FC1);// now add the training image (some conversion is necessary first) . . .
//matROIResized.convertTo(matImageFloat, CV_32FC1); // convert Mat to float
cv::Mat matImageFlattenedFloat = matImageFloat.reshape(1, 1);
//matTrainingImagesAsFlattenedFloats.push_back(matImageFlattenedFloat);// flatten
try {
//matTrainingImagesAsFlattenedFloats.push_back(matImageFlattenedFloat);
std::cout << matTrainingImagesAsFlattenedFloats.rows << " " << matTrainingImagesAsFlattenedFloats.cols;
//unsigned char* re;
int ii = 0; // Current column in training_mat
for (int i = 0; i<matImageFloat.rows; i++) {
for (int j = 0; j < matImageFloat.cols; j++) {
matTrainingImagesAsFlattenedFloats.at<float>(contoursCount-1, ii++) = matImageFloat.at<float>(i,j);
}
}
}
catch (std::exception &exc) {
f = false;
exc.what();
}
if (f) {
matClassificationInts.push_back((float)intChar);
matSamples.at<float>(contoursCount-1, j) = 1.0;
}
f = true;
// add to Mat as though it was a vector, this is necessary due to the
// data types that KNearest.train accepts
} // end if
//} // end if
} // end for
}//end i
}//end j
}
Output of predict function
Unfortunately, I don't have the necessary time to really review the code, but I can say off the top that to train a model that performs well for prediction with 36 classes, you will need several things:
A large number of good quality images. Ideally, you'd want thousands of images for each class. Of course, you can see somewhat decent results with less than that, but if you only have a few images per class, it's never going to be able to generalize adequately.
You need a model that is large and sophisticated enough to provide the necessary expressiveness to solve the problem. For a problem like this, a plain old multi-layer perceptron with one hidden layer with 100 units may not be enough. This is actually a problem that would benefit from using a Convolutional Neural Net (CNN) with a couple layers just to extract useful features first. But assuming you don't want to go down that path, you may at least want to tweak the size of your hidden layer.
To even get to a point where the training process converges, you will probably need to experiment and crucially, you need an effective way to test the accuracy of the ANN after each experiment. Ideally, you want to observe the loss as the training is proceeding, but I'm not sure whether that's possible using OpenCV's ML functionality. At a minimum, you should fully expect to have to play around with the various so-called "hyper-parameters" and run many experiments before you have a reasonable model.
Anyway, the most important thing is to make sure you have a solid mechanism for validating the accuracy of the model after training. If you aren't already doing so, set aside some images as a separate test set, and after each experiment, use the trained ANN to predict each test image to see the accuracy.
One final general note: what you're trying to do is complex. You will save yourself a huge number of headaches if you take the time early and often to refactor your code. No matter how many experiments you run, if there's some defect causing (for example) your training data to be fundamentally different in some way than your test data, you will never see good results.
Good luck!
EDIT: I should also point out that seeing the same result for every input image is a classic sign that training failed. Unfortunately, there are many reasons why that might happen and it will be very difficult for anyone to isolate that for you without some cleaner code and access to your image data.
I have solved the issue of not getting the output of predict. The issue was created because of the input Mat image to train (ie. matTrainingImagesAsFlattenedFloats) was having values 255.0 for a white pixel. This happened because I haven't use convertTo() properly. You need to use convertTo(OutputImage name, CV_32FC1, 1.0 / 255.0); like this which will convert all the pixel values with 255.0 to 1.0 and after that I am getting the correct output.
Thank you for all the help.
This is too broad to be in one question. Sorry for the bad news. I tried this over and over and couldn't find a solution. I recommend that you implement a simple AND, OR or XOR first just to make sure that the learning part is working and that you are getting better results the more passes you do. Also I suggest to try the Tangent Hyperbolic as a Transfer Function instead of Sigmoid. And Good luck!
Here is some of my own posts that might help you:
Exact results as yours: HERE
Some codes: HERE
I don't want to say that, but several professors I met said Backpropagation just doesn't work and they had (and me have) to implement my own method of teaching the network.

Accessing elements in a multi-channel OpenCV Mat

this is my first post on stackoverflow, so I hope to do everything right, sorry if I don't.
I'm writing code for a function to convert a single RGB value into CIE L*a*b* color space. The function is supposed to take a 3 floats array (RGB channels with values in [0-255]) and to give in output a 3 floats array with the L*a*b* values. To do so, I'm using the cvtColor function available with OpenCV.
As suggested on the openCV website I'm creating the Mat structures (needed by cvtColor) by contructor.
My problem is that, although I think the code runs properly and performs the conversion, I'm unable to get the values contained in the Mat structure back.
Here's my code:
float * rgb2lab(float rgb[3]) {
// bring input in range [0,1]
rgb[0] = rgb[0] / 255;
rgb[1] = rgb[1] / 255;
rgb[2] = rgb[2] / 255;
// copy rgb in Mat data structure and check values
cv::Mat rgb_m(1, 1, CV_32FC3, cv::Scalar(rgb[0], rgb[1], rgb[2]));
std::cout << "rgb_m = " << std::endl << " " << rgb_m << std::endl;
cv::Vec3f elem = rgb_m.at<cv::Vec3f>(1, 1);
float R = elem[0];
float G = elem[1];
float B = elem[2];
printf("RGB =\n [%f, %f, %f]\n", R, G, B);
// create lab data structure and check values
cv::Mat lab_m(1, 1, CV_32FC3, cv::Scalar(0, 0, 0));
std::cout << "lab_m = " << std::endl << " " << lab_m << std::endl;
// convert
cv::cvtColor(rgb_m, lab_m, CV_RGB2Lab);
// check lab value after conversion
std::cout << "lab_m2 = " << std::endl << " " << lab_m << std::endl;
cv::Vec3f elem2 = lab_m.at<cv::Vec3f>(1, 1);
float l = elem2[0];
float a = elem2[1];
float b = elem2[2];
printf("lab =\n [%f, %f, %f]\n", l, a, b);
// generate the output and return
static float lab[] = { l, a, b };
return lab;
}
As you can see, I'm extracting all channels from the Mat structure by the at function and then accessing them individually from the vector. This is proposed as the solution in many places (one of them).
But if I run this code (input vector was {123,10,200}), on cout I correctly get the outputs of the Mat structures (from which I get the algorithm is converting correctly), but as you can see the extracted values are wrong:
rgb_m =
[0.48235294, 0.039215688, 0.78431374]
RGB =
[0.000000, 0.000000, -5758185472.000000]
lab_m =
[0, 0, 0]
lab_m2 =
[35.198029, 70.120964, -71.303688]
lab =
[0.000000, 0.000000, 4822177514157213323960797626368.000000]
Anyone have an idea of what I'm doing wrong?
Thank you so much for all your help!
The first element of a cv::Mat is always at (0, 0), so just correct cv::Vec3f elem = rgb_m.at<cv::Vec3f>(1, 1); by cv::Vec3f elem = rgb_m.at<cv::Vec3f>(0, 0); and cv::Vec3f elem2 = lab_m.at<cv::Vec3f>(1, 1); by cv::Vec3f elem2 = lab_m.at<cv::Vec3f>(0, 0);

Different Pixel Values in MATLAB and C++ with OpenCV

I see there are similar questions to this but don't quiet answer what I am asking so here is my question.
In C++ with OpenCV I run the code I will provide below and it returns an average pixel value of 6.32. However, when I open the image and use the mean function in MATLAB it returns an average pixel intensity of approximately 6.92ish. As you can see I convert the OpenCV values to double to try to ease this issue and have found that openCV loads the image as a set of integers whereas MATLAB loads the image as decimal values that are approximately but not quite the same obviously as the integers. So my question is, being new to coding, which is correct? I'm assuming MATLAB is returning more accurate values and if that is the case I would like to know if there is a way to load the images in the same fashion to avoid the discrepancy.
Thank you, Code below
Mat img = imread("Cells2.tif");
cv::cvtColor(img, img, CV_BGR2GRAY);
cv::imshow("stuff",img);
Mat dst;
if(img.channels() == 3)
{
img.convertTo(dst, CV_64FC1);
}
else if (img.channels() == 1)
{
img.convertTo(dst, CV_64FC1);
}
cv::imshow("output",dst/255);
int NumPixels = img.total();
double avg;
double c = 0;
double std;
for(int y = 0; y < dst.cols; y++)
{
for(int x = 0; x < dst.rows; x++)
{
c+=dst.at<double>(x,y)*255;
}
}
avg = c/NumPixels;
cout << "asfa = " << c << endl;
double deviation;
double var;
double z = 0;
double q;
//for(int a = 0; a<= img.cols; a++)
for(int y = 0; y< dst.cols; y++)
{
//for(int b = 0; b<= dst.rows; b++)
for(int x = 0; x< dst.rows; x++)
{
q=dst.at<double>(x,y);
deviation = q - avg;
z = z + pow(deviation,2);
//cout << "q = " << q << endl;
}
}
var = z/(NumPixels);
std = sqrt(var);
cv::Scalar avgPixel = cv::mean(dst);
cout << "Avg Value = " << avg << endl;
cout << "StdDev = " << std << endl;
cout << "AvgPixel =" << avgPixel;
cvWaitKey(0);
return 0;
}
According to your comment, the image seems to be stored with a 16-bit depth. MATLAB loads the TIFF image as is, while by default OpenCV will load images as 8-bit. This might explain the difference in precision that you are seeing.
Use the following to open the image in OpenCV:
cv::Mat img = cv::imread("file.tif", cv::IMREAD_ANYDEPTH|cv::IMREAD_ANYCOLOR);
In MATLAB, it's simply:
img = imread('file.tif');
Next you need to be aware of the data type you are working with. In OpenCV its CV_16U, in MATLAB its uint16. Therefore you need to convert types accordingly.
For example, in MATLAB:
img2 = double(img) ./ double(intmax('uint16'));
would convert it to a double image with values in the range [0,1]
When you load the image, you must use similar methods in both environments (MATLAB and OpenCV) to avoid possible conversions which may be done by default in either environment.
You are converting the image if certain conditions are met, this can change some color values while MATLAB can choose to not convert the image but use the raw image
colors are mostly represented in hex format with popular implementations in the format of 0xAARRGGBB or 0xRRGGBBAA, so 32 bit integers will do (unsigned/signed doesn't matter, the hex value is still the same), create a 64 bit variable, add all the 32 bit variables together and then divide by the amount of pixels, this will get you a quite accurate result (for images up to 16384 by 16384 pixels (where a 32 bit value is representing the color of one pixel), if larger, then a 64 bit integer will not be enough).
long long total = 0;
long long divisor = image.width * image.height;
for(int x = 0; x < image.width; ++x)
{
for(int y = 0; x < image.height; ++x)
{
total += image.at(x,y).color;
}
}
double avg = total / divisor;
std::cout << "Average color value: " << avg << std::endl;
Not sure what difficulty you are having with mean value in Matlab versus OpenCV. If I understand your question correctly, your goal is to implement Matlab's mean(image(:)) in OpenCV. For example in Matlab you do the following:
>> image = imread('sheep.jpg')
>> avg = mean(image(:))
ans =
119.8210
Here's how you do the same in OpenCV:
Mat image = imread("sheep.jpg");
Scalar avg_pixel;
avg_pixel = mean(image);
float avg = 0;
cout << "mean pixel (RGB): " << avg_pixel << endl;
for(int i; i<image.channels(); ++i) {
avg = avg + avg_pixel[i];
}
avg = avg/image.channels();
cout << "mean, that's equivalent to mean(image(:)) in Matlab: " << avg << endl;
OpenCV console output:
mean pixel (RGB): [77.4377, 154.43, 127.596, 0]
mean, that's equivalent to mean(image(:)) in Matlab: 119.821
So the results are the same in Matlab and OpenCV.
Follow up
Found some problems in your code.
OpenCV stores data differently from Matlab. Look at this answer for a rough explanation on how to access a pixel in OpenCV. For example:
// NOT a correct way to access a pixel in CV_F32C3 type image
double pixel = image.at<double>(x,y);
//The correct way (where the pixel value is stored in a vector)
// Note that Vec3d is defined as: typedef Vec<double, 3> Vec3d;
Vec3d pixel = image.at<Vec3d>(x, y);
Another error I found
if(img.channels() == 3)
{
img.convertTo(dst, CV_64FC1); //should be CV_64FC3, instead of CV_64FC1
}
Accessing Mat elements may be confusing. I suggest getting a book on OpenCV to get started, for example this one, and read OpenCV tutorials and documentation. Hope this helps.

How to access the RGB values in Opencv?

I am confused about the use of number of channels.
Which one is correct of the following?
// roi is the image matrix
for(int i = 0; i < roi.rows; i++)
{
for(int j = 0; j < roi.cols; j+=roi.channels())
{
int b = roi.at<cv::Vec3b>(i,j)[0];
int g = roi.at<cv::Vec3b>(i,j)[1];
int r = roi.at<cv::Vec3b>(i,j)[2];
cout << r << " " << g << " " << b << endl ;
}
}
Or,
for(int i = 0; i < roi.rows; i++)
{
for(int j = 0; j < roi.cols; j++)
{
int b = roi.at<cv::Vec3b>(i,j)[0];
int g = roi.at<cv::Vec3b>(i,j)[1];
int r = roi.at<cv::Vec3b>(i,j)[2];
cout << r << " " << g << " " << b << endl ;
}
}
the second one is correct,
the rows and cols inside the Mat represents the number of pixels,
while the channel has nothing to do with the rows and cols number.
and CV use BGR by default, so assuming the Mat is not converted to RGB then the code is correct
reference, personal experience, OpenCV docs
A quicker way to get color components from an image is to have the image represented as an IplImage structure and then make use of the pixel size and number of channels to iterate through it using pointer arithmetic.
For example, if you know that your image is a 3-channel image with 1 byte per pixel and its format is BGR (the default in OpenCV), the following code will get access to its components:
(In the following code, img is of type IplImage.)
for (int y = 0; y < img->height; y++) {
for(int x = 0; x < img->width; x++) {
uchar *blue = ((uchar*)(img->imageData + img->widthStep*y))[x*3];
uchar *green = ((uchar*)(img->imageData + img->widthStep*y))[x*3+1];
uchar *red = ((uchar*)(img->imageData + img->widthStep*y))[x*3+2];
}
}
For a more flexible approach, you can use the CV_IMAGE_ELEM macro defined in types_c.h:
/* get reference to pixel at (col,row),
for multi-channel images (col) should be multiplied by number of channels */
#define CV_IMAGE_ELEM( image, elemtype, row, col ) \
(((elemtype*)((image)->imageData + (image)->widthStep*(row)))[(col)])
I guess the 2nd one is correct, nevertheless it is very time consuming to get the data like that.
A quicker method would be to use the IplImage* data structure and increment the address pointed with the size of the data contained in roi...

Camera motion compensation

I am using openCV to implementing camera motion compensation for an application. I know I need to calculate the optical flow and then find the fundamental matrix between two frames to transform the image.
Here is what I have done so far:
void VideoStabilization::stabilize(Image *image) {
if (image->getWidth() != width || image->getHeight() != height) reset(image->getWidth(), image->getHeight());
IplImage *currImage = toCVImage(image);
IplImage *currImageGray = cvCreateImage(cvSize(width, height), IPL_DEPTH_8U, 1);
cvCvtColor(currImage, currImageGray, CV_BGRA2GRAY);
if (baseImage) {
CvPoint2D32f currFeatures[MAX_CORNERS];
char featuresFound[MAX_CORNERS];
opticalFlow(currImageGray, currFeatures, featuresFound);
IplImage *result = transformImage(currImage, currFeatures, featuresFound);
if (result) {
updateImage(image, result);
cvReleaseImage(&result);
}
}
cvReleaseImage(&currImage);
if (baseImage) cvReleaseImage(&baseImage);
baseImage = currImageGray;
updateGoodFeatures();
}
void VideoStabilization::updateGoodFeatures() {
const double QUALITY_LEVEL = 0.05;
const double MIN_DISTANCE = 5.0;
baseFeaturesCount = MAX_CORNERS;
cvGoodFeaturesToTrack(baseImage, eigImage,
tempImage, baseFeatures, &baseFeaturesCount, QUALITY_LEVEL, MIN_DISTANCE);
cvFindCornerSubPix(baseImage, baseFeatures, baseFeaturesCount,
cvSize(10, 10), cvSize(-1,-1), TERM_CRITERIA);
}
void VideoStabilization::opticalFlow(IplImage *currImage, CvPoint2D32f *currFeatures, char *featuresFound) {
const unsigned int WIN_SIZE = 15;
const unsigned int PYR_LEVEL = 5;
cvCalcOpticalFlowPyrLK(baseImage, currImage,
NULL, NULL,
baseFeatures,
currFeatures,
baseFeaturesCount,
cvSize(WIN_SIZE, WIN_SIZE),
PYR_LEVEL,
featuresFound,
NULL,
TERM_CRITERIA,
0);
}
IplImage *VideoStabilization::transformImage(IplImage *image, CvPoint2D32f *features, char *featuresFound) const {
unsigned int featuresFoundCount = 0;
for (unsigned int i = 0; i < MAX_CORNERS; ++i) {
if (featuresFound[i]) ++featuresFoundCount;
}
if (featuresFoundCount < 8) {
std::cout << "Not enough features found." << std::endl;
return NULL;
}
CvMat *points1 = cvCreateMat(2, featuresFoundCount, CV_32F);
CvMat *points2 = cvCreateMat(2, featuresFoundCount, CV_32F);
CvMat *fundamentalMatrix = cvCreateMat(3, 3, CV_32F);
unsigned int pos = 0;
for (unsigned int i = 0; i < featuresFoundCount; ++i) {
while (!featuresFound[pos]) ++pos;
cvSetReal2D(points1, 0, i, baseFeatures[pos].x);
cvSetReal2D(points1, 1, i, baseFeatures[pos].y);
cvSetReal2D(points2, 0, i, features[pos].x);
cvSetReal2D(points2, 1, i, features[pos].y);
++pos;
}
int fmCount = cvFindFundamentalMat(points1, points2, fundamentalMatrix, CV_FM_RANSAC, 1.0, 0.99);
if (fmCount < 1) {
std::cout << "Fundamental matrix not found." << std::endl;
return NULL;
}
std::cout << fundamentalMatrix->data.fl[0] << " " << fundamentalMatrix->data.fl[1] << " " << fundamentalMatrix->data.fl[2] << "\n";
std::cout << fundamentalMatrix->data.fl[3] << " " << fundamentalMatrix->data.fl[4] << " " << fundamentalMatrix->data.fl[5] << "\n";
std::cout << fundamentalMatrix->data.fl[6] << " " << fundamentalMatrix->data.fl[7] << " " << fundamentalMatrix->data.fl[8] << "\n";
cvReleaseMat(&points1);
cvReleaseMat(&points2);
IplImage *result = transformImage(image, *fundamentalMatrix);
cvReleaseMat(&fundamentalMatrix);
return result;
}
MAX_CORNERS is 100 and it usually find around 70-90 features.
With this code, I get a weird fundamental matrix, like:
-0.000190809 -0.00114947 1.2487
0.00127824 6.57727e-05 0.326055
-1.22443 -0.338243 1
Since I just hold the camera with my hand and try not to shake it (and there werent any objects moving), I expected the matrix to be close to identity. What am I doing wrong?
Also, I'm not sure what to use to transform the image. cvWarpAffine need a 2x3 matrix, should I discard the last row or use another function?
What you're looking for is not the fundamental matrix but rather an affine or perspective transform.
The fundamental matrix describes the relation of two cameras having significantly different viewpoints. It is calculated such that if you have two points x (on one image) and x' (on another) that are projections of the same point in space, then x F x' (the product) is zero. If x and x' are nearly identical... then the only solution is to make F nearly zero (and practically useless). That's why you've got what you have.
The matrix that should indeed be near identity is a transformation A that transforms the points x to x'= A x (the old image into the new one). Depending on what types of transformations you want to include (affine or perspective), you could (theoretically) use the functions cvGetAffineTransform or cvGetPerspectiveTransform to calculate the transform. For that, you would need 3 or 4 point pairs, respectively.
However, the best choice (I think) is cvFindHomograpy. It estimates a perspective transform based on all of the point pairs available, using outlier filtering algorithms (RANSAC, for example), giving you a 3x3 matrix.
Then you can use cvWarpPerspective to transform the images themselves.