Pre-processing before digit recognition with KNN classifier - c++

Right now I'm trying to create digit recognition system using OpenCV. There are many articles and examples in WEB (and even on StackOverflow). I decided to use KNN classifier because this solution is the most popular in WEB. I found a database of handwritten digits with a training set of 60k examples and with error rate less than 5%.
I used this tutorial as an example of how to work with this database using OpenCV. I'm using exactly same technique and on test data (t10k-images.idx3-ubyte) I've got 4% error rate. But when I try to classify my own digits I've got much bigger error. For example:
is recognized as 7
and are recognized as 5
and are recognized as 1
is recognized as 8
And so on (I can upload all images if it's needed).
As you can see all digits have good quality and are easily-recognizable for human.
So I decided to do some pre-processing before classifying. From the table on MNIST database site I found that people are using deskewing, noise removal, blurring and pixel shift techniques. Unfortunately almost all links to the articles are broken. So I decided to do such pre-processing by myself, because I already know how to do that.
Right now, my algorithm is the following:
Erode image (I think that my original digits are too
rough).
Remove small contours.
Threshold and blur image.
Center digit (instead of shifting).
I think that deskewing is not needed in my situation because all digits are normally rotated. And also I have no idea how to find a right rotation angle.
So after this I've got these images:
is also 1
is 3 (not 5 as it used to be)
is 5 (not 8)
is 7 (profit!)
So, such pre-processing helped me a bit, but I need better results, because in my opinion such digits should be recognized without problems.
Can anyone give me any advice with pre-processing? Thanks for any help.
P.S. I can upload my source (c++) code.

I realized my mistake - it wasn't connected with pre-processing at all (thanks to #DavidBrown and #John). I used handwritten dataset of digits instead of printed (capitalized). I didn't find such database in the web so I decided to create it by myself. I have uploaded my database to the Google Drive.
And here's how you can use it (train and classify):
int digitSize = 16;
//returns list of files in specific directory
static vector<string> getListFiles(const string& dirPath)
{
vector<string> result;
DIR *dir;
struct dirent *ent;
if ((dir = opendir(dirPath.c_str())) != NULL)
{
while ((ent = readdir (dir)) != NULL)
{
if (strcmp(ent->d_name, ".") != 0 && strcmp(ent->d_name, "..") != 0 )
{
result.push_back(ent->d_name);
}
}
closedir(dir);
}
return result;
}
void DigitClassifier::train(const string& imagesPath)
{
int num = 510;
int size = digitSize * digitSize;
Mat trainData = Mat(Size(size, num), CV_32FC1);
Mat responces = Mat(Size(1, num), CV_32FC1);
int counter = 0;
for (int i=1; i<=9; i++)
{
char digit[2];
sprintf(digit, "%d/", i);
string digitPath(digit);
digitPath = imagesPath + digitPath;
vector<string> images = getListFiles(digitPath);
for (int j=0; j<images.size(); j++)
{
Mat mat = imread(digitPath+images[j], 0);
resize(mat, mat, Size(digitSize, digitSize));
mat.convertTo(mat, CV_32FC1);
mat = mat.reshape(1,1);
for (int k=0; k<size; k++)
{
trainData.at<float>(counter*size+k) = mat.at<float>(k);
}
responces.at<float>(counter) = i;
counter++;
}
}
knn.train(trainData, responces);
}
int DigitClassifier::classify(const Mat& img) const
{
Mat tmp = img.clone();
resize(tmp, tmp, Size(digitSize, digitSize));
tmp.convertTo(tmp, CV_32FC1);
return knn.find_nearest(tmp.reshape(1, 1), 5);
}

5 & 6 , 1 & 7, 9 & 8 are recognized as the same because central points of classes are too similar. What about this ?
Apply connected component labeling method to digits for getting real boundaries of digits and crop images over these boundaries. So, you will work on more correct area and central points are normalized.
Then divide digits into two parts as horizontally. (For example you will have two circles after dividing "8")
As a result, "9" and "8" are more recognizable as well as "5" and "6". Upper parts will be same but lower parts are different.

I can not give you a better answer than your own answer, but I would like to contribute with an advise. You could improve your digits recognition system on the following way:
Apply over the white and black patch an skeletonization process.
After that, apply distance transform.
On this way you can improve results of the classifier when digits are not exactly centered or they are not exactly the same, morphologically speaking.

Related

OpenCV homography - question about deringing lanczos interpolation

I'm attempting to improve performance of the OpenCV lanczos interpolation algorithm for applying homography transformations to astronomical images, as it is prone to ringing artefacts around stars in some images.
My approach is to apply homography twice, once using lanczos and once using bilinear filtering which is not susceptible to ringing, but doesn't perform as well at preserving detail. I then use the bilinear-interpolated output as a guide image, and clamp the lanczos-interpolated output to the guide if it undershoots the guide by more than a given percentage.
I have working code (below) but have 2 questions:
It doesn't seem optimal to iterate across elements in the Mat. Is there a better way of doing the compare and replace loop using OpenCV Mat methods?
My overall approach is computationally expensive - I'm applying homography to the entire Mat twice. Is there an overall better approach to preventing deringing of lanczos interpolation? (Rewriting the entire algorithm plus all the various optimisations that OpenCV makes available is not an option for me.)
warpPerspective(in, out, H, Size(target_rx, target_ry), interpolation, BORDER_TRANSPARENT);
if (interpolation == OPENCV_LANCZOS4) {
int count = 0;
// factor sets how big an undershoot can be tolerated
double factor = 0.75;
// Create guide image
warpPerspective(in, guide, H, Size(target_rx, target_ry), OPENCV_LINEAR, BORDER_TRANSPARENT);
// Compare the two, replace out pixels with guide pixels if too far out
for (int i = 0 ; i < out.rows ; i++) {
const double* outi = out.ptr<double>(i);
const double* guidei = guide.ptr<double>(i);
for (int j = 0; j < out.cols ; j++) {
if (outi[j] < guidei[j] * factor) {
out.at<double>(i, j) = guidei[j];
count++;
}
}
}
}
With a steer from Christoph Rackwitz, the answer was surprisingly simple:
compare(out, (guide * factor), mask, CMP_LT);
guide.copyTo(out, mask);
Thanks :)

Template Matching with Mask

I want to perform Template matching with mask. In general Template matching can be made faster by converting the image from Spacial domain into Frequency domain. But is there any any method i can apply if i want to perform the same with mask? I'm using opencv c++. Is there any matching function already there in opencv for this task?
My current Approach:
Bitwise Xor Image A & Image B with Mask.
Count the Non-Zero Pixels.
Fill the Resultant matrix with this count.
Search for maxi-ma.
Few parameters I'm guessing now are:
Skip the Tile position if the matches are less than 25%.
Skip the tile position if the matches are less than 25%.
Skip the Tile position if the previous Tile has matches are less than 50%.
My question: is there any algorithm to do this matching already? Is there any mathematical operation which can speed up this process?
With binary images, you can use directly HU-Moments and Mahalanobis distance to find if image A is similar to image B. If the distance tends to 0, then the images are the same.
Of course you can use also Features detectors so see what matches, but for pictures like these, HU Moments or Features detectors will give approximately same results, but HU Moments are more efficient.
Using findContours, you can extract the black regions inside the white star and fill them, in order to have image A = image B.
Other approach: using findContours on your mask and apply the result to Image A (extracting the Region of Interest), you can extract what's inside the star and count how many black pixels you have (the mismatching ones).
I have same requirement and I have tried the almost same way. As in the image, I want to match the castle. The castle has a different shield image and variable length clan name and also grass background(This image comes from game Clash of Clans). The normal opencv matchTemplate does not work. So I write my own.
I follow the ways of matchTemplate to create a result image, but with different algorithm.
The core idea is to count the matched pixel under the mask. The code is following, it is simple.
This works fine, but the time cost is high. As you can see, it costs 457ms.
Now I am working on the optimization.
The source and template images are both CV_8U3C, mask image is CV_8U. Match one channel is OK. It is more faster, but it still costs high.
Mat tmp(matTempl.cols, matTempl.rows, matTempl.type());
int matchCount = 0;
float maxVal = 0;
double areaInvert = 1.0 / countNonZero(matMask);
for (int j = 0; j < resultRows; j++)
{
float* data = imgResult.ptr<float>(j);
for (int i = 0; i < resultCols; i++)
{
Mat matROI(matSource, Rect(i, j, matTempl.cols, matTempl.rows));
tmp.setTo(Scalar(0));
bitwise_xor(matROI, matTempl, tmp);
bitwise_and(tmp, matMask, tmp);
data[i] = 1.0f - float(countNonZero(tmp) * areaInvert);
if (data[i] > matchingDegree)
{
SRect rc;
rc.left = i;
rc.top = j;
rc.right = i + imgTemplate.cols;
rc.bottom = j + imgTemplate.rows;
rcOuts.push_back(rc);
if ( data[i] > maxVal)
{
maxVal = data[i];
maxIndex = rcOuts.size() - 1;
}
if (++matchCount == maxMatchs)
{
Log_Warn("Too many matches, stopped at: " << matchCount);
return true;
}
}
}
}
It says I have not enough reputations to post image....
http://i.stack.imgur.com/mJrqU.png
New added:
I success optimize the algorithm by using key points. Calculate all the points is cost, but it is faster to calculate only server key points. See the picture, the costs decrease greatly, now it is about 7ms.
I still can not post image, please visit: http://i.stack.imgur.com/ePcD9.png
Please give me reputations, so I can post images. :)
There is a technical formulation for template matching with mask in OpenCV Documentation, which works well. It can be used by calling cv::matchTemplate and its source code is also available under the Intel License.

Recognizing an image from a list with OpenCV SIFT using the FLANN matching

The point of the application is to recognize an image from an already set list of images. The list of images have had their SIFT descriptors extracted and saved in files. Nothing interesting here:
std::vector<cv::KeyPoint> detectedKeypoints;
cv::Mat objectDescriptors;
// Extract data
cv::SIFT sift;
sift.detect(image, detectedKeypoints);
sift.compute(image, detectedKeypoints, objectDescriptors);
// Save the file
cv::FileStorage fs(file, cv::FileStorage::WRITE);
fs << "descriptors" << objectDescriptors;
fs << "keypoints" << detectedKeypoints;
fs.release();
Then the device takes a picture. SIFT descriptors are extracted in the same way. The idea now was to compare the descriptors to the ones from the files. I am doing that using the FLANN matcher from OpenCV. I am trying to quantify the similarity, image by image. After going through the whole list I should have the best match.
const cv::Ptr<cv::flann::IndexParams>& indexParams = new cv::flann::KDTreeIndexParams(1);
const cv::Ptr<cv::flann::SearchParams>& searchParams = new cv::flann::SearchParams(64);
// Match using Flann
cv::Mat indexMat;
cv::FlannBasedMatcher matcher(indexParams, searchParams);
std::vector< cv::DMatch > matches;
matcher.match(objectDescriptors, readDescriptors, matches);
After matching I understand that I get a list of the closest found distances between the feature vectors. I find the minimum distance and, using it I can count "good matches" and even get a list of the respective points:
// Count the number of mathes where the distance is less than 2 * min_dist
int goodCount = 0;
for (int i = 0; i < objectDescriptors.rows; i++)
{
if (matches[i].distance < 2 * min_dist)
{
++goodCount;
// Save the points for the homography calculation
obj.push_back(detectedKeypoints[matches[i].queryIdx].pt);
scene.push_back(readKeypoints[matches[i].trainIdx].pt);
}
}
I'm showing easy parts of the code just to make this more easy to follow, I know some of it doesn't need to be here.
Continuing, I was hoping that simply counting the number of good matches like this would be enough, but it turned out to mostly just point me to the image with the most descriptors. What I tried to after this was computing the homography. The aim was to compute it and see whether it's a valid homoraphy or not. The hope was that a good match, and only a good match, would have a homography that is a good transformation. Creating the homography was done simply using cv::findHomography on the obj and scene which are std::vector< cv::Point2f>. I checked the validity of the homography using some code I found online:
bool niceHomography(cv::Mat H)
{
std::cout << H << std::endl;
const double det = H.at<double>(0, 0) * H.at<double>(1, 1) - H.at<double>(1, 0) * H.at<double>(0, 1);
if (det < 0)
{
std::cout << "Homography: bad determinant" << std::endl;
return false;
}
const double N1 = sqrt(H.at<double>(0, 0) * H.at<double>(0, 0) + H.at<double>(1, 0) * H.at<double>(1, 0));
if (N1 > 4 || N1 < 0.1)
{
std::cout << "Homography: bad first column" << std::endl;
return false;
}
const double N2 = sqrt(H.at<double>(0, 1) * H.at<double>(0, 1) + H.at<double>(1, 1) * H.at<double>(1, 1));
if (N2 > 4 || N2 < 0.1)
{
std::cout << "Homography: bad second column" << std::endl;
return false;
}
const double N3 = sqrt(H.at<double>(2, 0) * H.at<double>(2, 0) + H.at<double>(2, 1) * H.at<double>(2, 1));
if (N3 > 0.002)
{
std::cout << "Homography: bad third row" << std::endl;
return false;
}
return true;
}
I don't understand the math behind this so, while testing, I sometimes replaced this function with a simple check whether the determinant of the homography was positive. The problem is that I kept having issues here. The homographies were either all bad, or good when they shouldn't have been (when I was checking only the determinant).
I figured I should actually use the homography and for a number of points just compute their position in the destination image using their position in the source image. Then I would compare these average distances, and I would ideally get a very obvious smaller average distance in the case of the correct image. This did not work at all. All the distances were colossal. I thought I might have used the homography the other way around to calculate the right position, but switching obj and scene with each other gave similar results.
Other things I tried were SURF descriptors instead of SIFT, BFMatcher (brute force) instead of FLANN, getting the n smallest distances for every image instead of a number depending on the minimum distance, or getting distances depending on a global maximum distance. None of these approaches gave me definite good results, and I feel stuck now.
My only next strategy would be to sharpen the images or even turn them to binary images using some local threshold or some algorithms used for segmentation. I am looking for any suggestions or mistake anyone can see in my work.
I don't know whether this is relevant, but I added some of the images I am testing this on. Many times in the test images most of the SIFT vectors come from the frame (higher contrast) than the painting. This is why I'm thinking sharpening the images might work, but I don't want to go deeper in case something I did previously is wrong.
The gallery of images is here with the descriptions in the titles. The images are of quite high resolution, please view in case it might give some hints.
You can try to test if when matching, the lines between the source image and the target image are relatively parallel. If it's not a correct match, then you'd have a lot of noise and the lines won't be parallel.
See the attached image which shows a correct match (using SURF and BF) - all the lines are mostly parallel (though I should point out that this is an easy example).
You are going correct way.
First, use second nearest ratio isntead of your "good match by 2*min_dist" https://stackoverflow.com/a/23019889/1983544.
Second, use homography other way. When you find homography, you have not only H ,matrix, but the number of correspondences consistent with it. Check if it is some reasonable number, say >=15. If less, than object is not matched.
Third, if you have a big viewpoint change, SIFT or SURF are unable to match images. Try to use MODS instead (http://cmp.felk.cvut.cz/wbs/ here is Windows and Linux binaries, as well as paper describing algorithm) or ASIFT (much slower and matches much worse, but open source) http://www.ipol.im/pub/art/2011/my-asift/
Or at least use MSER or Hessian-Affine detector instead of SIFT (retaining SIFT as descriptor).

How to compute 2D log-chromaticity?

My goal is to remove shadows from image. I use C++ and OpenCV. Sure I lack enough math background and not being native English speaker makes everything harder to understand.
After reading different approaches to remove shadows I found method which should work for me but it relies on something they call "2D chromaticity" and "2D log-chromaticity space" but even this term seems to be inconsistent in different sources. Many papers on topic, few are listed here:
http://www.cs.cmu.edu/~efros/courses/LBMV09/Papers/finlayson-eccv-04.pdf
http://www2.cmp.uea.ac.uk/Research/compvis/Papers/DrewFinHor_ICCV03.pdf
http://www.cvc.uab.es/adas/publications/alvarez_2008.pdf
http://ivrgwww.epfl.ch/alumni/fredemba/papers/FFICPR06.pdf
I teared Google into strips by searching right words and explanations. Best I found is Illumination invariant image which did not help me much.
I tried to repeat formula log(G/R), log(B/R) described in first paper, page 3 to get figures similar to 2b.
As input I used http://en.wikipedia.org/wiki/File:Gretag-Macbeth_ColorChecker.jpg
Output I get is
My source code:
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include <iostream>
#include <stdio.h>
using namespace std;
using namespace cv;
int main( int argc, char** argv ) {
Mat src;
src = imread( argv[1], 1 );
if( !src.data )
{ return -1; }
Mat image( 600, 600, CV_8UC3, Scalar(127,127,127) );
int cn = src.channels();
uint8_t* pixelPtr = (uint8_t*)src.data;
for(int i=0 ; i< src.rows;i++) {
for(int j=0 ; j< src.cols;j++) {
Scalar_<uint8_t> bgrPixel;
bgrPixel.val[0] = pixelPtr[i*src.cols*cn + j*cn + 0]; // B
bgrPixel.val[1] = pixelPtr[i*src.cols*cn + j*cn + 1]; // G
bgrPixel.val[2] = pixelPtr[i*src.cols*cn + j*cn + 2]; // R
if(bgrPixel.val[2] !=0 ) { // avoid division by zero
float a= image.cols/2+50*(log((float)bgrPixel.val[0] / (float)bgrPixel.val[2])) ;
float b= image.rows/2+50*(log((float)bgrPixel.val[1] / (float)bgrPixel.val[2])) ;
if(!isinf(a) && !isinf(b))
image.at<Vec3b>(a,b)=Vec3b(255,2,3);
}
}
}
imshow("log-chroma", image );
imwrite("log-chroma.png", image );
waitKey(0);
}
What I am missing or misunderstand?
By reading the paper Recovery of Chromaticity Image Free from Shadows via Illumination Invariance that you've posted, and your code, I guess the problem is that your coordinate system (X/Y axis) are linear while in the paper the coordinate system are log(R/G) by log(B/G).
This is the closest I can figure. Reading through this:
http://www2.cmp.uea.ac.uk/Research/compvis/Papers/DrewFinHor_ICCV03.pdf
I came across the sentence:
"Fig. 2(a) shows log-chromaticities for the 24 surfaces of a Macbeth ColorChecker Chart, (the six neutral patches all belong to the same
cluster). If we now vary the lighting and plot median values
for each patch, we see the curves in Fig. 2(b)."
If you look closely at the log-chromaticity plot, you see 19 blobs, corresponding to each of the 18 colors in the Macbeth chart, plus the sum of all the 6 grayscale targets in the bottom row:
Explanation of Log Chromaticities
Explanation of Log Chromaticities
With 1 picture, we can only get 1 point of each blob: We take the median value inside each target and plot it. To get plot from the paper, we would have to create multiple images with different lighting. We might be able to do this by varying the temperature of the image in an image editor.
For now, I just looked at the color patches in the original image and plotted the points:
Input:
Color Patches Used
Output:
Log Chromaticity
The graph dots are not all in the same place as the paper, but I figure it's fairly close. Would someone please check my work to see if this makes sense?
In that OpenCV code I got a "undefined Identifier error" for the function ifinf() and I solved it by replacing it with _finite(). That might be the issue with the Visual studio version.
if(!isinf(a) && !isinf(b)) ----> if(_finite(a) && _finite(b))
Include this header:
#include<float.h>

OpenCV: Output of the predict function of Expectation Maximization

Background:
I have 2 sets of color pixels from an image, one corresponding to the background, another corresponding to the foreground. Next, I train 2 Gaussian Mixture Models using EM from OpenCV for each set. My aim is to find the probability of a random pixel to belong to the foreground and to the background. Thus, I use the function "predict" for each EM on my pixel.
Question:
I don't understand the values returned by this function. In the documentation of OpenCV, it is written:
The method returns a two-element double vector. Zero element is a likelihood logarithm value for the sample. First element is an index of the most probable mixture component for the given sample.
http://docs.opencv.org/modules/ml/doc/expectation_maximization.html?highlight=predict#Vec2d%20EM::predict%28InputArray%20sample,%20OutputArray%20probs%29%20const
I don't understand what means "likehood logarithm". In my results, I have sometimes negative values and values > 1. Is anyone who used the same function has this kind of results or resuts between 0 and 1 ? What can I conclude from my results ?
How can I get the probability of a pixel to belong to the whole GMM (not the probality to belong to each cluster of the GMM) ?
Here is my code:
Mat mask = imread("mask.tif", 0);
Mat formerImage = imread("ImageFormer.tif");
Mat currentImage = imread("ImageCurrent.tif");
// number of cluster in the GMM
int nClusters = 5;
int countB=0, countF=0;
Vec3b color;
Vec2d probFg, probBg; // probabilities to belong to the foreground or background from GMMs
//count the number of pixels for each training data
for(int c=0; c<=40;c++) {
for(int l=0; l<=40;l++) {
if(mask.at<BYTE>(l, c)==255) {
countF++;
} else if(mask.at<BYTE>(l, c)==0) {
countB++;
}
}
}
printf("countB %d countF %d \n", countB, countF);
Mat samplesForeground = Mat(countF,3, CV_64F);
Mat samplesBackground = Mat(countB,3, CV_64F);
// Expectation-Maximisation able to resolve the GMM and to predict the probability for a pixel to belong to the GMM.
EM em_foreground= EM(nClusters);
EM em_background= EM(nClusters);
countB=0;
countF=0;
// fill the training data from the former image depending of the mask
for(int c=0; c<=40;c++) {
for(int l=0; l<=40;l++) {
if(mask.at<BYTE>(l, c)==255) {
color = formerImage.at<Vec3b>(l, c);
samplesForeground.at<double>(countF,0)=color[0];
samplesForeground.at<double>(countF,1)=color[1];
samplesForeground.at<double>(countF,2)=color[2];
countF++;
} else if(mask.at<BYTE>(l, c)==0) {
color = formerImage.at<Vec3b>(l, c);
samplesBackground.at<double>(countB, 0)=color[0];
samplesBackground.at<double>(countB, 1)=color[1];
samplesBackground.at<double>(countB, 2)=color[2];
countB++;
}
}
}
printf("countB %d countF %d \n", countB, countF);
em_foreground.train(samplesForeground);
em_background.train(samplesBackground);
Mat sample(1, 3, CV_64F);
// try every pixel of the current image and get the log likelihood
for(int c=0; c<=40;c++) {
for(int l=0; l<=40;l++) {
color = currentImage.at<Vec3b>(l,c);
sample.at<double>(0)=color[0];
sample.at<double>(1)=color[1];
sample.at<double>(2)=color[2];
probFg=em_foreground.predict(sample);
probBg=em_background.predict(sample);
if(probFg[0]>0 || probBg[0]>0)
printf("probFg[0] %f probBg[0] %f \n", probFg[0], probBg[0]);
}
}
EDIT
After #BrianL explained, I now understand the log likelihood.
My problem is the log probability of the predict function is sometimes >0. But it should be <=0. Has anyone met this problem before?
I have edited the code above to show the problem. I have tried the program with images below:
The first image is the ImageCurrent.tif, the second is the ImageFormer.tif and the last one is mask.tif.
Is this can be considered a bug in OpenCV? Should I open a ticket on OpenCV bug tracker?
The "likelihood logarithm" means the log of the probability. Since for a probability p we expect 0 ≤ p ≤ 1, I would expect the values to be negative: log(p) ≤ 0. Larger negative numbers imply smaller probabilities.
This form is helpful when you are dealing with products of very small probabilities: if you multiplied the normal way, you could easily get underflow and lose precision because the probability becomes very small. But in log space the multiplication turns into an addition, which improves the accuracy and also potentially the speed of the calculation.
The predict function is for classifying a data point. If you want to give a point a score for how likely it is to belong to any component in the model, you can use the model parameters (see the get documentation) to calculate it yourself.
As I understand you have two separate GMMs for the foreground and background part of the image.The total probability of a sample pixel 'x' in the test image when evaluated in the foreground GMM is
P_fg(x) = sum_over_j_1_to_k ( Wj_fg * Pj_fg( x ))
where
k = number of clusters in foreground GMM
x = test sample
Pj_fg(x) = probability that sample x is in j-th cluster according to the foreground GMM
Wj_fg = weight of the j-th cluster in foreground GMM
also, sum of all weights should be 1 for each GMM.
We can do a similar calculation for the background GMM.
From looking at the EM code in opencv, it looks like the first part of the 2 values that EM returns is the log likelihood. For the foreground GMM this is
log(P_fg(x_i))
I implemented your algorithm and for each pixel in the test image, I compared the log-likelihoods returned for each of the two GMM-s and classified the pixel with the GMM with higher value. I got decent results.
In that respect, yes this value is an indication of the pixel to be belonging to the entire GMM.
2)
In my implementation of your problem, I always got the log likelihoods of all GMMS of all test-sample pixels under 0.
I notice that you are doing graphcut based image segmentation.
You might want to take a look at the following blog post which use OpenCV and its GMM class in a very similar way as what you are doing to perform graph cut-based image segmentation. Code is given in C++ with detailed explanations. Here is the link: link
Basically, I can only say that the log probability, whether it is correct or not, is not what you are looking for. Check out the above link for details.