Motion detection using OpenCV/C++, threshold always become zero - c++

I work on C++ crowd detection code with “OpenCV”, that takes 2 frames and subtracts them. Then compare the result with threshold.
This is the first time I deal with “OpenCV” for C++ and I don't have much information about it.
These are the steps of the code:
Take two video frames with α minutes in between.
Convert frames to black and white images.
Subtract the two frames.
Compare the difference with the threshold.
If Difference <= threshold then crowd detected else--> there is no crowd.
c++ code:
#include <iostream>
#include <opencv2/opencv.hpp>
using namespace std;
using namespace cv;
int main (int argc, const char * argv[])
{
//first frame.
Mat current_frame = imread("image1.jpg",CV_LOAD_IMAGE_GRAYSCALE);
//secunde frame.
Mat previous_frame = imread("image2.jpg",CV_LOAD_IMAGE_GRAYSCALE);
//Minus the current frame from the previous_frame and store the result.
Mat result = current_frame-previous_frame;
//compare the difference with threshold, if the deference <70 -> there is crowd.
//if it is > 70 there is no crowd
int threshold= cv::threshold(result,result,0,70,CV_THRESH_BINARY);
if (threshold==0) {
cout<< "crowd detected \n";
}
else {
cout<<" no crowd detected \n ";
}
}
The problem is :
The threshold always be zero!
and the output always:
crowd detected
even if there is no crowd
We don't care about the output image because we won't use it, and we just want to know the last value of threshold.
My aim is to know how much deference between 2 frames. I want to compare the deference with threshold to detect the human crowd in specific place
I hope that one of you can help me
Thank you

there's a couple of flaws in your usage of the threshold function
it just returns the threshold value ( not what you expected )
'We don't care about the output image' - well, you'd better!
if you want to threshold against a value of 70, that should be your 3rd arg, not the 4th (the 4th arg is the value, anything >thresh is set to
what you probably wanted, is :
cv::threshold(result,result,70,1, CV_THRESH_BINARY); // same as : result = result>70;
int on_pixels = countNonZero(result);
// or:
int on_pixels = sum(result)[0];
[sidenote: if you really want to detect human crowds, you'll have to put much more sweat into this. just diff'ing frames is prone to err with illumination changes, also, there's birds, cars and traffic lights]

Related

colorbalance in an image using c++ and opencv

I'm trying to score the colorbalance of an image using c++ and opencv.
To do this the easiest way is to count the number of pixels in each color and then see if one of the colors is more prevalent.
I figured I should probably used calcHist and with the split function I can split a image in R, G, and B histograms. However I am unsure about what to do next. I could probably walk through all the bins and just see how many pixels are in there but this seems like a lot of work (I currently use 256 bins).
Is there a faster way to count the pixels in a color range? Also I am not sure how it would work if white or black are the more prevalant colors?
Automatic color balance algorithm is described in this link http://web.stanford.edu/~sujason/ColorBalancing/simplestcb.html
For C++ Code you can refer to this link : https://www.morethantechnical.com/2015/01/14/simplest-color-balance-with-opencv-wcode/
/// perform the Simplest Color Balancing algorithm
void SimplestCB(Mat& in, Mat& out, float percent) {
assert(in.channels() == 3);
assert(percent > 0 && percent < 100);
float half_percent = percent / 200.0f;
vector<Mat> tmpsplit; split(in,tmpsplit);
for(int i=0;i<3;i++) {
//find the low and high precentile values (based on the input percentile)
Mat flat; tmpsplit[i].reshape(1,1).copyTo(flat);
cv::sort(flat,flat,CV_SORT_EVERY_ROW + CV_SORT_ASCENDING);
int lowval = flat.at<uchar>(cvFloor(((float)flat.cols) * half_percent));
int highval = flat.at<uchar>(cvCeil(((float)flat.cols) * (1.0 - half_percent)));
cout << lowval << " " << highval << endl;
//saturate below the low percentile and above the high percentile
tmpsplit[i].setTo(lowval,tmpsplit[i] < lowval);
tmpsplit[i].setTo(highval,tmpsplit[i] > highval);
//scale the channel
normalize(tmpsplit[i],tmpsplit[i],0,255,NORM_MINMAX);
}
merge(tmpsplit,out);
}
// Usage example
void main() {
Mat tmp,im = imread("lily.png");
SimplestCB(im,tmp,1);
imshow("orig",im);
imshow("balanced",tmp);
waitKey(0);
return;
}
Colour balance is normally looking at a white (or gray) surface and checking the ratios of red/blue to green. A perfectly balanced system would have equal signal levels in red/blue.
You can then simply work out the average red/blue from the test gray card image and apply the same scaling to your real image.
Doing it on a live image with no reference is trickier, you have to find areas that are probably white (ie bright and nearly r=g=b) and use them as the reference
There's no definitive algorithm for colour balance, so anything you might implement, however good it is, will probably fail in some conditions.
One of the simplest algorithms is called Grey World, and assumes that statistically the average colour of a scene should be grey. And if it isn't, it means that it needs to be corrected to grey. So, very simply (in pseudo-python), if you have an image RGB:
cc[0] = np.mean(RGB[:,0]) # calculating channel-wise average
cc[1] = np.mean(RGB[:,1])
cc[2] = np.mean(RGB[:,2])
cc = cc / np.sqrt((cc**2).sum()) # normalise the light (you might want to
# play with this a bit
RGB /= cc # divide every pixel by the estimated light
Note that here I'm assuming that RGB is an array of floats with values between 0 and 1. Something else that helps is to exclude from the average pixels that contain values below and above certain thresholds (e.g., below 0.05 and above 0.95). This way you ignore pixels whose value is heavily influenced by noise (small values) and pixels that saturated the camera sensor and whose colour may not be reliable (large values).

How to compute 2D log-chromaticity?

My goal is to remove shadows from image. I use C++ and OpenCV. Sure I lack enough math background and not being native English speaker makes everything harder to understand.
After reading different approaches to remove shadows I found method which should work for me but it relies on something they call "2D chromaticity" and "2D log-chromaticity space" but even this term seems to be inconsistent in different sources. Many papers on topic, few are listed here:
http://www.cs.cmu.edu/~efros/courses/LBMV09/Papers/finlayson-eccv-04.pdf
http://www2.cmp.uea.ac.uk/Research/compvis/Papers/DrewFinHor_ICCV03.pdf
http://www.cvc.uab.es/adas/publications/alvarez_2008.pdf
http://ivrgwww.epfl.ch/alumni/fredemba/papers/FFICPR06.pdf
I teared Google into strips by searching right words and explanations. Best I found is Illumination invariant image which did not help me much.
I tried to repeat formula log(G/R), log(B/R) described in first paper, page 3 to get figures similar to 2b.
As input I used http://en.wikipedia.org/wiki/File:Gretag-Macbeth_ColorChecker.jpg
Output I get is
My source code:
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include <iostream>
#include <stdio.h>
using namespace std;
using namespace cv;
int main( int argc, char** argv ) {
Mat src;
src = imread( argv[1], 1 );
if( !src.data )
{ return -1; }
Mat image( 600, 600, CV_8UC3, Scalar(127,127,127) );
int cn = src.channels();
uint8_t* pixelPtr = (uint8_t*)src.data;
for(int i=0 ; i< src.rows;i++) {
for(int j=0 ; j< src.cols;j++) {
Scalar_<uint8_t> bgrPixel;
bgrPixel.val[0] = pixelPtr[i*src.cols*cn + j*cn + 0]; // B
bgrPixel.val[1] = pixelPtr[i*src.cols*cn + j*cn + 1]; // G
bgrPixel.val[2] = pixelPtr[i*src.cols*cn + j*cn + 2]; // R
if(bgrPixel.val[2] !=0 ) { // avoid division by zero
float a= image.cols/2+50*(log((float)bgrPixel.val[0] / (float)bgrPixel.val[2])) ;
float b= image.rows/2+50*(log((float)bgrPixel.val[1] / (float)bgrPixel.val[2])) ;
if(!isinf(a) && !isinf(b))
image.at<Vec3b>(a,b)=Vec3b(255,2,3);
}
}
}
imshow("log-chroma", image );
imwrite("log-chroma.png", image );
waitKey(0);
}
What I am missing or misunderstand?
By reading the paper Recovery of Chromaticity Image Free from Shadows via Illumination Invariance that you've posted, and your code, I guess the problem is that your coordinate system (X/Y axis) are linear while in the paper the coordinate system are log(R/G) by log(B/G).
This is the closest I can figure. Reading through this:
http://www2.cmp.uea.ac.uk/Research/compvis/Papers/DrewFinHor_ICCV03.pdf
I came across the sentence:
"Fig. 2(a) shows log-chromaticities for the 24 surfaces of a Macbeth ColorChecker Chart, (the six neutral patches all belong to the same
cluster). If we now vary the lighting and plot median values
for each patch, we see the curves in Fig. 2(b)."
If you look closely at the log-chromaticity plot, you see 19 blobs, corresponding to each of the 18 colors in the Macbeth chart, plus the sum of all the 6 grayscale targets in the bottom row:
Explanation of Log Chromaticities
Explanation of Log Chromaticities
With 1 picture, we can only get 1 point of each blob: We take the median value inside each target and plot it. To get plot from the paper, we would have to create multiple images with different lighting. We might be able to do this by varying the temperature of the image in an image editor.
For now, I just looked at the color patches in the original image and plotted the points:
Input:
Color Patches Used
Output:
Log Chromaticity
The graph dots are not all in the same place as the paper, but I figure it's fairly close. Would someone please check my work to see if this makes sense?
In that OpenCV code I got a "undefined Identifier error" for the function ifinf() and I solved it by replacing it with _finite(). That might be the issue with the Visual studio version.
if(!isinf(a) && !isinf(b)) ----> if(_finite(a) && _finite(b))
Include this header:
#include<float.h>

How to randomly choose sample points that maximize space occupation?

I would like to generate the sample points that can randomly fill/cover a space (like in the attached image). I think they have a method called "Quasi-random" that can generate such sample points. However, it's a little bit far from my knowledge. Can someone make suggestions or help me find a library that can be do this? Or suggest how to start writing such a program?
In the image, 256 sample points are applied on the given space, placed at random positions to cover the whole given space.
Update:
I just try to use some code from Halton Quasi-random Sequence and compare with the result of pseudo-random which is post by friend below. The result of Halton's method is more better in my opinion. I would like to share some result as below;
The code which I wrote is
#include "halton.hpp"
#include "opencv2/opencv.hpp"
int main()
{
int m_dim_num = 2;
int m_n = 50;
int m_seed[2], m_leap[2], m_base[2];
double m_r[100];
for (int i = 0; i < m_dim_num; i++)
{
m_seed[i] = 0;
m_leap[i] = 1;
m_base[i] = 2+i;
}
cv::Mat out(100, 100, CV_8UC1);
i4_to_halton_sequence( m_dim_num, m_n, 0, m_seed, m_leap, m_base, m_r);
int displaced = 100;
for (int i = 0; i < 100; i=i+2)
{
cv::circle(out, cv::Point2d((m_r[i])*displaced, (m_r[i+1])*displaced), 1, cv::Scalar(0, 255, 0), 1, 8, 0);
}
cv::imshow("test", out);
cv::waitKey(0);
return 0;
}
As I little bit familiar with OpenCV, I wrote this code by plot on the matrix of OpenCV (Mat). The "i4_to_halton_sequence()" is the function from the library that I mentioned above.
The result is not better, but might be use in somehow for my work. Someone have another idea?
I am going to give an answer that will seem half-assed. However, this topic has been studied extensively in the literature, so I will just refer you to some summaries from Wikipedia and other places online.
What you want is also called low-discrepancy sequence (or quasi-random, as you pointed out). You can read more about it here: http://en.wikipedia.org/wiki/Low-discrepancy_sequence. It's useful for a number of things, which includes numerical integration and, more recently, simulating retinal ganglion mosaic.
There are many ways to generate low-discrepancy sequences (or pseudo quasi random sequences :p). Some of these are in ACM Collected Algorithms (http://www.netlib.org/toms/index.html).
The most common of which, I think, is called Sobol sequence (algorithm 659 from the ACM thing). You can get some details on this here: http://en.wikipedia.org/wiki/Sobol_sequence
For the most part, unless you are really into it, that stuff looks pretty scary. For quick result, I would use GNU's GSL (GNU Scientific Library): http://www.gnu.org/software/gsl/
This library includes code to generate quasi-random sequences (http://www.gnu.org/software/gsl/manual/html_node/Quasi_002dRandom-Sequences.html) including Sobol sequence (http://www.gnu.org/software/gsl/manual/html_node/Quasi_002drandom-number-generator-examples.html).
If you're still stuck, I can paste some code here, but you're better off digging into GSL.
Well here's another way to do quasi-random that covers the entire space.
Since you have 256 points to use, you can start by plotting those points as a 16x16 grid.
Then apply some function that give some random offset to each point (say 0 to ±2 to the points' x and y coordinates).
You could create equidistant points (all points have same distance to their neighbors) and then, in a second step, move each point randomly a bit so that they appear 'random'.
The second idea I have is:
1. Start with one area.
2. Create a random point P rand about the 'middle' of your area.
3. Divide the area into 4 areas by that point. P is the upper right corner of the lower left subarea, the upper left corner of the lower right area and so on.
4. Repeat steps 2..4 for all 4 sub areas. Of course, not forever, but until you're satisfied.
This algorithms ensures that each 'hole' (i.e. the new sub area) is filled with a point.
Update: Your initial area should be twice as large as your area, because of step (2). This ensures having points at the edges and corners as well.
This is called a "low discrepancy sequence". The linked Wikipage explains how you can generate them.
But I suspect you already knew this, as your image is very similar to the 2,3 Halton sequence example from Wikipedia
You just need library rand() function:
#include <stdlib.h>
#include <time.h>
unsigned int N = 256; //number of points
int RANGE_X = 100; //x range to put sample points in
int RANGE_Y = 100;
void PutSamplePoint(int x, int y)
{
//some your code putting sample point on field
}
int main()
{
srand((unsigned)time(0)); //initialize random generator - uses current time as seed
for(unsigned int i = 0; i < N; i++)
{
int x = rand() % RANGE_X; //returns random value in range [0, RANGE_X)
int y = rand() % RANGE_Y;
PutSamplePoint(x, y);
}
return 0;
}

OpenCV: Output of the predict function of Expectation Maximization

Background:
I have 2 sets of color pixels from an image, one corresponding to the background, another corresponding to the foreground. Next, I train 2 Gaussian Mixture Models using EM from OpenCV for each set. My aim is to find the probability of a random pixel to belong to the foreground and to the background. Thus, I use the function "predict" for each EM on my pixel.
Question:
I don't understand the values returned by this function. In the documentation of OpenCV, it is written:
The method returns a two-element double vector. Zero element is a likelihood logarithm value for the sample. First element is an index of the most probable mixture component for the given sample.
http://docs.opencv.org/modules/ml/doc/expectation_maximization.html?highlight=predict#Vec2d%20EM::predict%28InputArray%20sample,%20OutputArray%20probs%29%20const
I don't understand what means "likehood logarithm". In my results, I have sometimes negative values and values > 1. Is anyone who used the same function has this kind of results or resuts between 0 and 1 ? What can I conclude from my results ?
How can I get the probability of a pixel to belong to the whole GMM (not the probality to belong to each cluster of the GMM) ?
Here is my code:
Mat mask = imread("mask.tif", 0);
Mat formerImage = imread("ImageFormer.tif");
Mat currentImage = imread("ImageCurrent.tif");
// number of cluster in the GMM
int nClusters = 5;
int countB=0, countF=0;
Vec3b color;
Vec2d probFg, probBg; // probabilities to belong to the foreground or background from GMMs
//count the number of pixels for each training data
for(int c=0; c<=40;c++) {
for(int l=0; l<=40;l++) {
if(mask.at<BYTE>(l, c)==255) {
countF++;
} else if(mask.at<BYTE>(l, c)==0) {
countB++;
}
}
}
printf("countB %d countF %d \n", countB, countF);
Mat samplesForeground = Mat(countF,3, CV_64F);
Mat samplesBackground = Mat(countB,3, CV_64F);
// Expectation-Maximisation able to resolve the GMM and to predict the probability for a pixel to belong to the GMM.
EM em_foreground= EM(nClusters);
EM em_background= EM(nClusters);
countB=0;
countF=0;
// fill the training data from the former image depending of the mask
for(int c=0; c<=40;c++) {
for(int l=0; l<=40;l++) {
if(mask.at<BYTE>(l, c)==255) {
color = formerImage.at<Vec3b>(l, c);
samplesForeground.at<double>(countF,0)=color[0];
samplesForeground.at<double>(countF,1)=color[1];
samplesForeground.at<double>(countF,2)=color[2];
countF++;
} else if(mask.at<BYTE>(l, c)==0) {
color = formerImage.at<Vec3b>(l, c);
samplesBackground.at<double>(countB, 0)=color[0];
samplesBackground.at<double>(countB, 1)=color[1];
samplesBackground.at<double>(countB, 2)=color[2];
countB++;
}
}
}
printf("countB %d countF %d \n", countB, countF);
em_foreground.train(samplesForeground);
em_background.train(samplesBackground);
Mat sample(1, 3, CV_64F);
// try every pixel of the current image and get the log likelihood
for(int c=0; c<=40;c++) {
for(int l=0; l<=40;l++) {
color = currentImage.at<Vec3b>(l,c);
sample.at<double>(0)=color[0];
sample.at<double>(1)=color[1];
sample.at<double>(2)=color[2];
probFg=em_foreground.predict(sample);
probBg=em_background.predict(sample);
if(probFg[0]>0 || probBg[0]>0)
printf("probFg[0] %f probBg[0] %f \n", probFg[0], probBg[0]);
}
}
EDIT
After #BrianL explained, I now understand the log likelihood.
My problem is the log probability of the predict function is sometimes >0. But it should be <=0. Has anyone met this problem before?
I have edited the code above to show the problem. I have tried the program with images below:
The first image is the ImageCurrent.tif, the second is the ImageFormer.tif and the last one is mask.tif.
Is this can be considered a bug in OpenCV? Should I open a ticket on OpenCV bug tracker?
The "likelihood logarithm" means the log of the probability. Since for a probability p we expect 0 ≤ p ≤ 1, I would expect the values to be negative: log(p) ≤ 0. Larger negative numbers imply smaller probabilities.
This form is helpful when you are dealing with products of very small probabilities: if you multiplied the normal way, you could easily get underflow and lose precision because the probability becomes very small. But in log space the multiplication turns into an addition, which improves the accuracy and also potentially the speed of the calculation.
The predict function is for classifying a data point. If you want to give a point a score for how likely it is to belong to any component in the model, you can use the model parameters (see the get documentation) to calculate it yourself.
As I understand you have two separate GMMs for the foreground and background part of the image.The total probability of a sample pixel 'x' in the test image when evaluated in the foreground GMM is
P_fg(x) = sum_over_j_1_to_k ( Wj_fg * Pj_fg( x ))
where
k = number of clusters in foreground GMM
x = test sample
Pj_fg(x) = probability that sample x is in j-th cluster according to the foreground GMM
Wj_fg = weight of the j-th cluster in foreground GMM
also, sum of all weights should be 1 for each GMM.
We can do a similar calculation for the background GMM.
From looking at the EM code in opencv, it looks like the first part of the 2 values that EM returns is the log likelihood. For the foreground GMM this is
log(P_fg(x_i))
I implemented your algorithm and for each pixel in the test image, I compared the log-likelihoods returned for each of the two GMM-s and classified the pixel with the GMM with higher value. I got decent results.
In that respect, yes this value is an indication of the pixel to be belonging to the entire GMM.
2)
In my implementation of your problem, I always got the log likelihoods of all GMMS of all test-sample pixels under 0.
I notice that you are doing graphcut based image segmentation.
You might want to take a look at the following blog post which use OpenCV and its GMM class in a very similar way as what you are doing to perform graph cut-based image segmentation. Code is given in C++ with detailed explanations. Here is the link: link
Basically, I can only say that the log probability, whether it is correct or not, is not what you are looking for. Check out the above link for details.

Using time in OpenCV for frame processes and other tasks

I want to count the vehicles from a video. After frame differencing I got a gray scale image or kind of binary image. I have defined a Region of Interest to work on a specific area of the frames, the values of the pixels of the vehicles passing through the Region of Interest are higher than 0 or even higher than 40 or 50 because they are white.
My idea is that when a certain number of pixels in a specific interval of time (say 1-2 seconds) are white then there must be a vehicle passing so I will increment the counter.
What I want is, to check whether there are still white pixels coming or not after a 1-2 seconds. If there are no white pixels coming it means that the vehicle has passed and the next vehicle is gonna come, in this way the counter must be incremented.
One method that came to my mind is to count the frames of the video and store it in a variable called No_of_frames. Then using that variable I think I can estimate the time passed. If the value of the variable No_of_frames is greater then lets say 20, it means that nearly 1 second has passed, if my videos frame rate is 25-30 fps.
I am using Qt Creator with windows 7 and OpenCV 2.3.1
My code is something like:
for(int i=0; i<matFrame.rows; i++)
{
for(int j=0;j<matFrame.cols;j++)
if (matFrame.at<uchar>(i,j)>100)//values of pixels greater than 100
//will be considered as white.
{
whitePixels++;
}
if ()// here I want to use time. The 'if' statement must be like:
//if (total_no._of_whitepixels>100 && no_white_pixel_came_after 2secs)
//which means that a vehicle has just passed so increment the counter.
{
counter++;
}
}
Any other idea for counting the vehicles, better than mine, will be most welcomed. Thanks in advance.
For background segmentation I am using the following algorithm but it is very slow, I don't know why. The whole code is as follows:
// opencv2/video/background_segm.hpp OPENCV header file must be included.
IplImage* tmp_frame = NULL;
CvCapture* cap = NULL;
bool update_bg_model = true;
Mat element = getStructuringElement( 0, Size( 2,2 ),Point() );
Mat eroded_frame;
Mat before_erode;
if( argc > 2 )
cap = cvCaptureFromCAM(0);
else
// cap = cvCreateFileCapture( "C:\\4.avi" );
cap = cvCreateFileCapture( "C:\\traffic2.mp4" );
if( !cap )
{
printf("can not open camera or video file\n");
return -1;
}
tmp_frame = cvQueryFrame(cap);
if(!tmp_frame)
{
printf("can not read data from the video source\n");
return -1;
}
cvNamedWindow("BackGround", 1);
cvNamedWindow("ForeGround", 1);
CvBGStatModel* bg_model = 0;
for( int fr = 1;tmp_frame; tmp_frame = cvQueryFrame(cap), fr++ )
{
if(!bg_model)
{
//create BG model
bg_model = cvCreateGaussianBGModel( tmp_frame );
// bg_model = cvCreateFGDStatModel( temp );
continue;
}
double t = (double)cvGetTickCount();
cvUpdateBGStatModel( tmp_frame, bg_model, update_bg_model ? -1 : 0 );
t = (double)cvGetTickCount() - t;
printf( "%d. %.1f\n", fr, t/(cvGetTickFrequency()*1000.) );
before_erode= bg_model->foreground;
cv::erode((Mat)bg_model->background, (Mat)bg_model->foreground, element );
//eroded_frame=bg_model->foreground;
//frame=(IplImage *)erode_frame.data;
cvShowImage("BackGround", bg_model->background);
cvShowImage("ForeGround", bg_model->foreground);
// cvShowImage("ForeGround", bg_model->foreground);
char k = cvWaitKey(5);
if( k == 27 ) break;
if( k == ' ' )
{
update_bg_model = !update_bg_model;
if(update_bg_model)
printf("Background update is on\n");
else
printf("Background update is off\n");
}
}
cvReleaseBGStatModel( &bg_model );
cvReleaseCapture(&cap);
return 0;
A great deal of research has been done on vehicle tracking and counting. The approach you describe appears to be quite fragile, and is unlikely to be robust or accurate. The main issue is using a count of pixels above a certain threshold, without regard for their spatial connectivity or temporal relation.
Frame differencing can be useful for separating a moving object from its background, provided the object of interest is the only (or largest) moving object.
What you really need is to first identify the object of interest, segment it from the background, and track it over time using an adaptive filter (such as a Kalman filter). Have a look at the OpenCV video reference. OpenCV provides background subtraction and object segmentation to do all the required steps.
I suggest you read up on OpenCV - Learning OpenCV is a great read. And also on more general computer vision algorithms and theory - http://homepages.inf.ed.ac.uk/rbf/CVonline/books.htm has a good list.
Normally they just put a small pneumatic pipe across the road (soft pipe semi filled with air). It is attached to a simple counter. Each vehicle passing over the pipe generates two pulses (first front, then rear wheels). The counter records the number of pulses in specified time intervals and divides by 2 to get the approx vehicle count.