Image based counting algorithm to count objects on a moving conveyor - c++

I am building a vision system which can count boxes moving on a variable speed conveyor belt.
Using open_cv and c++, I could separate the blobs and extract the respective centroids.
Now I have to increment the count variable, if the centroid crosses the cutoff boundary line.
This is where I am stuck. I tried 2 alternatives.
Fixing a rectangular strip where a centroid would stay for only one single frame
But since the conveyor is multi speed, I could not fix a constant boundary value.
I tried something like
centroid_prev = centroid_now;
centroid_now = posX;
if (centroid_now >= xLimit && centroid_prev < xLimit)
This works fine if just a single box is present on the conveyor.
But for 2 or more blobs in same frame, I do not know how to handle using arrays for contours.
Can you please suggest a simple counting algorithm which can compare
blob properties between previous frame and current frame even if
multiple blobs are present per frame?
PS. Conveyor speed is around 50 boxes/second, so a lightweight algorithm will be very much appreciated else we may end up with a lower frame rate.

Assuming the images you pasted are representative, you can easily solve this by doing some kind of tracking.
The simplest way that comes to mind is to use goodFeaturesToTrack and calcOpticalFlowPyrLK to track the motion of the conveyor.
You'll probably need to do some filtering on the result, but I don't think that would be difficult, as the motion and images are very low in noise.
Once you have that motion, you can calculate for each centroid when it moved beyond a certain X threshold and count it.
With a low number of corners (<100) such as in this image, it should be fast.

Have you tried matching the centroid coordinates from the previous frame with the centroids from the new frame? You can use OpenCV's descriptor matchers for that. (The code samples all match feature vectors, but there's no reason why you shouldn't use them for coordinate matching.)
If you're worried about performance: matching 5-10 coordinate centers should be orders of magnitudes faster than finding blobs in an image.

This is the algorithm for arrays. It's just an extension of what you are doing - you can adjust the specifics
for(i=0; i<centroid.length; i++)
centroid_prev[i] = centroid[i].posX;
for(frame j=0 to ...) {
... recompure centroids
for(i=0; i<centroid.length; i++) {
centroid_now = centroid[i].posX;
if (centroid_now >= xLimit && centroid_prev[i] < xLimit)
for(i=0; i<centroid.length; i++)
centroid_prev[i] = centroid[i].posX;
}// end j
If the objects can move about (and they look about the same) you need to add additional info such as color to locate the same objects.


Match object between different video frames

Am trying to use OPENCV to detect the shift in consecutive video frames when the camera is unstable and moving real time as shown in the picture.. To compensate the effect of shaking or changing in the angle I want to match some objects in the image as example the clock and from the center of the same object in the consecutive frames I can detect the shift value and compensate its effect. I don't know the way to do this real time or how many ways are available and accurate to do this.
Thank you in advance and I hope my question is clear.
This is a fairly standard operation, as it's actively used in MPEG-4 compression. It's called "motion estimation" and you don't do it on objects (too hard, requires image segmentation). In OpenCV, it's covered under Video Stabilization
If you want to try writing code yourself then one method is to first of all crop the frame to produce a sub image of your actual image slightly smaller than your actual image along each dimension. This will give you some room to move.
Next you want to be able to find and track shapes in OpenCV - an example of code is here - - Play around until you get a few geometric primitive shapes coming up on each frame.
Next you want to build some vectors between the centres of each shape - these are what will determine the movement of the camera - if in the next frame most of the vectors are displaced but parallel that is a good indicator that the camera has moved.
The last step is to calculate the displacement, which should is matter of measuring the distance between detected parallel vectors. If this is smaller than your sub-image cropping then you can crop the original image to negate the displacement.
The pseudo code for each iteration would be something like -
image wholeFrame1, wholeFrame2, subImage, shapesFrame1, shapesFrame2
vectorArray vectorsFrame1, vectorsFrame2; parallelVectorList
vector cameraDisplacement = [0,0]
//Display image
subImage = cropImage(wholeFrame1, cameraDisplacement)
//Find shapes to track
shapesFrame1 = findShapes(wholeFrame1)
shapesFrame2 = findShapes(wholeFrame2)
//Store a list of parallel vectors
parallelVectorList = detectParallelVectors(shapesFrame1, shapesFrame2)
//Find the mean displacement of each pair of parallel vectors
cameraDisplacement = meanDisplacement(parallelVectorList)
//Crop the next image accounting for camera displacement
subImage = cropImage(wholeFrame1, cameraDisplacement)
There are better ways of doing it but this would be easy enough for someone doing their first attempt at image stabilisation with experience of OpenCV.

Detect clusters of circular objects by iterative adaptive thresholding and shape analysis

I have been developing an application to count circular objects such as bacterial colonies from pictures.
What make it easy is the fact that the objects are generally well distinct from the background.
However, few difficulties make the analysis tricky:
The background will present gradual as well as rapid intensity change.
In the edges of the container, the object will be elliptic rather than circular.
The edges of the objects are sometimes rather fuzzy.
The objects will cluster.
The object can be very small (6px of diameter)
Ultimately, the algorithms will be used (via GUI) by people that do not have deep understanding of image analysis, so the parameters must be intuitive and very few.
The problem has been address many times in the scientific literature and "solved", for instance, using circular Hough transform or watershed approaches, but I have never been satisfied by the results.
One simple approach that was described is to get the foreground by adaptive thresholding and split (as I described in this post) the clustered objects using distance transform.
I have successfully implemented this method, but it could not always deal with sudden change in intensity. Also, I have been asked by peers to come out with a more "novel" approach.
I therefore was looking for a new method to extract foreground.
I therefore investigated other thresholding/blob detection methods.
I tried MSERs but found out that they were not very robust and quite slow in my case.
I eventually came out with an algorithm that, so far, gives me excellent results:
I split the three channels of my image and reduce their noise (blur/median blur). For each channel:
I apply a manual implementation of the first step of adaptive thresholding by calculating the absolute difference between the original channel and a convolved (by a large kernel blur) one. Then, for all the relevant values of threshold:
I apply a threshold on the result of 2)
find contours
validate or invalidate contours on the grant of their shape (size, area, convexity...)
only the valid continuous regions (i.e. delimited by contours) are then redrawn in an accumulator (1 accumulator per channel).
After accumulating continuous regions over values of threshold, I end-up with a map of "scores of regions". The regions with the highest intensity being those that fulfilled the the morphology filter criteria the most often.
The three maps (one per channel) are then converted to grey-scale and thresholded (the threshold is controlled by the user)
Just to show you the kind of image I have to work with:
This picture represents part of 3 sample images in the top and the result of my algorithm (blue = foreground) of the respective parts in the bottom.
Here is my C++ implementation of : 3-7
* cv::Mat dst[3] is the result of the absolute difference between original and convolved channel.
* MCF(std::vector<cv::Point>, int, int) is a filter function that returns an positive int only if the input contour is valid.
/* Allocate 3 matrices (1 per channel)*/
cv::Mat accu[3];
/* We define the maximal threshold to be tried as half of the absolute maximal value in each channel*/
int maxBGR[3];
for(unsigned int i=0; i<3;i++){
double min, max;
maxBGR[i] = max/2;
/* In addition, we fill accumulators by zeros*/
/* This loops are intended to be multithreaded using
#pragma omp parallel for collapse(2) schedule(dynamic)
For each channel */
for(unsigned int i=0; i<3;i++){
/* For each value of threshold (m_step can be > 1 in order to save time)*/
for(int j=0;j<maxBGR[i] ;j += m_step ){
/* Temporary matrix*/
cv::Mat tmp;
std::vector<std::vector<cv::Point> > contours;
/* Thresholds dst by j*/
cv::threshold(dst[i],tmp, j, 255, cv::THRESH_BINARY);
/* Finds continous regions*/
cv::findContours(tmp, contours, CV_RETR_LIST, CV_CHAIN_APPROX_TC89_L1);
if(contours.size() > 0){
/* Tests each contours*/
for(unsigned int k=0;k<contours.size();k++){
int valid = MCF(contours[k],m_minRad,m_maxRad);
/* I found that redrawing was very much faster if the given contour was copied in a smaller container.
* I do not really understand why though. For instance,
cv::drawContours(miniTmp,contours,k,cv::Scalar(1),-1,8,cv::noArray(), INT_MAX, cv::Point(-rect.x,-rect.y));
is slower especially if contours is very long.
std::vector<std::vector<cv::Point> > tpv(1);
std::copy(contours.begin()+k, contours.begin()+k+1, tpv.begin());
/* We make a Roi here*/
cv::Rect rect = cv::boundingRect(tpv[0]);
cv::Mat miniTmp(rect.height,rect.width,CV_8U,cv::Scalar(0));
cv::drawContours(miniTmp,tpv,0,cv::Scalar(1),-1,8,cv::noArray(), INT_MAX, cv::Point(-rect.x,-rect.y));
accu[i](rect) = miniTmp + accu[i](rect);
/* Make the global scoreMap*/
/* Conditional noise removal*/
I have two questions:
What is the name of such foreground extraction approach and do you see any reason for which it could be improper to use it in this case ?
Since recursively finding and drawing contours is quite intensive, I would like to make my algorithm faster. Can you indicate me any way to achieve this goal ?
Thank you very much for you help,
Several years ago I wrote an aplication that detects cells in a microscope image. The code is written in Matlab, and I think now that is more complicated than it should be (it was my first CV project), so I will only outline tricks that will actually be helpful for you. Btw, it was deadly slow, but it was really good at separating large groups of twin cells.
I defined a metric by which to evaluate the chance that a given point is the center of a cell:
- Luminosity decreases in a circular pattern around it
- The variance of the texture luminosity follows a given pattern
- a cell will not cover more than % of a neighboring cell
With it, I started to iteratively find the best cell, mark it as found, then look for the next one. Because such a search is expensive, I employed genetic algorithms to search faster in my feature space.
Some results are given below:

OpenCV Image Manipulation

I am trying to find out the difference in 2 images.
Scenario: Suppose that i have 2 images, one of a background and the other of a person in front of the background, I want to subtract the two images in such a way that I get the position of the person, that is the program can detect where the person was standing and give the subtracted image as the output.
The code that I have managed to come up with is taking two images from the camera and re-sizing them and is converting both the images to gray scale. I wanted to know what to do after this. I checked the subtract function provided by OpenCV but it takes arrays as inputs so I don't know how to progress.
The code that I have written is:
cap>>frame; //gets the first image
cv::cvtColor(frame,frame,CV_RGB2GRAY); //converts it to gray scale
cv::resize(frame,frame,Size(30,30)); //re-sizes it
cap>>frame2;//gets the second image
cv::cvtColor(frame2,frame2,CV_RGB2GRAY); //converts it to gray scale
cv::resize(frame2,frame2,Size(30,30)); //re-sizes it
Now do I simply use the subtract function like:
or do I apply some filters first and then use the subtract function?
As others have noticed, it's a tricky problem: easy to come up with a hack that will work sometimes, hard to come up with a solution that will work most of the time with minimal human intervention. Also, much easier to do if you can control tightly the material and illumination of the background. The professional applications are variously known as "chromakeying" (esp. in the TV industry), "bluescreening", "matting" or "traveling matte" (in cinematography), "background removal" in computer vision.
The groundbreaking work for matting quasi-uniform backdrops was done by Petro Vlahos many years ago. The patents on its basic algorithms have already expired, so you can go to town with them (and find open source implementations of various quality). Needless to say, IANAL, so do your homework on the patent subject.
Matting out more complex backgrounds is still an active research area, especially for the case when no 3D information is available. You may want to look into a few research papers that have come out of MS Research in the semi-recent past (A. Criminisi did some work in that area).
Using the subtract would not be appropriate because, it might result in some values becoming negative and will work only if you are trying to see if there is a difference or not( a boolean true/false).
If you need to get the pixels where it is differing, you should do a pixel by pixel comparison - something like:
int rows = frame.rows;
int cols = frame.cols;
cv::Mat diffImage = cv::Mat::zeros(rows, cols, CV_8UC1);
for(int i = 0; i < rows; ++i)
for(int j = 0; j < cols; ++j)
if(<uchar>(i,j) !=<uchar>(i,j))<uchar>(i, j) = 255;
now, you can either show or save diffImage. All pixels that differ will be white while the similar ones will be in black

Robustly find N circles with the same diameter: alternative to bruteforcing Hough transform threshold

I am developing application to track small animals in Petri dishes (or other circular containers).
Before any tracking takes place, the first few frames are used to define areas.
Each dish will match an circular independent static area (i.e. will not be updated during tracking).
The user can request the program to try to find dishes from the original image and use them as areas.
Here are examples:
In order to perform this task, I am using Hough Circle Transform.
But in practice, different users will have very different settings and images and I do not want to ask the user to manually define the parameters.
I cannot just guess all the parameters either.
However, I have got additional informations that I would like to use:
I know the exact number of circles to be detected.
All the circles have the almost same dimensions.
The circles cannot overlap.
I have a rough idea of the minimal and maximal size of the circles.
The circles must be entirely in the picture.
I can therefore narrow down the number of parameters to define to one: the threshold.
Using these informations and considering that I have got N circles to find, my current solution is to
test many values of threshold and keep the circles between which the standard deviation is the smallest (since all the circles should have a similar size):
//at this point, minRad and maxRad were calculated from the size of the image and the number of circles to find.
//assuming circles should altogether fill more than 1/3 of the images but cannot be altogether larger than the image.
//N is the integer number of circles to find.
//img is the picture of the scene (filtered).
//the vectors containing the detected circles and the --so far-- best circles found.
std::vector<cv::Vec3f> circles, bestCircles;
//the score of the --so far-- best set of circles
double bestSsem = 0;
for(int t=5; t<400 ; t=t+2){
//Apply Hough Circles with the threshold t
cv::HoughCircles(img, circles, CV_HOUGH_GRADIENT, 3, minRad*2, t,3, minRad, maxRad );
if(circles.size() >= N){
//call a routine to give a score to this set of circles according to the similarity of their radii
double ssem = scoreSetOfCircles(circles,N);
//if no circles are recorded yet, or if the score of this set of circles is higher than the former best
if( bestCircles.size() < N || ssem > bestSsem){
//this set become the temporary best set of circles
//the methods to assess how good is a set of circle (the more similar the circles are, the higher is ssem)
double scoreSetOfCircles(std::vector<cv::Vec3f> circles, int N){
double ssem=0, sum = 0;
double mean;
for(unsigned int j=0;j<N;j++){
sum = sum + circles[j][2];
mean = sum/N;
for(unsigned int j=0;j<N;j++){
double em = mean - circles[j][2];
ssem = 1/(ssem + em*em);
return ssem;
I have reached a higher accuracy by performing a second pass in which I repeated this algorithm narrowing the [minRad:maxRad] interval using the result of the first pass.
For instance minRad2 = 0.95 * average radius of best circles and maxRad2 = 1.05 * average radius of best circles.
I had fairly good results using this method so far. However, it is slow and rather dirty.
My questions are:
Can you thing of any alternative algorithm to solve this problem in a cleaner/faster manner ?
Or what would you suggest to improve this algorithm?
Do you think I should investigate generalised Hough transform ?
Thank you for your answers and suggestions.
The following approach should work pretty well for your case:
Binarize your image (you might need to do this on several levels of threshold to make algorithm independent of the lighting conditions)
Find contours
For each contour calculate the moments
Filter them by area to remove too small contours
Filter contours by circularity:
double area = moms.m00;
double perimeter = arcLength(Mat(contours[contourIdx]), true);
double ratio = 4 * CV_PI * area / (perimeter * perimeter);
ratio close to 1 will give you circles.
Calculate radius and center of each circle
center = Point2d(moms.m10 / moms.m00, moms.m01 / moms.m00);
And you can add more filters to improve the robustness.
Actually you can find an implementation of the whole procedure in OpenCV. Look how the SimpleBlobDetector class and findCirclesGrid function are implemented.
Within the current algorithm, the biggest thing that sticks out is the for(int t=5; t<400; t=t+2) loop. Trying recording score values for some test images. Graph score(t) versus t. With any luck, it will either suggest a smaller range for t or be a smoothish curve with a single maximum. In the latter case you can change your loop over all t values into a smarter search using Hill Climbing methods.
Even if it's fairly noisy, you can first loop over multiples of, say, 30, and for the best 1 or 2 of those loop over nearby multiples of 2.
Also, in your score function, you should disqualify any results with overlapping circles and maybe penalize overly spaced out circles.
You don't explain why you are using a black background. Unless you are using a telecentric lens (which seems unlikely, given the apparent field of view), and ignoring radial distortion for the moment, the images of the dishes will be ellipses, so estimating them as circles may lead to significant errors.
All and all, it doesn't seem to me that you are following a good approach. If the goals is simply to remove the background, so you can track the bugs inside the dishes, then your goal should be just that: find which pixels are background and mark them. The easiest way to do that is to take a picture of the background without dishes, under the same illumination and camera, and directly detect differences with the picture with the images. A colored background would be preferable to do that, with a color unlikely to appear in the dishes (e.g. green or blue velvet). So you'd have reduced the problem to bluescreening (or chroma keying), a classic technique in machine vision as applied to visual effects. Do a google search for "matte petro vlahos assumption" to find classic algorithms for solving this problem.

How do I fix eroded rectangles?

Basically, I have an image like this
or one with multiple rectangles within the same image. The rectangles are completely black and white have "dirty" edges and gouges, but it's pretty easy to tell they're rectangles. To be more precise, they are image masks. The white regions are parts of the image which are to be "left alone", but the black parts are to be made bitonal.
My question is, how do I make a nice and crisp rectangle out of this degraded one? I am a Python person, but I have to use Qt and C++ for this task. It would be preferable if no other libraries are used.
If the bounding box that contains all non-black pixels can do what you want, this should do the trick:
int boundLeft = INT_MAX;
int boundRight = -1;
int boundTop = INT_MAX;
int boundBottom = -1;
for(int y=0;y<imageHeight;++y) {
bool hasNonMask = false;
for(int x=0;x<imageWidth;++x) {
if(isNotMask(x, y)) {
hasNonMask = true;
if(x < boundLeft) boundLeft = x;
if(x > boundRight) boundRight = x;
if(hasNonMask) {
if(y < boundTop) boundTop = y;
if(y > boundBottom) boundBottom = y
If the result has negative size, then there's no non-mask pixel in the image. The code can be more optimized but I haven't had enough coffee yet. :)
Usually you'd do that by repeatedly dilating and eroding the mask. I don't think qt has premade functions for that, so you probably have to implement them yourself if you don't want to use libraries - has information on how to implement the functions.
For the moment, we'll assume they're all supposed to come out as rectangles with no rotation. In this case, you should be able to use a pretty simple approach. Starting from each pixel at the edge of the bitmap, start sampling pixels working your way inward until you encounter a transition. Record the distance from the edge for each transition (if there is one). Once you've done that from each edge, you basically "take a vote" -- the distance that occurred most often from that edge is what you treat as that edge of the rectangle. If the rectangle really is aligned, that should constitute a large majority of the distances.
If, instead you see a number of distances with nearly equal frequencies, chances are that the rectangle is rotated (or at least one edge is). In this case, you can divide the side in half (for example) and repeat. Once you've reached a large majority of points in each region agreeing on the distance, you can (attempt to) linearly interpolate between them to give a straight line (and limiting the minimum region size will limit the maximum rotation -- if you get to some size without reaching agreement, you're looking at a gouge, not the rectangle edge). Likewise, if you have a region (or more than one) that doesn't fit cleanly with the rest and won't fit with a line, you should probably ignore it as well -- again, you're probably looking at a gouge, not what's intended as an edge.