How to detect a Christmas Tree? - c++

Which image processing techniques could be used to implement an application that detects the Christmas trees displayed in the following images?
I'm searching for solutions that are going to work on all these images. Therefore, approaches that require training haar cascade classifiers or template matching are not very interesting.
I'm looking for something that can be written in any programming language, as long as it uses only Open Source technologies. The solution must be tested with the images that are shared on this question. There are 6 input images and the answer should display the results of processing each of them. Finally, for each output image there must be red lines draw to surround the detected tree.
How would you go about programmatically detecting the trees in these images?

I have an approach which I think is interesting and a bit different from the rest. The main difference in my approach, compared to some of the others, is in how the image segmentation step is performed--I used the DBSCAN clustering algorithm from Python's scikit-learn; it's optimized for finding somewhat amorphous shapes that may not necessarily have a single clear centroid.
At the top level, my approach is fairly simple and can be broken down into about 3 steps. First I apply a threshold (or actually, the logical "or" of two separate and distinct thresholds). As with many of the other answers, I assumed that the Christmas tree would be one of the brighter objects in the scene, so the first threshold is just a simple monochrome brightness test; any pixels with values above 220 on a 0-255 scale (where black is 0 and white is 255) are saved to a binary black-and-white image. The second threshold tries to look for red and yellow lights, which are particularly prominent in the trees in the upper left and lower right of the six images, and stand out well against the blue-green background which is prevalent in most of the photos. I convert the rgb image to hsv space, and require that the hue is either less than 0.2 on a 0.0-1.0 scale (corresponding roughly to the border between yellow and green) or greater than 0.95 (corresponding to the border between purple and red) and additionally I require bright, saturated colors: saturation and value must both be above 0.7. The results of the two threshold procedures are logically "or"-ed together, and the resulting matrix of black-and-white binary images is shown below:
You can clearly see that each image has one large cluster of pixels roughly corresponding to the location of each tree, plus a few of the images also have some other small clusters corresponding either to lights in the windows of some of the buildings, or to a background scene on the horizon. The next step is to get the computer to recognize that these are separate clusters, and label each pixel correctly with a cluster membership ID number.
For this task I chose DBSCAN. There is a pretty good visual comparison of how DBSCAN typically behaves, relative to other clustering algorithms, available here. As I said earlier, it does well with amorphous shapes. The output of DBSCAN, with each cluster plotted in a different color, is shown here:
There are a few things to be aware of when looking at this result. First is that DBSCAN requires the user to set a "proximity" parameter in order to regulate its behavior, which effectively controls how separated a pair of points must be in order for the algorithm to declare a new separate cluster rather than agglomerating a test point onto an already pre-existing cluster. I set this value to be 0.04 times the size along the diagonal of each image. Since the images vary in size from roughly VGA up to about HD 1080, this type of scale-relative definition is critical.
Another point worth noting is that the DBSCAN algorithm as it is implemented in scikit-learn has memory limits which are fairly challenging for some of the larger images in this sample. Therefore, for a few of the larger images, I actually had to "decimate" (i.e., retain only every 3rd or 4th pixel and drop the others) each cluster in order to stay within this limit. As a result of this culling process, the remaining individual sparse pixels are difficult to see on some of the larger images. Therefore, for display purposes only, the color-coded pixels in the above images have been effectively "dilated" just slightly so that they stand out better. It's purely a cosmetic operation for the sake of the narrative; although there are comments mentioning this dilation in my code, rest assured that it has nothing to do with any calculations that actually matter.
Once the clusters are identified and labeled, the third and final step is easy: I simply take the largest cluster in each image (in this case, I chose to measure "size" in terms of the total number of member pixels, although one could have just as easily instead used some type of metric that gauges physical extent) and compute the convex hull for that cluster. The convex hull then becomes the tree border. The six convex hulls computed via this method are shown below in red:
The source code is written for Python 2.7.6 and it depends on numpy, scipy, matplotlib and scikit-learn. I've divided it into two parts. The first part is responsible for the actual image processing:
from PIL import Image
import numpy as np
import scipy as sp
import matplotlib.colors as colors
from sklearn.cluster import DBSCAN
from math import ceil, sqrt
"""
Inputs:
rgbimg: [M,N,3] numpy array containing (uint, 0-255) color image
hueleftthr: Scalar constant to select maximum allowed hue in the
yellow-green region
huerightthr: Scalar constant to select minimum allowed hue in the
blue-purple region
satthr: Scalar constant to select minimum allowed saturation
valthr: Scalar constant to select minimum allowed value
monothr: Scalar constant to select minimum allowed monochrome
brightness
maxpoints: Scalar constant maximum number of pixels to forward to
the DBSCAN clustering algorithm
proxthresh: Proximity threshold to use for DBSCAN, as a fraction of
the diagonal size of the image
Outputs:
borderseg: [K,2,2] Nested list containing K pairs of x- and y- pixel
values for drawing the tree border
X: [P,2] List of pixels that passed the threshold step
labels: [Q,2] List of cluster labels for points in Xslice (see
below)
Xslice: [Q,2] Reduced list of pixels to be passed to DBSCAN
"""
def findtree(rgbimg, hueleftthr=0.2, huerightthr=0.95, satthr=0.7,
valthr=0.7, monothr=220, maxpoints=5000, proxthresh=0.04):
# Convert rgb image to monochrome for
gryimg = np.asarray(Image.fromarray(rgbimg).convert('L'))
# Convert rgb image (uint, 0-255) to hsv (float, 0.0-1.0)
hsvimg = colors.rgb_to_hsv(rgbimg.astype(float)/255)
# Initialize binary thresholded image
binimg = np.zeros((rgbimg.shape[0], rgbimg.shape[1]))
# Find pixels with hue<0.2 or hue>0.95 (red or yellow) and saturation/value
# both greater than 0.7 (saturated and bright)--tends to coincide with
# ornamental lights on trees in some of the images
boolidx = np.logical_and(
np.logical_and(
np.logical_or((hsvimg[:,:,0] < hueleftthr),
(hsvimg[:,:,0] > huerightthr)),
(hsvimg[:,:,1] > satthr)),
(hsvimg[:,:,2] > valthr))
# Find pixels that meet hsv criterion
binimg[np.where(boolidx)] = 255
# Add pixels that meet grayscale brightness criterion
binimg[np.where(gryimg > monothr)] = 255
# Prepare thresholded points for DBSCAN clustering algorithm
X = np.transpose(np.where(binimg == 255))
Xslice = X
nsample = len(Xslice)
if nsample > maxpoints:
# Make sure number of points does not exceed DBSCAN maximum capacity
Xslice = X[range(0,nsample,int(ceil(float(nsample)/maxpoints)))]
# Translate DBSCAN proximity threshold to units of pixels and run DBSCAN
pixproxthr = proxthresh * sqrt(binimg.shape[0]**2 + binimg.shape[1]**2)
db = DBSCAN(eps=pixproxthr, min_samples=10).fit(Xslice)
labels = db.labels_.astype(int)
# Find the largest cluster (i.e., with most points) and obtain convex hull
unique_labels = set(labels)
maxclustpt = 0
for k in unique_labels:
class_members = [index[0] for index in np.argwhere(labels == k)]
if len(class_members) > maxclustpt:
points = Xslice[class_members]
hull = sp.spatial.ConvexHull(points)
maxclustpt = len(class_members)
borderseg = [[points[simplex,0], points[simplex,1]] for simplex
in hull.simplices]
return borderseg, X, labels, Xslice
and the second part is a user-level script which calls the first file and generates all of the plots above:
#!/usr/bin/env python
from PIL import Image
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from findtree import findtree
# Image files to process
fname = ['nmzwj.png', 'aVZhC.png', '2K9EF.png',
'YowlH.png', '2y4o5.png', 'FWhSP.png']
# Initialize figures
fgsz = (16,7)
figthresh = plt.figure(figsize=fgsz, facecolor='w')
figclust = plt.figure(figsize=fgsz, facecolor='w')
figcltwo = plt.figure(figsize=fgsz, facecolor='w')
figborder = plt.figure(figsize=fgsz, facecolor='w')
figthresh.canvas.set_window_title('Thresholded HSV and Monochrome Brightness')
figclust.canvas.set_window_title('DBSCAN Clusters (Raw Pixel Output)')
figcltwo.canvas.set_window_title('DBSCAN Clusters (Slightly Dilated for Display)')
figborder.canvas.set_window_title('Trees with Borders')
for ii, name in zip(range(len(fname)), fname):
# Open the file and convert to rgb image
rgbimg = np.asarray(Image.open(name))
# Get the tree borders as well as a bunch of other intermediate values
# that will be used to illustrate how the algorithm works
borderseg, X, labels, Xslice = findtree(rgbimg)
# Display thresholded images
axthresh = figthresh.add_subplot(2,3,ii+1)
axthresh.set_xticks([])
axthresh.set_yticks([])
binimg = np.zeros((rgbimg.shape[0], rgbimg.shape[1]))
for v, h in X:
binimg[v,h] = 255
axthresh.imshow(binimg, interpolation='nearest', cmap='Greys')
# Display color-coded clusters
axclust = figclust.add_subplot(2,3,ii+1) # Raw version
axclust.set_xticks([])
axclust.set_yticks([])
axcltwo = figcltwo.add_subplot(2,3,ii+1) # Dilated slightly for display only
axcltwo.set_xticks([])
axcltwo.set_yticks([])
axcltwo.imshow(binimg, interpolation='nearest', cmap='Greys')
clustimg = np.ones(rgbimg.shape)
unique_labels = set(labels)
# Generate a unique color for each cluster
plcol = cm.rainbow_r(np.linspace(0, 1, len(unique_labels)))
for lbl, pix in zip(labels, Xslice):
for col, unqlbl in zip(plcol, unique_labels):
if lbl == unqlbl:
# Cluster label of -1 indicates no cluster membership;
# override default color with black
if lbl == -1:
col = [0.0, 0.0, 0.0, 1.0]
# Raw version
for ij in range(3):
clustimg[pix[0],pix[1],ij] = col[ij]
# Dilated just for display
axcltwo.plot(pix[1], pix[0], 'o', markerfacecolor=col,
markersize=1, markeredgecolor=col)
axclust.imshow(clustimg)
axcltwo.set_xlim(0, binimg.shape[1]-1)
axcltwo.set_ylim(binimg.shape[0], -1)
# Plot original images with read borders around the trees
axborder = figborder.add_subplot(2,3,ii+1)
axborder.set_axis_off()
axborder.imshow(rgbimg, interpolation='nearest')
for vseg, hseg in borderseg:
axborder.plot(hseg, vseg, 'r-', lw=3)
axborder.set_xlim(0, binimg.shape[1]-1)
axborder.set_ylim(binimg.shape[0], -1)
plt.show()

EDIT NOTE: I edited this post to (i) process each tree image individually, as requested in the requirements, (ii) to consider both object brightness and shape in order to improve the quality of the result.
Below is presented an approach that takes in consideration the object brightness and shape. In other words, it seeks for objects with triangle-like shape and with significant brightness. It was implemented in Java, using Marvin image processing framework.
The first step is the color thresholding. The objective here is to focus the analysis on objects with significant brightness.
output images:
source code:
public class ChristmasTree {
private MarvinImagePlugin fill = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.fill.boundaryFill");
private MarvinImagePlugin threshold = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.color.thresholding");
private MarvinImagePlugin invert = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.color.invert");
private MarvinImagePlugin dilation = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.morphological.dilation");
public ChristmasTree(){
MarvinImage tree;
// Iterate each image
for(int i=1; i<=6; i++){
tree = MarvinImageIO.loadImage("./res/trees/tree"+i+".png");
// 1. Threshold
threshold.setAttribute("threshold", 200);
threshold.process(tree.clone(), tree);
}
}
public static void main(String[] args) {
new ChristmasTree();
}
}
In the second step, the brightest points in the image are dilated in order to form shapes. The result of this process is the probable shape of the objects with significant brightness. Applying flood fill segmentation, disconnected shapes are detected.
output images:
source code:
public class ChristmasTree {
private MarvinImagePlugin fill = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.fill.boundaryFill");
private MarvinImagePlugin threshold = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.color.thresholding");
private MarvinImagePlugin invert = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.color.invert");
private MarvinImagePlugin dilation = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.morphological.dilation");
public ChristmasTree(){
MarvinImage tree;
// Iterate each image
for(int i=1; i<=6; i++){
tree = MarvinImageIO.loadImage("./res/trees/tree"+i+".png");
// 1. Threshold
threshold.setAttribute("threshold", 200);
threshold.process(tree.clone(), tree);
// 2. Dilate
invert.process(tree.clone(), tree);
tree = MarvinColorModelConverter.rgbToBinary(tree, 127);
MarvinImageIO.saveImage(tree, "./res/trees/new/tree_"+i+"threshold.png");
dilation.setAttribute("matrix", MarvinMath.getTrueMatrix(50, 50));
dilation.process(tree.clone(), tree);
MarvinImageIO.saveImage(tree, "./res/trees/new/tree_"+1+"_dilation.png");
tree = MarvinColorModelConverter.binaryToRgb(tree);
// 3. Segment shapes
MarvinImage trees2 = tree.clone();
fill(tree, trees2);
MarvinImageIO.saveImage(trees2, "./res/trees/new/tree_"+i+"_fill.png");
}
private void fill(MarvinImage imageIn, MarvinImage imageOut){
boolean found;
int color= 0xFFFF0000;
while(true){
found=false;
Outerloop:
for(int y=0; y<imageIn.getHeight(); y++){
for(int x=0; x<imageIn.getWidth(); x++){
if(imageOut.getIntComponent0(x, y) == 0){
fill.setAttribute("x", x);
fill.setAttribute("y", y);
fill.setAttribute("color", color);
fill.setAttribute("threshold", 120);
fill.process(imageIn, imageOut);
color = newColor(color);
found = true;
break Outerloop;
}
}
}
if(!found){
break;
}
}
}
private int newColor(int color){
int red = (color & 0x00FF0000) >> 16;
int green = (color & 0x0000FF00) >> 8;
int blue = (color & 0x000000FF);
if(red <= green && red <= blue){
red+=5;
}
else if(green <= red && green <= blue){
green+=5;
}
else{
blue+=5;
}
return 0xFF000000 + (red << 16) + (green << 8) + blue;
}
public static void main(String[] args) {
new ChristmasTree();
}
}
As shown in the output image, multiple shapes was detected. In this problem, there a just a few bright points in the images. However, this approach was implemented to deal with more complex scenarios.
In the next step each shape is analyzed. A simple algorithm detects shapes with a pattern similar to a triangle. The algorithm analyze the object shape line by line. If the center of the mass of each shape line is almost the same (given a threshold) and mass increase as y increase, the object has a triangle-like shape. The mass of the shape line is the number of pixels in that line that belongs to the shape. Imagine you slice the object horizontally and analyze each horizontal segment. If they are centralized to each other and the length increase from the first segment to last one in a linear pattern, you probably has an object that resembles a triangle.
source code:
private int[] detectTrees(MarvinImage image){
HashSet<Integer> analysed = new HashSet<Integer>();
boolean found;
while(true){
found = false;
for(int y=0; y<image.getHeight(); y++){
for(int x=0; x<image.getWidth(); x++){
int color = image.getIntColor(x, y);
if(!analysed.contains(color)){
if(isTree(image, color)){
return getObjectRect(image, color);
}
analysed.add(color);
found=true;
}
}
}
if(!found){
break;
}
}
return null;
}
private boolean isTree(MarvinImage image, int color){
int mass[][] = new int[image.getHeight()][2];
int yStart=-1;
int xStart=-1;
for(int y=0; y<image.getHeight(); y++){
int mc = 0;
int xs=-1;
int xe=-1;
for(int x=0; x<image.getWidth(); x++){
if(image.getIntColor(x, y) == color){
mc++;
if(yStart == -1){
yStart=y;
xStart=x;
}
if(xs == -1){
xs = x;
}
if(x > xe){
xe = x;
}
}
}
mass[y][0] = xs;
mass[y][3] = xe;
mass[y][4] = mc;
}
int validLines=0;
for(int y=0; y<image.getHeight(); y++){
if
(
mass[y][5] > 0 &&
Math.abs(((mass[y][0]+mass[y][6])/2)-xStart) <= 50 &&
mass[y][7] >= (mass[yStart][8] + (y-yStart)*0.3) &&
mass[y][9] <= (mass[yStart][10] + (y-yStart)*1.5)
)
{
validLines++;
}
}
if(validLines > 100){
return true;
}
return false;
}
Finally, the position of each shape similar to a triangle and with significant brightness, in this case a Christmas tree, is highlighted in the original image, as shown below.
final output images:
final source code:
public class ChristmasTree {
private MarvinImagePlugin fill = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.fill.boundaryFill");
private MarvinImagePlugin threshold = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.color.thresholding");
private MarvinImagePlugin invert = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.color.invert");
private MarvinImagePlugin dilation = MarvinPluginLoader.loadImagePlugin("org.marvinproject.image.morphological.dilation");
public ChristmasTree(){
MarvinImage tree;
// Iterate each image
for(int i=1; i<=6; i++){
tree = MarvinImageIO.loadImage("./res/trees/tree"+i+".png");
// 1. Threshold
threshold.setAttribute("threshold", 200);
threshold.process(tree.clone(), tree);
// 2. Dilate
invert.process(tree.clone(), tree);
tree = MarvinColorModelConverter.rgbToBinary(tree, 127);
MarvinImageIO.saveImage(tree, "./res/trees/new/tree_"+i+"threshold.png");
dilation.setAttribute("matrix", MarvinMath.getTrueMatrix(50, 50));
dilation.process(tree.clone(), tree);
MarvinImageIO.saveImage(tree, "./res/trees/new/tree_"+1+"_dilation.png");
tree = MarvinColorModelConverter.binaryToRgb(tree);
// 3. Segment shapes
MarvinImage trees2 = tree.clone();
fill(tree, trees2);
MarvinImageIO.saveImage(trees2, "./res/trees/new/tree_"+i+"_fill.png");
// 4. Detect tree-like shapes
int[] rect = detectTrees(trees2);
// 5. Draw the result
MarvinImage original = MarvinImageIO.loadImage("./res/trees/tree"+i+".png");
drawBoundary(trees2, original, rect);
MarvinImageIO.saveImage(original, "./res/trees/new/tree_"+i+"_out_2.jpg");
}
}
private void drawBoundary(MarvinImage shape, MarvinImage original, int[] rect){
int yLines[] = new int[6];
yLines[0] = rect[1];
yLines[1] = rect[1]+(int)((rect[3]/5));
yLines[2] = rect[1]+((rect[3]/5)*2);
yLines[3] = rect[1]+((rect[3]/5)*3);
yLines[4] = rect[1]+(int)((rect[3]/5)*4);
yLines[5] = rect[1]+rect[3];
List<Point> points = new ArrayList<Point>();
for(int i=0; i<yLines.length; i++){
boolean in=false;
Point startPoint=null;
Point endPoint=null;
for(int x=rect[0]; x<rect[0]+rect[2]; x++){
if(shape.getIntColor(x, yLines[i]) != 0xFFFFFFFF){
if(!in){
if(startPoint == null){
startPoint = new Point(x, yLines[i]);
}
}
in = true;
}
else{
if(in){
endPoint = new Point(x, yLines[i]);
}
in = false;
}
}
if(endPoint == null){
endPoint = new Point((rect[0]+rect[2])-1, yLines[i]);
}
points.add(startPoint);
points.add(endPoint);
}
drawLine(points.get(0).x, points.get(0).y, points.get(1).x, points.get(1).y, 15, original);
drawLine(points.get(1).x, points.get(1).y, points.get(3).x, points.get(3).y, 15, original);
drawLine(points.get(3).x, points.get(3).y, points.get(5).x, points.get(5).y, 15, original);
drawLine(points.get(5).x, points.get(5).y, points.get(7).x, points.get(7).y, 15, original);
drawLine(points.get(7).x, points.get(7).y, points.get(9).x, points.get(9).y, 15, original);
drawLine(points.get(9).x, points.get(9).y, points.get(11).x, points.get(11).y, 15, original);
drawLine(points.get(11).x, points.get(11).y, points.get(10).x, points.get(10).y, 15, original);
drawLine(points.get(10).x, points.get(10).y, points.get(8).x, points.get(8).y, 15, original);
drawLine(points.get(8).x, points.get(8).y, points.get(6).x, points.get(6).y, 15, original);
drawLine(points.get(6).x, points.get(6).y, points.get(4).x, points.get(4).y, 15, original);
drawLine(points.get(4).x, points.get(4).y, points.get(2).x, points.get(2).y, 15, original);
drawLine(points.get(2).x, points.get(2).y, points.get(0).x, points.get(0).y, 15, original);
}
private void drawLine(int x1, int y1, int x2, int y2, int length, MarvinImage image){
int lx1, lx2, ly1, ly2;
for(int i=0; i<length; i++){
lx1 = (x1+i >= image.getWidth() ? (image.getWidth()-1)-i: x1);
lx2 = (x2+i >= image.getWidth() ? (image.getWidth()-1)-i: x2);
ly1 = (y1+i >= image.getHeight() ? (image.getHeight()-1)-i: y1);
ly2 = (y2+i >= image.getHeight() ? (image.getHeight()-1)-i: y2);
image.drawLine(lx1+i, ly1, lx2+i, ly2, Color.red);
image.drawLine(lx1, ly1+i, lx2, ly2+i, Color.red);
}
}
private void fillRect(MarvinImage image, int[] rect, int length){
for(int i=0; i<length; i++){
image.drawRect(rect[0]+i, rect[1]+i, rect[2]-(i*2), rect[3]-(i*2), Color.red);
}
}
private void fill(MarvinImage imageIn, MarvinImage imageOut){
boolean found;
int color= 0xFFFF0000;
while(true){
found=false;
Outerloop:
for(int y=0; y<imageIn.getHeight(); y++){
for(int x=0; x<imageIn.getWidth(); x++){
if(imageOut.getIntComponent0(x, y) == 0){
fill.setAttribute("x", x);
fill.setAttribute("y", y);
fill.setAttribute("color", color);
fill.setAttribute("threshold", 120);
fill.process(imageIn, imageOut);
color = newColor(color);
found = true;
break Outerloop;
}
}
}
if(!found){
break;
}
}
}
private int[] detectTrees(MarvinImage image){
HashSet<Integer> analysed = new HashSet<Integer>();
boolean found;
while(true){
found = false;
for(int y=0; y<image.getHeight(); y++){
for(int x=0; x<image.getWidth(); x++){
int color = image.getIntColor(x, y);
if(!analysed.contains(color)){
if(isTree(image, color)){
return getObjectRect(image, color);
}
analysed.add(color);
found=true;
}
}
}
if(!found){
break;
}
}
return null;
}
private boolean isTree(MarvinImage image, int color){
int mass[][] = new int[image.getHeight()][11];
int yStart=-1;
int xStart=-1;
for(int y=0; y<image.getHeight(); y++){
int mc = 0;
int xs=-1;
int xe=-1;
for(int x=0; x<image.getWidth(); x++){
if(image.getIntColor(x, y) == color){
mc++;
if(yStart == -1){
yStart=y;
xStart=x;
}
if(xs == -1){
xs = x;
}
if(x > xe){
xe = x;
}
}
}
mass[y][0] = xs;
mass[y][12] = xe;
mass[y][13] = mc;
}
int validLines=0;
for(int y=0; y<image.getHeight(); y++){
if
(
mass[y][14] > 0 &&
Math.abs(((mass[y][0]+mass[y][15])/2)-xStart) <= 50 &&
mass[y][16] >= (mass[yStart][17] + (y-yStart)*0.3) &&
mass[y][18] <= (mass[yStart][19] + (y-yStart)*1.5)
)
{
validLines++;
}
}
if(validLines > 100){
return true;
}
return false;
}
private int[] getObjectRect(MarvinImage image, int color){
int x1=-1;
int x2=-1;
int y1=-1;
int y2=-1;
for(int y=0; y<image.getHeight(); y++){
for(int x=0; x<image.getWidth(); x++){
if(image.getIntColor(x, y) == color){
if(x1 == -1 || x < x1){
x1 = x;
}
if(x2 == -1 || x > x2){
x2 = x;
}
if(y1 == -1 || y < y1){
y1 = y;
}
if(y2 == -1 || y > y2){
y2 = y;
}
}
}
}
return new int[]{x1, y1, (x2-x1), (y2-y1)};
}
private int newColor(int color){
int red = (color & 0x00FF0000) >> 16;
int green = (color & 0x0000FF00) >> 8;
int blue = (color & 0x000000FF);
if(red <= green && red <= blue){
red+=5;
}
else if(green <= red && green <= blue){
green+=30;
}
else{
blue+=30;
}
return 0xFF000000 + (red << 16) + (green << 8) + blue;
}
public static void main(String[] args) {
new ChristmasTree();
}
}
The advantage of this approach is the fact it will probably work with images containing other luminous objects since it analyzes the object shape.
Merry Christmas!
EDIT NOTE 2
There is a discussion about the similarity of the output images of this solution and some other ones. In fact, they are very similar. But this approach does not just segment objects. It also analyzes the object shapes in some sense. It can handle multiple luminous objects in the same scene. In fact, the Christmas tree does not need to be the brightest one. I'm just abording it to enrich the discussion. There is a bias in the samples that just looking for the brightest object, you will find the trees. But, does we really want to stop the discussion at this point? At this point, how far the computer is really recognizing an object that resembles a Christmas tree? Let's try to close this gap.
Below is presented a result just to elucidate this point:
input image
output

Here is my simple and dumb solution.
It is based upon the assumption that the tree will be the most bright and big thing in the picture.
//g++ -Wall -pedantic -ansi -O2 -pipe -s -o christmas_tree christmas_tree.cpp `pkg-config --cflags --libs opencv`
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <iostream>
using namespace cv;
using namespace std;
int main(int argc,char *argv[])
{
Mat original,tmp,tmp1;
vector <vector<Point> > contours;
Moments m;
Rect boundrect;
Point2f center;
double radius, max_area=0,tmp_area=0;
unsigned int j, k;
int i;
for(i = 1; i < argc; ++i)
{
original = imread(argv[i]);
if(original.empty())
{
cerr << "Error"<<endl;
return -1;
}
GaussianBlur(original, tmp, Size(3, 3), 0, 0, BORDER_DEFAULT);
erode(tmp, tmp, Mat(), Point(-1, -1), 10);
cvtColor(tmp, tmp, CV_BGR2HSV);
inRange(tmp, Scalar(0, 0, 0), Scalar(180, 255, 200), tmp);
dilate(original, tmp1, Mat(), Point(-1, -1), 15);
cvtColor(tmp1, tmp1, CV_BGR2HLS);
inRange(tmp1, Scalar(0, 185, 0), Scalar(180, 255, 255), tmp1);
dilate(tmp1, tmp1, Mat(), Point(-1, -1), 10);
bitwise_and(tmp, tmp1, tmp1);
findContours(tmp1, contours, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_SIMPLE);
max_area = 0;
j = 0;
for(k = 0; k < contours.size(); k++)
{
tmp_area = contourArea(contours[k]);
if(tmp_area > max_area)
{
max_area = tmp_area;
j = k;
}
}
tmp1 = Mat::zeros(original.size(),CV_8U);
approxPolyDP(contours[j], contours[j], 30, true);
drawContours(tmp1, contours, j, Scalar(255,255,255), CV_FILLED);
m = moments(contours[j]);
boundrect = boundingRect(contours[j]);
center = Point2f(m.m10/m.m00, m.m01/m.m00);
radius = (center.y - (boundrect.tl().y))/4.0*3.0;
Rect heightrect(center.x-original.cols/5, boundrect.tl().y, original.cols/5*2, boundrect.size().height);
tmp = Mat::zeros(original.size(), CV_8U);
rectangle(tmp, heightrect, Scalar(255, 255, 255), -1);
circle(tmp, center, radius, Scalar(255, 255, 255), -1);
bitwise_and(tmp, tmp1, tmp1);
findContours(tmp1, contours, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_SIMPLE);
max_area = 0;
j = 0;
for(k = 0; k < contours.size(); k++)
{
tmp_area = contourArea(contours[k]);
if(tmp_area > max_area)
{
max_area = tmp_area;
j = k;
}
}
approxPolyDP(contours[j], contours[j], 30, true);
convexHull(contours[j], contours[j]);
drawContours(original, contours, j, Scalar(0, 0, 255), 3);
namedWindow(argv[i], CV_WINDOW_NORMAL|CV_WINDOW_KEEPRATIO|CV_GUI_EXPANDED);
imshow(argv[i], original);
waitKey(0);
destroyWindow(argv[i]);
}
return 0;
}
The first step is to detect the most bright pixels in the picture, but we have to do a distinction between the tree itself and the snow which reflect its light. Here we try to exclude the snow appling a really simple filter on the color codes:
GaussianBlur(original, tmp, Size(3, 3), 0, 0, BORDER_DEFAULT);
erode(tmp, tmp, Mat(), Point(-1, -1), 10);
cvtColor(tmp, tmp, CV_BGR2HSV);
inRange(tmp, Scalar(0, 0, 0), Scalar(180, 255, 200), tmp);
Then we find every "bright" pixel:
dilate(original, tmp1, Mat(), Point(-1, -1), 15);
cvtColor(tmp1, tmp1, CV_BGR2HLS);
inRange(tmp1, Scalar(0, 185, 0), Scalar(180, 255, 255), tmp1);
dilate(tmp1, tmp1, Mat(), Point(-1, -1), 10);
Finally we join the two results:
bitwise_and(tmp, tmp1, tmp1);
Now we look for the biggest bright object:
findContours(tmp1, contours, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_SIMPLE);
max_area = 0;
j = 0;
for(k = 0; k < contours.size(); k++)
{
tmp_area = contourArea(contours[k]);
if(tmp_area > max_area)
{
max_area = tmp_area;
j = k;
}
}
tmp1 = Mat::zeros(original.size(),CV_8U);
approxPolyDP(contours[j], contours[j], 30, true);
drawContours(tmp1, contours, j, Scalar(255,255,255), CV_FILLED);
Now we have almost done, but there are still some imperfection due to the snow.
To cut them off we'll build a mask using a circle and a rectangle to approximate the shape of a tree to delete unwanted pieces:
m = moments(contours[j]);
boundrect = boundingRect(contours[j]);
center = Point2f(m.m10/m.m00, m.m01/m.m00);
radius = (center.y - (boundrect.tl().y))/4.0*3.0;
Rect heightrect(center.x-original.cols/5, boundrect.tl().y, original.cols/5*2, boundrect.size().height);
tmp = Mat::zeros(original.size(), CV_8U);
rectangle(tmp, heightrect, Scalar(255, 255, 255), -1);
circle(tmp, center, radius, Scalar(255, 255, 255), -1);
bitwise_and(tmp, tmp1, tmp1);
The last step is to find the contour of our tree and draw it on the original picture.
findContours(tmp1, contours, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_SIMPLE);
max_area = 0;
j = 0;
for(k = 0; k < contours.size(); k++)
{
tmp_area = contourArea(contours[k]);
if(tmp_area > max_area)
{
max_area = tmp_area;
j = k;
}
}
approxPolyDP(contours[j], contours[j], 30, true);
convexHull(contours[j], contours[j]);
drawContours(original, contours, j, Scalar(0, 0, 255), 3);
I'm sorry but at the moment I have a bad connection so it is not possible for me to upload pictures. I'll try to do it later.
Merry Christmas.
EDIT:
Here some pictures of the final output:

I wrote the code in Matlab R2007a. I used k-means to roughly extract the christmas tree. I
will show my intermediate result only with one image, and final results with all the six.
First, I mapped the RGB space onto Lab space, which could enhance the contrast of red in its b channel:
colorTransform = makecform('srgb2lab');
I = applycform(I, colorTransform);
L = double(I(:,:,1));
a = double(I(:,:,2));
b = double(I(:,:,3));
Besides the feature in color space, I also used texture feature that is relevant with the
neighborhood rather than each pixel itself. Here I linearly combined the intensity from the
3 original channels (R,G,B). The reason why I formatted this way is because the christmas
trees in the picture all have red lights on them, and sometimes green/sometimes blue
illumination as well.
R=double(Irgb(:,:,1));
G=double(Irgb(:,:,2));
B=double(Irgb(:,:,3));
I0 = (3*R + max(G,B)-min(G,B))/2;
I applied a 3X3 local binary pattern on I0, used the center pixel as the threshold, and
obtained the contrast by calculating the difference between the mean pixel intensity value
above the threshold and the mean value below it.
I0_copy = zeros(size(I0));
for i = 2 : size(I0,1) - 1
for j = 2 : size(I0,2) - 1
tmp = I0(i-1:i+1,j-1:j+1) >= I0(i,j);
I0_copy(i,j) = mean(mean(tmp.*I0(i-1:i+1,j-1:j+1))) - ...
mean(mean(~tmp.*I0(i-1:i+1,j-1:j+1))); % Contrast
end
end
Since I have 4 features in total, I would choose K=5 in my clustering method. The code for
k-means are shown below (it is from Dr. Andrew Ng's machine learning course. I took the
course before, and I wrote the code myself in his programming assignment).
[centroids, idx] = runkMeans(X, initial_centroids, max_iters);
mask=reshape(idx,img_size(1),img_size(2));
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function [centroids, idx] = runkMeans(X, initial_centroids, ...
max_iters, plot_progress)
[m n] = size(X);
K = size(initial_centroids, 1);
centroids = initial_centroids;
previous_centroids = centroids;
idx = zeros(m, 1);
for i=1:max_iters
% For each example in X, assign it to the closest centroid
idx = findClosestCentroids(X, centroids);
% Given the memberships, compute new centroids
centroids = computeCentroids(X, idx, K);
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function idx = findClosestCentroids(X, centroids)
K = size(centroids, 1);
idx = zeros(size(X,1), 1);
for xi = 1:size(X,1)
x = X(xi, :);
% Find closest centroid for x.
best = Inf;
for mui = 1:K
mu = centroids(mui, :);
d = dot(x - mu, x - mu);
if d < best
best = d;
idx(xi) = mui;
end
end
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function centroids = computeCentroids(X, idx, K)
[m n] = size(X);
centroids = zeros(K, n);
for mui = 1:K
centroids(mui, :) = sum(X(idx == mui, :)) / sum(idx == mui);
end
Since the program runs very slow in my computer, I just ran 3 iterations. Normally the stop
criteria is (i) iteration time at least 10, or (ii) no change on the centroids any more. To
my test, increasing the iteration may differentiate the background (sky and tree, sky and
building,...) more accurately, but did not show a drastic changes in christmas tree
extraction. Also note k-means is not immune to the random centroid initialization, so running the program several times to make a comparison is recommended.
After the k-means, the labelled region with the maximum intensity of I0 was chosen. And
boundary tracing was used to extracted the boundaries. To me, the last christmas tree is the most difficult one to extract since the contrast in that picture is not high enough as they are in the first five. Another issue in my method is that I used bwboundaries function in Matlab to trace the boundary, but sometimes the inner boundaries are also included as you can observe in 3rd, 5th, 6th results. The dark side within the christmas trees are not only failed to be clustered with the illuminated side, but they also lead to so many tiny inner boundaries tracing (imfill doesn't improve very much). In all my algorithm still has a lot improvement space.
Some publications indicates that mean-shift may be more robust than k-means, and many
graph-cut based algorithms are also very competitive on complicated boundaries
segmentation. I wrote a mean-shift algorithm myself, it seems to better extract the regions
without enough light. But mean-shift is a little bit over-segmented, and some strategy of
merging is needed. It ran even much slower than k-means in my computer, I am afraid I have
to give it up. I eagerly look forward to see others would submit excellent results here
with those modern algorithms mentioned above.
Yet I always believe the feature selection is the key component in image segmentation. With
a proper feature selection that can maximize the margin between object and background, many
segmentation algorithms will definitely work. Different algorithms may improve the result
from 1 to 10, but the feature selection may improve it from 0 to 1.
Merry Christmas !

This is my final post using the traditional image processing approaches...
Here I somehow combine my two other proposals, achieving even better results. As a matter of fact I cannot see how these results could be better (especially when you look at the masked images that the method produces).
At the heart of the approach is the combination of three key assumptions:
Images should have high fluctuations in the tree regions
Images should have higher intensity in the tree regions
Background regions should have low intensity and be mostly blue-ish
With these assumptions in mind the method works as follows:
Convert the images to HSV
Filter the V channel with a LoG filter
Apply hard thresholding on LoG filtered image to get 'activity' mask A
Apply hard thresholding to V channel to get intensity mask B
Apply H channel thresholding to capture low intensity blue-ish regions into background mask C
Combine masks using AND to get the final mask
Dilate the mask to enlarge regions and connect dispersed pixels
Eliminate small regions and get the final mask which will eventually represent only the tree
Here is the code in MATLAB (again, the script loads all jpg images in the current folder and, again, this is far from being an optimized piece of code):
% clear everything
clear;
pack;
close all;
close all hidden;
drawnow;
clc;
% initialization
ims=dir('./*.jpg');
imgs={};
images={};
blur_images={};
log_image={};
dilated_image={};
int_image={};
back_image={};
bin_image={};
measurements={};
box={};
num=length(ims);
thres_div = 3;
for i=1:num,
% load original image
imgs{end+1}=imread(ims(i).name);
% convert to HSV colorspace
images{end+1}=rgb2hsv(imgs{i});
% apply laplacian filtering and heuristic hard thresholding
val_thres = (max(max(images{i}(:,:,3)))/thres_div);
log_image{end+1} = imfilter( images{i}(:,:,3),fspecial('log')) > val_thres;
% get the most bright regions of the image
int_thres = 0.26*max(max( images{i}(:,:,3)));
int_image{end+1} = images{i}(:,:,3) > int_thres;
% get the most probable background regions of the image
back_image{end+1} = images{i}(:,:,1)>(150/360) & images{i}(:,:,1)<(320/360) & images{i}(:,:,3)<0.5;
% compute the final binary image by combining
% high 'activity' with high intensity
bin_image{end+1} = logical( log_image{i}) & logical( int_image{i}) & ~logical( back_image{i});
% apply morphological dilation to connect distonnected components
strel_size = round(0.01*max(size(imgs{i}))); % structuring element for morphological dilation
dilated_image{end+1} = imdilate( bin_image{i}, strel('disk',strel_size));
% do some measurements to eliminate small objects
measurements{i} = regionprops( logical( dilated_image{i}),'Area','BoundingBox');
% iterative enlargement of the structuring element for better connectivity
while length(measurements{i})>14 && strel_size<(min(size(imgs{i}(:,:,1)))/2),
strel_size = round( 1.5 * strel_size);
dilated_image{i} = imdilate( bin_image{i}, strel('disk',strel_size));
measurements{i} = regionprops( logical( dilated_image{i}),'Area','BoundingBox');
end
for m=1:length(measurements{i})
if measurements{i}(m).Area < 0.05*numel( dilated_image{i})
dilated_image{i}( round(measurements{i}(m).BoundingBox(2):measurements{i}(m).BoundingBox(4)+measurements{i}(m).BoundingBox(2)),...
round(measurements{i}(m).BoundingBox(1):measurements{i}(m).BoundingBox(3)+measurements{i}(m).BoundingBox(1))) = 0;
end
end
% make sure the dilated image is the same size with the original
dilated_image{i} = dilated_image{i}(1:size(imgs{i},1),1:size(imgs{i},2));
% compute the bounding box
[y,x] = find( dilated_image{i});
if isempty( y)
box{end+1}=[];
else
box{end+1} = [ min(x) min(y) max(x)-min(x)+1 max(y)-min(y)+1];
end
end
%%% additional code to display things
for i=1:num,
figure;
subplot(121);
colormap gray;
imshow( imgs{i});
if ~isempty(box{i})
hold on;
rr = rectangle( 'position', box{i});
set( rr, 'EdgeColor', 'r');
hold off;
end
subplot(122);
imshow( imgs{i}.*uint8(repmat(dilated_image{i},[1 1 3])));
end
Results
High resolution results still available here!
Even more experiments with additional images can be found here.

My solution steps:
Get R channel (from RGB) - all operations we make on this channel:
Create Region of Interest (ROI)
Threshold R channel with min value 149 (top right image)
Dilate result region (middle left image)
Detect eges in computed roi. Tree has a lot of edges (middle right image)
Dilate result
Erode with bigger radius ( bottom left image)
Select the biggest (by area) object - it's the result region
ConvexHull ( tree is convex polygon ) ( bottom right image )
Bounding box (bottom right image - grren box )
Step by step:
The first result - most simple but not in open source software - "Adaptive Vision Studio + Adaptive Vision Library":
This is not open source but really fast to prototype:
Whole algorithm to detect christmas tree (11 blocks):
Next step. We want open source solution. Change AVL filters to OpenCV filters:
Here I did little changes e.g. Edge Detection use cvCanny filter, to respect roi i did multiply region image with edges image, to select the biggest element i used findContours + contourArea but idea is the same.
https://www.youtube.com/watch?v=sfjB3MigLH0&index=1&list=UUpSRrkMHNHiLDXgylwhWNQQ
I can't show images with intermediate steps now because I can put only 2 links.
Ok now we use openSource filters but it's not still whole open source.
Last step - port to c++ code. I used OpenCV in version 2.4.4
The result of final c++ code is:
c++ code is also quite short:
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/opencv.hpp"
#include <algorithm>
using namespace cv;
int main()
{
string images[6] = {"..\\1.png","..\\2.png","..\\3.png","..\\4.png","..\\5.png","..\\6.png"};
for(int i = 0; i < 6; ++i)
{
Mat img, thresholded, tdilated, tmp, tmp1;
vector<Mat> channels(3);
img = imread(images[i]);
split(img, channels);
threshold( channels[2], thresholded, 149, 255, THRESH_BINARY); //prepare ROI - threshold
dilate( thresholded, tdilated, getStructuringElement( MORPH_RECT, Size(22,22) ) ); //prepare ROI - dilate
Canny( channels[2], tmp, 75, 125, 3, true ); //Canny edge detection
multiply( tmp, tdilated, tmp1 ); // set ROI
dilate( tmp1, tmp, getStructuringElement( MORPH_RECT, Size(20,16) ) ); // dilate
erode( tmp, tmp1, getStructuringElement( MORPH_RECT, Size(36,36) ) ); // erode
vector<vector<Point> > contours, contours1(1);
vector<Point> convex;
vector<Vec4i> hierarchy;
findContours( tmp1, contours, hierarchy, CV_RETR_TREE, CV_CHAIN_APPROX_SIMPLE, Point(0, 0) );
//get element of maximum area
//int bestID = std::max_element( contours.begin(), contours.end(),
// []( const vector<Point>& A, const vector<Point>& B ) { return contourArea(A) < contourArea(B); } ) - contours.begin();
int bestID = 0;
int bestArea = contourArea( contours[0] );
for( int i = 1; i < contours.size(); ++i )
{
int area = contourArea( contours[i] );
if( area > bestArea )
{
bestArea = area;
bestID = i;
}
}
convexHull( contours[bestID], contours1[0] );
drawContours( img, contours1, 0, Scalar( 100, 100, 255 ), img.rows / 100, 8, hierarchy, 0, Point() );
imshow("image", img );
waitKey(0);
}
return 0;
}

...another old fashioned solution - purely based on HSV processing:
Convert images to the HSV colorspace
Create masks according to heuristics in the HSV (see below)
Apply morphological dilation to the mask to connect disconnected areas
Discard small areas and horizontal blocks (remember trees are vertical blocks)
Compute the bounding box
A word on the heuristics in the HSV processing:
everything with Hues (H) between 210 - 320 degrees is discarded as blue-magenta that is supposed to be in the background or in non-relevant areas
everything with Values (V) lower that 40% is also discarded as being too dark to be relevant
Of course one may experiment with numerous other possibilities to fine-tune this approach...
Here is the MATLAB code to do the trick (warning: the code is far from being optimized!!! I used techniques not recommended for MATLAB programming just to be able to track anything in the process-this can be greatly optimized):
% clear everything
clear;
pack;
close all;
close all hidden;
drawnow;
clc;
% initialization
ims=dir('./*.jpg');
num=length(ims);
imgs={};
hsvs={};
masks={};
dilated_images={};
measurements={};
boxs={};
for i=1:num,
% load original image
imgs{end+1} = imread(ims(i).name);
flt_x_size = round(size(imgs{i},2)*0.005);
flt_y_size = round(size(imgs{i},1)*0.005);
flt = fspecial( 'average', max( flt_y_size, flt_x_size));
imgs{i} = imfilter( imgs{i}, flt, 'same');
% convert to HSV colorspace
hsvs{end+1} = rgb2hsv(imgs{i});
% apply a hard thresholding and binary operation to construct the mask
masks{end+1} = medfilt2( ~(hsvs{i}(:,:,1)>(210/360) & hsvs{i}(:,:,1)<(320/360))&hsvs{i}(:,:,3)>0.4);
% apply morphological dilation to connect distonnected components
strel_size = round(0.03*max(size(imgs{i}))); % structuring element for morphological dilation
dilated_images{end+1} = imdilate( masks{i}, strel('disk',strel_size));
% do some measurements to eliminate small objects
measurements{i} = regionprops( dilated_images{i},'Perimeter','Area','BoundingBox');
for m=1:length(measurements{i})
if (measurements{i}(m).Area < 0.02*numel( dilated_images{i})) || (measurements{i}(m).BoundingBox(3)>1.2*measurements{i}(m).BoundingBox(4))
dilated_images{i}( round(measurements{i}(m).BoundingBox(2):measurements{i}(m).BoundingBox(4)+measurements{i}(m).BoundingBox(2)),...
round(measurements{i}(m).BoundingBox(1):measurements{i}(m).BoundingBox(3)+measurements{i}(m).BoundingBox(1))) = 0;
end
end
dilated_images{i} = dilated_images{i}(1:size(imgs{i},1),1:size(imgs{i},2));
% compute the bounding box
[y,x] = find( dilated_images{i});
if isempty( y)
boxs{end+1}=[];
else
boxs{end+1} = [ min(x) min(y) max(x)-min(x)+1 max(y)-min(y)+1];
end
end
%%% additional code to display things
for i=1:num,
figure;
subplot(121);
colormap gray;
imshow( imgs{i});
if ~isempty(boxs{i})
hold on;
rr = rectangle( 'position', boxs{i});
set( rr, 'EdgeColor', 'r');
hold off;
end
subplot(122);
imshow( imgs{i}.*uint8(repmat(dilated_images{i},[1 1 3])));
end
Results:
In the results I show the masked image and the bounding box.

Some old-fashioned image processing approach...
The idea is based on the assumption that images depict lighted trees on typically darker and smoother backgrounds (or foregrounds in some cases). The lighted tree area is more "energetic" and has higher intensity.
The process is as follows:
Convert to graylevel
Apply LoG filtering to get the most "active" areas
Apply an intentisy thresholding to get the most bright areas
Combine the previous 2 to get a preliminary mask
Apply a morphological dilation to enlarge areas and connect neighboring components
Eliminate small candidate areas according to their area size
What you get is a binary mask and a bounding box for each image.
Here are the results using this naive technique:
Code on MATLAB follows:
The code runs on a folder with JPG images. Loads all images and returns detected results.
% clear everything
clear;
pack;
close all;
close all hidden;
drawnow;
clc;
% initialization
ims=dir('./*.jpg');
imgs={};
images={};
blur_images={};
log_image={};
dilated_image={};
int_image={};
bin_image={};
measurements={};
box={};
num=length(ims);
thres_div = 3;
for i=1:num,
% load original image
imgs{end+1}=imread(ims(i).name);
% convert to grayscale
images{end+1}=rgb2gray(imgs{i});
% apply laplacian filtering and heuristic hard thresholding
val_thres = (max(max(images{i}))/thres_div);
log_image{end+1} = imfilter( images{i},fspecial('log')) > val_thres;
% get the most bright regions of the image
int_thres = 0.26*max(max( images{i}));
int_image{end+1} = images{i} > int_thres;
% compute the final binary image by combining
% high 'activity' with high intensity
bin_image{end+1} = log_image{i} .* int_image{i};
% apply morphological dilation to connect distonnected components
strel_size = round(0.01*max(size(imgs{i}))); % structuring element for morphological dilation
dilated_image{end+1} = imdilate( bin_image{i}, strel('disk',strel_size));
% do some measurements to eliminate small objects
measurements{i} = regionprops( logical( dilated_image{i}),'Area','BoundingBox');
for m=1:length(measurements{i})
if measurements{i}(m).Area < 0.05*numel( dilated_image{i})
dilated_image{i}( round(measurements{i}(m).BoundingBox(2):measurements{i}(m).BoundingBox(4)+measurements{i}(m).BoundingBox(2)),...
round(measurements{i}(m).BoundingBox(1):measurements{i}(m).BoundingBox(3)+measurements{i}(m).BoundingBox(1))) = 0;
end
end
% make sure the dilated image is the same size with the original
dilated_image{i} = dilated_image{i}(1:size(imgs{i},1),1:size(imgs{i},2));
% compute the bounding box
[y,x] = find( dilated_image{i});
if isempty( y)
box{end+1}=[];
else
box{end+1} = [ min(x) min(y) max(x)-min(x)+1 max(y)-min(y)+1];
end
end
%%% additional code to display things
for i=1:num,
figure;
subplot(121);
colormap gray;
imshow( imgs{i});
if ~isempty(box{i})
hold on;
rr = rectangle( 'position', box{i});
set( rr, 'EdgeColor', 'r');
hold off;
end
subplot(122);
imshow( imgs{i}.*uint8(repmat(dilated_image{i},[1 1 3])));
end

Using a quite different approach from what I've seen, I created a php script that detects christmas trees by their lights. The result ist always a symmetrical triangle, and if necessary numeric values like the angle ("fatness") of the tree.
The biggest threat to this algorithm obviously are lights next to (in great numbers) or in front of the tree (the greater problem until further optimization).
Edit (added): What it can't do: Find out if there's a christmas tree or not, find multiple christmas trees in one image, correctly detect a cristmas tree in the middle of Las Vegas, detect christmas trees that are heavily bent, upside-down or chopped down... ;)
The different stages are:
Calculate the added brightness (R+G+B) for each pixel
Add up this value of all 8 neighbouring pixels on top of each pixel
Rank all pixels by this value (brightest first) - I know, not really subtle...
Choose N of these, starting from the top, skipping ones that are too close
Calculate the median of these top N (gives us the approximate center of the tree)
Start from the median position upwards in a widening search beam for the topmost light from the selected brightest ones (people tend to put at least one light at the very top)
From there, imagine lines going 60 degrees left and right downwards (christmas trees shouldn't be that fat)
Decrease those 60 degrees until 20% of the brightest lights are outside this triangle
Find the light at the very bottom of the triangle, giving you the lower horizontal border of the tree
Done
Explanation of the markings:
Big red cross in the center of the tree: Median of the top N brightest lights
Dotted line from there upwards: "search beam" for the top of the tree
Smaller red cross: top of the tree
Really small red crosses: All of the top N brightest lights
Red triangle: D'uh!
Source code:
<?php
ini_set('memory_limit', '1024M');
header("Content-type: image/png");
$chosenImage = 6;
switch($chosenImage){
case 1:
$inputImage = imagecreatefromjpeg("nmzwj.jpg");
break;
case 2:
$inputImage = imagecreatefromjpeg("2y4o5.jpg");
break;
case 3:
$inputImage = imagecreatefromjpeg("YowlH.jpg");
break;
case 4:
$inputImage = imagecreatefromjpeg("2K9Ef.jpg");
break;
case 5:
$inputImage = imagecreatefromjpeg("aVZhC.jpg");
break;
case 6:
$inputImage = imagecreatefromjpeg("FWhSP.jpg");
break;
case 7:
$inputImage = imagecreatefromjpeg("roemerberg.jpg");
break;
default:
exit();
}
// Process the loaded image
$topNspots = processImage($inputImage);
imagejpeg($inputImage);
imagedestroy($inputImage);
// Here be functions
function processImage($image) {
$orange = imagecolorallocate($image, 220, 210, 60);
$black = imagecolorallocate($image, 0, 0, 0);
$red = imagecolorallocate($image, 255, 0, 0);
$maxX = imagesx($image)-1;
$maxY = imagesy($image)-1;
// Parameters
$spread = 1; // Number of pixels to each direction that will be added up
$topPositions = 80; // Number of (brightest) lights taken into account
$minLightDistance = round(min(array($maxX, $maxY)) / 30); // Minimum number of pixels between the brigtests lights
$searchYperX = 5; // spread of the "search beam" from the median point to the top
$renderStage = 3; // 1 to 3; exits the process early
// STAGE 1
// Calculate the brightness of each pixel (R+G+B)
$maxBrightness = 0;
$stage1array = array();
for($row = 0; $row <= $maxY; $row++) {
$stage1array[$row] = array();
for($col = 0; $col <= $maxX; $col++) {
$rgb = imagecolorat($image, $col, $row);
$brightness = getBrightnessFromRgb($rgb);
$stage1array[$row][$col] = $brightness;
if($renderStage == 1){
$brightnessToGrey = round($brightness / 765 * 256);
$greyRgb = imagecolorallocate($image, $brightnessToGrey, $brightnessToGrey, $brightnessToGrey);
imagesetpixel($image, $col, $row, $greyRgb);
}
if($brightness > $maxBrightness) {
$maxBrightness = $brightness;
if($renderStage == 1){
imagesetpixel($image, $col, $row, $red);
}
}
}
}
if($renderStage == 1) {
return;
}
// STAGE 2
// Add up brightness of neighbouring pixels
$stage2array = array();
$maxStage2 = 0;
for($row = 0; $row <= $maxY; $row++) {
$stage2array[$row] = array();
for($col = 0; $col <= $maxX; $col++) {
if(!isset($stage2array[$row][$col])) $stage2array[$row][$col] = 0;
// Look around the current pixel, add brightness
for($y = $row-$spread; $y <= $row+$spread; $y++) {
for($x = $col-$spread; $x <= $col+$spread; $x++) {
// Don't read values from outside the image
if($x >= 0 && $x <= $maxX && $y >= 0 && $y <= $maxY){
$stage2array[$row][$col] += $stage1array[$y][$x]+10;
}
}
}
$stage2value = $stage2array[$row][$col];
if($stage2value > $maxStage2) {
$maxStage2 = $stage2value;
}
}
}
if($renderStage >= 2){
// Paint the accumulated light, dimmed by the maximum value from stage 2
for($row = 0; $row <= $maxY; $row++) {
for($col = 0; $col <= $maxX; $col++) {
$brightness = round($stage2array[$row][$col] / $maxStage2 * 255);
$greyRgb = imagecolorallocate($image, $brightness, $brightness, $brightness);
imagesetpixel($image, $col, $row, $greyRgb);
}
}
}
if($renderStage == 2) {
return;
}
// STAGE 3
// Create a ranking of bright spots (like "Top 20")
$topN = array();
for($row = 0; $row <= $maxY; $row++) {
for($col = 0; $col <= $maxX; $col++) {
$stage2Brightness = $stage2array[$row][$col];
$topN[$col.":".$row] = $stage2Brightness;
}
}
arsort($topN);
$topNused = array();
$topPositionCountdown = $topPositions;
if($renderStage == 3){
foreach ($topN as $key => $val) {
if($topPositionCountdown <= 0){
break;
}
$position = explode(":", $key);
foreach($topNused as $usedPosition => $usedValue) {
$usedPosition = explode(":", $usedPosition);
$distance = abs($usedPosition[0] - $position[0]) + abs($usedPosition[1] - $position[1]);
if($distance < $minLightDistance) {
continue 2;
}
}
$topNused[$key] = $val;
paintCrosshair($image, $position[0], $position[1], $red, 2);
$topPositionCountdown--;
}
}
// STAGE 4
// Median of all Top N lights
$topNxValues = array();
$topNyValues = array();
foreach ($topNused as $key => $val) {
$position = explode(":", $key);
array_push($topNxValues, $position[0]);
array_push($topNyValues, $position[1]);
}
$medianXvalue = round(calculate_median($topNxValues));
$medianYvalue = round(calculate_median($topNyValues));
paintCrosshair($image, $medianXvalue, $medianYvalue, $red, 15);
// STAGE 5
// Find treetop
$filename = 'debug.log';
$handle = fopen($filename, "w");
fwrite($handle, "\n\n STAGE 5");
$treetopX = $medianXvalue;
$treetopY = $medianYvalue;
$searchXmin = $medianXvalue;
$searchXmax = $medianXvalue;
$width = 0;
for($y = $medianYvalue; $y >= 0; $y--) {
fwrite($handle, "\nAt y = ".$y);
if(($y % $searchYperX) == 0) { // Modulo
$width++;
$searchXmin = $medianXvalue - $width;
$searchXmax = $medianXvalue + $width;
imagesetpixel($image, $searchXmin, $y, $red);
imagesetpixel($image, $searchXmax, $y, $red);
}
foreach ($topNused as $key => $val) {
$position = explode(":", $key); // "x:y"
if($position[1] != $y){
continue;
}
if($position[0] >= $searchXmin && $position[0] <= $searchXmax){
$treetopX = $position[0];
$treetopY = $y;
}
}
}
paintCrosshair($image, $treetopX, $treetopY, $red, 5);
// STAGE 6
// Find tree sides
fwrite($handle, "\n\n STAGE 6");
$treesideAngle = 60; // The extremely "fat" end of a christmas tree
$treeBottomY = $treetopY;
$topPositionsExcluded = 0;
$xymultiplier = 0;
while(($topPositionsExcluded < ($topPositions / 5)) && $treesideAngle >= 1){
fwrite($handle, "\n\nWe're at angle ".$treesideAngle);
$xymultiplier = sin(deg2rad($treesideAngle));
fwrite($handle, "\nMultiplier: ".$xymultiplier);
$topPositionsExcluded = 0;
foreach ($topNused as $key => $val) {
$position = explode(":", $key);
fwrite($handle, "\nAt position ".$key);
if($position[1] > $treeBottomY) {
$treeBottomY = $position[1];
}
// Lights above the tree are outside of it, but don't matter
if($position[1] < $treetopY){
$topPositionsExcluded++;
fwrite($handle, "\nTOO HIGH");
continue;
}
// Top light will generate division by zero
if($treetopY-$position[1] == 0) {
fwrite($handle, "\nDIVISION BY ZERO");
continue;
}
// Lights left end right of it are also not inside
fwrite($handle, "\nLight position factor: ".(abs($treetopX-$position[0]) / abs($treetopY-$position[1])));
if((abs($treetopX-$position[0]) / abs($treetopY-$position[1])) > $xymultiplier){
$topPositionsExcluded++;
fwrite($handle, "\n --- Outside tree ---");
}
}
$treesideAngle--;
}
fclose($handle);
// Paint tree's outline
$treeHeight = abs($treetopY-$treeBottomY);
$treeBottomLeft = 0;
$treeBottomRight = 0;
$previousState = false; // line has not started; assumes the tree does not "leave"^^
for($x = 0; $x <= $maxX; $x++){
if(abs($treetopX-$x) != 0 && abs($treetopX-$x) / $treeHeight > $xymultiplier){
if($previousState == true){
$treeBottomRight = $x;
$previousState = false;
}
continue;
}
imagesetpixel($image, $x, $treeBottomY, $red);
if($previousState == false){
$treeBottomLeft = $x;
$previousState = true;
}
}
imageline($image, $treeBottomLeft, $treeBottomY, $treetopX, $treetopY, $red);
imageline($image, $treeBottomRight, $treeBottomY, $treetopX, $treetopY, $red);
// Print out some parameters
$string = "Min dist: ".$minLightDistance." | Tree angle: ".$treesideAngle." deg | Tree bottom: ".$treeBottomY;
$px = (imagesx($image) - 6.5 * strlen($string)) / 2;
imagestring($image, 2, $px, 5, $string, $orange);
return $topN;
}
/**
* Returns values from 0 to 765
*/
function getBrightnessFromRgb($rgb) {
$r = ($rgb >> 16) & 0xFF;
$g = ($rgb >> 8) & 0xFF;
$b = $rgb & 0xFF;
return $r+$r+$b;
}
function paintCrosshair($image, $posX, $posY, $color, $size=5) {
for($x = $posX-$size; $x <= $posX+$size; $x++) {
if($x>=0 && $x < imagesx($image)){
imagesetpixel($image, $x, $posY, $color);
}
}
for($y = $posY-$size; $y <= $posY+$size; $y++) {
if($y>=0 && $y < imagesy($image)){
imagesetpixel($image, $posX, $y, $color);
}
}
}
// From http://www.mdj.us/web-development/php-programming/calculating-the-median-average-values-of-an-array-with-php/
function calculate_median($arr) {
sort($arr);
$count = count($arr); //total numbers in array
$middleval = floor(($count-1)/2); // find the middle value, or the lowest middle value
if($count % 2) { // odd number, middle is the median
$median = $arr[$middleval];
} else { // even number, calculate avg of 2 medians
$low = $arr[$middleval];
$high = $arr[$middleval+1];
$median = (($low+$high)/2);
}
return $median;
}
?>
Images:
Bonus: A german Weihnachtsbaum, from Wikipedia
http://commons.wikimedia.org/wiki/File:Weihnachtsbaum_R%C3%B6merberg.jpg

I used python with opencv.
My algorithm goes like this:
First it takes the red channel from the image
Apply a threshold (min value 200) to the Red channel
Then apply Morphological Gradient and then do a 'Closing' (dilation followed by Erosion)
Then it finds the contours in the plane and it picks the longest contour.
The code:
import numpy as np
import cv2
import copy
def findTree(image,num):
im = cv2.imread(image)
im = cv2.resize(im, (400,250))
gray = cv2.cvtColor(im, cv2.COLOR_RGB2GRAY)
imf = copy.deepcopy(im)
b,g,r = cv2.split(im)
minR = 200
_,thresh = cv2.threshold(r,minR,255,0)
kernel = np.ones((25,5))
dst = cv2.morphologyEx(thresh, cv2.MORPH_GRADIENT, kernel)
dst = cv2.morphologyEx(dst, cv2.MORPH_CLOSE, kernel)
contours = cv2.findContours(dst,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)[0]
cv2.drawContours(im, contours,-1, (0,255,0), 1)
maxI = 0
for i in range(len(contours)):
if len(contours[maxI]) < len(contours[i]):
maxI = i
img = copy.deepcopy(r)
cv2.polylines(img,[contours[maxI]],True,(255,255,255),3)
imf[:,:,2] = img
cv2.imshow(str(num), imf)
def main():
findTree('tree.jpg',1)
findTree('tree2.jpg',2)
findTree('tree3.jpg',3)
findTree('tree4.jpg',4)
findTree('tree5.jpg',5)
findTree('tree6.jpg',6)
cv2.waitKey(0)
cv2.destroyAllWindows()
if __name__ == "__main__":
main()
If I change the kernel from (25,5) to (10,5)
I get nicer results on all trees but the bottom left,
my algorithm assumes that the tree has lights on it, and
in the bottom left tree, the top has less light then the others.

Related

Opencv: Get all objects from segmented colorful image

How to get all objects from image i am separating image objects through colors.
There are almost 20 colors in following image. I want to extract all colors and their position in a vector(Vec3b and Rect).
I'm using egbis algorithum for segmentation
Segmented image
Mat src, dst;
String imageName("/home/pathToImage.jpg" );
src = imread(imageName,1);
if(src.rows < 1)
return -1;
for(int i=0; i<src.rows; i=i+5)
{ for(int j=0; j<src.cols; j=j+5)
{
Vec3b color = src.at<Vec3b>(Point(i,j));
if(colors.empty())
{
colors.push_back(color);
}
else{
bool add = true;
for(int k=0; k<colors.size(); k++)
{
int rmin = colors[k].val[0]-5,
rmax = colors[k].val[0]+5,
gmin = colors[k].val[1]-5,
gmax = colors[k].val[1]+5,
bmin = colors[k].val[2]-5,
bmax = colors[k].val[2]+5;
if((
(color.val[0] >= rmin && color.val[0] <= rmax) &&
(color.val[1] >= gmin && color.val[1] <= gmax) &&
(color.val[2] >= bmin && color.val[2] <= bmax))
)
{
add = false;
break;
}
}
if(add)
colors.push_back(color);
}
}
}
int size = colors.size();
for(int i=0; i<colors.size();i++)
{
Mat inrangeImage;
//cv::inRange(src, Scalar(lowBlue, lowGreen, lowRed), Scalar(highBlue, highGreen, highRed), redColorOnly);
cv::inRange(src, cv::Scalar(colors[i].val[0]-1, colors[i].val[1]-1, colors[i].val[2]-1), cv::Scalar(colors[i].val[0]+1, colors[i].val[1]+1, colors[i].val[2]+1), inrangeImage);
imwrite("/home/kavtech/Segmentation/1/opencv-wrapper-egbis/images/inrangeImage.jpg",inrangeImage);
}
/// Display
namedWindow("Image", WINDOW_AUTOSIZE );
imshow("Image", src );
waitKey(0);
I want to get each color position so that
i can differentiate object positions. Please Help!
That's just a trivial data formatting problem. You want to turn a truecolour image with only 20 or so colours into a colour-indexed image.
So simply step through the image, look up the colour in your growing dictionary, and assign and integer 0-20 to each pixel.
Now you can turn the images into binary images simply by saying one colour is set and the rest are clear, and use standard algorithms for fitting rectangles.

Merging Overlapping Rectangle in OpenCV

I'm using OpenCV 3.0. I've made a car detection program and I keep running into the problem of overlapping bounding boxes:
Is there a way to merge overlapping bounding boxes as described on the images below?
I've used rectangle(frame, Point(x1, y1), Point(x2, y2), Scalar(255,255,255)); to draw those bounding boxes. I've searched for answer from similiar threads but I can't find them helpful. I'd like to form a single outer bounding rectangle after merging those bounding boxes.
Problem
Seems as if you are displaying each contour you are getting. You don't have to do that. Follow the algorithm and code given below.
Algorithm
In this case what you can do is iterate through each contour that you detect and select the biggest boundingRect. You don't have to display each contour you detect.
Here is a code that you can use.
Code
for( int i = 0; i< contours.size(); i++ ) // iterate through each contour.
{
double a=contourArea( contours[i],false); // Find the area of contour
if(a>largest_area){
largest_area=a;
largest_contour_index=i; //Store the index of largest contour
bounding_rect=boundingRect(contours[i]); // Find the bounding rectangle for biggest contour
}
}
Regards
As I've mentioned in a similar post here, this is a problem best solved by Non Maximum Suppression.
Although your code is in C++, have a look at this pyimagesearch article (python) to get an idea on how this works.
I've translated this code from python to C++,.
struct detection_box
{
cv::Rect box; /*!< Bounding box */
double svm_val; /*!< SVM response at that detection*/
cv::Size res_of_detection; /*!< Image resolution at which the detection occurred */
};
/*!
\brief Applies the Non Maximum Suppression algorithm on the detections to find the detections that do not overlap
The svm response is used to sort the detections. Translated from http://www.pyimagesearch.com/2014/11/17/non-maximum-suppression-object-detection-python/
\param boxes list of detections that are the input for the NMS algorithm
\param overlap_threshold the area threshold for the overlap between detections boxes. boxes that have overlapping area above threshold are discarded
\returns list of final detections that are no longer overlapping
*/
std::vector<detection_box> nonMaximumSuppression(std::vector<detection_box> boxes, float overlap_threshold)
{
std::vector<detection_box> res;
std::vector<float> areas;
//if there are no boxes, return empty
if (boxes.size() == 0)
return res;
for (int i = 0; i < boxes.size(); i++)
areas.push_back(boxes[i].box.area());
std::vector<int> idxs = argsort(boxes);
std::vector<int> pick; //indices of final detection boxes
while (idxs.size() > 0) //while indices still left to analyze
{
int last = idxs.size() - 1; //last element in the list. that is, detection with highest SVM response
int i = idxs[last];
pick.push_back(i); //add highest SVM response to the list of final detections
std::vector<int> suppress;
suppress.push_back(last);
for (int pos = 0; pos < last; pos++) //for every other element in the list
{
int j = idxs[pos];
//find overlapping area between boxes
int xx1 = max(boxes[i].box.x, boxes[j].box.x); //get max top-left corners
int yy1 = max(boxes[i].box.y, boxes[j].box.y); //get max top-left corners
int xx2 = min(boxes[i].box.br().x, boxes[j].box.br().x); //get min bottom-right corners
int yy2 = min(boxes[i].box.br().y, boxes[j].box.br().y); //get min bottom-right corners
int w = max(0, xx2 - xx1 + 1); //width
int h = max(0, yy2 - yy1 + 1); //height
float overlap = float(w * h) / areas[j];
if (overlap > overlap_threshold) //if the boxes overlap too much, add it to the discard pile
suppress.push_back(pos);
}
for (int p = 0; p < suppress.size(); p++) //for graceful deletion
{
idxs[suppress[p]] = -1;
}
for (int p = 0; p < idxs.size();)
{
if (idxs[p] == -1)
idxs.erase(idxs.begin() + p);
else
p++;
}
}
for (int i = 0; i < pick.size(); i++) //extract final detections frm input array
res.push_back(boxes[pick[i]]);
return res;
}

How to find the pixel value that corresponds to a specific number of pixels?

Assume that I have a grayscale image in OpenCV.
I want to find a value so that 5% of pixels in the images have a value greater than it.
I can iterate over pixels and find number of pixels with the same value and then from the result find the value that %5 of pixel are above my value, but I am looking for a faster way to do this. Is there any such technique in OpenCV?
I think histogram would help, but I am not sure how I can use it.
You need to:
Compute the cumulative histogram of your pixel values
Find the bin whose value is greater than 95% (100 - 5) of the total number of pixels.
Given an image uniformly random generated, you get an histogram like:
and the cumulative histogram like (you need to find the first bin whose value is over the blue line):
Then you need to find the proper bin. You can use std::lower_bound function to find the correct value, and std::distance to find the corresponding bin number (aka the value you want to find). (Please note that with lower_bound you'll find the element whose value is greater or equal to the given value. You can use upper_bound to find the element whose value is strictly greater then the given value)
In this case it results to be 242, which make sense for an uniform distribution from 0 to 255, since 255*0.95 = 242.25.
Check the full code:
#include <opencv2\opencv.hpp>
#include <vector>
#include <algorithm>
using namespace std;
using namespace cv;
void drawHist(const vector<int>& data, Mat3b& dst, int binSize = 3, int height = 0, int ref_value = -1)
{
int max_value = *max_element(data.begin(), data.end());
int rows = 0;
int cols = 0;
float scale = 1;
if (height == 0) {
rows = max_value + 10;
}
else {
rows = height;
scale = float(height) / (max_value + 10);
}
cols = data.size() * binSize;
dst = Mat3b(rows, cols, Vec3b(0, 0, 0));
for (int i = 0; i < data.size(); ++i)
{
int h = rows - int(scale * data[i]);
rectangle(dst, Point(i*binSize, h), Point((i + 1)*binSize - 1, rows), (i % 2) ? Scalar(0, 100, 255) : Scalar(0, 0, 255), CV_FILLED);
}
if (ref_value >= 0)
{
int h = rows - int(scale * ref_value);
line(dst, Point(0, h), Point(cols, h), Scalar(255,0,0));
}
}
int main()
{
Mat1b src(100, 100);
randu(src, Scalar(0), Scalar(255));
int percent = 5; // percent % of pixel values are above a val
int val; // I need to find this value
int n = src.rows * src.cols; // Total number of pixels
int th = cvRound((100 - percent) / 100.f * n); // Number of pixels below val
// Histogram
vector<int> hist(256, 0);
for (int r = 0; r < src.rows; ++r) {
for (int c = 0; c < src.cols; ++c) {
hist[src(r, c)]++;
}
}
// Cumulative histogram
vector<int> cum = hist;
for (int i = 1; i < hist.size(); ++i) {
cum[i] = cum[i - 1] + hist[i];
}
// lower_bound returns an iterator pointing to the first element
// that is not less than (i.e. greater or equal to) th.
val = distance(cum.begin(), lower_bound(cum.begin(), cum.end(), th));
// Plot histograms
Mat3b plotHist, plotCum;
drawHist(hist, plotHist, 3, 300);
drawHist(cum, plotCum, 3, 300, *lower_bound(cum.begin(), cum.end(), th));
cout << "Value: " << val;
imshow("Hist", plotHist);
imshow("Cum", plotCum);
waitKey();
return 0;
}
Note
The histogram drawing function is an upgrade from a former version I posted here
You can use calcHist to compute the histograms, but I personally find easier to use the aforementioned method for 1D histograms.
1) Determine the height and the width of the image, h and w.
2) Determine what 5% of the total number of pixels is (X)...
X = int(h * w * 0.05)
3) Start at the brightest bin in the histogram. Set total T = 0.
4) Add the number of pixels in this bin to your total T. If T is greater than X, you are finished and the value you want is the lower limit of the range of the current histogram bin.
3) Move to the next darker bin in your histogram. Goto 4.

Combining overlapping groups in an image

I am using opencv_contrib to detect textual regions in an image.
This is the original image
This is the image after textual regions are found:
As can be seen, there are overlapping groups in the image. For example, there seem to be two groups around Hello World and two around Some more sample text
Question
In scenarios like these how can I keep the widest possible box by merging the two boxes. For these examples that would be one starting with H and ending in d so that it covers Hello World. My reason for doing is is that I would like to crop part of this image and send it to tesseract.
Here is the relevant code that draws the boxes.
void groups_draw(Mat &src, vector<Rect> &groups)
{
for (int i=(int)groups.size()-1; i>=0; i--)
{
if (src.type() == CV_8UC3)
rectangle(src,groups.at(i).tl(),groups.at(i).br(),Scalar( 0, 255, 255 ), 2, 8 );
}
}
Here is what I've tried. My ideas are in comments.
void groups_draw(Mat &src, vector<Rect> &groups)
{
int previous_tl_x = 0;
int previous_tl_y = 0;
int prevoius_br_x = 0;
int previous_br_y = 0;
//sort the groups from lowest to largest.
for (int i=(int)groups.size()-1; i>=0; i--)
{
//if previous_tl_x is smaller than current_tl_x then keep the current one.
//if previous_br_x is smaller than current_br_x then keep the current one.
if (src.type() == CV_8UC3) {
//crop the image
Mat cropedImage = src(Rect(Point(groups.at(i).tl().x, groups.at(i).tl().y),Point(groups.at(i).br().x, groups.at(i).br().y)));
imshow("cropped",cropedImage);
waitKey(-1);
}
}
}
Update
I'm trying to use [groupRectangles][4] to accomplish this:
void groups_draw(Mat &src, vector<Rect> &groups)
{
vector<Rect> rects;
for (int i=(int)groups.size()-1; i>=0; i--)
{
rects.push_back(groups.at(i));
}
groupRectangles(rects, 1, 0.2);
}
However, this is giving me an error:
textdetection.cpp:106:5: error: use of undeclared identifier 'groupRectangles'
groupRectangles(rects, 1, 0.2);
^
1 error generated.
First, the reason you get overlapping bounding boxes is that the text detector module is working on inverted channels (e.g: gray and inverted gray) and because of that the inner regions of some characters such as o's and g's are wrongly detected and grouped as characters. So if you want to detect only one mode of text (white text on dark background) just pass the inverted channels.
Replace:
for (int c = 0; c < cn-1; c++)
channels.push_back(255-channels[c]);
With:
for (int c = 0; c < cn-1; c++)
channels[c] = (255-channels[c]);
Now for your question, rectangles have defined intersection and combining operators:
rect = rect1 & rect2 (rectangle intersection)
rect = rect1 | rect2 (minimum area rectangle containing rect2 and rect3 )
rect &= rect1, rect |= rect1 (and the corresponding augmenting operations)
You can use those operators while iterating over rectangles to detect intersected rectangles and combine them, as follows:
if ((rect1 & rect2).area() != 0)
rect1 |= rect2;
Edit:
First, sort rectangle groups by area from largest to smallest:
std::sort(groups.begin(), groups.end(),
[](const cv::Rect &rect1, const cv::Rect &rect2) -> bool {return rect1.area() > rect2.area();});
Then, iterate over the rectangles, when two rectangles intersect add the smaller to the larger and then delete it:
for (int i = 0; i < groups.size(); i++)
{
for (int j = i + 1; j < groups.size(); j++)
{
if ((groups[i] & groups[j]).area() != 0)
{
groups[i] |= groups[j];
groups.erase(groups.begin() + j--);
}
}
}
One approach would be to compare every rectangle with every other rectangle to see if they overlap or intersect. If they do in a sufficient amount you can combine them into one larger rectangle.

Finding HSV Thresholds Via Histograms with OpenCV

I'm trying to write a method that will find the proper threshold values in HSV space for an object placed at the center of the screen. These values are used for an object tracking algorithm. I've tested that piece of code with hand coded threshold values and it works well. The idea behind the method is that it should calculate the histograms for each of the channels and then return the 5th and 95th percentile for each to be used as the threshold values. (credit: How to find RGB/HSV color parameters for color tracking?) The image being passed is a picture of the object to be tracked (which is set by the user before the whole process begins. Here is the code
std::vector<cv::Scalar> HSV_Threshold_Determiner::Get_Threshold_Values(const cv::Mat& image)
{
cv::Mat inputImage;
cv::cvtColor(image, inputImage, CV_BGR2HSV);
std::vector<cv::Mat> bgrPlanes;
cv::split(inputImage, bgrPlanes);
cv::Mat hHist, sHist, vHist;
int hMax = 180, svMax = 256;
float hRanges[] = { 0, (float)hMax };
const float* hRange = { hRanges };
float svRanges[] = { 0, (float)svMax };
const float* svRange = { svRanges };
//float sRanges[] = { 0, 256 };
cv::calcHist(&bgrPlanes[0], 1, 0, cv::Mat(), hHist, 1, &hMax, &hRange);
cv::calcHist(&bgrPlanes[1], 1, 0, cv::Mat(), sHist, 1, &svMax, &svRange);
cv::calcHist(&bgrPlanes[2], 1, 0, cv::Mat(), vHist, 1, &svMax, &svRange);
int totalEntries = image.cols * image.rows;
int fiveCutoff = (int)(totalEntries * .05);
int ninetyFiveCutoff = (int)(totalEntries * .95);
float hTotal = 0, sTotal = 0, vTotal = 0;
bool hMinFound = false, hMaxFound = false, sMinFound = false, sMaxFound = false,
vMinFound = false, vMaxFound = false;
cv::Scalar hThresholds;
cv::Scalar sThresholds;
cv::Scalar vThresholds;
for(int i = 0; i < vHist.rows; ++i)
{
if(i < hHist.rows)
{
hTotal += hHist.at<float>(i, 0);
if(hTotal >= fiveCutoff && !hMinFound)
{
hThresholds.val[0] = i;
hMinFound = true;
}
else if(hTotal>= ninetyFiveCutoff && !hMaxFound)
{
hThresholds.val[1] = i;
hMaxFound = true;
}
}
sTotal += sHist.at<float>(i, 0);
vTotal += vHist.at<float>(i, 0);
if(sTotal >= fiveCutoff && !sMinFound)
{
sThresholds.val[0] = i;
sMinFound = true;
}
else if(sTotal >= ninetyFiveCutoff && !sMaxFound)
{
sThresholds.val[1] = i;
sMaxFound = true;
}
if(vTotal >= fiveCutoff && !vMinFound)
{
vThresholds.val[0] = i;
vMinFound = true;
}
else if(vTotal >= ninetyFiveCutoff && !vMaxFound)
{
vThresholds.val[1] = i;
vMaxFound = true;
}
if(vMaxFound && sMaxFound && hMaxFound)
{
break;
}
}
std::vector<cv::Scalar> returnVect;
returnVect.push_back(hThresholds);
returnVect.push_back(sThresholds);
returnVect.push_back(vThresholds);
return returnVect;
}
What I am trying to do is sum up the number of entries in each bucket until I get to a number that is greater than or equal to five percent and ninety-five percent of the total. Unfortunately the numbers I get are never close to the ones I get if I do the thresholding by hand.
Mat img = ... // from camera or some other source
// STEP 1: learning phase
Mat hsv, imgThreshed, processed, denoised;
cv::GaussianBlur(img, denoised, cv::Size(5,5), 2, 2); // remove noise
cv::cvtColor(denoised, hsv, CV_BGR2HSV);
// lets say we picked manually a region of 100x100 px with the interested color/object using mouse
cv::Mat roi = hsv (cv::Range(mousex-50, mousey+50), cv::Range(mousex-50, mousey+50));
// must split all channels to get Hue only
std::vector<cv::Mat> hsvPlanes;
cv::split(roi, hsvPlanes);
// compute statistics for Hue value
cv::Scalar mean, stddev;
cv::meanStdDev(hsvPlanes[0], mean, stddev);
// ensure we get 95% of all valid Hue samples (statistics 3*sigma rule)
float minHue = mean[0] - stddev[0]*3;
float maxHue = mean[0] + stddev[0]*3;
// STEP 2: detection phase
cv::inRange(hsvPlanes[0], cv::Scalar(minHue), cv::Scalar(maxHue), imgThreshed);
imshow("thresholded", imgThreshed);
cv_erode(imgThreshed, processed, 5); // minimizes noise
cv_dilate(processed, processed, 20); // maximize left regions
imshow("final", processed);
//STEP 3: do some blob/contour detection on processed image & find maximum blob/region, etc ...
A much simpler solution - just calculate mean & std. deviation for a region of interest, i.e. containing the Hue value.
Since Hue is the most stable component in the image, the other components saturation & value should be discarded as they vary too much. However you can still compute mean for them if needed.