image with two circles
I have an image that include two fibers (presenting as two circles in the image). How can I calculate the distance of two fibers?
I find it hard to detect the position of the fiber. I have tried to use the HoughCircles function, but the parameters are hard to optimize and it cannot locate the circle precisely in most times. Should I subtract the background first or is there any other methods? MANY Thanks!
Unfortunately, you haven't shown your preprocessing steps. In my approach, I'll do the following:
Convert input image to grayscale (see cvtColor).
Median blurring, maintains the "edges" (see medianBlur).
Adaptive thresholding (see adaptiveTreshold).
Morphological opening to get rid of small noise (see morphologyEx).
Find circles by HoughCircles.
Not done here: Possible refinements of the found circles. Exclude too small or too large circles. Use all prior information you have on that! For example, how large can the circles be at all?
Here's my whole code:
// Read image.
cv::Mat img = cv::imread("images/i7aJJ.jpg", cv::IMREAD_COLOR);
// Convert to grayscale for processing.
cv::Mat blk;
cv::cvtColor(img, blk, cv::COLOR_BGR2GRAY);
// Median blurring to improve following thresholding.
cv::medianBlur(blk, blk, 11);
// Adaptive thresholding.
cv::adaptiveThreshold(blk, blk, 255, cv::ADAPTIVE_THRESH_GAUSSIAN_C, cv::THRESH_BINARY, 51, -2);
// Morphological opening to get rid of small noise.
cv::morphologyEx(blk, blk, cv::MORPH_OPEN, cv::getStructuringElement(cv::MORPH_ELLIPSE, cv::Size(3, 3)));
// Find circles using Hough transform.
std::vector<cv::Vec4f> circles;
cv::HoughCircles(blk, circles, cv::HOUGH_GRADIENT, 1.0, 300, 50, 25, 100);
// TODO: Refinement of found circles, if there are more than two.
// For example, calculate areas: Neglect too small or too large areas.
// Compare all areas, and keep the two with nearly matching areas and
// suitable areas.
// Draw circles in input image.
for (Vec4f& circle : circles) {
cv::circle(img, cv::Point(circle[0], circle[1]), circle[2], cv::Scalar(0, 0, 255), 4);
cv::circle(img, cv::Point(circle[0], circle[1]), 5, cv::Scalar(0, 255, 0), cv::FILLED);
}
// --- Assuming there are only the two right circles left from here. --- //
// Draw some debug output in input image.
const cv::Point c1 = cv::Point(circles[0][0], circles[0][1]);
const cv::Point c2 = cv::Point(circles[1][0], circles[1][1]);
cv::line(img, c1, c2, cv::Scalar(255, 0, 0), 2);
// Calculate distance, and put in input image.
double dist = cv::norm(c1 - c2);
cv::putText(img, std::to_string(dist), cv::Point((c1.x + c2.x) / 2 + 20, (c1.y + c2.y) / 2 + 20), cv::FONT_HERSHEY_COMPLEX, 1.0, cv::Scalar(255, 0, 0));
The final output looks like this:
The intermediate image right before the HoughCircles operation looke like this:
In general, I'm not that skeptical about HoughCircles. You "just" have to pay attention to your preprocessing.
Hope that helps!
It's possible using hough circle detection but you should provide more images if you want a more stable detection. I just do denoising and go straight to circle detection. Using a non-local means denoising is pretty good at preserving edges which is in turn good for the canny edge algorithm included in the hough circle algorithm.
My code is written in Python but can easily be translated into C++.
import cv2
from matplotlib import pyplot as plt
IM_PATH = 'your image path'
DS = 2 # downsample the image
orig = cv2.imread(IM_PATH, cv2.IMREAD_GRAYSCALE)
orig = cv2.resize(orig, (orig.shape[1] // DS, orig.shape[0] // DS))
img = cv2.fastNlMeansDenoising(orig, h=3, templateWindowSize=20 // DS + 1, searchWindowSize=40 // DS + 1)
plt.imshow(orig, cmap='gray')
circles = cv2.HoughCircles(img, cv2.HOUGH_GRADIENT, dp=1, minDist=200 // DS, param1=40 // DS, param2=40 // DS, minRadius=210 // DS, maxRadius=270 // DS)
if circles is not None:
for x, y, r in circles[0]:
c = plt.Circle((x, y), r, fill=False, lw=1, ec='C1')
plt.gca().add_patch(c)
plt.gcf().set_size_inches((12, 8))
plt.show()
Important
Doing a bit of image processing is only the first step in a good (and stable!) object detection. You have to leverage every detail and property that you can get your hands on and apply some statistics to improve your results. For example:
Use Yves' approach as an addition and filter all detected circles that do not intersect the joints.
Is one circle always underneath the other? Filter out horizontally aligned pairs.
Can you reduce the ROI (are the circles always in a specific area in your image or can they be everywhere)?
Are both circles always the same size? Filter out pairs with different sizes.
...
If you can use multiple metrics you can apply a statistical model (ex. majority voting or knn) to find the best pair of circles.
Again: always think of what you know about your object, the environment and its behavior and take advantage of that knowledge.
I am trying to find the bounding boxes of text in an image and am currently using this approach:
// calculate the local variances of the grayscale image
Mat t_mean, t_mean_2;
Mat grayF;
outImg_gray.convertTo(grayF, CV_32F);
int winSize = 35;
blur(grayF, t_mean, cv::Size(winSize,winSize));
blur(grayF.mul(grayF), t_mean_2, cv::Size(winSize,winSize));
Mat varMat = t_mean_2 - t_mean.mul(t_mean);
varMat.convertTo(varMat, CV_8U);
// threshold the high variance regions
Mat varMatRegions = varMat > 100;
When given an image like this:
Then when I show varMatRegions I get this image:
As you can see it somewhat combines the left block of text with the header of the card, for most cards this method works great but on busier cards it can cause problems.
The reason it is bad for those contours to connect is that it makes the bounding box of the contour nearly take up the entire card.
Can anyone suggest a different way I can find the text to ensure proper detection of text?
200 points to whoever can find the text in the card above the these two.
I used a gradient based method in the program below. Added the resulting images. Please note that I'm using a scaled down version of the image for processing.
c++ version
The MIT License (MIT)
Copyright (c) 2014 Dhanushka Dangampola
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
#include "stdafx.h"
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <iostream>
using namespace cv;
using namespace std;
#define INPUT_FILE "1.jpg"
#define OUTPUT_FOLDER_PATH string("")
int _tmain(int argc, _TCHAR* argv[])
{
Mat large = imread(INPUT_FILE);
Mat rgb;
// downsample and use it for processing
pyrDown(large, rgb);
Mat small;
cvtColor(rgb, small, CV_BGR2GRAY);
// morphological gradient
Mat grad;
Mat morphKernel = getStructuringElement(MORPH_ELLIPSE, Size(3, 3));
morphologyEx(small, grad, MORPH_GRADIENT, morphKernel);
// binarize
Mat bw;
threshold(grad, bw, 0.0, 255.0, THRESH_BINARY | THRESH_OTSU);
// connect horizontally oriented regions
Mat connected;
morphKernel = getStructuringElement(MORPH_RECT, Size(9, 1));
morphologyEx(bw, connected, MORPH_CLOSE, morphKernel);
// find contours
Mat mask = Mat::zeros(bw.size(), CV_8UC1);
vector<vector<Point>> contours;
vector<Vec4i> hierarchy;
findContours(connected, contours, hierarchy, CV_RETR_CCOMP, CV_CHAIN_APPROX_SIMPLE, Point(0, 0));
// filter contours
for(int idx = 0; idx >= 0; idx = hierarchy[idx][0])
{
Rect rect = boundingRect(contours[idx]);
Mat maskROI(mask, rect);
maskROI = Scalar(0, 0, 0);
// fill the contour
drawContours(mask, contours, idx, Scalar(255, 255, 255), CV_FILLED);
// ratio of non-zero pixels in the filled region
double r = (double)countNonZero(maskROI)/(rect.width*rect.height);
if (r > .45 /* assume at least 45% of the area is filled if it contains text */
&&
(rect.height > 8 && rect.width > 8) /* constraints on region size */
/* these two conditions alone are not very robust. better to use something
like the number of significant peaks in a horizontal projection as a third condition */
)
{
rectangle(rgb, rect, Scalar(0, 255, 0), 2);
}
}
imwrite(OUTPUT_FOLDER_PATH + string("rgb.jpg"), rgb);
return 0;
}
python version
The MIT License (MIT)
Copyright (c) 2017 Dhanushka Dangampola
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.
import cv2
import numpy as np
large = cv2.imread('1.jpg')
rgb = cv2.pyrDown(large)
small = cv2.cvtColor(rgb, cv2.COLOR_BGR2GRAY)
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3))
grad = cv2.morphologyEx(small, cv2.MORPH_GRADIENT, kernel)
_, bw = cv2.threshold(grad, 0.0, 255.0, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (9, 1))
connected = cv2.morphologyEx(bw, cv2.MORPH_CLOSE, kernel)
# using RETR_EXTERNAL instead of RETR_CCOMP
contours, hierarchy = cv2.findContours(connected.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
#For opencv 3+ comment the previous line and uncomment the following line
#_, contours, hierarchy = cv2.findContours(connected.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
mask = np.zeros(bw.shape, dtype=np.uint8)
for idx in range(len(contours)):
x, y, w, h = cv2.boundingRect(contours[idx])
mask[y:y+h, x:x+w] = 0
cv2.drawContours(mask, contours, idx, (255, 255, 255), -1)
r = float(cv2.countNonZero(mask[y:y+h, x:x+w])) / (w * h)
if r > 0.45 and w > 8 and h > 8:
cv2.rectangle(rgb, (x, y), (x+w-1, y+h-1), (0, 255, 0), 2)
cv2.imshow('rects', rgb)
You can detect text by finding close edge elements (inspired from a LPD):
#include "opencv2/opencv.hpp"
std::vector<cv::Rect> detectLetters(cv::Mat img)
{
std::vector<cv::Rect> boundRect;
cv::Mat img_gray, img_sobel, img_threshold, element;
cvtColor(img, img_gray, CV_BGR2GRAY);
cv::Sobel(img_gray, img_sobel, CV_8U, 1, 0, 3, 1, 0, cv::BORDER_DEFAULT);
cv::threshold(img_sobel, img_threshold, 0, 255, CV_THRESH_OTSU+CV_THRESH_BINARY);
element = getStructuringElement(cv::MORPH_RECT, cv::Size(17, 3) );
cv::morphologyEx(img_threshold, img_threshold, CV_MOP_CLOSE, element); //Does the trick
std::vector< std::vector< cv::Point> > contours;
cv::findContours(img_threshold, contours, 0, 1);
std::vector<std::vector<cv::Point> > contours_poly( contours.size() );
for( int i = 0; i < contours.size(); i++ )
if (contours[i].size()>100)
{
cv::approxPolyDP( cv::Mat(contours[i]), contours_poly[i], 3, true );
cv::Rect appRect( boundingRect( cv::Mat(contours_poly[i]) ));
if (appRect.width>appRect.height)
boundRect.push_back(appRect);
}
return boundRect;
}
Usage:
int main(int argc,char** argv)
{
//Read
cv::Mat img1=cv::imread("side_1.jpg");
cv::Mat img2=cv::imread("side_2.jpg");
//Detect
std::vector<cv::Rect> letterBBoxes1=detectLetters(img1);
std::vector<cv::Rect> letterBBoxes2=detectLetters(img2);
//Display
for(int i=0; i< letterBBoxes1.size(); i++)
cv::rectangle(img1,letterBBoxes1[i],cv::Scalar(0,255,0),3,8,0);
cv::imwrite( "imgOut1.jpg", img1);
for(int i=0; i< letterBBoxes2.size(); i++)
cv::rectangle(img2,letterBBoxes2[i],cv::Scalar(0,255,0),3,8,0);
cv::imwrite( "imgOut2.jpg", img2);
return 0;
}
Results:
a. element = getStructuringElement(cv::MORPH_RECT, cv::Size(17, 3) );
b. element = getStructuringElement(cv::MORPH_RECT, cv::Size(30, 30) );
Results are similar for the other image mentioned.
Here is an alternative approach that I used to detect the text blocks:
Converted the image to grayscale
Applied threshold (simple binary threshold, with a handpicked value of 150 as the threshold value)
Applied dilation to thicken lines in image, leading to more compact objects and less white space fragments. Used a high value for number of iterations, so dilation is very heavy (13 iterations, also handpicked for optimal results).
Identified contours of objects in resulted image using opencv findContours function.
Drew a bounding box (rectangle) circumscribing each contoured object - each of them frames a block of text.
Optionally discarded areas that are unlikely to be the object you are searching for (e.g. text blocks) given their size, as the algorithm above can also find intersecting or nested objects (like the entire top area for the first card) some of which could be uninteresting for your purposes.
Below is the code written in python with pyopencv, it should easy to port to C++.
import cv2
image = cv2.imread("card.png")
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY) # grayscale
_,thresh = cv2.threshold(gray,150,255,cv2.THRESH_BINARY_INV) # threshold
kernel = cv2.getStructuringElement(cv2.MORPH_CROSS,(3,3))
dilated = cv2.dilate(thresh,kernel,iterations = 13) # dilate
_, contours, hierarchy = cv2.findContours(dilated,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_NONE) # get contours
# for each contour found, draw a rectangle around it on original image
for contour in contours:
# get rectangle bounding contour
[x,y,w,h] = cv2.boundingRect(contour)
# discard areas that are too large
if h>300 and w>300:
continue
# discard areas that are too small
if h<40 or w<40:
continue
# draw rectangle around contour on original image
cv2.rectangle(image,(x,y),(x+w,y+h),(255,0,255),2)
# write original image with added contours to disk
cv2.imwrite("contoured.jpg", image)
The original image is the first image in your post.
After preprocessing (grayscale, threshold and dilate - so after step 3) the image looked like this:
Below is the resulted image ("contoured.jpg" in the last line); the final bounding boxes for the objects in the image look like this:
You can see the text block on the left is detected as a separate block, delimited from its surroundings.
Using the same script with the same parameters (except for thresholding type that was changed for the second image like described below), here are the results for the other 2 cards:
Tuning the parameters
The parameters (threshold value, dilation parameters) were optimized for this image and this task (finding text blocks) and can be adjusted, if needed, for other cards images or other types of objects to be found.
For thresholding (step 2), I used a black threshold. For images where text is lighter than the background, such as the second image in your post, a white threshold should be used, so replace thesholding type with cv2.THRESH_BINARY). For the second image I also used a slightly higher value for the threshold (180). Varying the parameters for the threshold value and the number of iterations for dilation will result in different degrees of sensitivity in delimiting objects in the image.
Finding other object types:
For example, decreasing the dilation to 5 iterations in the first image gives us a more fine delimitation of objects in the image, roughly finding all words in the image (rather than text blocks):
Knowing the rough size of a word, here I discarded areas that were too small (below 20 pixels width or height) or too large (above 100 pixels width or height) to ignore objects that are unlikely to be words, to get the results in the above image.
#dhanushka's approach showed the most promise but I wanted to play around in Python so went ahead and translated it for fun:
import cv2
import numpy as np
from cv2 import boundingRect, countNonZero, cvtColor, drawContours, findContours, getStructuringElement, imread, morphologyEx, pyrDown, rectangle, threshold
large = imread(image_path)
# downsample and use it for processing
rgb = pyrDown(large)
# apply grayscale
small = cvtColor(rgb, cv2.COLOR_BGR2GRAY)
# morphological gradient
morph_kernel = getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3))
grad = morphologyEx(small, cv2.MORPH_GRADIENT, morph_kernel)
# binarize
_, bw = threshold(src=grad, thresh=0, maxval=255, type=cv2.THRESH_BINARY+cv2.THRESH_OTSU)
morph_kernel = getStructuringElement(cv2.MORPH_RECT, (9, 1))
# connect horizontally oriented regions
connected = morphologyEx(bw, cv2.MORPH_CLOSE, morph_kernel)
mask = np.zeros(bw.shape, np.uint8)
# find contours
im2, contours, hierarchy = findContours(connected, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
# filter contours
for idx in range(0, len(hierarchy[0])):
rect = x, y, rect_width, rect_height = boundingRect(contours[idx])
# fill the contour
mask = drawContours(mask, contours, idx, (255, 255, 2555), cv2.FILLED)
# ratio of non-zero pixels in the filled region
r = float(countNonZero(mask)) / (rect_width * rect_height)
if r > 0.45 and rect_height > 8 and rect_width > 8:
rgb = rectangle(rgb, (x, y+rect_height), (x+rect_width, y), (0,255,0),3)
Now to display the image:
from PIL import Image
Image.fromarray(rgb).show()
Not the most Pythonic of scripts but I tried to resemble the original C++ code as closely as possible for readers to follow.
It works almost as well as the original. I'll be happy to read suggestions how it could be improved/fixed to resemble the original results fully.
You can try this method that is developed by Chucai Yi and Yingli Tian.
They also share a software (which is based on Opencv-1.0 and it should run under Windows platform.) that you can use (though no source code available). It will generate all the text bounding boxes (shown in color shadows) in the image. By applying to your sample images, you will get the following results:
Note: to make the result more robust, you can further merge adjacent boxes together.
Update: If your ultimate goal is to recognize the texts in the image, you can further check out gttext, which is an OCR free software and Ground Truthing tool for Color Images with Text. Source code is also available.
With this, you can get recognized texts like:
Above Code JAVA version:
Thanks #William
public static List<Rect> detectLetters(Mat img){
List<Rect> boundRect=new ArrayList<>();
Mat img_gray =new Mat(), img_sobel=new Mat(), img_threshold=new Mat(), element=new Mat();
Imgproc.cvtColor(img, img_gray, Imgproc.COLOR_RGB2GRAY);
Imgproc.Sobel(img_gray, img_sobel, CvType.CV_8U, 1, 0, 3, 1, 0, Core.BORDER_DEFAULT);
//at src, Mat dst, double thresh, double maxval, int type
Imgproc.threshold(img_sobel, img_threshold, 0, 255, 8);
element=Imgproc.getStructuringElement(Imgproc.MORPH_RECT, new Size(15,5));
Imgproc.morphologyEx(img_threshold, img_threshold, Imgproc.MORPH_CLOSE, element);
List<MatOfPoint> contours = new ArrayList<MatOfPoint>();
Mat hierarchy = new Mat();
Imgproc.findContours(img_threshold, contours,hierarchy, 0, 1);
List<MatOfPoint> contours_poly = new ArrayList<MatOfPoint>(contours.size());
for( int i = 0; i < contours.size(); i++ ){
MatOfPoint2f mMOP2f1=new MatOfPoint2f();
MatOfPoint2f mMOP2f2=new MatOfPoint2f();
contours.get(i).convertTo(mMOP2f1, CvType.CV_32FC2);
Imgproc.approxPolyDP(mMOP2f1, mMOP2f2, 2, true);
mMOP2f2.convertTo(contours.get(i), CvType.CV_32S);
Rect appRect = Imgproc.boundingRect(contours.get(i));
if (appRect.width>appRect.height) {
boundRect.add(appRect);
}
}
return boundRect;
}
And use this code in practice :
System.loadLibrary(Core.NATIVE_LIBRARY_NAME);
Mat img1=Imgcodecs.imread("abc.png");
List<Rect> letterBBoxes1=Utils.detectLetters(img1);
for(int i=0; i< letterBBoxes1.size(); i++)
Imgproc.rectangle(img1,letterBBoxes1.get(i).br(), letterBBoxes1.get(i).tl(),new Scalar(0,255,0),3,8,0);
Imgcodecs.imwrite("abc1.png", img1);
This is a C# version of the answer from dhanushka using OpenCVSharp
Mat large = new Mat(INPUT_FILE);
Mat rgb = new Mat(), small = new Mat(), grad = new Mat(), bw = new Mat(), connected = new Mat();
// downsample and use it for processing
Cv2.PyrDown(large, rgb);
Cv2.CvtColor(rgb, small, ColorConversionCodes.BGR2GRAY);
// morphological gradient
var morphKernel = Cv2.GetStructuringElement(MorphShapes.Ellipse, new OpenCvSharp.Size(3, 3));
Cv2.MorphologyEx(small, grad, MorphTypes.Gradient, morphKernel);
// binarize
Cv2.Threshold(grad, bw, 0, 255, ThresholdTypes.Binary | ThresholdTypes.Otsu);
// connect horizontally oriented regions
morphKernel = Cv2.GetStructuringElement(MorphShapes.Rect, new OpenCvSharp.Size(9, 1));
Cv2.MorphologyEx(bw, connected, MorphTypes.Close, morphKernel);
// find contours
var mask = new Mat(Mat.Zeros(bw.Size(), MatType.CV_8UC1), Range.All);
Cv2.FindContours(connected, out OpenCvSharp.Point[][] contours, out HierarchyIndex[] hierarchy, RetrievalModes.CComp, ContourApproximationModes.ApproxSimple, new OpenCvSharp.Point(0, 0));
// filter contours
var idx = 0;
foreach (var hierarchyItem in hierarchy)
{
idx = hierarchyItem.Next;
if (idx < 0)
break;
OpenCvSharp.Rect rect = Cv2.BoundingRect(contours[idx]);
var maskROI = new Mat(mask, rect);
maskROI.SetTo(new Scalar(0, 0, 0));
// fill the contour
Cv2.DrawContours(mask, contours, idx, Scalar.White, -1);
// ratio of non-zero pixels in the filled region
double r = (double)Cv2.CountNonZero(maskROI) / (rect.Width * rect.Height);
if (r > .45 /* assume at least 45% of the area is filled if it contains text */
&&
(rect.Height > 8 && rect.Width > 8) /* constraints on region size */
/* these two conditions alone are not very robust. better to use something
like the number of significant peaks in a horizontal projection as a third condition */
)
{
Cv2.Rectangle(rgb, rect, new Scalar(0, 255, 0), 2);
}
}
rgb.SaveImage(Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "rgb.jpg"));
Python Implementation for #dhanushka's solution:
def process_rgb(rgb):
hasText = False
gray = cv2.cvtColor(rgb, cv2.COLOR_BGR2GRAY)
morphKernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3,3))
grad = cv2.morphologyEx(gray, cv2.MORPH_GRADIENT, morphKernel)
# binarize
_, bw = cv2.threshold(grad, 0.0, 255.0, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
# connect horizontally oriented regions
morphKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (9, 1))
connected = cv2.morphologyEx(bw, cv2.MORPH_CLOSE, morphKernel)
# find contours
mask = np.zeros(bw.shape[:2], dtype="uint8")
_,contours, hierarchy = cv2.findContours(connected, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
# filter contours
idx = 0
while idx >= 0:
x,y,w,h = cv2.boundingRect(contours[idx])
# fill the contour
cv2.drawContours(mask, contours, idx, (255, 255, 255), cv2.FILLED)
# ratio of non-zero pixels in the filled region
r = cv2.contourArea(contours[idx])/(w*h)
if(r > 0.45 and h > 5 and w > 5 and w > h):
cv2.rectangle(rgb, (x,y), (x+w,y+h), (0, 255, 0), 2)
hasText = True
idx = hierarchy[0][idx][0]
return hasText, rgb
You can utilize a python implementation SWTloc.
Full Disclosure : I am the author of this library
To do that :-
First and Second Image
Notice that the text_mode here is 'lb_df', which stands for Light Background Dark Foreground i.e the text in this image is going to be in darker color than the background
from swtloc import SWTLocalizer
from swtloc.utils import imgshowN, imgshow
swtl = SWTLocalizer()
# Stroke Width Transform
swtl.swttransform(imgpaths='img1.jpg', text_mode = 'lb_df',
save_results=True, save_rootpath = 'swtres/',
minrsw = 3, maxrsw = 20, max_angledev = np.pi/3)
imgshow(swtl.swtlabelled_pruned13C)
# Grouping
respacket=swtl.get_grouped(lookup_radii_multiplier=0.9, ht_ratio=3.0)
grouped_annot_bubble = respacket[2]
maskviz = respacket[4]
maskcomb = respacket[5]
# Saving the results
_=cv2.imwrite('img1_processed.jpg', swtl.swtlabelled_pruned13C)
imgshowN([maskcomb, grouped_annot_bubble], savepath='grouped_img1.jpg')
Third Image
Notice that the text_mode here is 'db_lf', which stands for Dark Background Light Foreground i.e the text in this image is going to be in lighter color than the background
from swtloc import SWTLocalizer
from swtloc.utils import imgshowN, imgshow
swtl = SWTLocalizer()
# Stroke Width Transform
swtl.swttransform(imgpaths=imgpaths[1], text_mode = 'db_lf',
save_results=True, save_rootpath = 'swtres/',
minrsw = 3, maxrsw = 20, max_angledev = np.pi/3)
imgshow(swtl.swtlabelled_pruned13C)
# Grouping
respacket=swtl.get_grouped(lookup_radii_multiplier=0.9, ht_ratio=3.0)
grouped_annot_bubble = respacket[2]
maskviz = respacket[4]
maskcomb = respacket[5]
# Saving the results
_=cv2.imwrite('img1_processed.jpg', swtl.swtlabelled_pruned13C)
imgshowN([maskcomb, grouped_annot_bubble], savepath='grouped_img1.jpg')
You will also notice that the grouping done is not so accurate, to get the desired results as the images might vary, try to tune the grouping parameters in swtl.get_grouped() function.
this is a VB.NET version of the answer from dhanushka using EmguCV.
A few functions and structures in EmguCV need different consideration than the C# version with OpenCVSharp
Imports Emgu.CV
Imports Emgu.CV.Structure
Imports Emgu.CV.CvEnum
Imports Emgu.CV.Util
Dim input_file As String = "C:\your_input_image.png"
Dim large As Mat = New Mat(input_file)
Dim rgb As New Mat
Dim small As New Mat
Dim grad As New Mat
Dim bw As New Mat
Dim connected As New Mat
Dim morphanchor As New Point(0, 0)
'//downsample and use it for processing
CvInvoke.PyrDown(large, rgb)
CvInvoke.CvtColor(rgb, small, ColorConversion.Bgr2Gray)
'//morphological gradient
Dim morphKernel As Mat = CvInvoke.GetStructuringElement(ElementShape.Ellipse, New Size(3, 3), morphanchor)
CvInvoke.MorphologyEx(small, grad, MorphOp.Gradient, morphKernel, New Point(0, 0), 1, BorderType.Isolated, New MCvScalar(0))
'// binarize
CvInvoke.Threshold(grad, bw, 0, 255, ThresholdType.Binary Or ThresholdType.Otsu)
'// connect horizontally oriented regions
morphKernel = CvInvoke.GetStructuringElement(ElementShape.Rectangle, New Size(9, 1), morphanchor)
CvInvoke.MorphologyEx(bw, connected, MorphOp.Close, morphKernel, morphanchor, 1, BorderType.Isolated, New MCvScalar(0))
'// find contours
Dim mask As Mat = Mat.Zeros(bw.Size.Height, bw.Size.Width, DepthType.Cv8U, 1) '' MatType.CV_8UC1
Dim contours As New VectorOfVectorOfPoint
Dim hierarchy As New Mat
CvInvoke.FindContours(connected, contours, hierarchy, RetrType.Ccomp, ChainApproxMethod.ChainApproxSimple, Nothing)
'// filter contours
Dim idx As Integer
Dim rect As Rectangle
Dim maskROI As Mat
Dim r As Double
For Each hierarchyItem In hierarchy.GetData
rect = CvInvoke.BoundingRectangle(contours(idx))
maskROI = New Mat(mask, rect)
maskROI.SetTo(New MCvScalar(0, 0, 0))
'// fill the contour
CvInvoke.DrawContours(mask, contours, idx, New MCvScalar(255), -1)
'// ratio of non-zero pixels in the filled region
r = CvInvoke.CountNonZero(maskROI) / (rect.Width * rect.Height)
'/* assume at least 45% of the area Is filled if it contains text */
'/* constraints on region size */
'/* these two conditions alone are Not very robust. better to use something
'Like the number of significant peaks in a horizontal projection as a third condition */
If r > 0.45 AndAlso rect.Height > 8 AndAlso rect.Width > 8 Then
'draw green rectangle
CvInvoke.Rectangle(rgb, rect, New MCvScalar(0, 255, 0), 2)
End If
idx += 1
Next
rgb.Save(IO.Path.Combine(Application.StartupPath, "rgb.jpg"))
Is there a way of doing deconvolution with OpenCV?
I'm just impressed by the improvement shown here
and would like to add this feature also to my software.
EDIT (Additional information for bounty.)
I still have not figured out how to implement the deconvolution.
This code helps me to sharpen the image, but I think the deconvolution could do it better.
void ImageProcessing::sharpen(QImage & img)
{
IplImage* cvimg = createGreyFromQImage( img );
if ( !cvimg ) return;
IplImage* gsimg = cvCloneImage(cvimg );
IplImage* dimg = cvCreateImage( cvGetSize(cvimg), IPL_DEPTH_8U, 1 );
IplImage* outgreen = cvCreateImage( cvGetSize(cvimg), IPL_DEPTH_8U, 3 );
IplImage* zeroChan = cvCreateImage( cvGetSize(cvimg), IPL_DEPTH_8U, 1 );
cvZero(zeroChan);
cv::Mat smat( gsimg, false );
cv::Mat dmat( dimg, false );
cv::GaussianBlur(smat, dmat, cv::Size(0, 0), 3);
cv::addWeighted(smat, 1.5, dmat, -0.5 ,0, dmat);
cvMerge( zeroChan, dimg, zeroChan, NULL, outgreen);
img = IplImage2QImage( outgreen );
cvReleaseImage( &gsimg );
cvReleaseImage( &cvimg );
cvReleaseImage( &dimg );
cvReleaseImage( &outgreen );
cvReleaseImage( &zeroChan );
}
Hoping for helpful hints!
Sure, you can write a deconvolution Code using OpenCV. But there are no ready to use Functions (yet).
To get started you can look at this Example that shows the implementation of Wiener Deconvolution in Python using OpenCV.
Here is another Example using C, but this is from 2012, so maybe it is outdated.
Nearest neighbor deconvolution is a technique which is used typically on a stack of images in the Z plane in optical microscopy. This review paper: Jean-Baptiste Sibarita. Deconvolution Microscopy. Adv Biochem Engin/Biotechnol (2005) 95: 201–243 covers quite a lot of the techniques used, including the one you are interested in. This is also a nice intro: http://blogs.fe.up.pt/BioinformaticsTools/microscopy/
This numpy+scipy python example shows how it works:
from pylab import *
import numpy
import scipy.ndimage
width = 100
height = 100
depth = 10
imgs = zeros((height, width, depth))
# prepare test input, a stack of images which is zero except for a point which has been blurred by a 3D gaussian
#sigma = 3
#imgs[height/2,width/2,depth/2] = 1
#imgs = scipy.ndimage.filters.gaussian_filter(imgs, sigma)
# read real input from stack of images img_0000.png, img_0001.png, ... (total number = depth)
# these must have the same dimensions equal to width x height above
# if imread reads them as having more than one channel, they need to be converted to one channel
for k in range(depth):
imgs[:,:,k] = scipy.ndimage.imread( "img_%04d.png" % (k) )
# prepare output array, top and bottom image in stack don't get filtered
out_imgs = zeros_like(imgs)
out_imgs[:,:,0] = imgs[:,:,0]
out_imgs[:,:,-1] = imgs[:,:,-1]
# apply nearest neighbor deconvolution
alpha = 0.4 # adjustabe parameter, strength of filter
sigma_estimate = 3 # estimate, just happens to be same as the actual
for k in range(1, depth-1):
# subtract blurred neighboring planes in the stack from current plane
# doesn't have to be gaussian, any other kind of blur may be used: this should approximate PSF
out_imgs[:,:,k] = (1+alpha) * imgs[:,:,k] \
- (alpha/2) * scipy.ndimage.filters.gaussian_filter(imgs[:,:,k-1], sigma_estimate) \
- (alpha/2) * scipy.ndimage.filters.gaussian_filter(imgs[:,:,k+1], sigma_estimate)
# show result, original on left, filtered on right
compare_img = copy(out_imgs[:,:,depth/2])
compare_img[:,:width/2] = imgs[:,:width/2,depth/2]
imshow(compare_img)
show()
The sample image you provided actually is a very good example of Lucy-Richardson deconvolution. There is not a built-in function in OpenCV libraries for this deconvolution method. In Matlab, you may use the deconvolution with "deconvlucy.m" function. Actually, you can see the source code for some of the functions in Matlab by typing "open " or "edit ".
Below, I tried to simplify the Matlab code in OpenCV.
// Lucy-Richardson Deconvolution Function
// input-1 img: NxM matrix image
// input-2 num_iterations: number of iterations
// input-3 sigma: sigma of point spread function (PSF)
// output result: deconvolution result
// Window size of PSF
int winSize = 10 * sigmaG + 1 ;
// Initializations
Mat Y = img.clone();
Mat J1 = img.clone();
Mat J2 = img.clone();
Mat wI = img.clone();
Mat imR = img.clone();
Mat reBlurred = img.clone();
Mat T1, T2, tmpMat1, tmpMat2;
T1 = Mat(img.rows,img.cols, CV_64F, 0.0);
T2 = Mat(img.rows,img.cols, CV_64F, 0.0);
// Lucy-Rich. Deconvolution CORE
double lambda = 0;
for(int j = 0; j < num_iterations; j++)
{
if (j>1) {
// calculation of lambda
multiply(T1, T2, tmpMat1);
multiply(T2, T2, tmpMat2);
lambda=sum(tmpMat1)[0] / (sum( tmpMat2)[0]+EPSILON);
// calculation of lambda
}
Y = J1 + lambda * (J1-J2);
Y.setTo(0, Y < 0);
// 1)
GaussianBlur( Y, reBlurred, Size(winSize,winSize), sigmaG, sigmaG );//applying Gaussian filter
reBlurred.setTo(EPSILON , reBlurred <= 0);
// 2)
divide(wI, reBlurred, imR);
imR = imR + EPSILON;
// 3)
GaussianBlur( imR, imR, Size(winSize,winSize), sigmaG, sigmaG );//applying Gaussian filter
// 4)
J2 = J1.clone();
multiply(Y, imR, J1);
T2 = T1.clone();
T1 = J1 - Y;
}
// output
result = J1.clone();
Here are some examples and results.
Example results with Lucy-Richardson deconvolution
Visit my blog Here where you may access the whole code.
I'm not sure you understand what deconvolution is. The idea behind deconvolution is to remove the detector response from the image. This is commonly done in astronomy.
For instance, if you have a CCD mounted to a telescope, then any image you take is a convolution of what you are looking at in the sky and the response of the optical system. The telescope (or camera lens or whatever) will have some point spread function (PSF). That is, if you look at a point source that is very far away, like a star, when you take an image of it, the star will be blurred over several pixels. This blurring -- the point spread -- is what you would like to remove. If you know the point spread function of your optical system very well, then you can deconvolve the PSF from your image and obtain a sharper image.
Unless you happen to know the PSF of your optics (nontrivial to measure!), you should seek out some other option for sharpening your image. I doubt OpenCV has anything like a Richardson-Lucy algorithm built-in.