How best to approach a localised thresholding opengl function - opengl

I would like to take a photo of some text and make the text easier to read. The tricky part is that the initial photo may have dark regions as well as light regions and I want the opengl function to enhance the text in all these regions.
Here is an example. On top is the original image. On bottom is the processed images.
[edited]
I have added in a better example picture of what is happening. I am able to enhance the text, but in areas where I have no text, this simple thresholding is creating speckled noise (image bottom left).
If I wind back the threshold, then I lose the text in the darker region (bottom right).
At the moment, the processed image only picks up some of the text, not all the text. The original algorithm I used was pretty simple:
- sample 8 pixels around the current pixel (pixels about 4-5 distant away seem to work best)
- figure out the lightest and darkest pixels from this sample
- if the current pixel is closer to the darkest threshold, then make black, and vice versa
This seemed to work very well for around text, but when it came to non-text, then it provided a very noisy image (even when I provided an initial rejection threshold)
I modified this algorithm to assume that text was always close to black. This provided the bottom image above, but once again I am not able to pull out all the text features I want.

Before implementing this as a program, you might want to take source photo and play with it in a GIMP or another editor to see what you can do.
One way to deal with shadows is to run high pass filter before thresolding.
This is how you do it in image editor (manually, without "highpass" filter plugin):
1. Convert image to grayscale and save it to "layer_A"
2. make a copy of "layer_A" into "Layer_B"
3. Invert colors in "Layer_B"
4. Gaussian blur "Layer_B" with radius that is larger than largest feature you want to preserve. (blur radius larger than letter)
5. Merge "Layer_A" with "Layer_B" where result = "Layer_A" * 0.5 + "Layer_B" * 0.5.
6. Increase contrast in resulting image.
7. Run thresold.
In opengl it'll be done in same fashion (and without multiple layers)
It won't work well with strong/crisp shadows (obviously), but it will exterminate huge smooth shadows that occurs due to page being bent, etc.
The technique (high pass filter) is frequently used for making seamless textures, and you should be able to find several such tutorials and additional info with google (GIMP seamless texture high pass or GIMP high pass).
By the way, if you want to improve "readability", then you might want to keep it grayscale (while improving contrast) instead of converting it to "black and white" (1 bit color). Sharp letter edges make text harder to read.

thanks for your help.
In the end I went for quite a basic approach.
Taking a sample of 8 nearby pixels, determining the max and min. Determined the local threshold (max - min). Then
smooth = dot(vec3(1.0/3.0), smoothstep(currentMin, currentMax, p11).rgb);
smooth = (localthreshold < threshold) ? 1.0 : smooth;
return vec4(smooth, smooth, smooth, 1);
This does not show me the text nicely in both the dark and light region, which is the ideal, but it nicely cleans up the text in the lighter region.
Mike

Related

Python: Reduce rectangles on images to their border

I have many grayscale input images which contain several rectangles. Some of them overlap and some go over the border of the image. An example image could look like this:
Now i have to reduce the rectangles to their border. My idea was to make all non-white pixels which are less than N (e.g. 3) pixels away from the border or a white pixel (using the Manhatten distance) white. The output should look like this (sorry for the different-sized borders):
It is not very hard to implement this. Unfortunately the implementation must be fast, because the input may contain extremly many images (e.g. 100'000) and the user has to wait until this step is finished.
I thought about using fromimage and do then everything with numpy, but i did not find a good solution.
Maybe someone has an idea or a hint how this problem may be solved very efficient?
Calculate the distance transform of the image (opencv distanceTrasform http://docs.opencv.org/2.4/modules/imgproc/doc/miscellaneous_transformations.html)
In the resulted image zero all the pixels that have value bigger than 3

Infrared images segmentation using OpenCV

Let's say I have a series of infrared pictures and the task is to isolate human body from other objects in the picture. The problem is a noise from other relatively hot objects like lamps and their 'hot' shades.
Simple thresholding methods like binary and/or Otsu didn't give good results on difficult (noisy) pictures, so I've decided to do it manually.
Here are some samples
The results are not terrible, but I think they can be improved. Here I simple select pixels by hue value of HSV. More or less, hot pixels are located in this area: hue < 50, hue > 300. My main concern here is these pink pixels which sometimes are noise from lamps but sometimes are parts of human body, so I can't simply discard them without causing significant damage to the results: e.g. on the left picture this will 'destroy' half of the left hand and so on.
As the last resort I could use some strong filtering and erosion but I still believe there's a way somehow to told to OpenCV: hey, I don't need these pink areas unless they are part of a large hot cluster.
Any ideas, keywords, techniques, good articles? Thank in advance
FIR data is presumably monotonically proportional (if not linear) to temperature, and this should yield a grayscale image.
Your examples are colorized with a color map - the color only conveys a single channel of actual information. It would be best if you could work directly on the grayscale image (maybe remap the images to grayscale).
Then, see if you can linearize the images to an actual temperature scale such that the pixel value represents the temperature. Once you do this you can should be able to clamp your image to the temperature range that you expect a person to appear in. Check the datasheets of your camera/imager for the conversion formula.

How can I detect TV Screen from an Image with OpenCV or Another Library?

I've working on this some time now, and can't find a decent solution for this.
I use OpenCV for image processing and my workflow is something like this:
Took a picture of a tv.
Split image in to R, G, B planes - I'm starting to test using H, S, V too and seems a bit promising.
For each plane, threshold image for a range values in 0 to 255
Reduce noise, detect edges with canny, find the contours and approximate it.
Select contours that contains the center of the image (I can assume that the center of the image is inside the tv screen)
Use convexHull and HougLines to filter and refine invalid contours.
Select contours with certain area (area between 10%-90% of the image).
Keep only contours that have only 4 points.
But this is too slow (loop on each channel (RGB), then loop for the threshold, etc...) and is not good enought as it not detects many tv's.
My base code is the squares.cpp example of the OpenCV framework.
The main problems of TV Screen detection, are:
Images that are half dark and half bright or have many dark/bright items on screen.
Elements on the screen that have the same color of the tv frame.
Blurry tv edges (in some cases).
I also have searched many SO questions/answers on Rectangle detection, but all are about detecting a white page on a dark background or a fixed color object on a contrast background.
My final goal is to implement this on Android/iOS for near-real time tv screen detection. My code takes up to 4 seconds on a Galaxy Nexus.
Hope anyone could help . Thanks in advance!
Update 1: Just using canny and houghlines, does not work, because there can be many many lines, and selecting the correct ones can be very difficult. I think that some sort of "cleaning" on the image should be done first.
Update 2: This question is one of the most close to the problem, but for the tv screen, it didn't work.
Hopefully these points provide some insight:
1)
If you can properly segment the image via foreground and background, then you can easily set a bounding box around the foreground. Graph cuts are very powerful methods of segmenting images. It appears that OpenCV provides easy to use implementations for it. So, for example, you provide some brush strokes which cover "foreground" and "background" pixels, and your image is converted into a digraph which is sliced optimally to split the two. Here is a fun example:
http://docs.opencv.org/trunk/doc/py_tutorials/py_imgproc/py_grabcut/py_grabcut.html
This is a quick something I put together to illustrate its effectiveness:
2)
If you decide to continue down the edge detection route, then consider using Mathematical Morphology to "clean up" the lines you detect before trying to fit a bounding box or contour around the object.
http://en.wikipedia.org/wiki/Mathematical_morphology
3)
You could train across a dataset containing TVs and use the viola jones algorithm for object detection. Traditionally it is used for face detection but you can adapt it for TVs given enough data. For example you could script downloading images of living rooms with TVs as your positive class and living rooms without TVs as your negative class.
http://en.wikipedia.org/wiki/Viola%E2%80%93Jones_object_detection_framework
http://docs.opencv.org/trunk/doc/py_tutorials/py_objdetect/py_face_detection/py_face_detection.html
4)
You could perform image registration using cross correlation, like this nice MATLAB example demonstrates:
http://www.mathworks.com/help/images/examples/registering-an-image-using-normalized-cross-correlation.html
As for your template TV image which would be slid across the search image, you could obtain a bunch of pictures of TVs and create "Eigenscreens" similar to how Eigenfaces are used for facial recognition and generate an average TV image:
http://jeremykun.com/2011/07/27/eigenfaces/
5)
It appears OpenCV has plenty of fun tools for describing shape and structure features, which appears to be mainly what you're interested in. Worth a look if you haven't seen this already:
http://docs.opencv.org/modules/imgproc/doc/structural_analysis_and_shape_descriptors.html
Best of luck.

Image recognition of well defined but changing angle image

PROBLEM
I have a picture that is taken from a swinging vehicle. For simplicity I have converted it into a black and white image. An example is shown below:
The image shows the high intensity returns and has a pattern in it that is found it all of the valid images is circled in red. This image can be taken from multiple angles depending on the rotation of the vehicle. Another example is here:
The intention here is to attempt to identify the picture cells in which this pattern exists.
CURRENT APPROACHES
I have tried a couple of methods so far, I am using Matlab to test but will eventually be implementing in c++. It is desirable for the algorithm to be time efficient, however, I am interested in any suggestions.
SURF (Speeded Up Robust Features) Feature Recognition
I tried the default matlab implementation of SURF to attempt to find features. Matlab SURF is able to identify features in 2 examples (not the same as above) however, it is not able to identify common ones:
I know that the points are different but the pattern is still somewhat identifiable. I have tried on multiple sets of pictures and there are almost never common points. From reading about SURF it seems like it is not robust to skewed images anyway.
Perhaps some recommendations on pre-processing here?
Template Matching
So template matching was tried but is definitely not ideal for the application because it is not robust to scale or skew change. I am open to pre-processing ideas to fix the skew. This could be quite easy, some discussion on extra information on the picture is provided further down.
For now lets investigate template matching: Say we have the following two images as the template and the current image:
The template is chosen from one of the most forward facing images. And using it on a very similar image we can match the position:
But then (and somewhat obviously) if we change the picture to a different angle it won't work. Of course we expect this because the template no-longer looks like the pattern in the image:
So we obviously need some pre-processing work here as well.
Hough Lines and RANSAC
Hough lines and RANSAC might be able to identify the lines for us but then how do we get the pattern position?
Other that I don't know about yet
I am pretty new to the image processing scene so i would love to hear about any other techniques that would suit this simple yet difficult image rec problem.
The sensor and how it will help pre-processing
The sensor is a 3d laser, it has been turned into an image for this experiment but still retains its distance information. If we plot with distance scaled from 0 - 255 we get the following image:
Where lighter is further away. This could definitely help us to align the image, some thoughts on the best way?. So far I have thought of things like calculating the normal of the cells that are not 0, we could also do some sort of gradient descent or least squares fitting such that the difference in the distance is 0, that could align the image so that it is always straight. The problem with that is that the solid white stripe is further away? Maybe we could segment that out? We are sort of building algorithms on our algorithms then so we need to be careful so this doesn't become a monster.
Any help or ideas would be great, I am happy to look into any serious answer!
I came up with the following program to segment the regions and hopefully locate the pattern of interest using template matching. I've added some comments and figure titles to explain the flow and some resulting images. Hope it helps.
im = imread('sample.png');
gr = rgb2gray(im);
bw = im2bw(gr, graythresh(gr));
bwsm = imresize(bw, .5);
dism = bwdist(bwsm);
dismnorm = dism/max(dism(:));
figure, imshow(dismnorm, []), title('distance transformed')
eq = histeq(dismnorm);
eqcl = imclose(eq, ones(5));
figure, imshow(eqcl, []), title('histogram equalized and closed')
eqclbw = eqcl < .2; % .2 worked for samples given
eqclbwcl = imclose(eqclbw, ones(5));
figure, imshow(eqclbwcl, []), title('binarized and closed')
filled = imfill(eqclbwcl, 'holes');
figure, imshow(filled, []), title('holes filled')
% -------------------------------------------------
% template
tmpl = zeros(16);
tmpl(3:4, 2:6) = 1;tmpl(11:15, 13:14) = 1;
tmpl(3:10, 7:14) = 1;
st = regionprops(tmpl, 'orientation');
tmplAngle = st.Orientation;
% -------------------------------------------------
lbl = bwlabel(filled);
stats = regionprops(lbl, 'BoundingBox', 'Area', 'Orientation');
figure, imshow(label2rgb(lbl), []), title('labeled')
% here I just take the largest contour for convenience. should consider aspect ratio and any
% other features that can be used to uniquely identify the shape
[mx, id] = max([stats.Area]);
mxbb = stats(id).BoundingBox;
% resize and rotate the template
tmplre = imresize(tmpl, [mxbb(4) mxbb(3)]);
tmplrerot = imrotate(tmplre, stats(id).Orientation-tmplAngle);
xcr = xcorr2(double(filled), double(tmplrerot));
figure, imshow(xcr, []), title('template matching')
Resized image:
Segmented:
Template matching:
Given the poor image quality (low resolution + binarization), I would prefer template matching because it is based on a simple global measure of similarity and does not attempt to do any feature extraction (there are no reliable features in your samples).
But you will need to apply template matching with rotation. One way is to precompute rotated instances of the template, perform matchings for every angle and keep the best.
It is possible to integrate depth information in the comparison (if that helps).
This is quite similar to the problem of recognising hand-sketched characters that we tackle in our lab, in the sense that the target pattern is binary, low resolution, and liable to moderate deformation.
Based on our experiences I don't think SURF is the right way to go as pointed out elsewhere this assumes a continuous 2D image not binary and will break in your case. Template matching is not good for this kind of binary image either - your pixels need to be only slightly misaligned to return a low match score, as there is no local spatial coherence in the pixel values to mitigate minor misalignments of the window.
Our approach is this scenario is to try to "convert" the binary image into a continuous or "greyscale" image. For example see below:
These conversions are made by running a 1st derivative edge detector e.g. convolve 3x3 template [0 0 0 ; 1 0 -1 ; 0 0 0] and it's transpose over image I to get dI/dx and dI/dy.
At any pixel we can get the edge orientation atan2(dI/dy,dI/dx) from these two fields. We treat this information as known at the sketched pixels (the white pixels in your problem) and unknown at the black pixels. We then use a Laplacian smoothness assumption to extrapolate values for the black pixels from the white ones. Details are in this paper:
http://personal.ee.surrey.ac.uk/Personal/J.Collomosse/pubs/Hu-CVIU-2013.pdf
If this is a major hassle you could try using a distance transform instead, convenient in Matlab using bwdist, but it won't give as accurate results.
Now we have the "continuous" image (as per right hand column of images above). The greyscale patterns encode the local structure in the image, and are much more amenable to gradient based descriptors like SURF and template matching.
My hunch would be to try template match first, but since this is affine sensitive I would go the whole way and use a HOG/Bag of Visual words approach again just as in our above paper, to match those patterns.
We have found this pipeline to give state of the art results in sketch based shape recognition, and my PhD student has successfully used in subsequent work for matching hieroglyphs, so I think it could have a good shot at working the kind of pattern you pose in your example images.
I do not think SURF is the right approach to use here. SURF is designed to work on regular 2D intensity images, but what you have here is a 3D point cloud. There is an algorithm for point cloud registration called Iterative Closed Point (ICP). There are several implementations on MATLAB File Exchange, such as this one.
Edit
The Computer Vision System Toolbox now (as of the R2015b release) includes point cloud processing functionality. See this example for point cloud registration and stitching.
I would:
segment image
by Z coordinates (distance from camera/LASER) where Z coordinate jumps more then threshold there is border between object and background (if neighboring Z value is big or out of range) or another object (if neighboring Z value is different) or itself (if neighboring Z value is different but can be connected to itself). This will give you set of objects
align to viewer
compute boundary points of each object (most outer edges), compute direction via atan2 rotate back to face camera perpendicular.
Your image looks like flag marker so in that case rotation around Y axis should suffice. Also you can scale size of the object to predefined distance (if the target is always the same size)
You will need to know the FOV of your camera system and have calibrated Z axis for this.
now try to identify object
here use what you have by now and also can add filter like skip objects with not matching size or aspect ratio ... you can use DFT/DCT or compare histograms of normalized/equalized image etc. ...
[PS]
for features is not a good idea to use BW-Bit image because you loose too much info. Use gray-scale or color instead (gray-scale is usually enough). I usually add few simplified histograms of small area (with few different radius-es) around point of interest which is invariant on rotation.
Have a look a log-polar template matching, it is rotation and scale invariant:
http://etd.lsu.edu/docs/available/etd-07072005-113808/unrestricted/Thunuguntla_thesis.pdf

Which filter can best remove horizontal vertical banding noises (hvbn) from image

I would like to know if among the OpenCV filter functions, is there a particular one that is suited to remove horizontal/vertical banding noise in an image? Like, for salt and pepper noise, Gaussian blur works best. Or, is the process of removing banding noise a combination of different filters?
Did try several filters already like bilateral, gaussian, median and blur(homogenous), but the vertical banding lines don't seem to disappear or significantly reduce, or sometimes the resulting image is just so smoothed. I'd prefer the image to be retained, only that the noise is removed.
I am interested to do the removal of noise in OpenCV, c++ in particular. Any basic steps, concepts or tutorial would be great. Thanks
The sample images are here which shows obvious marks of vertical bands of noise. These images are outputs of a scanner by the way.
https://drive.google.com/file/d/0B1aXcXzD_OADMkNuNnJxNGw2NTA/edit?usp=sharing
https://drive.google.com/file/d/0B1aXcXzD_OADNkFBbGgxU20yMjg/edit?usp=sharing
I would do it like this:
1.find your lines
for example search for repetitive bumps/edges on first derivation by x or y;
then select only those which are in average period place
and neighbouring lines/rows has also this bump near the same place
2.remove pixels of selected lines
just take color before and after bump
and interpolate between them (avg color for example)
also in some case you can substract constant color from all pixels of the line
depends on the color mixing mechanism which created those lines
You can reduce vertical or horizontal noise by adjusting the kernel size using simple blur operation
Fir example for your vertical noised image you could use
blur(src,dst,Size(11,1));
Note that here the kernel size in vertical direction is higher value compared to horizontal.
See the image before and after filtering
Similarly for horizontal just use reverse kernel size like
blur(src,dst,Size(1,11));
See the result