Speedup Image comparison - opengl

I'm looking for an algorithm, that would do image comparisons at real time, basically on images acquired from a webcam (like 30 frames/second). My current implementation is pretty slow, tired to improve it by dropping a few frames and reducing the resolution -- but with no success.
So, I'm exploring options like using better algorithms like Key-point Matching etc. And on a different note, I'm also looking for a GPU based image comparison sample implementations (either DirectX or OpenGL APIs).

Have you tried Perceptual Image Diff?

I didn't read the entire thread but it may help you somehow
Image comparison - fast algorithm

Related

FFT based image registration (optionally using OpenCV) in cpp?

I'm trying to align two images taken from a handheld camera.
At first, I was trying to use the OpenCV warpPerspective method based on SIFT/SURF feature points. The problem is the feature-extract & matching process may be extremely slow when the image quality is high (3000x4000). I tried to scale-down the image before find feature-points, the result is not as good as before.(The Mat generated from findHomography shouldn't be affected by scaling down the image, right?) And sometimes, due to lack of good feature point matches, the result is quite strange.
After searching on this topic, it seems that solving the problem in Fourier domain will speed up the registration process. And I've found this question which leads me to the code here.
The only problem is the code is written in python with numpy (not even using OpenCV), which makes it quite hard to re-written to C++ code using OpenCV (In OpenCV, I can only find dft and there's no fftshift nor fft stuff, I'm not quite familiar with NumPy, and I'm not brave enough to simply ignore the missing methods). So I'm wondering why there is not such a Fourier-domain image registration implementation using C++?
Can you guys give me some suggestion on how to implement one, or give me a link to the already implemented C++ version? Or help me to turn the python code into C++ code?
Big thanks!
I'm fairly certain that the FFT method can only recover a similarity transform, that is, only a (2d) rotation, translation and scale. Your results might not be that great using a handheld camera.
This is not quite a direct answer to your question, but, as a suggestion for a speed improvement, have you tried using a faster feature detector and descriptor? In OpenCV SIFT/SURF are some of the slowest methods they have for feature extraction/matching. You could try testing some of their other methods first, they all work quite well and are faster than SIFT/SURF. Especially if you use their FLANN-based matcher.
I've had to do this in the past with similar sized imagery, and using the binary descriptors OpenCV has increases the speed significantly.
If you need only shift you can use OpenCV's phasecorrelate

Best algorithm for feature detection in urban environment - OpenCV

I'm using OpenCV library (C++) to extract detectors from 2 images coming from a video stream taker from an aerial camera in order to, afterwards, find the matching points in successive images. i'm wondering which is the best algorithm to find robust detectors of a urban environment??
Ps. Actually I'm using SURF but when the images changes a little (because the camera is translating very slowly) the matchings between these descriptors become very few!
If you want to try different aproaches give a try to RoboRealm , they have a trial version, you just put the algoritms and seems the results, for testing purposes even if you will use OpenCV its ok.

Fastest deskew algorithm?

I am a little overwhelmed by my task at hand. We have a toolkit which we use for TWAIN scanning. Some of our customers are complaining about slower scan speeds when the deskew option is set. This is because if their scanner does not support a hardware deskew, it is done in post-processing on the CPU. I was wondering if anyone knows of a good (i.e. fast) algorithm to achieve this. It is hard for me to say what algorithm we are using now. What algorithms are out there for this, and how do they rank as far as speed/accuracy? If I knew the names of the algorithms, it could be easier for me to do a google search on them.
Thank You.
-Tom
Are you scanning in Color or B/W ?
Deskew is processor intensive. A Group4 tiff or JPEG must be decompressed, skew angle determined, deskewed and then compressed.
There are many image processing algorithms out there with deskew and I have evaluated many over the years. There are some huge differences in processing speed between the different libraries and a lot of it comes down to how well it is coded rather than the algorithm used. There is a huge difference in commercial libraries just reading and writing images.
The fastest commerical deskew I have used by far comes from Unisoft Imaging (www.unisoftimaging.com). I assume much of it is written in assembler. Unisoft has been around for many years and is very fast and efficient. It supports different many different deskew options including black border removal, color and B/W deskew. The Group4 routines are very solid and very fast. The library comes with many other image processing options as well as TWAIN and native SCSI scanner support. It also supports Unix.
If you want a free deskew then you might want to have a look at Leptonica. It does not come with too much documentation but is very stable and well written. http://www.leptonica.com/
Developing code from scratch could be quite time consuming and may be quite buggy and prone to errors.
The other option is to process the document in a separate process so that scanning can run at the speed of the scanner. At the moment you are probably processing everything in a parallel fashion, one task after another, hence the slowdown.
Consider doing it as post-processing, because deskew cannot be done at real-time (unless it's hardware accelerated).
Deskew consists of two steps: skew detection and rotation. Detecting the skew angle can usually be done on a B&W (1-bit) image faster. Rotation speed depends on the quality of the interpolation. A good quality deskew will take a lot of time to run, much more than scanning pages.
A good high speed scanner can do 120 double-sided pages per minute, if it has hardware JPEG or TIFF Group 4 compression, and your TWAIN library takes advantage of it (hint: do not use native mode). You barely have enough time to save the file to the hard drive at that speed, let alone decompress, skew detect, rotate, re-compress. Quality deskew takes several seconds per page, unless you can use the video card's hardware accelerator to rotate and compress.
Do I correctly understand you already have such algorithm implemented? If so, are you sure there is no space for optimization? I'd start with profiling existing solution.
Anyway, I guess you should look for fast digital Radon transform algorithm.
Take a look at http://pagetools.sourceforge.net. They have deskew algorithm implementation.

Assessing the quality of an image with respect to compression?

I have images that I am using for a computer vision task. The task is sensitive to image quality. I'd like to remove all images that are below a certain threshold, but I am unsure if there is any method/heuristic to automatically detect images that are heavily compressed via JPEG. Anyone have an idea?
Image Quality Assessment is a rapidly developing research field. As you don't mention being able to access the original (uncompressed) images, you are interested in no reference image quality assessment. This is actually a pretty hard problem, but here are some points to get you started:
Since you mention JPEG, there are two major degradation features that manifest themselves in JPEG-compressed images: blocking and blurring
No-reference image quality assessment metrics typically look for those two features
Blocking is fairly easy to pick up, as it appears only on macroblock boundaries. Macroblocks are a fixed size -- 8x8 or 16x16 depending on what the image was encoded with
Blurring is a bit more difficult. It occurs because higher frequencies in the image have been attenuated (removed). You can break up the image into blocks, DCT (Discrete Cosine Transform) each block and look at the high-frequency components of the DCT result. If the high-frequency components are lacking for a majority of blocks, then you are probably looking at a blurry image
Another approach to blur detection is to measure the average width of edges of the image. Perform Sobel edge detection on the image and then measure the distance between local minima/maxima on each side of the edge. Google for "A no-reference perceptual blur metric" by Marziliano -- it's a famous approach. "No Reference Block Based Blur Detection" by Debing is a more recent paper
Regardless of what metric you use, think about how you will deal with false positives/negatives. As opposed to simple thresholding, I'd use the metric result to sort the images and then snip the end of the list that looks like it contains only blurry images.
Your task will be a lot simpler if your image set contains fairly similar content (e.g. faces only). This is because the image quality assessment metrics
can often be influenced by image content, unfortunately.
Google Scholar is truly your friend here. I wish I could give you a concrete solution, but I don't have one yet -- if I did, I'd be a very successful Masters student.
UPDATE:
Just thought of another idea: for each image, re-compress the image with JPEG and examine the change in file size before and after re-compression. If the file size after re-compression is significantly smaller than before, then it's likely the image is not heavily compressed, because it had some significant detail that was removed by re-compression. Otherwise (very little difference or file size after re-compression is greater) it is likely that the image was heavily compressed.
The use of the quality setting during re-compression will allow you to determine what exactly heavily compressed means.
If you're on Linux, this shouldn't be too hard to implement using bash and imageMagick's convert utility.
You can try other variations of this approach:
Instead of JPEG compression, try another form of degradation, such as Gaussian blurring
Instead of merely comparing file-sizes, try a full reference metric such as SSIM -- there's an OpenCV implementation freely available. Other implementations (e.g. Matlab, C#) also exist, so look around.
Let me know how you go.
I had many photos shot to an ancient book (so similar layout, two pages per image), but some were much blurred, to the point that the text could not be read. I searched for a ready-made batch script to find the most blurred one, but I didn't find any useful, so I used another part of script got on the net (based on ImageMagick, but no longer working; I couldn't retrieve the author for the credits!), useful to assessing the blur level of a single image, tweaked it, and automatised it over a whole folder. I uploaded here:
https://gist.github.com/888239
hoping it will be useful for someone else. It works on a Linux system, and uses ImageMagick (and some usually command line installed tools, as gawk, sort, grep, etc.).
One simple heuristic could be to look at width * height * color depth < sigma * file size. You would have to determine a good value for sigma, of course. sigma would be dependent on the expected entropy of the images you are looking at.

High Quality Image Magnification on GPU

I'm looking for interesting algorithms for image magnification that can be implemented on a gpu for real-time scaling of video. Linear and bicubic interpolations algorithms are not good enough.
Suggestions?
Here are some papers I've found, unsure about their suitability for gpu implementation.
Adaptive Interpolation
Level Set
I've seen some demos on the cell processor used in TVs for scaling which had some impressive results, no link unfortunately.
lanczos3 is a very nice interpolation algorithm (you can test it in the GIMP or virtualDub). It generally performs better than cubic interpolation and can be parallelized.
A GPU based version is implemented in Chromium:
http://code.google.com/p/chromium/issues/detail?id=47447
Check out chromium source code.
It may be still too slow for realtime video processing but maybe worth trying if you don't use too high resolution.
You may also want to try out CUVI Lib which offers a good set of GPU acceleration Image Processing algorithms. Find about it on: http://www.cuvilib.com
Disclosure: I am part of the team that developed CUVI.
Still slightly 'work in progress' but gpuCV is a drop in replacement for the openCV image processing functions implemented in openCL on a GPU
Prefiltered cubic b-spline interpolation delivers good results (you can have a look here for some theoretical background).
CUDA source code can be downloaded here.
WebGL examples can be found here.
edit: The cubic interpolation code is now available on github: CUDA version and WebGL version.
You may want to have a look at Super Resolution Algorithms. Starting Point on CiteseerX