C++: find a minimum set of rectangles that cover a binary image - c++

Task
I have a binary image, and I want to extract a relatively small number of rectangles that cover the non-zero area.
I don't really need the smallest set. Of course finding it at least some of the time wouldn't hurt, however average speed on non-pathological cases is much more important for me.
If it helps, you can think of the binary image as a set of 1x1 rectangles. Indeed such a set is trivial to find, and this solution already works for me basically – however, finding any smaller set would speed up other things considerably.
Yet another alternative (more or less equivalent to the above, at least from my point of view) is to have a set of arbitrary contours (including holes) as the input:
Related work
This question has been asked and answered many times; for example, the accepted answer here is great.
There is even a Javascript implementation here.
So the question is not about whether an algorithm exists, or which algorithm is optimal.
The question
Instead, the question is this:
Given the fact that I'm using C++, and OpenCV by the way, is there an implementation that I could easily integrate to my code?
Even better, is there an implementation with a permissive license?
Ideal solution
Ideally, I could just do:
const cv::Mat binary_image = ...; // get input image from somewhere
const std::vector<cv::Rect> rects = dissect(binary_image);
Or:
const std::vector<cv::Rect> initial_rects = ...;
const std::vector<cv::Rect> rects = find_small_dissection(initial_rects);
But of course a solution that doesn't use OpenCV is just fine also; it's no problem at all to convert back and forth. But since OpenCV already has types for image (cv::Mat) and rectangle (cv::Rect), it was convenient to use these definitions above.
Motivation for this question
It's not that I can't take the pseudocode, or the Javascript code, and start porting it to C++ and OpenCV. However, it'll probably take quite a while before my implementation runs bug-free. I'd much rather use that same time testing, and possibly fixing, an existing open-source implementation. If none can be found, then maybe I'll need to write the code myself, however I might not be able to then open-source the result (because I'm developing a commercial product for my client).

Related

FastMarching method from MATLAB alternative in OpenCV C++

I am currently working on image processing software from MATLAB code to C++ with OpenCV.
I came upon the imseggmm function in MATLAB (Link) wich use the internal MEX fastMarchingMex wich return a distance map.
There seems to be no equivalent at this day in OpenCV, and implementations using OpenCV are not very common (in fact, inexistants on the web)
My question is : does others functions that do the same exact job exists even with Superior computation time ? (computation time is not an issue in my case)
EDIT : as pointed by comment, my question is not very clear.
Let's say that for me, performance is not important, I only search a function that output the same exact results as the FastMarching method.
I tested the DistanceTransform function wich did not returned the same results at all with my test sample.
There seems to be something I don't understand in the process of the FastMarching method, that differ from others distance calculation methods.
My MATLAB code use fastMarching providing it a binary image of zeros and ones, and an index of those non-zero points (in linear array coordinates). It return a distance map.
Please consider also this is the first time I post a question on StackOverflow, I hope I have not been anoying and that my question, if it got an answer, will help other people to find an alternative to this function.

Least Median of Squares robust regression C++

I have a set of data z(0), z(1), z(2)...,z(n) that I am currently fitting with a 2 variables polynomial of the kind p(x,y) = a(1)*x^2+a(2)*y^2+a(3)*x*y+a(4). I have i=1,...,n (x(i),y(i)) coordinates that I impose to be p(x(i),y(i))=z(i). In this way I have a Overdetermined System that I can solve using Eigen SVD . I am looking for a more sophisticated method that can take care of outliers, like a Least Median of Squares robust regression (as described here) but I haven't found a C++ implementation for 2 variables. I looked in GSL but it seems there is nothing for 2 variable functions. The only other solution I can think of is using a TGraph2D in ROOT. Do you know any other solution? Numerical recipes maybe? Since I am writing C++ code I would prefer C or C++ implementations.
Since non answer has been given yet, but I am still working on this problem, I will share my progresses here.
The class TLinearFitter has a fit method that allows you to select Robust fitting - Least Trimmed Squares regression (LTS):
https://root.cern.ch/root/html532/TLinearFitter.html
Another possible solution, more time consuming maybe, but maybe more efficient on the long run is to write my own function to be minimized, and the use:
https://projects.coin-or.org/Ipopt to minimize it. Although in this approach there is a bigger "step". I don't know how to use the library and I haven't (yet?) found a nice tutorial to understand it.
here: https://wis.kuleuven.be/stat/robust/software there is a Fortran implementation of the LMedS algorithm called PROGRESS. So another possible solution could be to port this software to C/C++ and make a library out of it.

Supprt Vector Machine works in matlab, doesn't work in c++

I'm writing an application that uses an SVM to do classification on some images (specifically these). My Matlab implementation works really well. Using a SIFT bag-of-words approach, I'm able to get near 100% accuracy with a linear kernel.
I need to implement this in C++ for speed/portability reasons, and so I've tried using both libsvm and dlib. I've tried multiple SVM types (c_svm, nu_svm, one_class) and multiple kernels (linear, polynomial, rbf). The best I've been able to achieve is around 50% accuracy - even on the same samples that I've trained on. I've confirmed that my feature generators are working, because when I export my c++-generated features to Matlab and train on those, I'm able to get near-perfect results again.
Is there something magical about Matlab's SVM implementation? Are there any common pitfalls or areas that I might look into that would explain the behavior I'm seeing? I know this is a little vague, but part of the problem is that I don't know where to go. Please let me know in the comments if there is other info I can provide that would be helpful.
There is nothing magical about the Matlab version of the libraries, other that it runs in Matlab which makes it harder to shoot yourself on the foot.
A check list:
Are you normalizing your data, making all values lie between 0 and 1
(or between -1 and 1), either linearly or using the mean and the
standard deviation?
Are you parameter searching for a good value of C (or C and gamma in
the case of an RBF kernel)? Doing cross validation or on a hold out set?
Are you sure that your're handling NaN, and all other floating point
nastiness? Matlab is very good at hiding this from you, C++ not so
much.
Could it be that you're loading your data incorrectly, reading a
"%s" into a double or something that is adding noise to your input
data?
Could it be that libsvm/dlib expects the data in row major order and
your're sending it in in column major (or the other way around)? Again Matlab makes this almost impossible, C++ not so much.
32-64 bit nastiness one version of the library, executable compiled
with the other?
Some other things:
Could it be that in Matlab you're somehow leaking the class (y) into
the preprocessing? no one does this on purpose, but I've seen it happen.
If you make almost any f(y) a feature, you'll get almost 100%
everytime.
Sometimes it helps to verify that everything is numerically
identical by printing to file before training both in C++ and
Matlab.
i'm very happy with libsvm using the rbf kernel. carlosdc pointed out the most common errors in the correct order :-). for libsvm - did you use the python tools shipped with libsvm? if not i recommend to do so. write your feature vectors to a file (from matlab and/or c++) and do a metatraining for the rbf kernel with easy.py. you get the parameters and a prediction for the generated model. if this prediction is ok continue with c++. from training you also get a scaled feature file (min/max transformed to -1.0/1.0 for every feature). compare these to your c++ implementation as well.
some libsvm issues: a nasty habit is (if i remember correctly) that values scaling to 0 (zero) are omitted in the scaled file. in grid.py is a parameter "nr_local_worker" which is defining the mumber of threads. you might wish to increase it.

CBIR with SIFT alike features, discrete- vs. continuous-approach

currently I'm dealing with implementing a CBIR-System for object recognition (object classification in detail) and now since I have some working feature-detectors and -descriptors I try to find the best way for handling these features for the task of content based image retrieval.
As far as I know there are two main trends for this task, the discrete- and the continuous-approach. Where discrete stands for methods like bag-of-visual words and codebooks for building up inverted indices to apply methods referring text-retrieval, and continuous stands for methods like Best Bin First search with k-d trees and nearest neighbor classification.
So one main difference between those both approaches is, one works with an extra representation for features like visual-words and the other one works with the n-D features calculated from the descriptor.
My question is now, is there any comparison between the two method for CBIR which could help me in finding the best approach for my task?
The full answer to this question would be quite complex and long.
but generally, a continuous method can give you more accurate results, but it's slower as you can effectively build a search index, and you need to work with large descriptors.
you should consider a combination that uses discrete features (visual words) for initial results, and afterwards filter the result set using continuous methods.

How should I go about implementing algorithms to be used with black/white and color images?

I thought about:
1) Implement everything for the b/w images, then make wrappers for the methods that check if it's a color image. If it is, split the channels, make the operations on each individually and then merge them.
2) Use functors to correctly update the values depending on what I'm dealing with. Problem is that the compiler errors would be really complicated and I'm not used to it, and I think I may end up needing quite a few of them. Not sure if this is a good idea tbh.
There might be a correct design pattern here I'm not seeing too. There could also be a way to do this that's channel/color agnostic in OpenCV though I haven't found it yet, and so far the book I'm reading (OpenCV 2 Computer Vision Application Programming Cookbook) hasn't shown me such a possibility yet.
If speed is important, Don't.
It sounds like you're trying to encapsulate or abstract away the type of pixel using OO techniques or the like. This could add an extra level of indirection for every pixel access, killing your performance.
If you're calling staight to a function vs. a pointer to one (e.g., delegate, overriden method, functor) it can still be faster for the CPU, but if you're doing function calls at all reconsider; they're still extra work and if you can nest everything in the outer FOR loop, it will look ugly and functional programming snobs will sneer at you, remember, this isn't a big LOB app that will get hard to maintain. That's why engineers can still perfectly maintain 30 year old quickbasic code, the problem space doesn't need anything smarter (however usually their problems themselves need something a lot smarter than I!)
It's best to implement simple things (e.g., a threshold op or resizing) optimized for each kind of image if you want speed. You can also research transformation matrix and see if you can accomplish your work like that. That way you can write 2 transformer algorithms (b&w) only, and, using a similar (or same) matrix do the same thing for both types of pictures.
Hence accomplishing a major goal of abstraction anyway, seamless reuse, separation of concerns. And speed to boot (but hopefully not reboot!) good luck
Splitting the channels could work well with algorithms that work with the channels independently; not all of them do, so this will be quite limiting. You'll also spend a bit of time and space making all those copies.
By functors I presume you mean making templates out of your algorithm functions, with a pixel type as the template parameter. That could work also, but it means defining your basic pixel operations in a way that they could be implemented as functions or operators on a generic pixel type. This is harder than it looks and should be done after you've had some experience in implementing the algorithms.
A third option not mentioned is to promote the b/w images to full color, process them, and convert back to b/w. This optimizes the full color processing at the expense of the b/w.
For most algorithms it is not necessary to worry about monochrome vs. colour images. You either use the grey value of the monochrome image or you calculate the luminance/intensity/whatever of the colour and use that. You choose the measure luminance etc. by looking at which colour space will give you the result you want.
When you have calculated how you are going to modify your images you use some pixel aware processing, e.g. blending two pixels might be pixel_a*0.5 + pixel_b*0.5, your pixel class will sort out how to apply that to the different colour channels, i.e. Pixel::operator+(const Pixel &), Pixel::operator*(float) and so on.
There are algorithms that are applied individually to each colour channel but they are not as common and often there is some correlation between the spatiotemporal changes in the colours so you wouldn't do something as basic as process each channel totally independently of each other.
My own Image class uses a planar structure (that is, color channels are separate) instead of an interleaved structure. However this is VERY limiting when it comes to image quantization and other joint color processing tasks.
I am planning to rewrite it to use the other approach, to simply be a two dimensional array of pixels. At the moment I am not sure how will I implement it exactly (template pixel class, Pixel base class or a simple three dimensional array).
I also plan to write a planar wrapper for this interleaved image structure to ease any disadvantage I might encounter. One thing is sure, this wrapper will be much efficient than a pixel wrapper would be for planar images.
Frankly I believe splitting planes is rather inefficient, since you calculate various overheads several times. For example, if you want to resize an image, calculation of the various filter coefficients is very expensive, and it would be MUCH better to just calculate them once and apply Pixel::operator * and + instead of the same with the underlying subpixel components.