Illumination invariant image - c++

I try to create an illumination invariant image with openCV like in this paper here: http://www.cvc.uab.es/adas/publications/alvarez_2008.pdf
Has someone an idea how one can create that image from the log-log plot image in OpenCV?

+1 for the link to an interesting paper.
I guess I would build a function to convert to log, divide the channels, rotate by theta, and project onto one axis. Then I would build a function to measure the quality of the resulting invariant image. Then I would set up a search over theta to optimize the quality. That looks like what Alvarez is doing.
But first, I would study the Luv color space, it might be the closest approximation to this scheme that is possible without the special narrowband camera. Project the uv space onto a vector at angle theta, and see what happens.

As far as I can understand the two papers, they are proceeding from a false premise and arriving at an interesting method for getting 1D illumination invariant information from 2D (such as uv from Luv, HS from HSV, etc) color space.
They say illumination invariant, but they show a method of obtaining Color Temperature invariant information from log ratio of color pairs, say {log(R/G),log(B/G)}. You can imagine the setup, with a lamp on a dimmer, and they plot the color ratios: dim the lights, yes, the illumination changes, but so does the color temperature T.
Not to mention that light is not all blackbody color temperature Lambertian. How in the world can this method work? But their results look good.
So, on to the interesting method: Maximum Entropy
As in answer above, project the (log of) uv space onto a vector at angle theta. What should theta be? Search theta to maximize entropy of the result. That is, to get the sharpest peaks in the 1D result. Sort of like an auto-focus.
To answer your question though, use calcHist in opencv. After computing the log, of course.

Related

Is camera pose estimation with SOLVEPNP_EPNP sensitive to outliers and can this be rectified?

I have to do an assignment in which I should compare the function solvePnP() used with SOLVEPNP_EPNP and solvePnPRansac() used with SOLVEPNP_ITERATIVE. The goal is to calculate a warped image from an input image.
To do this, I get an RGB input image, the same image as 16bit depth information image, the camera intrinsics and a list of feature match points between the given image and the wanted resulting warped image (which is the same scene from a different perspective.
This is how I went about this task so far:
Calculate a list of 3D object points form the depth image and the the intrinsics which correspond to the list of feature matches.
use solvePnP() and solvePnPRansac() with the respective algorithms where the calculated 3D object points and the feature match points of the resulting image are the inputs. As a result I get a rotation and a translation vector for both methods.
As sanity check I calculate the average reprojection error using projectPoints() for all feature match points and comparing the resulting projected points to the feature match points of the resulting image.
Finally I calculate 3D object points for each pixel of the input image and again project them using the rotation and translation vector from before. Each projected point will get the color from the corresponding pixel in the input image resulting in the final warped image.
These are my inputs:
Using steps described above I get the following output with the Ransac Method:
This looks pretty much like the reference solution I have, so this should be mostly correct.
However with the solvePnP() method using SOLVEPNP_EPNP the resulting rotation and translation vectors look like this, which doesn't make sense at all:
================ solvePnP using SOVLEPNP_EPNP results: ===============
Rotation: [-4.3160208e+08; -4.3160208e+08; -4.3160208e+08]
Translation: [-4.3160208e+08; -4.3160208e+08; -4.3160208e+08]
The assignment sheet states, that the list of feature matches contain some miss - matches, so basically outliers. As far as I know, Ransac handles outliers better, however can this be the reason for this weird results for the other method? I was expecting some anomalies, but this is completely wrong and the resulting image is completely black since no points are inside the image area.
Maybe someone can point me into the right direction.
OK, I could solve the issue. I used float for all all calculation (3D object points, matches, ...) beforehand and tried to change everything to double - it did the trick.
The warped perspective is still off and I get a rather high re-projection error, however this should be due to the nature of the algorithm itself, which doesn't handle outliers very well.
The weird thing about this is, that in the OpenCV documentation on solvePnP() it states that vector<Point3f> and vector<Point2f>can be passed as arguments for object points and image points respectively.

3D reconstruction from 2 images with baseline and single camera calibration

my semester project is to Calibrate Stereo Cameras with a big baseline (~2m).
so my approach is to run without exact defined calibration pattern like the chessboard cause it had to be huge and would be hard to handle.
my problem is similar to this: 3d reconstruction from 2 images without info about the camera
Program till now:
Corner detection left image goodFeaturesToTrack
refined corners cornerSubPix
Find corner locations in right image calcOpticalFlowPyrLK
calculate fundamental matrix F findFundamentalMat
calculate H1, H2 rectification homography matrix stereoRectifyUncalibrated
Rectify images warpPerspective
Calculate Disparity map sgbm
so far so good it works passably but rectified images are "jumping" in perspective if i change the number of corners..
don't know if this if form imprecision or mistakes i mad or if it cant be calculated due to no known camera parameters or no lens distortion compensation (but also happens on Tsukuba pics..)
suggestions are welcome :)
but not my main problem, now i want to reconstruct the 3D points.
but reprojectImageTo3D needs the Q matrix which i don't have so far. so my question is how to calculate it? i have the baseline, distance between the two cameras. My feeling says if i convert des disparity map in to a 3d point cloud the only thing im missing is the scale right? so if i set in the baseline i got the 3d reconstruction right? then how to?
im also planing to compensate lens distortion as the first step for each camera separately with a chessboard (small and close to one camera at a time so i haven't to be 10-15m away with a big pattern in the overlapping area of both..) so if this is helping i could also use the camera parameters..
is there a documentation besides the http://docs.opencv.org? that i can see and understand what and how the Q matrix is calculated or can i open the source code (probably hard to understand for me ^^) if i press F2 in Qt i only see the function with the transfer parameter types.. (sorry im really new to all of this )
left: input with found corners
top h1, h2: rectify images (looks good with this corner count ^^)
SGBM: Disparity map
so i found out what the Q matrix constrains here:
Using OpenCV to generate 3d points (assuming frontal parallel configuration)
all these parameters are given by the single camera calibration:
c_x , c_y , f
and the baseline is what i have measured:
T_x
so this works for now, only the units are not that clear to me, i have used them form single camera calib which are in px and set the baseline in meters, divided the disparity map by 16, but it seams not the right scale..
by the way the disparity map above was wrong ^^ and now it looks better. you have to do a anti Shearing Transform cause the stereoRectifyUncalibrated is Shearing your image (not documented?).
described in this paper at "7 Shearing Transform" by Charles Loop Zhengyou Zhang:
http://research.microsoft.com/en-us/um/people/Zhang/Papers/TR99-21.pdf
Result:
http://i.stack.imgur.com/UkuJi.jpg

OpenCV Image stiching when camera parameters are known

We have pictures taken from a plane flying over an area with 50% overlap and is using the OpenCV stitching algorithm to stitch them together. This works fine for our version 1. In our next iteration we want to look into a few extra things that I could use a few comments on.
Currently the stitching algorithm estimates the camera parameters. We do have camera parameters and a lot of information available from the plane about camera angle, position (GPS) etc. Would we be able to benefit anything from this information in contrast to just let the algorithm estimate everything based on matched feature points?
These images are taken in high resolution and the algorithm takes up quite amount of RAM at this point, not a big problem as we just spin large machines up in the cloud. But I would like to in our next iteration to get out the homography from down sampled images and apply it to the large images later. This will also give us more options to manipulate and visualize other information on the original images and be able to go back and forward between original and stitched images.
If we in question 1 is going to take apart the stitching algorithm to put in the known information, is it just using the findHomography method to get the info or is there better alternatives to create the homography when we actually know the plane position and angles and the camera parameters.
I got a basic understanding of opencv and is fine with c++ programming so its not a problem to write our own customized stitcher, but the theory is a bit rusty here.
Since you are using homographies to warp your imagery, I assume you are capturing areas small enough that you don't have to worry about Earth curvature effects. Also, I assume you don't use an elevation model.
Generally speaking, you will always want to tighten your (homography) model using matched image points, since your final output is a stitched image. If you have the RAM and CPU budget, you could refine your linear model using a max likelihood estimator.
Having a prior motion model (e.g. from GPS + IMU) could be used to initialize the feature search and match. With a good enough initial estimation of the feature apparent motion, you could dispense with expensive feature descriptor computation and storage, and just go with normalized crosscorrelation.
If I understand correctly, the images are taken vertically and overlap by a known amount of pixels, in that case calculating homography is a bit overkill: you're just talking about a translation matrix, and using more powerful algorithms can only give you bad conditioned matrixes.
In 2D, if H is a generalised homography matrix representing a perspective transformation,
H=[[a1 a2 a3] [a4 a5 a6] [a7 a8 a9]]
then the submatrixes R and T represent rotation and translation, respectively, if a9==1.
R= [[a1 a2] [a4 a5]], T=[[a3] [a6]]
while [a7 a8] represents the stretching of each axis. (All of this is a bit approximate since when all effects are present they'll influence each other).
So, if you known the lateral displacement, you can create a 3x3 matrix having just a3, a6 and a9=1 and pass it to cv::warpPerspective or cv::warpAffine.
As a criteria of matching correctness you can, f.e., calculate a normalized diff between pixels.

Calculating the precision of homography on 2D plane

I am trying to find a way to parametrize the precision of my homography calculation. I would like to obtain a value that describes the precision of the homography calculation for a measurement taken at a certain position.
I currently have succesfully calculated the homography (with cv::findHomography) and I can use it to map a point on my camera image onto a 2D map (using cv::perspectiveTransform). Now I want to track these objects on my 2D map and to do this I want to take in account that objects that are in the back of my camera image have a less precise position on my 2D map than the objects that are all the way in the front.
I have looked at the following example on this website that mentions plane fitting but I don't really understand how to fill the matrices correctly using this method. The visualisation of the result does seem to fit my needs. Is there any way to do this with standard OpenCV functions?
EDIT:
Thanks Francesco for your recommendations. But, I think I am looking for something different than your answer. I am not looking to test the precision of the homography itself, but the relation between the density of measurements in one real camera view and the actual size on a map I create. I want to know that when I am 1 pixel off on my detection in the camera image, how many meters this will be on my map at this point.
I can of course calculate by taking some pixels around my measurement on my camera image and then use the homography to see how many meters on my map this represent every time I do a homography, but I don't want to calculate this every time. What I would like is to have a formula that tells me the relation between pixels in my image and pixels on my map so I can take this in account for my tracking on the map.
What you are looking for is called "predictive error bars" or "prediction uncertainty". You should definitely consult a good introductory book on estimation theory for details (e.g. this one). But briefly, the predictive uncertainty is the probability that...
A certain pixel p in image 1 will is the mapping H(p') of a pixel p' in image 2 under the homography H...
Given the uncertainty in H which is due to the errors in the matched pairs (q0, q0'), (q1, q1'), ..., that have been used to estimate H, ...
But assuming the model is correct, that is, that the true map between images 1 and 2 is, in fact, a homography (although the estimated parameters of the homography itself may be affected by errors).
In order to estimate this probability distribution you'll need a model for the errors in the measurements, and a model for how they propagate through the (homography) model.

Convert Polar Image to a Cartesian Image

I am attempting to convert an image in polar coordinates (axes are angle x radius) to an image in cartesian coordinates (axes are x and y).
This is simple enough in matlab using pcolor() but the issue is that I must do this in a mex file (c++ interface to Matlab). This seem's easy enough except that Matlab ONLY uses array containers so I can't think of a clever or eloquent way of doing this.
I do have access to the image dimensions and I can imagine a very messy way of repackaging the input image array as a matrix in C++ and carying out the conversion but this would be messy and problematic.
Also, I need to be able to interpolate gaps between points in the xy plain.
Any ideas?
This is reasonably standard in image processing, particularly in registration. However, it takes some thought and isn't "obvious". It wasn't obvious to me the first time either.
I'm assuming you have two images, in different "domains", in your case a source image in polar coordinates and a target image in Cartesian coordinates. I'm assuming you know the region in the target image you want to populate.
The commonly known best thing to do in image processing is to loop over coordinates in the known area of the target image that you want to populate. For each of these positions (x,y), you'll have some conversion to polar. It's probably r = sqrt(x*x+y*y) and theta = atan2(y,x) or something like that. Then you sample from that position in the polar coordinate position with interpolation.
Among choices of interpolation are:
Nearest neighbor - you just round to the nearest r and theta and choose the value of that.
Bilinear -
Bi-cubic
...
Of course you should take care of boundary conditions and what happens if your r and theta go out of your image.
This procedure also is similar (looping over the target image and sampling from the source image, and doing lookups based on the reverse transform) for all kinds of coordinates transformations. The nice thing is that you don't leave holes where your source imagine is relevant.
Hope this helps with the image part.
As for the mex part, here's some links:
Mex tutorial
Mex tutorial
Can you be more specific about what you need about the mex part?