3D reconstruction C++ with OpenCV..Fundamental Matrix too large - c++

Ok I am posting my conundrums of life to stackoverflow after 4 days of mindless programming when nothing seems to get things right or atleast close to right. sorry for being a little dramatic but I feel like a lousy programmer today.
Anyway, my problem is:
To obtain Fundamental matrix using RANSAC (N>8).
I have two images with wide baseline but sufficient overlap so that adequate amount of SURF keypoints (~308) are matched correctly (i plot them).
Now lies the problem. I pass the 2D points to cv::findFindamentalMat but I get completly baseless results. The function returns:
FundMat=[2.05148e-13 3.72341 -2.03671e+10
1.6701e+26 -4.17712 4.59533e+29
3.32414e+18 2.8843 1.91069e-26]
To circumvent the large dynamic range of the matrix, Hartley suggested to normalise the data points (in euclidean space and not the projection space normalization)....Even after doing that the result is the almost the same. (10^-9 to 10^9)
I understand that FundMat is accurate only upto scale but a difference of 10^-9 to 10^+9 is too much.
I referred to other questions here but i dont seem to get any leads:findfundamentalmatrix-doesnt-find-fundamental-matrix
how-to-calculate-the-fundamental-matrix-for-stereo-vision
Any ideas would be great. This is a very important step when considering uncalibrated images for the rest of the software pipeline.
n case the code is helpful. (its not indented and colored though..space is too less here.)
https://sites.google.com/site/3drecon124/

its solved...silly human error. there was a data type conversion from double to float and it caused data to be fetched from incorrect locations in memory. now its smooth and epipolar constraint is satisfied upto scale.

Related

Bundle adjustment with focal length correction not converging

I am trying to add a new feature to our existing implementation of the bundle adjustment in code.
The algorithm uses the Gauss-Newton method and has been working for well over a decade. The least squares "A" matrix is populated using initial approximations of the image exterior orientations, as well as the object points. The book from Kraus - "Photogrammetry: Fundamental and Standard Processes" - was used for this.
A while ago, self calibration was added to this algorithm, however, only the formulae by Ebner and Gruen were added (formula for Ebner here). I am now trying to add the "Brown-Conrady" formula which is well documented in this paper (final algorithm under "concluding remarks"). It uses 10 parameters to determine deltaX and deltaY.
When I include all the parameters except for deltaC (the correction to the focal length/camera constant), our algorithm works and the adjustment converges and produces the desired residuals. However, as soon as I introduce deltaC (which mathematically I see as "allowing" the image points to scale by some amount in X and Y) the adjustment diverges.
The input to the algorithm is a large set of already undistorted aerial images, along with their control points and a large number of image points. We are therefore expecting the distortion/correction parameters to be close to zero, since the images are already undistorted. This is indeed the case for Ebner and Grun.
For Brown, however, some of the parameters (and therefore the delta corrections) grow uncontrollably. I have tried scaling these parameters (the principle points and focal length correction deltaC) so that they are closer in magnitude to the other parameters (K1,K2,K3,P1,P2) however this did not help - the adjustment diverges all the same.
Is there any reason for this? Could it perhaps be because the images are already undistorted? Or something to do with this aerial job in particular?
I have not provided code as it is simply too complex, however I feel it is maybe an understanding of the implementation as opposed to specific code where I am going wrong.
Thanks!

Perspective projection based on 4 points in 2D

I'm writing to ask about homography and perspective projection.
I'm trying to write a piece of code, that will "warp" my image so that its corners align with 4 reference points that are in the 3D space - however, the game engine that I'm running it in, already allows me to get the screen position of them, so I already have their screen-space coordinates of both xi,yi and ui,vi, normalized to values between 0 and 1.
I have to mention that I don't have a degree in mathematics, which seems to be a requirement in the posts I've seen on this topic so far, but I'm hoping there is actually a solution to this problem that one can comprehend. I never had a chance to take classes in Computer Vision.
The reason I came here is that in all the posts I've seen online, the simple explanation that I came across is that each point must be put into a 1x3 matrix and multiplied by a 3x3 homography, which consists of 9 components h1,h2,h3...h9, and this transformation matrix will transform each point to the correct perspective. And that's where I'm hitting a brick wall - how do I calculate the transformation matrix? It feels like it should be a relatively simple algebraic task, but apparently it's not.
At this point I spent days reading on the topic, and the solutions I've come across are either based on matlab (which have a ton of mathematical functions built into them), or include elaborations and discussions that don't really explain much; sometimes they suggest tons of different parameters and simplifications, but rarely explain why and what's their purpose, or they are referencing books and studies that have been since removed from the web, and I found myself more confused than I was in the beginning. Most of the resources I managed to find online are also made in a different context - image stitching and 3d engine development.
I also want to mention that I need to run this code each frame on the CPU, and I'm fairly concerned about the effect of having to run too many matrix transformations and solving a ton of linear algebra equations.
I apologize for not asking about any specific code, but my general question is - can anyone point me in the right direction with this issue?
Limit the problem you deal with.
For example, if you always warp the entire rectangular image, you can treat that the coordinates of the image corners are {(0,0), (1,0), (0,1), (1,1)}.
This can simplify the equation, and you'll be able to solve the equation by yourself.
So you'll be able to implement the answer.
Note : Homograpy is scale invariant. So you can decrease the freedom to 8. (e.g. you can solve the equation under h9=1).
Best advice I can give: read a good book on the subject. For example, "Multiple View Geometry" by Hartley and Zisserman

How to detect anomalies in opencv (c++) if threshold is not good enought?

I have grayscale images like this:
I want to detect anomalies on this kind of images. On the first image (upper-left) I want to detect three dots, on the second (upper-right) there is a small dot and a "Foggy area" (on the bottom-right), and on the last one, there is also a bit smaller dot somewhere in the middle of the image.
The normal static tresholding does't work ok for me, also Otsu's method is always the best choice. Is there any better, more robust or smarter way to detect anomalies like this? In Matlab I was using something like Frangi Filtering (eigenvalue filtering). Can anybody suggest good processing algorithm to solve anomaly detection on surfaces like this?
EDIT: Added another image with marked anomalies:
Using #Tapio 's tophat filtering and contrast adjustement.
Since #Tapio provide us with great idea how to increase contrast of anomalies on the surfaces like I asked at the begining, I provide all you guys with some of my results. I have and image like this:
Here is my code how I use tophat filtering and contrast adjustement:
kernel = getStructuringElement(MORPH_ELLIPSE, Size(3, 3), Point(0, 0));
morphologyEx(inputImage, imgFiltered, MORPH_TOPHAT, kernel, Point(0, 0), 3);
imgAdjusted = imgFiltered * 7.2;
The result is here:
There is still question how to segment anomalies from the last image?? So if anybody have idea how to solve it, just take it! :) ??
You should take a look at bottom-hat filtering. It's defined as the difference of the original image and the morphological closing of the image and it makes small details such as the ones you are looking for flare out.
I adjusted the contrast to make both images visible. The anomalies are much more pronounced when looking at the intensities and are much easier to segment out.
Let's take a look at the first image:
The histogram values don't represent the reality due to scaling caused by the visualization tools I'm using. However the relative distances do. So now the thresholding range is much larger, the target changed from a window to a barn door.
Global thresholding ( intensity > 15 ) :
Otsu's method worked poorly here. It segmented all the small details to the foreground.
After removing noise by morphological opening :
I also assumed that the black spots are the anomalies you are interested in. By setting the threshold lower you include more of the surface details. For example the third image does not have any particularly interesting features to my eye, but that's for you to judge. Like m3h0w said, it's a good heuristic to know that if something is hard for your eye to judge it's probably impossible for the computer.
#skoda23, I would try unsharp masking with fine tuned parameters for the blurring part, so that the high frequencies get emphasized and test it thoroughly so that no important information is lost in the process. Remember that it is usually not good idea to expect computer to do super-human work. If a human has doubts about where the anomalies are, computer will have to. Thus it is important to first preprocess the image, so that the anomalies are obvious for the human eye. Alternative for unsharp masking (or addition) might be CLAHE. But again: remember to fine tune it very carefully - it might bring out the texture of the board too much and interfere with your task.
Alternative approach to basic thresholding or Otsu's, would be AdaptiveThreshold() which might be a good idea since there is a difference in intensity values between different regions you want to find.
My second guess would be first using fixed value thresholding for the darkest dots and then trying Sobel, or Canny. There should exist an optimal neighberhood where texture of the board will not shine as much and anomalies will. You can also try bluring before edge detection (if you've detected the small defects with the thresholding).
Again: it is vital for the task to experiment a lot on every step of this approach, because fine tuning the parameters will be crucial for eventual success. I'd recommend making friends with the trackbar, to speed up the process. Good luck!
You're basically dealing with the unfortunate fact that reality is analog. A threshold is a method to turn an analog range into a discrete (binary) range. Any threshold will do that. So what exactly do you mean with a "good enough" threshold?
Let's park that thought for a second. I see lots of anomalies - sort of thin grey worms. Apparently, you ignore them. I'm applying a different threshold then you are. This may be reasonable, but you're applying domain knowledge that I don't have.
I suspect these grey worms will be throwing off your fixed value thresholding. That's not to say the idea of a fixed threshold is bad. You can use it to find some artifacts and exclude those. Somewhat darkish patches will be missed, but can be brought out by replacing each pixel with the median value of its neighborhood, using a neighborhood size that's bigger than the width of those worms. In the dark patch, this does little, but it wipes out small local variations.
I don't pretend these two types of abnormalities are the only two, but that is really an application domain question and not about techniques. E.g. you don't appear to have ligthing artifacts (reflections), at least not in these 3 samples.

Detecting/correcting Photo Warping via Point Correspondences

I realize there are many cans of worms related to what I'm asking, but I have to start somewhere. Basically, what I'm asking is:
Given two photos of a scene, taken with unknown cameras, to what extent can I determine the (relative) warping between the photos?
Below are two images of the 1904 World's Fair. They were taken at different levels on the wireless telegraph tower, so the cameras are more or less vertically in line. My goal is to create a model of the area (in Blender, if it matters) from these and other photos. I'm not looking for a fully automated solution, e.g., I have no problem with manually picking points and features.
Over the past month, I've taught myself what I can about projective transformations and epipolar geometry. For some pairs of photos, I can do pretty well by finding the fundamental matrix F from point correspondences. But the two below are causing me problems. I suspect that there's some sort of warping - maybe just an aspect ratio change, maybe more than that.
My process is as follows:
I find correspondences between the two photos (the red jagged lines seen below).
I run the point pairs through Matlab (actually Octave) to find the epipoles. Currently, I'm using Peter Kovesi's
Peter's Functions for Computer Vision.
In Blender, I set up two cameras with the images overlaid. I orient the first camera based on the vanishing points. I also determine the focal lengths from the vanishing points. I orient the second camera relative to the first using the epipoles and one of the point pairs (below, the point at the top of the bandstand).
For each point pair, I project a ray from each camera through its sample point, and mark the closest covergence of the pair (in light yellow below). I realize that this leaves out information from the fundamental matrix - see below.
As you can see, the points don't converge very well. The ones from the left spread out the further you go horizontally from the bandstand point. I'm guessing that this shows differences in the camera intrinsics. Unfortunately, I can't find a way to find the intrinsics from an F derived from point correspondences.
In the end, I don't think I care about the individual intrinsics per se. What I really need is a way to apply the intrinsics to "correct" the images so that I can use them as overlays to manually refine the model.
Is this possible? Do I need other information? Obviously, I have little hope of finding anything about the camera intrinsics. There is some obvious structural info though, such as which features are orthogonal. I saw a hint somewhere that the vanishing points can be used to further refine or upgrade the transformations, but I couldn't find anything specific.
Update 1
I may have found a solution, but I'd like someone with some knowledge of the subject to weigh in before I post it as an answer. It turns out that Peter's Functions for Computer Vision has a function for doing a RANSAC estimate of the homography from the sample points. Using m2 = H*m1, I should be able to plot the mapping of m1 -> m2 over top of the actual m2 points on the second image.
The only problem is, I'm not sure I believe what I'm seeing. Even on an image pair that lines up pretty well using the epipoles from F, the mapping from the homography looks pretty bad.
I'll try to capture an understandable image, but is there anything wrong with my reasoning?
A couple answers and suggestions (in no particular order):
A homography will only correctly map between point correspondences when either (a) the camera undergoes a pure rotation (no translation) or (b) the corresponding points are all co-planar.
The fundamental matrix only relates uncalibrated cameras. The process of recovering a camera's calibration parameters (intrinsics) from unknown scenes, known as "auto-calibration" is a rather difficult problem. You'd need these parameters (focal length, principal point) to correctly reconstruct the scene.
If you have (many) more images of this scene, you could try using a system such as Visual SFM: http://ccwu.me/vsfm/ It will attempt to automatically solve the Structure From Motion problem, including point matching, auto-calibration and sparse 3D reconstruction.

Generating contour lines from regularly spaced data

I am currently working on a data visualization project.My aim is to produce contour lines ,in other words iso-lines, from gridded data.Data can be temperature, weather data or any kind of other environmental parameters but only condition is it must be regularly spaced.
I searched in internet , however i could not find a good algorithm, pseudo-code or source code for producing contour lines from grids.
Does anybody knows a library, source code or an algorithm for producing contour lines from gridded data?
it will be good if your suggestion has a good run time performance, i don't want to wait my users so much :)
Edit: thanks for response but isolines have some constrains like they should not intersects
so just generating bezier curves does not accomplish my goal.
See this question: How to approximate a vector contour from an elevation raster?
It's a near duplicate, but uses quite different terminology. You'll find that cartography and computer graphics solve many of the same problems, but use different terminology for them.
there's some reasonably good contouring available in GNUplot - if you're able to use GPL code that may help.
If your data is placed at regular intervals, this can be done fairly easily (assuming I understand your problem correctly). First you need to determine at what interval you want your contours. Next create the grid you are going to use to store the contour information (i'm assuming just a simple on/off or elevation at this contour level type of data), which should be one interval smaller than the source data.
Now the trick here is to offset the 2 grids by 1/2 an interval (won't actually show up in code like this, but its the concept I'm dealing with here) and compare the 4 coordinates surrounding the current point in the contour data grid you are calculating. If any of the 4 points are in a different interval range, then that 'pixel' in the contour grid should be set to true (or the value of the contour range being crossed).
With this method, there will be a problem when the interval is too fine which will cause several contours to overlap onto each other.
As the link from Paul Tomblin suggests, Bezier curves (which are a subset of B-splines) are a ripe solution for your problem. If runtime performance is an issue, Bezier curves have the added benefit of being constructable via the very fast de Casteljau algorithm, instead of drawing them according to the parametric equations. On the off chance you're working with DirectX, it has a library function for the de Casteljau, but it should not be challenging to brew one yourself using the 1001 web pages that describe it.