How to get the corresponding pixel color of disparity map coordinate - c++

I'm currently working on an assignment, consisting of camera calibration, stereo-calibration and finally stereo matching. For this I am allowed to use the available OpenCV samples and tutorials and adapt them to our needs.
While the first two parts weren't much of a problem, a Question about the stereo matching part:
We should create a colored point cloud in .ply format from the disparity map of two provided images.
I'm using this code as a template:
https://github.com/opencv/opencv/blob/master/samples/cpp/stereo_match.cpp
I get the intrinsic and extrinsic files from the first two parts of the assignment.
My question is, how to get the corresponding color for each 3D-point from the original images and the disparity map?
I'm guessing that each coordinate of the disparity map corresponds to a pixel both input-images share. But how to get those pixelvalues?
EDIT: I know that the value of each element of the disparity map represents the disparity the corresponding pixel between the left and right image. But how do I get the corresponding pixels from the coordinates of the disparity map?
Example:
my disparity value at coordinates (x, y) is 128. 128 represents the depth. But how do I know which pixel in the original left or right image this corresponds to?
Additional Questions
I'm having further questions about StereoSGBM and which parameters make sense.
Here are my (downscaled for upload) input-images:
left:
right
Which give me these rectified images:
left
right
From this I get this disparity image:
For the disparity image: this is the best result I could achieve using blocksize=3 and numDisparities=512. However I'm not at all sure if those parameters make any sense. Are these values sensible?

My question is, how to get the corresponding color for each 3D-point from the original images and the disparity map?
So a disparity map is nothing but distance between matching pixels in the epipolar plane in the left and right images. This means, you just need the pixel intensity to compute the disparity which in turn implies, you could do this computation on either just the grey-scale left-right image or any of the channels of the left-right images.
I am pretty sure the disparity image you are computing operates on grey-scale images obtained from the original rgb images. If you want to compute a color disparity image, you just need to extract the individual color channels of the left-right images, and compute the corresponding disparity map channel. The outcome will then be a 3 channel disparity map.
Additional Questions I'm having further questions about StereoSGBM and which parameters make sense. Here are my (downscaled for upload) input-images:
There is never a good answer to this for the most general case. You need a parameter tuner for this. See https://github.com/guimeira/stereo-tuner as an example. You should be able to write your own in open cv pretty easily if you want.

Ok the solution to this problem is to use the projectpoint() function from OpenCV.
Basically calculate 3D-Points from the disparity image and project them onto the 2D image and use the color you hit.

Related

Decode a 2D circle colour barcode

I am new to opencv, coding in c++. I have a task given to me to decode a 2D circle barcode using an encoded array. I am up to the point where I am able to centralize the figure and get the line using Hough transforms.
Need help with how to read the colour in the images, note that each of the two adjacent blocks correspond to a letter.
Any pointers will be highly appreciated. Thanks.
First, you need to load the image. I suspect this isn't a problem because you are already using Hough transforms on it, but:
Mat img = imread(filename)
Once the image is loaded, you can grab any of the pixels using:
Scalar intensity = img.at<uchar>(y, x);
However, what you need to do is threshold the image. As I mentioned in the comments, the image colors are either 0 or 255 for each RGB channel. This is on purpose for encoding the data in case there are image artifacts. If the channel is above a certain color value, then you will consider that it's 'on' and if below, it's 'off'.
Threshold the image using adaptiveThreshold. I would threshold down to binary 1 or 0. This will produce RGB triplets that are one of eight (2^3) possible combinations, from (0,0,0) to (1,1,1).
Then you need to walk the pixels. This is where it gets interesting.
You say each adjacent 2 pixels form a single letter. That's 2^6 or 64 different letters. The next question is: are the letters arranged in scan lines, left-to-right, top to bottom? If yes, then it will be important to orientate the image using the crosshair in the center.
If the image is encoded radially (using polar coordinates) then things get a little trickier. You need to use cvLinearPolar to remap the image.
Otherwise you need to walk the whole image, stepping the size of the RGB blocks and discard any pixels whose distance from the center is greater than the radius of the circle. After reading all of the pixels into an array, group them by pairs.
At some point, I would say that using OpenCV to do this is heading towards machine learning. There has to be some point where you can cut in and use Neural Networks to decode the image for you. Once you have the circle (cutoff radius) and the image centered, you can convert to polar coordinates and discard everything outside the circle by cutting off everything greater than the radius of the circle. Remember, polar coordinates are (r,theta), so you should be able to cutoff the right part of the polar image.
Then you could train a Neural Network to take the polar image as input and spit out the paragraph.
You would have to provide lots of training data, and the trained model would still be reliant on your ability to pre-process the image. This will include any affine transforms in case the image is tilted or rotated. At that point you would say to yourself that you've done all the heavy lifting and the last little bit really isn't that hard.
However, once you get a process working for a clean image, you can start adding to steps to introduce ML to work on dirty images. HoughCircles can be used to detect the part of an image to run detection on. Next, you need to decide if the image inside the circle is a barcode or not.
A good barcode system will have parity bits or some other form of error correction, but you can use machine learning to cleanup output.
My 2 cents anyways.

Build-in function for interpolating single pixels and small blobs

Problem
Is there a build-in function for interpolating single pixels?
Given a normal image as Mat and a Point, e.g. an anomaly of the sensor or an outlier, is their some function to repair this Point?
Furthermore, if I have more than one Point connected (let's say a blob with area smaller 10x10) is there a possibility to fix them too?
Trys but not really solutions
It seems that interpolation is implemented in the geometric transformations including resizing images and to extrapolate pixels outside of the image with borderInterpolate, but I haven't found a possibility for single pixels or small clusters of pixels.
A solution with medianBlur like suggested here does not seem appropriate as it changes the whole image.
Alternative
If there isn't a build-in function, my idea would be to look at all 8-connected surrounding pixels which are not part of the blob and calculate the mean or weighted mean. If doing this iteratively, all missing or erroneous pixel should be filled. But this method would be dependent of the applied order to correct each pixel. Are there other suggestions?
Update
Here is an image to illustrate the problem. Left the original image with a contour marking the pixels to fix. Right side shows the fixed pixels. I hope to find some sophisticated algorithms to fix the pixel.
The build-in function inpaint of OpenCV does the desired interpolation of chosen pixels. Simply create a mask with all pixels to be repaired.
See the documentation here: OpenCV 3.2. Description: inpaint and Function: inpaint

reverse UndistortRectifyMap

I'm making multicamera-stereo calibration program.
My idea is to rectify each pair of cameras separately.
For example: for given 3 cameras I compute undistortion and rectification maps (using stereoRectify() and initUndistortRectifyMap()) separately for {camera[1] and camera[2]}, {camera[2] and camera[3]} and {camera[1] and camera[3]}.
Using remap(), I can transform any original image (from, lets say, camera[1]) to one of two different rectified images: rectified[1][2] and rectified[1][3].
Now, also using remap(), for any point from that original image, I can compute its new coordinates separately in rectified[1][2] and rectified[1][3] images.
It works well, but now I need to compute these coordinates in opposite direction: for any point from any of rectified images I need to find its original coordinates in its original image.
How can I do this?

Finding individual center points of circles in an image

I am using open CV and C++. I have a completely dark image which has 3 colored points on it. I need their center coordinates. If I have only one colored point in the dark image, it will automatically display its center coordinate. However,if I take as input the dark image with the 3 colored points,my program will make an average if those 3 coordinates and return the center of the 3 colored points together,which is my exact problem. I need their individual center coordinates.
Can anyone suggest a method to do that please. Thanks
Here is the code http://pastebin.com/RM7chqBE
Found a solution!
load original image to grayscale
convert original image to gray
set range of intensity value depending on color that needs to be detected
vector of contours and hierarchy
findContours
vector of moments and point
iterate through each contour to find coordinates
One of the ways to do this easily is to use the findContours and drawContours function.
In the documentation you have a bit of code that explains how to retrieve the connected components of an image. Which is what you are actually trying to do.
For example you could draw every connected component you will find (that means every dot) on it's own image and use the code you already have on every image.
This may not be the most efficient way to do this however but it's really simple.
Here is how I would do it
http://pastebin.com/y1Ae3e2V
I'm not sure this works however as I don't have time to test it but you can try it.

disparity map from 2 consecutive frames of a SINGLE calibrated camera. Is it possible?

The stereo_match.cpp example converts L and R images into disparity and point cloud. I want to adapt this example for compute the disparity and point cloud from 2 consecutive frames of a single calibrated camera. Is it possible? If this example isn't good for my scope, what are the steps for obtain what I want?
Disparity map, on stereo systems, is used to obtain depth information - distance to objects in scene. For that, you need the distance between cameras, to be able to convert disparity info to real dimensions.
On the other hand, if you have consecutive frames from a static camera, I suppose you want the differences between them. You can obtain it with an optical flow algorithm. Dense optical flow is calculated for each pixel in image, in the same way as disparity, and it outputs the movement direction and magnitude. Most common OF are sparse - they track only a set of "strong", or well-defined points.
It may make sense to obtain disparity algorithms if you have a static scene, but you move the camera, simulating the two cameras in a stereo rig.
Yes if the camera (or scene) is moving
I suppose we cannot calculate an accurate disparity map from a single camera. In computing the disparity map we basically assume that the vertical pixel coordinate in both the images in a stereo rig is same, only the horizontal pixel coordinate changes, but in monocular image sequence, this may not hold true as the camera is moving between two consecutive frames.