colmap localization: get camera position - computer-vision

I'm using colmap. I succeed to visualize a 3D sparse reconstitution from a video.
Now I have some new images from the same scene and I want to (only) localize them. I want the (x,y,z, angles) of the camera.
Following the doc, I used the commands colmap feature_extractor and colmap vocab_tree_matcher.
Everything seemed to get well; the output is
Indexing image [1/23] in 0,022s
...
Indexing image [23/23] in 0,077s
Matching image [1/16] in 0,078s
...
Matching image [16/16] in 0,003s
Elapsed time: 0,043 [minutes]
But now what ?
How do I query the colmap database to get the (x,y,z,angle) of image, say, 12 ?
I want to programmatically get the information.

Related

Building an object detector for a small dataset with a single class

I have a dataset of a single class (rectangular object) with a size of 130 images. My goal is to detect the object & draw a circle/dot/mark in the centre of the object.
Because the objects are rectangular, my idea is to get the dimensions of the predicted bounding box and take the circle/dot/mark as (width/2, height/2).
However, if I were to do transfer learning, would YOLO be a good choice to detect a single class of objects in a small dataset?
YOLO should be fine. However it is old now. Try YoloV4 for better results.
People have tried transfer learning from FasterRCNN to detect single objects with 300 images and it worked fine. (Link). However 130 images is a bit smaller. Try augmenting images - flipping, rotating etc if you get inferior results.
Use same augmentation for annotation as well while doing translation, rotation, flip augmentations. For example in pytorch, for segmentation, I use:
if random.random()<0.5: # Horizontal Flip
image = T.functional.hflip(image)
mask = T.functional.hflip(mask)
if random.random()<0.25: # Rotation
rotation_angle = random.randrange(-10,11)
image = T.functional.rotate(image,angle = rotation_angle)
mask = T.functional.rotate(mask ,angle = rotation_angle)
For bounding box you will have to create coordinates, x becomes width-x for horizontal flip.
Augmentations where object position is not changing: do not change annotations e.g.: gamma intensity transformation

Get the polygon coordinates of predicted output mask in YOLACT/YOLACT++

I am using Yolact https://github.com/dbolya/yolact ,an instance segmentation algorithm which outputs the test image with a mask on the detected object. As the input images are given with the coordinates of polygons around the input classes in the annotations.json, I want to get an output like this. But I can't figure out how to extract the coordinates of those contours/polygons.
As far as I understood from this script https://github.com/dbolya/yolact/blob/master/eval.py the output is list of tensors for detected objects. It contains classes, scores, boxes and mask for evaluated image. The eval.py script returns recognized image with all this information. Recognition is saved in 'preds' in evalimg function (line 595), and post-processing of predict result is in the "def prep_display" (line 135)
Now how do I extract those polygon coordinates and save it in .JSON file or whatever else?
I also tried to look at these but couldn't figure out sadly!
https://github.com/dbolya/yolact/issues/286
and
https://github.com/dbolya/yolact/issues/256
You need to create a complete post-processing pipeline that is specific to your task. Here's small pseudocode that could be added to the prep_disply() in eval.py
with timer.env('Copy'):
if cfg.eval_mask_branch:
# Add the below line to get all the predicted objects as a list
all_objects_mask = t[3][:args.top_k]
# Convert each object mask to binary and then
# Use OpenCV's findContours() method to extract the contour points for each object

Arrayfire - rendering a heatmap as an image/array with the available colormaps

I'm using Arrayfire to make a 2D heat transfer simulation. My dataset is a matrix of temperatures and I want to vizualize it as a heatmap. I need to produce frames of the colored dataset and save it as an image on the disk. So each temperature in my dataset has to be mapped to a color according to a certain color scheme.
I found that you can render the dataset in a window with a colormap using fig():
http://blog.accelereyes.com/blog/2013/07/03/arrayfire-examples-part-7-of-8-pde/
I also found that the colormaps available:
http://arrayfire.org/docs/defines_8h.htm#a553ceda8a1d8946efac3b08e642574ae
My plan so far has been to render the colored dataset using window.image() in a hidden window and then extract an array/image from the result so I can save this result using saveImage(). But I cannot find a way to extract the image rendered by the window.
Is there a better way to do this using the image processing functions? I would like to avoid defining my own color scheme. (i.e. making my own function that maps a temperature to a color)

OpenCV/C++ - Convert a gray picture to BGR picture after some images process

I'm working on a project which consist to stabilize a video.
Therefore, during the process of stabilization, i had to convert my frames to gray in order to use some methods like goodFeaturesToTrack() or opticalFlow().
But at the end of my process, after applying my last transformation using warpAffine(), i would like to recover the colour information of the frame but i'm not able to do this. I tried some things.
I try cvtColor(outFrame,outFrame,CV_GRAY2BGR) but not working (obviously). Still back and white
At the beginning of the loop, i picked up the three colour channels B G R of my original picture like that:
Mat channel[3];
split(frameColor, channel);
And then at the end of the process, i'm doing that:
merge(channel,3,outFrame);
So i have the colour of my frame but not stabilized , that is to say it's like merging channels has removed all the transformation.
I also try to use the warpAffine() function with the colour picture but i have the same result of above.
Please help me.
Thank.
I solved my problem.
Actually when you apply a transformation like warpAffine(), you have to apply it on the previous frame and not the current. I didn't notice that i applied it to my current color frame and not the previous colour frame. And therefore, there was no changement.
By applying it on my previous colour frame, the image is in colour and stabilized.

How to detect image location before stitching with OpenCV / C++

I'm trying to merge/stitch 2 images together but found that the default stitcher class in OpenCV could not handle my images.
So I started to write my own..
Unfortunately the images are too large to attach to this message (they are both 12600x9000 pixels in size).. so I'll try to explain as good as possible.
The 2 images are not pictures takes by a camera but are tiff files extracted from a PDF file.
The images themselves were actually CAD drawings, so not much gradients in there and therefore I think the default stitcher class could not handle them.
So far, I managed to extract the features and match them.
Also I used the following well known example to stitch them together:
Mat WarpedImage;
cv::warpPerspective(img_2,WarpedImage,homography,cv::Size(2*img_2.cols,2*img_2.rows));
Mat half(WarpedImage,Rect(0,0,img_1.cols,img_1.rows));
img_1.copyTo(half);
I sort of made it fit.. because my problem is that in my case the 2 images could be aligned vertically or horizontally.
By default, all stitch examples on the internet assume the first image is the left image and the 2nd image is the right image.
So my first question would be:
How can I detect if the image is to the left, right, above or below the first image and create a proper sized new image?
Secondly..
Currently I'm getting the proper image.. however, because I'm not having some decent code to check the ideal width and height of the new image, I have a lot of black/empty space in the new image.
What would be the best C++ code to remove those black area's?
(I'm seeing a lot of Python scripts on the net.. but no C++ examples of this.. and I have 0 Python skills....)
Thank you very much in advance for your help.
Greetings,
Floris.
You can reproject the corners of the second image with perspectiveTransform. With the transformed points you can find the relative position of your image and calculate the new image size that will fit both images. This will also let you deal with the black areas, since you have the boundaries of the two images.