I am new to visual SLAM (with ORB SLAM). I have setup ORB SLAM3
https://github.com/UZ-SLAMLab/ORB_SLAM3
with a monocular camera. Once I have generated the map, how can I use it ? For example, how can I view the 3D points of the map in a software ? and how to get the positions in order to move the robot thru some detected features ?
Thanks
Karim
Related
Description of the "issue":
I want to use keypoints (a.k.a. tie points) between two successive images from an Apple smartphone using ARKit but I can't find these.
I can find 3D values in world reference frame from the ARPointCloud or rawFeaturePoints but I cannot find 2D values (i.e. in the image reference frame) for each images of the pair where they were actually detected (probably using some modified SIFT detector or whatever algorithm... in fact I'd like to know which algo is used aswell).
Question:
Do you know in which object they are stored or how I can retrieve them?
I'd like to reproject them onto the images taken by the camera in an other software (python, scikit-image, or even opencv) to do some processing.
Whatever algorithm ARKit uses to generate feature points is internal and private to ARKit. As such, any intermediary results are equally hidden from public API, and both the algorithm and results are subject to change between iOS releases or across different devices.
I have the following problem: given a 3D point cloud, its set of views V with known poses, and a view v ∉ V (i.e. with completely unknown pose), how to estimate the camera pose matrix of v avoiding to run the reconstruction again with V ∪ {v}?
I am trying to solve this in OpenCV 3.2, however any idea, intuition or pseudocode that you can provide me would be very useful. Thanks!
Well, you obviously need to establish image point correspondences between the new view and the old ones using the point cloud, e.g. by matching image descriptors (SURF, ORB, ...) associated to the projections of cloud points in the old images, and matching them to interest points extracted in the new one.
You can then go through the usual process of removing outliers using the 5 or 8 point algorithm. Once you have good correspondences, you can just use solvePnP from the cloud points to their matched locations in the new image.
Note that this is essentially what VSLAM algorithms do for all "new" images when there is no need to relocalize.
I'm working with SDK 1.8 and I'm getting the depth stream from the Kinect. Now, I want to hold a paper of A4 size in front of the camera and want to get the co-ordinates of the corners of that paper so I can project an image onto it.
How can I detect the corners of the paper and get the co-ordinates? Does Kinect SDK 1.8 provide that option?
Thanks
Kinect SDK 1.8 does not provide this feature itself (to my knowledge). Depending on the language which you use for coding, there most certainly are libraries which allow such an operation if you segment it into steps.
OpenCV for example is quite useful in image-processing. When I once worked with the Kinect for object recognition, I used AForge with C#.
I recommend to target the challenge as follows:
Edge Detection:
You will apply edge detection algorithms such as the Canny Filter onto the image. First you will probably - depending on the library - transform your depth picture into a greyscale picture. The resulting image will be greyscale as well and the intensity of a pixel correlates with the probability of it belonging to an edge. Using a threshold, you will binarize this picture to black/white.
Hough Transformation: is used to get the position and parameters of a line within a image, which allows further calculation. Hough Transformation is VERY sensistive to its parameters and you will spend a lot of time in tuning those to get good results.
Calculation of edge points: Assuming that your Hough Transformation was successful, you can now calculate all intersections or the given lines which will yield the points that you are looking for.
All of these steps (especially Edge Detection and Hough Transformation) have been asked/answered/discussed in this forum.
If you provide code and intermediate results or further question, you can get a more detailled answer.
p.s.
I remember that the kinect was not that accurate and that noise was a topic. Therefore you might consider using a filter before doing these operations.
I am new to opencv and i am trying to track some moving objects(e.g. cars) in an image. I have computed the optical flow and have used it to implement kmeans and try something like background substraction , i mean seperate moving objects from stationary. Then i have also used the intensity of the video as information . The following screenshots are from the result of the flow and the k means segmentation respectively :
The results are not good but also not bad. How could i proceed from now on ? I am thinking of trying SURF feature extraction and SURF detector . Any ideas are welcome .
It seems you are using dense optical flow. I would advice trying some feature detection (surf, fast, whatever) followed by sparse optical flow tracking(from my experience it is better than feature matching for this task). Then, once you have the feature correspondences over some frames, you can use fundamental matrix, trifocal tensor, plane+parallax or some other method to detect moving objects. You can later cluster moving objects into different motion groups that represent different objects.
Also it seems that your camera is fixed. In this case you can drop the movement detection step, and consider only tracks with enough displacement, and then do the clustering into motion groups.
I am currently working on a robotic project: a robot must grab an cube using a Kinect camera that process cube detection and calculate coordinates.
I am new in computer vision. I first worked on static image of square in order to get a basic understanding. Using C++ and openCV, I managed to get the corners (and their x y pixel coordinates) of the square using smoothing (remove noise), edge detection (canny function), lines detection (Hough transform) and lines intersection (mathematical calculation) on an simplified picture (uniform background).
By adjusting some threshold I can achieve corners detection assuming that I have only one square and no line feature in the background.
Now is my question: do you have any direction/recommendation/advice/literature about cube recognition algorithm ?
What I have found so far involves shape detection combined with texture detection and/or learning sequence. Moreover, in their applications, they often use GPU/parallellisation computing, which I don't have...
The Kinect also provided a depth camera which gives distance of the pixel from the camera. Maybe I can use this to bypass "complicated" image processing ?
Thanks in advance.
OpenCV 3.0 with contrib includes surface_matching module.
Cameras and similar devices with the capability of sensation of 3D
structure are becoming more common. Thus, using depth and intensity
information for matching 3D objects (or parts) are of crucial
importance for computer vision. Applications range from industrial
control to guiding everyday actions for visually impaired people. The
task in recognition and pose estimation in range images aims to
identify and localize a queried 3D free-form object by matching it to
the acquired database.
http://docs.opencv.org/3.0.0/d9/d25/group__surface__matching.html