We have modified sample code for the C API so Tango pose data (position (x,y,z) and quaternion (x,y,z,w)) is published as PoseStamped ROS messages.
We are attempting to visualize the pose using Rviz. The pose data appears to need some transformation as the rotation of the Rviz arrow does not match the behavior of the Tango when we move it around.
We realize that in the sample code, before visualization on the Tango screen, the pose data is transformed into a 4x4 Pose matrix (function PoseData::GetExtrinsicsAppliedOpenGLWorldFrame), which is then multiplied left and right by various matrices representing changes of coordinate frames (for instance, Tango to OpenGL).
Ideally, we would be able to apply a similar transformation to the pose data before publishing it for visualization. However we must keep the pose data in the position (x,y,z) and quaternion (x,y,z,w) format in order to publish it in a PoseStamped message, and we do not see what transform to apply.
We have looked at the Tango coordinate systems conventions but the transformations the Tango developers suggest we apply are only suited for pose data in a Pose matrix format. We have also attempted to apply transformations applied by Ologic in their code to no avail.
Does anyone have any suggestions on how to transform Tango pose data, without changing its format, for correct visualization on the Rviz OpenGL interface?
If it's OpenGL convention, you will basically need to do a transformation on the left hand side of the pose data. The c++ motion tracking example has a line doing this operation here. You could ignore the rotation part, but just apply following code:
glm::mat4 opengl_world_T_opengl_camera = tango_gl::conversions::opengl_world_T_tango_world() * start_service_T_deivce;
I know that is a late answer but it can maybe help others people.
If you want to visualize any data with Rviz, I assume that you want to use ros. Then maybe the best way to do it is to use the rasjava library to do your Tango android app. It works well for me. I you just have to use poseStamp, odometry and tf publisher on your tango device and then display the topic with rviz. Morever it is one of the best way to keep the real-time aspect.
Moreover here there is 2 good way to learn how to use rosjava :
https://github.com/ologic/Tango
https://github.com/rosjava/android_core/tree/master
Related
I'm currently trying to "undistort" fisheye imagery using OpenCV in C++. I know the exact lens and camera model, so I figured that I would be able to use this information to calculate some parameters and ultimately convert fisheye images to rectilinear images. However, all the tutorials I've found online encourage using auto-calibration with checkerboards. Is there a way to calibrate the fisheye camera by just using camera + lens parameters and some math? Or do I have to use the checkerboard calibration technique?
I am trying to avoid having to use the checkerboard calibration technique because I am just receiving some images to undistort, and it would be undesirable to have to ask for images of checkerboards if possible. The lens is assumed to retain a constant zoom/focal length for all images.
Thank you so much!
To un-distord an image, you need to know the intrinsic parameters of the camera which describe the distorsion.
You can't compute them from datasheet values, because they depend on how the lens is manufactured and two lenses of the same vendor & model might have different distorsion coefficients, especially if they are cheap one.
Some raster graphics editor embed a lens database from which you can query distorsion coefficients. But there is no magic, they built it by measuring the lens distorsion and eventually interpolate them after.
But you can still use an empiric method to correct at least barrel effect.
They are plenty of shaders to do so and you can always do your own maths to build a distorsion map.
I am studying about visual odometry and watched Prof. Dr. Cyrill Stachniss' video recordings which are available as YouTube 2015/16 Playlist about Photogrammetry I & II .
First, If I want to create my own dataset (like KITTI dataset for VO or like Oxford campus dataset) what should be the properties of the image that I take with a camera.
Are they just images? Or, does they have some special properties ? That is, how can I create my own dataset with a monocular or stereo camera.
Thank you.
To get extrinsic and intrinsic parameters from the image you must have a set of images of known shape from varying views. It's not trivial task to do on your own, by common CV libraries / solution have a built-in utilities for camera calibration (I have to deal with OpenCV library and Matlab CV package and they are generally the same).Usually it's done with a black and white checkboard or another simple geometric pattern.
Then with known camera parameters you can manipulate your own dataset.
Matlab camera calibration reference
OpenCV camera calibration tutorials
If you want to benchmark some visual odometry algorithms with your dataset, you will definitely need the intrinsic parameters of your camera as well as its pose.
As said in #f4f answer, the intrinsic calibration is typically done with some images of a checkerboard that you tilt and rotate (see opencv).
This will give you parameters such as focal length, optical center but also the distortion coefficients which can be important depending on your camera.
Getting the pose of the camera (i.e extrinsic parameters) at each frame is probably trickier. Usually the ground-truth is obtained using information from additional sensors (tracking system, IMU, GPS, ...). You can have a look at : TUM RGB-D SLAM Dataset and the corresponding paper. They explain how they used a motion-capture system to get the ground-truth pose.
Recording the time of acquisition of the camera frames can also be interesting (one timestamp per frame).
Creating your own visual odometry dataset is not trivial. If you just want to create a dataset "for fun" or to do some experiments and if you have only a camera available, I would say you can just try some methods that are known to work well (like ORB-SLAM). This will give you good approximate of the camera poses (you may have to manually fix the unknown scale).
I want to build a depth camera that finds out any image from particular distance. I have already read the following link.
http://www.i-programmer.info/news/194-kinect/7641-microsoft-research-shows-how-to-turn-any-camera-into-a-depth-camera.html
https://jahya.net/blog/how-depth-sensor-works-in-5-minutes/
But couldn't understand clearly which hardware requirements need & how to integrated into all together?
Thanks
Certainly, a depth sensor needs an IR sensor, just like in Kinect or Asus Xtion and other cameras available that provides the depth or range image. However, Microsoft came up with machine learning techniques and using algorithmic modification and research which you can find here. Also here is a video link which shows the mobile camera that has been modified to get depth rendering. But some hardware changes might be necessary if you make a standalone 2D camera into a new performing device. So I would suggest you to see the hardware design of the existing market devices as well.
one way or the other you would need two angles to the same points to get a depth. So search for depth sensors and examples e.g. kinect with ros or openCV or here
also you could transfere two camera streams into a point cloud but that's another story
Here's what I know:
3D Cameras
RGBD and Stereoscopic cameras are popular for these applications but are not always practical / available. I've prototyped with Kinects (v1,v2) and intel cameras (r200,d435). Certainly those are preferred even today.
2D Cameras
IF YOU WANT TO USE RGB DATA FOR DEPTH INFO then you need to have an algorithm that will process the math for each frame; try an RGB SLAM. A good algo will not process ALL the data every frame but it will process all the data once and then look for clues to support evidence of changes to your scene. A number of BIG companies have already done this (it's not that difficult if you have a big team w big money) think Google, Apple, MSFT, etc etc.
Good luck out there, make something amazing!
Some background:
Hi all! I have a project which involves cloud imaging. I take pictures of the sky using a camera mounted on a rotating platform. I then need to compute the amount of cloud present based on some color threshold. I am able to this individually for each picture. To completely achieve my goal, I need to do the computation on the whole image of the sky. So my problem lies with stitching several images (about 44-56 images). I've tried using the stitch function on all and some subsets of image set but it returns an incomplete image (some images were not stitched). This could be because of a lack of overlap of something, I dunno. Also the output image has been distorted weirdly (I am actually expecting the output to be something similar to a picture taken by a fish-eye lense).
The actual problem:
So now I'm trying to figure out the opencv stitching pipeline. Here is a link:
http://docs.opencv.org/modules/stitching/doc/introduction.html
Based on what I have researched I think this is what I want to do. I want to map all the images to a circular shape, mainly because of the way how my camera rotates, or something else that has uses a fairly simple coordinate transformation. So I think I need get some sort of fixed coordinate transform thing for the images. Is this what they call the homography? If so, does anyone have any idea how I can go about my problem? After this, I believe I need to get a mask for blending the images. Will I need to get a fixed mask like the one I want for my homography?
Am I going through a possible path? I have some background in programming but almost none in image processing. I'm basically lost. T.T
"So I think I need get some sort of fixed coordinate transform thing for the images. Is this what they call the homography?"
Yes, the homography matrix is the transformation matrix between an original image and the ideal result. It warps an image in perspective so it can fit in stitching to the other image.
"If so, does anyone have any idea how I can go about my problem?"
Not with the limited information you provided. It would ease the problem a lot if you know the order of pictures (which borders which.. row, column position)
If you have no experience in image processing, I would recommend you use a tutorial covering stitching using more basic functions in detail. There is some important work behind the scenes, and it's not THAT harder to actually do it yourself.
Start with this example. It stitches two pictures.
http://ramsrigoutham.com/2012/11/22/panorama-image-stitching-in-opencv/
This is a followup to Matt's previous question about camera orientation. I'm working with him on a javascript interface for a python analysis code for 3D hydro simulations.
We've successfully used xtk to build a 3D model of the mesh structure in our simulation. The resulting demo looks a lot like the simple cube demo on the xtk website so your advice based on that demo should be readily portable to our use case.
We were able to infer the view matrix at runtime from the XTK camera object. After a lot of poking and some trial and error, we figured out that the view matrix is really (in openGL nomenclature) the model-view matrix - it combines the camera's view and translation with the orientation and translation of the model the camera is looking at.
We are trying to infer the orientation of the camera relative to the (from our point of view) fixed model as we click, drag, and zoom with respect to the model. In the end, we'd like to save a set of keyframes from which we can generate a camera path that will eventually be exported to python to make a 3D volume rendering movie of the simulation data. We've tried a bunch of things but have been unable to invert the model-view matrix to infer the camera's orientation with respect to the model.
Do you have any insight into how this might be done? Is our inference about the view matrix correct or is it actually tracking something different from what I described above?
From our point of view it would be really great if xtk kept track of the camera's up, look, and position vectors with respect to the model so that we could just query for and use the values directly.
Thanks very much for your help with this and for making your visualization toolkit freely available.
This page might be useful as long as I'm understanding your needs http://3dgep.com/?p=1700 . It gives a very well understanding of the View Matrix which i needed myself and about half-way down the page might be a bit of info you could use http://3dgep.com/?p=1700#Converting_between_Camera_Transformation_and_View_Matrix .