light field rendering from camera array - c++

I'm trying to implement something similar to "Dynamically Reparameterized Light Fields" (Isaksen, McMillan, & Gortler), where the light field is a series of cameras placed on a plane:
In the paper, it discusses that we can find the corresponding camera and pixels by the following formulation: MF→D s,t = Ps,t ◦ TF.
The dataset that I'm using doesn't contain any camera parameters or any information regarding what is the distance between the cameras. I just know that they are placed on a plane uniformly. I have a free moving camera and rendering a view quad. So I can get the 3D position of the focal surface but I don't know how to get the (s,t,u,v) parameters from that. As soon as I get these parameters I can render correctly.

Related

How to find overlapping fields of view?

If I have 2 cameras and I'm given the positions and orientations of the cameras in the same coordinate system, is there any way I could detect overlapping fields of view? In other words, how could I tell if something that's displayed in the frame of 1 camera is also displayed in another? In addition, I'm also given the view and projection matrices of the 2 cameras.
To detect two overlapping fields of view you'll want to do a collision check between two viewing frustums (viewing volume).
A frustum is a convex polyhedra so you can use the separating axis theorem to do it.
See here.
However, if you just want to know if an object that is displayed in the frame of one camera is displayed in the frame of another camera the best way to do that is to transform the world space coordinates from said object in to the viewport space of both cameras. If both coordinates land within the range [0:width, height:0] for both and the z coordinate is positive, then the object is in view of both cameras.
This page has a great diagram of the 3D transformation viewing pipeline if you want to read more on what viewspace and worldspace are.

Retrieving occluded faces given a rectangular region

I am trying to do a click-and-drag selection to select all the visible faces of a model (similar to those in 3D modelling software such as Blender).
Initially I am thinking of using line intersection to find all the occluded faces in the scene: for every pixel in the viewport, trace a line into the scene and find the first intersection. Then the list of occluded faces would be the ones that did not get intersected. But then after experimentation I realized that this method is very slow.
I heard of another method which goes something like:
Assigning a unique color for each primitive.
Project all those onto a virtual plane coincides with the viewport
From the projected pixels, if the colors corresponding to a primitive are not present, then it is occluded.
The problem is that I have no idea how to go about creating such a "virtual" plane, and at the same time not revealing it to the end-user. Any help or better idea to solve this?

How to get 3d coordinates from a 3d object file.

I am using 3 ArUco Markers stuck on a 3D head phantom model to do pose estimation using OpenCV in C++. My algorithm for pose estimation is giving me the translation with respect to the camera, but I want to now know the coordinates of the marker with respect to the model coordinate system. Therefore I have scanned the head model using a 3D scanner and have an object file and the texture file with me. My question is what is the easiest or best way to get the coordinates of the markers with respect to the head model. Should I use OpenGL, blender or some other software for it? Looking for some pointers or advice.
Sounds like you have the coordinates for the markers with respect to the camera as the coordinate system, so coordinates in "eye space" or camera space. Which is when you have coordinates where the camera is at the origin.
This article has a brilliant diagram which explain the different spaces and how to transform in to different spaces:
http://antongerdelan.net/opengl/raycasting.html
If you want these same coordinates but in model space you need the matrices that will get you in to that space.
In this case you are going from eye/camera space -> model space so you need to multiply those coordinates by the inverse view matrix then by the inverse model matrix. Then your coordinate would be in model space.
But this is a lot more difficult when you are using a physical camera, as opposed to a software camera, in OpenGL for example.
To do that you will need to use OpenCV to obtain your camera's intrinsic and extrinsic parameters.
See this tutorial for more details:
https://docs.opencv.org/3.1.0/dc/dbb/tutorial_py_calibration.html

How to create views from a 360 degree panorama. (like street view)

Given a sphere like this one from google streetview.
If i wanted to create 4 views, front view, left view, right view and back view, how do i do the transformations needed to straiten the image out like if i was viewing it in google streetview. Notice the green line i drawed in, in the raw image its bended, but in street view its strait. How can i do this?
The streetview image is a spherical map. The way streetview and Google Earth work is by rendering the scene as if you were standing at the center of a giant sphere This sphere is textured with an image like in your question. The longitude on the sphere corresponds to the x coordinate on the texture and the latitude with the y coordinate.
A way to create the pictures you need would be to render the texture as a sphere like Google Earth does and then taking a screenshot of all the sides.
A way to do it purely mathematical is to envision yourself at the center of a cube and a sphere at the same time. The images you are looking for are the sides of the cube. If you want to know how a specific pixel in the cube map relates to a pixel in the spherical map, make a vector that points from the center of the cube to that pixel, and then see where that same vector points to on the sphere (latitude & longitude).
I'm sure if you search the web for spherical map cube map conversion you will be able to find more examples and implementations. Good luck!

How to use a chessboard to find the rotation/translation between 2 cameras

I am using opencv with C, and I am trying to get the extrinsic parameters (Rotation and translation) between 2 cameras.
I'm told that a checkerboard pattern can be used to calibrate, but I can't find any good samples on this. How do I go about doing this?
edit
The suggestions given are for calibrating a single camera with a checkerboard. How would you find the rotation and translation between 2 cameras given the checkerboard images from both views?
I was using code from http://www.starlino.com/opencv_qt_stereovision.html. It has some useful information and code of the author is pretty easy to understand and analyze, it covers both - chessboard calibrate and getting depth image from stereo cameras. I think it's based on this OpenCV book
opencv library here and about 3 chapters of the opencv book
A picture from a camera is just a projection of a bunch of color samples onto a plane. Assuming that the camera itself creates pictures with square pixels, the possible position of a given pixel is a vector from the camera's origin through the plane the pixel was projected onto. We'll refer to that plane as the picture plane.
One sample doesn't give you that much information. Two samples tells you a little bit more - the position of the camera relative to the plane created by three points: the two sample points and the camera position. And a third sample tells you the relative position of the camera in the world; this will be a single point in space.
If you take the same three samples and find them in another picture taken from a different camera, you will be able to determine the relative position of the cameras from the three samples (and their orientations based on the right and up vectors of the picture plane). To the correct distance, you need to know the distance between the actual sample points. In the case of a checkerboard, it's the physical dimensions of the checkerboard.