Virtual PTZ camera via physical fisheye camera and Open CV - c++

I'm trying to realize a virtual Pan-Tilt-Zoom (PTZ) camera, based on data from physical fisheye camera (180 degrees FOV).
In my opinion I have to realize the next sequence.
Get the coordinates of center of fisheye circle in coordinates of fisheye sensor matrix.
Get radius of fisheye circle in the same coordinate system.
Generate a sphere equation, which has the same center and radius as flat fisheye image on flat camera sensor.
Project all colored points from flat image to upper hemisphere.
Choose angles in X Y plane and X Z plane to describe the direction of view of virtual PTZ.
Choose view angle and mark it with circle around virtual PTZ view vector which will be painted on the surface of hemisphere.
Generate the plane equation which intersection with hemisphere will be a circle around the direction of view.
Move all colored points from the circle to the plane of circle around the direction of view, using direction from hemisphere edge to hemisphere center for projection.
Paint all unpainted points inside circle of projection, using interpolation (realized in cv::remap).
In my opinion the most important step is to rise colored points from flat image to 3-D hemisphere.
My question is:
Will it be correct to just set Z-coordinate to all colored points of flat image in accordance with hemisphere equation, to rise points from image plane to hemisphere surface?

Related

How to find an object's 3D coordinates (triangulation) given two images and camera positions/orientations

I am given
Camera intrinsics: focal length of the pinhole camera in pixels, resolution of the camera in pixels
Camera extrinsics: 3D coordinates (X,Y,Z) of 2 points where pictures of the object were taken, heading of the camera in both positions (rotation, in degrees, from the y axis - the camera is level with the x-y plane) and camera pixel coordinates of the object in each image.
I am not given the rotation and translation matrices for the camera (I have tried figuring these out but I'm confused on how to do so without knowing translation of specific points in the camera frame to 3D coordinate frame).
PS: this is theoretical so I am not able to use OpenCV, etc.
I tried following the process described in this post: How to triangulate a point in 3D space, given coordinate points in 2 image and extrinsic values of the camera
but do not have access to the translation and rotation matrices which all sources I've looked at used.

estimation of the ground plane in pinhole camera model

I am trying to understand the pinhole camera model and the geometry behind some computer vision and camera calibration stuff that I am looking at.
So, if I understand correctly, the pinhole camera model maps the pixel coordinates to 3D real world coordinates. So, the model looks as:
y = K [R|T]x
Here y is pixel coordinates in homogeneous coordinates, R|T represent the extrinsic transformation matrix and x is the 3D world coordinates also in homogeneous coordinates.
Now, I am looking at a presentation which says
project the center of the focus region onto the ground plane using [R|T]
Now the center of the focus region is just taken to be the center of the image. I am not sure how I can estimate the ground plane? Assuming, the point to be projected is in input space, the projection should that be computed by inverting the [R|T] matrix and multiplying that point by the inverted matrix?
EDIT
Source here on page 29: http://romilbhardwaj.github.io/static/BuildSys_v1.pdf

How to calculate near and far plane for glOrtho in OpenGL

I am using orthographic projection glOrtho for my scene. I implemented a virtual trackball to rotate an object beside that I also implemented a zoom in/out on the view matrix. Say I have a cube of size 100 unit and is located at the position of (0,-40000,0) far from the origin. If the center of rotation is located at the origin once the user rotate the cube and after zoom in or out, it could be position at some where (0,0,2500000) (this position is just an assumption and it is calculated after multiplied by the view matrix). Currently I define a very big range of near(-150000) and far(150000) plane, but some time the object still lie outside either the near or far plane and the object just turn invisible, if I define a larger near and far clipping plane say -1000000 and 1000000, it will produce an ungly z artifacts. So my question is how do I correctly calculate the near and far plane when user rotate the object in real time? Thanks in advance!
Update:
I have implemented a bounding sphere for the cube. I use the inverse of view matrix to calculate the camera position and calculate the distance of the camera position from the center of the bounding sphere (the center of the bounding sphere is transformed by the view matrix). But I couldn't get it to work. can you further explain what is the relationship between the camera position and the near plane?
A simple way is using the "bounding sphere". If you know the data bounding box, the maximum diagonal length is the diameter of the bounding sphere.
Let's say you calculate the distance 'dCC' from the camera position to the center of the sphere. Let 'r' the radius of that sphere. Then:
Near = dCC - r - smallMargin
Far = dCC + r + smallMargin
'smallMargin' is a value used just to avoid clipping points on the surface of the sphere due to numerical precision issues.
The center of the sphere should be the center of rotation. If not, the diameter should grow so as to cover all data.

Screen space bounding box computation in OpenGL

I'm trying to implement tiled deferred rendering method and now I'm stuck. I'm computing min/max depth for each tile (32x32) and storing it in texture. Then I want to compute screen space bounding box (bounding square) represented by left down and top right coords of rectangle for every pointlight (sphere) in my scene (see pic from my app). This together with min/max depth will be used to check if light affects actual tile.
Problem is I have no idea how to do this. Any idea, source code or exact math?
Update
Screen-space is basically a 2D entity, so instead of a bounding box think of a bounding rectangle.
Here is a simple way to compute it:
Project 8 corner points of your world-space bounding box onto the screen using your ModelViewProjection matrix
Find a bounding rectangle of these points (which is just min/max X and Y coordinates of the points)
A more sophisticated way can be used to compute a screen-space bounding rect for a point light source. We calculate four planes that pass through the camera position and are tangent to the light’s sphere of illumination (the light radius). Intersections of each tangent plane with the image plane gives us 4 lines on the image plane. This lines define the resulting bounding rectangle.
Refer to this article for math details: http://www.altdevblogaday.com/2012/03/01/getting-the-projected-extent-of-a-sphere-to-the-near-plane/

How do you find the verticies of a rotated rectangle from its center?

I have a rectangle that I've rotated around its center by an angle. How can I derive the verticies from the rectangle?
Apply the rotation matrix to the vertices.
For example, if the the origin is in the center of your rectangle and the coordinates of a vertex are given by v.x and v.y.
Then the new coordinates for this vertex are given by:
v_new.x=v.x*cos(angle)-v.y*sin(angle)
v_new.y=v.x*sin(angle)+v.y*cos(angle)
(assuming counter clockwise rotation)
Translate such that the rectangle's center is (0,0). View each point as not part of the rectangle, but as part of a circle, and the line segment from the rectangle's center as the center of that circle. Then you are solving a different problem: given a circle at the origin and a point on that circle (or line segment), what is the point (angle) degrees rotated around the circle. I'll leave looking up the appropriate algorithm for that one to you :)