Draw camera position using OpenCV

Draw camera position using OpenCV - c++

I am quite new to OpenCV so please forgive me if I am asking something obvious.
I have a program that gives me position and rotation of moving camera. But to be sure if my program works correctly I want to draw those results in 3 coordinate system.
I also have camera projection matrix
Camera matrix: [1135,52 0 1139,49
0 1023,50 543,50
0 0 1]
example how my result looks (calculated camera position):
Position = [ 0,92725 0,041710 -0,372177 0,0803997
-0,0279857 -0,983288 -0,1179896 -0,0466907
0,373459 -0,177219 0,910561, 1,19969
0 0 0 1 ]

Related

GLSL fragment shader - draw simple thick curve

I am trying to draw a very simple curve in just a fragment shader where there is a horizontal section, a transition section, then another horizontal section. It looks like the following:
My approach:
Rather than using bezier curves (which would then make it more complicated with thickness), I tried to take a shortcut. Basically, I just use one smooth step to transition between horizontal segments, which gives a decent curve. To compute thickness of the curve, for any given fragment x, I compute the y and ultimately the coordinate of where on the line we should be (x,y). Unfortunately, this isn't computing the shortest distance to the curve as seen below.
Below is a diagram to help perhaps understand the function I am having trouble with.
// Start is a 2D point where the line will start
// End is a 2d point where the line will end
// transition_x is the "x" position where we're use a smoothstep to transition between points
float CurvedLine(vec2 start, vec2 end, float transition_x) {
// Setup variables for positioning the line
float curve_width_frac = bendWidth; // How wide should we make the S bend
float thickness = abs(end.x - start.x) * curve_width_frac; // normalize
float start_blend = transition_x - thickness;
float end_blend = transition_x + thickness;
// for the current fragment, if you draw a line straight up, what's the first point it hits?
float progress_along_line = smoothstep(start_blend, end_blend, frag_coord.x);
vec2 point_on_line_from_x = vec2(frag_coord.x, mix(start.y,end.y, progress_along_line)); // given an x, this is the Y
// Convert to application specific stuff since units are a little odd
vec2 nearest_coord = point_on_line_from_x * dimensions;
vec2 rad_as_coord = rad * dimensions;
// return pseudo distance function where 1 is inside and 0 is outside
return 1.0 - smoothstep(lineWidth * dimensions.y, lineWidth * 1.2 * dimensions.y, distance(nearest_coord, rad_as_coord));
// return mix(vec4(1.0), vec4(0.0), s));
}
So I am familiar with given a line or line segment, compute the shortest distance to the line but I am not too sure how to tackle it with this curved segment. Any suggestions would be greatly appreciated.

I would do this in 2 passes:
render thin curve
do not yet use target colors but BW/grayscale instead ... Black background white lines to make the next step easier.
smooth the original image and threshold
so simply use any FIR smoothing or Gaussian blur that will bleed the colors up to half of your thickness distance. After this just threshold the result against background and recolor to wanted colors. The smoothing needs the rendered image from #1 as input. You can use simple convolution with circular mask:
0 0 0 1 1 1 0 0 0
0 0 1 1 1 1 1 0 0
0 1 1 1 1 1 1 1 0
1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1
0 1 1 1 1 1 1 1 0
0 0 1 1 1 1 1 0 0
0 0 0 1 1 1 0 0 0
btw. the color intensity after convoluiton like this will be a function of distance from center so it can be used as texture coordinate or shading parameter if you want ...
Also instead of convolution matrix you can use 2 nested for loops instead:
// convolution
col=vec4(0.0,0.0,0.0,0.0);
for (y=-r;y<=+r;y++)
for (x=-r;x<=+r;x++)
if ((x*x)+(y*y)<=r*r)
col+=texture2D(sampler,vec2(x0+x*mx,y0+y*my));
// threshold & recolor
if (col.r>threshold) col=col_curve; // assuming 1st pass has red channel used
else col=col_background;
where x0,y0 is your fragment position in texture and mx,my scales from pixels to texture coordinate scale. Also you need to handle edge cases when as x+x0 and y+y0 can be outside your texture.
Beware the thicker the curve the slower this will get ... For higher thicknesses is faster to apply smaller radius smoothing few times (more passes)
Here some related QAs that could covers some of the steps:
OpenGL Scale Single Pixel Line for multi pass (old api)
How to implement 2D raycasting light effect in GLSL scanning input texture

How do we extract view up vector, view angle and focal point from a view transform matrix?

I have a sensor (a tracker) whose rotation with respect to world coordinates and position with respect to world coordinates is known. I am rendering a 3D model of the world using VTK camera.
I have already registered the model in world coordinates.
Based on the tracker's rotation and position with respect to world coordinates, relative position between tracker and world is known. So if I can reorient the VTK camera to put it in the same relative position and orientation to the rendered 3D model as that of the tracker with respect to world, I will get an aligned 3D model with the world.
My intuition says that this can be done by changing the VTK camera's view up vector, view angle, focal point and position.
trackerMat is a vtkMatrix which has the information about the rotation and position of tracker in world coordinates. camera is the vtkCamera and renderer is the vtkRenderer.
record.x, record.y and record.z are real time positions of the tracker in world coordinates. record.a, record.e and record.r are the azimuth, elevation and roll of the tracker.
Basically this is what I want to do:
vtkSmartPointer<vtkMatrix4> trackerMat = vtkSmartPointer<vtkMatrix4>::New();
vtkSmartPointer<vtkCamera> camera = vtkSmartPointer<vtkCamera>::New();
vtkSmartPointer<vtkRenderer> renderer = vtkSmartPointer<vtkRenderer>::New();
vtkSmartPointer<vtkRenderWindow> renderWindow = vtkSmartPointer<vtkRenderWindow>::New();
camera->SetPosition(record.x, record.y, record.z);
camera->SetFocalPoint(f.x, f.y, f.z);
camera->SetViewUp(viewUp.x, viewUp.y, viewUp.z);
camera->SetViewAngle(viewAngle);
renderer->SetActiveCamera(camera);
renderer->GetRenderWindow()->Render();
I tried setting vtkCamera azimuth, elevation and roll to the same as that of tracker, but it gave bizarre results
This is why I thought I should set focal point, view up vector and view angle to the vtkCamera rather than the angles. vtkCamera position can be set to the same position as that of the tracker easily. Is there someway I can extract the view up vector, focal point and view angle from trackerMat? My renderWindow size is 1280*720
UPDATE:
I ran a serious of trials by setting vtkCamera focal point to different positions and see how the viewMatrix changed:
Case 1:
camera->SetPosition(0,0,-5);
camera->SetFocalPoint(0,0,0);
vtkSmartPointer<vtkMatrix4x4> viewMat = vtkSmartPointer<vtkMatrix4x4>::New();
viewMat = camera->GetViewTransformMatrix();
In this case viewMat is:
-1 0 0 0
0 1 0 0
0 0 0 -5
0 0 0 1
Case 2:
camera->SetPosition(0,0,-5);
camera->SetFocalPoint(0,0,3);
vtkSmartPointer<vtkMatrix4x4> viewMat = vtkSmartPointer<vtkMatrix4x4>::New();
viewMat = camera->GetViewTransformMatrix();
In this case viewMat is:
-1 0 0 0
0 1 0 0
0 0 0 -5
0 0 0 1
So when the focal point is changed only in the z-direction, viewMat will remain same. Makes sense.
Case 3:
camera->SetPosition(0,0,-5);
camera->SetFocalPoint(3,3,3);
vtkSmartPointer<vtkMatrix4x4> viewMat = vtkSmartPointer<vtkMatrix4x4>::New();
viewMat = camera->GetViewTransformMatrix();
In this case viewMat is:
-0.936 0 0.351 1.755
-0.116 0.943 -0.310 -1.551
-0.331 -0.331 -0.883 -4.417
0 0 0 1
If I index the viewMat rows by 0 to 3 and columns by 0 to 3, then what I see is that viewMat(0,3), viewMat(1,3) and viewMat(2,3) does not always correspond to the vtkCamera position in world coordinates.
Direction of projection = unit vector pointing in the direction from camera position to focal point.
It seems that -viewMat(2,0), -viewMat(2,1) and -viewMat(2,2) always correspond to the direction of projection.
If you orthogonalise your view up vector so that it is always perpendicular to direction of projection by doing this:
camera->OrthogonaliseViewUp();
Then it is seen that -viewMat(1,0), -viewMat(1,1) and -viewMat(1,2) always correspond to the viewUp vector of vtkCamera.
As far as I know, the translation vectors, viewMat(0,3), viewMat(1,3) and viewMat(2,3), should give your world origin in camera coordinates. But it does not seem so in vtk.

Transformation matrices are essentially the base vectors of the destination space as seen from the origin space. So the information you're interested in is available ready to use in the view matrix, the upper left 3×3 sub matrix to be precise, each row or column (depending on which mapping you want; in a orthogonal matrix – and the upper left 3×3 of a view should be orthogonal the transpose is the inverse, so the rows are the inverse to the columns).
Note that there's no such thing as a "focal point" in view transformation, there are just directions Right, Up, and View. But that's exactly what you need.

OpenGL custom rendering pipeline: Perspective matrix

I am attempting to work in LWJGL to display a simple quad using my own matrices. I've been looking around for awhile and have found a few perspective matrix implementations, these two in particular:
[cot(fov/2)/a 0 0 0]
[0 cot(fov/2) 0 0]
[0 0 -f/(f-n) -1]
[0 0 -f*n/(f-n) 0]
and:
[cot(fov/2)/a 0 0 0]
[0 cot(fov/2) 0 0]
[0 0 -(f+n)/(f-n) -1]
[0 0 -(2*f*n)/(f-n) 0]
Both of these provide the same effect, as expected (got them from here and here, respectively). The issue is in my understanding of how multiplying this by the modelview matrix, then a vertex, then dividing each x, y, and z value by its w value gives a screen coordinate. More specifically, if I multiply either of these by the modelview matrix then by a vertex (10, 10, 0, 1), it gives a w=0. That in itself is a big smack in the face. I conclude either the matrices are wrong, or I am missing something completely. In my actual test program, the vertices don't even end up on screen even though the camera position at (0,0,0) and no rotation would make it so. I even have tried many different z values, positive and negative, to see if it was just a clipping plane. Am I missing something here?
EDIT: After a lot of checking over, I've narrowed down the problem I am facing. The biggest issue is that the z-axis does not appear to be remapped to the range I specify (n to f). Any object just zooms in or out a little bit when I translate it along the z-axis then pops out of existence as it moves past the range [-1, 1]. I think this is also making me more confused. I set my far plane to 100 and my near to 0.1, and it behaves like anything but.

Both of these provide the same effect, as expected
While the second projection matrix form is very standard, the first one gives a different effect. If you have z==1 and w==0, the projection will be:
Matrix 1: -f/(f-n) / -f*n/(f-n) = f / f*n = 1 / n
Matrix 2: -(f+n)/(f-n) / -(2*f*n)/(f-n) = (f+n) / (2*f2n)
The result is clearly different. You should always use the second form.
if I multiply either of these by the modelview matrix then by a vertex
(10, 10, 0, 1), it gives a w=0. That in itself is a big smack in the
face
For a focal length d the projection is computed as (ignoring aspect ratio):
x'= d*x/z = x / w
y'= d*y/z = y / w
where
w = z / d
If you have z==0 this means that you want to project a point that is already in the eye and only points beyond d are visible. In practice this point will be clipped because z is not within the range n (near) and f (far) (n is expected as a positive constant)

Model matrix in 3D graphics / OpenGL

I'm following some tutorials to learn openGL (from www.opengl-tutorial.org if it makes any difference) and there is an exercise that asks me to draw a cube and a triangle on the screen and it says as a hint that I'm supposed to calculate two MVP-matrices, one for each object. MVP matrix is given by Projection*View*Model and as far as I understand, the projection and view matrices are the same for all the objects on the screen (they are only affected by my choice of "camera" location and settings). However, the model matrix should change since it's supposed to give me the coordinates and rotation of the object in the global coordinates. Following the tutorials, for my cube the model matrix is just the unit matrix since it is located at the origin and there's no rotation or scaling. Then I draw my triangle so that its vertices are at (2,2,0), (2,3,0) and (3,2,0). Now my question is, what is the model matrix for my triangle?
My own reasoning says that if I don't want to rotate or scale it, the model matrix should be just translation matrix. But what gives the translation coordinates here? Should it include the location of one of the vertices or the center of the triangle or what? Or have I completely misunderstood what the model matrix is?

The model matrix is like the other matrices (projection, view) a 4x4 matrix with the same layout. Depending on whether you're using column or row vectors the matrix consists of the x,y,z axis of your local frame and a t1,t2,t3 vector specifying the translation part
so for a column vector p the transformation matrix (M) looks like
x1, x2, x3, t1,
y1, y2, y3, t2,
z1, z2, z3, t3,
0, 0, 0, 1
p' = M * p
so for row vectors you could try to find out how the matrix layout must be. Also note that if you have row vectors p' = p * M.
If you have no rotational component your local frame has the usual x,y,z axis as the rows of the 3x3 submatrix of the model matrix..
1 0 0 t1 -> x axis
0 1 0 t2 -> y axis
0 0 1 t3 -> z axis
0 0 0 1
the forth column specifies the translation vector (t1,t2,t3). If you have a point p =
1,
0,
0,
1
in a local coordinate system and you want it to translate +1 in z direction to place it in the world coordinate system the model matrix is simply:
1 0 0 0
0 1 0 0
0 0 1 1
0 0 0 1
p' = M * p .. p' is the transformed point in world coordinates.
For your example above you could already specify the triangle in (2,2,0), (2,3,0) and (3,2,0) in your local coordinate system. Then the model matrix is trivial. Otherwise you have to find out how you compute rotation etc.. I recommend reading the first few chapters of mathematics for 3d game programming and computer graphics. It's a very simple 3d math book, there you should get the minimal information you need to handle the most of the 3d graphics math.

Why does sign matter in opengl projection matrix

I'm working on a computer vision problem which requires rendering a 3d model using a calibrated camera. I'm writing a function that breaks the calibrated camera matrix into a modelview matrix and a projection matrix, but I've run into an interesting phenomenon in opengl that defies explanation (at least by me).
The short description is that negating the projection matrix results in nothing being rendered (at least in my experience). I would expect that multiplying the projection matrix by any scalar would have no effect, because it transforms homogeneous coordinates, which are unaffected by scaling.
Below is my reasoning why I find this to be unexpected; maybe someone can point out where my reasoning is flawed.
Imagine the following perspective projection matrix, which gives correct results:
[ a b c 0 ]
P = [ 0 d e 0 ]
[ 0 0 f g ]
[ 0 0 h 0 ]
Multiplying this by camera coordinates gives homogeneous clip coordinates:
[x_c] [ a b c 0 ] [X_e]
[y_c] = [ 0 d e 0 ] * [Y_e]
[z_c] [ 0 0 f g ] [Z_e]
[w_c] [ 0 0 h 0 ] [W_e]
Finally, to get normalized device coordinates, we divide x_c, y_c, and z_c by w_c:
[x_n] [x_c/w_c]
[y_n] = [y_c/w_c]
[z_n] [z_c/w_c]
Now, if we negate P, the resulting clip coordinates should be negated, but since they are homogeneous coordinates, multiplying by any scalar (e.g. -1) shouldn't have any affect on the resulting normalized device coordinates. However, in openGl, negating P results in nothing being rendered. I can multiply P by any non-negative scalar and get the exact same rendered results, but as soon as I multiply by a negative scalar, nothing renders. What is going on here??
Thanks!

Well, the gist of it is that clipping testing is done through:
-w_c < x_c < w_c
-w_c < y_c < w_c
-w_c < z_c < w_c
Multiplying by a negative value breaks this test.

I just found this tidbit, which makes progress toward an answer:
From Red book, appendix G:
Avoid using negative w vertex coordinates and negative q texture coordinates. OpenGL might not clip such coordinates correctly and might make interpolation errors when shading primitives defined by such coordinates.
Inverting the projection matrix will result in negative W clipping coordinate, and apparently opengl doesn't like this. But can anyone explain WHY opengl doesn't handle this case?
reference: http://glprogramming.com/red/appendixg.html

Reasons I can think of:
By inverting the projection matrix, the coordinates will no longer be within your zNear and zFar planes of the view frustum (necessarily greater than 0).
To create window coordinates, the normalized device coordinates are translated/scaled by the viewport. So, if you've used a negative scalar for the clip coordinates, the normalized device coordinates (now inverted) translate the viewport to window coordinates that are... off of your window (to the left and below, if you will)
Also, since you mentioned using a camera matrix and that you have inverted the projection matrix, I have to ask... to which matrices are you applying what from the camera matrix? Operating on the projection matrix save near/far/fovy/aspect causes all sorts of problems in the depth buffer including anything that uses z (depth testing, face culling, etc).
The OpenGL FAQ section on transformations has some more details.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js