How to clip a 3D image in XTK

How to clip a 3D image in XTK - xtk

We have a 3D image composed of a nii volume and several obj meshes. We want to
clip it to expose partial image.
The viewpoint option at http://www.usc.edu/programs/face/3Dmodel/C57BL6-E185.html
is the target. Is it possible to do in XTK?

I have figured out how to do this. In camera3D.js, on the call to create _perspective matrix,
X.matrix.makePerspective(X.matrix.identity(), this._fieldOfView, (width/height), 1, 10000);
adjust the 4th parameter (near) to introduce a clip view plane.
If want to use Orthographic projection, in order to unflatten the cube in
example 00, (4th parameter is also the 'near' parameter)
X.matrix.makeOrtho(X.matrix.identity(), -(width/2), (width/2), -(height/2), (height/2), 1, 10000);
(there is also a _perspective creation call in renderer.js)
It would be great if there is a way in XTK to have an easy access to adjust
that parameter on the fly but for now I will embed a handle to it.

Related

How to place object in video with opencv

Ok so I've used a chessboard image to calibrate my camera. This calibrates both the intrinsic and extrinsic parameters of the scene. Now I want to draw an object to sit on the table in which I had placed the chessboard. My question is do need the rotation+translation vectors of the camera to do this? And if so how can I get these after doing the calibration? Or are these vectors taken into account in the calibrateCamera function?
Basically after I calibrate the camera how can I now draw into the scene on top of a surface?
Thanks!

The calibrateCamera function provides you with the extrinsic (rotation+translation) and intrinsic (camera matrix K and distortion coefficients).
This allows you to project 3D points, expressed in the coordinate system of your chessboard, into an image acquired by the camera, for example using the projectPoints function (link). Using this approach, you can draw wireframe objects directly into the image by projecting the 3D edges using the projectPoints function.
If you would like to render more complex objects (i.e. including textures, etc), this requires using an auxiliary rendering engine since OpenCV does not provide such functionalities. You will have to use a rendering engine, say OpenGL, provide the projection details to it, retrieve the object rendered from the camera viewpoint in a buffer, and overlay this buffer on top of your initial image.
However, notice that the result of this, which is called augmented-reality, may look weird sometimes, because it does not take into account the occlusions between your rendered object and the scene observed in the image... Handling occlusions appropriately would require a 3D model of the scene.

AldourDisciple post is a great answer.
If I can add my 2 cents, this is a problem I worked with, and it is the basic problem of Augmented Reality, and more generally on how to integrate OpenCV and OpenGL (because you'll gonna use OpenGL to draw the 3d stuffs and use a OpenGL texture to show the underline OpenCV images from the video). You can find some (imho) good links, documentation and tutorial and references in my previous answer on that.

Determine if 3d coordinate is inside frustum

I am trying to implement an overlay for a particular screen saver that I inject my application into.
This screen saver uses 3d objects and a dynamic camera.
I have camera position and direction and the 3d object positions, and the fov value, and I'd like to create an audio based overlay that adds icons to the objects.
I have the 2d overlay in place, and can successfully iterate the objects, however I can't figure how to calculate a frustum with the data I have.
Basically: how can one create a frustrum from the camera direction? Is a world to screen matrix required to create the frustum? I don't have the w2s matrix, so would that make the problem impossible to solve?

There is an excellent resource at http://web.archive.org/web/20120531231005/http://crazyjoke.free.fr/doc/3D/plane%20extraction.pdf which should explain everything you need and outlines how to extract the view frustum using the view and model matrices.
In regards to the world to screen (W2S) matrix you can calculate it on the fly by utilising a variant of the following:
ScreenPosition = (ProjectionMatrix* (ViewMatrix * WorldPosition))
Thast way you can calculate it on the fly as required.
There is also a frustum tutorial at Lighthouse aimed at frustum culling, but explaining the implementation quite nicely.

How to find out how many units across the screen plane in OpenGL

How would one get the relative size of the viewing plane in opengl's own units? I need to find out the width and height in "opengl units". Is there a function which will retrieve this information?
I assume that one unit (let us say 1.0f) in Z would be equivalent to one unit in X, even if conversion to a real measurement system in meaningless.
I know I can get the screen size either by use of GetSystemMetrics(SM_CXSCREEN) or glutGet(GLUT_SCREEN_WIDTH), but this is in pixels.
To handle the graphical window calls, I am using freeglut on non-windows OSes and the WinAPI on Windows.

Assuming you want to draw something like a UI, set your projection matrix to an Orthographic matrix with glOrtho, then you don't have any perspective and have a direct orthographic mapping between world coordinates and screen coordinates. The arguments to your glOrtho call determine how wide/high your view port is in world coordinates.
If you want to draw both a UI and a 3D scene, draw the UI with glOrtho and draw the scene with glPerspective using a clipping mask to make sure you don't ruin your UI.
If on the other hand you want to know the width of the view port in a 3D scene with perspective, so that you know how big to draw your object then you'll have to deal with the perspective projection. You need to know at which Z coordinate you want to know the witdh/height of the view port. You can use gluUnProject to calculate the world coordinate corresponding to a given screen coordinate and Z plane.
However it would probably be better to do it the other way around, always draw your object with a given size and then calculate what your projection matrix should be to have that object appear properly in your view port.

Here is a Volume Render result, how to interact with other 3D object

I've implemented the volume render using ray-casting in CUDA. Now I need to add other 3D objects (like 3D terrain in my case) in the scene and then make it interact with the volume-render result. For example, when I move the volume-render result overlapping the terrain, I wish to modulate the volume render result such as clipping the overlapping part in the volume render result.
However, the volume render result comes from a ray accumulating color, so it is a 2D picture with no depth. So how to implement the interaction makes me very confuse. Somebody can give me a hint?

First you render your 3D rasterized objects. Then you take the depth buffer and use it as an additional data source in the volume raycaster as additional constraint on the integration limits.

Actually, I think the result of ray-casting is a 2D image, it cannot interact with other 3D objects as the usual way. So my solution is to take the ray-casting 2D image as a texture and blend it in the 3D scene. If I can control the view position and direction, we can map the ray-casting result in the exact place in the 3D scene. I'm still trying to implement this solution, but I think this idea is all right!

OpenGL define vertex position in pixels

I've been writing a 2D basic game engine in OpenGL/C++ and learning everything as I go along. I'm still rather confused about defining vertices and their "position". That is, I'm still trying to understand the vertex-to-pixels conversion mechanism of OpenGL. Can it be explained briefly or can someone point to an article or something that'll explain this. Thanks!

This is rather basic knowledge that your favourite OpenGL learning resource should teach you as one of the first things. But anyway the standard OpenGL pipeline is as follows:
The vertex position is transformed from object-space (local to some object) into world-space (in respect to some global coordinate system). This transformation specifies where your object (to which the vertices belong) is located in the world
Now the world-space position is transformed into camera/view-space. This transformation is determined by the position and orientation of the virtual camera by which you see the scene. In OpenGL these two transformations are actually combined into one, the modelview matrix, which directly transforms your vertices from object-space to view-space.
Next the projection transformation is applied. Whereas the modelview transformation should consist only of affine transformations (rotation, translation, scaling), the projection transformation can be a perspective one, which basically distorts the objects to realize a real perspective view (with farther away objects being smaller). But in your case of a 2D view it will probably be an orthographic projection, that does nothing more than a translation and scaling. This transformation is represented in OpenGL by the projection matrix.
After these 3 (or 2) transformations (and then following perspective division by the w component, which actually realizes the perspective distortion, if any) what you have are normalized device coordinates. This means after these transformations the coordinates of the visible objects should be in the range [-1,1]. Everything outside this range is clipped away.
In a final step the viewport transformation is applied and the coordinates are transformed from the [-1,1] range into the [0,w]x[0,h]x[0,1] cube (assuming a glViewport(0, w, 0, h) call), which are the vertex' final positions in the framebuffer and therefore its pixel coordinates.
When using a vertex shader, steps 1 to 3 are actually done in the shader and can therefore be done in any way you like, but usually one conforms to this standard modelview -> projection pipeline, too.
The main thing to keep in mind is, that after the modelview and projection transforms every vertex with coordinates outside the [-1,1] range will be clipped away. So the [-1,1]-box determines your visible scene after these two transformations.
So from your question I assume you want to use a 2D coordinate system with units of pixels for your vertex coordinates and transformations? In this case this is best done by using glOrtho(0.0, w, 0.0, h, -1.0, 1.0) with w and h being the dimensions of your viewport. This basically counters the viewport transformation and therefore transforms your vertices from the [0,w]x[0,h]x[-1,1]-box into the [-1,1]-box, which the viewport transformation then transforms back to the [0,w]x[0,h]x[0,1]-box.
These have been quite general explanations without mentioning that the actual transformations are done by matrix-vector-multiplications and without talking about homogenous coordinates, but they should have explained the essentials. This documentation of gluProject might also give you some insight, as it actually models the transformation pipeline for a single vertex. But in this documentation they actually forgot to mention the division by the w component (v" = v' / v'(3)) after the v' = P x M x v step.
EDIT: Don't forget to look at the first link in epatel's answer, which explains the transformation pipeline a bit more practical and detailed.

It is called transformation.
Vertices are set in 3D coordinates which is transformed into a viewport coordinates (into your window view). This transformation can be set in various ways. Orthogonal transformation can be easiest to understand as a starter.
http://www.songho.ca/opengl/gl_transform.html
http://www.opengl.org/wiki/Vertex_Transformation
http://www.falloutsoftware.com/tutorials/gl/gl5.htm

Firstly be aware that OpenGL not uses standard pixel coordinates. I mean by that for particular resolution, ie. 800x600 you dont have horizontal coordinates in range 0-799 or 1-800 stepped by one. You rather have coordinates ranged from -1 to 1 later send to graphic card rasterizing unit and after that matched to particular resolution.
I ommited one step here - before all that you have an ModelViewProjection matrix (or viewProjection matrix in some simple cases) which before all that will cast coordinates you use to an projection plane. Default use of that is to implement a camera which converts 3D space of world (View for placing an camera into right position and Projection for casting 3d coordinates into screen plane. In ModelViewProjection it's also step of placing a model into right place in world).
Another case (and you can use Projection matrix this way to achieve what you want) is to use these matrixes to convert one range of resolutions to another.
And there's a trick you will need. You should read about modelViewProjection matrix and camera in openGL if you want to go serious. But for now I will tell you that with proper matrix you can just cast your own coordinate system (and ie. use ranges 0-799 horizontaly and 0-599 verticaly) to standarized -1:1 range. That way you will not see that underlying openGL api uses his own -1 to 1 system.
The easiest way to achieve this is glOrtho function. Here's the link to documentation:
http://www.opengl.org/sdk/docs/man/xhtml/glOrtho.xml
This is example of proper usage:
glMatrixMode (GL_PROJECTION)
glLoadIdentity ();
glOrtho (0, 800, 600, 0, 0, 1)
glMatrixMode (GL_MODELVIEW)
Now you can use own modelView matrix ie. for translation (moving) objects but don't touch your projection example. This code should be executed before any drawing commands. (Can be after initializing opengl in fact if you wont use 3d graphics).
And here's working example: http://nehe.gamedev.net/tutorial/2d_texture_font/18002/
Just draw your figures instead of drawing text. And there is another thing - glPushMatrix and glPopMatrix for choosen matrix (in this example projection matrix) - you wont use that until you combining 3d with 2d rendering.
And you can still use model matrix (ie. for placing tiles somewhere in world) and view matrix (in example for zooming view, or scrolling through world - in this case your world can be larger than resolution and you could crop view by simple translations)
After looking at my answer I see it's a little chaotic but If you confused - just read about Model, View, and Projection matixes and try example with glOrtho. If you're still confused feel free to ask.

MSDN has a great explanation. It may be in terms of DirectX but OpenGL is more-or-less the same.

Google for "opengl rendering pipeline". The first five articles all provide good expositions.
The key transition from vertices to pixels (actually, fragments, but you won't be too far off if you think "pixels") is in the rasterization stage, which occurs after all vertices have been transformed from world-coordinates to screen coordinates and clipped.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js