I know OpenGL only slight and all this docs and tutorials are damn hard to read so i do not helps.. I got some vision though how it could work and only would like some clarification or validation of my vision
I assume 3D world is build from 3d meshes, each mesh may be hold in some array or few arrays (storing the geometry for that mesh).. I assume also that some meshes may be sorta like cloned and used more than once on the scene.. So in my wision i got say 50 meshes but some of them are used more than once... Lets say those clones i would name as a instance of a mesh (each mesh may have 0 instances, 1 instance or more instances)
Is this vision okay? Some more may be added?
I understand that each instance should have its own position and orientation, so do we have some array of instances each element containing one pos-oriantation matrix? or thiose matrices only existing in the code branches (you know what i mean, i set such matrix then send a mesh then modify this position matrix then send the mesh again till all instances are sent) ?
Do this exhaust the geomety (non-shader) part of the things?
(then shaders part come which i also not quite understand, there is a tremendous amount of hoax on shaders where this geometry part seem more important to me, well whatever)
Can someone validate the vision i spread here?
So you have a model which will contain one or more meshes, a mesh that will contain one or more groups, and a group that will contain vertex data.
There is only a small difference between a model and a mesh such as a model will contain other data such as texture which will be used by a mesh (or meshes).
A mesh will also contain data on how to draw the groups such as a matrix.
A group is a part of the mesh which is generally used to move a part of the model using sub matrices. Take a look at "skeletal animation".
So as traditional fixed pipelines suggest you will usually have a stack of matrices which can be pushed and popped to define somewhat "sub-positions". Imaging having a model representing a dragon. The model would most likely consist of a single mesh, a texture and maybe some other data on the drawing. In the runtime this model would have some matrix defining the model basic position and rotation, even scale. Then when the dragon needs to fly you would move its wings. Since the wings may be identical there may be only 1 group but the mesh would contain data to draw it twice with a different matrix. So the model has the matrix which is then multiplied with the wing group matrix to draw the wing itself:
push model matrix
multiply with the dragon matrix
push model matrix
multiply with the wing matrix
draw wing
pop matrix
push matrix
multiply with the second wing matrix
draw second wing
pop matrix
... draw other parts of the dragon
pop matrix
You can probably imagine the wing is then divided into multiple parts each again containing an internal relative matrix achieving a deeper level of matrix usage and drawing.
The same procedures would then be used on other parts of the model/mesh.
So the idea is to put as least data as possible on the GPU and reuse them. So when model is loaded all the textures and vertex data should be sent to the GPU and be prepared to use. The CPU must be aware of those buffers and how are they used. A whole model may have a single vertex buffer where each of the draw calls will reuse a different part of the buffer but rather just imagine there is a buffer for every major part of the mode such as a wing, a head, body, leg...
In the end we usually come up with something like a shared object containing all the data needed to draw a dragon which would be textures and vertex buffers. Then we have another dragon object which will point out to that model and contain all the necessary data to draw a specific dragon on the scene. That would include the matrix data for the position in the scene, the matrix for the groups to animate the wings and other parts, maybe some size or even some basic color to combine with the original model... Also some states are usually stored here such as speed, some AI parameters or maybe even hit points.
So in the end what we want to do is something like foreach(dragon in dragons) dragon.draw() which will use its internal data to setup the basic model matrices and use any additional data needed. Then the draw method will call out to all the groups, meshes in the model to be drawn as well until the "recursion" is done and the whole model is drawn.
So yes, the structure of the data is quite complicated in the end but if you start with the smaller parts and continue outwards it all fits together quite epic.
There are other runtime systems that need to be handled as well to have a smooth loading. For instance if you are in a game and there are dragons in vicinity you will not have the model for the dragon loaded. When the dragon enters the vicinity the model should be loaded in the background if possible but drawn only when needed (in visual range). Then when the dragon is gone you may not simply unload the model, you must be sure all of the dragons are gone and maybe even wait a little bit if someone might return. This then leads to something much like a garbage collector.
I hope this will help you to a better understanding.
So im building a plugin in Nuke (from The Foundry) that will mimic Maya's animation constraint behaviours. I have a parent, a child, and then options for point, orientation, aim, parent constraints. This is all working pretty well, however my biggest issue at the moment is the Aim constraint.
Some background:
Working with the Nuke Matrix4 class
Its worth noting this matrix is a 4x4 in which the first 3 columns of the first 3 rows apply to rotations/scale, and the last column of the first 3 rows is translation (X,Y,Z)
in Vector3 classes
I am getting the source and target position. Target-source = ST
Then im setting up a Y plane (one inverted, one not)
Then i get the corss product of my ST point and the Y plane, and then another cross product of my ST and inverted Y plane. (for when the parent is behind the child to invert it)
I then get the cross product of my ST and the result of my ST.cross(y_plane)
The aim constraint actually works quite well, but i get a lot of Z rotation in my camera (child) when the parent is in certain postions. I want to be able to avoid this Z rotation. Would anyone happen to know how to do so?
If you're emulating Maya's constraint system, Maya handles Z rotation through the up vector, which adjusts your Z rotation to align with one of five options:
scene up aims the top of your camera to +Y
object up aims the top of your camera toward a third object
object rotation up matches the camera's Z rotation to the XYZ rotation of a third object
vector aims the top of your camera at that vector
none Doesn't attempt to orient the top of your camera with anything. This must be what you have currently.
Additionally, there's an up vector which defines what is the "top of your camera" just like the aim vector defines where the camera should point.
Am I correct in thinking that you would ultimately account for differences in coordinate systems in the model view matrix (via scaling).
most examples I have seen describe all of the original coordinates between 0 and 1.
Indeed it doesn't matter. For example minecraft models have the unit length be the length of a block and most mobs are larger than then that. Also Kerbal Space program models have the unit length be roughly the height of a kerbal but all the rocket parts are much larger.
Once you position and get some perspective it doesn't matter what the original coordinate system was. It does make it easier to have different size models that share their unit length so you don't need to mess with scale matrices.
I have a webcam pointed at a table at a slant and with it I track markers.
I have a transformationMatrix in OpenSceneGraph and its translation part contains the relative coordinates from the tracked Object to the Camera.
Because the Camera is pointed at a slant, when I move the marker across the table the Y and Z axis is updated, although all I want to be updated is the Z axis, because the height of the marker doesnt change only its distance to the camera.
This has the effect when when project a model on the marker in OpenSceneGraph, the model is slightly off and when I move the marker arround the Y and Z values are updated incorrectly.
So my guess is I need a Transformation Matrix with which I multiply each point so that I have a new coordinate System which lies orthogonal on the table surface.
Something like this: A * v1 = v2 v1 being the camera Coordinates and v2 being my "table Coordinates"
So what I did now was chose 4 points to "calibrate" my system. So I placed the marker at the top left corner of the Screen and defined v1 as the current camera coordinates and v2 as (0,0,0) and I did that for 4 different points.
And then taking the linear equations I get from having an unknown Matrix and two known vectors I solved the matrix.
I thought the values I would get for the matrix would be the values I needed to multiply the camera Coordinates with so the model would updated correctly on the marker.
But when I multiply the known Camera Coordinates I gathered before with the matrix I didnt get anything close to what my "table coordinates" were suposed to be.
Is my aproach completely wrong, did I just mess something up in the equations? (solved with the help of wolframalpha.com) Is there an easier or better way of doing this?
Any help would be greatly appreciated, as I am kind of lost and under some time pressure :-/
Thanks,
David
when I move the marker across the table the Y and Z axis is updated, although all I want to be updated is the Z axis, because the height of the marker doesnt change only its distance to the camera.
Only true when your camera's view direction is aligned with your Y axis (or Z axis). If the camera is not aligned with Y, it means the transform will apply a rotation around the X axis, hence modifying both the Y and Z coordinates of the marker.
So my guess is I need a Transformation Matrix with which I multiply each point so that I have a new coordinate System which lies orthogonal on the table surface.
Yes it is. After that, you will have 2 transforms:
T_table to express marker's coordinates in the table referential,
T_camera to express table coordinates in the camera referential.
Finding T_camera from a single 2d image is hard because there's no depth information.
This is known as the Pose problem -- it has been studied by -among others- Daniel DeMenthon. He developed a fast and robust algorithm to find the pose of an object:
articles available on its research homepage, section 4 "Model Based Object Pose" (and particularly "Model-Based Object Pose in 25 Lines of Code", 1995);
code at the same place, section "POSIT (C and Matlab)".
Note that the OpenCv library offers an implementation of the DeMenthon's algorithm. This library also offers a convenient and easy-to-use interface to grab images from a webcam. It's worth a try: OpenCv homepage
If you know the location in the physical world of your four markers and you've recorded the positions as they appear on the camera, you ought to be able to derive some sort of transform.
When you do the calibration, surely you'd want to put the marker at the four corners of the table not the screen? If you're just doing the corners of the screen, I imagine you're probably not taking into acconut the slant of the table.
Is the table literally just slanted relative to the camera or is it also rotated at all?
So I'm reading the "3D Math Primer For Graphics And Game Development" book, coming from pretty much a non-math background I'm finally starting to grasp vector/matrix math - which is a relief.
But, yes there's always a but, I'm having trouble understand the translation of an object from one coordinate space to another. In the book the author takes an example with gun shooting at a car (image) that is turned 20 degrees (just a 2D space for simplicity) in "world space". So we have three spaces: World Space, Gun Object Space and Car Object Space - correct? The book then states this:
"In this figure, we have introduced a rifle that is firing a bullet at the car. As indicated by the
coordinate space on the left, we would normally begin by knowing about the gun and the trajectory
of the bullet in world space. Now, imagine transforming the coordinate space in line with the
car’s object space while keeping the car, the gun, and the trajectory of the bullet still. Now we
know the position of the gun and the trajectory of the bullet in the object space of the car, and we
could perform intersection tests to see if and where the bullet would hit the car."
And I follow this explanation, and when I beforehand know that the car is rotated 20* degrees in world space this isn't a problem - but how does this translate into a situation say when I have an archer in a game shooting from a hill down on someone else? I don't know the angle at which everything is displaced there?
And which object space is rotated here? The World or Gun space? Yeah as you can see I'm a bit confused.
I think the ideal response would be using the car and gun example using arbitrary variables for positions, angle, etc.
You should read how to change basis and think in vector, not arrays but the math ones :P
I used to be a game programmer and I did that time after time. Eventually, I got away from using angles. For every object, I had a forward-facing vector and an up vector. You can get the right-facing vector, then, from a cross-product. And all the conversions between spaces become dot products.
Do you understand how the notion of how coordinate spaces and transforms work in 2D? I find that coordinate spaces and transforms are a lot easier to visualize in 2D before trying to move to 3D. That way you can work "what-if" scenarios out on paper, and helps you to just grok the major concepts.
In the image you posted I think the interpretation is that the car itself has not changed in its internal coordinate system, but that its system has been rotated with respect to the World's system.
You have to understand that the car has its own local coordinate system. The geometry of the car is defined in terms of its local coordinate system. So the length of the car always extends along the x-axis in its own local system regardless of its orientation in the World. The car can be oriented by transforming its local coordinate system.
Coordinate systems are always defined relative to another system, except for the root, in this case the World. So the gun has its own system, the car has its own system and they are both embedded into the World's system. If I rotate or move the car's system with respect to the World then the car will appear to rotate even though the geometry is unchanged.
This is something that is very hard to explain without being able to draw out visual scenarios and my google-fu is failing to find good descriptions of the basics.
As a previous reply suggests, keeping an up, forward and right vector is a good way to define a (Euclidean) coordinate space. Its even better if you add an origin as well, since you can represent a wider range of spaces.
Lets say we have two spaces A and B, in A, up, forward and right are (0,1,0), (0,0,1) and (1,0,0) respectively, and the origin is at zero this gives the usual left-handed xyz coordinates for A. Say for B we have u=(ux,uy,uz), f=(fx,fy,fz) and r=(rx,ry,rz) with origin o = (ox,oy,oz). Then for a point at p = (x,y,z) in B we have in A (x*rx + y*ux + z*fx + ox, x*ry + y*uy + z*fy + oy, x*rz + y*uz + z*fz + oz).
This can be arrived at by inspection. Observe that, since the right, up and forward vectors for B have components in each axis of A, a component of some coordinates in B must contribute to all three components of the coordinates in A. i.e. since (0,1,0) in B is equal to (ux,uy,uz), then (x,y,z) = y*u + (some other stuff). If we do this for each coordinate we have that (x,y,z) = x*r + y*u + z*f + (some other stuff). If we make the observation that the at the origin these terms vanish except for (some other stuff) then we realise that (some other stuff) must in fact be o, which gives the coordinates in A as x*r + y*u + z*f + o, which is (x*rx + y*ux + z*fx + ox, x*ry + y*uy + z*fy + oy, x*rz + y*uz + z*fz + oz) once the vector operations are expanded.
This operation can be reversed as well, we just set the coordinates in A and solve equations to find them in B. e.g. (1,1,1) in A is equal to x*r + y*u + z*f + o in B. This gives three equations in three unknowns and can be solved by the method of simultaneous equations. I won't bother explaining that here... but here is a link if you get stuck: link
How does all of this relate to your original example of a bullet and a car? Well, if you rotate a set of up/right/forward vectors with the car, and update the origin as the car is translated you can move from world space to the car's local space and make some tests easier. e.g instead of transforming vertices for a collision model, you can transform the bullet into 'car local' space and use the local coordinates. This is handy if you are going to transform the car's vertices for rendering on a GPU, but don't want to suffer the overhead of reading that information back to use for physics calculations on the CPU.
In other uses it can save you transforming x points by transforming three points and performing these operations instead, this allows you to combine x transformations on a large number of points without a significant performance hit over a single transformation across the same number of points.
In a game situation generally you wouldn't know the car was rotated 20 degrees, per se; instead your positioning information for the car would implicitly contain that knowledge. So in this two dimensional example, you'd know the x,y coordinates of the center of the car and x,y vector the car is pointing (both pieces of information in the world space) -- otherwise you wouldn't be able to draw it. Those two pieces of information are all you need to find the matrix to transform between world space and the car's object space. (And then a person could look at that matrix in this example and say, oh, look, rotation by 20 degrees -- but that's not a piece of information you'd normally worry about in the game.)
The problem of the gun and the car can be solved in any of the three spaces. So the question is, which is it easiest in? Presumably the gun's space is set up so that the bullet is fired down the X axis. So it's easy to translate that into either of the other spaces. A 2D car is probably going to be represented in its own object space -- maybe as a set of 2D line segments or 2D pixels or something. You certainly could translate those into world space or the gun's object space, but if you solve the problem in car object space you don't have to translate them at all, so that's the easiest one to work in for this problem.
It's sort of like relativity: from its own perspective, none of the spaces are rotated. Unlike relativity, though, we treat the world space as a privileged fixed frame of reference. So the objects' model spaces are rotated, mirrored, scaled, translated, etc with respect to the world space.