model dont have tangent or binormal value fbxsdk directx - c++

I'm using a translator, so please understand that I may be inexperienced. I'm currently trying to create a 3d model in dx12 using fbxsdk.
There is no problem with the modeling used in the current example code, but now the moment you use the 3d model on sites such as mixamo and turbo,
iTangentCnt or iBinormalCnt value is zero, so the program is not working.
Is this because there is no tangent and binormal in the model?
There is no value even if you use other models, so I am curious that models have no value. How did others solve this problem?
void CFBXLoader::GetTangent(FbxMesh* _pMesh, tContainer* _pContainer, int _iIdx , int _iVtxOrder){
int iTangentCnt = _pMesh->GetElementTangentCount();
if (1 != iTangentCnt)
assert(NULL);
FbxGeometryElementTangent* pTangent = _pMesh->GetElementTangent();
UINT iTangentIdx = 0;
if (pTangent->GetMappingMode() == FbxGeometryElement::eByPolygonVertex)
{
if (pTangent->GetReferenceMode() == FbxGeometryElement::eDirect)
iTangentIdx = _iVtxOrder;
else
iTangentIdx = pTangent->GetIndexArray().GetAt(_iVtxOrder);
}
else if (pTangent->GetMappingMode() == FbxGeometryElement::eByControlPoint)
{
if (pTangent->GetReferenceMode() == FbxGeometryElement::eDirect)
iTangentIdx = _iIdx;
else
iTangentIdx = pTangent->GetIndexArray().GetAt(_iIdx);
}
FbxVector4 vTangent = pTangent->GetDirectArray().GetAt(iTangentIdx);
_pContainer->vecTangent[_iIdx].x = (float)vTangent.mData[0];
_pContainer->vecTangent[_iIdx].y = (float)vTangent.mData[2];
_pContainer->vecTangent[_iIdx].z = (float)vTangent.mData[1];}
void CFBXLoader::GetBinormal(FbxMesh* _pMesh, tContainer* _pContainer, int _iIdx, int _iVtxOrder){
int iBinormalCnt = _pMesh->GetElementBinormalCount();
if (1 != iBinormalCnt)
assert(NULL);
FbxGeometryElementBinormal* pBinormal = _pMesh->GetElementBinormal();
UINT iBinormalIdx = 0;
if (pBinormal->GetMappingMode() == FbxGeometryElement::eByPolygonVertex)
{
if (pBinormal->GetReferenceMode() == FbxGeometryElement::eDirect)
iBinormalIdx = _iVtxOrder;
else
iBinormalIdx = pBinormal->GetIndexArray().GetAt(_iVtxOrder);
}
else if (pBinormal->GetMappingMode() == FbxGeometryElement::eByControlPoint)
{
if (pBinormal->GetReferenceMode() == FbxGeometryElement::eDirect)
iBinormalIdx = _iIdx;
else
iBinormalIdx = pBinormal->GetIndexArray().GetAt(_iIdx);
}
FbxVector4 vBinormal = pBinormal->GetDirectArray().GetAt(iBinormalIdx);
_pContainer->vecBinormal[_iIdx].x = (float)vBinormal.mData[0];
_pContainer->vecBinormal[_iIdx].y = (float)vBinormal.mData[2];
_pContainer->vecBinormal[_iIdx].z = (float)vBinormal.mData[1];
}

If your run-time scenario requires tangents/binormals to operate, then you need to handle the case of them not already being defined in the FBX. If the iBinormalCnt is 0, then you can generate tangents/binormals in your own program using the surface normal and texture coordinates.
Lengyel, Eric. "7.5 Tangent Space", Foundations of Game Engine Development: Volume 2 - Rendering, Terathon Software LLC (2019) link
Mittring, Martin. "Triangle Mesh Tangent Space Calculation". Shader X^4 Advanced Rendering Techniques, 2006
See DirectXMesh's ComputeTangentFrame function.
For an example of a complete model exporter using Autodesk FBX, see the DirectX SDK Sample Content Exporter.
Note another option for rendering with tangent space without exporting the tangent/binormals is to compute them in the pixel shader at render time. I use this technique in the DirectX Tool Kit.
Christian Schüler, "Normal Mapping without Precomputed Tangents", ShaderX 5, Chapter 2.6, pp. 131 – 140 and this blog post

If tangents and bitangents are not present in your fbx file (you still need normals and one set of texture coordinates to have it work), you can use the GenerateTangentsData of FbxMesh object to build them.
bool result = _pMesh->GenerateTangentsData(uvSetIndex, overwrite, ignoretangentflip);
you will want overwrite to false, and ignoretangentflip to false most of the times.
uvSetIndex will be 0 most times as well (unless you have multiple uv set on your model).

Related

Assimp GLTF: Meshes not properly scaled (?)

I am currently trying to load the Sponza model for my PBR Renderer. It is in the GLTF format. I have been able to load smaller models quite successfully, but when I try to load the Sponza scene, some meshes aren't properly scaled. This image demonstrates my issue (I am not actually doing the PBR calculations here, that's just the albedo of the model). The wall is there, but it's only a 1x1 quad with the wall texture, even though it's supposed to be a lot bigger ans stretch across the entire model. Same goes for every wall and every floor in the model. The model is not broken as Blender and that default windows model viewer can load it correctly. I am even applying the mNode->mTransformation, but it still doesn't work. My model loading code looks kind of like this:
void Model::LoadModel(const fs::path& filePath)
{
Assimp::Importer importer;
std::string pathString = filePath.string();
m_Scene = importer.ReadFile(pathString, aiProcess_Triangulate | aiProcess_GenNormals | aiProcess_CalcTangentSpace);
if (!m_Scene || m_Scene->mFlags & AI_SCENE_FLAGS_INCOMPLETE || !m_Scene->mRootNode)
{
// Error handling
}
ProcessNode(m_Scene->mRootNode, m_Scene, glm::mat4(1.0f));
}
void Model::ProcessNode(aiNode* node, const aiScene* m_Scene, glm::mat4 parentTransformation)
{
glm::mat4 transformation = AiMatrix4x4ToGlm(&node->mTransformation);
glm::mat4 globalTransformation = transformation * parentTransformation;
for (int i = 0; i < node->mNumMeshes; i++)
{
aiMesh* assimpMesh = m_Scene->mMeshes[node->mMeshes[i]];
// This will just get the vertex data, indices, tex coords, etc.
Mesh neroMesh = ProcessMesh(assimpMesh, m_Scene);
neroMesh.SetTransformationMatrix(globalTransformation);
m_Meshes.push_back(neroMesh);
}
for (int i = 0; i < node->mNumChildren; i++)
{
ProcessNode(node->mChildren[i], m_Scene, globalTransformation);
}
}
glm::mat4 Model::AiMatrix4x4ToGlm(const aiMatrix4x4* from)
{
glm::mat4 to;
to[0][0] = (GLfloat)from->a1; to[0][1] = (GLfloat)from->b1; to[0][2] = (GLfloat)from->c1; to[0][3] = (GLfloat)from->d1;
to[1][0] = (GLfloat)from->a2; to[1][1] = (GLfloat)from->b2; to[1][2] = (GLfloat)from->c2; to[1][3] = (GLfloat)from->d2;
to[2][0] = (GLfloat)from->a3; to[2][1] = (GLfloat)from->b3; to[2][2] = (GLfloat)from->c3; to[2][3] = (GLfloat)from->d3;
to[3][0] = (GLfloat)from->a4; to[3][1] = (GLfloat)from->b4; to[3][2] = (GLfloat)from->c4; to[3][3] = (GLfloat)from->d4;
return to;
}
I don't think the ProcessMesh function is necessary for this, but if it is I can post it as well.
Does anyone see any issues? I am really getting desperate over this...
Turns out I am the dumbest human being on earth. I implemented Parallax Mapping in my code, but disabled it immediately after as it was not working correctly on every model. BUT I still had this line in my shader code:
if (texCoords.x > 1.0 || texCoords.y > 1.0 || texCoords.x < 0.0 || texCoords.y < 0.0)
discard;
Which ended up removing some parts of my model.
If I had to rate my stupidity on a scale from 1 to 10, I'd give it a 100000000000.

Vertex skinning artifacts with specific glTF models

I recently implemented vertex skinning in my own Vulkan engine. Pretty much all models render properly, however, I find that with Mixamo models, I get skinning artifacts. The main difference I found between "regular" glTF models and Mixamo models, is that the Mixamo models share inverse bind matrices between multiple meshes, but I highly doubt that this causes this issue.
Here you can see that for some reason the vertices are pulled towards one specific point which seems to be at (0, 0, 0). I know for sure that this is not caused by the vertex and index loading, as the model renders properly without vertex skinning.
Calculation of joint matrices
void Object::updateJointsByNode(Node *node) {
if (node->mesh && node->skinIndex > -1) {
auto inverseTransform = glm::inverse(node->getLocalMatrix());
auto skin = this->skinLookup[node->skinIndex];
auto numberOfJoints = skin->jointsIndices.size();
std::vector<glm::mat4> jointMatrices(numberOfJoints);
for (size_t i = 0; i < numberOfJoints; i++) {
jointMatrices[i] =
this->getNodeByIndex(skin->jointsIndices[i])->getLocalMatrix() * skin->inverseBindMatrices[i];
jointMatrices[i] = inverseTransform * jointMatrices[i];
}
this->inverseBindMatrices = jointMatrices;
}
for (auto &child : node->children) {
this->updateJointsByNode(child);
}
}
Calculation of vertex displacement in GLSL Vertex Shader
mat4 skinMat =
inWeight0.x * model.jointMatrix[int(inJoint0.x)] +
inWeight0.y * model.jointMatrix[int(inJoint0.y)] +
inWeight0.z * model.jointMatrix[int(inJoint0.z)] +
inWeight0.w * model.jointMatrix[int(inJoint0.w)];
localPosition = model.model * model.local * skinMat * vec4(inPosition, 1.0);
So basically the mistake I made was relatively simple. It was caused by casting the joints to the wrong type. When a glTF model uses a byte stride of size 4 for the vertex joints, the implicit type is uint8_t (unsigned byte) and not the uint16_t (unsigned short) which I was using.

Kinect for Windows v2 depth to color image misalignment

currently I am developing a tool for the Kinect for Windows v2 (similar to the one in XBOX ONE). I tried to follow some examples, and have a working example that shows the camera image, the depth image, and an image that maps the depth to the rgb using opencv. But I see that it duplicates my hand when doing the mapping, and I think it is due to something wrong in the coordinate mapper part.
here is an example of it:
And here is the code snippet that creates the image (rgbd image in the example)
void KinectViewer::create_rgbd(cv::Mat& depth_im, cv::Mat& rgb_im, cv::Mat& rgbd_im){
HRESULT hr = m_pCoordinateMapper->MapDepthFrameToColorSpace(cDepthWidth * cDepthHeight, (UINT16*)depth_im.data, cDepthWidth * cDepthHeight, m_pColorCoordinates);
rgbd_im = cv::Mat::zeros(depth_im.rows, depth_im.cols, CV_8UC3);
double minVal, maxVal;
cv::minMaxLoc(depth_im, &minVal, &maxVal);
for (int i=0; i < cDepthHeight; i++){
for (int j=0; j < cDepthWidth; j++){
if (depth_im.at<UINT16>(i, j) > 0 && depth_im.at<UINT16>(i, j) < maxVal * (max_z / 100) && depth_im.at<UINT16>(i, j) > maxVal * min_z /100){
double a = i * cDepthWidth + j;
ColorSpacePoint colorPoint = m_pColorCoordinates[i*cDepthWidth+j];
int colorX = (int)(floor(colorPoint.X + 0.5));
int colorY = (int)(floor(colorPoint.Y + 0.5));
if ((colorX >= 0) && (colorX < cColorWidth) && (colorY >= 0) && (colorY < cColorHeight))
{
rgbd_im.at<cv::Vec3b>(i, j) = rgb_im.at<cv::Vec3b>(colorY, colorX);
}
}
}
}
}
Does anyone have a clue of how to solve this? How to prevent this duplication?
Thanks in advance
UPDATE:
If I do a simple depth image thresholding I obtain the following image:
This is what more or less I expected to happen, and not having a duplicate hand in the background. Is there a way to prevent this duplicate hand in the background?
I suggest you use the BodyIndexFrame to identify whether a specific value belongs to a player or not. This way, you can reject any RGB pixel that does not belong to a player and keep the rest of them. I do not think that CoordinateMapper is lying.
A few notes:
Include the BodyIndexFrame source to your frame reader
Use MapColorFrameToDepthSpace instead of MapDepthFrameToColorSpace; this way, you'll get the HD image for the foreground
Find the corresponding DepthSpacePoint and depthX, depthY, instead of ColorSpacePoint and colorX, colorY
Here is my approach when a frame arrives (it's in C#):
depthFrame.CopyFrameDataToArray(_depthData);
colorFrame.CopyConvertedFrameDataToArray(_colorData, ColorImageFormat.Bgra);
bodyIndexFrame.CopyFrameDataToArray(_bodyData);
_coordinateMapper.MapColorFrameToDepthSpace(_depthData, _depthPoints);
Array.Clear(_displayPixels, 0, _displayPixels.Length);
for (int colorIndex = 0; colorIndex < _depthPoints.Length; ++colorIndex)
{
DepthSpacePoint depthPoint = _depthPoints[colorIndex];
if (!float.IsNegativeInfinity(depthPoint.X) && !float.IsNegativeInfinity(depthPoint.Y))
{
int depthX = (int)(depthPoint.X + 0.5f);
int depthY = (int)(depthPoint.Y + 0.5f);
if ((depthX >= 0) && (depthX < _depthWidth) && (depthY >= 0) && (depthY < _depthHeight))
{
int depthIndex = (depthY * _depthWidth) + depthX;
byte player = _bodyData[depthIndex];
// Identify whether the point belongs to a player
if (player != 0xff)
{
int sourceIndex = colorIndex * BYTES_PER_PIXEL;
_displayPixels[sourceIndex] = _colorData[sourceIndex++]; // B
_displayPixels[sourceIndex] = _colorData[sourceIndex++]; // G
_displayPixels[sourceIndex] = _colorData[sourceIndex++]; // R
_displayPixels[sourceIndex] = 0xff; // A
}
}
}
}
Here is the initialization of the arrays:
BYTES_PER_PIXEL = (PixelFormats.Bgr32.BitsPerPixel + 7) / 8;
_colorWidth = colorFrame.FrameDescription.Width;
_colorHeight = colorFrame.FrameDescription.Height;
_depthWidth = depthFrame.FrameDescription.Width;
_depthHeight = depthFrame.FrameDescription.Height;
_bodyIndexWidth = bodyIndexFrame.FrameDescription.Width;
_bodyIndexHeight = bodyIndexFrame.FrameDescription.Height;
_depthData = new ushort[_depthWidth * _depthHeight];
_bodyData = new byte[_depthWidth * _depthHeight];
_colorData = new byte[_colorWidth * _colorHeight * BYTES_PER_PIXEL];
_displayPixels = new byte[_colorWidth * _colorHeight * BYTES_PER_PIXEL];
_depthPoints = new DepthSpacePoint[_colorWidth * _colorHeight];
Notice that the _depthPoints array has a 1920x1080 size.
Once again, the most important thing is to use the BodyIndexFrame source.
Finally I get some time to write the long awaited answer.
Lets start with some theory to understand what is really happening and then a possible answer.
We should start by knowing the way to pass from a 3D point cloud which has the depth camera as the coordinate system origin to an image in the image plane of the RGB camera. To do that it is enough to use the camera pinhole model:
In here, u and v are the coordinates in the image plane of the RGB camera. the first matrix in the right side of the equation is the camera matrix, AKA intrinsics of the RGB Camera. The following matrix is the rotation and translation of the extrinsics, or better said, the transformation needed to go from the Depth camera coordinate system to the RGB camera coordinate system. The last part is the 3D point.
Basically, something like this, is what the Kinect SDK does. So, what could go wrong that makes the hand gets duplicated? well, actually more than one point projects to the same pixel....
To put it in other words and in the context of the problem in the question.
The depth image, is a representation of an ordered point cloud, and I am querying the u v values of each of its pixels that in reality can be easily converted to 3D points. The SDK gives you the projection, but it can point to the same pixel (usually, the more distance in the z axis between two neighbor points may give this problem quite easily.
Now, the big question, how can you avoid this.... well, I am not sure using the Kinect SDK, since you do not know the Z value of the points AFTER the extrinsics are applied, so it is not possible to use a technique like the Z buffering.... However, you may assume the Z value will be quite similar and use those from the original pointcloud (at your own risk).
If you were doing it manually, and not with the SDK, you can apply the Extrinsics to the points, and the use the project them into the image plane, marking in another matrix which point is mapped to which pixel and if there is one existing point already mapped, check the z values and compared them and always leave the closest point to the camera. Then, you will have a valid mapping without any problems. This way is kind of a naive way, probably you can get better ones, since the problem is now clear :)
I hope it is clear enough.
P.S.:
I do not have Kinect 2 at the moment so I can'T try to see if there is an update relative to this issue or if it still happening the same thing. I used the first released version (not pre release) of the SDK... So, a lot of changes may had happened... If someone knows if this was solve just leave a comment :)

libgdx changing the texture on a pre textured model

I've exported a model from blender but I want some instances to use a different texture
if (x % 2 == 0) {
shipInstance.materials.clear();
shipInstance.materials.add(new Material());
shipInstance.materials.get(0).set(new TextureAttribute(TextureAttribute.Diffuse, enemyTexture));
unfortunately doesn't work!
In a similar way I want to be able to change things like shininess and smoothing
(I'm guessing you can change things like this that are using the default shader?)
I've also (later) tried this...
Material mat = shipInstance.materials.get(m);
for (Iterator<Attribute> ai = mat.iterator(); ai.hasNext();){
Attribute att=ai.next();
if (att.type==TextureAttribute.Diffuse) {
((TextureAttribute)att).textureDescription.set(enemyTexture,TextureFilter.Linear,TextureFilter.Linear,TextureWrap.ClampToEdge,TextureWrap.ClampToEdge);
}
}
amongst other things...
argh!
for(int m=0;m<shipInstance.materials.size;m++) {
Material mat = shipInstance.materials.get(m);
for (Iterator<Attribute> ai = mat.iterator(); ai.hasNext();){
Attribute att=ai.next();
if (att.type==TextureAttribute.Diffuse) {
((TextureAttribute)att).textureDescription.set(enemyTexture,TextureFilter.Linear,TextureFilter.Linear,TextureWrap.ClampToEdge,TextureWrap.ClampToEdge);
}
}
}
My mistake was to subtract 1 from materials.size !!! (the last material in the model happened to be the most obvious one, it was in many cases when I tried different things probably working (accept for the last material) DoH!!!

Projection Mapping with Kinect and OpenGL

Im currently using a JavaCV software called procamcalib to calibrate a Kinect-Projector setup, which has the Kinect RGB Camera as origin. This setup consists solely of a Kinect RGB Camera (Im roughly using the Kinect just as an ordinary camera at the moment) and one Projector. This calibration software uses LibFreenect (OpenKinect) as the Kinect Driver.
Once the software completes its process, it will give me the intrinsics and extrinsics parameters of both the camera and the projector, which are being thrown at an OpenGL software to validate the calibration and is where a few problems begins. Once the Projection and Modelview are correctly set, I should be able to fit what is seen by the Kinect with what is being projected, but in order to achieve this I have to do a manual translation in all 3 axis and this last part isnt making any sense to me! Could you guys please help me sorting this out?
The SDK used to retrieve Kinect data is OpenNI (not the latest 2.x version, it should be 1.5.x)
I'll explain exactly what Im doing to reproduce this error. The calibration parameters is used as follows:
The Projection matrix is set as ( based on http://sightations.wordpress.com/2010/08/03/simulating-calibrated-cameras-in-opengl/ ):
r = width/2.0f; l = -width/2.0f;
t = height/2.0f; b = -height/2.0f;
alpha = fx; beta = fy;
xo = cx; yo = cy;
X = kinectCalibration.c_near + kinectCalibration.c_far;
Y = kinectCalibration.c_near*kinectCalibration.c_far;
d = kinectCalibration.c_near - kinectCalibration.c_far;
float* glOrthoMatrix = (float*)malloc(16*sizeof(float));
glOrthoMatrix[0] = 2/(r-l); glOrthoMatrix[4] = 0.0f; glOrthoMatrix[8] = 0.0f; glOrthoMatrix[12] = (r+l)/(l-r);
glOrthoMatrix[1] = 0.0f; glOrthoMatrix[5] = 2/(t-b); glOrthoMatrix[9] = 0.0f; glOrthoMatrix[13] = (t+b)/(b-t);
glOrthoMatrix[2] = 0.0f; glOrthoMatrix[6] = 0.0f; glOrthoMatrix[10] = 2/d; glOrthoMatrix[14] = X/d;
glOrthoMatrix[3] = 0.0f; glOrthoMatrix[7] = 0.0f; glOrthoMatrix[11] = 0.0f; glOrthoMatrix[15] = 1;
printM( glOrthoMatrix, 4, 4, true, "glOrthoMatrix" );
float* glCameraMatrix = (float*)malloc(16*sizeof(float));
glCameraMatrix[0] = alpha; glCameraMatrix[4] = skew; glCameraMatrix[8] = -xo; glCameraMatrix[12] = 0.0f;
glCameraMatrix[1] = 0.0f; glCameraMatrix[5] = beta; glCameraMatrix[9] = -yo; glCameraMatrix[13] = 0.0f;
glCameraMatrix[2] = 0.0f; glCameraMatrix[6] = 0.0f; glCameraMatrix[10] = X; glCameraMatrix[14] = Y;
glCameraMatrix[3] = 0.0f; glCameraMatrix[7] = 0.0f; glCameraMatrix[11] = -1; glCameraMatrix[15] = 0.0f;
float* glProjectionMatrix = algMult( glOrthoMatrix, glCameraMatrix );
And the Modelview matrix is set as:
proj_loc = new Vec3f( proj_RT[12], proj_RT[13], proj_RT[14] );
proj_fwd = new Vec3f( proj_RT[8], proj_RT[9], proj_RT[10] );
proj_up = new Vec3f( proj_RT[4], proj_RT[5], proj_RT[6] );
proj_trg = new Vec3f( proj_RT[12] + proj_RT[8],
proj_RT[13] + proj_RT[9],
proj_RT[14] + proj_RT[10] );
gluLookAt( proj_loc[0], proj_loc[1], proj_loc[2],
proj_trg[0], proj_trg[1], proj_trg[2],
proj_up[0], proj_up[1], proj_up[2] );
And finally the camera is displayed and moved around with:
glPushMatrix();
glTranslatef(translateX, translateY, translateZ);
drawRGBCamera();
glPopMatrix();
where the translation values are manually adjusted with the keyboard until I have a visual match (I'm projecting on the calibration board what the Kinect-rgb camera is seeing, so I manually adjust the opengl-camera until the projected pattern matches the printed pattern).
My question here is WHY do I have to make this manual adjustment? The modelview and projection setup should take care of it.
I was also wandering if there are any problems when switching drivers like that, since OpenKinect is used for calibration and OpenNI for validation. This came at mind when researching another popular calibration tool called RGBDemo, where it says that if using LibFreenect backend a Kinect calibration is needed.
So, will a calibration go wrong if made with a driver and displayed with another?
Does anyone think it'll be easier to achieve success if this is done with OpenCV rather than OpenGL ?
JavaCV Reference: https://code.google.com/p/javacv/
Procamcalib "short paper": http://www.ok.ctrl.titech.ac.jp/~saudet/research/procamcalib/
Procamcalib source code: https://code.google.com/p/javacv/source/browse?repo=procamcalib
RGBDemo calibration Reference: http://labs.manctl.com/rgbdemo/index.php/Documentation/Calibration
I can upload more things if necessary, just let me know what you guys need to be able to help me out :)
I'm the author of the article you linked to, and I think I can help.
The problem is in how you're setting your modelview matrix. You're using the third column of proj_RT as the camera's position when you call gluLookAt(), but it isn't the camera's position, it's the position of the world origin in camera coordinates. I wrote an article for my new blog that might help clear this up. It describes three different (equivalent) ways of interpreting the extrinsic matrix, with WebGL demos of each:
http://ksimek.github.io/2012/08/22/extrinsic/
If you must use gluLookAt, this article will show you how, but its much simpler to just call glLoadMatrix(proj_RT).
tl;dr: replace gluLookAt() with glLoadMatrix(proj_RT)
For Kinect calibration, take a look at the latest 0.7 release of RGBDemo http://labs.manctl.com/rgbdemo and corresponding Freenect calibration source.
From the v0.7.0 ChangeLogs:
New features since v0.6.1:
New demo to acquire object models using markers
Simple calibration mode for rgbd-multikinect
Much faster grabbing in rgbd-multikinect
Add timestamps and camera serials when saving to disk
Compatibility with PCL 1.4 Various bug fixes
A very good book to follow is Jason McKesson's Learning Modern 3D Graphics Programming You may also read the Kinect's ROS page and Nicolas' Kinect Calibration Page