parsing the wavefront obj file format - c++

I'd like to import obj models into my opengl program. I have a class / data format that I use to pass attribute data into shaders:
class CustomVertex : public IVtxFmt
{
public:
float m_Position[3]; // x, y, z offset 0, size = 3*sizeof(float)
float m_Normal[3]; // nx, ny, nz; offset 3
float m_TexCoords[2]; // u, v offset 6
float m_Colour[4]; // r, g, b, a offset 8
float m_Tangent[3]; // r, g, b offset 12
float m_Bitangent[3]; // r, g, b offset 15
};
So I'm working with a model of a log cabin I downloaded from the Internet.
The log cabin has several vertices, normals, and texture coord definitions, followed by a list of face definitions.
So my first instinct was to parse the obj file and end up with
vector<vertex>
vector<Normal>
vector<TexCoord>
That's not straightforward to translate into my CustomVertex format, since there might be 210 vertices, 100 tex coords and 80 normals defined in the file.
After a list of ~390 faces in this format:
f 83/42/1 67/46/1 210/42/1
I encounter the following in the file:
#
# object tile00
#
followed by more vertex definitions.
So from this, I have inferred that a model might consist of several sub objects, each defined by a number of faces; each face defined by 3 x vertex / normal / texcoord index values.
So in order to arrive with a vector of CustomVertex, I'm thinking that I need to do the following:
create and populate:
vector <vertex>
vector <normal>
vector <texcoord>
vector <indices>
I need to create a CustomVertex for each unique v/vn/vt triple in the face definitions.
So I thought about creating a map:
std::vector<CustomVertex> and
std::map< nHashId, CustomVertex_index >
So my idea is that for each v/vn/vt I encounter, I create a hash of this string e.g. nHashId = hash("80/50/1")* and search the map for the hash. If none exists, I create a CustomVertex and add it to the vector, then I add the newly created hash and the CustomVertex_index into the map.
*: By creating a hash of the v/vn/vt string, I'm creating a unique numeric value that corresponds to that string, which I'm hoping is faster to search/compare in the map than the equivalent text.
If I come across a match to the hash, I consider that the customvertex already exists and instead of creating a new CustomVertex, I just add the CustomVertex_index entry to the indices vector and move on.
Since this seems like a computationally expensive exercise, I guess I'll be dumping my CustomVertex arrays (and corresponding indices arrays) to disk for later retrieval, rather than parse the obj file every time.
Before I ask my questions, may I point out that due to time constraints and not wanting to have to redesign my Vbo class (a non-trivial task), I'm stuck with the CustomVertex format - I know its possible to supply attributes in separate arrays to my shaders, but I had read that interleaving the data like I have with CustomVertex can enhance performance.
So to my questions:
1. Does my method seem sound or crazy? If crazy, please point out where I'm going wrong.
Can you spot any potential issues?
Has anyone done this before and can recommend a simpler way to achieve what I'm trying to?

Can you spot any potential issues?
You mean besides hash collisions? Because I don't see the part of your algorithm that handles that.
Has anyone done this before and can recommend a simpler way to achieve what I'm trying to?
There's a much simpler way: just compare the indices and not use hashes.

Instead of creating a string hash of "v/vn/vt", the idea is to only hash v as an integer. After that you get a bucket that contains all the "v/vn/vt" combinations that share the same v index.
If a hash collision happens(same v encountered), you would compare the collided combination with those in the bucket to see if it is really duplicated. If not, remember to add the collided combination to the bucket.

Related

Efficient data structure for storing 3d points

I'm looking for efficient data structures for storing 3d points (x,y,z). The effect of storing in at the points in a data structure should generate a more memory efficient structure and a faster search for a specific set of coordinates. The 3d points is mapping to a specific ID so it should be able to keep track of each set of coordinates I'm looking for any implementation which is available.
x, y, z gives the cartesian coordinates of each node.
id x y z
1 14.566132 34.873772 7.857000
2 16.022520 33.760513 7.047000
3 17.542000 32.604973 6.885001
4 19.163984 32.022469 5.913000
5 20.448090 30.822802 4.860000
6 21.897903 28.881084 3.402000
7 18.461960 30.289471 8.586000
8 19.420759 28.730757 9.558000
The number of coordinates will be huge maybe around 1 000 000.
Thanks in advance!
a more memory efficient structure
More memory efficient than what? A list? You'd need compression for that.
a faster search for a specific set of coordinates
If you want to find the k closest points from a set of coordinates, a ball tree is a good option.
If you want to search a volume, a quad tree (or octree) works better.
I'm hearing that the coords you're looking up will be exact matches for those in the structure already. Depending perhaps on your spatial distribution, you could create a hash function that takes the coord and attempts to produce something fairly unique, then just use a standard hash map. Most modern languages provide some kind of hash map implementation, so all you'd need to do is provide those appropriate hash values for your coords.
If you need to look up coords near the test coord, then a balltree or octree or something, but it doesn't sound like that's what you need.
You can use a struct:
struct coordinate
{
double x;
double y;
double z;
} points[1000000];

2 dimension geometric layout of objects

so I have this class:
class Piece{
int width;
int height;
}
my problem is that I need to make a container type class that somehow can save the layout of multiple and different size "Piece" objects (note that Piece can only represent rectangles).
Example:
________
| t |
| t jj |
| t jj |
_________
My goal with this is to be able to "fill" a empty rectangle with multiple "Piece" objects but with the ability to know if the "Piece" can fit in.
I'm developing this in C++. I started with the most logic solution I think that was to use a "matrix" of vectors (vector< vector< Piece * > > mat) but that doesn't work because as I said "Piece" objects can have different sizes.
I hope you can give some hints on how to get a solution for this or if it exists some lib or open-sorce project links.
Thank you.
EDIT
I forgot this:I know beforehand the dimensions of the container and the insertion (after validation) is sequential (Piece after Piece) with no predefined orientation.
You can use Piece p[width][height] and use memset to make all zeros or use a std::vector if you don't know the size of the grid beforehand. Then you can check(while adding a new Piece at some position (x, y)) if on any of its subsquares there is some other Piece already.
Edit: You can use a matrix char mem[sqrt(width)][sqrt(height)]; and a one vector of Pieces. Then using the matrix if there will be a probable collision and if not, just add the Piece. Else you iterate through all the existing ones and check for a collision.
If you want to make the procedure faster( this one is reasonable only with small grids ), then you will need to use more "advanced" data structures. What I suggest you is to learn about 2D BIT(or fenwick) trees(there are a lot of resources on google). You can also use 2D segment trees. Then when adding a new Piece at position (x, y) check the sum of all squares in it(e.g from (x, y) to (x + width, y + height)). If that sum is zero then the new Piece won't collide with previous ones and you then update the grid as you add 1 to all squares in your Piece(I mean to the corresponding values in the 2D segment tree). Else if the sum is greater than zero it means that there will be some overlap and you must then discard the new Piece.

Vertex Specification Best Practices using OpenGL (Windows)

I wonder what is the best practice concerning cache management of the vertices.
Actually, I read numerous of articles on this topic but I'm not convinced yet by the best choice I should use.
I'm coding a small 3D rendering engine and my goal is to optimize this rendering by limiting the number of draw calls and of course the number of buffer bindings!
Until here, I gather within a single batch all the vertices of the objects sharing the same material properties (same lighting model properties and texture). if a VBO reallocation fails (GL_OUT_OF_MEMORY), then I create a new VBO to store the vertices of my object. Finally, I attach each batch to a VAO.
Pseudo-code:
for_each vbo in vbo_list
{
vbo->Bind();
for_each batch in vbo->getAttachedBatchList()
{
batch->BindVAO();
{
glDrawXXX(batch->GetOffset(), batch->GetLength());
}
}
}
Everything works well but is the technique I use is the most efficient one ?
I took a look to the following article (OpenGL ES - Mac):
https://developer.apple.com/library/ios/documentation/3DDrawing/Conceptual/OpenGLES_ProgrammingGuide/TechniquesforWorkingwithVertexData/TechniquesforWorkingwithVertexData.html
This article advise to store Interleaved Vertex Data like below (for 3 given vertices):
VNTVNTVNT
In my case I use the following pattern:
VVVNNNTTT
Another article on the subject: https://www.opengl.org/wiki/Vertex_Specification_Best_Practices
According to you, what is the best choice in my case ?
And finally I have another interrogation according to vertex data alignement (the topic is covered into the first article). It said "Avoid Misaligned Vertex Data". Apparently, the advise is only relevant for the case VNTVNTVNT.
For example, if this case is the best choice the following structure declaration should be correct:
struct Vertex
{
float x, y, z; //12 bytes
float nx, ny, nz; //12 bytes
float s, t; //8 bytes
};
In the case sizeof(Vertex) = 32 bytes which a multiple of 4 bytes!
If I add the color components r, g, b:
struct Vertex
{
float x, y, z; //12 bytes
float nx, ny, nz; //12 bytes
float r, g, b; //12 bytes
float s, t; //8 bytes
};
We've now 44 bytes which also a multiple of 4 bytes!
So according to the article, if I store my data this way it shouldn't have this misaligned data problem.
In conclusion: is my pseudo-code is correct ? Is it correct to wait a GL_OUT_OF_MEMORY exception to create a new VBO? Is it better to respect a maximum size of allocation ? And finally which is the best way to store the data (Interleaved or not) and is my data alignment proposition is correct?
UPDATE:
It said on the second article: `"1MB to 4MB is a nice size according to one nVidia document". It seems to be too small! I wonder if there is a mistake because I know it's possible to store a much larger amount of data (much more than 100 Mo whithout any problem on current hardware). Plus, some famous models like the Dragon could not be stored on a single VBO (it would mean that the geometry of this mesh should be shared between 4 to 5 VBOs in the best case, so 5 draw calls to render it. I can't imagine, in this case, the number of VBO allocated in real video game scene!). What do you think of that ?

transformation of sensor data

I want to transform this data (I was told to do it from the object perspective). A list of the data is:
[0, -20.790001, -4.49] make up the acceleration xyz coordinates - accel(x,y,z).
[-0.762739, -3.364226, -8.962189] make up angle xyz coordinates - angle(x,y,z).
I am trying to use Rodrigues’ rotation formula or linear transformation matrix for rotation? Is this different with sensor data?
I am able to read the data from .csv, but am unsure how to transform into C++ and how to create a matrix in C++.
As long as you have a formula for transformation of the data, you just need to apply it. As for the matrix and creating one, there are multiple ways, either by using a double array:
float matrix[][] ( or matrix** if you want to use pointers )
or using a class (or struct, up to you) which contains the rows and columns
class Matrix
float rows[]
float columns[]
Good luck!
Note: just pseudo code definitely won't work out of the box, obviously

Parsing a Wavefront .obj file using C++

While trying to a parse a wavefront .obj file, I thought of two approaches:
Create an 2D array the size of the number of vertices. When a face uses a vertex, get it's coordinates from the array.
Get the starting position of the vertex list and then when a face uses a vertex, scan the lines until you reach the vertex.
IMO, option 1 will be very memory intensive, but much faster.
Since option 2 involves extensive file reading, (and because the number of vertices in most objects becomes very large) this will be much slower, but less memmory intensive.
The question is: Comparing the tradeoff between memory and speed, which option would be better suited to an average computer?
And, is there an alternative method?
I plan to use OpenGL along with GLFW to render the object.
IMO, Option 1 will be very memory intensive, but much faster.
You must get those vertices into memory anyway. But there's no need for a 2D array, which BTW would cause two pointer indirections, thus a major performance hit. Just use a simple std::vector<Vertex> for your data, the vector index is the index for the accompanying face list.
EDIT due to comment
class Vertex
{
union { struct { float x, y, z }; float pos[3] };
union { struct { float nx, ny, nz }; float normal[3] };
union { struct { float s, t }; float pos[2] };
Vertex &operator=();
}
std::vector<Vertex>;
Generally you read the list of vertices into an array. Parsing ASCII text is extremely slow; do it only once when loading the file and then store everything in arrays in memory.
Same goes with the triangles / faces. Each triangle generally is composed of a list of three vertex indexes. That should also be stored in an array.
You may find the OBJ reader in the VTK open source library to be useful: http://www.vtk.org/doc/nightly/html/classvtkOBJReader.html. We use it and have had no reason to write our own... Use VTK directly, or you may find studying the source code to be good for further inspiration of your own reader.
In my opinion, one of the major shortcomings with OBJ files is the use of ASCII. 3D ASCII files (be it STL, PLY, OBJ, etc.) are very slow to load if they are ASCII due to the string parsing. Binary format files are much faster and should always be used if performance is an issue: the load time for a good binary format is instantaneous.
Just load them into arrays. Memory should not be an issue. Your system (usually) has way more memory than your GPU. If you are running into memory problems, you are probably loading a model that is too detailed. (I am semi-assuming that you are going to make a game in OpenGL. If you have a specific need for such large model files, you will still have to work out a way to load the appropriate chunks.)
You shouldn't need a 2 dimensional array. Your models should be triangulated and then you can simply load the obj file using gluts obj loader. Simply store points, faces and normals in 3 seperate arrays/buffers. There is an example how you can do it here, but if you want to do it fast you should go for a binary format.
This is a pretty decent solution for prototyping, running a script that generates the arrays for use in OpenGL or your preferred rendering API. obj2opengl.pl is a perl script, you'll need perl installed that you can find here. GitHub link is here.
While running the perl script you may get a runtime error on line 154 concerning if(defined(#center)). Replace it with if(#center).
From the example, once the header file is generated with the data, you can use the it as shown:
/*
created with obj2opengl.pl
source file : ./banana.obj
vertices : 4032
faces : 8056
normals : 4032
texture coords : 4420
// include generated arrays
#import "./banana.h"
// set input data to arrays
glVertexPointer(3, GL_FLOAT, 0, bananaVerts);
glNormalPointer(GL_FLOAT, 0, bananaNormals);
glTexCoordPointer(2, GL_FLOAT, 0, bananaTexCoords);
// draw data
glDrawArrays(GL_TRIANGLES, 0, bananaNumVerts);
*/