I'm trying to implement picking routine using transform feedback. Currently it works ok, but the problem is very low speed (slower than GL_SELECT).
How it works now:
Bind TBO using glBindBufferRange() with offset (0 in the beginning).
Reset memory(size of TF varyings structure) using glBufferSubData() (to be sure picking will be correct). The main problem is here.
Draw objects with geometry shader that checks intersection with picking ray. If intersection has been found, shader writes this to TF varying (initially it has no intersection, see step 2).
Increase offset and go to step 1 with the next object.
So, at the end I have an array of picking data for each object.
The question is how to avoid calling glBufferSubData() on each iteration? Possible solutions (but I don't know how to implement them) are:
Write only one TF varying. So it is not necessary to reset others
Reset data with any other way
Any ideas?
If all you want to do is clear a region of a buffer, use glClearBufferSubData. That being said, it's not clear why you need to clear it, instead of just overwriting what's there.
FYI: Picking is best implemented by rendering the scene, assigning objects different "colors", and reading the pixel of interest back. Your method is always going to be slower.
Related
I have been using OpenGL for a while now and continue to stay positive about making progress. However, I now have an issue that I have been unable to solve and it's taking a while. So, the issue is that I would like to:
Create points on screen sequentially (to appear every second for example)
Move these points independently
So far I have 2 methods on paper and that is to upload all vertices to a VBO and make each point visible (draw). The other method I had in mind was to create an empty VBO (set to NULL) and upload data per point.
Note, I want to transform these points independent of each other - can a uniform still be used? If so how can I set this up to draw point - transform - draw point - transform.
If I'm going about this completely wrong or there is a better, more improved method then please say so.
Many thanks!
I've an OBJ file that I've parsed, but not surprisingly indexing for vertex position and vertex texture is separate.
Here are a couple of OBJ lines to explicit what I mean with different indexing. These are quads, where first index references XYZ position and second index references UV coords:
f 3899/8605 3896/8606 720/8607 3897/8608
f 3898/8609 3899/8610 3897/8611 721/8612
I know that a solution is do some duplication, but what's the most clever way to proceed?
As per now I had these two options in mind:
1) Use the indexing to create two big sets of vertices and vertex texture coordinates. This means that I duplicate everything so that I will end up with a vertex for each couple v/vt in the faces blindly. If I have for example 1/3 in first face and the same 1/3 in a different face, I will end up with two separate vertices. Proceed then with glDrawArrays without using indices anymore, but the newly created sets (full of duplicates)
2) Examine each face vertex to come up to unique "GL vertices" (position+texture coord are the same in my specific case) and figure out a way of indexing with these. Differently from 1) here I will not consider as separate vertices the same couple found multiple times. I'll then create a new indexing for these new vertices and finally using glDrawElements when it comes to the draw call using the new indices.
Now I believe that the first option is way easier, but I guess each drawArrays call will be bit slower than a drawElement right? How much is this advantage I'd have?
The second option as a first thought looks pretty slow in a preprocessing step and more complicated to implement. But will it grants to me much better performance overall?
Are there any other way to account for this issue?
If you have few low-poly models - go for option #1, it's way easier to implement and performance difference will be unnoticeable.
Option #2 would be the proper way if you have some high-poly models (looking at the sample, you have at least 9k vertices in there).
Generally you should not worry about model loading time, cos that is done only once and after that you can convert/save it in a most optimal format you need (serialize it just the way it is stored in your code)
Where's the dividing line between these two approaches? It's impossible to say without real-life profiling on the target hardware and your vertex rendering pipe (skeletal animation, shadows, everything adds its toll).
I have openGL code that renders some objects and displays text labels for some of them. Displaying a label is done by projecting the appropriate vertex to the screen using gluProject, and then adding a small offset so the label is beside the vertex. This way each label is the same distance from its vertex on the screen.
I didn't originally use a display list for this (apart from the display lists for the glyphs), and it worked correctly (if somewhat slowly). Now I build a display list for the entire scene, and find that the labels are placed incorrectly.
It took me a while, but I think I have basically found the problem: gluProject takes as parameters the projection matrix, model-view matrix, and the viewport. I see no way to provide them other than calling glGetDoublev(GL_MODELVIEW_MATRIX, ...), etc. But glGet functions are "not allowed" in a display list, which - empirically - seems to mean that they don't cause an error, but rather execute immediately. So the matrix data being compiled into the display list is from list compilation time instead of list execution time (which is a problem because I need to precompile the list, not execute it immediately). At least this is my current theory.
Can anyone confirm or deny that this would cause the problem?
How does one solve this? I just want to do what gluProject does, but using the list's current matrices.
Note: I'm aware that various functions/approaches are deprecated in recent versions of openGL; please spare me answers along the lines of "you shouldn't be doing that" ;-)
Think about it: glGet… places some data in your process memory, possibly on the stack. There is absolutely no way, how a display list could even reproduce the calculations performed on data, that is not even in its reach. Add to this, that GLU (note the U) functions are not part of OpenGL, hence don't make it to the display list. GLU functions also are not GPU accelerated, all the calculations happen on the CPU and due to the API design data transfer is rather inefficient.
Scrunities like those, which as you find out, make display lists rather impractical are among the reasons, why they have been stripped from later versions of OpenGL. Or in other words: Don't use them.
Instead use Vertex Buffer Object and Index Buffers. A labeling system like yours can be implemented using instancing, fed by a list of the target positions. If instancing is not available you need to supply redundant position attributes to the label's vertex attribute vector.
Anyway: In your case making proper use of shaders and VBOs will easily outperform any display list based solution (because you can't display list everything).
Rather odd, but working would be calls to glRasterPos, glBitmap (hence glutBitmap text calls) put in a display list, and the offset applied in the projection matrix before the actual projection mapping, i.e.
glMatrixMode(GL_PROJECITON);
glLoadIdentity();
scene_projection();
draw_scene();
glMatrixMode(GL_PROJECITON);
glLoadIdentity();
glTranslatef(...); /* for the offset */
scene_projection();
draw_labels();
Though this is how I'd have done it 12 years ago. Definitely not today.
As far as I know it is common practice to call glColor4f or the like each time before drawing an object or primitive.
But what about point and line style properties?
Is it normal to call glLineSize and glPointSize very often?
Should I store a backup of the current point size and set it back after drawing, or simply call glPointSize before drawing any point, even ones which use the default size?
Unless you are drawing tens to hundreds of thousands of lines, it really won't matter. And even then, you should profile and verify that this actually matters to performance. But let's assume you did that.
Minimizing the number of state changes could improve your performance. This means that you should sort your lines by line size and your points by point size. That way, lines that are all the same size can be drawn at the same time. This of course assumes that you could draw the lines in any order. If you need the lines to be drawn in a certain order, then you will have to live with the state changes.
Avoid glGet** functions to determine current line width / point size. It is a big performance eater.
Instead store current property localy and update when necessary (preferred), or use glPushAttrib(GL_LINE_BIT) / glPopAttrib.
OpenGL is a state machine, so the only important rule is, that you set state when you need it and whenever it changes. You need a certain line width? Set it and be done width. You need a number of different line widths in a single code segment: Sort by line width and set once for every line width.
Some states are expensive to switch, so it's a good idea to keep track of those; in particular the states in question are anything related to texture binding (either to a texture unit or as FBO attachment) and shaders. Everything else is actually quite cheap to change.
In general it's a good idea to set OpenGL state explicitly and don't assume certain states being preset from earlier. This also covers the transformation matrices and setup: Do a full viewport and projection setup at every beginning of the display function; advanced applications will have to change those multiple times drawing a single frame anyway (so no glViewport, glMatrixMode(GL_PROJECTION), ... in a reshape handler).
I am providing a question regarding a subject that I am now working on.
I have an OpenGL view in which I would like to display points.
So far, this is something I can handle ;)
For every point, I have its coordinates (X ; Y ; Z) and a value (unsigned char).
I have a color array giving the link between one value and a color.
For example, 255 is red, 0 is blue, and so on...
I want to display those points in an OpenGL view.
I want to use a threshold value so that depending on it, I can modify the transparency value of a color depending on the value of one point.
I want also that the performance doesn't go bad even if I have a lot of points (5 billions in the worst case but 1~2 millions in a standard case).
I am now looking for the effective way to handle this.
I am interested in the VBO. I have read that it will allow some good performance and also that I can modify the buffer as I want without recalculating it from scratch (as with display list).
So that I can solve the threshold issue.
However, doing this on a million points dynamically will provide some heavy calculations (at least a pretty bad for loop), no ?
I am opened to any suggestions and I would like to discuss about any of your ideas !
Trying to display a billion points or more is generally (forgive the pun) pointless.
Even an extremely high resolution screen has only a few million pixels. Nothing you can do will get it to display more points than that.
As such, your first step is almost undoubtedly to figure out a way to restrict your display to a number of points that's at least halfway reasonable. OpenGL can (and will) oblige if you ask it to display more, but your monitor won't and neither will mine or much or anybody else's.
Not directly related to the OpenGL part of your question, but if you are looking at rendering massive point clouds you might want to read up on space partitioning hierarchies such as octrees to keep performance in check.
Put everything into one VBO. Draw it as an array of points: glDrawArrays(GL_POINTS,0,num). Calculate alpha in a pixel shader (using threshold passed as uniform).
If you want to change a small subset of points - you can map a sub-range of the VBO. If you need to update large parts frequently - you can use Transform Feedback to utilize GPU.
If you need to simulate something for the updates, you should consider using CUDA or OpenCL to run the update completely on the GPU. This will give you the best performance. Otherwise, you can use a single VBO and update it once per frame from the CPU. If this gets too slow, you could try multiple buffers and distribute the updates across several frames.
For the threshold, you should use a shader uniform variable instead of modifying the vertex buffer. This allows you to set a value per-frame which can be then combined with the data from the vertex buffer (for instance, you set a float minVal; and every vertex with some attribute less than minVal gets discarded in the geometry shader.)