OpenGL: which strategy to render a tree menu? - opengl

So, what we need to implement is a tree-menu with several nodes (up to hundreds). A node can have children and then can be expanded/collapsed. There is also a background lightning by mouse over and a background lightning by mouse selection.
Each node has a box, an icon and a text, which can be very large, occupying the whole width screen.
This is an example of an already working solution:
Basically I am:
rendering text a first time, just to get the length of an possible background highlight
rendering boxes and icon-textures (yeah I know, they are upside down at the moment)
rendering text a second time, first all the bold one and then all the normal one
This solution actually has a relative acceptable performance impact.
Then we tried another way, that is using the g Graphic java by drawing the tree menu and returning it as a bufferedImage to create at the end a big texture render it. All this obviously is done at every node collapse/expande and at each mouse movement.
This performed much better, but Java seems to have some big troubles handling the old bufferedImages. Indeed ram consumption increases constantly and forcing a garbage collection improves only slightly memory increasing, by slowing down it, but still...
Moreover performances fall, since the garbage collector is called every time and does not seem light at all.
So what I am going to ask you is: which is the best strategy for my needing?
Would be also maybe feasible to render each node on a different texture (actually three: one normal, one with a light background for mouse over and a last one with a normal background for mouse selection) and then at each display() just combine all these textures with the current tree-menu state?

For the Java-approach: If the BufferedImage hasn't changed in size (the width/height of your tree control), can't you reuse it to avoid garbage collection?
For the GL-approach, make sure you minimize texture switches. How do you render text? You can have a single large texture that contains all the normal and bold letters and just use different texture coordinates for each letter.

Related

Infinite cube world engine (like Minecraft) optimization suggestions?

Voxel engine (like Minecraft) optimization suggestions?
As a fun project (and to get my Minecraft-adict son excited for programming) I am building a 3D Minecraft-like voxel engine using C# .NET4.5.1, OpenGL and GLSL 4.x.
Right now my world is built using chunks. Chunks are stored in a dictionary, where I can select them based on a 64bit X | Z<<32 key. This allows to create an 'infinite' world that can cache-in and cache-out chunks.
Every chunk consists of an array of 16x16x16 block segments. Starting from level 0, bedrock, it can go as high as you want (unlike minecraft where the limit is 256, I think).
Chunks are queued for generation on a separate thread when they come in view and need to be rendered. This means that chunks might not show right away. In practice you will not notice this. NOTE: I am not waiting for them to be generated, they will just not be visible immediately.
When a chunk needs to be rendered for the first time a VBO (glGenBuffer, GL_STREAM_DRAW, etc.) for that chunk is generated containing the possibly visible/outside faces (neighboring chunks are checked as well). [This means that a chunk potentially needs to be re-tesselated when a neighbor has been modified]. When tesselating first the opaque faces are tesselated for every segment and then the transparent ones. Every segment knows where it starts within that vertex array and how many vertices it has, both for opaque faces and transparent faces.
Textures are taken from an array texture.
When rendering;
I first take the bounding box of the frustum and map that onto the chunk grid. Using that knowledge I pick every chunk that is within the frustum and within a certain distance of the camera.
Now I do a distance sort on the chunks.
After that I determine the ranges (index, length) of the chunks-segments that are actually visible. NOW I know exactly what segments (and what vertex ranges) are 'at least partially' in view. The only excess segments that I have are the ones that are hidden behind mountains or 'sometimes' deep underground.
Then I start rendering ... first I render the opaque faces [culling and depth test enabled, alpha test and blend disabled] front to back using the known vertex ranges. Then I render the transparent faces back to front [blend enabled]
Now... does anyone know a way of improving this and still allow dynamic generation of an infinite world? I am currently reaching ~80fps#1920x1080, ~120fps#1024x768 (screenshots: http://i.stack.imgur.com/t4k30.jpg, http://i.stack.imgur.com/prV8X.jpg) on an average 2.2Ghz i7 laptop with a ATI HD8600M gfx card. I think it must be possible to increase the number of frames. And I think I have to, as I want to add entity AI, sound and do bump and specular mapping. Could using Occlusion Queries help me out? ... which I can't really imagine based on the nature of the segments. I already minimized the creation of objects, so there is no 'new Object' all over the place. Also as the performance doesn't really change when using Debug or Release mode, I don't think it's the code but more the approach to the problem.
edit: I have been thinking of using GL_SAMPLE_ALPHA_TO_COVERAGE but it doesn't seem to be working?
gl.Enable(GL.DEPTH_TEST);
gl.Enable(GL.BLEND); // gl.Disable(GL.BLEND);
gl.Enable(GL.MULTI_SAMPLE);
gl.Enable(GL.SAMPLE_ALPHA_TO_COVERAGE);
To render a lot of similar objects, I strongly suggest you take a look into instanced draw : glDrawArraysInstanced and/or glDrawElementsInstanced.
It made a huge difference for me. I'm talking from 2 fps to over 60 fps to render 100000 similar icosahedrons.
You can parametrize your cubes by using Attribs ( glVertexAttribDivisor and friends ) to make them differents. Hope this helps.
It's on ~200fps currently, should be OK. The 3 main things that I've done are:
1) generation of both chunks on a separate thread.
2) tessellation the chunks on a separate thread.
3) using a Deferred Rendering Pipeline.
Don't really think the last one contributed much to the overall performance but had to start using it because of some of the shaders. Now the CPU is sort of falling asleep # ~11%.
This question is pretty old, but I'm working on a similar project. I approached it almost exactly the same way as you, however I added in one additional optimization that helped out a lot.
For each chunk, I determine which sides are completely opaque. I then use that information to do a flood fill through the chunks to cull out the ones that are underground. Note, I'm not checking individual blocks when I do the flood fill, only a precomputed bitmask for each chunk.
When I'm computing the bitmask, I also check to see if the chunk is entirely empty, since empty chunks can obviously be ignored.

Fastest way of plotting a point on screen in MFC C++ app

I have an application that contains many millions of 3d rgb points that form an image when plotted. What is the fastest way of getting them to screen in a MFC application? I've tried CDC.SetPixelV in conjunction with a bitmap, which seems quite slow, and am looking towards a Direct3D or OpenGL window in my MFC view class. Any other good places to look?
Double buffering is your solution. There are many examples on codeproject. Check this one for example
Sounds like a point cloud. You might find some good information searching on that term.
3D hardware is the fastest way to take 3D points and get them into a 2D display, so either Direct3D or OpenGL seem like the obvious choices.
If the number of points is much greater than the number of pixels in your display, then you'll probably first want to cull points that are trivially outside the view. You put all your points in some sort of spatial partitioning structure (like an octree) and omit the points inside any node that's completely outside the viewing frustrum. This reduces the amount of data you have to push from system memory to GPU memory, which will likely be the bottleneck. (If your point cloud is static, and you're just building a fly through, and if your GPU has enough memory, you could skip the culling, send all the data at once, and then just update the transforms for each frame.)
If you don't want to use the GPU and instead write a software renderer, you'll want to render to a bitmap that's in the same pixel format as your display (to eliminate the chance of the blit need to do any pixels formatting as it blasts the bitmap to the display). For reasonable window sizes, blitting at 30 frames per second is feasible, but it might not leave much time for the CPU to do the rendering.

How can I change an object's rate of rotation in OpenGL?

I'm working with a simple rotating polygon that should speed up when the user clicks and drags upwards and show down when the user clicks and drags downward. Unfortunately, I have searched EVERYWHERE and cannot seem to find any specific GL function or variable that will easily let me manipulate the speed (I've searched for "frame rate," too...)
Is there an easy call/series of calls to do something like this, or will I actually need to do things with timers at different segments of the code?
OpenGL draws stuff exactly where you tell it to draw the stuff. It has no notion of what was rendered the previous frame or what will be rendered the next frame. OpenGL has no concept of time, and without time, you cannot have speed.
Speed is something you have to manage. Generally, this is done by taking your intended speed, multiplying it by the time interval since you last rendered, and adding that into the current rotation (or position). Then render at that new orientation/position. This is all up to you however.

Selection / glRenderMode(GL_SELECT)

In order to do object picking in OpenGL, do I really have to render the scene twice?
I realize rendering the scene is supposed to be cheap, going at 30fps.
But if every selection object requires an additional gall to RenderScene()
then if I click at 30 times a second, then the GPU has to render twice as many times?
One common trick is to have two separate functions to render your scene. When you're in picking mode, you can render a simplified version of the world, without the things you don't want to pick. So terrain, inert objects, etc, don't need to be rendered at all.
The time to render a stripped-down scene should be much less than the time to render a full scene. Even if you click 30 times a second (!), your frame rate should not be impacted much.
First of all, the only way you're going to get 30 mouse clicks per second is if you have some other code simulating mouse clicks. For a person, 10 clicks a second would be pretty fast -- and at that, they wouldn't have any chance to look at what they'd selected -- that's just clicking the button as fast as possible.
Second, when you're using GL_SELECT, you normally want to use gluPickMatrix to give it a small area to render, typically a (say) 10x10 pixel square, centered on the click point. At least in a typical case, the vast majority of objects will fall entirely outside that area, and be culled immediately (won't be rendered at all). This speeds up that rendering pass tremendously in most cases.
There have been some good suggestions already about how to optimize picking in GL. That will probably work for you.
But if you need more performance than you can squeeze out of gl-picking, then you may want to consider doing what most game-engines do. Since most engines already have some form of a 3D collision detection system, it can be much faster to use that. Unproject the screen coordinates of the click and run a ray-vs-world collision test to see what was clicked on. You don't get to leverage the GPU, but the volume of work is much smaller. Even smaller than the CPU-side setup work that gl-picking requires.
Select based on simpler collision hulls, or even just bounding boxes. The performance scales by number of objects/hulls in the scene rather than by the amount of geometry plus the number of objects.

OpenGL game development - scenes that span far into view

I am working on a 2d game. Imagine a XY plane and you are a character. As your character walks, the rest of the scene comes into view.
Imagine that the XY plane is quite large and there are other characters outside of your current view.
Here is my question, with opengl, if those objects aren't rendered outside of the current view, do they eat up processing time?
Also, what are some approaches to avoid having parts of the scene rendered that aren't in view. If I have a cube that is 1000 units away from my current position, I don't want that object rendered. How could I have opengl not render that.
I guess the easiest approaches is to calculate the position and then not draw that cube/object if it is too far away.
OpenGL faq on "Clipping, Culling and Visibility Testing" says this:
OpenGL provides no direct support for determining whether a given primitive will be visible in a scene for a given viewpoint. At worst, an application will need to perform these tests manually. The previous question contains information on how to do this.
Go ahead and read the rest of that link, it's all relevant.
If you've set up your scene graph correctly objects outside your field of view should be culled early on in the display pipeline. It will require a box check in your code to verify that the object is invisible, so there will be some processing overhead (but not much).
If you organise your objects into a sensible hierarchy then you could cull large sections of the scene with only one box check.
Typically your application must perform these optimisations - OpenGL is literally just the rendering part, and doesn't perform object management or anything like that. If you pass in data for something invisible it still has to transform the relevant coordinates into view space before it can determine that it's entirely off-screen or beyond one of your clip planes.
There are several ways of culling invisible objects from the pipeline. Checking if an object is behind the camera is probably the easiest and cheapest check to perform since you can reject half your data set on average with a simple calculation per object. It's not much harder to perform the same sort of test against the actual view frustrum to reject everything that isn't at all visible.
Obviously in a complex game you won't want to have to do this for every tiny object, so it's typical to group them, either hierarchically (eg. you wouldn't render a gun if you've already determined that you're not rendering the character that holds it), spatially (eg. dividing the world up into a grid/quadtree/octree and rejecting any object that you know is within a zone that you have already determined is currently invisible), or more commonly a combination of both.
"the only winning move is not to play"
Every glVertex etc is going to be a performance hit regardless of whether it ultimately gets rendered on your screen. The only way to get around that is to not draw (i.e. cull) objects which wont ever be rendered anyways.
most common method is to have a viewing frustum tied to your camera. Couple that with an octtree or quadtree depending on whether your game is 3d/2d so you dont need to check every single game object against the frustum.
The underlying driver may do some culling behind the scenes, but you can't depend on that since it's not part of the OpenGL standard. Maybe your computer's driver does it, but maybe someone else's (who might run your game) doesn't. It's best for you do to your own culling.