OpenGL 3.2+ Drawing Cubes around existing vertices - c++

So I have a cool program that renders a pretty cube in the centre of the screen.
I'm trying to now create a tiny cube on each corner of the existing cube (so 8 tiny cubes), centred on each of the existing cubes corners (or vertices).
I'm assuming an efficient way to implement this would be with a loop of some kind, to minimise the amount of code.
My query is, how does this affect the VAO/VBO's? Even in a loop, would each one need it's own buffer or could they all be sent at the same time...
Secondly, if it can be done, what would the structure of this loop be like, in terms of focusing on separate vertices given that each vertex has different coordinates...

As Vaughn Cato said, each object (using the same VBOs) can simply be drawn at different locations in world space, so you do not need to define separate VBO's for each object.
To complete this task, you simply need a loop to modify the given matrix before each one is rendered to the screen to change the origins of where each cube is drawn.

Related

Select object in OpenGL when doing transformations in the vertex shader

I'm pretty new to OpenGL and am trying to implement a simple program where I can draw cubes, move them around with the mouse, and delete them.
Previously I had done my drag operations by translating on the CPU. In this way I was able to use ray-tracing to pick out the element I wanted because the vertices themselves were being updated.
However, I'm trying to move all of the transformations to the GPU and in doing so realized that I would then be giving up updated access to the vertices on the CPU (as the CPU still thinks the vertices are the un-transformed ones). How does one do this communication so that I wouldn't have to manually do transformations on the CPU as well as in the Vertex Shader?
No matter where you're doing your transformations, you will typically have a model matrix that describes where each object is in the scene. Instead of transforming each object into world space just so you can check for intersection with a world-space ray, you can also transform the ray into the object space of each object by transforming the ray with the inverse model matrix.
One general issue with ray-tracing is that, as your scene gets larger, brute force testing of each object will get increasingly slow. You can use acceleration structures like an Octree or a Bounding Volume Hierarchy to speed things up. A completely different approach when it comes to picking would be just render an ID buffer, i.e. a buffer that has the same resolution as your currently rendered frame and for each pixel saves the ID of the object that is visible at that pixel. Then you can simply read back the value of the pixel underneath the cursor to find out what object you hit without the need to do any raytracing. Rendering the ID buffer could be done as a separate pass or can likely just be added as an additional render target to a pass you're already doing, e.g., prefilling the depth buffer or just when rendering the scene in case you only do one pass.

OpenGL - How to render many different models?

I'm currently struggling with finding a good approach to render many (thousands) slightly different models. The model itself is a simple cube with some vertex offset, think of a skewed quad face. Each 'block' has a different offset of its vertices, so basically I have a voxel engine on steroids as each block is not a perfect cube but rather a skewed cuboid. To render this shape 48 vertices are needed but can be cut to 24 vertices as only 3 faces are visible. With indexing we are at 12 vertices (4 for each face).
But, now that I have the vertices for each block in the world, how do I render them?
What I've tried:
Instanced Rendering. Sounds good, doesn't work as my models are not the same.
I could simplify distant blocks to a cube and render them with glDrawArraysInstanced/glDrawElementsInstanced.
Put everything in one giant VBO. This has a better performance than rendering each cube individually, but has the downside of having one large mesh. This is not desireable as I need every cube to have different textures, lighting, etc... Selecting a single cube within that huge mesh is not possible.
I am aware of frustum culling and occlusion culling, but I already have problems with some cubes in front of me (tested with a 128x128 world).
My requirements:
Draw some thousand models.
Each model has vertices offsets to make the block less cubic, stored in another VBO.
Each block has to be an individual object, as you should be able to place/remove blocks.
Any good performance advices?
This is not desireable as I need every cube to have different textures, lighting, etc... Selecting a single cube within that huge mesh is not possible.
Programmers should avoid declaring that something is "impossible"; it limits your thinking.
Giving each face of these cubes different textures has many solutions. The Minecraft approach uses texture atlases. Each "texture" is really just a sub-section of one large texture, and you use texture coordinates to select which sub-section a particular face uses. But you can get more complex.
Array textures allow for a more direct way to solve this problem. Here, the texture coordinates would be the same, but you use a per-vertex integer to select the correct texture for a face. All of the vertices for a particular face would have an index. And if you're clever, you don't even really need texture coordinates. You can generate them in your vertex shader, based on per-vertex values like gl_VertexID and the like.
Lighting parameters would work the same way: use some per-vertex data to select parameters from a UBO or SSBO.
As for the "individual object" bit, that's merely a matter of how you're thinking about the problem. Do not confuse what happens in the player's mind with what happens in your code. Games are an elaborate illusion; just because something appears to the user to be an "individual object" doesn't mean it is one to your rendering engine.
What you need is the ability to modify your world's data to remove and add new blocks. And if you need to show a block as "selected" or something, then you simply need another per-block value (like the lighting parameters and index for the texture) which tells you whether to draw it as a "selected" block or as an "unselected" one. Or you can just redraw that specific selected block. There are many ways of handling it.
Any decent graphics card (since about 2010) is able to render a few millions vertices in a blinking.
The approach is different depending on how many changes per frame. In other words, how many data must be transferred to the GPU per frame.
For the case of small number of changes, storing the data in one big VBO or many smaller VBOs (and their VAOs), sending the changes by uniforms, and calling several glDraw***, shows similar performance. Different hardwares behave with little difference. Indexed data may improve the speed.
When most of the data changes in every frame and these changes are hard or impossible to do in the shaders, then your app is memory-transfer bound. Streaming is a good advise.

OpenGL VBO storage and templates

I have a question about VBO's. Let's say just as an example I'm trying to build a voxel style engine that makes even a 16x16x16 chunk.
Do I store the map information in the VBO? How do I get the verticies for a cube? The way I'm thinking about it, the VBO would require 24 vector3 variables (vectors for each cube at each location). That seems like a lot.
is there some way to have a single 'cube' VBO template, then somehow change the coordinates for each cube I want to draw, calling the template (i hope that makes sense) and using bufferdata to update that template for every location, do I have to actually store those 24 vectors for every single location in the 16x16x16, or would I just store the map coordinates, then have the cube and polygons drawn through a shader?
I hope that makes sense. it seems expensive memory wise loading up something that stores 24 vectors per location, and it seems resource intensive to me calling bufferdata 16x16x16 times per frame... so the last option using the vertex shader seems the most viable, but I'm new to shaders so is something like that possible?
What is the most common method used?
Geometry shaders can, indeed, emit multiple primitives for a single input primitive. So drawing all 6 faces of a cube from a single input point is certainly possible. Though for "voxel" engines you might be better served by point sprites, as often the orientation of the cube isn't useful. A point sprite draws a single screen-aligned quad from an input point. Beyond that you'll need to be more specific about what you're doing.

3d Occlusion Culling

I'm writing a Minecraft like static 3d block world in C++ / openGL. I'm working at improving framerates, and so far I've implemented frustum culling using an octree. This helps, but I'm still seeing moderate to bad frame rates. The next step would be to cull cubes that are hidden from the viewpoint by closer cubes. However I haven't been able to find many resources on how to accomplish this.
Create a render target with a Z-buffer (or "depth buffer") enabled. Then make sure to sort all your opaque objects so they are rendered front to back, i.e. the ones closest to the camera first. Anything using alpha blending still needs to be rendered back to front, AFTER you rendered all your opaque objects.
Another technique is occlusion culling: You can cheaply "dry-render" your geometry and then find out how many pixels failed the depth test. There is occlusion query support in DirectX and OpenGL, although not every GPU can do it.
The downside is that you need a delay between the rendering and fetching the result - depending on the setup (like when using predicated tiling), it may be a full frame. That means that you need to be creative there, like rendering a bounding box that is bigger than the object itself, and dismissing the results after a camera cut.
And one more thing: A more traditional solution (that you can use concurrently with occlusion culling) is a room/portal system, where you define regions as "rooms", connected via "portals". If a portal is not visible from your current room, you can't see the room connected to it. And even it is, you can click your viewport to what's visible through the portal.
The approach I took in this minecraft level renderer is essentially a frustum-limited flood fill. The 16x16x128 chunks are split into 16x16x16 chunklets, each with a VBO with the relevant geometry. I start a floodfill in the chunklet grid at the player's location to find chunklets to render. The fill is limited by:
The view frustum
Solid chunklets - if the entire side of a chunklet is opaque blocks, then the floodfill will not enter the chunklet in that direction
Direction - the flood will not reverse direction, e.g.: if the current chunklet is to the north of the starting chunklet, do not flood into the chunklet to the south
It seems to work OK. I'm on android, so while a more complex analysis (antiportals as noted by Mike Daniels) would cull more geometry, I'm already CPU-limited so there's not much point.
I've just seen your answer to Alan: culling is not your problem - it's what and how you're sending to OpenGL that is slow.
What to draw: don't render a cube for each block, render the faces of transparent blocks that border an opaque block. Consider a 3x3x3 cube of, say, stone blocks: There is no point drawing the center block because there is no way that the player can see it. Likewise, the player will never see the faces between two adjacent stone blocks, so don't draw them.
How to draw: As noted by Alan, use VBOs to batch geometry. You will not believe how much faster they make things.
An easier approach, with minimal changes to your existing code, would be to use display lists. This is what minecraft uses.
How many blocks are you rendering and on what hardware? Modern hardware is very fast and is very difficult to overwhelm with geometry (unless we're talking about a handheld platform). On any moderately recent desktop hardware you should be able to render hundreds of thousands of cubes per frame at 60 frames per second without any fancy culling tricks.
If you're drawing each block with a separate draw call (glDrawElements/Arrays, glBegin/glEnd, etc) (bonus points: don't use glBegin/glEnd) then that will be your bottleneck. This is a common pitfall for beginners. If you're doing this, then you need to batch together all triangles that share texture and shading parameters into a single call for each setup. If the geometry is static and doesn't change frame to frame, you want to use one Vertex Buffer Object for each batch of triangles.
This can still be combined with frustum culling with an octree if you typically only have a small portion of your total game world in the view frustum at one time. The vertex buffers are still loaded statically and not changed. Frustum cull the octree to generate only the index buffers for the triangles in the frustum and upload those dynamically each frame.
If you have surfaces close to the camera, you can create a frustum which represents an area that is not visible, and cull objects that are entirely contained in that frustum. In the diagram below, C is the camera, | is a flat surface near the camera, and the frustum-shaped region composed of . represents the occluded area. The surface is called an antiportal.
.
..
...
....
|....
|....
|....
|....
C |....
|....
|....
|....
....
...
..
.
(You should of course also turn on depth testing and depth writing as mentioned in other answer and comments -- it's very simple to do in OpenGL.)
The use of a Z-Buffer ensures that polygons overlap correctly.
Enabling the depth test makes every drawing operation check the Z-buffer before placing pixels onto the screen.
If you have convex objects you must (for performance) enable backface culling!
Example code:
glEnable(GL_CULL_FACE);
glEnable(GL_DEPTH_TEST);
glDepthMask(GL_TRUE);
You can change the behaviour of glCullFace() passing GL_FRONT or GL_BACK...
glCullFace(...);
// Draw the "game world"...

Mesh deformation and VBOs

In my current project I render a series of basically cubic 3D models arranged in a grid. These 3D tiles form the walls of a dungeon level in a game, so they're not perfectly cubic, but I pay special attention to be certain all the edges line up and everything tiles correctly.
I'm interested in implementing a height-map deformation, which seems like it'd require me to manually deform the vertices of the 3D tiles, first by raising or lowering a corner, then by calculating a line between two corners and shifting all the vertices based on the height of that line. Seems pretty straightforward.
My current issue is this: I'm using OpenGL, which provides an optimization called VBOs, which basically are (to my understanding) static copies of the mesh kept in GPU memory for speed. I render using VBOs because I only use three basic models (L-corner, straight-wall, and a cap to join walls when they don't meet in an L). If I have to manually fiddle with the vertices of my models, it seems like I'd have to replace the content of the VBO every tile, which pretty much negates the point of using them.
It seems to me that I might be able to use simple rotation and translation transforms to achieve a similar effect, but I can't figure out how to do it without leaving gaps between the tiles. Any thoughts?
You may be able to use a vertex program on your GPU. The main difficulty (if I understand your problem correctly) is that vertex programs must rely on either global or per-vertex parameters, and there is a strictly limited amount of space available for each.
Without more details, I can only suggest being clever about how you set up the parameters...