I am starting a new project in C++ using GLFW and GLEW.
The plan is to have a fairly big Low Poly terrain. It will NOT be randomly generated, I am planning on making it in Blender.
My problem is, that I cannot create a huge Low Poly terrain in Blender, because the program becomes really slow with the amount of vertices that the terrain has. I created a 500m x 500m terrain, and subdivided it by 1000. That gave me ALOT of vertices, making the program not usable.
What would be the best approach to creating a huge terrain?
Im not sure how I would go onto creating chunks of the terrain, since I have to model them.
How do I create a big Low Poly terrain, without having a problem with
the program being slow?
Another concern of mine is obviously loading the world into a custom game engine of mine. I suppose a big world like this would have huge problems with the load times.
Terrain in game engines like Unity, Unreal Engine and CryEngine is treated differently from your average static or skeletal mesh. Creation of different levels of detail are is usually done at runtime, as opposed to ordinary meshes having their LODs pre-created. Loading a mesh from a 3D program like Blender or 3DS Max as your entire terrain just isn't doable.
The Direct3D tutorials at rastertek are very good for learning, but isn't OpenGL obviously. Here is a basic tutorial of creating a basic terrain in Java OpenGL (This doesn't go into LOD handling I don't think).
Java OpenGL terrain
Most commonly I think I've seen a quad tree system, where you have terrain patches, and each patch is subdivided into four other patches, depending on a condition (whether distance to camera or screenspace size).
This is what a standard quad-tree LOD system looks like, in particular for the game Kerbal Space Program.
Along the way you'll need to figure out how to solve some problems, like how to get rid of the cracks and gaps in between two terrain patches that are of different LOD levels. Kerbal Space Program solved this by treating the edge vertices differently to line up, and not allowing any two adjacent terrain patches to be more than one LOD level of difference.
One method I tried was to upload two vertex positions for each vertex, the current LOD position and the position of the LOD vertex from one level down, and linearly interpolate between the two based on camera distance. Yet I'm pretty sure there are more elegant ways than this.
I've posted a video from a while ago of me messing around with this stuff, it shows the basic quad tree pattern, the problem of cracks, and then the vertex interpolation method. Some people create the patches on the CPU and other on the GPU and read back any necessary info, (like for example for physics) using transform feedback. There's lots of ways of doing things, and I hope to get back into it.
TerrainPatches
I did something similar many years ago and found this tutorial very helpful:
http://www.rastertek.com/tertut05.html
It describes creating a quad tree with specific triangles from your terrain mesh partitioned into AABBs, using frustum culling huge parts of your terrain can be culled during runtime and your application's performance should improve. As long as you are confident importing meshes exported from blender (are they in .obj ?) you should easily be able to partition the different triangles using the strategies outlined in the tutorial.
A further optimization could be to have various LODs for nodes in your quadtree depending on the distance from the camera, i.e if a node is a set distance from the camera render a lower poly mesh by skipping certain vertices to make the smaller triangles "collapse" into larger ones. I'd recommend generating specific index lists to do this and use the same vertex data as opposed to having separate pre-generated chunks of mesh to save on memory.
Related
I have been researching different approaches to terrain systems in game engines for a bit now, trying to familiarize myself with the work. A number of the details seem straightforward, but I am getting hung up on a single detail.
For performance reasons many terrain solutions utilize shaders to generate parts or all of the geometry, such as vertex shaders to generate positions or tessellation shaders for LoD. At first I figured those approaches were exclusively for renders that weren't concerned about physics simulations.
The reason I say that is because as I understand shaders at the moment, the results of a shader computation generally are discarded at the end of the frame. So if you rely on shaders heavily then the geometry information will be gone before you could access it and send it off to another system (such as physics running on the CPU).
So, am I wrong about shaders? Can you store the results of them generating geometry to be accessed by other systems? Or am I forced to keep the terrain geometry on CPU and leave the shaders to the other details?
Shaders
You understand parts of the shaders correctly, that is: after a frame, the data is stored as a final composed image in the backbuffer.
BUT: Using transform feedback it is possible to capture transformed geometry into a vertex buffer and reuse it. Transform Feedback happens AFTER the vertex/geometry/tessellation shader, so you could use the geometry shader to generate a terrain (or visible parts of it once), push it through transform-feedback and store it.
This way, you potentially could use CPU collision detection with your terrain! You can even combine this with tessellation.
You will love this: A Framework for Real-Time, Deformable Terrain.
For the LOD and tessellation: LOD is not the prerequisite of tessellation. You can use tessellation to allow some more sophisticated effects such as adding a detail by recursive subdivision of rough geometry. Linking it with LOD is simply a very good optimization avoiding RAM-memory based LOD-mesh-levels, since you just have your "base mesh" and subdivide it (Although this will be an unsatisfying optimization imho).
Now some deeper info on GPU and CPU exclusive terrain.
GPU Generated Terrain (Procedural)
As written in the NVidia article Generating Complex Procedural Terrains Using the GPU:
1.2 Marching Cubes and the Density Function Conceptually, the terrain surface can be completely described by a single function, called the
density function. For any point in 3D space (x, y, z), the function
produces a single floating-point value. These values vary over
space—sometimes positive, sometimes negative. If the value is
positive, then that point in space is inside the solid terrain.
If the value is negative, then that point is located in empty space
(such as air or water). The boundary between positive and negative
values—where the density value is zero—is the surface of the terrain.
It is along this surface that we wish to construct a polygonal mesh.
Using Shaders
The density function used for generating the terrain, must be available for the collision-detection shader and you have to fill an output buffer containing the collision locations, if any...
CUDA
See: https://www.youtube.com/watch?v=kYzxf3ugcg0
Here someone used CUDA, based on the NVidia article, which however implies the same:
In CUDA, performing collision detection, the density function must be shared.
This will however make the transform feedback techniques a little harder to implement.
Both, Shaders and CUDA, imply resampling/recalculation of the density at at least one location, just for the collision detection of a single object.
CPU Terrain
Usually, this implies a RAM-memory stored set of geometry in the form of vertex/index-buffer pairs, which are regularly processed by the shader-pipeline. As you have the data available here, you will also most likely have a collision mesh, which is a simplified representation of your terrain, against which you perform collision.
Alternatively you could spend your terrain a set of colliders, marking the allowed paths, which is imho performed in the early PS1 Final Fantasy games (which actually don't really have a terrain in the sense we understand terrain today).
This short answer is neither extensively deep nor complete. I just tried to give you some insight into some concepts used in dozens of solutions.
Some more reading: http://prideout.net/blog/?tag=opengl-transform-feedback.
Two questions:
How do modern games set up their terrain vertices? Do they attach a height map image to a texture and then use it to set each vertex position, or do they just use a 3D software (like Blender) to create a file that contains these vertices and then read it to a VBO? Please correct me if my grasp is incorrect.
How important are tessellation shaders to this process? Do they just save performance or do they also change the viewer's scene?
The two most common I have seen are heightmaps, in which the RGB value is used for surface normal and the alpha value is used for heights, and procedural terrain generation using a method such as Perlin Noise, that use a random function and sample their surrounding vertices to even out the height.
Tesselation shaders are used primarily in decreasing workload by simplifying far away meshes in which you would not notice the extra detail. They do change the viewers scene, but in a way that is attempting to not be noticed.
Generally height are generated procedurally in shaders for vertices.
By procedurally in computer graphics it means by some mathematics algorithm. Perlin noise is one of the methods for this procedural generation. There are several strategies keep the height map of small size and produce different heights using procedural method this is done as height map is texture and that uses bandwidth.
Tessellation shaders are used along for adaptive tessellation. You can think of it as some kind of level of detail mechanism. Smoothness of terrain depends upon how many triangles are used to represent patch on terrain. Depending on the distance of pixel from camera developers can decide what should be tessellation level on the fly and generate more triangles for patches close to user. This is way to improve details on the terrain. Everything here is happening on the GPU so its extremely efficient.
Previous to tessellation shaders were accessibe there were algorithms like ROAR which used to do adaptive tessellation on the CPU.
Please follow http://vterrain.org/ this project. You will see all state of the terrain techniques implemented here.
I have a pretty large terrain mesh (heightmap), and I'd like to be able to divide this into smaller chunks... After reading posts and articles, I've found this about terrain LOD:
No you don't. In your typical terrain renderer the data is subdivided
into tiles. And usually those tiles subdivide again, and again to
implement level of detail. What sets the tiles apart are the vertices
they reference. So you'd have one large vertex array for the terrain
data, and a lot of index arrays for the tiles. By calling
glDrawElements with the right index arrays you can select which tiles
to draw at which level of detail.
Answer by datenwolf, link to the post:
OpenGL: Are VAOs and VBOs practical for large polygon rendering tasks?
EDIT:
I read my heightmap from file, usually from a .BMP image, and I displace a regular grid with these height samples. I'm using VBOs, VAOs, DrawElements(), triangles (not strips) and shaders (still without tesselation shader, I implement it next week).
Is there a good algorithm uses this, or could somebody share an article about this method?
I searched Google for "quadtree terrain rendering" (I think you where missing the keyword quadtree) and this came up:
http://vterrain.org/LOD/Papers/
A lot of publications, the second one already looks very interesting:
Continuous Distance-Dependent Level of Detail for Rendering Heightmaps
I'm making a voxel engine in C++ and OpenGL (à la Minecraft) and can't get decent fps on my 3GHz with ATI X1600... I'm all out of ideas.
When I have about 12000 cubes on the screen it falls to under 20fps - pathetic.
So far the optimizations I have are: frustum culling, back face culling (via OpenGL's glEnable(GL_CULL_FACE)), the engine draws only the visible faces (except the culled ones of course) and they're in an octree.
I've tried VBO's, I don't like them and they do not significantly increase the fps.
How can Minecraft's engine be so fast... I struggle with a 10000 cubes, whereas Minecraft can easily draw much more at higher fps.
Any ideas?
#genpfault: I analyze the connectivity and just generate faces for the outer, visible surface. The VBO had a single cube that I glTranslate()d
I'm not an expert at OpenGL, but as far as I understand this is going to save very little time because you still have to send every cube to the card.
Instead what you should do is generate faces for all of the outer visible surface, put that in a VBO, and send it to the card and continue to render that VBO until the geometry changes. This saves you a lot of the time your card is actually waiting on your processor to send it the geometry information.
You should profile your code to find out if the bottleneck in your application is on the CPU or GPU. For instance it might be that your culling/octtree algorithms are slow and in that case it is not an OpenGL-problem at all.
I would also keep count of the number of cubes you draw on each frame and display that on screen. Just so you know your culling routines work as expected.
Finally you don't mention if your cubes are textured. Try using smaller textures or disable textures and see how much the framerate increases.
gDEBugger is a great tool that will help you find bottlenecks with OpenGL.
I don't know if it's ok here to "bump" an old question but a few things came up my mind:
If your voxels are static you can speed up the whole rendering process by using an octree for frustum culling, etc. Furthermore you can also compile a static scene into a potential-visibility-set in the octree. The main principle of PVS is to precompute for evere node in the tree which other nodes are potential visible from it and store pointers to them in a vector. When it comes to rendering you first check in which node the camera is placed and then run frustum culling against all nodes in the PVS-vector of the node.(Carmack used something like that in the Quake engines, but with Binary Space Partitioning trees)
If the shading of your voxels is kindalike complex it is also fast to do a pre-Depth-Only-Pass, without writing into the colorbuffer,just to fill the Depthbuffer. After that you render a 2nd pass: disable writing to the Depthbuffer and render only to the Colorbuffer while checking the Depthbuffer. So you avoid expensive shader-computations which are later overwritten by a new fragment which is closer to the viewer.(Carmack used that in Quake3)
Another thing which will definitely speed up things is the use of Instancing. You store only the position of each voxel and, if nescessary, its scale and other parameters into a texturebufferobject. In the vertexshader you can then read the positions of the voxels to be spawned and create an instance of the voxel(i.e. a cube which is given to the shader in a vertexbufferobject). So you send the 8 Vertices + 8 Normals (3 *sizeof(float) *8 +3 *sizeof(float) *8 + floats for color/texture etc...) only once to the card in the VBO and then only the positions of the instances of the Cube(3*sizeof(float)*number of voxels) in the TBO.
Maybe it is possibile to parallelize things between GPU and CPU by combining all 3 steps in 2 threads, in the CPU-thread you check the octrees pvs and update a TBO for instancing in the next frame, the GPU-thread does meanwhile render the 2 passes while using an TBO for instancing which was created by the CPU thread in the previous step. After that you switch TBOs. If the Camera has not moved you don't even have to do the CPU-calculations again.
Another kind of tree you me be interested in is the so called k-d-tree, which is more general than octrees.
PS: sorry for my english, it's not the clearest....
There are 3rd-party libraries you could use to make the rendering more efficient. For example the C++ PolyVox library can take a volume and generate the mesh for you in an efficient way. It has built-in methods for reducing triangle count and helping to generate things like ambient occlusion. It's got a good community around it so getting support on the forum should be easy.
Have you used a common display list for all your cubes ?
Do you skip calling drawing code of cubes which are not visible to the user ?
In my current project I render a series of basically cubic 3D models arranged in a grid. These 3D tiles form the walls of a dungeon level in a game, so they're not perfectly cubic, but I pay special attention to be certain all the edges line up and everything tiles correctly.
I'm interested in implementing a height-map deformation, which seems like it'd require me to manually deform the vertices of the 3D tiles, first by raising or lowering a corner, then by calculating a line between two corners and shifting all the vertices based on the height of that line. Seems pretty straightforward.
My current issue is this: I'm using OpenGL, which provides an optimization called VBOs, which basically are (to my understanding) static copies of the mesh kept in GPU memory for speed. I render using VBOs because I only use three basic models (L-corner, straight-wall, and a cap to join walls when they don't meet in an L). If I have to manually fiddle with the vertices of my models, it seems like I'd have to replace the content of the VBO every tile, which pretty much negates the point of using them.
It seems to me that I might be able to use simple rotation and translation transforms to achieve a similar effect, but I can't figure out how to do it without leaving gaps between the tiles. Any thoughts?
You may be able to use a vertex program on your GPU. The main difficulty (if I understand your problem correctly) is that vertex programs must rely on either global or per-vertex parameters, and there is a strictly limited amount of space available for each.
Without more details, I can only suggest being clever about how you set up the parameters...