Large 3D scene streaming - opengl

I'm working on a 3D engine suitable for very large scene display.
Appart of the rendering itself (frustum culling, occlusion culling, etc.), I'm wondering what is the best solution for scene management.
Data is given as a huge list of 3D meshs, with no relation between them, so I can't generate portals, I think...
The main goal is to be able to run this engine on systems with low RAM (500MB-1GB), and the scenes loaded into it are very large and can contain millions of triangles, which leads to very intensive memory usage. I'm actually working with a loose octree right now, constructed on loading, it works well on small and medium scenes, but many scenes are just to huge to fit entirely in memory, so here come my question:
How would you handle scenes to load and unload chunks dynamically (and ideally seamlessly), and what would you base on to determine if a chunk should be loaded/unloaded? If needed, I can create a custom file format, as scenes are being exported using a custom exporter on known 3D authoring tools.
Important information: Many scenes can't be effectively occluded, because of their construction.
Example: A very huge pipe network, so there isn't so much occlusion but very high number of elements.

I think that the best solution will be a "solution pack", a pack of different techniques.
Level of detail(LOD) can reduce memory footprint if unused levels are not loaded. It can be changed more or less seamlessly by using an alpha mix between the old and the new detail. The easiest controller will use mesh distance to camera.
Freeing the host memory(RAM) when the object has been uploaded to the GPU (device), and obviously free all unsued memory (OpenGL resources too). Valgrind can help you with this one.
Use low quality meshes and use tessellation to increase visual quality.
Use VBO indexing, this should reduce VRAM usage and increase performance
Don't use meshes if possible, terrain can be rendered using heightmaps. Some things can be procedurally generated.
Use bump or/and normalmaps. This will improve quality, then you can reduce vertex count.
Divide those "pipes" into different meshes.
Fake 3D meshes with 2D images: impostors, skydomes...

If the vast amount of ram is going to be used by textures, there are commercial packages available such as the GraniteSDK that offer seamless LOD-based texture streaming using a virtual texture cache. See http://graphinesoftware.com/granite . Alternatively you can look at http://ir-ltd.net/
In fact you can use the same technique to construct poly's on the fly from texture data in the shader, but it's going to be a bit more complicated.
For voxels there is a techniques to construct oct-trees entirely in GPU memory, and page in/out the parts you really need. The rendering can then be done using raycasting. See this post: Use octree to organize 3D volume data in GPU , http://www.icare3d.org/research/GTC2012_Voxelization_public.pdf and http://www.cse.chalmers.se/~kampe/highResolutionSparseVoxelDAGs.pdf
It comes down to how static the scene is going to be, and following from that, how well you can pre-bake the data according to your vizualization needs. It would already help if you can determine visibility constraints up front (e.g. google Potential Visiblity Sets) and organize it so that you can stream it at request. Since the visualizer will have limits, you always end up with a strategy to fit a section of the data into GPU memory as quickly and accurately as possible.

Related

drawing time series of millions of vertices using OpenGL

I'm working on a data visualisation application where I need to draw about 20 different time series overlayed in 2D, each consisting of a few million data points. I need to be able to zoom and pan the data left and right to look at it and need to be able to place cursors on the data for measuring time intervals and to inspect data points. It's very important that when zoomed out all the way, I can easily spot outliers in the data and zoom in to look at them. So averaging the data can be problematic.
I have a naive implementation using a standard GUI framework on linux which is way too slow to be practical. I'm thinking of using OpenGL instead (testing on a Radeon RX480 GPU), with orthogonal projection. I searched around and it seems VBOs to draw line strips might work, but I have no idea if this is the best solution (would give me the best frame rate).
What is the best way to send data sets consisting of millions of vertices to the GPU, assuming the data does not change, and will be redrawn each time the user interacts with it (pan/zoom/click on it)?
In modern OpenGL (versions 3/4 core profile) VBOs are the standard way to transfer geometric / non-image data to the GPU, so yes you will almost certainly end up using VBOs.
Alternatives would be uniform buffers, or texture buffer objects, but for the application you're describing I can't see any performance advantage in using them - might even be worse - and it would complicate the code.
The single biggest speedup will come from having all the data points stored on the GPU instead of being copied over each frame as a 2D GUI might be doing. So do the simplest thing that works and then worry about speed.
If you are new to OpenGL, I recommend the book "OpenGL SuperBible" 6th or 7th edition. There are many good online OpenGL guides and references, just make sure you avoid the older ones written for OpenGL 1/2.
Hope this helps.

Reducing RAM usage with regard to Textures

Currently, My app is using a large amount of memory after loading textures (~200Mb)
I am loading the textures into a char buffer, passing it along to OpenGL and then killing the buffer.
It would seem that this memory is used by OpenGL, which is doing its own texture management internally.
What measures could I take to reduce this?
Is it possible to prevent OpenGL from managing textures internally?
One typical solution is to keep track of which textures you are needing at a given position of your camera or time-frame, and only load those when you need (opposed to load every single texture at the loading the app). You will have to have a "manager" which controls the loading-unloading and bounding of the respective texture number (e.g. a container which associates a string, name of the texture, with an integer) assigned by the glBindTexture)
Other option is to reduce the overall quality/size of the textures you are using.
It would seem that this memory is used by OpenGL,
Yes
which is doing its own texture management internally.
No, not texture management. It just need to keep the data somewhere. On modern systems the GPU is shared by several processes running simultanously. And not all of the data may fit into fast GPU memory. So the OpenGL implementation must be able to swap data out. The GPU fast memory is not storage, it's just another cache level. Just like the system memory is cache for system storage.
Also GPUs may crash and modern drivers reset them in situ, without the user noticing. For this they need a full copy of the data as well.
Is it possible to prevent OpenGL from managing textures internally?
No, because this would either be tedious to do, or break things. But what you can do, is loading only the textures you really need for drawing a given scene.
If you look through my writings about OpenGL, you'll notice that for years I tell people not to writing silly things like "initGL" functions. Put everything into your drawing code. You'll go through a drawing scheduling phase anyway (you must sort translucent objects far-to-near, frustum culling, etc.). That gives you the opportunity to check which textures you need, and to load them. You can even go as far and load only lower resolution mipmap levels so that when a scene is initially shown it has low detail, and load the higher resolution mipmaps in the background; this of course requires appropriate setting of minimum and maximum mip levels to be set as either texture or sampler parameter.

OpenGL models complexity level

A mobile application that I'm working on is expanding in scope. The client would like to have actual 3D objects in a product viewer within the app that a potential customer/dealer could zoom and rotate. I'm concerned about bringing a model into an OpenGL environment within a mobile device.
My biggest concern is complexity. I've looked at some of the engineering models for the products and some of them contain more than 360K faces! Does anyone know of any guidelines which would discuss how complex of an object OpenGL is able to handle?
Does anyone know of any guidelines which would discuss how complex of an object OpenGL is able to handle?
OpenGL is just a specification and doesn't deal with geometrical complexity. BTW: OpenGL doesn't treat geometry as coherent objects. For OpenGL it's just a bunch of loose points, lines or triangles that it throws (i.e. renders) to a framebuffer, one at a time.
Any considerations regarding performance make only sense with respect to an actual implementation. For example a low end GPU may be able to process as little as 500k vertices per second, while high end GPUs process several 10 million vertices per second with ease.

C++ GUI Development - Bitmap vs. Vector Graphics CPU Usage

I'm currently in the process of designing and developing GUI's for some audio applications made in C++ (using the Juce framework).
So far I've been playing with using bitmap graphics to create custom sliders and dials, by using 'film strip' style images to animate the components (meaning when the user interacts with a slider it triggers a method that changes the offset of a film-strip image to change the components appearance). Depending on the size of the original image and the number of 'frames', the CPU usage level changes quite dramatically.
Firstly, what would be the most efficient bitmap file format to use in terms of CPU consumption? At the moment I'm using PNG images.
Secondly, would it be more efficient to use vector graphics for these kind of graphical components? I understand the main differences between bitmap and vector graphics, but I haven't found any information regarding their CPU usage levels with regard to GUI interaction.
Or would CPU usage be down to the particular methods/functions/libraries/frameworks being used?
Thanks!
Or would CPU consumption be down to the particular methods/functions/libraries/frameworks being used?
Any of these things could influence it.
Pixel based images might take a while to read off of disk the bigger they are. Compressed types might take more time to uncompress. Vector might take more time to render when are loaded.
That being said, I would definitely not expect that your choice of image type to have any impact on its performance. Since you didn't provide a code example it is hard to speculate beyond that.
In general, you would expect that the run-time costs of the images to happen when they are loaded. So whenever you create an image object. If you create an images all over the place, then maybe its expensive. It is possible that your film strip is recreating the images instead of loading them once and caching them.
Before choosing bitmap vs. vector graphics, investigate if your graphics processor supports vector or bitmap graphics. Some things take a long time to draw as vectors.
Have you tried double-bufferring?
This is where you write to a buffer in memory while the display (graphics processor) is loading another.
Load your bitmaps from the resource once. Store them as memory snapshots to avoid the additional cost of translating them from a format.
Does your graphic processor support "blitting"?
Blitting is where the graphics processor can copy a rectangular area in memory (bitmap) and display it along with apply optional operations before displaying (such as XOR with existing bits).
Summary:
To improve your rendering speed, only convert images from the file into a bitmap form once. Store this somewhere. Refer to this converted bitmap as needed. Next, investigate and implement double buffering. Lastly, investigate and use bit-blitting or blitting.
Other optimization rules apply here too, such as reviewing the design, removing requirements, loop unrolling, passing images via pointer vs. copying them, and reduce "if" statements by using boolean logic and Karnaugh (sp?) maps.
In general, calculations for rendering vector graphics are going to take longer than blitting a rectangular region of a bitmap to the screen. But for basic UI stuff, neither should be particularly intensive.
You probably should do some profiling. Perhaps you're redrawing much more frequently than necessary. Or perhaps the PNG is being decoded each time you try to draw from it. (I'm not familiar with Juce.)
For a straight Windows app, I'd probably render vector graphics into a device-dependent bitmap once on startup and then just blit from the bitmap to the screen. Using vector gives you DPI independence, and blitting from a device-dependent bitmap is about the fastest way to paint a block of pixels. I believe the color matching is done when you render to the device-dependent bitmap, so you don't even have the ICM overhead on the screen drawing.
Vector graphics was ditched long ago - bitmap graphics are more performant. The thing is that you can send a bitmap to the GPU once and then render it forever more by a simple copy.
Secondly, the GPU uses it's own texture compression. DirectX is DXT5, I believe, but when the GPU sees the texture, it doesn't care what you loaded it from.
However, a modern CPU even with a crappy integrated GPU should have absolutely no problem with simple GUI rendering. If you're struggling, then it's time to look again at the technique you're using. Perhaps your framework is slow or your use of it is suboptimal.

Full HD 2D Texture Memory OpenGL

I am in the process of writing a full HD capable 2D engine for a company of artists which will hopefully be cross platform and is written in OpenGL and C++.
The main problem i've been having is how to deal with all those HD sprites. The artists have drawn the graphics at 24fps and they are exported as png sequences. I have converted them into DDS (not ideal, because it needs the directx header to load) DXT5 which reduces filesize alot. Some scenes in the game can have 5 or 6 animated sprites at a time, and these can consist of 200+ frames each. Currently I am loading sprites into an array of pointers, but this is taking too long to load, even with compressed textures, and uses quite a bit of memory (approx 500mb for a full scene).
So my question is do you have any ideas or tips on how to handle such high volumes of frames? There are a couple of ideas i've thought've of:
Use the swf format for storing the frames from Flash
Implement a 2D skeletal animation system, replacing the png sequences (I have concerns about the joints being visible tho)
How do games like Castle Crashers load so quickly with great HD graphics?
Well the first thing to bear in mind is that not all platforms support DXT5 (mobiles specifically).
Beyond that have you considered using something like zlib to compress the textures? The textures will likely have a fair degree of self similarity which will mean that they will compress down a lot. In this day and age decompression is cheap due to the speed of processors and the time saved getting the data off the disk can be far far more useful than the time lost to decompression.
I'd start there if i were you.
24 fps hand-drawn animations? Have you considered reducing the framerate? Even cinema-quality cel animation is only rarely drawn at the full 24-fps. Even going down to 18 fps will get rid of 25% of your data.
In any case, you didn't specify where your load times were long. Is the load from harddisk to memory the problem, or is it the memory to texture load that's the issue? Are you frequently swapping sets of texture data into the GPU, or do you just build a bunch of textures out of it at load time?
If it's a disk load issue, then your only real choice is to compress the texture data on the disk and decompress it into memory. S3TC-style compression is not that compressed; it's designed to be a useable compression technique for texturing hardware. You can usually make it smaller by using a standard compression library on it, such as zlib, bzip2, or 7z. Of course, this means having to decompress it, but CPUs are getting faster than harddisks, so this is usually a win overall.
If the problem is in texture upload bandwidth, then there aren't very many solutions to that. Well, depending on your hardware of interest. If your hardware of interest supports OpenCL, then you can always transfer compressed data to the GPU, and then use an OpenCL program to decompress it on the fly directly into GPU memory. But requiring OpenCL support will impact the minimum level of hardware you can support.
Don't dismiss 2D skeletal animations so quickly. Games like Odin Sphere are able to achieve better animation of 2D skeletons by having several versions of each of the arm positions. The one that gets drawn is the one that matches up the closest to the part of the body it is attached to. They also use clever art to hide any defects, like flared clothing and so forth.