OpenSceneGraph memory usage when resetting scene - c++

I have spent a great deal of time trying to figure out OSG's memory management.
I have a scene graph with several children (actually a LOD based on an octree).
However, when I need to reset my scene (I just want to wipe ALL nodes from de scene and also wipe the memory), I use
// Clear main osg::Group root node
m_rootNode->removeChildren(0, m_rootNode->getNumChildren());
m_rootNode->dirtyBound();
// Clear Main view scene data from osg::Viewer
m_viewer->setSceneData(nullptr);
BEFORE I do this, I check all my nodes with a NodeVisitor pattern, and found out that ALL my nodes have reference count of 1, i.e, after clearing them from the scene, I expect my memory to be freed. However, this does not happen: my scene is actually reset, all the nodes disappear from the viewer, but the memory remains occupied.
Nonetheless, when I load another scene to my viewer, the memory is rewritten somehow (i.e., the memory usage does not increase, hence there is no memory leak, but used memory is always the same)
I can't have this behaviour, as I need to closely control memory usage. How can I do this?

Looks like OSG keeps cached instances of your data, either as CPU-side or GPU-side objects.
You could have a look at osgDB's options to disable caching in first place (CACHE_NONE, CACHE_ALL & ~CACHE_ARCHIVES), but this can actually increase your memory consumption as data may not be re-used and re-loaded multiple times.
You could instruct osg::Texture to free the CPU-side texture data after it was uploaded to OpenGL - in case you don't need it any more. This can be done conveniently via the osgUtil::Optimizer::TextureVisitor which you would want to set up to change the AutoUnref for each texture to true. I think, running osgUtil::Optimizer with the OPTIMIZE_TEXTURE_SETTINGS achieves the same effect.
Then, after closing down your scene, as you did in your Question's code, you could explicitly instruct OSG's database pager to wipe its caches:
for( osgViewer::View v in AllYourViews )
{
v->getDatabasePager()->cancel();
v->getDatabasePager()->clear();
}
To finally get rid of all pre-allocated GPU-side objects and their CPU-side representations, you would need to destroy your views and GLContext's.

Related

How to update vertex buffer data frequently in DirectX 11?

I am trying to update my vertex buffer data with the map function in dx. Though it does update the data once, but if i iterate over it the model disappears. i am actually trying to manipulate vertices in real-time by user input and to do so i have to update the vertex buffer every frame while the vertex is selected.
Perhaps this happens because the Map function disables GPU access to the vertices until the Unmap function is called. So if the access is blocked every frame, it kind of makes sense for it to not be able render the mesh. However when i update the vertex every frame and then stop after sometime, theatrically the mesh should show up again, but it doesn't.
i know that the proper way to update data every frame is to use constant buffers, but manipulating vertices with constant buffers might not be a good idea. and i don't think that there is any other way to update the vertex data. i expect dynamic vertex buffers to be able to handle being updated every frame.
D3D11_MAPPED_SUBRESOURCE mappedResource;
ZeroMemory(&mappedResource, sizeof(D3D11_MAPPED_SUBRESOURCE));
// Disable GPU access to the vertex buffer data.
pRenderer->GetDeviceContext()->Map(pVBuffer, 0, D3D11_MAP_WRITE_DISCARD, 0, &mappedResource);
// Update the vertex buffer here.
memcpy((Vertex*)mappedResource.pData + index, pData, sizeof(Vertex));
// Reenable GPU access to the vertex buffer data.
pRenderer->GetDeviceContext()->Unmap(pVBuffer, 0);
As this has been already answered the key issue that you are using Discard (which means you won't be able to retrieve the contents from the GPU), I thought I would add a little in terms of options.
The question I have is whether you require performance or the convenience of having the data in one location?
There are a few configurations you can try.
Set up your Buffer to have both CPU Read and Write Access. This though mean you will be pushing and pulling your buffer up and down the bus. In the end, it also causes performance issues on the GPU such as blocking etc (waiting for the data to be moved back onto the GPU). I personally don't use this in my editor.
If memory is not the issue, set up a copy of your buffer on CPU side, each frame map with Discard and block copy the data across. This is performant, but also memory intensive. You obviously have to manage the data partioning and indexing into this space. I don't use this, but I toyed with it, too much effort!
You bite the bullet, you map to the buffer as per 2, and write each vertex object into the mapped buffer. I do this, and unless the buffer is freaking huge, I havent had issue with it in my own editor.
Use the Computer shader to update the buffer, create a resource view and access view and pass the updates via a constant buffer. Bit of a Sledgehammer to crack a wallnut. And still doesn't stop the fact you may need pull the data back off the GPU ala as per item 1.
There are some variations on managing the buffer, such as interleaving you can play with also (2 copies, one on GPU while the other is being written to) which you can try also. There are some rather ornate mechanisms such as building the content of the buffer in another thread and then flagging the update.
At the end of the day, DX 11 doesn't offer the ability (someone might know better) to edit the data in GPU memory directly, there is alot shifting between CPU and GPU.
Good luck on which ever technique you choose.
Mapping buffer with D3D11_MAP_WRITE_DISCARD flag will cause entire buffer content to become invalid. You can not use it to update just a single vertex. Keep buffer on the CPU side instead and then update entire buffer on GPU side once per frame.
If you develop for UWP - use of map/unmap may result in sync problems. ID3D11DeviceContext methods are not thread safe: https://learn.microsoft.com/en-us/windows/win32/direct3d11/overviews-direct3d-11-render-multi-thread-intro.
If you update buffer from one thread and render from another - you may get different errors. In this case you must use some synchronization mechanism, such as critical sections. Example is here https://developernote.com/2015/11/synchronization-mechanism-in-directx-11-and-xaml-winrt-application/

OpenGL 3.3 vertex buffer deletion before frame finished

This is an advanced OpenGL question and tbh. it seems more like a driver bug. I know that the standard explicitly states, that deletion of an object only deletes it's name, therefore a generator function can return the same name. However it's not clear on how to deal with this...
The situation is the following: I have a so called "transient" (C++) object (TO from now on), which generates GL objects, enqueues commands using them, then deletes them.
Now consider that I use more than one of this kind before I call SwapBuffers(). The following happens:
TO 1. generates a vertex buffer named VBO1, along with a VAO1 and other things
TO 1. calls some mapping/drawing commands with VBO1
TO 1. deletes the VAO1 and VBO1 (therefore the name VBO1 is freed)
TO 2. generates a vertex buffer object, now of course with the same name (VBO1) as the name 1 is deleted and available, along with another VAO (probably 1)
TO 2. calls some other mapping/drawing commands with this new VBO1 (different vertex positions, etc.)
TO 2. deletes the new VBO1
SwapBuffers()
And the result is: only the modifications performed by TO 1. are in effect. In a nutshell: I wanted to render a triangle, then a square, but I only got the triangle.
Workaround: not deleting the VBO, so I get a new name in TO 2. (VBO2)
I would like to ask for your help in this matter; although I'm aware of the fact that I shouldn't delete/generate objects mid-frame, but aside that, this "buggy" mechanism really disturbs me (I mean how can I trust GL then?...short answer: I can't...)
(sideonote: I've been programming 3D graphics since 12 years, but this thing really gave me the creeps...)
I have similar problems with my multithreaded rendering code. I use a double buffering system for the render commands, so when I delete an object, it might be used in the next frame.
The short of it is that TO shouldn't directly delete the GL objects. It needs to submit the handle to a manager to queue for deletion between frames. With my double buffering, I add a small timer to count down 2 frames before releasing.
For my transient verts, I have a large chunk of memory that I write to for storage, and skip the VBO submission. I don't know what your setup is or how many vertices you are pushing, but you may not benefit from VBOs if you 1) regenerate every frame or 2) push small sets of verts. Definitely perf test with and without VBOs.
I found the cause of the problem, I think it's worth mentioning (so that other developers won't fall into the same hole). The actual problem is the VAO, or more precisely the caching of the VAO.
In Metal and Vulkan the input layout is completely independent of the actual buffers used: you only specify the binding point (location) where the buffer is going to be.
But not in OpenGL... the VAO actually holds a strong reference to the vertex buffer which was bound during it's creation. Therefore the following thing happened:
VBO1 was created, VAO1 was created
VAO1 was cached in the pipeline cache
VBO1 was deleted, but only the name was freed, not the object
glGenBuffers() returns 1 again as the name is available
but VAO1 in the cache still references the old VBO1
the driver gets confused and doesn't let me modify the new VBO1
And the solution...well... For now when a vertex buffer gets deleted I delete any cached pipelines that reference that buffer.
On the long term tho: I'm going to maintain a separate cache for input layouts (even if it's part of the pipeline state), and move the transient object further up, so that it becomes less transient.
Welcome to the world of OpenGL...

How should I allocate/populate/update memory on GPU for different type of scene objects?

I'm trying to write my first DX12 app. I have no previous experience in DX11. I would like to display some rigid and some soft objects. Without textures for now. So I need to place into GPU some vertex/index buffers which I will never change later and some which I will change. And the scene per se isn't static, so some new objects can appear and some can vanish.
How should I allocate/populate/update memory on GPU for it? I would like to see high level overview easy to read and understand, not real code. Hope the question isn't too broad.
You said you are new to DirectX, i will strongly recommend you to stay away from DX12 and stick with DX11. DX12 is only useful for people that are already Expert ( with a big E ) and project that has to push very far or have edge cases for a feature in DX12 not possible in DX11.
But anyway, on DX12, as an example to initialize a buffer, you have to create instances of ID3D12Resource. You will need two, one in the an upload heap and one in the default heap. You fill the first one on the CPU using Map. Then you need to use a command list to copy to the second one. Of course, you have to manage the resource state of your resource with barriers ( copy destination, shader resource, ... ). You need then to execute the command list on the command queue. You also need to add a fence to listen the gpu for completion before you can destroy the resource in the upload heap.
On DX11, you call ID3D11Device::CreateBuffer, by providing the description struct with a SRV binding flag and the pointer to the cpu data you want to put in it… Done
It is slightly more complex for texture as you deal with memory layout. So, as i state above, you should focus on DX11, it is not degrading at all, both have their roles.

Cocos2dx - Clearing specific textures from the cache

I am getting crashes on iPhone 4s due to "memory pressure." My game is set up like this:
Main scene sprite sheet that always stays in memory.
Individual game scene levels load from individual textures (not sprite sheets).
When a level is done and we return to the main scene, I would like those cached game scene textures to get dumped. What ends up happening is that when you play through 3-4 levels, it crashes as it runs out of memory as it never releases this memory after a level. I don't want the level textures to be cached past the lifespan of the game scene. When returning to the main scene, it needs to release this memory.
I have tried removing all of the game scene children which does nothing to memory. I have tried looking for a specific way to clear just these textures I have loaded in this game scene from the cache.
Any suggestions?
Are you using cocos2d v2? You probably have memory leaks since unused textures are removed when necessary. Try profiling your app to see if you have leaks and where they are at.
You also can call these methods yourself at the appropriate time, although I doubt you'd have to:
[[CCTextureCache sharedTextureCache] removeUnusedTextures];
[[CCSpriteFrameCache sharedSpriteFrameCache] removeUnusedSpriteFrames];
But again what you describe sounds more like memory leaks. When your application receives a memory warning the cached data is purged. During this purge the remove unused textures method is called on the texture cache, among other things. If you have 3/4 levels of data still lurking around long after you've exited those scenes, that sounds like a memory leak.
I'm assuming this only happens after visiting multiple scenes and the problem just isn't that your 4th scene is simply trying to load more data than the device can handle.
You can remove specific textures calling:
In cocos2D-x v3.x :
Director::getInstance()->getTextureCache()->removeTextureForKey(ImageKeyName)
In cocos2D-x v2.x :
CCTextureCache::sharedTextureCache()->removeTextureForKey(ImageKeyName);
Where ImageKeyName is just the path of the image (the same you used to load the texture)

Reducing RAM usage with regard to Textures

Currently, My app is using a large amount of memory after loading textures (~200Mb)
I am loading the textures into a char buffer, passing it along to OpenGL and then killing the buffer.
It would seem that this memory is used by OpenGL, which is doing its own texture management internally.
What measures could I take to reduce this?
Is it possible to prevent OpenGL from managing textures internally?
One typical solution is to keep track of which textures you are needing at a given position of your camera or time-frame, and only load those when you need (opposed to load every single texture at the loading the app). You will have to have a "manager" which controls the loading-unloading and bounding of the respective texture number (e.g. a container which associates a string, name of the texture, with an integer) assigned by the glBindTexture)
Other option is to reduce the overall quality/size of the textures you are using.
It would seem that this memory is used by OpenGL,
Yes
which is doing its own texture management internally.
No, not texture management. It just need to keep the data somewhere. On modern systems the GPU is shared by several processes running simultanously. And not all of the data may fit into fast GPU memory. So the OpenGL implementation must be able to swap data out. The GPU fast memory is not storage, it's just another cache level. Just like the system memory is cache for system storage.
Also GPUs may crash and modern drivers reset them in situ, without the user noticing. For this they need a full copy of the data as well.
Is it possible to prevent OpenGL from managing textures internally?
No, because this would either be tedious to do, or break things. But what you can do, is loading only the textures you really need for drawing a given scene.
If you look through my writings about OpenGL, you'll notice that for years I tell people not to writing silly things like "initGL" functions. Put everything into your drawing code. You'll go through a drawing scheduling phase anyway (you must sort translucent objects far-to-near, frustum culling, etc.). That gives you the opportunity to check which textures you need, and to load them. You can even go as far and load only lower resolution mipmap levels so that when a scene is initially shown it has low detail, and load the higher resolution mipmaps in the background; this of course requires appropriate setting of minimum and maximum mip levels to be set as either texture or sampler parameter.