I've run into some issue with drawing a texture. Situation is as follows:
I've got linux box with ati hardware and propietary ati driver, which is two or three years old - because of ati ditching old hw. I've got custom application with dedicated (mostly)2D engine based on opengl. (It was build over years and is quite mature, and never
got problems like this)
The problem happens, when vram (which is taken from system memory, 2GB in this particular case) is almost to maximum filed with textures. When in scene there is a quad, which is textured with texture over 2048x2048 it is not drawn. When I do timing of particular surfaces, the surface which takes most time to draw is not the one being textured with big tex (takes about 87 us), but the next one drawn after it (takes ~900 ms!).
The scene being drawn, doesn't use all the textures from vram, but only, let's say: 8%. Unfortunatelly, I cannot free even a small part of it. The application is usually working under that kind of vram-stressed conditions, and never behaved like this.
glGetError() returns nothing.
All other textures are drawn normally.
Related
I'm making an isometric (2D) game with SFML. I handle the drawing order (depth) by sorting all drawables by their Y position and it works perfectly well.
The game uses an enourmous amount of art assets, such that the npcs, monsters and player graphics alone are contained in their own 4k texture atlas. It is logistically not possible for me to put everything into one atlas. The target devices would not be able to handle textures of that size. Please do not focus on WHY it's impossible, and understand that I simply MUST use seperate files for my textures in this case.
This causes a problem. Let's say I have a level with 2 npcs and 2 pillars. The npcs are in NPCs.png and the pillars are in CastleLevel.png. Depending on where the npcs move, the drawing order (hence the opengly texture binding order) can be different. Let's say the Y positions are sorted like this:
npc1, pillar1, npc2, pillar 2
This would mean that opengl has to switch between the 2 textures twice. My question is, should I:
a) keep the texture atlasses OR
b) divide them all into smaller png files (1 png per npc, 1 png per pillar etc). Since the textures must be changed multiple times anyway, would it improve performance if opengl had to bind smaller textures instead?
Is it worth keeping the texture atlasses because it will SOMETIMES reduce the number of draw calls?
Since the textures must be changed multiple times anyway, would it improve performance if opengl had to bind smaller textures instead?
Almost certainly not. The cost of a texture bind is fixed; it isn't based on the texture's size.
It would be better for you to either:
Properly batch your rendering. That is, when you say "draw NPC1", you don't actually draw it yet. You stick some data in an array, and later on, you execute "draw NPCs", which draws all of the NPCs you've buffered in one go.
Use a bigger texture atlas, probably involving array textures. Each layer of the array texture would be one of the atlases you load. This way, you only ever bind one texture to render your scene.
Deal with it. 2D games aren't exactly stressful on the GPU or CPU. The overhead from the additional state changes will not be what knocks you down from 60FPS to 30FPS.
I have a problem when I render my skybox. I am using DirectX 11 with c++. The picture is too blurry. I think it might me I'm using too low resolution textures. Currently for every face of the skybox, the resolution is 1024x1024. My screen resolution is 1920x1080. On average I will be staring into one face of the skybox at all times, this means the 1024x1024 picture will be stretched to fill my screen, which is why it is blurry. I'm considering using 2048x2048 textures. I created a simple skybox texture and it is not blurry anymore. But my problem is it takes too much memory! Almost 100MB loaded to the GPU just for the background.
My question is that is there a better way to render skyboxes? I've looked around on the internet without much luck. Some say that the norm is 512x512 per face. The blurriness then is unacceptable. I'm wondering how the commercial games did their skyboxes? Did they use huge texture sizes? In particular, for those have seen it, I love the Dead Space 3 space environment. I would like to create something like that. So how did they do it?
Firstly, the pixel density will depend not only on the resolution of your texture and the screen, but also the field of view. A narrow field of view will result in less of the skybox filling the screen, and thus will zoom into the texture more, requiring higher resolution. You don't say exactly what FOV you're using, but I'm a little surprised a 1k texture is particularly blurry, so maybe it's a bit on the narrow side?
Oh, and before I forget - you should be using compressed textures... 2k textures shouldn't be that scary.
However, aside from changing the resolution, which obviously does start to burn through memory fairly quickly, I generally always combine the skybox with some simple distant objects.
For example, in a space scene I would probably render a fairly simple skybox which only contained things like nebula, etc., where resolution wasn't critical. I'd perhaps render at least some of the stars as sprites, where the texture density can be locally higher. A planet could be textured geometry.
If I was rendering a more traditional outdoor scene, I could render sky and clouds on the skybox, but a distant horizon as geometry. A moon in the sky might be an overlay.
There is no one standard answer - a variety of techniques can be employed, depending on the situation.
I know that OpenGL deprecated and got rid of GL_QUADS in the newer releases. I have heard this is due to the fact that modern GPUs only render with triangles so calling a quad would just make the GPU work harder to break it into two triangles (what I have heard anyway, I am not much of an expert on any of this topic).
I was wondering whether or not it is better (assuming the average person's CPU is faster, relatively, than their GPU) to just manually break the rendering of quads into two triangles yourself or to just let the GPU do it itself. Again, I have absolutely no real experience with OpenGL as I am just starting. I would rather know which is better for most machines these days so I could focus my attention on either rendering method*. Thanks.
*Yet I will probably utilize the 'triangle method' for the sake of it.
Even if you feed OpenGL quads, the triangularization is done by the driver on the CPU side before it even hits the GPU. Modern GPUs these days eat nothing except triangles. (Well, and lines and points.) So something will be triangulating, whether it's you or the driver -- it doesn't matter too much where it happens.
This would be less efficient if, say, you don't reuse your vertex buffers, and instead refill them anew every time with quads (in which case the driver will have to retriangulate every vertex buffer), instead of refilling them with pretriangulated triangles every time, but that's pretty contrived (and the problem you should be fixing in that case is just the fact you're refilling your vertex buffers).
I would say, if you have the choice, stick with triangles, since that's what most content pipelines put out anyways, and you're less likely to run into problems with non-planar quads and the like. If you get to choose what format your content comes in, then use triangles for sure, and the triangulation step gets skipped altogether.
Any geometry can be represented with triangles, and that is why it was decided to use triangles instead of quads. Another reason is two triangles do not have to be co-planar, which is not true for quad.
Yes, you select to render quads, but the driver will converting the quad into two triangles.
Therefore, by choosing to render a quad will not make GPU work less, but will make your CPU work more, because it has to do the conversion.
I've been working on a project and the lab computers are packed with ATI series 6700. The project is in C++ with OpenGL, so for the shaders I use GLSL. It is basically a planet with Level of Detail, so the number of polygons being sent to the shaders depends on camera position and the closer you are the more polygons there are for better definition. Bear in mind that I have indeed put a limit on the level of detail, otherwise everything would crash when camera was too close, due to insane amount of subdivisions.
However, when running the program on the lab computers, there's weird twitching of the mesh (some vertices appearing at other positions) and most importantly, there's a drop in framerate when I get too close and eventually the whole thing freezes and both my screens turn black and when Windows is visible again there's a message saying the drivers have encountered a problem and had to recover.
I tried running my project on my personal laptop which has an nvidia GT555M and none of the above problems appear. No twitching, no drop in framerate and no crashing. Also, when I disable the shaders in the lab computers, the program doesn't crash, nor is there twitching, but there's still the drop in framerate. All of this makes me think it has something to do with the way NVidia and ATI interpret and compile GLSL.
First of all, I should say I'm only creating a VBO with 17x17 vertices, where each vertex has only 16 bytes, 3 floats for position and 4 bytes of padding for performance. So when creating the VBO and drawing it, I do not enable COLOR, TEXTURE or NORMAL, since there's only information about the vertex position.
On the GPU side, however, I might be using non-standard GLSL or something like that, since I've read that ATI cards can only work with standardized GLSL. So here's my vertex shader and fragment shader:
http://pastebin.com/DE6ijidq
http://pastebin.com/Q8xUAguw
Bear in mind that what the C++ does is create 6 grids of 16 by 16 squares, each for one side of the cube, but has them centered on the origin, so in the first line of the main I displace them accordingly to create a cube, then I use the exploding cube technique to make it a sphere and work on it. I don't specify a GLSL version and I'm using varying variables, which might be where the problem is. Is there a way to make this compatible with the ATI cards as well? Should I specify a version, if so, which one? And should I use in and out variables, but then how? I've been testing my shaders in RenderMonkey and for some reason it still uses varying.
I would like to draw voxels by using opengl but it doesn't seem like it is supported. I made a cube drawing function that had 24 vertices (4 vertices per face) but it drops the frame rate when you draw 2500 cubes. I was hoping there was a better way. Ideally I would just like to send a position, edge size, and color to the graphics card. I'm not sure if I can do this by using GLSL to compile instructions as part of the fragment shader or vertex shader.
I searched google and found out about point sprites and billboard sprites (same thing?). Could those be used as an alternative to drawing a cube quicker? If I use 6, one for each face, it seems like that would be sending much less information to the graphics card and hopefully gain me a better frame rate.
Another thought is maybe I can draw multiple cubes using one drawelements call?
Maybe there is a better method altogether that I don't know about? Any help is appreciated.
Drawing voxels with cubes is almost always the wrong way to go (the exceptional case is ray-tracing). What you usually want to do is put the data into a 3D texture and render slices depending on camera position. See this page: https://developer.nvidia.com/gpugems/GPUGems/gpugems_ch39.html and you can find other techniques by searching for "volume rendering gpu".
EDIT: When writing the above answer I didn't realize that the OP was, most likely, interested in how Minecraft does that. For techniques to speed-up Minecraft-style rasterization check out Culling techniques for rendering lots of cubes. Though with recent advances in graphics hardware, rendering Minecraft through raytracing may become the reality.
What you're looking for is called instancing. You could take a look at glDrawElementsInstanced and glDrawArraysInstanced for a couple of possibilities. Note that these were only added as core operations relatively recently (OGL 3.1), but have been available as extensions quite a while longer.
nVidia's OpenGL SDK has an example of instanced drawing in OpenGL.
First you really should be looking at OpenGL 3+ using GLSL. This has been the standard for quite some time. Second, most Minecraft-esque implementations use mesh creation on the CPU side. This technique involves looking at all of the block positions and creating a vertex buffer object that renders the triangles of all of the exposed faces. The VBO is only generated when the voxels change and is persisted between frames. An ideal implementation would combine coplanar faces of the same texture into larger faces.