I just implemented basic opengl rendering into my pygame application thinking hardware acceleration would make the program run faster. It is much, much slower instead.
Looks like the problem is the drawing function.
Here is my opengl drawing function
def draw(self, screen):
rect = self.texture.imagerect.copy()
rect.x += self.xoffset
rect.y += self.yoffset
halfWidth = self.getWidth()/2
halfHeight = self.getHeight()/2
glEnable(GL_TEXTURE_2D)
glBindTexture(GL_TEXTURE_2D, self.texture.getTexID())
self.color.setGLColor()
glPushMatrix()
glTranslatef(rect.x,rect.y,0)
glRotatef(self.angle, 0, 0, 1);
glBegin(GL_QUADS)
glTexCoord2d(0,0)
glVertex2f(-halfWidth + self.pivot.x, -halfHeight + self.pivot.y)
glTexCoord2d(0,1)
glVertex2f(-halfWidth + self.pivot.x,-halfHeight + self.getHeight() + self.pivot.y)
glTexCoord2d(1,1)
glVertex2f(-halfWidth + self.getWidth() + self.pivot.x,-halfHeight + self.getHeight() + self.pivot.y)
glTexCoord2d(1,0)
glVertex2f(-halfWidth + self.getWidth() + self.pivot.x,-halfHeight + self.pivot.y)
glEnd()
glPopMatrix()
what my profiler gives for the draw function
ncalls tottime percall cumtime percall filename:lineno(function)
312792 20.395 0.000 34.637 0.000 image.py:61(draw)
the rest of my profiler text: (expires in 1 month)
http://pastebin.com/ApfiCQzw
my sourcecode
https://bitbucket.org/claysmithr/warbots/src
Note: when i set it to not draw any tiles i get 60 fps! I also get 20 fps if i limit to only draw tiles that appear on the screen, but this is still much slower than blitting
Number of tiles i'm trying to draw (64x64): 15,625
Is there any way to test if I am really hardware accelerated?
Should I just go back to blitting?
edit: Does blitting automatically not draw tiles that are not on the screen? that could be the reason why opengl is being so slow!
If I've understood correctly, you need to draw thousands of these textured quadrangles every frame. The slowdown comes from the multiple OpenGL calls you are making for each quadrangle - at least 16 in the code above.
What you need to do is draw in batches, many more primitives at a time.
To start with, can you merge tiles into bigger units? Any graphics card made in the last decade can handle 16K x 16K texture maps. The more quads you can draw without having to bind a new texture, the better.
Replace the glBegin .. glEnd blocks with vertex arrays. Even in the worst case of still drawing one quad at a time, you can replace the current 10 OpenGL calls with just 2, glVertexPointer and glTexCoordPointer.
Then start merging tile quads into bigger vertex arrays. Instead of having a glTranslatef, add the rect.x and rect.y values directly to each vertex.
The glRotatef is a problem if it really does have to be different for each tile. If it's limited to multiples of 90 degrees then you don't need it, instead just swap the texture coords around. For other values, work out how to use sin and cos to directly rotate the texture coords.
Once you've eliminated the translate and rotate per tile, you can stick all the calculated quadrangle vertex and texture coords into giant vertex arrays and draw the entire map with just two calls.
Hope this helps.
(For real hyper performance you'd probably want to use GPU side vertex buffer objects and shaders, but from your coding style I assume you want to stick with OpenGL 1/2.)
Related
I am using modern OpenGL to render a 3D scene with a texture atlas which holds all of my textures.
The texture coordinate calculation is done very simply and gets loaded into a VBO for the shader. Here is some basic calculation in pseudo code:
int posX = slot % texturesInAtlasX, posY = slot / texturesInAtlasX;
float startX = 1f / texturesInAtlasX * posX;
float startY = 1f / texturesInAtlasY * posY;
Normally I use this to calculate the 4 texture coordinates for a rectangle to apply a part of the atlas to it. But to the complexity of my scene forces me to simplify my meshes by calculating some faces with the same texture together into one. Normally this is not a big deal, because in OpenGL you can easily tile your texture by raising your coords above 1. But as I am using a texture atlas this can not easily be done by changing the way how the texture coords gets calculated.
My idea was to load the slot of the atlas which should be used and the size of the face to the shader and somehow tile the part of the atlas across the face. But I have absolutely no idea how to do that. As my objects are individually moving across the scene, loading the objects into one VAO per texture or batching the objects per textures together is not an option, because this absolutely kills the performance.
So my question is: Is there any feature in OpenGL which can help me tiling only parts of a texture?
Ok, after hours of researching I finally found a solution for my problem.
The feature which I found is called "texture arrays" and is in core since OpenGL 3.3.
So it's safe to use and has no big impact on performance. In my case it also gave me a big performance improvment. It was also really simple to implement.
If anyone got some similar problem here is a link which explaine how texture arrays work: https://www.khronos.org/opengl/wiki/Array_Texture
Im programming a Prototyp of an Tilebased Game to learn SDL.My map is just a 257x257 array of Tiles, each Tile is 1x1 to 60x60 pixels (different zoomings).The SDL Window has a resolution of 1024x768. So i can display 18x13 to 1024*768 Tiles.
Till now i tried 2 approches.
1st: Render from Tiles
//for (at worst) 1024*768 Tiles
SDL_Rect Tile;
SDL_SetRenderDrawColor(gRenderer, /*some color*/ , 255);
Tile = { Tile_size * x, Tile_size * y, Tile_size, Tile_size };
SDL_RenderFillRect(gRenderer, &(Tile));
con: it is way to time consuming and the game starts lagging if i try to move the map.
2nd create an texture before the Game starts
with: SDL_CreateRGBSurface, SDL_FillRect, SDL_CreateTextureFromSurface
con: the Texture would be (257x257)(Tiles)x(60x60)(pixel/Tile)x(32)(bit/pixel) ~ 951 MB. and with multiple Textures for different Zoom steps its way to huge to handle.
I'd appreciate any tips to improve the performance.
The first example just draws a single filled rectangle... That can't be slow, I'd have to see more to give a better answer.
In general you'll want to render only the tiles which are visible on the screen not the tiles on the map itself. With 60x60 tiles you can get away with just using SDL 2d drawing functions then. When you add different zoom levels I'm afraid you won't be able to just use 1x1 pixel tiles and use the same approach - you would try to call every pixel via a function call!
So once you add different zoom levels you'll have to figure out how to get that on screen - and what that's supposed to mean to the player anyway :)
I've never been able to understand the best practice in this context . I usually want to ship my game with as minimum size as possible. so where ever possible , I try to use scaling of graphics . Let us suppose I have to draw a 1000 X 300 px wall of yellow color in my game. So I usually just use a 3 X 3 px yellow image and stretch it in game (using nearest neighbor filter). Is this the right approach ?
Let us consider another situation . Let us suppose I wish to render rain in my game . Basically 2 X 30 px blue white gradient streaks . Let us suppose at any time 200 drops max are going to be rendered . Now if I just ship a 2 X 6 px streak with the game and scale it at runtime , will it affect performance .
In short how does scaling affect performance in OpenGL?
It seems you are asking if scaling the textures will affect rendering performance?
For your wall example in the vertex shader there will be no difference. In the fragment shader you will sample the texture and the GPU will simply multiply the texture coordinate by the size of the texture and then round the resulting coordinates and grab the corresponding pixel from the texture buffer. It doesn't matter how large the texture is the operations are the same.
Same issue with the rain besides that linear scaling will grab some more pixels and combine them according to how close the texture coordinate is to them.
Besides these issues you should think of the memory requirements of storing the textures in the GPU 9 pixels needs a lot less memory than 300,000 pixels.
Let me give you a little advice. If your target is to minimize app weight, I would recommend you generate one-pixel size white Texture and use it with different colors for every case where possible (e.g. wall of monochrome green color).
public static Texture createPixelTexture() {
Pixmap pixmap = new Pixmap(1, 1, Pixmap.Format.RGBA8888);
pixmap.drawPixel(0, 0, Color.WHITE.toIntBits());
Texture texture = new Texture(pixmap);
pixmap.dispose();
return texture;
}
Be aware of this method gives you unmanaged texture. It means that you have to recreate it each time your app loses context and of course you have to call dipsose for it.
This is suitable when you just need to draw something monochrome, if you need in gradient approach you get another problem... And in this situation you may use same one-pixel white texture with your custom fragment shader. But this way is a way of serious guys who love to get in troubles and then solve them (sometimes very slowly), because managing of different shaders (I'm sure you need 2 at least) will complicate your drawing cycle, and you have to manage them somehow...
So, I just wanted give you a point. Good luck!
This is a beginner's question, but I am a little confused about how this works in OpenGL. Also, I am aware that in practice, given the figures I'm using, this won't make a real difference in terms of framerate, but I would like to understand how this works.
My problem
Let's say I want to display a starry night sky, i.e. a set of white points on a black background. Assuming the stars do not move, I can think of two options to do that in OpenGL:
Define each star as an OpenGL vertex and call glDrawElements to render them.
Use a full-screen texture, rendered on an OpenGL quad.
Let's say my screen is 1920x1080 wide and I want to draw 1000 stars. Then, if we quickly compare the workload associated with each option: the first one has to draw 1000 vertices whereas the second one uses only 4 vertices but must uselessly render 1920x1080 = 2*106 pixels.
My questions
Should I conclude that the first option is most efficient ? If not, why ?
I'm more particularly interested in OpenGL ES (2.0), so is the answer the same for OpenGL and OpenGL ES (2.0) ?
It totally depends on the resolution. In fact, you're right that you'd limit the vertices amount, but you have to understand the Graphics Pipeline.
Even tough the texture is only black and white, OpenGL has to work with each Texel of the texture, getting even more expensive if you don't use mipmapping (using auto-generated lower resolution texture variants on distance). Let's say you're using a texture of the size 640 * 480 for the stars, used for the quad in the sky. Then OpenGL has to compute 4 vertex and 307200 texels for your sky, each having four components (r,g,b,a).
Indeed, you'd only have to compute 4 vertices, but instead a huge ammount of texels. So if you really have this black sky with the ~1000 stars, it should be more efficient to draw vertex array with glDrawElements. And yes, it should be the same for OpenGL and GLES.
I was wondering how to create wall in opengl and it is continuously appears from up and disappers at down screen. I am able to construct wall by GL_QUADS with texture mapping. but do not know how to generate it dynamically whenever player climbs up.
You have several possibilities.
Create one quad for, say, one meter. Render it 100 times, from floor(playerPos.z) to 100 meters ahead. Repeat for the opposite wall
Create one quad for 100 meters. Set the U texture coordinate of the quad to playerPos.z and playerPos.z + 100. Set the texture mapping to GL_REPEAT.
The second one is faster (only 2 quads) but doesn't let you choose different textures for different parts of the wall.
You don't have to make a "dynamic wall" (et. change glVertex* values every frame). Just change your camera position (modelview matrix) with glTranslatef function.
(I hope I understood your question correctly)