I am now using FFMPEG to read a high resolution video (6480*1920) and use opengl to show it
after decoding, I get 3 pointer that point to the Y,U,V.
At first, I use swsscale to convert it rgb and show it, but I find it's too slow. So I directly deal with YUV. My second try is generate 3 one channel texture and convert it to rgb in fragment shader. It is faster, but still cannot achieve 60fps
I find the bottleneck is this function : texture(texy, tex_coord.xy). When the texture is large, it cost a lot of time. So instead of call it 3 times, my idea is to put the YUV in one single texture since a texture can have 4 channel. But I wonder that how can I update a certain channel of a texture.
I try the following code, but it seems do not work. Instead of update a channel, glTexSubImage2D changes the whole texture:
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB, frame->width, frame->height,0, GL_RED, GL_UNSIGNED_BYTE, Y);
glTexSubImage2D(GL_TEXTURE_2D,0,0,0,frame->width, frame->height, GL_GREEN,U);
glTexSubImage2D(GL_TEXTURE_2D,0,0,0,frame->width, frame->height, GL_BLUE,V);
So how can I use one texture to pass the YUV data ? I also try that gather the YUV data into one array then generate the texture. But it does not help since it need a lot of time to generate that array.
Any good idea?
You're approaching this from the wrong angle, since you don't actually understand what is causing the poor performance in the first place. Yes, texture access is a rather expensive operation. But it is not that expensive; I mean, just think about of the amount of texture data that gets pushed around in modern games at very high frame rates.
The problem is not the channel format of the texture, and it is also not the call of GLSL texture.
Your problem is this:
(…) high resolution video (6480*1920)
Plain and simple the dimensions of the frame are outside the range of what the GPU is comfortable working with. Try breaking down the picture into a set of smaller textures. Using glPixelStorei paramters GL_UNPACK_ROW_LENGTH, GL_UNPACK_SKIP_PIXELS and GL_UNPACK_SKIP_ROWS you can select the rectangle inside your source picture to copy.
You don't have to make several draw calls BTW, just select the texture inside the shader based on the target fragment position or texture coordinate.
Unfortunately OpenGL doesn't offer a convenient function to determine the sweet spot, for most GPUs these days the maximum size in either direction for dense textures is 2048. Go above it and in my experience the performance tanks for dense textures.
Sparse textures are an entirely different chapter, and irrelevant for this problem.
And just for the sake of completeness: I take it, that you don't reinitialize the texture for each and every frame with a call to glTexImage2D. Do that only once at the start of the video, then just update the texture(s).
Related
I am here because I'm working on an OpenGL program and I have some issues with performance. I work with OpenGL ES 3.0 on iMX6 soc.
Here is my algorithm :
I get an image from camera which is directly map to a texture.
Using an FBO, I render to texture to map the image on a specific form.
I do the same thing (with a second FBO) for another image which is sent via shared memory by another application. This step is performed only if the image is updated. Only once per second.
I blend these two textures in the default frame buffer to render the result to the screen.
If I perform these three steps separately, It works well and the screen is updated at 30FPS. But when I include the three step in one program the render is very slow and I got only 0.5FPS.
I am wondering if the GPU on the iMX6 is enough powerful, but I think it is not a complex algorithm. I think I am doing something in the wrong way, but what?
I use 3 different frame buffers, so is that a good way or should I use only one?
Can someone give me answer, clues, anything that can help me? :-)
My images dimensions are 1280x1024 x RGBA. Then I am doing some conversion from floating-point texture to integer and back to float, this is done to perform bitwise operation on pixels.
Thanks to #Columbo the problem came from all the conversion, I work with floating-point texture and only for the bitwise operations I do the conversion which improve a lot the performance of the algorithm.
Another point which decrease the performance was the texture format. For the first step, the image was 1280x1024 but only on one composent (grayscale image). To keep only the grayscale composant and not to use too much memory I worked with a GL_RED texture but this wasn't a good idea because when I changed it to GL_RGB, I double the framerate of the render too.
I want to create an openGL 2D texture and set the RGBA values of every pixel by its own. Can someone give me an explanation for my problem? I didn't find one in the internet.
If you're just looking to write the pixels of a 2D texture, you can simply use glTexImage2D, which takes a buffer specifying the pixel data you wish to upload to the texture (https://www.opengl.org/sdk/docs/man/html/glTexImage2D.xhtml). Alternatively, you can use glTexSubImage2D to write a portion of the texture's pixels (https://www.khronos.org/opengles/sdk/docs/man/xhtml/glTexSubImage2D.xml). If you're instead looking to do the analogous thing with the framebuffer, you can use glDrawPixels (https://www.opengl.org/sdk/docs/man2/xhtml/glDrawPixels.xml).
If the target is the backbuffer, attempting to draw to a exact pixel values to a texture by binding it as a framebuffer, and then rendering a textured quad completely covering it is possible. However, this process is subject to blending and potentially pixel-center issues, whereas glDrawPixels is not.
I did something like this some time ago, when playing around with OpenGL.
Have a look at the code here, on GitHub.
You can find it in main.cpp.
Basically, my idea was to create an array of floats, set the values, copy to GPU with glBufferData and draw with glDrawElements.
As I remember it, doing it often was very bad in terms of performance, so it's probably not the best direction.
Please also note that this code is just my sandbox, and may not be the best possible example to be copied.
as we all know, openGL uses a pixel-data orientation that has 0/0 at left/bottom, whereas the rest of the world (including virtually all image formats) uses left/top.
this has been a source of endless worries (at least for me) for years, and i still have not been able to come up with a good solution.
in my application i want to support following image data as textures:
image data from various image sources (including still-images, video-files and live-video)
image data acquired via copying the framebuffer to main memory (glReadPixels)
image data acquired via grabbing the framebuffer to texture (glCopyTexImage)
(case #1 delivers images with top-down orientation (in about 98% of the cases; for the sake of simplicity let's assume that all "external images" have top-down orientation); #2 and #3 have bottom-up orientation)
i want to be able to apply all of these textures onto various arbitrarily complex objects (e.g. 3D-models read from disk, that have texture coordinate information stored).
thus i want a single representation of the texture_coords of an object. when rendering the object, i do not want to be bothered with the orientation of the image source.
(until now, i have always carried a topdown-flag alongside the texture id, that get's used when the texture coordinates are actually set. i want to get rid of this clumsy hack!
basically i see three ways to solve the problem.
make sure all image data is in the "correct" (in openGL terms this
is upside down) orientation, converting all the "incorrect" data, before passing it to openGL
provide different texture-coordinates depending on the image-orientation (0..1 for bottom-up images, 1..0 for top-down images)
flip the images on the gfx-card
in the olde times i've been doing #1, but it turned out to be too slow. we want to avoid the copy of the pixel-buffer at all cost.
so i've switched to #2 a couple of years ago, but it is way to complicated to maintain. i don't really understand why i should carry metadata of the original image around, once i transfered the image to the gfx-card and have a nice little abstract "texture"-object.
i'm in the process of finally converting my code to VBOs, and would like to avoit having to update my texcoord arrays, just because i'm using an image of the same size but with different orientation!
which leaves #3, which i never managed to work for me (but i believe it must be quite simple).
intuitively i though about using something like glPixelZoom().
this works great with glDrawPixels() (but who is using that in real life?), and afaik it should work with glReadPixels().
the latter is great as it allows me to at least force a reasonably fast homogenous pixel orientation (top-down) for all images in main memory.
however, it seems thatglPixelZoom() has no effect on data transfered via glTexImage2D, let alone glCopyTex2D(), so the textures generated from main-memory pixels will all be upside down (which i could live with, as this only means that i have to convert all incoming texcoords to top-down when loading them).
now the remaining problem is, that i haven't found a way yet to copy a framebuffer to a texture (using glCopyTex(Sub)Image) that can be used with those top-down texcoords (that is: how to flip the image when using glCopyTexImage())
is there a solution for this simple problem? something that is fast, easy to maintain and runs on openGL-1.1 through 4.x?
ah, and ideally it would work with both power-of-two and non-power-of-two (or rectangle) textures. (as far as this is possible...)
is there a solution for this simple problem? something that is fast, easy to maintain and runs on openGL-1.1 through 4.x?
No.
There is no method to change the orientation of pixel data at pixel upload time. There is no method to change the orientation of a texture in-situ. The only method for changing the orientation of a texture (besides downloading, flipping and re-uploading) is to use an upside-down framebuffer blit from a framebuffer containing a source texture to a framebuffer containing a destination texture. And glFramebufferBlit is not available on any hardware that's so old it doesn't support GL 2.x.
So you're going to have to do what everyone else does: flip your textures before uploading them. Or better yet, flip the textures on disk, then load them without flipping them.
However, if you really, really want to not flip data, you could simply have all of your shaders take a uniform that tells them whether or not to invert the Y of their texture coordinate data. Inversion shouldn't be anything more than a multiply/add operation. This could be done in the vertex shader to minimize processing time.
Or, if you're coding in the dark ages of fixed-function, you can apply a texture matrix that inverts the Y.
why arent you change the way how you map the texture to the polygone ?
I use this mapping coordinates { 0, 1, 1, 1, 0, 0, 1, 0 } for origin top left
and this mapping coordinates { 0, 0, 1, 0, 0, 1, 1, 1 } for origin bottom left.
Then you dont need to manualy switch your pictures.
more details about mapping textures to a polygone could be found here:
http://iphonedevelopment.blogspot.de/2009/05/opengl-es-from-ground-up-part-6_25.html
in each frame (as in frames per second) I render, I make a smaller version of it with just the objects that the user can select (and any selection-obstructing objects). In that buffer I render each object in a different color.
When the user has mouseX and mouseY, I then look into that buffer what color corresponds with that position, and find the corresponding objects.
I can't work with FBO so I just render this buffer to a texture, and rescale the texture orthogonally to the screen, and use glReadPixels to read a "hot area" around mouse cursor.. I know, not the most efficient but performance is ok for now.
Now I have the problem that this buffer with "colored objects" has some accuracy problems. Of course I disable all lighting and frame shaders, but somehow I still get artifacts. Obviously I really need clean sheets of color without any variances.
Note that here I put all the color information in an unsigned byte in GL_RED. (assumiong for now I maximally have 255 selectable objects).
Are these caused by rescaling the texture? (I could replace this by looking up scaled coordinates int he small texture.), or do I need to disable some other flag to really get the colors that I want.
Can this technique even be used reliably?
It looks like you're using GL_LINEAR for your GL_TEXTURE_MAG_FILTER. Use GL_NEAREST instead if you don't want interpolated colors.
I could replace this by looking up scaled coordinates int he small texture.
You should. Rescaling is more expensive than converting the coordinates for sure.
That said, scaling a uniform texture should not introduce artifacts if you keep an integer ratio (like upscale 2x), with no fancy filtering. It looks blurry on the polygon edges, so I'm assuming that's not what you use.
Also, the rescaling should introduce variations only at the polygon boundaries. Did you check that there are no variations in the un-scaled texture ? That would confirm whether it's the scaling that introduces your "artifacts".
What exactly do you mean by "variance"? Please explain in more detail.
Now some suggestion: In case your rendering doesn't depend on stencil buffer operations, you could put the object ID into the stencil buffer in the render pass to the window itself, don't use the detour over a separate texture. On current hardware you usually get 8 bits of stencil. Of course the best solution, if you want to use a index buffer approach, is using multiple render targets and render the object ID into an index buffer together with color and the other stuff in one pass. See http://www.opengl.org/registry/specs/ARB/draw_buffers.txt
Now that my OpenGL application is getting larger and more complex, I am noticing that it's also getting a little slow on very low-end systems such as Netbooks. In Java, I am able to get around this by drawing to a BufferedImage then drawing that to the screen and updating the cached render every one in a while. How would I go about doing this in OpenGL with C++?
I found a few guides but they seem to only work on newer hardware/specific Nvidia cards. Since the cached rendering operations will only be updated every once in a while, i can sacrifice speed for compatability.
glBegin(GL_QUADS);
setColor(DARK_BLUE);
glVertex2f(0, 0); //TL
glVertex2f(appWidth, 0); //TR
setColor(LIGHT_BLUE);
glVertex2f(appWidth, appHeight); //BR
glVertex2f(0, appHeight); //BR
glEnd();
This is something that I am especially concerned about. A gradient that takes up the entire screen is being re-drawn many times per second. How can I cache it to a texture then just draw that texture to increase performance?
Also, a trick I use in Java is to render it to a 1 X height texture then scale that to width x height to increase the performance and lower memory usage. Is there such a trick with openGL?
If you don't want to use Framebuffer Objects for compatibility reasons (but they are pretty widely available), you don't want to use the legacy (and non portable) Pbuffers either. That leaves you with the simple possibility of reading the contents of the framebuffer with glReadPixels and creating a new texture with that data using glTexImage2D.
Let me add that I don't really think that in your case you are going to gain much. Drawing a texture onscreen requires at least texel access per pixel, that's not really a huge saving if the alternative is just interpolating a color as you are doing now!
I sincerely doubt drawing from a texture is less work than drawing a gradient.
In drawing a gradient:
Color is interpolated at every pixel
In drawing a texture:
Texture coordinate is interpolated at every pixel
Color is still interpolated at every pixel
Texture lookup for every pixel
Multiply lookup color with current color
Not that either of these are slow, but drawing untextured polygons is pretty much as fast as it gets.
Hey there, thought I'd give you some insight in to this.
There's essentially two ways to do it.
Frame Buffer Objects (FBOs) for more modern hardware, and the back buffer for a fall back.
The article from one of the previous posters is a good article to follow on it, and there's plent of tutorials on google for FBOs.
In my 2d Engine (Phoenix), we decided we would go with just the back buffer method. Our class was fairly simple and you can view the header and source here:
http://code.google.com/p/phoenixgl/source/browse/branches/0.3/libPhoenixGL/PhRenderTexture.h
http://code.google.com/p/phoenixgl/source/browse/branches/0.3/libPhoenixGL/PhRenderTexture.cpp
Hope that helps!
Consider using a display list rather than a texture. Texture reads (especially for large ones) are a good deal slower than 8 or 9 function calls.
Before doing any optimization you should make sure you fully understand the bottlenecks. You'll probably be surprised at the result.
Look into FBOs - framebuffer objects. It's an extension that lets you render to arbitrary rendertargets, including textures. This extension should be available on most recent hardware. This is a fairly good primer on FBOs: OpenGL Frame Buffer Object 101