QImage vs OpenGL Performance

QImage vs OpenGL Performance - c++

I'm porting an old 4.8 application to 5.2.1 and back in that time, I used QImage to render some raw data on the screen, in a QLabel.
I am grabbing images from a camera, so i want to display those images in real-time. Until now, with QImage, i achieve over 20FPS (the camera is able to grab 30 FPS).
I'm wondering if rendering this data on OpenGL (maybe in a QML Quick / Qt Widgets new application) would be faster than the current developed method?

With next assumptions in mind :
your implementation in OpenGL is using HW acceleration
your implementation is using optimal texture parameters, to display the image (i.e. the driver is not doing some conversion)
you may achieve better performances using OpenGL. QImage still has to hold data at both the memory and GPU, meaning at least one additional copy is needed when updating QImage. With OpenGL, you can copy data directly to GPU memory and you do not need to store the data somewhere in memory.
However, what may be optimal on one GPU, doesn't have to be optimal on another. So, if you are implementing something that needs to run on various hardware, I would advise to go for QImage.
But as said, the only way is to implement and measure.

Related

Portable YUV Drawing Context

I have a stream of YUV data (from a video file) that I want to stream to a screen in real time. (Basically, I want to write a program that plays the video in real time.)
As such, I am looking for a portable way to send YUV data to the screen. I would ideally like to use something portable so I don't have to reimplement it for every major platform.
I have found a few options, but all of them seem to have significant issues. They are:
Use OpenGL directly, converting the YUV data to RGB. (And using the single quad for the whole screen trick.)
This obviously won't work because converting RGB to YUV on the CPU is going to be too slow for displaying images in real time.
Use OpenGL, but use a shader to convert the YUV stream to RGB.
This option is a bit better. Although the problem here is that (afaict), this will involve making two streams and splicing them together. It might work, but may have issues with larger resolutions.
Instead use SDL, which has the option of creating a YUV context directly.
The problem with this is I already am using a cross platform widget library for other aspects of my program (such as playback controls). As far as I can tell, SDL only opens up in its on (possibly borderless) window. I would ideally like my controls and drawing context to be in the same window. Which I can do with opengl, but not SDL.
Use SDL, and also use something like Qt for the on screen widgets, use a message passing protocol to communicate between the two libraries. Have the (borderless) SDL window constantly move itself on top of the opengl window.
While this approach is clever, it seems like the two windows could get out of sink easily making the user experience sub-optimal.
Forget a cross platform library, do thinks OS specific, making use of hardware acceleration if present.
This is a fine solution although its not cross platform.
As such, is there any good way to draw YUV data to a screen that ideally is:
Portable (at least to the major platforms).
Fast enough to be real time.
Allows other widgets in the same window.

Use the option number 2. There's no problem in doing the YUV to RGB conversion in the shader. There's no such other "portable" way to do that.
Think like this: no matter "how big or small" your video is, the fragment shaders (where the conversion is done) will execute per pixel at the moment of the display, so you can have either a small video in full screen or big one, the computation (for the shaders) is the same, because they are displaying the same number of pixels.
Any video card in normal conditions will be able to run this kind of shader without any problem.

SDL Basics: Textures vs. Images

I'm writing some code that uses SDL2 to display an image with moving markers layered on it, and I think I'd like to use the new (?) 2D hardware accelerated rendering. As I understand it, I have to load an image and convert it to a texture -- but what's the difference? Searching for 'image texture 2d sdl' only gets me tutorials on how to load textures and I'm looking for more of the background rather than the how-to.
So, some questions:
What's a texture versus an image? Aren't they the same thing?
Am I correct in assuming that I need to load the static background image as a texture if I want hardware accelerated rendering? In fact, it sounds like all the bits need to be textures for this to work.
Speaking of OpenGL, are SDL textures actually OpenGL textures?
I'm writing the main app for a single-purpose machine with limited resources (dual core ARM CPU, dual core Mali 400 GPU, 4GB RAM: Olimex A20 LIME2). All I need to do is render an 480x800 (yes, portrait layout) image and put markers on it. I expect the markers to have a single opaque and two transparency layers, to be updated at around 15 fps, and I expect about 125 of them, tops. Is it worth my while to use 2D hardware acceleration or should I just do it in software?

To understand the basics of textures, I advise you to have a look at a simpler library's documentation. Here, the term pixmap is used in the same way as SDL's texture. Essentially, those are already converted and uploaded into your GPU's memory, which makes operations quite a bit faster, but also more complex to deal with.
OpenGL textures are another beast, but we could basically say that they are the same, that is, images in video memory. When binding a texture in OpenGL, you need to upload it to the GPU memory, which is somewhat similar to this texture transformation.
At 125 objects, I think considering using the 2D acceleration becomes worth the hassle, especially if you have to move them around. If this is just a static image, I guess you could go for the regular image route.
As a general rule, I encourage you to use 2D acceleration (or just acceleration, for that matter) whenever possible, if only for the battery improvements. With that said, if the images are static, the outcome will exactly be the same, maybe just slightly different code-path wise. As such, I suppose you could load the static background image just as a regular image without any downsides (note that I am not a SDL professional, so this mixed approach might not work here, but it is worth trying since it will work on most 2D toolkits).
I hope I answered all of your questions :)

Is it possible to render a scene on a graphics card and then transfer the image back to system memory

I'd like to know if it is possible to:
Take an image stored in system memory, for example cv::Mat.
Transfer it to the graphics card.
Do some processing on the graphics card using openGL / directX / 3D Engine
(I'm aware OpenCV has functionality for some of it's algorithms but this is not what I'm looking for). For example rendering a mesh.
Then transfer the data back to a cv::Mat.
I'd like to know a good platform independent way of doing this

With OpenGL there are 2 essential functions:
Take an image stored in system memory, for example cv::Mat.
Transfer it to the graphics card.
glTexImage…
Then transfer the data back to a cv::Mat.
glReadPixels
all the rest like FBOs, PBOs and such is just for making things more efficient or to allow for reading back from certain resources, that don't have a direct way to access them.

C++ GUI Development - Bitmap vs. Vector Graphics CPU Usage

I'm currently in the process of designing and developing GUI's for some audio applications made in C++ (using the Juce framework).
So far I've been playing with using bitmap graphics to create custom sliders and dials, by using 'film strip' style images to animate the components (meaning when the user interacts with a slider it triggers a method that changes the offset of a film-strip image to change the components appearance). Depending on the size of the original image and the number of 'frames', the CPU usage level changes quite dramatically.
Firstly, what would be the most efficient bitmap file format to use in terms of CPU consumption? At the moment I'm using PNG images.
Secondly, would it be more efficient to use vector graphics for these kind of graphical components? I understand the main differences between bitmap and vector graphics, but I haven't found any information regarding their CPU usage levels with regard to GUI interaction.
Or would CPU usage be down to the particular methods/functions/libraries/frameworks being used?
Thanks!

Or would CPU consumption be down to the particular methods/functions/libraries/frameworks being used?
Any of these things could influence it.
Pixel based images might take a while to read off of disk the bigger they are. Compressed types might take more time to uncompress. Vector might take more time to render when are loaded.
That being said, I would definitely not expect that your choice of image type to have any impact on its performance. Since you didn't provide a code example it is hard to speculate beyond that.
In general, you would expect that the run-time costs of the images to happen when they are loaded. So whenever you create an image object. If you create an images all over the place, then maybe its expensive. It is possible that your film strip is recreating the images instead of loading them once and caching them.

Before choosing bitmap vs. vector graphics, investigate if your graphics processor supports vector or bitmap graphics. Some things take a long time to draw as vectors.
Have you tried double-bufferring?
This is where you write to a buffer in memory while the display (graphics processor) is loading another.
Load your bitmaps from the resource once. Store them as memory snapshots to avoid the additional cost of translating them from a format.
Does your graphic processor support "blitting"?
Blitting is where the graphics processor can copy a rectangular area in memory (bitmap) and display it along with apply optional operations before displaying (such as XOR with existing bits).
Summary:
To improve your rendering speed, only convert images from the file into a bitmap form once. Store this somewhere. Refer to this converted bitmap as needed. Next, investigate and implement double buffering. Lastly, investigate and use bit-blitting or blitting.
Other optimization rules apply here too, such as reviewing the design, removing requirements, loop unrolling, passing images via pointer vs. copying them, and reduce "if" statements by using boolean logic and Karnaugh (sp?) maps.

In general, calculations for rendering vector graphics are going to take longer than blitting a rectangular region of a bitmap to the screen. But for basic UI stuff, neither should be particularly intensive.
You probably should do some profiling. Perhaps you're redrawing much more frequently than necessary. Or perhaps the PNG is being decoded each time you try to draw from it. (I'm not familiar with Juce.)
For a straight Windows app, I'd probably render vector graphics into a device-dependent bitmap once on startup and then just blit from the bitmap to the screen. Using vector gives you DPI independence, and blitting from a device-dependent bitmap is about the fastest way to paint a block of pixels. I believe the color matching is done when you render to the device-dependent bitmap, so you don't even have the ICM overhead on the screen drawing.

Vector graphics was ditched long ago - bitmap graphics are more performant. The thing is that you can send a bitmap to the GPU once and then render it forever more by a simple copy.
Secondly, the GPU uses it's own texture compression. DirectX is DXT5, I believe, but when the GPU sees the texture, it doesn't care what you loaded it from.
However, a modern CPU even with a crappy integrated GPU should have absolutely no problem with simple GUI rendering. If you're struggling, then it's time to look again at the technique you're using. Perhaps your framework is slow or your use of it is suboptimal.

GPU memory allocation for video

Is it possible to allocate some memory on the GPU without cuda?
i'm adding some more details...
i need to get the video frame decoded from VLC and have some compositing functions on the video; I'm doing so using the new SDL rendering capabilities.
All works fine until i have to send the decoded data to the sdl texture... that part of code is handled by standard malloc which is slow for video operations.
Right now i'm not even sure that using gpu video will actually help me

Let's be clear: are you are trying to accomplish real time video processing? Since your latest update changed the problem considerably, I'm adding another answer.
The "slowness" you are experiencing could be due to several reasons. In order get the "real-time" effect (in the perceptual sense), you must be able to process the frame and display it withing 33ms (approximately, for a 30fps video). This means you must decode the frame, run the compositing functions (as you call) on it, and display it on the screen within this time frame.
If the compositing functions are too CPU intensive, then you might consider writing a GPU program to speed up this task. But the first thing you should do is determine where the bottleneck of your application is exactly. You could strip your application momentarily to let it decode the frames and display them on the screen (do not execute the compositing functions), just to see how it goes. If its slow, then the decoding process could be using too much CPU/RAM resources (maybe a bug on your side?).
I have used FFMPEG and SDL for a similar project once and I was very happy with the result. This tutorial shows to do a basic video player using both libraries. Basically, it opens a video file, decodes the frames and renders them on a surface for displaying.

You can do this via Direct3D 11 Compute Shaders or OpenCL. These are similar in spirit to CUDA.

Yes, it is. You can allocate memory in the GPU through OpenGL textures.

Only indirectly through a graphics framework.
You can use OpenGL which is supported by virtually every computer.
You could use a vertex buffer to store your data. Vertex buffers are usually used to store points for rendering, but you can easily use it to store an array of any kind. Unlike textures, their capacity is only limited by the amount of graphics memory available.
http://www.songho.ca/opengl/gl_vbo.html has a good tutorial on how to read and write data to vertex buffers, you can ignore everything about drawing the vertex buffer.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js