Performance Considerations And Suggestions article says:
As an application developer, you must strive to allow the rendering
engine to achieve a consistent 60 frames-per-second refresh rate. 60
FPS means that there is approximately 16 milliseconds between each
frame in which processing can be done, which includes the processing
required to upload the draw primitives to the graphics hardware.
Is there an event or signal or any form of callback to make the code be called with that refresh?
The goal is to eliminate the need for handling the signal from the rendering thread in the UI thread slot. If the new data arrived then it will be drawn or marked for the next refresh to be drawn (with update() call).
QQuickWindow has a bunch of signals for the purpose of synchronization - beforeRendering(), afterRendering(), beforeSynchronizing(), afterSynchronizing(), frameSwapped(). Take your pick.
Related
I am experimenting with Qt for a new layout for an instrument simulation program at work. Our current sim is running everything in a single window (we've used both glut (old) and fltk), it uses glViewport(...) and glScissor(...) to split instrument readouts into their own views, and then it uses some form of "ortho2D" calls to create their own virtual pixel space. The simulator currently updates the instruments and then draws each in their own viewport one by one, all in the same thread.
We want to find a better approach, and we settled on Qt. I am working under a few big constraints:
Each of the instrument panels still need to be in their OpenGL viewport. There are a lot of buttons and a lot of instruments. My tentative solution is to use a QOpenGLWidget for each. I have made progress on this.
The sim is not just a pretty readout, but also simulates many of the instruments as feedback for the instrument designers, so it sometimes has a hefty CPU load. It isn't a full hardware emulator, but it does simulate the logic. I don't think that it's feasible to tell the instruments to update themselves at the beginning of its associated widget's paintEvent(...) method, so I want simulation updates to run in a separate thread.
Our customers may have old computers and thus more recent versions of OpenGL have been ruled out. We are still using glBegin() and glEnd() and everything in between, and the instruments draw a crap ton of variable symbols, so drawing is takes a lot of time and I want to split drawing off into it's own thread. I don't yet know if OpenGL 3 is on the table, which will be necessary (I think) for rendering to off-screen buffers.
Problem: QOpenGLWidget does not have on overrideable "update" method, and it only draws during the widgets' paintEvent(...) and paintGL(...) calls.
Tentative Solution: Split the simulator into three threads:
GUI: Runs user input, paintEvent(...), and paintGL(...).
Simulator: Runs all instrument logic and updates values for symbology.
Drawing: Renders latest symbology to an offscreen buffer (will use a frame buffer object (FBO)).
In this design, cross-thread talking is cyclic and one-way, with the GUI thread providing input, the simulator thread taking that input into account on its next loop, the drawing thread reading the latest symbology and rendering it to the FBO and setting a "next frame available" flag to true (or maybe emitting a signal), and then the paintGL(...) method will take that FBO and spit it out to the widget, thus keeping event processing down and GUI responsiveness up. Continue this cycle.
Bottom line question: I've read here that GUI operations cannot be done in a separate thread, so is my approach even feasible?
If feasible, any other caution or suggestions would be appreciated.
Each OpenGL widget has its own OpenGL context, and these contexts are QObjects and thus can be moved to other threads. As with any otherwise non-threadsafe object, you should only access them from their thread().
Additionally - and this is also portable to QML - you could use worker functors to compute display lists that are then submitted to the render thread to be converted into draw calls. The render thread doesn't do any logic and doesn't compute anything: it takes data (vertex arrays, etc.) and submits it for drawing. The worker functors would be submitted for execution on a thread pool using QtConcurrent::run.
You can thus have a main thread, a render thread (perhaps one per widget, but not necessarily), and functors that run your simulation steps.
In any case, convoluting logic and rendering is a very bad idea. Whether you're doing drawing using QPainter on a raster widget, or using QPainter on an QOpenGLWidget, or using direct OpenGL calls, the thread that does the drawing should not have to compute what's to be drawn.
If you don't want to mess with OpenGL calls, and you can represent most of your work as array-based QPainter calls (e.g. drawRects, drawPolygons), these translate almost directly into OpenGL draw calls and the OpenGL backend will render them just as quickly as if you hand-coded the draw calls. QPainter does all this for you if you use it on a QOpenGLWidget!
In a display application we do use a large Window painting area. The display application gets so many updates for painting realtime data that all CPU time of the PC is used for painting. We do use InvalidateRect() and then paint the items in WM_PAINT message.
So we decided to use a dirty flag for each item to paint for reducing painting it.
How to know when the application can paint the items so that not all CPU time is consumed. Is there anything telling us that we can do our paint stuff now ?
If the data is updating so fast that painting each update is too much, you can use a timer. Every (say) quarter second, the timer fires, and if any items are dirty, the timer handler calls InvalidateRect(). Updating the data no longer invalidates; only the timer handler does that.
Edit: You could query Windows for the CPU load and if it's low, do the Invalidate immediately; see How to get system cpu/ram usage in c++ on Windows
One method I've used is to make sure that only one paint event is on the event queue at a time. You can use a boolean flag to mark when you begin updating and then reset the flag at the end of the WM_PAINT message (the end of the update process). Of course, if you try to update the window again and the flag is already set, then don't do anything. This will keep extra events from being piled into the queue, which can bog down your system. It looks like you may have thought of this, but do this with the entire update in addition to the individual items. Keep in mind that I'm only thinking of the updating of the windows themselves and not any underlying data.
One other thing I had to do was to "pump" (or process) the message queue during my (application) updates because updating a window (in my case) took several messages, ending with the WM_PAINT.
Another thing to watch out for is to not use idle messages for updating your interface. This is a quick and dirty way of having the update happen automatically, but ends up being a really bad idea because the idling only happens when there are no other events on the message queue. Of course, any time you move the mouse or press keys those events are placed onto the event queue and causes a "stall" of the update process. The idle events can end up coming so fast that it causes your application to use most of the CPU processing power just for displaying data that hasn't even changed. It's better to have your GUI only update when the underlying data it displays actually updates.
I had data coming in at 60Hz and updating lots of lists with columns of data as well as 3D stuff going on. I finally had to prioritize the updates and just not update the lists for each cycle, but DO update the 3D data each cycle. Updaing the lists at about 1-5 Hz was good enough for me and when combined with the techniques above resulted in a much improved and responsive system.
Has anyone figured out how to display smooth video (i.e. a series of bitmaps) in a FireMonkey application, HD or 3D? In VCL you could write to a canvas from a thread and this would work perfectly, but this does not work in FMX. To make things worse, the apparently only reliable way is to use TImage, and that seems to be updated from the main thread (open a menu and video freezes temporarily). All EMB examples I could find all either write to TImage from the main thread, or use Synchronize(). These limitations make FMX unusable for decent video display so I am looking for a hack or possibly bypass of FMX. I use XE5/C++ but welcome any suggestions. Target OS is both Windows 7+ & OS X. Thanks!
How about putting a TPaintbox on your form to hold the video. In the OnPaint method you simply draw the next frame to the paintbox canvas. Now put a TTimer on the form, set the interval to the frame rate required. In the OnTimer event for the timer just write paintbox1.repaint
This should give you regular frames no matter what else the program is doing.
For extra safety, you could increment a frame number in the OnTimer event. Now in the paintbox paint method you know which frame to paint. This means you won't jump frames if something else calls the paint method as well as the timer - you will just end up repainting the same frame for the extra call to OnPaint.
I use this for marching ants selections although I go one step further and use an overlaid canvas so I can draw independently to the selection and the underlying paintbox canvas to remove the need to repaint the main canvas when the selection changes. That requires calls to API but I guess you won't need it unless you are doing videos with a transparent colour.
Further research, including some talks with the Itinerant developer, has unfortunately made it clear that, due to concurrency restrictions, FM has been designed so that all GPU access goes through the main thread and therefore painting will always be limited. As a result I have decided FM is not suitable for my needs and I am re-evaluating my options.
I had perfomances issues with CUDA in my program. The time taken for the same task (aligning clouds of 3D points) wasn't stable and could be 30 times higher sometimes.
I use Qt for the main interface, which initialize a thread with my worker class. The purpose of this class is to launch cuda computing on my data and to send Qt signals which will be captured by the GUI for updating the display of an OpenGl widget.
I had resolved my performances issues by removing a QBasicTimer in my OpenGL widget, it was used like this :
void SWGLCloudWidget::initializeGL()
{
// ...
m_oTimer->start(5, this);
}
It has no use at all, but i forget to delete it after some refactoring.
In Qt documentation it says :
The QBasicTimer class provides timer events for objects.
This is a fast, lightweight, and low-level class used by Qt internally. We recommend using the higher-level QTimer class rather than this class if you want to use timers in your >applications. Note that this timer is a repeating timer that will send subsequent timer events >unless the stop() function is called.
I was wondering how this low-level call could cause such a mess with CUDA, just for my curiosity.
The way I interpret:
As described in the documentation update()\updateGL()
does not cause an immediate repaint; instead it schedules a paint
event for processing when Qt returns to the main event loop. This
permits Qt to optimize for more speed and less flicker than a call to
repaint() does.
If for some reason (other threads, monitor refresh rate limitations,time spent computing new images, other signals and slots,etc...) the screen can be refreshed only every X milliseconds and you ask for a refresh rate of Y > X, then Qt will keep queuing paint events to the detriment of other events. Thus the system will be even less responsive as you observed.
This is an issue of congestion as it happens in network systems, where the throughput (average successful rate) is lower and lower than the requested rate.
Here is the situation:
You have one long-running calculation running in a background thread.
This calculation is sending out a signal to, for example, refresh a GUI element, every 100 msec.
Let's say it sends out 100 such signals.
The widget being redrawn takes more than 100 msec to redraw; let's say 1 second.
What happens in the event loop? Do the signal calls "pile up" until they are all executed (i.e. 100 seconds)? Is there any mechanism for "dropping" events?
User events are never discarded. If you queue emitted signal events faster than you can process them, your event queue will grow until you run out of memory and your program will crash. It's worth noting, though, that QTimer will skip timeout events if the system is under heavy load. To some extent, that may help regulate your throughput.
You could also consider sending feedback from one thread to the other (an acknowledgement, perhaps), and manually adjust your timing in the producer thread based on how far behind the consumer thread is. Or, you could use a metaphorical sledgehammer and switch to a blocking queued connection.
In your example, you could measure the drawing time in the widget. If the drawing takes for example 240 ms, then you could process the next 2 signals quickly without drawing anything at all. That way the signals wouldn't pile up.
Edit:
Actually there is a slight problem in my solution. The last signal should always cause a redraw, otherwise the widget would show wrong data when the calculation is finished.
When a signal is skipped, a single shot timer could be started for example with a 150 ms interval. When a redraw is done because of a signal, this timer would be stopped. So after the last redraw signal, this single shot timer would cause the drawing of the final state. I guess this would work, but it would be quite complicated.
Starting a simple timer to do the redrawing when the calculation starts would quite probably be a better approach. If the drawing of the widget takes a lot of time, the timer interval could be dynamically adjusted according to the draw time.