QBasicTimer interference with cuda - c++

I had perfomances issues with CUDA in my program. The time taken for the same task (aligning clouds of 3D points) wasn't stable and could be 30 times higher sometimes.
I use Qt for the main interface, which initialize a thread with my worker class. The purpose of this class is to launch cuda computing on my data and to send Qt signals which will be captured by the GUI for updating the display of an OpenGl widget.
I had resolved my performances issues by removing a QBasicTimer in my OpenGL widget, it was used like this :
void SWGLCloudWidget::initializeGL()
{
// ...
m_oTimer->start(5, this);
}
It has no use at all, but i forget to delete it after some refactoring.
In Qt documentation it says :
The QBasicTimer class provides timer events for objects.
This is a fast, lightweight, and low-level class used by Qt internally. We recommend using the higher-level QTimer class rather than this class if you want to use timers in your >applications. Note that this timer is a repeating timer that will send subsequent timer events >unless the stop() function is called.
I was wondering how this low-level call could cause such a mess with CUDA, just for my curiosity.

The way I interpret:
As described in the documentation update()\updateGL()
does not cause an immediate repaint; instead it schedules a paint
event for processing when Qt returns to the main event loop. This
permits Qt to optimize for more speed and less flicker than a call to
repaint() does.
If for some reason (other threads, monitor refresh rate limitations,time spent computing new images, other signals and slots,etc...) the screen can be refreshed only every X milliseconds and you ask for a refresh rate of Y > X, then Qt will keep queuing paint events to the detriment of other events. Thus the system will be even less responsive as you observed.
This is an issue of congestion as it happens in network systems, where the throughput (average successful rate) is lower and lower than the requested rate.

Related

QML Rendering Engine: frame refresh event

Performance Considerations And Suggestions article says:
As an application developer, you must strive to allow the rendering
engine to achieve a consistent 60 frames-per-second refresh rate. 60
FPS means that there is approximately 16 milliseconds between each
frame in which processing can be done, which includes the processing
required to upload the draw primitives to the graphics hardware.
Is there an event or signal or any form of callback to make the code be called with that refresh?
The goal is to eliminate the need for handling the signal from the rendering thread in the UI thread slot. If the new data arrived then it will be drawn or marked for the next refresh to be drawn (with update() call).
QQuickWindow has a bunch of signals for the purpose of synchronization - beforeRendering(), afterRendering(), beforeSynchronizing(), afterSynchronizing(), frameSwapped(). Take your pick.

C++ Win32 realtime painting performance - how to know when the application can paint without using all CPU time

In a display application we do use a large Window painting area. The display application gets so many updates for painting realtime data that all CPU time of the PC is used for painting. We do use InvalidateRect() and then paint the items in WM_PAINT message.
So we decided to use a dirty flag for each item to paint for reducing painting it.
How to know when the application can paint the items so that not all CPU time is consumed. Is there anything telling us that we can do our paint stuff now ?
If the data is updating so fast that painting each update is too much, you can use a timer. Every (say) quarter second, the timer fires, and if any items are dirty, the timer handler calls InvalidateRect(). Updating the data no longer invalidates; only the timer handler does that.
Edit: You could query Windows for the CPU load and if it's low, do the Invalidate immediately; see How to get system cpu/ram usage in c++ on Windows
One method I've used is to make sure that only one paint event is on the event queue at a time. You can use a boolean flag to mark when you begin updating and then reset the flag at the end of the WM_PAINT message (the end of the update process). Of course, if you try to update the window again and the flag is already set, then don't do anything. This will keep extra events from being piled into the queue, which can bog down your system. It looks like you may have thought of this, but do this with the entire update in addition to the individual items. Keep in mind that I'm only thinking of the updating of the windows themselves and not any underlying data.
One other thing I had to do was to "pump" (or process) the message queue during my (application) updates because updating a window (in my case) took several messages, ending with the WM_PAINT.
Another thing to watch out for is to not use idle messages for updating your interface. This is a quick and dirty way of having the update happen automatically, but ends up being a really bad idea because the idling only happens when there are no other events on the message queue. Of course, any time you move the mouse or press keys those events are placed onto the event queue and causes a "stall" of the update process. The idle events can end up coming so fast that it causes your application to use most of the CPU processing power just for displaying data that hasn't even changed. It's better to have your GUI only update when the underlying data it displays actually updates.
I had data coming in at 60Hz and updating lots of lists with columns of data as well as 3D stuff going on. I finally had to prioritize the updates and just not update the lists for each cycle, but DO update the 3D data each cycle. Updaing the lists at about 1-5 Hz was good enough for me and when combined with the techniques above resulted in a much improved and responsive system.

Exact delay in screen draw and keyboard keypress event in Qt

I am working on a Qt project in which exact time at which certain events occur is of prime importance. To be specific: I have a very simple animation that must be drawn to the screen at certain time say t1. Once I issue the QWidget update to start the animation, it will take a small time dt (depending on screen refresh rates etc.) to actually show the update on screen. I need to measure this extra time dt. I am unsure as to how to do it.
I thought of using QTime and QElapsedTimer object in the paint event of the QWidget but I'm not sure if that would achieve my goal.
Similarly, when the user presses a key it will be registered after a small delay based on the polling rate of the keyboard. I need to account for this delay as well. If I could get the polling rate I know on average how much will the delay be.
What you're asking for is--by definition--not possible from within the computer.
How would you expect to be able to tell when a pixel "actually showed up" on the screen, without a sensor stuck to the monitor and synchronized to an atomic clock the computer has access to also? :-)
The odds are stacked even further against Qt because it's generally used as an abstraction layer on top of Win/OSX/Linux. Those weren't Real-Time Operating Systems of any kind in the first place.
All you can know is when you asked for something to happen. Then you can time how long it takes for you to get back control to make another request. You can set some expectations on your basic "frame rate" throughput by doing this, but there are countless factors that could lead to wide variations in performance at any moment in time.
If you can dig through to the kernel/driver level you can find out a closer-to-the-metal measure of when the actual effect went to the hardware. But that's not Qt's domain, and still doesn't tell you the "actual" answer of when the effect manifested in the outside world.
About the best you're going to get out of Qt is a periodic QTimer. It can make a callback at (roughly) millisecond resolution. If that's not good enough... you're going to need a smaller boat. :-)
You might get a little boost from stuff related to the search term "high resolution timer":
Qt high-resolution timer
http://qt-project.org/forums/viewthread/31941
I thought of using QTime and QElapsedTimer object in the paint event of the QWidget but I'm not sure if that would achieve my goal.
This is, in fact, the only way to do it, and is all you can actually do. There is nothing further that can be done without resorting to a real-time operating system, custom drivers, or external hardware.
You may not need both - the QElapsedTimer measuring the time passed since the last update is sufficient.
Do note that when the event loop is empty, the delay between invocation of widget.update() and the paintEvent executing is under a microsecond, assuming that your process wasn't preempted.
it is a reaction time experiment for some studies. A visual input is presented to which the user responds via keyboard or mouse. To be able to find the reaction time precisely I need to know when was the stimulus presented on the screen and when was the key pressed.
There is essentially only one way of doing it right without resorting to a realtime operating system or a custom driver, and a whole lot of ways of doing it wrong. So, what's the right way?
A small area of the screen needs to change color or brightness coincidentally with the presentation of the visual stimulus. You attach a fiber optic to the screen, and feed it into a receiver attached to an external event timer. The contact closure in the keyboard is also fed to the same event timer. This lets you precisely time the latency of the response with no regard for operating system latencies, thread preemption, etc. The event timer can be something as cheap as an Arduino, if you are willing to do a bit more development work.
If you are showing the stimulus repetitively and need a certain timing between stimulus presentations, you simply repeat the presentation often and collect both response latency and stimulus-to-stimulus timing in your data. You can then discard the presentations that were outside of desired tolerances.
This approach is screen-agnostic and you can use it even on a mobile device, as long as it can somehow interface with your timer hardware. The timer hardware can of course be networked, making interfacing easy.

Many QPropertyAnimation instances VS single one?

I use QPropertyAnimation simply as a source of ticks. I set 1 sec animation, infinite number of loops, call start() and then watch QElapsedTimer::elapsed() in a "tick handler" to know how much time is elapsed from animation start. So, i don't depend on loops count, animation start and stop values, and i don't care about property value that is being animated. Just a source of ticks!
Before that i was using QTimer that gives different results on Linux and Windows: for animation to be smooth in Linux, i had to use QTimer interval = 1000/30, but for windows 1000/60 was a minimum. So i had to use #ifndef, but that's a dirty code. In addition to that, QTimer uses signal-slot machinery, but QPropertyAnimation doesn't, so my QApplication event loop is not busy with animation events (am i correct?)
Now i need to animate N widgets (different kind of animation for each), and i am going to use QPropertyAnimation in the same way - as the same stupid source of ticks.
What is the CPU-cost difference between these variants:
N running QPropertyAnimation instances each connected to its own widget; Qt documentation say that QPropertyAnimation fire ticks at about 60fps = ~17 ms between ticks. But Qt cannot fire ticks from N different QPropertyAnimation instances simultaneously, because you may have started these animations in different time() - lets say there was 8 ms between QPropertyAnimation::start().
1 single running QPropertyAnimation instance connected to some kind of proxy object that transmit the ticks to N widgets; And all such widgets have a member 'animTick(void)' for that.
If all you want is a source of "ticks", then all you need is a QVariantAnimation, not even QPropertyAnimation.
The more animations, the higher the CPU cost. All you want is one animation, whose valueChanged(QVariant) signal is connected to multiple widgets.
Note that a QBasicTimer is not a source of anything, it's a very thin wrapper around the timer id returned by QObject::startTimer(). Thus it only works within a QObject instance, and only when you reimplement timerEvent(...).
The QVariantAnimation is simply a source of nicely timed ticks so that you don't need to reinvent the wheel.
If you want a general purpose timer that sends signals to multiple objects, you really want a QTimer. That's what it's for. That way you don't need proxy objects, as you can connect one signal to many slots. You can also connect a signal to a signal, if you so wish - thus you can forward or alias signals. A QTimer is simply a QObject with timerEvent(...) that emits a signal. That's all there's to it. It'd be silly to write your own.

Drawing from multiple threads in Qt

I'm writing a program in Qt, which runs 10 worker threads which calculate the trajectory of an object in space. They also have to draw the path of the object. I have a "Body" class deriving QGraphicsEllipseItem and it has a QPainterPath in it. The "Simulation" class takes a list of obstacles in the world, and the body to simulate and runs until the body collides with something. Simulation runs in a separate thread ( done with moveToThread, not by subclassing QThread). When the body collides, the Simulation emits a signal saying that it finished. When all threads have finished I'd like to draw the paths (I do it by invoking a method in "Body" which enables path drawing in its draw method).
Unfortunately I get ASSERT errors :
ASSERT: "!unindexedItems.contains(item)" in file graphicsview\qgraphicsscenebsptreeindex.cpp, line 364
They happen seemingly randomly. I've tried different connection types, to no result.
I'm starting the threads in a loop.
I'm using Qt 5.0
Generally speaking, with Qt you can't do any GUI operations outside of the GUI thread (i.e. the thread that is executing QApplication::exec(), which is typically the main() thread).
So if you have multiple threads manipulating QGraphicsItems (especially QGraphicsItems that are currently part of a QGraphicsScene), that is likely the cause of your assertion failures. That is, when the Qt GUI thread is doing its window refresh, it is reading data from the various QGraphicsItem objects as part of its calculations, and it expects the QGraphicsItems to remain constant for the duration of the refresh operation. If a QGraphicsItem is changed (by another thread) while the refresh routine is executing, then the calculations made by the main thread can become wrong/corrupted, and that occasionally causes an assertion failure (and/or other unwanted behaviors).
If you really need to use multiple threads, what you'll probably need to do is have the threads do all their calculations on their own private data structures that the Qt GUI thread has no access to. Then when the threads have computed their results, they should send the results back to the Qt GUI thread (via queued connection or QApplication::postEvent()). The GUI thread can then look at the results and use them to update the QGraphicsItems, etc; this will be "safe" because this update can't happen in the middle of a window update.
If that sounds like too much work, then you might consider just doing everything in the GUI thread; it will be much easier and simpler to make everything work reliably that way.
As mentioned by Jeremy, Qt rendering must be done on the main thread.
While you could move it all to the main thread, you've likely chosen to create separate ones for efficiency, especially as collision detection can be processor intensive. The best way to handle this is to split the modelling of the objects and their physics from their rendering, as you would in a Model / View / Controller pattern.
Create representations of the body instances that are not derived from any QGraphicsItem/Objects. These can then do their calculations on separate threads and have signals to graphics objects that are running in the main thread, which updates each body instance's graphic representation, allowing real-time rendering of the trajectories.