Multithreaded input with mutex not smooth (as expected) - c++

I am currently developing a 3D engine from scratch (again) as I wanted to use more modern techniques (and frankly, my previous design was crap). Now I am in the process of implementing my input thread.
Now that I am more experienced I know that if I write to the same variable from my input thread and my rendering/main thread then I will get data races so I decided to use mutexes(mutices?) to lock data that could be written to in different threads, but this is causing an unacceptable bug: mouse input isn't smooth any more :/
I did kind of expect that though, I just thought my thinking might be off.
Now I am stuck at a crossroads because I don't know how to go about fixing this issue!
The variable that I am writing to from both threads is x_rel and y_rel which is mouse position relative to last position when I received an event.
The input thread sets the variables and the rendering/main thread resets them to 0.0 when it is finished with them. This works fine, but as I said, this gives me very rigid mouse motion.
My question here is, what can I do to get smooth input while still being race safe across threads?
Here is my mutex definition (it is global):
std::mutex mouse_mutex;
Here is the code that I use to get the mouse events:
void input_thread_func(application &app, const bool &running, double &x_rel, double &y_rel){
while(running){
application::event ev = app.get_input();
switch(ev.type){
case events::mouse_motion :{
if(mouse_mutex.try_lock()){
x_rel = ev.xrel;
y_rel = ev.yrel;
mouse_mutex.unlock();
}
break;
}
default:
break;
}
}
}
And here is my main function:
int main(int argc, char *argv[]){
/* all my init stuff */
application app;
bool running = true;
double x_rel = 0.0, y_rel = 0.0;
std::thread input_thread(
input_thread_func,
std::ref(app), std::cref(running)
std::ref(x_rel), std::ref(y_rel)
);
double multiplier = /* whatever I like */;
while(running){
/* check input data */
if(mouse_mutex.try_lock()){
update_camera(x_rel * multiplier, y_rel * multiplier);
app.set_mouse_pos(0, 0);
x_rel = 0.0; y_rel = 0.0;
mouse_mutex.unlock();
}
/* do rendering stuff */
}
}

This is my understanding of your problem:
Basically, you have 2 threads, one dealing with mouse events coming from the system, and one taking care of rendering duties. As far as I understand, the later needs the most recent mouse position in order to compute an accurate set of camera matrices. However, because of the nature of the input device, the event thread is flooded with mouse position events: the system polls the device fast enough to get many updates per rendering frame, and these updates get pushed back to the event thread. That thread having only that task to do, it will constantly lock/unlock the mouse mutex while processing these events, and the odds of having that lock held by the event thread is high when the rendering thread actually wants to get ahold of it.
There are two possible problems coming out of your current setup:
either the rendering thread needs all the mouse updates, and thus you'd need to implement an event queue between that thread and the event thread to keep track of all of them, then you'd be essentially filtering mouse events to dispatch them to rendering:
setup a mouse event queue between those threads, and just push mouse events from the input queue to that new queue.
have rendering check the mouse queue at each frame, and do as many camera updates as necessary, or better yet - as suggested in the comments, combine all these events in one single camera update, if possible. You should try putting that computation in the event thread, in order to reduce load on rendering.
or rendering only needs the latest event (as it seems to be doing), in which case you need to update the mutexed data structure with the latest one only (thus reducing contention on that mutex).
To recap, depending on how the camera update function works wrt. mouse events, you will probably have to change it to work with only one set of absolute coordinates built cumulatively from all the relative events (maybe you can get them directly from the mouse event structure?), and only update the mutexed data once per event stream.
Note:
You may also check how other engines do it: the IdTech2 (Quake I/II) series of engines are monothreaded, but they're still a good source of inspiration. Each frame, these
deal with all mouse inputs in one go. During a frame render, a first routine is called (In_Move,HandleEvents, or some other function, depending on the backend, see the sys_* files) to check all the events and update all the related structures. When the rendering code (R_RenderFrame) is invoked, there're no contention on these structures anymore. You probably want to "emulate" the same behaviour, by making sure that rendering isn't held back by one or more mutexes. A possible solution has been described above for mouse input, and can certainly be extended to handle other type of input devices.

Related

OpenGL render loop

I have an application which renders a 3d object using OpenGL, allowing the user to rotate and zoom and inspect the object. Currently, this is driven directly by received mouse messages (it's a Windows MFC MDI application). When a mouse movement is received, the viewing matrix is updated, and the scene re-rendered into the back buffer, and then SwapBuffers is called. For a spinning view, I start a 20ms timer and render the scene on the timer, with small updates to the viewing matrix each frame. This is OK, but is not perfectly smooth. It sometimes pauses or skips frames, and is not linked to vsync. I would love to make it smoother and smarter with the rendering.
It's not like a game where it needs to be rendered every frame though. There are long periods where the object is not moved, and does not need to be re-rendered.
I have come across GLFW library and the glfwSwapInterval function. Is this a commonly used solution?
Should I create a separate thread for the render loop, rather than being message/timer driven?
Are there other solutions I should investigate?
Are there any good references for how to structure a suitable render loop? I'm OK with all the rendering code - just looking for a better structure around the rendering code.
So, I consider you are using GLFW for creating / operating your window.
If you don't have to update your window on each frame, suggest using glfwWaitEvents() or glfwWaitEventsTimeout(). The first one tells the system to put this process (not window) on sleep state, until any event happens (mouse press / resize event etc.). The second one is similar, but you can specify a timeout for the sleep state. The function will wait till any event happens OR till specified time runs out.
What's for the glfwSwapInterval(), this is probably not the solution you are looking for. This function sets the amount of frames that videocard has to skip (wait) when glfwSwapBuffers() is called.
If you, for example, use glfwSwapInterval(1) (assuming you have valid OpenGL context), this will sync your context to the framerate of your monitor (aka v-sync, but I'm not sure if it is valid to call it so).
If you use glfwSwapInterval(0), this will basicly unset your syncronisation with monitor, and videocard will swap buffers with glfwSwapBuffers() instanly, without waiting.
If you use glfwSwapInterval(2), this will double up the time that glfwSwapBuffers() waits after (or before?) flushing framebuffer to screen. So, if you have, for instance, 60 fps on your display, using glfwSwapInterval(2) will result in 30 fps in your program (assuming you use glfwSwapBuffers() to flush framebuffer).
The glfwSwapInterval(3) will give you 20 fps, glfwSwapInterval(4) - 15 fps and so on.
As for separate render thread, this is good if you want to divide your "thinking" and rendering processes, but it comes with its own advantages, disadvantages and difficulties. Tip: some window events can't be handled "properly" without having separate thread (See this question).
The usual render loop looks like this (as far as I've learned from learnopengl lessons):
// Setup process before...
while(!window_has_to_close) // <-- Run game loop until window is marked "has to
// close". In GLFW this is done using glfwWindowShouldClose()
// https://www.glfw.org/docs/latest/group__window.html#ga24e02fbfefbb81fc45320989f8140ab5
{
// Prepare for handling input events (e. g. callbacks in GLFW)
prepare();
// Handle events (if there are none, this is just skipped)
glfwPollEvents(); // <-- You can also use glfwWaitEvents()
// "Thinknig step" of your program
tick();
// Clear window framebuffer (better also put this in separate func)
glClearColor(0.f, 0.f, 0.f, 1.f);
glClear(GL_COLOR_BUFFER_BIT);
// Render everything
render();
// Swap buffers (you can also put this in separate function)
glfwSwapBuffers(window); // <-- Flush framebuffer to screen
}
// Exiting operations after...
See this ("Ready your engines" part) for additional info. Wish you luck!

Strategy for creating a fadable child window

I'm attempting to create a window class that supports fading in and out, even for child windows. Basically it adds the WS_EX_LAYERED style to the window, and then it calls SetLayeredWindowAttributes on a timer, gradually changing the alpha value.
That approach is okay, but of course the fading will become temporarily interrupted if there are higher priority messages that come through the thread's message queue. So, for example, if there's some resize event going on somewhere, the fading will slow or temporarily stop.
I'm wondering if there's a strategy to somehow avoid this. So far my only solution is to create the fadable window on its own thread, so the timer messages don't get interrupted by anything. That solution is feasible, but it does add some additional threading complexity, so I was hoping to avoid it if possible. Thanks for any input.

C++ Win32 realtime painting performance - how to know when the application can paint without using all CPU time

In a display application we do use a large Window painting area. The display application gets so many updates for painting realtime data that all CPU time of the PC is used for painting. We do use InvalidateRect() and then paint the items in WM_PAINT message.
So we decided to use a dirty flag for each item to paint for reducing painting it.
How to know when the application can paint the items so that not all CPU time is consumed. Is there anything telling us that we can do our paint stuff now ?
If the data is updating so fast that painting each update is too much, you can use a timer. Every (say) quarter second, the timer fires, and if any items are dirty, the timer handler calls InvalidateRect(). Updating the data no longer invalidates; only the timer handler does that.
Edit: You could query Windows for the CPU load and if it's low, do the Invalidate immediately; see How to get system cpu/ram usage in c++ on Windows
One method I've used is to make sure that only one paint event is on the event queue at a time. You can use a boolean flag to mark when you begin updating and then reset the flag at the end of the WM_PAINT message (the end of the update process). Of course, if you try to update the window again and the flag is already set, then don't do anything. This will keep extra events from being piled into the queue, which can bog down your system. It looks like you may have thought of this, but do this with the entire update in addition to the individual items. Keep in mind that I'm only thinking of the updating of the windows themselves and not any underlying data.
One other thing I had to do was to "pump" (or process) the message queue during my (application) updates because updating a window (in my case) took several messages, ending with the WM_PAINT.
Another thing to watch out for is to not use idle messages for updating your interface. This is a quick and dirty way of having the update happen automatically, but ends up being a really bad idea because the idling only happens when there are no other events on the message queue. Of course, any time you move the mouse or press keys those events are placed onto the event queue and causes a "stall" of the update process. The idle events can end up coming so fast that it causes your application to use most of the CPU processing power just for displaying data that hasn't even changed. It's better to have your GUI only update when the underlying data it displays actually updates.
I had data coming in at 60Hz and updating lots of lists with columns of data as well as 3D stuff going on. I finally had to prioritize the updates and just not update the lists for each cycle, but DO update the 3D data each cycle. Updaing the lists at about 1-5 Hz was good enough for me and when combined with the techniques above resulted in a much improved and responsive system.

SDL_PollEvent indepth

Here is a part of SDL2 code
SDL main function
int main(int argc,char *argv[])
{
...
...
bool quit=false;
SDL_Event e;
while(!quit) ///First while (say)
{
while(SDL_PollEvent(&e)) ///Second while (say)
{
if(e.type==SDL_QUIT)
{
quit=true;
}
handleEvent(e) ;///Function for executing certain event
}
...
SDL_RenderPresent((SDL_Renderer)renderer);
}
}
My question is,what does this SDL_PollEvent() actually do ,and suppose an event occur does the execution goes out of the second while() and call the SDL_RenderPresent() or it waits for all the events to take poll and then SDL_RenderPresent() is called , i am totally confused ?
The above is a very common single thread event loop:
Basically the application is constantly inside the outer while loop. To get the smoothest user experience we try to keep this loop under 17ms (for a 60 frames a sec rate)
Every 'frame' starts by responding to all the events that are waiting in the queue (the inner while):
while(SDL_PollEvent(&e)) ///Second while (say)
{
if(e.type==SDL_QUIT)
{
quit=true;
}
handleEvent(e) ;///Function for executing certain event
}
Events are notifications from the operating system that something happened. It might be that the window is closing SDL_QUIT or that the mouse was moved.
You must respond to these events for the application to be responsive. Usually the response is to change the state of the application.
For example we might see a left-mouse is down event we might find what is "under" the mouse button and indicate that it is now selected. This is normally just finding the object and calling a function that will change its state. All that changes is the boolean value that indicates the object is now selected.
Maybe moving the mouse needs to change the point of view of the next frame so we will update the vector that stores the direction we are looking at. So we update the vector in memory.
You may have long stretches where the event queue is empty and the application does not have any events to handle. And there might be flurries of activity (for instance the user moving the mouse) where you will get lots of events to respond to.
SDL_PollEvent will not "wait" for events. If there is an event in the queue you will get the information. If there is no event it will return false.
Handling events should be done quickly (remember we have to be finished in 17ms) don't worry it is quite a lot of time on a PC.
Once you are done with all the events and out of the inner loop you are ready to move on to updating the world and rendering.
At this point you will normally do stuff like AI. Calling physics engine. For instance you might iterate over the objects and change their position based on their velocity.
The next step is to actually do the drawing.
SDL_RenderClear(renderer);
...
SDL_RenderPresent((SDL_Renderer)renderer);
The first call will clear the screen. You then go and based on the state of different objects do the rendering. For instance maybe because we changed the object state to selected we will now draw a glowing border around it.
Your final call is for SDL_RenderPresent(renderer) to present the new screen to the user
If you are using Vsync (quite common) then this final call will hide a small wait time to synch the screen update with the graphics card capabilities. This will produce a smoother graphics. Assuming a 60Hz refresh rate (60 frames per second) and assuming you are running under 16.6 ms in your frame rendering logic the app will wait the remaining time.
Now the application is ready to go back to the start of the loop and check if there are any events in SDL_PollEvent. Since the entire loop typically only takes a few milliseconds the application will always feel responsive.

Qt: what happens if you send out signals too quickly?

Here is the situation:
You have one long-running calculation running in a background thread.
This calculation is sending out a signal to, for example, refresh a GUI element, every 100 msec.
Let's say it sends out 100 such signals.
The widget being redrawn takes more than 100 msec to redraw; let's say 1 second.
What happens in the event loop? Do the signal calls "pile up" until they are all executed (i.e. 100 seconds)? Is there any mechanism for "dropping" events?
User events are never discarded. If you queue emitted signal events faster than you can process them, your event queue will grow until you run out of memory and your program will crash. It's worth noting, though, that QTimer will skip timeout events if the system is under heavy load. To some extent, that may help regulate your throughput.
You could also consider sending feedback from one thread to the other (an acknowledgement, perhaps), and manually adjust your timing in the producer thread based on how far behind the consumer thread is. Or, you could use a metaphorical sledgehammer and switch to a blocking queued connection.
In your example, you could measure the drawing time in the widget. If the drawing takes for example 240 ms, then you could process the next 2 signals quickly without drawing anything at all. That way the signals wouldn't pile up.
Edit:
Actually there is a slight problem in my solution. The last signal should always cause a redraw, otherwise the widget would show wrong data when the calculation is finished.
When a signal is skipped, a single shot timer could be started for example with a 150 ms interval. When a redraw is done because of a signal, this timer would be stopped. So after the last redraw signal, this single shot timer would cause the drawing of the final state. I guess this would work, but it would be quite complicated.
Starting a simple timer to do the redrawing when the calculation starts would quite probably be a better approach. If the drawing of the widget takes a lot of time, the timer interval could be dynamically adjusted according to the draw time.