Windows Aero Rendering Bug - c++

So, I have stumbled upon an interesting bug with the Windows API and I'm wondering if anyone has some insight on how to work around it. It seems that even Google has struggled with it. It should be noted that while I will be fixing this in Qt source itself, the problem is with Windows default message handling, not Qt. All of the files that I will mention can be found online as they are all open source libraries. Below is somewhat of a complex problem and I will try and give as much context as possible. I've put a lot of time and effort into fixing this myself, but being that I've only been an engineer for about 8 months, I'm still quite inexperienced and could very well have missed something obvious.
The context:
I have written a program that uses Qt to skin my windows with custom skins. These skins go over the system default non-client UI skins. In other words, I use custom painted frames (supported by Qt). Since Qt5 I've been having issues with my program when it is run on any pre-Windows Aero OS ( less than XP and greater than Vista with Windows Aero disabled). Unfortunately, Qt devs have all but confirmed that they do not really support XP anymore, so I will not rely on them to fix the bug.
The Bug:
Clicking anywhere in the non-client area while running a machine with composition disabled (Windows Aero disabled or not existing) will cause Windows to repaint its system default non-client UI on top of my custom skin.
My Research
A bit of debugging and investigation led me to qWindowsProc in qwindowscontext.cpp. I was able to determine that the last windows message to be handled before my window's skin was painted over was WM_NCLBUTTONDOWN. This seemed strange, so I took to the internets.
Sure enough, I found a file called hwnd_message_Handler.cc that comes from Google's Chromium Embedded Framework (CEF). In that file are many comments about how various windows messages, for some insane reason, cause repaints of the system default non-client frames over custom frames. The following is one such comment.
// A scoping class that prevents a window from being able to redraw in response
// to invalidations that may occur within it for the lifetime of the object.
//
// Why would we want such a thing? Well, it turns out Windows has some
// "unorthodox" behavior when it comes to painting its non-client areas.
// Occasionally, Windows will paint portions of the default non-client area
// right over the top of the custom frame. This is not simply fixed by handling
// WM_NCPAINT/WM_PAINT, with some investigation it turns out that this
// rendering is being done *inside* the default implementation of some message
// handlers and functions:
// . **WM_SETTEXT**
// . **WM_SETICON**
// . **WM_NCLBUTTONDOWN**
// . EnableMenuItem, called from our WM_INITMENU handler
// The solution is to handle these messages and **call DefWindowProc ourselves**,
// but prevent the window from being able to update itself for the duration of
// the call. We do this with this class, which automatically calls its
// associated Window's lock and unlock functions as it is created and destroyed.
// See documentation in those methods for the technique used.
//
// The lock only has an effect if the window was visible upon lock creation, as
// it doesn't guard against direct visiblility changes, and multiple locks may
// exist simultaneously to handle certain nested Windows messages.
//
// IMPORTANT: Do not use this scoping object for large scopes or periods of
// time! IT WILL PREVENT THE WINDOW FROM BEING REDRAWN! (duh).
//
// I would love to hear Raymond Chen's explanation for all this. And maybe a
// list of other messages that this applies to ;-)
Also in that file exists several custom message handlers to prevent this bug from occurring. For example, another message I found that causes this bug is WM_SETCURSOR. Sure enough, they have a handler for that which, when ported to my program, worked wonderfully.
One of the common ways they handle these messages is with a ScopedRedrawLock. Essentially, this just locks redrawing at the beginning of the hostile message's default handling (via DefWindowProc) and remains locked for the duration of the call, unlocking itself when it comes out of scope (hence, ScopedRedrawLock). This will not work for WM_NCLBUTTONDOWN for the following reason:
Stepping through qWindowsWndProc during the default handling of WM_NCLBUTTONDOWN, I saw that WM_SYSCOMMAND is handled in the same call stack directly after WM_NCLBUTTONDOWN. The wParam for this particular WM_SYSCOMMAND is 0xf012 - another officially undocumented value**. Luckily in the remarks section of the MSDN WM_SYSCOMMAND page somebody commented about it. Turns out, it is the SC_DRAGMOVE code.
For reasons that may seem obvious, we cannot simply lock redrawing for the handling of WM_NCLBUTTONDOWN because Windows automatically assumes that the user is trying to drag the window if he clicks on a non-client area (in this case, HTCAPTION). Locking here will cause the window to never redraw for the duration of the drag- until Windows receives a button up message(WM_NCLBUTTONUP or WM_LBUTTONUP).
And sure enough, I find this comment in their code,
if (!handled && message == WM_NCLBUTTONDOWN && w_param != HTSYSMENU &&
delegate_->IsUsingCustomFrame()) {
// TODO(msw): Eliminate undesired painting, or re-evaluate this workaround.
// DefWindowProc for WM_NCLBUTTONDOWN does weird non-client painting, so we
// need to call it inside a ScopedRedrawLock. This may cause other negative
// side-effects (ex/ stifling non-client mouse releases).
DefWindowProcWithRedrawLock(message, w_param, l_param);
handled = true;
}
This makes it seem as though they had the same problem, but didn't quite get around to solving it.
The only other place CEF handles WM_NCLBUTTONDOWN in the same scope as this problem is here:
else if (message == WM_NCLBUTTONDOWN && delegate_->IsUsingCustomFrame()) {
switch (w_param) {
case HTCLOSE:
case HTMINBUTTON:
case HTMAXBUTTON: {
// When the mouse is pressed down in these specific non-client areas,
// we need to tell the RootView to send the mouse pressed event (which
// sets capture, allowing subsequent WM_LBUTTONUP (note, _not_
// WM_NCLBUTTONUP) to fire so that the appropriate WM_SYSCOMMAND can be
// sent by the applicable button's ButtonListener. We _have_ to do this
// way rather than letting Windows just send the syscommand itself (as
// would happen if we never did this dance) because for some insane
// reason DefWindowProc for WM_NCLBUTTONDOWN also renders the pressed
// window control button appearance, in the Windows classic style, over
// our view! Ick! By handling this message we prevent Windows from
// doing this undesirable thing, but that means we need to roll the
// sys-command handling ourselves.
// Combine |w_param| with common key state message flags.
w_param |= base::win::IsCtrlPressed() ? MK_CONTROL : 0;
w_param |= base::win::IsShiftPressed() ? MK_SHIFT : 0;
}
}
And while that handler addresses a similar problem, its not quite the same.
The Question
So at this point I'm stuck. I'm not quite sure where to look. Maybe I'm reading the code incorrectly? Maybe the answer is there in CEF and I'm just overlooking it? It seems like CEF engineers encountered this problem and have yet to come up with the solution, given the TODO: comment. Does anybody have any idea what else I could do? Where do I go from here? Not solving this bug is not an option. I'm willing to dig deeper but at this point I'm contemplating actually handling Windows drag events myself rather than having the DefWindowProc handle it. Though, that might still cause the bug in the case where the user is actually dragging the window.
Links
I have included a list of links that I have been using in my research. Personally, I downloaded CEF source myself so that I could better navigate the code. If you are truly interested in solving this problem, you might need to do the same.
WM_NCLBUTTONDOWN
WM_NCHITTEST
WM_SYSCOMMAND
DefWindowProc
hwnd_message_handler.cc
hwnd_message_handler.h
qwindowscontext.cpp
Tangent
Just to bring validation to CEF's code, if you look in the header of hwnd_message_handler, you will also notice that there are two undocumented windows messages of value 0xAE and 0xAF. I was seeing 0xAE during the default handling of WM_SETICON that was causing problems, and this code helped confirm that what I was seeing was indeed real.

I found this page which suggests hiding your window by removing WS_VISIBLE immediately before calling DefWindowProc(), then showing it immediately after. I haven't tried it, but it's something to look at.

So, the actual way this fix was achieved was by removing the WS_CAPTION flag during NC_LBUTTONDOWN and adding it back during NC_LBUTTONUP message handling. However, because of the way Windows calculates its size before rendering, it could miscalculate since it removes the caption area from consideration. So, you will need to offset this while handling the WM_NCCALCSIZE message.
Keep in mind that the amount of pixels you will need to offset will vary depending on which windows theme or OS you are in. i.e. Vista has a different theme than XP. So you will need to decide on a scale factor to keep it clean.

Related

Simulate mouse click in background window

I'm trying to use SendMessage to post mouse clicks to a background window (Chrome), which works fine, but brings the window to front after every click. Is there any way to avoid that?
Before anyone says this is a duplicate question, please make sure that the other topic actually mentions not activating the target window, because I couldn't find any.
Update: aha, hiding the window does the trick, almost. It receives simulated mouse/keyboard events as intended, and doesn't show up on screen. However, I can just barely use my own mouse to navigate around the computer, and keyboard input is completely disrupted.
So my question is, how does sending messages to a window affect other applications? Since I'm not actually simulating mouse/keyboard events, shouldn't the other windows be completely oblivious to this?
Is it possibly related to the window calling SetCapture when it receives WM_LBUTTONDOWN? And how would I avoid that, other than hooking the API call (which would be very, very ugly for such a small task)?
The default handling provided by the system (via DefWindowProc) causes windows to come to the front (when clicked on) as a response to the WM_MOUSEACTIVATE message, not WM_LBUTTONDOWN.
The fact that Chrome comes to the front in response to WM_LBUTTONDOWN suggests that it's something Chrome is specifically doing, rather than default system behaviour that you might be able to prevent in some way.
The source code to Chrome is available; I suggest you have a look at it and see if it is indeed something Chrome is doing itself. If so, the only practical way you would be able to prevent it (short of compiling your own version of Chrome) is to inject code into Chrome's process and sub-class its main window procedure.

Properly Handling Alt-Enter / Alt-Tab Fullscreen Resolution

The MSDN page on DXGI gives instructions on how to handle fullscreen resolutions different from the desktop resolution. It says to call IDXGISwapChain::ResizeTargets() before calling IDXGISwapChain::SetFullscreenState() to prevent flickering, among other things.
It does not say how to handle Alt-Enter, which calls IDXGISwapChain::SetFullscreenState() before the program is given a chance to make its own call to IDXGISwapChain::ResizeTargets(). If the latter method is called upon a WM_SIZE message, another WM_SIZE message will be sent, possibly causing an infinite loop. How can I ensure that the latter will be called before the former when alt-enter or alt-tab are pressed, and that mode switching occurs painlessly in general?
This is going to be really tricky ... the right way how this is supposed to be handled is IDXGIFactory::MakeWindowAssociation, which, as far as I know, nobody has managed to use successfully. You may want to try it anyway.
The "right" answer is to manually handle Alt+Enter. So, disable Alt+Enter using MakeWindowAssociation and get your hands dirty. First, there is no need to capture WM_SIZE. Instead, listen on WM_ENTERSIZEMOVE, WM_CAPTURECHANGED, WM_WINDOWPOSCHANGED and WM_EXITSIZEMOVE. This will prevent you from having to deal with WM_SIZE and still get all relevant window resizing events. (When doing this, read this question as well: WM_ENTERSIZEMOVE / WM_EXITSIZEMOVE - when using menu, not always paired)
Ok, so assuming everything went fine, for Alt+Enter, you have to do the following: You set your swap chain to full screen using IDXGISwapChain::SetFullscreenState and then resize your swap chain (IDXGISwapChain::ResizeBuffers). By default, you'll get a swap chain which is as close as possible to the current resolution of your window before resizing. The way you do this properly is to enumerate the full screen resolutions first, and upon going fullscreen, forcing the resolution you want to have. This sounds ugly, but it seems to be the most robust way to solve the problem.
In general, real, exclusive fullscreen mode is not worth the trouble, as you will always get flickering when someone goes Alt+Tab (you can't avoid it if a mode switch happens, as the screen itself will have to readjust.) A much better solution is to use a fullscreen borderless window. You simply create a window class without any decoration, make it full-screen, place it such that it covers the whole screen and be done with it. Then you don't have to worry about Alt+Enter and Alt+Tab at all. It also allows people to continue working on a second screen without flickering. Performance wise, this is pretty ok'ish (most new games support this as "borderless fullscreen".)
There might be a silver bullet which solves all of this correctly, but I haven't seen it yet. If there is a cleaner/nicer solution, I'd be really curious to hear it. "Borderless fullscreen" seems to be the current standard though, IIRC, Unity 5 will only allow "borderless fullscreen" for Direct3D 11.
I just want to add an update on this issue - I've written a small windowing library that I believe handles DXGI pretty well - no debug messages, no error messages, and everything behaves as intended, at least on my Windows environment. The full solution to this problem is much too complex to explain in a single answer, since it requires a lot of precisely placed method calls (DXGI is really, really rigid as it turns out), but I have my code up on github if anyone wants to take a look at it. Specifically, this file and this file are the ones you want to look at - the latter being an aggregate object of the former.
Note that I have disabled ALT+ENTER in favor of F11, but the functionality is exactly the same.
If you want to use that library, by the way, I am releasing it as free software and will provide documentation soon.

What function is called when Alt-Enter is pressed?

I have a game app that has the ability to go fullscreen and back to windowed when Alt-Enter is pressed. However, when it goes fullscreen, I get the following warning from DirectX:
DXGI Warning: IDXGISwapChain::Present: Fullscreen presentation inefficiencies incurred due to application not using IDXGISwapChain::ResizeBuffers appropriately, specifying a DXGI_MODE_DESC not available in IDXGIOutput::GetDisplayModeList, or not using DXGI_SWAP_CHAIN_FLAG_ALLOW_MODE_SWITCH.
I've already ruled out the second two possibilities through testing, so I know the only reasons left for the warning to pop up are either IDXGISwapChain::ResizeBuffers isn't being used right, or Windows is just bugged. Since I can't debug the 2nd possibility, I'm sticking with the ResizeBuffers problem. To debug this, I want to look at what happens when Alt-Enter is pressed going from windowed to fullscreen. However, the app does not seem to be calling my ResizeDXGIBuffers method; in fact, it seems that Alt-Enter is embedded into windows or DirectX somewhere, and I don't know how to find the chain of function calls that go off when it is pressed. EDIT: When my method is put in the WM_ACTIVATEAPP handler, it is called, but this is not what i meant. If i take it out of that message handler, the window STILL goes to fullscreen, even though I am not calling any functions to make the window fullscreen myself. So Alt+Enter must be automatically calling some internal function to do this.
So that is my question: Does anyone know what function is called by windows and/or DirectX 11 when Alt-Enter is pressed?
EDIT: As the tags for this question say, I am using DirectX 11 on a Windows machine. Specifically, Windows 7 64-bit.
EDIT 2: I now completely eat the Alt+Enter keystroke and manually store the state of Alt+Enter being pressed so that I know for certain only my code is being called. The warning I spoke of above persists, however. I am following the MSDN best practices as well, so I don't know where to go from here.
Try handling the WM_ACTIVATEAPP message.
I do not know which framework you use to create your windows, so I can't tell how to concretely handle this message.
After looking at the MSDN best practices page and re-working my code to reflect all of the practices described, the warning has disappeared. I hope this helps anyone else that has the same problem.
Also, thanks to Hans Passant for the link. I already fixed it by the time you posted it, but thanks anyways.

A single MFC/Win32 control seems to be making my whole desktop repaint

I have a custom control, which owns an edit box and moves it around, etc. The edit-box is typically modified with a wodge of code like this:
edit.MoveWindow( &rc );
edit.SetWindowText( text );
edit.SetLimitText( N );
edit.ShowWindow(SW_SHOW);
edit.SetFocus();
edit.SetSel(0, CB_ERR);
RECT rc is in coordinates local to the custom control, edit is created with the custom control as parent. I'm not even sure this is definitely the problem, but when triggering this code sometimes it is nice and smooth, other times my entire desktop appears to flicker like it's being redrawn. I can't see I'm explicitly calling Invalidate(Rect) anywhere.
Any ideas?
It's not going to be any of the code that you are showing us. A whole desktop flash is nearly always somewhere in your code that is calling InvalidateRect(NULL,...) so keep digging.
Several of these calls will result in messages being sent to the parent window of the edit, most likely the InvalidateRect is happening while handling that message.
If I was a betting man, I'd bet on the SetFocus() call as the one that's triggering the repaint.

Managing Window Z-Order Like Photoshop CS

So I've got an application whose window behavior I would like to behave more like Photoshop CS. In Photoshop CS, the document windows always stay behind the tool windows, but are still top level windows. For MDI child windows, since the document window is actually a child, you can't move it outside of the main window. In CS, though, you can move your image to a different monitor fine, which is a big advantage over docked applications like Visual Studio, and over regular MDI applications.
Anyway, here's my research so far. I've tried to intercept the WM_MOUSEACTIVATE message, and use the DeferWindowPos commands to do my own ordering of the window, and then return MA_NOACTIVATEANDEAT, but that causes the window to not be activated properly, and I believe there are other commands that "activate" a window without calling WM_MOUSEACTIVATE (like SetFocus() I think), so that method probably won't work anyway.
I believe Windows' procedure for "activating" a window is
1. notify the unactivated window with the WM_NCACTIVATE and WM_ACTIVATE messages
2. move the window to the top of the z-order (sending WM_POSCHANGING, WM_POSCHANGED and repaint messages)
3. notify the newly activated window with WM_NCACTIVATE and WM_ACTIVATE messages.
It seems the cleanest way to do it would be to intercept the first WM_ACTIVATE message, and somehow notify Windows that you're going to override their way of doing the z-ordering, and then use the DeferWindowPos commands, but I can't figure out how to do it that way. It seems once Windows sends the WM_ACTIVATE message, it's already going to do the reordering its own way, so any DeferWindowPos commands I use are overridden.
Right now I've got a basic implementation quasy-working that makes the tool windows topmost when the app is activated, but then makes them non-topmost when it's not, but it's very quirky (it sometimes gets on top of other windows like the task manager, whereas Photoshop CS doesn't do that, so I think Photoshop somehow does it differently) and it just seems like there would be a more intuitive way of doing it.
Anyway, does anyone know how Photoshop CS does it, or a better way than using topmost?
I havn't seen anything remarkable about Photoshop CS that requries anything close to this level of hacking that can't instead be done simply by specifying the correct owner window relationships when creating windows. i.e. any window that must be shown above some other window specifies that window as its owner when being created - if you have multiple document windows, each one gets its own set of owned child windows that you can dynamically show and hide as the document window gains and looses activation.
You can try handle WM_WINDOWPOSCHANGING event to prevent overlaping another windows (with pseudo-topmost flag). So you are avoiding all problems with setting/clearing TopMost flag.
public class BaseForm : Form
{
public virtual int TopMostLevel
{
get { return 0; }
}
[DllImport("user32.dll")]
[return: MarshalAs(UnmanagedType.Bool)]
static extern bool EnumThreadWindows(uint dwThreadId, Win32Callback lpEnumFunc, IntPtr lParam);
/// <summary>
/// Get process window handles sorted by z order from top to bottom.
/// </summary>
public static IEnumerable<IntPtr> GetWindowsSortedByZOrder()
{
List<IntPtr> handles = new List<IntPtr>();
EnumThreadWindows(GetCurrentThreadId(),
(hWnd, lparam) =>
{
handles.Add(hWnd);
return true;
}, IntPtr.Zero);
return handles;
}
protected override void WndProc(ref Message m)
{
if (m.Msg == (int)WindowsMessages.WM_WINDOWPOSCHANGING)
{
//Looking for Window at the bottom of Z-order, but with TopMostLevel > this.TopMostLevel
foreach (IntPtr handle in GetWindowsSortedByZOrder().Reverse())
{
var window = FromHandle(handle) as BaseForm;
if (window != null && this.TopMostLevel < window.TopMostLevel)
{
//changing hwndInsertAfter field in WindowPos structure
if (IntPtr.Size == 4)
{
Marshal.WriteInt32(m.LParam, IntPtr.Size, window.Handle.ToInt32());
}
else if (IntPtr.Size == 8)
{
Marshal.WriteInt64(m.LParam, IntPtr.Size, window.Handle.ToInt64());
}
break;
}
}
}
base.WndProc(ref m);
}
}
public class FormWithLevel1 : BaseForm
{
public override int TopMostLevel
{
get { return 1; }
}
}
So, FormWithLevel1 will be always over any BaseForm. You can add any number of Z-order Levels. Windows on the same level behave as usual, but will be always under Windows with level Current+1 and over Windows with level Current-1.
Not being familiar to Photoshop CS it is bit hard to know exactly what look and feel you are trying to achieve.
But I would have thought if you created a modeless dialog window as you tool window and you made sure it had the WS_POPUP style then the resulting tool window would not be clipped to the main parent window and Windows would automatically manage the z-Order making sure that the tool window stayed on top of the parent window.
And as the tool window dialog is modeless it would not interfere with the main window.
Managing Window Z-Order Like Photoshop CS
You should create the toolwindow with the image as the parent so that windows manage the zorder. There is no need to set WS_POPUP or WS_EX_TOOLWINDOW. Those flags only control the rendering of the window.
Call CreateWindowEx with the hwnd of the image window as the parent.
In reply to Chris and Emmanuel, the problem with using the owner window feature is that a window can only be owned by one other window, and you can't change who owns a window. So if tool windows A and B always need to be on top of document windows C and D, then when doc window C is active, I want it to own windows A and B so that A and B will always be on top of it. But when I activate doc window D, I would have to change ownership of tool windows A and B to D, or else they will go behind window D (since they're owned by window C). However, Windows doesn't allow you to change ownership of a window, so that option isn't available.
For now I've got it working with the topmost feature, but it is a hack at best. I do get some consolation in the fact that GIMP has tried themselves to emulate Photoshop with their version 2.6, but even their implementation occasionally acts quirky, which leads me to believe that their implementation was a hack as well.
Have you tried to make the tool windows topmost when the main window receives focus, and non-topmost when it loses focus? It sounds like you've already started to look at this sort of solution... but much more elaborate.
As a note, it seems to be quite well documented that tool windows exhibit unexpected behavior when it comes to z-ordering. I haven't found anything on MSDN to confirm it, but it could be that Windows manages them specially.
I would imagine they've, since they're not using .NET, rolled their own windowing code over the many years of its existence and it is now, like Amazon's original OBIDOS, so custom to their product that off-the-shelf (aka .NET's MDI support) just aren't going to come close.
I don't like answering without a real answer, but likely you'd have to spend a lot of time and effort to get something similar if Photoshop-like is truly your goal. Is it worth your time? Just remember many programmers over many years and versions have come together to get Photoshop's simple-seeming windowing behavior to work just right and to feel natural to you.
It looks like you're already having to delve pretty deep into Win32 API functions and values to even glimpse at a "solution" and that should be your first red flag. Is it eventually possible? Probably. But depending on your needs and your time and a lot of other factors only you could decide, it may not be practical.