We see that the application hangs when trying to close the application by sending WM_CLOSE to all the windows in that application. Note that WM_CLOSE is sent by a different application. We are using EnumChildWindows to enumerate through all windows and match the process id of the window handle with the process id of the application that needs to closed using GetWindowThreadProcessId method and send WM_CLOSE to all the windows of that process. From the dump, we see that the process is waiting on some handles. However we are not sure which handle the process is waiting on.
Call Stack:
ntdll.dll!NtWaitForMultipleObjects() Unknown
KERNELBASE.dll!WaitForMultipleObjectsEx() Unknown
KERNELBASE.dll!WaitForMultipleObjects() Unknown
CoreMessaging.dll!Microsoft::CoreUI::Messaging::MessageSession::WaitOnHandleCollection(struct Microsoft::CoreUI::Support::Win32Handle *,unsigned int) Unknown
CoreMessaging.dll!Microsoft::CoreUI::Messaging::MessageSession::ProcessPendingAlpcConnections() Unknown
CoreMessaging.dll!Microsoft::CoreUI::Messaging::MessageSession::OnFinalRelease() Unknown
CoreMessaging.dll!Cn::Com::ExportAdapter::Destroy(void) Unknown
CoreMessaging.dll!Cn::Com::ExportAdapter::Release() Unknown
TextInputFramework.dll!CTextInputClientFreeThread::~CTextInputClientFreeThread(void) Unknown
TextInputFramework.dll!CTextInputClientFreeThread::vector deleting destructor'(unsigned int) Unknown TextInputFramework.dll!Microsoft::WRL::Details::RuntimeClassImpl<struct Microsoft::WRL::RuntimeClassFlags<2>,1,0,0,struct ITextInputClient,struct IRemoteTextInputClient,struct IInputLanguageProvider,struct IKeyEventProcessor,struct IMessageProxyReconnectAdapterOwner,struct ITextInputClientInternal,struct IConnectionMonitor>::Release(void) Unknown msctf.dll!CTextInputClientWrapper::~CTextInputClientWrapper(void) Unknown msctf.dll!CTextInputClientWrapper::Release() Unknown msctf.dll!ATL::AtlComPtrAssign(struct IUnknown * *,struct IUnknown *) Unknown msctf.dll!OnTextInputClientWrapperReleased(void) Unknown msctf.dll!CTextInputClientWrapper::Release() Unknown TextInputFramework.dll!Microsoft::WRL::ComPtr<struct IInputLanguageProvider>::InternalRelease(void) Unknown TextInputFramework.dll!TextInputHost::~TextInputHost() Unknown TextInputFramework.dll!TextInputHost::vector deleting destructor'(unsigned int) Unknown
TextInputFramework.dll!Microsoft::WRL::Details::RuntimeClassImpl<struct Microsoft::WRL::RuntimeClassFlags<2>,1,0,0,struct IRemoteTextInputHost,struct ITextInputHost,struct IMessageProxyReconnectAdapterOwner,struct IMessageProxyListener>::Release(void) Unknown
msctf.dll!ATL::AtlComPtrAssign(struct IUnknown * *,struct IUnknown *) Unknown
msctf.dll!CThreadInputMgr::Suspend() Unknown
msctf.dll!CThreadInputMgr::OnActivationChange() Unknown
msctf.dll!CThreadInputMgr::Deactivate() Unknown
msctf.dll!CicBridge::DeactivateIMMX() Unknown
msctf.dll!_CtfImeDestroyThreadMgr() Unknown
msctf.dll!CtfImeDestroyThreadMgr() Unknown
imm32.dll!ActivateOrDeactivateTIM() Unknown
msctf.dll!TF_Notify() Unknown
user32.dll!CtfHookProcWorker(int,unsigned __int64,__int64,unsigned __int64) Unknown
user32.dll!CallHookWithSEH(struct _GENERICHOOKHEADER *,void *,unsigned long *,unsigned __int64) Unknown
user32.dll!fnHkINDWORD() Unknown
ntdll.dll!KiUserCallbackDispatcherContinue() Unknown
win32u.dll!NtUserMessageCall() Unknown
user32.dll!RealDefWindowProcWorker() Unknown
user32.dll!DefWindowProcW() Unknown
user32.dll!ImeWndProcWorker() Unknown
user32.dll!ImeWndProcW(struct HWND *,unsigned int,unsigned __int64,__int64) Unknown
user32.dll!UserCallWinProcCheckWow() Unknown
user32.dll!DispatchMessageWorker() Unknown
Any idea how to debug the issue ? Or any logging that could help identify the problem ?
First, strictly speaking sending a WM_CLOSE message to a window doesn't necessarily destroy it. This message is handled by the appropriate handler (window proc), and it may, but doesn't have to decide to destroy it.
Second, do NOT try to destroy child windows created by an application. It may not expect this, and may not work properly (may crash). You should only destroy the top-level window.
And last, but not least, sending a message to a window belonging to another thread (and another process in your case) will BLOCK your thread, until that thread that handles messages for that window processes it. If that thread decides not to process messages at all, then you will be blocked forever.
In addition if that thread waits for yours (for instance, it could also send a message to your thread) - then you have a deadlock.
If your goal is to "ask" another application to close, then a conventional way to do this is to find the target thread (what you already did), and then post (not send!!!) a WM_QUIT message to it.
That is, call PostThreadMessage with uMsg == WM_QUIT. But note: the target application may, but strictly speaking doesn't have to, quit.
Related
my project using QTcpSocket and the function setSocketDescriptor(). The code is very normal
QTcpSocket *socket = new QTcpSocket();
socket->setSocketDescriptor(this->m_socketDescriptor);
This coding worked fine most of the time until I ran a performance testing on Windows Server 2016, the crash occurred. I debugging with the crash dump, here is the log
0000004f`ad1ff4e0 : ucrtbase!abort+0x4e
00000000`6ed19790 : Qt5Core!qt_logging_to_console+0x15a
000001b7`79015508 : Qt5Core!QMessageLogger::fatal+0x6d
0000004f`ad1ff0f0 : Qt5Core!QEventDispatcherWin32::installMessageHook+0xc0
00000000`00000000 : Qt5Core!QEventDispatcherWin32::createInternalHwnd+0xf3
000001b7`785b0000 : Qt5Core!QEventDispatcherWin32::registerSocketNotifier+0x13e
000001b7`7ad57580 : Qt5Core!QSocketNotifier::QSocketNotifier+0xf9
00000000`00000001 : Qt5Network!QLocalSocket::socketDescriptor+0x4cf7
00000000`00000000 : Qt5Network!QAbstractSocket::setSocketDescriptor+0x256
In the stderr log, I see those logs
CreateWindow() for QEventDispatcherWin32 internal window failed (Not enough storage is available to process this command.)
Qt: INTERNAL ERROR: failed to install GetMessage hook: 8, Not enough storage is available to process this command.
Here is the function, where the code was stopped on the Qt codebase
void QEventDispatcherWin32::installMessageHook()
{
Q_D(QEventDispatcherWin32);
if (d->getMessageHook)
return;
// setup GetMessage hook needed to drive our posted events
d->getMessageHook = SetWindowsHookEx(WH_GETMESSAGE, (HOOKPROC) qt_GetMessageHook, NULL, GetCurrentThreadId());
if (Q_UNLIKELY(!d->getMessageHook)) {
int errorCode = GetLastError();
qFatal("Qt: INTERNAL ERROR: failed to install GetMessage hook: %d, %s",
errorCode, qPrintable(qt_error_string(errorCode)));
}
}
I did research and the error Not enough storage is available to process this command. maybe the OS (Windows) does not have enough resources to process this function (SetWindowsHookEx) and failed to create a hook, and then Qt fire a fatal signal, finally my app is killed.
I tested this on Windows Server 2019, the app is working fine, no crashes appear.
I just want to know more about the meaning of the error message (stderr) cause I don't really know what is "Not enough storage"? I think it is maybe the limit or bug of the Windows Server 2016? If yes, is there any way to overcome this issue on Windows Server 2016?
The error ‘Not enough storage is available to process this command’ usually occurs in Windows servers when the registry value is set incorrectly or after a recent reset or reinstallations, the configurations are not set correctly.
Below is verified procedure for this issue:
Click on Start > Run > regedit & press Enter
Find this key name HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\LanmanServer\Parameters
Locate IRPStackSize
If this value does not exist Right Click on Parameters key and Click on New > Dword Value and type in IRPStackSize under the name.
The name of the value must be exactly (combination of uppercase and lowercase letters) the same as what I have above.
Right Click on the IRPStackSize and click on Modify
Select Decimal enter a value higher than 15(Maximum Value is 50 decimal) and Click Ok
You can close the registry editor and restart your computer.
Reference
After researching for a few days I finally can configure the Windows Server 2016 setting (registry) to prevent the crash.
So basically it is a limitation of the OS itself, it is called desktop heap limitation.
https://learn.microsoft.com/en-us/troubleshoot/windows-server/performance/desktop-heap-limitation-out-of-memory
(The funny thing is the error message is Not enough storage is available to process this command but the real problem came to desktop heap limitation. )
So for the solution, flowing the steps in this link: https://learn.microsoft.com/en-us/troubleshoot/system-center/orchestrator/increase-maximum-number-concurrent-policy-instances
I increased the 3rd parameter of SharedSection to 2048 and it fix the issue.
Summary steps:
Desktop Heap for the non-interactive desktops is identified by the third parameter of the SharedSection= segment of the following registry value:
HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\Session Manager\SubSystems\Windows
The default data for this registry value will look something like the following:
%SystemRoot%\system32\csrss.exe ObjectDirectory=\Windows SharedSection=1024,3072,512 Windows=On SubSystemType=Windows ServerDll=basesrv,1 ServerDll=winsrv:UserServerDllInitialization,3 ServerDll=winsrv:ConServerDllInitialization,2 ProfileControl=Off MaxRequestThreads=16
The value to be entered into the Third Parameter of the SharedSection= segment should be based on the calculation of:
(number of desired concurrent policies) * 10 = (third parameter value)
Example: If it's desired to have 200 concurrent policy instances, then 200 * 10 = 2000, rounding up to a nice memory number gives you 2048as the third parameter resulting in the following update to be made to the registry value:
SharedSection=1024,3072,2048
I am compiling code with an interface with FLTK, I need to be able to fork a callback because it's taking parameters from a form an launch work when hitting 'run'.
I cannot use a fork at the start of the function to have one thread coming back to the UI instantly, it is said that XInitThreads() is run without argument and returns zero on failure, any other is success.
My check don't show up XInitThreads returning 0, so this part is working. Yet I still got an error:
[xcb] Unknown sequence number while processing queue
[xcb] Most likely this is a multi-threaded client and XInitThreads has not been called
[xcb] Aborting, sorry about that.
rc: ../../src/xcb_io.c:260: poll_for_event: Assertion `!xcb_xlib_threads_sequence_lost' failed.
And this appears two time as one per launched threads.
I assert the call with:
if(XInitThreads() == 0)
{
fprintf(stderr, "Warning ! No forking available.\n");
}
Warning doesn't appear.
Using ubuntu 20.10
g++
FLTK-1.1
ARCH amd64
//by using sdt::thread.
Fl_Button b;
b.callback(wrapper, data);
void wrapper(Fl_Widget *w, void *v) {
std::thread th(task, v);
th.detach();
}
the wrapper function starts a thread and we don't join inside of the callback function, or at all.
To debug a locked file problem, we're calling SysInternal's Handle64.exe 4.11 from a .NET process (via Process.Start with asynchronous output redirection). The calling process hangs on Process.WaitForExit because the Handle64 process doesn't exit (for more than two hours).
We took a dump of the corresponding Handle64 process and checked it in the Visual Studio 2017 debugger. It shows two threads ("Main Thread" and "ntdll.dll!TppWorkerThread").
Main thread's call stack:
ntdll.dll!NtWaitForSingleObject () Unknown
ntdll.dll!LdrpDrainWorkQueue() Unknown
ntdll.dll!RtlExitUserProcess() Unknown
kernel32.dll!ExitProcessImplementation () Unknown
handle64.exe!000000014000664c() Unknown
handle64.exe!00000001400082a5() Unknown
kernel32.dll!BaseThreadInitThunk () Unknown
ntdll.dll!RtlUserThreadStart () Unknown
Worker thread's call stack:
ntdll.dll!NtWaitForSingleObject() Unknown
ntdll.dll!LdrpDrainWorkQueue() Unknown
ntdll.dll!LdrpInitializeThread() Unknown
ntdll.dll!_LdrpInitialize() Unknown
ntdll.dll!LdrInitializeThunk() Unknown
My question is: Why would a process hang in LdrpDrainWorkQueue? From https://stackoverflow.com/a/42789684/62838, I gather that this is the Windows 10 parallel loader at work, but why would it get stuck while exiting the process? Can this be caused by how we invoke Handle64 from another process? I.e., are we doing something wrong or is this rather a bug in Handle64?
How long did you wait?
According to this analysis,
The worker thread idle timeout is set to 30 seconds. Programs which
execute in less than 30 seconds will appear to hang due to
ntdll!TppWorkerThread waiting for the idle timeout before the process
terminates.
I would recommend trying to set the registry key specified in that article to disable the parallel loader and see if this resolved the issue.
Parent Key: HKLM\Software\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\handle64.exe
Value Name: MaxLoaderThreads
Type: DWORD
Value: 1 to disable
I have called WSARecv() which returned WSA_IO_PENDING. I have then sent an RST packet from the other end. The GetQueuedCompletionStatus() function which exists in another thread has returned FALSE as expected, but when I called WSAGetLastError() I got 64 instead of WSAECONNRESET.
So why WSAGetLastError() did not return WSAECONNRESET?
Edit:
I forgot to mention that when I call WSAGetLastError() directly after a failing WSARecv() (because of an RST packet being received), the error code returned is WSAECONNRESET and not 64.
So it looks like the error code returned depends on whether WSARecv() has failed directly after calling it, or has failed later when retrieving a completion packet.
This is a generic issue with IOCP, you are making a low-level call to the TCP/IP driver stack. Which, as all drivers do in Windows, report failure with NTSTATUS error codes. The expected error here is STATUS_CONNECTION_RESET.
These native error codes need to be translated to a winapi error code. This translation is normally context-sensitive, it depends on what winapi library issued the driver command. In other words, you can only ever get a WSAECONNRESET error back if it was the Winsock library that did the translation. But that's not what happened in your program, it was GetQueuedCompletionStatus() that handled the error.
Which is a generic helper function that handles IOCP for any device driver. There is no context, the OVERLAPPED structure is not nearly enough to indicate how the I/O request got started. Turn to this KB article, it documents the default mapping from NTSTATUS error codes to winapi error codes. The mapping that GetQueuedCompletionStatus() uses. Relevant entries in the list are:
STATUS_NETWORK_NAME_DELETED ERROR_NETNAME_DELETED
STATUS_LOCAL_DISCONNECT ERROR_NETNAME_DELETED
STATUS_REMOTE_DISCONNECT ERROR_NETNAME_DELETED
STATUS_ADDRESS_CLOSED ERROR_NETNAME_DELETED
STATUS_CONNECTION_DISCONNECTED ERROR_NETNAME_DELETED
STATUS_CONNECTION_RESET ERROR_NETNAME_DELETED
These were, ahem, not fantastic choices. Probably goes back to very early Windows, back when Lanman was the network layer of choice. WSAGetLastError() is pretty powerless to map ERROR_NETNAME_DELETED back to a WSA specific error, the NTSTATUS code was lost when GetQueuedCompletionStatus() set the "last error" code for the thread. So it doesn't, it just returns what it can.
What you'd expect is a WSAGetQueuedCompletionStatus() function so this error translation can happen correctly, using Winsock rules. There isn't one. These days I prefer to use the ultimate authority on how to write Windows code properly, the .NET Framework source as available from the Reference Source. I linked to the source for SocketAsyncEventArgs.CompletionCallback() method. Which contains the key:
// The Async IO completed with a failure.
// here we need to call WSAGetOverlappedResult() just so Marshal.GetLastWin32Error() will return the correct error.
bool success = UnsafeNclNativeMethods.OSSOCK.WSAGetOverlappedResult(
m_CurrentSocket.SafeHandle,
m_PtrNativeOverlapped,
out numBytes,
false,
out socketFlags);
socketError = (SocketError)Marshal.GetLastWin32Error();
Or in other words, you have to make an extra call to WSAGetOverlappedResult() to get the proper return value from GetLastError(). This is not very intuitive :)
We are building a RTB(real time bidding) platform. Using nginx as http server, the bidder is writen in lua, google protocol buffer for serializing data and Zlog for logs. After test runs, we got three error messages in the nginx error log:
"[libprotobuf Error, google/protobuf/wire_format.cc:1053]
String field contains invalid UTF-8 data when parsing a protocol buffer.
Use the 'bytes' type if you intend to send raw bytes."
So we went back to check the source code of protocol buffer, and found that this check is controlled by a macro(-DNDEBUG: it means NOT debug mode?, according to the comment). And -DNDEBUG disables GOOGLE_PROTOBUF_UTF8_VALIDATION(i think?). So, we enabled this macro(-DNDEBUG) in the configuration. However, after testing, we still got the same error message. And then, we changed all the "String" type to "Bytes" typr in XXX.proto. After testing, the same error message showed.
worker process 53574 exited on signal 11(core dumped),then process died.
lua entry thread aborted: runtime error:/home/bilin/rtb/src/lua/shared/log.lua:34: 'short' is not callable"
Hope somebody can help us solving those problems.
Thank you.