I have a bug I am chasing (I think its a deadlock). When I run the code it hangs without the debugger flagging an error, so after a while I try pressing the pause (break all) button. The debugger then reports "The process appears to be deadlocked...". I then can see that all the threads are held up at lines saying EnterCriticalSection except for one which is already inside a critical section. When I look at the thread that is inside the C.S. with the debugger I see a green arrow, accompanied by a tiny blue circle pointing at a line with GetWindowText... as below:
// stuff A
{
GetWindowText(editwin[a].child_window_handle,existing_text,MAX_TEXT_SIZE-1);
}
// stuff B
If I hover the mouse over the green arrow I see the text "this is the next statement to execute when this thread returns from the current function". Now this has stumped me because I don't know if it means that it is stuck inside "stuff A" and is waiting to come back or its stuck inside GetWindowText and has somehow got stuck inside that. The arguments to GetWindowText all look sensible to me. If I click on "step into" I get the message "Unable to step. The process has been soft broken".
EDIT: stuff A is in fact the statement:
if (buf_ptr != NULL)
Usually a green arrow beside a line of code means "this is the next line that would be executed, if not for the fact we're stuck somewhere in a deeper stack frame." However, VS makes it impossible to say for sure based on the info provided so far...
[EDIT - of course, deep knowledge of Win32 can provide a very good guess - see the answer by "mos" for a likely explanation based on the GetWindowText() API's known pitfalls]
As mentioned, what Visual Studio shows you is sometimes misleading. To get a closer view of exactly what is happening you need to turn off some non-helpful "features" that VS enables by default. In Tools -> Options -> Debugging -> General, make sure:
Enable address-level debugging = ON
Enable Just My Code = OFF
Enable Source Server support = ON
This should allow you to:
1) break on / step over / etc the exact instruction that's causing the deadlock
2) see the full stack trace up to that point, regardless of module(s)
3) see source code whenever available, assuming your symbol & source servers are configured correctly
Your problem is that GetWindowText actually sends a message to the other window and waits for it to return. If that window is owned by another thread that is waiting for a critical section, GetWindowText will wait forever.
You're stuck inside GetWindowText, and have created a deadlock.
As the previous responses suggest, your code is stuck inside "Stuff A".
Can I suggest another tool for your tool-belt?
I usually find it much easier to debug native synchronization problems using WinDbg.
just launch your program in WinDbg, point to the correct symbols and all the info will be right there for your investigation using the !locks, !cs and k commands.
If you're new to WinDbg, you'll find that the internet is full with information about it. I recommend reading Advanced Windows Debugging as well.
It's a little bit difficult to start, comparing to the user friendly VS Debugger but every minute you'll invest in learning how to use it will save you hours of debugging further down the road.
Assuming your question is "Is this normal", then yes, the debugger usually shows the statement after the one stuck on a critical section.
Related
My application started life as a c++ Console application in VS2019. Code was provided as part of an SDK. Worked perfect. Great response from the manufacturer USB device. Later, I wanted to graduate is to a GUI application, much as I've been doing in VB and c#. Lo and behold, I managed to reconstruct the application in both Qt and Win32 but I'm running into a situation where the application becomes unresponsive and I have no way to tell what's going on.
In the Console application, I have to execute this code to interface with the device AFTER sending a "TakeMeasurement" command :
if (SDK_SUCCESSFUL(sdkError)) {
printf("\nWaiting for measurement to complete...\n");
while (!isMeasureWait) {
if (isDisConnect) break;
this_thread::sleep_for(chrono::milliseconds(1000));
}
}
This code works like a charm! Ater one or two iteration, the device has completed the measurement and I can get to the data easily.
On the Win32 side, I use the exact same code. Only, once control enters the loop, it never returns.
Any idea how I could diagnose the error? I have the impression that the "timing" is critical, between the exact moment where the Measurement command is initiated to the exact moment the instrument signals that it's done, and the data ready to be picked up.
My naive hypothesis is that, in debug mode on both 'platforms', I must be getting some timing differences? Sadly, I can't get more information from the manufacturer in this regard but I suspect I have a small window of time within which the instrument's response can be acted on? And I begin to suspect that, on Win32, that "time" is too long? Compared to on the Console side?
I was thinking of, perhaps, "measuring" that time, in milliseconds? First, on the Console side, to see what kind of delay "works", and then, to see how the delay compares with the Win32 side.
I may be wasting my time and I sure don't mean to waste yours.
How would I go about getting an idea of time elapsed in a c++ application? I'll take a look around VS2019, they have all kinds of "performance" things that popup at run time?
Any help is appreciated.
I am not sure I completely understand what is going on.
Execution of the thread wait loop was not not the culprit.
I'm not 100% sure but what happens is that, in my 'Export data to CSV TEXT file', if I tried to execute the call to :
SetWindowText(hEditMeasure, wMeasurements);
The application always hung. I placed breakpoints right before the call in the code, to trace execution, and it did not strike me at first but, in VS toolbar, there was a "thread" comboBox? With the value showing = DEVICE.DLL? and to its right, the name of my Export function as the Stackframe. In searching for additional information on the setWindowText function, I came accross the reference to use VM_SETTEXT to send to "different application"? Could it be unknowingly I was sending a message to "another thread", the DLL thread? And that's why it hung? I did not know enough to tell. So I started to move the setWindowText line around, ultimately inside the code that is called by the "Measure" button, and it worked!
I'm not out of the woods yet but I feel I'm making progress. Thank you all for your help and patience.
I have a Qt GUI program, inside it I could click a button to load/unload many dock widgets.
I have the problem that when I click the button to load/unload dock widgets, the programm crash with saying that
Debug Assertion Failed, Expression: _BLOCK_TYPE_IS_VALID(pHead->nBlockUse)
It doesn't happen every time. (Actually very rare to happen.)
And when I check the Windows's event log, it says the application hang with a cross thread dead lock.
But most people online said that the _BLOCK_TYPE_IS_VALID(pHead->nBlockUse) means a memory error.
I just don't what's going on...
It is a very big program by someone else and the bug happens very rarely...
What I could do now to locate the bug?
EDIT:
Hi, I have got the crash dump file, and I have seen that my program stop at a worker thread with the call stack: > ntdll.dll!_NtWaitForMultipleObjects#20() + 0x15 bytes
How could I trace back to the source code that the program actually stop?
That usually means you're trying to access an illegal memory block inside an std container.
To debug this properly, just look at the stack in the Call Stack window, look up the stack until you get to your code, and see why the value is invalid.
It's hard to describe it, but briefly, here is what you should do:
Install Application Verifier and run it.
Ctrl+A, select your executable.
Deselect all tests in the right pane, select only Basic->Heaps.
Ensure you have 'Full heap' enabled and 'Traces' enabled (properties via right click on 'Heaps' item).
Save. You may close Application Verifier now.
Launch WinDBG of proper architecture (the same as your app).
Ctrl+E, select your executable.
The program will be stopped on first instruction, run it using F5
The probability you'll hit the bug will be much higher. You'll also may found memory access issue you were not aware of before. When you hit one of them, the debugger will stop with one of 'Verifier stops' and you'll see the message in console telling you which command you can use to investigate further. Usually you'll be able to see detailed info about the heap using !heap -p -a <address>, including allocation and deallocation stacks.
Remember, that Application Verifier checks are enabled even when Application Verifier application is not started. You need to run Application Verifier, disable the checks and press 'Save' to actually disable them.
Hope this will help, at least a bit. Read more about Application Verifier techniques on the Internet.
We have a program which is proving difficult to debug. The MFC application runs fine for a few hours but over a day it will crash. Sometimes it will not throw any errors and simply exit the "dlg.DoModal()" section of our code without the user (or any obvious part of the code) closing the dialog. Other times it will crash and throw an error outside of our code and have a horrendous call stack that has nothing to do with our code, it has a lot of calls to system DLLs however.
A bit of background to our problem.
We are trying to develop a MFC bases C++ application (with a dialog). We have multiple threads and the code is rather large which makes debugging a nightmare. We have been experiencing intermittent crashes that we have been unable to locate the source of so far.
We are well past the use of breakpoints for debugging as we are pretty sure it is an issue somewhere in MFC, maybe not a bug but more a problem with the way we are using MFC.
Now we've tried simple things to help us like:
Enable all debug exceptions:
Debug -> Exceptions and then checking all the boxes so that we can
trap silly mistakes.
This proved helpful but we've now corrected all the errors it throws
within a few hours of running.
Search for memory leaks
We then tried Visual Leak Detector (which works beautifully by the
way) located here: http://vld.codeplex.com/ Our code now reports no
memory leaks so it is not an obvious memory leak issue. We have
included vld.h in the very top of our code near the entry point.
Adding Microsoft Symbol Server to obtain debug symbol files.
We then tried making our call stacks more human readable by using MS
Symbol servers shown in these tutorials:
http://social.msdn.microsoft.com/Forums/vstudio/en-US/3f1825e1-6770-48c0-91b0-12d8946ab259/2-how-do-i-configure-visual-studio-to-use-microsoft-symbol-server?forum=vsdebug and http://support.microsoft.com/kb/311503
This ultimately did nothing as it still doesn't tell you enough about any errors.
Using the Thread window to see all call stacks, and using the
Parallel Stack window
We have been using the Thread and Parallel stack windows to aid our
debugging but ultimately they have proved nothing more than pretty
pictures and fancily formatted call stacks that makes you feel good
more than anything. We have been using the tutorial here which has
been very handy
http://www.codeproject.com/Articles/79508/Mastering-Debugging-in-Visual-Studio-A-Beginn
Now for the more interesting things we've tried that do not throw errors straight away but can be detected as problems:
GDI Objects not being destroyed
This one is not obvious in VS2010 as an issue. Basically if you use a call like "CDC* pDC = lChild->GetDC();" and do not use the call "ReleaseDC(pDC);" then you have just created a GDI Object that will not be destroyed until your program terminates. VS2010 is a bit dumb in this regard and will keep creating these objects until your program crashes, and the call stack will look horrible and you will probably have no idea why it has crashed.
To find this issue, start Windows Task Manager -> Click Processes Tab -> Click the View Toolbar item -> Select Columns. Now check Handles, Threads, User Objects, and GDI Objects. Now start your program, find it's process in the list under Image Name, and watch to see if the GDI Objects column keeps growing or stabilizes.
Objects Not being destroyed
This is another not so obvious error, if you create a bitmap like this: "reinterpret_cast(LoadImage( GetModuleHandle(NULL), MAKEINTRESOURCE(IDB_BITMAPNPR), IMAGE_BITMAP, 0, 0, 0))" and assign it straight to a picture control, the bitmap will not be destroyed if you assign another bitmap to this picture control using similar code. Instead you need to assign the above to a HBITMAP variable which you then need to destroy when you are done.
This situation can also arise if you create a font or colour in a similar fashion.
Now with all that being said, we have tried all the methods above and we still can't find our issue. Sometimes our program will exit normally and we won't be given any debug info (this is usually after is has been running overnight), other times our program will lockup the PC (tested on multiple PCs), other times it will throw an error but we can't locate the culprit because it simply points to the ".DoModal()" part of our code and the rest is native windows DLLs which is useless for debugging purposes.
We suspect something is either being created and not destroyed properly but we aren't sure what and VS2010 is not telling us anything useful to point us in the right direction.
Does anyone have any ideas? How do we trap errors that aren't obvious to VS2010? Or rather how do we easily trap "GDI leaks" and the like?
Thanks in advance
Edit:
We've been using Microsoft's Application Verifier, it's found a few errors so far. To use it download it here http://www.microsoft.com/en-us/download/details.aspx?id=20028 run Application Verifier, add your .exe file in your Debug or Release directories and run the program in VS2010 as normal. VS2010 will break when Application Verifier 'sees' an error. It hasn't found anything too outrageous yet so I assume that we still have issues with our code.
OK, so there are numerous questions around, asking for a "Visual Studio equivalent on Linux" or a variation of this question. (here, here, here, ...)
I would like to focus on one aspect and ask how the debugging workflow possibly differs on different systems, specifically the full-integrated-IDE approach used by Visual Studio (like) systems and a possibly more "separate" toolchain oriented approach.
To this end, let me present what I consider a short description of the "Visual Studio Debugging Workflow":
Given an existing project
I open up the project (one single step from a user perspective)
I navigate to the code I want to debug (possibly by searching of my project files, which is simply done by opening the Find in Files dialog box.)
I put a breakpoint at line (a), simply by putting the cursor on the line and hitting F9
I put a "tracepoint" at line (b), by adding a breakpoint there and then changing the breakpoint properties so that the debugger doesn't stop, but instead traces the value of a local variable.
I hit F5, which automatically compiles my executable, starts it under the debugger and then I wait until the prg stops at (a), meanwhile monitoring the trace window for output of (b)
When the debugger finally stops at (a), my screen automatically shows me the following information in (one-time preconfigured windows) side-by-side at the same time:
Current call stack
values of the most recently changed local variables
loaded modules (DLLs)
a list of all active breakpoints with their locations
a watch window with the last watch expressions I entered
A memory window to examine raw memory contents
a small window displaying current register values
Plus/minus some features, this is what I would expect under Eclipse/CDT under Linux also.
How would this workflow and presented information be retrieved when developing with VIM, Emacs, gdb/DDD and the likes?
This question isn't really about if some tool has one feature or not, it's about seeing that development/debugging work is using a combination of features and having a multitude of options available at your fingertips and how you access this information when not using a fully integrated IDE.
I think your answer isn't just about which software you use, but also what methodology you use. I use Emacs and depends on TDD for most of my debugging. When I see something fail, I usually write tests filling in the gap which I (obviously) have missed, and checks every expectation that way. So it goes far between each time I use the debugger.
When I do run into problems I have several options. In some cases I use valgrind first, it can tell me if there is some memory related problems right away, eliminating the need for the debugger. It will point straight to the line where i overwrite or delete memory that should be left alone. If I suspect a race condition valgrind is pretty good at that to.
When I use the debugger I often use it right in emacs, through GUD mode. It will give me a view with stack, local variables, the source code, breakpoints and a window where I can command the debugger. It usually involves setting a couple of breakpoints, watching some memory or some evaluation, and stepping through the code. It is pretty much like using the debugger in an IDE. The GDB debugger is a powerful beast, but my problems has never been large enough to need to invoke its power.
I just fired up totalview on my "hello world" application (c++) and i only get to view the assembly code.
Is there any settings/flags i need to set to view the source code? Menubar->View->Source As->Source does not work for me.
the application im trying to debug is just a cout << "Hello World" application, just to get the debugger up and running.
Lets start with the simple stuff.
Did you compile your application with the '-g' debugging flag? The debugger relies on the compiler to provide it with a symbol table and line number table to map what happens in the executable back to your source code. Without that -g flag (or if you subsequently strip your application) that info won't be present and assembly debugging is the best you can hope for.
If you did compile with -g are the source and the executable all together in the same directory, or if not have they been moved since you compiled them? The compiler only knows the locations of the source and executable at the time they are created, if you move them around then sometimes the debugger won't be able to locate the source code file. In that case you might need to give it some help by defining a source code search path.
Write back here and let me know if -g fixed your problem. If not we can look into the search path and such.
Cheers,
Chris
I realize that Jason94 has almost certainly solved his problem some other way, but I figured I could chime in here to answer this since it is a good question.
For this particular case it would be interesting to know if the program is multi-threaded. TotalView is designed to let you work with multi-threaded programs and it has a characteristic that may be surprising to users. By default it won't always focus you on the thread that hits the breakpoint. So your program might actually have stopped at your second breakpoint in another thread.
Imagine you have 6 threads (we'll number them 0 - 5) and you set a breakpoint in a routine. Thread 0 is the one you are focused on and you hit "go". The program runs and thread 4 hits the breakpoint first. By default the breakpoint will stop the whole process when the breakpoint is hit. In the debugger you might see assembly representing where thread 0 was when thread 4 hit the breakpoint.
You can check the root window or the thread pane to see what the status of the other threads are and you might see that one of them says "B2" (for breakpoint 2). Then you can click on that thread and TotalView will refocus you to that thread and you'll see it sitting at the breakpoint.
Why do we do that? Well, because we think it is confusing/disconcerting to have your focus "ripped away from you" just because another thread hit a breakpoint. So by default we leave the user in control of their thread focus.
There is a preference that you can change which will tell totalview to refocus the process window to the "site of the event". You can set that if you would prefer to have TotalView refocus your attention to the breakpoint, but be aware that as you do that you may be bouncing from one thread to the next.
The other possibility is that TotalView stopped the process for some reason other than a breakpoint being hit. Did the program segfault? Check the status bar at the top of the process window to see what the status of the thread and process are.
Anyway -- just wanted to post this for the record.