stepping through program with debugger takes a long time - c++

When I debug my program by stepping through it, it sometimes takes a long time for the step to finish. This was not happening in the beginning of the project so most likely it is due to something I have added. Could you give me pointers as to how to remedy this. I did notice one of the problems was due to the main thread trying to paint a widget. My application is multi-threaded (1 background thread and 1 main thread) so I am wondering if it has something to do with that. Your comments are appreciated.

With gdb just set scheduler-locking mode to desired behaviour.
In this case: "The step mode optimizes for single-stepping. It stops other threads from "seizing the prompt" by preempting the current thread while you are stepping. Other threads will only rarely (or never) get a chance to run when you step."

A guess: Is your "background thread" pegged at near 100% CPU utilization?
Between lines of of your main thread, while stepping, the debugger is going to allow the background thread to also "step". If the background thread is pegged it can be running a lot more than a few instructions, causing things to appear unresponsive.
Probably if your second thread is doing that much computation continuously it indicates you've got another problem in your application that you need to fix. If you get that thread under control you will probably see your debugger handling things a lot better.

I asked a very similar question regarding visual studio: VS2010 debugger takes an unreasonable amount of time
No real answer came about. You'll find similar questions for past versions of the IDE here as well.

Related

How does thread waiting affect the execution time of the program?

In my C++ program, I am using boost libraries for parallel programming. Several threads are made to join() on other threads in a part of the program.
The program runs pretty slow for some inputs... In an attempt to improve my program, I tried finding hotspots using Intel VTune. The most time-consuming hotspot is shown to occur due to boost::this_thread::interruptible_wait:
When I checked the portion of the source code where this hotspot occurs, it shows the call to join(). I was under the impression that waiting threads do not take CPU Time. Can someone help me understand why does the thread join() operation take up so much CPU time?
Any insights on how to fix such a hotspot will be very helpful too! One way I can think of to fix such a hotspot would be to somehow detach() the threads and not join() them.
Thanks in advance!
I was under the impression that waiting threads do not take CPU Time
It really depends on how the threads wait. They may be busy waiting (i.e. spinning) to react as quickly as possible to whatever they are waiting for. The alternative of yielding execution after every check means potentially higher delays from operating system scheduling (and thread switching overhead).
VTune will mercilessly pick up on all your threading library overhead, you will need to filter appropriately to figure out where your serial hotspots are and if your parallelization has mitigated them.
If your threads spend a significant amount of time waiting on the join, your parallel section is probably not well-balanced. Without more information on your problem it's hard to tell what the reason is or how to mitigate it, but you should probably try to distribute the work more evenly.
On another note, the recent spectre/meltdown fixes appear to have increased VTune's profiling overhead. I would be careful taking the results at face value (does your program run close to the same amount of time with and without profiling?).
Edit: Related material here and here. Following the instructions in the linked page for disabling the kernel protections helped in my case, although I have not tested it on the latest VTune update.

Stack corruption, unexpected jumps in backtrace

For the sake of NDA I may not be able to paste any code here for this question.
The software I work upon is in c++. In which we use a lot of STL maps, vectors and other standard c++ features, like inheritance and others.
Recently I have been observing a SIGSEGV occur in the software. It is occurring very consistently.
The backtrace is confusing. There are many threads(related to this software) running in the system. The backtrace starts from a thread say THREAD1. It shows us that it is executing some functions in THREAD1, and it proceeds along and suddenly it jumps to THREAD2(which is running in the system), and it starts executing somewhere in the middle it and not from the beginning of that thread instance. It now takes two or three steps more, and then goes for a sigsegv. The threads THREAD1 and THREAD2 are always the same threads.
I have tried to make sure all the checked in code is proper and got them reviewed by many people.
My questions are as follows,
Are such jumps possible? If yes, what could be the root causes?
Are there any debugging steps I can take to find out what is happening with those threads?

Can I Always debug multiple instances of a same object that is of type thread with GDB?

program runs fine. When I put a breakpoint a segmentation fault is generated. Is it me or GDB? At run time this never happens and if I instantiate only one object then no problems.
Im using QtCreator on ubuntu x86_64 karmic koala.
UPDATE1:
I have made a small program containing a simplified version of that class. You can download it at:
example program
simply put a breakpoint on the first line of the function called drawChart() and step into to see the segfault happen
UPDATE2: This is another small program but it is practically the same as the mandlebrot example and it is still happening. You can diff it with mandlebrot to see the small difference.
almost the same as mandlebrot example program
To answer your question: Yes, you should be able to debug multiple threads using GDB. This depends on the concurrent design to be sound.
There is a chance you have a race condition on data that your threads access. It is possible that the problem does not show when you run the program normally, but attaching a debugger changes timing and scheduling. Even so, you should be able to use the debugger to break when the segfault happens. Understanding where this happens can inform you about the race condition or corruption, whatever the case may be.
It is worth looking into because even if it doesn't happen under most 'run time' conditions, it may manifest under different system load conditions.
Are you Calling into Qt's drawing code from multiple threads? (particularly widget methods)
http://doc.qt.nokia.com/4.3/threads.html#reentrancy-and-thread-safety
Seems like Qt is like GTK+ and you should only be touching GUI stuff from one thread (in particular the main one)
I'm not familiar enough with Qt to give you advice on how to change your code, but I'd suggest changing it to be event based (ie rendering starts in response to an event, then triggers an event in the main thread when it's done, every thread has it's own mainloop) that way you can probably completely avoid mutexes and synchronization.

Detecting application hang

I have a very large, complex (million+ LOC) Windows application written in C++. We receive a handful of reports every day that the application has locked up, and must be forcefully shut down.
While we have extensive reporting about crashes in place, I would like to expand this to include these hang scenarios -- even with heavy logging in place, we have not been able to track down root causes for some of these. We can clearly see where activity stopped - but not why it stopped, even in evaluating output of all threads.
The problem is detecting when a hang occurs. So far, the best I can come up with is a watchdog thread (as we have evidence that background threads are continuing to run w/out issues) which periodically pings the main window with a custom message, and confirms that it is handled in a timely fashion. This would only capture GUI thread hangs, but this does seem to be where the majority of them are occurring. If a reply was not received within a configurable time frame, we would capture a memory and stack dump, and give the user the option of continuing to wait or restarting the app.
Does anyone know of a better way to do this than such a periodic polling of the main window in this way? It seems painfully clumsy, but I have not seen alternatives that will work on our platforms -- Windows XP, and Windows 2003 Server. I see that Vista has much better tools for this, but unfortunately that won't help us.
Suffice it to say that we have done extensive diagnostics on this and have been met with only limited success. Note that attaching windbg in real-time is not an option, as we don't get the reports until hours or days after the incident. We would be able to retrieve a memory dump and log files, but nothing more.
Any suggestions beyond what I'm planning above would be appreciated.
The answer is simple: SendMessageTimeout!
Using this API you can send a message to a window and wait for a timeout before continuing; if the application responds before timeout the is still running otherwise it is hung.
One option is to run your program under your own "debugger" all the time. Some programs, such as GetRight, do this for copy protection, but you can also do it to detect hangs. Essentially, you include in your program some code to attach to a process via the debugging API and then use that API to periodically check for hangs. When the program first starts, it checks if there's a debugger attached to it and, if not, it runs another copy of itself and attaches to it - so the first instance does nothing but act as the debugger and the second instance is the "real" one.
How you actually check for hangs is another whole question, but having access to the debugging API there should be some way to check reasonably efficiently whether the stack has changed or not (ie. without loading all the symbols). Still, you might only need to do this every few minutes or so, so even if it's not efficient it might be OK.
It's a somewhat extreme solution, but should be effective. It would also be quite easy to turn this behaviour on and off - a command-line switch will do or a #define if you prefer. I'm sure there's some code out there that does things like this already, so you probably don't have to do it from scratch.
A suggestion:
Assuming that the problem is due to locking, you could dump your mutex & semaphore states from a watchdog thread. With a little bit of work (tracing your call graph), you can determine how you've arrived at a deadlock, which call paths are mutually blocking, etc.
While a crashdump analysis seems to provide a solution for identifying the problem, in my experience this rarely bears much fruit since it lacks sufficient unambiguous detail of what happened just before the crash. Even with the tool you propose, it would provide little more than circumstantial evidence of what happened. I bet the cause is unprotected shared data, so a lock trace wouldn't show it.
The most productive way of finding this—in my experience—is distilling the application's logic to its essence and identifying where conflicts must be occurring. How many threads are there? How many are GUI? At how many points do the threads interact? Yep, this is good old desk checking. Leading suspect interactions can be identified in a day or two, then just convince a small group of skeptics that the interaction is correct.

Debugging multitheaded programs

I have been a C programmer for many years and my favorite "debugger" has always been the printf() function - I only resort to visual studio's debugger when absolutely forced and so have never been very proficient in using it. Recently I have had to modify a program from C to C++ (although of course printf still works fine) and and parts of the program are now farmed out in to multiple threads (one for each core on a multicore machine) to make the program run faster. Now i will no doubt come up against awkward multi-thread related bugs like deadlocks and I wonder what debugging methodology I can turn to. Does visual studio (2008) have everything I could reasonably need to help me resolve thread related bugs? Should I take some time out now to learn how to use some third party debugger? Could I solve most problems using my good old printf?
Could I for example write code which, if kept waiting on entry to a critical section would print something like "Thread X waiting to enter ... but blocked because its being used by thread Y"?
Visual Studio supports thread debugging to some extend. Via the Threads Window you can select threads, suspend and resume threads etc. When you switch between threads the Call Stack Window is updated accordingly so you can inspect what each thread is doing. You may also restrict breakpoints to specific threads.
If you want an alternative WinDbg (which is part of the free Debugging Tools for Windows package from Microsoft) offers lots of options as well but with a slightly more esoteric user interface.
As for using printf, there's the problem of synchronizing output. If you don't do it you output will most likely be gibberish. If you do synchronize it you basically change the concurrency of the application, which may or may not affect the problem you're trying to solve.
If you could port your project to Linux, Valgrind (especially the 'helgrind' tool) would do exactly what you ask. http://valgrind.org/
I'm not sure if this is exactly what you are asking, but, To help in debugging, you can write code that gives each thread a "name", so that debug messages printed to the debug window, (or a log file or whatever) include that thread "name" along with whatever other info you prescribe. The code below is in C# but this is available even in unmanaged C++
Thread T = new Thread(RunSchedule);
T.Name = "Scheduler"; // <=== Thread given a name here...
T.Start();
Intel provides several tools to find out threading-related issues: data races, deadlocks, performance penalties. These tools are: Intel Thread Checker, Intel Thread Profiler.