Distributed software debug with gdb - c++

I am currently developing a distributed software in C++ using linux which is executed in more than 20 nodes simultaneously. So one of the most challenging issue that I found is how to debug it.
I heard that is possible to manage in a single gdb session multiple remote sessions (e.g. in the master node I create the gdb session and in every other node I launch the program using gdbserver), is it possible? If so can you give an example? Do you know any other way to do it?
Thanks

You can try to do it like this:
First start nodes with gdbserver on remote hosts. It is even possible to start it without a program to debug, if you start it with --multi flag. When server is in multi mode, you can control it from your local session, I mean that you can make it start a program you want to debug.
Then, start multiple inferiors in your gdb session
gdb> add-inferior -copies <number of servers>
switch them to a remote target and connect them to remote servers
gdb> inferior 1
gdb> target extended-remote host:port // use extended to switch gdbserver to multi mode
// start a program if gdbserver was started in multi mode
gdb> inferior 2
...
Now you have them all attached to one gdb session. The problem is that, AFAIK, it is not much better than to start multiple gdb's from different console tabs. On the other hand you can write some scripts or auto tests this way. See the gdb tutorial: server and inferiors.

I don't believe there is one, simple, answer to debugging "many remote applications". Yes, you can attach to a process on another machine, and step through it in GDB. But it's quite awkward to debug a large number of interdependent processes, especially when the problem is complicated.
I believe a good set of logging capabilities in the code, supplemented with additional logs for specific debugging as needed, is more likely to give you a good/fast result.
Another option might be to run the processes on one machine, rather than on multiple machines. Perhaps even use threads within one process, to simulate the behaviour of multiple machines, simplifying the debugging process. Of course, this doesn't prevent bugs that appear ONLY when you run 20 processes on 20 different machines. But the basic idea is to reduce the number of those bugs to a minimum, and debug most things in a "simpler environment".
Aggressive use of defensive programming paradigms, such as liberal use of assert is clearly a good idea (perhaps with a macro to turn it off for the production runs, but make sure that you don't just leave error paths completely unchecked - it is MUCH harder to detect that the reason something crashes is that a memory allocation failed than to track down where that NULL pointer came from some 20 function calls away from a failed allocation.

Related

How to develop a debugger

I am looking for a way to do things such as attach to a process, set breakpoints, view memory, and other things that gdb/lldb can do. I cannot, however, find a way to do these things.
This question is similar to this one, but for MacOS instead of Windows. Any help is appreciated!
Note: I want to make a debugger, not use one.
Another thing is that i dont want this debugger to be super complicated, all i need is just reading/writing memory, breakpoint handling, and viewing the GPR
If you really want to make your own debugger, another way to start would be to figure out how to cons up and parse the gdb-remote protocol packets (e.g. https://sourceware.org/gdb/onlinedocs/gdb/Remote-Protocol.html). That's the protocol gdb uses when remote debugging and lldb uses for everything but Windows debugging. On MacOS, lldb spawns a debugserver instance which does the actual debugging and controls it with gdb-remote protocol packets. On Linux, it uses the lldb-server tool that's part of the Linux lldb distribution for the same purpose.
The gdb-remote protocol has primitives for most of the operations you want to perform, launch a process, attach to a process, set breakpoints, read memory & registers and isolates you from a lot of the low-level details of controlling processes.
You can help yourself out by observing how lldb uses this protocol by running an lldb debug session with:
(lldb) log enable gdb-remote packets
But you might also have a look at the SB API's in lldb. The documentation is not as advanced as it should be but there are a bunch of examples in the examples/python directory of the lldb sources to get you started, and in general the API's are pretty straightforward and self-explanatory.
LLDB has an API that can be consumed from C++ or Python. Maybe this is what you’re looking for.
Unfortunately the documentation is fairly threadbare, and there don’t seem to be usage examples. This will therefore entail some reading of the source and a lot of trial and error.
If you want to write your own debugger, you'll need to obtain a task port to the process (task_for_pid), then you can read/write/iterate to virtual memory (mach_vm_read, mach_vm_write, mach_vm_region). To do breakpoints, you need to first set up an exception handler then you can manipulate the debug registers on threads (task_threads, thread_get_state, thread_set_state) and handle them in the exception handler.
Reference to some not all that correct debugger code I've written because breakpoints (especially instruction ones) are a bit involved.
MacDBG may be another lightweight reference but I haven't used it myself.
Well, if you want to write a debugger, take a look at the gdb/lldb source code. I'd suggest the latter, due to historical legacy in gdb that might cloud whatever is actually going on.
Use a debugger. Such as gdb or lldb for example. They have plenty of documentation to teach you the how bit, but for example gdb -p <pid of process> will attach gdb to a running process.
If you want to drive gdb for example from a C++ program, then launch it in a separate process (see fork and exec) with the aguments it needs (probably including the one to enable its machine parsable interface). Make sure you set up pipes to its stdin/stdout so you can read its output and send it commands.
If instead you want to write your own debugger from scratch then that is a huge undertaking. I suggest starting by reading the source of an existing open source debugger.
Whilst you could look at source code of another debugger, you may find it difficult to understand without the knowledge of the underlying concepts. Therefore, I recommend you start by obtaining a good grounding of the concepts with the following resources:
Mac OS X Sys Internals
Rather outdated now, but the original bible for the internals of Mac
Mac OS X and iOS Internals
Again, outdated but useful. This link is Jonathan Levin's (the author's) own site and he's now providing it for free, due to issues he had with the publisher. He's since purchased back the rights, making it available to all.
*OS Internals
The current bible of Mac Internals, also by Jonathan Levin. Books III and I have been published, with book II to follow shortly!

How can you trace execution of an embedded system emulted in QEMU?

I've built OpenWrt for x86 and I'm using QEMU to run it virtually.I'm trying to debug this system in real time. I need to see things like network traffic flowing etc.
I can attach gdb remotely and execute (mostly) step by step with break points. I really want trace points though. I don't want to pause execution and loose network flow. When I tried setting trace points using tstart, I see the message "Target does not support this command". I did a bit of reading of the gdb documentation and from what I can tell the gdb stub that runs to intercept normal execution in QEMU does not support trace points.
From here I started looking at other tools and ran across PANDA (https://github.com/panda-re/panda). As I understand PANDA will capture a complete system trace in a log and allow for replay. I think this tool is supposed to do what I need, but I cannot seem to replay the results. I see the logs, I just can't replay them.
Now, I'm a bit stuck on what other tools/options I might have to actually trace a running embedded system. Are there any good tools you can recommend? Or perhaps another method I've missed?
If you want to see the system calls and signals use strace.
Strace can also be used with running process and it can put the output in a log file if required.
In OpenWrt it is possible to build with ftrace. Ftrace has much of the functionality I required but not all.
To build with ftrace, the option for ftrace must be selected in the build menu. Additionally there are a variety of tracer options that must also be enabled.
The trace-cmd (ftrace) is located in menuconfig/Development
Tracing support is under menuconfig/Global build settings/Compile the kernel with tracing support and includes: Trace system calls, Trace process context switches and events, and Function tracer (Function graph tracer, Enable/disable function tracing dynamically, and Function profiler)
I'm also planning to build a custom GDB stub to do this a little bit better as I also want to see the data passed to the functions not just the function calls.

gdb: How do I pause during loop execution?

I'm writing a software renderer in g++ under mingw32 in Windows 7, using NetBeans 7 as my IDE.
I've been needing to profile it of late, and this need has reached critical mass now that I'm past laying down the structure. I looked around, and to me this answer shows the most promise in being simultaneously cross-platform and keeping things simple.
The gist of that approach is that possibly the most basic (and in many ways, the most accurate) way to profile/optimise is to simply sample the stack directly every now and then by halting execution... Unfortunately, NetBeans won't pause. So I'm trying to find out how to do this sampling with gdb directly.
I don't know a great deal about gdb. What I can tell from the man pages though, is that you set breakpoints before running your executable. That doesn't help me.
Does anyone know of a simple approach to getting gdb (or other gnu tools) to either:
Sample the stack when I say so (preferable)
Take a whole bunch of samples at random intervals over a given period
...give my stated configuration?
Have you tried simply running your executable in gdb, and then just hitting ^C (Ctrl+C) when you want to interrupt it? That should drop you to gdb's prompt, where you can simply run the where command to see where you are, and then carry on execution with continue.
If you find yourself in a irrelevant thread (e.g. a looping UI thread), use thread, info threads and thread n to go to the correct one, then execute where.

How to find why my application becomes unresponsive at unaccessible customer sites

My application is deployed at customer sites, that I can not access, and has no internet connection.
There are complains that in several sites, once in a week or so, the application become unresponsive, so that the operators need to kill and restart it.
We were unable to observe it in our site.
Is there something I can do that may help me find the problem?
It is a VC2008 Win32 MFC applications.
The application is quite complex, and includes many threads, synchronization mechanisms, database access, HMI, communication channels...
Note: The custmer can send us log files.
Note: The application does not crash. It just hangs. Since I don't know what is the nature of the problem, I have no way to know programmatically that something went wrong (or do I?)
I have had great success with ADplus and WinDBG in the past. You may check it out. Especially check out the Hang mode in ADplus.
I would start with some questions - is the CPU hogged during these unresponsive times? Is there a specific process that's hogging it? (You can use PerfMon to get the answers). Depending on the answers I would probably proceed by taking a dump of the process at this stage (ProcDump by sysinternals is great for these purposes) and investigate it offline.
In similar situation on a non-windows platform we have the capability to gather system dumps. Get a thread dump of the entire system for off-site analysis. This enables us to find deadlocks quite easily. For slow problems rather than stop a single dump is not enough. Then we need a sequence of dumps, and some good luck.
Another, rather messier technique is to have enough trace, and enough fine-grained control of trace in the app. Then turn on some trace and hope to spot where the delays are happening.
My experience with finding bugs in installations on the other side of the planet shows three helpful techniques: Logging, logging, and logging.
What do those log files say your customers sent you? If they aren't detailed enough, send them a version that logs more. Use binary approximation to home in on the error.
To know where the process is hung is better to start with the stack trace at that instant.
Now since your program is installed remotely and you can't access it, you can write a monitoring program which can periodically check the stack of your program and log it. This information along with your logging mechanism will make things easier to identify and debug.
Since I am not a windows programmer, i don't know much about such tools availability in windows, however i think you need something similar to this http://www.codeproject.com/KB/threads/StackWalker.aspx

How to use gprof to profile a daemon process without terminating it gracefully?

Need to profile a daemon written in C++, gprof says it need to terminate the process to get the gmon.out. I'm wondering anyone has ideas to get the gmon.out with ctrl-c? I want to find out the hot spot for cpu cycle
Need to profile a daemon written in C++, gprof says it need to terminate the process to get the gmon.out.
That fits the normal practice of debugging daemon processes: provision a switch (e.g. with command line option) which would force the daemon to run in foreground.
I'm wondering anyone has ideas to get the gmon.out with ctrl-c?
I'm not aware of such options.
Though in case of gmon, call to exit() should suffice: if you for example intend to test say processing 100K messages, you can add in code a counter incremented on every processed message. When the counter exceeds the limit, simply call exit().
You also can try to add a handler for some unused signal (like SIGUSR1 or SIGUSR2) and call exit() from there. Thought I do not have personal experience and cannot be sure that gmon would work properly in the case.
I want to find out the hot spot for cpu cycle
My usual practice is to create a test application, using same source code as the daemon but different main() where I simulate precise scenario (often with a command line switch many scenarios) I need to debug or test. For the purpose, I normally create a static library containing the whole module - except the file with main() - and link the test application with the static library. (That helps keeping Makefiles tidy.)
I prefer the separate test application to hacks inside of the code since especially in case of performance testing I can sometimes bypass or reduce calls to expensive I/O (or DB accesses) which often skews the profiler's sampling and renders the output useless.
As a first suggestion I would say you might try to use another tool. If the performance of that daemon is not an issue in your test you could give a try to valgrind. It is a wonderful tool, I really love it.
If you want to make the daemon go as fast as possible, you can use lsstack with this technique. It will show you what's taking time that you can remove. If you're looking for hot spots, you are probably looking for the wrong thing. Typically there are function calls that are not absolutely needed, and those don't show up as hot spots, but they do show up on stackshots.
Another good option is RotateRight/Zoom.