Minimizing the size of debugging information for testing at a remote location - c++

I am trying to create a way to transfer the debug information of a C++ project to a remote location for testing. In the current development cycle, small changes to the code require the entire binary (100s MB in size and mostly debug info) to be transferred.
Currently my approach to addressing this is by splitting the debugging information from the object files (the size of which without the debugging info is manageable on my connection) using -gsplit-dwarf and then diffing the debug files against a copy of the build currently on the remote box.
The aim is to have a set of patches for the debug files of a project so that new code can be debugged at a remote location. The connection between the remote location and the local machine is slow and so minimization of the size of the patches is paramount but it should also be balanced with the run time of the tool. I have looked into bsdiff and xdelta as potential solutions and have run into a conundrum where xdetla is fast but too large and bsdiff is perfect in terms of size but the run time and memory requirements are a little higher than I would like it to be.
Is there a tool or approach I am missing or am I just going about this the wrong way? Some alternative to bsdiff and xdelta perhaps? I know that a tool like gbdserver won't work in this situation because of some of the requirements we have with the actual debugging. Could some alteration of bsdiff help the performance? And indeed if the approach I'm using is sound, what would be a good way to keep a copy of the build on the remote machine around to diff against.

The simplest way is to use "strip" to copy the debuginfo into a separate ".debug" file, and then use "strip" again to remove the debug info from the executable that you will deploy. The "strip" manual explains how to do this, look for the "--only-keep-debug" option.
After you do this, you can tell gdb about the separate debug info in various ways. The very best way is to use the "build-id" feature. This is what modern Linux distros do. However there are other ways as well. There's a whole section in the gdb manual about separate debug files.
The key point here is that you can start gdb on the stripped executable and it will find the separate debug info automatically. This data can all be local, so you won't need to deploy the debug info.
If you still care about shrinking debug info even when this is done, you can look at the "dwz" tool. This is a DWARF compressor. However this usually only matters if you plan to ship the debug info somewhere -- distros use it to make it easier to download debug info, but ordinary users won't really see the need.

Related

How can you trace execution of an embedded system emulted in QEMU?

I've built OpenWrt for x86 and I'm using QEMU to run it virtually.I'm trying to debug this system in real time. I need to see things like network traffic flowing etc.
I can attach gdb remotely and execute (mostly) step by step with break points. I really want trace points though. I don't want to pause execution and loose network flow. When I tried setting trace points using tstart, I see the message "Target does not support this command". I did a bit of reading of the gdb documentation and from what I can tell the gdb stub that runs to intercept normal execution in QEMU does not support trace points.
From here I started looking at other tools and ran across PANDA (https://github.com/panda-re/panda). As I understand PANDA will capture a complete system trace in a log and allow for replay. I think this tool is supposed to do what I need, but I cannot seem to replay the results. I see the logs, I just can't replay them.
Now, I'm a bit stuck on what other tools/options I might have to actually trace a running embedded system. Are there any good tools you can recommend? Or perhaps another method I've missed?
If you want to see the system calls and signals use strace.
Strace can also be used with running process and it can put the output in a log file if required.
In OpenWrt it is possible to build with ftrace. Ftrace has much of the functionality I required but not all.
To build with ftrace, the option for ftrace must be selected in the build menu. Additionally there are a variety of tracer options that must also be enabled.
The trace-cmd (ftrace) is located in menuconfig/Development
Tracing support is under menuconfig/Global build settings/Compile the kernel with tracing support and includes: Trace system calls, Trace process context switches and events, and Function tracer (Function graph tracer, Enable/disable function tracing dynamically, and Function profiler)
I'm also planning to build a custom GDB stub to do this a little bit better as I also want to see the data passed to the functions not just the function calls.

GRPC: how to open GRPC profiling macro GPR_TIMER_BEGIN/GPR_TIMER_END

We're using GRPC to sending message that met performance issues. We found there're lots of GPR_TIMER_BEGIN/GPR_TIMER_END in most critical path to measure time consuming of the functions.
But just define GRPC_STAP_PROFILER in Makefile would trigger build errors.
Anyone who knows how to open the GRPC's performance profiling macros?
The stap profiler hasn't been properly maintained unfortunately. You'd get more chance with the basic profiler. No need to update the Makefile, you can simply run make CONFIG=basicprof.
Most of our profiling however nowadays happens using microbenchmarks. You can browse the directory with our microbenchmarks, which are actively used to create automated github comments to notify us of performance differences caused by pull requests.

How can I read a windows BSOD generated memory.dmp using C++

I need to read information, code, flags, address, etc from a memory.dmp file generated from a windows BSOD through C++. The basic idea is that status info can be requested from a remote site and one of the requested pieces of information is some basic info from the last BSOD that occured on the machine thus I need to open the kernel/memory dump file through C++ (Im using MSVC 2005).
Start here, then realize using scripted commands in WinDBG is much easier.
Note: you only need WinDBG on the analysis machine, not the crashing one. You retrieve the minidump and analyse it externally. The only difficulty you will have is getting the right symbols - for Windows, Microsoft makes them available via their symbol servers, but applications that caused the crash may not supply the right symbols you need. IF they are you own applications causing the crash, get a symbol server and use it.
I would configure Windows to create small kernel memory dumps which will include the parameter of the bugcheck you are after.
On XP it was 64KB on my Win8.1 x64 it is 256KB. These files compress well. You should be able to get away with a zip file of 10-60KB size depending on the bitness of the OS. If bandwidth is of of utmost importance to you, you can use 7z which compresses about 50% better than the plain zip algo at the expense of much longer compression times (5-6 longer) but for such small files the CPU time difference should be irrelevant.
If you do not want your users to configure dump reporting you need to set the DWORD
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\CrashControl
to 3 for a small kernel dump programatically.
For an explanation of the values see http://technet.microsoft.com/en-us/library/cc976050.aspx
0 Debugging information is not written to a file.
1 Complete crash dump is written to a file.
2 Kernel memory dump is written to a file.
3 Small memory dump is written to a file.
You will then get by default a small kernel dump in %SystemRoot%\MEMORY.DMP.

Distributed software debug with gdb

I am currently developing a distributed software in C++ using linux which is executed in more than 20 nodes simultaneously. So one of the most challenging issue that I found is how to debug it.
I heard that is possible to manage in a single gdb session multiple remote sessions (e.g. in the master node I create the gdb session and in every other node I launch the program using gdbserver), is it possible? If so can you give an example? Do you know any other way to do it?
Thanks
You can try to do it like this:
First start nodes with gdbserver on remote hosts. It is even possible to start it without a program to debug, if you start it with --multi flag. When server is in multi mode, you can control it from your local session, I mean that you can make it start a program you want to debug.
Then, start multiple inferiors in your gdb session
gdb> add-inferior -copies <number of servers>
switch them to a remote target and connect them to remote servers
gdb> inferior 1
gdb> target extended-remote host:port // use extended to switch gdbserver to multi mode
// start a program if gdbserver was started in multi mode
gdb> inferior 2
...
Now you have them all attached to one gdb session. The problem is that, AFAIK, it is not much better than to start multiple gdb's from different console tabs. On the other hand you can write some scripts or auto tests this way. See the gdb tutorial: server and inferiors.
I don't believe there is one, simple, answer to debugging "many remote applications". Yes, you can attach to a process on another machine, and step through it in GDB. But it's quite awkward to debug a large number of interdependent processes, especially when the problem is complicated.
I believe a good set of logging capabilities in the code, supplemented with additional logs for specific debugging as needed, is more likely to give you a good/fast result.
Another option might be to run the processes on one machine, rather than on multiple machines. Perhaps even use threads within one process, to simulate the behaviour of multiple machines, simplifying the debugging process. Of course, this doesn't prevent bugs that appear ONLY when you run 20 processes on 20 different machines. But the basic idea is to reduce the number of those bugs to a minimum, and debug most things in a "simpler environment".
Aggressive use of defensive programming paradigms, such as liberal use of assert is clearly a good idea (perhaps with a macro to turn it off for the production runs, but make sure that you don't just leave error paths completely unchecked - it is MUCH harder to detect that the reason something crashes is that a memory allocation failed than to track down where that NULL pointer came from some 20 function calls away from a failed allocation.

Visual Studio profiler uses huge amount of RAM

I'm trying to do an Instrumentation Profiling of a quite big project (around 40'000 source files in the whole solution, but the project under profiling has around 200 source files), written in C++.
Each time I run the profiling, it creates a huge report of around 34GB, and then, when it's going to analyze it, it's trying (I think) to load the whole file into RAM.
Obviously, it renders the computer unusable, and I have to stop the analyzer before it completes.
Any suggestions?
Hi there hope this response isn't too late. This is Andre Hamilton from Visual Studio profiler team. Analyzing such a large report file does take some time. Instrumentation produces that much data because all of your functions are instrumented. By instrumenting a few functions or specific binary you might be able to speed things up if you don't mind profiling via the command line. This will product a vsp file which you can then open in VS and use as normal. Lets say that your project requires n binaries to run. Let us assume that of these binaries you are interested in the performance of binary ni
Open up a VisualStudio command prompt
1) Do vsinstr ni.dllto instrument the entire binary or use the /include or /exclude options of vsinstr to further restrict which functions are instrumented. N.B if your binary was signed, you will need to resign after instrumenting
2)Start the profiler in instrumentation mode via the given command
vsperf /start:trace /output:myinstrumentedtrace.vsp
3)Launch Your Application
4)When you are ready to stop profiling
vsperf /shutdown
Hope this helps
(Notice, I assume you have a licensed copy of VS to both collect and analyze the data).
This is a general problem when profiling large or "dense" programs. You need to restrict the profiler to collect data only from certain units of your code base. In Microsoft's profilers this is done by using Include/Exclude switches either at command line or in the IDE.
There is a bug in VS, the reason is most of the profiling work is done in UI thread it make the VS unusable, as mentioned in http://channel9.msdn.com/Forums/TechOff/260091-Visual-Studio-Performance-Analysis-in-10-minutes
You may give try to VS 2012 to see if the problem is resolved, but there is no doubt loading a 34 GB file is not a simple task and its also the reason behind resulting in system unusable, so as John suggested above in comment section, break your code in smaller component and then do profiling, hope it help!