What is the difference between dprintf vs break + commands + continue? - gdb

For example:
dprintf main,"hello\n"
run
Generates the same output as:
break main
commands
silent
printf "hello\n"
continue
end
run
Is there a significant advantage to using dprintf over commands, e.g. it is considerably faster (if so why?), or has some different functionality?
I imagine that dprinf could be in theory faster as it could in theory compile and inject code with a mechanism analogous to the compile code GDB command.
Or is it mostly a convenience command?
Source
In the 7.9.1 source, breakpoint.c:dprintf_command, which defines dprintf, calls create_breakpoint which is also what break_command calls, so they both seem to use the same underlying mechanism.
The main difference is that dprintf passes the dprintf_breakpoint_ops structure, which has different callbacks and gets initialized at initialize_breakpoint_ops.
dprintf stores list of command strings much like that of commands command, depending on the settings. They are:
set at update_dprintf_command_list
which gets called on after a type == bp_dprintf check inside init_breakpoint_sal
which gets called by create_breakpoint.
When a breakpoint is reached:
bpstat_stop_status gets called and invokes b->ops->after_condition_true (bs); for the breakpoint reached
after_condition_true for dprintf is dprintf_after_condition_true
bpstat_do_actions_1 runs the commands

There are two main differences.
First, dprintf has some additional output modes that can be used to make it work in other ways. See help set dprintf-channel, or the manual, for more information. I think these modes are the reason that dprintf was added as a separate entity; though at the same time they are fairly specialized and unlikely to be of general interest.
More usefully, though, dprintf doesn't interfere with next. If you write a breakpoint and use commands, and then next over such a breakpoint, gdb will forget about the next and act as if you had typed continue. This is a longstanding oddity in the gdb scripting language. dprintf doesn't suffer from this problem. (If you need similar functionality from an ordinary breakpoint, you can do this from Python.)

Related

How to start a REPL while debugging C++?

How can I get a REPL to evaluate simple things and test things out in the middle of a program's execution in C++?
I have some debugging tools at my disposal, lldb, but as far as I can tell, I can only view things, not test new things on the fly, even though it seems like it should be able to do it but maybe that functionality of the lldb debugger only works for Swift.
It doesn't have to be a complete REPL, but the ability to do some basic things would be so incredibly helpful. I am working with a massive binary in a new codebase with all sorts of types, classes, composition, and inheritance going on that's overwhelming and undocumented, and every test and repeat of the 50GB binary is about 10 minutes which is painful (running unit tests takes the same 10 minutes). It would be great if I could test statements out in a REPL on a breakpoint in the middle of the execution, one statement after another to play around and figure things out faster.
I'm coming from a drop into ipython/ipdb background, as well as the same lldb for Swift. Very new to C++.
The lldb command expr - aliased to p - should do what you want. expr will run the code you provide "as though it had been inserted in the current context" and will return the expression's result. So for instance, if you have a function and want to see what it returns for various inputs, you can do:
(lldb) expr my_new_function(10, 20, 30)
You can even debug functions while handling such interesting inputs by doing:
(lldb) break set -n my_new_function
(lldb) expr -i 0 -- my_new_function(10, 20, 30)
the -i tells lldb to stop on the breakpoint, by default lldb ignores breakpoints in hand called functions.
You can also define new objects & structs in the expression command, but unlike the Swift REPL mode(*) user-defined objects, functions & variables have to be named with an initial $ to keep them from getting confused with names from your code. For instance, if you have a class A:
(lldb) expr A $myA()
will make a default-constructed object of type A, then you can call methods on it:
(lldb) expr $myA().doSomething()
Note that C++ templates are poorly described in debug information - the current standard only describes instantiations and not the abstract template form. So YMMV when trying to create objects of complex template types. The other danger with C++ is that some functions you want to call may only exist as inlined code, which the debugger can't invoke. This happens for some of the STL types, e.g. the size() method is often not callable on vectors & the like because it's always inlined. That's less of a problem with -O0 code, however.
(*) When you run swift in Terminal w/o arguments to bring up the Swift REPL, you are actually just running lldb in a fancy input mode that feeds what you type to the expr command under the covers. The main difference between the bare expr and the full Swift REPL is that the REPL is focused on making new code, whereas expr is more meant for exploring the code you are stopped in right now, so the name lookup rules are different. There isn't a full C++ REPL mostly because we don't know how to do jobs like "build C++ classes incrementally", which would be required for a real REPL.

Create an executable that calls another executable?

I want to make a small application that runs another application multiple times for different input parameters.
Is this already done?
Is it wrong to use system("myAp param"), for each call (of course with different param value)?
I am using kdevelop on Linux-Ubuntu.
From your comments, I understand that instead of:
system("path/to/just_testing p1 p2");
I shall use:
execl("path/to/just_testing", "path/to/just_testing", "p1", "p2", (char *) 0);
Is it true? You are saying that execl is safer than system and it is better to use?
In the non-professional field, using system() is perfectly acceptable, but be warned, people will always tell you that it's "wrong." It's not wrong, it's a way of solving your problem without getting too complicated. It's a bit sloppy, yes, but certainly is still a usable (if a bit less portable) option. The data returned by the system() call will be the return value of the application you're calling. Based on the limited information in your post, I assume that's all you're really wanting to know.
DIFFERENCES BETWEEN SYSTEM AND EXEC
system() will invoke the default command shell, which will execute the command passed as argument.
Your program will stop until the command is executed, then it'll continue.
The value you get back is not about the success of the command itself, but regards the correct opening of command shell.
A plus of system() is that it's part of the standard library.
With exec(), your process (the calling process) is replaced. Moreover you cannot invoke a script or an internal command. You could follow a commonly used technique: Differences between fork and exec
So they are quite different (for further details you could see: Difference between "system" and "exec" in Linux?).
A correct comparison is between POSIX spawn() and system(). spawn() is more complex but it allows to read the external command's return code.
SECURITY
system() (or popen()) can be a security risk since certain environment variables (like $IFS / $PATH) can be modified so that your program will execute external programs you never intended it to (i.e. a command is specified without a path name and the command processor path name resolution mechanism is accessible to an attacker).
Also the system() function can result in exploitable vulnerabilities:
when passing an unsanitized or improperly sanitized command string originating from a tainted source;
if a relative path to an executable is specified and control over the current working directory is accessible to an attacker;
if the specified executable program can be spoofed by an attacker.
For further details: ENV33-C. Do not call system()
Anyway... I like Somberdon's answer.

Any way to break only if another break is encountered in Visual C++ 2008?

I have found myself in this situation many times where I need to break into a function that is called hundreds of times only after a particular break-point has been hit.
So let's say there is a function that updates the status of objects. This is called multiple times per
frame. I am testing a function that edits an object. As soon as that function is hit, I can then break into the UpdateStatus function. Obviously, if I put a breakpoint in UpdateStatus it will always break and I will never be able to interact with the program. What would be great if I could set a condition on the breakpoint to only break if the breakpoint in the other function hit. Note that this is just an example.
I am using Visual C++ 2008.
I remember running into a situation like this myself. I believe you can combine Visual Studio tracepoints with Visual Studio macros to do this pretty easily. This page describes how to write a macro that turns on breakpoints: http://weseetips.com/tag/enable-breakpoint/ Since you want to enable only a single breakpoint, you'll want to use some combination of file and line number in your macro to enable only the breakpoint you want--you can find the breakpoint object members here: http://msdn.microsoft.com/en-us/library/envdte.breakpoint.aspx (File and FileLine look particularly useful)
This page describes how to use a 'tracepoint' to run a macro: http://msdn.microsoft.com/en-us/library/232dxah7.aspx (This page has some nice screenshots of setting up tracepoints: http://weblogs.asp.net/scottgu/archive/2010/08/18/debugging-tips-with-visual-studio-2010.aspx in VS2010)
So you could create a macro that will enable your breakpoint, and create a tracepoint that executes your macro. (You can even add a second tracepoint that disables the breakpoint afterward, so that you can easily repeat the process.)
You might be able to place a conditional breakpoint within the UpdateStatus itself.
Alternatively, place a conditional breakpoint at the call-site of UpdateStatus then perform the step-in manually.
Whether you'll be able to do one or the other (or any at all) depends on how complex the breakpoint condition is and whether input for that condition is "reachable" from the particular stack frame.
You could have the first breakpoint set a flag when hit, then let the second breakpoint check that flag as a condition.
You can set the flag by specifying a "message" to be printed when hit, like so:
{flag = 1;}
flag needs to exist in the scope, of course.
(It would be nice if it was possible to declare a variable that only exists during the debug, but I'm unaware of how to do this.)

Logging code execution in C++

Having used gprof and callgrind many times, I have reached the (obvious) conclusion that I cannot use them efficiently when dealing with large (as in a CAD program that loads a whole car) programs. I was thinking that maybe, I could use some C/C++ MACRO magic and somehow build a simple (but nice) logging mechanism. For example, one can call a function using the following macro:
#define CALL_FUN(fun_name, ...) \
fun_name (__VA_ARGS__);
We could add some clocking/timing stuff before and after the function call, so that every function called with CALL_FUN gets timed, e.g
#define CALL_FUN(fun_name, ...) \
time_t(&t0); \
fun_name (__VA_ARGS__); \
time_t(&t1);
The variables t0, t1 could be found in a global logging object. That logging object can also hold the calling graph for each function called through CALL_FUN. Afterwards, that object can be written in a (specifically formatted) file, and be parsed from some other program.
So here comes my (first) question: Do you find this approach tractable ? If yes, how can it be enhanced, and if not, can you propose a better way to measure time and log callgraphs ?
A collegue proposed another approach to deal with this problem, which is annotating with a specific comment each function (that we care to log). Then, during the make process, a special preprocessor must be run, parse each source file, add logging logic for each function we care to log, create a new source file with the newly added (parsing) code, and build that code instead. I guess that reading CALL_FUN... macros (my proposal) all over the place is not the best approach, and his approach would solve this problem. So what is your opinion about this approach?
PS: I am not well versed in the pitfalls of C/C++ MACROs, so if this can be developed using another approach, please say it so.
Thank you.
Well you could do some C++ magic to embed a logging object. something like
class CDebug
{
CDebug() { ... log somehow ... }
~CDebug() { ... log somehow ... }
};
in your functions then you simply write
void foo()
{
CDebug dbg;
...
you could add some debug info
dbg.heythishappened()
...
} // not dtor is called or if function is interrupted called from elsewhere.
I am bit late, but here is what I am doing for this:
On Windows there is a /Gh compiler switch which makes the compiler to insert a hidden _penter function at the start of each function. There is also a switch for getting a _pexit call at the end of each function.
You can utilizes this to get callbacks on each function call. Here is an article with more details and sample source code:
http://www.johnpanzer.com/aci_cuj/index.html
I am using this approach in my custom logging system for storing the last few thousand function calls in a ring buffer. This turned out to be useful for crash debugging (in combination with MiniDumps).
Some notes on this:
The performance impact very much depends on your callback code. You need to keep it as simple as possible.
You just need to store the function address and module base address in the log file. You can then later use the Debug Interface Access SDK to get the function name from the address (via the PDB file).
All this works suprisingly well for me.
Many nice industrial libraries have functions' declarations and definitions wrapped into void macros, just in case. If your code is already like that -- go ahead and debug your performance problems with some simple asynchronous trace logger. If no -- post-insertion of such macros can be an unacceptably time-consuming.
I can understand the pain of running an 1Mx1M matrix solver under valgrind, so I would suggest starting with so called "Monte Carlo profiling method" -- start the process and in parallel run pstack repeatedly, say each second. As a result you will have N stack dumps (N can be quite significant). Then, the mathematical approach would be to count relative frequencies of each stack and make a conclusion about the ones most frequent. In practice you either immediately see the bottleneck or, if no, you switch to bisection, gprof, and finally to valgrind's toolset.
Let me assume the reason you are doing this is you want to locate any performance problems (bottlenecks) so you can fix them to get higher performance.
As opposed to measuring speed or getting coverage info.
It seems you're thinking the way to do this is to log the history of function calls and measure how long each call takes.
There's a different approach.
It's based on the idea that mainly the program walks a big call tree.
If time is being wasted it is because the call tree is more bushy than necessary,
and during the time that's being wasted, the code that's doing the wasting is visible on the stack.
It can be terminal instructions, but more likely function calls, at almost any level of the stack.
Simply pausing the program under a debugger a few times will eventually display it.
Anything you see it doing, on more than one stack sample, if you can improve it, will speed up the program.
It works whether or not the time is being spent in CPU, I/O or anything else that consumes wall clock time.
What it doesn't show you is tons of stuff you don't need to know.
The only way it can not show you bottlenecks is if they are very small,
in which case the code is pretty near optimal.
Here's more of an explanation.
Although I think it will be hard to do anything better than gprof, you can create a special class LOG for instance and instantiate it in the beginning of each function you want to log.
class LOG {
LOG(const char* ...) {
// log time_t of the beginning of the call
}
~LOG(const char* ...) {
// calculate the total time spent,
//by difference between current time and that saved in the constructor
}
};
void somefunction() {
LOG log(__FUNCTION__, __FILE__, ...);
.. do other things
}
Now you can integrate this approach with the preprocessing one you mentioned. Just add something like this in the beginning of each function you want to log:
// ### LOG
and then you replace the string automatically in debug builds (shoudn't be hard).
May be you should use a profiler. AQTime is a relatively good one for Visual Studio. (If you have VS2010 Ultimate, you already have a profiler.)

How do I get a print of the functions called in a C++ program under Linux?

What I want is a mix of what can be obtained by a static code analysis like Doxygen and the stackframe you can see when using GDB. I know which problematic function I'm debugging and I want to see the neighbourhood of the function calls that guided the execution to this function call. For instance, running a simple HelloWorld! would output something like:
main:
Greeter::Greeter()
Greeter::printHello()
Greeter::printWorld()
denoting that from the main function, the constructor was called and then the printHello and printWorld functions where called. Notice that in GDB if I break at printWorld I won't be able to see in the stackframe that printHello was called.
Any ideas about how to trace function calls without going through the pain of inserting log messages in a myriad of source files?
Thanks!!
The -finstrument-functions option to gcc instructs the compiler to call a user-provided profiling function at every function entry and exit.
You could use this to write a function that just logs every function entry and exit.
From reading the question I understand that you want a list of all relevant functions executed in order as they're executed.
Unfortunately there is no application to generate this list automatically, but there are helper macros to save you a lot of time. Define a single macro called LOGFUNCTION or whatever you want and define it as:
#define LOGFUNCTION printf("In %s (%s:%d)\n", __PRETTY_FUNCTION__, __FILE__, __LINE__);
Now you do have to paste the line LOGFUNCTION wherever you want a trace to be added.
wherever you see fit.
see http://gcc.gnu.org/onlinedocs/gcc/Function-Names.html and http://gcc.gnu.org/onlinedocs/cpp/Standard-Predefined-Macros.html
GDB features a stack trace, it does what you ask for.
What he wants is to obtain tha info (for example, backtrace from gdb) but printed in a 'nicer' format than gdb do.
I think you can't. I mean, maybe there is some type of app that trace your application and do something like that, but I never hear about something like that.
The best thing you can do is use GDB, maybe create some type of bash script that use gdb to obtain the info and print it out in the way you like.
Of course, your application MUST be compiled with debug symbols (-g param to gcc).
I'm not entirely sure what the problem is with gdb's backtrace, but maybe a profiler is closer to what you want? For example, using valgrind:
valgrind --tool cachegrind ./myprogram
kcachegrind callgrind.out.NNNN
Have you tried to use gprof to generate a call graph? You can also convert gprof output to something easier on the eye with gprof2dot for example.