GDB break on exception thrown when called from specific function - c++

I would like to use GDB to break when an exception is thrown only when the stack goes through a specific function.
My use case is that I have a Thread class whose doRun() function is called in a new thread. That thread catches any exception that bubbles up, but I would like to be able to break when the exception is thrown (not caught).
I know GDB can do "reverse debugging" (awesome concept) so this could potentially be used, but I'd like something more general purpose -- in fact, I'd like this solution to find its way to my .gdbinit file.

Recent versions of gdb have some convenience functions that are useful for this, e.g., "$_any_caller_matches". These are written in Python, so even if your gdb doesn't have them built-in, you might be able to grab the code and just drop it into your gdb.
You would use it like (untested, but you get the idea):
catch throw if $_any_caller_matches("Thread::doRun")

It doesn't appear $caller_matches (or its sister function $caller_is) are included in the base installation of GDB.
Go ahead and add the source code into your GDB python functions folder.
This folder is usually found in /usr/share/gdb/python/gdb/function; the filename should be caller_is.py.
Note that when using $caller_matches, the underlying implementation is using re.match, so make sure the string you pass it works with that function.
As well, both functions have an optional second parameter, defaulting to 1, that specifies how far up the stack to traverse (look). This means that if you omit it, it will only check the direct caller of the current function. If you want to check specific stack positions (i.e. if you want to check the grandparent caller), use 2, 3, etc..
I've included the source below.
# Caller-is functions.
# Copyright (C) 2008 Free Software Foundation, Inc.
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
import gdb
import re
class CallerIs (gdb.Function):
"""Return True if the calling function's name is equal to a string.
This function takes one or two arguments.
The first argument is the name of a function; if the calling function's
name is equal to this argument, this function returns True.
The optional second argument tells this function how many stack frames
to traverse to find the calling function. The default is 1."""
def __init__ (self):
super (CallerIs, self).__init__ ("caller_is")
def invoke (self, name, nframes = 1):
frame = gdb.selected_frame ()
while nframes > 0:
frame = frame.older ()
nframes = nframes - 1
return frame.name () == name.string ()
class CallerMatches (gdb.Function):
"""Return True if the calling function's name matches a string.
This function takes one or two arguments.
The first argument is a regular expression; if the calling function's
name is matched by this argument, this function returns True.
The optional second argument tells this function how many stack frames
to traverse to find the calling function. The default is 1."""
def __init__ (self):
super (CallerMatches, self).__init__ ("caller_matches")
def invoke (self, name, nframes = 1):
frame = gdb.selected_frame ()
while nframes > 0:
frame = frame.older ()
nframes = nframes - 1
return re.match (name.string (), frame.name ()) is not None
CallerIs()
CallerMatches()

Related

C++11 - Optional callback fully optimized if not given inside of a linkable library

I'm making a library that is meant to run on an embedded device. This library does some specific memory accesses and I want to let the user register a pre-access callback to prepare the memory (such as flushing a cache, for instance). I want this callback to be optional. Ideally, if no callback is provided, I'd like the library to optimize that and go straight to its task without calling an empty func or even testing for the existence of the callback. How can I achieve that?
Here's what I considered :
Defining the callback at runtime and passing a function pointer : I do not think I can optimize that as I will need to test the function pointer for null
Defining a default empty callback as a weak symbol : that works, but I end up with a call to an empty func that doesn't get optimized properly. I've been able to strip that with the gcc -s option, but that does more than just speed optimization.
Preprocessor macro - That could work, but the user will need to recompile the library depending if he want to add a callback or not; not so convenient.
Ideally, option #2 would be the way to go. What disturbs me is the definition of the -s option :
-s : Remove all symbol table and relocation information from the executable.
That seem to do way more than I am asking for. I'm not exactly sure what are the consequence of removing all that. My goal is simply to get a correct optimization for something that is obviously optimizable when disassembling the code. By that, I mean a call ret sequence.

How to make a C++ class gdb-friendly?

Consider the following example:
std::string s = "Hello!!";
(gdb) p s
$1 = "Hello!!";
Essentially, just providing the variable name is good enough to display the string. I don't have to, for example, type "p s.c_str()."
Is gdb using any implicit operator to get the display string? I need to do something similar for my class. Here is a trivial example of my class:
class MyClass {
private:
std::string _name;
};
You need to write a pretty-printer for your class. It is not something you do in your C++ class, but something you do in gdb (although matching your C++ class). The easiest way to do that is through gdb's Python API (you can also use the Guile language).
GDB already comes with pretty-printers for most of the standard library classes and that is why you can easily see an std::string object, an std::vector, etc. If you type info pretty-printer in gdb it will tell you about the pretty-printers it currently knows about and you will notice many std::something pretty printers.
If you use pass /r to the print command in gdb it will print the variable without using any possible registered pretty-printer that matches it. Try that with an std::string to see how it would be printed if gdb didn't come with a pretty-printer for it.
So, how can you write your own pretty-printers? For that, you should read GDB's documentation on this topic. But I find it much easier to start by reading and tweaking some existing pretty-printer you can find and then read gdb's documentation for the details.
For instance, I have a Coordinate class in one of my projects as below
class Coordinate {
private:
double x;
double y;
double z;
public:
...
}
It's very easy to write a pretty-printer for this class. You create a python file with the following code
class CoordinatePrinter:
def __init__(self, val):
# val is the python representation of you C++ variable.
# It is a "gdb.Value" object and you can query the member
# atributes of the C++ object as below. Since the result is
# another "gdb.Value" I'am converting it to a python float
self.x = float(val['x'])
self.y = float(val['y'])
self.z = float(val['z'])
# Whatever the `to_string` method returns is what will be printed in
# gdb when this pretty-printer is used
def to_string(self):
return "Coordinate(x={:.2G}, y={:.2G}, z={:.2G})".format(self.x, self.y, self.z)
import gdb.printing
# Create a "collection" of pretty-printers
# Note that the argument passed to "RegexpCollectionPrettyPrinter" is the name of the pretty-printer and you can choose your own
pp = gdb.printing.RegexpCollectionPrettyPrinter('cppsim')
# Register a pretty-printer for the Coordinate class. The second argument is a
# regular expression and my Coordinate class is in a namespace called `cppsim`
pp.add_printer('Coordinate', '^cppsim::Coordinate$', CoordinatePrinter)
# Register our collection into GDB
gdb.printing.register_pretty_printer(gdb.current_objfile(), pp, replace=True)
Now all we need to do is to source this python file in gdb. For that, write in your .gdbinit file
source full_path_to_your_python_file_with_pretty_printers.py
When you start gdb it will run your .gdbinit file, which will load your pretty-printers. Note that these pretty-printers will often also work inside IDEs that use gdb.
If you are interested in more examples, I have created pretty-printers to some classes in the Armadillo library (vector, matrices and general linear algebra) which are available here.
If you have libstdc++ pretty-printers installed (you already have them installed) your class is already gdb-friendly because they will be invoked when printing members of your class. If you have a lot of other class members besides _name you can also use set print pretty for easier distinguishing between them:
(gdb) p my_class
$1 = {_name = ""}
(gdb) set print pretty
(gdb) p my_class
$2 = {
_name = ""
}
(gdb)

How to convert function insertion module pass to intrinsic to inline

PROBLEM:
I currently have a traditional module instrumentation pass that
inserts new function calls into a given IR according to some logic
(inserted functions are external from a small lib that is later linked
to given program). Running experiments, my overhead is from
the cost of executing a function call to the library function.
What I am trying to do:
I would like to inline these function bodies into the IR of
the given program to get rid of this bottleneck. I assume an intrinsic
would be a clean way of doing this, since an intrinsic function would
be expanded to its function body when being lowered to ASM (please
correct me if my understanding is incorrect here, this is my first
time working with intrinsics/LTO).
Current Status:
My original library call definition:
void register_my_mem(void *user_vaddr){
... C code ...
}
So far:
I have created a def in: llvm-project/llvm/include/llvm/IR/IntrinsicsX86.td
let TargetPrefix = "x86" in {
def int_x86_register_mem : GCCBuiltin<"__builtin_register_my_mem">,
Intrinsic<[], [llvm_anyint_ty], []>;
}
Added another def in:
otwm/llvm-project/clang/include/clang/Basic/BuiltinsX86.def
TARGET_BUILTIN(__builtin_register_my_mem, "vv*", "", "")
Added my library source (*.c, *.h) to the compiler-rt/lib/test_lib
and added to CMakeLists.txt
Replaced the function insertion with trying to insert the intrinsic
instead in: llvm/lib/Transforms/Instrumentation/myModulePass.cpp
WAS:
FunctionCallee sm_func =
curr_inst->getModule()->getOrInsertFunction("register_my_mem",
func_type);
ArrayRef<Value*> args = {
builder.CreatePointerCast(sm_arg_val, currType->getPointerTo())
};
builder.CreateCall(sm_func, args);
NEW:
Intrinsic::ID aREGISTER(Intrinsic::x86_register_my_mem);
Function *sm_func = Intrinsic::getDeclaration(currFunc->getParent(),
aREGISTER, func_type);
ArrayRef<Value*> args = {
builder.CreatePointerCast(sm_arg_val, currType->getPointerTo())
};
builder.CreateCall(sm_func, args);
Questions:
If my logic for inserting the intrinsic functions shouldnt be a
module pass, where do i put it?
Am I confusing LTO with intrinsics?
Do I put my library function definitions into the following files as mentioned in
http://lists.llvm.org/pipermail/llvm-dev/2017-June/114322.html as for example EmitRegisterMyMem()?
clang/lib/CodeGen/CodeGenFunction.cpp - define llvm::Instrinsic::ID
clang/lib/CodeGen/CodeGenFunction.h - declare llvm::Intrinsic::ID
My LLVM compiles, so it is semantically correct, but currently when
trying to insert this function call, LLVM segfaults saying "Not a valid type for function argument!"
I'm seeing multiple issues here.
Indeed, you're confusing LTO with intrinsics. Intrinsics are special "functions" that are either expanded into special instructions by a backend or lowered to library function calls. This is certainly not something you're going to achieve. You don't need an intrinsic at all, you'd just need to inline the function call in question: either by hands (from your module pass) or via LTO, indeed.
The particular error comes because you're declaring your intrinsic as receiving an integer argument (and this is how the declaration would look like), but:
asking the declaration of variadic intrinsic with invalid type (I'd assume your func_type is a non-integer type)
passing pointer argument
Hope this makes an issue clear.
See also: https://llvm.org/docs/LinkTimeOptimization.html
Thanks you for clearing up the issue #Anton Korobeynikov.
After reading your explanation, I also believe that I have to use LTO to accomplish what I am trying to do. I especially found this link very useful: https://llvm.org/docs/LinkTimeOptimization.html. It seems that I am now on a right path.

How can OCaml values be printed outside the toplevel?

The OCaml repl ("toplevel") has rich printing for any types, user-defined or otherwise. Is it possible to access this functionality outside the toplevel, rather than having to write an entirely custom set of value printers for one's own entire set of types?
The pretty-printing facility is part of the toplevel library. You'll find the source in toplevel/genprintval.ml. It's understandable, considering that it needs type information: you can't just throw any value at it, the choice of pretty-printer is based on the type.
If you want to use this code in your program, you'll need to link with the toplevel library (toplevellib.cma) or compile in genprintval (which means bringing in enough bits of the type checker to analyse the type, it can get pretty big).
There is a similar facility (but not sharing the code, I think) in the debugger (debugger/printval.ml and debugger/loadprinter.ml).
There are third-party libraries that you can directly link against and that provide pretty-printing facilities. Extlib's Std.dump provides a very crude facility (not based on the type). Deriving by Jeremy Yallop and Jake Donham is another approach. This Caml Weekly News item offers more suggestions.
The OCaml Batteries Included library contains the dump function in its BatPervasives module . It converts any value to a string and returns it. You can see its source code here. The output will not be identical to the toplevel, because some information is lost at runtime, e.g. abstract data type constructors will become integers.
No. As of OCaml 4.06, the compiler doesn't make type information available at runtime. It is therefore not possible to have standalone programs that nicely print any OCaml data without some compromises. The two main avenues are:
Some form of preprocessing which derives printers from type definitions. Today, the best approach might be the show plugin of ppx-deriving. This requires annotating each type definition.
Relying only on the runtime representation of values. This requires no effort from the programmer and works out-of-the-box on data produced by external libraries. However it doesn't show things like record field names or any other information that was lost during compilation. An instance of this approach is detailed below.
The function Dum.to_stdout from the dum package will take any OCaml value, including cyclic ones, and print their physical representation in a human-readable form given the data available at runtime only.
Simple things give more or less what one would expect:
# Dum.to_stdout ("Hello", 42, Some `Thing, [1;2;3]);;
("Hello" 42 (582416334) [ 1 2 3 ])
Cyclic—and in general, shared—values are shown using labels and references. This is a circular list:
# let rec cyc = 1 :: 2 :: cyc;;
# Dum.to_stdout cyc;;
#0: (1 (2 #0))
We can also look into the runtime representation of functions, modules and other things. For example, the Filename module can be inspected as follows:
# module type Filename = module type of Filename;;
# Dum.to_stdout (module Filename : Filename);;
(
#0: "."
".."
#1: "/"
#2: closure (#1 #3: closure ())
#4: closure ()
closure (#4)
closure ()
closure ()
closure (#5: closure (#3))
closure (#5)
closure (#5)
closure (closure () #3 #0)
closure (closure () #3 #0)
closure (#6: closure (#2 <lazy>) #7: (#8))
closure (#6 #7)
closure (#7)
closure (#7)
#8: "/tmp"
closure (closure () "'\\''")
)
I know you want it outside of top level but I think it's worth mentioning how to do it in top level so that ppl looking for printing in anyway (since it seems outside top level is not trivial):
load your file in top level
utop
#use "datatypes.ml";;
then "call" the variable inside top level:
utop # let nada = Nothing;;
utop # nada;;
- : foo = Nothing
ref: https://discuss.ocaml.org/t/how-does-one-print-any-type/4362/16?u=brando90

Finding invocations of a certain function in a c++ file using python

I need to find all occurrences of a function call in a C++ file using python, and extract the arguments for each call.
I'm playing with the pygccxml package, and extracting the arguments given a string with the function call is extremely easy:
from pygccxml.declarations import call_invocation
def test_is_call_invocation(call):
if call_invocation.is_call_invocation(call):
print call_invocation.name(call)
for arg in call_invocation.args(call):
print " ",arg
else:
print "not a function invocation"
What I couldn't find is a way of getting the calls parsing a file:
from pygccxml import parser
from pygccxml import declarations
decls = parser.parse( ['main.cpp'] )
# ...
Is there a way to find the calls to a certain function using the pygccxml package?
Or maybe that package is an overkill for what I'm trying to do :) and there's a much simpler way? Finding the function calls with a regular expression is, I'm afraid, much trickier than it might look at a first sight...
XML-GCC can't do that, because it only reports the data types (and function signatures). It ignores the function bodies. To see that, create a.cc:
void foo()
{}
void bar()
{
foo();
}
and then run gccxml a.cc -fxml=a.xml. Look at the generated a.xml, to see that the only mentioning of foo (or its id) is in the declaration of foo.
An alternative might be available in codeviz (http://www.csn.ul.ie/~mel/projects/codeviz/). It consists of a patch to gcc 3.4.6 that generates call dependency information - plus some perl scripts that generate graphviz input; the latter you can safely ignore.
As yet another alternative (which doesn't need gcc modifications) you could copy the approach from egypt (http://www.gson.org/egypt/); this parses GCC RTL dumps. It should work with any recent GCC, however, it might be that you don't get calls to inline functions.
In any case, with these approaches, you won't get "calls" to macros, but that might be actually the better choice.