Invoking function with string argument with lldb: how? - c++

I am unable to use lldb to invoke simple, non-templated functions that take string arguments. Is there any way to get lldb to understand the C++ datatype "string", which is a commonly used datatype in C++ programs?
The sample source code here just creates a simple class with a few constructors, and then calls them (includes of "iostream" and "string" omitted):
using namespace std;
struct lldbtest{
int bar=5;
lldbtest(){bar=6;}
lldbtest(int foo){bar=foo;}
lldbtest(string fum){bar=7;}
};
int main(){
string name="fum";
lldbtest x,y(3);
cout<<x.bar<<y.bar<<endl;
return 0;
}
When compiled on Mac Maverick with
g++ -g -std=c++11 -o testconstructor testconstructor.cpp
the program runs and prints the expected output of "63".
However, when a breakpoint is set in main just before the return statement, and attempt to invoke the constructor fails with a cryptic error message:
p lldbtest(string("hey there"))
error: call to a function 'lldbtest::lldbtest(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >)' ('_ZN8lldbtestC1ENSt3__112basic_stringIcNS0_11char_traitsIcEENS0_9allocatorIcEEEE') that is not present in the target
error: The expression could not be prepared to run in the target
Possibly relevant as well, the command:
p lldbtest(name)
prints nothing at all.
Also, calling the constructor with a string literal also failed, the standard way:
p lldbtest("foo")
gives a similar long error:
error: call to a function
'lldbtest::lldbtest(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >)' ('_ZN8lldbtestC1ENSt3__112basic_stringIcNS0_11char_traitsIcEENS0_9allocatorIcEEEE') that is not present in the targeterror: The expression could not be prepared to run in the target
Is there any way to get lldb to understand and use the C++ "string" datatype? I have a number of functions taking string arguments and need a way to invoke these functions from the debugger. On a Mac.

THE PROBLEM
This is due to a subtle problem with your code, that boils down to the following wording from the C++ Standard:
7.1.2p3-4 Function specifiers [dcl.fct.spec]
A function defined within a class definition is an inline function.
...
An inline function shall be defined in every translation unit in which it is odr-used, and shall have exactly the same definition in every case (3.2).
Your constructor, lldbtest(std::string) is defined within the body of lldbtest which means that it will implicitly be inline, which further means that the compiler will not generate any code for it, unless it is used in the translation unit.
Since the definition must be present in every translation unit that potentially calls it we can imagine the compiler saying; "heck, I don't need to do this.. if someone else uses it, they will generate the code".
lldb will look for a function definition which doesn't exist, since gcc didn't generate one; because you didn't use it.
THE SOLUTION
If we change the definition of lldbtest to the following I bet it will work as you intended:
struct lldbtest{
int bar=5;
lldbtest();
lldbtest(int foo);
lldbtest(string fum);
};
lldbtest::lldbtest() { bar=6; }
lldbtest::lldbtest(int) { bar=7; }
lldbtest::lldbtest(string) { bar=8; }
But.. what about p lldbtest(name)?
The command p in lldb is used to print* information, but it can also be used to evaluate expressions.
lldbtest(name) will not call the constructor of lldbtest with a variable called name, it's equivalent of declaring a variable called name of type lldbtest; ie. lldbtest name is sementically equivalent.

Going to answer the asked question here instead of addressing the problem with the op's code. Especially since this took me a while to figure out.
Use a string in a function invocation in lldb in C++
(This post helped greatly, and is a good read: Dancing in The Debugger)

Related

Check if function is user defined in LLVM-IR or not

I am writing a LLVM pass which prints function name only if it is user-defined (which are defined by the user in the source file).
I cannot find any way to distinguish the user-defined function from the initialization function (or static constructors). I tried checking if the function is just declared or defined, but it does not work as there some init functions are defined (like __cxx_global_var_init).
At pass-time, I know of no way to accomplish what you're trying to do.
That said, Clang provides a way to determine this during initial compilation. See: clang::SourceManager::isInSystemHeader(). You would have to write a Clang plugin or a libTooling-based program to take advantage of this as the information is gone once opt is executed. Here is a contrived example of how to do so using an AST visitor:
bool VisitFunctionDecl(clang::FunctionDecl* funcDecl)
{
if (sourceManager.isInSystemHeader(funcDecl->getLocStart()))
{
return true;
}
}

How can I find all places a given member function or ctor is called in g++ code?

I am trying to find all places in a large and old code base where certain constructors or functions are called. Specifically, these are certain constructors and member functions in the std::string class (that is, basic_string<char>). For example, suppose there is a line of code:
std::string foo(fiddle->faddle(k, 9).snark);
In this example, it is not obvious looking at this that snark may be a char *, which is what I'm interested in.
Attempts To Solve This So Far
I've looked into some of the dump features of gcc, and generated some of them, but I haven't been able to find any that tell me that the given line of code will generate a call to the string constructor taking a const char *. I've also compiled some code with -s to save the generated equivalent assembly code. But this suffers from two things: the function names are "mangled," so it's impossible to know what is being called in C++ terms; and there are no line numbers of any sort, so even finding the equivalent place in the source file would be tough.
Motivation and Background
In my project, we're porting a large, old code base from HP-UX (and their aCC C++ compiler) to RedHat Linux and gcc/g++ v.4.8.5. The HP tool chain allowed one to initialize a string with a NULL pointer, treating it as an empty string. The Gnu tools' generated code fails with some flavor of a null dereference error. So we need to find all of the potential cases of this, and remedy them. (For example, by adding code to check for NULL and using a pointer to a "" string instead.)
So if anyone out there has had to deal with the base problem and can offer other suggestions, those, too, would be welcomed.
Have you considered using static analysis?
Clang has one called clang analyzer that is extensible.
You can write a custom plugin that checks for this particular behavior by implementing a clang ast visitor that looks for string variable declarations and checks for setting it to null.
There is a manual for that here.
See also: https://github.com/facebook/facebook-clang-plugins/blob/master/analyzer/DanglingDelegateFactFinder.cpp
First I'd create a header like this:
#include <string>
class dbg_string : public std::string {
public:
using std::string::string;
dbg_string(const char*) = delete;
};
#define string dbg_string
Then modify your makefile and add "-include dbg_string.h" to cflags to force include on each source file without modification.
You could also check how is NULL defined on your platform and add specific overload for it (eg. dbg_string(int)).
You can try CppDepend and its CQLinq a powerful code query language to detect where some contructors/methods/fields/types are used.
from m in Methods where m.IsUsing ("CClassView.CClassView()") select new { m, m.NbLinesOfCode }

Make BFD library find the location of a class member function

I am using the function bfd_find_nearest_line to find the source location of a function (from an executable with debugging symbols --compiled with -g). Naturally one of the arguments is a pointer to the function I want to locate:
boolean
_bfd_elf_find_nearest_line (abfd,
section,
symbols,
offset,
filename_ptr,
functionname_ptr, // <- HERE!
line_ptr)
https://sourceware.org/ml/binutils/2000-08/msg00248.html
After quite a bit of (pure C) boiler plate, I managed this to work with normal functions (where the normal function pointer is casted to *void).
For example, this works:
int my_function(){return 5;}
int main(){
_bfd_elf_find_nearest_line (...,
(void*)(&my_function),
...);
}
The question is if bfd_find_nearest_line can be used to locate the source code of a class member function.
struct A{
int my_member_function(){return 5.;}
};
_bfd_elf_find_nearest_line (...,
what_should_I_put_here??,
...)
Class member function (in this case if type int (A::*)()) are not functions, an in particular cannot be cast to any function pointer, not even to void*. See here: https://isocpp.org/wiki/faq/pointers-to-members#cant-cvt-memfnptr-to-voidptr
I completely understand the logic behind this, how ever the member-function pointer is the only handle from which I have information of a member function in order to make BFD identify the function. I don't want this pointer to call a function.
I know more or less how C++ works, the compiler will generate silently an equivalent free-C function,
__A_my_member_function(A* this){...}
But I don't know how to access the address of this free function or if that is even possible,and whether the bfd library will be able to locate the source location of the original my_member_function via this pointer.
(For the moment at least I am not interested in virtual functions.)
In other words,
1) I need to know if bfd will be able to locate a member function,
2) and if it can how can I map the member function pointer of type int (A::*)() to an argument that bfd can take (void*).
I know by other means (stack trace) that the pointer exists, for example I can get that the free function is called in this case _ZN1A18my_member_functionEv, but the problem is how I can get this from &(A::my_member_function).
Okay, there's good news and bad news.
The good news: It is possible.
The bad news: It's not straight forward.
You'll need the c++filt utility.
And, some way to read the symbol table of your executable, such as readelf. If you can enumerate the [mangled] symbols with a bfd_* call, you may be able to save yourself a step.
Also, here is a biggie: You'll need the c++ name of your symbol in a text string. So, for &(A::my_member_function), you'll need it in a form: "A::my_member_function()" This shouldn't be too difficult since I presume you have a limited number of them that you care about.
You'll need to get a list of symbols and their addresses from readelf -s <executable>. Be prepared to parse this output. You'll need to decode the hex address from the string to get its binary value.
These will be the mangled names. For each symbol, do c++filt -n mangled_name and capture the output (i.e. a pipe) into something (e.g. nice_name). It will give you back the demangled name (i.e. the nice c++ name you'd like).
Now, if nice_name matches "A:my_member_function()", you now have a match, you already have the mangled name, but, more importantly, the hex address of the symbol. Feed this hex value [suitably cast] to bfd where you were stuffing functionname_ptr
Note: The above works but can be slow with repeated invocations of c++filt
A faster way is to do this is to capture the piped output of:
readelf -s <executable> | c++filt
It's also [probably] easier to do it this way since you only have to parse the filtered output and look for the matching nice name.
Also, if you had multiple symbols that you cared about, you could get all the addresses in a single invocation.
Ok, I found a way. First, I discovered that bfd is pretty happy detecting member functions debug information from member pointers, as long as the pointer can be converted to void*.
I was using clang which wouldn't allow me to cast the member function pointer to any kind of pointer or integer.
GCC allows to do this but emits a warning.
There is even a flag to allow pointer to member cast called -Wno-pmf-conversions.
With that information in mind I did my best to convert a member function pointer into void* and I ended up doing this using unions.
struct A{
int my_member_function(){return 5.;}
};
union void_caster_t{
int (A::*p)(void) value;
void* casted_value;
};
void_caster_t void_caster = {&A::my_member_function};
_bfd_elf_find_nearest_line (...,
void_caster.casted_value,
...)
Finally bfd is able to give me debug information of a member function.
What I didn't figure out yet, is how to get the pointer to the constructor and the destructor member functions.
For example
void_caster_t void_caster = {&A::~A};
Gives compiler error: "you can't take the address of the destructor".
For the constructor I wasn't even able to find the correct syntax, since this fails as a syntax error.
void_caster_t void_caster = {&A::A};
Again all the logic behind not being able involves non-sensical callbacks, but this is different because I want the pointer (or address) to get debug information, not callbacks.

Force separate copy of gnu_inline function with mingw 4.8

I have a program which I'd like to profile. However, it has some functions declared with attribute gnu_inline. If I try to build the program with -finstrument-functions flag, I get linker errors, for example:
#define always_inline __attribute__((always_inline, gnu_inline))
static int inline always_inline f()
{
return 0;
}
int main(int argc, char *argv[])
{
int i = f();
return i;
}
gives me "undefined reference to f()" error.
The problem is that the compiler tries to pass the address of f() to profiling functions, but the function is inlined, so there is no body of function, and no address of f().
I tried to build my program with -fkeep-inline-functions flag, but it apparently has no effect on functions declared with gnu_inline.
Is it possible somehow to force the compiler making a separate copy of f() for linker? Or are such functions unprofileable?
My program uses Qt 5, and these functions are located in Qt headers, so I'd prefer not to alter function declaration, if it is possible.
At least, the following two options could make your scenario working even though there might be a prettier solution:
-finstrument-functions-exclude-file-list=file,file,...
Set the list of functions that are excluded from instrumentation (see the description of "-finstrument-functions"). If the file that contains a function definition matches with one
of file, then that function is not instrumented. The match is done on substrings: if the file parameter is a substring of the file name, it is considered to be a match.
For example:
-finstrument-functions-exclude-file-list=/bits/stl,include/sys
excludes any inline function defined in files whose pathnames contain "/bits/stl" or "include/sys".
If, for some reason, you want to include letter ',' in one of sym, write ','. For example, "-finstrument-functions-exclude-file-list=',,tmp'" (note the single quote surrounding the
option).
-finstrument-functions-exclude-function-list=sym,sym,...
This is similar to "-finstrument-functions-exclude-file-list", but this option sets the list of function names to be excluded from instrumentation. The function name to be matched is its user-visible name, such as "vector blah(const vector &)", not the internal mangled name (e.g., "_Z4blahRSt6vectorIiSaIiEE"). The match is done on substrings: if the sym parameter is a substring of the function name, it is considered to be a match. For C99 and C++ extended identifiers, the function name must be given in UTF-8, not using universal character names

how to start the execution of a program in c/c++ from a different function,but not main() [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Does the program execution always start from main in C?
i want to start the execution of my program which contains 2 functions (excluding main)
void check(void)
void execute(void)
i want to start my execution from check(), is it possible in c/c++?
You can do this with a simple wrapper:
int main()
{
check();
}
You can't portably do it in any other way since the standard explicitly specifies main as the program entry point.
EDIT for comment: Don't ever do this. In C++ you could abuse static initialization to have check called before main during static init, but you still can't call main legally from check. You can just have check run first. As noted in a comment this doesn't work in C because it requires constant initializers.
// At file scope.
bool abuse_the_language = (check(), true);
int main()
{
// No op if desired.
}
Various linkers have various options to specify the entry point. Eg. Microsoft linker uses /ENTRY:function:
The /ENTRY option specifies an entry point function as the starting
address for an .exe file or DLL.
GNU's ld uses the -e or ENTRY() in the command file.
Needles to say, modifying the entry point is a very advanced feature which you must absolutely understand how it works. For one, it may cause skipping the loading the standard libraries initialization.
int main()
{
check();
return 0;
}
Calling check from main seems like the most logical solution, but you could still explore using /ENTRY to define another entry point for your application. See here for more info.
You cannot start in something other than main, although there are ways to have some code execute before main.
Putting code in a static initialization block will have the code run prior to main; however, it won't be 100% controllable. while you can be assured it runs prior to main, you cannot specify the order that two static initialization blocks will run prior to them both executing before main.
Linkers and loaders both have the concept of main held as a shared "understood" start of a C / C++ program; however, there is code that runs prior to main. This code is responsible for "setting up the environment" of the program (things like setting up stdin or cin). By putting code in a static initialization block, you effectively say, "hey you need to do this too to have the right environment". Generally, this should be something small, that can stand independently in execution order of other items.
If you need two or three things to execute in order before main, then make them into proper functions and call them at the beginning of main.
There is a contrived way to achieve that, but it is nothing more than a hack.
The idea is to create a static library containing the main function, and make it call your "check" function.
The linker will resolve the symbol when linking against your "program", and your "program" code will indeed not have a main by itself.
This is NOT recommended, unless you have very specific needs (an example that pops to mind is Windows Screensavers, as the helper library that comes with the Windows SDK has a main function that performs specific initialization like parsing the command line).
It may be supportted by the compiler. For example, gcc, you can use -nostartfiles and --entry=xxx to set the entry point of the program. The default entry point is _start, which will call the function main.
You can "intercept" the call to main by creating an object before the main starts. The constructor needs to execute your function.
#include <iostream>
void foo()
{
// do stuff
std::cout<<"exiting from foo" <<std::endl;
}
struct A
{
A(){ foo(); };
};
static A a;
int main()
{
// something
std::cout<<"starting main()" <<std::endl;
}
I have found solution to my own question.
we can simply use
#pragma startup function-name <priority>
#pragma exit function-name <priority>
These two pragmas allow the program to specify function(s) that should be called either upon program startup (before the main function is called), or program exit (just before the program terminates through _exit).
The specified function-name must be a previously declared function taking no arguments and returning void; in other words, it should be declared as:
void func(void);
The optional priority parameter should be an integer in the range 64 to 255. The highest priority is 0. Functions with higher priorities are called first at startup and last at exit. If you don't specify a priority, it defaults to 100.
thanks!