llvm function wrapper for timing - llvm

I would like to add a function wrapper in order to record the entry and exit times of certain functions. It seems that LLVM would be a good tool to accomplish this. However, I've been having trouble finding a tutorial on how to write function wrappers. Any suggestions?
p.s. my target language is C

Assuming you need to call func_start when entering each function and func_return when returning, the easiest way is to do the following:
for each function F
insert a call to func_start(F) before the first instruction in the entry block
for each block B in function F
get the terminator instruction T
if T is a return instruction
insert a call to func_return(F) before T
All in all, including boilerplate code for your FunctionPass, wou'll have to write about 40 lines of code for this.
If you really want to go with the wrapper approach you have to do:
for each function F
clone function F (call it G)
delete all instructions in F
insert a call to func_start(F) in F
insert a call to G in F (forwarding the arguments), put the return value in R
insert a call to func_return(F) in F
insert a return instruction returning R in F
The code complexity in this case will be slightly higher and you'll likely incur in a higher compile- and run-time overhead.

I like doing this and use several approaches, depending on the circumstance.
The easiest if you are on a Linux platform is to use the wonderful ltrace utility. You provide the C program you are timing as an argument to ltrace. The "-T" option will output the elapsed call time. If you want a summary of call times use the "-c" option. You can control the amount of output by using the "-e" and "--library" options. Other platforms have somewhat similar tools (like dtrace) but they are not quite as easy to use.
Another, slightly hackish approach is to use macros to redefine the function names. This has all the potential pitfalls of macros but can work well in a controlled environment for smallish programs. The C preprocessor will not recursively expand macros so you can just call the actual function from inside your wrapper macro at the point of call. This avoids the difficulty of placing the "stop timing" code before each potential return in the function body.
#define foo(a,b,c) ({long t0 = now(); int retval = foo(a,b,c); long elapsed = now() - t0; retval;})
Notice the use of the non-standard code block inside an expression. This avoids collisions of the temporary names used for timing and retval. Also by placing retval as the last expression in the statement list this code will time function calls that are embedded in assignments or other expressional contexts (you need to change the type of "retval" to whatever is appropriate for your function).
You must be very careful NOT to include the #define before prototypes and such.
Use your favorite timer function and its appropriate data type (double, long long, whatever). I like <chrono> in C++11 myself.

Related

Can I automatically build an argument list with a macro?

I've been writing a little hooking library, which uses MS Detours and can attach to any MS API in any process and log what it's doing. One of things I've tried to do in my library is to remove all the boiler plate from hooking so that each hook can be written very simply.
I've got it down to this being all that's required to hook the CreateFile function and print its first parameter to the debugger:
Hooker hooker;
Hook(hooker, dllnames::Kernel32, apis::T_CreateFileW, CreateFileW, [](auto lpFilename, auto ...rest) -> auto
{
APILOG(L"%s", lpFilename);
return TruePtr(lpFilename, rest...);
});
The only boiler plate required for this, is the function pointer definition apis::T_CreateFileW. However, it strikes me for many cases, I could go further even than this. I'm wondering if I could us a macro to write the above as:
APILOG(hooker, dllnames::Kernel32, CreateFileW, L"%s", lpFilename);
This would mean that could almost write a printf that would allow me to log what any API was doing provided its parameters could be reasonably captured with a format string.
But is this possible as a macro? I guess some problems are:
I have to expand the ... of the macro into both auto param1, auto param2 as well as param1, param2.
I'd have to specify and print at least the first N parameters of the real API in my format string whether I want them or not, since only at that point can I pass rest...
If I want to print all the parameters, I wouldn't be able to pass rest...
It's possible I've already simplified this as far as I can, I'm just curious to see if anything approaching it is possible.
I don't think this is possible without using multiple macros and duplicating the parameter names at least one time; you can pass a multiple-args "bundle" using parens, but you can't do something with each argument like adding the auto prefix.
I'd instead consider more advanced code generation techniques using something other than the preprocessor. m4 comes to mind. Even a simple shell script could be used.

How to convert function insertion module pass to intrinsic to inline

PROBLEM:
I currently have a traditional module instrumentation pass that
inserts new function calls into a given IR according to some logic
(inserted functions are external from a small lib that is later linked
to given program). Running experiments, my overhead is from
the cost of executing a function call to the library function.
What I am trying to do:
I would like to inline these function bodies into the IR of
the given program to get rid of this bottleneck. I assume an intrinsic
would be a clean way of doing this, since an intrinsic function would
be expanded to its function body when being lowered to ASM (please
correct me if my understanding is incorrect here, this is my first
time working with intrinsics/LTO).
Current Status:
My original library call definition:
void register_my_mem(void *user_vaddr){
... C code ...
}
So far:
I have created a def in: llvm-project/llvm/include/llvm/IR/IntrinsicsX86.td
let TargetPrefix = "x86" in {
def int_x86_register_mem : GCCBuiltin<"__builtin_register_my_mem">,
Intrinsic<[], [llvm_anyint_ty], []>;
}
Added another def in:
otwm/llvm-project/clang/include/clang/Basic/BuiltinsX86.def
TARGET_BUILTIN(__builtin_register_my_mem, "vv*", "", "")
Added my library source (*.c, *.h) to the compiler-rt/lib/test_lib
and added to CMakeLists.txt
Replaced the function insertion with trying to insert the intrinsic
instead in: llvm/lib/Transforms/Instrumentation/myModulePass.cpp
WAS:
FunctionCallee sm_func =
curr_inst->getModule()->getOrInsertFunction("register_my_mem",
func_type);
ArrayRef<Value*> args = {
builder.CreatePointerCast(sm_arg_val, currType->getPointerTo())
};
builder.CreateCall(sm_func, args);
NEW:
Intrinsic::ID aREGISTER(Intrinsic::x86_register_my_mem);
Function *sm_func = Intrinsic::getDeclaration(currFunc->getParent(),
aREGISTER, func_type);
ArrayRef<Value*> args = {
builder.CreatePointerCast(sm_arg_val, currType->getPointerTo())
};
builder.CreateCall(sm_func, args);
Questions:
If my logic for inserting the intrinsic functions shouldnt be a
module pass, where do i put it?
Am I confusing LTO with intrinsics?
Do I put my library function definitions into the following files as mentioned in
http://lists.llvm.org/pipermail/llvm-dev/2017-June/114322.html as for example EmitRegisterMyMem()?
clang/lib/CodeGen/CodeGenFunction.cpp - define llvm::Instrinsic::ID
clang/lib/CodeGen/CodeGenFunction.h - declare llvm::Intrinsic::ID
My LLVM compiles, so it is semantically correct, but currently when
trying to insert this function call, LLVM segfaults saying "Not a valid type for function argument!"
I'm seeing multiple issues here.
Indeed, you're confusing LTO with intrinsics. Intrinsics are special "functions" that are either expanded into special instructions by a backend or lowered to library function calls. This is certainly not something you're going to achieve. You don't need an intrinsic at all, you'd just need to inline the function call in question: either by hands (from your module pass) or via LTO, indeed.
The particular error comes because you're declaring your intrinsic as receiving an integer argument (and this is how the declaration would look like), but:
asking the declaration of variadic intrinsic with invalid type (I'd assume your func_type is a non-integer type)
passing pointer argument
Hope this makes an issue clear.
See also: https://llvm.org/docs/LinkTimeOptimization.html
Thanks you for clearing up the issue #Anton Korobeynikov.
After reading your explanation, I also believe that I have to use LTO to accomplish what I am trying to do. I especially found this link very useful: https://llvm.org/docs/LinkTimeOptimization.html. It seems that I am now on a right path.

Removal of unused or redundant code [duplicate]

This question already has answers here:
Listing Unused Symbols
(2 answers)
Closed 7 years ago.
How do I detect function definitions which are never getting called and delete them from the file and then save it?
Suppose I have only 1 CPP file as of now, which has a main() function and many other function definitions (function definition can also be inside main() ). If I were to write a program to parse this CPP file and check whether a function is getting called or not and delete if it is not getting called then what is(are) the way(s) to do it?
There are few ways that come to mind:
I would find out line numbers of beginning and end of main(). I can do it by maintaining a stack of opening and closing braces { and }.
Anything after main would be function definition. Then I can parse for function definitions. To do this I can parse it the following way:
< string >< open paren >< comma separated string(s) for arguments >< closing paren >
Once I have all the names of such functions as described in (2), I can make a map with its names as key and value as a bool, indicating whether a function is getting called once or not.
Finally parse the file once again to check for any calls for functions with their name as in this map. The function call can be from within main or from some other function. The value for the key (i.e. the function name) could be flagged according to whether a function is getting called or not.
I feel I have complicated my logic and it could be done in a smarter way. With the above logic it would be hard to find all the corner cases (there would be many). Also, there could be function pointers to make parsing logic difficult. If that's not enough, the function pointers could be typedefed too.
How do I go about designing my program? Are a map (to maintain filenames) and stack (to maintain braces) the right data structures or is there anything else more suitable to deal with it?
Note: I am not looking for any tool to do this. Nor do I want to use any library (if it exists to make things easy).
I think you should not try to build a C++ parser from scratch, becuse of other said in comments that is really hard. IMHO, you'd better start from CLang libraries, than can do the low-level parsing for you and work directly with the abstract syntax tree.
You could even use crange as an example of how to use them to produce a cross reference table.
Alternatively, you could directly use GNU global, because its gtags command directly generates definition and reference databases that you have to analyse.
IMHO those two ways would be simpler than creating a C++ parser from scratch.
The simplest approach for doing it yourself I can think of is:
Write a minimal parser that can identify functions. It just needs to detect the start and ending line of a function.
Programmatically comment out the first function, save to a temp file.
Try to compile the file by invoking the complier.
Check if there are compile errors, if yes, the function is called, if not, it is unused.
Continue with the next function.
This is a comment, rather than an answer, but I post it here because it's too long for a comment space.
There are lots of issues you should consider. First of all, you should not assume that main() is a first function in a source file.
Even if it is, there should be some functions header declarations before the main() so that the compiler can recognize their invocation in main.
Next, function's opening and closing brace needn't be in separate lines, they also needn't be the only characters in their lines. Generally, almost whole C++ code can be put in a single line!
Furthermore, functions can differ with parameters' types while having the same name (overloading), so you can't recognize which function is called if you don't parse the whole code down to the parameters' types. And even more: you will have to perform type lists matching with standard convertions/casts, possibly considering inline constructors calls. Of course you should not forget default parameters. Google for resolving overloaded function call, for example see an outline here
Additionally, there may be chains of unused functions. For example if a() calls b() and b() calls c() and d(), but a() itself is not called, then the whole four is unused, even though there exist 'calls' to b(), c() and d().
There is also a possibility that functions are called through a pointer, in which case you may be unable to find a call. Example:
int (*testfun)(int) = whattotest ? TestFun1 : TestFun2; // no call
int testResult = testfun(paramToTest); // unknown function called
Finally the code can be pretty obfuscated with #defineā€“s.
Conclusion: you'll probably have to write your own C++ compiler (except the machine code generator) to achieve your goal.
This is a very rough idea and I doubt it's very efficient but maybe it can help you get started. First traverse the file once, picking out any function names (I'm not entirely sure how you would do this). But once you have those names, traverse the file again, looking for the function name anywhere in the file, inside main and other functions too. If you find more than 1 instance it means that the function is being called and should be kept.

Eliminating inherited overlong MACRO

I have inherited a very long set of macros from some C algorithm code.They basically call free on a number of structures as the function exits either abnormally or normally. I would like to replace these with something more debuggable and readable. A snippet is shown below
#define FREE_ALL_VECS {FREE_VEC_COND(kernel);FREE_VEC_COND(cirradCS); FREE_VEC_COND(pixAccum).....
#define FREE_ALL_2D_MATS {FREE_2D_MAT_COND(circenCS); FREE_2D_MAT_COND(cirradCS_2); }
#define FREE_ALL_IMAGES {immFreeImg(&imgC); immFreeImg(&smal.....
#define COND_FREE_ALLOC_VARS {FREE_ALL_VECS FREE_ALL_2D_MATS FREE_ALL_IMAGES}
What approach would be best? Should I just leave well alone if it works? This macro set is called twelve times in one function. I'm on Linux with gcc.
Usually I refactor such macros to functions, using inline functions when the code is really performance critical. Also I try to move allocation, deallocation and clean up stuff into C++ objects, to get advantage of the automatic destruction.
If they are broken then fix them by converting to functions.
If they're aren't broken then leave them be.
If you are determined to change them, write unit-tests to check you don't inadvertently break something.
Ideally, I would use inline functions instead of using macros to eliminate function call overhead. However, basing from your snippet, the macros you have would call several nested functions. Inlining them may not have any effect, thus I would just suggest to refactor them into functions to make them more readable and maintainable. Inlining improves performance only if the function to be inlined is simple (e.g. accessors, mutators, no loops).
I believe this is your decision. If the macros are creating problems when debugging, I believe it is best to create some functions that do the same things as the macros. In general you should avoid complicated macros. By complicated I mean macros that do something more than a simple value definition.
Recommended:
// it is best to use only this type of macro
#define MAX_VALUE 200
The rest is not recommended (see example below):
// this is not recommended
#define min(x,y) ( (x)<(y) ? (x) : (y) )
// imagine using min with some function arguments like this:
//
// val = min(func1(), func2())
//
// this means that one of functions is called twice which is generally
// not very good for performance

Does an arbitrary instruction pointer reside in a specific function?

I have a very difficult problem I'm trying to solve: Let's say I have an arbitrary instruction pointer. I need to find out if that instruction pointer resides in a specific function (let's call it "Foo").
One approach to this would be to try to find the start and ending bounds of the function and see if the IP resides in it. The starting bound is easy to find:
void *start = &Foo;
The problem is, I don't know how to get the ending address of the function (or how "long" the function is, in bytes of assembly).
Does anyone have any ideas how you would get the "length" of a function, or a completely different way of doing this?
Let's assume that there is no SEH or C++ exception handling in the function. Also note that I am on a win32 platform, and have full access to the win32 api.
This won't work. You're presuming functions are contigous in memory and that one address will map to one function. The optimizer has a lot of leeway here and can move code from functions around the image.
If you have PDB files, you can use something like the dbghelp or DIA API's to figure this out. For instance, SymFromAddr. There may be some ambiguity here as a single address can map to multiple functions.
I've seen code that tries to do this before with something like:
#pragma optimize("", off)
void Foo()
{
}
void FooEnd()
{
}
#pragma optimize("", on)
And then FooEnd-Foo was used to compute the length of function Foo. This approach is incredibly error prone and still makes a lot of assumptions about exactly how the code is generated.
Look at the *.map file which can optionally be generated by the linker when it links the program, or at the program's debug (*.pdb) file.
OK, I haven't done assembly in about 15 years. Back then, I didn't do very much. Also, it was 680x0 asm. BUT...
Don't you just need to put a label before and after the function, take their addresses, subtract them for the function length, and then just compare the IP? I've seen the former done. The latter seems obvious.
If you're doing this in C, look first for debugging support --- ChrisW is spot on with map files, but also see if your C compiler's standard library provides anything for this low-level stuff -- most compilers provide tools for analysing the stack etc., for instance, even though it's not standard. Otherwise, try just using inline assembly, or wrapping the C function with an assembly file and a empty wrapper function with those labels.
The most simple solution is maintaining a state variable:
volatile int FOO_is_running = 0;
int Foo( int par ){
FOO_is_running = 1;
/* do the work */
FOO_is_running = 0;
return 0;
}
Here's how I do it, but it's using gcc/gdb.
$ gdb ImageWithSymbols
gdb> info line * 0xYourEIPhere
Edit: Formatting is giving me fits. Time for another beer.