LLVM IR: Branch instruction with variable BasicBlock target? - c++

Does LLVM support for branch instructions with a variable BasicBlock target?
More specifically, suppose I convert all unconditional br instructions into function calls to some function f. Is it then possible to provide the target label as an argument to f, and then use this label in an unconditional branch within f?
Or is the only solution to make a switch in f, map all BB's to unique ID's, and then call f with the ID corresponding to the target BB?

From what I can see, non-local indirect branches to labels aren't possible.
http://blog.llvm.org/2010/01/address-of-label-and-indirect-branches.html?m=1

Related

LLVM Pass: to change the function call's argument values

part of my project, based on some analysis, I have to change the function call's arguments. I am doing it in the llvm-ir level. something like this,
doWork("work",functionBefore)
based on my results my llvm-pass should be able to transform the function pointer passed to the function call like this
doWork("work",functionAfter)
assume both functionBefore and functionAfter have the same return type.
Is it possible to change the arguments using llvm pass?
Or should i delete the instruction and recreate the one I needed?
Please give some suggestions or directions how to do this ?
llvm ir to the call the function would be something like this-
invoke void #_Z7processNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEPFvS4_E(%"c lass.std::__cxx11::basic_string"* nonnull %1, void (%"class.std::__cxx11::basic_string"*)* nonnull #_Z9functionBNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE) to label %7 unwind label %13

Given an LLVM Instruction, how can we obtain the pointer to its BasicBlock?

Suppose I have an llvm::Instruction* inst, how can I obtain the pointer to its basicblock? I searched in LLVM API and have found no such interface such like inst.getBasicBlock(). Any help?
In well formed LLVM IR each Instruction is embedded in a BasicBlock. You can get the BasicBlock from getParent().
getParent() will always go one step up in the LLVM IR hierarchy, i.e., you get a Function as parent from a BasicBlock, and the Module from a Function.

Can llvm emit code that jumps to a given address within a function?

Following up on this question, is it possible for llvm to generate code that may jump to an arbitrary address within a function in the same address space?
i.e.
void func1() {
...
<code that jumps to addr2>
...
}
void func2() {
...
addr2:
<some code in func2()>
...
}
Yes,No,Yes,No,(yes) - It depends on the level you look at and what you mean with possible:
Yes, as the llvm backend will produce target specific assembler
instructions and those assembler instructions allow to set the
program counter to an abitrary value.
No, because - as far as I know - the llvm ir (the intermediate representation into which a frontend like clang compiles your c code) hasn't any instructions that would allow abitrary jumps between (llvm-ir) functions.
Yes, because the frontend COULD certainly produce code, that simulates that behaviour (breaking up func2 into multiple separate functions).
No, because C and C++ don't allow such jumps to ARBITRARY positions and so clang will not compile any program that tries to do that (e.g. via goto)
(yes) the c longjmp macro jumps back to a place in the control flow that you have already visited (where you called setjmp) but also restores (most) of the system state. EDIT: However, this is UB if func2 isn't somewhere up in the current callstack from where you jump.

Branch instruction with direct jump in LLVM

In LLVM, how can I have generate a branch instruction that jumps directly, rather than having if-else. I know there is LLVM::BranchInst class, but don't know how to use it for this purpose, or do I need to use some other class?
You need an unconditional branch:
static BranchInst * llvm::BranchInst::Create(BasicBlock *IfTrue,
Instruction *InsertBefore = 0)
static BranchInst * llvm::BranchInst::Create(BasicBlock *IfTrue,
BasicBlock *InsertAtEnd)
Use this method:
static BranchInst * Create (BasicBlock *IfTrue, BasicBlock *InsertAtEnd)
The first argument is where you are jumping to and the second one is where created instruction should be placed.

llvm function wrapper for timing

I would like to add a function wrapper in order to record the entry and exit times of certain functions. It seems that LLVM would be a good tool to accomplish this. However, I've been having trouble finding a tutorial on how to write function wrappers. Any suggestions?
p.s. my target language is C
Assuming you need to call func_start when entering each function and func_return when returning, the easiest way is to do the following:
for each function F
insert a call to func_start(F) before the first instruction in the entry block
for each block B in function F
get the terminator instruction T
if T is a return instruction
insert a call to func_return(F) before T
All in all, including boilerplate code for your FunctionPass, wou'll have to write about 40 lines of code for this.
If you really want to go with the wrapper approach you have to do:
for each function F
clone function F (call it G)
delete all instructions in F
insert a call to func_start(F) in F
insert a call to G in F (forwarding the arguments), put the return value in R
insert a call to func_return(F) in F
insert a return instruction returning R in F
The code complexity in this case will be slightly higher and you'll likely incur in a higher compile- and run-time overhead.
I like doing this and use several approaches, depending on the circumstance.
The easiest if you are on a Linux platform is to use the wonderful ltrace utility. You provide the C program you are timing as an argument to ltrace. The "-T" option will output the elapsed call time. If you want a summary of call times use the "-c" option. You can control the amount of output by using the "-e" and "--library" options. Other platforms have somewhat similar tools (like dtrace) but they are not quite as easy to use.
Another, slightly hackish approach is to use macros to redefine the function names. This has all the potential pitfalls of macros but can work well in a controlled environment for smallish programs. The C preprocessor will not recursively expand macros so you can just call the actual function from inside your wrapper macro at the point of call. This avoids the difficulty of placing the "stop timing" code before each potential return in the function body.
#define foo(a,b,c) ({long t0 = now(); int retval = foo(a,b,c); long elapsed = now() - t0; retval;})
Notice the use of the non-standard code block inside an expression. This avoids collisions of the temporary names used for timing and retval. Also by placing retval as the last expression in the statement list this code will time function calls that are embedded in assignments or other expressional contexts (you need to change the type of "retval" to whatever is appropriate for your function).
You must be very careful NOT to include the #define before prototypes and such.
Use your favorite timer function and its appropriate data type (double, long long, whatever). I like <chrono> in C++11 myself.