My goal is to determine all the possible exit points from an LLVM function. I know that terminator instructions end basic blocks, either to exit the function or to branch to another part of the same function. Among the terminator instructions, I am clear on most of them:
ret and resume exit the function
br, switch, indirectbr branch to other blocks in the same function
invoke, catchswitch related to exception control flow, and also should not exit the function
(unreachable can be ignored for this purpose)
I would like to seek clarification on catchret and cleanupret. I have compiled example exception handling code (clang++ on Mac and Ubuntu) and do not see these instructions in the compiled LLVM IR. Are these only used for specific ABIs?
Suppose I have an llvm::Instruction* inst, how can I obtain the pointer to its basicblock? I searched in LLVM API and have found no such interface such like inst.getBasicBlock(). Any help?
In well formed LLVM IR each Instruction is embedded in a BasicBlock. You can get the BasicBlock from getParent().
getParent() will always go one step up in the LLVM IR hierarchy, i.e., you get a Function as parent from a BasicBlock, and the Module from a Function.
In llvm used to be this great function (I don't know which version they are using here):
const unsigned* llvm::TargetRegisterInfo::getSuperRegisters(unsigned RegNo)
http://legup.eecg.utoronto.ca/doxygen/classllvm_1_1TargetRegisterInfo.html#90b85b889ff636c6bdd40b7543343473
Unfortunately I am using llvm 3.4, where this function does not exist. Is there something with similar functionality? Or is there an easy workaround to get all the parent registers of a given register?
Should have read the docu more carefully. Here is the answer:
http://llvm.org/docs/doxygen/html/classllvm_1_1MCSuperRegIterator.html
llvm::MCSuperRegIterator expects a physical register in its constructor and then iterates over all its parents.
I want to get the line number of an instruction (and also of a variable declaration - alloca and global). The instruction is saved in an array of instructions. I have the function:
Constant* metadata::getLineNumber(Instruction* I){
if (MDNode *N = I->getMetadata("dbg")) { // this if is never executed
DILocation Loc(N);
unsigned Line = Loc.getLineNumber();
return ConstantInt::get(Type::getInt32Ty(I->getContext()), Line);
} // else {
// return NULL; }
}
and in my main() I have :
errs()<<"\nLine number is "<<*metadata::getLineNumber(allocas[p]);
the result is NULL since I->getMetadata("dbg") is false.
Is there a possibility to enable dbg flags in LLVM without rebuilding the LLVM framework, like using a flag when compiling the target program or when running my pass (I used -debug) ?
Compiling a program with ā-O3 -gā should give full debug information, but I still have the same result. I am aware of http://llvm.org/docs/SourceLevelDebugging.html , from where I can see that is quite easy to take the source line number from a metadata field.
PS: for Allocas, it seems that I have to use findDbgDeclare method from DbgInfoPrinter.cpp.
Thank you in advance !
LLVM provides debugging information if you specify the -g flag to Clang. You don't need to rebuild LLVM to enable/disable it - any LLVM will do (including a pre-built one from binaries or binary packages).
The problem may be that you're trying to have debug information in highly optimized code (-O3). This is not necessarily possible, since LLVM simply optimizes some code away in such cases and there's not much meaning to debug information. LLVM tries to preserve debug info during optimizations, but it's not an easy task.
Start by generating unoptimized code with debug info (-O0 -g) and write your code/passes to work with that. Then graduate to optimized code, and try to examine what specifically gets lost. If you think that LLVM is being stupid, don't hesitate to open a bug.
Some random tips:
Generate IR from clang (-emit-llvm) and see the debug metadata nodes in it. Then you can run through opt with optimizations and see what remains.
The -debug option to llc and other LLVM tools is quite unrelated to debug info in the source.
I want to read (parse) LLVM IR code (which is saved in a text file) and add some of my own code to it. I need some example of doing this, that is, how this is done by using the libraries provided by LLVM for this purpose. So basically what I want is to read in the IR code from a text file into the memory (perhaps the LLVM library represents it in AST form, I dont know), make modifications, like adding some more nodes in the AST and then finally write back the AST in the IR text file.
Although I need to both read and modify the IR code, I would greatly appreciate if someone could provide or refer me to some example which just read (parses) it.
First, to fix an obvious misunderstanding: LLVM is a framework for manipulating code in IR format. There are no ASTs in sight (*) - you read IR, transform/manipulate/analyze it, and you write IR back.
Reading IR is really simple:
int main(int argc, char** argv)
{
if (argc < 2) {
errs() << "Expected an argument - IR file name\n";
exit(1);
}
LLVMContext &Context = getGlobalContext();
SMDiagnostic Err;
Module *Mod = ParseIRFile(argv[1], Err, Context);
if (!Mod) {
Err.print(argv[0], errs());
return 1;
}
[...]
}
This code accepts a file name. This should be an LLVM IR file (textual). It then goes on to parse it into a Module, which represents a module of IR in LLVM's internal in-memory format. This can then be manipulated with the various passes LLVM has or you add on your own. Take a look at some examples in the LLVM code base (such as lib/Transforms/Hello/Hello.cpp) and read this - http://llvm.org/docs/WritingAnLLVMPass.html.
Spitting IR back into a file is even easier. The Module class just writes itself to a stream:
some_stream << *Mod;
That's it.
Now, if you have any specific questions about specific modifications you want to do to IR code, you should really ask something more focused. I hope this answer shows you how to parse IR and write it back.
(*) IR doesn't have an AST representation inside LLVM, because it's a simple assembly-like language. If you go one step up, to C or C++, you can use Clang to parse that into ASTs, and then do manipulations at the AST level. Clang then knows how to produce LLVM IR from its AST. However, you do have to start with C/C++ here, and not LLVM IR. If LLVM IR is all you care about, forget about ASTs.
This is usually done by implementing an LLVM pass/transform. This way you don't have to parse the IR at all because LLVM will do it for you and you will operate on a object-oriented in-memory representation of the IR.
This is the entry point for writing an LLVM pass. Then you can look at any of the already implemented standard passes that come bundled with LLVM (look into lib/Transforms).
The Opt tool takes llvm IR code, runs a pass on it, and then spits out transformed llvm IR on the other side.
The easiest to start hacking is lib\Transforms\Hello\Hello.cpp. Hack it, run through opt with your source file as input, inspect output.
Apart from that, the docs for writing passes is really quite good.
The easiest way to do this is to look at one of the existing tools and steal code from it. In this case, you might want to look at the source for llc. It can take either a bitcode or .ll file as input. You can modify the input file any way you want and then write out the file using something similar to the code in llvm-dis if you want a text file.
As mentioned above the best way it to write a pass. But if you want to simply iterate through the instructions and do something with the LLVM provided an InstVisitor class. It is a class that implements the visitor pattern for instructions. It is very straight forward to user, so if you want to avoid learning how to implement a pass, you could resort to that.