How to get LLVM IR AST - llvm

There's lots of information how to use Clang / LLVM to get the C or C++ AST. I'd like the IR AST. That is:
Given xor i1 %1, true, how do I get something like
xor : i1 # operator, binary
%1 # operand 1
true # operand 2

Related

LLVM IRBuilder If-Then Codegen

I have modified Kaleidoscope's if-then-else codegen, to support if-then (no else clause), for a non-functional language. I'm unable to get the merge block (after the 'then') to be created in the IR.
I have removed the Phi node creation, because (0) the phi node needs to merge 2 branches of execution and I only really have one branch (1) it's not really required in a non-functional language context? (2) the phi requires two legs of the same type and the statement above the 'if' statement might return a different type.
The if-then codegen is:
static IRBuilder<> Builder(TheContext);
:
Value* IfThenExprAST::Codegen() {
Value* CondV = Cond->Codegen();
CondV = Builder.CreateICmpNE(CondV, Builder.getInt1(0), "ifcond");
Function *TheFunction = Builder.GetInsertBlock()->getParent();
BasicBlock* ThenBB = BasicBlock::Create(TheContext, "then", TheFunction);
BasicBlock* MergeBB = BasicBlock::Create(TheContext, "ifcont");
Builder.CreateCondBr(CondV, ThenBB, MergeBB);
Builder.SetInsertPoint(ThenBB);
llvm::Value* ThenV;
for(std::vector<ExprAST*>::iterator it = Then.begin(); it != Then.end(); ++it)
ThenV = (*it)->Codegen();
Builder.CreateBr(MergeBB);
return CondV;
}
The resulting IR is:
:
br i1 true, label %then, label %ifcont
then: ; preds = %entry
br label %ifcont
:
ret double 1.000000e+01
Notice that there is no 'ifcont' label generated in the IR.
How do I tell the builder to insert the ifcont label after the 'then' clause?
I solved this by adding the following 2 lines just before 'return CondV;'
TheFunction->getBasicBlockList().push_back(MergeBB);
Builder.SetInsertPoint(MergeBB);

Is there LLVM API for setting funclet operand bundle in LLVM/MSVC exception handling?

I'm using LLVM 3.9.1 Core libraries in implementation of a compiler front-end that targets Windows x64. I'm currently implementing exception handling.
For Windows x64 target LLVM generates so called funclets for catch and clean-up code. To catch an exception you generate a catchswitch instruction and then a catchpad instruction for each exception type caught:
...
catchsection0: ; preds = %entry
%8 = catchswitch within none [label %catch0] unwind to caller
catch0: ; preds = %catchsection0
%9 = catchpad within %8 [i8** #exceptionClassDescriptor", i32 8, i8** %1]
%10 = load i8*, i8** %1
br label %target10
target10: ; preds = %catch0
%11 = load i8*, i8** #__CD0
%12 = call i1 #RtHandleException(i8* %10, i8* %11) [ "funclet"(token %9) ]
br i1 %12, label %thisHandlerTarget10, label %nextHandlerTarget10
...
In order to call a function in a catch-block you need to set a funclet operand bundle for the call instruction passing the token returned by the catchpad instruction as argument:
%12 = call i1 #RtHandleException(i8* %10, i8* %11) [ "funclet"(token %9) ]
I have tried to search from the LLVM doxygen documentation and through the LLVM library source code for an API how to set the funclet operand bundle for the call or invoke instruction but without any success.
Does anyone know how to create an operand bundle and set it to the call instruction?
I found a solution that was simpler than I thought:
CallInst::Create and InvokeInst::Create instructions have overloads that take vector/array of OperandBundleDef as arguments. Problem solved.
Seppo

julia llvmcall failed to parse llvm assembly

I am trying to add support for new intrinsics in Julia. For this I use the llvmcall fork. However, when using this in a Julia function, I always encounter the following problem:
declare function:
julia> function ptx_test(a,b,c)
idx = Base.llvmcall("""%1 = tail call i32 #llvm.nvvm.read.ptx.sreg.tid.x() readnone nounwind
ret i32 %1""", Int32, ())
c[idx] = a[idx] + b[idx]
return nothing
end
ptx_test (generic function with 1 method)
retrieve llvm IR of this function:
julia> code_llvm(ptx_test, (Array{Int32,1},Array{Int32,1},Array{Int32,1}))
PTX function found: ptx_test
ERROR: error compiling ptx_test: Failed to parse LLVM Assembly:
julia: <string>:3:20: error: use of undefined value '#llvm.nvvm.read.ptx.sreg.tid.x'
%1 = tail call i32 #llvm.nvvm.read.ptx.sreg.tid.x() readnone nounwind
^
in _dump_function at reflection.jl:132
in code_llvm at reflection.jl:139
However, if I execute the same code_llvm command again directly after this, It works (correct IR is returned). How can I avoid this error when compiling for the first time?

Input in LLVM, I think I do not understand dominance and the location of phi nodes

My goal is to do something simple in LLVM. I want to, using the C library function getchar, define an LLVM function that reads an input from the commandline. Here is my algorithm in pseudocode:
getInt:
get a character, set the value to VAL
check if VAL is '-'
if yes then set SGN to -1 and set VAL to the next character else set SGN to 1
set NV = to the next char minus 48
while (NV >= 0) // 48 is the first ASCII character that represents a number
set VAL = VAL*10
set VAL = VAL + NV
set NV to the next char minus 48
return SGN*VAL
So now, the LLVM code I come up with for doing this is in my head the most straightforward way to translate the above into LLVM IR. However, I get the error
"PHI nodes not grouped at the top of the basic block." If I move some things around to fix this error, I get errors about dominance. Below is the LLVM IR code that gives me the PHI nodes error. I believe I am misunderstanding something basic about LLVM IR, so any help you can give is super appreciated.
define i32 #getIntLoop() {
_L1:
%0 = call i32 #getchar()
%1 = phi i32 [ %0, %_L1 ], [ %3, %_L2 ], [ %8, %_L4 ]
%2 = icmp eq i32 %1, 45
br i1 %2, label %_L2, label %_L5
_L2: ; preds = %_L1
%3 = call i32 #getchar()
br label %_L3
_L3: ; preds = %_L4, %_L2
%4 = call i32 #getchar()
%5 = icmp slt i32 %4, 40
br i1 %5, label %_L5, label %_L4
_L4: ; preds = %_L3
%6 = sub i32 %4, 48
%7 = mul i32 %1, 10
%8 = add i32 %6, %7
br label %_L3
_L5: ; preds = %_L3, %_L1
br i1 %2, label %_L6, label %_L7
_L6: ; preds = %_L5
%9 = mul i32 -1, %1
ret i32 %9
_L7: ; preds = %_L5
ret i32 %1
}
You're getting a very clear error, though. According to the LLVM IR language reference:
There must be no non-phi instructions between the start of a basic
block and the PHI instructions: i.e. PHI instructions must be first in
a basic block.
You have a phi in L1 which violates this.
Why does it have %_L1 as one of its sources? There are no jumps to %_L1 anywhere else. I think you should first understand how phi works, possibly by compiling small pieces of C code into LLVM IR with Clang and see what gets generated.
Put simply, a phi is needed to have consistency in SSA form while being able to assign one of several values into the same register. Make sure you read about SSA - it explains Phi node as well. And additional good resource is the LLVM tutorial which you should go through. In particular, part 5 covers Phis. As suggested above, running small pieces of C through Clang is a great way to understand how things work. This is in no way "hacky" - it's the scientific method! You read the theory, think hard about it, form hypotheses about how things work and then verify those hypotheses by running Clang and seeing what it generates for real-life control flow.

Expected top-level entity

How did you managed to pass through expected top-level entity error while executing lli in the llvm framework?
This error usually means that you copy-pasted part of some IR code which doesn't count as a top level entity. In other words, it's not a function, not a type, not a global variable, etc. The same error can happen in C, just for comparison:
x = 8;
Is not valid contents for a C file, because the assignment statement isn't a valid top level entity. To make it valid you put it in a function:
void foo() {
x = 8; /* assuming x is global and visible here */
}
The same error happens in LLVM IR.
My Issue: The .ll file format was "UTF-8 with BOM" instead of "UTF-8 without BOM".
Fix: With notepad++, in the encoding menu, select the "UTF-8 without BOM", then save.
Quick setup: (For llvm 3.4.0 .ll files on windows)
advanced text editor from https://notepad-plus-plus.org/
llvm binaries from https://github.com/CRogers/LLVM-Windows-Binaries
hello.ll as "UTF-8 without BOM" (This code is in llvm 3.4.0 format):
#msg = internal constant [13 x i8] c"Hello World!\00"
declare i32 #puts(i8*)
define i32 #main() {
call i32 #puts(i8* getelementptr inbounds ([13 x i8]* #msg, i32 0, i32 0))
ret i32 0
}
In command prompt:
lli hello.ll
Quick setup: (For llvm 3.8.0 .ll files on windows)
advanced text editor from https://notepad-plus-plus.org/
clang binaries from: http://llvm.org/releases/download.html#3.8.0
hello.ll as "UTF-8 without BOM" (This code is in llvm 3.8.0 format):
#msg = internal constant [13 x i8] c"Hello World!\00"
declare i32 #puts(i8*)
define i32 #main() {
call i32 #puts(i8* getelementptr inbounds ([13 x i8], [13 x i8]* #msg, i32 0, i32 0))
ret i32 0
}
In command prompt:
clang hello.ll -o hello.exe
hello.exe
Errors about char16_t, u16String, etc means clang needs: -fms-compatibility-version=19