Setting a variable to 0 in LLVM IR - llvm

Is it possible to set a variable to 0 (or any other number) in LLVM-IR ? My searches have found me the following 3 line snippet, but is there anything simpler than the following solution ?
%ptr = alloca i32 ; yields i32*:ptr
store i32 3, i32* %ptr ; yields void
%val = load i32, i32* %ptr ; yields i32:val = i32 3

To set a value to zero (or null in general) you can use
Constant::getNullValue(Type)
and to set a value with an arbitrary constant number you can use ConstantInt::get(), but you need to identify the context first, like this:
LLVMContext &context = function->getContext();
/* or BB->getContext(), BB can be any basic block in the function */
Value* constVal = ConstantInt::get(Type::getInt32Ty(context), 3);

LLVM-IR is in static single assignment (SSA) form, so each variable is only assigned once. If you want to assign a value to a memory region you can simply use a store operation as you showed in your example:
store i32 3, i32* %ptr
The type of the second argument is i32* which means that it is a pointer to an integer that is 32 bit long.

Related

How to get the variable, which the value is assigned to, in an instruction in LLVM?

For example,
If I want to store and save the variables %1, and % 3 in the instruction
%1 = alloca i32, align 4
%3 = load i32, i32* %2, align 4
into " *Value " variable for further use, how can I get them?
It looks like getOperand() can only get the operands, but not the variables being assigned.
It looks like getOperand() can only get the operands, but not the variables being assigned.

LLVM IR : C++ API : Typecast from i1 to i32 and i32 to i1

I am writing a compiler for a self-made language which can handle only int values i.e. i32. Conditions and expressions are similar to C language. Thus, I am considering conditional statements as expressions i.e. they return an int value. They can also be used in expressions e.g (2 > 1) + (3 > 2) will return 2. But LLVM conditions output i1 value.
Now, I want that after each conditional statement, i1 should be converted into i32, so that I can carry out binary operations
Also, I want to use variables and expression results as condition e.g. if(variable) or if(a + b). For that I need to convert i32 to i1
At the end, I want a way to typecast from i1 to i32 and from i32 to i1. My code is giving these kinds of errors as of now :
For statement like if(variable) :
error: branch condition must have 'i1' type
br i32 %0, label %ifb, label %else
^
For statement like a = b > 3
error: stored value and pointer type do not match
store i1 %gttmp, i32* #a
^
Any suggestion on how to do that ?
I figured it out. To convert from i1 to i32, as pointed out here by Ismail Badawi , I used IRBuilder::CreateIntCast. So if v is Value * pointer pointing to an expression resulting in i1, I did following to convert it to i32 :
v = Builder.CreateIntCast(v, Type::getInt32Ty(getGlobalContext()), true);
But same can't be applied for converting i32 to i1. It will truncate the value to least significant bit. So i32 2 will result in i1 0. I needed i1 1 for non-zero i32. If v is Value * pointer pointing to an expression resulting in i32, I did following to convert it in i1 :
v = Builder.CreateICmpNE(v, ConstantInt::get(Type::getInt32Ty(getGlobalContext()), 0, true))

How to change the type of operands in the Callinst in llvm?

I am trying to implement a transformation of CallInst and perform the following:
Change the type of the arguments of function calls
Change the type of the return value
For example, I want to change the following IR:
%call = call double #add(double %0, double %1)
define double #add(double %x, double %y) #0 {
entry:
%x.addr = alloca double, align 8
%y.addr = alloca double, align 8
store double %x, double* %x.addr, align 8
store double %y, double* %y.addr, align 8
%0 = load double, double* %x.addr, align 8
%1 = load double, double* %x.addr, align 8
%add = fadd double %0, %1
ret double %add
}
To IR_New:
%call = call x86_fp80 #new_add(x86_fp80 %0, x86_fp80 %1)
define x86_fp80 #new_add(x86_fp80 %x, x86_fp80 %y) #0 {
entry:
%x.addr = alloca x86_fp80, align 16
%y.addr = alloca x86_fp80, align 16
store x86_fp80 %x, x86_fp80* %x.addr, align 16
store x86_fp80 %y, x86_fp80* %y.addr, align 16
%0 = load x86_fp80, x86_fp80* %x.addr, align 16
%1 = load x86_fp80, x86_fp80* %x.addr, align 16
%add = fadd x86_fp80 %0, %1
ret x86_fp80 %add
}
I have finished changing the type of AllocaInst, StoreInst, LoadInst, BinaryOperator and ReturnInst.
I am now very confused about how to deal with CallInst.
My original idea is when iterating all the instructions, if I find a CallInst,
if (CallInst *call = dyn_cast<CallInst>(it)){
do the following three steps:
Construct the new FunctionType
x86_fp80(x86_fp80, x86_fp80)
using
std::vector<Type*> ParamTys;
ParamTys.push_back(Type::getX86_FP80Ty(context));
ParamTys.push_back(Type::getX86_FP80Ty(context));
FunctionType *new_fun_type = FunctionType::get(Type::getX86_FP80Ty(context), ParamTys, true);
Construct function with new type in Step 1, i.e. construct new_add in the example
Function *fun = call->getCalledFunction();
Function *new_fun = Function::Create(new_fun_type,fun->getLinkage(), "", fun->getParent());
Construct a new CallInst with the new function obtained from step 2.
CallInst *new_call = CallInst::Create(new_fun, *arrayRefOperands, "newCall", call);
new_call->takeName(call);
}
However, in this way, I got the following IR instead of the IR_New I want:
%call = call x86_fp80 (x86_fp80, x86_fp80, ...) #0(x86_fp80 %5, x86_fp80 %7)
declare x86_fp80 #new_add(x86_fp80, x86_fp80, ...)
A new definition of called function is constructed(declare x86_fp80 #new_add(x86_fp80, x86_fp80, ...)), but the body of this new function is empty. I am very confused how to add the body and get the IR_New I want. My naive idea is:
for (Instruction i : called function(add in the example)){
create new_i with type x86_fp80;
insert new_i in the new function constructed(new_add in the example);
}
Is this a good way to achieve my goal?
Any advice will be greatly appreciated :)
you can use llvm::Value::mutateType(llvm::Ty) to change double type value to x86_fp80 if you are no longer using the original function somewhere else.
goto function definition using CallInst->getCalledFunction() and iterate over all value then mutate double types to x86_fp80.
ref: http://llvm.org/docs/doxygen/html/classllvm_1_1Value.html#ac0f09c2c9951158f9eecfaf7068d7b20

incrementing a ptr in llvm ir

I am trying to understand the getelementptr instruction in llvm IR, but not fully understanding it.
I have a struct like below -
struct Foo {
int32_t* p;
}
I want to do this -
foo.p++;
What would be the right code for this?
%0 = getelementptr %Foo* %fooPtr, i32 0, i32 0
%1 = getelementptr i32* %0, i8 1
store i32* %1, i32* %0
I am wondering if value in %0 needs to be first loaded using "load" before executing 2nd line.
Thanks!
You can see the GEP instruction as an operation that performs arithmetic operations on pointers. In LLVM IR the GEP instruction is your instruction of choice to perform operations on pointers easyly. You don't have to do cumbersome calculate the size of your types and offsets to manually perform such operations.
In your case:
%0 = getelementptr %Foo* %fooPtr, i32 0, i32 0
selects the member inside the structure. It uses the pointer operatand %fooPtr to calculate %0 = ((fooPtr + 0) + 0). GEP does not know about fooPtr just pointing to one element of Foo, this is why two indices are used to select the member.
%1 = getelementptr i32* %0, i8 1
As mentioned above the GEP performs pointer arithmetic and in your case get %1 = (p + 1);
Since you are operating on pointers using GEP you don't need to load the value of p. GEP will do this implicitly for you.
Now you can store the new index back to the position of the p member inside the Foo struct pointed to by fooPtr.
For further reading: The Often Misunderstood GEP Instruction

Writing llvm byte code

I have just discovered LLVM and don't know much about it yet. I have been trying it out using llvm in browser. I can see that any C code I write is converted to LLVM byte code which is then converted to native code. The page shows a textual representation of the byte code. For example for the following C code:
int array[] = { 1, 2, 3};
int foo(int X) {
return array[X];
}
It shows the following byte code:
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
target triple = "x86_64-linux-gnu"
#array = global [3 x i32] [i32 1, i32 2, i32 3] ; <[3 x i32]*> [#uses=1]
define i32 #foo(i32 %X) nounwind readonly {
entry:
%0 = sext i32 %X to i64 ; <i64> [#uses=1]
%1 = getelementptr inbounds [3 x i32]* #array, i64 0, i64 %0 ; <i32*> [#uses=1]
%2 = load i32* %1, align 4 ; <i32> [#uses=1]
ret i32 %2
}
My question is: Can I write the byte code and give it to the llvm assembler to convert to native code skipping the first step of writing C code altogether? If yes, how do I do it? Does any one have any pointers for me?
One very important feature (and design goal) of the LLVM IR language is its 3-way representation:
The textual representation you can see here
The bytecode representation (or binary form)
The in-memory representation
All 3 are indeed completely interchangeable. Nothing that can be expressed in one cannot be expressed in the 2 others as well.
Therefore, as long as you conform to the syntax, you can indeed write the IR yourself. It is rather pointless though, unless used as an exercise to accustom yourself with the format, whether to be better at reading (and diagnosing) the IR or to produce your own compiler :)
Yes, surely you can. First, you can write LLVM IR by hand. All tools like llc (which will generate a native code for you) and opt (LLVM IR => LLVM IR optimizer) accept textual representation of LLVM IR as input.