How to get the variable, which the value is assigned to, in an instruction in LLVM? - llvm

For example,
If I want to store and save the variables %1, and % 3 in the instruction
%1 = alloca i32, align 4
%3 = load i32, i32* %2, align 4
into " *Value " variable for further use, how can I get them?
It looks like getOperand() can only get the operands, but not the variables being assigned.
It looks like getOperand() can only get the operands, but not the variables being assigned.

Related

LLVM IR : C++ API : Typecast from i1 to i32 and i32 to i1

I am writing a compiler for a self-made language which can handle only int values i.e. i32. Conditions and expressions are similar to C language. Thus, I am considering conditional statements as expressions i.e. they return an int value. They can also be used in expressions e.g (2 > 1) + (3 > 2) will return 2. But LLVM conditions output i1 value.
Now, I want that after each conditional statement, i1 should be converted into i32, so that I can carry out binary operations
Also, I want to use variables and expression results as condition e.g. if(variable) or if(a + b). For that I need to convert i32 to i1
At the end, I want a way to typecast from i1 to i32 and from i32 to i1. My code is giving these kinds of errors as of now :
For statement like if(variable) :
error: branch condition must have 'i1' type
br i32 %0, label %ifb, label %else
^
For statement like a = b > 3
error: stored value and pointer type do not match
store i1 %gttmp, i32* #a
^
Any suggestion on how to do that ?
I figured it out. To convert from i1 to i32, as pointed out here by Ismail Badawi , I used IRBuilder::CreateIntCast. So if v is Value * pointer pointing to an expression resulting in i1, I did following to convert it to i32 :
v = Builder.CreateIntCast(v, Type::getInt32Ty(getGlobalContext()), true);
But same can't be applied for converting i32 to i1. It will truncate the value to least significant bit. So i32 2 will result in i1 0. I needed i1 1 for non-zero i32. If v is Value * pointer pointing to an expression resulting in i32, I did following to convert it in i1 :
v = Builder.CreateICmpNE(v, ConstantInt::get(Type::getInt32Ty(getGlobalContext()), 0, true))

How to change the type of operands in the Callinst in llvm?

I am trying to implement a transformation of CallInst and perform the following:
Change the type of the arguments of function calls
Change the type of the return value
For example, I want to change the following IR:
%call = call double #add(double %0, double %1)
define double #add(double %x, double %y) #0 {
entry:
%x.addr = alloca double, align 8
%y.addr = alloca double, align 8
store double %x, double* %x.addr, align 8
store double %y, double* %y.addr, align 8
%0 = load double, double* %x.addr, align 8
%1 = load double, double* %x.addr, align 8
%add = fadd double %0, %1
ret double %add
}
To IR_New:
%call = call x86_fp80 #new_add(x86_fp80 %0, x86_fp80 %1)
define x86_fp80 #new_add(x86_fp80 %x, x86_fp80 %y) #0 {
entry:
%x.addr = alloca x86_fp80, align 16
%y.addr = alloca x86_fp80, align 16
store x86_fp80 %x, x86_fp80* %x.addr, align 16
store x86_fp80 %y, x86_fp80* %y.addr, align 16
%0 = load x86_fp80, x86_fp80* %x.addr, align 16
%1 = load x86_fp80, x86_fp80* %x.addr, align 16
%add = fadd x86_fp80 %0, %1
ret x86_fp80 %add
}
I have finished changing the type of AllocaInst, StoreInst, LoadInst, BinaryOperator and ReturnInst.
I am now very confused about how to deal with CallInst.
My original idea is when iterating all the instructions, if I find a CallInst,
if (CallInst *call = dyn_cast<CallInst>(it)){
do the following three steps:
Construct the new FunctionType
x86_fp80(x86_fp80, x86_fp80)
using
std::vector<Type*> ParamTys;
ParamTys.push_back(Type::getX86_FP80Ty(context));
ParamTys.push_back(Type::getX86_FP80Ty(context));
FunctionType *new_fun_type = FunctionType::get(Type::getX86_FP80Ty(context), ParamTys, true);
Construct function with new type in Step 1, i.e. construct new_add in the example
Function *fun = call->getCalledFunction();
Function *new_fun = Function::Create(new_fun_type,fun->getLinkage(), "", fun->getParent());
Construct a new CallInst with the new function obtained from step 2.
CallInst *new_call = CallInst::Create(new_fun, *arrayRefOperands, "newCall", call);
new_call->takeName(call);
}
However, in this way, I got the following IR instead of the IR_New I want:
%call = call x86_fp80 (x86_fp80, x86_fp80, ...) #0(x86_fp80 %5, x86_fp80 %7)
declare x86_fp80 #new_add(x86_fp80, x86_fp80, ...)
A new definition of called function is constructed(declare x86_fp80 #new_add(x86_fp80, x86_fp80, ...)), but the body of this new function is empty. I am very confused how to add the body and get the IR_New I want. My naive idea is:
for (Instruction i : called function(add in the example)){
create new_i with type x86_fp80;
insert new_i in the new function constructed(new_add in the example);
}
Is this a good way to achieve my goal?
Any advice will be greatly appreciated :)
you can use llvm::Value::mutateType(llvm::Ty) to change double type value to x86_fp80 if you are no longer using the original function somewhere else.
goto function definition using CallInst->getCalledFunction() and iterate over all value then mutate double types to x86_fp80.
ref: http://llvm.org/docs/doxygen/html/classllvm_1_1Value.html#ac0f09c2c9951158f9eecfaf7068d7b20

Setting a variable to 0 in LLVM IR

Is it possible to set a variable to 0 (or any other number) in LLVM-IR ? My searches have found me the following 3 line snippet, but is there anything simpler than the following solution ?
%ptr = alloca i32 ; yields i32*:ptr
store i32 3, i32* %ptr ; yields void
%val = load i32, i32* %ptr ; yields i32:val = i32 3
To set a value to zero (or null in general) you can use
Constant::getNullValue(Type)
and to set a value with an arbitrary constant number you can use ConstantInt::get(), but you need to identify the context first, like this:
LLVMContext &context = function->getContext();
/* or BB->getContext(), BB can be any basic block in the function */
Value* constVal = ConstantInt::get(Type::getInt32Ty(context), 3);
LLVM-IR is in static single assignment (SSA) form, so each variable is only assigned once. If you want to assign a value to a memory region you can simply use a store operation as you showed in your example:
store i32 3, i32* %ptr
The type of the second argument is i32* which means that it is a pointer to an integer that is 32 bit long.

incrementing a ptr in llvm ir

I am trying to understand the getelementptr instruction in llvm IR, but not fully understanding it.
I have a struct like below -
struct Foo {
int32_t* p;
}
I want to do this -
foo.p++;
What would be the right code for this?
%0 = getelementptr %Foo* %fooPtr, i32 0, i32 0
%1 = getelementptr i32* %0, i8 1
store i32* %1, i32* %0
I am wondering if value in %0 needs to be first loaded using "load" before executing 2nd line.
Thanks!
You can see the GEP instruction as an operation that performs arithmetic operations on pointers. In LLVM IR the GEP instruction is your instruction of choice to perform operations on pointers easyly. You don't have to do cumbersome calculate the size of your types and offsets to manually perform such operations.
In your case:
%0 = getelementptr %Foo* %fooPtr, i32 0, i32 0
selects the member inside the structure. It uses the pointer operatand %fooPtr to calculate %0 = ((fooPtr + 0) + 0). GEP does not know about fooPtr just pointing to one element of Foo, this is why two indices are used to select the member.
%1 = getelementptr i32* %0, i8 1
As mentioned above the GEP performs pointer arithmetic and in your case get %1 = (p + 1);
Since you are operating on pointers using GEP you don't need to load the value of p. GEP will do this implicitly for you.
Now you can store the new index back to the position of the p member inside the Foo struct pointed to by fooPtr.
For further reading: The Often Misunderstood GEP Instruction

bit layout of vector after bitcasting in llvm

%0 = bitcast i16 %arg1 to <2 x i8>
%2 = extractelement <2 x i8> %0, i32 1
%arg1 in memory:
00000000 11111111
|--8bit--||--8bit--|
After bitcasting, %0 is a pointer to vector.
So is %0 also the address of the first element of the vector?
And what is %2 exactly? Is it the second element of vector(11111111) or,
00000000?
After bitcasting, %0 is a Value of type <2 x i8>. It is not "a pointer". The vector may very well be stored in a register when code generation to machine code happens.
%2 is i8, because extractelement is defined as:
<result> = extractelement <n x <ty>> <val>, i32 <idx> ; yields <ty>
The vector has two elements, each with of type i8. %2 is a Value that holds the second element in the vector.
Note that how the vector is laid out in memory or registers is target dependent. LLVM IR level doesn't care about that. It sees the vector as an abstract container of two values.
(Would like to post this despite this question was discussed fairly long time ago, after reading when it's good to reply an old question, since the reply provides another way to verify the answer and references recent document update)
The value of %2 would depend on the endianess and target (the vector 's memory layout depends on the target and endianess as Eli Bendersky mentioned); https://gcc.godbolt.org/z/vfcGhb4EK could be a good way to visualize that.
By reading the language reference, these two are helpful to reason about the above IR (at least to me)
the semantics of bitcast
the memory layout of vector types
Both 1 and 2 were updated from https://reviews.llvm.org/D94964 (credit to the authors and reviewers)
bitcast <type1> <value> to <type2> converts the value, %arg1 in your case, from type1 to type2 without changing the bits given that the number of bits in the two types are the same.
%0 = bitcast i16 %arg1 to <2 x i8>
This means that %0 is now an array/vector of two 8-bit integers instead of a single 16-bit integer. Looking over the linked documentation, this appears to only be a value.
extractelement <n x <type>> <value>, i32 <index> extracts the element in the n-length array of typed values as the given type using the 32-bit integer index.
%2 = extractelement <2 x i8> %0, i32 1
This means that %2 is now an 8-bit integer with the value of element 1 (the second/last 8-bit element). Assuming little endian hardware, I would expect the value of %2 to be 0.