Address Problems in LLVM Interpreter - llvm

I'm using LLVM interpreter. I tried to find variables values by using ExecutionEngine.cpp and Execution.cpp. I could find current values, but I have another problem. The binary operations could be done with similar addresses. I think they use temp addresses, but why same ones? I need to make them different to get some results.
To be more clear
Here are different instructions for different basic blocks, with there interpreting information I got.
First Instruction is:
About to interpret: %BB1 = add i32 %4, 1
Basic Block Name: CBB1
arg0: %4 = load i32, i32* #BB1
arg1: i32 1
visitBinaryOperator
Source1 Current Input ::: 10
Source1 Current Address ::: 0x7fff4582cd90
Source1 [ 0, 41 ]
Source1 Current Input::: 1
Source1 Current Input ::: 10
Sourc2 Current Address::: 0x7fff4582cdc0
Second Instruction is
About to interpret: %10 = add nsw i32 %9, %8
BasicBlock Name: CBB2
arg0: %9 = load i32, i32* %sum, align 4
arg1: %8 = load i32, i32* %7, align 4
visitBinaryOperator
Source1 Current Input ::: 0
Source1 Current Address ::: 0x7fff4582cd90
Source2 Current Input ::: 0
Source2 Current Address::: 0x7fff4582cdc0
Part of BasicBlock CBB2
%8 = load i32, i32* %7, align 4
%9 = load i32, i32* %sum, align 4
%10 = add nsw i32 %9, %8
store i32 %10, i32* %sum, align 4
I need to get some analysis using current and previous values for each instructions inputs and outputs.
I got the values from function Interpreter::visitBinaryOperator:
ExecutionContext &SF = ECStack.back();
Type *Ty = I.getOperand(0)->getType();
GenericValue Src1 = getOperandValue(I.getOperand(0), SF);
GenericValue Src2 = getOperandValue(I.getOperand(1), SF);
GenericValue R; // Result
Getting addresses y1, y2, r and values v1, v2, rv:
uint8_t *y1 = reinterpret_cast<uint8_t *>(
const_cast<uint64_t *>(Src1.IntVal.getRawData()));
int v1 = *reinterpret_cast<int *>(y1);
uint8_t *y2 = reinterpret_cast<uint8_t *>(
const_cast<uint64_t *>(Src2.IntVal.getRawData()));
int v2 = *reinterpret_cast<int *>(y2);
uint8_t *r = reinterpret_cast<uint8_t *>(
const_cast<uint64_t *>(R.IntVal.getRawData()));
int rv = *reinterpret_cast<int *>(r);
I have another suggestion, if I could not solve the problem for add instruction, could I use the load instructions before add and store instruction after add, to get my results. I could already get them individually with valid results, but how to be connected to be arguments for the binary operation.

Related

How to create instruction in function without basic block by LLVM C++ API?

I want to insert instructions into function without basic block, for example:
define void #_Z2f2v() nounwind {
%a = alloca i32, align 4
%b = alloca i32, align 4
store i32 2, i32* %a, align 4
%1 = load i32* %a, align 4
%2 = icmp sgt i32 %1, 0
ret void
}
But I read LLVM document, all C++ API I have are:
BasicBlock *bb = BasicBlock::Create(...);
irBuilder.setInsertPoint(bb);
irBuilder.CreateXXXInst(...);
or
Instruction *inst = new XXXInst(..., Instruction *insertBefore);
Instruction *inst = new XXXInst(..., BasicBlock *insertAtEnd);
It seems that I must create a BasicBlock at the beginning of a function.
How could I create instruction into function without BasicBlock by C++ API ?
I want to insert instructions into function without basic block, for example:
define void #_Z2f2v() nounwind {
%a = alloca i32, align 4
%b = alloca i32, align 4
store i32 2, i32* %a, align 4
%1 = load i32* %a, align 4
%2 = icmp sgt i32 %1, 0
ret void
}
That function contains exactly one basic block, not zero. To create a function like that, you add all of your instructions to the function's entry block.
How could I create instruction into function without BasicBlock by C++ API ?
You can't - neither using the C++ API nor any other way. Every instruction has to be part of a basic block by definition.
Basic blocks are the nodes in the CFG, so if you had an instruction without a basic block, it would not be part of the CFG and could therefore never be executed, which would be pointless.

Is LLVM dependence analysis able to output dependence between two instructions other than store & load?

I am using llvm DependenceAnalysisWrapperPass to obtain the dependence between two IR instructions. But it seems like this analysis only output dependence between load/store instructions, but not say dependence between a load and a arithmetic instructions. Is there any pass in LLVM can output a more comprehensive dependence among instructions?
For example:
%retval = alloca i32, align 4
%a = alloca i32, align 4
%b = alloca i32, align 4
%r = alloca i32, align 4
store i32 0, i32* %retval, align 4
store i32 1, i32* %a, align 4
store i32 2, i32* %b, align 4
%0 = load i32, i32* %a, align 4
%1 = load i32, i32* %b, align 4
%add = add nsw i32 %0, %1
store i32 %add, i32* %r, align 4
%2 = load i32, i32* %r, align 4
ret i32 %2
By using the DependenceAnalysisWrapperPass, it outputs the following dependency graph
Dependency Graph
It shows that the two load instructions depend on the two store instructions, respectively. However it does not show the dependency between, say, the two load instructions and the following add instruction. This is expected, since the code of DependenceAnalysisWrapperPass says that it only shows the dependence between store and load instructions. My question is that is there any pass available showing other dependences as well?
The source code shows the information you want.
Each instruction's operands are precisely those instructions (or other values) on which it depends. This is a general principle of LLVM. The pass you've seen exists because loads and stores are an exception. However, loads and stores are the only exception.

Erasing redundant expression with llvm and local value numbering algorithm

So my C code is:
#include <stdio.h>
void main(){
int a, b,c, d;
b = 18, c = 112;
b = a - d;
d = a - d;
}
and part of its IR is:
%5 = load i32, i32* %1, align 4
%6 = load i32, i32* %4, align 4
%7 = sub nsw i32 %5, %6
store i32 %7, i32* %2, align 4
%8 = load i32, i32* %1, align 4
%9 = load i32, i32* %4, align 4
%10 = sub nsw i32 %8, %9
store i32 %10, i32* %4, align 4
I have implemented LVN algorithm to detect the redundant expression which is d = a - d. Now for optimization, I need to manipulate the instruction and make it d = b. I am not sure how to do it with llvm and how I can manipulate the IR.
I am new in llvm so it might be a silly question but I am really confused. Since, llvm works on IR, I understand that when it see "d = a - d" it will first load a and d, but the binary operation and store instruction in IR needs to be changed so that %4 gets the value from %2. Can anyone help me checking if I am understanding this correctly and how I can manipulate the IR to optimize the code.
First of all, let's replace your example program with one that does not invoke undefined behaviour (due to accessing uninitialized variables), so that the UB does not confuse the issue:
void f(int a, int b, int c, int d){
b = a - d;
d = a - d;
// Code that uses b and d
}
(I've also removed the two assignments as they didn't have any effect and will disappear after mem2reg anyway.)
Now to actually answer your question: Most optimizations run after the mem2reg pass, which converts memory accesses to registers where possible. This is important because, unlike memory locations, LLVM registers can only be assigned from a single point in the source, so mem2reg turns the code into SSA form, which is required for many optimizations to work.
If we apply mem2reg to the example code, we get:
define void #f(i32, i32, i32, i32) #0 {
%5 = sub nsw i32 %0, %3
%6 = sub nsw i32 %0, %3
; Code that uses b and d
}
So now we'd apply your analysis to find out that %6 is equivalent to %5. With that information we can remove the definition of %6 and replace all the occurrences of %6 with %5 (note that this would be more complicated if %5 and %6 were in the different basic blocks where one didn't dominate the other). To do that you can find all uses of %6 using the uses() method, which tells you which instructions have %6 as which operand. Then you can just set that operand to be a reference to %5 instead.

LLVM IR: getting the value of an address

I'm trying to write a LLVM pass to analyse the following IR:
%d = alloca i32, align 4
store i32 0, i32* %d, align 4
%1 = load i32* %d, align 4
%2 = add nsw i32 %1, 2
store i32 %2, i32* %d, align 4
What I need to do is to figure out the final value of d.
For the store i32 0, i32* %d, align 4 I used ConstantInt casting for the operand 0 and found the assigned value for d (which is 0). But I'm struggling with how to find the value for the d in last store instruction:
store i32 %2, i32* %d, align 4
As I know, %2 is a pointer to the result of the instruction %2 = add nsw i32 %1, 2 and similar thing to the %1.
Do I need to backtrack for %2 to find the value of %2 or is there a simpler method for this?
EDIT:
Following is the code I used so far:
void analyse(BasicBlock* BB)
{
for (auto &I: *BB)
{
if (isa<StoreInst>(I))
{
Value *v = I.getOperand(0);
Instruction *i = dyn_cast<Instruction>(I.getOperand(1));
if (isa<ConstantInt>(v))
{
llvm::ConstantInt *CI = dyn_cast<llvm::ConstantInt>(v);
int value = CI->getZExtValue();
std::string ope = i->getName().str().c_str();
std::cout << "ope " << value << " \n";
}
}
}
}
Way to solve this is to back track. In this case:
store i32 %2, i32* %d, align 4
%2 = add nsw i32 %1, 2
%1 = load i32* %d, align 4
so it's checking the operand is an instruction, and if so, check the type of the instruction (i.e: isa(v), isa(v) or isa(v) etc), and then find the value.

How to know the type of a variable in an llvm code

Is there any method to know the type of the variables in the LLVM code?
For example, I have the following code:
%i = alloca i32, align 4
store i32 1, i32* %i, align 4
%n = add i32 6, 1
br label %2
And I want a function that returns the type of each of the variables %i, %n and %2, i.e. respectively i32*, i32 and label
Is there any proposition?
Type* var_type = cur_instruction->getType();
%i = alloca i32, align 4, store i32 1, i32* %i, align 4 and %n = add i32 6, 1 are instructions. You can query their type via their getType method.
%2 is a basic block and has label type. You can check whether a value is a basic block by using isa.