So my C code is:
#include <stdio.h>
void main(){
int a, b,c, d;
b = 18, c = 112;
b = a - d;
d = a - d;
}
and part of its IR is:
%5 = load i32, i32* %1, align 4
%6 = load i32, i32* %4, align 4
%7 = sub nsw i32 %5, %6
store i32 %7, i32* %2, align 4
%8 = load i32, i32* %1, align 4
%9 = load i32, i32* %4, align 4
%10 = sub nsw i32 %8, %9
store i32 %10, i32* %4, align 4
I have implemented LVN algorithm to detect the redundant expression which is d = a - d. Now for optimization, I need to manipulate the instruction and make it d = b. I am not sure how to do it with llvm and how I can manipulate the IR.
I am new in llvm so it might be a silly question but I am really confused. Since, llvm works on IR, I understand that when it see "d = a - d" it will first load a and d, but the binary operation and store instruction in IR needs to be changed so that %4 gets the value from %2. Can anyone help me checking if I am understanding this correctly and how I can manipulate the IR to optimize the code.
First of all, let's replace your example program with one that does not invoke undefined behaviour (due to accessing uninitialized variables), so that the UB does not confuse the issue:
void f(int a, int b, int c, int d){
b = a - d;
d = a - d;
// Code that uses b and d
}
(I've also removed the two assignments as they didn't have any effect and will disappear after mem2reg anyway.)
Now to actually answer your question: Most optimizations run after the mem2reg pass, which converts memory accesses to registers where possible. This is important because, unlike memory locations, LLVM registers can only be assigned from a single point in the source, so mem2reg turns the code into SSA form, which is required for many optimizations to work.
If we apply mem2reg to the example code, we get:
define void #f(i32, i32, i32, i32) #0 {
%5 = sub nsw i32 %0, %3
%6 = sub nsw i32 %0, %3
; Code that uses b and d
}
So now we'd apply your analysis to find out that %6 is equivalent to %5. With that information we can remove the definition of %6 and replace all the occurrences of %6 with %5 (note that this would be more complicated if %5 and %6 were in the different basic blocks where one didn't dominate the other). To do that you can find all uses of %6 using the uses() method, which tells you which instructions have %6 as which operand. Then you can just set that operand to be a reference to %5 instead.
Related
I want to insert instructions into function without basic block, for example:
define void #_Z2f2v() nounwind {
%a = alloca i32, align 4
%b = alloca i32, align 4
store i32 2, i32* %a, align 4
%1 = load i32* %a, align 4
%2 = icmp sgt i32 %1, 0
ret void
}
But I read LLVM document, all C++ API I have are:
BasicBlock *bb = BasicBlock::Create(...);
irBuilder.setInsertPoint(bb);
irBuilder.CreateXXXInst(...);
or
Instruction *inst = new XXXInst(..., Instruction *insertBefore);
Instruction *inst = new XXXInst(..., BasicBlock *insertAtEnd);
It seems that I must create a BasicBlock at the beginning of a function.
How could I create instruction into function without BasicBlock by C++ API ?
I want to insert instructions into function without basic block, for example:
define void #_Z2f2v() nounwind {
%a = alloca i32, align 4
%b = alloca i32, align 4
store i32 2, i32* %a, align 4
%1 = load i32* %a, align 4
%2 = icmp sgt i32 %1, 0
ret void
}
That function contains exactly one basic block, not zero. To create a function like that, you add all of your instructions to the function's entry block.
How could I create instruction into function without BasicBlock by C++ API ?
You can't - neither using the C++ API nor any other way. Every instruction has to be part of a basic block by definition.
Basic blocks are the nodes in the CFG, so if you had an instruction without a basic block, it would not be part of the CFG and could therefore never be executed, which would be pointless.
int func(int i){
int j;
j = i + 0;
return j;
}
I want to practise and learn LLVM transformation pass.
For the above simple c function, I want to implement algebraic identification optimization X+0 -> X
I expect the optimized program to be
int func(int i){
int j;
j = i // remove the add instruction
return j;
}
I read about the IRBuilder, I can create Add/Sub/Mul.... a lot of instructions. But for handling the above case, I can not find any matches.
what should I do to handle the above case?
I also think if I can just remove the instruction.
And the program would be
int func(int i){
return i;
}
I am not sure if llvm will do this automatically, once I remove the useless add instruction.
Running clang -O0 -S -emit-llvm -o - test.c on your code produces following IR:
define i32 #func(i32 %i) #0 {
entry:
%i.addr = alloca i32, align 4
%j = alloca i32, align 4
store i32 %i, i32* %i.addr, align 4
%0 = load i32, i32* %i.addr, align 4
%add = add nsw i32 %0, 0
store i32 %add, i32* %j, align 4
%1 = load i32, i32* %j, align 4
ret i32 %1
}
As you can see, there is add nsw i32 %0, 0 instruction. This means that clang doesn't optimize it right away (at least on -O0) and this is instruction we are going to process by our pass.
I'll omit boilerplate code that is required to add your own pass, as it is thoroughly described in the LLVM documentation.
The pass should do something like (pseudo-code)
runOnFunction(Function& F)
{
for(each instruction in F)
if(isa<BinaryOperator>(instruction))
if(instruction.getOpcode() == BinaryInstruction::Add)
if(isa<ConstantInt>(instruction.getOperand(1))
if(extract value from constant operand and check if it is 0)
instruction.eraseFromParent()
}
To implement the avoid add zero optimization
The required things to do are:
Find the instruction where y=x+0
Replace ALL the use of y with x
Record the pointer of the instruction
Remove it afterward
I'm trying to find out whether LLVM IR temporaries can be used outside a loop in which they were defined. For that, I compiled the following simple C code:
while (*s == 'a')
{
c = *s++;
}
*s = c;
and like I suspected, the final write outside the loop (*s = c) is done
with another temporary (%tmp5) than the one read to inside the loop (%tmp4)
while.body: ; preds = %while.cond
%tmp3 = load i8*, i8** %s.addr, align 8
%incdec.ptr = getelementptr inbounds i8, i8* %tmp3, i32 1
store i8* %incdec.ptr, i8** %s.addr, align 8
%tmp4 = load i8, i8* %tmp3, align 1
store i8 %tmp4, i8* %c, align 1
br label %while.cond
while.end: ; preds = %while.cond
%tmp5 = load i8, i8* %c, align 1
%tmp6 = load i8*, i8** %s.addr, align 8
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; store i8 %tmp4, i8* %tmp6, align 1 ;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
store i8 %tmp5, i8* %tmp6, align 1
When I edit the *.ll file and manually replace %tmp5 with %tmp4,
then llvm-as is unhappy:
$ llvm-as modified.ll
Instruction does not dominate all uses!
%tmp4 = load i8, i8* %tmp3, align 1
store i8 %tmp4, i8* %tmp6, align 1
Is there any example where a temporary will be defined
inside a loop and used outside of it? Thanks!
LLVM doesn't really have temporaries, it uses SSA. That's short for static single assignment, and the key word here is single. Everything is a value and the value must always be assigned once.
Anything can use any value which necessarily has been assigned by the time it's used. "Dominates" means "provably comes before" in the error message you got, ie. LLVM sees that the input string is "b", the code will jump straight from while.cond to while.end, past while.body.
When you do use values from within loop after the end of the loop, things can get a little confusing. You may need to think hard and close the Slack and Facebook tabs. But LLVM doesn't mind.
The while.end basic block has only while.cond block as its predecessor. Thus, you can't access variables defined in while.body. It is like you want to access a variable defined in one branch from another:
if(...)
int x = ...;
else
print(x);
Instead, declare whatever variables you need in loop entry block and then use it from both while.body and while.end.
Assume a simple partial evaluation scenario:
#include <vector>
/* may be known at runtime */
int someConstant();
/* can be partially evaluated */
double foo(std::vector<double> args) {
return args[someConstant()] * someConstant();
}
Let's say that someConstant() is known and does not change at runtime (e.g. given by the user once) and can be replaced by the corresponding int literal. If foo is part of the hot path, I expect a significant performance improvement:
/* partially evaluated, someConstant() == 2 */
double foo(std::vector<double> args) {
return args[2] * 2;
}
My current take on that problem would be to generate LLVM IR at runtime, because I know the structure of the partially evaluated code (so I would not need a general purpose partial evaluator).
So I want to write a function foo_ir that generates IR code that does the same thing as foo, but not calling someConstant(), because it is known at runtime.
Simple enough, isn't it? Yet, when I look at the generated IR for the code above:
; Function Attrs: uwtable
define double #_Z3fooSt6vectorIdSaIdEE(%"class.std::vector"* %args) #0 {
%1 = call i32 #_Z12someConstantv()
%2 = sext i32 %1 to i64
%3 = call double* #_ZNSt6vectorIdSaIdEEixEm(%"class.std::vector"* %args, i64 %2)
%4 = load double* %3
%5 = call i32 #_Z12someConstantv()
%6 = sitofp i32 %5 to double
%7 = fmul double %4, %6
ret double %7
}
; Function Attrs: nounwind uwtable
define linkonce_odr double* #_ZNSt6vectorIdSaIdEEixEm(%"class.std::vector"* %this, i64 %__n) #1 align 2 {
%1 = alloca %"class.std::vector"*, align 8
%2 = alloca i64, align 8
store %"class.std::vector"* %this, %"class.std::vector"** %1, align 8
store i64 %__n, i64* %2, align 8
%3 = load %"class.std::vector"** %1
%4 = bitcast %"class.std::vector"* %3 to %"struct.std::_Vector_base"*
%5 = getelementptr inbounds %"struct.std::_Vector_base"* %4, i32 0, i32 0
%6 = getelementptr inbounds %"struct.std::_Vector_base<double, std::allocator<double> >::_Vector_impl"* %5, i32 0, i32 0
%7 = load double** %6, align 8
%8 = load i64* %2, align 8
%9 = getelementptr inbounds double* %7, i64 %8
ret double* %9
}
I see, that the [] was included from the STL definition (function #_ZNSt6vectorIdSaIdEEixEm) - fair enough. The problem is: It could as well be some member function, or even a direct data access, I simply cannot assume the data layout to be the same everywhere, so at development-time, I do not know the concrete std::vector layout of the host machine.
Is there some way to use C++ metaprogramming to get the required information at compile time? i.e. is there some way to ask llvm to provide IR for std::vector's [] method?
As a bonus: I would prefer to not enforce the compilation of the library with clang, instead, LLVM shall be a runtime-dependency, so just invoking clang at compile time (even if I do not know how to do this) is a second-best solution.
Answering my own question:
While I still have no solution for the general case (e.g. std::map), there exists a simple solution for std::vector:
According to the C++ standard, the following holds for the member function data()
Returns a direct pointer to the memory array used internally by the
vector to store its owned elements.
Because elements in the vector are guaranteed to be stored in
contiguous storage locations in the same order as represented by the
vector, the pointer retrieved can be offset to access any element in
the array.
So in fact, the object-level layout of std::vector is fixed by the standard.
From an llvm pass, I need to print an llvm instruction (Type llvm::Instruction) on the screen, just like as it appears in the llvm bitcode file. Actually my compilation is crashing, and does not reach the point where bitcode file is generated. So for debugging I want to print some instructions to know what is going wrong.
Assuming I is your instruction
I.print(errs());
By simply using the print method.
For a simple Hello World program, using C++'s range-based loops, you can do something like this:
for(auto& B: F){
for(auto& I: B){
errs() << I << "\n";
}
}
This gives the output:
%3 = alloca i32, align 4
%4 = alloca i8**, align 8
store i32 %0, i32* %3, align 4
store i8** %1, i8*** %4, align 8
%5 = call i32 (i8*, ...) #printf(i8* getelementptr inbounds ([15 x i8], [15 x i8]* #.str, i64 0, i64 0))
ret i32 0