In a basicblock I wants to find all the values used in instructions, That are not computed in the same basicblock.
Example,
for.body5:
%i.015 = phi i32 [ 0, %for.body.lr.ph ], [ %inc, %for.body ]
%add1 = add nsw i32 %2, %i.015
%arrayidx = getelementptr inbounds [100 x i32]* %b, i32 0, i32 %i.015
store i32 %add1, i32* %arrayidx, align 4, !tbaa !0
%arrayidx2 = getelementptr inbounds [100 x i32]* %a, i32 0, i32 %i.015
store i32 %add1, i32* %arrayidx2, align 4, !tbaa !0
%inc = add nsw i32 %i.015, 1
%cmp = icmp slt i32 %inc, %3
br i1 %cmp, label %for.body, label %for.cond3.preheader
In above example i should get,
%2
%b
%a
%3
Which are declared and/or assigned in other basicblocks.
Please Suggest me a method.
Thanks in advance.
Hi I havent tested this out, but I would do something like this:
vector<Value*> values;
BasicBlock::iterator it;
User::op_iterator it;
// Iterate over all of the instructions in the Block
for (it=block->begin(); it++; it != block->end()){
// Iterate over the operands used by an instruction. 'op_begin' Defined in llvm::User class.
for (operand_it=it->op_begin(); operand_it++; operand_it != it->op_end() ){
// Could this if else statement be reduced?
// If this operand is an argument it was not defined in the block.
if (isa<Argument>(operand_it)){
values.push_back(operand_it);
}
// Otherwize, it could be a constant value or ...
else if (!isa<Instruction>(operand_it)){
continue;
}
// Check if the parent of the instruction is not the block in question.
else if (((Instruction*)operand_it)->getParent() != block){
values.push_back(operand_it);
}
}
}
Related
I have LLVM IR like below :
for.body: ; preds = %for.cond
%add = add nsw i32 %i.0, 3
%idxprom = sext i32 %add to i64
%arrayidx = getelementptr inbounds i32, i32* %arr, i64 %idxprom
%0 = load i32, i32* %arrayidx, align 4
%add1 = add nsw i32 %sum1.0, %0
%add2 = add nsw i32 %i.0, 2
%idxprom3 = sext i32 %add2 to i64
%arrayidx4 = getelementptr inbounds i32, i32* %arr, i64 %idxprom3
%1 = load i32, i32* %arrayidx4, align 4
%add5 = add nsw i32 %sum2.0, %1
%add6 = add nsw i32 %i.0, 1
%idxprom7 = sext i32 %add6 to i64
%arrayidx8 = getelementptr inbounds i32, i32* %arr, i64 %idxprom7
%2 = load i32, i32* %arrayidx8, align 4
%add9 = add nsw i32 %sum3.0, %2
%idxprom10 = sext i32 %i.0 to i64
%arrayidx11 = getelementptr inbounds i32, i32* %arr, i64 %idxprom10
%3 = load i32, i32* %arrayidx11, align 4
%add12 = add nsw i32 %sum4.0, %3
br label %for.inc
I want to re-arrang GEP instructions above. It should be arranged like below for this example :
%arrayidx11 = getelementptr inbounds i32, i32* %arr, i64 %idxprom10
%arrayidx8 = getelementptr inbounds i32, i32* %arr, i64 %idxprom7
%arrayidx4 = getelementptr inbounds i32, i32* %arr, i64 %idxprom3
%arrayidx = getelementptr inbounds i32, i32* %arr, i64 %idxprom
I know that even the uses of array access has to be moved after this arrangement. So I am trying to get use-chain for each GEP instruction using below code :
// Get all the use chain instructions
for (Value::use_iterator i = inst1->use_begin(),e = inst1->use_end(); i!=e;++i) {
dyn_cast<Instruction>(*i)->dump();
}
But I am getting only the declaration instruction with this code, I was expecting to get all the below instructions for %arrayidx4 :
%arrayidx4 = getelementptr inbounds i32, i32* %arr, i64 %idxprom3
%1 = load i32, i32* %arrayidx4, align 4
Please help me out here. Thanks in advance.
I don't really like this question, but I should be doing paperwork for my taxes today...
Your first task is to find the GEPs and sort them into the order you want. When doing this, you need a separate list. LLVM's BasicBlock class does provide a list, but as a general rule, never modify that list while you're iterating over it. That's permitted but too error-prone.
So at the start:
std::vector<GetElementPtr *> geps;
for(auto & i : block->getInstList())
if(GetElementPtrInst * g = dyn_cast<GetElementPTrInst>(&i))
geps.push_back(g);
You can use any container class, your project's code standard will probably suggest using either std::whatever or an LLVM class.
Next, sort geps into the order you prefer. I leave that part out.
After that, move each GEP to the latest permissible point in the block. Which point is that? Well, if the block was valid, then each GEP is already after the values it uses and before the instructions that use it, so moving it to a possibly later point while keeping it before its users will do.
for(auto g : geps) {
Instruction * firstUser = nullptr;
for(auto u : g->users()) {
Instruction * i = dyn_cast<Instruction>(u);
if(i &&
i->getParent() == g->getParent() &&
(!firstUser ||
i->comesBefore(firstUser)))
firstUser = i;
}
}
if(firstUser)
g->moveBefore(firstUser);
}
For each user, check that it is an instruction within the same basic block, and if it is so, check whether it's earlier in the block than the other users seen so far. Finally, move the GEP.
You may prefer a different approach. Several are possible. For example, you could reorder the GEPs after sorting them (using moveAfter() to move each GEP after the previous one) and then use a combination of users() and moveAfter() to make sure all users are after the instructions they use.
for(auto u : foo->users))) {
Instruction * i = dyn_cast<Instruction>(u);
if(i &&
i->getParent() == foo->getParent() &&
i->comesBefore(foo))
i->moveAfter(foo);
}
Note again that this code never modifies the basic block's list while iterating over it. If you have any mysterious errors, check that first.
I'm new to LLVM IR. Wondering how can I check the result of a method call? I'm trying to use an online compiler https://godbolt.org/
I write below code that want to do something like below.
if (a > b) {
return a + 20;
} else {
return a + b;
}
Am I writing it correct? How can I verify the value that is return by method call?
define i32 #ssa2() {
entry:
%a = alloca i32;
%b = alloca i32;
store i32 3, i32* %a;
store i32 5, i32* %b;
%va = load i32, i32* %a;
%vb = load i32, i32* %b;
%cond = icmp ugt i32 %va, %vb;
br i1 %cond, label %IfGreater, label %IfNotGreater;
IfGreater:
%add1 = add nsw i32 %va, 20;
ret i32 %add1;
IfNotGreater:
%add2 = add nsw i32 %va, %vb;
ret i32 %add2;
}
How can I identify an annotated variable in an LLVM pass?
#include <stdio.h>
int main (){
int x __attribute__((annotate("my_var")))= 0;
int a,b;
x = x + 1;
a = 5;
b = 6;
x = x + a;
return x;
}
For example, I want to identify the instructions which have the annotated variable (x in this case) and print them out (x = x+1; and x = x+a)
How can I achieve this?
This is the .ll file generated using LLVM
; ModuleID = 'test.c'
source_filename = "test.c"
target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
target triple = "aarch64"
#.str = private unnamed_addr constant [7 x i8] c"my_var\00", section "llvm.metadata"
#.str.1 = private unnamed_addr constant [7 x i8] c"test.c\00", section "llvm.metadata"
; Function Attrs: noinline nounwind optnone
define i32 #main() #0 {
%1 = alloca i32, align 4
%2 = alloca i32, align 4
%3 = alloca i32, align 4
%4 = alloca i32, align 4
store i32 0, i32* %1, align 4
%5 = bitcast i32* %2 to i8*
call void #llvm.var.annotation(i8* %5, i8* getelementptr inbounds ([7 x i8], [7 x i8]* #.s$
store i32 0, i32* %2, align 4
%6 = load i32, i32* %2, align 4
%7 = add nsw i32 %6, 1
store i32 %7, i32* %2, align 4
store i32 5, i32* %3, align 4
store i32 6, i32* %4, align 4
%8 = load i32, i32* %2, align 4
%9 = load i32, i32* %3, align 4
%10 = add nsw i32 %8, %9
store i32 %10, i32* %2, align 4
%11 = load i32, i32* %2, align 4
ret i32 %11
}
; Function Attrs: nounwind
declare void #llvm.var.annotation(i8*, i8*, i8*, i32) #1
attributes #0 = { noinline nounwind optnone "correctly-rounded-divide-sqrt-fp-math"="false" $
attributes #1 = { nounwind }
!llvm.module.flags = !{!0}
!llvm.ident = !{!1}
!0 = !{i32 1, !"wchar_size", i32 4}
I recently encountered similiary problem, as I searched Google still not found a solution.
But in the end , I found "ollvm" project's Utils.cpp ,it solved my problem.
In your case,
%5 = bitcast i32* %2 to i8*
call void #llvm.var.annotation(i8* %5, i8* getelementptr inbounds ([7 x i8], [7 x i8]* #.s$
as we can see there is a call to #llvm.var.annotation , in our pass ,
we can loop through instructions over a function , and search for "call" instruction.
Then get the called function's name:
Function *fn = callInst->getCalledFunction();
StringRef fn_name = fn->getName();
and compare the called function's name with "llvm.var.annotation" .
If they match ,then we found the location of "int x " in your case .
The function "llvm.var.annotation" is documented in llvm's doc :
http://llvm.org/docs/LangRef.html#llvm-var-annotation-intrinsic
If you have learn the function "llvm.var.annotation"'s prototype,
then you know that it's second argument is a pointer ,the pointer
points to "my_var\00" in your case . If you thought you can simply
convert it to a GlobalVariable ,then you will failed to get what
you wanted . The actual second argument passed to "llvm.var.annotation"
is
i8* getelementptr inbounds ([7 x i8], [7 x i8]* #.s$
in your case.
It's a expression but a GlobalVariable !!! By knowing this , we can
finally get the annotation of our target variable by :
ConstantExpr *ce =
cast<ConstantExpr>(callInst->getOperand(1));
if (ce) {
if (ce->getOpcode() == Instruction::GetElementPtr) {
if (GlobalVariable *annoteStr =
dyn_cast<GlobalVariable>(ce->getOperand(0))) {
if (ConstantDataSequential *data =
dyn_cast<ConstantDataSequential>(
annoteStr->getInitializer())) {
if (data->isString()) {
errs() << "Found data " << data->getAsString();
}
}
}
}
Hope you already solved the problem .
Have a nice day .
You have to loop on instructions and identify calls to llvm.var.annotation
First argument is a pointer to the annotated variable (i8*).
To get the actual annotated variable, you then need to find what this pointer points to.
In your case, this is the source operand of the bitcast instruction.
There is a branch in ir that I want to delete completely(condtion + branch + true_basic_block + false_basic_block). It looks like this:
%4 = icmp sge i32 %2, %3
br i1 %4, label %5, label %7
; <label>:5 ; preds = %0
%6 = load i32* %x, align 4
store i32 %6, i32* %z, align 4
br label %9
; <label>:7 ; preds = %0
%8 = load i32* %y, align 4
store i32 %8, i32* %z, align 4
br label %9
; <label>:9 ; preds = %7, %5
%10 = call dereferenceable(140) %"class.std::basic_ostream"*#_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc(%"class.std::basic_ostream"* dereferenceable(140) #_ZSt4cout, i8* getelementptr inbounds ([5 x i8]* #.str, i32 0, i32 0))
%11 = load i32* %z, align 4
%12 = call dereferenceable(140) %"class.std::basic_ostream"* #_ZNSolsEi(%"class.std::basic_ostream"* %10, i32 %11)
%13 = call dereferenceable(140) %"class.std::basic_ostream"* #_ZNSolsEPFRSoS_E(%"class.std::basic_ostream"* %12, %"class.std::basic_ostream"* (%"class.std::basic_ostream"*)* #_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_)
ret i32 0
Now to delete it , is there a removeBranch function , or do I need to delete instructions one by one. I have been trying the latter way but I have seen every error from "Basic block in main does not have an terminator" to "use remains when def is destroyed", and many more.. I have used erasefromparent, replaceinstwithvalue, replaceinstwithinst, removefromparent, etc.
Can anyone be kind enough to point me in the correct direction?
This is my function_pass :
bool runOnFunction(Function &F) override {
for (auto& B : F)
for (auto& I : B)
if(auto* brn = dyn_cast<BranchInst>(&I))
if(brn->isConditional()){
Instruction* cond = dyn_cast<Instruction>(brn->getCondition());
if(cond->getOpcode() == Instruction::ICmp){
branch_vector.push_back(brn);
//removeConditionalBranch(dyn_cast<BranchInst>(brn));
}
}
/*For now just delete the branches in the vector.*/
for(auto b : branch_vector)
removeConditionalBranch(dyn_cast<BranchInst>(b));
return true;
}
This is the output :
I don't know of any RemoveBranch utility function, but something like this should work. The idea is to delete the branch instruction, then delete anything that becomes dead as a result, and then merge the initial block with the join block.
// for DeleteDeadBlock, MergeBlockIntoPredecessor
#include "llvm/Transforms/Utils/BasicBlockUtils.h"
// for RecursivelyDeleteTriviallyDeadInstructions
#include "llvm/Transforms/Utils/Local.h"
void removeConditionalBranch(BranchInst *Branch) {
assert(Branch &&
Branch->isConditional() &&
Branch->getNumSuccessors() == 2);
BasicBlock *Parent = Branch->getParent();
BasicBlock *ThenBlock = Branch->getSuccessor(0);
BasicBlock *ElseBlock = Branch->getSuccessor(1);
BasicBlock *ThenSuccessor = ThenBlock->getUniqueSuccessor();
BasicBlock *ElseSuccessor = ElseBlock->getUniqueSuccessor();
assert(ThenSuccessor && ElseSuccessor && ThenSuccessor == ElseSuccessor);
Branch->eraseFromParent();
RecursivelyDeleteTriviallyDeadInstructions(Branch->getCondition());
DeleteDeadBlock(ThenBlock);
DeleteDeadBlock(ElseBlock);
IRBuilder<> Builder(Parent);
Builder.CreateBr(ThenSuccessor);
bool Merged = MergeBlockIntoPredecessor(ThenSuccessor);
assert(Merged);
}
This code only handles the simple case you've shown, with the then and else blocks both jumping unconditionally to a common join block (it will fail with an assertion error for anything more complicated). More complicated control flow will be a bit trickier to handle, but you should still be able to use this code as a starting point.
As in, say my header file is:
class A
{
void Complicated();
}
And my source file
void A::Complicated()
{
...really long function...
}
Can I split the source file into
void DoInitialStuff(pass necessary vars by ref or value)
{
...
}
void HandleCaseA(pass necessary vars by ref or value)
{
...
}
void HandleCaseB(pass necessary vars by ref or value)
{
...
}
void FinishUp(pass necessary vars by ref or value)
{
...
}
void A::Complicated()
{
...
DoInitialStuff(...);
switch ...
HandleCaseA(...)
HandleCaseB(...)
...
FinishUp(...)
}
Entirely for readability and without any fear of impact in terms of performance?
You should mark the functions static so that the compiler know they are local to that translation unit.
Without static the compiler cannot assume (barring LTO / WPA) that the function is only called once, so is less likely to inline it.
Demonstration using the LLVM Try Out page.
That said, code for readability first, micro-optimizations (and such tweaking is a micro-optimization) should only come after performance measures.
Example:
#include <cstdio>
static void foo(int i) {
int m = i % 3;
printf("%d %d", i, m);
}
int main(int argc, char* argv[]) {
for (int i = 0; i != argc; ++i) {
foo(i);
}
}
Produces with static:
; ModuleID = '/tmp/webcompile/_27689_0.bc'
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
target triple = "x86_64-unknown-linux-gnu"
#.str = private constant [6 x i8] c"%d %d\00" ; <[6 x i8]*> [#uses=1]
define i32 #main(i32 %argc, i8** nocapture %argv) nounwind {
entry:
%cmp4 = icmp eq i32 %argc, 0 ; <i1> [#uses=1]
br i1 %cmp4, label %for.end, label %for.body
for.body: ; preds = %for.body, %entry
%0 = phi i32 [ %inc, %for.body ], [ 0, %entry ] ; <i32> [#uses=3]
%rem.i = srem i32 %0, 3 ; <i32> [#uses=1]
%call.i = tail call i32 (i8*, ...)* #printf(i8* getelementptr inbounds ([6 x i8]* #.str, i64 0, i64 0), i32 %0, i32 %rem.i) nounwind ; <i32> [#uses=0]
%inc = add nsw i32 %0, 1 ; <i32> [#uses=2]
%exitcond = icmp eq i32 %inc, %argc ; <i1> [#uses=1]
br i1 %exitcond, label %for.end, label %for.body
for.end: ; preds = %for.body, %entry
ret i32 0
}
declare i32 #printf(i8* nocapture, ...) nounwind
Without static:
; ModuleID = '/tmp/webcompile/_27859_0.bc'
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
target triple = "x86_64-unknown-linux-gnu"
#.str = private constant [6 x i8] c"%d %d\00" ; <[6 x i8]*> [#uses=1]
define void #foo(int)(i32 %i) nounwind {
entry:
%rem = srem i32 %i, 3 ; <i32> [#uses=1]
%call = tail call i32 (i8*, ...)* #printf(i8* getelementptr inbounds ([6 x i8]* #.str, i64 0, i64 0), i32 %i, i32 %rem) ; <i32> [#uses=0]
ret void
}
declare i32 #printf(i8* nocapture, ...) nounwind
define i32 #main(i32 %argc, i8** nocapture %argv) nounwind {
entry:
%cmp4 = icmp eq i32 %argc, 0 ; <i1> [#uses=1]
br i1 %cmp4, label %for.end, label %for.body
for.body: ; preds = %for.body, %entry
%0 = phi i32 [ %inc, %for.body ], [ 0, %entry ] ; <i32> [#uses=3]
%rem.i = srem i32 %0, 3 ; <i32> [#uses=1]
%call.i = tail call i32 (i8*, ...)* #printf(i8* getelementptr inbounds ([6 x i8]* #.str, i64 0, i64 0), i32 %0, i32 %rem.i) nounwind ; <i32> [#uses=0]
%inc = add nsw i32 %0, 1 ; <i32> [#uses=2]
%exitcond = icmp eq i32 %inc, %argc ; <i1> [#uses=1]
br i1 %exitcond, label %for.end, label %for.body
for.end: ; preds = %for.body, %entry
ret i32 0
}
Depends on aliasing (pointers to that function) and function length (a large function inlined in a branch could throw the other branch out of cache, thus hurting performance).
Let the compiler worry about that, you worry about your code :)
A complicated function is likely to have its speed dominated by the operations within the function; the overhead of a function call won't be noticeable even if it isn't inlined.
You don't have much control over the inlining of a function, the best way to know is to try it and find out.
A compiler's optimizer might be more effective with shorter pieces of code, so you might find it getting faster even if it's not inlined.
If you split up your code into logical groupings the compiler will do what it deems best: If it's short and easy, the compiler should inline it and the result is the same. If however the code is complicated, making an extra function call might actually be faster than doing all the work inlined, so you leave the compiler the option to do that too. On top of all that, the logically split code can be far easier for a maintainer to grok and avoid future bugs.
I suggest you create a helper class to break your complicated function into method calls, much like you were proposing, but without the long, boring and unreadable task of passing arguments to each and every one of these smaller functions. Pass these arguments only once by making them member variables of the helper class.
Don't focus on optimization at this point, make sure your code is readable and you'll be fine 99% of the time.