llvm alloca dependencies - llvm

I am trying to determine for certain Load instructions from my pass their corresponding Alloca instructions (that can be in other previous blocks). The chain can be something like : TargetLoad(var) -> other stores/loads that use var (or dependencies on var) -> alloca(var). , linked on several basic blocks. Do you know how can I do it?
I tried to use the methods from DependenceAnalysis and MemoryDependenceAnalysis, but the results were not correct. For instance, MemoryDependenceAnalysis::getDependency should be good with option "Def", but works only for stores, not for loads. Also I have a segfault when trying to use MemoryDependenceAnalysis::getNonLocalPointerDependency or MemoryDependenceAnalysis::getPointerDependencyFrom . When I try to check my result using MemDepResult::getDef(), the result for Load instructions is the same instruction ! So its depending on itself, that being weird since it is using a variable that is previously defined in the code.
The alternative of making the intersection for identifying common parts between all the variables used by target_load_instructions and all the allocated variables is not an option. Because there might be something like : alloca(a) ... c=a*b+4 .... load(c).
It seems also that DependenceAnalysis::depends() is not ok for my pass. The next line of code is only for reference: if(DA.depends(allocaInstrArray[i],loadInstrArray[j],true)) is always false. And it should be true in several cases. I think I am not using it correctly.
However, I made the assumption that maybe depends() does not work for Alloca. So I checked the dependencies among all Load instructions kept in an array. Some results are not based on the loaded variable as they should. For example: LOAD %3 = load i32* %c, align 4 IS DEPENDENT ON %1 = load i32* %j, align 4. As you can see, one is loading c and one is loading j. In my Test.cpp target code there is no dependence between j and c. Maybe the dependence is not based on variables/memory locations used?
Thank you for any suggestion !

First, use getOperand(0) or getOperand(1) of the ICMP instructions. If there is isa<LoadInst> valid, then cast them to LoadInst. getPointerOperand() will get the Value* that is the actual variable which is searched.
Second, do the same procedure between Load instructions and Alloca instructions. getOperand(0) applied on Load gives the corresponding Alloca instruction.
Finally, link the two results together, by checking the dependencies.

Related

Are MachineBasicBlocks supposed to implicitly fall through to their successors?

I'm debugging an LLVM target backend, and I am chasing a problem where a certain basic block ends up jumping to "nothing", i.e. just after the end of the function, when compiled with optimizations turned on.
One thing I noticed is that after instruction selection, the machine basic block has a successor but no instruction to actually jump there:
BB#1: derived from LLVM BB %switch.lookup
Predecessors according to CFG: BB#0
%vreg5<def> = SEXT %vreg2, %SREG<imp-def,dead>; DLDREGS:%vreg5 GPR8:%vreg2
%vreg6<def,tied1> = ANDIWRdK %vreg5<tied0>, -2, %SREG<imp-def,dead>; DLDREGS:%vreg6,%vreg5
%vreg7<def> = LDIWRdK 4; DLDREGS:%vreg7
%vreg8<def> = LDIRdK 0; LD8:%vreg8
%vreg9<def> = LDIRdK 1; LD8:%vreg9
CPWRdRr %vreg6<kill>, %vreg7<kill>, %SREG<imp-def>; DLDREGS:%vreg6,%vreg7
%vreg0<def> = Select8 %vreg9<kill>, %vreg8<kill>, 1, %SREG<imp-use>; GPR8:%vreg0 LD8:%vreg9,%vreg8
Successors according to CFG: BB#2(?%)
I see similar ISel results from the x86 LLVM backend and the end result doesn't have a jump-to-nothingness, so I assume this, on its own, is not a problem:
BB#1: derived from LLVM BB %switch.lookup
Predecessors according to CFG: BB#0
%vreg7<def> = MOVSX32rr8 %vreg3; GR32:%vreg7 GR8:%vreg3
%vreg8<def,tied1> = AND32ri %vreg7<tied0>, 65534, %EFLAGS<imp-def,dead>; GR32:%vreg8,%vreg7
%vreg9<def,tied1> = SUB32ri8 %vreg8<tied0>, 4, %EFLAGS<imp-def>; GR32:%vreg9,%vreg8
%vreg0<def> = SETNEr %EFLAGS<imp-use>; GR8:%vreg0
Successors according to CFG: BB#2(?%)
So my question is: What is the mechanism by which these CFG-specified successors are supposed to be turned into real jumps? Does the x86 backend implement something special for this to work that the backend I'm debuggig doesn't?
Should I change my ISelLowering class to lower Select8 into something that ends with an explicit jump, or is that unnecessary (maybe potentially even detrimental for some optimization to kick in) and there's some other magic that I need to do so that these implicit successors are correctly lowered?
It is perfectly valid for a MachineBasicBlock to fall through to the next Block:
That is valid. Passes that want to reorder basic blocks should only do
so if the AnalyzeBranch and related target hooks (Insert/Remove) allow
it.

Extracting LLVM IR Identifiers

I have an instruction visitor implemented that inspects FCmpInst. In my IR, I have a couple lines generated from clang on a c++ file:
%2 = load float, float* %x, align 4
%3 = fcmp ogt float %2, 1.0000e+00
Calling getOperand(0) during the FCmpInst visit returns the load instruction above. Then, if I call getPointerOperand() on the load instruction, it points back to the alloca instruction that first sets aside %x. I do not want the pointer - instead, I want the identifier name "%x". How do we extract these names from the IR? I see that calling dump() on any instruction shows the identifier, but I have not found an API call that could pull out the identifier itself. Thanks!
You can use the getName method on Value.
Note that not every value is named -- in particular, you won't be able to retrieve names like %1, %2, etc. as those are generated on the fly while the IR is being written out.
I was trying to do the same thing. I needed to detect global identifiers.
isa<GlobalValue>(mem_address) did it for me.

Using global variables in JIT yields garbage result

I'm using the llvm-c API and want to use the JIT. I've created the following module
; ModuleID = '_tmp'
#a = global i64 5
define i64 #__tempfunc() {
entry:
%a_val = load i64* #a
ret i64 %a_val
}
This output is generated by LLVMDumpModule just before I call LLVMRunFunction. Which yields a LLVMGenericValueRef. However converting the result to a 64bit integer via LLVMGenericValueToInt(gv, true), it results in 360287970189639680 or something similar - not 5. Converting via LLVMGenericValueToInt(gv, false) didn't help either.
How can I use global variables in a JIT situation? Is anything wrong with the IR?
Edit: Well, i figured out that it has to do with the datalayout, since 360287970189639680 actually is 0x50...0. So I'd like to change the question to "How do I set the correct datalayout for a module? I've tried: LLVMSetDataLayout(mod, "x86_64-pc-linux") which aborts my program.
The data layout format is described in http://llvm.org/docs/LangRef.html#data-layout. And it's certainly not a target triple. Best, if you would simply feed dummy .c file to clang for your target, compile via -S -emit-llvm and grab the full data layout string from there.

llvm.stackprotect of LLVM

I just get started with LLVM. I am reading the code for stack protection which is located in lib/CodeGen/StackProtector.cpp. In this file, the InsertStackProtectors function will insert a call to llvm.stackprotect to the code:
// entry:
// StackGuardSlot = alloca i8*
// StackGuard = load __stack_chk_guard
// call void #llvm.stackprotect.create(StackGuard, StackGuardSlot)
// ...(Skip some lines)
CallInst::
Create(Intrinsic::getDeclaration(M, Intrinsic::stackprotector),
Args, "", InsPt);
This llvm.strackprotect(http://llvm.org/docs/LangRef.html#llvm-stackprotector-intrinsic) seems to be an intrinsic function of llvm, so I tried to find the source code of this function. However, I cannot find it...
I do find one line definition of this function in include/llvm/IR/Intrinsics.td, but it does not tell how it is implemented.
So my questions are:
Where can I find the code for this llvm.strackprotect function?
What is the purpose of these *.td files?
Thank you very much!
The .td file is LLVM's use of code-generation to reduce the amount of boilerplate code. In this particular case, ./include/llvm/IR/Intrinsics.gen is generated in the build directory and contains code describing the intrinsics specified in the .td file.
As for stackprotector, there's a bunch of code in the backend for handling it. See for instance lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp - in SelectionDAGBuilder::visitIntrinsicCall it generates the actual DAG nodes that implement this intrinsic

llvm dependencies alloca-load

I have some problems of finding dependencies. I want to get the corresponding Alloca from every Load (corresponding from the point of view of the variable used, meaning that the Load is using a variable based/dependent on the Alloca or Allocas).
Hence, I have a chain like : Alloca -> Load(1) -> ... -> Computation where the variable might be changed -> Store(new_var) -> ... -> Load(n)
"Computation where the variable is changed" means that : I might have Alloca(a), c=a+7000*b[32], Load(c).
First, I tried to use methods from AliasAnalysis class.
The plan was the following: after I get all the must-aliases, I categorize them into 2 :
category A : aliases with allocas
category B: aliases without allocas
For category A, is straight ahead for what I need.
For category B, I check if there is an instruction where the variable is used also from an instruction from the aliases of category A. If it is, it is ok.
For some methods I cannot use Alloca, so I try to find dependencies among Loads (I have all the Load instructions in an array loadInstrArray) and then check if some Load use the same variable as an Alloca.
But the following gave me no result at all (and they should, I have dependencies in my target test code - meaning that Load j is used multiple times in my code, so the pointers should be must-alias) :
if( AA.isMustAlias(Loci,Locj) ) - no results
if( AA.alias(Loci,Locj) ) - wrong results, like "LOAD %2 = load i32* %j, align 4 IS DEPENDENT ON %3 = load i32* %c, align 4"
where j is totally independent from c
3". if( AA.getModRefInfo(allocaInstrArray[j],Loci) ) - no results
where Loci is AliasAnalysis::Location from a Load, allocaInstrArray is an array with all allocas
Second, I tried to use methods from DependencyAnalysis class.
if (DA.depends(loadInstrArray[i],loadInstrArray[j],false)) - no results
Third, I tried methods from MemoryDependenceAnalysis class - the existing pass http://llvm.org/docs/doxygen/html/MemDepPrinter_8cpp_source.html .
I have wrong results, like : %1 = load i32* %j, align 4 IS Clobber FROM: store i32 0, i32* %c, align 4
then I tried to get only DEF (not clobber), since I only need to see the correspondent Alloca for every Load. I don't have any results for Loads. I also checked with getDef() method, Loads being dependent on themselves.
I have to mention that I commented line 00131, since I had an unsolved segfault and not the parameters are the problem.
What do you think I should focus on, what approach is better to take into account and what to eliminate?
Thank you a lot for your time !
In addition, you should check if all the time the ICMP operands are referring to a Load instruction. If not, seek recursive for Load instructions from both ICMP operands (0 and 1). There might be other intermediate operations between Loads and ICMP.
Use getOperand(0)/getOperand(1) of the ICMP instructions. If there is isa<LoadInst> valid, then cast them to LoadInst. getPointerOperand() will get the Value* that is the actual variable which is searched.
Do the same procedure between Load instructions and Alloca instructions. getOperand(0) applied on Load gives the corresponding Alloca instruction.
Link the two results together, by checking the dependencies. The result of doing it manually passes the tests.