Using global variables in JIT yields garbage result

Using global variables in JIT yields garbage result - llvm

I'm using the llvm-c API and want to use the JIT. I've created the following module
; ModuleID = '_tmp'
#a = global i64 5
define i64 #__tempfunc() {
entry:
%a_val = load i64* #a
ret i64 %a_val
}
This output is generated by LLVMDumpModule just before I call LLVMRunFunction. Which yields a LLVMGenericValueRef. However converting the result to a 64bit integer via LLVMGenericValueToInt(gv, true), it results in 360287970189639680 or something similar - not 5. Converting via LLVMGenericValueToInt(gv, false) didn't help either.
How can I use global variables in a JIT situation? Is anything wrong with the IR?
Edit: Well, i figured out that it has to do with the datalayout, since 360287970189639680 actually is 0x50...0. So I'd like to change the question to "How do I set the correct datalayout for a module? I've tried: LLVMSetDataLayout(mod, "x86_64-pc-linux") which aborts my program.

The data layout format is described in http://llvm.org/docs/LangRef.html#data-layout. And it's certainly not a target triple. Best, if you would simply feed dummy .c file to clang for your target, compile via -S -emit-llvm and grab the full data layout string from there.

Related

LLVM IR Loading type name mangling

I'm having a hard time fixing this problem with loading LLVM-IR modules.
First of all here is the problem.
I have this LLVM-IR File with this instruction:
%7 = getelementptr inbounds %struct.hoge, %struct.hoge* %6, i32 0, i32 0, !dbg !34
Now, when I load it using my tool, it turns to this:
%7 = getelementptr inbounds %struct.hoge.0, %struct.hoge.0* %6, i32 0, i32 0, !dbg !34
My goal is to get rid of this .0 at the end of all struct types.
Here is some code,
Module loading:
Module *FPR_::load(const path &input) const {
// Construct module
SMDiagnostic err;
unique_ptr<Module> module{parseIRFile(input.string(), err, ctx, true, "")};
...
Module printing:
ofstream ofs("hoge.ll");
string str;
raw_string_ostream rso(str);
module->print(rso, NULL, true, true);
ofs << str;
Now, let me tell you what I've done so far.
I've debugged my instrumentation tool and found out that this .0 gets added when the module is LOADED not PRINTED. I found this out because my tool outputs some instructions for debug purposed and they already have the .0s right after loading.
I've tried to load and print the same module multiple times and this is the result.
%7 = getelementptr inbounds %struct.hoge.0.0.0.0.0.0, %struct.hoge.0.0.0.0.0.0* %6, i32 0, i32 0, !dbg !34
Which would be funny if this isn't stopping me from developing my tool.
Verified that this does not happen with the opt tool.
[user#host]opt hoge.ll -o foo.ll
[user#host]diff hoge.ll foo.ll
[user#host]
Which means there must be something I'm missing that opt does.
Read opt's source code.
Some of the code felt like they mattered. And here is what I've ended up with.
LLVMContext ctx;
ctx.setDiscardValueNames(false);
ctx.enableDebugTypeODRUniquing();
FPR_::FPR_ FPR_(path(src_ll.getValue()), path(dep_ll.getValue()), ctx, options); // Module gets loaded in this constructor.
The options in the Context is all I found that opt seemed to do different. But no results so far.
Frankly, I'm out of ideas on how to fix this. So any help would be greatly appreciated.
And as a side note, I know that it is recommended to use the opt tool itself for static analysis or instrumentation.
But the problem is, the tool I am trying to make uses the static analysis for dynamic analysis as well. So I do not want my program to end when it outputs the instrumented LLVM-IR.
I could do some dirty hack and not return from the runOnModule method and just print the Module from there. But I do want to avoid that if possible.
Thank you in advance!

Short Answer:
Do not use the same LLVMContext to read modules if they are NOT supposed to be in the same program.
Long Answer:
By reading multiple Modules with the same LLVMContext, you imply that those Modules are supposed to be within the same program.
So what happens is that, when there are multiple Modules that have the same type name , they get indexed in order to avoid conflict.
Thank you for all the help in the comments. :)

Extracting LLVM IR Identifiers

I have an instruction visitor implemented that inspects FCmpInst. In my IR, I have a couple lines generated from clang on a c++ file:
%2 = load float, float* %x, align 4
%3 = fcmp ogt float %2, 1.0000e+00
Calling getOperand(0) during the FCmpInst visit returns the load instruction above. Then, if I call getPointerOperand() on the load instruction, it points back to the alloca instruction that first sets aside %x. I do not want the pointer - instead, I want the identifier name "%x". How do we extract these names from the IR? I see that calling dump() on any instruction shows the identifier, but I have not found an API call that could pull out the identifier itself. Thanks!

You can use the getName method on Value.
Note that not every value is named -- in particular, you won't be able to retrieve names like %1, %2, etc. as those are generated on the fly while the IR is being written out.

I was trying to do the same thing. I needed to detect global identifiers.
isa<GlobalValue>(mem_address) did it for me.

LLVM get operand and lvalue name of an instruction

For a LLVM IR instruction like %cmp7 = icmp eq i32 %6 %7 I want to get all three register/symbol names (i.e. %cmp %6 and %7)
Now I can get string %cmp by command pi->getName() where pi is Instruction pointer. But when I try to get oprand names I got empty string by typing pi->getOperand(0)->getName().
I tried isa<Instruction>(pi->getOperand(0)) to check whether this is an instruction and it returned true but pi->getOperand(0)->hasName() returned false. Things make me feeling strange is that why both pi and pi->getOperand(0) are instructions but only pi has name?
Is there any thoughts I can get operand name (string %6 and %7 here)by using API?
LLVM version I'm using is 3.4.2

Names are optional for LLVM instructions, and indeed the two operands of your icmp instruction in this case don't have a name, hence the empty string.
When you print an LLVM module to an .ll file then the writer allocates a %<num> name for each instruction to make it human-readable, but this is only something the writer does during printing, that string does not exist in the actual module.

llvm dependencies alloca-load

I have some problems of finding dependencies. I want to get the corresponding Alloca from every Load (corresponding from the point of view of the variable used, meaning that the Load is using a variable based/dependent on the Alloca or Allocas).
Hence, I have a chain like : Alloca -> Load(1) -> ... -> Computation where the variable might be changed -> Store(new_var) -> ... -> Load(n)
"Computation where the variable is changed" means that : I might have Alloca(a), c=a+7000*b[32], Load(c).
First, I tried to use methods from AliasAnalysis class.
The plan was the following: after I get all the must-aliases, I categorize them into 2 :
category A : aliases with allocas
category B: aliases without allocas
For category A, is straight ahead for what I need.
For category B, I check if there is an instruction where the variable is used also from an instruction from the aliases of category A. If it is, it is ok.
For some methods I cannot use Alloca, so I try to find dependencies among Loads (I have all the Load instructions in an array loadInstrArray) and then check if some Load use the same variable as an Alloca.
But the following gave me no result at all (and they should, I have dependencies in my target test code - meaning that Load j is used multiple times in my code, so the pointers should be must-alias) :
if( AA.isMustAlias(Loci,Locj) ) - no results
if( AA.alias(Loci,Locj) ) - wrong results, like "LOAD %2 = load i32* %j, align 4 IS DEPENDENT ON %3 = load i32* %c, align 4"
where j is totally independent from c
3". if( AA.getModRefInfo(allocaInstrArray[j],Loci) ) - no results
where Loci is AliasAnalysis::Location from a Load, allocaInstrArray is an array with all allocas
Second, I tried to use methods from DependencyAnalysis class.
if (DA.depends(loadInstrArray[i],loadInstrArray[j],false)) - no results
Third, I tried methods from MemoryDependenceAnalysis class - the existing pass http://llvm.org/docs/doxygen/html/MemDepPrinter_8cpp_source.html .
I have wrong results, like : %1 = load i32* %j, align 4 IS Clobber FROM: store i32 0, i32* %c, align 4
then I tried to get only DEF (not clobber), since I only need to see the correspondent Alloca for every Load. I don't have any results for Loads. I also checked with getDef() method, Loads being dependent on themselves.
I have to mention that I commented line 00131, since I had an unsolved segfault and not the parameters are the problem.
What do you think I should focus on, what approach is better to take into account and what to eliminate?
Thank you a lot for your time !

In addition, you should check if all the time the ICMP operands are referring to a Load instruction. If not, seek recursive for Load instructions from both ICMP operands (0 and 1). There might be other intermediate operations between Loads and ICMP.

Use getOperand(0)/getOperand(1) of the ICMP instructions. If there is isa<LoadInst> valid, then cast them to LoadInst. getPointerOperand() will get the Value* that is the actual variable which is searched.
Do the same procedure between Load instructions and Alloca instructions. getOperand(0) applied on Load gives the corresponding Alloca instruction.
Link the two results together, by checking the dependencies. The result of doing it manually passes the tests.

llvm alloca dependencies

I am trying to determine for certain Load instructions from my pass their corresponding Alloca instructions (that can be in other previous blocks). The chain can be something like : TargetLoad(var) -> other stores/loads that use var (or dependencies on var) -> alloca(var). , linked on several basic blocks. Do you know how can I do it?
I tried to use the methods from DependenceAnalysis and MemoryDependenceAnalysis, but the results were not correct. For instance, MemoryDependenceAnalysis::getDependency should be good with option "Def", but works only for stores, not for loads. Also I have a segfault when trying to use MemoryDependenceAnalysis::getNonLocalPointerDependency or MemoryDependenceAnalysis::getPointerDependencyFrom . When I try to check my result using MemDepResult::getDef(), the result for Load instructions is the same instruction ! So its depending on itself, that being weird since it is using a variable that is previously defined in the code.
The alternative of making the intersection for identifying common parts between all the variables used by target_load_instructions and all the allocated variables is not an option. Because there might be something like : alloca(a) ... c=a*b+4 .... load(c).
It seems also that DependenceAnalysis::depends() is not ok for my pass. The next line of code is only for reference: if(DA.depends(allocaInstrArray[i],loadInstrArray[j],true)) is always false. And it should be true in several cases. I think I am not using it correctly.
However, I made the assumption that maybe depends() does not work for Alloca. So I checked the dependencies among all Load instructions kept in an array. Some results are not based on the loaded variable as they should. For example: LOAD %3 = load i32* %c, align 4 IS DEPENDENT ON %1 = load i32* %j, align 4. As you can see, one is loading c and one is loading j. In my Test.cpp target code there is no dependence between j and c. Maybe the dependence is not based on variables/memory locations used?
Thank you for any suggestion !

First, use getOperand(0) or getOperand(1) of the ICMP instructions. If there is isa<LoadInst> valid, then cast them to LoadInst. getPointerOperand() will get the Value* that is the actual variable which is searched.
Second, do the same procedure between Load instructions and Alloca instructions. getOperand(0) applied on Load gives the corresponding Alloca instruction.
Finally, link the two results together, by checking the dependencies.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Using global variables in JIT yields garbage result - llvm

The data layout format is described in http://llvm.org/docs/LangRef.html#data-layout. And it's certainly not a target triple. Best, if you would simply feed dummy .c file to clang for your target, compile via -S -emit-llvm and grab the full data layout string from there.

Related

LLVM IR Loading type name mangling

Extracting LLVM IR Identifiers

LLVM get operand and lvalue name of an instruction

llvm dependencies alloca-load

llvm alloca dependencies

Categories

Resources