How can I check if two GEP instructions are semantically equal or not? - llvm

I have two GEP instructions looking like below:
%size = getelementptr inbounds %struct.ArrayInfo, %struct.ArrayInfo* %0, i32 0, i32 0
...
%size = getelementptr inbounds %struct.ArrayInfo, %struct.ArrayInfo* %1, i32 0, i32 0
Essentially these two are accessing the same struct field. Is there a way to check if these two instructions are equivalent in llvm? I tried comparing pointers of GEPOperator (GEPOperator*), but it looks like they are different.

Try isSameOperationAs(). If you cast both size variables in your example to llvm::Instruction and call this method on one with the other as an argument, you'll get a true value.

Related

llvm - Access And Call Function Pointer In A Global Array Without Horrible Pointer Hacking

I am having quite some trouble programmatically accessing a function pointer in a global array programmatically. I have a global array of function pointers, my "lookup table" which I basically I am using for "overloads". Every time I try to GetElementPointer (GEP)/getelementptr an element in this array with the desired type, I get a runtime assertion:
warp_compiler: /root/.conan/data/llvm-core/13.0.0/_/_/package /6efbb14f313e71b5e1dbf77c1c011f47614b7c7c/include/llvm/IR/
Instructions.h:960: static llvm::GetElementPtrInst* llvm::GetElementPtrInst::Create(
llvm::Type*, llvm::Value*, llvm::ArrayRef<llvm::Value*>, const llvm::Twine&, llvm::Instruction*):
Assertion `cast<PointerType>(Ptr->getType()->getScalarType()) ->isOpaqueOrPointeeTypeMatches(PointeeType)'
failed.
Aborted (core dumped)
Now the type of the array when compiled is [3 x i32 (i32)*] by default it tries to do a a GEP on [3 x i32 (i32)*]* with element type [3 x i32 (i32)*] which does not work.
If I manaually edit the code to be:
%option_address = getelementptr i32 (i32)*, [3 x i32 (i32)*]* #my_function_1_table, i32 %7
Or too:
%option_address = getelementptr i32 (i32)*, [3 x i32 (i32)*] #my_function_1_table, i32 %7
it works just dandy, the ladder is really what I am looking to do. But I cant seem to do it probrammatically because of this exception.
I have tried casting the array to i32 (i32)* with:
auto first_element = context->builder.CreatePointerBitCastOrAddrSpaceCast(
(llvm::Value*) lookup_table_global,
(llvm::Type*) function->getType(),
"cast"
);
Then trying to access the elements with something like:
auto element = context->builder.CreateGEP(
(llvm::Type*) function->getType(),
first_element,
index_array,
"option_address"
);
But I get that exception again, and it does work if I type it manually into the IR
%option_address = getelementptr i32 (i32)*, i32 (i32)* #my_function_1_table, i32 %7
Seems like a pretty regular way to access an array, right?
But I cant seem to do it programmatically, because if the assertion, I even tried to make a work around by tryng to inherit from GetElementPtrInst directly and omitting the assertion, but couldn't (because its constructor is private).
Currently, my solution is to cast the array to a i32 (i32)* then to a [1 x i32 (i32)*] then do the GEP on a [1 x i32(i32)*]* with a [1 x i32(i32)]
%option_address = getelementptr [1 x i32 (i32)*], [1 x i32 (i32)*]* bitcast ([3 x i32 (i32)*]* #my_function_1_table to [1 x i32 (i32)*]*), i32 %7
This is horrible.
Does anyone know how I can simply access the function pointers I need from a global (constant) array so they can be called?
Also is my current solution portable?
Thank you!
Sorry you've run into this challenging aspect of LLVM. It definitely causes confusion.
There is an entire webpage dedicated to trying to help folks understand the counter-intuitive design of this instruction. While the design is well motivated from within LLVM, it causes lots of folks confusion and frustration when they first encounter it.
The challenge you're hitting is because a GEP instruction in LLVM always operates on a pointer, and with global variables, that pointer is to the variable. When the global variable is an array as in your case, this is extra confusing -- GEP has to go through an extra layer of pointer before it gets to the array you're trying to index with it.
The first section of the GEP site I mentioned above specifically explains how the first index to a GEP works -- it indexes the base pointer directly.
The second section then specifically clarifies why global variables end up surprising here. The global variable, #my_function_1_table in your case, is a pointer to itself. You'll have to index that with a simple i32 0 index first. Then you can add an additional index into the array that global variable points to.
So for a global variable with type [3 x i32 (i32)*], if you want to extract the second element of the array, you need:
%fptr = getelementptr [3 x i32 (i32)*], [3 x i32 (i32)*]* #my_function_1_table, i32 0, i32 2
The first i32 0 here indexes the global itself. The second index of i32 2 indexes into the array.
You can also use Clang to get example LLVM IR that can help explain how to do things. For example, here is some C++ that does something similar to what you're trying to do:
using FPtrT = int (*)(int);
extern FPtrT function_ptrs[3];
int test(int i) {
FPtrT fptr = function_ptrs[i];
return (*fptr)(42);
}
And this turns into the following LLVM IR after some basic optimizations (-O1):
#function_ptrs = external dso_local local_unnamed_addr global [3 x i32 (i32)*], align 16
define dso_local i32 #_Z4testi(i32 %0) local_unnamed_addr #0 {
%2 = sext i32 %0 to i64
%3 = getelementptr inbounds [3 x i32 (i32)*], [3 x i32 (i32)*]* #function_ptrs, i64 0, i64 %2
%4 = load i32 (i32)*, i32 (i32)** %3, align 8, !tbaa !4
%5 = call i32 %4(i32 42)
ret i32 %5
}
attributes #0 = { mustprogress uwtable "frame-pointer"="none" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
Here you can see %3 is doing a dynamic (and inbounds, but that's orthogonal) version of this indexing.
You can play with this kind of IR generation using Compiler Explorer: https://cpp.compiler-explorer.com/z/ETa8nvTvh
Once you're using the API to create this two index GEP it should start working for you.
Also, just so you (or others) reading this don't get confused: the LLVM IR syntax changed here recently, so the latest versions of LLVM don't look quite the same. You can switch from Clang v13 to a more recent one to see what it looks like, for example here: https://cpp.compiler-explorer.com/z/Kc4er413G

Break constant GEPs

I need to break constant GEPs. I found an old BreakConstantGEPs pass and I try to use it with newer LLVM version. For the code:
int tab[1]={1};
void fun()
{
int val=tab[0];
}
without performing pass I get the following .ll file:
#tab = dso_local global [1 x i32] [i32 1], align 4
; Function Attrs: noinline nounwind optnone uwtable
define dso_local void #mult() #0 {
%1 = alloca i32, align 4
%2 = load i32, i32* getelementptr inbounds ([1 x i32], [1 x i32]* #tab, i64 0, i64 0), align 4
store i32 %2, i32* %1, align 4
ret void
}
Then I perform the pass. GEP is properly recognized. Here I present most important parts of the pass (full code in link above).
Creating new GEP instruction:
GetElementPtrInst::CreateInBounds(CE->getOperand(0), Indices, CE->getName(), InsertPt) //in original code GetElementPtrInst::Create() is used
Replacing:
I->replaceUsesOfWith (CE, NewInst);
I->removeFromParent();
Unfortunately, the module verifier outputs errors:
Instruction referencing instruction not embedded in a basic block!
%2 = getelementptr inbounds [1 x i32], [1 x i32]* #tab, i64 0, i64 0
<badref> = load i32, i32* %2, align 4
Instruction does not dominate all uses!
<badref> = load i32, i32* %2, align 4
store i32 <badref>, i32* %1, align 4
in function fun
LLVM ERROR: Broken function found, compilation aborted!
What am I doing wrong?
Remove I->removeFromParent(); and it should work.
LLVM uses linked lists to represent most of its data, a Module holds global values (global variables, aliases and functions) in a linked list, a Function holds basic blocks in a linked list, and BasicBlock holds Instructions in a linked list. Thinking about this as memory ownership, the Instruction is owned by the BasicBlock and will be deleted when the BasicBlock is deleted. If you delete a Function then it will delete all the BasicBlocks it owns, which deletes all the Instructions and so on. We use linked lists instead of vectors in order to make moving instructions less expensive, you can hoist an instruction by detaching it from its parent basic block, and inserting it elsewhere.
When you created the instruction with GetElementPtrInst::CreateInBounds([...], InsertPt) you created a new getelementptr instruction and it was inserted in your code at the InsertPt insertion point (just any other instruction will do). Perfect.
Then you called I->removeFromParent() which removes the instruction from its basic block. It still exists as a C++ object but it has no parent basic block, it does not belong to any block or function or module, it never runs. Why did you do that? You probably didn't mean to do that. Or maybe you wanted to insert it somewhere other than the InsertPt?

C++/LLVM: Runtime code generation and STL container

Assume a simple partial evaluation scenario:
#include <vector>
/* may be known at runtime */
int someConstant();
/* can be partially evaluated */
double foo(std::vector<double> args) {
return args[someConstant()] * someConstant();
}
Let's say that someConstant() is known and does not change at runtime (e.g. given by the user once) and can be replaced by the corresponding int literal. If foo is part of the hot path, I expect a significant performance improvement:
/* partially evaluated, someConstant() == 2 */
double foo(std::vector<double> args) {
return args[2] * 2;
}
My current take on that problem would be to generate LLVM IR at runtime, because I know the structure of the partially evaluated code (so I would not need a general purpose partial evaluator).
So I want to write a function foo_ir that generates IR code that does the same thing as foo, but not calling someConstant(), because it is known at runtime.
Simple enough, isn't it? Yet, when I look at the generated IR for the code above:
; Function Attrs: uwtable
define double #_Z3fooSt6vectorIdSaIdEE(%"class.std::vector"* %args) #0 {
%1 = call i32 #_Z12someConstantv()
%2 = sext i32 %1 to i64
%3 = call double* #_ZNSt6vectorIdSaIdEEixEm(%"class.std::vector"* %args, i64 %2)
%4 = load double* %3
%5 = call i32 #_Z12someConstantv()
%6 = sitofp i32 %5 to double
%7 = fmul double %4, %6
ret double %7
}
; Function Attrs: nounwind uwtable
define linkonce_odr double* #_ZNSt6vectorIdSaIdEEixEm(%"class.std::vector"* %this, i64 %__n) #1 align 2 {
%1 = alloca %"class.std::vector"*, align 8
%2 = alloca i64, align 8
store %"class.std::vector"* %this, %"class.std::vector"** %1, align 8
store i64 %__n, i64* %2, align 8
%3 = load %"class.std::vector"** %1
%4 = bitcast %"class.std::vector"* %3 to %"struct.std::_Vector_base"*
%5 = getelementptr inbounds %"struct.std::_Vector_base"* %4, i32 0, i32 0
%6 = getelementptr inbounds %"struct.std::_Vector_base<double, std::allocator<double> >::_Vector_impl"* %5, i32 0, i32 0
%7 = load double** %6, align 8
%8 = load i64* %2, align 8
%9 = getelementptr inbounds double* %7, i64 %8
ret double* %9
}
I see, that the [] was included from the STL definition (function #_ZNSt6vectorIdSaIdEEixEm) - fair enough. The problem is: It could as well be some member function, or even a direct data access, I simply cannot assume the data layout to be the same everywhere, so at development-time, I do not know the concrete std::vector layout of the host machine.
Is there some way to use C++ metaprogramming to get the required information at compile time? i.e. is there some way to ask llvm to provide IR for std::vector's [] method?
As a bonus: I would prefer to not enforce the compilation of the library with clang, instead, LLVM shall be a runtime-dependency, so just invoking clang at compile time (even if I do not know how to do this) is a second-best solution.
Answering my own question:
While I still have no solution for the general case (e.g. std::map), there exists a simple solution for std::vector:
According to the C++ standard, the following holds for the member function data()
Returns a direct pointer to the memory array used internally by the
vector to store its owned elements.
Because elements in the vector are guaranteed to be stored in
contiguous storage locations in the same order as represented by the
vector, the pointer retrieved can be offset to access any element in
the array.
So in fact, the object-level layout of std::vector is fixed by the standard.

Is LLVM GEP safe if not know the size of an array?

I am trying to use GEP to get a pointer of i32 from an array.
But the problem is: I don't know the size of the array.
The IR document on llvm.org said GEP just adds the offsets to the base address with silently-wrapping two’s complement arithmetic.
So, I want to ask for some advice.
Is it safe like this:
%v1 = alloca i32
store i32 5, i32* %v1
%6 = load i32* %v1
%7 = bitcast i32* %v0 to [1 x i32]*
%8 = getelementptr [1 x i32]* %7, i32 0, i32 %6
%9 = load i32* %8
store i32 %9, i32* %v0
Type of %v0 is i32*, and I know %v0 is pointing to an array in mem, but the size is 9, not 1.
Then I "GEP" from %7 which I treat it as a [1 x i32], not [9 x i32] , but the "offset" is 5(%6).
So, is there any problem? Not safe, or just not good but basically OK?
First of all, the entire code you wrote is equivalent to:
%x = getelementptr i32* %v0, i32 5
%y = load i32* %x
store i32* %y, %v0
There's no reason to bitcast the pointer to [1 x i32]*, just use it as-is.
Regarding your question - using a gep to get the pointer is always safe (in the sense that it's well-defined and will never crash), however there's nothing stopping it from evaluating to a pointer beyond the bounds of the array; and in such a case, accessing the memory (as you do in the subsequent load instruction) is undefined.
Also, this link might be of interest: http://llvm.org/docs/GetElementPtr.html#what-happens-if-an-array-index-is-out-of-bounds

How much space for a LLVM trampoline

I'm trying to figure out how to use the trampoline intrinsics in LLVM. The documentation makes mention of some amount of storage that's needed to store the trampoline in, which is platform dependent. My question is, how do I figure out how much is needed?
I found this example, that picks 32 bytes for apparently no reason. How does one choose a good value?
declare void #llvm.init.trampoline(i8*, i8*, i8*);
declare i8* #llvm.adjust.trampoline(i8*);
define i32 #foo(i32* nest %ptr, i32 %val)
{
%x = load i32* %ptr
%sum = add i32 %x, %val
ret i32 %sum
}
define i32 #main(i32, i8**)
{
%closure = alloca i32
store i32 13, i32* %closure
%closure_ptr = bitcast i32* %closure to i8*
%tramp_buf = alloca [32 x i8], align 4
%tramp_ptr = getelementptr [32 x i8]* %tramp_buf, i32 0, i32 0
call void #llvm.init.trampoline(
i8* %tramp_ptr,
i8* bitcast (i32 (i32*, i32)* #foo to i8*),
i8* %closure_ptr)
%ptr = call i8* #llvm.adjust.trampoline(i8* %tramp_ptr)
%fp = bitcast i8* %ptr to i32(i32)*
%val2 = call i32 %fp (i32 13)
; %val = call i32 #foo(i32* %closure, i32 42);
ret i32 %val2
}
Yes, trampolines are used to generate some code "on fly". It's unclear why do you need these intrinsics at all, because they are used to implement GCC's nested functions extension (in particular, when the address of the nested function is captured and the function access the stuff inside the enclosing function).
The best way to figure out the necessary size and alignment of trampoline buffer is to grep gcc sources for "TRAMPOLINE_SIZE" and "TRAMPOLINE_ALIGNMENT".
As far as I can see, at the time of this writing, the buffer of 72 bytes and alignment of 16 bytes will be enough for all the platforms gcc / LLVM supports.