How to use CreateInBoundsGEP in cpp api of llvm to access the element of an array? - c++

I am new to llvm programming, and I am trying to write cpp to generate llvm ir for a simple C code like this:
int a[10];
a[0] = 1;
I want to generate something like this to store 1 into a[0]
%3 = getelementptr inbounds [10 x i32], [10 x i32]* %2, i64 0, i64 0
store i32 1, i32* %3, align 16
And I tried CreateGEP: auto arrayPtr = builder.CreateInBoundsGEP(var, num); where var and
num are both of type llvm::Value*
but I only get
%1 = getelementptr inbounds [10 x i32], [10 x i32]* %0, i32 0
store i32 1, [10 x i32]* %1
I searched google for a long time and looked the llvm manual but still don't know what Cpp api to use and how to use it.
Really appreciate it if you can help!

Note that the 2nd argument to IRBuilder::CreateInBoundsGEP (1st overload) is actually ArrayRef<Value *>, which means it accepts an array of Value * values (including C-style array, std::vector<Value *> and std::array<Value *, LEN> and others).
To generate a GEP instruction with multiple (child) addresses, pass an array of Value * to the second argument:
Value *i32zero = ConstantInt::get(contexet, APInt(32, 0));
Value *indices[2] = {i32zero, i32zero};
builder.CreateInBoundsGEP(var, ArrayRef<Value *>(indices, 2));
Which will yield
%1 = getelementptr inbounds [10 x i32], [10 x i32]* %0, i32 0, i32 0
You can correctly identify that %1 is of type i32*, pointing to the first item in the array pointed to by %0.
LLVM documentation on GEP instruction: https://llvm.org/docs/GetElementPtr.html

Related

LLVM: passing alloca as a function argument

I want an LLVM IR code equivalent to
double x = 4.93;
printf("hello world: %f", x);
In my LLVM IR code, I got from godbolt and after neating the code, it will become something like
%x = alloca double, align 8
store double 4.930000e+00, double* %x, align 8
%1 = load double, double* %x, align 8
%2 = call i32 (i8*, ...) #printf(i8* getelementptr inbounds ([20 x i8], [20 x i8]* #.str, i32 0, i32 0), double %1)
No error. No problem so far.
But to me, %1 look redundant. However, it looks like I cannot remove this medium. The following code
%x = alloca double, align 8
store double 4.930000e+00, double* %x, align 8
%2 = call i32 (i8*, ...) #printf(i8* getelementptr inbounds ([20 x i8], [20 x i8]* #.str, i32 0, i32 0), double %x)
leads to error
llvm-as-9: test.ll:26:113: error: '%x' defined with type 'double*' but expected 'double'
%2 = call i32 (i8*, ...) #printf(i8* getelementptr inbounds ([20 x i8], [20 x i8]* #.str, i32 0, i32 0), double %x)
^
Now I am wondering if what I'm doing to remove %1 is really possible?
The alloca instruction returns a pointer, so in your code %x is of double* type. You have to load from a double* to get a double.
In your case it should be possible to create a double constant with 4.930000e+00 value and use this value as printf argument directly.
The same effect can be achieved by running optimization passes on your first snippet.

LLVM: How to create char array reference

I am trying to implement an opt pass that will compress string constants in ROM and then, at the cost of CPU + RAM, re-materialize the values at runtime. Before implementing compression, I just want to place all strings in a table, and do a lookup.
Example:
printf("Hello");
Would become the equivalent of
char placeholder[6];
int strID = 0;
tableLookup(placeholder, 0 /*ID*/); // Fill array
printf(placeholder);
The LLVM IR I was able to generate looks like the following:
%fakeString = alloca [10 x i8], align 1
call void #llvm.dbg.declare(metadata [10 x i8]* %fakeString, metadata !60, metadata !25), !dbg !64
%arraydecay = getelementptr inbounds [10 x i8], [10 x i8]* %fakeString, i32 0, i32 0, !dbg !65
call void #tableLookup(i8* %arraydecay, i32 0) #2, !dbg !66
How would I be able to create this programatically? The two main pieces I am missing:
1. How to create the array reference (after creating the alloca instruction)
2. How to get result from tableLookup and replace the operand in the old printf()
Any help would be much appreciated!

Get pointer to first element of array in LLVM-IR

I want to implement a string type in LLVM-IR, and my plan is as follows:
When a string variable is declared, allocate memory for a i8*.
When the variable is initialized, store the string somewhere, store the pointer to the first element at the formerly allocated address, and save the length of the string in a member variable.
The problem is, that I can't get the pointer to the first element.
Using the IRBuilder (C++ API), I have yet created the following code:
%str2 = alloca i8*
%1 = alloca [4 x i8]
store [4 x i8] c"foo\00", [4 x i8]* %1
%2 = getelementptr [4 x i8], [4 x i8]* %1, i32 0
store [4 x i8]* %2, i8** %str2
But calling llc on this gives the following error:
error: stored value and pointer type do not match
store [4 x i8]* %2, i8** %str2
^
When using clang to emit llvm-ir on the same (char* instead of string) code, it emits the following:
%str2 = alloca i8*, align 8
store i8* %str, i8** %1, align 8
store i8* getelementptr inbounds ([4 x i8], [4 x i8]* #.str, i32 0, i32 0), i8** %str2, align 8
Which looks very similar imho, but is not the same.
Clang makes the string a global constant (which is hopefully not necessary), uses the offset parameter of getelementptr and gives the type i8* to the store instruction.
Unfortunately I wasn't able to find any API Methods to explicitly give a type to the store instruction, or use the offset parameter (which wouldn't even help, I guess).
So finally my question is: How can I properly get a pointer to the first array element and store it?
Thanks in advance
As in the link in the comment, the solution is to bitcast the [n x i8]* to an i8*
%str2 = alloca i8*
%1 = alloca [4 x i8]
store [4 x i8] c"foo\00", [4 x i8]* %1
%2 = bitcast [4 x i8]* %1 to i8*
store i8* %2, i8** %str2

Is LLVM GEP safe if not know the size of an array?

I am trying to use GEP to get a pointer of i32 from an array.
But the problem is: I don't know the size of the array.
The IR document on llvm.org said GEP just adds the offsets to the base address with silently-wrapping two’s complement arithmetic.
So, I want to ask for some advice.
Is it safe like this:
%v1 = alloca i32
store i32 5, i32* %v1
%6 = load i32* %v1
%7 = bitcast i32* %v0 to [1 x i32]*
%8 = getelementptr [1 x i32]* %7, i32 0, i32 %6
%9 = load i32* %8
store i32 %9, i32* %v0
Type of %v0 is i32*, and I know %v0 is pointing to an array in mem, but the size is 9, not 1.
Then I "GEP" from %7 which I treat it as a [1 x i32], not [9 x i32] , but the "offset" is 5(%6).
So, is there any problem? Not safe, or just not good but basically OK?
First of all, the entire code you wrote is equivalent to:
%x = getelementptr i32* %v0, i32 5
%y = load i32* %x
store i32* %y, %v0
There's no reason to bitcast the pointer to [1 x i32]*, just use it as-is.
Regarding your question - using a gep to get the pointer is always safe (in the sense that it's well-defined and will never crash), however there's nothing stopping it from evaluating to a pointer beyond the bounds of the array; and in such a case, accessing the memory (as you do in the subsequent load instruction) is undefined.
Also, this link might be of interest: http://llvm.org/docs/GetElementPtr.html#what-happens-if-an-array-index-is-out-of-bounds

Passing an array to an external function

I am new to LLVM, and I am learning how to use LLVM for profiling. I need to pass an array to an external method, and insert a call instruction to the method in the code. I am currently using the following code, which on execution gives a segmentation fault.
std::vector<Value*> Args(1);
//Vector with array values
SmallVector<Constant*, 2> counts;
counts.push_back(ConstantInt::get(Type::getInt32Ty(BB->getContext()),32, false));
counts.push_back(ConstantInt::get(Type::getInt32Ty(BB->getContext()),12, false));
//Array with 2 integers
Args[0]= ConstantArray::get(llvm::ArrayType::get(llvm::Type::getInt32Ty(BI->getContext()),2), counts);
Here, the external function 'hook' is defined as M.getOrInsertFunction("hook", Type::getVoidTy(M.getContext()),
llvm::ArrayType::get(llvm::Type::getInt32Ty(BI->getContext()),2)
(Type*)0);
After reading a few source files, I've tried using GetElementPtrInst to pass the array
std::vector<Value*> ids(1);
ids.push_back(ConstantInt::get(Type::getInt32Ty(BB->getContext()),0));
Constant* array = ConstantArray::get(llvm::ArrayType::get(llvm::Type::getInt32Ty(BI->getContext()),2), counts);
Args[0] = ConstantExpr::getGetElementPtr(&(*array), ids, false);
but it fails with
7 opt 0x00000000006c59f5 bool llvm::isa<llvm::Constant, llvm::Value*>(llvm::Value* const&) + 24
8 opt 0x00000000006c5a0f llvm::cast_retty<llvm::Constant, llvm::Value*>::ret_type llvm::cast<llvm::Constant, llvm::Value*>(llvm::Value* const&) + 24
9 opt 0x0000000000b2b22f
10 opt 0x0000000000b2a4fe llvm::ConstantFoldGetElementPtr(llvm::Constant*, bool, llvm::ArrayRef<llvm::Value*>) + 55
11 opt 0x0000000000b33df2 llvm::ConstantExpr::getGetElementPtr(llvm::Constant*, llvm::ArrayRef<llvm::Value*>, bool) + 82
Also, in this case, 'hook' is defined as M.getOrInsertFunction("hook", Type::getVoidTy(M.getContext()),
PointerType::get(Type::getInt32PtrTy(M.getContext()),0), //when using GEP
(Type*)0);
Could someone kindly keep me a few pointers on passing arrays to an external function (say with the signature void hook(int abc[]) ). I am probably wrong all the way through, and would really appreciate some help.
A good place to start with "how do I do this c-like thing in LLVM IR" questions is to first write what you want to do in C, then compile it to LLVM IR via Clang and take a look at the result.
In your particular instance, the file:
void f(int a[2]);
void g() {
int x[2];
x[0] = 1;
x[1] = 3;
f(x);
}
Will compile to:
define void #g() nounwind {
%x = alloca [2 x i32], align 4
%1 = getelementptr inbounds [2 x i32]* %x, i32 0, i32 0
store i32 1, i32* %1, align 4
%2 = getelementptr inbounds [2 x i32]* %x, i32 0, i32 1
store i32 3, i32* %2, align 4
%3 = getelementptr inbounds [2 x i32]* %x, i32 0, i32 0
call void #f(i32* %3)
ret void
}
declare void #f(i32*)
So we can see the clang compiled g to receive i32*, not an array. That means you need a way to get an address to the first element of the array from the array itself, and a getelementptr instruction is a straightforward way of doing that.
Notice, however, that you want to generate a GEP (getelementptr instruction), for example via GetElementPtrInst::create. A gep constant expression, which is what you're trying to generate here, is something else, and will only work on compile-time constants.
You should use Clang to compile it. Then, check the boundaries of the array and if all the elements are defined.