Get pointer to first element of array in LLVM-IR - c++

I want to implement a string type in LLVM-IR, and my plan is as follows:
When a string variable is declared, allocate memory for a i8*.
When the variable is initialized, store the string somewhere, store the pointer to the first element at the formerly allocated address, and save the length of the string in a member variable.
The problem is, that I can't get the pointer to the first element.
Using the IRBuilder (C++ API), I have yet created the following code:
%str2 = alloca i8*
%1 = alloca [4 x i8]
store [4 x i8] c"foo\00", [4 x i8]* %1
%2 = getelementptr [4 x i8], [4 x i8]* %1, i32 0
store [4 x i8]* %2, i8** %str2
But calling llc on this gives the following error:
error: stored value and pointer type do not match
store [4 x i8]* %2, i8** %str2
^
When using clang to emit llvm-ir on the same (char* instead of string) code, it emits the following:
%str2 = alloca i8*, align 8
store i8* %str, i8** %1, align 8
store i8* getelementptr inbounds ([4 x i8], [4 x i8]* #.str, i32 0, i32 0), i8** %str2, align 8
Which looks very similar imho, but is not the same.
Clang makes the string a global constant (which is hopefully not necessary), uses the offset parameter of getelementptr and gives the type i8* to the store instruction.
Unfortunately I wasn't able to find any API Methods to explicitly give a type to the store instruction, or use the offset parameter (which wouldn't even help, I guess).
So finally my question is: How can I properly get a pointer to the first array element and store it?
Thanks in advance

As in the link in the comment, the solution is to bitcast the [n x i8]* to an i8*
%str2 = alloca i8*
%1 = alloca [4 x i8]
store [4 x i8] c"foo\00", [4 x i8]* %1
%2 = bitcast [4 x i8]* %1 to i8*
store i8* %2, i8** %str2

Related

LLVM: passing alloca as a function argument

I want an LLVM IR code equivalent to
double x = 4.93;
printf("hello world: %f", x);
In my LLVM IR code, I got from godbolt and after neating the code, it will become something like
%x = alloca double, align 8
store double 4.930000e+00, double* %x, align 8
%1 = load double, double* %x, align 8
%2 = call i32 (i8*, ...) #printf(i8* getelementptr inbounds ([20 x i8], [20 x i8]* #.str, i32 0, i32 0), double %1)
No error. No problem so far.
But to me, %1 look redundant. However, it looks like I cannot remove this medium. The following code
%x = alloca double, align 8
store double 4.930000e+00, double* %x, align 8
%2 = call i32 (i8*, ...) #printf(i8* getelementptr inbounds ([20 x i8], [20 x i8]* #.str, i32 0, i32 0), double %x)
leads to error
llvm-as-9: test.ll:26:113: error: '%x' defined with type 'double*' but expected 'double'
%2 = call i32 (i8*, ...) #printf(i8* getelementptr inbounds ([20 x i8], [20 x i8]* #.str, i32 0, i32 0), double %x)
^
Now I am wondering if what I'm doing to remove %1 is really possible?
The alloca instruction returns a pointer, so in your code %x is of double* type. You have to load from a double* to get a double.
In your case it should be possible to create a double constant with 4.930000e+00 value and use this value as printf argument directly.
The same effect can be achieved by running optimization passes on your first snippet.

How to use CreateInBoundsGEP in cpp api of llvm to access the element of an array?

I am new to llvm programming, and I am trying to write cpp to generate llvm ir for a simple C code like this:
int a[10];
a[0] = 1;
I want to generate something like this to store 1 into a[0]
%3 = getelementptr inbounds [10 x i32], [10 x i32]* %2, i64 0, i64 0
store i32 1, i32* %3, align 16
And I tried CreateGEP: auto arrayPtr = builder.CreateInBoundsGEP(var, num); where var and
num are both of type llvm::Value*
but I only get
%1 = getelementptr inbounds [10 x i32], [10 x i32]* %0, i32 0
store i32 1, [10 x i32]* %1
I searched google for a long time and looked the llvm manual but still don't know what Cpp api to use and how to use it.
Really appreciate it if you can help!
Note that the 2nd argument to IRBuilder::CreateInBoundsGEP (1st overload) is actually ArrayRef<Value *>, which means it accepts an array of Value * values (including C-style array, std::vector<Value *> and std::array<Value *, LEN> and others).
To generate a GEP instruction with multiple (child) addresses, pass an array of Value * to the second argument:
Value *i32zero = ConstantInt::get(contexet, APInt(32, 0));
Value *indices[2] = {i32zero, i32zero};
builder.CreateInBoundsGEP(var, ArrayRef<Value *>(indices, 2));
Which will yield
%1 = getelementptr inbounds [10 x i32], [10 x i32]* %0, i32 0, i32 0
You can correctly identify that %1 is of type i32*, pointing to the first item in the array pointed to by %0.
LLVM documentation on GEP instruction: https://llvm.org/docs/GetElementPtr.html

LLVM IR: initialize and cast [20 x i8]

I am trying to initialize and then cast a number of LLVM IR variables in the following way:
store i64 %content, i64* %5
%tt2 = load i64, i64* %5
%ttt2 = trunc i64 %tt2 to i32
While this seems trivial and works fine, I am trapped to do the same thing for a [20 * i8] typed variable. Something like:
store [20 x i8] %content, [20 x i8]* %5
%tt2 = load [20 x i8], [20 x i8]* %5
%ttt2 = trunc [20 x i8] %tt2 to i32
Currently I got the following error msg for the third line:
invalid cast opcode for cast from [20 x i8] to i32
Could anyone shed some lights on this issue? Thanks!
You can trunc from one int to another, but not from an array to an int. That's just how trunc is defined — if the input isn't an int, then trunc would need to do something markedly different from "drop the higher-order bits and preserve the lower-order bits".
I think the most common approach is to cast the pointer and then load/store from a pointer that already matches the type you want to load/store.
(Note that %ttt2 etc. aren't LLVM variables, they're LLVM values. They don't vary, ever.)

LLVM: How to create char array reference

I am trying to implement an opt pass that will compress string constants in ROM and then, at the cost of CPU + RAM, re-materialize the values at runtime. Before implementing compression, I just want to place all strings in a table, and do a lookup.
Example:
printf("Hello");
Would become the equivalent of
char placeholder[6];
int strID = 0;
tableLookup(placeholder, 0 /*ID*/); // Fill array
printf(placeholder);
The LLVM IR I was able to generate looks like the following:
%fakeString = alloca [10 x i8], align 1
call void #llvm.dbg.declare(metadata [10 x i8]* %fakeString, metadata !60, metadata !25), !dbg !64
%arraydecay = getelementptr inbounds [10 x i8], [10 x i8]* %fakeString, i32 0, i32 0, !dbg !65
call void #tableLookup(i8* %arraydecay, i32 0) #2, !dbg !66
How would I be able to create this programatically? The two main pieces I am missing:
1. How to create the array reference (after creating the alloca instruction)
2. How to get result from tableLookup and replace the operand in the old printf()
Any help would be much appreciated!

Is LLVM GEP safe if not know the size of an array?

I am trying to use GEP to get a pointer of i32 from an array.
But the problem is: I don't know the size of the array.
The IR document on llvm.org said GEP just adds the offsets to the base address with silently-wrapping two’s complement arithmetic.
So, I want to ask for some advice.
Is it safe like this:
%v1 = alloca i32
store i32 5, i32* %v1
%6 = load i32* %v1
%7 = bitcast i32* %v0 to [1 x i32]*
%8 = getelementptr [1 x i32]* %7, i32 0, i32 %6
%9 = load i32* %8
store i32 %9, i32* %v0
Type of %v0 is i32*, and I know %v0 is pointing to an array in mem, but the size is 9, not 1.
Then I "GEP" from %7 which I treat it as a [1 x i32], not [9 x i32] , but the "offset" is 5(%6).
So, is there any problem? Not safe, or just not good but basically OK?
First of all, the entire code you wrote is equivalent to:
%x = getelementptr i32* %v0, i32 5
%y = load i32* %x
store i32* %y, %v0
There's no reason to bitcast the pointer to [1 x i32]*, just use it as-is.
Regarding your question - using a gep to get the pointer is always safe (in the sense that it's well-defined and will never crash), however there's nothing stopping it from evaluating to a pointer beyond the bounds of the array; and in such a case, accessing the memory (as you do in the subsequent load instruction) is undefined.
Also, this link might be of interest: http://llvm.org/docs/GetElementPtr.html#what-happens-if-an-array-index-is-out-of-bounds