bit layout of vector after bitcasting in llvm - c++

%0 = bitcast i16 %arg1 to <2 x i8>
%2 = extractelement <2 x i8> %0, i32 1
%arg1 in memory:
00000000 11111111
|--8bit--||--8bit--|
After bitcasting, %0 is a pointer to vector.
So is %0 also the address of the first element of the vector?
And what is %2 exactly? Is it the second element of vector(11111111) or,
00000000?

After bitcasting, %0 is a Value of type <2 x i8>. It is not "a pointer". The vector may very well be stored in a register when code generation to machine code happens.
%2 is i8, because extractelement is defined as:
<result> = extractelement <n x <ty>> <val>, i32 <idx> ; yields <ty>
The vector has two elements, each with of type i8. %2 is a Value that holds the second element in the vector.
Note that how the vector is laid out in memory or registers is target dependent. LLVM IR level doesn't care about that. It sees the vector as an abstract container of two values.

(Would like to post this despite this question was discussed fairly long time ago, after reading when it's good to reply an old question, since the reply provides another way to verify the answer and references recent document update)
The value of %2 would depend on the endianess and target (the vector 's memory layout depends on the target and endianess as Eli Bendersky mentioned); https://gcc.godbolt.org/z/vfcGhb4EK could be a good way to visualize that.
By reading the language reference, these two are helpful to reason about the above IR (at least to me)
the semantics of bitcast
the memory layout of vector types
Both 1 and 2 were updated from https://reviews.llvm.org/D94964 (credit to the authors and reviewers)

bitcast <type1> <value> to <type2> converts the value, %arg1 in your case, from type1 to type2 without changing the bits given that the number of bits in the two types are the same.
%0 = bitcast i16 %arg1 to <2 x i8>
This means that %0 is now an array/vector of two 8-bit integers instead of a single 16-bit integer. Looking over the linked documentation, this appears to only be a value.
extractelement <n x <type>> <value>, i32 <index> extracts the element in the n-length array of typed values as the given type using the 32-bit integer index.
%2 = extractelement <2 x i8> %0, i32 1
This means that %2 is now an 8-bit integer with the value of element 1 (the second/last 8-bit element). Assuming little endian hardware, I would expect the value of %2 to be 0.

Related

Is LLVM bitcast from vector of bool (i1) to i8, i16, etc. well defined?

In LLVM, can a value of type <8 x i1> be bitcasted to an i8? If so what is the expected bit order? The LLVM documentation on bitcast is not explicit on this. The claim it makes is
The bitcast instruction converts value to type ty2. It is always a no-op cast because no bits change with this conversion. The conversion is done as if the value had been stored to memory and read back as type ty2.
Tangentially, on the mailing list, it has been clarified that no-op cast does not mean what it sounds like. Back to the issue at hand, the problem I see with bitcasting <8 x i1> to any other type (not just i8) is that a value of type <8 x i1> cannot be stored to memory. I have confirmed this experimentally (code not included), and it is also well-documented on the mailing list. Since storing values of type <8 x i1> leads to undefined behavior, the specification "as if the value had been stored to memory and read back as type ty2" implies that any bitcast to or from <8 x i1> results in undefined behavior. Note that a very similar question has been asked before, but the answers to this question do not provide a satisfactory answer to the general soundness issue presented here. The author of the aforementioned issue resolved the issue by bitcasting <8 x i1> to <1 x i8>, but this cast involves an argument of type <8 x i1>, so I am not convinced that it is sound.
For what it's worth, in some of my own small tests with LLVM, I have confirmed that bitcasting <8 x i1> to i8 works. Below is a function that tests 8 i16s at a time for whether or not they are each equal to 42.
; Filename is equality-8x16.ll
define void #equals42(<8 x i16>* %src0,i8* %dst0,i64 %len0) { ; i32()*
entry:
%len = udiv exact i64 %len0, 8
br label %cond
cond:
%i = phi i64 [ 0, %entry ], [ %isucc, %loop ]
%src = phi <8 x i16>* [ %src0, %entry ], [ %srcsucc, %loop ]
%dst = phi i8* [ %dst0, %entry ], [ %dstsucc, %loop ]
%cmp = icmp slt i64 %i, %len
br i1 %cmp, label %loop, label %end
loop:
%isucc = add i64 %i, 1
%srcsucc = getelementptr <8 x i16>, <8 x i16>* %src, i64 1
%dstsucc = getelementptr i8, i8* %dst, i64 1
%val = load <8 x i16>, <8 x i16>* %src
%bits = icmp eq <8 x i16> %val, <i16 42,i16 42,i16 42,i16 42,i16 42,i16 42,i16 42,i16 42>
%res = bitcast <8 x i1> %bits to i8
store i8 %res, i8* %dst
br label %cond
end:
ret void;
}
And here's some C code (call-equality.c) that calls it:
#include <stdio.h>
#include <stdint.h>
#define SZ 8
void equals42(void*,void*,int64_t);
/* Prints the highest bit first and the lowest bit last */
void printbits(uint8_t x)
{
for(int i=sizeof(x)<<3; i; i--)
putchar('0'+((x>>(i-1))&1));
}
int main(){
uint16_t a[SZ * 8] = {0};
uint8_t b[8];
a[1] = 42;
a[15] = 42;
equals42(a,b,SZ * 8);
for(int i = 0; i < SZ; i++){
printf("Index %d:",i);
printbits(b[i]);
printf("\n");
}
}
Build, link, and run with:
llc-9 -O3 -mcpu=skylake -filetype=obj equality-8x16.ll
gcc call-equality.c equality-8x16.o
./a.out
And here's the results:
Index 0:00000010
Index 1:10000000
Index 2:00000000
Index 3:00000000
Index 4:00000000
Index 5:00000000
Index 6:00000000
Index 7:00000000
This works, and it even happens to do what I expect. This bits at positions 1 and 15 are set (interpreting byte 1, bit position 7 as bit position 15). However, it's not clear whether or not I would get the same results on a big endian platform (I'm using a little-endian Skylake CPU). Again, I'd like to stress that LLVM's official documentation does not document the behavior of bitcasts involving <8 x i1>.
The question is not just "does this happen to work on your computer or mine". (Although if someone has a big-endian platform, I would be curious to see if the example program gives the same results). The real questions are:
Is there some quasi-authoritative source, even if it's just mailing list threads and issue trackers, that specifies the semantics of these bitcasts?
If these bitcasts are unsound, what is the idiomatic way to convert a <8 x i1> to an i8? It is possible to project out all eight bits individually (via extractelement) and then build an i8 with some ORs and bitshifts, but that seems both tedious and relies heavily on an optimization pass to get the shuffle operation I would expect. Is there something better?
The closest thing I've found so far is a mailing list thread from 2018 where a user notices a problem where bitcast <16 x i1> %a1 to i16 is poorly optimized. A maintainer responds, offering a fix in r348104 (which I'm not able to find on GitHub or Phabricator). But this seems to imply that bitcast <16 x i1> %a1 to i16 is understood to be well defined. But what is it actually supposed to mean? Is element 0 supposed to be bit position 0 in the resulting word? I think so, but it would be nice to see this spelled out somewhere.

LLVM IR : C++ API : Typecast from i1 to i32 and i32 to i1

I am writing a compiler for a self-made language which can handle only int values i.e. i32. Conditions and expressions are similar to C language. Thus, I am considering conditional statements as expressions i.e. they return an int value. They can also be used in expressions e.g (2 > 1) + (3 > 2) will return 2. But LLVM conditions output i1 value.
Now, I want that after each conditional statement, i1 should be converted into i32, so that I can carry out binary operations
Also, I want to use variables and expression results as condition e.g. if(variable) or if(a + b). For that I need to convert i32 to i1
At the end, I want a way to typecast from i1 to i32 and from i32 to i1. My code is giving these kinds of errors as of now :
For statement like if(variable) :
error: branch condition must have 'i1' type
br i32 %0, label %ifb, label %else
^
For statement like a = b > 3
error: stored value and pointer type do not match
store i1 %gttmp, i32* #a
^
Any suggestion on how to do that ?
I figured it out. To convert from i1 to i32, as pointed out here by Ismail Badawi , I used IRBuilder::CreateIntCast. So if v is Value * pointer pointing to an expression resulting in i1, I did following to convert it to i32 :
v = Builder.CreateIntCast(v, Type::getInt32Ty(getGlobalContext()), true);
But same can't be applied for converting i32 to i1. It will truncate the value to least significant bit. So i32 2 will result in i1 0. I needed i1 1 for non-zero i32. If v is Value * pointer pointing to an expression resulting in i32, I did following to convert it in i1 :
v = Builder.CreateICmpNE(v, ConstantInt::get(Type::getInt32Ty(getGlobalContext()), 0, true))

Setting a variable to 0 in LLVM IR

Is it possible to set a variable to 0 (or any other number) in LLVM-IR ? My searches have found me the following 3 line snippet, but is there anything simpler than the following solution ?
%ptr = alloca i32 ; yields i32*:ptr
store i32 3, i32* %ptr ; yields void
%val = load i32, i32* %ptr ; yields i32:val = i32 3
To set a value to zero (or null in general) you can use
Constant::getNullValue(Type)
and to set a value with an arbitrary constant number you can use ConstantInt::get(), but you need to identify the context first, like this:
LLVMContext &context = function->getContext();
/* or BB->getContext(), BB can be any basic block in the function */
Value* constVal = ConstantInt::get(Type::getInt32Ty(context), 3);
LLVM-IR is in static single assignment (SSA) form, so each variable is only assigned once. If you want to assign a value to a memory region you can simply use a store operation as you showed in your example:
store i32 3, i32* %ptr
The type of the second argument is i32* which means that it is a pointer to an integer that is 32 bit long.

incrementing a ptr in llvm ir

I am trying to understand the getelementptr instruction in llvm IR, but not fully understanding it.
I have a struct like below -
struct Foo {
int32_t* p;
}
I want to do this -
foo.p++;
What would be the right code for this?
%0 = getelementptr %Foo* %fooPtr, i32 0, i32 0
%1 = getelementptr i32* %0, i8 1
store i32* %1, i32* %0
I am wondering if value in %0 needs to be first loaded using "load" before executing 2nd line.
Thanks!
You can see the GEP instruction as an operation that performs arithmetic operations on pointers. In LLVM IR the GEP instruction is your instruction of choice to perform operations on pointers easyly. You don't have to do cumbersome calculate the size of your types and offsets to manually perform such operations.
In your case:
%0 = getelementptr %Foo* %fooPtr, i32 0, i32 0
selects the member inside the structure. It uses the pointer operatand %fooPtr to calculate %0 = ((fooPtr + 0) + 0). GEP does not know about fooPtr just pointing to one element of Foo, this is why two indices are used to select the member.
%1 = getelementptr i32* %0, i8 1
As mentioned above the GEP performs pointer arithmetic and in your case get %1 = (p + 1);
Since you are operating on pointers using GEP you don't need to load the value of p. GEP will do this implicitly for you.
Now you can store the new index back to the position of the p member inside the Foo struct pointed to by fooPtr.
For further reading: The Often Misunderstood GEP Instruction

Writing llvm byte code

I have just discovered LLVM and don't know much about it yet. I have been trying it out using llvm in browser. I can see that any C code I write is converted to LLVM byte code which is then converted to native code. The page shows a textual representation of the byte code. For example for the following C code:
int array[] = { 1, 2, 3};
int foo(int X) {
return array[X];
}
It shows the following byte code:
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
target triple = "x86_64-linux-gnu"
#array = global [3 x i32] [i32 1, i32 2, i32 3] ; <[3 x i32]*> [#uses=1]
define i32 #foo(i32 %X) nounwind readonly {
entry:
%0 = sext i32 %X to i64 ; <i64> [#uses=1]
%1 = getelementptr inbounds [3 x i32]* #array, i64 0, i64 %0 ; <i32*> [#uses=1]
%2 = load i32* %1, align 4 ; <i32> [#uses=1]
ret i32 %2
}
My question is: Can I write the byte code and give it to the llvm assembler to convert to native code skipping the first step of writing C code altogether? If yes, how do I do it? Does any one have any pointers for me?
One very important feature (and design goal) of the LLVM IR language is its 3-way representation:
The textual representation you can see here
The bytecode representation (or binary form)
The in-memory representation
All 3 are indeed completely interchangeable. Nothing that can be expressed in one cannot be expressed in the 2 others as well.
Therefore, as long as you conform to the syntax, you can indeed write the IR yourself. It is rather pointless though, unless used as an exercise to accustom yourself with the format, whether to be better at reading (and diagnosing) the IR or to produce your own compiler :)
Yes, surely you can. First, you can write LLVM IR by hand. All tools like llc (which will generate a native code for you) and opt (LLVM IR => LLVM IR optimizer) accept textual representation of LLVM IR as input.