something strange when using icmp->getOpcode() function - llvm

I tried to use following codes to get the opcode of icmp instruction,
int opCode = icmp -> getOpcode();
when I run the statement on an icmp instruction likes this
%cmp = icmp eq i32 %0, 0
however, I got the opCode is '52', but actually, for operation 'eq' ,opCode should be '32',
ICMP_EQ = 32, ///< equal
why this strange phenomenon occurs ? and how to solve it ?

I may knew how to solve this problem, getCode() will return an error code, so we can use getPredicate() to replace it.

Related

Is LLVM bitcast from vector of bool (i1) to i8, i16, etc. well defined?

In LLVM, can a value of type <8 x i1> be bitcasted to an i8? If so what is the expected bit order? The LLVM documentation on bitcast is not explicit on this. The claim it makes is
The bitcast instruction converts value to type ty2. It is always a no-op cast because no bits change with this conversion. The conversion is done as if the value had been stored to memory and read back as type ty2.
Tangentially, on the mailing list, it has been clarified that no-op cast does not mean what it sounds like. Back to the issue at hand, the problem I see with bitcasting <8 x i1> to any other type (not just i8) is that a value of type <8 x i1> cannot be stored to memory. I have confirmed this experimentally (code not included), and it is also well-documented on the mailing list. Since storing values of type <8 x i1> leads to undefined behavior, the specification "as if the value had been stored to memory and read back as type ty2" implies that any bitcast to or from <8 x i1> results in undefined behavior. Note that a very similar question has been asked before, but the answers to this question do not provide a satisfactory answer to the general soundness issue presented here. The author of the aforementioned issue resolved the issue by bitcasting <8 x i1> to <1 x i8>, but this cast involves an argument of type <8 x i1>, so I am not convinced that it is sound.
For what it's worth, in some of my own small tests with LLVM, I have confirmed that bitcasting <8 x i1> to i8 works. Below is a function that tests 8 i16s at a time for whether or not they are each equal to 42.
; Filename is equality-8x16.ll
define void #equals42(<8 x i16>* %src0,i8* %dst0,i64 %len0) { ; i32()*
entry:
%len = udiv exact i64 %len0, 8
br label %cond
cond:
%i = phi i64 [ 0, %entry ], [ %isucc, %loop ]
%src = phi <8 x i16>* [ %src0, %entry ], [ %srcsucc, %loop ]
%dst = phi i8* [ %dst0, %entry ], [ %dstsucc, %loop ]
%cmp = icmp slt i64 %i, %len
br i1 %cmp, label %loop, label %end
loop:
%isucc = add i64 %i, 1
%srcsucc = getelementptr <8 x i16>, <8 x i16>* %src, i64 1
%dstsucc = getelementptr i8, i8* %dst, i64 1
%val = load <8 x i16>, <8 x i16>* %src
%bits = icmp eq <8 x i16> %val, <i16 42,i16 42,i16 42,i16 42,i16 42,i16 42,i16 42,i16 42>
%res = bitcast <8 x i1> %bits to i8
store i8 %res, i8* %dst
br label %cond
end:
ret void;
}
And here's some C code (call-equality.c) that calls it:
#include <stdio.h>
#include <stdint.h>
#define SZ 8
void equals42(void*,void*,int64_t);
/* Prints the highest bit first and the lowest bit last */
void printbits(uint8_t x)
{
for(int i=sizeof(x)<<3; i; i--)
putchar('0'+((x>>(i-1))&1));
}
int main(){
uint16_t a[SZ * 8] = {0};
uint8_t b[8];
a[1] = 42;
a[15] = 42;
equals42(a,b,SZ * 8);
for(int i = 0; i < SZ; i++){
printf("Index %d:",i);
printbits(b[i]);
printf("\n");
}
}
Build, link, and run with:
llc-9 -O3 -mcpu=skylake -filetype=obj equality-8x16.ll
gcc call-equality.c equality-8x16.o
./a.out
And here's the results:
Index 0:00000010
Index 1:10000000
Index 2:00000000
Index 3:00000000
Index 4:00000000
Index 5:00000000
Index 6:00000000
Index 7:00000000
This works, and it even happens to do what I expect. This bits at positions 1 and 15 are set (interpreting byte 1, bit position 7 as bit position 15). However, it's not clear whether or not I would get the same results on a big endian platform (I'm using a little-endian Skylake CPU). Again, I'd like to stress that LLVM's official documentation does not document the behavior of bitcasts involving <8 x i1>.
The question is not just "does this happen to work on your computer or mine". (Although if someone has a big-endian platform, I would be curious to see if the example program gives the same results). The real questions are:
Is there some quasi-authoritative source, even if it's just mailing list threads and issue trackers, that specifies the semantics of these bitcasts?
If these bitcasts are unsound, what is the idiomatic way to convert a <8 x i1> to an i8? It is possible to project out all eight bits individually (via extractelement) and then build an i8 with some ORs and bitshifts, but that seems both tedious and relies heavily on an optimization pass to get the shuffle operation I would expect. Is there something better?
The closest thing I've found so far is a mailing list thread from 2018 where a user notices a problem where bitcast <16 x i1> %a1 to i16 is poorly optimized. A maintainer responds, offering a fix in r348104 (which I'm not able to find on GitHub or Phabricator). But this seems to imply that bitcast <16 x i1> %a1 to i16 is understood to be well defined. But what is it actually supposed to mean? Is element 0 supposed to be bit position 0 in the resulting word? I think so, but it would be nice to see this spelled out somewhere.

LLVM IR : C++ API : Typecast from i1 to i32 and i32 to i1

I am writing a compiler for a self-made language which can handle only int values i.e. i32. Conditions and expressions are similar to C language. Thus, I am considering conditional statements as expressions i.e. they return an int value. They can also be used in expressions e.g (2 > 1) + (3 > 2) will return 2. But LLVM conditions output i1 value.
Now, I want that after each conditional statement, i1 should be converted into i32, so that I can carry out binary operations
Also, I want to use variables and expression results as condition e.g. if(variable) or if(a + b). For that I need to convert i32 to i1
At the end, I want a way to typecast from i1 to i32 and from i32 to i1. My code is giving these kinds of errors as of now :
For statement like if(variable) :
error: branch condition must have 'i1' type
br i32 %0, label %ifb, label %else
^
For statement like a = b > 3
error: stored value and pointer type do not match
store i1 %gttmp, i32* #a
^
Any suggestion on how to do that ?
I figured it out. To convert from i1 to i32, as pointed out here by Ismail Badawi , I used IRBuilder::CreateIntCast. So if v is Value * pointer pointing to an expression resulting in i1, I did following to convert it to i32 :
v = Builder.CreateIntCast(v, Type::getInt32Ty(getGlobalContext()), true);
But same can't be applied for converting i32 to i1. It will truncate the value to least significant bit. So i32 2 will result in i1 0. I needed i1 1 for non-zero i32. If v is Value * pointer pointing to an expression resulting in i32, I did following to convert it in i1 :
v = Builder.CreateICmpNE(v, ConstantInt::get(Type::getInt32Ty(getGlobalContext()), 0, true))

Setting a variable to 0 in LLVM IR

Is it possible to set a variable to 0 (or any other number) in LLVM-IR ? My searches have found me the following 3 line snippet, but is there anything simpler than the following solution ?
%ptr = alloca i32 ; yields i32*:ptr
store i32 3, i32* %ptr ; yields void
%val = load i32, i32* %ptr ; yields i32:val = i32 3
To set a value to zero (or null in general) you can use
Constant::getNullValue(Type)
and to set a value with an arbitrary constant number you can use ConstantInt::get(), but you need to identify the context first, like this:
LLVMContext &context = function->getContext();
/* or BB->getContext(), BB can be any basic block in the function */
Value* constVal = ConstantInt::get(Type::getInt32Ty(context), 3);
LLVM-IR is in static single assignment (SSA) form, so each variable is only assigned once. If you want to assign a value to a memory region you can simply use a store operation as you showed in your example:
store i32 3, i32* %ptr
The type of the second argument is i32* which means that it is a pointer to an integer that is 32 bit long.

Why is my (re)implementation of strlen wrong?

I came up with this little code but all the professionals said its dangerous and I should not write code like this. Can anyone highlight its vulnerabilities in 'more' details?
int strlen(char *s){
return (*s) ? 1 + strlen(s + 1) : 0;
}
It has no vulnerabilities per se, this is perfectly correct code. It is prematurely pessimized, of course. It will run out of stack space for anything but the shortest strings, and its performance will suck due to recursive calls, but otherwise it's OK.
The tail call optimization most likely won't cope with such code. If you want to live dangerously and depend on tail-call optimizations, you should rephrase it to use the tail-call:
// note: size_t is an unsigned integertype
int strlen_impl(const char *s, size_t len) {
if (*s == 0) return len;
if (len + 1 < len) return len; // protect from overflows
return strlen_impl(s+1, len+1);
}
int strlen(const char *s) {
return strlen_impl(s, 0);
}
Dangerous it a bit of a stretch, but it is needlessly recursive and likely to be less efficient than the iterative alternative.
I suppose also that given a very long string there is a danger of a stack overflow.
There are two serious security bugs in this code:
Use of int instead of size_t for the return type. As written, strings longer than INT_MAX will cause this function to invoke undefined behavior via integer overflow. In practice, this could lead to computing strlen(huge_string) as some small value like 1, malloc'ing the wrong amount of memory, and then performing strcpy into it, causing a buffer overflow.
Unbounded recursion which can overflow the stack, i.e. Stack Overflow. :-) A compiler may choose to optimize the recursion into a loop (in this case, it's possible with current compiler technology), but there is no guarantee that it will. In a best case, stack overflow will simply crash the program. In a worst case (e.g. running on a thread with no guard page) it could clobber unrelated memory, possibly yielding arbitrary code execution.
The problem with killing the stack that have been pointed out, ought to be fixed by a decent compiler, where the apparent recursive call is flattened into a loop. I verified this hypothesis and asked clang to translate your code:
//sl.c
unsigned sl(char const* s) {
return (*s) ? (1+sl(s+1)) : 0;
}
Compiling and disassembling:
clang -emit-llvm -O1 -c sl.c -o sl.o
# ^^ Yes, O1 is already sufficient.
llvm-dis-3.2 sl.o
And this is the relevant part of the llvm result (sl.o.ll)
define i32 #sl(i8* nocapture %s) nounwind uwtable readonly {
%1 = load i8* %s, align 1, !tbaa !0
%2 = icmp eq i8 %1, 0
br i1 %2, label %tailrecurse._crit_edge, label %tailrecurse
tailrecurse: ; preds = %tailrecurse, %0
%s.tr3 = phi i8* [ %3, %tailrecurse ], [ %s, %0 ]
%accumulator.tr2 = phi i32 [ %4, %tailrecurse ], [ 0, %0 ]
%3 = getelementptr inbounds i8* %s.tr3, i64 1
%4 = add i32 %accumulator.tr2, 1
%5 = load i8* %3, align 1, !tbaa !0
%6 = icmp eq i8 %5, 0
br i1 %6, label %tailrecurse._crit_edge, label %tailrecurse
tailrecurse._crit_edge: ; preds = %tailrecurse, %0
%accumulator.tr.lcssa = phi i32 [ 0, %0 ], [ %4, %tailrecurse ]
ret i32 %accumulator.tr.lcssa
}
I don't see a recursive call. Indeed clang called the looping label tailrecurse which gives us a pointer as to what clang is doing here.
So, finally (tl;dr) yes, this code is perfectly safe and a decent compiler with a decent flag will iron the recursion out.

Writing llvm byte code

I have just discovered LLVM and don't know much about it yet. I have been trying it out using llvm in browser. I can see that any C code I write is converted to LLVM byte code which is then converted to native code. The page shows a textual representation of the byte code. For example for the following C code:
int array[] = { 1, 2, 3};
int foo(int X) {
return array[X];
}
It shows the following byte code:
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64"
target triple = "x86_64-linux-gnu"
#array = global [3 x i32] [i32 1, i32 2, i32 3] ; <[3 x i32]*> [#uses=1]
define i32 #foo(i32 %X) nounwind readonly {
entry:
%0 = sext i32 %X to i64 ; <i64> [#uses=1]
%1 = getelementptr inbounds [3 x i32]* #array, i64 0, i64 %0 ; <i32*> [#uses=1]
%2 = load i32* %1, align 4 ; <i32> [#uses=1]
ret i32 %2
}
My question is: Can I write the byte code and give it to the llvm assembler to convert to native code skipping the first step of writing C code altogether? If yes, how do I do it? Does any one have any pointers for me?
One very important feature (and design goal) of the LLVM IR language is its 3-way representation:
The textual representation you can see here
The bytecode representation (or binary form)
The in-memory representation
All 3 are indeed completely interchangeable. Nothing that can be expressed in one cannot be expressed in the 2 others as well.
Therefore, as long as you conform to the syntax, you can indeed write the IR yourself. It is rather pointless though, unless used as an exercise to accustom yourself with the format, whether to be better at reading (and diagnosing) the IR or to produce your own compiler :)
Yes, surely you can. First, you can write LLVM IR by hand. All tools like llc (which will generate a native code for you) and opt (LLVM IR => LLVM IR optimizer) accept textual representation of LLVM IR as input.