Checking the top bits of an i64 Value in LLVM IR - llvm

I am going to keep this short and to the point, but if further clarifications are needed please let me know.
I have an i64 Value that I want to check the top bits of if they are zeros or not. If they are zeros, I would do something, if they are not, I would do something else. How do I instrument the IR to allow this to happen at runtime?
One thing I found is that LLVM has an intrinsic "llvm.ctlz" that counts the leading zeros and puts them in an i64 Value, but how do I use its return value to do the checking? Or how do I instrument so the checking happens at runtime?
Any help or suggestions would be appreciated. Thanks!

You didn't say how many top bits, so I'll do an example with the top 32 bits. Given i64 %x, I'd check it with %result = icmp uge i64 %x, i64 4294967296 because 4294967296 is 2^32 and that is the first value which has a 1 bit in the top 32-bits. If you want to check the top two bits to be zero, use 2^62 (4611686018427387904) instead.
In order to do two different things based on the value of %result in general you'll want to branch on it. BasicBlock has a method splitBasicBlock that takes an instruction to split at. Use that to split your block into a before and after. Create new blocks for the true side an false side, add a branch on your result to your new blocks, br i1 %result, label %cond_true, label %cond_false. Make sure those two new blocks branch back to the after block.
Depending on what you want to do, you may not need an entire block, for instance if you're only calculating a value and not doing any side-effecting operations you might be able to use a select instruction instead of a branch and separate blocks.

Related

Pre evaluate LLVM IR

Let's suppose we have expressions like:
%rem = srem i32 %i.0, 10
%mul = mul nsw i32 %rem, 2
%i.0 is a llvm::PHINode which I can get the bounds.
The question is: Is there a way to get the value of %mul during compile time? I'm writing a llvm Pass and I need to evaluate some expressions which use %i.0. I'm searching for a function, class or something else which I will give a value to %i.0 and it will evaluate the expression and return the result.
You could clone the code (the containing function or the entire module, depending on how much context you need), then replace %i.0 with a constant value, run the constant propagation pass on the code, and finally check whether %mul is assigned to a constant value and if so, extract it.
It's not elegant, but I think it would work. Just pay attention to:
Make sure %mul is not elided out - for example, return it from the function, or store its value to memory, or something.
Be aware constant propagation assumes some things about the code, in particular that it already passed through mem2reg.

Llvm how to access global array elements

Can someone please explain me what is wrong with this code?
I think this should fetch the second argument from global array, but in fact it silently crushes somewhere inside JIT compilation routine.
My suppositions:
GEP instruction calculates memory address of the element by applying offset and returns pointer.
load instruction loads value referenced by given pointer (it dereferences a pointer, in other words).
ret instruction exits function and passes given value to caller.
Seems like I've missed something basic, but time point from which i should give up looking for answer myself is gone and i have to seek for help.
#arr = common global [256 x i64], align 8
define i64 #iterArray() {
entry:
%0 = load i64* getelementptr inbounds ([256 x i64]* #arr, i32 1, i32 0)
ret i64 %0
}
You requested the 257th item in a 256-item array, and that's a problem.
The first index given to a gep instruction means how many steps are made through the value operand - and here the value operand is not an array but a pointer to an array. That means every step there skips the entire size of the array forward - and that's why the gep actually asks for the 257th item. Using 0 as the first gep index will probably fix the problem. Then using 1 as the 2nd index will get you the 2nd item in the array, which is what you wanted. Read more about it here: http://llvm.org/docs/GetElementPtr.html#what-is-the-first-index-of-the-gep-instruction
Alternatively, it's more appropriate here to use the extractvalue instruction, which is similar to gep with implicitly uses a 0 for the first index (and there are a couple of other differences).
Regarding why the compiler crashes, I'm not sure - I'm guessing that while normally such a memory access would compile fine (and at runtime either generate a segfault or just return a bad value), here you specifically requested the gep to be inbounds, which means that a bounds check is done - and it will fail here - so a poison value is returned, which means your function is now effectively load undef. I'm not sure what LLVM does with load undef - it should probably be optimized out and the function be made to just return undef - but maybe it did something different which led to a rejection of your code.

Using the address from LLVM store instruction to create another

I'm working with LLVM to take a store instruction and replace it with another so that I can take something like
store i64 %0, i64* %a
and replace it with
store i64 <value>, i64* %a
I've used
llvm::Value *str = i->getOperand(1);
to get the address that my old instruction is using, and then I create a new store via (i is the current instruction location, so this store will be created before the store I'm replacing)
StoreInstr *store = new StoreInst(value, str, i);
I then delete the store I've replaced with
i->eraseFromParent();
But I'm getting the error:
While deleting: i64%
Use still stuck around after Def is destroyed: store i64 , i64* %a
and a failure message that Assertion "use empty" && uses remain when a value is destroyed fail.
How could I get around this? I'd love to create a store instruction and then use LLVM's ReplaceInstWithInst, but I can't find a way to create a store instruction without giving it a location to insert itself. I'm also not 100% that will solve my issue.
I'll add that prior to my store replacement, I'm matching an instruction i, then getting the value I need before performing i->eraseFromParent, so I'm not sure if that is part of my problem; I'm assuming that eraseFromParent moves i along to the following store instruction.
eraseFromParent removes an instruction from the enclosing basic block (and consequently, from the enclosing function). It doesn't move it anywhere. Erasing an instruction this way without taking care of its uses first will leave your IR malformed, which is why you're getting the error - it's as if you deleted line 1 from the following C snippet:
1 int x = 3;
2 int y = x + 1;
Obviously you'll get an error on the remaining line, the definition of x is now missing!
ReplaceInstWithInst is probably the best way to replace one instruction with another. You don't need to supply the new instruction with a location to insert it with: just leave the instruction as NULL (or better yet, omit the argument) and it will create a dangling instruction which you can then place wherever you want.
Because of the above, by the way, the key method that ReplaceInstWithInst invokes is Value::replaceAllUsesWith, this ensures that you won't be left with missing values in your IR.

How can I find the size of a type?

I'm holding a Type* in my hand. How do I find out its size (the size objects of this type will occupy in memory) in bits / bytes? I see all kinds of methods allowing me to get "primitive" or "scalar" size, but that won't help me with aggregate types...
If you only need the size because you are inserting it into the IR (e.g., so you can send it to a call to malloc()), you can use the getelementptr instruction to do the dirty work (with a little casting), as described here (with updating for modern LLVM):
Though LLVM does not contain a special purpose sizeof/offsetof instruction, the
getelementptr instruction can be used to evaluate these values. The basic idea
is to use getelementptr from the null pointer to compute the value as desired.
Because getelementptr produces the value as a pointer, the result is casted to
an integer before use.
For example, to get the size of some type, %T, we would use something like
this:
%Size = getelementptr %T* null, i32 1
%SizeI = ptrtoint %T* %Size to i32
This code is effectively pretending that there is an array of T elements,
starting at the null pointer. This gets a pointer to the 2nd T element
(element #1) in the array and treats it as an integer. This computes the
size of one T element.
The good thing about doing this is that it is useful in exactly the cases where you do not care what the value is; where you just need to pass the correct value from the IR to something. That's by far the most common case for my need for sizeof()-alike operations in the IR generation.
The page also goes on to describe how to do an offsetof() equivalent:
To get the offset of some field in a structure, a similar trick is used. For
example, to get the address of the 2nd element (element #1) of { i8, i32* }
(which depends on the target alignment requirement for pointers), something
like this should be used:
%Offset = getelementptr {i8,i32*}* null, i32 0, i32 1
%OffsetI = ptrtoint i32** %Offset to i32
This works the same way as the sizeof trick: we pretend there is an instance of
the type at the null pointer and get the address of the field we are interested
in. This address is the offset of the field.
Note that in both of these cases, the expression will be evaluated to a
constant at code generation time, so there is no runtime overhead to using this
technique.
The IR optimizer also converts the values to constants.
The size depends on the target (for several reasons, alignment being one of them).
In LLVM versions 3.2 and above, you need to use DataLayout, in particular its getTypeAllocSize method. This returns the size in bytes, there's also a bit version named getTypeAllocSizeInBits. A DataLayout instance can be obtained by creating it from the current module: DataLayout* TD = new DataLayout(M).
With LLVM up to version 3.1 (including), use TargetData instead of DataLayout. It exposes the same getTypeAllocSize methods, though.

Find loop temination condition variable

I want to find the variable which is used to check for termination of the loop,
For example,in the loop below i should get "%n":
for.body8: ; preds = %for.body8.preheader,for.body8
%i.116 = phi i32 [ %inc12, %for.body8 ], [ 0, %for.body8.preheader ]
%inc12 = add nsw i32 %i.116, 1
.....
%6 = load i32* %n, align 4, !tbaa !0
% cmp7 = icmp slt i32 %inc12, %6
br i1 %cmp7, label %for.body8, label %for.end13.loopexit
Is there any direct method to get this value?.
One way I can do is by,Iterating instruction and checking for icmp instruction.But I dont think its a proper method.
Please suggest me a method.
Thanks in advance.
While there is no way to do this for general loops, it is possible to find this out in some cases. In LLVM there is a pass called '-indvars: Canonicalize Induction Variables' which is described as
This transformation analyzes and transforms the induction variables
(and computations derived from them) into simpler forms suitable for
subsequent analysis and transformation.
This transformation makes the following changes to each loop with an
identifiable induction variable:
All loops are transformed to have a single canonical induction variable which starts at zero and steps by one.
The canonical induction variable is guaranteed to be the first PHI node in the loop header block.
Any pointer arithmetic recurrences are raised to use array subscripts.
If the trip count of a loop is computable, this pass also makes the
following changes:
The exit condition for the loop is canonicalized to compare the induction value against the exit value. This turns loops like:
for (i = 7; i*i < 1000; ++i)
into
for (i = 0; i != 25; ++i)
Any use outside of the loop of an expression derived from the indvar is changed to compute the derived value outside of the loop,
eliminating the dependence on the exit value of the induction
variable. If the only purpose of the loop is to compute the exit value
of some derived expression, this transformation will make the loop
dead.
This transformation should be followed by strength reduction after all
of the desired loop transformations have been performed. Additionally,
on targets where it is profitable, the loop could be transformed to
count down to zero (the "do loop" optimization).
and sounds like it does just what you need.
Unfortunately there is no general solution to this. Your question is an instance of the Halting Problem, proven to have no general solution: http://en.wikipedia.org/wiki/Halting_problem
If you're able to cut the problem space down to something extremely simple, using a subset of operations that are not turing complete (http://en.wikipedia.org/wiki/Turing-complete), you may be able to come up with a solution. However there is no general solution.