LLVM pass to count vector type instructions - c++

I am trying to write an LLVM pass that counts instructions of vector type.
for instructions like :
%24 = or <2 x i64> %21, %23
%25 = bitcast <16 x i8> %12 to <8 x i16>
%26 = shl <8 x i16> %25, <i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1>
%27 = bitcast <8 x i16> %26 to <2 x i64>
I wrote this code:
for (auto &F : M) {
for (auto &B : F) {
for (auto &I : B) {
if (auto* VI = dyn_cast<InsertElementInst>(&I)) {
Value* op = VI->getOperand(0);
if (op->getType()->isVectorTy()){
++vcount;
}
}
But for some reason if (auto* VI = dyn_cast<InsertElementInst>(&I)) is never satisfied.
Any idea why?
Thanks in advance.

InsertElementInst is one specific instruction (that inserts an element into a vector) - and there is none in your list of instructiokns.
You probably want to dyn_cast to a regular use the Instruction in I as it is.
[I personally would use a one of the function or module pass classes as a base, so you only need to implement the inner loops of your code, but that's more of a "it's how you're supposed to do things", not something you HAVE to do to make it work].

In LLVM, the instruction is the same as it's result. so for an example
%25 = bitcast <16 x i8> %12 to <8 x i16>
when you cast Instruction I to value you get %25
Value* psVal = cast<Value>(&I);
and then you can check if it is of vector type or not by getType()->isVectorTy().
Also i suggest you look at inheritance diagram of llvm Value for more clarification
here http://llvm.org/docs/doxygen/html/classllvm_1_1Value.html

Related

How to use CreateInBoundsGEP in cpp api of llvm to access the element of an array?

I am new to llvm programming, and I am trying to write cpp to generate llvm ir for a simple C code like this:
int a[10];
a[0] = 1;
I want to generate something like this to store 1 into a[0]
%3 = getelementptr inbounds [10 x i32], [10 x i32]* %2, i64 0, i64 0
store i32 1, i32* %3, align 16
And I tried CreateGEP: auto arrayPtr = builder.CreateInBoundsGEP(var, num); where var and
num are both of type llvm::Value*
but I only get
%1 = getelementptr inbounds [10 x i32], [10 x i32]* %0, i32 0
store i32 1, [10 x i32]* %1
I searched google for a long time and looked the llvm manual but still don't know what Cpp api to use and how to use it.
Really appreciate it if you can help!
Note that the 2nd argument to IRBuilder::CreateInBoundsGEP (1st overload) is actually ArrayRef<Value *>, which means it accepts an array of Value * values (including C-style array, std::vector<Value *> and std::array<Value *, LEN> and others).
To generate a GEP instruction with multiple (child) addresses, pass an array of Value * to the second argument:
Value *i32zero = ConstantInt::get(contexet, APInt(32, 0));
Value *indices[2] = {i32zero, i32zero};
builder.CreateInBoundsGEP(var, ArrayRef<Value *>(indices, 2));
Which will yield
%1 = getelementptr inbounds [10 x i32], [10 x i32]* %0, i32 0, i32 0
You can correctly identify that %1 is of type i32*, pointing to the first item in the array pointed to by %0.
LLVM documentation on GEP instruction: https://llvm.org/docs/GetElementPtr.html

LLVM IR: initialize and cast [20 x i8]

I am trying to initialize and then cast a number of LLVM IR variables in the following way:
store i64 %content, i64* %5
%tt2 = load i64, i64* %5
%ttt2 = trunc i64 %tt2 to i32
While this seems trivial and works fine, I am trapped to do the same thing for a [20 * i8] typed variable. Something like:
store [20 x i8] %content, [20 x i8]* %5
%tt2 = load [20 x i8], [20 x i8]* %5
%ttt2 = trunc [20 x i8] %tt2 to i32
Currently I got the following error msg for the third line:
invalid cast opcode for cast from [20 x i8] to i32
Could anyone shed some lights on this issue? Thanks!
You can trunc from one int to another, but not from an array to an int. That's just how trunc is defined — if the input isn't an int, then trunc would need to do something markedly different from "drop the higher-order bits and preserve the lower-order bits".
I think the most common approach is to cast the pointer and then load/store from a pointer that already matches the type you want to load/store.
(Note that %ttt2 etc. aren't LLVM variables, they're LLVM values. They don't vary, ever.)

Will GetElementPtr work as expected

I am writing llvm code using C++. I have a place in my code where the below scenario happens
1. %117 = phi <2 x double>* [ %105, %aligned ], [ %159, %116 ]
7. %123 = getelementptr <2 x double>* %117, i32 0
8. %127 = getelementptr <2 x double>* %123, i32 0
9. %128 = load <2 x double>* %127
10. %129 = getelementptr <2 x double>* %123, i32 1
11. %130 = load <2 x double>* %129
12. %131 = shufflevector <2 x double> %128, <2 x double> %130, <2 x i32> <i32 1, i32 3>
I am trying to compute the same address which should point to same data type twice in lines 7 and 8 with the address parameter value different. Is it safe to do this or will this lead to undefined results?
The instruction
%x = getelementptr %anytype* %y, i32 0
Is completely meaningless; it's as if you've written (the illegal):
%x = %y
So yes, both %123 and %127 will point to the same memory. It's safe, but redundant: you can just use %117 directly wherever %123 or %127 are used. The only problematic thing in your snippet is that the value numbering is not sequential, but I assume that's just from pasting just parts of the code here.

LLVM IR: efficiently summing a vector

I'm writing a compiler that's generating LLVM IR instructions. I'm working extensively with vectors.
I would like to be able to sum all the elements in a vector. Right now I'm just extracting each element individually and adding them up manually, but it strikes me that this is precisely the sort of thing that the hardware should be able to help with (as it sounds like a pretty common operation). But there doesn't seem to be an intrinsic to do it.
What's the best way to do this? I'm using LLVM 3.2.
First of all, even without using intrinsics, you can generate log(n) vector additions (with n being vector length) instead of n scalar additions, here's an example with vector size 8:
define i32 #sum(<8 x i32> %a) {
%v1 = shufflevector <8 x i32> %a, <8 x i32> undef, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
%v2 = shufflevector <8 x i32> %a, <8 x i32> undef, <4 x i32> <i32 4, i32 5, i32 6, i32 7>
%sum1 = add <4 x i32> %v1, %v2
%v3 = shufflevector <4 x i32> %sum1, <4 x i32> undef, <2 x i32> <i32 0, i32 1>
%v4 = shufflevector <4 x i32> %sum1, <4 x i32> undef, <2 x i32> <i32 2, i32 3>
%sum2 = add <2 x i32> %v3, %v4
%v5 = extractelement <2 x i32> %sum2, i32 0
%v6 = extractelement <2 x i32> %sum2, i32 1
%sum3 = add i32 %v5, %v6
ret i32 %sum3
}
If your target has support for these vector additions then it seems highly likely the above will be lowered to use those instructions, giving you performance.
Regarding intrinsics, there are no target-independent intrinsics to handle this. If you're compiling to x86, though, you do have access to the hadd instrinsics (e.g. llvm.x86.int_x86_ssse3_phadd_sw_128 to add two <4 x i32> vectors together). You'll still have to do something similar to the above, only the add instructions could be replaced.
For more information about this you can search for "horizontal sum" or "horizontal vector sum"; for instance, here are some relevant stackoverflow questions for a horizontal sum on x86:
horizontal sum of 8 packed 32bit floats
Fastest way to do horizontal vector sum with AVX instructions
Fastest way to do horizontal float vector sum on x86

Passing an array to an external function

I am new to LLVM, and I am learning how to use LLVM for profiling. I need to pass an array to an external method, and insert a call instruction to the method in the code. I am currently using the following code, which on execution gives a segmentation fault.
std::vector<Value*> Args(1);
//Vector with array values
SmallVector<Constant*, 2> counts;
counts.push_back(ConstantInt::get(Type::getInt32Ty(BB->getContext()),32, false));
counts.push_back(ConstantInt::get(Type::getInt32Ty(BB->getContext()),12, false));
//Array with 2 integers
Args[0]= ConstantArray::get(llvm::ArrayType::get(llvm::Type::getInt32Ty(BI->getContext()),2), counts);
Here, the external function 'hook' is defined as M.getOrInsertFunction("hook", Type::getVoidTy(M.getContext()),
llvm::ArrayType::get(llvm::Type::getInt32Ty(BI->getContext()),2)
(Type*)0);
After reading a few source files, I've tried using GetElementPtrInst to pass the array
std::vector<Value*> ids(1);
ids.push_back(ConstantInt::get(Type::getInt32Ty(BB->getContext()),0));
Constant* array = ConstantArray::get(llvm::ArrayType::get(llvm::Type::getInt32Ty(BI->getContext()),2), counts);
Args[0] = ConstantExpr::getGetElementPtr(&(*array), ids, false);
but it fails with
7 opt 0x00000000006c59f5 bool llvm::isa<llvm::Constant, llvm::Value*>(llvm::Value* const&) + 24
8 opt 0x00000000006c5a0f llvm::cast_retty<llvm::Constant, llvm::Value*>::ret_type llvm::cast<llvm::Constant, llvm::Value*>(llvm::Value* const&) + 24
9 opt 0x0000000000b2b22f
10 opt 0x0000000000b2a4fe llvm::ConstantFoldGetElementPtr(llvm::Constant*, bool, llvm::ArrayRef<llvm::Value*>) + 55
11 opt 0x0000000000b33df2 llvm::ConstantExpr::getGetElementPtr(llvm::Constant*, llvm::ArrayRef<llvm::Value*>, bool) + 82
Also, in this case, 'hook' is defined as M.getOrInsertFunction("hook", Type::getVoidTy(M.getContext()),
PointerType::get(Type::getInt32PtrTy(M.getContext()),0), //when using GEP
(Type*)0);
Could someone kindly keep me a few pointers on passing arrays to an external function (say with the signature void hook(int abc[]) ). I am probably wrong all the way through, and would really appreciate some help.
A good place to start with "how do I do this c-like thing in LLVM IR" questions is to first write what you want to do in C, then compile it to LLVM IR via Clang and take a look at the result.
In your particular instance, the file:
void f(int a[2]);
void g() {
int x[2];
x[0] = 1;
x[1] = 3;
f(x);
}
Will compile to:
define void #g() nounwind {
%x = alloca [2 x i32], align 4
%1 = getelementptr inbounds [2 x i32]* %x, i32 0, i32 0
store i32 1, i32* %1, align 4
%2 = getelementptr inbounds [2 x i32]* %x, i32 0, i32 1
store i32 3, i32* %2, align 4
%3 = getelementptr inbounds [2 x i32]* %x, i32 0, i32 0
call void #f(i32* %3)
ret void
}
declare void #f(i32*)
So we can see the clang compiled g to receive i32*, not an array. That means you need a way to get an address to the first element of the array from the array itself, and a getelementptr instruction is a straightforward way of doing that.
Notice, however, that you want to generate a GEP (getelementptr instruction), for example via GetElementPtrInst::create. A gep constant expression, which is what you're trying to generate here, is something else, and will only work on compile-time constants.
You should use Clang to compile it. Then, check the boundaries of the array and if all the elements are defined.