LLVM get values of array literal from GlobalVariable - llvm

Say an array is declared as follows in LLVM IR:
...
#values = local_unnamed_addr constant [6 x i32] [i32 0, i32 1, i32 8, i32 27, i32 64, i32 125], align 16
...
This will show up when calling getGlobalList() on a Module object. How do I get the literal values {0, 1, 8, 27, 64, 125} from the GlobalVariable* representing values?

First, you get the [6 x i32] [i32 0, i32 1, i32 8, i32 27, i32 64, i32 125] by calling getInitializer(), then do cast<ConstantDataArray>() and finally use getElement* methods.
See http://llvm.org/doxygen/classllvm_1_1ConstantDataSequential.html

Related

Why std::unique_ptr isn't optimized while std::variant can? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I tried to compare the overhead of std::visit(std::variant polymorphism) and virtual function(std::unique_ptr polymorphism).(please note my question is not about overhead or performance, but optimization.)
Here is my code.
https://quick-bench.com/q/pJWzmPlLdpjS5BvrtMb5hUWaPf0
#include <memory>
#include <variant>
struct Base
{
virtual void Process() = 0;
};
struct Derived : public Base
{
void Process() { ++a; }
int a = 0;
};
struct VarDerived
{
void Process() { ++a; }
int a = 0;
};
static std::unique_ptr<Base> ptr;
static std::variant<VarDerived> var;
static void PointerPolyMorphism(benchmark::State& state)
{
ptr = std::make_unique<Derived>();
for (auto _ : state)
{
for(int i = 0; i < 1000000; ++i)
ptr->Process();
}
}
BENCHMARK(PointerPolyMorphism);
static void VariantPolyMorphism(benchmark::State& state)
{
var.emplace<VarDerived>();
for (auto _ : state)
{
for(int i = 0; i < 1000000; ++i)
std::visit([](auto&& x) { x.Process();}, var);
}
}
BENCHMARK(VariantPolyMorphism);
I know it's not good benchmark test, it was only draft during my test.
But I was surprised at the result.
std::visit benchmark was high(which means slow) without any optimization.
But When I turn on optimization (higher than O2), std::visit benchmark is extremely low(which means extremely fast) while std::unique_ptr isn't.
I'm wondering why the same optimization can't be applied to the std::unique_ptr polymorphism?
I've compiled your code with Clang++ to LLVM (without your benchmarking) with -Ofast. Here's what you get for VariantPolyMorphism, unsurprisingly:
define void #_Z19VariantPolyMorphismv() local_unnamed_addr #2 {
ret void
}
On the other hand, PointerPolyMorphism does really execute the loop and all calls:
define void #_Z19PointerPolyMorphismv() local_unnamed_addr #2 personality i32 (...)* #__gxx_personality_v0 {
%1 = tail call dereferenceable(16) i8* #_Znwm(i64 16) #8, !noalias !8
tail call void #llvm.memset.p0i8.i64(i8* nonnull align 16 dereferenceable(16) %1, i8 0, i64 16, i1 false), !noalias !8
%2 = bitcast i8* %1 to i32 (...)***
store i32 (...)** bitcast (i8** getelementptr inbounds ({ [3 x i8*] }, { [3 x i8*] }* #_ZTV7Derived, i64 0, inrange i32 0, i64 2) to i32 (...)**), i32 (...)*** %2, align 8, !tbaa !11, !noalias !8
%3 = getelementptr inbounds i8, i8* %1, i64 8
%4 = bitcast i8* %3 to i32*
store i32 0, i32* %4, align 8, !tbaa !13, !noalias !8
%5 = load %struct.Base*, %struct.Base** getelementptr inbounds ({ { %struct.Base* } }, { { %struct.Base* } }* #_ZL3ptr, i64 0, i32 0, i32 0), align 8, !tbaa !4
store i8* %1, i8** bitcast ({ { %struct.Base* } }* #_ZL3ptr to i8**), align 8, !tbaa !4
%6 = icmp eq %struct.Base* %5, null
br i1 %6, label %7, label %8
7: ; preds = %8, %0
br label %11
8: ; preds = %0
%9 = bitcast %struct.Base* %5 to i8*
tail call void #_ZdlPv(i8* %9) #7
br label %7
10: ; preds = %11
ret void
11: ; preds = %7, %11
%12 = phi i32 [ %17, %11 ], [ 0, %7 ]
%13 = load %struct.Base*, %struct.Base** getelementptr inbounds ({ { %struct.Base* } }, { { %struct.Base* } }* #_ZL3ptr, i64 0, i32 0, i32 0), align 8, !tbaa !4
%14 = bitcast %struct.Base* %13 to void (%struct.Base*)***
%15 = load void (%struct.Base*)**, void (%struct.Base*)*** %14, align 8, !tbaa !11
%16 = load void (%struct.Base*)*, void (%struct.Base*)** %15, align 8
tail call void %16(%struct.Base* %13)
%17 = add nuw nsw i32 %12, 1
%18 = icmp eq i32 %17, 1000000
br i1 %18, label %10, label %11
}
The reason for this is that both your variables are static. This allows the compiler to infer that no code outside the translation unit has access to your variant instance. Therefore your loop doesn't have any visible effect and can be safely removed. However, although your smart pointer is static, the memory it points to could still change (as a side-effect of the call to Process, for example). The compiler can therefore not easily prove that is safe to remove the loop and doesn't.
If you remove the static from both VariantPolyMorphism you get:
define void #_Z19VariantPolyMorphismv() local_unnamed_addr #2 {
store i32 0, i32* getelementptr inbounds ({ { %"union.std::__1::__variant_detail::__union", i32 } }, { { %"union.std::__1::__variant_detail::__union", i32 } }* #var, i64 0, i32 0, i32 1), align 4, !tbaa !16
store i32 1000000, i32* getelementptr inbounds ({ { %"union.std::__1::__variant_detail::__union", i32 } }, { { %"union.std::__1::__variant_detail::__union", i32 } }* #var, i64 0, i32 0, i32 0, i32 0, i32 0, i32 0), align 4, !tbaa !18
ret void
}
Which isn't surprising once again. The variant can only contain VarDerived so nothing needs to be computed at run-time: The final state of the variant can already be determined at compile-time. The difference, though, now is that some other translation unit might want to access the value of var later on and the value must therefore be written.
Your variant can store only singe type, so this is same as single regular variable (it is working more like an optional).
You are running test without optimizations enabled
Result is not secured from optimizer so it can trash your code.
Your code actually do not utilizes polymorphism, some compilers are able to figure out that there is only one implementation of Base class and drop virtual calls.
This is better but still not trustworthy:
ver 1, ver 2 with arrays.
Yes polymorphism can be expensive when used in tight loops.
Witting benchmarks for such small extremely fast features is hard and full of pitfalls, so must be approached with extreme caution, since you reaching limitations of benchmark tool.

How to use CreateInBoundsGEP in cpp api of llvm to access the element of an array?

I am new to llvm programming, and I am trying to write cpp to generate llvm ir for a simple C code like this:
int a[10];
a[0] = 1;
I want to generate something like this to store 1 into a[0]
%3 = getelementptr inbounds [10 x i32], [10 x i32]* %2, i64 0, i64 0
store i32 1, i32* %3, align 16
And I tried CreateGEP: auto arrayPtr = builder.CreateInBoundsGEP(var, num); where var and
num are both of type llvm::Value*
but I only get
%1 = getelementptr inbounds [10 x i32], [10 x i32]* %0, i32 0
store i32 1, [10 x i32]* %1
I searched google for a long time and looked the llvm manual but still don't know what Cpp api to use and how to use it.
Really appreciate it if you can help!
Note that the 2nd argument to IRBuilder::CreateInBoundsGEP (1st overload) is actually ArrayRef<Value *>, which means it accepts an array of Value * values (including C-style array, std::vector<Value *> and std::array<Value *, LEN> and others).
To generate a GEP instruction with multiple (child) addresses, pass an array of Value * to the second argument:
Value *i32zero = ConstantInt::get(contexet, APInt(32, 0));
Value *indices[2] = {i32zero, i32zero};
builder.CreateInBoundsGEP(var, ArrayRef<Value *>(indices, 2));
Which will yield
%1 = getelementptr inbounds [10 x i32], [10 x i32]* %0, i32 0, i32 0
You can correctly identify that %1 is of type i32*, pointing to the first item in the array pointed to by %0.
LLVM documentation on GEP instruction: https://llvm.org/docs/GetElementPtr.html

llvm 3.9 ConstantExpr::getAsInstruction for Instruction::GetElementPtr get assertion

I migrated from LLVM 3.6.1 to LLVM 3.9.0. In LLVMv3.6 this code execute fine, but in LLVMv3.9 I have assertion failure:
... include/llvm/IR/Instructions.h:866: static llvm::GetElementPtrInst* llvm::GetElementPtrInst::Create(llvm::Type*, llvm::Value*, llvm::ArrayRef<llvm::Value*>, const llvm::Twine&, llvm::Instruction*): Assertion `PointeeType == cast<PointerType>(Ptr->getType()->getScalarType())->getElementType()' failed.
My Code is:
pOperand = pStore->getValueOperand();
if(!pOperand)
return;
pConstExpr = dyn_cast<ConstantExpr>(pOperand);
if(!pConstExpr)
return;
if(pConstExpr->getOpcode() == Instruction::GetElementPtr)
{
pGEPInst = dyn_cast<GetElementPtrInst>(pConstExpr->getAsInstruction()); // Assertion !!!
if(!pGEPInst)
return;
... other code ...
}
EDIT:
This problem appears only when the build type of LLVM-3.9.0 is DEBUG. The RELEASE-build hasn't this problem!
I found where I had a problem. It is my own mistake. In instruction GetElementPtrInst of LLVM-3.9.0 appeared argument "Type *PointeeType", but LLVM-3.6.1 it argument wasn't, and that was the my error.
When in LLVM-3.6.1 I generate LLVM IR from source file, I got IR-line:
store i8* getelementptr inbounds ([14 x i8]* #.str1, i32 0, i32 0), i8** %2
in LLVM-3.9.0 this line:
store i8* getelementptr inbounds ([14 x i8], [14 x i8]* #.str1, i32 0, i32 0), i8** %2
After this in both versions I change global variable #.str1 to #globstr1, and replace it everywhere in module. In LLVM-3.6.1 this line transform to:
store i8* getelementptr inbounds ([17 x i8]* #globstr1, i32 0, i32 0), i8** %2
and worked fine, but in LLVM-3.9.0 this line transform to:
store i8* getelementptr inbounds ([14 x i8], [17 x i8]* #globstr1, i32 0, i32 0), i8** %2
the Type [14 x i8] not changed and in this place occurred assertion failure.
EDIT:
This problem appears only when the build with ASSERTS (parameter -DLLVM_ENABLE_ASSERTIONS=1 for CMAKE. This parametr default is ON for DEBUG build). In RELEASE-build it doesn't appear because parametr DLLVM_ENABLE_ASSERTIONS default is off (default for RELEASE build)!

What does an overlong bitshift on a LLVM vector yield?

The LLVM documentation for 'shl' says that
<result> = shl i32 1, 32
is an undefined value because it's shifting by greater than or equal to the number of bits in an i32. However, it's not clear to me what happens with
<result> = shl <2 x i32> < i32 1, i32 1>, < i32 1, i32 32>
Is only the second element of the result undefined (result=<2 x i32> < i32 2, i32 undef>), or is the result as a whole undefined (result=<2 x i32> undef)?

Is LLVM GEP safe if not know the size of an array?

I am trying to use GEP to get a pointer of i32 from an array.
But the problem is: I don't know the size of the array.
The IR document on llvm.org said GEP just adds the offsets to the base address with silently-wrapping two’s complement arithmetic.
So, I want to ask for some advice.
Is it safe like this:
%v1 = alloca i32
store i32 5, i32* %v1
%6 = load i32* %v1
%7 = bitcast i32* %v0 to [1 x i32]*
%8 = getelementptr [1 x i32]* %7, i32 0, i32 %6
%9 = load i32* %8
store i32 %9, i32* %v0
Type of %v0 is i32*, and I know %v0 is pointing to an array in mem, but the size is 9, not 1.
Then I "GEP" from %7 which I treat it as a [1 x i32], not [9 x i32] , but the "offset" is 5(%6).
So, is there any problem? Not safe, or just not good but basically OK?
First of all, the entire code you wrote is equivalent to:
%x = getelementptr i32* %v0, i32 5
%y = load i32* %x
store i32* %y, %v0
There's no reason to bitcast the pointer to [1 x i32]*, just use it as-is.
Regarding your question - using a gep to get the pointer is always safe (in the sense that it's well-defined and will never crash), however there's nothing stopping it from evaluating to a pointer beyond the bounds of the array; and in such a case, accessing the memory (as you do in the subsequent load instruction) is undefined.
Also, this link might be of interest: http://llvm.org/docs/GetElementPtr.html#what-happens-if-an-array-index-is-out-of-bounds