In the following code what I try to do is load an array of doubles (represented as a pointer) from a nested struct inside a struct then store a value inside the final element. When I try to execute this, I get a segmentation fault (debugging below). Why am I getting a segmentation fault and how can I fix it?
LLVM IR Code
%7 = getelementptr { { double*, i32 }*, i32 }, { { double*, i32 }*, i32 }* %foo3, i32 0, i32 0
%load_array_ptr = load { double*, i32 }*, { double*, i32 }** %7
%8 = getelementptr { double*, i32 }, { double*, i32 }* %load_array_ptr, i32 0, i32 0
%load_elem_ptr = load double*, double** %8
%9 = getelementptr double, double* %load_elem_ptr, i32 0
; problematic line:
store double 1.000000e+00, double* %9
Debugging
I tried debugging this with lldb but, I did not really get anything useful:
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
frame #0: 0x0000000100000ee9 built`__anon_expr0 + 105
built`__anon_expr0:
-> 0x100000ee9 <+105>: movsd %xmm0, (%rax)
0x100000eed <+109>: movsd 0x10(%rsp), %xmm0 ; xmm0 = mem[0],zero
0x100000ef3 <+115>: addq $0x48, %rsp
0x100000ef7 <+119>: retq
Target 0: (built) stopped.
stop reason = EXC_BAD_ACCESS (code=1, address=0x0)
EXC_BAD_ACCESS means you're illegally accessing memory and address=0x0 means that the address you're trying to access is 0. So, if this occurs on store double 1.000000e+00, double* %9, %9 must be a null pointer. This means that the double* in your inner array was null.
PS:
%9 = getelementptr double, double* %load_elem_ptr, i32 0
This isn't related to your problem, but here you're setting %9 to be equal to %load_elem_ptr + 0, which is a no-op.
Related
I'm new to LLVM, and I'm doing some experiments on it such as inserting an instruction.
My main.c is shown below:
int foo(int e, int a) {
int b = a + 1;
int c = b * 2;
b = e << 1;
int d = b / 4;
return c * d;
}
I use the command below to generate the LLVM bytecode
clang-12 -O0 -Xclang -disable-O0-optnone -emit-llvm -c main.c -o main.bc
opt-12 -S -mem2reg main.bc -o main.ll
The bytecode is
; Function Attrs: noinline nounwind uwtable
define dso_local i32 #foo(i32 %0, i32 %1) #0 {
%3 = add nsw i32 %1, 1
%4 = mul nsw i32 %3, 2
%5 = shl i32 %0, 1
%6 = sdiv i32 %5, 4
%7 = mul nsw i32 %4, %6
ret i32 %7
}
And I use the code to insert an instruction after the first instruction:
bool runOnBasicBlock(BasicBlock &B) {
// get the first and second instruction
Instruction &Inst1st = *B.begin();
Instruction *NewInst = BinaryOperator::Create(
Instruction::Add, Inst1st.getOperand(0), Inst1st.getOperand(0));
NewInst->insertAfter(&Inst1st);
...
}
After I run this pass, the bytecode is changed to
; Function Attrs: noinline nounwind uwtable
define dso_local i32 #foo(i32 %0, i32 %1) #0 {
%3 = add nsw i32 %1, 1
%4 = add i32 %1, %1
%5 = mul nsw i32 %4, 2
%6 = shl i32 %0, 1
%7 = sdiv i32 %6, 4
%8 = mul nsw i32 %5, %7
ret i32 %8
}
It seems that the inserted instruction is equal to b = a + a;, so the instruction %4 = mul nsw i32 %3, 2 is changed to %5 = mul nsw i32 %4, 2. I cannot understand the reason. Any help?
As I know, NewInst->insertAfter(&Inst1st); makes from the block
int b = a + 1;
int c = b * 2;
the following block
int b = a + 1, a + a;
int c = b * 2;
therefore b drops off the previous value %3 and gets the new value %4 and further mul uses that new value of b.
I'm working with the LLVM IR in Ocaml to build a toy language and, now my problem is to convert the variable into the reference to this variable.
In other words, my simple program is this
int main(){
int i;
i = 2;
int *p;
p = &i;
print(*p);
return 0;
}
and my problem is to get the pointer of the variable i in the instruction p = &i;, my actual
IR generated is
define i32 #main() {
entry:
%i = alloca i32
store i32 2, i32* %i
%p = alloca i32*
%0 = getelementptr i32, i32* %i, i32 0
store i32* %0, i32** %p
%1 = load i32*, i32** %p
%2 = load i32, i32* %1
call void #print(i32 %2)
ret i32 0
}
I don't like this line %0 = getelementptr i32, i32* %i, i32 0, and I think that I'm only lucky that my code work as expected.
To summarize my question is, What is the good practice to make this memory operation with a variable like C language? In particular, I need to to the following
i = 2;
int *p;
p = &i;
And also
int *p;
p = &i;
*p = *p + 2;
I'm missing something because when I try to compile code like that *p = *p + 2; I receive some core dump.
I noted also that clang for my first example doesn't use getelementptr, but generate some code like that
; Function Attrs: noinline nounwind optnone uwtable
define dso_local i32 #main() #0 {
%1 = alloca i32, align 4
%2 = alloca i32, align 4
%3 = alloca i32*, align 8
store i32 0, i32* %1, align 4
store i32 2, i32* %2, align 4
store i32* %2, i32** %3, align 8
%4 = load i32*, i32** %3, align 8
%5 = load i32, i32* %4, align 4
%6 = call i32 (i32, ...) bitcast (i32 (...)* #print to i32 (i32, ...)*)(i32 %5)
ret i32 0
}
In my grammar, the *p is a pointer and I convert it into llvm IR into an llvm pointer type.
I am trying to run this code but I am getting:
error: expected instruction opcode
label_3:
this is the relevant part of the code:
define void #main(){
%r1 = alloca [50 x i32]
%r7 = alloca i32
store i32 0 , i32* %r7
label_3:
%r9 = load i32 , i32* %r7
%r8 = getelementptr [258 x i32], [258 x i32]* %r6 , i32 0 , i32 %r9
store i32 0 , i32* %r8
%r10 = add i32 1 , %r9
store i32 %r10 , i32* %r7
%r11 = icmp eq i32 256 , i32 %r10
br i1 %r11 , label %label_4 , label %label_3
label_4:
.....
Thanks in advance!
I solved the problem, the problem was that before entering the loop (label_3) we need to close the previous block implicitly and to do that a “Terminator” instruction is required, so I added before label_3: line, br label label_3
for more details read this:
https://zanopia.wordpress.com/2010/09/14/understanding-llvm-assembly-with-fractals-part-i/
Please consider following code:
float test(int len, int* tab)
{
for(int i = 0; i<len; i++)
tab[i] = i;
}
Obviously return is missing. For this scenario for both clang and ndk compiler for ARM processor an infinite loop is generated. After disassembling it becomes clear that compiler generates regular branch instruction instead of conditional branch.
mov r0, #0
.LBB0_1:
str r0, [r1, r0, lsl #2]
add r0, r0, #1
b .LBB0_1
The example with an error can be found here: https://godbolt.org/z/YDSFw-
Please note that c++ specification states that missing return is considered as undefined behaviour but it refers only to the returned value. It shall not affect the preceding instructions.
Am I missing something here? Any thoughts?
No, you can't reason that way with undefined behaviour.
The compiler is free to use undefined behaviour and assumptions around it for optimizations. The compiler is free to assume your code will not contain undefined behaviour.
In this case, the compiler can assume that the code with undefined behaviour won't be reached. As the end of the function contains undefined behaviour, the compiler concludes that the end of the function actually never will be reached, and thus can optimize the loop.
If you remove the -Oz and add -emit-llvm to the compiler explorer command, you'll see what LLVM IR clang produces originally, when not doing optimizations:
https://godbolt.org/z/-dbeNj
define dso_local float #_Z4testiPi(i32 %0, i32* %1) #0 {
%3 = alloca i32, align 4
%4 = alloca i32*, align 4
%5 = alloca i32, align 4
store i32 %0, i32* %3, align 4
store i32* %1, i32** %4, align 4
store i32 0, i32* %5, align 4
br label %6
6: ; preds = %15, %2
%7 = load i32, i32* %5, align 4
%8 = load i32, i32* %3, align 4
%9 = icmp slt i32 %7, %8
br i1 %9, label %10, label %18
10: ; preds = %6
%11 = load i32, i32* %5, align 4
%12 = load i32*, i32** %4, align 4
%13 = load i32, i32* %5, align 4
%14 = getelementptr inbounds i32, i32* %12, i32 %13
store i32 %11, i32* %14, align 4
br label %15
15: ; preds = %10
%16 = load i32, i32* %5, align 4
%17 = add nsw i32 %16, 1
store i32 %17, i32* %5, align 4
br label %6
18: ; preds = %6
call void #llvm.trap()
unreachable
}
The end of the loop, label 18, contains unreachable. This can be used for further optimizations, getting rid of the branch and comparison at the start of the loop.
Edit:
There's an excellent blog post from John Regehr about how to reason around undefined behaviour in C and C++. It's a bit long but well worth a read.
I'm writing a LLVM IR pass that changes the index operand of GetElementPtr instruction at runtime.
I succeeded replacing the GEP index with constant integers. For example,
the code below will replace every last index of GEP instructions with 0.
// For each instruction in the function
for(inst_iterator I = inst_begin(F), E = inst_end(F); I != E; ++I){
// Find GEP instruction
if(auto *GI = dyn_cast<GetElementPtrInst>(&*I)){
GI->setOperand(GI->getNumIndices(), ConstantInt::get(Type::getInt32Ty(I->getContext()), 0));
}
}
the result IR is like this.
Original: %7 = getelementptr inbounds %struct.A, %struct.A* %6, i32 0, i32 0
Replace: %7 = getelementptr inbounds %struct.A, %struct.A* %6, i32 0, i32 0
Original: %9 = getelementptr inbounds %struct.A, %struct.A* %8, i32 0, i32 1
Replace: %9 = getelementptr inbounds %struct.A, %struct.A* %8, i32 0, i32 0
The problem is, when I try to change the index by the result of Instruction on runtime, it fails.
Modified pass:
for(inst_iterator I = inst_begin(F), E = inst_end(F); I != E; ++I){
// Find GEP instruction
if(auto *GI = dyn_cast<GetElementPtrInst>(&*I)){
IRBuilder<> Builder(I);
Instruction* X = Builder.CreateCall(...)
GI->setOperand(GI->getNumIndices(), X);
}
}
Result of the modified pass:
Original: %7 = getelementptr inbounds %struct.A, %struct.A* %6, i32 0, i32 0
Replace: %7 = getelementptr inbounds %struct.A, %struct.A* %6, i32 0, void <badref>
Original: %9 = getelementptr inbounds %struct.A, %struct.A* %8, i32 0, i32 1
Replace: %9 = getelementptr inbounds %struct.A, %struct.A* %8, i32 0, void <badref>
GEP indexes must be integers
%7 = getelementptr inbounds %struct.A, %struct.A* %6, i32 0, void <badref>
GEP indexes must be integers
%9 = getelementptr inbounds %struct.A, %struct.A* %8, i32 0, void <badref>
I also tried to get the constant integer value of the returned value by
I->setOperand(I->getNumIndices(), ConstantInt::get(Type::getInt32Ty(I->getContext()), cast<ConstantInt>(X)->getZExtValue()));
but also doesn't work.
Original: %7 = getelementptr inbounds %struct.A, %struct.A* %6, i32 0, i32 0
Replace: %7 = getelementptr inbounds %struct.A, %struct.A* %6, i32 0, i32 784505880
Original: %9 = getelementptr inbounds %struct.A, %struct.A* %8, i32 0, i32 1
Replace: %9 = getelementptr inbounds %struct.A, %struct.A* %8, i32 0, i32 784506264
Invalid indices for GEP pointer type!
%7 = getelementptr inbounds %struct.A, %struct.A* %6, i32 0, i32 784505880
Invalid indices for GEP pointer type!
%9 = getelementptr inbounds %struct.A, %struct.A* %8, i32 0, i32 784506264
I think the reason is that it is impossible to set the GEP index by the runtime results. Then what should I do to change every indices of GEP on runtime?
Do I need to replace the GEP instruction with some address additions and memory access instruction?
Note the error message: GEP indexes must be integers. If the call is to a function that returns int, then it can work. It doesn't always work — you can call foo() and use the result to get the foo()'th element of an array, but when you're retrieving a struct field, you have to have a constant.
In your second case, you're asking for the 784505880th field of the struct. That's either a bug or an amazingly wide struct ;)
AFAIK, using setOperand() directly is unsafe. Instead, get a pointer to the operand you want to change and call GI->replaceUsesOfWith(oldOp, newOp).