Does loop operation with variable assignment violate SSA principle? - llvm

I just started to learn LLVM IR and SSA, got a question about the SSA principle.
I found the following code block on the Internet, which seems to violate SSA principle because variables are assigned value for several times. Is my comprehension right?
; <label>:4: ; preds = %7, %0
%5 = load i32, i32* %3, align 4
%6 = icmp slt i32 %5, 10
br i1 %6, label %7, label %12
; <label>:7: ; preds = %4
%8 = load i32, i32* %3, align 4
%9 = add nsw i32 %8, 1
store i32 %9, i32* %3, align 4
%10 = load i32, i32* %2, align 4
%11 = mul nsw i32 %10, 2
store i32 %11, i32* %2, align 4
br label %4

LLVM uses "partial SSA" form. LLVM's infinite registers are in SSA form but memory and global variables are not. Your %5 can take on different values because it is a load from memory.
Even in fully SSA form an SSA value in a loop ordinarily takes on different values through the loop iterations. It would look like %5 = phi i32 [%start_val, %loopheader_bb], [%iteration_val, %backedge_bb]. You should get phi nodes if you run opt -sroa over your code.

Related

What's the instruction for '&&' in LLVM IR?

I want to write an LLVM pass to reduce && in LLVM IR, but I can't find the specific instructions for it in IR. For example,
#include <iostream>
int main(){
bool a = true;
bool b = false;
bool c = a && b;
return 0;
}
and I get the IR,
define dso_local i32 #main() #4 {
%1 = alloca i32, align 4
%2 = alloca i8, align 1
%3 = alloca i8, align 1
%4 = alloca i8, align 1
store i32 0, i32* %1, align 4
store i8 1, i8* %2, align 1
store i8 0, i8* %3, align 1
%5 = load i8, i8* %2, align 1
%6 = trunc i8 %5 to i1
br i1 %6, label %7, label %10
7: ; preds = %0
%8 = load i8, i8* %3, align 1
%9 = trunc i8 %8 to i1
br label %10
10: ; preds = %7, %0
%11 = phi i1 [ false, %0 ], [ %9, %7 ]
%12 = zext i1 %11 to i8
store i8 %12, i8* %4, align 1
ret i32 0
}
but I tried this one,
#include <iostream>
int main(){
int a = 10;
int b = 10;
int c;
c = a && b;
return 0;
}
and I get this
define dso_local i32 #main() #4 {
%1 = alloca i32, align 4
%2 = alloca i32, align 4
%3 = alloca i32, align 4
%4 = alloca i32, align 4
store i32 0, i32* %1, align 4
store i32 10, i32* %2, align 4
store i32 10, i32* %3, align 4
%5 = load i32, i32* %2, align 4
%6 = icmp ne i32 %5, 0
br i1 %6, label %7, label %10
7: ; preds = %0
%8 = load i32, i32* %3, align 4
%9 = icmp ne i32 %8, 0
br label %10
10: ; preds = %7, %0
%11 = phi i1 [ false, %0 ], [ %9, %7 ]
%12 = zext i1 %11 to i32
store i32 %12, i32* %4, align 4
ret i32 0
}
I use LLVM 10 in ubuntu. I'll appreciate any answers or suggestions.
There is no LLVM instruction that specifically corresponds to the && operator. It can and will be translated in different ways depending on the expression and the optimization settings.
When you have optimizations enabled, the operands are side effect free (and not expensive to evaluate) and the whole expression can't be optimized away, clang will usually convert both operands to i1 and apply the logical and operator on them.
When optimizations are disabled or the operands have side effects, it'll usually be translated using branch instructions. That's the case in the two examples you posted.
Note that expr1 && expr2 is semantically equivalent to expr1 ? expr2 : false and you'll generally get the same LLVM code for both.
If you're okay with treating expr1 ? expr2 : false and other equivalent code (for example using if statements) the same as &&, you can try to detect the branching pattern created by them. If you need your pass to also be applicable after optimizations, you'll also have to detect at least the pattern of converting to i1 and anding.
If you only want your transformation to apply to && and nothing else, you simply can't do it at the LLVM level. You'd need an AST transformation at the Clang level.

Compiler error in ndk and clang++ for ARM?

Please consider following code:
float test(int len, int* tab)
{
for(int i = 0; i<len; i++)
tab[i] = i;
}
Obviously return is missing. For this scenario for both clang and ndk compiler for ARM processor an infinite loop is generated. After disassembling it becomes clear that compiler generates regular branch instruction instead of conditional branch.
mov r0, #0
.LBB0_1:
str r0, [r1, r0, lsl #2]
add r0, r0, #1
b .LBB0_1
The example with an error can be found here: https://godbolt.org/z/YDSFw-
Please note that c++ specification states that missing return is considered as undefined behaviour but it refers only to the returned value. It shall not affect the preceding instructions.
Am I missing something here? Any thoughts?
No, you can't reason that way with undefined behaviour.
The compiler is free to use undefined behaviour and assumptions around it for optimizations. The compiler is free to assume your code will not contain undefined behaviour.
In this case, the compiler can assume that the code with undefined behaviour won't be reached. As the end of the function contains undefined behaviour, the compiler concludes that the end of the function actually never will be reached, and thus can optimize the loop.
If you remove the -Oz and add -emit-llvm to the compiler explorer command, you'll see what LLVM IR clang produces originally, when not doing optimizations:
https://godbolt.org/z/-dbeNj
define dso_local float #_Z4testiPi(i32 %0, i32* %1) #0 {
%3 = alloca i32, align 4
%4 = alloca i32*, align 4
%5 = alloca i32, align 4
store i32 %0, i32* %3, align 4
store i32* %1, i32** %4, align 4
store i32 0, i32* %5, align 4
br label %6
6: ; preds = %15, %2
%7 = load i32, i32* %5, align 4
%8 = load i32, i32* %3, align 4
%9 = icmp slt i32 %7, %8
br i1 %9, label %10, label %18
10: ; preds = %6
%11 = load i32, i32* %5, align 4
%12 = load i32*, i32** %4, align 4
%13 = load i32, i32* %5, align 4
%14 = getelementptr inbounds i32, i32* %12, i32 %13
store i32 %11, i32* %14, align 4
br label %15
15: ; preds = %10
%16 = load i32, i32* %5, align 4
%17 = add nsw i32 %16, 1
store i32 %17, i32* %5, align 4
br label %6
18: ; preds = %6
call void #llvm.trap()
unreachable
}
The end of the loop, label 18, contains unreachable. This can be used for further optimizations, getting rid of the branch and comparison at the start of the loop.
Edit:
There's an excellent blog post from John Regehr about how to reason around undefined behaviour in C and C++. It's a bit long but well worth a read.

getting block names for LLVM IR parser

I'm writing a LLVM parser to analyse whether a program is adhering to a certain programming paradigm. To that I need to analyse each block of the IR and check certain instructions. When I created the .ll file, I don't see the label names but an address:
; <label>:4 ; preds = %0
%5 = load i32* %c, align 4
%6 = add nsw i32 %5, 10
store i32 %6, i32* %c, align 4
br label %10
; <label>:7 ; preds = %0
%8 = load i32* %c, align 4
%9 = add nsw i32 %8, 15
store i32 %9, i32* %c, align 4
br label %10
; <label>:10 ; preds = %7, %4
%11 = load i32* %1
ret i32 %11
What I need is to get these "labels" into a list. I have also seen that some .ll files has following format:
if.then: ; preds = %entry
%5 = load i32* %c, align 4
%6 = add nsw i32 %5, 10
store i32 %6, i32* %c, align 4
br label %10
if.else: ; preds = %entry
%8 = load i32* %c, align 4
%9 = add nsw i32 %8, 15
store i32 %9, i32* %c, align 4
br label %10
if.end: ; preds = %if.else,
%11 = load i32* %1
ret i32 %11
With the 2nd format, I can use the getName() to get the name of the block: i.e: 'if.then', 'if.else' etc.
But with the 1st format, it's impossible as it doesn't have a name. But I tested with printAsOperand(errs(), true) from which I can print the addresses like: '%4, %7 %10'. What my question is, how to add these addresses (or operands) into a stings list? or obtain these values and assign to a certain variable.
Here's the way to do it;
raw_ostream should be used in printAsOperand() method to get the required address into a variable:
following is the method I used for the purpose:
#include "llvm/Support/raw_ostream.h"
std::string get_block_reference(BasicBlock *BB){
std::string block_address;
raw_string_ostream string_stream(block_address);
BB->printAsOperand(string_stream, false);
return string_stream.str();
}
Instruction / basic block names is a debugging feature that simplifies the development of IR-level passes, but no guarantees are made towards them. E.g. they could be simply stripped off, they could be misleading, etc. You should not rely on them for anything meaningful (and in general they may not have any connection to the original source code). Normally the names are no generated in Release builds of LLVM. You need to build everything in Debug (or Release+Assertions) mode.

About Variables Used Within BasicBlock

I want to ask a question about LLVM IR language. For a basicblock, variables used are always loaded prior to usage, and stored after usage. Two example basic blocks are as follows:
%1 = alloca i32, align 4
%2 = alloca i32, align 4
%3 = alloca i8**, align 8
%i = alloca i32, align 4
%fact = alloca i32, align 4
%n = alloca i32, align 4
store i32 0, i32* %1
store i32 %argc, i32* %2, align 4
store i8** %argv, i8*** %3, align 8
%4 = load i8*** %3, align 8
%5 = getelementptr inbounds i8** %4, i64 1
%6 = load i8** %5, align 8
%7 = call i32 (i8*, ...)* bitcast (i32 (...)* #atoi to i32 (i8*, ...)*)(i8* %6)
store i32 %7, i32* %n, align 4
store i32 1, i32* %fact, align 4
store i32 1, i32* %i, align 4
br label %8
%9 = load i32* %i, align 4
%10 = load i32* %n, align 4
%11 = icmp sle i32 %9, %10
br i1 %11, label %12, label %19
For control flow, define first basic block as A, second basic block as B, control flow is from A to B.
I wonder, for the usage of %7, program store %7 to %n pointer in A, and load %n pointer to %10 to get access to it, which are like:
store i32 %7, i32* %n, align 4
%10 = load i32* %n, align 4
%11 = icmp sle i32 %9, %10
I wonder if I could just DROP store and load instructions, and directly use value %7, which is as follows:
%11 = icmp sle i32 %9, %7
Is this OK? Could anyone talk about the reason behind it?
My description may be obscure. I could explain it more clear if you have questions on it.
Thanks
It is possible to refer to virtual registers from other basic blocks.
Since you provided an incomplete example, I can just speculate if %7 can be directly used in the comparison:
If you optimize the code with LLVM's opt tool, the register will probably not be stored and reloaded and the comparison will directly use %7 (or a phi function dependent on the value).
You can try the mem2reg register pass:
opt -mem2reg <your file>.ll -o <target file>.ll

How to change a do-while form loop into a while form loop in LLVM IR

How can I change a loop in do-while form into a loop in while-form in LLVM IR?
Here we have a little loop example. The loops are just running through a boolean array until they find the first occurrence of true. I compiled it with clang -emit-llvm to get the optimized llvm IR.
#include <stdio.h>
#include <string.h>
int foo(bool* start){
bool* cond = start;;
while (*cond != true)
cond++;
return cond - start;
}
int bar(bool* start){
bool* cond = start;
do {
}while (*(++cond) != true);
return cond - start;
}
int main(){
bool cond[8];
memset(&cond, 0, sizeof(bool)*8);
cond[5] = true;
printf("%i %i\n", foo(cond), bar(cond));
}
The IR for the foo function (using just a while loop) looks like this:
; Function Attrs: nounwind uwtable
define i32 #_Z3fooPb(i8* %start) #0 {
%1 = alloca i8*, align 8
%cond = alloca i8*, align 8
store i8* %start, i8** %1, align 8
%2 = load i8** %1, align 8
store i8* %2, i8** %cond, align 8
br label %3
; <label>:3 ; preds = %9, %0
%4 = load i8** %cond, align 8
%5 = load i8* %4, align 1
%6 = trunc i8 %5 to i1
%7 = zext i1 %6 to i32
%8 = icmp ne i32 %7, 1
br i1 %8, label %9, label %12
; <label>:9 ; preds = %3
%10 = load i8** %cond, align 8
%11 = getelementptr inbounds i8* %10, i32 1
store i8* %11, i8** %cond, align 8
br label %3
; <label>:12 ; preds = %3
%13 = load i8** %cond, align 8
%14 = load i8** %1, align 8
%15 = ptrtoint i8* %13 to i64
%16 = ptrtoint i8* %14 to i64
%17 = sub i64 %15, %16
%18 = trunc i64 %17 to i32
ret i32 %18
}
and for bar, which is using a do while we get:
; Function Attrs: nounwind uwtable
define i32 #_Z3barPb(i8* %start) #0 {
%1 = alloca i8*, align 8
%cond = alloca i8*, align 8
store i8* %start, i8** %1, align 8
%2 = load i8** %1, align 8
store i8* %2, i8** %cond, align 8
br label %3
; <label>:3 ; preds = %4, %0
br label %4
; <label>:4 ; preds = %3
%5 = load i8** %cond, align 8
%6 = getelementptr inbounds i8* %5, i32 1
store i8* %6, i8** %cond, align 8
%7 = load i8* %6, align 1
%8 = trunc i8 %7 to i1
%9 = zext i1 %8 to i32
%10 = icmp ne i32 %9, 1
br i1 %10, label %3, label %11
; <label>:11 ; preds = %4
%12 = load i8** %cond, align 8
%13 = load i8** %1, align 8
%14 = ptrtoint i8* %12 to i64
%15 = ptrtoint i8* %13 to i64
%16 = sub i64 %14, %15
%17 = trunc i64 %16 to i32
ret i32 %17
}
The differences are very small for bar we have one additional label and an additional br because we jump strait to the body of the loop and execute it before we evaluate the condition.
So the first thing to transform a do while is to get rid of the branch and just jump to the condition. Now its a while loop where the condition is evaluated first. That is easy. Now you have two choices how you handle the condition. You can try to modify the condition what is a realy hard task because you can put almost everything inside a loops condition. The easy way is to just copy the loop body one time (everything from ;<label>:4 to ;<label>:11) prior to the first branch of the loop. so you want change the correctness of your code and your do-while loop will become a loop (with on loop-body execution) in-front of the loop.
You can copy the loop body with CloneBasicBlock from llvm/Transforms/Utils/Cloning.h:
/// CloneBasicBlock - Return a copy of the specified basic block, but without
/// embedding the block into a particular function. The block returned is an
/// exact copy of the specified basic block, without any remapping having been
/// performed. Because of this, this is only suitable for applications where
/// the basic block will be inserted into the same function that it was cloned
/// from (loop unrolling would use this, for example).
///
/// Also, note that this function makes a direct copy of the basic block, and
/// can thus produce illegal LLVM code. In particular, it will copy any PHI
/// nodes from the original block, even though there are no predecessors for the
/// newly cloned block (thus, phi nodes will have to be updated). Also, this
/// block will branch to the old successors of the original block: these
/// successors will have to have any PHI nodes updated to account for the new
/// incoming edges.
///
/// The correlation between instructions in the source and result basic blocks
/// is recorded in the VMap map.
///
/// If you have a particular suffix you'd like to use to add to any cloned
/// names, specify it as the optional third parameter.
///
/// If you would like the basic block to be auto-inserted into the end of a
/// function, you can specify it as the optional fourth parameter.
///
/// If you would like to collect additional information about the cloned
/// function, you can specify a ClonedCodeInfo object with the optional fifth
/// parameter.
///
BasicBlock *CloneBasicBlock(const BasicBlock *BB,
ValueToValueMapTy &VMap,
const Twine &NameSuffix = "", Function *F = nullptr,
ClonedCodeInfo *CodeInfo = nullptr);
I hope this is a little help. Have Fun!