It seems LLVM can now emit DDGs for each loop in a function (https://github.com/llvm/llvm-project/blob/llvmorg-12.0.1/llvm/lib/Analysis/DDGPrinter.cpp).
I am able to generate CFG and Callgraph with opt --dot-cfg foo.bc and with opt --dot-callgraph foo.bc, but the similar opt --dot-ddg-only foo.bc executes but generates no .dot file. I also tried opt -passes=dot-ddg foo.bc without success.
Is there another opt possible call? Or someone has suggestions for other similar tools?
C code used (foo.bc obtained with clang -c -emit-llvm foo.c -o foo.bc):
void foo(int b[], int c[], int n){
for (int i = 1; i < n; i++) {
b[i] = c[i] + b[i-1];
}
}
int main(){
int b[] = {1, 2, 3, 4, 5};
int c[] = {1, 1, 1, 1, 1};
foo(b, c, 5);
}
My answer is probably naive as i am not an expert. I came across the same problem and found out this. First i compile your program to a readable IR:
clang -c -S -emit-llvm foo.c -o foo.ll . Notice this is equivalent to a O0 optimisation level.
the -debug-pass-manager option has been key to me to understand why nothing was neither printed nor written out. Call
% opt foo.ll -passes=dot-ddg -debug-pass-manager
(Note: this is using the new pass manager syntax)
I noticed in the output, things like:
Skipping pass DDGDotPrinterPass on foo due to optnone attribute
Edit the IR foo.ll and indeed, we can spot the optnone attribute, e.g:
; Function Attrs: noinline nounwind optnone uwtable
define dso_local void #foo(i32* noundef %b, i32* noundef %c, i32 noundef %n) #0 {
You can disable this attribute. I regenerate the IR with the proper option -disable-O0-optnone
% clang -O0 -Xclang -disable-O0-optnone foo.c -emit-llvm -S -o foo.ll
and now the attribute is gone :
; Function Attrs: noinline nounwind uwtable
define dso_local void #foo(i32* noundef %b, i32* noundef %c, i32 noundef %n) #0 {
This time the pass will take effect:
% opt foo.ll -disable-output -passes=dot-ddg
Writing 'ddg.foo.for.cond.dot'
Final result, that i can't decipher yet, is:
Related
Consider the following simple function:
int foo() { return 42; }
Compiling this to LLVM via clang -emit-llvm -S foo.cpp produces the following module:
; ModuleID = 'foo.cpp'
source_filename = "foo.cpp"
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-apple-macosx10.13.0"
; Function Attrs: noinline nounwind ssp uwtable
define i32 #_Z3foov() #0 {
ret i32 42
}
attributes #0 = { noinline nounwind ssp uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="penryn" "target-features"="+cx16,+fxsr,+mmx,+sse,+sse2,+sse3,+sse4.1,+ssse3,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
!llvm.module.flags = !{!0}
!llvm.ident = !{!1}
!0 = !{i32 1, !"PIC Level", i32 2}
!1 = !{!"Apple LLVM version 9.0.0 (clang-900.0.37)"}
Why is the foo function declared as noinline? The flag is not added if an optimization level (other than -O0) is specified, but I would like to avoid that.
Is there another way / flag?
With -O0, you can't enable inlining globally, judging from Clang's source code
(Frontend\CompilerInvocation.cpp):
// At O0 we want to fully disable inlining outside of cases marked with
// 'alwaysinline' that are required for correctness.
Opts.setInlining((Opts.OptimizationLevel == 0)
? CodeGenOptions::OnlyAlwaysInlining
: CodeGenOptions::NormalInlining);
Depending on your requirements, you may:
Use -O1, which is the closest to -O0.
Use -O1 in conjuction with disabling of optimization flags that it enables. See the following answer for optimization flags enabled with -O1: Clang optimization levels
Apply always_inline attribute selectively on functions that should be inlined.
For example: int __attribute__((always_inline)) foo() { return 42; }
The following lengthy C program generates a simple LLVM module containing a function that merely calls llvm.x86.sse41.round.ps. It emits the bitcode file and then runs the code generated by LLVM. My question is how do I find out target triple and instruction extensions like SSE or AVX of the host machine and how do I add this information to the LLVM module or how do I otherwise tell it to the LLVM execution engine. Here is, what I do:
$ cat ctest/avx-instruction-selection.c
#include <llvm-c/Core.h>
#include <llvm-c/Target.h>
#include <llvm-c/ExecutionEngine.h>
#include <llvm-c/BitWriter.h>
#include <llvm-c/Transforms/Scalar.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#if 1
const int vectorSize = 4;
const char* roundName = "llvm.x86.sse41.round.ps";
#else
const int vectorSize = 8;
const char* roundName = "llvm.x86.avx.round.ps.256";
#endif
int main ()
{
LLVMModuleRef module;
LLVMExecutionEngineRef execEngine;
LLVMTargetDataRef targetData;
LLVMTypeRef floatType, vectorType, ptrType, voidType, funcType, roundType, int32Type;
LLVMValueRef func, roundFunc;
LLVMValueRef param, loaded, const1, callRound;
LLVMBuilderRef builder;
LLVMBasicBlockRef block;
const int false = 0;
LLVMInitializeX86TargetInfo();
LLVMInitializeX86Target();
LLVMInitializeX86TargetMC();
module = LLVMModuleCreateWithName("_module");
LLVMSetTarget(module, "x86_64-unknown-linux-gnu");
floatType = LLVMFloatType();
vectorType = LLVMVectorType(floatType, vectorSize);
ptrType = LLVMPointerType(vectorType, 0);
voidType = LLVMVoidType();
LLVMTypeRef roundParams[] = { ptrType };
roundType = LLVMFunctionType(voidType, roundParams, 1, false);
func = LLVMAddFunction(module, "round", roundType);
LLVMSetLinkage(func, LLVMExternalLinkage);
builder = LLVMCreateBuilder();
block = LLVMAppendBasicBlock(func, "_L1");
LLVMPositionBuilderAtEnd(builder, block);
param = LLVMGetParam(func, 0);
loaded = LLVMBuildLoad(builder, param, "");
int32Type = LLVMIntType(32);
LLVMTypeRef funcParams[] = { vectorType, int32Type } ;
funcType = LLVMFunctionType(vectorType, funcParams, 2, false);
roundFunc = LLVMAddFunction(module, roundName, funcType);
LLVMSetLinkage(roundFunc, LLVMExternalLinkage);
const1 = LLVMConstInt(int32Type, 1, false);
LLVMValueRef callParams [] = { loaded, const1 } ;
callRound = LLVMBuildCall(builder, roundFunc, callParams, 2, "");
LLVMSetInstructionCallConv(callRound, 0);
LLVMAddInstrAttribute(callRound, 0, 0);
LLVMBuildStore(builder, callRound, param);
LLVMBuildRetVoid(builder);
LLVMWriteBitcodeToFile(module, "round-avx.bc");
char *errorMsg;
LLVMCreateExecutionEngineForModule(&execEngine, module, &errorMsg);
targetData = LLVMGetExecutionEngineTargetData(execEngine);
size_t vectorSize0 = LLVMStoreSizeOfType(targetData, vectorType);
size_t vectorAlign = LLVMABIAlignmentOfType(targetData, vectorType);
float vector[vectorSize];
printf("%lx, size %lx, align %lx\n", (size_t)vector, vectorSize0, vectorAlign);
LLVMGenericValueRef genericVector = LLVMCreateGenericValueOfPointer(vector);
LLVMGenericValueRef runParams[] = { genericVector } ;
LLVMRunFunction(execEngine, func, 1, runParams);
return 0;
}
$ gcc -Wall -o ctest/avx-instruction-selection ctest/avx-instruction-selection.c `/usr/lib/llvm-3.4/bin/llvm-config --cflags --ldflags` -lLLVM-3.4
$ ctest/avx-instruction-selection
7fff590431c0, size 10, align 10
$ ls round-avx.bc
round-avx.bc
$ llvm-dis -o - round-avx.bc
; ModuleID = 'round-avx.bc'
target triple = "x86_64-unknown-linux-gnu"
define void #round(<4 x float>*) {
_L1:
%1 = load <4 x float>* %0
%2 = call <4 x float> #llvm.x86.sse41.round.ps(<4 x float> %1, i32 1)
store <4 x float> %2, <4 x float>* %0
ret void
}
; Function Attrs: nounwind readnone
declare <4 x float> #llvm.x86.sse41.round.ps(<4 x float>, i32) #0
attributes #0 = { nounwind readnone }
$ gcc -Wall -o ctest/avx-instruction-selection ctest/avx-instruction-selection.c `/usr/lib/llvm-3.5/bin/llvm-config --cflags --ldflags` -lLLVM-3.5
$ ctest/avx-instruction-selection
7ffed6170350, size 10, align 10
LLVM ERROR: Cannot select: intrinsic %llvm.x86.sse41.round.ps
$ gcc -Wall -o ctest/avx-instruction-selection ctest/avx-instruction-selection.c `/usr/lib/llvm-3.6/bin/llvm-config --cflags --ldflags` -lLLVM-3.6
$ ctest/avx-instruction-selection
7ffeae91eb40, size 10, align 10
LLVM ERROR: Target does not support MC emission!
$ gcc -Wall -o ctest/avx-instruction-selection ctest/avx-instruction-selection.c `/usr/lib/llvm-3.7/bin/llvm-config --cflags --ldflags` -lLLVM-3.7
$ ctest/avx-instruction-selection
7fffb6464ea0, size 10, align 10
LLVM ERROR: Target does not support MC emission!
$ gcc -Wall -o ctest/avx-instruction-selection ctest/avx-instruction-selection.c `/usr/lib/llvm-3.8/bin/llvm-config --cflags --ldflags` -lLLVM-3.8
$ ctest/avx-instruction-selection
7ffd5e233000, size 10, align 10
LLVM ERROR: Target does not support MC emission!
Summarized: With LLVM-3.4 the example works, with LLVM-3.5 the intrinsic function round.ps cannot be found and LLVM-3.6 and later say something about MC emissions that I do not understand.
As I understand, LLVM-3.5 does not find the round.ps intrinsic and I guess that it cannot find it because I have not told it about the existing SSE extension. When running llc I can add the option -mattr=sse4.1 but how can I tell it to the execution engine?
Second question: How can I find out about the available instruction extensions like SSE of the host machine via the LLVM-C API? On x86 I can call the CPUID instruction but is there a way that works uniformly on all platforms and can LLVM assist detection of extensions?
Third question: I have hard-coded the target triple into the C code. How can I find out the host target-triple via the LLVM-C API?
Last question: What about this MC emission error?
After trying around a lot I think the answer is as follows:
Replace the lines
LLVMInitializeX86TargetInfo();
LLVMInitializeX86Target();
LLVMInitializeX86TargetMC();
by
LLVMInitializeNativeTarget();
LLVMInitializeNativeAsmPrinter();
LLVMInitializeNativeAsmParser();
Replace the call of LLVMCreateExecutionEngineForModule by a call to the custom function LLVMCreateExecutionEngineForModuleCPU. It is the original implementation of LLVMCreateExecutionEngineForModule plus a call of setMCPU.
#define LLVM_VERSION (LLVM_VERSION_MAJOR * 100 + LLVM_VERSION_MINOR)
LLVMBool LLVMCreateExecutionEngineForModuleCPU
(LLVMExecutionEngineRef *OutEE,
LLVMModuleRef M,
char **OutError) {
std::string Error;
#if LLVM_VERSION < 306
EngineBuilder builder(unwrap(M));
#else
EngineBuilder builder(std::unique_ptr<Module>(unwrap(M)));
#endif
builder.setEngineKind(EngineKind::Either)
.setMCPU(sys::getHostCPUName().data())
.setErrorStr(&Error);
if (ExecutionEngine *EE = builder.create()){
*OutEE = wrap(EE);
return 0;
}
*OutError = strdup(Error.c_str());
return 1;
}
I should also add
float vector[vectorSize] __attribute__((aligned(32)));
in order to align the array for AVX vectors.
According to an answer in the thread crash JIT with AVX intrinsics LLVMRunFunction is restricted to main-like prototypes (apparently only in MCJIT). Thus we should also replace the LLVMRunFunction stuff by
void (*funcPtr) (float *);
funcPtr = LLVMGetPointerToGlobal(execEngine, func);
funcPtr(vector);
I am compiling this:
int main(){
}
With clang, using this command line:
clang++.exe -S -o %OUTFILE%.clang -emit-llvm %INFILE% -I. -I%INCLUDE_PATH% -std=c++14 -ftemplate-depth=1000
Which gives me llvm byte-code.
Then I use llc like so, to convert the byte-code into c code:
llc "%IN_FILE%.clang" -march=c -o foo.c
And get this error:
error: unterminated attribute group
attributes #0 = { norecurse nounwind uwtable "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+fxsr,+mmx,+sse,+sse2" "unsafe-fp-math"="false" "use-soft-float"="false" }
What I am doing wrong?
This is what clang++ is giving me:
; ModuleID = 'C:\Users\Owner\Documents\C++\SVN\prototypeInd\util\StaticBigInt.cpp'
target datalayout = "e-m:w-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-windows-msvc18.0.0"
; Function Attrs: norecurse nounwind readnone uwtable
define i32 #main() #0 {
entry:
ret i32 0
}
attributes #0 = { norecurse nounwind readnone uwtable "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+fxsr,+mmx,+sse,+sse2" "unsafe-fp-math"="false" "use-soft-float"="false" }
!llvm.module.flags = !{!0}
!llvm.ident = !{!1}
!0 = !{i32 1, !"PIC Level", i32 2}
!1 = !{!"clang version 3.8.0 (branches/release_38)"}
Note:
I am using clang version 3.8 and llc version 3.4
When you run a command such as:
clang -S -emit-llvm ...
Then the compiler emits not an IR in a bitcode form, but human readable representation.
It makes sense if all tools you use have the same versions, or if you just want to inspect the output manually.
However, the human readable IR may be incompatible with old tools.
In this case I can recommend to use bitcode directly (note that there is no -S parameter anymore):
clang -emit-llvm
C backend in LLVM was removed several years ago. It seems that you're trying to feed LLVM IR from the recent LLVM version to llc from old LLVM. This is certainly not supported - the IR is not compatible between the versions.
I am trying to call printf to print a float number from LLVM. While it works fine with int, it segfaults when using double.
Here is the code (generated from clang but slightly modified so that it works fine with llc) :
#.str = private unnamed_addr constant [3 x i8] c"%f\00", align 1
; Function Attrs: nounwind uwtable
define i32 #main() #0 {
%1 = alloca i32, align 4
store i32 0, i32* %1
%2 = call i32 (i8*, ...)* #printf(i8* getelementptr inbounds ([3 x i8]* #.str, i32 0, i32 0), double 3.140000e+00)
ret i32 0
}
declare i32 #printf(i8*, ...) #1
attributes #0 = { nounwind uwtable "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+sse,+sse2" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+sse,+sse2" "unsafe-fp-math"="false" "use-soft-float"="false" }
Here is how I produce an executable:
llc main.ll --filetype=obj
ld -lc -e main -dynamic-linker /lib64/ld-linux-x86-64.so.2 main.o -o main
When I run main within valgrind, I get:
Process terminating with default action of signal 11 (SIGSEGV): dumping core
General Protection Fault
I read somewhere on this site that I need to align the stack to use printf.
If this is the issue, how can I do this in LLVM.
Otherwise, what is causing this segfault?
I run Linux 64-bits.
This is NOT llvm issue.
when you ran,
llc main.ll --filetype=obj
ld -lc -e main -dynamic-linker /lib64/ld-linux-x86-64.so.2 main.o -o main
here after creating objact file main.o you are trying to link it saying that main is its entry point for execution, which is not correct.
from the c programmer point of view main is the entry point for our program while execution which is not, some extra startup code is added by compiler which is _start, this is the function which is executed first and then interns call main function.
the startup code is relocatable objects and those are passed by compiler to linker. search for crti.o crtn.o and crt1.o, also the printf function is in library libc.so/libc.a which you need to provide while linking.
if you want the easy solution then use gcc for converting object files to executable
gcc main.o -o main
also you can have look here for more clarification,
ref http://blog.techveda.org/building-executables-with-gnu-linker/
I noticed that llvm.read_register() could read the value of stack pointer, as well as llvm.write_register() could set the value of stack pointer. I add main function to the stackpointer.ll which could be found in the llvm src:
;stackpointer.ll
define i32 #get_stack() nounwind {
%sp = call i32 #llvm.read_register.i32(metadata !0)
ret i32 %sp
}
declare i32 #llvm.read_register.i32(metadata) nounwind
!0 = metadata !{metadata !"sp\00"}
define i32 #main() {
%1 = call i32 #get_stack()
ret i32 %1
}
I tested on an armv7 board running ubuntu 11.04:
lli stackpointer.ll
then, I get a stack dump:
ARMCodeEmitter::emitPseudoInstruction
UNREACHABLE executed at ARMCodeEmitter.cpp:847!
Stack dump:
0. Program arguments: lli stackpointer.ll
1. Running pass 'ARM Machine Code Emitter' on function '#main'
Aborted
I also tried llc:
llc stackpointer.ll -o stackpointer.s
The error messege:
Can't get register for value!
UNREACHABLE executed at ARMCodeEmitter.cpp:1183!
Stack dump:
0. Program arguments: llc stackpointer.ll -o stackpointer.s
1. Running pass 'Function Pass Manager' on moulude 'stackpointer.ll'
2. Running pass 'ARM Instruction Selection' on function '#get_stack'
Aborted
I also tried on x86-64 platform, it didn't work. What is the correct way to use these intrinsics?
My lli didn't like your metadata definition.
I cnagned your
!0 = metadata !{metadata !"sp\00"}
to
!0 = !{!"sp\00"}
And it worked. (Well, since I'm on x86-64, I have also changed everywhere i32 to i64 and sp to rsp).
Plus there were bad whitespace symbols in your formatting, but I think it might be due to StackOverflow/html or something).