What is the proper way to load a set of pre-generated LLVM IR and make it available to runtime JIT modules such that the same types aren't given new names and inlining and const propagation can still take place?
My attempt so far:
I compile those C functions to LLVM IR offline via clang -c -emit-llvm -S -ffast-math -msse2 -O3 -o MyLib.ll MyLib.c
For each runtime JIT module, I load the generated LLVM IR via llvm::parseIRFile() and "paste" it into the runtime JIT module via llvm::Linker::linkModules().
This works fine for the first JIT module, but not for subsequent JIT modules created. Each time llvm::parseIRFile() is called, the resulting module's type definitions in the IR is given new names.
For example, offline MyLib.ll looks like this:
%struct.A = type { <4 x float> }
define <4 x float> #Foo(A*)(%struct.A* nocapture readonly)
{
...
}
The resulting module from the first call to llvm::parseIRFile() looks exactly as the offline version. The resulting module from the second call to llvm:parseIRFile() instead looks like:
%struct.A.1 = type { <4 x float> }
define <4 x float> #Foo(A*)(%struct.A.1* nocapture readonly)
{
...
}
Note that %struct.A was renamed to %struct.A.1. The runtime JIT module continues to generate code using %struct.A thus fails to call Foo() since it takes %struct.A.1 instead.
Related
I would like to add method to my class using assebler language. How can I do it?
example:
main.cpp
Struct ex {
int field1;
asm_method(char*);
}
add.asm
asm_method:
//some asm code
Get asm output the compiler generates for a non-inline definition of the C++ member function, and use that as a starting point for an asm source file. This works for any ISA with any compiler that can emit valid asm (which is most of them, although apparently MSVC emits a bunch of extra junk that you have to remove.)
Example with GCC (for x86-64 GNU/Linux, but works anywhere)
Also works with clang.
e.g. g++ -O3 -fverbose-asm -masm=intel -S -o foo_func.S foo.cpp (How to remove "noise" from GCC/clang assembly output?)
That .S file is now your asm source file. Remove the compiler-generated instruction lines and insert your own.
Obviously you need to know the calling convention and other stuff like that (e.g. for x86 see https://www.agner.org/optimize/#manuals for a calling convention guide), but this will get the compiler to do the name mangling for you, for that specific target platform's ABI.
struct ex { // lower case struct not Struct
int field1;
void *asm_method(char*); // methods need a return type
}; // struct declarations end with a ;
void *ex::asm_method(char*) {
return this; // easy way to find out what register `this` is passed in.
}
compiles as follows for x86-64 System V, with g++ -O3 (Godbolt with Linux gcc and Windows MSVC)
# x86-64 System V: GNU/Linux g++ -O3
# This is GAS syntax
.intel_syntax noprefix
.text # .text section is already the default at top of file
.align 2
.p2align 4 # aligning functions by 16 bytes is typical
.globl _ZN2ex10asm_methodEPc # the symbol is global, not private to this file
.type _ZN2ex10asm_methodEPc, #function # (optional) and it's a function.
_ZN2ex10asm_methodEPc: # a label defines the symbol
.cfi_startproc
## YOUR CODE GOES HERE ##
## RSP-8 is aligned by 16 in x86-64 SysV and Windows ##
mov rax, rdi # copy first arg (this) to return-value register.
ret # pop into program counter
.cfi_endproc
.size _ZN2ex10asm_methodEPc, .-_ZN2ex10asm_methodEPc # maybe non-optional for dynamic linking
It's probably fine to omit the .cfi stack-unwind directives from hand-written asm for leaf functions, since you're not going to be throwing C++ exceptions from hand-written asm (I hope).
This depends on your target platform and compiler/toolchain and is generally too broad a question for StackOverflow.
For example, the C++ compiler in the GCC toolchain actually generates assembly from C++, and then produces object files from that assembly. Then the linker links together multiple object files to produce an ELF module.
You can bypass the C++ compilation step for a single object file and directly write .asm files.
You can compile it the same way you compile .c: gcc myfile.S -o myfile.o.
Though you should take platform ABI into account such that you can accept function arguments and return values via the correct registers. The platform ABI also specifies the calling convention and which registers should be preserved across function calls. Finally, you need to produce correct function names according to C++ name mangling rules, or use C naming rules (which are simpler) and declare your function extern "C".
For more details see C++ to ASM linkage and for Linux ABI refer to System V ABI.
For Windows start here: calling conventions and compiling assembly in Visual Studio.
I am new to clang and llvm. I'm trying to generate an unoptimized version of bit code from a c source code. I found that the generated bit code is having the constant folding optimization which I don't want.
I'm using this command: clang -O0 -Xclang -disable-O0-optnone test1.c -S -emit-llvm -o test1.ll
The test1.c file has the following code:
int test() {
int y;
y = 2 * 4;
return y;
}
The content of the test1.ll file:
Instead of generating an instruction for multiplying 2 and 4, it is directly storing the value 8 by doing the constant folding operation:
store i32 8, i32* %1, align 4
It would be really nice if someone kindly let me know what I am missing and how should I turn off the constant folding optimization. The version of llvm I am using is 6.0.0.
Thank you.
It would be really nice if someone kindly let me know what I am missing and how should I turn off the constant folding optimization. The version of llvm I am using is 6.0.0.
It is a Clang feature and can't be turned off even with -O0. To workaround this try making variables global, pass them as parameters to the function, or just write the IR manually.
I'm working on a compiler for a small language. Inside the compiler, I'm using the LLVM C++ API to generate llvm code, similar to the LLVM Kaleidoscope tutorial. So I'm using TheModule, TheContext, BasicBlocks,
and calls to Builder.Create...().
I can currently generate valid llvm code for arithmetic, control flow, and methods. However, I would also like my small language to support very simple OpenMP pragmas. For example,
#pragma omp parallel
{
print "Hello World"
}
I've tried writing a similar program in C++,
#include <iostream>
int main() {
#pragma omp parallel
{
std::cout << "Hi";
}
}
and generating llvm using clang++ -S -emit-llvm file.cpp -fopenmp. Along with the rest of the code, this generates the following lines which seem to implement the OpenMP functionality:
declare void #__kmpc_fork_call(%ident_t*, i32, void (i32*, i32*, ...)*, ...)
define internal void #.omp_outlined.(...)
From researching these statements, I found the Clang OpenMP API that contains calls like
OMPParallelDirective * OMPParallelDirective::Create(...)
I'm guessing this is what the Clang compiler uses to generate the statements above. However, it seems to be separate from the LLVM C++ API, as it doesn't reference TheContext, TheModule, etc...
So my question: Is there any way to leverage the Clang OpenMP API calls with my LLVM C++ API calls to generate the kmpc__fork_call and #.omp_outlined IR needed for parallel computation?
I did try decompiling the llvm generated from the C++ code back into LLVM C++ API code using llc -march=cpp file.bc ... but was unsuccessful.
The API you found operate on clang AST and are hardly usable outside clang. In fact, there are no OpenMP constructs at the LLVM IR level - everything is already lowered down to runtime calls, etc.
So, you'd really need to implement codegeneration for OpenMP by yourself emitting runtime calls as necessary (and per your language semantics).
I'm writing my own language in LLVM and I'm using external C functions from std and custom. I'm now adding declarations using C++ classes for LLVM IR. Like this:
void register_malloc(llvm::Module *module) {
std::vector<llvm::Type*> arg_types;
arg_types.push_back(Type::getInt32Ty(getGlobalContext()));
FunctionType* type = FunctionType::get(
Type::getInt8PtrTy(getGlobalContext()), arg_types, false);
Function *func = Function::Create(
type, llvm::Function::ExternalLinkage,
llvm::Twine("malloc"),
module
);
func->setCallingConv(llvm::CallingConv::C);
}
void register_printf(llvm::Module *module) {
std::vector<llvm::Type*> printf_arg_types;
printf_arg_types.push_back(llvm::Type::getInt8PtrTy(getGlobalContext()));
llvm::FunctionType* printf_type =
llvm::FunctionType::get(
llvm::Type::getInt32Ty(getGlobalContext()), printf_arg_types, true);
llvm::Function *func = llvm::Function::Create(
printf_type, llvm::Function::ExternalLinkage,
llvm::Twine("printf"),
module
);
func->setCallingConv(llvm::CallingConv::C);
}
I'm gonna define tens of external functions, is there some easy way to define them, and how?
I think about "including" C header(or LLVM IR file .ll) to the module. But I couldn't find any example how to do this...
Create an empty C source and include every header you need, then compile it to LLVM IR with clang -S -emit-llvm. This source would contain declarations for every function from headers. Now use llc -march=cpp out.ll and it will produce C++ source that calls LLVM API to generate given IR. You can copy-paste this code into your program.
Make sure you have cpp backend enabled during LLVM build.
I'm trying to read and call a function parsed from LLVM bitcode in LLVM 2.8. I have everything working apart from the actual call, which crashes the program.
First I have this C code:
void hello() {}
I've compiled this with:
llvm-gcc -c -emit-llvm hello.c -o hello.bc
Here's a trimmed down version of the code that's supposed to read it:
using namespace std;
using namespace llvm;
void callFunction(string file, string function) {
InitializeNativeTarget();
LLVMContext context;
string error;
MemoryBuffer* buff = MemoryBuffer::getFile(file);
Module* m = getLazyBitcodeModule(buff, context, &error);
// Check the module parsed here.
// ...
ExecutionEngine* engine = ExecutionEngine::create(m);
// Check the engine started up correctly here.
// ...
Function* func = m->getFunction(function);
// Check the function was found here.
// ..
vector<GenericValue> args(0);
// This is what crashes.
engine->runFunction(func, args);
}
I've included plenty of LLVM headers, including ExecutionEngine/JIT.h, and the code checks at each step to make sure values aren't NULL. It parses the bitcode, and I have examined the function it finds to confirm it was as expected.
I've also tried building a module and function myself, which works as expected, so the problem definitely arises from the fact that the function is produced by the bitcode.
I've managed to get this running as expected. I was curious if the problem lay in the above process, but this is obviously not the case. The system I was running this as a part of was causing the crash, and the code above does work on its own.