LLVM Insert function call into another function - c++

I am trying to insert function call inside a main function, so then when i run generated binary file, function will be executed automatically. Since language i am trying to "compile" looks like a "scripted" language :
function foo () begin 3 end;
function boo () begin 4 end;
writeln (foo()+boo()) ;
writeln (8) ;
writeln (9) ;
where writeln is a function available by default, and after executing binary i expect to see 7 8 9. Is there a way to insert last function call right before return statement of a main function ?
Right now I have
define i32 #main() {
entry:
ret i32 0
}
and i want to have something like
define i32 #main() {
entry:
%calltmp = call double #writeln(double 7.000000e+00)
%calltmp = call double #writeln(double 8.000000e+00)
%calltmp = call double #writeln(double 9.000000e+00)
ret i32 0
}
editing IR file manually and compile it afterwards works, but i want to do it in codegen part of my code.
edit
what i generate right now is
define double #__anon_expr() {
entry:
%main = call double #writeln(double 3.000000e+00)
ret double %main
}
define i32 #main() {
entry:
ret i32 0
}
so when i execute binary - nothing happens

feel free to source your inspiration from here
Type * returnType = Type::getInt32Ty(TheContext);
std::vector<Type *> argTypes;
FunctionType * functionType = FunctionType::get(returnType, argTypes, false);
Function * function = Function::Create(functionType, Function::ExternalLinkage, "main", TheModule.get());
BasicBlock * BB = BasicBlock::Create(TheContext, "entry", function);
Builder.SetInsertPoint(BB);
vector<Value *> args;
args.push_back(ConstantFP::get(TheContext, APFloat(4.0)));
Builder.CreateCall(getFunction("writeln"), args, "call");
Value * returnValue = Builder.getInt32(0);
Builder.CreateRet(returnValue);

Related

LLVM error: invalid redefinition of function

I'm writing a LLVM IR generator for a pseudo code language. This language should allow redefinition of function.
Here is one case that I have two functions both named "f" but they have different parameters.
function f(int i, float r) returns int { return i; }
function f(float r, float r2) returns int {return i; }
I thought LLVM could distinct that, but I get
error: invalid redefinition of function
And the code I generated is:
define i32 #f(i32 %i, float %r) {
%var.i.0 = alloca i32
store i32 %i, i32* %var.i.0
%var.r.1 = alloca float
store float %r, float* %var.r.1
%int.2 = load i32* %var.i.0
ret i32 %int.2
; -- 0 :: i32
%int.3 = add i32 0, 0
ret i32 %int.3
}
define i32 #f(float %r, float %r2) {
%var.r.2 = alloca float
store float %r, float* %var.r.2
%var.r2.3 = alloca float
store float %r2, float* %var.r2.3
%var.i.4 = alloca i32
%float.3 = load float* %var.r.2
%int.7 = fptosi float %float.3 to i32
store i32 %int.7, i32* %var.i.4
%int.8 = load i32* %var.i.4
ret i32 %int.8
; -- 0 :: i32
%int.9 = add i32 0, 0
ret i32 %int.9
}
So, I think LLVM do not allow function overloading? Then is it a good idea that I generate a sequential counter, and distinct all these functions by adding this sequential counter as a suffix i.e define i32 #f.1() and define i32 #f.2()?
You're correct that LLVM IR doesn't have function overloading.
Using a sequential counter is probably not a good idea depending on how code in your language is organized. If you're just assigning incrementing integers, those may not be deterministic across the compilation of different files. For example, in C++, you might imagine something like
// library.cpp
int f(int i, float r) { ... }
int f(float r, float r2) { ... }
// user.cpp
extern int f(float r, float r2);
int foo() { return f(1.0, 2.0); }
When compiling user.cpp, there would be no way for the compiler to know that the f being referenced will actually be named f.2.
The typical way to implement function overloading is to use name mangling, somehow encoding the type signature of the function into its name so that it'll be unique in the presence of overloads.
My generator was written in java, so every time I parse a function definition, I will increase the counter for the same function name if the function name has already existed in the scope table.
My table is defined by Map with function name as key, and a list of function def as value:
Map<String,ArrayList<functionSymbol>> = new HashMap<>();
and then the constructor will look like:
static int counter = 0;
public FunctionSymbol(String functionName, Type retType, List<Variable> paramList){
this.functionName = functionName+counter;
this.paramList = paramList;
this.retType = retType;
counter++;
}

LLVM IR - Can someone explain this behavior?

I'm trying to build a compiler for my language at the moment. In my language, I want to have implicit pointer usage for objects/structs just like in Java. In the program below, I am testing out this feature. However, the program does not run as I had expected. I do not expect you guys to read through my entire compiler code because that would be a waste of time. Instead I was hoping I could explain what I intended for the program to do and you guys could spot in the llvm ir what went wrong. That way, I can adjust the compiler to generate proper llvm ir.
Flow:
[Function] Main - [Return: Int] {
-> Allocates space for structure of one i32
-> Calls createObj function and stores the returning value inside previous allocated space
-> Returns the i32 of the structure
}
[Function] createObj - [Return: struct { i32 }] {
-> Allocates space for structure of one i32
-> Calls Object function on this space (pointer really)
-> Returns this space (pointer really)
}
[Function] Object - [Return: void] {
-> Stores the i32 value of 5 inside of the struct pointer argument
}
The program is that main keeps returning some random number instead of 5. One such number is 159383856. I'm guessing that this is the decimal representation of a pointer address, but I'm not sure why it is printing out the pointer address.
; ModuleID = 'main'
%Object = type { i32 }
define i32 #main() {
entry:
%0 = call %Object* #createObj()
%o = alloca %Object*
store %Object* %0, %Object** %o
%1 = load %Object** %o
%2 = getelementptr inbounds %Object* %1, i32 0, i32 0
%3 = load i32* %2
ret i32 %3
}
define %Object* #createObj() {
entry:
%0 = alloca %Object
call void #-Object(%Object* %0)
%o = alloca %Object*
store %Object* %0, %Object** %o
%1 = load %Object** %o
ret %Object* %1
}
define void #-Object(%Object* %this) {
entry:
%0 = getelementptr inbounds %Object* %this, i32 0, i32 0
store i32 5, i32* %0
ret void
}
This llvm ir is generated from this syntax.
func () > main > (int) {
Object o = createObj();
return o.id;
}
// Create an object and returns it
func () > createObj > (Object) {
Object o = make Object < ();
return o;
}
// Object decl
tmpl Object {
int id; // Property
// This is run every time an object is created.
constructor < () {
this.id = 5;
}
}
It seems like in createObj you're returning a pointer to a stack variable which will no longer be valid after function return.
If you're doing implicit object pointers like Java at minimum you're going to need a call to a heap allocation like malloc which I don't think you have.

llvm - How to implement print function in my language?

I'm following llvm's tutorial for their own simple programming language "Kaleidoscope" and there's an obvious functionality in my language which this tutorial doesn't seem to cover. I simply want to print any double to standard output pretty much as C++ would do:
std::cout << 5.0;
my language would do something like
print(5.0);
Third chapter of llvm's tutorial covers function calls. The code they use is:
Value *CallExprAST::codegen() {
// Look up the name in the global module table.
Function *CalleeF = TheModule->getFunction(Callee);
if (!CalleeF)
return ErrorV("Unknown function referenced");
// If argument mismatch error.
if (CalleeF->arg_size() != Args.size())
return ErrorV("Incorrect # arguments passed");
std::vector<Value *> ArgsV;
for (unsigned i = 0, e = Args.size(); i != e; ++i) {
ArgsV.push_back(Args[i]->codegen());
if (!ArgsV.back())
return nullptr;
}
return Builder.CreateCall(CalleeF, ArgsV, "calltmp");
}
How could I implement codegen() method for specific function call print(any fp number)?
below is the llvm ir code generated for printf("%f", a); using clang. printf signature is int printf(const char*, ...);
#.str = private unnamed_addr constant [3 x i8] c"%f\00", align 1
; Function Attrs: nounwind uwtable
define i32 #main() #0 {
%a = alloca double, align 8
%1 = load double* %a, align 8
%2 = call i32 (i8*, ...)* #printf(i8* getelementptr inbounds ([3 x i8]* #.str, i32 0, i32 0), double %1)
ret i32 0
}
declare i32 #printf(i8*, ...) #1
to implement in codegen you first need to check if the function is already present in module or not. if not then you need to add the declaration, you can do both in one call.
Function *CalleeF = TheModule->getOrInsertFunction("printf",
FunctionType::get(IntegerType::getInt32Ty(Context), PointerType::get(Type::getInt8Ty(Context), 0), true /* this is var arg func type*/)
);
above will get or add you the handle to function declaration
declare i32 #printf(i8*, ...) #1
then you can call function via matching params.
std::vector<Value *> ArgsV;
for (unsigned i = 0, e = Args.size(); i != e; ++i)
ArgsV.push_back(Args[i]->codegen());
return Builder.CreateCall(CalleeF, ArgsV, "printfCall");
You'd first check if Callee == "print" and then insert any instructions you want.
LLVM IR has no concept of "printing" since that's not really a language consideration -- it's a facility provided by the OS. Probably the simplest option for you would be to translate the call into a call to printf, so that e.g. print(5.0) becomes printf("%f\n", 5.0).
The tutorial you linked does show how external function calls work -- you'd have to insert a declaration for printf with the correct signature, then build a call to that.

LLVM extract i8* out of structure value

I'm writing a compiler using LLVM as a backend, I've written the front-end (parser, etc.) and now I've come to a crossroads.
I have a structure (%Primitive) which contains a single field, an i8* value, a pointer to a character array.
%Primitive = type { i8* }
In the compiler, instances of Primitive are passed around on the stack. I'm trying to write this character array to standard output using the puts function, but it isn't working quite like I was hoping.
declare i32 #puts(i8*) ; Declare the libc function 'puts'
define void #WritePrimitive(%Primitive) {
entry:
%1 = extractvalue %Primitive %0, 0 ; Extract the character array from the primitive.
%2 = call i32 #puts(i8* %1) ; Write it
ret void
}
When I try to run the code (either using an ExecutionEngine or the LLVM interpreter program lli), I get the same error; a segmentation fault.
The error lies in the fact that the address passed to puts is somehow the ASCII character code of the first character in the array. It seems the address passed, rather than being a pointer to an array of 8 bit chars, is instead an 8 bit wide pointer that equals the dereferenced string.
For example, if I call #WritePrimitive with a primitive where the i8* member points to the string "hello", puts is called with the string address being 0x68.
Any ideas?
Thanks
EDIT: You were right, I was initializing my Primitive incorrectly, my new initialization function is:
llvm::Value* PrimitiveHelper::getConstantPrimitive(const std::string& str, llvm::BasicBlock* bb)
{
ConstantInt* int0 = ConstantInt::get(Type::getInt32Ty(getGlobalContext()), 0);
Constant* strConstant = ConstantDataArray::getString(getGlobalContext(), str, true);
GlobalVariable* global = new GlobalVariable(module,
strConstant->getType(),
true, // Constant
GlobalValue::ExternalLinkage,
strConstant,
"str");
Value* allocated = new AllocaInst(m_primitiveType, "allocated", bb);
LoadInst* onStack1 = new LoadInst(allocated, "onStack1", bb);
GetElementPtrInst* ptr = GetElementPtrInst::Create(global, std::vector<Value*>(2,int0), "", bb);
InsertValueInst* onStack2 = InsertValueInst::Create(onStack1, ptr, std::vector<unsigned>(1, 0), "", bb);
return onStack2;
}
I missed that, Thank You!
There's nothing wrong with the code you pasted above; I just tried it myself and it worked fine. I'm guessing the issue is that you did not initialize the pointer properly, or did not set it properly into the struct.
The full code I used is:
#str = private unnamed_addr constant [13 x i8] c"hello world\0A\00"
; Your code
%Primitive = type { i8* }
declare i32 #puts(i8*) ; Declare the libc function 'puts'
define void #WritePrimitive(%Primitive) {
entry:
%1 = extractvalue %Primitive %0, 0 ; Extract the character array from the primitive.
%2 = call i32 #puts(i8* %1) ; Write it
ret void
}
; /Your code
define void #main() {
%allocated = alloca %Primitive
%onstack1 = load %Primitive* %allocated
%onstack2 = insertvalue %Primitive %onstack1, i8* getelementptr ([13 x i8]* #str, i64 0, i64 0), 0
call void #WritePrimitive(%Primitive %onstack2)
ret void
}

llvm exceptions; catch handler not handling, cleanup not called

I i'm trying to create a exception handler inside JIT llvm code. the current documentation regarding exception handling in LLVM is very handwavy at the moment, so i've been trying to reuse most of the snippets i get from http://llvm.org/demo in order to get a working example, but i'm not sure if those are up to date with llvm 2.9 (the version i am using).
This is what the module looks after Module::dump();
; ModuleID = 'testModule'
declare i32 #myfunc()
define i32 #test_function_that_invokes_another() {
entryBlock:
%0 = alloca i8*
%1 = alloca i32
%someName = invoke i32 #myfunc()
to label %exitBlock unwind label %unwindBlock
exitBlock: ; preds = %entryBlock
ret i32 1
unwindBlock: ; preds = %entryBlock
%2 = call i8* #llvm.eh.exception()
store i8* %2, i8** %0
%3 = call i32 (i8*, i8*, ...)* #llvm.eh.selector(i8* %2, i8* bitcast (i32 (...)* #__gxx_personality_v0 to i8*), i8* null)
store i32 1, i32* %1
%4 = load i8** %0
%5 = call i32 (...)* #__cxa_begin_catch(i8* %4) nounwind
%cleanup_call = call i32 #myCleanup()
%6 = call i32 (...)* #__cxa_end_catch()
ret i32 1
}
declare i32 #__gxx_personality_v0(...)
declare i32 #__cxa_begin_catch(...)
declare i32 #__cxa_end_catch(...)
declare i8* #llvm.eh.exception() nounwind readonly
declare i32 #llvm.eh.selector(i8*, i8*, ...) nounwind
declare i32 #myCleanup()
and this is what happens when i try to execute the function:
inside JIT calling C/C++ call
terminate called after throwing an instance of 'int'
Aborted
this shows that the function that throws gets called, it throws, but i never land in the cleanup call. (my cleanup call should have said 'inside JIT calling C/C++ Cleanup')
The function that invokes and (attempts) to catch a thrown exception is:
const inline llvm::FunctionType* getTestFunctionSignature(llvm::LLVMContext& context) {
return llvm::TypeBuilder< unsigned int(), false > ::get(context);
}
llvm::Function* createFunctionThatInvokesAnother( llvm::LLVMContext& ctx, llvm::Module* mod , llvm::Function* another ) {
llvm::Function* result = llvm::Function::Create(getTestFunctionSignature(ctx),
llvm::GlobalValue::ExternalLinkage,
"test_function_that_invokes_another",
mod);
llvm::BasicBlock* entry_block = llvm::BasicBlock::Create(ctx, "entryBlock", result);
llvm::BasicBlock* exit_block = llvm::BasicBlock::Create(ctx, "exitBlock", result);
llvm::BasicBlock* unwind_block = llvm::BasicBlock::Create(ctx, "unwindBlock", result);
llvm::IRBuilder<> builder(entry_block);
llvm::ConstantInt* ci = llvm::ConstantInt::get( mod->getContext() , llvm::APInt( 32 , llvm::StringRef("1"), 10));
llvm::PointerType* pty3 = llvm::PointerType::get(llvm::IntegerType::get(mod->getContext(), 8), 0);
llvm::AllocaInst* ptr_24 = new llvm::AllocaInst(pty3, "", entry_block);
llvm::AllocaInst* ptr_25 = new llvm::AllocaInst(llvm::IntegerType::get(mod->getContext(), 32), "", entry_block);
llvm::Twine name("someName");
builder.CreateInvoke( another , exit_block , unwind_block , "someName" );
builder.SetInsertPoint( exit_block );
builder.CreateRet(ci);
builder.SetInsertPoint( unwind_block );
llvm::Function* func___gxx_personality_v0 = func__gxx_personality_v0(mod);
llvm::Function* func___cxa_begin_catch = func__cxa_begin_catch(mod);
llvm::Function* func___cxa_end_catch = func__cxa_end_catch(mod);
llvm::Function* func_eh_ex = func_llvm_eh_exception(mod);
llvm::Function* func_eh_sel = func__llvm_eh_selector(mod);
llvm::Constant* const_ptr_17 = llvm::ConstantExpr::getCast(llvm::Instruction::BitCast, func___gxx_personality_v0, pty3);
llvm::ConstantPointerNull* const_ptr_18 = llvm::ConstantPointerNull::get(pty3);
llvm::CallInst* get_ex = llvm::CallInst::Create(func_eh_ex, "", unwind_block);
get_ex->setCallingConv(llvm::CallingConv::C);
get_ex->setTailCall(false);
new llvm::StoreInst(get_ex, ptr_24, false, unwind_block);
std::vector<llvm::Value*> int32_37_params;
int32_37_params.push_back(get_ex);
int32_37_params.push_back(const_ptr_17);
int32_37_params.push_back(const_ptr_18);
llvm::CallInst* eh_sel = llvm::CallInst::Create(func_eh_sel, int32_37_params.begin(), int32_37_params.end(), "", unwind_block);
eh_sel->setCallingConv(llvm::CallingConv::C);
eh_sel->setTailCall(false);
new llvm::StoreInst(ci, ptr_25, false, unwind_block);
llvm::LoadInst* ptr_29 = new llvm::LoadInst(ptr_24, "", false, unwind_block);
llvm::CallInst* ptr_30 = llvm::CallInst::Create(func___cxa_begin_catch, ptr_29, "", unwind_block);
ptr_30->setCallingConv(llvm::CallingConv::C);
ptr_30->setTailCall(false);
llvm::AttrListPtr ptr_30_PAL;
{
llvm::SmallVector<llvm::AttributeWithIndex, 4 > Attrs;
llvm::AttributeWithIndex PAWI;
PAWI.Index = 4294967295U;
PAWI.Attrs = 0 | llvm::Attribute::NoUnwind;
Attrs.push_back(PAWI);
ptr_30_PAL = llvm::AttrListPtr::get(Attrs.begin(), Attrs.end());
}
ptr_30->setAttributes(ptr_30_PAL);
llvm::Function* cleanup = call_myCleanup( mod );
builder.CreateCall( cleanup , "cleanup_call");
llvm::CallInst* end_catch = llvm::CallInst::Create(func___cxa_end_catch, "", unwind_block);
builder.CreateRet(ci);
//createCatchHandler( mod , unwind_block );
return result;
}
This gets called like the usual business:
testMain() {
llvm::LLVMContext ctx;
llvm::InitializeNativeTarget();
llvm::StringRef idRef("testModule");
llvm::Module* module = new llvm::Module(idRef, ctx);
std::string jitErrorString;
llvm::ExecutionEngine* execEngine = executionEngine( module , jitErrorString );
llvm::FunctionPassManager* OurFPM = new llvm::FunctionPassManager(module);
llvm::Function *thr = call_my_func_that_throws( module );
llvm::Function* result = createFunctionThatInvokesAnother(ctx, module ,thr);
std::string errorInfo;
llvm::verifyModule(* module, llvm::PrintMessageAction, & errorInfo);
module->dump();
void *fptr = execEngine->getPointerToFunction(result);
unsigned int (*fp)() = (unsigned int (*)())fptr;
try {
unsigned int value = fp();
} catch (...) {
std::cout << " handled a throw from JIT function" << std::endl;
}
}
where my function that throws is:
int myfunc() {
std::cout << " inside JIT calling C/C++ call" << std::endl;
throw 0;
};
llvm::Function* call_my_func_that_throws (llvm::Module* mod) {
std::vector< const llvm::Type* > FuncTy_ex_args;
llvm::FunctionType* FuncTy_ex = llvm::FunctionType::get( llvm::IntegerType::get( mod->getContext() , 32) , FuncTy_ex_args , false);
llvm::Function* result = llvm::Function::Create(FuncTy_ex, llvm::GlobalValue::ExternalLinkage, "myfunc", mod);
result->setCallingConv( llvm::CallingConv::C );
llvm::AttrListPtr PAL;
result->setAttributes( PAL );
llvm::sys::DynamicLibrary::AddSymbol( "myfunc" , (void*) &myfunc );
return result;
}
and my cleanup function is defined in a similar way:
int myCleanup() {
std::cout << " inside JIT calling C/C++ Cleanup" << std::endl;
return 18;
};
llvm::Function* call_myCleanup (llvm::Module* mod) {
std::vector< const llvm::Type* > FuncTy_ex_args;
llvm::FunctionType* FuncTy_ex = llvm::FunctionType::get( llvm::IntegerType::get( mod->getContext() , 32) , FuncTy_ex_args , false);
llvm::Function* result = llvm::Function::Create(FuncTy_ex, llvm::GlobalValue::ExternalLinkage, "myCleanup", mod);
result->setCallingConv( llvm::CallingConv::C );
llvm::AttrListPtr PAL;
result->setAttributes( PAL );
llvm::sys::DynamicLibrary::AddSymbol( "myCleanup" , (void*) &myCleanup );
return result;
}
I've also read this document regarding recent exception handling changes in LLVM, but is not clear how those changes translate to actual, you know, code
Right now the EH code is undergoing a large amount of revision. The demo, if I recall correctly, is not version 2.9, but current development sources - meaning trying to do something with 2.9 is going to be a world of hurt if you try that way.
That said, the EH representation is much better now and numerous patches have gone in to improve the documentation just this week. If you are trying to write a language that uses exceptions via llvm I highly suggest you migrate your code to current development sources.
All of that said, I'm not sure how well exception handling works in the JIT at all right now. It's nominally supported, but you may need to debug the unwind tables that are put into memory to make sure they're correct.