I am attempting to write some compiler and use LLVM to generate intermediate code. Unfortunately, LLVM documentation is not very great and even somewhat confusing.
At the moment I have lexer,grammar and AST implemented. I was also following some examples found on Internet. My current AST works as follows: it has the abstract base class Tree*, from which other trees inherit (so, like one for variable definition, one for statement list, one for binary expression etc.).
I am trying to implement the variable definition, so for the input
class Test{
int main()
{
int x;
}
}
I want LLVM output to be:
; ModuleID = "Test"
define i32 #main() {
entry:
%x = alloca i32
return i32 0
}
However, right now I can get %x = alloca i32 part to the part where main function is created, but the actual output is missing the %x = alloca i32. So, the output I'm getting is as follows:
; ModuleID = "Test"
define i32 #main() {
entry:
return i32 0
}
my Codegen() for variable declaration is shown bellow (symbol table for now is just a list, I am trying to keep things as simple as possible at the moment):
llvm::Value *decafStmtList::Codegen() {
string name = SyandTy.back(); // Just a name of a variable
string type = SyandTy.front(); // and its type in string format
Type* typeVal = getLLVMType(decafType(str2DecafType(type))); // get LLVM::*Type representation
llvm::AllocaInst *Alloca = Builder.CreateAlloca(typeVal, 0, name.c_str());
Value *V = Alloca;
return Alloca;//Builder.CreateLoad(V, name.c_str());
}
The part where I am generating my #main is as follows:
Note: I have commented out the print_int function (this is the function I will use later to print things, but for now I don't need it). If I'll uncomment the print_int function, TheFunction will not pass verifier(TheFunction) -> complains about module being broken and parameters not matching the signature.
Function *gen_main_def(llvm::Value *RetVal, Function *print_int) {
if (RetVal == 0) {
throw runtime_error("something went horribly wrong\n");
}
// create the top-level definition for main
FunctionType *FT = FunctionType::get(IntegerType::get(getGlobalContext(), 32), false);
Function *TheFunction = Function::Create(FT, Function::ExternalLinkage, "main", TheModule);
if (TheFunction == 0) {
throw runtime_error("empty function block");
}
// Create a new basic block which contains a sequence of LLVM instructions
BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction);
// All subsequent calls to IRBuilder will place instructions in this location
Builder.SetInsertPoint(BB);
/*
Function *CalleeF = TheModule->getFunction(print_int->getName());
if (CalleeF == 0) {
throw runtime_error("could not find the function print_int\n");
}*/
// print the value of the expression and we are done
// Value *CallF = Builder.CreateCall(CalleeF, RetVal, "calltmp");
// Finish off the function.
// return 0 from main, which is EXIT_SUCCESS
Builder.CreateRet(ConstantInt::get(getGlobalContext(), APInt(32, 0)));
return TheFunction;
}
If someone knows why my Alloca object is not being generated, please help me out - any hints will be greatly appreciated.
Thank you
EDIT:
Codegen is called from the grammar:
start: program
program: extern_list decafclass
{
ProgramAST *prog = new ProgramAST((decafStmtList *)$1, (ClassAST *)$2);
if (printAST) {
cout << getString(prog) << endl;
}
Value *RetVal = prog->Codegen();
delete $1; // get rid of abstract syntax tree
delete $2; // get rid of abstract syntax tree
// we create an implicit print_int function call to print
// out the value of the expression.
Function *print_int = gen_print_int_def();
Function *TheFunction = gen_main_def(RetVal, print_int);
verifyFunction(*TheFunction);
}
EDIT: I figured it out, basically the createAlloca has to be called after the basicblock when generating main;
There are two weird things here:
All you do is call Builder.CreateRet... I don't see how there could be any code in main unless you call something that creates the corresponding instructions. In particular, you never seem to call the CodeGen part.
You pass a size of zero to CreateAlloc. I think the size should be one for a single variable.
Also, make sure that you don't call any LLVM optimization passes after generating your code. Those passes would optimize the value away (it's never used, thus dead code).
Related
I am playing with llvm (and antlr), working vaguely along the lines of the Kaleidoscope tutorial. I successfully created LLVM-IR code from basic arithmetic expressions both on top-level and as function definitions, which corresponds to the tutorial chapters up to 3.
Now I would like to incrementally add JIT support, starting with the top-level arithmetic expressions. Here is my problem:
Basic comparison makes it seem as if I follow the same sequence of function calls as the tutorial, only with a simpler code organization
The generated IR code looks good
The function definition is apparently found, since otherwise the code would exit (i verified this by intentionally looking for a wrongly spelled function name)
However the call of the function pointer created by JIT evaluation always returns zero.
These snippets (excerpt) are executed as part of the antlr visitor of the main/entry-node of my grammar:
//Top node main -- top level expression
antlrcpp::Any visitMain(ExprParser::MainContext *ctx)
{
llvm::InitializeNativeTarget();
llvm::InitializeNativeTargetAsmPrinter();
llvm::InitializeNativeTargetAsmParser();
TheJIT = ExitOnErr( llvm::orc::KaleidoscopeJIT::Create() );
InitializeModuleAndPassManager();
// ... Code which visits the child nodes ...
}
InitializeModuleAndPassManager() is the same as in the tutorial:
static void InitializeModuleAndPassManager()
{
// Open a new context and module.
TheContext = std::make_unique<llvm::LLVMContext>();
TheModule = std::make_unique<llvm::Module>("commandline", *TheContext);
TheModule->setDataLayout(TheJIT->getDataLayout());
// Create a new builder for the module.
Builder = std::make_unique<llvm::IRBuilder<>>(*TheContext);
// Create a new pass manager attached to it.
TheFPM = std::make_unique<llvm::legacy::FunctionPassManager>(TheModule.get());
// Do simple "peephole" optimizations and bit-twiddling optzns.
TheFPM->add(llvm::createInstructionCombiningPass());
// Reassociate expressions.
TheFPM->add(llvm::createReassociatePass());
// Eliminate Common SubExpressions.
TheFPM->add(llvm::createGVNPass());
// Simplify the control flow graph (deleting unreachable blocks, etc).
TheFPM->add(llvm::createCFGSimplificationPass());
TheFPM->doInitialization();
}
This is the function which handles the top-level expression and which is also supposed to do JIT evaluation:
//Bare expression without function definition -- create anonymous function
antlrcpp::Any visitBareExpr(ExprParser::BareExprContext *ctx)
{
string fName = "__anon_expr";
llvm::FunctionType *FT = llvm::FunctionType::get(llvm::Type::getDoubleTy(*TheContext), false);
llvm::Function *F = llvm::Function::Create(FT, llvm::Function::ExternalLinkage, fName, TheModule.get());
llvm::BasicBlock *BB = llvm::BasicBlock::Create(*TheContext, "entry", F);
Builder->SetInsertPoint(BB);
llvm::Value* Expression=visit(ctx->expr()).as<llvm::Value* >();
Builder->CreateRet(Expression);
llvm::verifyFunction(*F);
//TheFPM->run(*F);//outcommented this because i wanted to try JIT before optimization-
//it causes a compile error right now because i probably lack some related code.
//However i do not assume that a missing optimization run will cause the problem that i have
F->print(llvm::errs());
// Create a ResourceTracker to track JIT'd memory allocated to our
// anonymous expression -- that way we can free it after executing.
auto RT = TheJIT->getMainJITDylib().createResourceTracker();
auto TSM = llvm::orc::ThreadSafeModule(move(TheModule), move(TheContext));
ExitOnErr(TheJIT->addModule(move(TSM), RT));
InitializeModuleAndPassManager();
// Search the JIT for the __anon_expr symbol.
auto ExprSymbol = ExitOnErr(TheJIT->lookup("__anon_expr"));
// Get the symbol's address and cast it to the right type (takes no
// arguments, returns a double) so we can call it as a native function.
double (*FP)() = (double (*)())(intptr_t)ExprSymbol.getAddress();
double ret = FP();
fprintf(stderr, "Evaluated to %f\n", ret);
// Delete the anonymous expression module from the JIT.
ExitOnErr(RT->remove());
return F;
}
Now this is what happens as an example:
[robert#robert-ux330uak test4_expr_llvm_2]$ ./testmain '3*4'
define double #__anon_expr() {
entry:
ret float 1.200000e+01
}
Evaluated to 0.000000
I would be thankful for any ideas about what I might be doing wrong.
I'm trying to use forward declaration of functions in LLVM, but I'm not able to do it... The reason for doing that is this error:
error: invalid forward reference to function 'f' with wrong type! "
Right now I'm trying to do it with this code:
std::vector<Type *> args_type = f->get_args_type();
Module* mod = get_module();
std::string struct_name("struct.");
struct_name.append(f->get_name());
Type* StructTy = mod->getTypeByName(struct_name);
if (!StructTy) {
StructTy = Type::getVoidTy(getGlobalContext());
}
FunctionType *ftype = FunctionType::get(StructTy, args_type, false);
//Function *func = Function::Create(ftype, GlobalValue::InternalLinkage, f->get_name(), get_module());
Constant* c = mod->getOrInsertFunction(f->get_name(), ftype);
Function *func = cast<Function>(c);
But it does not show in the IR when I generate the code. When I create the function again using this same code shown above, it works. I wonder if it's because I insert a BasicBlock right after when I start insert things within the function.
Right now that's how it is my IR
define internal void #main() {
entry:
...
}
define internal %struct.f #f(i32* %x) {
entry:
...
}
I believe that putting an declare %struct.f #f(i32*) before the #main function would fix this issue, but I can't figure out how to do it...
Summary: I just want to create something with a declare on top of the file, so I can use the define it later and start inserting instructions of the function
Ok, it seems LLVM does that 'automatically'.
I just realized that the functions changed their orders when I ran the code again. So, if you create a function before even though you don't insert any code (body), it will create the prototype and wait for any further declarations to the body, as long as you reference this function with the getOrInsert() method of the Module class.
I don't know if this is the right answer or if it's clear, but it solved my problem...
I have an LLVM pass that traverses input IR code and performs analysis on called functions. My analysis function signature is functionTracer(const Function* pFunc) and I call it on a CallInst's getCalledFunction().
At the start of my analysis function I create a copy of the passed in function that I manipulate during the analysis:
Function* pFunctionToAnalyze = CloneFunction(pFunction,VMap,false);
I have a C++ main that calls a function f2(int i):
int main(){
int a = 3;
int b = f2(a);
int c = f2(b);
}
I turn this code into IR and submit to my pass. My code appears to execute and perform the manipulations I want but I get the following error output:
While deleting: i32 (i32)* %_Z2f2i
Use still stuck around after Def is destroyed: %call1 = call i32 #_Z2f2i(i32 %1)
Use still stuck around after Def is destroyed: %call = call i32 #_Z2f2i(i32 %0)
module: /home/src/extern/llvm/llvm-3.7.0.src/lib/IR/Value.cpp:82:
virtual llvm::Value::~Value(): Assertion `use_empty() && "Uses remain when a value is destroyed!"' failed.
Aborted (core dumped)
Do I need to perform manual clean up of the Cloned function, pFunctionToAnalyze, at the end of my analysis function to remove Uses before returning? Is there a better way to copy function contents for analysis that may modify it?
There's an example on that in lib/Transforms/IPO/PartialInlining.cpp
// Clone the function, so that we can hack away on it.
ValueToValueMapTy VMap;
Function* duplicateFunction = CloneFunction(F, VMap,
/*ModuleLevelChanges=*/false);
And in the end of the pass:
duplicateFunction->replaceAllUsesWith(F);
duplicateFunction->eraseFromParent();
Isn't that what fixes your problem?
I have a structure and it has a pointer to function as follows.
typedef struct
{
void (*p)();
int n;
} myStruct;
I used it as folllowing:
myStruct * a = malloc( sizeof(myStruct));
a->n=88;
a->p = &booooo;
a->p()
In LLVM, How can I get the name of function (booooo) and struct element (a->p) to save it in symbol table and print it later.
I could find the name of the function in StoreInst.
When I print its value I got this result:
void (...)* bitcast (void ()* #booooo to void (...)*)
How can I get only the name (booooo) from the value.
There are (at least) two kinds of casts in LLVM IR: BitCastInst and bitcast values. You have the later. Fortunately, there is a method for retrieving the original value within the bitcast: stripPointerCasts(). It took me sometime to figure out this distinction.
Here is my usage of the routine, where I was trying to identify the function called (BasicBlock::iterator I):
if (CallInst *ci = dyn_cast<CallInst>(&*I)) {
Function *f = ci->getCalledFunction();
if (f == NULL)
{
Value* v = ci->getCalledValue();
f = dyn_cast<Function>(v->stripPointerCasts());
if (f == NULL)
{
continue;
}
}
const char* fname = f->getName().data();
As explained in the previous question asking the same thing [marginally different], you are better off using the AST form that the Clang compiler produces, rather than the LLVM IR form. It is a much more direct representation of the C or C++ code than the LLVM IR, and easier to work with in general.
But from the StoreInst you can use getValueOperand to get the value that is being stored, and then getName of the value. Of course, like I also said in comments the previous answer, it's not very hard to make the code hard to derive what the original value stored was.
In otherwords, if we have an llvm::Instruction *inst, we could do this:
if (llvm::StoreInst* si = llvm::dyn_cast<llvm::StoreInst>(inst))
{
std::string name = si->getValueOperand()->getName();
}
[Code is not tested, not compiled, no guarantee provided, I just wrote it as part of this answer with the intention that it may work]
I want to replace the call to malloc with call to cumemhostalloc function.
float *h_A=(float *)malloc(size);
should be replaced with
cuMemHostAlloc((void **)&h_A,size,2);
I use the following code for this,
*if (dyn_cast<CallInst> (j))
{
Ip=cast<Instruction>(j);
CastInst* ci_hp = new BitCastInst(ptr_h_A, PointerTy_23, "" );
BB->getInstList().insert(Ip,ci_hp);
errs()<<"\n Cast instruction is inserted"<<*ci_hp;
li_size = new LoadInst(al_size, "", false);
li_size->setAlignment(4);
BB->getInstList().insert(Ip,li_size);
errs()<<"\n Load instruction is inserted"<<*li_size;
ConstantInt* const_int32_34 = ConstantInt::get(M->getContext(), APInt(32, StringRef("2"), 10));
std::vector<Value*> cumemhaparams;
cumemhaparams.push_back(ci_hp);
cumemhaparams.push_back(li_size);
cumemhaparams.push_back(const_int32_34);
CallInst* cumemha = CallInst::Create(func_cuMemHostAlloc, cumemhaparams, "");
cumemha->setCallingConv(CallingConv::C);
cumemha->setTailCall(false);
AttrListPtr cumemha_PAL;
cumemha->setAttributes(cumemha_PAL);
ReplaceInstWithInst(callinst->getParent()->getInstList(), j,cumemha);*
}
But I get the following error,
/home/project/llvmfin/llvm-3.0.src/lib/VMCore/Value.cpp:287: void llvm::Value::replaceAllUsesWith(llvm::Value*): Assertion `New->getType() == getType() && "replaceAllUses of value with new value of different type!"' failed.
Is it because the call to malloc is replaced with a function that has a different signature?
Almost. Call to malloc produce a value, your function - does not. So, you have to replace call with a load, not with another call
Also, looking into your code:
Do not play with instlists directly. Use IRBuilder + iterators instead
You can check for CallInst and declare var at the same time, no need to additional cast to Instruction.