how to insert inline assembly instruction by using llvm pass - llvm

I tried to insert an assembly instruction into each base block using pass in the IR Pass of LLVM.
Update:
LLVMContext *Ctx = nullptr;
Ctx = &M.getContext();
BasicBlock::iterator IP = BB.getFirstInsertionPt();
IRBuilder<> IRB(&(*IP));
StringRef asmString = "int3";
StringRef constraints = "~{dirflag},~{fpsr},~{flags}";
llvm::InlineAsm *IA = llvm::InlineAsm::get(Ty,asmString,constraints,true,false,InlineAsm::AD_ATT);
ArrayRef<Value *> Args = None;
llvm::CallInst *Ptr = IRB.CreateCall(IA,Args);
Ptr->addAttribute(AttributeList::FunctionIndex, Attribute::NoUnwind);
However, when I ran the pass on one of the test files, test.bc, I found that no INT3 instructions were inserted into the file.I compared the statement I created with Ptr:
call void asm sideeffect "int3", "~{dirflag},~{fpsr},~{flags}"() #4
And the real INT3 in IR is:
call void asm sideeffect "int3", "~{dirflag},~{fpsr},~{flags}"() #2, !srcloc !2
I wonder how can I modify my code to make it work?

The type of the inline assembly certainly don't have to match with the type of function it is used in.
For a int3 inline assembly you probably want a void (void) type, that is FunctionType::get(Type::getVoidTy(ctx), false).

Related

Why does my llvm function jit-evaluate to 0?

I am playing with llvm (and antlr), working vaguely along the lines of the Kaleidoscope tutorial. I successfully created LLVM-IR code from basic arithmetic expressions both on top-level and as function definitions, which corresponds to the tutorial chapters up to 3.
Now I would like to incrementally add JIT support, starting with the top-level arithmetic expressions. Here is my problem:
Basic comparison makes it seem as if I follow the same sequence of function calls as the tutorial, only with a simpler code organization
The generated IR code looks good
The function definition is apparently found, since otherwise the code would exit (i verified this by intentionally looking for a wrongly spelled function name)
However the call of the function pointer created by JIT evaluation always returns zero.
These snippets (excerpt) are executed as part of the antlr visitor of the main/entry-node of my grammar:
//Top node main -- top level expression
antlrcpp::Any visitMain(ExprParser::MainContext *ctx)
{
llvm::InitializeNativeTarget();
llvm::InitializeNativeTargetAsmPrinter();
llvm::InitializeNativeTargetAsmParser();
TheJIT = ExitOnErr( llvm::orc::KaleidoscopeJIT::Create() );
InitializeModuleAndPassManager();
// ... Code which visits the child nodes ...
}
InitializeModuleAndPassManager() is the same as in the tutorial:
static void InitializeModuleAndPassManager()
{
// Open a new context and module.
TheContext = std::make_unique<llvm::LLVMContext>();
TheModule = std::make_unique<llvm::Module>("commandline", *TheContext);
TheModule->setDataLayout(TheJIT->getDataLayout());
// Create a new builder for the module.
Builder = std::make_unique<llvm::IRBuilder<>>(*TheContext);
// Create a new pass manager attached to it.
TheFPM = std::make_unique<llvm::legacy::FunctionPassManager>(TheModule.get());
// Do simple "peephole" optimizations and bit-twiddling optzns.
TheFPM->add(llvm::createInstructionCombiningPass());
// Reassociate expressions.
TheFPM->add(llvm::createReassociatePass());
// Eliminate Common SubExpressions.
TheFPM->add(llvm::createGVNPass());
// Simplify the control flow graph (deleting unreachable blocks, etc).
TheFPM->add(llvm::createCFGSimplificationPass());
TheFPM->doInitialization();
}
This is the function which handles the top-level expression and which is also supposed to do JIT evaluation:
//Bare expression without function definition -- create anonymous function
antlrcpp::Any visitBareExpr(ExprParser::BareExprContext *ctx)
{
string fName = "__anon_expr";
llvm::FunctionType *FT = llvm::FunctionType::get(llvm::Type::getDoubleTy(*TheContext), false);
llvm::Function *F = llvm::Function::Create(FT, llvm::Function::ExternalLinkage, fName, TheModule.get());
llvm::BasicBlock *BB = llvm::BasicBlock::Create(*TheContext, "entry", F);
Builder->SetInsertPoint(BB);
llvm::Value* Expression=visit(ctx->expr()).as<llvm::Value* >();
Builder->CreateRet(Expression);
llvm::verifyFunction(*F);
//TheFPM->run(*F);//outcommented this because i wanted to try JIT before optimization-
//it causes a compile error right now because i probably lack some related code.
//However i do not assume that a missing optimization run will cause the problem that i have
F->print(llvm::errs());
// Create a ResourceTracker to track JIT'd memory allocated to our
// anonymous expression -- that way we can free it after executing.
auto RT = TheJIT->getMainJITDylib().createResourceTracker();
auto TSM = llvm::orc::ThreadSafeModule(move(TheModule), move(TheContext));
ExitOnErr(TheJIT->addModule(move(TSM), RT));
InitializeModuleAndPassManager();
// Search the JIT for the __anon_expr symbol.
auto ExprSymbol = ExitOnErr(TheJIT->lookup("__anon_expr"));
// Get the symbol's address and cast it to the right type (takes no
// arguments, returns a double) so we can call it as a native function.
double (*FP)() = (double (*)())(intptr_t)ExprSymbol.getAddress();
double ret = FP();
fprintf(stderr, "Evaluated to %f\n", ret);
// Delete the anonymous expression module from the JIT.
ExitOnErr(RT->remove());
return F;
}
Now this is what happens as an example:
[robert#robert-ux330uak test4_expr_llvm_2]$ ./testmain '3*4'
define double #__anon_expr() {
entry:
ret float 1.200000e+01
}
Evaluated to 0.000000
I would be thankful for any ideas about what I might be doing wrong.

How to call an inline function in llvm

I wrote an inline function and implemented each basic block call to the inline function in LLVM's pass. However, the compilation error is as follows:
inline function call in a function with debug info must have a !dbg location
I wonder why? Am I missing a step or a parameter?
[+] the call instruction like this:
std::vector<Type*> Vct;
Vct.push_back(Int32Ty);
Vct.push_back(Int32Ty);
ArrayRef<Type*> Args(Vct);
FunctionType *Ty = FunctionType::get(Type::getVoidTy(*Ctx), Args,false);
FunctionCallee callee_ = M.getOrInsertFunction("HH",Ty);
Value *Args_[] = {IRB.getInt32(a),IRB.getInt32(b)};
const Twine Name = "";
IRB.CreateCall(callee_,Args_,Name);

How should LLVM Function clones be cleaned up?

I have an LLVM pass that traverses input IR code and performs analysis on called functions. My analysis function signature is functionTracer(const Function* pFunc) and I call it on a CallInst's getCalledFunction().
At the start of my analysis function I create a copy of the passed in function that I manipulate during the analysis:
Function* pFunctionToAnalyze = CloneFunction(pFunction,VMap,false);
I have a C++ main that calls a function f2(int i):
int main(){
int a = 3;
int b = f2(a);
int c = f2(b);
}
I turn this code into IR and submit to my pass. My code appears to execute and perform the manipulations I want but I get the following error output:
While deleting: i32 (i32)* %_Z2f2i
Use still stuck around after Def is destroyed: %call1 = call i32 #_Z2f2i(i32 %1)
Use still stuck around after Def is destroyed: %call = call i32 #_Z2f2i(i32 %0)
module: /home/src/extern/llvm/llvm-3.7.0.src/lib/IR/Value.cpp:82:
virtual llvm::Value::~Value(): Assertion `use_empty() && "Uses remain when a value is destroyed!"' failed.
Aborted (core dumped)
Do I need to perform manual clean up of the Cloned function, pFunctionToAnalyze, at the end of my analysis function to remove Uses before returning? Is there a better way to copy function contents for analysis that may modify it?
There's an example on that in lib/Transforms/IPO/PartialInlining.cpp
// Clone the function, so that we can hack away on it.
ValueToValueMapTy VMap;
Function* duplicateFunction = CloneFunction(F, VMap,
/*ModuleLevelChanges=*/false);
And in the end of the pass:
duplicateFunction->replaceAllUsesWith(F);
duplicateFunction->eraseFromParent();
Isn't that what fixes your problem?

How can I get the name of function from StoreInst's Value In LLVM

I have a structure and it has a pointer to function as follows.
typedef struct
{
void (*p)();
int n;
} myStruct;
I used it as folllowing:
myStruct * a = malloc( sizeof(myStruct));
a->n=88;
a->p = &booooo;
a->p()
In LLVM, How can I get the name of function (booooo) and struct element (a->p) to save it in symbol table and print it later.
I could find the name of the function in StoreInst.
When I print its value I got this result:
void (...)* bitcast (void ()* #booooo to void (...)*)
How can I get only the name (booooo) from the value.
There are (at least) two kinds of casts in LLVM IR: BitCastInst and bitcast values. You have the later. Fortunately, there is a method for retrieving the original value within the bitcast: stripPointerCasts(). It took me sometime to figure out this distinction.
Here is my usage of the routine, where I was trying to identify the function called (BasicBlock::iterator I):
if (CallInst *ci = dyn_cast<CallInst>(&*I)) {
Function *f = ci->getCalledFunction();
if (f == NULL)
{
Value* v = ci->getCalledValue();
f = dyn_cast<Function>(v->stripPointerCasts());
if (f == NULL)
{
continue;
}
}
const char* fname = f->getName().data();
As explained in the previous question asking the same thing [marginally different], you are better off using the AST form that the Clang compiler produces, rather than the LLVM IR form. It is a much more direct representation of the C or C++ code than the LLVM IR, and easier to work with in general.
But from the StoreInst you can use getValueOperand to get the value that is being stored, and then getName of the value. Of course, like I also said in comments the previous answer, it's not very hard to make the code hard to derive what the original value stored was.
In otherwords, if we have an llvm::Instruction *inst, we could do this:
if (llvm::StoreInst* si = llvm::dyn_cast<llvm::StoreInst>(inst))
{
std::string name = si->getValueOperand()->getName();
}
[Code is not tested, not compiled, no guarantee provided, I just wrote it as part of this answer with the intention that it may work]

*Value is not being generated into the LLVM code

I am attempting to write some compiler and use LLVM to generate intermediate code. Unfortunately, LLVM documentation is not very great and even somewhat confusing.
At the moment I have lexer,grammar and AST implemented. I was also following some examples found on Internet. My current AST works as follows: it has the abstract base class Tree*, from which other trees inherit (so, like one for variable definition, one for statement list, one for binary expression etc.).
I am trying to implement the variable definition, so for the input
class Test{
int main()
{
int x;
}
}
I want LLVM output to be:
; ModuleID = "Test"
define i32 #main() {
entry:
%x = alloca i32
return i32 0
}
However, right now I can get %x = alloca i32 part to the part where main function is created, but the actual output is missing the %x = alloca i32. So, the output I'm getting is as follows:
; ModuleID = "Test"
define i32 #main() {
entry:
return i32 0
}
my Codegen() for variable declaration is shown bellow (symbol table for now is just a list, I am trying to keep things as simple as possible at the moment):
llvm::Value *decafStmtList::Codegen() {
string name = SyandTy.back(); // Just a name of a variable
string type = SyandTy.front(); // and its type in string format
Type* typeVal = getLLVMType(decafType(str2DecafType(type))); // get LLVM::*Type representation
llvm::AllocaInst *Alloca = Builder.CreateAlloca(typeVal, 0, name.c_str());
Value *V = Alloca;
return Alloca;//Builder.CreateLoad(V, name.c_str());
}
The part where I am generating my #main is as follows:
Note: I have commented out the print_int function (this is the function I will use later to print things, but for now I don't need it). If I'll uncomment the print_int function, TheFunction will not pass verifier(TheFunction) -> complains about module being broken and parameters not matching the signature.
Function *gen_main_def(llvm::Value *RetVal, Function *print_int) {
if (RetVal == 0) {
throw runtime_error("something went horribly wrong\n");
}
// create the top-level definition for main
FunctionType *FT = FunctionType::get(IntegerType::get(getGlobalContext(), 32), false);
Function *TheFunction = Function::Create(FT, Function::ExternalLinkage, "main", TheModule);
if (TheFunction == 0) {
throw runtime_error("empty function block");
}
// Create a new basic block which contains a sequence of LLVM instructions
BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction);
// All subsequent calls to IRBuilder will place instructions in this location
Builder.SetInsertPoint(BB);
/*
Function *CalleeF = TheModule->getFunction(print_int->getName());
if (CalleeF == 0) {
throw runtime_error("could not find the function print_int\n");
}*/
// print the value of the expression and we are done
// Value *CallF = Builder.CreateCall(CalleeF, RetVal, "calltmp");
// Finish off the function.
// return 0 from main, which is EXIT_SUCCESS
Builder.CreateRet(ConstantInt::get(getGlobalContext(), APInt(32, 0)));
return TheFunction;
}
If someone knows why my Alloca object is not being generated, please help me out - any hints will be greatly appreciated.
Thank you
EDIT:
Codegen is called from the grammar:
start: program
program: extern_list decafclass
{
ProgramAST *prog = new ProgramAST((decafStmtList *)$1, (ClassAST *)$2);
if (printAST) {
cout << getString(prog) << endl;
}
Value *RetVal = prog->Codegen();
delete $1; // get rid of abstract syntax tree
delete $2; // get rid of abstract syntax tree
// we create an implicit print_int function call to print
// out the value of the expression.
Function *print_int = gen_print_int_def();
Function *TheFunction = gen_main_def(RetVal, print_int);
verifyFunction(*TheFunction);
}
EDIT: I figured it out, basically the createAlloca has to be called after the basicblock when generating main;
There are two weird things here:
All you do is call Builder.CreateRet... I don't see how there could be any code in main unless you call something that creates the corresponding instructions. In particular, you never seem to call the CodeGen part.
You pass a size of zero to CreateAlloc. I think the size should be one for a single variable.
Also, make sure that you don't call any LLVM optimization passes after generating your code. Those passes would optimize the value away (it's never used, thus dead code).