I wanna get value and 861 from a return instruction, for example ret i32 %3, !dbg !861 and it's metadata !861 = !DILocation(line: 8, column: 5, scope: !857). But it didn't work.
version of clang and llvm is 13.0.0
for (auto &B : F) {
for (auto &I : B) {
// get metadata
if (auto *inst = dyn_cast<ReturnInst>(&I)) {
// ret i32 %3, !dbg !861
// !861 = !DILocation(line: 8, column: 5, scope: !857)
errs() << "!!!return inst: " << *inst << "\n";
DILocation *DILoc = inst->getDebugLoc().get();
errs() << " " << DILoc << "."<< "\n";
Type *instTy = inst->getType();
errs() << " " << *instTy << "."<< "\n";
Value* val = dyn_cast<Value>(inst);
errs() << " val name: " << val->getName().str() << ".\n";
if (auto constant_int = dyn_cast<ConstantInt>(val)) {
int number = constant_int->getSExtValue();
errs() << " val number: " << number << ".\n";
}
}
}
}
and the result:
!!!return inst: ret i32 %3
0x0.
void.
val name: .
I nearly got nothing! Problems:
1. DILocation return 0x0, why? I wanna get information of !861 = !DILocation(line: 8, column: 5, scope: !857)
Actually, now I find my true problem.
I used clang++ -O0 -g -S -emit-llvm test1.cpp -o test.ll to get .ll file. So it generate the metadata.
When I used clang++, I didn’t use -O0 -g. So it didn’t generate the metadata.
So the function LLVM: llvm::DebugLoc didn’t work.
And now, after I added the two arguments, the code I wrote works!
2. return type is void, why? I thought it should be ret.
Nick Lewycky said: The return instruction, locally to the current function, is void, it does not produce a value that subsequent instructions in the same function can consume. > %a = add i32 %b, %c makes sense, but %a = ret i32 %b does not.
A ret instruction itself always has void type, no matter what type the function is returning.
If you want the type of the returned value you could ask inst->getReturnValue()->getType().
Thanks Nick Lewycky!
Recently, I've been trying to create a toy language based on an old tutorial. I had this idea almost half a year ago, but I don't have the time to do it until now. Anyway, when I'm following the tutorial, I modified the source code to get rid of a handful of compile errors (I believe that most of the errors are related to backward compatibility), but I'm stuck with feeding the "custom-defined code" to the parser.
The error:
eric#pop-os:~/Desktop/my_toy_compiler-master$ echo 'int do_math(int a){ int x = a * 5 + 3 } do_math(10)' | ./parser
0x55a3e10a7580
Generating code...
Generating code for 20NFunctionDeclaration
Creating variable declaration int a
Generating code for 20NVariableDeclaration
Creating variable declaration int x
Creating assignment for x
Creating binary operation 274
Creating integer: 3
Creating binary operation 276
Creating integer: 5
Creating identifier reference: a
parser: /home/eric/llvm-project/llvm/lib/IR/DataLayout.cpp:740: llvm::Align llvm::DataLayout::getAlignment(llvm::Type*, bool) const: Assertion `Ty->isSized() && "Cannot getTypeInfo() on a type that is unsized!"' failed.
Aborted (core dumped)
The problem starts after 'Creating identifier reference: a'. Therefore, it is intuitive to take a look at the relevant code. There are two functions in the codegen.cpp that are considered significant to this bug.
excerpt of codegen.cpp:
...
static Type *typeOf(const NIdentifier& type)
{
if (type.name.compare("int") == 0) {
return Type::getInt64Ty(MyContext);
}
else if (type.name.compare("double") == 0) {
return Type::getDoubleTy(MyContext);
}
return Type::getVoidTy(MyContext);
}
...
Value* NIdentifier::codeGen(CodeGenContext& context)
{
std::cout << "Creating identifier reference: " << name << endl;
if (context.locals().find(name) == context.locals().end()) {
std::cerr << "undeclared variable " << name << endl;
return NULL;
}
return new LoadInst(typeOf(name), context.locals()[name], "", false, context.currentBlock());
}
...
The full version of codegen.cpp:
#include "node.h"
#include "codegen.h"
#include "parser.hpp"
using namespace std;
/* Compile the AST into a module */
void CodeGenContext::generateCode(NBlock& root)
{
std::cout << "Generating code...\n";
/* Create the top level interpreter function to call as entry */
vector<Type*> argTypes;
FunctionType *ftype = FunctionType::get(Type::getVoidTy(MyContext), makeArrayRef(argTypes), false);
mainFunction = Function::Create(ftype, GlobalValue::InternalLinkage, "main", module);
BasicBlock *bblock = BasicBlock::Create(MyContext, "entry", mainFunction, 0);
/* Push a new variable/block context */
pushBlock(bblock);
root.codeGen(*this); /* emit bytecode for the toplevel block */
ReturnInst::Create(MyContext, bblock);
popBlock();
/* Print the bytecode in a human-readable format
to see if our program compiled properly
*/
std::cout << "Code is generated.\n";
// module->dump();
legacy::PassManager pm;
pm.add(createPrintModulePass(outs()));
pm.run(*module);
}
/* Executes the AST by running the main function */
GenericValue CodeGenContext::runCode() {
std::cout << "Running code...\n";
ExecutionEngine *ee = EngineBuilder( unique_ptr<Module>(module) ).create();
ee->finalizeObject();
vector<GenericValue> noargs;
GenericValue v = ee->runFunction(mainFunction, noargs);
std::cout << "Code was run.\n";
return v;
}
/* Returns an LLVM type based on the identifier */
static Type *typeOf(const NIdentifier& type)
{
if (type.name.compare("int") == 0) {
return Type::getInt64Ty(MyContext);
}
else if (type.name.compare("double") == 0) {
return Type::getDoubleTy(MyContext);
}
return Type::getVoidTy(MyContext);
}
/* -- Code Generation -- */
Value* NInteger::codeGen(CodeGenContext& context)
{
std::cout << "Creating integer: " << value << endl;
return ConstantInt::get(Type::getInt64Ty(MyContext), value, true);
}
Value* NDouble::codeGen(CodeGenContext& context)
{
std::cout << "Creating double: " << value << endl;
return ConstantFP::get(Type::getDoubleTy(MyContext), value);
}
Value* NIdentifier::codeGen(CodeGenContext& context)
{
std::cout << "Creating identifier reference: " << name << endl;
if (context.locals().find(name) == context.locals().end()) {
std::cerr << "undeclared variable " << name << endl;
return NULL;
}
return new LoadInst(Type::getInt64Ty(MyContext), context.locals()[name], "", false, context.currentBlock());
}
Value* NMethodCall::codeGen(CodeGenContext& context)
{
Function *function = context.module->getFunction(id.name.c_str());
if (function == NULL) {
std::cerr << "no such function " << id.name << endl;
}
std::vector<Value*> args;
ExpressionList::const_iterator it;
for (it = arguments.begin(); it != arguments.end(); it++) {
args.push_back((**it).codeGen(context));
}
CallInst *call = CallInst::Create(function, makeArrayRef(args), "", context.currentBlock());
std::cout << "Creating method call: " << id.name << endl;
return call;
}
Value* NBinaryOperator::codeGen(CodeGenContext& context)
{
std::cout << "Creating binary operation " << op << endl;
Instruction::BinaryOps instr;
switch (op) {
case TPLUS: instr = Instruction::Add; goto math;
case TMINUS: instr = Instruction::Sub; goto math;
case TMUL: instr = Instruction::Mul; goto math;
case TDIV: instr = Instruction::SDiv; goto math;
/* TODO comparison */
}
return NULL;
math:
return BinaryOperator::Create(instr, lhs.codeGen(context),
rhs.codeGen(context), "", context.currentBlock());
}
Value* NAssignment::codeGen(CodeGenContext& context)
{
std::cout << "Creating assignment for " << lhs.name << endl;
if (context.locals().find(lhs.name) == context.locals().end()) {
std::cerr << "undeclared variable " << lhs.name << endl;
return NULL;
}
return new StoreInst(rhs.codeGen(context), context.locals()[lhs.name], false, context.currentBlock());
}
Value* NBlock::codeGen(CodeGenContext& context)
{
StatementList::const_iterator it;
Value *last = NULL;
for (it = statements.begin(); it != statements.end(); it++) {
std::cout << "Generating code for " << typeid(**it).name() << endl;
last = (**it).codeGen(context);
}
std::cout << "Creating block" << endl;
return last;
}
Value* NExpressionStatement::codeGen(CodeGenContext& context)
{
std::cout << "Generating code for " << typeid(expression).name() << endl;
return expression.codeGen(context);
}
Value* NReturnStatement::codeGen(CodeGenContext& context)
{
std::cout << "Generating return code for " << typeid(expression).name() << endl;
Value *returnValue = expression.codeGen(context);
context.setCurrentReturnValue(returnValue);
return returnValue;
}
Value* NVariableDeclaration::codeGen(CodeGenContext& context)
{
std::cout << "Creating variable declaration " << type.name << " " << id.name << endl;
AllocaInst *alloc = new AllocaInst(typeOf(type), NULL, id.name.c_str(), context.currentBlock());
context.locals()[id.name] = alloc;
if (assignmentExpr != NULL) {
NAssignment assn(id, *assignmentExpr);
assn.codeGen(context);
}
return alloc;
}
Value* NExternDeclaration::codeGen(CodeGenContext& context)
{
vector<Type*> argTypes;
VariableList::const_iterator it;
for (it = arguments.begin(); it != arguments.end(); it++) {
argTypes.push_back(typeOf((**it).type));
}
FunctionType *ftype = FunctionType::get(typeOf(type), makeArrayRef(argTypes), false);
Function *function = Function::Create(ftype, GlobalValue::ExternalLinkage, id.name.c_str(), context.module);
return function;
}
Value* NFunctionDeclaration::codeGen(CodeGenContext& context)
{
vector<Type*> argTypes;
VariableList::const_iterator it;
for (it = arguments.begin(); it != arguments.end(); it++) {
argTypes.push_back(typeOf((**it).type));
}
FunctionType *ftype = FunctionType::get(typeOf(type), makeArrayRef(argTypes), false);
Function *function = Function::Create(ftype, GlobalValue::InternalLinkage, id.name.c_str(), context.module);
BasicBlock *bblock = BasicBlock::Create(MyContext, "entry", function, 0);
context.pushBlock(bblock);
Function::arg_iterator argsValues = function->arg_begin();
Value* argumentValue;
for (it = arguments.begin(); it != arguments.end(); it++) {
(**it).codeGen(context);
argumentValue = &*argsValues++;
argumentValue->setName((*it)->id.name.c_str());
StoreInst *inst = new StoreInst(argumentValue, context.locals()[(*it)->id.name], false, bblock);
}
block.codeGen(context);
ReturnInst::Create(MyContext, context.getCurrentReturnValue(), bblock);
context.popBlock();
std::cout << "Creating function: " << id.name << endl;
return function;
}
Note that there are originally only 3 parameters for the LoadInst function. I checked the llvm::LoadInst Class Reference only to see that the LoadInst function now requires at least 4 parameters. I figured out that I (and the author) missed the Type *Ty parameter. Obviously, typeOf(name) in return new LoadInst(typeOf(name), context.locals()[name], "", false, context.currentBlock()); is not a solution since name, which is 'a' according to the error, will always make typeOf(name) void. I suspect that this causes Cannot getTypeInfo() on a type that is unsized!, as stated by the error.
To be short, I believe that I should look for something like this:
Value* NIdentifier::codeGen(CodeGenContext& context)
{
std::cout << "Creating identifier reference: " << name << endl;
if (context.locals().find(name) == context.locals().end()) {
std::cerr << "undeclared variable " << name << endl;
return NULL;
}
return new LoadInst(*some magic that return the llvm::type of name identifier*, context.locals()[name], "", false, context.currentBlock());
}
I'm still a noob in llvm, so excuse me if my guess isn't correct. Big thanks for any tips or ideas.
P.S. I tried return new LoadInst(Type::getInt64Ty(MyContext), context.locals()[name], "", false, context.currentBlock());. The terminal broke my heart again by saying the follow:
eric#pop-os:~/Desktop/my_toy_compiler-master$ echo 'int do_math(int a){ int x = a * 5 + 3 } do_math(10)' | ./parser
0x562120785580
Generating code...
Generating code for 20NFunctionDeclaration
Creating variable declaration int a
Generating code for 20NVariableDeclaration
Creating variable declaration int x
Creating assignment for x
Creating binary operation 274
Creating integer: 3
Creating binary operation 276
Creating integer: 5
Creating identifier reference: a
Creating block
Creating function: do_math
Generating code for 20NExpressionStatement
Generating code for 11NMethodCall
Creating integer: 10
Creating method call: do_math
Creating block
Code is generated.
; ModuleID = 'main'
source_filename = "main"
#.str = private constant [4 x i8] c"%d\0A\00"
declare i32 #printf(i8*, ...)
define internal void #echo(i64 %toPrint) {
entry:
%0 = call i32 (i8*, ...) #printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* #.str, i32 0, i32 0), i64 %toPrint)
ret void
}
define internal void #main() {
entry:
%0 = call i64 #do_math(i64 10)
ret void
}
define internal i64 #do_math(i64 %a1) {
entry:
%a = alloca i64, align 8
store i64 %a1, i64* %a, align 4
%x = alloca i64, align 8
%0 = load i64, i64* %a, align 4
%1 = mul i64 %0, 5
%2 = add i64 %1, 3
store i64 %2, i64* %x, align 4
ret void
}
Running code...
Function context does not match Module context!
void (i64)* #echo
in function echo
LLVM ERROR: Broken function found, compilation aborted!
Aborted (core dumped)
It's sad that my core is dumped anyway.
I have the following:
void print_str(std::shared_ptr<std::string> str) {
std::cout << str->c_str() << std::endl;
}
int main() {
auto str = std::make_shared<std::string>("Hello");
std::function<void()> f = std::bind(print_str, str);
f(); // correctly print: Hello
return 0;
}
I think the type of std::bind(print_str, str) is std::function<void(std::shared_ptr<std::string>)>, but the code above is correctly running. Is there any trick in std::bind?
env: centos, gcc82
What std::bind does is correct. It uses the value you provided (str) for the call to print_str. So you don't need to specify it anymore and will always be replaced by the bound value.
#include <iostream>
#include <functional>
int sum(int value1, int value2) {
return value1 + value2;
}
int main() {
std::function<int(int, int)> f1 = std::bind(sum, std::placeholders::_1, std::placeholders::_1);
std::function<int(int)> f2 = std::bind(sum, 10, std::placeholders::_1);
std::function<int()> f3 = std::bind(sum, 100, 200);
std::function<int(int)> f4 = std::bind(sum, std::placeholders::_1, 200);
int a = 1;
int b = 2;
std::cout << "the sum of " << a << " and " << b << " is: " << f1(a, b) << std::endl;
std::cout << "the sum of " << 10 << " and " << b << " is: " << f2(b) << std::endl;
std::cout << "the sum of " << 100 << " and " << 200 << " is: " << f3() << std::endl;
std::cout << "the sum of " << 200 << " and " << b << " is: " << f4(b) << std::endl;
return 0;
}
output:
the sum of 1 and 2 is: 2
the sum of 10 and 2 is: 12
the sum of 100 and 200 is: 300
the sum of 200 and 2 is: 202
f1 binds no values but placeholders and returns an int(int, int) like function
f2 binds one value and one placeholder and returns an int(int) like function
f3 binds two values and no placeholder and returns an int() like function
f4 is like f2 except that the place holder is now the first parameter instead of the second one.
Your code falls into the f3 case.
I think the type of std::bind(print_str, str) is std::function<void(std::shared_ptr<std::string>)>
No, the type of std::bind(print_str, str) is an unspecified functor type, something like
class binder
{
void(*f)(std::shared_ptr<std::string>);
std::shared_ptr<std::string> p;
public:
template<typename... Args>
void operator()(Args... ) { f(p); }
};
Note that this is callable with any arguments or none.
What you are experiencing here is correct and is precisely doing what std::bind was designed for.
Simply speaking:
It turns a function taking n parameters into a function taking m parameters (where n >= m).
In your particular case, you give it a function taking one parameter and get back a function taking zero parameters. This new function will internally call print_str and always pass str as argument.
Side note:
Since there are lambdas in C++11, std::bind is sort of redundant.
What you are doing is exactly equivalent to this:
void print_str(std::shared_ptr<std::string> str) {
std::cout << str->c_str() << std::endl;
}
int main() {
auto str = std::make_shared<std::string>("Hello");
std::function<void()> f = [=]() { print_str(str); };
f(); // correctly print: Hello
return 0;
}
This hopefully also helps understanding what std::bind does behind the scenes.
I am creating an LLVM pass and I don't understand something : when I look into the .ll file the argument of a function has a name :
call void #_ZNK2xi9spawnable9SpawnableIFvbdEEclEbd( %"class.xi::spawnable::Spawnable.0"* nonnull #_ZN2xi9spawnable2f2E, i1 zeroext %9, double %10)
So here the first argument name seems to be _ZN2xi9spawnable2f2E.
But in my pass when I use the function getName() it returns me an empty string. When I access the full argument I obtain : %"class.xi::spawnable::Spawnable.0"* %1
How can I obtain the same name as in the .ll file?
EDIT: This is a part of the code (I tried to clean it up a little so maybe there are some missing brackets)
virtual bool runOnFunction(Function &F){
LoopInfo &LI = getAnalysis<LoopInfoWrapperPass>().getLoopInfo();
std::string Name = demangle(F.getName ());
outs() << "Function "<< *(F.getFunctionType()) <<" " << Name << " {\n";
for(LoopInfo::iterator i = LI.begin(), e = LI.end(); i!=e; ++i)
BlocksInLoop (*i,0);
for( Function::iterator b = F.begin() , be = F.end() ;b != be; ++b){
for(BasicBlock::iterator i = b->begin() , ie = b->end();i != ie; ++i){
if(isa<CallInst>(&(*i)) || isa<InvokeInst>(&(*i))){
if (!(i->getMetadata("seen"))){
Function * fct =NULL;
if (isa<CallInst>(&(*i)))
fct = cast<CallInst>(&(*i))->getCalledFunction();
if (isa<InvokeInst>(&(*i)))
fct = cast<InvokeInst>(&(*i))->getCalledFunction();
if (fct){
outs()<<"Call " << *(fct->getFunctionType()) <<" "<< demangle(fct->getName()) << "\n";
for(Function::arg_iterator argi=fct->arg_begin(),arge=fct->arg_end(); argi!=arge;argi++ )
outs()<< argi->getName()<<"\n";
}
}
}
}
}
outs() << "}\n";
return(false);
};
You are analyzing not the call site, but the function itself. When you are looking at the function, you only have formal parameters and can't know what values are passed there.
Instead of calling ->getCalledFunction() and iterating over its args, you should iterate over cast<CallInst>(&(*i)) operands. See ->op_begin() and value_op_begin() methods.
I am trying to extract what operands are being used in an if instruction in LLVM IR.
For example: for an instruction like if(x==10), I want x and 10 as output.
Is this not how it should be done:
if (ICmpInst* iCmpInst = dyn_cast<ICmpInst>(&*i))
{
errs() << "Conditional Instruction found: ";
errs() << iCmpInst->getOpcodeName() << '\t';
errs() << iCmpInst->getPredicate() << '\t';
MDNode* metadata = iCmpInst->getMetadata("dbg");
llvm::MDNode::op_iterator o_begin = metadata->op_begin();
llvm::MDNode::op_iterator o_end = metadata->op_end();
for(; o_begin != o_end; ++o_begin)
{
errs() << o_begin << "\n";
}
}
For literals such as x, I have to scan `store instructions I think...
if you just want to get the operands ,may be you can try,
Value* opl = iCmpInst -> getOperand(0);
Value* opr = iCmpInst -> getOperand(1);