What does CallInst::Create() return in LLVM? - c++

Considering
static CallInst *Create(Value *Func,
ArrayRef<Value *> Args,
const Twine &NameStr = "",
Instruction *InsertBefore = 0)
this function, I wonder what the return value of this function means.
For example, in following code,
int foo(int a);
...
Function *foo_ptr = ~~;//say, foo is refered through getOrInsertFunction()
CallInst *ptr = CallInst::Create(foo_ptr, .../* properly set */);
the CallInst *ptr is the return value. Abstractly, does ptr mean
an integer value returned by int foo(int);
or CALL instruction
I thought number 2 was the answer, but started to get confused looking at some codes.

Both 1 and 2 are "true". It returns the call instruction, whose "value", when we execute the code, will be the return value of the function.
To illustrate, take this little Pascal program:
program p;
function f: integer;
begin
f := 42;
end; { f }
begin
writeln(f);
end.
Which translates to this LLVM-IR:
; ModuleID = 'TheModule'
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
%text = type { i32, i8*, i32, i32 }
#input = global %text zeroinitializer, align 8
#output = global %text zeroinitializer, align 8
#UnitIniList = constant [1 x i8*] zeroinitializer
define i32 #P.f() #0 {
entry:
%f = alloca i32, align 4
store i32 42, i32* %f
%0 = load i32, i32* %f
ret i32 %0
}
define void #__PascalMain() #0 {
entry:
%calltmp = call i32 #P.f()
call void #__write_int(%text* #output, i32 %calltmp, i32 0)
call void #__write_nl(%text* #output)
ret void
}
declare void #__write_int(%text*, i32, i32)
declare void #__write_nl(%text*)
attributes #0 = { "no-frame-pointer-elim"="true" }
The call i32 #P.f() is generated by:
inst = builder.CreateCall(calleF, argsV, "calltmp");
The contents of inst is %calltmp = call i32 #P.f() - and that is a CallInst 'value'.
and inst is returned to the evaluation of the expression for the argument to writeln.

Related

Pointer type is hidden or not shown on LLVM function type

I have a piece of codes which constructs IR within a function like below.
SmallVector<Type*, 24> argTypes;
Type *t_1xi32 = Type::getInt32Ty(TheContext);
Type *t_2xi32 = FixedVectorType::get(t_1xi32, 2); t_2xi32->dump();
Type *ptrType = t_2xi32->getPointerTo(0); ptrType->dump();
argTypes.push_back(ptrType);
FunctionType *funcType = FunctionType::get(TheBuilder.getInt32Ty(), argTypes, false);
Function *func = Function::Create(funcType, Function::ExternalLinkage, "test0", TheModule);
BasicBlock *entryBB = BasicBlock::Create(TheContext, "entry", func);
TheBuilder.SetInsertPoint(entryBB);
TheBuilder.CreateGEP(TheBuilder.getInt32Ty(), func->getArg(0), TheBuilder.getInt32(0));
TheBuilder.CreateRet(TheBuilder.getInt32(0));
But when I compile and dump the function.
define i32 #test0(ptr %0) {
entry:
%1 = getelementptr i32, ptr %0, i32 0
%2 = getelementptr i32, ptr %0, i32 1
ret i32 0
}
It confuses me because I would expect something like this
%1 = getelementptr i32, <2 x i32>* %0, i32 0

LLVM IR How to pass struct to function

I'm making my own c-like language and I'm trying to pass a struct to a function. The struct
is representing an array(one member is a pointer to the array and the other member is the length). If I call the function "test" like this: call void #test(%structintarray %a) I get error: '%a' defined with type '%structintarray*' but expected '%structintarray = type { i32*, i32 }' . But if I call "test" like this: call void #test(%structintarray* %a) I get error: '#test' defined with type 'void (%structintarray)*' but expected 'void (%structintarray*)*' I don't understand this second error.
What I'm I doing wrong here?
`
void test(int[] a) {
}
int main() {
int[] a = new int[5];
test(a);
return 0;
}
generates;
%structintarray = type { i32*, i32 }
define void #test(%structintarray %__p__a) {
entry: %a = alloca %structintarray, align 4
store %structintarray %__p__a , %structintarray* %a, align 4
ret void
}
define i32 #main() {
entry: %t0 = call noalias i8* #calloc(i32 5 , i32 4)
%t1 = bitcast i8* %t0 to i32*
%a = alloca %structintarray, align 4
%t2 = getelementptr %structintarray, %structintarray* %a, i32 0, i32 0
store i32* %t1 , i32** %t2, align 4 ; pointer to array
%t3 = getelementptr %structintarray, %structintarray* %a, i32 0, i32 1
store i32 5 , i32* %t3, align 4 ; size of array
call void #test(%structintarray %a)
ret i32 0
}

llvm pass replaceAllUsesWith type not match?

use a encryptedString replace a GlobalVariable but type not match.
the GlobalVariable a const char * string.
code like that:
GlobalVariable* GV = *it;
//get clear text string
std::string clearstr = getGlobalStringValue(GV);
GlobalVariable::LinkageTypes lt = GV->getLinkage();
//encrypt current string
std::string encryptedString = stringEncryption(clearstr);
//create new global string with the encrypted string
std::ostringstream oss;
oss << ".encstr" << encryptedStringCounter << "_" << sys::Process::GetRandomNumber();
Constant *cryptedStr = ConstantDataArray::getString(M.getContext(), encryptedString, true);
GlobalVariable* gCryptedStr = new GlobalVariable(M, cryptedStr->getType(), true, GV->getLinkage(), cryptedStr, oss.str());
StringMapGlobalVars[oss.str()] = gCryptedStr;
//replace use of clear string with encrypted string
GV->replaceAllUsesWith(gCryptedStr);
but failed with:
Assertion failed: (New->getType() == getType() && "replaceAllUses of
value with new value of different type!"),
At first: I recommend replacing everything with the right type in LLVM IR that's why this assertion is there.
However:
You get this assertion because your strings does not match in length. A global string is represented as an array of characters (i.e. i8 values). So the type of your string is [len x i8] where len is the length of your string.
#.str = private unnamed_addr constant [12 x i8] c"hello world\00", align 1
What you can do is write your own replacement function like this:
template<typename T>
void ReplaceUnsafe(T *from, T *to) {
while (!from->use_empty()) {
auto &U = *from->use_begin();
U.set(to);
}
from->eraseFromParent();
}
However, this is (as the function name indicates) unsafe and here is why:
Consider the following C/C++ code:
int main() {
return "hello world"[9];
}
which will just return the int representation of l.
Compiled to IR it looks like this:
#.str = private unnamed_addr constant [12 x i8] c"hello world\00", align 1
; Function Attrs: nounwind
define i32 #main() #0 {
entry:
%retval = alloca i32, align 4
store i32 0, i32* %retval
%0 = load i8* getelementptr inbounds ([12 x i8]* #.str, i32 0, i64 9), align 1
%conv = sext i8 %0 to i32
ret i32 %conv
}
if the string is now replaced with somiting of unequal type (e.g., something of type [7 x i8]), then you may end up with a problem because your GEP instruction has the 9 as contant index. This will result in an out of bounds access. I don't know if the llvm verify pass catches this when it looks at GEP instructions (if you run it).
Constant *cryptedStr = ConstantDataArray::getString(M.getContext(), encryptedString, true);
change to
Constant *cryptedStr = ConstantDataArray::getString(M.getContext(), encryptedString, false);

Need insights about writing a pass

For my source code, I have the following IR:
; ModuleID = '<stdin>'
#.str = private unnamed_addr constant [9 x i8] c"SOME_ENV_VAR\00", align 1
#.str1 = private unnamed_addr constant [26 x i8] c"Need to set $ENV_Variable.\0A\00", align 1
; Function Attrs: nounwind
define void #foo(i8* %bar) #0 {
entry:
%bar.addr = alloca i8*, align 4
%baz = alloca i8*, align 4
store i8* %bar, i8** %bar.addr, align 4
%call = call i8* #getenv(i8* getelementptr inbounds ([9 x i8]* #.str, i32 0, i32 0)) #2
store i8* %call, i8** %baz, align 4
%0 = load i8** %baz, align 4
%cmp = icmp eq i8* %0, null
br i1 %cmp, label %if.then, label %if.else
if.then: ; preds = %entry
%call1 = call i32 (i8*, ...)* #printf(i8* getelementptr inbounds ([26 x i8]* #.str1, i32 0, i32 0))
br label %if.end
if.else: ; preds = %entry
%1 = load i8** %bar.addr, align 4
%2 = load i8** %baz, align 4
%call2 = call i8* #strcpy(i8* %1, i8* %2) #2
br label %if.end
if.end: ; preds = %if.else, %if.then
ret void
}
; Function Attrs: nounwind
declare i8* #getenv(i8*) #0
declare i32 #printf(i8*, ...) #1
; Function Attrs: nounwind
declare i8* #strcpy(i8*, i8*) #0
I intend to write a pass, which when compiled (using LLVM), produces bitcode where the call to strcpy(dest,src) is replaced with strncpy(dest,src,n).
I've written the following code so far:
#include <stdlib.h>
#include <stdio.h>
#include "llvm/Pass.h"
#include "llvm/IR/Function.h"
#include "llvm/Support/raw_ostream.h"
#include "llvm/IR/Module.h"
#include "llvm/PassManager.h"
#include "llvm/Analysis/Verifier.h"
#include "llvm/Assembly/PrintModulePass.h"
#include "llvm/IR/IRBuilder.h"
using namespace llvm;
namespace
{
Module* makeLLVMModule() {
Module* mod = new Module(llvm::StringRef("CustomPass"),getGlobalContext());
Constant* c = mod->getOrInsertFunction(llvm::StringRef("foo"),Type::getInt32Ty(getGlobalContext()),NULL);
Function* foo = cast<Function>(c);
Function::arg_iterator args =foo->arg_begin();
Value* bar = args++;
BasicBlock* Entry = BasicBlock::Create(getGlobalContext(),llvm::Twine("Entry"), foo);
BasicBlock* False = BasicBlock::Create(getGlobalContext(),llvm::Twine("False"), foo);
BasicBlock* True = BasicBlock::Create(getGlobalContext(),llvm::Twine("True"), foo);
char* pPath;
pPath = getenv("SOME_ENV_VAR");
IRBuilder<> builder(Entry);
Value* envVarDoesntExist = builder.CreateICmpEQ(llvm::StringRef(pPath),Constant::getNullValue(Value),llvm::Twine("temp"));
//---1
builder.CreateCondBr(envVarDoesntExist, False, True);
builder.SetInsertPoint(True);
builder.CreateCall3(strncpy,bar,llvm::StringRef(pPath),45,llvm::Twine("temp"));
//---2
builder.SetInsertPoint(False);
builder.CreateCall(printf,llvm::StringRef("Need to set $ENV_Variable.\n"),llvm::Twine("temp"));
//---1
return mod;
}
}
char funcP::ID = 0;
static RegisterPass<funcP> X("funcp", "funcP", false, false);
From ---1:How to convert llvm::StringRef to Value* ?
From ---2:How to convert char* to Value*
Could Constant::getNullValue(Value) be used for getting a NULL value?
I intend to write a pass, which when compiled (using LLVM), produces bitcode where the call to strcpy(dest,src) is replaced with strncpy(dest,src,n).
Then what you need to do is to locate the call instruction and change it. There's no need to recreate the entire flow, it's already in your source code.
All you need to do is to create a function pass, iterate over all the instructions in the function, and if the instruction is a call instruction and the callee's name is strcpy then create a new call instruction to your new function, then replace the old instruction with the new instruction.
Also there seems to be some fundamental misunderstanding in your code between values in the compiler (such as 45 and all the StringRefs) and values in the code you are processing (instances of one of the subtypes of llvm::Value). Specifically, you can't just use 45 as a parameter to a function in the code you are processing - you have to create a constant int from that number, and then you can use that constant.
One final note - you can implicitly construct a StringRef from a const char*, you don't need to explicitly call the StringRef's constructor all over the place. Same with Twine.

How to get the value of a string literal in LLVM IR?

I'm new to LLVM. I'm trying to write a basic Pass that will inspect the arguments of a printf call, when it is given the Intermediate Representation.
If the format string is not a string literal, then of course I can't inspect it. But quite often, it is.
The sample IR I'm trying to inspect is:
#.str = private unnamed_addr constant [7 x i8] c"Hi %u\0A\00", align 1
define i32 #main() nounwind {
entry:
%retval = alloca i32, align 4
store i32 0, i32* %retval
%call = call i32 (i8*, ...)* #printf(i8* getelementptr inbounds ([7 x i8]* #.str, i32 0, i32 0), i32 1)
ret i32 0
}
declare i32 #printf(i8*, ...)
I found the preexisting Pass called ExternalFunctionsPassedConstants, which seemed relevant:
struct ExternalFunctionsPassedConstants : public ModulePass {
static char ID; // Pass ID, replacement for typeid
ExternalFunctionsPassedConstants() : ModulePass(ID) {}
virtual bool runOnModule(Module &M) {
for (Module::iterator I = M.begin(), E = M.end(); I != E; ++I) {
if (!I->isDeclaration()) continue;
bool PrintedFn = false;
for (Value::use_iterator UI = I->use_begin(), E = I->use_end();
UI != E; ++UI) {
Instruction *User = dyn_cast<Instruction>(*UI);
if (!User) continue;
CallSite CS(cast<Value>(User));
if (!CS) continue;
...
So I added the code:
if (I->getName() == "printf") {
errs() << "printf() arg0 type: "
<< CS.getArgument(0)->getType()->getTypeID() << "\n";
}
So far, so good -- I see that the type ID is 14, which means it's a PointerTyID.
But now, how do I get the contents of the string literal that is being passed as an argument, so I can validate the number of expected arguments against the number actually given?
CS.getArgument(0)
represents the GetElementPtrConstantExpr
i8* getelementptr inbounds ([7 x i8]* #.str, i32 0, i32 0)
, it is an User object. The string you want (i.e. #.str) is this GetElementPtrConstantExpr's first operand.
So, you can get the string literal through
CS.getArgument(0).getOperand(0)
However, I have not tested this code. If there are any mistakes, please tell me.