invalid getelementptr indices - llvm

I'm getting "invalid getelementptr indices" on the last line of this llvm-IR code:
%alc = alloca %mytype*
store %mytype* %obj, %mytype** %alc
%ldc = load %mytype** %alc
%gcs = getelementptr inbounds %mytype* %ldc, i32 0, i32 1
where mytype is defined as follows:
%mytype = type {i32, %tp1**, %tp1}
I have another similar type that indexing over it doesn't cause the above error and is defined as:
%mytype2 = type {i32, i16*, %tp1}
Any help to resolve this problem would be appreciated.

The error is caused because %mytype does not define a valid type. Normally LLVM reports an error on the type itself, but if the type definition appears later than a getelementptr (GEP) usage, then you only get an error from the GEP and not from the type.
If you move the definition of %mytype to appear before the GEP in the IR file you'll see a more appropriate error message.
In this case, I'm guessing the problem is that %mytype is incomplete - either the definition for %tp1 is missing, or the definition to a type it uses (e.g. %tp2, which I see in your comment that it uses) is missing, or something like that.
By the way, you might want to use my IR editor, it would help you quickly find these sorts of errors.

Related

Detecting free function inside a "combined" LLVM IR instruction?

I can easily find free in this IR call instruction with getCalledFunction():
call void #free(i8* %10) #4, !dbg !53
However, I can't seem to know how to find it in this call instruction:
%call7 = call i32 bitcast (i32 (...)* #free to i32 (%struct.Bar*)*)(%struct.Bar* %7), !dbg !56
This instruction is combining a BitCast with a call instruction. I am not sure if "combining" is the proper phrase, but nevertheless, how can I detect free here?
I tried dyn_cast to a Bitcast and it isn't. I even used getCalledOperand() first and tried casting the Value I get from it to a BitCast and it still isn't detecting it. I would appreciate any help with this.
Thanks!
#arnt answered this in the comments, so I'm adding the answer for everyone else.
#arnt: The first argument to the call is a ConstantExpr, returned by getBitCast. cast(foo)->getOperand(0) will return the free.

LLVM Pass: to change the function call's argument values

part of my project, based on some analysis, I have to change the function call's arguments. I am doing it in the llvm-ir level. something like this,
doWork("work",functionBefore)
based on my results my llvm-pass should be able to transform the function pointer passed to the function call like this
doWork("work",functionAfter)
assume both functionBefore and functionAfter have the same return type.
Is it possible to change the arguments using llvm pass?
Or should i delete the instruction and recreate the one I needed?
Please give some suggestions or directions how to do this ?
llvm ir to the call the function would be something like this-
invoke void #_Z7processNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEPFvS4_E(%"c lass.std::__cxx11::basic_string"* nonnull %1, void (%"class.std::__cxx11::basic_string"*)* nonnull #_Z9functionBNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE) to label %7 unwind label %13

What does a ".number" following a function name mean in LLVM IR?

In LLVM IR, a "." and a number following a function name.
Such as
#kmalloc.2670,#kmalloc.19
What does this number mean?
It is often the situation that a same function name followed by different numbers. However, the definition code of the two functions are the same.
Can anybody help me?
define internal i8* #kmalloc.2670(i64 %size, i32 %flags) #5 !dbg !436635
define internal i8* #kmalloc.19(i64 %size, i32 %flags) #5 !dbg !1202009
Is this right?
LLVM docs:
One nice thing about LLVM is that the name is just a hint. For
instance, if the code above emits multiple “addtmp” variables, LLVM
will automatically provide each one with an increasing, unique numeric
suffix. Local value names for instructions are purely optional, but it
makes it much easier to read the IR dumps.

LLVM IR jump to a basicblock within another function

I have generated an IR with my pass, inside a function in this IR, I would like to jump back to a basicblock of the caller function , inside caller function ext_callee function is invoked like this:
%4 = call i1 #ext_callee(i32 32, i32 %3, i32 -4, i8* blockaddress(#tobecalled, %5), i8* blockaddress(#tobecalled, %7)).
The last two parameters are the basicblock addresses I would like to jump to inside this ext_callee function.
I tried to use indirectbr instruction with one of the blockaddress parameters but when I run the IR it prompts segment fault. I searched LLVM documents but didn't find how to jump to basicblocks of another function. Does anyone have a clue? Thanks very much!
You cannot do this.
Per http://llvm.org/docs/LangRef.html#i-indirectbr:
Control transfers to the block specified in the address argument. All possible destination blocks must be listed in the label list, otherwise this instruction has undefined behavior. This implies that jumps to labels defined in other functions have undefined behavior as well.

LLVM IR types being collapsed wrongly when linking (C++ API)

Straight to the point -- I'm trying to link two (or more) llvm modules together, and I'm facing a certain odd error from LLVM.
I don't want to post too much code, so I'll use a bunch of pseudo here.
I have 3 modules, let's say A, B, and C. A is the main module; I initialise llvm::Linker with it. B and C are secondary modules; I call linker.linkInModule(B and C).
All 3 modules have, among other things, these two types defined:
%String = type { i8*, i64 }
%Character = type { i8*, i64 }
Note that they have the same member types. Furthermore, a function foo is defined as such (in module B):
define i1 #_ZN9Character7hasDataEv(%Character*) { }
This function is declared in modules A and C. Now, all seems well and good -- this function is called from both modules A and C, and the IR looks normal, like so:
%21 = call i1 #_ZN9Character7hasDataEv(%Character* %4)
Here comes the problem: when all 3 modules are linked together, something happens to these types:
They lose their name, becoming %2 (%String) and %3 (%Character).
They appear to be merged together.
Strangely, while this transformation occurs in both modules A and C, the bug only occurs in C -- note that A is the so-called "main" module.
The function definition of the linked file is now
define i1 #_ZN9Character7hasDataEv(%2*)
Note how %Character, or %3, got turned into %2. Furthermore, at the callsite, in what is presumably an attempt to un-merge the types, I get this:
%10 = call i1 bitcast (i1 (%2*)* #_ZN9Character7hasDataEv to i1 (%3*)*)(%2* %2)
Curiously, although the function was casted from i1 (%2*) to %3 (%2*), the argument passed (arg. 1) is still of type %2. What's going on?
Note that in module A, whatever is going on is done properly, and there is no error. This happens for a number of functions, but only in module C.
I've tried reproducing it by copy-pasting these to .ll files and calling llvm-link followed by llvm-dis, but 1. the types are not merged, and 2. there is no such bug.
Thanks...?
Okay, turns out that, after some poking around in the llvm IRC channel, llvm::Linker was meant to be used with an empty llvm::Module as the starting module.
Also, in my use-case I am reusing the same llvm::Type (the actual thing in memory) across different modules that I link together. They said it wasn't illegal, but that it was never tested, so... ¯\_(ツ)_/¯
So anyway, the problem was fixed by starting with an empty module to pass to the linker.