How can I find the size of a type? - llvm

I'm holding a Type* in my hand. How do I find out its size (the size objects of this type will occupy in memory) in bits / bytes? I see all kinds of methods allowing me to get "primitive" or "scalar" size, but that won't help me with aggregate types...

If you only need the size because you are inserting it into the IR (e.g., so you can send it to a call to malloc()), you can use the getelementptr instruction to do the dirty work (with a little casting), as described here (with updating for modern LLVM):
Though LLVM does not contain a special purpose sizeof/offsetof instruction, the
getelementptr instruction can be used to evaluate these values. The basic idea
is to use getelementptr from the null pointer to compute the value as desired.
Because getelementptr produces the value as a pointer, the result is casted to
an integer before use.
For example, to get the size of some type, %T, we would use something like
this:
%Size = getelementptr %T* null, i32 1
%SizeI = ptrtoint %T* %Size to i32
This code is effectively pretending that there is an array of T elements,
starting at the null pointer. This gets a pointer to the 2nd T element
(element #1) in the array and treats it as an integer. This computes the
size of one T element.
The good thing about doing this is that it is useful in exactly the cases where you do not care what the value is; where you just need to pass the correct value from the IR to something. That's by far the most common case for my need for sizeof()-alike operations in the IR generation.
The page also goes on to describe how to do an offsetof() equivalent:
To get the offset of some field in a structure, a similar trick is used. For
example, to get the address of the 2nd element (element #1) of { i8, i32* }
(which depends on the target alignment requirement for pointers), something
like this should be used:
%Offset = getelementptr {i8,i32*}* null, i32 0, i32 1
%OffsetI = ptrtoint i32** %Offset to i32
This works the same way as the sizeof trick: we pretend there is an instance of
the type at the null pointer and get the address of the field we are interested
in. This address is the offset of the field.
Note that in both of these cases, the expression will be evaluated to a
constant at code generation time, so there is no runtime overhead to using this
technique.
The IR optimizer also converts the values to constants.

The size depends on the target (for several reasons, alignment being one of them).
In LLVM versions 3.2 and above, you need to use DataLayout, in particular its getTypeAllocSize method. This returns the size in bytes, there's also a bit version named getTypeAllocSizeInBits. A DataLayout instance can be obtained by creating it from the current module: DataLayout* TD = new DataLayout(M).
With LLVM up to version 3.1 (including), use TargetData instead of DataLayout. It exposes the same getTypeAllocSize methods, though.

Related

[3.0]Question about how to use the store IR instruction to obtain the blockaddress

I am writing to enquire about a question.
When I read the IR language generated by a piece of C programs, I found that in C programs, the behavior of getting tag addresses is handled by a store directive after it is translated into IR.
store i8* blockaddress(#func_name, %label_name), i8** %val_name
However, I read the official documents. Here's how Blockaddress works:
blockaddress(#function, %block)
The 'blockaddress' constant computes the address of the specified basic block in the specified function, and always has an i8* type. Taking the address of the entry block is illegal.
This value only has defined behavior when used as an operand to the '[indirectbr](file:///D:/opensourse/llvm-3.0.src/docs/LangRef.html#i_indirectbr)' instruction, or for comparisons against null. Pointer equality tests between labels addresses results in undefined behavior — though, again, comparison against null is ok, and no label is equal to the null pointer. This may be passed around as an opaque pointer sized value as long as the bits are not inspected. This allows ptrtoint and arithmetic to be performed on these values so long as the original value is reconstituted before the indirectbr instruction.
Finally, some targets may provide defined semantics when using the value as the operand to an inline assembly, but that is target specific.
So I want to figure out how stores construct blockaddress in the IR program by storing them in%5.
What should I do if I want to use C++ to construct this store directive to get Addresses of Basic Blocks?
I made some attempts, such as constructing an indirectbr:
irBuilder.SetInsertPoint(indirectbr_bb);
IndirectBrInst *indirect_br = IndirectBrInst::Create(BlockAddress::get(func, instr2_bb), 0, indirectbr_bb);
indirect_br->addDestination(instr1_bb);
indirect_br->addDestination(instr2_bb);
The IR program generated is as follows:
indirectbr_bb: ; preds = %dispatch_then_bb
indirectbr i8* blockaddress(#jit_func, %instr2_bb), [label %instr1_bb, label %instr2_bb]
After my test, it can be executed correctly. Therefore, I want to know how to construct a similar store IR to store the address of the basic block in the array.
Blockaddress::get(basicblock *bb) returns a blockaddress pointer, which is a subclass of constant and a derived class of Value.
In LLVM IR, all variables are of type Value.
So we can do this:
ArrayType *arrayType = ArrayType::get(irBuilder.getInt8PtrTy(), 1024);
module->getOrInsertGlobal("label_array", arrayType);
GlobalVariable *label_array = module->getNamedGlobal("label_array");
vector <Constant *> array_elems;
array_elems.push_back(BlockAddress::get(func, ret_bb));
array_elems.push_back(BlockAddress::get(func, instr1_bb));
array_elems.push_back(BlockAddress::get(func, instr2_bb));
label_array->setInitializer(ConstantArray::get(arrayType, array_elems));

How LLVM mem2reg pass works

mem2reg is an important optimization pass in llvm. I want to understand how this optimization works but didn't find good articles, books, tutorials and similar.
I found these two links:
https://blog.katastros.com/a?ID=01300-3d6589c1-1993-4fb1-8975-939f10c20503
https://www.zzzconsulting.se/2018/07/16/llvm-exercise.html
Both links explains that one can use Cytron's classical SSA algorithm to implement this pass, but reading the original paper I didn't see how alloca instructions are converted to registers.
As alloca is an instruction specific to llvm IR, I wonder if the algorithm that converts alloca instructions to registers is an ad-hoc algorithm that only works for llvm. Or if there is a general theory framework that I just don't know the name yet that explains how to promote memory variables to registers variables.
On the official documentation, it says:
mem2reg: This file promotes memory references to be register references. It promotes alloca instructions which only have loads and stores as uses. An alloca is transformed by using dominator frontiers to place phi nodes, then traversing the function in depth-first order to rewrite loads and stores as appropriate. This is just the standard SSA construction algorithm to construct “pruned” SSA form.
By this description, it seems that one just need to check if all users of the variable are load and store instructions and if so, it can be promoted to a register.
So if you can link me to articles, books, algorithms and so on I would appreciate.
"Efficiently Computing Static Single Assignment Form and the Control Dependence Graph" by Cytron, Ferrante et al.
The alloca instruction is named after the alloca() function in C, the rarely used counterpart to malloc() that allocates memory on the stack instead of the heap, so that the memory disappears when the function returns without needing to be explicitly freed.
In the paper, figure 2 shows an example of "straight-line code":
V ← 4
← V + 5
V ← 6
← V + 7
That text isn't valid LLVM IR, if we wanted to rewrite the same example in pre-mem2reg LLVM IR, it would look like this:
; V ← 4
store i32 4, i32* %V.addr
; ← V + 5
%tmp1 = load i32* %V.addr
%tmp2 = add i32 %tmp1, 5
store i32 %tmp2, i32* %V.addr
; V ← 6
store i32 6, i32* %V.addr
; ← V + 7
%tmp3 = load i32* %V.addr
%tmp4 = add i32 %tmp3, i32 7
store i32 %tmp4, i32* %V.addr
It's easy enough to see in this example how you could always replace %tmp1 with i32 4 using store-to-load forwarding, but you can't always remove the final store. Knowing nothing about %V.addr means that we must assume it may be used elsewhere.
If you know that %V.addr is an alloca, you can simplify a lot of things. You can see the alloca instruction so you know you can see all the uses, you never have to worry that a store to %ptr may alias with %V.addr. You see the allocation so you know what its alignment is. You know pointer accesses can not fault anywhere in the function, there is no free() equivalent for an alloca that anyone could call. If you see a store that isn't loaded before the end of the function, you can eliminate that store as a dead store since you know the alloca's lifetime ends at the function return. Finally you may delete the alloca itself if you've removed all its uses.
Thus if you start with an alloca whose only users are loads and stores, the problem has little in common with most memory optimization problems: there's no possibility of pointer aliasing nor concern about faulting memory accesses. What you do need to do is place the ϕ functions (phi instructions in LLVM) in the right places given the control flow graph, and that's what the paper describes how to do.

LLVM : Store Values of all types inside an i8 array at compile time

An example of the title would be storing a symbol (e.g. a function pointer) inside the array next to, let's say, integers. This would allow the linker to place the right address of the symbols, which we do not know at optimization-time, in the i8 array.
How can I store any LLVM Value, whichever it's type is (as long as it's sized), inside an [<Sum of sizes> x i8] array ? This, storing Values of any type inside an [N x i8] array, would take place in a LLVM pass.
I am aware that it will require to change every use of these values by a pointer cast and a load; this isn't an issue.
Thanks.
I don't quite follow what you're trying to do with the i8 array, but I assume this is the core of your question:
How can I store any LLVM Value, whichever it's type is (as long as
it's sized), inside an [ x i8] array
To reinterpret an LLVM value as one with a different type, use bitcast. You could obtain a pointer to your data and then bitcast it to i8*, then use llvm.memcpy to write it to your array. For obtaining the runtime size of a type, see this answer.
Alternatively you could put all your objects in a giant struct, and then bitcast it to an i8 array where it is needed. This is probably preferable as it would more naturally compile to something that has no runtime overhead.

Pre evaluate LLVM IR

Let's suppose we have expressions like:
%rem = srem i32 %i.0, 10
%mul = mul nsw i32 %rem, 2
%i.0 is a llvm::PHINode which I can get the bounds.
The question is: Is there a way to get the value of %mul during compile time? I'm writing a llvm Pass and I need to evaluate some expressions which use %i.0. I'm searching for a function, class or something else which I will give a value to %i.0 and it will evaluate the expression and return the result.
You could clone the code (the containing function or the entire module, depending on how much context you need), then replace %i.0 with a constant value, run the constant propagation pass on the code, and finally check whether %mul is assigned to a constant value and if so, extract it.
It's not elegant, but I think it would work. Just pay attention to:
Make sure %mul is not elided out - for example, return it from the function, or store its value to memory, or something.
Be aware constant propagation assumes some things about the code, in particular that it already passed through mem2reg.

Llvm how to access global array elements

Can someone please explain me what is wrong with this code?
I think this should fetch the second argument from global array, but in fact it silently crushes somewhere inside JIT compilation routine.
My suppositions:
GEP instruction calculates memory address of the element by applying offset and returns pointer.
load instruction loads value referenced by given pointer (it dereferences a pointer, in other words).
ret instruction exits function and passes given value to caller.
Seems like I've missed something basic, but time point from which i should give up looking for answer myself is gone and i have to seek for help.
#arr = common global [256 x i64], align 8
define i64 #iterArray() {
entry:
%0 = load i64* getelementptr inbounds ([256 x i64]* #arr, i32 1, i32 0)
ret i64 %0
}
You requested the 257th item in a 256-item array, and that's a problem.
The first index given to a gep instruction means how many steps are made through the value operand - and here the value operand is not an array but a pointer to an array. That means every step there skips the entire size of the array forward - and that's why the gep actually asks for the 257th item. Using 0 as the first gep index will probably fix the problem. Then using 1 as the 2nd index will get you the 2nd item in the array, which is what you wanted. Read more about it here: http://llvm.org/docs/GetElementPtr.html#what-is-the-first-index-of-the-gep-instruction
Alternatively, it's more appropriate here to use the extractvalue instruction, which is similar to gep with implicitly uses a 0 for the first index (and there are a couple of other differences).
Regarding why the compiler crashes, I'm not sure - I'm guessing that while normally such a memory access would compile fine (and at runtime either generate a segfault or just return a bad value), here you specifically requested the gep to be inbounds, which means that a bounds check is done - and it will fail here - so a poison value is returned, which means your function is now effectively load undef. I'm not sure what LLVM does with load undef - it should probably be optimized out and the function be made to just return undef - but maybe it did something different which led to a rejection of your code.