I am writing an LLVM pass. For an instruction (llvm::Instruction Class), how can I check if an instruction is a PHI instruction?
I found the solution. You can check for a PHI node like this, isa<PHINode>(inst).
Instruction* I;
if(I->getOpcode()==Instruction::PHI){
//code
}
Related
I'm looking for a way to create an LLVM Instruction from its opcode.
The ideal function would be something such as Instruction *createInstruction(unsigned Opcode, ArrayRef<Value *> Operands);
Is there anything like it ?
Thanks.
unsigned int lo = 0;
unsigned int hi = 0;
__asm__ __volatile__ (
"mfence;rdtsc" : "=a"(lo), "=d"(hi) : : "memory"
);
mfence in the above code, is it necessary?
Based on my test, cpu reorder is not found.
The fragment of test code is included below.
inline uint64_t clock_cycles() {
unsigned int lo = 0;
unsigned int hi = 0;
__asm__ __volatile__ (
"rdtsc" : "=a"(lo), "=d"(hi)
);
return ((uint64_t)hi << 32) | lo;
}
unsigned t1 = clock_cycles();
unsigned t2 = clock_cycles();
assert(t2 > t1);
What you need to perform a sensible measurement with rdtsc is a serializing instruction.
As it is well known, a lot of people use cpuid before rdtsc.
rdtsc needs to be serialized from above and below (read: all instructions before it must be retired and it must be retired before the test code starts).
Unfortunately the second condition is often neglected because cpuid is a very bad choice for this task (it clobbers the output of rdtsc).
When looking for alternatives people think that instructions that have a "fence" in their names will do, but this is also untrue. Straight from Intel:
MFENCE does not serialize the instruction stream.
An instruction that is almost serializing and will do in any measurement where previous stores don't need to complete is lfence.
Simply put, lfence makes sure that no new instructions start before any prior instruction completes locally. See this answer of mine for a more detailed explanation on locality.
It also doesn't drain the Store Buffer like mfence does and doesn't clobbers the registers like cpuid does.
So lfence / rdtsc / lfence is a better crafted sequence of instructions than mfence / rdtsc, where mfence is pretty much useless unless you explicitly want the previous stores to be completed before the test begins/ends (but not before rdstc is executed!).
If your test to detect reordering is assert(t2 > t1) then I believe you will test nothing.
Leaving out the return and the call that may or may not prevent the CPU from seeing the second rdtsc in time for a reorder, it is unlikely (though possible!) that the CPU will reorder two rdtsc even if one is right after the other.
Imagine we have a rdtsc2 that is exactly like rdtsc but writes ecx:ebx1.
Executing
rdtsc
rdtsc2
is highly likely that ecx:ebx > edx:eax because the CPU has no reason to execute rdtsc2 before rdtsc.
Reordering doesn't mean random ordering, it means look for other instruction if the current one cannot be executed.
But rdtsc has no dependency on any previous instruction, so it's unlikely to be delayed when encountered by the OoO core.
However peculiar internal micro-architectural details may invalidate my thesis, hence the likely word in my previous statement.
1 We don't need this altered instruction: register renaming will do it, but in case you are not familiar with it, this will help.
mfence is there to force serialization in CPU before rdtsc.
Usually you will find cpuid there (which is also serializing instruction).
Quote from Intel manuals about using rdtsc will make it clearer
Starting with the Intel Pentium processor, most Intel CPUs support
out-of-order execution of the code. The purpose is to optimize the
penalties due to the different instruction latencies. Unfortunately
this feature does not guarantee that the temporal sequence of the
single compiled C instructions will respect the sequence of the
instruction themselves as written in the source C file. When we call
the RDTSC instruction, we pretend that that instruction will be
executed exactly at the beginning and at the end of code being
measured (i.e., we don’t want to measure compiled code executed
outside of the RDTSC calls or executed in between the calls
themselves).
The solution is to call a serializing instruction before
calling the RDTSC one. A serializing instruction is an instruction
that forces the CPU to complete every preceding instruction of the C
code before continuing the program execution. By doing so we guarantee
that only the code that is under measurement will be executed in
between the RDTSC calls and that no part of that code will be executed
outside the calls.
TL;DR version - without serializing instruction before rdtsc you have no idea when that instruction started to execute making measurements possibly incorrect.
HINT - use rdtscp when possible.
Based on my test, cpu reorder is not found.
Still no guarantee that it may happen - that's why original code had "memory" to indicate possible memory clobber preventing compiler from reordering it.
I'm a beginner with LLVM, and I have a simple problem, but I can't find the solution in the documentation.
I'm doing a function pass that computes on instructions, and for this I need all 'data' from the instruction, I mean the operator, all operands, and the result.
My problem is, I can't get the result variable. For example, for the instruction:
%add1 = add nsw i32 %x, %y
I can have x and y name and variable, I can have the opCode, I can have add1 name, but, I can not have add1 variable.
I read all functions from the Instruction page of the documentation, and I can't find anything who looks like what I'm looking for.
So what is the proper API that can solve my problem?
Instruction inherits from Value and thus has method getName() which solves your problem.
But remember that instruction can be unnamed (such as %0) and getName probably won't return anything useful in that case
In LLVM is it necessary that if we insert some instruction in LLVM IR through LLVM Pass ,than also we have to insert an instruction which will use the result of our previous inserted instruction or we have to store result of our inserted instruction into some variable already present in LLVM IR that is not useless.
for example cant i insert instruction
%result = add i32 4 3
and %result is not used in subsequent instructions.
You should be able to insert it but if an optimization pass runs after your pass it might be eliminated because it's unused and doesn't have side effects.
No, it's absolutely not necessary. If you insert the instruction properly (i.e. use the API correctly), it can be left unused.
As a matter of fact, unused values can be left around by various optimization passes as well. LLVM has other passes like DCE (dead code elimination) that will remove unused instructions.
When I use the command clang -emit-llvm -S test.c -o test.ll, there is no any "phi" instruction in the IR file. How can I get it?
I know that I can use the pass "-mem2reg" or "-gvn" to get "phi" instruction. But they would do some optimization. I just want to get "phi" without any optimization.
I'm not sure what you mean by "do some optimization" but it seems to me that mem2reg is exactly what you need. Here is how it's described in the documentation:
This file promotes memory references to be register references. It
promotes alloca instructions which only have loads and stores as uses.
An alloca is transformed by using dominator frontiers to place phi
nodes, then traversing the function in depth-first order to rewrite
loads and stores as appropriate. This is just the standard SSA
construction algorithm to construct “pruned” SSA form.
Clang itself does not produce optimized LLVM IR. It produces fairly straightforward IR wherein locals are kept in memory (using allocas). The optimizations are done by opt on LLVM IR level, and one of the most important optimizations is indeed mem2reg which makes sure that locals are represented in LLVM's SSA values instead of memory.