LLVM Preserving Loop Analysis

LLVM Preserving Loop Analysis - llvm

I am writing an LLVM FunctionPass that transforms certain functions fairly aggressively. It ultimately ends up deleting the old set of blocks and replacing them with totally different ones. However, the loop unroller (LoopUnrollPass), which runs after, fails to find loops in the transformed function. (The transformed version should have natural loops.)
Is there anything I have to poke after recreating a function? How do I trigger the loop detector to run again? Finally, are there other analyses that I have to also update when I transform functions?

First, just in case you skipped it, it's important to read an understand the Writing and LLVM Pass documentation page.
When your pass runs, it says whether the function/module was modified. The pass manager is supposed to take that as a clue to re-run all analyses needed for the next passes, unless your pass declares it preserves them (with addPreserved). You can see the list of analyses required by LoopUnroll in its getAnalysisUsage method:
/// This transformation requires natural loop information & requires that
/// loop preheaders be inserted into the CFG...
///
virtual void getAnalysisUsage(AnalysisUsage &AU) const {
AU.addRequired<LoopInfo>();
AU.addPreserved<LoopInfo>();
AU.addRequiredID(LoopSimplifyID);
AU.addPreservedID(LoopSimplifyID);
AU.addRequiredID(LCSSAID);
AU.addPreservedID(LCSSAID);
AU.addRequired<ScalarEvolution>();
AU.addPreserved<ScalarEvolution>();
AU.addRequired<TargetTransformInfo>();
// FIXME: Loop unroll requires LCSSA. And LCSSA requires dom info.
// If loop unroll does not preserve dom info then LCSSA pass on next
// loop will receive invalid dom info.
// For now, recreate dom info, if loop is unrolled.
AU.addPreserved<DominatorTree>();
}

Related

How to check if a container only has a specific element?

I apologize if the wording is completely off. I am very very new to C++ syntax. For background, I have an object named system that holds elements called actions. Like so below:
for (event::System system : Ev->getSys())
{
for (event::Action actions : system.actions)
{
switch (thing)
I was wondering if there is a way to check if system (in the first for loop) has the same element within it. For example,
If it has only RUN (or RUN, RUN, RUN...etc), then execute a specific set of code. Or if it has different types like RUN, STOP, WALK, then it can proceed. I know it will be an if/else statement but I can't think of how to build the condition for it.

Apply LLVM pass to a specific basic block

Is it possible to apply LLVM transformation pass to a specific basic block, instead of the whole IR?
I know how to apply a pass to the whole IR:
$ opt –S –instcombine test.ll –o out.ll
But there might be several basic blocks inside test.ll and I want to apply –instcombine to just one of them.

Generally, no. Some LLVM passes are written to work on whole modules, others on whole functions. Some are also safe to use for single basic blocks (more by chance than by design), but LLVM's pass interface deals with only the design unit (functions in case of function passes, modules in case of module passes). That is, function passes are given a function by the pass manager, and nothing else.

Instrumented code causing infinite recursion loop

How can I prevent a suspected infinite loop in the following scenario?
The entire C++ codebase is instrumented by clang at build time, using an LLVM pass that searches for llvm.memcpy intrinsics and inserting a post-call to the instrumentation runtime
The instrumentation runtime contains a std::map structure
The underlying libc++ code that implements std::map has been instrumented, and in turn calls the instrumentation runtime again
When I run the program, it freezes once the first instrumentation call is made. The suspected loop is trace_memcpy > std::map::operator[] > trace_memcpy > and so forth
Is there a way to short-circuit this loop, e.g. can the instrumentation library inspect the call stack to see that it is already in the call the stack and return early from the trace_memcpy function?
Thanks :)

Quick & dirty & probably not bulletproof - add a static variable to the implementation of trace_memcpy to avoid nesting.
void trace_memcpy(void)
{
static int nested;
if (nested)
{
return;
}
nested = 1;
// whatever your actual trace logic is
nested = 0;
}
If you need something more sophisticated, use the appropriate concurrency object as provided by your system.

ELF INIT section code to prepopulate objects used at runtime

I'm fairly new to c++ and am really interested in learning more. Have been reading quite a bit. Recently discovered the init/fini elf sections.
I started to wonder if & how one would use the init section to prepopulate objects that would be used at runtime. Say for example you wanted
to add performance measurements to your code, recording the time, filename, linenumber, and maybe some ID (monotonic increasing int for ex) or name.
You would place for example:
PROBE(0,"EventProcessing",__FILE__,__LINE__)
...... //process event
PROBE(1,"EventProcessing",__FILE__,__LINE__)
......//different processing on same event
PROBE(2,"EventProcessing",__FILE__,__LINE__)
The PROBE could be some macro that populates a struct containing this data (maybe on an array/list, etc using the id as an indexer).
Would it be possible to have code in the init section that could prepopulate all of this data for each PROBE (except for the time of course), so only the time would need to be retrieved/copied at runtime?
As far as I know the __attribute__((constructor)) can not be applied to member functions?
My initial idea was to create some kind of
linked list with each node pointing to each probe and code in the init secction could iterate it populating the id, file, line, etc, but
that idea assumed I could use a member function that could run in the "init" section, but that does not seem possible. Any tips appreciated!

As far as I understand it, you do not actually need an ELF constructor here. Instead, you could emit descriptors for your probes using extended asm statements (using data, instead of code). This also involves switching to a dedicated ELF section for the probe descriptors, say __probes.
The linker will concatenate all the probes and in an array, and generate special symbols __start___probes and __stop___probes, which you can use from your program to access thes probes. See the last paragraph in Input Section Example.
Systemtap implements something quite similar for its userspace probes:
User Space Probe Implementation
Adding User Space Probing to an Application (heapsort example)
Similar constructs are also used within the Linux kernel for its self-patching mechanism.

There's a pretty simple way to have code run on module load time: Use the constructor of a global variable:
struct RunMeSomeCode
{
RunMeSomeCode()
{
// your code goes here
}
} do_it;
The .init/.fini sections basically exist to implement global constructors/destructors as part of the ABI on some platforms. Other platforms may use different mechanisms such as _start and _init functions or .init_array/.deinit_array and .preinit_array. There are lots of subtle differences between all these methods and which one to use for what is a question that can really only be answered by the documentation of your target platform. Not all platforms use ELF to begin with…
The main point to understand is that things like the .init/.fini sections in an ELF binary happen way below the level of C++ as a language. A C++ compiler may use these things to implement certain behavior on a certain target platform. On a different platform, a C++ compiler will probably have to use different mechanisms to implement that same behavior. Many compilers will give you tools in the form of language extensions like __attributes__ or #pragmas to control such platform-specific details. But those generally only make sense and will only work with that particular compiler on that particular platform.

You don't need a member function (which gets a this pointer passed as an arg); instead you can simply create constructor-like functions that reference a global array, like
#define PROBE(id, stuff, more_stuff) \
__attribute__((constructor)) void \
probeinit##id(){ probes[id] = {id, stuff, 0/*to be written later*/, more_stuff}; }
The trick is having this macro work in the middle of another function. GNU C / C++ allows nested functions, but IDK if you can make them constructors.
You don't want to declare a static int dummy#id = something because then you're adding overhead to the function you profile. (gcc has to emit a thread-safe run-once locking mechanism.)
Really what you'd like is some kind of separate pass over the source that identifies all the PROBE macros and collects up their args to declare
struct probe global_probes[] = {
{0, "EventName", 0 /*placeholder*/, filename, linenum},
{1, "EventName", 0 /*placeholder*/, filename, linenum},
...
};
I'm not confident you can make that happen with CPP macros; I don't think it's possible to #define PROBE such that every time it expands, it redefines another macro to tack on more stuff.
But you could easily do that with an awk/perl/python / your fave scripting language program that scans your program and constructs a .c that declares an array with static storage.
Or better (for a single-threaded program): keep the runtime timestamps in one array, and the names and stuff in a separate array. So the cache footprint of the probes is smaller. For a multi-threaded program, stores to the same cache line from different threads is called false sharing, and creates cache-line ping-pong.
So you'd have #define PROBE(id, evname, blah blah) do { probe_times[id] = now(); }while(0)
and leave the handling of the later args to your separate preprocessing.

LLVM traverse CFG

I want to apply a DFS traversing algorithm on a CFG of a function. Therefore, I need the internal representation of the CFG. I need oriented edges and spotted MachineBasicBlock::const_succ_iterator. It is there a way to get the CFG with oriented edges by using a FunctionPass, instead of a MachineFunctionPass? The reason why I want this is that I have problems using MachineFunctionPass. I have written several complex passes till now, but I cannot run a MachineFunctionPass pass.
I found that : "A MachineFunctionPass is a part of the LLVM code generator that executes on the machine-dependent representation of each LLVM function in the program. Code generator passes are registered and initialized specially by TargetMachine::addPassesToEmitFile and similar routines, so they cannot generally be run from the opt or bugpoint commands."...So how I can run a MachineFunctionPass?
When I was trying to run with opt a simple MachineFunctionPass, I got the error :
Pass 'mycfg' is not initialized.
Verify if there is a pass dependency cycle.
Required Passes:
opt: PassManager.cpp:638: void llvm::PMTopLevelManager::schedulePass(llvm::Pass*): Assertion `PI && "Expected required passes to be initialized"' failed.
So I have to initialize the pass. But in my all other passes I did not any initialization and I don't want to use INITIALIZE_PASS since I have to recompile the llvm file that is keeping the pass registration... Is there a way to keep using static RegisterPass for a MachineFunctionPass ? I mention that if I change to FunctionPass, I have no problems, so indeed it might be an opt problem.
I have started another pass for CallGraph. I am using CallGraph &CG = getAnalysis<CallGraph>(); efficiently. It is a similar way of getting CFG-s? What I found till now are succ_iterator/succ_begin/succ_end which are from CFG.h, but I think I still have to get the CFG analysis somehow.
Thank you in advance !

I think you may have some terms mixed up. Basic blocks within each function are already arranged in a kind-of CFG, and LLVM provides you the tools to traverse that. See my answer to this question, for example.
MachineFunction lives on a different level, and unless you're doing something very special, this is not the level you should operate on. It's too low-level, and too target specific. There's some overview of the levels here

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

LLVM Preserving Loop Analysis - llvm

Related

How to check if a container only has a specific element?

Apply LLVM pass to a specific basic block

Instrumented code causing infinite recursion loop

ELF INIT section code to prepopulate objects used at runtime

LLVM traverse CFG

Categories

Resources