About definition of llvm pass - llvm

I do not quite understand the definition of pass in the llvm. Does it mean I can only use opt command to run the program?
My situation is like I want to find loops in a CFG of basic blocks and I want to use LLVM API instead of writing code by myself. I found a file called Loopinfo http://llvm.org/docs/doxygen/html/LoopInfo_8h_source.html which includes pass.h and class passinfo inherited from Functionpass. Does it mean I can only use opt command to call instead of writing a normal project which uses some of class's functions and build and execute? I hope I clarified my question clearly.

You can analyze and manipulate LLVM IR just fine without knowing anything about passes. Just use the LLVM API and you'll be OK.
So what's the deal with passes? Well, if you do write your analysis or transformation in the form of a pass - by following this guide - you can still just use it as any regular C++ class1, but you get some advantages:
You can use the opt tool to run your pass. It will take care of everything else for you (e.g. loading the IR), it makes it very easy to run other passes before or after your pass (including the useful verification pass), makes it easy to enable/disable debug mode, etc.
You can easily combine your pass with other passes using a pass manager, which is very convenient (will take care of pass dependencies for you, for example).
So in general, writing things in the form of passes is recommended but not required.
1 Well if you define requirements on other passes then you'll have to run those yourself if you're not using opt or a pass manager

The easiest way is to add pass executed via opt command. But, you should be able to create dedicated executable which reads LLVM bitcode, performs your pass and writes bitcode back.
See here for an example:
Parsing and Modifying LLVM IR code
Source of opt command might also be useful:
https://llvm.org/svn/llvm-project/llvm/trunk/tools/opt/opt.cpp

Related

C++ Source-to-Source Transformation with Clang

I am working on a project for which I need to "combine" code distributed over multiple C++ files into one file. Due to the nature of the project, I only need one entry function (the function that will be defined as the top function in the Xilinx High-Level-Synthesis software -> see context below). The signature of this function needs to be preserved in the transformation. Whether other functions from other files simply get copied into the file and get called as a subroutine or are inlined does not matter. I think due to variable and function scopes simply concatenating the files will not work.
Since I did not write the C++ code myself and it is quite comprehensive, I am looking for a way to do the transformation automatically. The possibilities I am aware of to do this are the following:
Compile the code to LLVM IR with inlining flags and use a C++/C backend to turn the LLVM code into the source language again. This will result in bad source code and require either an old release of Clang and LLVM or another backend like JuliaComputing. 1
The other option would be developing a tool that relies on using the AST and a library like LibTooling to restructure the code. This option would probably result in better code and put everything into one file without the unnecessary inlining. 2 However, this options seems too complicated to put the all the code into one file.
Hence my question: Are you aware of a better or simply alternative approach to solve this problem?
Context: The project aims to run some of the code on a Xilinx FPGA and the Vitis High-Level-Synthesis tool requires all code that is to be made into a single IP block to be contained in a single file. That is why I need to realise this transformation.

Can an LLVM Pass be used to change code?

I am writing a program optimization, which involves adding new functions, removing lines of code, inserting function calls and changing arguments to functions.
Is all this possible using an LLVM Pass, and if yes how would I write such a code for this?
Having had a look at the how to write an LLVM pass page on the LLVM website, it does not explain anything about altering code.
This is a really good guide to start off writing pass. It also has an example how to change code.

A higher level tool for writing a pass in LLVM

Normally If you want to modify LLVM IR, you need to write a pass. However, writing a pass by yourself is an overkill sometimes if a higher level tool could facilitate you.
For example, someone might wish to log every load and store in the program. For that purpose, he would need to inject code that does the logging. Now if there is a higher level tool, it can provide callbacks to us to write what we want. So in this case, for example, it could provide us OnLoad and OnStore functions which we can fill to tell the tool what to do on each load and store. Does such kind of a tool exist?
So basically I want something similar to what is provided by Dynamic Binary Instrumentation tools but that works with LLVM, for compile time code injection.
I think you should consider using PIN instead of LLVM for such things: http://www.pintool.org/
PIN enables you insert instrumentation/analyze code at several granularity levels: instruction, basic block, function, traces and even load/unload of shared libraries. Is may be a way more practical since you won't need to compile the application - so you can analyze programs wich aren't open source for example.
There are version of PIN for windows and linux.
PS: Another tool that seems useful: http://eces.colorado.edu/~blomsted/llvmpin/llvmpin.html

LLVM: moving generated code around in a distributed/concurrent system

I'm using the LLVM C++ API mostly as a code generator for a scripting language that is parsed and evaluated (generating code, compiling, and executing it) at runtime. Currently I'm investigating future use cases in the context of a distributed/concurrent system and wonder if and how these use cases could be implemented. Maybe you can share your thoughts:
Is there a way to generate LLVM code on one node in a distributed
system, serialize it to some wire format, send it to another node,
compile or recompile it there and then execute it? I'm already stuck
finding methods to serialize a module/function.
Are there ways to enable multi-threaded code
generation/compilation within the same LLVMContext, i.e., a pool of
threads shares a LLVMContext and generate/execute code within this
context simultaneously. What I found out so far is that there should
be a LLVMContext for each thread in this case. However, I can I then
share a module between the different contexts and relating to 1),
how could I move generated code from one module to the other?
You can definitely use LLVM bitcode format to forward the code from one node to another. See include/llvm/Bitcode/ReaderWriter.h and around for more info. You can also check the sources of LLVM tools to see how the bitcode is serialized and deserialized. You might find http://llvm.org/docs/BitCodeFormat.html useful.

llvm: strategies to build JIT content incrementally

I want my language backend to build functions and types incrementally but don't pollute the main module and context when functions and types fail to build successfully (due to problems with the user input).
I ask an earlier question regarding this.
One strategy i can see for this would be building everything in temp module and LLVMContext, migrating to main context only after success, but i am not sure if that is possible with the current API. For instance, i wouldn't know know to migrate that content between different contexts, as they are supposed to represent isolated islands of LLVM functionality, but maybe there is always the alternative to save everything to .bc and load somewhere else?
what other strategies would you suggest for achieving this?
Assuming you have two modules - source and destination, it's possible to copy a function from source to destination. The code in LLVM you can use as an example is the body of the LLVM linker, in lib/linker/LinkModules.cpp.
In particular, look at the linkFunctionProto and linkFunctionBody methods in that file. linkFunctionBody copies the function definition, and uses the llvm::CloneFunctionInto utility for the heavy lifting.
As for LLVMContext, unless you specifically need to run several LLVM instances simultaneously in different threads, don't worry about it too much and just use getGlobalContext() everywhere a context is required. Read this doc page for more information.