How to apply llvm passes using CMake

How to apply llvm passes using CMake - c++

We have implemented a LLVM pass and compiled it to a library (called libMyPass.so).
We want to apply this pass to a project (all its source code files) which uses cmake to build it. Is there a way to do so in cmake?
Generally, we used clang to emit llvm bit code from a source file, opt to apply this pass to the bit code, llc to translate the new bit code to assembly language and clang again to compile assembly language to executable. Can I encapsulate this process using cmake?

You can have a look or even use this repo that implements the various steps as cmake commands.
The gist of it is that creates various commands (using the cmake's add_custom_command) basically do exactly what you're looking for using the various LLVM subtools in conjunction with the various cmake target properties, in order to create the IR generation commands from source and to native binary code (i.e. .o).
For example, using llvmir_attach_bc_target() attaches to a top-level cmake target and creates a (unoptimized) .bc file for each source file in the SOURCES property of it.
It contains various examples in the same repo that should be enough to get you started.

Related

Is it possible to compile LLIR to binary without clang?

I'm writing a compiler that embeds the LLVM API. By copying code from the llc tool, I can output assembly language or object files that I can turn into binaries using clang or an assembler.
But I want my compiler to be self contained. Is it possible to turn LLIR into binaries using LLVM? This seems like the sort of thing that should be in the LLVM toolkit.

Yes, it is possible and this is also done by llc with -filetype=obj argument.
You can consult the compileModule function to learn how to use the programmatic API.
Note that this will only generate an object file for a given translation unit. You will also need a linker to convert it into a proper executable or library. The LLVM linker, lld, can also be embedded into client applications as a library, so in the end you will be able to create a self-hosting compiler.

How to generate llvm bitcode for large programs with many source code files and a huge Makefile (e.g. memcached)?

I have my pass that I tested on toy programs and now I want to run it on large programs, many of which are open source programs like memcached. Such programs have their own Makefile and a complicated compilation procedure. I want to generate a bitcode file for such programs to let my pass work on them. Help and suggestions will be appreciated!

Depending on what you're pass is doing you can:
Build with LTO: adding -flto to the CFLAGS and building your application with your own built linker plugin is quite seamless from a build system point of view. However it requires some understand about how to setup LTO.
Build with your own built clang: adding statically your pass to the LLVM pipeline and use your own built clang. Depending on the build system, exporting CC/CXX environment variable pointing to your installed clang should be enough.
Build by loading your pass dynamically into clang, for example this is what Polly is (optionally) doing.

If you add -emit-llvm to your clang flags, it will emit BC files instead of object files or LL files instead of assembly.
You'll likely have to modify the makefile some more bit that should get you started in the right direction.

How can I configure cmake to compile a file twice with two different compilers?

I'm adding a SYCL/OpenCL kernel to a parallel C++ program which is built with cmake. Using SYCL means I need to get cmake to compile my C++ source file twice: once with the SYCL compiler, and once with the project's default compiler, which is GCC. Both compilations produce outputs which need to be included when linking.
I'm completely new to cmake. I've added the GCC compile and link steps to the project's CMakeLists.txt, but what's the best way to add the SYCL compile step? I'm currently trying the "add_custom_command" option with "PRE_BUILD", but the command which is run doesn't seem to know about the paths which are provided to the normal compile and link steps: the current working directory, include directories, source directories, etc. I'm having to specify all of these manually, and I'm having to figure some of them out first.
It feels like I'm doing this the hard way. Is there a recommended (or at least better) way to get cmake to compile a file twice with two different compilers?
Also, there used to be a SYCL tag, but it's disappeared. Can someone recreate it, please?

Be aware that PRE_BUILD only works as PRE_BUILD in Visual Studio 7, for other targets is just PRE_LINK.
If you need to use two compilers on the same source file, just add a dependency from the GCC compile and link to the custom target you are using, so the GCC is executed after the SYCL compiler.

I can think of a couple other ways to do it.
Generate two build configurations
Write a script to call both compilers
The first method is probably the easiest. You might need to maintain two seperate CMakeLists.txt files, or possibly just parameterize the compiler and options and pass them arguments to Cmake when you generate (CC=gcc, CXX=g++, CFLAGS/CXXFLAGS, etc...). You might be able to do the same with the underlying build system (e.g. make) and just run it twice.
The second method is a bit more complicated. Write a simple script that accepts both sets of compiler options and compile each file using the compilers in sequence. Then the script could be then configured as CC/CXX.
So, the command options would look something like this...
--cc1 sycl --cc2 gcc --cc1opts ... --cc2opts ...
I'm not familiar with SYCL though, so I don't know how it's normally used.

llvm and install time optimization

Based on LLVM official page, it is possible to have install-time optimization, based on my understanding, which first compiles to bytecode on build machine before distribution, and then on target machines, converts the bytecode to native code when installing.
Is there any real world example on this feature? More specifically, I am wondering if it is possible to take an arbitrary open source C/C++ project which uses autoconf (i.e. typically built and installed by ./configure && make && make install), and
on build machine, by running ./configure && make in a special
way (e.g. setting some environment variables, or even modify the
configure.ac or some other autoconf files) so that it generates
executable and libraries as byte code;
I transfer the build tree to target machine, and run make install
in a special way so that it installs all files as usual,
but converts byte code to native code for executable and libraries.

As #delnan indicated, this isn't possible in general. LLVM is a target independent IR, but it is not portable.
There have been a few attempts to construct a portable IR, PNaCl among them, but these are different from LLVM.

LLVM IR is target independent, meaning that it could be generated on one machine (compile time) and converted to bytecode (link time) on another and it would still generate the same bytecode as it would have on the first machine, provided that you were using the same version of LLVM with the same options. It does not mean that the IR that was generated would produce a valid binary on all machines.
The problem with this lies in the way that the ABI can vary between different systems.
This post addresses those differences in more detail:
LLVM bitcode cross-platform

Building autotooled software to LLVM bitcode

I would like to compile software using the autotools build system to LLVM bitcode; that is, I would like the executables obtained at the end to be LLVM bitcode, not actual machine code.
(The goal is to be able to run LLVM bitcode analysis tools on the whole program.)
I've tried specifying CC="clang -emit-llvm -use-gold-plugins" and variants to the configure script, to no avail. There is always something going wrong (e.g. the package builds .a static libraries, which are refused by the linker).
It seems to me that the correct way to do it would be that LLVM bitcode should be a cross-compilation target. to be set with --host=, but there is no such standard target (even though there is a target for Knuth's MMIX).
So far I've used kludges, such as compiling with CC="clang -emit-llvm -use-gold-plugins" and running linking lines (using llvm-ld or llvm-link) manually. This works for simple packages such as grep.
I would like a method that's robust and works with most, if not all, configure scripts, including when there are intermediate .a files, or intermediate targets.

There are some methods like this. But for simple builds where intermediate static libraries are not used, then you can do something simpler. The list of things you will need are
llvm, configured with gold plugin support. Refer to this
clang
dragonegg, if you need front-end for fortran, go, etc.
The key is to enable '-flto' for either clang or dragonegg(front-end), both at compile time and link time. It is straightforward for clang:
CC = clang
CLINKER = clang
CFLAGS = -flto -c
CLINKFLAGS = -flto -Wl,-plugin-opt=also-emit-llvm
If needed, add additional '-plugin-opt' option to specify llvm-specific codegen option:
-Wl,-plugin-opt=also-emit-llvm,-plugin-opt=-disable-fp-elim
The dumped whole problem bytecode would be sitting along with your final executable.
Two additional things are needed when using dragonegg.
First, the dragonegg is not aware of the location of llvm gold plugin, it needs to be specified in the linker flags like this -Wl,-plugin=/path/to/LLVMgold.so,-plugin-opt=...
Second, dragonegg is only able to dump IR rather than bytecode. You need a wrapper script for that purpose. I created one here. Works fine for me.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

How to apply llvm passes using CMake - c++

Related

Is it possible to compile LLIR to binary without clang?

How to generate llvm bitcode for large programs with many source code files and a huge Makefile (e.g. memcached)?

How can I configure cmake to compile a file twice with two different compilers?

llvm and install time optimization

Building autotooled software to LLVM bitcode

Categories

Resources