LLVM instrumentation - llvm

Recently, I am doing some research with LLVM.
At first, I want to write a pass to instrument .bc file.
Thus, it will record the execution path of the basic block of my .bc file.
Then, I want to term this .bc file into .exe file. Please give me your suggestions and if you have some examples for instrumentation of LLVM, please show me.

LLVM already comes with a number of instrumentation tools built-in. Take a look in the lib/Transforms/Instrumentation directory in the source tree.
One of the best known passes is Address Sanitizer, an instrumentation-based memory error detector (kinda like Valgrind, but significantly faster). Address Sanitizer has a runtime component + a LLVM pass that inserts instrumentation; the pass lives in lib/Transforms/Instrumentation/AddressSanitizer.cpp. There's some description of how it works on this page.

Related

How to link g++ itself to my project?

so to make things clear, my goal is to write a code which can save crucial parts of its own code (like the main func or other classes or stuff like that)
which i will be just copying them inside a class, then i want my program to add/remove some codes (func, obj, class, ...) using user input and after all of that i want it to regenerate that code again and create the class that holds the crucial parts of the code automatically and i want it to compile that and delete itself.
So i have all of the above planned and figured out except the part which i want it to compile that code, is there anyway to link g++ to my code? but i know that g++ has a main func, wouldnt that create problems with my main func?
+ i cant use the compiler on the existing system and i cant have the compiler as a separate executable...
You don't need to link your code with a compiler. You could package your executable with a compiler. Your code could generate C++ source code and then call the compiler to generate a new executable.
Keep in mind that most compilers are huge in size. Try installing G++ or MinGW on your system to get an understanding.
For more details, search the internet for "Compiler Design Theory" which will give you information about translating languages like C++ into an executable.
Also, you will need to have the Operating System launch the new executable (and kill the present instance). This will take some research into the Operating System's API.
IMHO, the best method is to use an interpretive or script language. The alternate is to have your C++ program execute a script or pieces of a script.
Edit 1: very low level
At the very lowest level, microcode (the command bytes that a processor processes) needs to be generated.
The steps would be:
Generate the microcode and place in some known location in memory
(that you have access to and is has execution privilege).
Transfer execution to the microcode, remembering to push the return
address on the stack.
The hard part is generating the microcode, especially adjusting the target addresses of all the branch instructions (unless you use something called Position Independent Code).
You could spend many months or years writing code that generates the micro code or figure out how to use pieces of compilers (like CLang or G++).

Generate binary code (shared library) from embedded LLVM in C++

I am working on a high performance system written in C++. The process needs to be able to understand some complex logic (rules) at runtime written in a simple language developed for this application. We have two options:
Interpret the logic - run a embedded interpreter and generate a dynamic function call, which when receives data, based on the interpreted logic works on the data
Compile the logic into a plugin.so dynamic shared file, use dlopen, dlsym to load the plugin and call logic function at runtime
Option 2 looks to be really attractive as it will be optimized machine code, would run much faster than embedded interpreter in the process.
The options I am exploring are:
write a compile method string compile( string logic, list & errors, list & warnings )
here input logic is a string containing logic coded in our custom language
it generates llvm ir, return value of the compile method returns ir string
write link method bool link(string ir, string filename, list & errors, list & warnings)
for the link method i searched llvm documentation but I have not been able to find out if there is a possibility to write such a method
If i am correct, LLVM IR is converted to LLVM Byte Code or Assembly code. Then either LLVM JIT is used to run in JIT mode or use GNU Assembler is used to generate native code.
Is it possible to find a function in LLVM which does that ? It would be much nicer if it is all done from within the code rather than using a system command from C++ to invoke "as" to generate the plugin.so file for my requirement.
Please let me know if you know of any ways i can generate a shared library native binary code from my process at runtime.
llc which is a llvm tool that does LLVM-IR to binary code translation. I think that is all you need.
Basically you can produce your LLVM IR the way you want and then call llc over your IR.
You can call it from the command line or you can go to the implementation of llc and find out how it works to do that in your own programs.
Here is a usefull link:
http://llvm.org/docs/CommandGuide/llc.html
I hope it helps.

How can I map OCaml bytecode to its original source code location?

Is there some nice feature of the format or library for going from some part of the bytecode to the line of code it originally came from? This would obviously be useful for debugging and error messages.
In particular, I'm looking at how hard it would be to add support for source maps to js_of_ocaml.
When compiled with debug information enabled (option -g), the bytecode carries so-called "event" structures marking for example the function entry and return point, which provide source location and typing information.
As a proof of concept of how to inspect this information, I have created a small branch of the ocamlpp tool (a small utility by Benoît Vaugon to inspect bytecode files) that prints this debug information alongside the bytecode instructions.
I have no idea whether js_of_ocaml takes the necessary steps to preserve this location information throughout the compilation process. You should probably contact the maintainer, Jérôme Vouillon, to ask for more information.
js_of_ocaml -debuginfo uses debug_event in a bytecode to write the line of code in comment.

LLVM: moving generated code around in a distributed/concurrent system

I'm using the LLVM C++ API mostly as a code generator for a scripting language that is parsed and evaluated (generating code, compiling, and executing it) at runtime. Currently I'm investigating future use cases in the context of a distributed/concurrent system and wonder if and how these use cases could be implemented. Maybe you can share your thoughts:
Is there a way to generate LLVM code on one node in a distributed
system, serialize it to some wire format, send it to another node,
compile or recompile it there and then execute it? I'm already stuck
finding methods to serialize a module/function.
Are there ways to enable multi-threaded code
generation/compilation within the same LLVMContext, i.e., a pool of
threads shares a LLVMContext and generate/execute code within this
context simultaneously. What I found out so far is that there should
be a LLVMContext for each thread in this case. However, I can I then
share a module between the different contexts and relating to 1),
how could I move generated code from one module to the other?
You can definitely use LLVM bitcode format to forward the code from one node to another. See include/llvm/Bitcode/ReaderWriter.h and around for more info. You can also check the sources of LLVM tools to see how the bitcode is serialized and deserialized. You might find http://llvm.org/docs/BitCodeFormat.html useful.

LLVM what is it and how can i use it to cross platform compilations

I was reading here and there about llvm that can be used to ease the pain of cross platform compilations in c++ , i was trying to read the documents but i didn't understand how can i
use it in real life development problems can someone please explain me in simple words how can i use it ?
The key concept of LLVM is a low-level "intermediate" representation (IR) of your program.
This IR is at about the level of assembler code, but it contains more information to facilitate optimization.
The power of LLVM comes from its ability to defer compilation of this intermediate representation to a specific target machine until just before the code needs to run. A just-in-time (JIT) compilation approach can be used for an application to produce the code it needs just before it needs it.
In many cases, you have more information at the time the program is running that you do back at head office, so the program can be much optimized.
To get started, you could compile a C++ program to a single intermediate representation, then compile it to multiple platforms from that IR.
You can also try the Kaleidoscope demo, which walks you through creating a new language without having to actually write a compiler, just write the IR.
In performance-critical applications, the application can essentially write its own code that it needs to run, just before it needs to run it.
Why don't you go to the LLVM website and check out all the documentation there. They explain in great detail what LLVM is and how to use it. For example they have a Getting Started page.
LLVM is, as its name says a low level virtual machine which have code generator. If you want to compile to it, you can use either gcc front end or clang, which is c/c++ compiler for LLVM which is still work in progress.
It's important to note that a bunch of information about the target comes from the system header files that you use when compiling. LLVM does not defer resolving things like "size of pointer" or "byte layout" so if you compile with 64-bit headers for a little-endian platform, you cannot use that LLVM source code to target a 32-bit big-endian assembly output pater.
There is a good chapter in a book explaining everything nicely here: www.aosabook.org/en/llvm.html