How do I generate a while loop using the llvm c++ api. I would like to understand how the branching and phi instructions work together
Related
I read that llvm is backend having multiple frontend, c, c++, swift ...
So, can I convert ir code based something to ir code based another?
Or, can I just use ir code in different language project?
I'm working on a toy programming language/compiler using ocaml and its llvm bindings. I want to have hashtables/hashmaps as a built in data structure for my language however I'm confused as to how to go about them.
I know the llvm c++ api has an ADT directory with a bunch of data structures that would suffice my needs, but I don't know how to call them using the ocaml api.
Another option would be to implement them using c and link them but I would rather focus on the first idea.
It would be helpful if anyone has useful resources on how to use/implement these data structures in llvm (either through the ocaml bindings or directly using the IR, not the c++ api).
I know the llvm c++ api has an ADT directory with a bunch of data structures that would suffice my needs, but I don't know how to call them using the ocaml api.
You can't call them from the OCaml API, but even if you could, they wouldn't solve your problem. They're just data structures, not a way to generate LLVM code. If you could use them in OCaml, you'd have a couple of additional options of data structures you could use in addition to OCaml's built-in lists, maps, sets and arrays. You wouldn't have a way to generate LLVM code implementing these data structures. That's not what the classes do.
Those data structures could just as well be a separate library that has nothing to do with LLVM. They're part of LLVM because they're used by the LLVM project, not because they're directly related to generating LLVM code.
Another option would be to implement them using c and link them but I would rather focus on the first idea.
That or linking against an existing library implementing hashtables (with added glue code to make them work with your language's type system and memory model as appropriate) would be your only options.
let's say I have some logic written down in some programming language with LLVM frontend available. I would like to reuse this logic in some c++ application. Can I generate some sort of library using common LLVM backends and call it from my application without significant loss of performance? Any hints on how to address this usecase?
I see that more and more people are switching to LLVM, especially people with a background in C or C++, so there is a pattern in which kind of people are approaching this compiler, what surprises me is the highly heterogeneous set of technologies that LLVM can manage, and I don't get what is pipeline that this virtual machine follows and what are the resulting benefits.
I would like to stress the fact that I'm focusing on LLVM, not really on clang.
A 1 in a million example is this one ( Youtube Video ), where the pipeline is not really obvious for me, or this other one, but apparently there a lot of totally different solutions where, for example, LLVM is used in conjunction with a JIT solution.
In short I see different syntax and semantics, people using LLVM to produce GPU shaders or binary objects, but I can't see the common denominator.
What is the meaning of "LLVM based compilation", Considering LLVM as a black box, what is the kind of input, output and the business logic in the middle ?
I can't see the common denominator.
The common denominator is converting code in one language to code in another language. And that's exactly what compilers do. So if you want to convert a piece of code in a "source language" to one in a "target language", what you need to do is:
Write a "front-end" - a component that converts from your source language to what LLVM expects as input. That language is an LLVM-specific language called "LLVM Bitcode" or "LLVM IR".
Alternatively, reuse an existing front-end - for example Clang.
Write a "back-end" - a component that converts from what LLVM emits to your target language.
Or use an existing back-end, for example LLVM's x86 back-end.
That's it. Now you get to enjoy things like the optimizations LLVM performs on the code between its input and output, its common framework for "lowering" the code to something closer to machine code, etc.
GCC behaves the same, by the way, it's just that LLVM is considered by many to be superior in some aspects, particularly licensing and ease of modification.
LLVM's advantage over other source-available compilers is that it is designed as a set of reusable libraries. That means to some degree you can pick-and-choose what to include in your tool. Not every language tool needs optimization and not every language tool needs code generation. LLVM is a very flexible system for langauge processing.
Generally when people say, "LLVM based compilation," they mean using one or more of the LLVM libraries to implement their tool. They can leverage all of the work put into LLVM in understanding its IR and generating code for multiple targets.
The LLVM IR is the common representation used by most of the LLVM libraries. It is the interface you need to write to. For low-level stuff like machine code you will need to deal with some of the other LLVM representations (MachineInstr, MC, etc.).
As for writing a frontend to generate that LLVM IR, the tricky part is ensuring that the translation from your source language to the LLVM IR preserves the semantics of the source language. The LLVM IR has a well-defined but low-level set of semantics for each instruction. If your source language has higher-level semantics you will have to lower them into LLVM IR instruction sequences to implement it. For example, there is no LLVM instruction that handles C-style bitfield access so C language frontends must use a sequence of LLVM instructions to implement the functionality (generally shifts and bitwise operations).
As long as you implement the semantics of your source language in the LLVM IR correctly, the LLVM libraries will have no problem performing correct code transformations. If some desired transformation requires higher-level semantics information than LLVM IR can provide, you either have to do the transformation in some stage before converting to LLVM IR (and so you will have the high-level information available) or you can pass attribute information in the LLVM IR to convey the high-level semantics and write a custom LLVM pass to implement the transformation. It is usually far cleaner to do the former than the latter.
I am creating an LLVM backend for a compiler. I am wondering if there is any downside to having my backend write IR code in files instead of using the APIs. The APIs are complicated (especially if one is using a language other than C++, in my case Haskell) and hard to use. The IR is much easier to understand. I don't need JIT compilation, the output code will be compiled to machine code by the standard command line tools.
The IR format changes from version to version. API changes much less frequently. There were examples in the past when IR format changed dramatically, so you'd need to invest big amount of time to tolerate these changes.
Using API is the preferable method. If sometimes it's not clear for you which API calls you will need - you can use cpp backend as a source of inspiration :)
As Anton said, there's a definite advantage in using the API as opposed to spitting out textual IR. I just want to address the point you raise regarding the complexity of the API and its usage from Haskell.
Note that LLVM has a C API, which (apart from being more stable) is suitable for foreign language interfaces. Python bindings exist for LLVM using this API, as well as Haskell bindings (this is easily found by Google) and for other languages as well.