Using LLVM as virtual machine - multiplatform and multiarchitecture coding - llvm

I'm currently working in a pet programming language (for learning purposes), and have gone through a lot of research over the past year, and I think its time to finally start modelling the concepts of such a languague. First of all I want it to compile to some intermediate form, such as JVM or .NET bytecode, the goal being multiplatform/architecture compatibily. Second, I want it to be fast (I also have many other things in mind, but its not the purpose of this topic to discuss those).
The best options that came to my mind were:
Compile to JVM bytecode and use OpenJDK as runtime environment,
Compile to .NET bytecode and use Mono as runtime environment,
Compile to LLVM IR and use LLVM as runtime environment.
As you may have imagined, I've chosen LLVM. Why? because its blazing fast. I did a little benchmark using the C++ N-Body code, and achieved 7s in my machine with lli jitted IR, in contrast with 27s with clang native compiled code (I know clang first make IR then machine code).
So, here is my question: Is there any redistributable version of the LLVM basic toolset (I just need lli) that I can use? Or I must compile my own? If the latter, can you provide me with any hints on how to do it? If I really must do it, I'm thinking is cross-compiling them from my machine (Intel Mac), and generating some installers (say, an .msi for windows, .rpm and .deb for popular linux distros and .pkg for Macs). Remember, I only need a minimal subset of LLVM, such that this subset is capable of acting like a VM, by using "lli ". The real question here is how to use LLVM as a typical virtual machine.

First, I think all 3 options - LLVM IR + LLVM, Java Bytecode + OpenJDK, and .NET CIL + Mono - are excellent options, and I agree deciding between them is not easy.
If you go for LLVM and you just want to use lli, you can compile LLVM to your target platform and pack the resulting lli executable with your distribution, it should work.
Another way to write a JIT compiler via LLVM is to use an execution engine - see the handy examples in the Kaleidoscope tutorial. That means that you write your own program which will JIT-compile your own language, compile it to whatever platform you want while statically linking it with LLVM, and then distribute it.
In any case, since a JIT compiler requires copying an LLVM binary to the client side, make sure to attach a copyright notice with your distribution (you don't have to open-source your distribution, though).

Related

What are the differences between MSIL and LLVM bitcode?

I'm new to .Net and I'm trying to understand the basics first. What is the difference between MSIL and LLVM bitcode?
Both LLVM bitcode and MSIL are intermediate languages. Essentially, they are generic assembly code languages: not as high-level as most source languages (e.g., Swift, C#) but also not as low-level as real assembly (e.g., ARM, x86). There are a number of technical implementation differences between the two languages, but most developers don't need to know the small stuff*. They just need to how they are used in their respective platforms' distribution models.
The LLVM bitcode format is a serialized version of the intermediate representation code used within the LLVM compiler. The "front end" of the compiler translates the source language (such as Swift) into LLVM bitcode, and then the "back end" of the compiler translates the bitcode into the target instruction set (such as ARM machine code). (Note: A previous version of this answer implied LLVM bitcode was processor-agnostic. That is not the case, because the source languages depend on the target processor.)
Apple allows iOS developers to submit their apps as either fully-compiled ARM code or as LLVM bitcode, the latter of which:
[...] will allow Apple to re-optimize your app binary in the future without the need to submit a new version of your app to the store.
Essentially, you run the LLVM front end on your development environment, pass the bitcode to Apple, who runs the LLVM back end on their servers. This process is known as ahead-of-time (AOT) compilation (the Wikipedia article is of two minds as to whether the non-bitcode case is also AOT or if that's just "standard" compilation).
But whether or not you use bitcode, iOS end users always get the app as ARM machine code.
Things are a bit different in .NET. Most .NET code is compiled to MSIL, which is packaged in files called assemblies. The .NET runtime on an end user's device loads and executes assemblies, compiling the MSIL to machine code for the device's processor at runtime. This is called just-in-time (JIT) compilation.
Normally, MSIL is processor-agnostic, so most developers can think of .NET apps as also being processor-agnostic. However, there are a number of ways that processor-specific code can be packaged before the end user runs the app through the JIT:
Some tools, like the Native Image Generator and .NET Native, allow AOT compilation. In fact, Universal Windows Platform (UWP) apps uploaded to the Microsoft Store are AOT compiled - you submit the MSIL version of your app to Microsoft, then their servers use .NET Native to compile it for the various architectures Windows 10 supports.
It's also possible to include native code with assemblies themselves; these are called mixed assemblies.
MSIL itself can be processor-specific, if the source language uses "unsafe" operations (e.g., pointer math in C#).
But these are typically the exception, rather than the rule. Usually, .NET apps are distributed in MSIL, and end users' devices are where the native code is generated.
So in summary:
LLVM bitcode is processor-specific, but not quite as low-level as actual machine code. Apple allows iOS developers to submit apps as bitcode, to allow for future re-compilations when optimizations can be introduced. The end user runs native executables.
MSIL is usually processor-agnostic. The end user typically runs this processor-agnostic code, with .NET compiling the MSIL to native code at runtime. However, there are some cases where some or all of the app could be native code.
* Of course, if you are interested in the technical details, there are standards for LLVM bitcode and for MSIL, under its ECMA name CIL. I'm moderately knowledgeable in the latter; after a cursory glance of the former, the most notable technical difference is the memory models: LLVM bitcode is register-based, MSIL/CIL uses an evaluation stack.

c++ - llvm and runtime jit

Context
Linux 64 bits / osx 64 bits. C++ (gcc 5.1, llvm 3.6.1)
Up to now, I always used gcc for my projects.
The problem for the next thing I am creating is the licence. Hence, I decided to give clang/llvm a go.
My needs : runtime self modifying code (and a very relaxed licence for compiler plugins for static analysis and other things.).
I played a lot with libgccjit and it works fine.
As for llvm, I read the Kaleidoscope project and some doc but it is unclear.
Question
I saw that llvm has some jit possibilities but I am not sure if it enables to self modify the code (more precisely, extend the code) at runtime as libgccjit does for c++ language.
I just need a starter here, llvm is huge and new to me, so anyone expert enough is very welcome to guide me a bit.

Is it possible to cross-compile D source code for MIPS?

Is it possible to cross-compile D source code for MIPS?
For example, I want to compile a D "Hello, world." program that will run on TI AR7-based devices, which have MIPS32 processor and typically run Linux 2.4.17 kernel with MontaVista patches and uClibc (using the MIPS I generic target; ELF 32-bit LSB executable, MIPS, MIPS-I version 1 SYSV).
http://en.wikipedia.org/wiki/TI-AR7
The reference compiler, DMD, does not generate MIPS code, so you'll have to use GDC and LDC2, which support generating code for whatever architectures their backends support (GCC and LLVM, respectively).
However, it's not a simple as generating the code. To get all of D's features workable, you'll need to port druntime and phobos to MIPS, as druntime is quite architecture specific. Without that, you'll be stuck without a GC, and all the features that entails.
So it is possible, but how possible definitely depends on how dedicated you are.

LLVM vs clang on OS X

I have a question concerning llvm, clang, and gcc on OS X.
What is the difference between the llvm-gcc 4.2, llvm 2.0 and clang? I know that they all build on llvm but how are they different?
Besides faster compiling, what is the advantage of llvm over gcc?
LLVM originally stood for "low-level virtual machine", though it now just stands for itself as it has grown to be something other than a traditional virtual machine. It is a set of libraries and tools, as well as a standardized intermediate representation, that can be used to help build compilers and just-in-time compilers. It cannot compile anything other than its own intermediate representation on its own; it needs a language-specific frontend in order to do so. If people just refer to LLVM, they probably mean just the low-level library and tools. Some people might refer to Clang or llvm-gcc incorrectly as "LLVM", which may cause some confusion.
llvm-gcc is a modified version of GCC, which uses LLVM as its backend instead of GCC's own. It is now deprecated, in favor of DragonEgg, which uses GCC's new plugin system to do the same thing without forking GCC.
Clang is a whole new C/C++/Objective-C compiler, which uses its own frontend, and LLVM as the backend. The advantages it provides are better error messages, faster compile time, and an easier way for other tools to hook into the compilation process (like the LLDB debugger and Clang static analyzer). It's also reasonably modular, and so can be used as a library for other software that needs to analyze C, C++, or Objective-C code.
Each of these approaches (plain GCC, GCC + LLVM, and Clang) have their advantages and disadvantages. The last few sets of benchmarks I've seen showed GCC to produce slightly faster code in most test cases (though LLVM had a slight edge in a few), while LLVM and Clang gave significantly better compile times. GCC and the GCC/LLVM combos have the advantage that a lot more code has been tested and works on the GCC flavor of C; there are some compiler specific extensions that only GCC has, and some places where the standard allows the implementation to vary but code depends on one particular implementation. It is a lot more likely if you get a large amount of legacy C code that it will work in GCC than that it will work in Clang, though this is improving over time.
There are 2 different things here.
LLVM is a backend compiler meant to build compilers on top of it. It deals with optimizations and production of code adapted to the target architecture.
CLang is a front end which parses C, C++ and Objective C code and translates it into a representation suitable for LLVM.
llvm gcc was an initial version of a llvm based C++ compiler based on gcc 4.2, which is now deprecated since CLang can parse everything it could parse, and more.
Finally, the main difference between CLang and gcc does not lie in the produced code but in the approach. While gcc is monolithic, CLang has been built as a suite of libraries. This modular design allow great reuse opportunities for IDE or completion tools for example.
At the moment, the code produced by gcc 4.6 is generally a bit faster, but CLang is closing the gap.
llvm-gcc-4.2 uses the GCC front-end to parse your code, then generates the compiled output using LLVM.
The "llvm compiler 2.0" uses the clang front-end to parse your code, and generates the compiled output using LLVM. "clang" is actually just the name for this front-end, but it is often used casually as a name for the compiler as a whole.

Compile C++ code for AIX on Ubuntu?

Question in one sentence: How can I compile code for AIX using G++ on Ubuntu? (Assuming it is possible)
I hope that it is as simple as adding an option to the make file to specify target processor. I am a novice when it comes to most things compiler related.
Thank you in advance.
What you are looking for is a cross-compiling toolchain.
A toolchain includes a cross-compiler (a compiler that runs on the current platform but builds the binary code to run on another, on your case, AIX), the C or C++ library, and some other interesting tools.
I have successfully used buildroot in the past, which is a tool that automates the process of creating a cross-compiling toolchain. I know they support several target platforms, maybe AIX is among them.
If you want to compile your toolchain by hand, take a look at the Roll-your-own section on this page.
Another approach, probably easier on your case, would be to install a AIX system inside a virtual machine on Ubuntu. This way you would have access to a complete AIX system running inside your machine, giving the opportunity to develop and test your application under real conditions (or whatever reasons you may find interesting for doing such a thing).
You'll want to download the right version of g++ (i.e., one that generates code for POWER, or whatever you're running AIX on), and compile that to run under Ubuntu.
Try to compile with -maix32 or -maux64 (like g++ -m32 file.cxx), it it doesn't work it means that it's not supported by your compiller.