I am trying to figuring out what the landingpad instruction in LLVM IR exactly does and when to use it for exception handling purposes.
I can't find any other instruction, where landingpad has any usage like catchpad in combination with catchswitch and catchret.
Can someone give me a simple example how to use it? Thanks!
Related
I need to visualize the CFG of an LLVM Function, which I have in a .ll file. There is the opt tool, which has the --view-cfg option. However, the problem is that the function is broken, the definition of a register does not dominate all its uses. I need to view the CFG to investigate why this is the case. Problem: opt does not take wrong LLVM functions, so I cannot view the CFG with it.
So, what is the best way to visualize the CFG of a broken LLVM function?
Problem: opt does not take wrong LLVM functions, so I cannot view the CFG with it.
That's not actually the case. The verifier is turned on by default, yes, but if the function in question is syntactically correct, then you can just turn it off:
$ opt -disable-verify -view-cfg foo.ll
You can even try to compile it with llc, run with lli, etc this way.
I am writing a program optimization, which involves adding new functions, removing lines of code, inserting function calls and changing arguments to functions.
Is all this possible using an LLVM Pass, and if yes how would I write such a code for this?
Having had a look at the how to write an LLVM pass page on the LLVM website, it does not explain anything about altering code.
This is a really good guide to start off writing pass. It also has an example how to change code.
I have a hard time formulating this question. The reason I'm asking is that I want to convert some C++ code with emscripten to java script code, but I don't think I need to convert the whole code base.
Is it possible in C++ to find all the code that a particular function could reach into when executed? Then I would know which part of the code I need to convert and which one I can just ignore.
It is called "call hierarchy" as Eugene said. You can use automatic documentation tools to get this information.
I strongly recommend you to try doxygen because it is really easy to use:
http://www.doxygen.nl/
Is it possible to add comments into a BasicBlock? I only want that when I print out the IR for debugging I can have a few comments that help me. That is, I fully expect them to be lost once I pass them to the optimizer.
No, it's not possible directly. Comments, by which you probably mean the lexical elements beginning with a semicolon (;) in the textual IR representation, have no representation in the in-memory IR (and binary bitcode). As you probably know, LLVM IR has three equivalent representations (in memory API level, textual "assembly" level, binary bitcode level). Once the LLVM assembly IR parser reads the code into memory, comments are lost.
What you could do, however, is use metadata for this purpose. You can create arbitrary metadata attached to any instruction, as well as global module-level metadata. This is a hack, for sure, but if you really think you need some sort of annotation, metadata is the way. LLVM uses metadata for a number of annotation needs, like debug info and alias analysis annotations.
I'm considering picking up some very rudimentary understanding of assembly. My current goal is simple: VERY BASIC understanding of GCC assembler output when compiling C/C++ with the -S switch for x86/x86-64.
Just enough to do simple things such as looking at a single function and verifying whether GCC optimizes away things I expect to disappear.
Does anyone have/know of a truly concise introduction to assembly, relevant to GCC and specifically for the purpose of reading, and a list of the most important instructions anyone casually reading assembly should know?
You should use GCC's -fverbose-asm option. It makes the compiler output additional information (in the form of comments) that make it easier to understand the assembly code's relationship to the original C/C++ code.
If you're using gcc or clang, the -masm=intel argument tells the compiler to generate assembly with Intel syntax rather than AT&T syntax, and the --save-temps argument tells the compiler to save temporary files (preprocessed source, assembly output, unlinked object file) in the directory GCC is called from.
Getting a superficial understanding of x86 assembly should be easy with all the resources out there. Here's one such resource: http://www.cs.virginia.edu/~evans/cs216/guides/x86.html .
You can also just use disasm and gdb to see what a compiled program is doing.
I usually hunt down the processor documentation when faced with a new device, and then just look up the opcodes as I encounter ones I don't know.
On Intel, thankfully the opcodes are somewhat sensible. PowerPC not so much in my opinion. MIPS was my favorite. For MIPS I borrowed my neighbor's little reference book, and for PPC I had some IBM documentation in a PDF that was handy to search through. (And for Intel, mostly I guess and then watch the registers to make sure I'm guessing right! heh)
Basically, the assembly itself is easy. It basically does three things: move data between memory and registers, operate on data in registers, and change the program counter. Mapping between your language of choice and the assembly will require some study (e.g. learning how to recognize a virtual function call), and for this an "integrated" source and disassembly view (like you can get in Visual Studio) is very useful.
"casually reading assembly" lol (nicely)
I would start by following in gdb at run time; you get a better feel for whats happening. But then maybe thats just me. it will disassemble a function for you (disass func) then you can single step through it
If you are doing this solely to check the optimizations - do not worry.
a) the compiler does a good job
b) you wont be able to understand what it is doing anyway (nobody can)
Unlike higher-level languages, there's really not much (if any) difference between being able to read assembly and being able to write it. Instructions have a one-to-one relationship with CPU opcodes -- there's no complexity to skip over while still retaining an understanding of what the line of code does. (It's not like a higher-level language where you can see a line that says "print $var" and not need to know or care about how it goes about outputting it to screen.)
If you still want to learn assembly, try the book Assembly Language Step-by-Step: Programming with Linux, by Jeff Duntemann.
I'm sure there are introductory books and web sites out there, but a pretty efficient way of learning it is actually to get the Intel references and then try to do simple stuff (like integer math and Boolean logic) in your favorite high-level language and then look what the resulting binary code is.