Just to note, I've read the questions and read the blog posts and I've also referenced the ABI.
What I completely don't understand is how that interacts with LLVM's EH intrinsics. The LLVM EH page gives a very vague overview- not exactly a checklist of "Implement X, Y, Z".
The LLVM EH page references the Itanium ABI directly. This would imply to me that LLVM only supports Itanium ABI exceptions. But I already know that Clang supports ARM and is developing support for Microsoft ABIs. So exactly how specific is LLVM's implementation of EH to the Itanium ABI?
When referencing the _Unwind stuff defined by Itanium ABI, is that obliged to be provided by a backend, or would I have to implement it for myself?
I also noticed that the LLVM IR generated by Clang does not reveal any language-specific tables, any exception frames, exception tables, or anything like that. In that case, how does LLVM know how to generate the language-specific data?
In short, how exactly do you go from LSDAs, EH contexts, and _Unwind_RaiseException to landingpad and resume?
Edit: Just for reference, I'm going to be JITting the resulting code on Windows.
Nowadays Itanium C++ ABI is de facto standard C++ ABI used on many other
platforms. Itanium C++ ABI supports zero cost exception handling technique,
which is the most widespread technique for today.
To support exception handling one must change the semantics of a function
call. Now the calls fork the execution flow. One branch is taken when
everything is fine and the second branch is taken in case of an exception.
In LLVM IR there is the invoke instruction to call functions that may throw.
When the second branch is taken several kinds of actions may be performed:
call destructors (cleanup)
continue stack unwinding (resume)
enforce throw specifications (filter)
restore normal control flow (catch).
It's clear that some additional code must be generated to perform these actions.
That's why we've got the landingpad instruction as well. It is the first
instruction to be executed after invoke ends up with an exception.
But the main magic is performed at run-time. After an exception is thrown, the
language agnostic runtime unwinds the stack, for every frame it finds language
specific data area (LSDA) and calls language specific personality routine. The
personality routine inspects program counter, LSDA and the current exception. It
determines if any cleanup is necessary, if any throw specification is violated
or if the exception can be caught by this frame.
As you probably know, all these data (personality routine, catched types, throw specification, cleanup actions) are already specified in landingpad
instruction, so no additional data should be passed to the backend to generate
exception-related sections in object file.
The short answer is, LLVM effectively hardcodes support for every EH ABI it wants to support- ARM, Itanium, SEH, etc. So whilst the landingpad stuff might in theory be somewhat abstract, it's really not abstract at all and very tightly coupled, because the other half of stuff you need to do must be accomplished by the Itanium ABI EH support library that you need to do explicitly.
LLVM generates virtually all the EH implementation details, but you must also link to an Itanium EH support library at runtime. Other than that, the IR is really just what it shows- no additional effort required on behalf of the programmer.
I imagine things get much more sticky if you want to use non-Itanium.
Related
I thought this would sound a general simple question but I got up this when reading C++ exception specification. that said in one of the book, C++11 now have a keyword 'noexcept' that means no exception will be thrown from a function when it is declared with the function header and that said the reason for this keyword came into existence is C++ exception specifications are checked at run time rather than at compile time, so they offer no programmer guarantees that all exceptions have been handled. and hence they conclude two case a function would throw exception or if we are clear if it will never throw, then use noexcept for optimization(hopefully)
void foo() noexcept();
Here is the main question. Which system software perform those run time checking(I hope not compiler/linker/loader) and also which system software is responsible for allocating memory at run time(dynamic memory allocation) when this are all not taken care by compiler and others?
There is no active "system software" checking for exceptions, as you phrase it; rather, throwing an exception is an action taken by the program itself. The program passes the exception back up the stack until the exception matches an exception handler.
If no exception handler matches, then the exception is caught by the bootstrap code (main is not the actual entry point for a typical program, but is where the runtime hands control to the programmer) and the program terminates.
AFAIK this is done by the C++ runtime (libstdc++ for example). In case of exceptions, there are some guards added around the functions by the compiler (this is necessary anyway to call destructors in case exception is thrown), and in the case of noexcept, if the function throws (or if it throws other exception than advertised by the throw() specification), terminate() is called by the C++ runtime and the application is shut down.
Memory heap allocations are also (by default) done by the C++ runtime libraries.
Typically the responsible software isn't one clearly identifiable piece of code, but small fragments of code sprinkled through the executable. The compiler translates your code into binary instructions, and noexcept is no exception ;).
Indeed, you would not say that the "standard library" handles this. Exceptions and exception specifications are rather a core language feature, more fundamental than the standard library.
You could similarly ask, what piece of software ensures that when I call a function in C++, that the caller actually receives the values that I pass in? What piece of software manipulates the stack frame pointers while my program is running?
From the point of view of the standard I would say "the implementation" is responsible for these details. In some languages, like Java for instance, there is a "Java Runtime Environment" which is very clearly responsible for these things, and you could try to study exactly how it does them. In C++ there is no universal runtime environment -- like others have said, the compiler is responsible to generate code that ensures that these things happen, and that code ends up sprinkled throughout your resulting executable. How exactly the compiler achieves its task is implementation-specific, you can't give a general answer beyond what the standard says, and generally it specifies the expected behavior, not the details under the hood.
When you ask
also which system software is responsible for allocating memory at run time(dynamic memory allocation
this is again an implementation detail, it will differ from compiler to compiler.
Brief:
"Binary application" call another function from "dynamic library".
Is exception handling is the part of function ABI in reallife?
Detailed
Calling convention include something like:
How parameters and to where parameters are pushed before function CALL
What registers for parameters transfer and return code.
Callee-save registers
What registers are scratch and didn't need to saved
What do with some unusual registers (ST*, XMM*)
But what will be if exceptions happend?
I'm more interesting in application and dynamic libraries written in C++ language
and compiled via identical or with different toolchains.
Under exception I mean not ALU, niether MMU exception. It's just a program exception created via "throw" in C#/C++ or "raise" in python.
Yes, this is part of an ABI for C++. Otherwise an exception couldn't safely be thrown across boundaries between binaries.
Here's an example: https://mentorembedded.github.io/cxx-abi/abi-eh.html
Function parameters are placed on the stack, but compilers can optimize this task by the use of optional registers. It would make sense that this optimization will kick in if there are only 1-2 parameters, and not when there are 256 (not that one would want to have the max number of parameters).
How can one find out the parameter limit (number of parameters) for a certain compiler (such as gcc) where one can be sure that this optimization will be used?
Function parameters are placed on the stack, but compilers can optimize this task by the use of optional registers.
As FrankH says in his comments and as I'm going to say in my answer, the application binary interface for the system in question determines how arguments are passed to functions - this is called the calling convention for that platform.
To complicate matters, x86 32-bit actually has several. This is historical and comes from the fact that when Win32 bit arrived, everyone went crazy doing different things.
So, yes, you can "optimise" by writing function calls in such a way, but no, you shouldn't. You should follow the standards for your platform. Because the honest truth is, the speed of stack access probably isn't slowing your code down to that great an extent that you need to be binary-incompatible from everyone else on your system.
Why the need for ABIs/standard calling conventions? Well, in terms of using the processor registers, stack etc, applications must agree on what means what and where it shoudl go. If one function decided all its arguments were in registers and another that some were on the stack, how would they be interoperable? Moreover, you might come across the term scratch registers to mean those registers you don't have to restore. What happens if you call a function expecting it to leave some registers alone?
Anyway, as for what you asked for, here's some ABI documentation:
The difference between x86 and x64 on windows.
x86_64 ABI used for Unix-like platforms.
Wikipedia's x86 calling conventions.
A document on compiler calling conventions.
The last one is my favourite. To quote it:
In the days of the old DOS operating system, it was often possible to combine development
tools from different vendors with few compatibility problems. With 32-bit Windows, the
situation has gone completely out of hand. Different compilers use different data
representations, different function calling conventions, and different object file formats.
While static link libraries have traditionally been considered compiler-specific, the
widespread use of dynamic link libraries (DLL's) has made the distribution of function
libraries in binary form more common.
So whatever you're trying to do with optimising via modifying the function calling method, don't. Find another way to optimise. Profile your code. Study the compiler optimisations you've got for your compiler (-OX) if you think it helps and dump the assembly to check, if the speed is really that crucial
For publically visible functions, this is documented in the ABI standard. For functions that are not referencible from the outside, all bets are off anyway.
You would have to read the fine manual for the compiler. If you were lucky, you would find it there in a description of function calling conventions. Otherwise, for an OSS compiler such as gcc you would probably have to read its source-code.
I remember some rules from a time ago (pre-32bit Intel processors), when was quite frequent (at least for me) having to analyze the assembly output generated by C/C++ compilers (in my case, Borland/Turbo at that time) to find performance bottlenecks, and to safely mix assembly routines with C/C++ code. Things like using the SI register for the this pointer, AX being used for return values, which registers should be preserved when an assembly routine returns, etc.
Now I was wondering if there's some reference for the more popular C/C++ compilers (Visual C++, GCC, Intel...) and processors (Intel, ARM, ...), and if not, where to find the pieces to create one. Ideas?
You are asking about "application binary interface" (ABI) and calling conventions. These are typically set by operating systems and libraries, and enforced by compilers and linkers. Google for "ABI" or "calling convention." Some starting points from Wikipedia and Debian for ARM.
Agner Fog's "Calling Conventions" document summarizes, amongst other things, the Windows and Linux 64 and 32-bit ABIs: http://www.agner.org/optimize/calling_conventions.pdf. See Table 4 on p.10 for a summary of register usage.
One warning from personal experience: don't embed assumptions about the ABI in inline assembly. If you write a function in inline assembly that assumes return and/or parameter transfer in particular registers (e.g. eax, rdi, rsi), it will break if/when the function is inlined by the compiler.
Open Watcom C/C++ compiler supports two calling conventions, register-based (default) and stack-based (very close to what other compilers use). User's Guide for this compiler describes them both and is available for free online, together with the compiler itself. You may find these topics in the User's Guide especially helpful:
10.4.1 Passing Arguments Using Register-Based Calling Conventions
10.4.6 Using Stack-Based Calling Conventions
10.5 Calling Conventions for 80x87-based Applications
Well, today if optimisation is turned on, there arn't any. But GCC allows you to declare that your assembly instruction should use particular variable regardless if it's in register or not, or even to force GCC tu put that variable into a register usable with your instruction. You can also declare which registers your inline assembly block reserves for itself (so compiler should generate apropriate save/restore code around your inline piece, if needed)
I believe but am by no means sure that GCC uses the Itanium ABI for most of its function; the incompatibilites between it and the ABI it uses are documented.
How does gcc implement stack unrolling for C++ exceptions on linux? In particular, how does it know which destructors to call when unrolling a frame (i.e., what kind of information is stored and where is it stored)?
See section 6.2 of the x86_64 ABI. This details the interface but not a lot of the underlying data. This is also independent of C++ and could conceivably be used for other purposes as well.
There are primarily two sections of the ELF binary as emitted by gcc which are of interest for exception handling. They are .eh_frame and .gcc_except_table.
.eh_frame follows the DWARF format (the debugging format that primarily comes into play when you're using gdb). It has exactly the same format as the .debug_frame section emitted when compiling with -g. Essentially, it contains the information necessary to pop back to the state of the machine registers and the stack at any point higher up the call stack. See the Dwarf Standard at dwarfstd.org for more information on this.
.gcc_except_table contains information about the exception handling "landing pads" the locations of handlers. This is necessary so as to know when to stop unwinding. Unfortunately this section is not well documented. The only snippets of information I have been able to glean come from the gcc mailing list. See particularly this post
The remaining piece of information is then what actual code interprets the information found in these data sections. The relevant code lives in libstdc++ and libgcc. I cannot remember at the moment which pieces live in which. The interpreter for the DWARF call frame information can be found in the gcc source code in the file gcc/unwind-dw.c
There isn't much documentation currently available, however the basic system is that GCC translates try/catch blocks to function calls and then links in a library with the needed runtime support (documentation about the tree building code includes the statement "throwing an exception is not directly represented in GIMPLE, since it is implemented by calling a function").
Unfortunately I'm not familiar with these functions and can't tell you what to look at (other than the source for libgcc -- which includes the exception handling runtime).
There is an "Exception Handling for Newbies" document available.
Although this looks to be for Itanium, presumably the implementation is similar for x86: exception handling ABI