arm-none-eabi-g++ does not correctly handle weak alias with -flto - c++

I am programming an STM32F413 microcontroller with SystemWorkbench 4 stm32. The Interrupt vectors are defined in an assembly startup file as weak aliases like follows:
.weak TIM1_UP_TIM10_IRQHandler
.thumb_set TIM1_UP_TIM10_IRQHandler,Default_Handler
And referenced in an object like follows:
g_pfnVectors:
.word _estack
.word Reset_Handler
.word NMI_Handler
.....
.word TIM1_UP_TIM10_IRQHandler
.....
So that the g_pfnVectors is a list of the addresses of the IRQ Handler functions. They are declared as weak aliases, so that if they are not defined by the user, the default handler is used.
I have defined the handler like this:
extern "C" {
void TIM1_UP_TIM10_IRQHandler() {
if (SU_TIM->SR & TIM_SR_UIF) {
SU_TIM->SR &= ~TIM_SR_UIF;
...
}
}
}
This works fine with the normal compiler optimization flags, however I wanted to try if I get smaller and possibly faster code with -flto (mainly for trying it, don't really needed it). But when compiling with -flto, g++ ignores my implementation of the handler and just uses the default handler, my handler isn't in the code at all.
So I tried to force g++ to include the function by adding __attribute__((used)) to the function definition, but it was still not compiled. However if I give it another name, then it was included in the binary. Also if I remove the weak alias and just have a reference to the handler in the startup file, it works too.
So somehow the weak aliases don't work with g++ link time optimization. Maybe someone can tell me what the error is and what I'm doing wrong here.
EDIT:
I have looked at which symbols are created with nm on the resulting .elf File, and the TIM1_UP_TIM10_IRQHandler is exported as a weak symbol with the address of the DefaultHandler. However when viewing just the .o file from the compilation unit containing the TIM1_UP_TIM10_IRQHandler function, it is exported as a symbol in the text section (T). So the linker, for some reason, chooses to keep the weak symbol, even though there is a strong symbol with the same name.

I think you should inform the compiler that it the interrupt __attribute__ ((interrupt ("IRQ"))), which is not needed normally as F4 has the stack by default aligned to 8 by the hardware.
If it does not help the workaround is to have a function pointer assigned with the handler, which will prevent it from discarding (if the pointer itself will not be discarded itself - check with your debugger).
The last resort - change the .s file with the vector table definitions

For those looking for this, still, there is apparently a confirmed bug in GCC 7 related to link-time optimization (-flto):
https://bugs.launchpad.net/gcc-arm-embedded/+bug/1747966
I have just run into this, again, with GCC 8 (gcc-arm-none-eabi-8-2019-q3-update release), the behavior is still the same.
The workaround that also works for me (from https://github.com/ObKo/stm32-cmake/issues/78) is to remove or comment the weak definitions at the end of the startup_XXX.s file, so change, for example
.weak NMI_Handler
.thumb_set NMI_Handler,Default_Handler
to
/*
.weak NMI_Handler
.thumb_set NMI_Handler,Default_Handler
*/
and replace them with your own implementation in a source file:
void NMI_Handler(void)
{
//...
}
All weak handlers need to be removed that are being called, so for example if you have UART1_Handler() defined in the HAL/LL drivers, you need to remove the corresponding .weak entry from the startup_XXX.s file, otherwise the interrupt will lock up the MCU by getting stuck in the default infinite loop, without executing the intended interrupt handler and returning from the interrupt, allowing other code execution to resume.

This bug is still present in gcc-arm-none-eabi-9-2020-q3-update but only for C handlers. Weirdly enough, handlers written in C++ (and declared with extern "C" linkage) are not anymore affected by this bug.
As another workaround, rather than messing with the startup.s file, I found that putting the IRQ handlers in separate .c files and building those (and only those) without LTO does the trick.
For those using CubeIDE and generating IRQ/HAL handlers with CubeMX (aka. "Device Configuration Tool"), all auto-generated handlers are in Core\Src\stm32XXXX_it.c, you just have to edit the properties of this file and remove LTO from the compilation options.
This is sub optimal, but it fits well with auto-generated IRQ/HAL handlers: only the first call (from IRQ handler to HAL handler) is unoptimized, but the HAL code itself is correctly optimized.

Related

Overriding Weak ISR Handler from Assembly to C++ doesn't compile any code

I am writing code for embedded programming on an ARM 32-bit based SAM D51 microprocessor using the IDE SEGGER Studio. I'm new to embedded programming and am writing my first interrupt.
The vector table with a dummy handler is written in ARM assembly, here's one example, the following code is auto-generated on project creation but marked as weak so it can be overridden in C/C++
ldr r0, =__stack_end__
mov sp, r0
bl SystemInit
b _start
.thumb_func
.weak SystemInit
SystemInit:
bx lr
Anywhere I read online says to simply add a C/C++ function with identical name and it's magically used by the linker because it's not marked as weak. Below is where I'm overriding it.
void SystemInit()
{
printf("Here");
}
However the debugger states that it can't place a breakpoint there because there's no code and in the disassembler it reveals that the entire function has been made into a comment with no code.
I've tried other functions including many of the handler functions but they all do the exact same thing and I have no idea why.
I've even tried forward declaring a weak function and then overriding it or marking the function as volatile. Here is my latest attempt but with the same result:
extern void __attribute__((weak)) SystemInit();
void SystemInit()
{
printf("Here");
}
and another attempt
volatile void SystemInit()
{
printf("Here");
}
They all end in no code being generated for the function and it appearing as a comment in disassembly.
Identifiers in C++ source files have their names mangled. The resulting linker function name is different from SystemInit. To stop C++ compiler from mangling the function name, declare the function with extern "C". That way generated function name will match the function name expected by the linker.

AVR G++: Executing a function that is past the 128 Kb ROM boundary

AVR g++ has a pointer size of 16 bits. However, my particular chip (the ATMega2560) has 256 KB of RAM. To support this, the compiler automatically generates trampoline sections in the same section of ROM as the current executing code that then contains the extended assembly code to jump into high memory or back. In order for trampolines to be generated, you must take the address of something that sits in high memory.
In my scenario, I have a bootloader that I have written sitting in high memory. The application code needs to be able to call a function in the bootloader. I know the address of this function and need to be able to directly address it by hard-coding the address in my code.
How can I get the compiler/linker to generate the appropriate trampoline for an arbitrary address?
Compiler and linker will only generate trampoline code when the far address is a symbolic address rather than a literal constant number already in code. something like (assuming the address you want to jump to is 0x20000).
extern void (*farfun)() = 0x20000;
farfun ();
Will definitely not work, it doesn't cause the linker to do anything because the address is already resolved.
You should be able to inject the symbol address in the linker command line like so:
extern void farfun ();
farfun ();
compiling "normally" and linking with
-Wl,--defsym,farfun=0x20000
I think it's clear that you need to make sure yourself that something sensible sits at farfun.
You will most probably also need --relax.
EDIT
Never tried this myself, but maybe:
You could probably try to store the function address in a table in high memory and declare it like this:
extern void (*farfunctable [10])();
(farfunctable [0])();
and use the very same linker command to resolve the external symbol (now your table at 0x20000 (in the bootloader) needs to look like this:
extern void func1();
extern void func2();
void ((*farfunctab [10])() = {
func1,
func2,....
};
I would recommend to put func1() ... func10() in a different module from farfunctab in order to make the linker know it has to generate trampolins.
I was planning on putting a dispatch struct (that is, a struct with function pointers to all the various functions). Your solution works well, but requires knowing all of the locations of all of the functions ahead of time. Is there a way to execute a function call to a far address that isn't known at compile time?
[...] My goal was to put the struct with pointers to the functions in a fixed location. That way, it would be a single thing that needed a fixed address rather than every external function.
So you have two applications, let's call them App and Boot, where Boot provides some functionalities that App wants to use. The following problems have to be addressed:
How to get addresses from Boot into App.
How to build a jump table for Boot.
Avoid constructs that will crash when App tries to use code from Boot, like: Using indirect calls or jumps, using static constructors or using static storage in Boot.
App uses Addresses of boot.elf directly
Linking with -Wl,-R,boot.elf
A simple way would be to just link app.elf against boot.elf be means of -Wl,-R,boot.elf. Option -R instructs the linker to use symbol values from the specified file without dragging any code. Problem is that there's no way to specify which symbols to use, for example this might lead to a situation where App uses libgcc functions from Boot.
Defining Symbols by means of -Wl,--defsym,symbol=value
A bit more control over which symbols are being defined can be implemented by following a specific naming convention. Suppose that all symbols from Boot that have "boot" in their name should be "exported", then you could just
> avr-nm -g boot.elf | grep ' T ' | awk '/boot/ { printf("--defsym %s=0x%s\n",$3,$1) }' > syms.opt
This prints global symbol values, and grep filters out symbols in the text section. awk then transforms lines like 00020102 T boot1 to lines like
--defsym boot1=0x00020102 which are written to an option file syms.opt. The option file can then be provided to the linker by means of -Wl,#syms.opt.
The advantage of an option file is that it is easier to provide than plain options in a build environment like make: app.elf would depend (amongst others) on syms.opt, which in turn would depend on boot.elf.
Defining Symbols in a Linker Script Snippet
An alternative would be to define the symbols in a linker script augmentation, which you would provide by means of -T syms.ld during link and which would contain
"boot1"=ABSOLUTE(0x00020102);
"boot2"=...
...
INSERT AFTER .text
Defining Symbols in an Assembly Module
Yet another way to define the symbols would be by means of an assembly module which contains definitions like .global boot1 together with boot1 = 0x00020102.
All these approaches have in common that all symbols must be defined, or otherwise the linker will throw an undefined symbol error. This means boot.elf must be available, and it does not matter whether just one symbol is undefined or whether dozends of symbols are undefined.
Let Boot provide a Dispatch Table
The problem with using boot.elf directly, like lined out in the previous section, is that it introduces a direct dependency. This means that if Boot is improved or refactored, then you'll also have to re-compile App each time, even if the interface did not change.
A solution is to let Boot provide a dispatch table whose position and layout are known ahead of time. Only when the interface itself changes, App will have to be rebuilt. Just refactoring Boot will not require to re-build App.
The Assembly Module with the Jump Table
As explained in the "Crash" section below, addresses in a dispatch table (and hence indirect jumps) won't work because EIND has a wrong value. Therefore, let's assume we have a table of JMPs to the desired Boot functions, like in an assembly module boot-table.sx that reads:
;;; Linker description file boot.ld locates input section .boot.table
;;; right after .vectors, hence the address of .boot_table will be
;;; text-section-start + _VECTORS_SIZE, where the latter is
;;; #define'd in <avr/io.h>.
;;; No "x" section flag so that the linker won't relax JMPs to RJMPs.
.section .boot.table,"a",#progbits
.global .boot_table
.type .boot_table,#object
boot_table:
jmp boot1
jmp boot2
.size boot_table, .-boot_table
In this example, we are going to locate the jump table right after .vectors, so that its location is known ahead of time. The respective symbol definitions in App's syms.opt will then read
--defsym boot1=0x20000+vectors_size+0*4
--defsym boot2=0x20000+vectors_size+1*4
provided Boot is located at 0x20000. Symbol vectors_size can be defined in a C/C++ module, here by abusing avr-gcc attribute "address":
#include <avr/io.h>
__attribute__((__address__(_VECTORS_SIZE)))
char vectors_size;
Locating the Jump Table
In order to locate input section .boot.table, we need an own linker description file, which you might already use for Boot anyways. We start with a linker script from avr-gcc installation at ./avr/lib/ldscripts/avr6.xn, copy it to boot.ld, and add the following 2 lines after vectors:
...
.text :
{
*(.vectors)
KEEP(*(.vectors))
*(.boot.table)
KEEP(*(.boot.table))
/* For data that needs to reside in the lower 64k of progmem. */
*(.progmem.gcc*)
...
Auto-Generating Boot's Jump Table Module and the Symbols for App
It's highly advisable to have an interface description used by both App and Boot, say common.h. Moreover, in order to keep Boot's boot-table.sx and App's syms.opt in sync with the interface, it's agood idea to auto-generate these two files from common.h. To that end, assume that common.h reads:
#ifndef COMMON_H
#define COMMON_H
#define EX __attribute__((__used__,__externally_visible__))
EX int boot1 /* #boot_table:0 */ (int);
EX int boot2 /* #boot_table:1 */ (void);
#endif /* COMMON_H */
For the matter of simplicity, let's assume that this is C code or the interfaces are extern "C" so that the symbols in source code match the assembly names, and there's no need to use mangled names. It' easy enough to generate boot-table.sx and syms.opt from common.h using the magic comments. The magic comment follows directly after the symbol, so a regex would retrieve the token left of the magic comment, something like Python:
# ... symbol /* #boot_table:index */...
pat = re.compile (r".*(\b\w+\b)\s*/\* #boot_table:(\d+) \*/.*")
for line in sys.stdin.readlines():
match = re.match (pat, line)
if match:
index = int (match.group(2))
symbol = match.group(1)
Output template for syms.opt would be something like:
asm_line = "--defsym {symbol}=0x20000+vectors_size+4*{index}\n"
Code that will crash
Using Boot code from App is subject to several restrictions:
Indirect Calls and Jumps
These will crash because the start addresses of App resp. Boot are in different 128KiB segments of flash. When the address of a code symbol is taken, the compiler does this per gs(symbol) which instructs the linker to generate a stub and resolve gs() to that stub in .trampolines if the target address is outside the 128KiB segment where the trampolines are located. An explanation of gs() can be found in this answer, there is however more to it: The startup code will effectively initialize
EIND = __vectors >> 17;
see gcrt1.S, the AVR-LibC bits of start-up code crt<device>.o. The compiler assumes EIND never changes during execution, see EIND and more than 128KiB of Flash in the GCC documentation.
This means code in Boot assumes EIND = 1 but is called with EIND = 0 and hence EICALL resp. EIJMP will target the wrong address. This means common code must avoid indirect calls and jumps, and should be compiled with -fno-jump-tables so that switch/case won't generate such tables.
This also implies that the dispatch table described above won't work if it would just held gs(symbol) entries, because App and Boot will disagree on EIND.
Data in Static Storage
If common Boot code is using data in static storage, the data might collide with App's static storage. One way out is to avoid static storage in respective parts of Boot and pass addresses to, say, some data buffer by means of pointer erguments of respective functions.
One could have completely separate RAM areas; one for Boot and one for App, but that would be a waste of RAM because the applications will never run at the same time.
Static Constructors
Boot's static constructors will be bypassed if App uses code from Boot. This includes:
C++ code in Boot that explicitly or implicitly generates such constructors.
C/C++ code in Boot that relies on __attribute__((__constructor__)) or code in section .initN which is supposed to run prior to main.
Start-up code that initializes static storage, EIND etc., which is also run by locating it in some .initN sections, but will be bypassed if App calls Boot code.

declaring an extern C function on template instantiation

I'm working on an MCU (STM32F4).
In the current system, all the interrupts handlers are declared as weak symbols in a link file, and if we want to use one, we just declare a function with the same name in C and it replaces the weak one at link time.
I'm trying to convert my system to C++, I envision a system where instantiating a certain interrupt type would declare the corresponding C function in the module.
I have no clue how to achieve that considering that extern "C" is forbidden for member functions.
any idea or alternative ?
My aim is to try to statically check some things and try to use some modern C++ in the field.
here is the current situation in C.
I have a assembly file with this thing in it:
g_pfnVectors:
.word _estack
.word Reset_Handler
(...)
.word SysTick_Handler
(...)
/*******************************************************************************
*
* Provide weak aliases for each Exception handler to the Default_Handler.
* As they are weak aliases, any function with the same name will override
* this definition.
*
*******************************************************************************/
(...)
.weak SysTick_Handler
.thumb_set SysTick_Handler,Default_Handler
in my C code, I have that :
main() {
(...)
SysTick_Config(SystemCoreClock / cncMemory.parameters.clockFrequency);
while (1);
}
void SysTick_Handler(void) {
cncMemory.tick++;
}
And I envision something like:
int main() {
MCU<mySystickHandler, ...> mcu;
mcu.start();
}
static void mySystickHandler(void) {
cncMemory.tick++;
}
or something approaching (probably without the still global function, but I try to separate the problems).
I know of nothing standard for that.
If you want to stay in the language, you'll have to look at extensions as compilers have provided pragmas and attributes to control such things for a long time. In the case of gcc, asm labels seems designed for that problem. I've not used it and I've the a priori that it can't be used with templates excepted perhaps for explicit specializations.
The alternative is obviously playing with linker level tricks.
AFAIK a weak symbol doesn't make an object file providing it to be extracted from a static library if it is the only symbol provided by the object. You could arrange that the object file providing from C to your template provide also another unique symbol which is needed by the instantiation you want.
Linker scripts have a lot of power and if things haven't changed too much since the last time I took care of embedded systems (that's so long ago that they must changed, I just don't know if they have changed in that aspect) custom linker scripts are still pretty much mandatory in that field.

How to make Visual C++ 9 not emit code that is actually never called?

My native C++ COM component uses ATL. In DllRegisterServer() I call CComModule::RegisterServer():
STDAPI DllRegisterServer()
{
return _Module.RegisterServer(FALSE); // <<< notice FALSE here
}
FALSE is passed to indicate to not register the type library.
ATL is available as sources, so I in fact compile the implementation of CComModule::RegisterServer(). Somewhere down the call stack there's an if statement:
if( doRegisterTypeLibrary ) { //<< FALSE goes here
// do some stuff, then call RegisterTypeLib()
}
The compiler sees all of the above code and so it can see that in fact the if condition is always false, yet when I inspect the linker progress messages I see that the reference to RegisterTypeLib() is still there, so the if statement is not eliminated.
Can I make Visual C++ 9 perform better static analysis and actually see that some code is never called and not emit that code?
Do you have whole program optimization active [/GL]? This seems like the sort of optimization the compiler generally can't do on its own.
Are you sure the code isn't eliminated later in the compilation/linking process? Have you checked the generated ASM?
How is the RegisterTypeLib function defined? Of course anything marked dllexportcan't be eliminated by the linker, but also any function not marked static (or placed in an anonymous namespace) can be referenced by multiple translation units, so the compiler won't be able to eliminate the function.
The linker can do it, but that might be one of the last optimizations it performs (I have no clue of the order in which it applies optimizations), so the symbols might still be there in the messages you're looking at, even if they're eliminated afterwards.
any code that is successfully inlined will only be generated if called. That's a simple way to do it, as long as the compiler will take the hint. inline is only a suggestion though
The inner call to AtlComModuleRegisterServer has external linkage, which generally prevents the optimizer from propagating the bRegTypeLib value down the call graph. Some of this can be better reasoned with in the disassembly.
So DllInstall(...) calls CAtlDllModuleT::RegisterServer(0). This is the start of the problem:
push 0
call ?DllRegisterServer#?$CAtlDllModuleT#VCAtlTestModule###ATL##QAEJH#Z
Let's just say for arguments sake that the compiler has verified CAtlDllModuleT::DllRegisterServer is only called once and it's very safe to push the 0/FALSE down one more level... the external linkage prevents discarding AtlComModuleRegisterServer, inlining it has a high cost (code duplication) and doesn't allow any additional whole-program optimizations. It is probably safer to keep the signature as-is and bail out early with a regular cdecl call...
?DllRegisterServer#?$CAtlDllModuleT#VCAtlTestModule###ATL##QAEJH#Z proc near
<trimmed>
push 0
push edx
push offset ATL::_AtlComModule
call _AtlComModuleRegisterServer#12
This code can be improved in size due to the two constants, but it's likely to cost about the same amount of runtime. If performance is an issue consider explicitly setting the function layout order, you might save a page fault.
Turns out the key is to enable link-time code generator all the way through the compiler settings.
It must be enabled on the General tab - Whole program optimization must be set to "Use link-time code generation". It must also be enabled on the C++ -> Optimization tab - "Whole program optimization* must be set to "Enable link-time code generation". It must also be enabled on the Linker -> Optimization tab - Link Time Code Generation must be set to "Use Link Time Code Generation". Then /OPT:REF and /OPT:ICF (again, *Linker -> Optimization" tab) must both be enabled.
And this effectively removes the calls to RegisterTypeLib() - it is no longer in the list of symbols imported.

How exactly does __attribute__((constructor)) work?

It seems pretty clear that it is supposed to set things up.
When exactly does it run?
Why are there two parentheses?
Is __attribute__ a function? A macro? Syntax?
Does this work in C? C++?
Does the function it works with need to be static?
When does __attribute__((destructor)) run?
Example in Objective-C:
__attribute__((constructor))
static void initialize_navigationBarImages() {
navigationBarImages = [[NSMutableDictionary alloc] init];
}
__attribute__((destructor))
static void destroy_navigationBarImages() {
[navigationBarImages release];
}
It runs when a shared library is loaded, typically during program startup.
That's how all GCC attributes are; presumably to distinguish them from function calls.
GCC-specific syntax.
Yes, this works in C and C++.
No, the function does not need to be static.
The destructor runs when the shared library is unloaded, typically at program exit.
So, the way the constructors and destructors work is that the shared object file contains special sections (.ctors and .dtors on ELF) which contain references to the functions marked with the constructor and destructor attributes, respectively. When the library is loaded/unloaded the dynamic loader program (ld.so or somesuch) checks whether such sections exist, and if so, calls the functions referenced therein.
Come to think of it, there is probably some similar magic in the normal static linker so that the same code is run on startup/shutdown regardless if the user chooses static or dynamic linking.
.init/.fini isn't deprecated. It's still part of the the ELF standard and I'd dare say it will be forever. Code in .init/.fini is run by the loader/runtime-linker when code is loaded/unloaded. I.e. on each ELF load (for example a shared library) code in .init will be run. It's still possible to use that mechanism to achieve about the same thing as with __attribute__((constructor))/((destructor)). It's old-school but it has some benefits.
.ctors/.dtors mechanism for example require support by system-rtl/loader/linker-script. This is far from certain to be available on all systems, for example deeply embedded systems where code executes on bare metal. I.e. even if __attribute__((constructor))/((destructor)) is supported by GCC, it's not certain it will run as it's up to the linker to organize it and to the loader (or in some cases, boot-code) to run it. To use .init/.fini instead, the easiest way is to use linker flags: -init & -fini (i.e. from GCC command line, syntax would be -Wl -init my_init -fini my_fini).
On system supporting both methods, one possible benefit is that code in .init is run before .ctors and code in .fini after .dtors. If order is relevant that's at least one crude but easy way to distinguish between init/exit functions.
A major drawback is that you can't easily have more than one _init and one _fini function per each loadable module and would probably have to fragment code in more .so than motivated. Another is that when using the linker method described above, one replaces the original _init and _fini default functions (provided by crti.o). This is where all sorts of initialization usually occur (on Linux this is where global variable assignment is initialized). A way around that is described here
Notice in the link above that a cascading to the original _init() is not needed as it's still in place. The call in the inline assembly however is x86-mnemonic and calling a function from assembly would look completely different for many other architectures (like ARM for example). I.e. code is not transparent.
.init/.fini and .ctors/.detors mechanisms are similar, but not quite. Code in .init/.fini runs "as is". I.e. you can have several functions in .init/.fini, but it is AFAIK syntactically difficult to put them there fully transparently in pure C without breaking up code in many small .so files.
.ctors/.dtors are differently organized than .init/.fini. .ctors/.dtors sections are both just tables with pointers to functions, and the "caller" is a system-provided loop that calls each function indirectly. I.e. the loop-caller can be architecture specific, but as it's part of the system (if it exists at all i.e.) it doesn't matter.
The following snippet adds new function pointers to the .ctors function array, principally the same way as __attribute__((constructor)) does (method can coexist with __attribute__((constructor))).
#define SECTION( S ) __attribute__ ((section ( S )))
void test(void) {
printf("Hello\n");
}
void (*funcptr)(void) SECTION(".ctors") =test;
void (*funcptr2)(void) SECTION(".ctors") =test;
void (*funcptr3)(void) SECTION(".dtors") =test;
One can also add the function pointers to a completely different self-invented section. A modified linker script and an additional function mimicking the loader .ctors/.dtors loop is needed in such case. But with it one can achieve better control over execution order, add in-argument and return code handling e.t.a. (In a C++ project for example, it would be useful if in need of something running before or after global constructors).
I'd prefer __attribute__((constructor))/((destructor)) where possible, it's a simple and elegant solution even it feels like cheating. For bare-metal coders like myself, this is just not always an option.
Some good reference in the book Linkers & loaders.
This page provides great understanding about the constructor and destructor attribute implementation and the sections within within ELF that allow them to work. After digesting the information provided here, I compiled a bit of additional information and (borrowing the section example from Michael Ambrus above) created an example to illustrate the concepts and help my learning. Those results are provided below along with the example source.
As explained in this thread, the constructor and destructor attributes create entries in the .ctors and .dtors section of the object file. You can place references to functions in either section in one of three ways. (1) using either the section attribute; (2) constructor and destructor attributes or (3) with an inline-assembly call (as referenced the link in Ambrus' answer).
The use of constructor and destructor attributes allow you to additionally assign a priority to the constructor/destructor to control its order of execution before main() is called or after it returns. The lower the priority value given, the higher the execution priority (lower priorities execute before higher priorities before main() -- and subsequent to higher priorities after main() ). The priority values you give must be greater than100 as the compiler reserves priority values between 0-100 for implementation. Aconstructor or destructor specified with priority executes before a constructor or destructor specified without priority.
With the 'section' attribute or with inline-assembly, you can also place function references in the .init and .fini ELF code section that will execute before any constructor and after any destructor, respectively. Any functions called by the function reference placed in the .init section, will execute before the function reference itself (as usual).
I have tried to illustrate each of those in the example below:
#include <stdio.h>
#include <stdlib.h>
/* test function utilizing attribute 'section' ".ctors"/".dtors"
to create constuctors/destructors without assigned priority.
(provided by Michael Ambrus in earlier answer)
*/
#define SECTION( S ) __attribute__ ((section ( S )))
void test (void) {
printf("\n\ttest() utilizing -- (.section .ctors/.dtors) w/o priority\n");
}
void (*funcptr1)(void) SECTION(".ctors") =test;
void (*funcptr2)(void) SECTION(".ctors") =test;
void (*funcptr3)(void) SECTION(".dtors") =test;
/* functions constructX, destructX use attributes 'constructor' and
'destructor' to create prioritized entries in the .ctors, .dtors
ELF sections, respectively.
NOTE: priorities 0-100 are reserved
*/
void construct1 () __attribute__ ((constructor (101)));
void construct2 () __attribute__ ((constructor (102)));
void destruct1 () __attribute__ ((destructor (101)));
void destruct2 () __attribute__ ((destructor (102)));
/* init_some_function() - called by elf_init()
*/
int init_some_function () {
printf ("\n init_some_function() called by elf_init()\n");
return 1;
}
/* elf_init uses inline-assembly to place itself in the ELF .init section.
*/
int elf_init (void)
{
__asm__ (".section .init \n call elf_init \n .section .text\n");
if(!init_some_function ())
{
exit (1);
}
printf ("\n elf_init() -- (.section .init)\n");
return 1;
}
/*
function definitions for constructX and destructX
*/
void construct1 () {
printf ("\n construct1() constructor -- (.section .ctors) priority 101\n");
}
void construct2 () {
printf ("\n construct2() constructor -- (.section .ctors) priority 102\n");
}
void destruct1 () {
printf ("\n destruct1() destructor -- (.section .dtors) priority 101\n\n");
}
void destruct2 () {
printf ("\n destruct2() destructor -- (.section .dtors) priority 102\n");
}
/* main makes no function call to any of the functions declared above
*/
int
main (int argc, char *argv[]) {
printf ("\n\t [ main body of program ]\n");
return 0;
}
output:
init_some_function() called by elf_init()
elf_init() -- (.section .init)
construct1() constructor -- (.section .ctors) priority 101
construct2() constructor -- (.section .ctors) priority 102
test() utilizing -- (.section .ctors/.dtors) w/o priority
test() utilizing -- (.section .ctors/.dtors) w/o priority
[ main body of program ]
test() utilizing -- (.section .ctors/.dtors) w/o priority
destruct2() destructor -- (.section .dtors) priority 102
destruct1() destructor -- (.section .dtors) priority 101
The example helped cement the constructor/destructor behavior, hopefully it will be useful to others as well.
Here is a "concrete" (and possibly useful) example of how, why, and when to use these handy, yet unsightly constructs...
Xcode uses a "global" "user default" to decide which XCTestObserver class spews it's heart out to the beleaguered console.
In this example... when I implicitly load this psuedo-library, let's call it... libdemure.a, via a flag in my test target รก la..
OTHER_LDFLAGS = -ldemure
I want to..
At load (ie. when XCTest loads my test bundle), override the "default" XCTest "observer" class... (via the constructor function) PS: As far as I can tell.. anything done here could be done with equivalent effect inside my class' + (void) load { ... } method.
run my tests.... in this case, with less inane verbosity in the logs (implementation upon request)
Return the "global" XCTestObserver class to it's pristine state.. so as not to foul up other XCTest runs which haven't gotten on the bandwagon (aka. linked to libdemure.a). I guess this historically was done in dealloc.. but I'm not about to start messing with that old hag.
So...
#define USER_DEFS NSUserDefaults.standardUserDefaults
#interface DemureTestObserver : XCTestObserver #end
#implementation DemureTestObserver
__attribute__((constructor)) static void hijack_observer() {
/*! here I totally hijack the default logging, but you CAN
use multiple observers, just CSV them,
i.e. "#"DemureTestObserverm,XCTestLog"
*/
[USER_DEFS setObject:#"DemureTestObserver"
forKey:#"XCTestObserverClass"];
[USER_DEFS synchronize];
}
__attribute__((destructor)) static void reset_observer() {
// Clean up, and it's as if we had never been here.
[USER_DEFS setObject:#"XCTestLog"
forKey:#"XCTestObserverClass"];
[USER_DEFS synchronize];
}
...
#end
Without the linker flag... (Fashion-police swarm Cupertino demanding retribution, yet Apple's default prevails, as is desired, here)
WITH the -ldemure.a linker flag... (Comprehensible results, gasp... "thanks constructor/destructor"... Crowd cheers)
Here is another concrete example. It is for a shared library. The shared library's main function is to communicate with a smart card reader, but it can also receive 'configuration information' at runtime over UDP. The UDP is handled by a thread which MUST be started at init time.
__attribute__((constructor)) static void startUdpReceiveThread (void) {
pthread_create( &tid_udpthread, NULL, __feigh_udp_receive_loop, NULL );
return;
}
The library was written in C.