I'm beginning to develop for a Renesas RZ/A1L microcontroller. Renesas provide an IDE (e2 Studio - a modified version of Eclipse), set up to compile C / C++ with GCC. Everything works fine, but...
If I declare an object in file-scope (outside of any function), its constructor is never called. For instance:
class NewClass {
public:
int i;
NewClass() {
i = 4;
}
};
NewClass newInstance;
int main(void)
{
// My program...
}
I can tell that the constructor isn't being called because, using the in-circuit debugging setup supplied by Renesas, I can see that i is never set to 4 (even when I place further references to newInstance and i; I have optimisation switched off too). Sorry I can't do a simple cout of i's value - the code is being run in a microcontroller and I haven't worked out how to do that just yet.
If I instead place the NewClass newInstance; line inside of main(), then the problem goes away.
A further consequence of the problem is that, for inheriting classes, calling a virtual function on one (via a pointer of base-class type) causes a crash - I suspect due to the constructor having not executed and hence not written to memory some indicator of what class the object was.
By what mechanism would such a constructor normally be called? I did some Googling - would it be the ".ctors" list? (https://gcc.gnu.org/onlinedocs/gccint/Initialization.html)
Renesas's "template" C++ project does actually include code to call all of the ctors; however, from looking at my generated .map file for the project, I can see that no ctors are actually present. Does that narrow down the problem - is the GCC compiler not spitting them out when it should be?
Many thanks for your help.
Quoting the draft C++11 standard, N3337, we find that:
[basic.start.main]/1 A program shall contain a global function called
main, which is the designated start of the program. It is
implementation-defined whether a program in a freestanding environment
is required to define a main function. [ Note: In a freestanding
environment, start-up and termination is implementation-defined;
start-up contains the execution of constructors for objects of
namespace scope with static storage duration; termination contains the
execution of destructors for objects with static storage duration. —
end note ]
As you can see, it's implementation-defined in a free-standing environment. Therefore assuming you have a 32-bit x86 GCC toolchain...
It sounds like you're in a freestanding environment. If so, if you want to use global constructors, there is some boilerplate you need to implement. The initialization page you linked mentions a linker line, something that would look like this:
i686-elf-ld crt0.o crti.o crtbegin.o foo.o bar.o crtend.o crtn.o
Assuming foo.o and bar.o are part of your program, it's required that your linker line looks like this. Note that the compiler should provide its own crtbegin.o and crtend.o, so you can find the location of those using -print-file-name. In a Makefile, it'd look something like this:
CRTBEGIN_OBJ:=$(shell $(CC) $(CFLAGS) -print-file-name=crtbegin.o)
CRTEND_OBJ:=$(shell $(CC) $(CFLAGS) -print-file-name=crtend.o)
Now for the actual initialization function. In the same file as your kernel entry point, call _init before kernel_main (or whatever it's called.) Optionally _fini can be called after kernel_main but it's unlikely to be necessary. The exact code will depend on the architecture, but here is an example for 32-bit x86:
/* x86 crti.s */
.section .init
.global _init
.type _init, #function
_init:
push %ebp
movl %esp, %ebp
/* gcc will nicely put the contents of crtbegin.o's .init section here. */
.section .fini
.global _fini
.type _fini, #function
_fini:
push %ebp
movl %esp, %ebp
/* gcc will nicely put the contents of crtbegin.o's .fini section here. */
/* x86 crtn.s */
.section .init
/* gcc will nicely put the contents of crtend.o's .init section here. */
popl %ebp
ret
.section .fini
/* gcc will nicely put the contents of crtend.o's .fini section here. */
popl %ebp
ret
Related
I want to debug a code of a plain assembler project for the ATmega2560. I want to use the Microchip debugger for that purpose. The goal is to get source level debugging with all variables and functions, breakpoints etc.
I managed to create a C "stub" file with a main() function that calls the assembler code.
extern int foo(int a);
int main(void)
{
int a = 0;
while (1)
{
a = foo(a);
}
}
But the assembler code also includes the interrupt vector table including the reset vector.
.extern main
.section .vectors
.global RESET_
RESET_: jmp WARM_0
.section code
.global foo
foo:
ret
WARM_0:
call main
ret
.end
Now I want to run the code from the label RESET_. The linker has placed the code in the section .vectors. That's okay so far, but the vector table from the GCC startup files are in that section before the vector table in my code. The GCC startup code must be removed to get my vector to the address 0. Therefore I activate the linker option "Do not use standard start files (-nostartfiles)". That gives the desired result: The reset vector has a jump to RESET_.
But this has the important side effect, that the debugger is not able to debug at source code level anymore. The C file with the main() function is still linked. But the source level debug support is lost.
How can I debug a plan assembler project with Microchip debugger/simulator for AVR8?
Remark: The code in the assembler file is not sufficient for a valid program. It's reduced to get a minimal example that should work in the Microchip environment.
I have a compiled ELF file, libfoo.so, that exports some class methods, like:
struct Bar {
void f();
void g();
};
I know the exact declaration of these classes, but the definition is compiled into the .so. I'm using the same compiler (gcc >= 7) as the .so, so the name mangling and ABI match. This means that if I add the above declaration to my code, I'll be able to call the methods implemented in libfoo.so directly.
However, I don't want to pollute my namespace with toplevel libfoo stuff (and I don't control libfoo source). So I'd like to declare struct Bar inside namespace foo {}. Now, once I do that, the name mangling no longer matches and the loader will not be able to find the functions in question.
I do not want to write thunks for all of them as a) there's thousands and b) it causes issues with virtual functions and destructors.
I'm thinking of using objcopy and renaming all exported symbols in libfoo.so, but hoping there's a better solution here.
Both libfoo and my code are using C++14 (can be moved to C++17) and gcc 7 (can be moved to later). Compiled for 64bit ELF on Linux.
It ended up being a lot more complex than expected, but I did manage. If libfoo.so has the function Bar::f() (mangled as _ZN3Bar1fEv), and I want to define foo::Bar::f() (_ZN3foo3Bar1fEv), I first need to find the relative offset of the function in libfoo.so. With nm you get something like:
0000000000abcd00 T _ZN3Bar1fEv
I have a script that parses this output, and generates assembly code like this:
; Export start of the translation space
.global START
START:
nop
.global _ZN3foo3Bar1fEv ; Desired target name
_ZN3foo3Bar1fEv:
movq $0x0000abcd12345678, %rax ; arbitrary constant
addq $0xabcd00, %rax ; offset from libfoo
pushq %rax
ret
; same for every other needed symbol
; Export end of the translation space
.global END
END:
nop
This defines a "naked" function with the desired name, which adds an arbitrary constant to the offset and jumps to it. The extra jump is a perf hit I can live with.
The arbitrary constant is needed to deal with ASLR - the 0xabcd00 is the base offset, but when libfoo.so is loaded to some base address (e.g. 0x04000000) the actual address of the function (i.e. if you wanted a function pointer) is 0x04abde00. When generating the asm code, I used a temporary constant that I'll fix up after loading:
const uintptr_t offset = 0xabcd00; // Expected offset in libfoo.so
uintptr_t real = (uintptr_t) dlsym(RTLD_DEFAULT, "_ZN3Bar1fEv"); // real address with ASLR
uintptr_t aslr_base = real - offset; // This is the actual base to be applied.
// Search everything between START and END and update the arbitrary constant
extern "C" void START(); extern "C" void END();
uint8_t *p = (uint8_t*)&START;
uint8_t *end = (uint8_t*)&END;
mprotect(ALIGN_TO_PAGE(p), end-p+PAGE_SIZE, PROT_READ | PROT_WRITE | PROT_EXEC); // mark the memory as writable
// update all instances of the arbitrary constant
while (p != end)
{
if (*(uintptr_t*)p == 0x0000abcd12345678)
*(uintptr_t*)p = aslr_base;
p++;
}
You can then restore the memory protections. The asm can be optimized to not use rax by doing calculations using just push and mov with indirect addressing using rsp, but this works for my case.
What doesn't work
objcopy allows redefining or injecting static symbols, but has no support for editing dynamic symbol table. None of the tools I found do either.
Linker scripts allow defining aliases (PROVIDE(alias = real)), but only if the target is defined at the time of linking; you cannot create an alias to a dynamic symbol.
Sourcing the ASLR offset from a global variable does not work - The linker kept asking that all code be compiled with -fPIC, which it was. This would have removed the need to modify the code.
I am programming an STM32F413 microcontroller with SystemWorkbench 4 stm32. The Interrupt vectors are defined in an assembly startup file as weak aliases like follows:
.weak TIM1_UP_TIM10_IRQHandler
.thumb_set TIM1_UP_TIM10_IRQHandler,Default_Handler
And referenced in an object like follows:
g_pfnVectors:
.word _estack
.word Reset_Handler
.word NMI_Handler
.....
.word TIM1_UP_TIM10_IRQHandler
.....
So that the g_pfnVectors is a list of the addresses of the IRQ Handler functions. They are declared as weak aliases, so that if they are not defined by the user, the default handler is used.
I have defined the handler like this:
extern "C" {
void TIM1_UP_TIM10_IRQHandler() {
if (SU_TIM->SR & TIM_SR_UIF) {
SU_TIM->SR &= ~TIM_SR_UIF;
...
}
}
}
This works fine with the normal compiler optimization flags, however I wanted to try if I get smaller and possibly faster code with -flto (mainly for trying it, don't really needed it). But when compiling with -flto, g++ ignores my implementation of the handler and just uses the default handler, my handler isn't in the code at all.
So I tried to force g++ to include the function by adding __attribute__((used)) to the function definition, but it was still not compiled. However if I give it another name, then it was included in the binary. Also if I remove the weak alias and just have a reference to the handler in the startup file, it works too.
So somehow the weak aliases don't work with g++ link time optimization. Maybe someone can tell me what the error is and what I'm doing wrong here.
EDIT:
I have looked at which symbols are created with nm on the resulting .elf File, and the TIM1_UP_TIM10_IRQHandler is exported as a weak symbol with the address of the DefaultHandler. However when viewing just the .o file from the compilation unit containing the TIM1_UP_TIM10_IRQHandler function, it is exported as a symbol in the text section (T). So the linker, for some reason, chooses to keep the weak symbol, even though there is a strong symbol with the same name.
I think you should inform the compiler that it the interrupt __attribute__ ((interrupt ("IRQ"))), which is not needed normally as F4 has the stack by default aligned to 8 by the hardware.
If it does not help the workaround is to have a function pointer assigned with the handler, which will prevent it from discarding (if the pointer itself will not be discarded itself - check with your debugger).
The last resort - change the .s file with the vector table definitions
For those looking for this, still, there is apparently a confirmed bug in GCC 7 related to link-time optimization (-flto):
https://bugs.launchpad.net/gcc-arm-embedded/+bug/1747966
I have just run into this, again, with GCC 8 (gcc-arm-none-eabi-8-2019-q3-update release), the behavior is still the same.
The workaround that also works for me (from https://github.com/ObKo/stm32-cmake/issues/78) is to remove or comment the weak definitions at the end of the startup_XXX.s file, so change, for example
.weak NMI_Handler
.thumb_set NMI_Handler,Default_Handler
to
/*
.weak NMI_Handler
.thumb_set NMI_Handler,Default_Handler
*/
and replace them with your own implementation in a source file:
void NMI_Handler(void)
{
//...
}
All weak handlers need to be removed that are being called, so for example if you have UART1_Handler() defined in the HAL/LL drivers, you need to remove the corresponding .weak entry from the startup_XXX.s file, otherwise the interrupt will lock up the MCU by getting stuck in the default infinite loop, without executing the intended interrupt handler and returning from the interrupt, allowing other code execution to resume.
This bug is still present in gcc-arm-none-eabi-9-2020-q3-update but only for C handlers. Weirdly enough, handlers written in C++ (and declared with extern "C" linkage) are not anymore affected by this bug.
As another workaround, rather than messing with the startup.s file, I found that putting the IRQ handlers in separate .c files and building those (and only those) without LTO does the trick.
For those using CubeIDE and generating IRQ/HAL handlers with CubeMX (aka. "Device Configuration Tool"), all auto-generated handlers are in Core\Src\stm32XXXX_it.c, you just have to edit the properties of this file and remove LTO from the compilation options.
This is sub optimal, but it fits well with auto-generated IRQ/HAL handlers: only the first call (from IRQ handler to HAL handler) is unoptimized, but the HAL code itself is correctly optimized.
In C, let's say you have a variable called variable_name. Let's say it's located at 0xaaaaaaaa, and at that memory address, you have the integer 123. So in other words, variable_name contains 123.
I'm looking for clarification around the phrasing "variable_name is located at 0xaaaaaaaa". How does the compiler recognize that the string "variable_name" is associated with that particular memory address? Is the string "variable_name" stored somewhere in memory? Does the compiler just substitute variable_name for 0xaaaaaaaa whenever it sees it, and if so, wouldn't it have to use memory in order to make that substitution?
Variable names don't exist anymore after the compiler runs (barring special cases like exported globals in shared libraries or debug symbols). The entire act of compilation is intended to take those symbolic names and algorithms represented by your source code and turn them into native machine instructions. So yes, if you have a global variable_name, and compiler and linker decide to put it at 0xaaaaaaaa, then wherever it is used in the code, it will just be accessed via that address.
So to answer your literal questions:
How does the compiler recognize that the string "variable_name" is associated with that particular memory address?
The toolchain (compiler & linker) work together to assign a memory location for the variable. It's the compiler's job to keep track of all the references, and linker puts in the right addresses later.
Is the string "variable_name" stored somewhere in memory?
Only while the compiler is running.
Does the compiler just substitute variable_name for 0xaaaaaaaa whenever it sees it, and if so, wouldn't it have to use memory in order to make that substitution?
Yes, that's pretty much what happens, except it's a two-stage job with the linker. And yes, it uses memory, but it's the compiler's memory, not anything at runtime for your program.
An example might help you understand. Let's try out this program:
int x = 12;
int main(void)
{
return x;
}
Pretty straightforward, right? OK. Let's take this program, and compile it and look at the disassembly:
$ cc -Wall -Werror -Wextra -O3 example.c -o example
$ otool -tV example
example:
(__TEXT,__text) section
_main:
0000000100000f60 pushq %rbp
0000000100000f61 movq %rsp,%rbp
0000000100000f64 movl 0x00000096(%rip),%eax
0000000100000f6a popq %rbp
0000000100000f6b ret
See that movl line? It's grabbing the global variable (in an instruction-pointer relative way, in this case). No more mention of x.
Now let's make it a bit more complicated and add a local variable:
int x = 12;
int main(void)
{
volatile int y = 4;
return x + y;
}
The disassembly for this program is:
(__TEXT,__text) section
_main:
0000000100000f60 pushq %rbp
0000000100000f61 movq %rsp,%rbp
0000000100000f64 movl $0x00000004,0xfc(%rbp)
0000000100000f6b movl 0x0000008f(%rip),%eax
0000000100000f71 addl 0xfc(%rbp),%eax
0000000100000f74 popq %rbp
0000000100000f75 ret
Now there are two movl instructions and an addl instruction. You can see that the first movl is initializing y, which it's decided will be on the stack (base pointer - 4). Then the next movl gets the global x into a register eax, and the addl adds y to that value. But as you can see, the literal x and y strings don't exist anymore. They were conveniences for you, the programmer, but the computer certainly doesn't care about them at execution time.
A C compiler first creates a symbol table, which stores the relationship between the variable name and where it's located in memory. When compiling, it uses this table to replace all instances of the variable with a specific memory location, as others have stated. You can find a lot more on it on the Wikipedia page.
All variables are substituted by the compiler. First they are substituted with references and later the linker places addresses instead of references.
In other words. The variable names are not available anymore as soon as the compiler has run through
This is what's called an implementation detail. While what you describe is the case in all compilers I've ever used, it's not required to be the case. A C compiler could put every variable in a hashtable and look them up at runtime (or something like that) and in fact early JavaScript interpreters did exactly that (now, they do Just-In-TIme compilation that results in something much more raw.)
Specifically for common compilers like VC++, GCC, and LLVM: the compiler will generally assign a variable to a location in memory. Variables of global or static scope get a fixed address that doesn't change while the program is running, while variables within a function get a stack address-that is, an address relative to the current stack pointer, which changes every time a function is called. (This is an oversimplification.) Stack addresses become invalid as soon as the function returns, but have the benefit of having effectively zero overhead to use.
Once a variable has an address assigned to it, there is no further need for the name of the variable, so it is discarded. Depending on the kind of name, the name may be discarded at preprocess time (for macro names), compile time (for static and local variables/functions), and link time (for global variables/functions.) If a symbol is exported (made visible to other programs so they can access it), the name will usually remain somewhere in a "symbol table" which does take up a trivial amount of memory and disk space.
Does the compiler just substitute variable_name for 0xaaaaaaaa whenever it sees it
Yes.
and if so, wouldn't it have to use memory in order to make that substitution?
Yes. But it's the compiler, after it compiled your code, why do you care about memory?
The section $3.6.1/1 from the C++ Standard reads,
A program shall contain a global
function called main, which is the
designated start of the program.
Now consider this code,
int square(int i) { return i*i; }
int user_main()
{
for ( int i = 0 ; i < 10 ; ++i )
std::cout << square(i) << endl;
return 0;
}
int main_ret= user_main();
int main()
{
return main_ret;
}
This sample code does what I intend it to do, i.e printing the square of integers from 0 to 9, before entering into the main() function which is supposed to be the "start" of the program.
I also compiled it with -pedantic option, GCC 4.5.0. It gives no error, not even warning!
So my question is,
Is this code really Standard conformant?
If it's standard conformant, then does it not invalidate what the Standard says? main() is not start of this program! user_main() executed before the main().
I understand that to initialize the global variable main_ret, the use_main() executes first but that is a different thing altogether; the point is that, it does invalidate the quoted statement $3.6.1/1 from the Standard, as main() is NOT the start of the program; it is in fact the end of this program!
EDIT:
How do you define the word 'start'?
It boils down to the definition of the phrase "start of the program". So how exactly do you define it?
You are reading the sentence incorrectly.
A program shall contain a global function called main, which is the designated start of the program.
The standard is DEFINING the word "start" for the purposes of the remainder of the standard. It doesn't say that no code executes before main is called. It says that the start of the program is considered to be at the function main.
Your program is compliant. Your program hasn't "started" until main is started. The function is called before your program "starts" according to the definition of "start" in the standard, but that hardly matters. A LOT of code is executed before main is ever called in every program, not just this example.
For the purposes of discussion, your function is executed prior to the 'start' of the program, and that is fully compliant with the standard.
No, C++ does a lot of things to "set the environment" prior to the call of main; however, main is the official start of the "user specified" part of the C++ program.
Some of the environment setup is not controllable (like the initial code to set up std::cout; however, some of the environment is controllable like static global blocks (for initializing static global variables). Note that since you don't have full control prior to main, you don't have full control on the order in which the static blocks get initialized.
After main, your code is conceptually "fully in control" of the program, in the sense that you can both specify the instructions to be performed and the order in which to perform them. Multi-threading can rearrange code execution order; but, you're still in control with C++ because you specified to have sections of code execute (possibly) out-of-order.
Your program will not link and thus not run unless there is a main. However main() does not cause the start of the execution of the program because objects at file level have constructors that run beforehand and it would be possible to write an entire program that runs its lifetime before main() is reached and let main itself have an empty body.
In reality to enforce this you would have to have one object that is constructed prior to main and its constructor to invoke all the flow of the program.
Look at this:
class Foo
{
public:
Foo();
// other stuff
};
Foo foo;
int main()
{
}
The flow of your program would effectively stem from Foo::Foo()
You tagged the question as "C" too, then, speaking strictly about C, your initialization should fail as per section 6.7.8 "Initialization" of the ISO C99 standard.
The most relevant in this case seems to be constraint #4 which says:
All the expressions in an initializer for an object that
has static storage duration shall be constant expressions or string literals.
So, the answer to your question is that the code is not compliant to the C standard.
You would probably want to remove the "C" tag if you were only interested to the C++ standard.
Section 3.6 as a whole is very clear about the interaction of main and dynamic initializations. The "designated start of the program" is not used anywhere else and is just descriptive of the general intent of main(). It doesn't make any sense to interpret that one phrase in a normative way that contradicts the more detailed and clear requirements in the Standard.
The compiler often has to add code before main() to be standard compliant. Because the standard specifies that initalization of globals/statics must be done before the program is executed. And as mentioned, the same goes for constructors of objects placed at file scope (globals).
Thus the original question is relevant to C as well, because in a C program you would still have the globals/static initialization to do before the program can be started.
The standards assume that these variables are initialized through "magic", because they don't say how they should be set before program initialization. I think they considered that as something outside the scope of a programming language standard.
Edit: See for example ISO 9899:1999 5.1.2:
All objects with static storage
duration shall be initialized (set to
their initial values) before program
startup. The manner and timing of such
initialization are otherwise
unspecified.
The theory behind how this "magic" was to be done goes way back to C's birth, when it was a programming language intended to be used only for the UNIX OS, on RAM-based computers. In theory, the program would be able to load all pre-initialized data from the executable file into RAM, at the same time as the program itself was uploaded to RAM.
Since then, computers and OS have evolved, and C is used in a far wider area than originally anticipated. A modern PC OS has virtual addresses etc, and all embedded systems execute code from ROM, not RAM. So there are many situations where the RAM can't be set "automagically".
Also, the standard is too abstract to know anything about stacks and process memory etc. These things must be done too, before the program is started.
Therefore, pretty much every C/C++ program has some init/"copy-down" code that is executed before main is called, in order to conform with the initialization rules of the standards.
As an example, embedded systems typically have an option called "non-ISO compliant startup" where the whole initialization phase is skipped for performance reasons, and then the code actually starts directly from main. But such systems don't conform to the standards, as you can't rely on the init values of global/static variables.
Your "program" simply returns a value from a global variable. Everything else is initialization code. Thus, the standard holds - you just have a very trivial program and more complex initialization.
main() is a user function called by the C runtime library.
see also: Avoiding the main (entry point) in a C program
Seems like an English semantics quibble. The OP refers to his block of code first as "code" and later as the "program." The user writes the code, and then the compiler writes the program.
Ubuntu 20.04 glibc 2.31 RTFS + GDB
glibc does some setup before main so that some of its functionalities will work. Let's try to track down the source code for that.
hello.c
#include <stdio.h>
int main() {
puts("hello");
return 0;
}
Compile and debug:
gcc -ggdb3 -O0 -std=c99 -Wall -Wextra -pedantic -o hello.out hello.c
gdb hello.out
Now in GDB:
b main
r
bt -past-main
gives:
#0 main () at hello.c:3
#1 0x00007ffff7dc60b3 in __libc_start_main (main=0x555555555149 <main()>, argc=1, argv=0x7fffffffbfb8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffbfa8) at ../csu/libc-start.c:308
#2 0x000055555555508e in _start ()
This already contains the line of the caller of main: https://github.com/cirosantilli/glibc/blob/glibc-2.31/csu/libc-start.c#L308.
The function has a billion ifdefs as can be expected from the level of legacy/generality of glibc, but some key parts which seem to take effect for us should simplify to:
# define LIBC_START_MAIN __libc_start_main
STATIC int
LIBC_START_MAIN (int (*main) (int, char **, char **),
int argc, char **argv,
{
/* Initialize some stuff. */
result = main (argc, argv, __environ MAIN_AUXVEC_PARAM);
exit (result);
}
Before __libc_start_main are are already at _start, which by adding gcc -Wl,--verbose we know is the entry point because the linker script contains:
ENTRY(_start)
and is therefore is the actual very first instruction executed after the dynamic loader finishes.
To confirm that in GDB, we an get rid of the dynamic loader by compiling with -static:
gcc -ggdb3 -O0 -std=c99 -Wall -Wextra -pedantic -o hello.out hello.c
gdb hello.out
and then make GDB stop at the very first instruction executed with starti and print the first instructions:
starti
display/12i $pc
which gives:
=> 0x401c10 <_start>: endbr64
0x401c14 <_start+4>: xor %ebp,%ebp
0x401c16 <_start+6>: mov %rdx,%r9
0x401c19 <_start+9>: pop %rsi
0x401c1a <_start+10>: mov %rsp,%rdx
0x401c1d <_start+13>: and $0xfffffffffffffff0,%rsp
0x401c21 <_start+17>: push %rax
0x401c22 <_start+18>: push %rsp
0x401c23 <_start+19>: mov $0x402dd0,%r8
0x401c2a <_start+26>: mov $0x402d30,%rcx
0x401c31 <_start+33>: mov $0x401d35,%rdi
0x401c38 <_start+40>: addr32 callq 0x4020d0 <__libc_start_main>
By grepping the source for _start and focusing on x86_64 hits we see that this seems to correspond to sysdeps/x86_64/start.S:58:
ENTRY (_start)
/* Clearing frame pointer is insufficient, use CFI. */
cfi_undefined (rip)
/* Clear the frame pointer. The ABI suggests this be done, to mark
the outermost frame obviously. */
xorl %ebp, %ebp
/* Extract the arguments as encoded on the stack and set up
the arguments for __libc_start_main (int (*main) (int, char **, char **),
int argc, char *argv,
void (*init) (void), void (*fini) (void),
void (*rtld_fini) (void), void *stack_end).
The arguments are passed via registers and on the stack:
main: %rdi
argc: %rsi
argv: %rdx
init: %rcx
fini: %r8
rtld_fini: %r9
stack_end: stack. */
mov %RDX_LP, %R9_LP /* Address of the shared library termination
function. */
#ifdef __ILP32__
mov (%rsp), %esi /* Simulate popping 4-byte argument count. */
add $4, %esp
#else
popq %rsi /* Pop the argument count. */
#endif
/* argv starts just at the current stack top. */
mov %RSP_LP, %RDX_LP
/* Align the stack to a 16 byte boundary to follow the ABI. */
and $~15, %RSP_LP
/* Push garbage because we push 8 more bytes. */
pushq %rax
/* Provide the highest stack address to the user code (for stacks
which grow downwards). */
pushq %rsp
#ifdef PIC
/* Pass address of our own entry points to .fini and .init. */
mov __libc_csu_fini#GOTPCREL(%rip), %R8_LP
mov __libc_csu_init#GOTPCREL(%rip), %RCX_LP
mov main#GOTPCREL(%rip), %RDI_LP
#else
/* Pass address of our own entry points to .fini and .init. */
mov $__libc_csu_fini, %R8_LP
mov $__libc_csu_init, %RCX_LP
mov $main, %RDI_LP
#endif
/* Call the user's main function, and exit with its value.
But let the libc call main. Since __libc_start_main in
libc.so is called very early, lazy binding isn't relevant
here. Use indirect branch via GOT to avoid extra branch
to PLT slot. In case of static executable, ld in binutils
2.26 or above can convert indirect branch into direct
branch. */
call *__libc_start_main#GOTPCREL(%rip)
which ends up calling __libc_start_main as expected.
Unfortunately -static makes the bt from main not show as much info:
#0 main () at hello.c:3
#1 0x0000000000402560 in __libc_start_main ()
#2 0x0000000000401c3e in _start ()
If we remove -static and start from starti, we get instead:
=> 0x7ffff7fd0100 <_start>: mov %rsp,%rdi
0x7ffff7fd0103 <_start+3>: callq 0x7ffff7fd0df0 <_dl_start>
0x7ffff7fd0108 <_dl_start_user>: mov %rax,%r12
0x7ffff7fd010b <_dl_start_user+3>: mov 0x2c4e7(%rip),%eax # 0x7ffff7ffc5f8 <_dl_skip_args>
0x7ffff7fd0111 <_dl_start_user+9>: pop %rdx
By grepping the source for _dl_start_user this seems to come from sysdeps/x86_64/dl-machine.h:L147
/* Initial entry point code for the dynamic linker.
The C function `_dl_start' is the real entry point;
its return value is the user program's entry point. */
#define RTLD_START asm ("\n\
.text\n\
.align 16\n\
.globl _start\n\
.globl _dl_start_user\n\
_start:\n\
movq %rsp, %rdi\n\
call _dl_start\n\
_dl_start_user:\n\
# Save the user entry point address in %r12.\n\
movq %rax, %r12\n\
# See if we were run as a command with the executable file\n\
# name as an extra leading argument.\n\
movl _dl_skip_args(%rip), %eax\n\
# Pop the original argument count.\n\
popq %rdx\n\
and this is presumably the dynamic loader entry point.
If we break at _start and continue, this seems to end up in the same location as when we used -static, which then calls __libc_start_main.
When I try a C++ program instead:
hello.cpp
#include <iostream>
int main() {
std::cout << "hello" << std::endl;
}
with:
g++ -ggdb3 -O0 -std=c++11 -Wall -Wextra -pedantic -o hello.out hello.cpp
the results are basically the same, e.g. the backtrace at main is the exact same.
I think the C++ compiler is just calling into hooks to achieve any C++ specific functionality, and things are pretty well factored across C/C++.
TODO:
commented on concrete easy-to-understand examples of what glibc is doing before main. This gives some ideas: What happens before main in C++?
make GDB show the source itself without us having to look at it separately, possibly with us building glibc ourselves: How to compile my own glibc C standard library from source and use it?
understand how the above source code maps to objects such as crti.o that can be seen with gcc --verbose main.c and which end up getting added to the final link
main is called after initializing all the global variables.
What the standard does not specify is the order of initialization of all the global variables of all the modules and statically linked libraries.
Yes, main is the "entry point" of every C++ program, excepting implementation-specific extensions. Even so, some things happen before main, notably global initialization such as for main_ret.