Sometimes there is a function in my binary that I'm sure hasn't been optimized away, because it's called by another function:
(gdb) disassemble 'k3::(anonymous namespace)::BM_AwaitLongReadyChain(testing::benchmark::State&)'
Dump of assembler code for function k3::(anonymous namespace)::BM_AwaitLongReadyChain(testing::benchmark::State&):
[...]
0x00000000003a416d <+45>: call 0x3ad0e0 <k3::(anonymous namespace)::RecursivelyAwait<k3::(anonymous namespace)::Immediate17>(unsigned long, k3::(anonymous namespace)::Immediate17&&)>
End of assembler dump.
But if I ask GDB to disassemble it using the very same name that it refers to the function with, it claims the function doesn't exist:
(gdb) disassemble 'k3::(anonymous namespace)::RecursivelyAwait<k3::(anonymous namespace)::Immediate17>(unsigned long, k3::(anonymous namespace)::Immediate17&&)'
No symbol "k3::(anonymous namespace)::RecursivelyAwait<k3::(anonymous namespace)::Immediate17>(unsigned long, k3::(anonymous namespace)::Immediate17&&)" in current context.
However, if I disassemble it using its address, it works fine:
(gdb) disassemble 0x3ad0e0
Dump of assembler code for function k3::(anonymous namespace)::RecursivelyAwait<k3::(anonymous namespace)::Immediate17>(unsigned long, k3::(anonymous namespace)::Immediate17&&):
0x00000000003ad0e0 <+0>: push rbp
[...]
End of assembler dump.
This is terribly inconvenient, because I don't know the address a priori—I have to go disassemble a caller just to find the address of the callee. It's really cumbersome.
How can I get GDB to disassemble this function by name? I assume this is some issue with name mangling/canonicalization, probably around the rvalue references and/or anonymous namespaces, but I can't figure out what exactly is going on. I'm using GDB 10.0-gg5.
But if I ask GDB to disassemble it using the very same name that it refers to the function with, it claims the function doesn't exist
There are many possible mangling schemes; the relationship between mangled and unmangled names is not 1:1.
The parser built into GCC which turns foo::bar(int) into something which can be used to lookup the symbol in the symbol table may have bugs.
This is terribly inconvenient, because I don't know the address a priori—I have to go disassemble a caller just to find the address of the callee.
If the called function is already on the stack (i.e. part of active call chain), you can easily disassemble it via disas $any_address_in_fn -- you don't need to give GDB the starting address. So you could do e.g. frame 5 followed by disas $pc -- GDB will find enclosing function in the symbol table and disassemble it in its entirety.
Another option is to get the address from file:line info: info line foo.cc:123 followed by disas $addr_given_by_previous_command.
If you know that foo::bar() exists somewhere, but don't know its source location, another option is to set a breakpoint on it via e.g. rbreak 'foo::bar'. This will tell you the address where the breakpoint was set, and you can disassemble that address.
In C, let's say you have a variable called variable_name. Let's say it's located at 0xaaaaaaaa, and at that memory address, you have the integer 123. So in other words, variable_name contains 123.
I'm looking for clarification around the phrasing "variable_name is located at 0xaaaaaaaa". How does the compiler recognize that the string "variable_name" is associated with that particular memory address? Is the string "variable_name" stored somewhere in memory? Does the compiler just substitute variable_name for 0xaaaaaaaa whenever it sees it, and if so, wouldn't it have to use memory in order to make that substitution?
Variable names don't exist anymore after the compiler runs (barring special cases like exported globals in shared libraries or debug symbols). The entire act of compilation is intended to take those symbolic names and algorithms represented by your source code and turn them into native machine instructions. So yes, if you have a global variable_name, and compiler and linker decide to put it at 0xaaaaaaaa, then wherever it is used in the code, it will just be accessed via that address.
So to answer your literal questions:
How does the compiler recognize that the string "variable_name" is associated with that particular memory address?
The toolchain (compiler & linker) work together to assign a memory location for the variable. It's the compiler's job to keep track of all the references, and linker puts in the right addresses later.
Is the string "variable_name" stored somewhere in memory?
Only while the compiler is running.
Does the compiler just substitute variable_name for 0xaaaaaaaa whenever it sees it, and if so, wouldn't it have to use memory in order to make that substitution?
Yes, that's pretty much what happens, except it's a two-stage job with the linker. And yes, it uses memory, but it's the compiler's memory, not anything at runtime for your program.
An example might help you understand. Let's try out this program:
int x = 12;
int main(void)
{
return x;
}
Pretty straightforward, right? OK. Let's take this program, and compile it and look at the disassembly:
$ cc -Wall -Werror -Wextra -O3 example.c -o example
$ otool -tV example
example:
(__TEXT,__text) section
_main:
0000000100000f60 pushq %rbp
0000000100000f61 movq %rsp,%rbp
0000000100000f64 movl 0x00000096(%rip),%eax
0000000100000f6a popq %rbp
0000000100000f6b ret
See that movl line? It's grabbing the global variable (in an instruction-pointer relative way, in this case). No more mention of x.
Now let's make it a bit more complicated and add a local variable:
int x = 12;
int main(void)
{
volatile int y = 4;
return x + y;
}
The disassembly for this program is:
(__TEXT,__text) section
_main:
0000000100000f60 pushq %rbp
0000000100000f61 movq %rsp,%rbp
0000000100000f64 movl $0x00000004,0xfc(%rbp)
0000000100000f6b movl 0x0000008f(%rip),%eax
0000000100000f71 addl 0xfc(%rbp),%eax
0000000100000f74 popq %rbp
0000000100000f75 ret
Now there are two movl instructions and an addl instruction. You can see that the first movl is initializing y, which it's decided will be on the stack (base pointer - 4). Then the next movl gets the global x into a register eax, and the addl adds y to that value. But as you can see, the literal x and y strings don't exist anymore. They were conveniences for you, the programmer, but the computer certainly doesn't care about them at execution time.
A C compiler first creates a symbol table, which stores the relationship between the variable name and where it's located in memory. When compiling, it uses this table to replace all instances of the variable with a specific memory location, as others have stated. You can find a lot more on it on the Wikipedia page.
All variables are substituted by the compiler. First they are substituted with references and later the linker places addresses instead of references.
In other words. The variable names are not available anymore as soon as the compiler has run through
This is what's called an implementation detail. While what you describe is the case in all compilers I've ever used, it's not required to be the case. A C compiler could put every variable in a hashtable and look them up at runtime (or something like that) and in fact early JavaScript interpreters did exactly that (now, they do Just-In-TIme compilation that results in something much more raw.)
Specifically for common compilers like VC++, GCC, and LLVM: the compiler will generally assign a variable to a location in memory. Variables of global or static scope get a fixed address that doesn't change while the program is running, while variables within a function get a stack address-that is, an address relative to the current stack pointer, which changes every time a function is called. (This is an oversimplification.) Stack addresses become invalid as soon as the function returns, but have the benefit of having effectively zero overhead to use.
Once a variable has an address assigned to it, there is no further need for the name of the variable, so it is discarded. Depending on the kind of name, the name may be discarded at preprocess time (for macro names), compile time (for static and local variables/functions), and link time (for global variables/functions.) If a symbol is exported (made visible to other programs so they can access it), the name will usually remain somewhere in a "symbol table" which does take up a trivial amount of memory and disk space.
Does the compiler just substitute variable_name for 0xaaaaaaaa whenever it sees it
Yes.
and if so, wouldn't it have to use memory in order to make that substitution?
Yes. But it's the compiler, after it compiled your code, why do you care about memory?
I'm reverse engineering a program using gdb and I'm getting confused in all the addresses I enter to various commands. Is there a way to create (and store) a custom variable so that I could say x/i my_addr_name instead of x/i 0xdeadbeef?
gdb has user-definable convenience variables to hold various values.
(gdb) set $my_addr_name=$pc
(gdb) x/i $my_addr_name
=> 0x400c7d <main+390>: lea -0xa0(%rbp),%rax
(gdb) ptype $my_addr_name
type = void (*)()
Convenience variable have a type, and the print command will make use of that, but the x command uses explicit or default formats and doesn't take the type of the expression into account.
could I have x/i 0xdeadbeff say my_addr_name+16
I don't think so, unless some additional C or python code is written. gdb's C source code has a build_address_symbolic function, which looks through the symbol tables to find the symbol nearest to an address. Short of creating a custom symbol table, then loading it with the add-symbol-file command, or writing a python extension to implement an alternative to the x command, I don't think such a customization is possible, currently.
I have a scenario in GCC causing me problems. The behaviour I get is not the behaviour I expect. To summarise the situation, I am proposing several new instructions for x86-64 which are implemented in a hardware simulator. In order to test these instructions I am taking existing C source code and handcoding the new instructions using hexidecimal. Because these instructions interact with the existing x86-64 registers, I use the input/output/clobber lists to declare dependencies for GCC.
What's happening is that if I call a function e.g. printf, the dependent registers aren't saved and restored.
For example
register unsigned long r9 asm ("r9") = 101;
printf("foo %s\n", "bar");
asm volatile (".byte 0x00, 0x00, 0x00, 0x00" : /* no output */ : "q" (r9) );
101 was assigned to r9 and the inline assembly (fake in this example) is dependent on r9. This runs correctly in the absence of the printf, but when it is there GCC does not save and restore r9 and another value is there by the time my custom instruction is called.
I thought perhaps that GCC might have secretly changed the assignment to the variable r9, but when I do this
asm volatile (".byte %0" : /* no output */ : "q" (r9) );
and look at the assembly output, it is indeed using %r9.
I am using gcc 4.4.5. What do you think might be happening? I thought GCC will always save and restore registers on function calls. Is there some way I can enforce it?
Thanks!
EDIT: By the way, I'm compiling the program like this
gcc -static -m64 -mmmx -msse -msse2 -O0 test.c -o test
The ABI, section 3.2.1 says:
Registers %rbp, %rbx and %r12 through %r15 “belong” to the calling function and the
called function is
required to preserve their values. In other words, a called function must preserve
these registers’ values for its caller. Remaining registers “belong” to the called
function. If a calling function wants to preserve such a register value across a
function call, it must save the value in its local stack frame.
so you shouldn't expect registers other than %rbp, %rbx and %r12 through %r15 to be preserved by a function call.
gcc will not make explicit-register variables like this callee-saved. Basically this register notation you're using makes the variable a direct alias for the register, with the assumption you want to be able to read back the value a callee leaves in the register. If you used a callee-saved register instead of a call-clobbered (caller-saved) register, the problem would go away.
The section $3.6.1/1 from the C++ Standard reads,
A program shall contain a global
function called main, which is the
designated start of the program.
Now consider this code,
int square(int i) { return i*i; }
int user_main()
{
for ( int i = 0 ; i < 10 ; ++i )
std::cout << square(i) << endl;
return 0;
}
int main_ret= user_main();
int main()
{
return main_ret;
}
This sample code does what I intend it to do, i.e printing the square of integers from 0 to 9, before entering into the main() function which is supposed to be the "start" of the program.
I also compiled it with -pedantic option, GCC 4.5.0. It gives no error, not even warning!
So my question is,
Is this code really Standard conformant?
If it's standard conformant, then does it not invalidate what the Standard says? main() is not start of this program! user_main() executed before the main().
I understand that to initialize the global variable main_ret, the use_main() executes first but that is a different thing altogether; the point is that, it does invalidate the quoted statement $3.6.1/1 from the Standard, as main() is NOT the start of the program; it is in fact the end of this program!
EDIT:
How do you define the word 'start'?
It boils down to the definition of the phrase "start of the program". So how exactly do you define it?
You are reading the sentence incorrectly.
A program shall contain a global function called main, which is the designated start of the program.
The standard is DEFINING the word "start" for the purposes of the remainder of the standard. It doesn't say that no code executes before main is called. It says that the start of the program is considered to be at the function main.
Your program is compliant. Your program hasn't "started" until main is started. The function is called before your program "starts" according to the definition of "start" in the standard, but that hardly matters. A LOT of code is executed before main is ever called in every program, not just this example.
For the purposes of discussion, your function is executed prior to the 'start' of the program, and that is fully compliant with the standard.
No, C++ does a lot of things to "set the environment" prior to the call of main; however, main is the official start of the "user specified" part of the C++ program.
Some of the environment setup is not controllable (like the initial code to set up std::cout; however, some of the environment is controllable like static global blocks (for initializing static global variables). Note that since you don't have full control prior to main, you don't have full control on the order in which the static blocks get initialized.
After main, your code is conceptually "fully in control" of the program, in the sense that you can both specify the instructions to be performed and the order in which to perform them. Multi-threading can rearrange code execution order; but, you're still in control with C++ because you specified to have sections of code execute (possibly) out-of-order.
Your program will not link and thus not run unless there is a main. However main() does not cause the start of the execution of the program because objects at file level have constructors that run beforehand and it would be possible to write an entire program that runs its lifetime before main() is reached and let main itself have an empty body.
In reality to enforce this you would have to have one object that is constructed prior to main and its constructor to invoke all the flow of the program.
Look at this:
class Foo
{
public:
Foo();
// other stuff
};
Foo foo;
int main()
{
}
The flow of your program would effectively stem from Foo::Foo()
You tagged the question as "C" too, then, speaking strictly about C, your initialization should fail as per section 6.7.8 "Initialization" of the ISO C99 standard.
The most relevant in this case seems to be constraint #4 which says:
All the expressions in an initializer for an object that
has static storage duration shall be constant expressions or string literals.
So, the answer to your question is that the code is not compliant to the C standard.
You would probably want to remove the "C" tag if you were only interested to the C++ standard.
Section 3.6 as a whole is very clear about the interaction of main and dynamic initializations. The "designated start of the program" is not used anywhere else and is just descriptive of the general intent of main(). It doesn't make any sense to interpret that one phrase in a normative way that contradicts the more detailed and clear requirements in the Standard.
The compiler often has to add code before main() to be standard compliant. Because the standard specifies that initalization of globals/statics must be done before the program is executed. And as mentioned, the same goes for constructors of objects placed at file scope (globals).
Thus the original question is relevant to C as well, because in a C program you would still have the globals/static initialization to do before the program can be started.
The standards assume that these variables are initialized through "magic", because they don't say how they should be set before program initialization. I think they considered that as something outside the scope of a programming language standard.
Edit: See for example ISO 9899:1999 5.1.2:
All objects with static storage
duration shall be initialized (set to
their initial values) before program
startup. The manner and timing of such
initialization are otherwise
unspecified.
The theory behind how this "magic" was to be done goes way back to C's birth, when it was a programming language intended to be used only for the UNIX OS, on RAM-based computers. In theory, the program would be able to load all pre-initialized data from the executable file into RAM, at the same time as the program itself was uploaded to RAM.
Since then, computers and OS have evolved, and C is used in a far wider area than originally anticipated. A modern PC OS has virtual addresses etc, and all embedded systems execute code from ROM, not RAM. So there are many situations where the RAM can't be set "automagically".
Also, the standard is too abstract to know anything about stacks and process memory etc. These things must be done too, before the program is started.
Therefore, pretty much every C/C++ program has some init/"copy-down" code that is executed before main is called, in order to conform with the initialization rules of the standards.
As an example, embedded systems typically have an option called "non-ISO compliant startup" where the whole initialization phase is skipped for performance reasons, and then the code actually starts directly from main. But such systems don't conform to the standards, as you can't rely on the init values of global/static variables.
Your "program" simply returns a value from a global variable. Everything else is initialization code. Thus, the standard holds - you just have a very trivial program and more complex initialization.
main() is a user function called by the C runtime library.
see also: Avoiding the main (entry point) in a C program
Seems like an English semantics quibble. The OP refers to his block of code first as "code" and later as the "program." The user writes the code, and then the compiler writes the program.
Ubuntu 20.04 glibc 2.31 RTFS + GDB
glibc does some setup before main so that some of its functionalities will work. Let's try to track down the source code for that.
hello.c
#include <stdio.h>
int main() {
puts("hello");
return 0;
}
Compile and debug:
gcc -ggdb3 -O0 -std=c99 -Wall -Wextra -pedantic -o hello.out hello.c
gdb hello.out
Now in GDB:
b main
r
bt -past-main
gives:
#0 main () at hello.c:3
#1 0x00007ffff7dc60b3 in __libc_start_main (main=0x555555555149 <main()>, argc=1, argv=0x7fffffffbfb8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffbfa8) at ../csu/libc-start.c:308
#2 0x000055555555508e in _start ()
This already contains the line of the caller of main: https://github.com/cirosantilli/glibc/blob/glibc-2.31/csu/libc-start.c#L308.
The function has a billion ifdefs as can be expected from the level of legacy/generality of glibc, but some key parts which seem to take effect for us should simplify to:
# define LIBC_START_MAIN __libc_start_main
STATIC int
LIBC_START_MAIN (int (*main) (int, char **, char **),
int argc, char **argv,
{
/* Initialize some stuff. */
result = main (argc, argv, __environ MAIN_AUXVEC_PARAM);
exit (result);
}
Before __libc_start_main are are already at _start, which by adding gcc -Wl,--verbose we know is the entry point because the linker script contains:
ENTRY(_start)
and is therefore is the actual very first instruction executed after the dynamic loader finishes.
To confirm that in GDB, we an get rid of the dynamic loader by compiling with -static:
gcc -ggdb3 -O0 -std=c99 -Wall -Wextra -pedantic -o hello.out hello.c
gdb hello.out
and then make GDB stop at the very first instruction executed with starti and print the first instructions:
starti
display/12i $pc
which gives:
=> 0x401c10 <_start>: endbr64
0x401c14 <_start+4>: xor %ebp,%ebp
0x401c16 <_start+6>: mov %rdx,%r9
0x401c19 <_start+9>: pop %rsi
0x401c1a <_start+10>: mov %rsp,%rdx
0x401c1d <_start+13>: and $0xfffffffffffffff0,%rsp
0x401c21 <_start+17>: push %rax
0x401c22 <_start+18>: push %rsp
0x401c23 <_start+19>: mov $0x402dd0,%r8
0x401c2a <_start+26>: mov $0x402d30,%rcx
0x401c31 <_start+33>: mov $0x401d35,%rdi
0x401c38 <_start+40>: addr32 callq 0x4020d0 <__libc_start_main>
By grepping the source for _start and focusing on x86_64 hits we see that this seems to correspond to sysdeps/x86_64/start.S:58:
ENTRY (_start)
/* Clearing frame pointer is insufficient, use CFI. */
cfi_undefined (rip)
/* Clear the frame pointer. The ABI suggests this be done, to mark
the outermost frame obviously. */
xorl %ebp, %ebp
/* Extract the arguments as encoded on the stack and set up
the arguments for __libc_start_main (int (*main) (int, char **, char **),
int argc, char *argv,
void (*init) (void), void (*fini) (void),
void (*rtld_fini) (void), void *stack_end).
The arguments are passed via registers and on the stack:
main: %rdi
argc: %rsi
argv: %rdx
init: %rcx
fini: %r8
rtld_fini: %r9
stack_end: stack. */
mov %RDX_LP, %R9_LP /* Address of the shared library termination
function. */
#ifdef __ILP32__
mov (%rsp), %esi /* Simulate popping 4-byte argument count. */
add $4, %esp
#else
popq %rsi /* Pop the argument count. */
#endif
/* argv starts just at the current stack top. */
mov %RSP_LP, %RDX_LP
/* Align the stack to a 16 byte boundary to follow the ABI. */
and $~15, %RSP_LP
/* Push garbage because we push 8 more bytes. */
pushq %rax
/* Provide the highest stack address to the user code (for stacks
which grow downwards). */
pushq %rsp
#ifdef PIC
/* Pass address of our own entry points to .fini and .init. */
mov __libc_csu_fini#GOTPCREL(%rip), %R8_LP
mov __libc_csu_init#GOTPCREL(%rip), %RCX_LP
mov main#GOTPCREL(%rip), %RDI_LP
#else
/* Pass address of our own entry points to .fini and .init. */
mov $__libc_csu_fini, %R8_LP
mov $__libc_csu_init, %RCX_LP
mov $main, %RDI_LP
#endif
/* Call the user's main function, and exit with its value.
But let the libc call main. Since __libc_start_main in
libc.so is called very early, lazy binding isn't relevant
here. Use indirect branch via GOT to avoid extra branch
to PLT slot. In case of static executable, ld in binutils
2.26 or above can convert indirect branch into direct
branch. */
call *__libc_start_main#GOTPCREL(%rip)
which ends up calling __libc_start_main as expected.
Unfortunately -static makes the bt from main not show as much info:
#0 main () at hello.c:3
#1 0x0000000000402560 in __libc_start_main ()
#2 0x0000000000401c3e in _start ()
If we remove -static and start from starti, we get instead:
=> 0x7ffff7fd0100 <_start>: mov %rsp,%rdi
0x7ffff7fd0103 <_start+3>: callq 0x7ffff7fd0df0 <_dl_start>
0x7ffff7fd0108 <_dl_start_user>: mov %rax,%r12
0x7ffff7fd010b <_dl_start_user+3>: mov 0x2c4e7(%rip),%eax # 0x7ffff7ffc5f8 <_dl_skip_args>
0x7ffff7fd0111 <_dl_start_user+9>: pop %rdx
By grepping the source for _dl_start_user this seems to come from sysdeps/x86_64/dl-machine.h:L147
/* Initial entry point code for the dynamic linker.
The C function `_dl_start' is the real entry point;
its return value is the user program's entry point. */
#define RTLD_START asm ("\n\
.text\n\
.align 16\n\
.globl _start\n\
.globl _dl_start_user\n\
_start:\n\
movq %rsp, %rdi\n\
call _dl_start\n\
_dl_start_user:\n\
# Save the user entry point address in %r12.\n\
movq %rax, %r12\n\
# See if we were run as a command with the executable file\n\
# name as an extra leading argument.\n\
movl _dl_skip_args(%rip), %eax\n\
# Pop the original argument count.\n\
popq %rdx\n\
and this is presumably the dynamic loader entry point.
If we break at _start and continue, this seems to end up in the same location as when we used -static, which then calls __libc_start_main.
When I try a C++ program instead:
hello.cpp
#include <iostream>
int main() {
std::cout << "hello" << std::endl;
}
with:
g++ -ggdb3 -O0 -std=c++11 -Wall -Wextra -pedantic -o hello.out hello.cpp
the results are basically the same, e.g. the backtrace at main is the exact same.
I think the C++ compiler is just calling into hooks to achieve any C++ specific functionality, and things are pretty well factored across C/C++.
TODO:
commented on concrete easy-to-understand examples of what glibc is doing before main. This gives some ideas: What happens before main in C++?
make GDB show the source itself without us having to look at it separately, possibly with us building glibc ourselves: How to compile my own glibc C standard library from source and use it?
understand how the above source code maps to objects such as crti.o that can be seen with gcc --verbose main.c and which end up getting added to the final link
main is called after initializing all the global variables.
What the standard does not specify is the order of initialization of all the global variables of all the modules and statically linked libraries.
Yes, main is the "entry point" of every C++ program, excepting implementation-specific extensions. Even so, some things happen before main, notably global initialization such as for main_ret.