Sometimes there is a function in my binary that I'm sure hasn't been optimized away, because it's called by another function:
(gdb) disassemble 'k3::(anonymous namespace)::BM_AwaitLongReadyChain(testing::benchmark::State&)'
Dump of assembler code for function k3::(anonymous namespace)::BM_AwaitLongReadyChain(testing::benchmark::State&):
[...]
0x00000000003a416d <+45>: call 0x3ad0e0 <k3::(anonymous namespace)::RecursivelyAwait<k3::(anonymous namespace)::Immediate17>(unsigned long, k3::(anonymous namespace)::Immediate17&&)>
End of assembler dump.
But if I ask GDB to disassemble it using the very same name that it refers to the function with, it claims the function doesn't exist:
(gdb) disassemble 'k3::(anonymous namespace)::RecursivelyAwait<k3::(anonymous namespace)::Immediate17>(unsigned long, k3::(anonymous namespace)::Immediate17&&)'
No symbol "k3::(anonymous namespace)::RecursivelyAwait<k3::(anonymous namespace)::Immediate17>(unsigned long, k3::(anonymous namespace)::Immediate17&&)" in current context.
However, if I disassemble it using its address, it works fine:
(gdb) disassemble 0x3ad0e0
Dump of assembler code for function k3::(anonymous namespace)::RecursivelyAwait<k3::(anonymous namespace)::Immediate17>(unsigned long, k3::(anonymous namespace)::Immediate17&&):
0x00000000003ad0e0 <+0>: push rbp
[...]
End of assembler dump.
This is terribly inconvenient, because I don't know the address a priori—I have to go disassemble a caller just to find the address of the callee. It's really cumbersome.
How can I get GDB to disassemble this function by name? I assume this is some issue with name mangling/canonicalization, probably around the rvalue references and/or anonymous namespaces, but I can't figure out what exactly is going on. I'm using GDB 10.0-gg5.
But if I ask GDB to disassemble it using the very same name that it refers to the function with, it claims the function doesn't exist
There are many possible mangling schemes; the relationship between mangled and unmangled names is not 1:1.
The parser built into GCC which turns foo::bar(int) into something which can be used to lookup the symbol in the symbol table may have bugs.
This is terribly inconvenient, because I don't know the address a priori—I have to go disassemble a caller just to find the address of the callee.
If the called function is already on the stack (i.e. part of active call chain), you can easily disassemble it via disas $any_address_in_fn -- you don't need to give GDB the starting address. So you could do e.g. frame 5 followed by disas $pc -- GDB will find enclosing function in the symbol table and disassemble it in its entirety.
Another option is to get the address from file:line info: info line foo.cc:123 followed by disas $addr_given_by_previous_command.
If you know that foo::bar() exists somewhere, but don't know its source location, another option is to set a breakpoint on it via e.g. rbreak 'foo::bar'. This will tell you the address where the breakpoint was set, and you can disassemble that address.
Related
I'm reverse engineering a program using gdb and I'm getting confused in all the addresses I enter to various commands. Is there a way to create (and store) a custom variable so that I could say x/i my_addr_name instead of x/i 0xdeadbeef?
gdb has user-definable convenience variables to hold various values.
(gdb) set $my_addr_name=$pc
(gdb) x/i $my_addr_name
=> 0x400c7d <main+390>: lea -0xa0(%rbp),%rax
(gdb) ptype $my_addr_name
type = void (*)()
Convenience variable have a type, and the print command will make use of that, but the x command uses explicit or default formats and doesn't take the type of the expression into account.
could I have x/i 0xdeadbeff say my_addr_name+16
I don't think so, unless some additional C or python code is written. gdb's C source code has a build_address_symbolic function, which looks through the symbol tables to find the symbol nearest to an address. Short of creating a custom symbol table, then loading it with the add-symbol-file command, or writing a python extension to implement an alternative to the x command, I don't think such a customization is possible, currently.
I was trying to develop a better understanding of the linkers and how they work, so I tried to call the simple function(printf) from the c library (MSVCRTD.lib) but with assembly code on MASM.
I dissected the external symbols from the "MSVCRTD.lib" library which has many printf's functions like:
__imp__printf
_printf
___imp___printf_l
;and more ...
I had 2 challenges (linking/building) and (running).
as for the first challenge linking my assembly code to the library was not a problem at all, I could link my assembly code with any call to any external function of the library,all I needed just to mimic the decorated (Mangled) name of the function so the linker can recognize it. I first tried the second one "_printf" which locked shorter and nicer, and after disassembling it's code I knew that it takes 2 parameters on the stack and it's a cdecl calling convention, so I write the code it needs and it was:
.386
.model flat,stdcall
.stack 4096
option casemap :none
Extern printf :PROC ; MASM will decorate it to be "_printf"
.data
message byte "Hello C library, this is MASM calling"
.code
main proc
push 0
push offset message
call printf
add esp,8 ; clean the stack
retn
main endp
end
and shoot! every thing was smooth .
but when I tried the same thing with "_imp__printf" the problems start.
BTY: this function is the one that the c compiler calls when you write the famous hello world! c application
the linker successfully build the program but when I run the program it crashes!
I read the linker output messages and every thing looks normal except for the line that says: " Discarded _printf from MSVCRTD.lib(MSVCR100D.dll)".
I debugged the program with OllyDBG and I found that the call instruction that should land on the function actually lands on an area that is recognized as DATA ! in the .rdata section
why the "_printf" function succeed and the "__imp__printf" didn't :( , any idias?
Thanks for Mr. Jester and Mr. Raymond Chen
they provided the solution for the problem in the comments.
it was the declaration of the __imp__printf. that is declared as a PROC like the working example _printf but there were DATA so declaring.
Extern _imp__printf :DWORD
will makes it work as printf
thank you so much , both of you
EDIT: this is not the solution , but I will leave it for later reference. the solution in the next answer.
I think I'm close to figure out why calling _imp__printf external function failed because of a problem in the jumping instruction, here is what I did..
I tried to build the hello world! C program to see how the _imp__printf function will look like in the symbols table if the file compiled as a C program instead of assembly , I then dumped the OBJ file that was compiled from both calling program (the MASM/ and the c ), and the results were very interesting , here is the _imp__printf in the OBJ compiled from the C file
and here is the _imp__printf in the OBJ compiled from the MASM file
Interesting! and after referencing to the COFF documentation It seems that the relocation type REL32 will force the linker not to process the import address table correctly so the jump instruction will fall as it happened before.
now my question is "how I can tell MASM to assemble the file with _imp__printf symbol type "DIR32" ?
I want to get the address of function _dl_start (entry point of the dynamic linker). I am able to set a breakpoint using gdb. I expected to find the symbol using readelf but I did not. How can I get the address / how does the gdb resolve _dl_start?
The example source (main.cpp) to set the breakpoint using the gdb is
int main( int argc, char** argv, char** envp )
{
return 0;
}
I compiled it with
g++ main.cpp -o teststart
The gdb output when running the program was
(gdb) b _dl_start
Function "_dl_start" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (_dl_start) pending.
(gdb) r
Starting program: /tmp/teststart
Breakpoint 1, 0x00007fa7ee8c4fc4 in _dl_start () from /lib64/ld-linux-x86-64.so.2
The _dl_start symbol is in ld-linux-x86-64.so.2 (the dynamic loader), and that symbol is private to ld-linux. This means that the only way to find it from inside the program is to do the same thing GDB does: read the symbol table of ld-linux, and search it for the "_dl_start" function (by name). Linking to it directly (as Martin suggested) can not and will not work (as you've already discovered).
Reading ELF symbol tables is not very complicated -- you just have to find .symtab and .strtab sections, and read the .symtab as a table of Elf64_Sym entries. Or use libelf (start here).
An additional complication is that ld-linux could be stripped (the symbol table is not required for it to work). If it is stipped, neither GDB, nor your program will be able to find _dl_start.
Finally, it is somewhat likely that your attempt to find _dl_start is pointless: you do realize that this function is called long before the first instruction of your program is executed. By the time you hit main, _dl_start has long finished, never to be called again.
Upate:
I still wonder how gdb gets the address of _dl_start in ld-linux (it is stripped)
If ld-linux is stripped, GDB will not be able to find _dl_start in it. Since you GDB does find it, either
your ld-linux is not actually stripped, or
you have "separate debuginfo" package for glibc installed.
To verify that ld-linux is really fully stripped, run nm /lib64/ld-linux-x86-64.so.2 | grep _dl_start and readelf -S /lib64/ld-linux-x86-64.so.2 | grep symtab. Both commands should produce no output.
To see where GDB is loading symbols from, you can use set print symbol-loading on command (before running the executable).
I wanted to to call _dl_start (after preparing the stack and adjusting the auxiliary vector) to create an executable image of a program stored already in memory (file representation)...
I don't see how that could possibly work. _dl_start expects certain state (e.g. its global variables to be zeroed out) before it is called, so calling it for a second time is very likely to result in assertion failure even if you don't adjust the aux vector. And assert is even more likely if you do adjust aux vector in some non-trivial way, which is (apparently) your goal.
_dl_start is not part of your program itself, it is contained in the runtime loader (as you can see from the output "..._dl_start () from /lib64/ld-linux-x86-64.so.2").
GDB initially cannot set a breakpoint, because it is not contained in your executable.
It is a bit unclear to me, if you want to know the address of _dl_start from inside the program or from outside? From the inside, you should be able to simply assign it e.g. to a void* variable like this:
void* address = dl_start;
I want to dump a backtrace from a C++ program in Linux in a similar format as it is done in gdb. I tried to use the backtrace() and backtrace_symbols() functions for this purpose. These returned function names and offsets. I can use the __cxa_demangle() function to get a readable function name.
Is there any way to get the file/line positions too, as it is done by gdb?
How it's better to invoke gdb from program to print its stacktrace?`
Methode #4, shows a way to get filename and line. But uses a extern program..
Dump of assembler code for function foo#plt:
0x0000000000400528 <foo#plt+0>: jmpq *0x2004d2(%rip) # 0x600a00 <_GLOBAL_OFFSET_TABLE_+40>
0x000000000040052e <foo#plt+6>: pushq $0x2
0x0000000000400533 <foo#plt+11>: jmpq 0x4004f8
(gdb) disas 0x4004f8
No function contains specified address.
I knwo 0x4004f8 is the entry point of procedure linkage table,but why I can't disas it?
disas with one address needs to find the function the address is contained within to know how much to disassemble.
Either with disas with two arguments, or x/i.
Also see:
How can I force GDB to disassemble?